Re: בעיות עברית בגנום
aronovitch at gmail.com
Sun Jan 18 15:29:02 IST 2009
2009/1/18 Dov Grobgeld <dov.grobgeld at gmail.com>
Hi Dov, glad to see you are on this list :-)
btw, pls have a look at my recent comment to your Wrap-off+scrollbar bugrep,
to see if I describe the current situation well.
A few comments.
> 1. The unicode algo actually does not specify how the paragraph direction
> is specified. Instead it delegates the decision to a "higher level
> protocol". It was therefore necessary to create a strategy for defining how
> the GtkTextView should determine. In the end I wrote a complex routine that
> sweeps both backwards and if necessary forwards.
Right. I was referring to the default paragraph direction rules (paragraph
rules P2 and P3 in uax#9), which you (or someone else?) employ by using
pango_find_direction. Your algorithm takes control when this returns
neutral. I was a bit sloppy on the description - thanks for clarifying this
> 2. Even if it is possible by some higher order routine to insert the
> necessary pagraph direction overrides that Amit mentions, it is still
> necessary to decide what to do when there is text only. That is all the
> current algorithm is doing.
Did not say anything was wrong with current algorithm. If you are editing
plaintext, tags are not saved and this info is lost, so a reasonable default
algorithm is a must. However if you want to export to html or other rich
formats, we need to provide the user with more fine-grained control over
directionality (as with other tags).
> 3. The automatic routine can not work in certain cases. E.g. (consider
> capital characters to be RTL):
> 1. c++ IS A LANGUAGE.
> 2. <h1>SHALOM</h1>
> 3. moshe: HOW ARE YOU
> 4. MOSHE: how are you
> In the first example "c++" must be skipped to determine the base direction.
> One idea I had would be to embed c++ in "neutral override" to show that it
> is not to be used for automatic paragraph direction determination.
> The same is true for the rest of the example, but it may be done more
> automatic. If the source editor would embed "<h1>" in neutral override and
> the IM client would embed the "moshe:" and "MOSHE:" via the same protection,
> then it would be rendered directly.
Interesting. There is a draft for a new standard by Matitiahu Allouche that
should be able to handle at least (2) above (bidi in structured
expressions). It does, however, describe desired display (not necessarily
the means) in terms of existing control-chars (no "neutral override"...).
I'll forward it to you. Note, however, that this is still under discussions.
> 4. It would be great if there would be a common BiDi interaction document
> that would describe the interaction with regards to issues that are not
> handled by the Unicode algorithm. Things like cursor motion, split cursor,
> right clip override characters.
I happen to attend a committee in SII (machon hatkanim) which is working
exactly on these things.
If other people here are interested in helping with these issues please
contact me (I already mailed some of you, but would not hurt to have, e.g.,
a Qt-er on board...).
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Heb-bugzap