Decomposed Greek forms in Accordance 13.1

June 9, 2020

Accordance 13.1 Release Notes: Importing/Exporting Content Improvements:

"Changed the export of Greek and Hebrew to unicode to be fully decomposed, rather than electing to use Extended Greek (U+1F00…) or Presentation Hebrew (U+FB20…) forms. This is due to an incompatibility with Nota Bene."

Whoa! That's a pretty significant change. I've been using Unicode since 2003, at that time going through the painstaking process of converting my old ASCII Word files into Mellel unicode files. I have many, many hundreds of files, all using composed Greek forms, i.e. the Greek Extended block U+1Fxx. My Greek keyboard renders composed forms. My raw GNT files are in composed forms. I'm not too happy about now mixing in decomposed forms whenever I copy+paste from Accordance (which is nearly every day). Several of my fonts which render composed Greek forms just fine do not render decomposed forms at all well. And all this just to provide compatibility with Nota Bene. Could this "Export Greek as fully decomposed" not be made optional, set in Preferences? I don't consider this an improvement.

June 9, 2020

We can consider making the change optional. However, according to the Unicode consortium:

Q: Can one use the presentation forms in a data file?

A: Use of presentation forms is not recommended because it does not guarantee data integrity and interoperability. In the particular case of Arabic, data files should include only the characters in the main Arabic block (U+0600..U+06FF) and Arabic supplement blocks (U+0750..U+07FF, U+08A0..U+08FF), rather than the presentation form blocks.

This is one of the other reasons we made the change. It keeps unicode more interoperable. If a font or unicode implementation doesn't understand a presentation form (e.g. U+1F71 "GREEK SMALL LETTER ALPHA WITH OXIA"), it simply cannot render anything. But, if the characters are U+03B1 "GREEK SMALL LETTER ALPHA" + U+0301 "COMBINING ACUTE ACCENT", then you at least get the alpha, the oxia, or both. And, if the font is a well designed font, it knows to combine them when presenting the characters (thus called presentation forms) and swap the glyph for U+1F71. So, the design of Unicode is to keep the raw characters in their decomposed form, but allow for composed presentation characters for clearer rendering to the screen.

June 9, 2020

Thanks for replying Joel.

I've done a lot more investigation this morning, and the problem seems to be in the Mellel text engine. I've tested with five different fonts that ought to render polytonic Greek well: SBL BibLit and SBL Greek; GraecaU (Linguists' Software); Galatia SIL and Gentium Plus (also SIL). Mellel, which uses its own text engine, presents composed forms well in all fonts. It properly presents all decomposed forms in SBL Greek and Graeca U, but does not properly present more complex decomposed forms (combined breathing mark+accent) in SBL BibLit or the SIL fonts. It's curious that Mellel presents decomposed Greek text well in SBL Greek but not in SBL BibLit.

Pages, which uses the MacOS text engine, presents decomposed forms correctly in all five fonts.

I'll take this up with Mellel. I've used Mellel as my word processor since 2003, when it was the only one that could handle rtl Unicode Hebrew.

So, I think I will eventually adjust to decomposed forms but it's going to take some work.

June 20, 2020

I was surprised by this change also. This probably doesn't affect many people (since it didn't come up yet), but searching in MS Word or LibreOffice Writer breaks, if you copy&pasted hebrew/greek text from Accordance previous to 13.1 and then try to search for something text copied from 13.1.

The bigger issue for me is with Anki. I still use version 2.x and it doesn't do unicode normalization when searching or saving cards like Anki 2.1+, so if I try to search a word I copied from Accordance 13.1 it wouldn't find it even if it is there (as normalized form).

So the upgrade to Anki 2.1 and Accordance 13.1+ have to happen at the same time.

The issue with the word processor programs remains however. They don't seem to allow unicode normalization (or compare normalized text when searching).

Edited June 20, 2020 by Elijah

Decomposed Greek forms in Accordance 13.1

Recommended Posts

PakBell

Link to comment

Share on other sites

Joel Brown

Link to comment

Share on other sites

PakBell

Link to comment

Share on other sites

Elijah

Link to comment

Share on other sites

Please sign in to comment

Browse

Activity