Proper Rendering of Tibetan in Android

January 18th, 2010 § 12

Summary: I managed to get Tibetan script rendering accurately on Android, even without Complex Script support, thanks to a Tibetan Unicode font with support for the GB/T20524-2006 standard.

Since I’ve been studying Tibetan, and have started programming on Android, I thought I’d try out Nathan Freitas’ instructions on enabling Tibetan font support in Android.  I’m not going to duplicate his instructions here, but they involve rooting your phone and installing a Tibetan Unicode font to replace the built-in DroidSansFallback font, used to display Chinese, Japanese, and Korean.

However, once I got it working, I realized that, though the characters were indeed Tibetan, they weren’t properly laid out.  Tibetan is a fairly complex language to render properly, since it allows for quite complex vertical stacks of characters with idiosyncratic joining rules.  Here is an example, taken from a supplication to Tara, showing how it was originally rendered on my cell phone:

badtibetansample

OM JE TSUN MA PAK MA DROL MA (Om, the Revered Lady Noble Tara)

If you don’t know Tibetan, that could look decent (and certainly mysteriously beautiful), so let me show the end result, how it’s supposed to look:

goodtibetansample

OM JE TSUN MA PAK MA DROL MA (Om, the Revered Lady Noble Tara)

Note that the letters don’t overlap anymore, and some of them have changed shape to accommodate their companions.  For example, the first syllable, an OM, has had its accents adjusted so they don’t overlap, and the shape of the NA-RO, the seagull-like accent, has been changed to anchor it off of the middle line.  The second syllable has more drastic changes, where the RA is simplified to blend in with the JA below it properly.  In the third syllable, the SHAB-KYU, the hook accent underneath, has been moved up snug to its letter.  The last major change is toward the end, where a RA subscript has been moved down to join up properly with the bottom of its stack.

Though the individual glyphs were being rendered perfectly, they were not being combined properly.  It turns out the need for better language support is a long-standing complaint about Android, and is in the Issues list as #5925.   Comment 15 is a very good description by an unidentified Google engineer of the core problems with Complex Script Support.  (If you’d like to encourage Android to support this properly, please add a comment to that issue.)  The main problem is that, though FreeType (used in Android for font rendering) renders fonts perfectly, there is no support for the ligatures necessary to render complex writing systems such as Arabic, Thai, Devanagiri, and Tibetan.

Chris Fynn is one of the main people who’s been working on Tibetan Unicode standards, and his font, Jomolhari, is my favorite Tibetan font (you can see it in the above pictures, and it’s available under the Open Font License).  He helped with Tibetan rendering in Pango, HarfBuzz, and ICU, any of which would render Tibetan properly, but none of these have yet been integrated into Android.  He lives in Bhutan, and I met him when I visited there.

Chris mentioned that there is a less-well-known Tibetan text encoding named GB/T20524-2006, which resides in the 0xF300 to 0xF8FF space. The “right way” to handle Tibetan has always been to combine glyphs on the fly.  However, since the Chinese are used to having a separate encoding of each unique combination of characters, they created a Chinese National Standard using the Unicode Private Use Area (PUA) to encode each combination of Tibetan glyphs as a single precomposed ligature.  Chris implemented support for this in Jomolhari partly in an effort to get Tibetan fonts working properly in some products (such as Adobe InDesign and Photoshop) that didn’t fully support Tibetan at the time.

He suggested using Andrew West’s BabelPad, which is a Unicode text editor that also supports all sorts of interesting conversions (including rewriting various Tibetan, Mongolian, Manchu, and Vietnamese encodings as Unicode), to re-encode whatever I wanted to display in Tibetan.  This worked once I installed Jomolhari on the handset as the DroidSansFallback.tff font.

I realized that, using Android’s WebView and WebViewClient classes, I could build a Tibetan web browser that intercepted url requests, downloaded the HTML, and rewrote any Tibetan Unicode as precomposed characters.  I would just have to implement a massive conversion table to do it.  Fortunately, Andrew West gave me a code sample in C++ that implemented this 1500-line table, and I was able to quickly port it to Java.  (BTW, for future reference, don’t cut-and-paste large C++ code files directly into Eclipse’s java editor.  It grinds away for a long time, patiently highlighting all the syntax errors for you, before it lets you fix any of them.)  It worked the first time I got it to compile, resulting in a beautiful, properly rendered web page.

A screenshot of the Tibetan WebView app.

A screenshot of the Tibetan WebView app.

I’m planning to clean up the rough edges on the browser, and release that code soon as the seed of a Tibetan open-source Android suite.  I thought that I’d make the Java code implementing the conversion available right away, even though it could also use some optimizing & clean-up, so that others can work with it.  It’s available at:

http://tom.to/code/TibConvert.java

It’s very easy to use:

composedData = TibConvert.convertUnicodeToPrecomposedTibetan(data);

I look forward to being able to help out with other useful Tibetan apps, such as a custom keyboard, IM and e-mail apps, and a dictionary.  Please contact me if you’re interested in working together on these types of things.

I’ve really been enjoying the Android development process.   Running code and debugging code directly on the handset is working well, and I like how the libraries are structured.  This might be what it takes for me to get back into having fun with developing client-side Java.

May there be a great flourishing of Tibetan apps on the Android!  Ge-lek pel!

BTW, I spent some time trying to get Tibetan font support working on a stock Android phone, without having to root it, using the font embedding technique described here.  I couldn’t get it to work (just lots of beautiful boxes showing up instead of Tibetan), but they say that some fonts just don’t work to embed.  Of course, embedding a 2.2MB font inside an app might be pushing the envelope a bit, but it would be great if someone could get this to work.  Though it appears that the only widgets you can set the font for are TextView widgets, so you wouldn’t get a web browser that way, and you’d also have to re-implement a new widget set derived from TextView to get it to show up properly.

Tagged: , , , , , , , , ,

§ 12 Responses to “Proper Rendering of Tibetan in Android”

What's this?

You are currently reading Proper Rendering of Tibetan in Android at tom at tom dot to.

meta