Supplementary Characters in the Java Platform:
Abstract
This article describes how supplementary characters are supported in the Java platform. Supplementary characters are characters in the Unicode standard whose code points are above U FFFF, and which therefore cannot be described as single 16-bit entities such as the char data type in the Java programming language. Such characters are generally rare, but some are used, for example, as part of Chinese and Japanese personal names, and so support for them is commonly required for government applications in East Asian countries.The Java platform is being enhanced to enable processing of supplementary characters with minimal impact on existing applications. New low-level APIs enable operations on individual characters where necessary. Most text-processing APIs, however, use character sequences, such as the String class or character arrays. These are now interpreted as UTF-16 sequences, and the implementations of these APIs is changed to correctly handle supplementary characters. The enhancements are part of version 1.5 of the Java 2 Platform, Standard Edition (J2SE).
Besides explaining these enhancements in detail, this article also provides guidelines for application developers for determining and implementing necessary changes to enable use of the complete Unicode character set.