An implementer of Khmer should never allow the entry of the COENG character by itself.
COENG does not have a conventional visual form in Khmer, as it is a control character to cause the formation of a subscript. Subscripts are formed by a combination of COENG glyph, followed by a consonant or independent vowel. Subscript Glyph – Subscript form of a consonant or independent vowel. For example, the word KARMA would be encoded as KA + MA + ROBAT.
It is ordered as one would write the text. ROBAT (U+17CC) – Above-base or combining form of the letter RO, used in most scripts if RO is the first consonant in the syllable and is not the base consonant. Transformations discussed in this document do not cross syllable boundaries.
The cursor cannot be positioned within the syllable. Once a syllable is shaped, it is indivisible (but deletions of its characters may take place starting from the end). In a text sequence, these characters are stored in phonetic order, although they may not be represented in phonetic order when displayed. Syllables are composed of consonant letters, independent vowels, dependant or inherent vowels, and signs. Khmer Syllable – Effective orthographic “unit” of Khmer writing systems, consisting of a consonant and a vowel core, and optionally with one or two subscripts inserted between the two, and followed by signs. Therefore, those illustrated in the examples to follow are named, for example, “KA” and “TA,” rather than just “K” or “T.”Ĭonsonant Shifters – Used to shift the base consonant between registers (U+17C9, U+17CA). Consonants may exist in different contextual forms, and have an inherent vowel (usually, the long vowel “A”). NOTE: The shape of the COENG is arbitrary and is not rendered.Ĭonsonant - Represents a single consonant sound. The COENG is always tied to the letter following it and is always handled as a unit with the following letter.
U+17D2 (COENG) – Code point before a consonant or independent vowel, which causes the formation of the subscript form of that letter.
Layout operations are defined in terms of a base glyph, not a base character, since the results of the shaping process are a series of glyphs. In Khmer, the first consonant or independent vowel of the syllable usually forms the base glyph. The following terms are useful for understanding the layout features and script rules discussed in this document.īase Glyph – The one and only consonant, independent vowel, or number in the syllable that is written in its “full” (nominal) form. In addition to being a primer and specification for the creation and support of Khmer fonts, this document is intended to more broadly illustrate the OpenType Layout architecture, feature schemes, and operating system support for shaping and positioning text. This document also presents information about the Khmer OpenType shaping engine of Uniscribe, the Windows component responsible for text layout. Registered features of the Khmer script are defined and illustrated, encodings are listed, and templates are included for compiling Khmer layout tables for OpenType fonts.
In this specification, font developers will learn how to encode complex script features in their fonts, choose character sets, organize font information, and use existing tools to produce Khmer fonts. This document presents information that will help font developers create or support OpenType fonts for the Khmer script languages covered by the Unicode Standard.