Script (Unicode)
In Unicode, a script is a collection of letters and other written signs used to represent textual information in one or more writing systems.[1] Some scripts support one and only one writing system and language, for example, Armenian. Other scripts support many different writing systems; for example, the Latin script supports English, French, German, Italian, Vietnamese, Latin itself, and several other languages. Some languages make use of multiple alternate writing systems, thus also use several scripts. In Turkish, the Arabic script was used before the 20th century, but transitioned to Latin in the early part of the 20th century. For a list of languages supported by each script see the list of languages by writing system. More or less complementary to scripts are symbols and Unicode control characters.
The unified diacritical characters and unified punctuation characters frequently have the "common" or "inherited" script property. However, the individual scripts often have their own punctuation and diacritics. So many scripts include not only letters, but also diacritic and other marks, punctuation, numerals and even their own idiosyncratic symbols and space characters.
Unicode 8.0 includes over 80 modern scripts plus over 40 ancient (out of use a thousand years or more) and historic (out of use several hundred years) scripts.[2][3] More scripts are in the process for encoding or have been tentatively allocated for encoding in roadmaps.[4]
Definition and classification
When multiple languages make use of the same script, there are frequently some differences: particularly in diacritics and other marks. For example, Swedish and English both use the Latin script. However, Swedish includes the character ‘å’ (sometimes called a "Swedish O") while English has no such character. Nor does English make use of the diacritic combining circle above for any character. In general the languages sharing the same scripts share many of the same characters. Despite these peripheral differences in the Swedish and English writing systems they are said to use the same Latin script. So the Unicode abstraction of scripts is a basic organizing technique. The differences between different alphabets or writing systems remain and are supported through Unicode’s flexible scripts, combining marks and collation algorithms.
Script versus writing system
"Writing system" is sometimes treated as a synonym for script. However it also can be used as the specific concrete writing system supported by a script. For example the Vietnamese writing system is supported by the Latin script. A writing system may also cover more than one script, for example the Japanese writing system makes use of the Han, Hiragana and Katakana scripts.
Most writing systems can be broadly divided into several categories: logographic, syllabic, alphabetic (or segmental), abugida, abjad and featural; however, all features of any of these may be found in any given writing system in varying proportions, often making it difficult to purely categorize a system. The term complex system is sometimes used to describe those where the admixture makes classification problematic.
Unicode supports all of these types of writing systems through its numerous scripts. Unicode also adds further properties to characters to help differentiate the various characters and the ways they behave within Unicode text processing algorithms.
Special script property values
In addition to explicit or specific script properties Unicode uses three special values:[5]
- Common: Unicode can assign a character in the UCS to a single script only. However, many characters — those that are not part of a formal natural language writing system or are unified across many writing systems may be used in more than one script. For example, currency signs, symbols, numerals and punctuation marks. In these cases Unicode defines them as belonging to the "common" script (ISO 15924 code "Zyyy").
- Inherited: Many diacritics and non-spacing combining characters may be applied to characters from more than one script. In these cases Unicode assigns them to the "inherited" script (ISO 15924 code Zinh), which means that they have the same script class as the base character with which they combine, and so in different contexts they may be treated as belonging to different scripts. For example, U+0308 ̈ COMBINING DIAERESIS may combine with either U+0065 e LATIN SMALL LETTER E to create a Latin "ë", or with U+0435 е CYRILLIC SMALL LETTER IE for the Cyrillic "ё". In the former case it inherits the Latin script of the base character whereas in the latter case it inherits the Cyrillic script of the base character.
- Unknown: The value of "unknown" script (ISO 15924 code Zzzz) is given to unassigned, private use, noncharacter, and surrogate code points.
Character categories within scripts
Unicode provides a general category property for each character. So in addition to belonging to a script every character also has a general category. Typically scripts include letter characters including: uppercase letters, lowercase letter and modifier letters. Some characters are considered titlecase letters for a few precomposed ligatures such as Dz (U+01F2). Such titlecase ligatures are all in the Latin and Greek scripts and are all compatibility characters and therefore Unicode discourages their use by authors. It is unlikely that new titlecase letters will be added in the future.
Most writing systems do not differentiate between uppercase and lowercase letters. For those scripts all letters are categorized as "other letter" or "modifier letter". Ideographs such as Unihan ideographs are also categorized as "other letters". A few scripts do differentiate between uppercase and lowercase however: Latin, Cyrillic, Greek, Armenian, Georgian, and Deseret. Even for these scripts there are some letters that are neither uppercase nor lowercase.
Scripts can also contain any other general category character such as marks (diacritic and otherwise), numbers (numerals), punctuation, separators (word separators such as spaces), symbols and non-graphical format characters. These are included in a particular script when they are unique to that script. Other such characters are generally unified and included in the punctuation or diacritic blocks. However, the bulk of characters in any script (other than the common and inherited scripts) are letters.
Table of scripts in Unicode
Unicode defines over a hundred script names (called "Alias" or "Property value alias"), based on the ISO 15924 list. Unicode uses the "Common" script name for ISO 15924's Zyyy (code for undetermined script), "Inherited" for ISO 15924's Zinh (code for inherited script), and "Unknown" for ISO 15924's Zzzz (code for uncoded script). Not used are, among others, the ISO 15924 script codes: Zsym (Symbols) and Zmth (Mathematical notation). These are considered not to be scripts in Unicode sense.
ISO 15924 | Script in Unicode[e] | ||||||
---|---|---|---|---|---|---|---|
Code | No. | Name | Alias[f] | Direction | Version | Characters | Remark |
Adlm | 166 | Adlam | R-to-L | Approved for inclusion in a future version of the Unicode Standard[6][7] | |||
Afak | 439 | Afaka | L-to-R | Not in Unicode, proposal under review by the Unicode Technical Committee[6] | |||
Aghb | 239 | Caucasian Albanian | Caucasian Albanian | L-to-R | 7.0 | 53 | Ancient/historic |
Ahom | 338 | Ahom, Tai Ahom | Ahom | L-to-R | 8.0 | 57 | Ancient/historic |
Arab | 160 | Arabic | Arabic | R-to-L | 1.0 | 1,257 | |
Aran | 161 | Arabic (Nastaliq variant) | R-to-L | Typographic variant of Arabic | |||
Armi | 124 | Imperial Aramaic | Imperial Aramaic | R-to-L | 5.2 | 31 | Ancient/historic |
Armn | 230 | Armenian | Armenian | L-to-R | 1.0 | 93 | |
Avst | 134 | Avestan | Avestan | R-to-L | 5.2 | 61 | Ancient/historic |
Bali | 360 | Balinese | Balinese | L-to-R | 5.0 | 121 | |
Bamu | 435 | Bamum | Bamum | L-to-R | 5.2 | 657 | |
Bass | 259 | Bassa Vah | Bassa Vah | L-to-R | 7.0 | 36 | Ancient/historic |
Batk | 365 | Batak | Batak | L-to-R | 6.0 | 56 | |
Beng | 325 | Bengali | Bengali | L-to-R | 1.0 | 93 | |
Bhks | 334 | Bhaiksuki | L-to-R | Approved for inclusion in a future version of the Unicode Standard[6] | |||
Blis | 550 | Blissymbols | L-to-R | Not in Unicode, proposal in initial/exploratory stage[6] | |||
Bopo | 285 | Bopomofo | Bopomofo | L-to-R | 1.0 | 70 | |
Brah | 300 | Brahmi | Brahmi | L-to-R | 6.0 | 109 | Ancient/historic |
Brai | 570 | Braille | Braille | L-to-R | 3.0 | 256 | |
Bugi | 367 | Buginese | Buginese | L-to-R | 4.1 | 30 | |
Buhd | 372 | Buhid | Buhid | L-to-R | 3.2 | 20 | |
Cakm | 349 | Chakma | Chakma | L-to-R | 6.1 | 67 | |
Cans | 440 | Unified Canadian Aboriginal Syllabics | Canadian Aboriginal | L-to-R | 3.0 | 710 | |
Cari | 201 | Carian | Carian | L-to-R | 5.1 | 49 | Ancient/historic |
Cham | 358 | Cham | Cham | L-to-R | 5.1 | 83 | |
Cher | 445 | Cherokee | Cherokee | L-to-R | 3.0 | 172 | |
Cirt | 291 | Cirth | L-to-R | Not in Unicode | |||
Copt | 204 | Coptic | Coptic | L-to-R | 1.0 | 137 | Ancient/historic, Disunified from Greek in 4.1 |
Cprt | 403 | Cypriot | Cypriot | R-to-L | 4.0 | 55 | Ancient/historic |
Cyrl | 220 | Cyrillic | Cyrillic | L-to-R | 1.0 | 434 | |
Cyrs | 221 | Cyrillic (Old Church Slavonic variant) | L-to-R | Not in Unicode | |||
Deva | 315 | Devanagari (Nagari) | Devanagari | L-to-R | 1.0 | 154 | |
Dsrt | 250 | Deseret (Mormon) | Deseret | L-to-R | 3.1 | 80 | |
Dupl | 755 | Duployan shorthand, Duployan stenography | Duployan | L-to-R | 7.0 | 143 | |
Egyd | 070 | Egyptian demotic | R-to-L | Not in Unicode | |||
Egyh | 060 | Egyptian hieratic | R-to-L | Not in Unicode | |||
Egyp | 050 | Egyptian hieroglyphs | Egyptian Hieroglyphs | L-to-R | 5.2 | 1,071 | Ancient/historic |
Elba | 226 | Elbasan | Elbasan | L-to-R | 7.0 | 40 | Ancient/historic |
Ethi | 430 | Ethiopic (Geʻez) | Ethiopic | L-to-R | 3.0 | 495 | |
Geok | 241 | Khutsuri (Asomtavruli and Nuskhuri) | Georgian | L-to-R | Unicode groups Geok and Geor together as "Georgian" | ||
Geor | 240 | Georgian (Mkhedruli) | Georgian | L-to-R | 1.0 | 127 | For Unicode, see also Geok |
Glag | 225 | Glagolitic | Glagolitic | L-to-R | 4.1 | 94 | Ancient/historic |
Goth | 206 | Gothic | Gothic | L-to-R | 3.1 | 27 | Ancient/historic |
Gran | 343 | Grantha | Grantha | L-to-R | 7.0 | 85 | Ancient/historic |
Grek | 200 | Greek | Greek | L-to-R | 1.0 | 516 | |
Gujr | 320 | Gujarati | Gujarati | L-to-R | 1.0 | 85 | |
Guru | 310 | Gurmukhi | Gurmukhi | L-to-R | 1.0 | 79 | |
Hanb | 503 | Han with Bopomofo (alias for Han + Bopomofo) | L-to-R | See Hani, Bopo | |||
Hang | 286 | Hangul (Hangŭl, Hangeul) | Hangul | L-to-R | 1.0 | 11,739 | Hangul syllables relocated in 2.0 |
Hani | 500 | Han (Hanzi, Kanji, Hanja) | Han | L-to-R | 1.0 | 81,734 | |
Hano | 371 | Hanunoo (Hanunóo) | Hanunoo | L-to-R | 3.2 | 21 | |
Hans | 501 | Han (Simplified variant) | L-to-R | Subset Hani | |||
Hant | 502 | Han (Traditional variant) | L-to-R | Subset Hani | |||
Hatr | 127 | Hatran | Hatran | R-to-L | 8.0 | 26 | Ancient/historic |
Hebr | 125 | Hebrew | Hebrew | R-to-L | 1.0 | 133 | |
Hira | 410 | Hiragana | Hiragana | L-to-R | 1.0 | 91 | |
Hluw | 080 | Anatolian Hieroglyphs (Luwian Hieroglyphs, Hittite Hieroglyphs) | Anatolian Hieroglyphs | L-to-R | 8.0 | 583 | Ancient/historic |
Hmng | 450 | Pahawh Hmong | Pahawh Hmong | L-to-R | 7.0 | 127 | |
Hrkt | 412 | Japanese syllabaries (alias for Hiragana + Katakana) | Katakana or Hiragana | L-to-R | See Hira, Kana | ||
Hung | 176 | Old Hungarian (Hungarian Runic) | Old Hungarian | R-to-L | 8.0 | 108 | Ancient/historic |
Inds | 610 | Indus (Harappan) | R-to-L | Not in Unicode, proposal in initial/exploratory stage[6] | |||
Ital | 210 | Old Italic (Etruscan, Oscan, etc.) | Old Italic | L-to-R | 3.1 | 36 | Ancient/historic |
Jamo | 284 | Jamo (alias for Jamo subset of Hangul) | L-to-R | Subset Hang | |||
Java | 361 | Javanese | Javanese | L-to-R | 5.2 | 90 | |
Jpan | 413 | Japanese (alias for Han + Hiragana + Katakana) | L-to-R | See Hani, Hira and Kana | |||
Jurc | 510 | Jurchen | L-to-R | Not in Unicode | |||
Kali | 357 | Kayah Li | Kayah Li | L-to-R | 5.1 | 47 | |
Kana | 411 | Katakana | Katakana | L-to-R | 1.0 | 300 | |
Khar | 305 | Kharoshthi | Kharoshthi | R-to-L | 4.1 | 65 | Ancient/historic |
Khmr | 355 | Khmer | Khmer | L-to-R | 3.0 | 146 | |
Khoj | 322 | Khojki | Khojki | L-to-R | 7.0 | 61 | Ancient/historic |
Kitl | 505 | Khitan large script | L-to-R | Not in Unicode | |||
Kits | 288 | Khitan small script | T-to-B | Not in Unicode | |||
Knda | 345 | Kannada | Kannada | L-to-R | 1.0 | 87 | |
Kore | 287 | Korean (alias for Hangul + Han) | L-to-R | See Hani and Hang | |||
Kpel | 436 | Kpelle | L-to-R | Not in Unicode, proposal in initial/exploratory stage[6] | |||
Kthi | 317 | Kaithi | Kaithi | L-to-R | 5.2 | 66 | Ancient/historic |
Lana | 351 | Tai Tham (Lanna) | Tai Tham | L-to-R | 5.2 | 127 | |
Laoo | 356 | Lao | Lao | L-to-R | 1.0 | 67 | |
Latf | 217 | Latin (Fraktur variant) | L-to-R | Typographic variant of Latin | |||
Latg | 216 | Latin (Gaelic variant) | L-to-R | Typographic variant of Latin | |||
Latn | 215 | Latin | Latin | L-to-R | 1.0 | 1,349 | See Latin script in Unicode |
Leke | 364 | Leke | L-to-R | Not in Unicode | |||
Lepc | 335 | Lepcha (Róng) | Lepcha | L-to-R | 5.1 | 74 | |
Limb | 336 | Limbu | Limbu | L-to-R | 4.0 | 68 | |
Lina | 400 | Linear A | Linear A | L-to-R | 7.0 | 341 | Ancient/historic |
Linb | 401 | Linear B | Linear B | L-to-R | 4.0 | 211 | Ancient/historic |
Lisu | 399 | Lisu (Fraser) | Lisu | L-to-R | 5.2 | 48 | |
Loma | 437 | Loma | L-to-R | Not in Unicode, proposal in initial/exploratory stage[6] | |||
Lyci | 202 | Lycian | Lycian | L-to-R | 5.1 | 29 | Ancient/historic |
Lydi | 116 | Lydian | Lydian | R-to-L | 5.1 | 27 | Ancient/historic |
Mahj | 314 | Mahajani | Mahajani | L-to-R | 7.0 | 39 | Ancient/historic |
Mand | 140 | Mandaic, Mandaean | Mandaic | R-to-L | 6.0 | 29 | |
Mani | 139 | Manichaean | Manichaean | R-to-L | 7.0 | 51 | Ancient/historic |
Marc | 332 | Marchen | L-to-R | Approved for inclusion in a future version of the Unicode Standard[6][7] | |||
Maya | 090 | Mayan hieroglyphs | Not in Unicode | ||||
Mend | 438 | Mende Kikakui | Mende Kikakui | R-to-L | 7.0 | 213 | |
Merc | 101 | Meroitic Cursive | Meroitic Cursive | R-to-L | 6.1 | 90 | Ancient/historic |
Mero | 100 | Meroitic Hieroglyphs | Meroitic Hieroglyphs | R-to-L | 6.1 | 32 | Ancient/historic |
Mlym | 347 | Malayalam | Malayalam | L-to-R | 1.0 | 100 | |
Modi | 324 | Modi, Moḍī | Modi | L-to-R | 7.0 | 79 | Ancient/historic |
Mong | 145 | Mongolian | Mongolian | T-to-B | 3.0 | 153 | Includes Clear, Manchu scripts |
Moon | 218 | Moon (Moon code, Moon script, Moon type) | Not in Unicode, proposal in initial/exploratory stage[6] | ||||
Mroo | 199 | Mro, Mru | Mro | L-to-R | 7.0 | 43 | |
Mtei | 337 | Meitei Mayek (Meithei, Meetei) | Meetei Mayek | L-to-R | 5.2 | 79 | |
Mult | 323 | Multani | Multani | L-to-R | 8.0 | 38 | Ancient/historic |
Mymr | 350 | Myanmar (Burmese) | Myanmar | L-to-R | 3.0 | 223 | |
Narb | 106 | Old North Arabian (Ancient North Arabian) | Old North Arabian | R-to-L | 7.0 | 32 | Ancient/historic |
Nbat | 159 | Nabataean | Nabataean | R-to-L | 7.0 | 40 | Ancient/historic |
Newa | 333 | Newa, Newar, Newari, Nepāla lipi | L-to-R | Approved for inclusion in a future version of the Unicode Standard[6][7] | |||
Nkgb | 420 | Nakhi Geba ('Na-'Khi ²Ggŏ-¹baw, Naxi Geba) | L-to-R | Not in Unicode, proposal in initial/exploratory stage[6] | |||
Nkoo | 165 | N’Ko | NKo | R-to-L | 5.0 | 59 | |
Nshu | 499 | Nüshu | L-to-R | Approved for inclusion in a future version of the Unicode Standard[8][7] | |||
Ogam | 212 | Ogham | Ogham | 3.0 | 29 | Ancient/historic | |
Olck | 261 | Ol Chiki (Ol Cemet’, Ol, Santali) | Ol Chiki | L-to-R | 5.1 | 48 | |
Orkh | 175 | Old Turkic, Orkhon Runic | Old Turkic | R-to-L | 5.2 | 73 | Ancient/historic |
Orya | 327 | Oriya | Oriya | L-to-R | 1.0 | 90 | |
Osge | 219 | Osage | L-to-R | Approved for inclusion in a future version of the Unicode Standard[6][7] | |||
Osma | 260 | Osmanya | Osmanya | L-to-R | 4.0 | 40 | |
Palm | 126 | Palmyrene | Palmyrene | R-to-L | 7.0 | 32 | Ancient/historic |
Pauc | 263 | Pau Cin Hau | Pau Cin Hau | L-to-R | 7.0 | 57 | |
Perm | 227 | Old Permic | Old Permic | L-to-R | 7.0 | 43 | Ancient/historic |
Phag | 331 | Phags-pa | Phags-pa | T-to-B | 5.0 | 56 | Ancient/historic |
Phli | 131 | Inscriptional Pahlavi | Inscriptional Pahlavi | R-to-L | 5.2 | 27 | Ancient/historic |
Phlp | 132 | Psalter Pahlavi | Psalter Pahlavi | R-to-L | 7.0 | 29 | Ancient/historic |
Phlv | 133 | Book Pahlavi | R-to-L | Not in Unicode | |||
Phnx | 115 | Phoenician | Phoenician | R-to-L | 5.0 | 29 | Ancient/historic |
Piqd | 293 | Klingon (KLI pIqaD) | L-to-R | Rejected for inclusion in the Unicode Standard[9][10] | |||
Plrd | 282 | Miao (Pollard) | Miao | L-to-R | 6.1 | 133 | |
Prti | 130 | Inscriptional Parthian | Inscriptional Parthian | R-to-L | 5.2 | 30 | Ancient/historic |
Qaaa | 900 | Reserved for private use (start) | Not in Unicode | ||||
Qaai | 908 | (Private use) | Not in Unicode (Before version 5.2, this was used instead of Zinh) | ||||
Qabx | 949 | Reserved for private use (end) | Not in Unicode | ||||
Rjng | 363 | Rejang (Redjang, Kaganga) | Rejang | L-to-R | 5.1 | 37 | |
Roro | 620 | Rongorongo | Not in Unicode, proposal in initial/exploratory stage[6] | ||||
Runr | 211 | Runic | Runic | L-to-R | 3.0 | 86 | Ancient/historic |
Samr | 123 | Samaritan | Samaritan | R-to-L | 5.2 | 61 | |
Sara | 292 | Sarati | Not in Unicode | ||||
Sarb | 105 | Old South Arabian | Old South Arabian | R-to-L | 5.2 | 32 | Ancient/historic |
Saur | 344 | Saurashtra | Saurashtra | L-to-R | 5.1 | 81 | |
Sgnw | 095 | SignWriting | SignWriting | T-to-B | 8.0 | 672 | |
Shaw | 281 | Shavian (Shaw) | Shavian | L-to-R | 4.0 | 48 | |
Shrd | 319 | Sharada, Śāradā | Sharada | L-to-R | 6.1 | 94 | |
Sidd | 302 | Siddham, Siddhaṃ, Siddhamātṛkā | Siddham | L-to-R | 7.0 | 92 | Ancient/historic |
Sind | 318 | Khudawadi, Sindhi | Khudawadi | L-to-R | 7.0 | 69 | |
Sinh | 348 | Sinhala | Sinhala | L-to-R | 3.0 | 110 | |
Sora | 398 | Sora Sompeng | Sora Sompeng | L-to-R | 6.1 | 35 | |
Sund | 362 | Sundanese | Sundanese | L-to-R | 5.1 | 72 | |
Sylo | 316 | Syloti Nagri | Syloti Nagri | L-to-R | 4.1 | 44 | |
Syrc | 135 | Syriac | Syriac | R-to-L | 3.0 | 77 | |
Syre | 138 | Syriac (Estrangelo variant) | R-to-L | Typographic variant of Syriac | |||
Syrj | 137 | Syriac (Western variant) | R-to-L | Typographic variant of Syriac | |||
Syrn | 136 | Syriac (Eastern variant) | R-to-L | Typographic variant of Syriac | |||
Tagb | 373 | Tagbanwa | Tagbanwa | L-to-R | 3.2 | 18 | |
Takr | 321 | Takri, Ṭākrī, Ṭāṅkrī | Takri | L-to-R | 6.1 | 66 | |
Tale | 353 | Tai Le | Tai Le | L-to-R | 4.0 | 35 | |
Talu | 354 | New Tai Lue | New Tai Lue | L-to-R | 4.1 | 83 | |
Taml | 346 | Tamil | Tamil | L-to-R | 1.0 | 72 | |
Tang | 520 | Tangut | L-to-R | Approved for inclusion in a future version of the Unicode Standard[6][7] | |||
Tavt | 359 | Tai Viet | Tai Viet | L-to-R | 5.2 | 72 | |
Telu | 340 | Telugu | Telugu | L-to-R | 1.0 | 96 | |
Teng | 290 | Tengwar | L-to-R | Not in Unicode | |||
Tfng | 120 | Tifinagh (Berber) | Tifinagh | L-to-R | 4.1 | 59 | |
Tglg | 370 | Tagalog (Baybayin, Alibata) | Tagalog | L-to-R | 3.2 | 20 | |
Thaa | 170 | Thaana | Thaana | R-to-L | 3.0 | 50 | |
Thai | 352 | Thai | Thai | L-to-R | 1.0 | 86 | |
Tibt | 330 | Tibetan | Tibetan | L-to-R | 2.0 | 207 | Added in 1.0, removed in 1.1 and reintroduced in 2.0 |
Tirh | 326 | Tirhuta | Tirhuta | L-to-R | 7.0 | 82 | |
Ugar | 040 | Ugaritic | Ugaritic | L-to-R | 4.0 | 31 | Ancient/historic |
Vaii | 470 | Vai | Vai | L-to-R | 5.1 | 300 | |
Visp | 280 | Visible Speech | L-to-R | Not in Unicode | |||
Wara | 262 | Warang Citi (Varang Kshiti) | Warang Citi | L-to-R | 7.0 | 84 | |
Wole | 480 | Woleai | R-to-L | Not in Unicode, proposal in initial/exploratory stage[6] | |||
Xpeo | 030 | Old Persian | Old Persian | L-to-R | 4.1 | 50 | Ancient/historic |
Xsux | 020 | Cuneiform, Sumero-Akkadian | Cuneiform | L-to-R | 5.0 | 1,234 | Ancient/historic |
Yiii | 460 | Yi | Yi | L-to-R | 3.0 | 1,220 | |
Zinh | 994 | Code for inherited script | Inherited | Inherited | 563 | ||
Zmth | 995 | Mathematical notation | L-to-R | Not a 'script' in Unicode | |||
Zsym | 996 | Symbols | Not a 'script' in Unicode | ||||
Zsye | 993 | Symbols (emoji variant) | Not a 'script' in Unicode | ||||
Zxxx | 997 | Code for unwritten documents | Not a 'script' in Unicode | ||||
Zyyy | 998 | Code for undetermined script | Common | 7,179 | |||
Zzzz | 999 | Code for uncoded script | Unknown | 993,309 | All other code points | ||
Notes
|
See also
References
- ↑ Glossary of Unicode Terms
- ↑ Unicode Character Database: Scripts
- ↑ "Chapter 14: Additional Ancient and Historic Scripts". The Unicode Standard, Version 6.2 (PDF). Mountain View, CA: Unicode, Inc. September 2012. p. 473. ISBN 978-1-936213-07-8.
- ↑ http://www.unicode.org/roadmaps/ Roadmaps to Unicode
- ↑ Unicode Standard Annex #24: Unicode Script Property
- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 "Proposed New Scripts". Unicode Consortium. 2015-06-12. Retrieved 2015-07-16.
- 1 2 3 4 5 6 "Roadmap to the SMP". Unicode Consortium. 2015-03-26. Retrieved 2015-05-22.
- ↑ "Proposed New Characters: Pipeline Table". Unicode Consortium. 2015-07-08. Retrieved 2015-07-16.
- ↑ Michael Everson (1997-09-18). "Proposal to encode Klingon in Plane 1 of ISO/IEC 10646-2".
- ↑ The Unicode Consortium (2001-08-14). "Approved Minutes of the UTC 87 / L2 184 Joint Meeting".
|