Get started
Grammars and edition
API Documentation
Release Notes
This page summarizes the supported Unicode blocks. To refer to a block in a lexical rule, use the construct ub{NAME}
.
In the table, the Start and End column are the bounds (included) of the corresponding block. They are Unicode code points. See Unicode blocks.
Block Name | Start | End |
---|---|---|
BasicLatin | U+0000 | U+007F |
Latin-1Supplement | U+0080 | U+00FF |
LatinExtended-A | U+0100 | U+017F |
LatinExtended-B | U+0180 | U+024F |
IPAExtensions | U+0250 | U+02AF |
SpacingModifierLetters | U+02B0 | U+02FF |
CombiningDiacriticalMarks | U+0300 | U+036F |
GreekandCoptic | U+0370 | U+03FF |
Cyrillic | U+0400 | U+04FF |
CyrillicSupplement | U+0500 | U+052F |
Armenian | U+0530 | U+058F |
Hebrew | U+0590 | U+05FF |
Arabic | U+0600 | U+06FF |
Syriac | U+0700 | U+074F |
ArabicSupplement | U+0750 | U+077F |
Thaana | U+0780 | U+07BF |
NKo | U+07C0 | U+07FF |
Samaritan | U+0800 | U+083F |
Mandaic | U+0840 | U+085F |
SyriacSupplement | U+0860 | U+086F |
ArabicExtended-A | U+08A0 | U+08FF |
Devanagari | U+0900 | U+097F |
Bengali | U+0980 | U+09FF |
Gurmukhi | U+0A00 | U+0A7F |
Gujarati | U+0A80 | U+0AFF |
Oriya | U+0B00 | U+0B7F |
Tamil | U+0B80 | U+0BFF |
Telugu | U+0C00 | U+0C7F |
Kannada | U+0C80 | U+0CFF |
Malayalam | U+0D00 | U+0D7F |
Sinhala | U+0D80 | U+0DFF |
Thai | U+0E00 | U+0E7F |
Lao | U+0E80 | U+0EFF |
Tibetan | U+0F00 | U+0FFF |
Myanmar | U+1000 | U+109F |
Georgian | U+10A0 | U+10FF |
HangulJamo | U+1100 | U+11FF |
Ethiopic | U+1200 | U+137F |
EthiopicSupplement | U+1380 | U+139F |
Cherokee | U+13A0 | U+13FF |
UnifiedCanadianAboriginalSyllabics | U+1400 | U+167F |
Ogham | U+1680 | U+169F |
Runic | U+16A0 | U+16FF |
Tagalog | U+1700 | U+171F |
Hanunoo | U+1720 | U+173F |
Buhid | U+1740 | U+175F |
Tagbanwa | U+1760 | U+177F |
Khmer | U+1780 | U+17FF |
Mongolian | U+1800 | U+18AF |
UnifiedCanadianAboriginalSyllabicsExtended | U+18B0 | U+18FF |
Limbu | U+1900 | U+194F |
TaiLe | U+1950 | U+197F |
NewTaiLue | U+1980 | U+19DF |
KhmerSymbols | U+19E0 | U+19FF |
Buginese | U+1A00 | U+1A1F |
TaiTham | U+1A20 | U+1AAF |
CombiningDiacriticalMarksExtended | U+1AB0 | U+1AFF |
Balinese | U+1B00 | U+1B7F |
Sundanese | U+1B80 | U+1BBF |
Batak | U+1BC0 | U+1BFF |
Lepcha | U+1C00 | U+1C4F |
OlChiki | U+1C50 | U+1C7F |
CyrillicExtended-C | U+1C80 | U+1C8F |
SundaneseSupplement | U+1CC0 | U+1CCF |
VedicExtensions | U+1CD0 | U+1CFF |
PhoneticExtensions | U+1D00 | U+1D7F |
PhoneticExtensionsSupplement | U+1D80 | U+1DBF |
CombiningDiacriticalMarksSupplement | U+1DC0 | U+1DFF |
LatinExtendedAdditional | U+1E00 | U+1EFF |
GreekExtended | U+1F00 | U+1FFF |
GeneralPunctuation | U+2000 | U+206F |
SuperscriptsandSubscripts | U+2070 | U+209F |
CurrencySymbols | U+20A0 | U+20CF |
CombiningDiacriticalMarksforSymbols | U+20D0 | U+20FF |
LetterlikeSymbols | U+2100 | U+214F |
NumberForms | U+2150 | U+218F |
Arrows | U+2190 | U+21FF |
MathematicalOperators | U+2200 | U+22FF |
MiscellaneousTechnical | U+2300 | U+23FF |
ControlPictures | U+2400 | U+243F |
OpticalCharacterRecognition | U+2440 | U+245F |
EnclosedAlphanumerics | U+2460 | U+24FF |
BoxDrawing | U+2500 | U+257F |
BlockElements | U+2580 | U+259F |
GeometricShapes | U+25A0 | U+25FF |
MiscellaneousSymbols | U+2600 | U+26FF |
Dingbats | U+2700 | U+27BF |
MiscellaneousMathematicalSymbols-A | U+27C0 | U+27EF |
SupplementalArrows-A | U+27F0 | U+27FF |
BraillePatterns | U+2800 | U+28FF |
SupplementalArrows-B | U+2900 | U+297F |
MiscellaneousMathematicalSymbols-B | U+2980 | U+29FF |
SupplementalMathematicalOperators | U+2A00 | U+2AFF |
MiscellaneousSymbolsandArrows | U+2B00 | U+2BFF |
Glagolitic | U+2C00 | U+2C5F |
LatinExtended-C | U+2C60 | U+2C7F |
Coptic | U+2C80 | U+2CFF |
GeorgianSupplement | U+2D00 | U+2D2F |
Tifinagh | U+2D30 | U+2D7F |
EthiopicExtended | U+2D80 | U+2DDF |
CyrillicExtended-A | U+2DE0 | U+2DFF |
SupplementalPunctuation | U+2E00 | U+2E7F |
CJKRadicalsSupplement | U+2E80 | U+2EFF |
KangxiRadicals | U+2F00 | U+2FDF |
IdeographicDescriptionCharacters | U+2FF0 | U+2FFF |
CJKSymbolsandPunctuation | U+3000 | U+303F |
Hiragana | U+3040 | U+309F |
Katakana | U+30A0 | U+30FF |
Bopomofo | U+3100 | U+312F |
HangulCompatibilityJamo | U+3130 | U+318F |
Kanbun | U+3190 | U+319F |
BopomofoExtended | U+31A0 | U+31BF |
CJKStrokes | U+31C0 | U+31EF |
KatakanaPhoneticExtensions | U+31F0 | U+31FF |
EnclosedCJKLettersandMonths | U+3200 | U+32FF |
CJKCompatibility | U+3300 | U+33FF |
CJKUnifiedIdeographsExtensionA | U+3400 | U+4DBF |
YijingHexagramSymbols | U+4DC0 | U+4DFF |
CJKUnifiedIdeographs | U+4E00 | U+9FFF |
YiSyllables | U+A000 | U+A48F |
YiRadicals | U+A490 | U+A4CF |
Lisu | U+A4D0 | U+A4FF |
Vai | U+A500 | U+A63F |
CyrillicExtended-B | U+A640 | U+A69F |
Bamum | U+A6A0 | U+A6FF |
ModifierToneLetters | U+A700 | U+A71F |
LatinExtended-D | U+A720 | U+A7FF |
SylotiNagri | U+A800 | U+A82F |
CommonIndicNumberForms | U+A830 | U+A83F |
Phags-pa | U+A840 | U+A87F |
Saurashtra | U+A880 | U+A8DF |
DevanagariExtended | U+A8E0 | U+A8FF |
KayahLi | U+A900 | U+A92F |
Rejang | U+A930 | U+A95F |
HangulJamoExtended-A | U+A960 | U+A97F |
Javanese | U+A980 | U+A9DF |
MyanmarExtended-B | U+A9E0 | U+A9FF |
Cham | U+AA00 | U+AA5F |
MyanmarExtended-A | U+AA60 | U+AA7F |
TaiViet | U+AA80 | U+AADF |
MeeteiMayekExtensions | U+AAE0 | U+AAFF |
EthiopicExtended-A | U+AB00 | U+AB2F |
LatinExtended-E | U+AB30 | U+AB6F |
CherokeeSupplement | U+AB70 | U+ABBF |
MeeteiMayek | U+ABC0 | U+ABFF |
HangulSyllables | U+AC00 | U+D7AF |
HangulJamoExtended-B | U+D7B0 | U+D7FF |
PrivateUseArea | U+E000 | U+F8FF |
CJKCompatibilityIdeographs | U+F900 | U+FAFF |
AlphabeticPresentationForms | U+FB00 | U+FB4F |
ArabicPresentationForms-A | U+FB50 | U+FDFF |
VariationSelectors | U+FE00 | U+FE0F |
VerticalForms | U+FE10 | U+FE1F |
CombiningHalfMarks | U+FE20 | U+FE2F |
CJKCompatibilityForms | U+FE30 | U+FE4F |
SmallFormVariants | U+FE50 | U+FE6F |
ArabicPresentationForms-B | U+FE70 | U+FEFF |
HalfwidthandFullwidthForms | U+FF00 | U+FFEF |
Specials | U+FFF0 | U+FFFF |
LinearBSyllabary | U+00010000 | U+0001007F |
LinearBIdeograms | U+00010080 | U+000100FF |
AegeanNumbers | U+00010100 | U+0001013F |
AncientGreekNumbers | U+00010140 | U+0001018F |
AncientSymbols | U+00010190 | U+000101CF |
PhaistosDisc | U+000101D0 | U+000101FF |
Lycian | U+00010280 | U+0001029F |
Carian | U+000102A0 | U+000102DF |
CopticEpactNumbers | U+000102E0 | U+000102FF |
OldItalic | U+00010300 | U+0001032F |
Gothic | U+00010330 | U+0001034F |
OldPermic | U+00010350 | U+0001037F |
Ugaritic | U+00010380 | U+0001039F |
OldPersian | U+000103A0 | U+000103DF |
Deseret | U+00010400 | U+0001044F |
Shavian | U+00010450 | U+0001047F |
Osmanya | U+00010480 | U+000104AF |
Osage | U+000104B0 | U+000104FF |
Elbasan | U+00010500 | U+0001052F |
CaucasianAlbanian | U+00010530 | U+0001056F |
LinearA | U+00010600 | U+0001077F |
CypriotSyllabary | U+00010800 | U+0001083F |
ImperialAramaic | U+00010840 | U+0001085F |
Palmyrene | U+00010860 | U+0001087F |
Nabataean | U+00010880 | U+000108AF |
Hatran | U+000108E0 | U+000108FF |
Phoenician | U+00010900 | U+0001091F |
Lydian | U+00010920 | U+0001093F |
MeroiticHieroglyphs | U+00010980 | U+0001099F |
MeroiticCursive | U+000109A0 | U+000109FF |
Kharoshthi | U+00010A00 | U+00010A5F |
OldSouthArabian | U+00010A60 | U+00010A7F |
OldNorthArabian | U+00010A80 | U+00010A9F |
Manichaean | U+00010AC0 | U+00010AFF |
Avestan | U+00010B00 | U+00010B3F |
InscriptionalParthian | U+00010B40 | U+00010B5F |
InscriptionalPahlavi | U+00010B60 | U+00010B7F |
PsalterPahlavi | U+00010B80 | U+00010BAF |
OldTurkic | U+00010C00 | U+00010C4F |
OldHungarian | U+00010C80 | U+00010CFF |
RumiNumeralSymbols | U+00010E60 | U+00010E7F |
Brahmi | U+00011000 | U+0001107F |
Kaithi | U+00011080 | U+000110CF |
SoraSompeng | U+000110D0 | U+000110FF |
Chakma | U+00011100 | U+0001114F |
Mahajani | U+00011150 | U+0001117F |
Sharada | U+00011180 | U+000111DF |
SinhalaArchaicNumbers | U+000111E0 | U+000111FF |
Khojki | U+00011200 | U+0001124F |
Multani | U+00011280 | U+000112AF |
Khudawadi | U+000112B0 | U+000112FF |
Grantha | U+00011300 | U+0001137F |
Newa | U+00011400 | U+0001147F |
Tirhuta | U+00011480 | U+000114DF |
Siddham | U+00011580 | U+000115FF |
Modi | U+00011600 | U+0001165F |
MongolianSupplement | U+00011660 | U+0001167F |
Takri | U+00011680 | U+000116CF |
Ahom | U+00011700 | U+0001173F |
WarangCiti | U+000118A0 | U+000118FF |
ZanabazarSquare | U+00011A00 | U+00011A4F |
Soyombo | U+00011A50 | U+00011AAF |
PauCinHau | U+00011AC0 | U+00011AFF |
Bhaiksuki | U+00011C00 | U+00011C6F |
Marchen | U+00011C70 | U+00011CBF |
MasaramGondi | U+00011D00 | U+00011D5F |
Cuneiform | U+00012000 | U+000123FF |
CuneiformNumbersandPunctuation | U+00012400 | U+0001247F |
EarlyDynasticCuneiform | U+00012480 | U+0001254F |
EgyptianHieroglyphs | U+00013000 | U+0001342F |
AnatolianHieroglyphs | U+00014400 | U+0001467F |
BamumSupplement | U+00016800 | U+00016A3F |
Mro | U+00016A40 | U+00016A6F |
BassaVah | U+00016AD0 | U+00016AFF |
PahawhHmong | U+00016B00 | U+00016B8F |
Miao | U+00016F00 | U+00016F9F |
IdeographicSymbolsandPunctuation | U+00016FE0 | U+00016FFF |
Tangut | U+00017000 | U+000187FF |
TangutComponents | U+00018800 | U+00018AFF |
KanaSupplement | U+0001B000 | U+0001B0FF |
KanaExtended-A | U+0001B100 | U+0001B12F |
Nushu | U+0001B170 | U+0001B2FF |
Duployan | U+0001BC00 | U+0001BC9F |
ShorthandFormatControls | U+0001BCA0 | U+0001BCAF |
ByzantineMusicalSymbols | U+0001D000 | U+0001D0FF |
MusicalSymbols | U+0001D100 | U+0001D1FF |
AncientGreekMusicalNotation | U+0001D200 | U+0001D24F |
TaiXuanJingSymbols | U+0001D300 | U+0001D35F |
CountingRodNumerals | U+0001D360 | U+0001D37F |
MathematicalAlphanumericSymbols | U+0001D400 | U+0001D7FF |
SuttonSignWriting | U+0001D800 | U+0001DAAF |
GlagoliticSupplement | U+0001E000 | U+0001E02F |
MendeKikakui | U+0001E800 | U+0001E8DF |
Adlam | U+0001E900 | U+0001E95F |
ArabicMathematicalAlphabeticSymbols | U+0001EE00 | U+0001EEFF |
MahjongTiles | U+0001F000 | U+0001F02F |
DominoTiles | U+0001F030 | U+0001F09F |
PlayingCards | U+0001F0A0 | U+0001F0FF |
EnclosedAlphanumericSupplement | U+0001F100 | U+0001F1FF |
EnclosedIdeographicSupplement | U+0001F200 | U+0001F2FF |
MiscellaneousSymbolsandPictographs | U+0001F300 | U+0001F5FF |
Emoticons | U+0001F600 | U+0001F64F |
OrnamentalDingbats | U+0001F650 | U+0001F67F |
TransportandMapSymbols | U+0001F680 | U+0001F6FF |
AlchemicalSymbols | U+0001F700 | U+0001F77F |
GeometricShapesExtended | U+0001F780 | U+0001F7FF |
SupplementalArrows-C | U+0001F800 | U+0001F8FF |
SupplementalSymbolsandPictographs | U+0001F900 | U+0001F9FF |
CJKUnifiedIdeographsExtensionB | U+00020000 | U+0002A6DF |
CJKUnifiedIdeographsExtensionC | U+0002A700 | U+0002B73F |
CJKUnifiedIdeographsExtensionD | U+0002B740 | U+0002B81F |
CJKUnifiedIdeographsExtensionE | U+0002B820 | U+0002CEAF |
CJKUnifiedIdeographsExtensionF | U+0002CEB0 | U+0002EBEF |
CJKCompatibilityIdeographsSupplement | U+0002F800 | U+0002FA1F |
Tags | U+000E0000 | U+000E007F |
VariationSelectorsSupplement | U+000E0100 | U+000E01EF |
SupplementaryPrivateUseArea-A | U+000F0000 | U+000FFFFF |
SupplementaryPrivateUseArea-B | U+00100000 | U+0010FFFF |