

Get started
Grammars and edition
API Documentation
Release Notes
This page summarizes the supported Unicode blocks. To refer to a block in a lexical rule, use the construct ub{NAME}.
In the table, the Start and End column are the bounds (included) of the corresponding block. They are Unicode code points. See Unicode blocks.
| Block Name | Start | End |
|---|---|---|
| BasicLatin | U+0000 | U+007F |
| Latin-1Supplement | U+0080 | U+00FF |
| LatinExtended-A | U+0100 | U+017F |
| LatinExtended-B | U+0180 | U+024F |
| IPAExtensions | U+0250 | U+02AF |
| SpacingModifierLetters | U+02B0 | U+02FF |
| CombiningDiacriticalMarks | U+0300 | U+036F |
| GreekandCoptic | U+0370 | U+03FF |
| Cyrillic | U+0400 | U+04FF |
| CyrillicSupplement | U+0500 | U+052F |
| Armenian | U+0530 | U+058F |
| Hebrew | U+0590 | U+05FF |
| Arabic | U+0600 | U+06FF |
| Syriac | U+0700 | U+074F |
| ArabicSupplement | U+0750 | U+077F |
| Thaana | U+0780 | U+07BF |
| NKo | U+07C0 | U+07FF |
| Samaritan | U+0800 | U+083F |
| Mandaic | U+0840 | U+085F |
| SyriacSupplement | U+0860 | U+086F |
| ArabicExtended-A | U+08A0 | U+08FF |
| Devanagari | U+0900 | U+097F |
| Bengali | U+0980 | U+09FF |
| Gurmukhi | U+0A00 | U+0A7F |
| Gujarati | U+0A80 | U+0AFF |
| Oriya | U+0B00 | U+0B7F |
| Tamil | U+0B80 | U+0BFF |
| Telugu | U+0C00 | U+0C7F |
| Kannada | U+0C80 | U+0CFF |
| Malayalam | U+0D00 | U+0D7F |
| Sinhala | U+0D80 | U+0DFF |
| Thai | U+0E00 | U+0E7F |
| Lao | U+0E80 | U+0EFF |
| Tibetan | U+0F00 | U+0FFF |
| Myanmar | U+1000 | U+109F |
| Georgian | U+10A0 | U+10FF |
| HangulJamo | U+1100 | U+11FF |
| Ethiopic | U+1200 | U+137F |
| EthiopicSupplement | U+1380 | U+139F |
| Cherokee | U+13A0 | U+13FF |
| UnifiedCanadianAboriginalSyllabics | U+1400 | U+167F |
| Ogham | U+1680 | U+169F |
| Runic | U+16A0 | U+16FF |
| Tagalog | U+1700 | U+171F |
| Hanunoo | U+1720 | U+173F |
| Buhid | U+1740 | U+175F |
| Tagbanwa | U+1760 | U+177F |
| Khmer | U+1780 | U+17FF |
| Mongolian | U+1800 | U+18AF |
| UnifiedCanadianAboriginalSyllabicsExtended | U+18B0 | U+18FF |
| Limbu | U+1900 | U+194F |
| TaiLe | U+1950 | U+197F |
| NewTaiLue | U+1980 | U+19DF |
| KhmerSymbols | U+19E0 | U+19FF |
| Buginese | U+1A00 | U+1A1F |
| TaiTham | U+1A20 | U+1AAF |
| CombiningDiacriticalMarksExtended | U+1AB0 | U+1AFF |
| Balinese | U+1B00 | U+1B7F |
| Sundanese | U+1B80 | U+1BBF |
| Batak | U+1BC0 | U+1BFF |
| Lepcha | U+1C00 | U+1C4F |
| OlChiki | U+1C50 | U+1C7F |
| CyrillicExtended-C | U+1C80 | U+1C8F |
| SundaneseSupplement | U+1CC0 | U+1CCF |
| VedicExtensions | U+1CD0 | U+1CFF |
| PhoneticExtensions | U+1D00 | U+1D7F |
| PhoneticExtensionsSupplement | U+1D80 | U+1DBF |
| CombiningDiacriticalMarksSupplement | U+1DC0 | U+1DFF |
| LatinExtendedAdditional | U+1E00 | U+1EFF |
| GreekExtended | U+1F00 | U+1FFF |
| GeneralPunctuation | U+2000 | U+206F |
| SuperscriptsandSubscripts | U+2070 | U+209F |
| CurrencySymbols | U+20A0 | U+20CF |
| CombiningDiacriticalMarksforSymbols | U+20D0 | U+20FF |
| LetterlikeSymbols | U+2100 | U+214F |
| NumberForms | U+2150 | U+218F |
| Arrows | U+2190 | U+21FF |
| MathematicalOperators | U+2200 | U+22FF |
| MiscellaneousTechnical | U+2300 | U+23FF |
| ControlPictures | U+2400 | U+243F |
| OpticalCharacterRecognition | U+2440 | U+245F |
| EnclosedAlphanumerics | U+2460 | U+24FF |
| BoxDrawing | U+2500 | U+257F |
| BlockElements | U+2580 | U+259F |
| GeometricShapes | U+25A0 | U+25FF |
| MiscellaneousSymbols | U+2600 | U+26FF |
| Dingbats | U+2700 | U+27BF |
| MiscellaneousMathematicalSymbols-A | U+27C0 | U+27EF |
| SupplementalArrows-A | U+27F0 | U+27FF |
| BraillePatterns | U+2800 | U+28FF |
| SupplementalArrows-B | U+2900 | U+297F |
| MiscellaneousMathematicalSymbols-B | U+2980 | U+29FF |
| SupplementalMathematicalOperators | U+2A00 | U+2AFF |
| MiscellaneousSymbolsandArrows | U+2B00 | U+2BFF |
| Glagolitic | U+2C00 | U+2C5F |
| LatinExtended-C | U+2C60 | U+2C7F |
| Coptic | U+2C80 | U+2CFF |
| GeorgianSupplement | U+2D00 | U+2D2F |
| Tifinagh | U+2D30 | U+2D7F |
| EthiopicExtended | U+2D80 | U+2DDF |
| CyrillicExtended-A | U+2DE0 | U+2DFF |
| SupplementalPunctuation | U+2E00 | U+2E7F |
| CJKRadicalsSupplement | U+2E80 | U+2EFF |
| KangxiRadicals | U+2F00 | U+2FDF |
| IdeographicDescriptionCharacters | U+2FF0 | U+2FFF |
| CJKSymbolsandPunctuation | U+3000 | U+303F |
| Hiragana | U+3040 | U+309F |
| Katakana | U+30A0 | U+30FF |
| Bopomofo | U+3100 | U+312F |
| HangulCompatibilityJamo | U+3130 | U+318F |
| Kanbun | U+3190 | U+319F |
| BopomofoExtended | U+31A0 | U+31BF |
| CJKStrokes | U+31C0 | U+31EF |
| KatakanaPhoneticExtensions | U+31F0 | U+31FF |
| EnclosedCJKLettersandMonths | U+3200 | U+32FF |
| CJKCompatibility | U+3300 | U+33FF |
| CJKUnifiedIdeographsExtensionA | U+3400 | U+4DBF |
| YijingHexagramSymbols | U+4DC0 | U+4DFF |
| CJKUnifiedIdeographs | U+4E00 | U+9FFF |
| YiSyllables | U+A000 | U+A48F |
| YiRadicals | U+A490 | U+A4CF |
| Lisu | U+A4D0 | U+A4FF |
| Vai | U+A500 | U+A63F |
| CyrillicExtended-B | U+A640 | U+A69F |
| Bamum | U+A6A0 | U+A6FF |
| ModifierToneLetters | U+A700 | U+A71F |
| LatinExtended-D | U+A720 | U+A7FF |
| SylotiNagri | U+A800 | U+A82F |
| CommonIndicNumberForms | U+A830 | U+A83F |
| Phags-pa | U+A840 | U+A87F |
| Saurashtra | U+A880 | U+A8DF |
| DevanagariExtended | U+A8E0 | U+A8FF |
| KayahLi | U+A900 | U+A92F |
| Rejang | U+A930 | U+A95F |
| HangulJamoExtended-A | U+A960 | U+A97F |
| Javanese | U+A980 | U+A9DF |
| MyanmarExtended-B | U+A9E0 | U+A9FF |
| Cham | U+AA00 | U+AA5F |
| MyanmarExtended-A | U+AA60 | U+AA7F |
| TaiViet | U+AA80 | U+AADF |
| MeeteiMayekExtensions | U+AAE0 | U+AAFF |
| EthiopicExtended-A | U+AB00 | U+AB2F |
| LatinExtended-E | U+AB30 | U+AB6F |
| CherokeeSupplement | U+AB70 | U+ABBF |
| MeeteiMayek | U+ABC0 | U+ABFF |
| HangulSyllables | U+AC00 | U+D7AF |
| HangulJamoExtended-B | U+D7B0 | U+D7FF |
| PrivateUseArea | U+E000 | U+F8FF |
| CJKCompatibilityIdeographs | U+F900 | U+FAFF |
| AlphabeticPresentationForms | U+FB00 | U+FB4F |
| ArabicPresentationForms-A | U+FB50 | U+FDFF |
| VariationSelectors | U+FE00 | U+FE0F |
| VerticalForms | U+FE10 | U+FE1F |
| CombiningHalfMarks | U+FE20 | U+FE2F |
| CJKCompatibilityForms | U+FE30 | U+FE4F |
| SmallFormVariants | U+FE50 | U+FE6F |
| ArabicPresentationForms-B | U+FE70 | U+FEFF |
| HalfwidthandFullwidthForms | U+FF00 | U+FFEF |
| Specials | U+FFF0 | U+FFFF |
| LinearBSyllabary | U+00010000 | U+0001007F |
| LinearBIdeograms | U+00010080 | U+000100FF |
| AegeanNumbers | U+00010100 | U+0001013F |
| AncientGreekNumbers | U+00010140 | U+0001018F |
| AncientSymbols | U+00010190 | U+000101CF |
| PhaistosDisc | U+000101D0 | U+000101FF |
| Lycian | U+00010280 | U+0001029F |
| Carian | U+000102A0 | U+000102DF |
| CopticEpactNumbers | U+000102E0 | U+000102FF |
| OldItalic | U+00010300 | U+0001032F |
| Gothic | U+00010330 | U+0001034F |
| OldPermic | U+00010350 | U+0001037F |
| Ugaritic | U+00010380 | U+0001039F |
| OldPersian | U+000103A0 | U+000103DF |
| Deseret | U+00010400 | U+0001044F |
| Shavian | U+00010450 | U+0001047F |
| Osmanya | U+00010480 | U+000104AF |
| Osage | U+000104B0 | U+000104FF |
| Elbasan | U+00010500 | U+0001052F |
| CaucasianAlbanian | U+00010530 | U+0001056F |
| LinearA | U+00010600 | U+0001077F |
| CypriotSyllabary | U+00010800 | U+0001083F |
| ImperialAramaic | U+00010840 | U+0001085F |
| Palmyrene | U+00010860 | U+0001087F |
| Nabataean | U+00010880 | U+000108AF |
| Hatran | U+000108E0 | U+000108FF |
| Phoenician | U+00010900 | U+0001091F |
| Lydian | U+00010920 | U+0001093F |
| MeroiticHieroglyphs | U+00010980 | U+0001099F |
| MeroiticCursive | U+000109A0 | U+000109FF |
| Kharoshthi | U+00010A00 | U+00010A5F |
| OldSouthArabian | U+00010A60 | U+00010A7F |
| OldNorthArabian | U+00010A80 | U+00010A9F |
| Manichaean | U+00010AC0 | U+00010AFF |
| Avestan | U+00010B00 | U+00010B3F |
| InscriptionalParthian | U+00010B40 | U+00010B5F |
| InscriptionalPahlavi | U+00010B60 | U+00010B7F |
| PsalterPahlavi | U+00010B80 | U+00010BAF |
| OldTurkic | U+00010C00 | U+00010C4F |
| OldHungarian | U+00010C80 | U+00010CFF |
| RumiNumeralSymbols | U+00010E60 | U+00010E7F |
| Brahmi | U+00011000 | U+0001107F |
| Kaithi | U+00011080 | U+000110CF |
| SoraSompeng | U+000110D0 | U+000110FF |
| Chakma | U+00011100 | U+0001114F |
| Mahajani | U+00011150 | U+0001117F |
| Sharada | U+00011180 | U+000111DF |
| SinhalaArchaicNumbers | U+000111E0 | U+000111FF |
| Khojki | U+00011200 | U+0001124F |
| Multani | U+00011280 | U+000112AF |
| Khudawadi | U+000112B0 | U+000112FF |
| Grantha | U+00011300 | U+0001137F |
| Newa | U+00011400 | U+0001147F |
| Tirhuta | U+00011480 | U+000114DF |
| Siddham | U+00011580 | U+000115FF |
| Modi | U+00011600 | U+0001165F |
| MongolianSupplement | U+00011660 | U+0001167F |
| Takri | U+00011680 | U+000116CF |
| Ahom | U+00011700 | U+0001173F |
| WarangCiti | U+000118A0 | U+000118FF |
| ZanabazarSquare | U+00011A00 | U+00011A4F |
| Soyombo | U+00011A50 | U+00011AAF |
| PauCinHau | U+00011AC0 | U+00011AFF |
| Bhaiksuki | U+00011C00 | U+00011C6F |
| Marchen | U+00011C70 | U+00011CBF |
| MasaramGondi | U+00011D00 | U+00011D5F |
| Cuneiform | U+00012000 | U+000123FF |
| CuneiformNumbersandPunctuation | U+00012400 | U+0001247F |
| EarlyDynasticCuneiform | U+00012480 | U+0001254F |
| EgyptianHieroglyphs | U+00013000 | U+0001342F |
| AnatolianHieroglyphs | U+00014400 | U+0001467F |
| BamumSupplement | U+00016800 | U+00016A3F |
| Mro | U+00016A40 | U+00016A6F |
| BassaVah | U+00016AD0 | U+00016AFF |
| PahawhHmong | U+00016B00 | U+00016B8F |
| Miao | U+00016F00 | U+00016F9F |
| IdeographicSymbolsandPunctuation | U+00016FE0 | U+00016FFF |
| Tangut | U+00017000 | U+000187FF |
| TangutComponents | U+00018800 | U+00018AFF |
| KanaSupplement | U+0001B000 | U+0001B0FF |
| KanaExtended-A | U+0001B100 | U+0001B12F |
| Nushu | U+0001B170 | U+0001B2FF |
| Duployan | U+0001BC00 | U+0001BC9F |
| ShorthandFormatControls | U+0001BCA0 | U+0001BCAF |
| ByzantineMusicalSymbols | U+0001D000 | U+0001D0FF |
| MusicalSymbols | U+0001D100 | U+0001D1FF |
| AncientGreekMusicalNotation | U+0001D200 | U+0001D24F |
| TaiXuanJingSymbols | U+0001D300 | U+0001D35F |
| CountingRodNumerals | U+0001D360 | U+0001D37F |
| MathematicalAlphanumericSymbols | U+0001D400 | U+0001D7FF |
| SuttonSignWriting | U+0001D800 | U+0001DAAF |
| GlagoliticSupplement | U+0001E000 | U+0001E02F |
| MendeKikakui | U+0001E800 | U+0001E8DF |
| Adlam | U+0001E900 | U+0001E95F |
| ArabicMathematicalAlphabeticSymbols | U+0001EE00 | U+0001EEFF |
| MahjongTiles | U+0001F000 | U+0001F02F |
| DominoTiles | U+0001F030 | U+0001F09F |
| PlayingCards | U+0001F0A0 | U+0001F0FF |
| EnclosedAlphanumericSupplement | U+0001F100 | U+0001F1FF |
| EnclosedIdeographicSupplement | U+0001F200 | U+0001F2FF |
| MiscellaneousSymbolsandPictographs | U+0001F300 | U+0001F5FF |
| Emoticons | U+0001F600 | U+0001F64F |
| OrnamentalDingbats | U+0001F650 | U+0001F67F |
| TransportandMapSymbols | U+0001F680 | U+0001F6FF |
| AlchemicalSymbols | U+0001F700 | U+0001F77F |
| GeometricShapesExtended | U+0001F780 | U+0001F7FF |
| SupplementalArrows-C | U+0001F800 | U+0001F8FF |
| SupplementalSymbolsandPictographs | U+0001F900 | U+0001F9FF |
| CJKUnifiedIdeographsExtensionB | U+00020000 | U+0002A6DF |
| CJKUnifiedIdeographsExtensionC | U+0002A700 | U+0002B73F |
| CJKUnifiedIdeographsExtensionD | U+0002B740 | U+0002B81F |
| CJKUnifiedIdeographsExtensionE | U+0002B820 | U+0002CEAF |
| CJKUnifiedIdeographsExtensionF | U+0002CEB0 | U+0002EBEF |
| CJKCompatibilityIdeographsSupplement | U+0002F800 | U+0002FA1F |
| Tags | U+000E0000 | U+000E007F |
| VariationSelectorsSupplement | U+000E0100 | U+000E01EF |
| SupplementaryPrivateUseArea-A | U+000F0000 | U+000FFFFF |
| SupplementaryPrivateUseArea-B | U+00100000 | U+0010FFFF |