Get started
Grammars and edition
API Documentation
Release Notes
This page summarizes the supported Unicode categories. To refer to a category in a lexical rule, use the construct: uc{Name}
. The following table lists the supported Unicode character categories:
Name | Meaning | Comment |
---|---|---|
L | Letter | |
Lu | Letter, Uppercase | |
Ll | Letter, Lowercase | |
Lt | Letter, Titlecase | |
Lm | Letter, Modifier | |
Lo | Letter, Other | |
M | Mark | |
Mn | Mark, Non-spacing | |
Mc | Mark, Spacing combining | |
Me | Mark, Enclosing | |
N | Number | |
Nd | Number, Decimal Digit | |
Nl | Number, Letter | Includes Roman numerals |
No | Number, Other | |
P | Punctuation | |
Pc | Punctuation, Connector | Includes the underscore |
Pd | Punctuation, Dash | Includes hyphen characters |
Ps | Punctuation, Open | Opening brackets |
Pe | Punctuation, Close | Closing brackets |
Pi | Punctuation, Initial quote | Opening quotation mark |
Pf | Punctuation, Final quote | Closing quotation mark |
Po | Punctuation, Other | |
S | Symbol | |
Sm | Symbol, Math | |
Sc | Symbol, Currency | |
Sk | Symbol, Modifier | |
So | Symbol, Other | |
Z | Separator | |
Zs | Separator, Space | Includes the ASCII spaces |
Zl | Separator, Line | Only the U+2028 LINE SEPARATOR |
Zp | Separator, Paragraph | Only the U+2029 PARAGRAPH SEPARATOR |
C | Other | |
Cc | Other, Control | |
Cf | Other, Format | |
Cs | Other, Surrogate | |
Co | Other, Private Use | |
Cn | Other, Not assigned |