Hime Grammar Language - Lexical Fragments

This page explains the use of lexical fragments in Hime 2.0.0 and up. The basic features of lexical rules is explained here. A lexical fragment is simply a lexical rule that can be used to build the definition of other terminals, but will not be matched by the lexer. This is useful when a terminal has a particularly complex definition that is best split into multiple rules but the individual parts shall not be matched themselves. For example consider:

```
INTEGER -> '0' | [1-9] [0-9]* ;
EXPONENT -> [eE] ('+'|'-')? INTEGER ;
REAL -> INTEGER? '.' INTEGER EXPONENT? | INTEGER EXPONENT ;
```

In this example, the
`INTEGER`

rule defines how to match a numeric integer in the decimal base. The
`EXPONENT`

rule then defines how to match the expression of an exponent. It reuses the definition of
`INTEGER`

. Finally, the
`REAL`

rule uses the two previous to define how to match the expression of a floating point number.
The
`EXPONENT`

rule is very useful here because it simplifies the definition of
`REAL`

. However, as is, it can be matched by the associated lexer. Consider the following input:

`x = e+1`

With this input, the lexer would match the
`EXPONENT`

rule as a terminal on the
`e+1` part. However this is probably not what was intended. To prevent this, one can
remove the
`EXPONENT`

rule and replace its usage by its definition. However, this would introduce additional complexity
to the grammar. To still keep the definition of the
`EXPONENT`

rule and prevent its matching, one can now use the
`fragment`

keyword as a prefix:

`fragment EXPONENT -> [eE] ('+'|'-')? INTEGER ;`

This means that the
`EXPONENT`

rule is only a definition of a fragment of terminal that can be reused in the definition
of other terminals; but it can never be matched by itself. The lexer will never produce an
`EXPONENT`

terminal.