Open-source image
open-source
Hime logo

Table of content

General Structure

The Hime Parser Generator provides its own language for expressing context-free grammars. It is largely similar to the standard BNF form with some enhancements for the sake of expressivity.

// Single line comments begin with double slash
/*
* This is a multiline comment
*/
grammar MathExp
{
// The 'options' section in a grammar specifies various compilation options for the grammar.
// At the very least this section may be empty.
// For more information about this section, see: options reference
options
{
// The Axiom option specifies the top rule for the grammar.
Axiom = "exp";
// The Separator option specifies the separator terminal (usually white space).
Separator = "SEPARATOR";
}
// The 'terminals' section in a grammar specifies the lexical rules for terminals.
// This section is optional.
// For more information about this section, see: terminals reference
terminals
{
// This is a lexical rule that defines the WHITE_SPACE terminal.
// By convention, the name of terminals are generally UPPER_CASE.
// U+XXXX represent the code of Unicode character.
WHITE_SPACE -> U+0020 | U+0009 | U+000B | U+000C ;
// This lexical rules reuses the previous definition of WHITE_SPACE.
// Note that this rule defines the SEPARATOR terminal referred to in the Separator option above.
// A terminal must be defined here before being used.
SEPARATOR -> WHITE_SPACE+;
// This set of three lexical rules defines the NUMBER terminal.
// Their order of appearance is significant.
INTEGER -> [1-9] [0-9]* | '0' ;
// Now we can use INTEGER for the definition of REAL.
REAL -> INTEGER? '.' INTEGER (('e' | 'E') ('+' | '-')? INTEGER)?
| INTEGER ('e' | 'E') ('+' | '-')? INTEGER ;
// Now we can use both INTEGER and REAL for the definition of NUMBER.
NUMBER -> INTEGER | REAL ;
}

// The 'rules' section in a grammar specifies the syntactic rules for variables.
// At the very least this section may be empty.
// For more information about this section, see: rules reference
rules
{
// This is a syntactic rule that defines the exp_atom variable.
// By convention, the name of variables are generally snake_case.
// Note that the rule's definition refers to the NUMBER terminal.
exp_atom -> NUMBER
| '(' exp ')' ;
// The order of the syntactic rules is not significant.
exp_factor -> exp_atom
| exp_factor '*' exp_atom
| exp_factor '/' exp_atom ;
exp_term -> exp_factor
| exp_term '+' exp_factor
| exp_term '-' exp_factor ;
exp -> exp_term ;
}
}