Grammar

Internally, aCLImatise uses a Parsing Expression Grammar, which is a class of recursive grammar used to parse programming languages. This grammar is expressed and parsed using the PyParsing Python library. To help visualise the grammar used to parse command-line help, here is a Railroad Diagram generated using PyParsing.

The “terminal” nodes (circular) are either:

  • In quotes, e.g. ':', which indicates a literal string

  • In the form W:(start, body), e.g. W:(0-9@-Za-z, \--9@-Z\\_a-z|), which indicates a word where the first character comes from the start list of characters, and the remaining characters come from the body characters

  • In the form Re: pattern, which indicates a regular expression pattern used to match this terminal

  • Whitespace nodes, e.g. <SP><TAB><CR><LF>, which list the types of whitespace being parsed by that terminal

  • Certain other special nodes like Empty, and LineStart which match based on custom code. Where possible, these are annotated with what they are designed to match, for example UnIndent matches an unindent in the input file.

The “non-terminal” nodes (square) refer to subsections of the diagram, which are spelled-out under the subheading with the same name.

To read the diagram, start with FlagList, the start node, and from there follow the lines along any branch of the path that goes forward (although some paths end up turning backwards to indicate loops). Any string that matches the sequence of tokens you encounter along that path will be parsed by the grammar.

FlagList

':' Suppress Unnamed 3 IndentCheckpoint LineStart <SP><TAB><CR><LF> Suppress Unnamed 3 IndentCheckpoint LineStart Suppress flag Unnamed 4

Unnamed 3

Unnamed 2 Pop Unnamed 2 IndentCheckpoint Unnamed 4 IndentCheckpoint Pop

Unnamed 2

Indent flag W:(0-9@-Za-z, --.0-9@-Z_a-z){2,...} <SP><TAB><CR><LF> Suppress LineEnd DescriptionLine

flag

FlagWithArg <SP><TAB><CR><LF> <SP><TAB><CR><LF> (,/|) <SP><TAB><CR><LF> Suppress FlagWithArg LineEnd DescriptionLine

FlagWithArg

'-' '-' W:(0-9@-Za-z, --.0-9@-Z_a-z) '=' ' ' Suppress W:(0-9@-Za-z, --.0-9@-Z_a-z|) RepeatedSegment W:(0-9@-Za-z, --.0-9@-Z_a-z|) '[' Suppress RepeatedSegment ']' Suppress ChoiceArg OptionalArg W:(0-9@-Za-z, --.0-9@-Z_a-z) '<' Suppress W:(0-9@-Za-z, --9@-Z\_a-z|) '>' Suppress

RepeatedSegment

W:(0-9@-Za-z, --.0-9@-Z_a-z|) '.' '.' '.' Suppress W:(0-9@-Za-z, --.0-9@-Z_a-z|)

ChoiceArg

'{' Suppress Re:('"(?:[^"\n\r\\]|(?:"")|(?:\\(?:[^x]|x[0-9a-fA-F]+)))*') '"' Re:("'(?:[^'\n\r\\]|(?:'')|(?:\\(?:[^x]|x[0-9a-fA-F]+)))*") "'" quotedString using single or double quotes ChoiceArg W:(0-9@-Za-z, --.0-9@-Z_a-z) ',' Suppress W:(0-9@-Za-z, --.0-9@-Z_a-z) '}' Suppress

OptionalArg

W:(0-9@-Za-z, --.0-9@-Z_a-z|) '[' ',' OptionalArg W:(0-9@-Za-z, --.0-9@-Z_a-z|) ']'

Unnamed 4

Indent Empty Peer LineEnd DescriptionLine Empty Unindent