Module parser

Source
Expand description

Module implementing the SurrealQL parser.

The SurrealQL parse is a relatively simple recursive decent parser. Most of the functions of the SurrealQL parser peek a token from the lexer and then decide to take a path depending on which token is next.

§Implementation Details

There are a bunch of common patterns for which this module has some confinence functions.

  • Whenever only one token can be next you should use the expected! macro. This macro ensures that the given token type is next and if not returns a parser error.
  • Whenever a limited set of tokens can be next it is common to match the token kind and then have a catch all arm which calles the macro unexpected!. This macro will raise an parse error with information about the type of token it recieves and what it expected.
  • If a single token can be optionally next use Parser::eat this function returns a bool depending on if the given tokenkind was eaten.
  • If a closing delimiting token is expected use Parser::expect_closing_delimiter. This function will raise an error if the expected delimiter isn’t the next token. This error will also point to which delimiter the parser expected to be closed.

§Far Token Peek

Occasionally the parser needs to check further ahead than peeking allows. This is done with the Parser::peek1 function. This function peeks one token further then peek.

§WhiteSpace Tokens

The lexer produces whitespace tokens, these are tokens which are normally ignored in most place in the syntax as they have no bearing on the meaning of a statements. Parser::next and Parser::peek automatically skip over any whitespace tokens. However in some places, like in a record-id and when gluing tokens, these white-space tokens are required for correct parsing. In which case the function Parser::next_whitespace and others with _whitespace are used. These functions don’t skip whitespace tokens. However these functions do not undo whitespace tokens which might have been skipped. Implementers must be carefull to not call a functions which requires whitespace tokens when they may already have been skipped.

§Compound tokens and token gluing.

SurrealQL has a bunch of tokens which have complex rules for when they are allowed and the value they contain. Such tokens are named compound tokens, and examples include a javascript body, strand-like tokens, regex, numbers, etc.

These tokens need to be manually requested from the lexer with the Lexer::lex_compound function.

This manually request of tokens leads to a problems when used in conjunction with peeking. Take for instance the production { "foo": "bar"}. "foo" is a compound token so when intially encountered the lexer only returns a " token and then that token needs to be collected into a the full strand token. However the parser needs to figure out if we are parsing an object or a block so it needs to look past the compound token to see if the next token is :. This is where gluing comes in. Calling Parser::glue checks if the next token could start a compound token and combines them into a single token. This can only be done in places where we know if we encountered a leading token of a compound token it will result in the ‘default’ compound token.

Structs§

Parser
The SurrealQL parser.
ParserSettings
StatementStream
A struct which can parse queries statements by statement

Enums§

GluedValue
PartialResult
A result of trying to parse a possibly partial query.

Type Aliases§

ParseResult
The result returned by most parser function.