A new Gobstones lexer for a given input.
GErrors.NoInputError if there are no documents in the input.
The language options read during the source processing.
Gets the attributes read since they were last gotten, and resets the pending ones, as they are none pending.
A getter for the warnings produced during the source processing.
It returns the next proper token in the source, reading characters from it, and skipping them in the source, leaving the source ready to read the next token. If the source has no more characters, it fails.
PRECONDITION: this.hasNextGToken()
GErrors.NoMoreInputErrorIn is there is no GToken next
It returns the next token in the source, reading characters from it, and skipping them, leaving the source ready to read the next token. If the source has no more characters, it fails.
A token includes both proper tokens and filler tokens (usually whitespaces, comments, and pragmas). Filler tokens (fillers) are ignored by the parser, but this function provides all, in order to keep all characters from the input grouped into tokens.
Filler tokens that have some usefulness are properly processed (pragmas are evaluated, and their effects carried on -- whether modifying the lexer behavior or adding attributes --, comments are added as attributes).
By using this function other tools that need not to ignore filler tokens can be built. The parser should use nextGToken, that skips whitespaces and comments, and processes pragmas.
PRECONDITION: this.hasNextToken()
GErrors.NoMoreInputErrorIn is there is no Token next
Modifies the input to the given one. It resets the lexer to start the input at the beginning.
with the new input to process.
PRECONDITION: there is at least one input document.
GErrors.NoInputError if there are no documents in the input.
Private
_evaluatePrivate
Evaluates a language pragma, passing it as an attribute. If the language pragma is not known, it is ignored, and a warning is generated.
Private
_evaluatePrivate
Evaluates the given pragma, carrying its effect.
Triggers a warning if the pragma is not one of those defined by the language.
Private
_verifyPrivate
Verify that the option selected for the pragma is the same as the one in the WordsDef, failing with an error if that is not the case.
PRECONDITION:
selectedOption
is the same as the baseOption
GErrors.WrongPragmaOptionError when the selected option is not the same as the base one
Private
_readPrivate
Read an escape char from the _source
if it is possible, or returns undefined if it is not.
Emits a warning if escape character is not complete, or invalid.
PRECONDITION:
_source
is not at the end of input or the end of a string_source
is an escape sigil char according to _langWords
Private
_readPrivate
_validatePrivate
_readEODPrivate
_readPrivate
Read a maximal group of regular chars at the beginning of the source, advancing it, producing the corresponding token. The token produced may be:
PRECONDITIONS:
Private
_readPrivate
Read a line comment at the beginning of the source, advancing it, producing the corresponding token. Add the comment to the pending attributes, to offer the parser the possibility to add it to the next token.
PRECONDITIONS:
Private
_readPrivate
Read a maximal group of digit chars at the beginning of the source, advancing it, producing the corresponding token. It also triggers a warning if the digits do not form a proper number token according to the rules.
PRECONDITIONS:
Private
_readPrivate
Read a paragraph comment at the beginning of the source, advancing it, producing the corresponding token. Add the comment to the pending attributes, to offer the parser the possibility to add it to the next token.
It also triggers a warning if the comment reaches the end of file without closing.
PRECONDITIONS:
Private
_readPrivate
Read a pragma at the beginning of the source, advancing it, producing the corresponding token.
Pragmas are evaluated, and their effects carried on -- whether modifying the lexer behavior or adding attributes.
It also triggers a warning if the pragma is malformed.
PRECONDITIONS:
Private
_readPrivate
Read a string at the beginning of the source, advancing it, producing the corresponding token. Strings are munchers surrounded by string delimiter char, and the may contain any character different of that delimiter. In order to be able to use the string delimiter char inside the string, strings may contain escaped characters. The sigil to specify the occurrence of a escaped char, and the exact list of permitted escaped chars (including the string delimiter) are specified by the argument GBSWordsDef given on Lexer creation.
Emits a warning if the string reaches the end of file without proper closing. Also emits a warning if there is an incomplete escaped char, or an invalid one.
PRECONDITIONS:
Private
_readPrivate
Read a maximal group of punctuation chars at the beginning of the source, advancing it, producing the corresponding token (it takes in consideration symbolic keywords).
PRECONDITIONS:
Private
_readPrivate
_readPrivate
Read the next word (whitespace, pragma, comment, symbol, number, or identifier), producing a token.
Filler tokens that have some usefulness are properly processed (pragmas are evaluated, and their effects carried on -- whether modifying the lexer behavior or adding attributes --, comments are added as attributes).
It triggers warnings for some tokens (pragmas, comments, and numbers) if they are ill-formed.
PRECONDITION: the source is not at the end of input
Private
_addPrivate
_emitPrivate
A convenient abbreviation to produce warnings.
Private
_replacePrivate
_langThe Words object managing the definition of words for the Gobstones language to use for recognition of characters, symbols, and identifiers.
Private
_languageThe language options read from the source.
Private
_pendingThe attributes read from the source since the last time they were consulted.
Private
_sourceThe [SourceReader
]
(https://gobstones.github.io/gobstones-core/modules/SourceReader.html),
to use as source of tokens.
Private
_warningsWarnings generated by the reading process.
This class defines the basic structure of the Gobstones Lexer. It is parametrized, to allow for different variations of the language (both in different natural languages, like English or Spanish, and in the Domain Specific primitives the language handle) to be used.
On creation, an input program and the language definition must be provided. Particular version of the lexer can be created by subclasses.