Class BaseLexer<DSPrims>

This class defines the basic structure of the Gobstones Lexer. It is parametrized, to allow for different variations of the language (both in different natural languages, like English or Spanish, and in the Domain Specific primitives the language handle) to be used.

On creation, an input program and the language definition must be provided. Particular version of the lexer can be created by subclasses.

Index

Constructors

constructor

Accessors

langWords languageMods pendingAttributes warnings

Methods

hasNextGToken hasNextToken nextGToken nextToken resetWith

Implementation: Auxiliaries -- Pragma processing

_evaluateLanguagePragma _evaluatePragma _verifyPragmaOption

Implementation: Auxiliaries -- Processing part of tokens

_readEscapeChar _readStringRegularContent _validateNumber

Implementation: Auxiliaries -- Processing words

_readEOD _readIdentifier _readLineComment _readNumber _readParComment _readPragma _readString _readSymbolicIdentifier _readWhitespaces _readWord

Implementation: Auxiliaries -- Side information

_addPendingAttribute _emitWarning _replaceLanguageMod

Implementation: Properties

_langWords _languageMods _pendingAttributes _source _warnings

Constructors

constructor

new BaseLexer<DSPrims>(langDef, input?): BaseLexer<DSPrims>
A new Gobstones lexer for a given input.
Type Parameters
- DSPrims extends DomainSpecificPrimitives
Parameters
- langDef: WordsDef<DSPrims>
  a WordsDef defining the concrete tokens to be read.
- input: SourceInput = ''
  with the input to process.
  
  PRECONDITION: there is at least one input document.
Returns BaseLexer<DSPrims>
Throws
GErrors.NoInputError if there are no documents in the input.

Accessors

langWords

get langWords(): Words
The Words used to access the language elements.

Returns Words
Implementation of Lexer.langWords

languageMods

get languageMods(): OptionsTable
The language options read during the source processing.

Returns OptionsTable
Implementation of Lexer.languageMods

pendingAttributes

get pendingAttributes(): OptionsTable
Gets the attributes read since they were last gotten, and resets the pending ones, as they are none pending.

Returns OptionsTable
Implementation of Lexer.pendingAttributes

warnings

get warnings(): LexerGWarning[]
A getter for the warnings produced during the source processing.

Returns LexerGWarning[]
Implementation of Lexer.warnings

Methods

hasNextGToken

hasNextGToken(): boolean
Indicates if there are more proper tokens to be read.

Returns boolean
Implementation of Lexer.hasNextGToken

hasNextToken

hasNextToken(): boolean
Indicates if there are more tokens to be read.

Returns boolean
Implementation of Lexer.hasNextToken

nextGToken

nextGToken(): Token
It returns the next proper token in the source, reading characters from it, and skipping them in the source, leaving the source ready to read the next token. If the source has no more characters, it fails.

PRECONDITION: this.hasNextGToken()

Returns Token
Throws
GErrors.NoMoreInputErrorIn is there is no GToken next

Implementation of Lexer.nextGToken

nextToken

nextToken(): Token
It returns the next token in the source, reading characters from it, and skipping them, leaving the source ready to read the next token. If the source has no more characters, it fails.

A token includes both proper tokens and filler tokens (usually whitespaces, comments, and pragmas). Filler tokens (fillers) are ignored by the parser, but this function provides all, in order to keep all characters from the input grouped into tokens.

Filler tokens that have some usefulness are properly processed (pragmas are evaluated, and their effects carried on -- whether modifying the lexer behavior or adding attributes --, comments are added as attributes).

By using this function other tools that need not to ignore filler tokens can be built. The parser should use nextGToken, that skips whitespaces and comments, and processes pragmas.

PRECONDITION: this.hasNextToken()

Returns Token
Throws
GErrors.NoMoreInputErrorIn is there is no Token next

Implementation of Lexer.nextToken

resetWith

resetWith(input): void
Modifies the input to the given one. It resets the lexer to start the input at the beginning.
Parameters
- input: SourceInput
  with the new input to process.
  
  PRECONDITION: there is at least one input document.
Returns void
Throws
GErrors.NoInputError if there are no documents in the input.

Implementation of Lexer.resetWith

Implementation: Auxiliaries -- Pragma processing

`Private` _evaluateLanguagePragma

_evaluateLanguagePragma(span, value): void
Private
Evaluates a language pragma, passing it as an attribute. If the language pragma is not known, it is ignored, and a warning is generated.
Parameters
- span: Span
- value: string
Returns void

`Private` _evaluatePragma

_evaluatePragma(span, name, args): void
Private
Evaluates the given pragma, carrying its effect.

Triggers a warning if the pragma is not one of those defined by the language.
Parameters
- span: Span
- name: string
- args: string[]
Returns void

`Private` _verifyPragmaOption

_verifyPragmaOption(optionPragma, selectedOption, baseOption, span): void
Private
Verify that the option selected for the pragma is the same as the one in the WordsDef, failing with an error if that is not the case.

PRECONDITION:
- the selectedOption is the same as the baseOption
Parameters
- optionPragma: string
- selectedOption: string
- baseOption: string
- span: Span
Returns void
Throws
GErrors.WrongPragmaOptionError when the selected option is not the same as the base one

Implementation: Auxiliaries -- Processing part of tokens

`Private` _readEscapeChar

_readEscapeChar(start, tokenStrs): string
Private
Read an escape char from the _source if it is possible, or returns undefined if it is not. Emits a warning if escape character is not complete, or invalid.

PRECONDITION:
- the _source is not at the end of input or the end of a string
- the first character in the _source is an escape sigil char according to _langWords
Parameters
- start: SourcePosition
- tokenStrs: string[]
Returns string

`Private` _readStringRegularContent

_readStringRegularContent(): string
Private
Read a chars from the _source until either an escape char or a string delimiter sigil is found, or either the end of string is reached.

PRECONDITION:
- there is a token String reading in process
Returns string

`Private` _validateNumber

_validateNumber(tokenStr, start, end): void
Private
Validates if the number according to the rules of the language. Emits a warning if the number is not valid.
Parameters
- tokenStr: string
- start: SourcePosition
- end: SourcePosition
Returns void

Implementation: Auxiliaries -- Processing words

`Private` _readEOD

_readEOD(): Token
Private
Read the end of file in the source. It just returns the EOD token, and advance the source.

PRECONDITION: the source is not at the end of input, and it is at the end of a document

Returns Token

`Private` _readIdentifier

_readIdentifier(): Token
Private
Read a maximal group of regular chars at the beginning of the source, advancing it, producing the corresponding token. The token produced may be:
- a keyword
- a language primitive
- a regular identifier, either upper or lower depending on the concrete token read and the concrete definition of the language given for the lexer instance.
PRECONDITIONS:
- the source is not at the end of input
- the source starts with a regular char different from a digit char
Returns Token

`Private` _readLineComment

_readLineComment(): Token
Private
Read a line comment at the beginning of the source, advancing it, producing the corresponding token. Add the comment to the pending attributes, to offer the parser the possibility to add it to the next token.

PRECONDITIONS:
- the source is not at the end of input
- the source starts with a line comment opener sigil
Returns Token

`Private` _readNumber

_readNumber(): Token
Private
Read a maximal group of digit chars at the beginning of the source, advancing it, producing the corresponding token. It also triggers a warning if the digits do not form a proper number token according to the rules.

PRECONDITIONS:
- the source is not at the end of input
- the source starts with a digit char
Returns Token

`Private` _readParComment

_readParComment(): Token
Private
Read a paragraph comment at the beginning of the source, advancing it, producing the corresponding token. Add the comment to the pending attributes, to offer the parser the possibility to add it to the next token.

It also triggers a warning if the comment reaches the end of file without closing.

PRECONDITIONS:
- the source is not at the end of input
- the source starts with a paragraph comment opener sigil
Returns Token

`Private` _readPragma

_readPragma(): Token
Private
Read a pragma at the beginning of the source, advancing it, producing the corresponding token.

Pragmas are evaluated, and their effects carried on -- whether modifying the lexer behavior or adding attributes.

It also triggers a warning if the pragma is malformed.

PRECONDITIONS:
- the source is not at the end of input
- the source starts with a pragma opener sigil
Returns Token

`Private` _readString

_readString(): Token
Private
Read a string at the beginning of the source, advancing it, producing the corresponding token. Strings are munchers surrounded by string delimiter char, and the may contain any character different of that delimiter. In order to be able to use the string delimiter char inside the string, strings may contain escaped characters. The sigil to specify the occurrence of a escaped char, and the exact list of permitted escaped chars (including the string delimiter) are specified by the argument GBSWordsDef given on Lexer creation.

Emits a warning if the string reaches the end of file without proper closing. Also emits a warning if there is an incomplete escaped char, or an invalid one.

PRECONDITIONS:
- the source is not at the end of input
- the source starts with a string delimiter char
Returns Token

`Private` _readSymbolicIdentifier

_readSymbolicIdentifier(): Token
Private
Read a maximal group of punctuation chars at the beginning of the source, advancing it, producing the corresponding token (it takes in consideration symbolic keywords).

PRECONDITIONS:
- the source is not at the end of input
- the source starts with a punctuation char, different from a string delimiter char
Returns Token

`Private` _readWhitespaces

_readWhitespaces(): Token
Private
Read a maximal group of whitespaces at the beginning of the source, advancing it, and producing the corresponding token.

PRECONDITIONS:
- the source is not at the end of input
- the source starts with a whitespace char
Returns Token

`Private` _readWord

_readWord(): Token
Private
Read the next word (whitespace, pragma, comment, symbol, number, or identifier), producing a token.

Filler tokens that have some usefulness are properly processed (pragmas are evaluated, and their effects carried on -- whether modifying the lexer behavior or adding attributes --, comments are added as attributes).

It triggers warnings for some tokens (pragmas, comments, and numbers) if they are ill-formed.

PRECONDITION: the source is not at the end of input

Returns Token

Implementation: Auxiliaries -- Side information

`Private` _addPendingAttribute

_addPendingAttribute(key, args): void
Private
Adds a new attribute to the list of pending attributes, adding the new value to the previous.
Parameters
- key: string
- args: string[]
Returns void

`Private` _emitWarning

_emitWarning(warning): void
Private
A convenient abbreviation to produce warnings.
Parameters
- warning: LexerGWarning
Returns void

`Private` _replaceLanguageMod

_replaceLanguageMod(key, args): void
Private
Adds a new option to the list of language options, replacing the old value if it exists.
Parameters
- key: string
- args: string[]
Returns void

Implementation: Properties

`Private` _langWords

_langWords: Words

The Words object managing the definition of words for the Gobstones language to use for recognition of characters, symbols, and identifiers.

`Private` _languageMods

_languageMods: OptionsTable

The language options read from the source.

`Private` _pendingAttributes

_pendingAttributes: OptionsTable

The attributes read from the source since the last time they were consulted.

`Private` _source

_source: SourceReader

The [SourceReader] (https://gobstones.github.io/gobstones-core/modules/SourceReader.html), to use as source of tokens.

`Private` _warnings

_warnings: LexerGWarning[]

Warnings generated by the reading process.

Class BaseLexer<DSPrims>

Type Parameters

Hierarchy (view full)

Implements

Index

Constructors

Accessors

Methods

Implementation: Auxiliaries -- Pragma processing

Implementation: Auxiliaries -- Processing part of tokens

Implementation: Auxiliaries -- Processing words

Implementation: Auxiliaries -- Side information

Implementation: Properties

Constructors

constructor

Type Parameters

Parameters

Returns BaseLexer<DSPrims>

Throws

Accessors

langWords

Returns Words

languageMods

Returns OptionsTable

pendingAttributes

Returns OptionsTable

warnings

Returns LexerGWarning[]

Methods

hasNextGToken

Returns boolean

hasNextToken

Returns boolean

nextGToken

Returns Token

Throws

nextToken

Returns Token

Throws

resetWith

Parameters

Returns void

Throws

Implementation: Auxiliaries -- Pragma processing

Private _evaluateLanguagePragma

Parameters

Returns void

Private _evaluatePragma

Parameters

Returns void

Private _verifyPragmaOption

Parameters

Returns void

Throws

Implementation: Auxiliaries -- Processing part of tokens

Private _readEscapeChar

Parameters

Returns string

Private _readStringRegularContent

Returns string

Private _validateNumber

Parameters

Returns void

Implementation: Auxiliaries -- Processing words

Private _readEOD

Returns Token

Private _readIdentifier

Returns Token

Private _readLineComment

Returns Token

Private _readNumber

Returns Token

Private _readParComment

Returns Token

Private _readPragma

Returns Token

Private _readString

Returns Token

Private _readSymbolicIdentifier

Returns Token

`Private` _evaluateLanguagePragma

`Private` _evaluatePragma

`Private` _verifyPragmaOption

`Private` _readEscapeChar

`Private` _readStringRegularContent

`Private` _validateNumber

`Private` _readEOD

`Private` _readIdentifier

`Private` _readLineComment

`Private` _readNumber

`Private` _readParComment

`Private` _readPragma

`Private` _readString

`Private` _readSymbolicIdentifier

`Private` _readWhitespaces

`Private` _readWord

`Private` _addPendingAttribute

`Private` _emitWarning

`Private` _replaceLanguageMod

`Private` _langWords

`Private` _languageMods

`Private` _pendingAttributes

`Private` _source

`Private` _warnings