Gobstones words are the basic groups of characters that conform a Gobstones program, and are detected by the Gobstones lexer to build tokens. Characters are divided in three groups: whitespaces, symbol chars, and regular chars, and then groups of those characters define special identifiers and keywords. This interface defines the basic structure to define a particular list of Gobstones concrete words, by defining which character belongs to which group, and which are the identifiers and keywords. These concrete words shape the precise form of a Gobstones program. Relating them with abstract words allows the internationalization of the Gobstones Language.

Characters and identifiers are organized according to the function they have in the language, thus allowing easy identification of their roles. In addition to the basic groups, the interface is generic, accepting as parameter an interface for domain specific primitives. The parameter has to be han extension of DomainSpecificPrimitives, to organize those primitives according to Gobstones language categories. Additional specification of the behavior of those domain specific primitives must also to be provided for full customization of the language.

By providing a particular locale conforming an instance of this interface to the Gobstones lexer, Gobstones programs in different languages (like Spanish, English, or Portuguese) can be parsed. Consult the documentation of Gobstones lexer to relate words definitions with actual Tokens.

Different fields organize the information into different groups according to their role in the language. There is a field to identify the particular LanguageLocale, and groups for Structural Elements, for Language Primitives and for Domain Specific Primitives. The purpose and restrictions of each group are the following.

  • LanguageLocale, together with the fields LanguageVariant of the language primitives -- see LangPrimitives --, and the field LanguageDomain of the domain parameter -- see DomainSpecificPrimitives --, allow a particular program to identify the specific locale of a specific Gobstones implementation that it was built for, via language pragmas.

  • Structural Elements are those that are used to provide the basic structure of the language, independently of the particular language types and operations. They are Symbol chars (basic and extra), Pragma elements, Muncher sigils, Symbolic keywords, Keywords, and Special regular chars (basic and extra), following the interfaces SymbolChars, PragmaElements, MuncherSigils, SymbolicKeywords, Keywords, and BasicSpecialRegularChars and SpecialRegularChars, respectively.

    • Symbol chars are characters used by Gobstones lexer to separate and organize words in a Gobstones program, when they do not appear inside a muncher -- see MuncherSigils documentation for an explanation on munchers. All characters that are not specified as a symbol are recognized as regular characters. All maximal groups of contiguous regular characters not appearing inside a muncher are classified as either numbers or identifiers.

      Symbol chars are used for different purposes -- see SymbolChars documentation. Some symbols are built-in, as they are used inside symbolic keywords, but extra symbols may be specified to modify the behavior of the Gobstones lexer, for example changing the line counting, ignoring chars, or to exclude them from the group of regular chars. Additional symbol chars have some restrictions:

      • The string LineSeparatorChars cannot contain any of the whitespace or symbol chars, the underscore '_', nor any basic regular character.
      • The string WhitespaceChars cannot contain any of the line separator or symbol chars, the underscore '_', nor any basic regular character.
      • The string PunctuationChars cannot contain any of the line separator or whitespace chars, the underscore '_', nor any basic regular character. The chars added here are excluded from identifiers and cannot appear in programs without modifications to the Gobstones lexer.
    • Pragmas are particular directives to the Gobstones lexer or other Gobstones language components to modify their behavior. Pragma elements establish the ways to identify pragmas in a Gobstones program. See PragmaElements documentation.

    • Muncher sigils are groups of symbol chars that indicate the start or end of parts of the code that allow the occurrence of characters following no particular structure (such as strings or comments) -- thus the name "munchers". Characters appearing inside a muncher are in general not further grouped or classified, except for escaped chars. See MuncherSigils documentation.

    • Symbolic keywords are groups of symbol chars that has special purpose for the Gobstones parser. See SymbolicKeywords documentation.

    • Keywords are particular identifiers (maximal groups of contiguous regular characters not recognized as numbers) used to build the different constructions of a Gobstones program. See Keywords documentation.

    • Special regular chars are particular subsets of regular chars used to further distinguish groups of regular chars that are formed with maximal groups of contiguous characters, as numbers and identifiers that are not keywords. Digit chars are those used to recognize numbers: a maximal group of regular digit chars is recognized as a number. Maximal groups starting with a non-digit regular char are recognized as identifiers. Further explanation is required to understand the remaining groups of special regular chars.

      In western cultures each letter may have two forms: a bigger, taller version, called "capital" or "uppercase", and a smaller, shorter one, called "lowercase" -- although the exact shape of each version may be very different, not just a difference in size. Gobstones uses capitalization of letters at the start of identifiers to distinguish two groups of them (for example, a non-keyword identifier starting with an uppercase character is recognized as a procedure or constructor, and a non-keyword identifier starting with a lowercase is recognized as a function). As not all non-western cultures have this distinction, a specification of Gobstones words includes the possibility to add special regular characters as uppercase or lowercase to keep the ability of differentiating identifiers this way. See BasicSpecialRegularChars and SpecialRegularChars documentation.

  • Language Primitives are particular identifiers used for the definition of basic language identifiers. They are not structural, as they follow the lexical rules of identifiers and the constructions they represent follow the syntactic rules of the language, but they are required by the Gobstones compiler to ensure proper function (e.g. the IF construction requires an expression with type BOOL to work properly). Its keywords have to be organized in the groups Types, Data Constructors by Types, and Operations, as it is defined by the restriction on the type parameter, LangPrimitives. See LangPrimitives interface.

  • DomainSpecificPrimitives are similar to language primitives, but corresponds to the specific domain used to provide the "concrete" elements needed for didactic purposes. Its keywords are parametric but they have to be organized also in the groups Types, Data Constructors by Types, and Operations, as it is defined by the restriction on the type parameter, DomainSpecificPrimitives. See DomainSpecificPrimitives interface.

interface WordsDef<DSPrims> {
    DomainSpecificPrimitives: DSPrims;
    LanguageLocale: string;
    LanguagePrimitives: LangPrimitives;
    StructuralElements: {
        Keywords: Keywords;
        LanguageVariant: string;
        MuncherSigils: MuncherSigils;
        PragmaElements: PragmaElements;
        SpecialRegularChars: {
            Basic: BasicSpecialRegularChars;
            Extra: SpecialRegularChars;
        };
        SymbolChars: {
            Basic: SymbolChars;
            Extra: SymbolChars;
        };
        SymbolicKeywords: SymbolicKeywords;
    };
}

Type Parameters

Properties

DomainSpecificPrimitives: DSPrims
LanguageLocale: string
LanguagePrimitives: LangPrimitives
StructuralElements: {
    Keywords: Keywords;
    LanguageVariant: string;
    MuncherSigils: MuncherSigils;
    PragmaElements: PragmaElements;
    SpecialRegularChars: {
        Basic: BasicSpecialRegularChars;
        Extra: SpecialRegularChars;
    };
    SymbolChars: {
        Basic: SymbolChars;
        Extra: SymbolChars;
    };
    SymbolicKeywords: SymbolicKeywords;
}

Type declaration