ParseState

Git Source

The parser is stateful. This struct keeps track of the entire state.

struct ParseState {
    uint256 activeSourcePtr;
    uint256 topLevel0;
    uint256 topLevel1;
    uint256 parenTracker0;
    uint256 parenTracker1;
    uint256 lineTracker;
    uint256 subParsers;
    uint256 sourcesBuilder;
    uint256 fsm;
    uint256 stackNames;
    uint256 stackNameBloom;
    uint256 literalBloom;
    uint256 constantsBuilder;
    bytes literalParsers;
    bytes operandHandlers;
    uint256[] operandValues;
    ParseStackTracker stackTracker;
    bytes data;
    bytes meta;
}

Properties

NameTypeDescription
activeSourcePtruint256The pointer to the current source being built. The active source being pointed to is: - low 16 bits: bitwise offset into the source for the next word to be written. Starts at 0x20. Once a source is no longer the active source, i.e. it is full and a member of the LL tail, the offset is replaced with a pointer to the next source (towards the head) to build a doubly linked list. - mid 16 bits: pointer to the previous active source (towards the tail). This is a linked list of sources that are built RTL and then reversed to LTR to eval. - high bits: 4 byte opcodes and operand pairs.
topLevel0uint256Memory region for stack word counters. The first byte is a counter/offset into the region, which increments for every top level item parsed on the RHS. The remaining 31 bytes are the word counters for each stack item, which are incremented for every op pushed to the source. This is reset to 0 for every new source.
topLevel1uint25631 additional bytes of stack words, allowing for 62 top level stack items total per source. The final byte is used to count the stack height according to the LHS for the current source. This is reset to 0 for every new source.
parenTracker0uint256Memory region for tracking pointers to words in the source, and counters for the number of words in each paren group. The first byte is a counter/offset into the region. The second byte is a phantom counter for the root level, the remaining 30 bytes are the paren group words.
parenTracker1uint25632 additional bytes of paren group words.
lineTrackeruint256A 32 byte memory region for tracking the current line. Will be partially reset for each line when endLine is called. Fully reset when a new source is started. Bytes from low to high: - byte 0: Lowest byte is the number of LHS items parsed. This is the low byte so that a simple ++ is a valid operation on the line tracker while parsing the LHS. This is reset to 0 for each new line. - byte 1: A snapshot of the first high byte of topLevel0, i.e. the offset of top level items as at the beginning of the line. This is reset to the high byte of topLevel0 on each new line. - bytes 2+: A sequence of 2 byte pointers to before the start of each top level item, which is implictly after the end of the previous top level item. Allows us to quickly find the start of the RHS source for each top level item.
subParsersuint256
sourcesBuilderuint256A builder for the sources array. This is a 256 bit integer where each 16 bits is a literal memory pointer to a source.
fsmuint256The finite state machine representation of the parser. - bit 0: LHS/RHS => 0 = LHS, 1 = RHS - bit 1: yang/yin => 0 = yin, 1 = yang - bit 2: word end => 0 = not end, 1 = end - bit 3: accepting inputs => 0 = not accepting, 1 = accepting - bit 4: interstitial => 0 = not interstitial, 1 = interstitial
stackNamesuint256A linked list of stack names. As the parser encounters named stack items it pushes them onto this linked list. The linked list is in FILO order, so the first item on the stack is the last item in the list. This makes it more efficient to reference more recent stack names on the RHS.
stackNameBloomuint256
literalBloomuint256A bloom filter of all the literals that have been encountered so far. This is used to quickly dedupe literals.
constantsBuilderuint256A builder for the constants array. - low 16 bits: the height (length) of the constants array. - high 240 bits: a linked list of constant values. Each constant value is stored as a 256 bit key/value pair. The key is the fingerprint of the constant value, and the value is the constant value itself.
literalParsersbytesA 256 bit integer where each 16 bits is a function pointer to a literal parser.
operandHandlersbytes
operandValuesuint256[]
stackTrackerParseStackTracker
databytes
metabytes