Mutable Symbol Tables

The previous example showed how to use a symbol table as a fixed lookup table. What if we want to add things to the table during the parse? We can do that, but we need to do so within a semantic action. First, here is our symbol table, already with a single value in it:

bp::symbols<int> const symbols = {{"c", 8}};
assert(parse("c", symbols));

No surprise that it works to use the symbol table as a parser to parse the one string in the symbol table. Now, here's our parser:

auto const parser = (bp::char_ >> bp::int_)[add_symbol] >> symbols;

Here, we've attached the semantic action not to a simple parser like double_, but to the sequence parser (bp::char_ >> bp::int_). This sequence parser contains two parsers, each with its own attribute, so it produces two attributes as a tuple.

auto const add_symbol = [&symbols](auto & ctx) {
    using namespace bp::literals;
    // symbols::insert() requires a string, not a single character.
    char chars[2] = {_attr(ctx)[0_c], 0};
    symbols.insert(ctx, chars, _attr(ctx)[1_c]);
};

Inside the semantic action, we can get the first element of the attribute tuple using UDLs provided by Boost.Hana, and boost::hana::tuple::operator[](). The first attribute, from the char_, is _attr(ctx)[0_c], and the second, from the int_, is _attr(ctx)[1_c] (if boost::parser::tuple aliases to std::tuple, you'd use std::get or boost::parser::get instead). To add the symbol to the symbol table, we call insert().

auto const parser = (bp::char_ >> bp::int_)[add_symbol] >> symbols;

During the parse, ("X", 9) is parsed and added to the symbol table. Then, the second 'X' is recognized by the symbol table parser. However:

assert(!parse("X", symbols));

If we parse again, we find that "X" did not stay in the symbol table. The fact that symbols was declared const might have given you a hint that this would happen. Also, notice that the call to insert() in the semantic action uses the parse context; that's where all the symbol table changes are stored during the parse.

The full program:

#include <boost/parser/parser.hpp>

#include <iostream>
#include <string>


namespace bp = boost::parser;

int main()
{
    bp::symbols<int> const symbols = {{"c", 8}};
    assert(parse("c", symbols));

    auto const add_symbol = [&symbols](auto & ctx) {
        using namespace bp::literals;
        // symbols::insert() requires a string, not a single character.
        char chars[2] = {_attr(ctx)[0_c], 0};
        symbols.insert(ctx, chars, _attr(ctx)[1_c]);
    };
    auto const parser = (bp::char_ >> bp::int_)[add_symbol] >> symbols;

    auto const result = parse("X 9 X", parser, bp::ws);
    assert(result && *result == 9);
    (void)result;

    assert(!parse("X", symbols));
}

Tip

	Tip
`symbols` also has a call operator that does exactly what `.insert_for_next_parse()` does. This allows you to chain additions with a convenient syntax, like this: symbols<int> roman_numerals; roman_numerals.insert_for_next_parse("I", 1)("V", 5)("X", 10);

symbols also has a call operator that does exactly what .insert_for_next_parse() does. This allows you to chain additions with a convenient syntax, like this:

symbols<int> roman_numerals;
roman_numerals.insert_for_next_parse("I", 1)("V", 5)("X", 10);

	Important
	`symbols` stores all its strings in UTF-32 internally. If you do Unicode or ASCII parsing, this will not matter to you at all. If you do non-Unicode parsing of a character encoding that is not a subset of Unicode (EBCDIC, for instance), it could cause problems. See the section on Unicode Support for more information.

It is possible to add symbols to a symbols permanently. To do so, you have to use a mutable symbols object s, and add the symbols by calling s.insert_for_next_parse(), instead of s.insert(). These two operations are orthogonal, so if you want to both add a symbol to the table for the current top-level parse, and leave it in the table for subsequent top-level parses, you need to call both functions.

It is also possible to erase a single entry from the symbol table, or to clear the symbol table entirely. Just as with insertion, there are versions of erase and clear for the current parse, and another that applies only to subsequent parses. The full set of operations can be found in the symbols API docs.