Terminology

First, let's cover some terminology that we'll be using throughout the docs:

A semantic action is an arbitrary bit of logic associated with a parser, that is only executed when the parser matches.

Simpler parsers can be combined to form more complex parsers. Given some combining operation C, and parsers P0, P1, ... PN, C(P0, P1, ... PN) creates a new parser Q. This creates a parse tree. Q is the parent of P1, P2 is the child of Q, etc. The parsers are applied in the top-down fashion implied by this topology. When you use Q to parse a string, it will use P0, P1, etc. to do the actual work. If P3 is being used to parse the input, that means that Q is as well, since the way Q parses is by dispatching to its children to do some or all of the work. At any point in the parse, there will be exactly one parser without children that is being used to parse the input; all other parsers being used are its ancestors in the parse tree.

A subparser is a parser that is the child of another parser.

The top-level parser is the root of the tree of parsers.

The current parser or bottommost parser is the parser with no children that is currently being used to parse the input.

A rule is a kind of parser that makes building large, complex parsers easier. A subrule is a rule that is the child of some other rule. The current rule or bottommost rule is the one rule currently being used to parse the input that has no subrules. Note that while there is always exactly one current parser, there may or may not be a current rule — rules are one kind of parser, and you may or may not be using one at a given point in the parse.

The top-level parse is the parse operation being performed by the top-level parser. This term is necessary because, though most parse failures are local to a particular parser, some parse failures cause the call to parse() to indicate failure of the entire parse. For these cases, we say that such a local failure "causes the top-level parse to fail".

Throughout the Boost.Parser documentation, I will refer to "the call to parse()". Read this as "the call to any one of the functions described in The parse() API". That includes prefix_parse(), callback_parse(), and callback_prefix_parse().

There are some special kinds of parsers that come up often in this documentation.

One is a sequence parser; you will see it created using operator>>, as in p1 >> p2 >> p3. A sequence parser tries to match all of its subparsers to the input, one at a time, in order. It matches the input iff all its subparsers do.

Another is an alternative parser; you will see it created using operator|, as in p1 | p2 | p3. An alternative parser tries to match all of its subparsers to the input, one at a time, in order; it stops after matching at most one subparser. It matches the input iff one of its subparsers does.

Finally, there is a permutation parser; it is created using operator||, as in p1 || p2 || p3. A permutation parser tries to match all of its subparsers to the input, in any order. So the parser p1 || p2 || p3 is equivalent to (p1 >> p2 >> p3) | (p1 >> p3 >> p2) | (p2 >> p1 >> p3) | (p2 >> p3 >> p1) | (p3 >> p1 >> p2) | (p3 >> p2 >> p1). Hopefully its terseness is self-explanatory. It matches the input iff all of its subparsers do, regardless of the order they match in.

Boost.Parser parsers each have an attribute associated with them, or explicitly have no attribute. An attribute is a value that the parser generates when it matches the input. For instance, the parser double_ generates a double when it matches the input. ATTR() is a notional macro that expands to the attribute type of the parser passed to it; ATTR(double_) is double. This is similar to the attribute type trait.

Next, we'll look at some simple programs that parse using Boost.Parser. We'll start small and build up from there.