First, let's cover some terminology that we'll be using throughout the docs:
A semantic action is an arbitrary bit of logic associated with a parser, that is only executed when the parser matches.
Simpler parsers can be combined to form more complex parsers. Given some
combining operation C
, and
parsers P0
, P1
, ... PN
,
C(P0, P1, ... PN)
creates a new parser Q
.
This creates a parse tree. Q
is the parent of P1
, P2
is the child of Q
,
etc. The parsers are applied in the top-down fashion implied by this topology.
When you use Q
to parse a
string, it will use P0
,
P1
, etc. to do the actual
work. If P3
is being used
to parse the input, that means that Q
is as well, since the way Q
parses is by dispatching to its children to do some or all of the work. At
any point in the parse, there will be exactly one parser without children
that is being used to parse the input; all other parsers being used are its
ancestors in the parse tree.
A subparser is a parser that is the child of another parser.
The top-level parser is the root of the tree of parsers.
The current parser or bottommost parser is the parser with no children that is currently being used to parse the input.
A rule is a kind of parser that makes building large, complex parsers easier. A subrule is a rule that is the child of some other rule. The current rule or bottommost rule is the one rule currently being used to parse the input that has no subrules. Note that while there is always exactly one current parser, there may or may not be a current rule — rules are one kind of parser, and you may or may not be using one at a given point in the parse.
The top-level parse is the parse operation being performed
by the top-level parser. This term is necessary because, though most parse
failures are local to a particular parser, some parse failures cause the
call to parse()
to indicate failure of the
entire parse. For these cases, we say that such a local failure "causes
the top-level parse to fail".
Throughout the Boost.Parser documentation, I will refer to "the call
to parse()
". Read this as "the
call to any one of the functions described in The
parse()
API". That includes prefix_parse()
,
callback_parse()
, and callback_prefix_parse()
.
There are some special kinds of parsers that come up often in this documentation.
One is a sequence parser; you will see it created using
operator>>
,
as in p1 >>
p2 >>
p3
. A sequence parser tries to
match all of its subparsers to the input, one at a time, in order. It matches
the input iff all its subparsers do.
Another is an alternative parser; you will see it created
using operator|
,
as in p1 |
p2 |
p3
. An alternative parser tries
to match all of its subparsers to the input, one at a time, in order; it
stops after matching at most one subparser. It matches the input iff one
of its subparsers does.
Finally, there is a permutation parser; it is created
using operator||
,
as in p1 ||
p2 ||
p3
. A permutation parser tries
to match all of its subparsers to the input, in any order. So the parser
p1 ||
p2 ||
p3
is equivalent to (p1 >>
p2 >>
p3) | (p1
>> p3
>> p2) | (p2 >> p1 >> p3) |
(p2 >> p3 >> p1) | (p3 >> p1 >> p2) |
(p3 >> p2 >> p1)
. Hopefully its terseness is self-explanatory.
It matches the input iff all of its subparsers do, regardless of the order
they match in.
Boost.Parser parsers each have an attribute associated
with them, or explicitly have no attribute. An attribute is a value that
the parser generates when it matches the input. For instance, the parser
double_
generates a double
when it matches
the input. ATTR
()
is a notional macro that expands to the attribute type of the parser passed
to it;
is ATTR
(double_)double
.
This is similar to the attribute
type trait.
Next, we'll look at some simple programs that parse using Boost.Parser. We'll start small and build up from there.