So far, we've seen only simple parsers that parse the same value repeatedly (with or without commas and spaces). It's also very common to parse a few values in a specific sequence. Let's say you want to parse an employee record. Here's a parser you might write:
namespace bp = boost::parser; auto employee_parser = bp::lit("employee") >> '{' >> bp::int_ >> ',' >> quoted_string >> ',' >> quoted_string >> ',' >> bp::double_ >> '}';
The attribute type for employee_parser
is boost::parser::tuple<int,
std::string, std::string, double>
.
That's great, in that you got all the parsed data for the record without
having to write any semantic actions. It's not so great that you now have
to get all the individual elements out by their indices, using get()
.
It would be much nicer to parse into the final data structure that your program
is going to use. This is often some struct
or class
. Boost.Parser supports
parsing into arbitrary aggregate struct
s,
and non-aggregates that are constructible from the tuple at hand.
If we have a struct
that has
data members of the same types listed in the boost::parser::tuple
attribute type for employee_parser
, it would be nice to parse
directly into it, instead of parsing into a tuple and then constructing our
struct
later. Fortunately, this
just works in Boost.Parser. Here is an example of parsing straight into a
compatible aggregate type.
#include <boost/parser/parser.hpp> #include <iostream> #include <string> struct employee { int age; std::string surname; std::string forename; double salary; }; namespace bp = boost::parser; int main() { std::cout << "Enter employee record. "; std::string input; std::getline(std::cin, input); auto quoted_string = bp::lexeme['"' >> +(bp::char_ - '"') >> '"']; auto employee_p = bp::lit("employee") >> '{' >> bp::int_ >> ',' >> quoted_string >> ',' >> quoted_string >> ',' >> bp::double_ >> '}'; employee record; auto const result = bp::parse(input, employee_p, bp::ws, record); if (result) { std::cout << "You entered:\nage: " << record.age << "\nsurname: " << record.surname << "\nforename: " << record.forename << "\nsalary : " << record.salary << "\n"; } else { std::cout << "Parse failure.\n"; } }
Unfortunately, this is taking advantage of the loose attribute assignment
logic; the employee_parser
parser still has a boost::parser::tuple
attribute. See The
parse()
API for a description of attribute out-param compatibility.
For this reason, it's even more common to want to make a rule that returns
a specific type like employee
.
Just by giving the rule a struct
type, we make sure that this parser always generates an employee
struct as its attribute, no matter where it is in the parse. If we made a
simple parser P
that uses
the employee_p
rule, like
bp::int >> employee_p
, P
's
attribute type would be boost::parser::tuple<int, employee>
.
#include <boost/parser/parser.hpp> #include <iostream> #include <string> struct employee { int age; std::string surname; std::string forename; double salary; }; namespace bp = boost::parser; bp::rule<struct quoted_string, std::string> quoted_string = "quoted name"; bp::rule<struct employee_p, employee> employee_p = "employee"; auto quoted_string_def = bp::lexeme['"' >> +(bp::char_ - '"') >> '"']; auto employee_p_def = bp::lit("employee") >> '{' >> bp::int_ >> ',' >> quoted_string >> ',' >> quoted_string >> ',' >> bp::double_ >> '}'; BOOST_PARSER_DEFINE_RULES(quoted_string, employee_p); int main() { std::cout << "Enter employee record. "; std::string input; std::getline(std::cin, input); static_assert(std::is_aggregate_v<std::decay_t<employee &>>); auto const result = bp::parse(input, employee_p, bp::ws); if (result) { std::cout << "You entered:\nage: " << result->age << "\nsurname: " << result->surname << "\nforename: " << result->forename << "\nsalary : " << result->salary << "\n"; } else { std::cout << "Parse failure.\n"; } }
Just as you can pass a struct
as an out-param to parse()
when the parser's attribute type is a tuple,
you can also pass a tuple as an out-param to parse()
when the parser's attribute type is a struct:
// Using the employee_p rule from above, with attribute type employee...
boost::parser::tuple
<int, std::string, std::string, double> tup;
auto const result = bp::parse(input, employee_p, bp::ws, tup); // Ok!
Important | |
---|---|
This automatic use of |
class
types as attributes
Many times you don't have an aggregate struct that you want to produce from your parse. It would be even nicer than the aggregate code above if Boost.Parser could detect that the members of a tuple that is produced as an attribute are usable as the arguments to some type's constructor. So, Boost.Parser does that.
#include <boost/parser/parser.hpp> #include <iostream> #include <string> namespace bp = boost::parser; int main() { std::cout << "Enter a string followed by two unsigned integers. "; std::string input; std::getline(std::cin, input); constexpr auto string_uint_uint = bp::lexeme[+(bp::char_ - ' ')] >> bp::uint_ >> bp::uint_; std::string string_from_parse; if (parse(input, string_uint_uint, bp::ws, string_from_parse)) std::cout << "That yields this string: " << string_from_parse << "\n"; else std::cout << "Parse failure.\n"; std::cout << "Enter an unsigned integer followed by a string. "; std::getline(std::cin, input); std::cout << input << "\n"; constexpr auto uint_string = bp::uint_ >> bp::char_ >> bp::char_; std::vector<std::string> vector_from_parse; if (parse(input, uint_string, bp::ws, vector_from_parse)) { std::cout << "That yields this vector of strings:\n"; for (auto && str : vector_from_parse) { std::cout << " '" << str << "'\n"; } } else { std::cout << "Parse failure.\n"; } }
Let's look at the first parse.
constexpr auto string_uint_uint = bp::lexeme[+(bp::char_ - ' ')] >> bp::uint_ >> bp::uint_; std::string string_from_parse; if (parse(input, string_uint_uint, bp::ws, string_from_parse)) std::cout << "That yields this string: " << string_from_parse << "\n"; else std::cout << "Parse failure.\n";
Here, we use the parser string_uint_uint
,
which produces a boost::parser::tuple<std::string, unsigned int, unsigned
int>
attribute. When we try to parse that into an out-param std::string
attribute, it just works. This is because std::string
has a constructor that takes a std::string
,
an offset, and a length. Here's the other parse:
constexpr auto uint_string = bp::uint_ >> bp::char_ >> bp::char_; std::vector<std::string> vector_from_parse; if (parse(input, uint_string, bp::ws, vector_from_parse)) { std::cout << "That yields this vector of strings:\n"; for (auto && str : vector_from_parse) { std::cout << " '" << str << "'\n"; } } else { std::cout << "Parse failure.\n"; }
Now we have the parser uint_string
,
which produces boost::parser::tuple<unsigned int, std::string>
attribute — the two char
s
at the end combine into a std::string
.
Those two values can be used to construct a std::vector<std::string>
, via the count, T
constructor.
Just like with using aggregates in place of tuples, non-aggregate class
types can be substituted for tuples
in most places. That includes using a non-aggregate class
type as the attribute type of a rule
.
However, while compatible tuples can be substituted for aggregates, you
can't substitute a tuple for some class
type T
just because the tuple could have been used to construct T
.
Think of trying to invert the substitution in the second parse above. Converting
a std::vector<std::string>
into a boost::parser::tuple<unsigned int, std::string>
makes no sense.