PrevUpHomeNext

Case Mapping

Case mapping is conceptually simple. There are three kinds of case: lower-case, upper-case, and title-case. Title-case has an upper-case letter at the beginning of each word. There are six operations, though there are a few overloads of each. There are three case-mapping algorithms: to_lower(), to_title(), and to_upper(). Each of these outputs its result (as code points) via an out-iterator. There are also three case predicates, is_lower(), is_title(), and is_upper().

For each of these, there are overloads that take the input code point sequence as a code_point_range, a pair of code_point_iters, or a grapheme_range:

std::array<uint32_t, 4> cps = {{'A', 'n', 'd'}};

assert(!boost::text::is_lower(cps));
assert(boost::text::is_title(cps));
assert(!boost::text::is_upper(cps));

std::array<uint32_t, 4> lowered_cps;
boost::text::to_lower(cps, lowered_cps.begin());

As a complication, some languages have case-mapping rules that differ from the general case, and so there is an optional case_language parameter to the to_*() functions that you can specify to get this custom behavior:

boost::text::text t = "ijssel";

boost::text::text default_titled_t;
boost::text::to_title(
    t, std::inserter(default_titled_t, default_titled_t.end()));
assert(default_titled_t == boost::text::text("Ijssel"));

std::string dutch_titled_t;
boost::text::to_title(
    t,
    boost::text::from_utf32_inserter(dutch_titled_t, dutch_titled_t.end()),
    boost::text::case_language::dutch);
assert(dutch_titled_t == "IJssel");

Another complication is that the title-case functions need to know where word boundaries are. By default, they use an instance of next_word_break_callable, which in turn just calls next_word_break(). You can supply your own callable instead if you need tailored word breaking.

There are a few case mapping behaviors that are common in various languages, but that are not accounted for by the default Unicode case mapping rules. For instance, Dutch capitalizes "IJ" at the beginning of title-cased words. This is available in Boost.Text's implementation if you use case_language::dutch, as seen above. Boost.Text also implements the somewhat complicated rules for upper-casing modern Greek.

Finally, there are in-place versions of the case-mapping functions available for use with text and rope:

boost::text::text t = "a title";
boost::text::in_place_to_upper(t);
assert(t == boost::text::text("A TITLE"));

boost::text::rope r = "another title";
boost::text::in_place_to_title(r);
assert(r == boost::text::text("Another Title"));


PrevUpHomeNext