Boost.Text is composed of two main layers:
text
layer
There are a couple of assorted bits that were necessary or useful to have around
when implementing various parts of Boost.Text: segmented_vector, unencoded_rope,
unencoded_rope_view,
and trie/trie_map/trie_set.
The Unicode layer provides a few Unicode-related utility types, but is primarily comprised of the Unicode algorithms. These algorithms are done in the style of the standard algorithms, with range-friendly interfaces. For each of the unicode algorithms there is a corresponding view. There are algorithms for these Unicode operations:
to_upper(), is_lower(), etc.)
These algorithms are independent of the text layer; it is possible to
use Boost.Text as a Unicode library without using the text layer at all.
The text
layer is built on top of the Unicode layer. Its types encode text as UTF-8,
and maintain normalization. Much of their implementation is done in terms of
the algorithms from the Unicode layer. The types in this layer are: text,
text_view,
rope,
and rope_view.
It contains templates that can be instantiated with different UTF formats,
normalization forms, and/or underlying storage.
Finally, there are some items that I wrote in the process of implementing everything else, that rise to the level of general utility.
First is segmented_vector.
This is a discontiguous sequence of T,
for which insertions anywhere in the sequence are cheap, with very cheap copies
provided via a copy-on-write mechanism. It is a generalization of unencoded_rope
for arbitrary T.
The remaining assorted types are trie, trie_map, and trie_set. The first of these
is a trie that is not a valid C++ container. The latter two are analogous to
std::map and std::set, respectively,
just built on a trie instead of a binary tree.