How it works...

This recipe concentrates on collecting all words in an std::map and then shoves all items out of the map and into an std::vector, which is then sorted differently, in order to print the data. Why?

Let's look at an example. When we count the word frequency in the string "a a b c b b b d c c", we would get the following map content:

a -> 2
b -> 4
c -> 3
d -> 1

However, that is not the order which we want to present to the user. The program should print b first because it has the highest frequency. Then c, then a, then d. Unfortunately, we cannot request the map to give us the "key with the highest associated value", then the "key with the second highest associated value", and so on.

Here, the vector comes into play. We typed the vector to contain pairs of strings and counter values. This way it can hold items exactly in the form as they drop out of the map.

vector<pair<string, size_t>> word_counts;

Then we fill the vector using the word-frequency pairs using the std::move algorithm. This has the advantage that the part of the strings which is maintained on the heap will not be duplicated, but will be moved over from the map to the vector. This way we can avoid a lot of copies.

move(begin(words), end(words), back_inserter(word_counts));
Some STL implementations use short string optimization--if the string is not too long, it will not be allocated on the heap and stored in the string object directly instead. In that case, a move is not faster. But moves are also never slower!

The next interesting step is the sort operation, which uses a lambda as a custom comparison operator:

sort(begin(word_counts), end(word_counts),
[](const auto &a, const auto &b) { return a.second > b.second; });

The sort algorithm will take items pairwise, and compare them, which is what sort algorithms do. By providing that lambda function, the comparison does not just compare if a is smaller than b (which is the default implementation), but also compares if a.second is larger than b.second. Note that all objects are pairs of strings and their counter values, and by writing a.second we access the word's counter value. This way we move all high-frequency words toward the beginning of the vector, and the low-frequency ones to the back.