Simon Rhymesfilter,
map,
fold,
das Glück ist dir hold.Doch std::transform,
das wurmt dich enorm.I wrote this cheesy poem in 2016. Detailed explanation follows -- not all readers are proficient both in software development
and in German here. A typical piece of code that appears every single time in an introduction to functional pipelines is:
const int y = [1, 2, 3, 4, 5]
.filter!(x => x > 2)
.map!(x => x*x)
.sum();
This sets
y to 50 because 50 is 3×3 + 4×4 + 5×5.
Now for the explanation of the poem. {
Functional pipelines for data are nice: Each step relies only on the
output of the function from the previous step, not on any properties/mutation/earlier side effects to/of a publicly visible buffer containing the original input. Fewer bugs because we avoided mutable data.
Even though you can chain higher-order functions such as
filter,
map,
fold in any sensible order, e.g., map first, then filter some, then map some more, then filter some more, ..., the most common order is exactly the order of the poem:
- filter: First discard some elements that we want to ignore.
- map: Transform the remaining elements.
- fold: Collect the mapped elements into a final result.
These functional pipelines are notoriously hard to express in C++ because you can't write them as pipes even with the C++17 standard library. You don't pass the collection itself as a single argument, instead you pass two arguments, a start iterator and an end interator. Therefore, you can't chain the calls à la
source.f().g().h(); but have to put every call into a separate statement, introducing a new name for every intermediate result.
In particular
std::transform is the equivalent of
map in that world, and the annoyance (having to put every step of the functional chain into a separate statement) feels already big enough here that it's nicer to write a conventional loop over the collection instead of calling
std::transform.
You're annoyed enough in C++17 that you might consider the
range-v3 library for C++17 or to move your project to C++20 where these pipeline-friendly ranges are part of the standard library.
}
Back from C++ to D. When I write functional pipelines in D, I rarely call an explicit
fold() in the last step. It's much more common to call one of these:
- join() some mapped strings into a single long string, possibly join(" ") or join(", ").
- array() to keep the results in an eagerly-allocated random-access array, usually because we'll only store the result for now.
- sum() on a list of numbers.
- any() or all() on a list of bools.
All of these are (semantically, not implementation-wise) special cases of
fold().
For example,
list.sum() is
list.fold!((a, b) => a + b)(0).
The better name goes a long way though to make the pipeline more readable, especially if the more specialized function comes directly from the standard library.
Even
map(list, func) and
filter(list, predicate) are special cases of
fold(), as explained in the blog article
The Universal Properties of Map, Fold, and Filter with the maximum-possible amount of category theory. The gist is:
filter() accumulates (folds) the list into a new list, possibly doing nothing instead of appending the next element. Similar for
map().
-- Simon