Transducers promise to separate the element transformation from the
actual traversal of the input data. This allows the very same transducer
pipeline to be used in reduce
(resp. the transducer specific
variant transduce
) and at the same time apply the exact same
transformation logic to asynchronous channels as provided by
core.async
.
Just as well you could create a traditional lazy sequence from the transducer. The promise is to provide the laziness we know from sequences, but save the cons cells created in each step of the transformation pipeline. This is more efficient and reduces the load on the garbage collector.
A very attractive promise indeed. But do transducers deliver on that promise?
To illustrate this principle let's have a look at an example. Assume the following transducer:
(defn neighbourhood
[n]
(list (dec n) n (inc n)))
(def example-t
(comp
(mapcat neighbourhood)
(map str)))
This is a very simple transducer pipeline. It specifies a transformation,
which first turns a given natural number into a vector of its neighbours
and itself. Since we use mapcat
this list is flattened into
the further pipeline steps, which – in this case – just plainly turns the
numbers into strings.
To get a lazy sequence from this transducer we use the
sequence
function which was augmented to understand
transducers.
user=> (sequence example-t (list 1 4 7))
("0" "1" "2" "3" "4" "5" "6" "7" "8")
This looks quite similar to our traditional approach using the sequence functions.
(defn example-s
[coll]
(->> coll
(mapcat neighbourhood)
(map str)))
This result is the same:
user=> (example-s (list 1 4 7))
("0" "1" "2" "3" "4" "5" "6" "7" "8")
However, the latter is tied to lazy sequences while we may use the transducer in various other scenarios.
user=> (transduce example-t str (list 1 4 7))
"012345678"
user=> (async/chan 0 example-t)
#<channel>
As stated above this is a very tempting promise transducers give here. We get all the niceties of the functions we are used to plus some real improvements in terms of efficiency and reuse.
However, there is a little bit of a problem. The transducer version is not as lazy as the traditional sequence form:
user=> (first (example-s (list 1 4 7)))
">0<"
"0"
user=> (first (sequence example-t (list 1 4 7)))
">0<"
">1<"
">2<"
"0"
Note: I replaced the call to str
in the
map
pipeline step with a different function which also prints
the result in order to show the effect.
As you can see the sequence is realized further than actually needed. In
fact the sequence is realized up to the length of the first return value in
the mapcat
step. To understand why this happens, we have to
have look under the hood.
Here are – roughly – the implementations of the two mapcats.
(defn lazy-mapcat
[f coll]
(->> coll
(map f)
(apply concat)))
(defn transducer-mapcat
[f]
(comp
(map f)
(cat)))
This looks quite similar. And a quick confirmation in the REPL shows
that the map
is not the problem. So the culprit must lie in
the difference between the lazy concat
and the transducer
cat
.
The main difference between the concatenating flavours is, that the lazy
concat
is based on the pull principle while the transducer
cat
is based on the push principle.
cat
is called with an input. It has no way to delay
processing the input to a later point it time. So it has no choice but
immediately walk the input and push its elements further downstream.
The lazy concat
can do that however. It keeps the input in
mind, but the work is only done, when specifically being asked for it. That
way concat
can also choose to process the input only step by
step.
When using sequence
to apply a transducer pipeline to a
collection the result may not be as lazy as it is when resorting to the
traditional sequence functions.
Beware the edge cases!
Published by Meikel Brandmeyer on .
I'm a long-time Clojure user and the developer of several open source projects mostly involving Clojure. I try to actively contribute to the Clojure community.
My most active projects are at the moment VimClojure, Clojuresque and ClojureCheck.
Copyright © 2009-2014 All Right Reserved. Meikel Brandmeyer