On the Clojure group every once in a while someone posts a question
concerning the interaction of binding
and lazy sequences. Let's have a short
look and what the problem is, why it arises and how a solution can be
implemented.
People happily create lazy sequences depending on dynamic variables.
(def x 1)
(defn make-seq
[n]
(lazy-seq
(cons (+ x n) (make-seq (inc n)))))
Now they create an instance of the sequence in the scope of a binding
call.
(def s (binding [x 100] (make-seq 5)))
Question: What is the result of (first s)
? Right. 6. Huh? Yes. There were a
lot of puzzled looks by clojure newbies already, when they first saw this
phenomenon.
So, why is that so?
binding
changes the value a Var is bound to for the executed code inside its
scope. The +
call, where x
shows up, is inside the scope of the binding
.
So why is it not affected? Because it is not executed inside the scope of the
binding
. The lazy-seq
defers the execution until first
retrieves the
first item. Then the +
call actually happens. But now x
has already its old
value. When the first
call is inside the binding
everything is fine.
(let [s (binding [x 100]
(let [s (make-seq 5)]
(println (first s))
s))]
(println (first s))
(println (second s)))
Question: What is printed? Right. 105, 105, and 7. Huh? The first first
realises the first item of the sequence. Since the call happens in the binding,
we have the binding
still in effect. So we print 105
. The second first
happens outside the binding
, but the previous value was cached. So we get
again 105
. Last but not least, the second
call realises the second item of
the sequence, but since the binding
is now gone, we got the original value of
x
and 7
is printed.
How can the problem be solved?
As we saw above, we can exploit caching of the lazy sequence! The simpliest
thing is a wrap of the sequence into a doall
before we pass it out of the
binding
. However this might not be very useful if the sequence is very large.
Then we have to deploy bigger guns and dive a little into the thread bindings
interface of Clojure. The solution is to install the required bindings every
time, when we realise one item of the sequence. In order to do this we first
capture the required bindings with get-thread-bindings
.
(defn make-bound-seq
[n]
(let [bindings (get-thread-bindings)
step (fn step [n]
(lazy-seq
(push-thread-bindings bindings)
(try
(cons (+ x n) (step (inc n)))
(finally
(pop-thread-bindings)))))]
(step n)))
Always, always, always, ALWAYS follow this style! First a
push-thread-bindings
, then a try
with your code and a finally
with a
pop-thread-bindings
. This ensures that every push-thread-bindings
is
complemented with a pop-thread-bindings
– even if an Exception is thrown.
Failing to do so, will throw up your bindings completely, leaving the running
program in a broken state.
Update: As suggested by Chouser this can be simplified with
with-bindings
. Together with the very good comment by Graham Fawcett
we can actually define a helper, which turns any input sequence into
a bound sequence.
(defn bound-seq*
[bind-map inner-seq]
(lazy-seq
(with-bindings bind-map
(when-let [s (seq inner-seq)]
(cons (first s) (bound-seq* bind-map (rest s)))))))
(defmacro bound-seq
([inner-seq]
`(bound-seq* (get-thread-bindings) ~inner-seq))
([bind-map inner-seq]
`(bound-seq* (hash-map ~@(mapcat (fn [[k v]] [`(var ~k) v]) bind-map))
~inner-seq)))
Now we can take our original sequence and can turn it easily into bound sequence.
(def bs (bound-seq {x 100} (make-seq 5)))
One has to understand the interactions between dynamic variables and laziness. As with everything else in life you have to understand. And then you have to do the Right Thing. The above seems noisy. But as Rich once stated: running into a lot of such trouble is a sign, that you misuse dynamic variables. Use them wisely.
What was said above also applies to new threads. Dynamic variables are
thread-local. The „Good Kirk - Bad Kirk“ scenario of multithreading. So when
you start a new thread by virtue of a function, the function does not have
access to the bindings of the old thread. This can be easily remedied by using
the bound-fn*
helper. Simply wrap the function into a bound-fn\*
and
you'll be fine. Or define your function directly with bound-fn
.
Published by Meikel Brandmeyer on .
I'm a long-time Clojure user and the developer of several open source projects mostly involving Clojure. I try to actively contribute to the Clojure community.
My most active projects are at the moment VimClojure, Clojuresque and ClojureCheck.
Copyright © 2009-2014 All Right Reserved. Meikel Brandmeyer