And what is elegance anyway? Of course this is a subjective topic. So I can only speak about my own sense of aesthetics. A recent thread on the clojure google group had some examples which touched me exactly there—at my aesthetic sense.
The problem the OP tried to solve was to provide something similar
to take-while
and drop-while
but working more like partition-by
.
That is, instead of working with a predicate it works with any
function and takes resp. drops items until this function returns a
different value. Here is an example for both functions:
user=> (take-by #(> % 6) (range 14))
(0 1 2 3 4 5 6)
user=> (drop-by #(> % 6) (range 14))
(7 8 9 10 11 12 13 14)
Note: Although the function is actually a predicate, you see that
the return value is not interpreted as would have been by eg.
take-while
.
We need a small utility: report-seq
. It just provides some means
to show when things are realised in the produced lazy sequences. Just
keep this definition in your mind when reading the rest of this article.
(defn report-seq
[]
(letfn [(this
[i]
(lazy-seq
(prn (str ">" i "<"))
(cons i (this (inc i)))))]
(take 14 (this 0))))
Equipped with this tool, let's have a look at some of the solutions from the thread.
partition-by
One of the simplest things to do is to leverage existing tools. Indeed
partition-by
really cries for this use case. The definitions are very
short.
(defn take-by-partition-by
[f coll]
(first (partition-by f coll)))
(defn drop-by-partition-by
[f coll]
(apply concat (rest (partition-by f coll))))
So what can I say about these definitions by just looking at them?
take-by-partition-by
looks innocent. Short. Simple. Good? To me
it's ugly! Define a lazy seq just to take the first element of it?
Then why defining the seq in the first place. I know, the seq is not
realised if not accessed. But it still feels like waste.
Similarly the definition of drop-by-partition-by
. It creates a
seq which is flattened again after cutting of the first element,
which is not used, by some apply-concat-rest
magic. There is no
obvious explanation by the function itself, why I need to do this.
It just drops out of nowhere to magically provide the desired result.
Additionally the – potentially expensive – function f
is called on
the whole sequence. Even if not necessary.
So let's see how these functions actually perform.
user=> (def s (take-by-partition-by #(> % 6) (report-seq)))
">0<"
">1<"
">2<"
">3<"
">4<"
">5<"
">6<"
">7<"
#'user/s
user=> (def t (drop-by-partition-by #(> % 6) (report-seq)))
">0<"
">1<"
">2<"
">3<"
">4<"
">5<"
">6<"
">7<"
">8<"
">9<"
">10<"
">11<"
">12<"
">13<"
#'user/t
Huh? Weren't these supposed to be lazy sequences? This result is
clearly undesirable. This solution – while looking superficially
simple – does not cut it. The culprit here is actually partition-by
.
It does more than we actually want by walking the groups it returns.
In particular it is a bad idea to use drop-by-partition-by
with
an infinite sequence in combination with a function which returns
only two different values like a predicate.
Note: Of course this could be fixed by making partition-by
build the groups lazily. But then you have to traverse the input
sequence twice and you would hold unto the head in some way. Calling
rest
on the output sequence would not let go of the items of the
first group. So you trade one drawback for others.
Then there is the variant using higher-order functions like map
to build the take-by
and drop-by
from basic pieces. This sounds
like a reasonable approach. Isn't this what functional programming
is about? Create complex functions by combining simple ones?
(defn take-by-map
[f coll]
(let [fs (map f coll)
ps (map = fs (rest fs))
zs (map list ps (rest coll))]
(cons (first coll) (map second (take-while first zs)))))
(defn drop-by-map
[f coll]
(let [fs (map f coll)
ps (map = fs (rest fs))
zs (map list ps (rest coll))]
(map second (drop-while first zs))))
This seems like putting map
to good use. It even uses patterns
like (map f x (rest x))
which should belong into everyone's toolbox.
Then there is some beauty in the symmetry of delegating some of the
work to existing take-while
and drop-while
functions.
But to be honest: for my taste it uses too much of these patterns
and again I get this magical feeling: How is the
map second drop-while first
incantation related to the original
problem?
This solution creates 5 intermediate sequences to construct a single
one. And still, it is not fully lazy. Another wrapping lazy-seq
is
required.
user=> (def s (take-by-map #(> % 6) (report-seq)))
">0<"
#'user/s
user=> (def t (drop-by-map #(> % 6) (report-seq)))
">0<"
#'user/t
Quiz: The first zero is obvious from the code. But why the second. Isn't that all supposed to be lazy? [Answer]
Note: The original take-by-map
version by Ken Wesson
used a sentinel, which I took the liberty to remove to unify both
implementations.
I also delegated the work to take-while
and drop-while
.
(defn take-by-minimalist
[f coll]
(lazy-seq
(when-let [s (seq coll)]
(let [fst (first s)
value (f fst)]
(cons fst (take-while #(-> % f (= value)) (rest s)))))))
(defn drop-by-minimalist
[f coll]
(lazy-seq
(when-let [s (seq coll)]
(let [fst (first s)
value (f fst)]
(drop-while #(-> % f (= value)) (rest s))))))
Let's restate the original problem. The functions should take
resp. drop items from the input collection until the given function
f
returns a different value. But written in different words this
can be rephrased as “while the given function f
returns the same
value.”
So all we need is a special treatment of the very first sequence
item to capture the required value. And that's all these functions
do. They retrieve the first item in a way which is “lazy-safe,”
calculate the value of this item under f
and then filter the
tail based on this cached value. This is basically a direct
transcript of the problem description.
user=> (def s (take-by-minimalist #(> % 6) (report-seq)))
#'user/s
user=> (def t (drop-by-minimalist #(> % 6) (report-seq)))
#'user/t
In the end it is mood what I conceive as “clear” or “elegant.” It depends on you – the astute reader – how you judge things and what is “elegant” to you. I can only present my point of view as input for your personal considerations.
I'm coming from a culture where awareness about the environment is very important. The green, ecologist party is very popular here in Germany. Even more so due to the recent events in Japan.
Being influenced by this background I detest waste of any kind. Be it in the trash can or in my programs. I try to be as minimalistic as possible. Just do what is necessary. No more, no less.
And considering the above comparison, I believe I am not very far off-track—be it in terms of self-explaining code or in terms of runtime performance.
Find your style. Be happy. Don't care what the others do. They are usually bad advisors. Don't listen to me.
Always question yourself.
The call to rest
has to realise the sequence. The rest
is kept lazily, but the sequence and hence the first item
is always realised.
Published by Meikel Brandmeyer on .
I'm a long-time Clojure user and the developer of several open source projects mostly involving Clojure. I try to actively contribute to the Clojure community.
My most active projects are at the moment VimClojure, Clojuresque and ClojureCheck.
Copyright © 2009-2014 All Right Reserved. Meikel Brandmeyer