What is the value of elegance?

And what is elegance anyway? Of course this is a subjective topic. So I can only speak about my own sense of aesthetics. A recent thread on the clojure google group had some examples which touched me exactly there—at my aesthetic sense.

The task

The problem the OP tried to solve was to provide something similar to take-while and drop-while but working more like partition-by. That is, instead of working with a predicate it works with any function and takes resp. drops items until this function returns a different value. Here is an example for both functions:

user=> (take-by #(> % 6) (range 14))
(0 1 2 3 4 5 6)
user=> (drop-by #(> % 6) (range 14))
(7 8 9 10 11 12 13 14)

Note: Although the function is actually a predicate, you see that the return value is not interpreted as would have been by eg. take-while.

A utility

We need a small utility: report-seq. It just provides some means to show when things are realised in the produced lazy sequences. Just keep this definition in your mind when reading the rest of this article.

(defn report-seq
  []
  (letfn [(this
            [i]
            (lazy-seq
              (prn (str ">" i "<"))
              (cons i (this (inc i)))))]
    (take 14 (this 0))))

Equipped with this tool, let's have a look at some of the solutions from the thread.

Using partition-by

One of the simplest things to do is to leverage existing tools. Indeed partition-by really cries for this use case. The definitions are very short.

(defn take-by-partition-by
  [f coll]
  (first (partition-by f coll)))

(defn drop-by-partition-by
  [f coll]
  (apply concat (rest (partition-by f coll))))

So what can I say about these definitions by just looking at them? take-by-partition-by looks innocent. Short. Simple. Good? To me it's ugly! Define a lazy seq just to take the first element of it? Then why defining the seq in the first place. I know, the seq is not realised if not accessed. But it still feels like waste.

Similarly the definition of drop-by-partition-by. It creates a seq which is flattened again after cutting of the first element, which is not used, by some apply-concat-rest magic. There is no obvious explanation by the function itself, why I need to do this. It just drops out of nowhere to magically provide the desired result. Additionally the – potentially expensive – function f is called on the whole sequence. Even if not necessary.

So let's see how these functions actually perform.

user=> (def s (take-by-partition-by #(> % 6) (report-seq)))
">0<"
">1<"
">2<"
">3<"
">4<"
">5<"
">6<"
">7<"
#'user/s
user=> (def t (drop-by-partition-by #(> % 6) (report-seq)))
">0<"
">1<"
">2<"
">3<"
">4<"
">5<"
">6<"
">7<"
">8<"
">9<"
">10<"
">11<"
">12<"
">13<"
#'user/t

Huh? Weren't these supposed to be lazy sequences? This result is clearly undesirable. This solution – while looking superficially simple – does not cut it. The culprit here is actually partition-by. It does more than we actually want by walking the groups it returns. In particular it is a bad idea to use drop-by-partition-by with an infinite sequence in combination with a function which returns only two different values like a predicate.

Note: Of course this could be fixed by making partition-by build the groups lazily. But then you have to traverse the input sequence twice and you would hold unto the head in some way. Calling rest on the output sequence would not let go of the items of the first group. So you trade one drawback for others.

Building on higher-order functions

Then there is the variant using higher-order functions like map to build the take-by and drop-by from basic pieces. This sounds like a reasonable approach. Isn't this what functional programming is about? Create complex functions by combining simple ones?

(defn take-by-map
  [f coll]
  (let [fs (map f coll)
        ps (map = fs (rest fs))
        zs (map list ps (rest coll))]
    (cons (first coll) (map second (take-while first zs)))))

(defn drop-by-map
  [f coll]
  (let [fs (map f coll)
        ps (map = fs (rest fs))
        zs (map list ps (rest coll))]
    (map second (drop-while first zs))))

This seems like putting map to good use. It even uses patterns like (map f x (rest x)) which should belong into everyone's toolbox. Then there is some beauty in the symmetry of delegating some of the work to existing take-while and drop-while functions.

But to be honest: for my taste it uses too much of these patterns and again I get this magical feeling: How is the map second drop-while first incantation related to the original problem?

This solution creates 5 intermediate sequences to construct a single one. And still, it is not fully lazy. Another wrapping lazy-seq is required.

user=> (def s (take-by-map #(> % 6) (report-seq)))
">0<"
#'user/s
user=> (def t (drop-by-map #(> % 6) (report-seq)))
">0<"
#'user/t

Quiz: The first zero is obvious from the code. But why the second. Isn't that all supposed to be lazy? [Answer]

Note: The original take-by-map version by Ken Wesson used a sentinel, which I took the liberty to remove to unify both implementations.

Let the problem guide you

I also delegated the work to take-while and drop-while.

(defn take-by-minimalist
  [f coll]
  (lazy-seq
    (when-let [s (seq coll)]
      (let [fst   (first s)
            value (f fst)]
        (cons fst (take-while #(-> % f (= value)) (rest s)))))))

(defn drop-by-minimalist
  [f coll]
  (lazy-seq
    (when-let [s (seq coll)]
      (let [fst   (first s)
            value (f fst)]
        (drop-while #(-> % f (= value)) (rest s))))))

Let's restate the original problem. The functions should take resp. drop items from the input collection until the given function f returns a different value. But written in different words this can be rephrased as “while the given function f returns the same value.”

So all we need is a special treatment of the very first sequence item to capture the required value. And that's all these functions do. They retrieve the first item in a way which is “lazy-safe,” calculate the value of this item under f and then filter the tail based on this cached value. This is basically a direct transcript of the problem description.

user=> (def s (take-by-minimalist #(> % 6) (report-seq)))
#'user/s
user=> (def t (drop-by-minimalist #(> % 6) (report-seq)))
#'user/t

Pick your style

In the end it is mood what I conceive as “clear” or “elegant.” It depends on you – the astute reader – how you judge things and what is “elegant” to you. I can only present my point of view as input for your personal considerations.

I'm coming from a culture where awareness about the environment is very important. The green, ecologist party is very popular here in Germany. Even more so due to the recent events in Japan.

Being influenced by this background I detest waste of any kind. Be it in the trash can or in my programs. I try to be as minimalistic as possible. Just do what is necessary. No more, no less.

And considering the above comparison, I believe I am not very far off-track—be it in terms of self-explaining code or in terms of runtime performance.

Upshot

Find your style. Be happy. Don't care what the others do. They are usually bad advisors. Don't listen to me.

Always question yourself.

Anwser to the quiz

The call to rest has to realise the sequence. The rest is kept lazily, but the sequence and hence the first item is always realised.

Published by Meikel Brandmeyer on .