Beauty in a bug?

I must confess that I haven't checked who provided the core implementation of partition-by (presumably Rich). But whoever it was: he clearly knows what he's doing.

However there is still a bug spoiling the show a little bit. The investigations lead us directly to the basic question of how partition-by should behave.

How should partition-by behave?

There are basically two approaches.

  1. partition-by could return a sequence of unrealised sequences. So even the groups would be generated lazily. The down side on this, is that the rest of the output sequence has to retain a reference to head of the input sequence. But then (rest s) doesn't let go of the items of the head group.
  2. partition-by could a sequence of realised sequences. The groups would be generated eagerly. So we loose a bit of laziness. However now (rest s) doesn't have to hold onto the head anymore.

Which way you choose is maybe a question of taste. I personally prefer the second because there are no hidden references kept in innocent looking code.

However, the core version combines the drawbacks of both approaches: it generates the groups eagerly and holds onto the head. Fixing this bug leads us directly to the question on how partition-by should behave.

; Copyright © Rich Hickey. All rights reserved.
; The use and distribution terms for this software are covered
; by the Eclipse Public License 1.0.
; http://opensource.org/licenses/eclipse-1.0.php
;
(defn partition-by
  "Applies f to each value in coll, splitting it each time f returns
   a new value.  Returns a lazy seq of partitions."
  {:added "1.2"}
  [f coll]
  (lazy-seq
    (when-let [s (seq coll)]
      (let [fst (first s)
            fv  (f fst)
            run (cons fst
                      (take-while #(= fv (f %)) (rest s)))]
        (cons run (partition-by f (lazy-seq (drop (count run) s))))))))

Adding a lazy-seq call around the drop fixes the problem of generating the groups eagerly. However the drop still keeps the head of the input sequence.

; Copyright © Rich Hickey. All rights reserved.
; The use and distribution terms for this software are covered
; by the Eclipse Public License 1.0.
; http://opensource.org/licenses/eclipse-1.0.php
;
(defn partition-by
  "Applies f to each value in coll, splitting it each time f returns
   a new value.  Returns a lazy seq of partitions."
  {:added "1.2"}
  [f coll]
  (lazy-seq
    (when-let [s (seq coll)]
      (let [fst (first s)
            fv  (f fst)
            run (cons fst
                      (take-while #(= fv (f %)) (rest s)))]
        (cons run (partition-by f (seq (drop (count run) s))))))))

Adding a seq call fixes the problem of keeping the head, but now the count realises the head group eagerly.

True elegance

lazy-seq delays the execution and stands for laziness. seq forces the realisation of a sequence and stands for eagerness.

By choosing the function to plug-in – without changing anything else – we can change the behaviour of the function completely! And the function plugged-in basically represents the particular behaviour.

This is true elegance!

Published by Meikel Brandmeyer on .