I must confess that I haven't checked who provided the core
implementation of partition-by
(presumably Rich). But
whoever it was: he clearly knows what he's doing.
However there is still a bug spoiling the show a little bit.
The investigations lead us directly to the basic question of
how partition-by
should behave.
partition-by
behave?There are basically two approaches.
partition-by
could return a sequence of unrealised sequences.
So even the groups would be generated lazily. The down side on
this, is that the rest of the output sequence has to retain a
reference to head of the input sequence. But then (rest s)
doesn't let go of the items of the head group.partition-by
could a sequence of realised sequences.
The groups would be generated eagerly. So we loose a bit of
laziness. However now (rest s)
doesn't have to hold onto the
head anymore.Which way you choose is maybe a question of taste. I personally prefer the second because there are no hidden references kept in innocent looking code.
However, the core version combines the drawbacks of both approaches:
it generates the groups eagerly and holds onto the head. Fixing this
bug leads us directly to the question on how partition-by
should
behave.
; Copyright © Rich Hickey. All rights reserved.
; The use and distribution terms for this software are covered
; by the Eclipse Public License 1.0.
; http://opensource.org/licenses/eclipse-1.0.php
;
(defn partition-by
"Applies f to each value in coll, splitting it each time f returns
a new value. Returns a lazy seq of partitions."
{:added "1.2"}
[f coll]
(lazy-seq
(when-let [s (seq coll)]
(let [fst (first s)
fv (f fst)
run (cons fst
(take-while #(= fv (f %)) (rest s)))]
(cons run (partition-by f (lazy-seq (drop (count run) s))))))))
Adding a lazy-seq
call around the drop
fixes the problem of
generating the groups eagerly. However the drop
still keeps the
head of the input sequence.
; Copyright © Rich Hickey. All rights reserved.
; The use and distribution terms for this software are covered
; by the Eclipse Public License 1.0.
; http://opensource.org/licenses/eclipse-1.0.php
;
(defn partition-by
"Applies f to each value in coll, splitting it each time f returns
a new value. Returns a lazy seq of partitions."
{:added "1.2"}
[f coll]
(lazy-seq
(when-let [s (seq coll)]
(let [fst (first s)
fv (f fst)
run (cons fst
(take-while #(= fv (f %)) (rest s)))]
(cons run (partition-by f (seq (drop (count run) s))))))))
Adding a seq
call fixes the problem of keeping the head, but
now the count
realises the head group eagerly.
lazy-seq
delays the execution and stands for laziness. seq
forces the realisation of a sequence and stands for eagerness.
By choosing the function to plug-in – without changing anything else – we can change the behaviour of the function completely! And the function plugged-in basically represents the particular behaviour.
This is true elegance!
Published by Meikel Brandmeyer on .
I'm a long-time Clojure user and the developer of several open source projects mostly involving Clojure. I try to actively contribute to the Clojure community.
My most active projects are at the moment VimClojure, Clojuresque and ClojureCheck.
Copyright © 2009-2014 All Right Reserved. Meikel Brandmeyer