What are Pods?

Update: Ignore the following as it is pure speculation. As Rich points out in his comment there are a lot of details, which are not clear, yet. And since he is not actively working on pods right now, it will likely take some more time before there is something more concrete.

Currently there are a few new things in the pipeline – eg. enhanced primitive support. One of the more mysterious additions are pods. Relatively little is known about them up to now. But in a recent interview on infoq Rich Hickey spoke a little about them. So let's have a deeper look at pods.

State and coordination

Clojure places a strong emphasis on immutability and side-effect freeness. A lot of trouble in classic programs comes from the fact of uncoordinated change to mutable structures. This requires a lot coordination between the different threads accessing the same memory in the same time frame. This coordination is hard to get right.

Clojure helps the programmer here in two ways. On the one hand Clojure's native data structures are immutable and persistent. So a thread can hold the data structure without fear of some other thread changing things under his hands.

On the other hand Clojure provides reference types, which provide handling of state with well defined semantics. So if several threads have to access the same state it happens in a clean manner. This makes reasoning about the program easier. Note: I said „easier“, not „easy“!

Mutability is bad, right?

However Clojure's data structures come with a price. Since they are persistent there is copying going on. Not as much as a naive implementation would do. As much as possible of the existing structure is shared. Nevertheless this has an impact on performance.

Of course Clojure tries to be as fast as possible. So is there some way how this performance hit can be remedied? Indeed there is, but to see why it makes sense, we have to have a short look on how update such a reference type works.

  1. Read the value from the reference type.
  2. Produce a new value—likely based on the old one.
  3. Store the new value in the reference type.

The outside world sees a consistent view of the value. Only after step three the value changes in an atomic way to the new value. So no other thread sees inconsistent values. Since this is such a common operation Clojure even provides special update functions like swap!, alter and send which take a producer function and take away the reading and storing boilerplate.

Important here is step number 2. No one actually sees what happens there! But hey: if no one sees what happens, you don't have to synchronise changes! So one could use the classical approach and modify things in place. Then no copying has to be done and we get a speed improvement.

Exactly this is what transients are for. You can turn a Clojure vector or map into a transient in constant time. Use it as you would a non-transient counterpart, but much faster. And then turn it back to a persistent structure in constant time.

So what are pods actually?

Transients are nice but they conflate two different things: actual change and change policy. Transients protect themselves from changes from other threads. Only the thread which created them is allowed to actually change them.

However there might be other situations, where it is perfectly reasonable that several threads might do changes to a transient. This is not possible at the moment.

Pods will change that. They will be a new reference type which actually holds transients internally. Storing a value will make it a transient, retrieving a value automatically turns it into a persistent structure again. Updating the value however will work on the transient and hence be fast. So transients will become a low-level implementation detail hidden behind pods.

On the other hand pods will provide a strategy for allowing the actual modifications. So one is not locked into the „one thread“ policy of transients. For different situations one might want to choose a different update policy.

Finally pods can be coordinated in much the same way as Refs. *However the Ref updates join a surrounding transaction, while pods are probably not. I'm not sure about this one. So anyone with more information please correct me.*

A question

One thought which occurred to me about this proposal is that it again conflates two things: the coordination property and the transparent transient handling. Why shouldn't the latter not be added to Atoms, Refs and Agents? They would profit in a similar way from this a approach. Also the coordination properties of pods would be interesting even without a transient inside.

So for me pods are not really the interesting thing, but the question whether this behaviour could be retro-fitted to the other reference types to get the same advantages for existing parts of Clojure. Pods would then just complete the spectrum of reference types with one more type of semantics–the „limited transaction.“ (If my assumption is correct…)

Published by Meikel Brandmeyer on .