Publication & Postmortem: SPICLUM Library

A drawing showing the evolution of different types of spears, one of which is the spiculum.

Introduction

I’ve finished version 0.0 of SPICLUM! “System Prevalence In Common Lisp Using MOP, pronounced however spiculum’s pronounced. In case you’d like to disregard the collective wisdom of roughly sixty years of database usage, and instead keep an object store in memory, serialize to and from text files, and rely on as much automagic as possible.” For more information about SPICLUM as-is, go to the linked page. For more information about the development of SPICLUM – read on!

Project Conception & Raison d’Être

While working on a project in late August 2019, I reached a point where I needed to persist CLOS instances/their data. This offered several options: direct use of a relational DB, use of an ORM, or something weird. So I discussed the options with a friend and espoused the rapid prototyping that should be theoretically possible if the persistence automagically took care of itself. Besides, it’d be aesthetically pleasing to not see e.g. explicit SQL snippets strewn about. So I decided to go for something weird.

The two existent options I considered the most were Elephant and CL-PREVALENCE. Elephant is a persistent object database, and looked interesting – but required configuring a database backend, which I didn’t want to bother with. I wanted as automagic persistence as possible, so it could just get out of the way and let me focus on prototyping. CL-PREVALENCE is an object prevalence system, and looked interesting – but required explicit transactions, which was a no-go. Replacing the stray SQL snippets with stray transactions seemed almost as bad.

Since neither Elephant nor CL-PREVALENCE satisficed, and since Lisp is supposed to be powerful, and because it seemed like a good way to learn about a multitude of things, I decided to write my own persistence system. I reasoned that a majority of effort on existing RDBMSes had probably gone into efficiency/optimization. Since I wouldn’t care about that, I should have a decent early version ready in a couple months.

Project Order

I quickly discovered a need to supply CLOS slots with new initargs. That meant I needed to change how classes worked in Lisp, which meant it was time to learn about the Metaobject Protocol. The Art of the Metaobject Protocol is a great book and blew my mind. Of course a language should be designed to fit a region of language/problem space rather a single point! I.e. just make the object system malleable, and make the malleability come through the object system itself. There’s an elegant recursive aspect to that design (and The Art of the Metaobject Protocol presents a metacircular version of an implementation of CLOS – nice to see metacircularity outside of SICP!).

After reading the book, I searched for research papers about using MOP for persistence purposes. Unfortunately the few papers I found weren’t illuminating and mostly just talked about their project outcomes in a positive light. That is, the papers had little to no information about challenges, particular techniques, or any such project reflection. At least not at the level of granularity I was interested in – I didn’t find anything about the relevant architectures, or the MOP generics they’d found relevant, or any code. In retrospect, it might be I could’ve learned significant amounts by digging through Elephant‘s source code.

Initial progress was slow thanks to stumbling over the Metaobject Protocol at the same time. First I implemented a keyable-slot class to support the new initargs I wanted, followed by the prevalence-class metaclass to specialize all the persistence on. Eventually I had a rough version of some of SPICLUM’s transaction points: make-instance and (setf slot-value-using-class). Then I implemented a rough version of the object-store, a simple nested hash-table. Next came writing tests for the code I had already written, followed by gradually expanding the set of supported transaction points in a loop: Add another transaction point, change the object-store as necessary, add tests. Detour to add utilities and refactor with supportive language as necessary. Not all the necessary intersection points specialized on the metaclass, so I had to introduce a prevalence-object class as well. I only started on the serialization once all intersection points were finished. After the serialization worked, I moved on to the query language and integration in terms of saving/loading worlds.

My colleagues reacted differently when I discussed the project with them (around the time of having a rough version of some transaction points in place): K seemed enthused, recounted some of his experience with MOP (related to web frameworks and generic functions), and mentioned that persistence as I planned it was a difficult project to do right/well thanks to the many possible different semantics. S warned that I would have efficiency issues compared with a full-fledged RDBMS, since those have received many years of work towards making them performant, and (unless I am mixing up two unrelated memories) recounted a time he had rolled his own persistence and run into performance issues. He’s right, of course, though automation vs performance trade-offs are common enough. D gave his unconditional moral support, called the project cool, talked enthusiastically about The Art of the Metaobject Protocol, and concluded that it’s Lisp – mold it and make it do what you want!

Project Postmortem

The project took a bit more than two months, namely about sixteen more months (chronologically). However, parts of that are due to working on-and-off on it. Parts of it is definitely from having little idea about all the involved parts of MOP when I started, though – not to mention the only transaction points I had in mind back then where make-instance and (setf slot-value-using-class). Perhaps the biggest reason the project took so many more calendar months than anticipated, though, was that I still mentally accounted time as if I were at university. Apparently, time is a lot more malleable as a student than as an employee. Full-time employment plus an hour’s worth of commuting eat up time in a way I simply hadn’t internalized.

Let’s take a look at some charts (yay!):

A chart of the number of commits over time, showing a spike in early 2020, and then another spike, twice as big, in late 2020.

A chart of the frequency of added and deleted lines of code over time, showing a spike of 800 LOCs in early 2020, a spike of 600 LOCs in October 2020, and a spike of around 1200 LOCs around late 2020.

A majority of direct programming work happened in early 2020 and late 2020, followed by finalization in February/March 2021. The Pareto principle’s clearly at work in terms of which months lead to progress. The picture gets complicated by the fact that I sometimes spent a few weeks mulling over design issues without writing any code, though. Squinting at these charts, I actively worked on SPICLUM for 1 month in 2021, 2 months in late 2020, May 2020, January 2020, and let’s call it 2 months for 2019. Add in another month of mull-work, and we get an “effective part-time” work-length of 8 months. That still vastly overshoots the initial expectation of a couple of months, but provides a clearer image than the noisy chronological time that’s happened to elapse.lines spent

The three thousand lines of the repository are split like this: 1016 on testing, 1675 on source code, and 290 on assorted files like the README and ASDF build files.

A more interesting question than the time and size of the project, though, is whether it succeeded: Did I create a persistence system that gets out of the way, and requires minimal configuration (just the place to store the world and log files)? Yes, I think so. The qualification there is because I haven’t yet used SPICLUM as the persistence system for another project, since I’ve been busy developing SPICLUM itself. But in terms of the overall design and use during testing/development, I’m pretty satisfied. There’s some shortcomings, though – let’s return to those in the failures section.

Successes

Most obviously, I have a persistence system with semantics and automagic that I like (modulo the inevitable flaws in design/approach I’ll find as I start using it as part of other projects). It gets out of the way – there’s nothing to do when using it except 1) loading/saving the world as appropriate 2) using defpclass 3) keeping the mutable data structure modification-without-slot-access in mind 4) keeping the supported data structures in mind 5) querying out relevant objects as relevant 6) deleting objects as appropriate 7) handling errors as appropriate.

I’m pleased with the query language. It’s based on classes (and their hierarchies) and slots – which seem the natural options when dealing with querying CLOS instances. If you supply each slot comparison with a function, it tests the function, whereas if you supply a value, it tests the value using the slot’s equality comparison. It’s flexible and just does what I want. I chose to use keywords for the compound query operations (:and :or :not), on the off-chance some hypothetical user might want to use (and or not) as slot names. As with the rest of SPICLUM, little to no effort has been put into performance optimization of the queries or the object-store, though – there’s a soft attempt at finding the smallest possible initial set of candidate objects to start filtering on, but it’s pretty stupid.

I’ve said enough about ensure-class-using-class and ensure-class-using-metaclass elsewhere – I do think ensure-class-using-metaclass provides an improved interface, though. There’s also not that much to say about the serialization to Lisp code: there’s something fun and circular about having Lisp procedures that (through other procedures) serialize (something equivalent to) the calls to themselves, though. Some of the serializations were tricky to figure out – in particular the ensure-class one, thanks to SBCL’s extra system-supplied arguments. A neat side-effect of serializing as Lisp code is that loading the world reduces to loading the relevant world file, log file, and some bookkeeping.

I’ve also learnt a lot. About MOP and CLOS, naturally – nothing like making changes to an object system to learn how it works. The use of MOP in SPICLUM’s limited to metaclasses and slot definition metaobjects and the intersection points listed in the README, though – so no need for method combinators, custom specializers, custom funcallables, etc. I’ve also gotten some more experience with testing and its usefulness – I lost count of how many times my tests would catch errors in design or implementation. Sometimes they caught regressions too. In implementing multi-setf and multi-psetf (atomic operators, if used with certain constraints), I also had to write my first uses of setf expansions.

Besides learning about MOP, though, the most interesting learning outcome for me personally was the opportunity to test out the sort of program-language fit I’d read about so many times. From small generic language utilities like prog1-let, partial function application (lfix and rfix, sometimes misnomered as curry and rcurry), ignore-args, key-args… to utilities to capture common CLOS/MOP actions, like slot->value-map, find-slot-defining-class, do-bound-slots, slot-by-name… My favourite utility to capture the semantics of what I was doing, though, was as-transaction: transactions had to be atomic. as-transaction takes a list of dos and undos, so as to unroll completed sub-actions of a transaction in case of non-local exits. If we guarantee that those sub-actions are transactions i.e. atomic, then that confers onto the whole as-transaction form. This greatly simplified the implementation of many things. Here’s an example usage:

(defmethod (setf c2mop:slot-value-using-class) :around (new-value
                                                        (class prevalence-class)
                                                        instance
                                                        slotd)
  "SETFs the NEW-VALUE of INSTANCE for SLOTD, as a transaction.

Must atomatically update indexes and persist as appropriate."
  (assert (acceptable-persistent-slot-value-type-p new-value))
  (multiple-value-bind (old-value slot-boundp)
      (guarded-slot-value instance (c2mop:slot-definition-name slotd))
    (let (;; CLHS specifies that SETF may return multiple values
          (results (multiple-value-list (call-next-method))))
      (with-recursive-locks (prevalence-slot-locks class (list slotd))
        (as-transaction
            ((:do (when slot-boundp
                    (prevalence-remove-class-slot class slotd old-value instance))
              :undo (if slot-boundp
                        (progn (call-next-method old-value class instance slotd)
                               (prevalence-insert-class-slot class slotd old-value instance))
                        (slot-makunbound instance (c2mop:slot-definition-name slotd))))
             (:do (prevalence-insert-class-slot class slotd new-value instance)
              :undo (prevalence-remove-class-slot class slotd new-value instance)))
          (key-args (new-value class instance slotd) serialize :setf-slot-value-using-class)
          (values-list results))))))

Part of the language-problem fit also comes from SPICLUM’s own code structure (as separate from the utilities produced): Conditions and keyable-slots are low-level and don’t depend on anything except the language. The object-store relies on keyable-slots and conditions, but nothing about prevalence-classes or prevalence-objects. It does store the world/log pathnames, though, which is a slightly muddled layering. In theory, the object-store could simply provide lookup access to objects for arbitrary purposes. Queries rely on the object-store, whereas serialization doesn’t rely on anything except the language again. And then on top of all those pieces of disconnected functionality, the MOP intersection points for the prevalence-class and prevalence-object classes ties them together and “drives” the whole system. (And then there’s some interface functions of their own atop that again, to provide a programmer interface.) It took some though to arrive at the way it’s layered – I had the confusing idea early on that it should basically be reversed, and that prevalence-class and prevalence-object would form the (near-) bottom of the system. But the current stratification is much neater, I think.

Failures

Besides the overshot time prognosis, the biggest failure is that SPICLUM doesn’t get entirely out of the way: from the list above under the successes header, points 3 and 4 are shortcomings. There’s no way to get around point 3 that I can tell – if you’re extracting a mutable structure from a slot, and then passing that around and mutating it, then the “parent” object simply can’t know about the mutation at the time it happens. Saving the world captures such mutated states, but they’ll be absent from the transaction log. It’d be possible to avoid the problem by having prevalent objects return copies of their mutable structures by default, unless within some special modification context, I suppose – it’d mitigate any accidental mutation causing cascades of changes, at least.

But the fact that SPICLUM doesn’t support all types of data annoys me more. Serializable closures would be neat. Some of the unsupported data types are because of the approach SPICLUM takes: displaced arrays are unsupported because they’d require extensive world analysis to figure out; anonymous classes are unsupported because SPICLUM uses the names of classes as parts of internal keys; non-prevalent CLOS instances are unsupported because they seem silly in context – why store them when prevalent CLOS instances already exist?

Some of the unsupported data is because of laziness/not-getting-around-to-it-yet-itis. Pathnames, packages and circular data all fall into this category (unless my preliminary analysis of those groups is erroneous). Figuring out a way to serialize general circular data sounds interesting, but like a significant piece of work. Doing so in a nice way seems like it’d require a general data walker, too. I might write a post on data walking – SPICLUM walks arbitrary data as part of serialization and as part of forcing thunks already.

The third group of unsupported data types falls into weaknesses/shortcomings in Common Lisp itself. I’ll write a separate post on the topic of closures. For now, let’s consider serializable named functions, and serializable structs. Actually the latter might deserve its own post, too. Named functions, then: An easy option would be to serialize a named function as a function lookup on the function’s name. This doesn’t capture every single possible use of a named function – maybe you’d like to store the previous version of the named function, despite a new one existing. But it’s one option for sensible semantics – if a slot’s value is a named function, the slot, in the reconstructed world, should hold whatever function has that name. Unfortunately, it turns out that there’s no portable manner to extract the name of a function: function-lambda-expression has the function name as its third returned value. Unfortunately, it’s permitted to return nil for all its values, whenever. Well, seems stupid to me!

Summa Summarum

Any significant project seems fractal in nature: Progress involves turning over rocks and finding more possible work/improvements. What started as wanting automagic persistence turns out to border on circular data structures, and shortcomings in the language. If I solve those, I’d probably find something tangential to each again that would warrant further improvement, and so forth. Or, for another example, writing this write-up has revealed two other posts to write.

The shortcomings in Common Lisp that I’ve found mostly revolve around a lack of introspection, which is ironic since the amount of introspection Lisp offers initially blew my mind (particularly when considering MOP on top). Suffice to say SPICLUM’s hammered home some subset of the myriad uses of introspection.

Apart from that, developing SPICLUM has been fun, and I’m excited to see how well its current design will hold up once I get around to actually using it for persistence. Once I decide on an application project, I should have a decent early version ready in a couple of months.