In the last post the question was: why was this written? And the harder version: what would it take to make the "why" survive? This post is the first real answer, including the version we got wrong before we got it right.

The short version: memory alone isn't enough, and git history alone isn't enough, and the interesting part is what happens when you make them the same thing.

Two halves that don't work on their own

Start with memory on its own. You let agents and engineers write down decisions (we chose this database because we couldn't depend on a separate server) and you store those notes somewhere. This helps for about a month. Then the notes drift. The code moves on, the note doesn't, and now you have a confident statement of intent that no longer matches reality and no way to tell which one is stale. Memory without provenance is just notes that rot.

Now take git history on its own. It has the opposite problem. It is rigorously tied to the code (every line, every change, every author) but it records what changed and almost never why. A wall of diffs with no intent. You can reconstruct the entire mechanical history of a file and still have no idea what anyone was thinking.

Each half fails in exactly the spot where the other is strong. Memory has the intent and loses the grounding. Git has the grounding and loses the intent. The obvious move is to put them together. The non-obvious part was where.

The wrong turn: memory beside the code

Our first instinct was the boring, sensible one: store the memory next to the repo. A file in the project, or a small local store the tool manages. The code is here, the memory is right here next to it, done.

It works on one machine. Then someone clones the repo and the memory doesn't come with it. Or it lives in a file that gets committed, and now every decision note is a merge conflict waiting to happen, sitting in the same diff as the code change it describes, cluttering reviews. Or it lives in a sidecar store and you've quietly introduced a second source of truth that has to be backed up, synced, and remembered: the exact kind of infrastructure we were trying to avoid.

Every version of "beside the code" had the same defect: the memory and the code could drift apart, because they were two things that merely lived near each other. We wanted them to be one thing that moved together.

The right place was inside git history

Git already has a feature for this, and most people have never used it: git notes. Notes let you attach data to a commit after the fact, without rewriting the commit. The note travels with the object. It clones, it fetches, it lives in the same versioned graph as the code.

So that's where the memory went. A decision in Spelunk is written through to git notes, anchored to a commit, carrying the commit SHA as provenance. The intent and the grounding stop being two things you have to keep in sync: they're the same object. Clone the repo and the why comes with it. There's no separate store to stand up, no index to build, nothing to back up beyond the thing you were already backing up: the repository.

This is the mechanism that shipped, and it's the one we'd point at if you asked what Spelunk actually is under the marketing. Memory, written through to git notes, with provenance, no infrastructure required to start.

(For the record, we did prototype a second, lower-level backend that wrote into git's own metadata more directly, and we dropped it. It didn't earn the complexity it added over plain notes write-through. That's a thread we'll pull on properly in the next post, which is entirely about the things we deleted.)

Supersede, don't overwrite

There's one more decision in here that matters more than it sounds.

When a decision changes (you chose the database for one reason, then a year later you swap it for a different reason), the naive thing is to overwrite the old note with the new one. Don't. The moment you overwrite, you've destroyed exactly the thing this whole exercise was about: the history of the why.

So new entries supersede old ones rather than replacing them. The current answer is what you get by default. But the superseded entry is still there, still anchored to its commit, so the question "why did we used to do it the other way, and what changed?" stays answerable. The decision has a past tense, and the past tense survives.

This also does something quietly useful on the first query: when two old notes disagree with each other (a stale convention and the decision that replaced it), the contradiction surfaces instead of hiding. You find out the docs disagree with themselves at the moment you ask, not three weeks into a refactor.

Why this is the load-bearing decision

Everything else Spelunk does leans on this one choice. Because the memory lives in git rather than in a vendor's store, it isn't tied to any particular agent, which is the subject of a later post in this series and the thing we'd defend hardest. Because it needs no separate infrastructure, the tool can start in the first minute instead of after a setup ritual. And because the why travels with the repo, the agent that reads your codebase next week starts with the intent the agent this week wrote down, instead of guessing.

It's a small, almost unglamorous mechanism. Git had the feature the whole time. We just decided the why deserved to live in the one place that already survives every clone.


Spelunk is open source and code-aware, callable from whatever agent you already use. Repo and docs: spelunk.cloud.