Your knowledge base remembers what changed, not why

How to engraph a folder of markdown and assets so a growing wiki doesn't drift into slop - and how to work in it without ever touching git.

Stoyan Stoyanov

May 28, 2026

contextai-engineeringproductivityknowledge-bases

Your knowledge base remembers what changed, not why

How to engraph a folder of markdown and assets so a growing wiki doesn't drift into slop - and how to work in it without ever touching git.

In April 2026 Andrej Karpathy posted that he was spending less of his token budget on generating code and more on building knowledge bases out of markdown and images - a folder of interlinked notes that an agent compiles, links, and keeps tidy as new material comes in. The post went everywhere, and within days there were gists, write-ups, and a wave of people pointing their agent at a folder and watching a personal wiki assemble itself.

The pattern works while it's closed. The trouble shows up the moment it isn't, and most of the ways it isn't have nothing to do with other people.

Open the folder three months in. You wrote a section then and you can't remember why it reads the way it does, so you "improve" it, and now it's wrong - you've quietly walked over a constraint nobody recorded. Without versioning there's no walking it back; with versioning but no reasoning, the old text is right there in the history and you still can't tell which version was the right one.

Or there's the agent version of the same problem. You keep asking it to update the same area, and every single time it forgets the same three considerations and you re-explain them. It isn't being slow - nothing in the repo is telling it those considerations exist, so it rediscovers them at the start of each session and loses them again at the end. The cost compounds session over session.

Then hand the folder to a team. Version it with git so changes are tracked. Let a few people and a few agents edit it through the week. The articles still compile. The links still resolve. But nobody remembers why the pricing doc was split in two, the onboarding section quietly contradicts the policy section, and someone just re-added a thing the team deliberately removed three months ago because the reason it was removed lived in a Slack thread that's gone.

That's the drift. The files are fine. The reasoning behind them has evaporated, and an automated consistency check can't bring it back - those checks confirm the wiki is consistent, not that it's right.

This is the gap engraph fills. We built it for codebases. The prose case was a side experiment, and the side experiment ended up teaching us something we weren't expecting.

What engraph is, without the jargon

Engraph is a layer you add to a git repository that keeps three things the files themselves can't keep:

A map of how the folder is organized and how things connect. Which docs link which, where assets live, which sections sit inside which document.

A history of decisions, written into the record at the moment each change is made. Not "this paragraph changed" but "we changed it because X, we considered Y and rejected it, and here's the constraint that forced our hand."

A set of house rules and checks - conventions for how docs in a given folder should look, and verifications that confirm a change was made correctly before it lands.

The first one a folder of files mostly gives you for free; more on that below. The other two are the point. They're what let anyone - a new teammate, or an agent that's never seen the repo - open it months later and not just read what's there but understand why it's there and how they're expected to add to it.

Engraph is built for codebases - that's the design center, and it's where it earns its keep. Code has rich structure engraph can parse and address against (functions, modules, types), and conventions in a codebase live at fine granularity, the kind of "this layer can't import from that one" rules that are hard to keep alive in anyone's head as the team grows. An agent working on code with this layer in place behaves like an engineer who's actually read the history. Without it, like an engineer on their first day, every day.

The three things it keeps aren't intrinsically about code, though. A folder of markdown is also a structure you can map. The decisions behind every prose change are still worth keeping. House rules - what frontmatter to require, which sections a given kind of doc must contain, which links must resolve - are real conventions that benefit from real enforcement. The mechanics translate, and when we pointed engraph at a folder of prose we found something we hadn't expected. The next section is about that.

Engraph doesn't move your knowledge anywhere. Everything stays in your folder, in plain markdown, owned by you, readable by any tool. Git already remembers every version and who touched it. Engraph makes sure each of those versions arrives with its reasoning attached and its house rules respected.

What we didn't expect

Here's the part we didn't see coming. When we built this layer for codebases, capturing the why behind every change was one improvement among several - the structural map, the rule enforcement, the agent context were all doing real work too. When we pointed the same layer at a folder of prose, the why-capture piece got dramatically more important than it had been. The reason turned out to be obvious in hindsight.

Read a code change and you get some help from the change itself - a renamed function, a new conditional, a different return type carry meaning you can partly reconstruct. Read a prose change and you get almost nothing. A diff that turns "8px" into "12px" tells you a number moved. It doesn't tell you the spacing was bumped after a usability test showed people mis-tapping on mobile, or that 16px was tried first and broke a dense table layout. The artifact shows the current answer. The reasoning that produced it is invisible in a way code rarely is.

There's a useful split here between knowledge and expertise. Knowledge is what you can read straight off the page - the spacing scale is 4px, the button doc says use sentence case. Expertise is the part that lives in someone's head - why the scale is 4px and not 8px, what we tried before sentence case, which constraint we're not allowed to break. A growing knowledge base accumulates knowledge naturally. Expertise leaks out the side, one departed contributor and one closed thread at a time. Engraph's job is to catch the expertise on the way past, at the only moment it's cheap to catch: when the change is being made and the person making it still knows why.

The order matters and it's worth being honest about. We didn't start with the prose case. Engraph was built for code, and what happened is that we had our own internal knowledge base - concepts, articles, strategies, plans, specs that we'd been sharing the hard way between us - and once engraph was working we plugged it into that folder as an experiment on ourselves. It worked. The structural backbone that maps a codebase mapped the folder of docs cleanly. The why-capture caught the same kind of thing it catches on commits. The rules and checks behaved the same way they behave on code.

What surprised us was the realization underneath. The concepts engraph is built on - a structural backbone over the artifacts, decisions captured at the moment they're made, house rules with enforcement that actually runs - aren't about code specifically. They're about catching expertise on any artifact that lives in a file structure. Code is still where engraph shines brightest, because the structure there is the richest and the agent loop runs hardest. But the underlying idea is more general than the target it was built for. And the reflection that probably matters most if you came to this article from the developer side isn't that the same layer happens to work on prose. It's the inverse: if it does this much for a folder of words, what is it doing in the codebase it was actually designed for?

A running example: a cross-functional team's knowledge base

Picture a team - UI designers, UX designers, and a few product people - keeping a shared knowledge base in a folder of markdown and exported images. The design system lives there: color, type, spacing, a doc per component with screenshots, brand voice, accessibility standards. The UX work lives there too: journey maps, research synthesis, the hypotheses that got tested and what came back, the reasoning behind why the signup flow looks the way it does and the approaches that didn't survive testing. And so does the product side: positioning, scope rationale, what got cut from the roadmap and why, the customer conversations that pushed a feature one direction and not another. A lot of reasoning lives in this folder - far more than any of the artifacts on their own would suggest. They've started versioning it in git so the whole org can see it, and now a dozen people and a couple of agents edit it through the week.

The cracks are already visible. The spacing doc got updated but three component docs still quote the old values. A new contributor added a "ghost button" variant that the team killed last spring for contrast-ratio reasons nobody wrote down. The journey map shows a wizard-style onboarding as "under consideration" - but the team tried it in February, watched retention drop, rolled it back, and the rollback rationale lived in a Slack thread that's gone. The positioning doc still describes the old competitive framing from before the pricing change. Someone is about to re-propose a scope cut that was already evaluated and rejected six months ago. None of this is catastrophic on its own, and none of it gets caught by reading the files, because each file is internally fine.

Here's how engraph changes the shape of that team's week.

Part 1 - Setting it up (once)

This is the only technical part, it happens one time, and one person can do it - or you can ask your agent to do it for you inside whatever environment you already use to edit the wiki.

Initialize. Point engraph at the folder:

npx engraph init

This sets up engraph's layer alongside your files. It doesn't rewrite anything you already have.

Agree on frontmatter. Most well-run markdown bases already put a small block of metadata at the top of each file - Karpathy's pattern uses one, and so should yours. For this team it might be:

---
id: component-button
status: active        # draft | active | deprecated
owner: design-systems
last_reviewed: 2026-05-20
---

The id matters more than it looks. If conventions are pinned to a document, you want them pinned to something stable. File paths move when folders get reorganized, and a knowledge base gets reorganized; a heading gets reworded every other week. A frontmatter id is the durable handle that survives both. If you don't have frontmatter yet, this is the one habit worth adopting before anything else - and a simple skill can add and maintain it across the whole folder so nobody does it by hand.

Let it take a first pass. Ask the agent to do an initial extraction over the folder. It walks the tree, maps the structure and the links between docs, and reads enough of the existing content to propose a starter set of conventions - things it noticed you already do. You review those suggestions and keep the ones that are real. Nothing is enforced that you didn't approve. This first pass is where the map gets built and where your house rules start as something concrete rather than a wiki page nobody reads.

The exact command names track whatever version of engraph you've installed. If a command here doesn't match, ask your agent - it knows the current surface - or check the docs.

A note on what the map can and can't see. Your folders, files, and the links between them become a clean structure engraph can address conventions against. Inside a long document, engraph can also see the heading structure if you want rules at that level, though headings are wobbly anchors because people rename them - lean on frontmatter ids for anything that needs to stay put. Images are along for the ride as references, not as something engraph reads: it knows "the button doc embeds these three screenshots and they live in this folder," which is exactly enough to check that links resolve and assets aren't orphaned, and nothing more.

Part 2 - Working in it without touching git

This is the part that matters most for a team that doesn't think in commits, and it's mostly about how you talk to your agent.

Here's the mental model. Git is a filing cabinet that never forgets - every version of every file, who changed it, when. You never open the drawers yourself. You talk to your agent, and the agent files things for you. The one new habit is that when you ask it to save a change, you tell it why, and it writes that reasoning into the record alongside the change. In git terms it's making a commit; you can just think of it as saving with a reason.

So instead of editing the button doc and moving on, the exchange looks like this:

You: Update the button doc - primary buttons go to 12px vertical padding. Save it. We bumped it because the usability round showed mis-taps on mobile at the old size, and we tried 16px first but it broke the pricing table layout.
Agent: Done. Saved with the reasoning attached.

What the agent actually wrote into the record reads like this:

docs(button): primary button vertical padding 8px -> 12px

intent(component-button): reduce mobile mis-taps on the primary action
decision(component-button): set vertical padding to 12px
rejected(component-button): 16px - broke the dense pricing table layout
constraint(component-button): must stay legible in the dense table context
learned(component-button): touch-target comfort beats visual tightness on mobile here

The top line is the human-readable summary. The lines under it are the structured reasoning - intent, the decision taken, what was rejected and why, the constraint in play, what was learned. Each is tagged to the component it belongs to. You didn't write any of that syntax. You said the why in plain language and the agent shaped it.

Now the value lands on the other side. Three months later a new designer is about to widen that padding again. They ask:

New designer: Why is the primary button padding 12px? Can I make it bigger?

Agent: It was set to 12px in May to cut mobile mis-taps. 16px was tried and rejected because it broke the dense pricing table - there's a standing constraint that the button has to stay legible in that table. If you want it bigger, that's the thing you'd need to solve first.

The agent didn't guess and it didn't read your mind. It reconstructed that answer from the reasoning you filed when you made the change. That's the whole loop: capture the why at the cheap moment, get it back at the expensive one. The "why was the ghost button removed" question that used to dead-end in a deleted thread now has an answer, because the removal was filed with its reason.

You can save as often or as rarely as you like. One change with a reason, or a morning's worth of edits saved together with a note covering them. The agent handles the mechanics either way.

Part 3 - Conventions and verification

This is what this team asked for without quite naming it: a way to set house rules per folder and confirm they're followed, and a way to check that a specific kind of change was done correctly.

It helps to separate two kinds of check, because they're enforced differently.

The mechanical kind is anything with a definite right answer. Every doc in /components must have frontmatter with status, owner, and last_reviewed. Every component doc must contain a Usage section, a Do / Don't section, and an Accessibility section. Internal links must resolve. Images must live in the component's own folder, not be pasted in from elsewhere. File names follow the agreed pattern. These need no judgment, so they run as a plain check:

npx engraph validate

By default that just reads and reports without changing anything - which doc is missing required frontmatter, which links are broken, which sections aren't where they should be. When the fix is obvious and safe, such as a missing frontmatter field, you can let it correct things:

npx engraph validate --fix

The judgment kind is where there's no single right answer and someone has to reason about it. Was this component actually deprecated correctly - status flipped to deprecated, a replacement linked, the entry pulled from the index, and a note left on why? Does this rewritten Usage section still match how the team actually talks about the component, or has the voice drifted? Those can't be settled by a rule, so the agent does the reasoning, checking the change against the conventions and the decision history before it lands. The mechanical checks are the guardrail; the judgment checks are the reviewer who knows the house.

For this team this is the difference between a folder that slowly contradicts itself and one where deprecating a component is a single understood move that always gets done the same way, where a new contributor's first edit is held to the same standard as a five-year veteran's, and where the reason behind every convention is one question away instead of lost.

None of this asks the team to change how they work. They still edit markdown. They still ask their agent questions. The only thing that's different is that saving a change now carries a reason, and the folder has rules it actually enforces instead of a style guide nobody opens.

That habit is worth having even if you never run engraph at all - a knowledge base where every change records why it was made, and where the house rules are checked rather than hoped for, is a better knowledge base. Engraph just makes the why something you get back later instead of something you trusted yourself to remember. Which, a few contributors and a few months in, you won't.