The Context That Didn’t Transfer
How to document AI-assisted code by capturing decisions, trade-offs, and definition of done
When AI ships code without the definition of done
A colleague once built a feature with AI that none of us on the team had touched before.
It was a row configuration component: something to let users dynamically control how rows and sub-rows appeared in a table. An improvement to the user flow. And my colleague wasn’t a frontend specialist. He leaned on AI to get it done, and by most measures, it worked.
Then it became my problem to understand it.
The code was clean. Readable, even. But the more I traced through it, the more unsettled I felt. The component impacted multiple views across the application, and I couldn’t work out how it was supposed to behave in each of them. There was no definition of done anywhere. No clear record of what “correct” looked like before the first line was written.
The code looked fine. I just had no way to know if it was right.
I started pulling on threads trying to understand the effect of any change I made. Every time I thought I’d found the boundary, another piece of logic turned out to be connected.
We eventually fixed it, not by reading the code more carefully, but by running a design collaboration session. We got the team in a room, aligned on what the feature was actually supposed to do and look in the first iteration, and worked backward from there. Once we had a shared definition of done, the code finally started to make sense.
But that workshop should have happened before the AI started working on it and before the first prompt was written.
The real gap: the conversation never makes it into the repo
For a while, I assumed this kind of thing was an AI problem. That something about how AI generates code makes it harder to follow.
Over time, I started to think that wasn’t quite right.
The code wasn’t undocumented. The conversation was.
When you build with AI, you’re not just writing code. You’re working through a problem in dialogue. You’re trying things, rejecting things, refining things. That back-and-forth, the questions you asked, the dead ends you ruled out, the constraint you finally remembered that changed everything, is where the real understanding lives.
The assumptions the model made. The decision context that shaped each iteration. The moment you said “no, that doesn’t work because...” and watched it adjust. None of that survives the handoff. It stays in the chat window, or it disappears entirely.
And when you hand off the code without any of that, you’re handing off the answer without the question.
When that person moves on, or just moves to a different part of the codebase, the context moves with them.
That’s not a new problem. Undocumented knowledge has always been expensive. But AI accelerates the gap between output and understanding in a way that feels different. When you write code by hand, the thinking tends to leave traces - comments, naming choices, the shape of the function. Not all of the reasoning, but some.
When you build with AI, the reasoning lives in the prompt. None of it makes it into the file. The code looks finished because it is finished. But the understanding is still sitting in a chat window somewhere. Or it’s already gone.
What to document when code was built with AI
Most documentation advice is about what the code does. API references, function signatures, data flows. That kind of documentation matters. But it doesn’t capture why the code exists in this shape, and that’s what a new developer actually needs when something breaks.
What I’ve started doing, and I’m still figuring this out if it’s the right approach, is treating context as something that needs to be transferred deliberately, not incidentally.
Write the decision, not the description
Instead of a comment that says // returns filtered list, write:
// filtering here instead of at the API level because the endpoint doesn't support partial queries - if that changes, this can be simplified
The second one tells the next person something they can actually use. The first one just describes what they can already read.
This applies to every meaningful choice made during an AI-assisted build: why this component structure, why this data shape, why this approach over the one the model suggested first.
Use PRs as decision logs — with a simple template
A PR description that says “adds filtering logic” is technically accurate and almost useless.
One that captures the actual decisions made during the build gives the next developer a starting point before they touch a single line. A template that I believe can work better:
Problem: [what you were solving]
What I tried: [approaches that didn't make the cut, and why]
What I shipped: [the chosen approach]
Trade-offs: [what this decision costs, what it protects against]
Success criteria: [how you'd know this is working correctly]It takes five minutes. It saves hours later.
Have the conversation out loud: a 15-minute handoff
Doing a short knowledge-sharing sessions after finishing anything complex built with AI. Not a formal presentation, sometimes just fifteen minutes with one or two people.
Walking through what the problem was, what you tried, what the AI suggested that you rejected and why. Something changes when you say it out loud. The person asking questions catches gaps you didn’t know were there. And you’re forced to articulate things that felt obvious in the moment but aren’t obvious at all to someone coming in fresh.
The design workshop that finally unlocked the row configuration component? That was this, just a bit late in the process.
Ask AI to surface its assumptions before you merge
Before accepting a generated solution, ask the model to explain what assumptions it made.
Something like: “What are you assuming about the data structure here?” or “What would break if this constraint changed?”
That usually surfaces two or three things you hadn’t consciously considered — and at least one of them is worth writing down before you close the tab.
Common mistakes that make AI-assisted code brittle
The code itself is rarely the problem. What makes AI-assisted code hard to inherit is usually one of a few things:
Clean code, unclear invariants. The implementation is tidy but no one wrote down what must always be true for it to work.
I ran into this with the same row configuration component. I extended the grouping logic to include additional columns — it looked like a straightforward improvement. What I didn’t know was that the backend wasn’t ready to handle those new aggregations. The requests started failing, and the fix was to pull back: restrict the dynamic configuration to a more static, pre-defined set of options.
The code had never said what it was assuming about the backend’s capabilities. So I changed something that looked safe. It wasn’t.
No named trade-offs. Every non-trivial solution sacrifices something — and when that sacrifice isn’t named, the next person can’t tell if it was intentional or accidental.
The static configuration we landed on was a deliberate trade-off: fully dynamic row grouping wasn’t possible yet because the backend couldn’t support those aggregations. We knew that. But we didn’t write it down. So the constraint looked like a design choice, and the temporary fix looked like the intended solution.
Someone would have changed it again eventually. And hit the same wall.
No success criteria. This was the core problem with the row configuration component. Without a definition of done written down somewhere, there was no way to know if a change improved things, broke things, or just shifted the problem somewhere else.
Trade-offs and when lightweight ADRs help
Not everything needs a formal Architecture Decision Record. Most changes are small enough that a well-written PR description is the right level of documentation.
But some decisions have longer reach — a choice about state management, a structural pattern that will be replicated across the codebase, a trade-off that will shape how the next five features get built. Those are worth a short ADR: a single document that names the decision, the context behind it, the options considered, and the reasoning for what was chosen.
A lightweight format that works:
Decision: [one sentence]
Context: [what made this decision necessary]
Options considered: [what else was on the table]
Chosen approach: [what you went with and why]
Consequences: [what this makes easier, what it makes harder]The rule of thumb: use a PR decision log for most changes, and reach for an ADR when the decision will still matter six months from now.
Checklist: context transfer before you ship
Before merging AI-assisted code, check:
[ ] Definition of done is written down — what does correct look like, and where is that documented?
[ ] Key decisions are named in comments — not what the code does, but why it’s shaped this way
[ ] PR description captures trade-offs — what was tried, what was chosen, what it costs
[ ] AI’s assumptions have been surfaced — you asked the model what it was assuming and wrote down anything non-obvious
[ ] Success criteria exist — there’s a way to know if a future change breaks this
[ ] At least one other person has heard the reasoning — a 15-minute walkthrough, a 1:1, a team session
[ ] For significant decisions: a lightweight ADR exists — especially if the choice will shape future work
I think about that row configuration feature sometimes. The colleague who built it wasn’t cutting corners. He was moving at the pace the team needed, using the tools available to him. The gap wasn’t effort or intention.
The gap was that no one had thought to ask: when this leaves your hands, what goes with it?
The code can outlast the person who wrote it. The question is whether the understanding can too.
Until next time,
Stefania
Articles from the ♻️ Knowledge seeks community 🫶 collection: https://stefsdevnotes.substack.com/t/knowledgeseekscommunity
Articles from the ✨ Dev Shorts collection:
https://stefsdevnotes.substack.com/t/frontendshorts
Articles from 🚀 The Future of API Design series:
https://stefsdevnotes.substack.com/t/futureofapidesign
🗣️ If this article resonated with you
If this article resonated with you, consider recommending Stef’s Dev Notes to someone who might enjoy it.
This newsletter grows mostly through thoughtful readers sharing it with others in the industry. I really appreciate every recommendation.
🤍 Some news, and a small gift for you
Stef’s Dev Notes is now a community partner for WeAreDevelopers.
If you enjoy thoughtful conversations about frontend, systems, and growth, this is one of the spaces where those ideas come together at a larger scale.
🎟 You can join WeAreDevelopers World Congress (23–25 September, San José)
and use this code Community_StefsDevsNotes for 10% off.
🚀 Something worth checking out
Elena | AI Product Leader launched draftkit.app last week — a shared workspace for creators and collaborators. I was one of the early beta testers and watching it take shape from the inside was a valuable lesson in what good iteration looks like. Her article on auditing her own workflow and finding a hidden 8.5-hour coordination tax is worth a read on its own, separate from the tool.
👋 Get in touch
Feel free to reach out to me, here, on Substack or on LinkedIn.






This really resonates and I’ve written about a similar gap in AI-assisted decisions - when no one explicitly owns the outcome, the reasoning gets lost and accountability disappears.
Context and ownership are where the real decision lives.