Managing Tribal Knowledge for Engineers: A Practical Guide
By Kaia Colban
Someone quits. Within a week you realize they were the only one who knew why the payment service retries exactly three times. Not in the code. Not in any ticket. Just in their head, because they were there when the decision was made, six months ago, in a Slack thread nobody bookmarked.
This is tribal knowledge. And every engineering team has more of it than they think.
What tribal knowledge actually is
Tribal knowledge isn't secrets. It's the gap between what's in the codebase and what someone needs to know to work in it safely. It includes:
- Architectural decisions without artifacts. Why did you pick Postgres over MongoDB? Why does the auth layer sit outside the API gateway? If the answer is "someone decided that two years ago," it's tribal.
- Implicit contracts between systems. The billing service expects the user ID to always be a UUID, never an integer, even though the database would accept either. That constraint exists only in the muscle memory of the engineers who were burned by it.
- Failed approaches. The migration that got halfway and had to be rolled back. The library that was evaluated and rejected. The refactor that was abandoned. These dead ends are some of the most valuable information a team has, and the hardest to find.
- Context behind closed tickets. The Jira ticket says "fix race condition in checkout." It doesn't say what you tried first, why it failed, or what the actual root cause turned out to be.
Engineers spend 19% of their time searching for information or asking colleagues. That's not a productivity problem. That's a knowledge problem.
Why the standard playbook doesn't work
The standard answer to tribal knowledge is documentation. Write it down. Run post-mortems. Keep a decision log. Maintain an ADR folder.
None of this scales.
Documentation is the past trying to remember itself. By the time an engineer sits down to write up why they made a decision, the context is already stale. The subtleties that mattered in the moment (the three alternatives they rejected, the constraint they discovered halfway through, the thing that nearly broke prod) don't make it into the doc. What makes it in is a sanitized summary written by someone whose next meeting starts in ten minutes.
Post-mortems help for incidents. They don't capture the hundreds of smaller decisions that accumulate into a codebase's character over years.
ADRs are better, but they require engineers to proactively write them, maintain them, and link to them. In practice, ADR folders fill up for a few months and then go dark. The team moves fast. The ADR becomes another thing to feel guilty about.
AI coding tools made this significantly worse
Here's a thing most engineering leaders haven't fully reckoned with yet: the shift to AI coding tools didn't just change how engineers work. It moved where decisions happen.
When an engineer writes code with a compiler and their own brain, the reasoning lives in their head and sometimes in comments. When they work with Claude Code or Cursor, the reasoning plays out in a conversation, a back-and-forth with the model where they explore approaches, reject options, discover constraints, and arrive at a solution.
That conversation is where the why lives now. And it disappears the moment the session ends.
Teams with high AI adoption merge 98% more PRs. PR review time increases 91%. Engineers touch 47% more PRs per day. The output is larger and faster. The reasoning behind each decision is less visible than ever.
The commit message says "refactor auth middleware." The AI session contains: the three approaches that were considered, the security constraint that ruled out the first one, the performance issue that killed the second, and the specific edge case that the third one handles. None of that makes it into the commit. None of it makes it into the ticket. It exists only in a chat window that's already been closed.
A practical guide to actually managing it
1. Stop trying to capture knowledge after the fact
The reason documentation fails is timing. By the time a decision is made, the engineer has moved on. Ask them to document it a week later and you'll get a skeleton. Ask them six months later and you'll get nothing.
The only moment when a decision is fully understood is when it's being made. That's when the tradeoffs are alive in the engineer's mind. That's when the reasoning is complete.
Capture needs to happen at the point of work, not after it.
2. Make the capture zero-cost
Any system that asks engineers to add steps to their workflow will be abandoned. This is not a character flaw. It's rational behavior under deadline pressure.
The only documentation that survives in practice is documentation that is a byproduct of work, not additional work. Commit messages survive because they're required to commit. Linting docs survive because they run in CI. Post-mortems survive because there's a template and a deadline and a post-mortem meeting.
If your knowledge management system adds friction, it will fail. Full stop.
3. Treat search as the primary interface
The goal of managing tribal knowledge is not to create a database. It's to make the right information findable at the moment someone needs it.
A wiki nobody searches is not a knowledge system. An ADR folder with no search is not a knowledge system. A Slack channel with answers buried in threads is not a knowledge system.
Whatever form your knowledge takes, the interface that matters is: "How long does it take a new engineer to find out why this decision was made?" If the answer is "they can't," the system isn't working.
4. Prioritize decisions, not just outcomes
Most knowledge management systems index on outcomes: this PR, this ticket, this commit. The thing that shipped.
Tribal knowledge lives in the reasoning that produced the outcome. The approach that didn't ship matters as much as the one that did, because it prevents the next engineer from repeating the same exploration.
Capture decisions. Capture the alternatives that were rejected and why. Capture the constraints that weren't in the original ticket but showed up during implementation.
5. Keep a team "why log" for high-stakes decisions
Not everything needs to be captured. Most decisions are reversible, low-stakes, and would produce boring documentation.
The decisions worth capturing are the ones that would be expensive to re-derive. Architectural choices. Security constraints. Performance tradeoffs. Integrations with external systems. Things that are load-bearing.
A lightweight "why log" (a Notion page, a markdown file in the repo, a dedicated Slack channel) forces the team to at least identify which decisions matter enough to record. The bar should be: "Would a new senior engineer get burned by not knowing this?"
6. Use onboarding as a diagnostic
The best way to find out where your tribal knowledge is concentrated is to watch a new engineer try to become productive.
New senior engineers take an average of nine months to reach full productivity. That's not incompetence. That's knowledge transfer latency. Every question they ask is a data point: this is something the team knows that isn't written down anywhere.
Instrument onboarding. Track where new engineers get stuck. Turn those sticking points into documentation or better yet, into searchable artifacts that don't require documentation at all.
Where AI goes from problem to solution
The same AI sessions that create the tribal knowledge problem can solve it, if you capture them.
When an engineer works through a problem in Claude Code, the session contains everything: the initial approach, the dead ends, the pivots, the final solution, and the reasoning at every step. It's more complete than any post-hoc documentation could be, because it was written in real time by the person who understood the problem most fully.
Teams that capture their AI sessions get a searchable record of how every decision was made. A new engineer inheriting a system can read through the sessions that built it. They can search for "why does the payment service retry three times" and find the actual conversation where that was decided, including the alternatives that were considered and rejected.
This is what Lore does. Engineers work normally. Sessions sync automatically. The team gets a searchable knowledge base that reflects actual work, not sanitized summaries written after the fact.
The thinking is already happening. Now your team can keep it.
The honest summary
Tribal knowledge is unavoidable. Every team has it. The question is whether it's managed or unmanaged.
The playbook that works:
- Capture at the point of work, not after
- Make capture zero-cost or don't bother
- Build for search, not for storage
- Focus on decisions and reasoning, not just outcomes
- Use onboarding as a diagnostic for what's missing
The AI era has made this harder by moving reasoning into sessions that disappear. It's also made it easier, if you capture those sessions, you have something more complete than any documentation standard could produce.
The work is already happening. The question is whether it vanishes when the session ends, or stays with the team.