The Age of the Personal OS

Where AI Earns Its Keep
Anatomy of a Personal OS
Why Grep Beats Vectors (At This Scale)
The Value Inversion
How This Scales to Teams
The Substrate Is the Strategy

Most of the conversation around AI in 2026 is about model capabilities. New benchmarks, new context windows, new reasoning modes — every week another chart claiming the frontier moved. The more interesting question, the one that gets less airtime, is what the model is consuming. The output quality of any AI workflow is bounded less by the model these days than by the substrate it is given to chew on. Change the substrate and the same model gives you a different system.

Andrej Karpathy published a quiet signal in April. In a short gist titled LLM Wiki, he described how he had stopped using LLMs primarily for code generation and started using them to maintain a personal markdown knowledge base — plain text files, no vector database, no RAG pipeline, no embeddings (Karpathy, 2026). His own framing of the division of labor is the line worth holding onto: "You're in charge of sourcing, exploration, and asking the right questions. The LLM does all the grunt work — the summarizing, cross-referencing, filing, and bookkeeping." That is not a tooling preference. It is a thesis about what AI is good at, and the inversion of value it implies.

The thesis: AI is excellent at curation and execution and bad at originating thought. That asymmetry, taken seriously, says the new commodity is not the model. It is the individual who has structured their expertise into something the model can act on. The personal OS — an opinionated, machine-consumable knowledge base — is the substrate of the AI era. And the same architecture, scaled to a team, is how organizational knowledge stops evaporating.

Where AI Earns Its Keep

The argument that LLMs do not really think has been made repeatedly, and the strongest version of it has come from the loudest critic. Yann LeCun has spent the last two years calling pure autoregressive models a dead end on the way to anything resembling human-level intelligence — pattern matching at scale, not reasoning — and in late 2025 he left Meta and raised over a billion dollars to build a different architecture (Newsweek, 2025). Whether you find his alternative bet convincing or not, the diagnosis is widely shared even among people who disagree with the cure: LLMs cannot reliably plan, cannot reason long-horizon, and cannot be trusted to be right about facts without external grounding. They confidently generate the shape of an answer regardless of whether the answer exists.

What gets less attention is the part LeCun and his opponents both concede: LLMs work well when paired with the right external scaffolding. Give a model good context, the right tools, and a tight loop, and it produces remarkably good work. Strip those away and it confabulates. The capability ceiling is largely a context problem, not a model problem. I have argued this from a different angle in LLMs Are Not Intelligent, and That's Okay — the point of acknowledging the absence of intelligence is not to dismiss the technology but to figure out what it is actually useful for.

What it is actually useful for is a specific shape of work: summarizing, cross-referencing, restating, traversing, filing, formatting, comparing, and executing well-specified tasks against well-specified context. That is not nothing. It is, in fact, most of the busywork that surrounds knowledge work — the bookkeeping, in Karpathy's framing. The trick is that the model needs someone else to decide what is worth doing, what is true, and how to recognize when the output is wrong. That someone else is you. The reframe that follows is simple: stop asking how smart the model is. Start asking how good the substrate you are giving it is.

Anatomy of a Personal OS

I have been running my own personal OS for about a year. The specifics of how it is wired together matter less than the four ideas it is built on — those ideas are the part that travels, regardless of which tools or formats you end up choosing. None of them are obvious, and most "second brain" advice in circulation manages to miss all four.

The first idea is that structure should live in the writing, not in the folder tree. A neat hierarchy of folders is a navigation crutch for humans walking a tree with their eyes; it is also the thing that quietly prevents the most valuable kind of insight, which is the one that connects two domains you would never have filed in the same place. A clustering algorithm and a pricing strategy might share the same underlying pattern, and a folder tree will keep them strangers forever. Let the pages sit flat, let the connections form through how you write, and the graph starts doing work the hierarchy could not.

The second idea is that confidence belongs in the metadata. Most note systems treat every note as equal: a half-formed hunch looks identical to a settled conviction once it is filed. That is not a knowledge base, it is noise with a search bar. A personal OS that distinguishes a hypothesis from a current best guess from a hard-won conclusion lets both you and your AI consumer act differently on each — citing the confident pages, hedging on the partial ones, refusing to present the speculative ones as fact. The same distinction also forces you to be honest with yourself about what you actually know.

The third idea is that hard-won mistakes deserve their own space. The most expensive knowledge you own is the scars — the things you got wrong, the corner you painted yourself into, the assumption that quietly cost you a quarter. Most of that information evaporates into casual summaries because there is no obvious place to put it. Giving it a dedicated home, separate from the polished narrative of what you learned, is the difference between knowledge that compounds and knowledge that gets re-learned every two years. This is the part Andy Matuschak's evergreen notes tradition gets right about atomic, concept-oriented writing (Matuschak, 2020), and the part most corporate documentation gets wrong by burying every painful lesson under a tidy executive summary.

The fourth idea is that the operating manual should fit in a small model's head. Whatever instructions tell an AI agent how to navigate your knowledge should be short enough that even a cheap local model can hold the whole thing at once. That sounds like a tooling detail; it is actually a bet about where AI is going. Most consumption of your knowledge over the next decade will not be done by frontier models at frontier prices. It will be done by small models running locally, repeatedly, for almost nothing. A personal OS designed for that world is built to compound for years. One designed only for the largest model available today will need to be rebuilt the moment the price of intelligence drops by another order of magnitude — which, on current trends, will be soon.

Why Grep Beats Vectors (At This Scale)

The reflex of 2026, when you say the word "knowledge base," is to reach for embeddings and a vector database. Chunk the documents, embed the chunks, store the vectors, retrieve by cosine similarity, feed top-k into the prompt. It is the default architecture, and for a personal OS at the scale of a few hundred or a few thousand markdown files, it is the wrong architecture.

The cleanest evidence is in the IR literature. The BEIR benchmark, which tested nine state-of-the-art retrieval models across seventeen diverse datasets in a zero-shot evaluation, found that BM25 — the old, boring keyword-based retrieval algorithm from the 1990s — was a stubbornly hard baseline to beat, and that more computationally efficient dense models could substantially underperform it depending on the task and domain (Thakur et al., 2021). Dense embeddings shine when you need semantic transfer across domains and a corpus large enough to amortize the indexing cost. Sparse and lexical methods win when terminology is precise and the corpus is narrow. Personal knowledge is both.

Karpathy's pattern skips retrieval altogether. You point the agent at the directory; the agent reads the index, traverses the links, and pulls the pages it needs into context. That works because two things have changed at the same time: context windows are now generous enough to hold a meaningful slice of a personal knowledge base, and the documents themselves are small, hand-edited, and densely linked — exactly the kind of corpus where graph traversal beats fuzzy similarity search. There is no retrieval layer to misbehave, no chunking strategy to tune, no embedding model to keep in sync when you rewrite a page. The substrate is the index.

The broader point is that infrastructure complexity should match knowledge scale. Vector databases are a great answer to a problem most individuals do not yet have. At the scale of one person's expertise, a well-structured flat directory plus grep plus a model that can follow links is faster, cheaper, more debuggable, and produces fewer surprises. The same logic flips at the scale of a million unstructured corporate documents — at that scale, embeddings start to earn their keep. The mistake is treating the corporate-scale answer as the universal answer.

The Value Inversion

If AI is the executor and the personal OS is the substrate, the obvious question is what that does to the value of the person sitting between them. The answer is the inversion that makes this moment interesting.

Execution becomes cheap. The work AI is good at — summarizing, drafting, reformatting, cross-referencing, the long tail of motion that fills most knowledge-work calendars — used to cost expensive hours. It now costs the price of a few tokens. What does not get cheaper is the upstream work: deciding what is worth learning in the first place, recognizing the load-bearing ideas inside a source, noticing when two domains share the same pattern, distinguishing a genuine scar from a one-off accident, and holding the conviction long enough to act on it. None of that is in the model. All of it lives in the person and in the structure they have built around their own thinking.

The economic consequence of that is what I explored in The Billion-Dollar Blind Spot: Why One-Person Companies Will Eat the SMB Market. The one-person company thesis is sometimes read as a story about scrappy generalists doing a little of everything badly. It is not. It is a story about specialists with the right substrate — a deep, well-structured personal OS in their domain, paired with AI that can execute against it — doing what used to require a team. The leverage is not in the AI alone. It is in the combination of human conviction and machine execution, and the substrate is the interface that makes the combination possible.

The honest version of this is that the substrate only works if you have actually thought. AI amplifies what is in your OS; it cannot generate the conviction, the scars, or the taste. Feed it shallow notes and it will produce confident shallow output, faster than you ever could. The same principle I wrote about in You Can't Vibe Code Past Your Own Engineering Judgment applies here: the tool's leverage tracks the skill it is leveraging. If the underlying thinking is not there, AI does not fix that — it surfaces it, at scale, in public. The personal OS forces the upstream work because it has nowhere to hide an unfinished thought.

How This Scales to Teams

The pattern compounds when you scale it. If every member of a team maintains a personal OS in their domain — opinionated, kept current, machine-consumable — then the team's collective knowledge graph becomes the organization's real operating system, regardless of what the wiki says. Decisions get cited. Tacit knowledge gets externalized. New hires onboard against actual thinking rather than against the polite, sanitized version of thinking that lives in corporate documentation.

The cost of not doing this is what the organizational research has been measuring for years. A 2023 review in The Learning Organization synthesized the empirical literature on knowledge loss from employee turnover and reached a number worth holding onto: roughly 42% of the expertise an employee uses in their role is known only to them, never written down, and cannot be filled in by a replacement (Čabrilo & Ndou, 2023). The review's central finding is that the loss of tacit knowledge — the kind that lives in heads and habits — does more damage to organizations than the loss of explicit knowledge, which is at least theoretically recoverable from documentation. Every departure is a quiet rewrite of the team's actual capability, and most organizations have no mechanism for slowing it.

The personal-OS-per-person pattern is the mitigation, and it works precisely because of what makes traditional wikis fail. Wikis fail because they are written for nobody in particular, maintained by nobody in particular, and reviewed by nobody at all. Personal OSes work because they are written for the person who owns them, with AI as the bookkeeping co-maintainer, and that selfish authorship is exactly what keeps them honest and current. When a colleague leaves, their OS does not become an archive nobody reads. It becomes a graph the rest of the team can ingest and cite. The knowledge stays.

This is also the missing piece in most of the current conversation about how to design AI-era teams. I argued in We're Designing AI-Era Teams Without a Blueprint that the dominant failure mode is treating AI adoption as a procurement decision rather than a team-design problem. The personal OS pattern is one of the few concrete answers to "what does the team-design version actually look like" — not a tool to roll out, but a practice that turns individual expertise into compounding institutional capability. As the team grows, the knowledge grows with it, instead of leaking out the door every time someone moves on.

The Substrate Is the Strategy

Most strategic conversations about AI are still about the model layer: which one, how much, when to switch. That is the easy half of the question, and increasingly the part that is being commoditized in public. The harder half — the half nobody else can do for you — is what you put underneath it. What you have spent years thinking carefully about. What you have learned the hard way. What you are willing to commit to paper in a form that compounds rather than scatters.

AI handed everyone the same execution layer overnight. What it cannot hand anyone is the substrate. That part you have to build yourself, in whatever domain you actually know — and the people, and the teams, that take this seriously now will compound for years while the rest are still shopping for tools. The future commodity in knowledge work is not the model. It is the person who built something worth feeding it.

References

Čabrilo, S., & Ndou, V. (2023). Knowledge loss induced by organizational member turnover: a review of empirical literature, synthesis and future research directions (Part I). The Learning Organization. https://www.emerald.com/insight/content/doi/10.1108/tlo-09-2022-0107/full/html

Karpathy, A. (2026). LLM Wiki. https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f

Matuschak, A. (2020). Evergreen notes. https://notes.andymatuschak.org/Evergreen_notes

Newsweek. (2025). Yann LeCun, one of the `Godfathers of AI,’ says LLMs are on their way out. https://www.newsweek.com/nw-ai/ai-impact-interview-yann-lecun-llm-limitations-analysis-2054255

Thakur, N., Reimers, N., Rücklé, A., Srivastava, A., & Gurevych, I. (2021). BEIR: A Heterogeneous Benchmark for Zero-shot Evaluation of Information Retrieval Models. Thirty-Fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2). https://arxiv.org/abs/2104.08663