AI Will Help You Drift With Confidence

The most expensive thing you can do with AI is hand it the wrong problem. It will not tell you. It will help you drift with confidence.

This is not a failure of the model. The model is doing exactly what it is supposed to do: taking your input and producing the most coherent, useful, well-structured output it can. The output will look right. The reasoning will sound logical. The prose will be clean, the code will compile, the spec will be thorough. You will not know you are heading the wrong direction until you are far enough in that the cost of turning back is real. That is the dangerous kind of failure — not the kind that announces itself, but the kind that compounds quietly until it is expensive.

The question is not whether AI is capable. It is whether you know enough about the problem to give it the right direction in the first place. That is a different skill entirely, and it is one that does not come from using AI more.

What Drift Actually Looks Like

Drift is not a sudden wrong turn. It is a slow erosion of intent that happens one plausible step at a time. You start with a problem that is slightly underspecified. The model interprets the ambiguity in a reasonable way. You build on that interpretation. The next question follows from the first answer. By the third or fourth exchange, you are working inside a frame that is coherent but subtly wrong, and nothing in the output flags this because coherence and correctness are not the same thing.

Research on context drift in multi-turn LLM interactions characterizes this as a compounding process: “memory limitations and information loss over conversation turns,” combined with “conflicting instructions that pull the model away from original goals” (Dongre & others, 2025). The paper’s central metaphor — drift as erosion, not collapse — is exactly right. The model does not suddenly break. It continues producing high-quality output inside a frame that has shifted. If you are not tracking the frame, you will not notice the shift until the work is done.

What makes this hard to catch is that the output quality never degrades. A poorly framed question does not produce obviously bad answers. It produces internally consistent answers to the wrong question. The spec is thorough. The analysis is rigorous. The code works. The issue is not quality at the sentence or function level; it is direction at the problem level. And direction is the one thing the model cannot evaluate for you.

Why Prompting Does Not Fix This

The natural response to any AI failure mode is to look for a prompting fix. Be more specific. Add constraints. Use chain-of-thought. Structure the request better. These are genuine improvements and they are worth learning. They do not address the upstream problem.

Prompt engineering is a downstream skill. It optimizes execution inside a problem frame. It cannot fix a badly-formed question because it operates after the question has been defined. If you are asking the wrong question with exceptional clarity, better prompting will produce a cleaner, more detailed answer to the wrong question. The precision makes the drift worse, not better, because you get further faster.

Empirical research on LLM performance in problem formulation tasks confirms this asymmetry. Even when models are given detailed prompts and explicitly instructed to maintain the correct level of abstraction, they still “produce solution-specific outputs that are inappropriate for problem formulation” and exhibit “great variability among parallel threads” (Ofsa & Topcu, 2025). The authors conclude that LLMs require “significant expert oversight” precisely because their lack of domain understanding limits their ability to recognize when the problem frame itself is wrong. The model cannot hold the abstraction level you need if it does not understand why that level matters. That understanding comes from domain knowledge, not from better instructions.

This is the prompting trap. You can spend hours refining a prompt chain that is optimizing in the wrong direction and every iteration will look like progress. The output gets sharper, more detailed, more confidently structured. The only check on this is knowing the domain well enough to recognize when the frame has drifted — and that check has to come from you.

Domain Knowledge Is the Upstream Check

You cannot catch drift in a domain you do not understand. The output looks right because you do not know what wrong looks like. Without a mental model of the territory, there is no reference point for evaluating direction. AI’s confidence — which is a feature, not a bug — becomes a liability when the person on the other side cannot calibrate it.

Cognitive science has studied how experts and novices differ in problem-solving for decades, and the finding that keeps reappearing is not about what experts know but about how they represent problems. Experts build hierarchically organized knowledge structures that let them recognize meaningful patterns rapidly, integrate surface features with deeper causal relationships, and — critically — identify when a problem is being framed at the wrong level. A domain expert reads an AI output and immediately notices when the framing is off because they have a prior model of what correct framing looks like. A novice has no such reference and evaluates the output on the only dimension available: does it look coherent?

This connects to the argument in You Can’t Prompt Your Way Out of Ignorance: domain knowledge is not a nice-to-have for AI use, it is the grounding layer that makes evaluation possible. Without it, you are navigating with a tool that points confidently in whatever direction you started walking. The tool is not wrong. You are the one who set the direction.

The same dynamic shows up in AI Doesn’t Read Code. It Reads Patterns. — a model extending a codebase has no way to distinguish a good architectural decision from a bad one encoded in the existing patterns. It continues whatever it finds. Problem formulation is the same mechanism one level up: the model continues whatever frame you give it, with no ability to evaluate whether the frame is sound.

The Skill That Does Not Automate

Problem formulation is the ten minutes before you open the chat window. It is the discipline of defining what you are actually trying to solve before you ask anything. It requires enough domain knowledge to know what the problem is not, which constraints are real versus assumed, and which question will actually produce useful work when answered.

Microsoft Research’s 2025 survey on AI and critical thinking found that as AI takes on more cognitive tasks, knowledge workers identify two skills as increasingly critical: quality control of AI output (50%) and critical thinking — the ability to analyze information objectively and make reasoned judgments (46%) (Lee & others, 2025). The broader New Future of Work Report frames this as a design challenge: AI should “scaffold reasoning” rather than substitute for it, supporting workers in deciding how to decide rather than collapsing that step (Microsoft Research, 2025). The point is that problem formulation is precisely the step that should not be delegated, and current AI tools make it very easy to skip.

The irony is that the better the tools get, the easier it becomes to skip this step. When AI produces fluent, confident, well-structured output from an underspecified input, there is little friction to slow you down. There is no error message. There is no blank page. There is just output that looks like progress. The discipline of formulating the problem well has to come from the person using the tool, and it has to come before the session starts.

This is an argument made in a different register in You Can’t Vibe Code Past Your Own Engineering Judgment: the judgment layer does not automate, it gets amplified. A sharp problem definition produces dramatically better AI-assisted work than a vague one, and the ability to produce that sharp definition comes from experience, domain knowledge, and the willingness to think before prompting. None of those come from using AI more.

What This Changes About How You Work

The practical implication is not to use AI less. It is to slow down upstream, not downstream. The return on investment is concentrated in the problem statement, not the prompt chain. A few minutes spent asking whether you are solving the right problem is worth more than an hour of iterative prompting on a wrong one, because the wrong one will keep producing coherent, convincing output that feels like progress until it does not.

This means investing in domain knowledge as an input to AI use, not just as a general professional asset. The more deeply you understand a domain, the better your problem framing will be, and the more value you will extract from the tools. The Age of the Personal OS makes the case for structuring your expertise so AI can act on it; the upstream piece is building that expertise in the first place, specifically because it is the check that AI cannot supply.

It also means being more deliberate about what you are trying to answer before you ask anything. Not as a ritual but as an investment. Write the problem down before opening a chat. Define what a correct answer would look like and what a wrong-but-plausible one might look like. Name the constraints that are real and the ones you are assuming. This is not elaborate. It takes minutes. But it sets the frame — and the frame is the one thing that determines everything that follows.

The Direction Is on You

AI does not make you faster at solving problems. It makes you faster at solving whatever problem you give it. Those are not the same thing.

The tools are good enough now that the limiting factor is rarely the model. It is the quality of the question. And the quality of the question is a function of domain knowledge, judgment, and the discipline to think before you prompt. These are human skills. They do not improve with more AI use. They improve with more domain investment, more time spent understanding what you are actually building, and more willingness to sit with a vague problem until it becomes a sharp one.

The drift is convincing. The output will look right. The only check on it is you knowing the territory well enough to recognize when you have left it.

References

Dongre, P., & others. (2025). Drift No More? Context Equilibria in Multi-Turn LLM Interactions. arXiv preprint arXiv:2510.07777. https://arxiv.org/abs/2510.07777

Lee, T. S., & others. (2025). The Impact of Generative AI on Critical Thinking: Self-Reported Reductions in Cognitive Engagement and Confidence Among Knowledge Workers. Microsoft Research. https://www.microsoft.com/en-us/research/wp-content/uploads/2025/01/lee_2025_ai_critical_thinking_survey.pdf

Microsoft Research. (2025). New Future of Work Report 2025. Microsoft Research. https://www.microsoft.com/en-us/research/publication/new-future-of-work-report-2025/

Ofsa, M., & Topcu, T. G. (2025). An Empirical Exploration of ChatGPT’s Ability to Support Problem Formulation Tasks for Mission Engineering and a Documentation of its Performance Variability. arXiv preprint arXiv:2502.03511. https://arxiv.org/abs/2502.03511