Engineering Culture

The Answer Without the Question

A case study in how AI can encourage confidence when the limits of the analysis are not made clear.

Jupiter Jayna

19 Dec 2025 — 4 min read

It read like an explanation. Headings, categories, causes, recommendations. The sort of document that settles people down when it lands in a Slack thread. Nothing obviously wrong. Nothing that invited argument. And yet it carried the faint unease that comes when something has been resolved too neatly.

The Shape of the Answer

The context was performance. A regression had appeared and an analysis was requested. What came back was an AI assisted review of recent work, grouped into themes. Image loading. Bundles. Lazy loading. Server side rendering. Design system changes. Each section made a reasonable case. Together they formed a narrative that sounded, at first pass, like understanding.

It was clear. It was professional. It was also flawed in a way that was easy to miss and difficult to ignore once seen.

The problem was not that the document was wrong. The problem was that it answered a question nobody could see.

Nowhere did it say what had actually been asked. Nowhere did it show when the metric moved or by how much. Nowhere did it explain why these changes mattered more than the many others committed in the same period. The structure suggested diagnosis, but the substance was closer to categorised speculation.

What the System Was Asked to Do

AI is very good at producing outputs that look like the end of thinking. It is less reliable at doing the thinking itself, at least in the sense that engineers usually mean it. What it excels at is organising language. If you give it commit messages, it will group them. If you give it themes, it will narrate them. If you ask for causes, it will infer intent and present it back as mechanism.

This is not deceit. It is compliance.

Consider a plausible prompt behind the document. Something like, summarise recent performance related commits that might have made performance worse. That is a reasonable request when time is short and you want a quick sense of what has changed.

The system will do exactly that. It will look for commits that appear performance related. It will cluster them by topic. It will note churn and reversions. It will produce a narrative that reads as analysis.

What it will not do is examine everything else.

What Was Excluded by Design

It will not consider feature work that quietly added an extra call on the critical path. It will not consider refactors that altered render order. It will not consider new content that increased payload size. These changes are less visible in code and often more likely to affect real metrics. They may not live in the codebase at all, but in configuration or content management systems.

By constraining the input to performance related work, the analysis embeds a false premise. It assumes regressions are usually caused by people touching performance. In practice, performance work often follows a regression. It clusters after the fact. That makes it visible and easy to blame.

The document then treats activity as signal. Many commits in one area become evidence of instability in that area. Density is mistaken for causality. Without a comparison set, the narrative cannot distinguish reaction from cause.

Experienced engineers recognise this immediately. Real regressions tend to have an awkward shape. They come from changes that looked harmless at the time and did not announce risk. The AI smooths that awkwardness away and replaces it with coherence.

What Runtime Evidence Already Shows

There is another omission, and it is more practical than philosophical. Modern browsers are very good at telling us where time goes. Chrome’s performance tools do not whisper. They highlight long tasks, blocked rendering, delayed network requests, and hydration costs on a timeline that is difficult to ignore.

When performance regresses, the slow work is almost always visible. You can draw a box around it. You can point to the moment it ran and see what it blocked. You can tell whether the cost was network, main thread, or rendering. The fix may still be non trivial, but the guesswork should already be constrained.

Analyses that begin with commit history rather than runtime evidence reverse this order. They start with narrative and work backwards, instead of starting with a flame chart and asking what could plausibly have caused it. Without that grounding, even a careful explanation remains a hypothesis.

If you cannot point to the slow bit, you are not analysing yet. You are guessing.

Safe Conclusions and Their Cost

The recommendations at the end complete the picture. Audit images. Review lazy loading. Improve monitoring. Sensible advice, but generic enough to apply almost anywhere. Their safety is the clue. They signal completeness without specificity.

There is a quieter cost. By listing many plausible contributors, the document makes the regression feel like an emergent property of complexity. No clear mechanism. No accountable change. This can be comforting, but it also slows learning.

None of this argues against using AI. Used carefully, it is effective at widening the search space and summarising large volumes of material. The problem arises when its output is allowed to stand in for understanding.

Why the Prompt Matters

Without the prompt, the analysis cannot be evaluated. The reader does not know what the system was asked to do, what data it was allowed to see, or what was excluded by design. The answer is judged without access to the question that shaped it.

In any other analytical discipline this would be unacceptable. A model without its features. A forecast without assumptions. An experiment without method. The prompt defines the boundaries of thought as much as the data does.

Showing the prompt does not make the analysis correct. It restores context. It tells the reader whether they are looking at hypothesis generation, summarisation, or an attempted explanation. It makes clear where the machine stopped and where human judgement was expected to begin.

A Working Rule

There is a simple practice that would remove much of this ambiguity. When AI is used to produce analytical work, the prompt should travel with the output as required context. The prompt defines what was considered, what was prioritised, and what was ignored. Without it, confidence is mistaken for method.

This is not transparency as virtue. It is professional hygiene. Documents like this influence decisions, priorities, and narratives of cause and effect. A polished analysis that cannot be audited for scope or intent can mislead precisely because it appears complete.

AI will continue to improve at producing plausible explanations. That is expected. The responsibility on senior teams is to ensure those explanations remain grounded in a visible question, runtime evidence, and an explicit handoff to human judgement.

If the work is analytical, show the prompt. It is a small constraint, but it keeps explanation from outrunning understanding.