The last pursuit I watched get lost had a 47-page response, three weeks of partner time, and a war room of people who had each read the 10-K twice.
They lost because nobody in the room could answer the question the CIO actually wanted answered, which was not the question in the RFP.
This happens constantly. It happens to firms that are much better at this than the one I was watching. It happens in rooms full of people who are, individually, excellent at their jobs. I have spent my career inside these rooms. Founder of a consulting firm. Corporate strategy inside a large enterprise, deciding which firm to hire. Now running pursuits at a global systems integrator across financial services, healthcare, and telecom.
The pattern is the same every time. The team is prepared. The team is not pointed.
The observation
Here is what I think most people underrate: the distance between informed and pointed is not a research gap. It is a judgment gap.
Informed means you have read the 10-K, talked to the references, mapped the org chart, pulled the competitive landscape, and written a clean summary of what the client said in their last three earnings calls. Most teams are informed. The tools of the last decade — CRM, research platforms, enablement libraries, and now generative AI — have made it easier than ever to be informed.
Pointed means you have formed a view. You believe the client’s stated problem is not their real problem. You know which of three competing frames will actually land with the economic buyer. You can tell the COO something she does not already know about her own business, and you can do it in the first thirty seconds.
Informed is a precondition for pointed. It is not the same thing. And nothing in the modern pursuit tech stack helps a team cross the gap.
This is why pursuits are lost. Not because the research was thin. Because the team never decided what they believed.
Why the tools made it worse
I want to be specific here, because the abstract version of this argument has been made before and it is boring.
CRM systems catalog interactions. They do not form views. Salesforce can tell you that you have had eight conversations with this account. It cannot tell you whether those conversations pointed somewhere.
Enablement platforms — Highspot, Seismic, Showpad — organize content so sellers can find the right deck faster. Faster access to the same generic deck is not an upgrade when the actual problem is that the deck should be different for this pursuit.
Call intelligence tools — Gong, Chorus — analyze what was said after the call. Valuable. But by the time the call has happened, the team has already walked into the room with whatever angle they had. The damage is done in preparation, not in review.
Generative AI assistants — the current crop, including some I respect — can summarize a 10-K in ninety seconds. The problem is that every team with access to ChatGPT now has the same summary. The summary is no longer the differentiator. If anything, it is the new floor. Teams are arriving at meetings with slightly better research and exactly the same lack of a point of view.
The tool landscape has made everyone better at the first 80% of pursuit work, which was already the easy part, and no better at the 20% that decides outcomes.
What I believe is missing
The missing layer is judgment. Specifically, a layer that does what a senior partner does in their head during a pursuit: reads the signals and forms a thesis about what is actually happening, pressure-tests the thesis before the team walks into the room, and translates the thesis into a defensible position with a sequence, a posture, and an honest read on where it will be attacked.
This is not an unsolved intellectual problem. The frameworks already exist. Miller Heiman laid out the buying-influence roles and response modes in the 1980s. Barbara Minto laid out the Pyramid Principle — answer first, then support it — in the 1970s. Both are still the sharpest tools a consultant has. They just never made it into software.
That is the bet Field Thesis is making. Not that AI will invent a new theory of enterprise selling. That AI can finally operationalize the theory that already works, and do it in a form that accumulates across a portfolio rather than resetting every deal. Call it pursuit intelligence — a category that didn’t quite exist before, because the work behind it lived in senior partners’ heads.
What Field Thesis actually is
Three capabilities, running on your own infrastructure.
Query — a senior strategy partner that answers in the form of a memo.
You ask the question the way you would ask your sharpest colleague. Field Thesis leads with the answer, cites the framework your firm already built for a similar problem, reads the macro posture against the client’s capex trajectory, pressure-tests the view against the obvious counter-argument, and closes with two or three hypotheses specific enough that one conversation could validate or refute them. Not a search result. Not a data dump. A memo.
War Room — the live intelligence surface for a pursuit.
The moment a pursuit opens, the War Room assembles itself. Filings, macro data, labor data, competitive read, and the language patterns that reveal how management actually talks about the business. A signal engine reads the earnings calls and tells you what you are walking into: a company that is moving, stuck, complacent, or overreaching. The Buying Influence Map surfaces who owns what, which roles are gaps, and which hypotheses remain open. Nothing resets between meetings.
Simulate — practice the conversation before you walk into it.
The simulated counterpart is grounded in the target company’s own 10-K, capex figures, and peer moves. The pushback is specific. When Dr. Vasquez tells you she put $340M into AI infrastructure last quarter and asks why the answer is more process and not more platform, she is quoting her own earnings call. Every message the team sends is scored as it goes — confidence, value framing, specificity, responsiveness, jargon, balance, differentiation. The skill that usually has no feedback mechanism finally has one.
The common thread: each capability turns a thing senior operators do in their heads into a surface the whole team can work from.
A pursuit, worked through
A regional health insurer asked for help accelerating their AI roadmap. The stated problem was modernization velocity. The CIO had the budget, had the mandate, and had three models ready for production that had been blocked for four months.
The team I was working with had done the reading. They had the transcripts. They had the interviews. They had the competitive landscape. They did not have a point of view.
What Field Thesis surfaced, in the thirty seconds it took to ask the right question, was that every signal in the data pointed at the same thing. This was not a platform problem. Payer peers were adopting federated governance models. The CIO was stuck inside an operating model that was not going to scale no matter what platform she bought. Capital was cheap — no macro reason to wait. The hesitation was internal.
The thesis that came back was specific: treat this as a governance problem. The stall is not a platform gap. It is a decision-rights gap the platform is exposing.
The team ran the simulation against a COO grounded in the company’s own filings. The hardest pushback — “this sounds like another quarter of stalling” — surfaced there, not in the room. The team rebuilt the first five minutes of the meeting around it.
Two weeks later, the client did not buy a platform. They bought a governance redesign, with the platform as a consequence. The sequence mattered. The posture mattered. The one-liner — “you do not have a technology gap, you have a governance gap the technology is exposing” — landed because it was tested.
It does not replace judgment. It gives judgment somewhere to compound.
The engineering, in detail
Query answers with a partner’s memo, not a search result. Leads with the answer, cites the frameworks your organization already built by name, weaves in live macro data, pressure-tests against the obvious counter-argument, closes with two or three testable hypotheses. When I read a memo back from Field Thesis next to one I wrote five years ago on the same problem, the structure of the thinking is recognizable.
The signal engine is substantive. 165 named signal patterns across seven categories. Twenty boilerplate filters so SEC safe-harbor language doesn’t score as real hedging. Industry packs in production across the major sectors — financial services tuned to capital-ratio and operational-resilience language, utilities to rate-case and load-growth disclosures, healthcare to CMS exposure, and the rest of the majors covered with their own cohort-validated tuning. The engine maps management’s own words to Miller Heiman response modes with confidence levels. It is not sentiment analysis. Sentiment analysis tells you a filing is “positive,” which is useless. Field Thesis tells you a company is in Growth mode with 76% confidence and shows you the commitment, hedging, defensive, and technology scores that got it there.
The coaching engine runs locally with no latency. 175 terms across five categories, seven coaching dimensions, positive reinforcement when the message is clean and specific correction when it is not. The Balance dimension triggers a reminder when a message passes 150 words, because this is a conversation, not a presentation, and nobody else tells senior sellers that.
War rooms accumulate. Every note, simulation debrief, hypothesis tested, new EDGAR filing — it all stays and feeds the next interaction. The tenth conversation about a client is informed by the first nine. When intelligence is deleted, it archives rather than disappears.
Every governed claim is anchored to an exact substring of the source paragraph. The contract is non-negotiable: a model that recovers more signals but copies plausibly-correct content from its training data fails the contract regardless of recall. We tested Claude Haiku and Sonnet as fallback extractors in May 2026 — both produced cleaner-looking output but hallucinated supporting phrases from training-data world knowledge. They were rejected. The local Qwen3 extractor abstains where it cannot ground, which is the correct failure mode for an evidence layer.
Each claim is assigned one of five bands — confirmed, strongly indicated, suggested, discovery signal, or suppressed. Each band carries an allowed-language permission. A claim in the suggested band cannot use a verb like “is shifting” — that verb is reserved for confirmed evidence. A claim in the discovery band cannot be wrapped in confident-sounding scaffolding to disguise it. The Claim Auditor enforces this before any output reaches the team. Overclaim and underuse are both caught — a claim that uses a verb stronger than its evidence supports is blocked, and a confirmed claim buried under hedged language is flagged just as hard.
When the evidence is too thin for a defensible position, the system says so. It produces a discovery card instead of a thesis, or a constrained diff instead of a full one. Thin trajectories are honest output, not failures.
What is hard
Where it falls short.
The quality of Query’s output is bounded by the quality of the organization’s library. Field Thesis retrieves the framework your firm already built for a similar problem and explains how it applies to the new one. If that framework exists and was written with clear thinking, the output is exceptional. If the library is thin, or the relevant framework exists but was documented casually, the synthesis is still useful but it loses the “this is our firm’s voice” quality that is the whole point. The first few weeks at a new firm are an ingestion problem, not a synthesis problem.
The simulation is as grounded as the data that feeds it. Executives know the target company’s 10-K language, capex numbers, and peer moves because all of it is injected from the war room. When a pursuit is well-mapped — stakeholders named, roles assigned, response modes set, hypotheses written — the counterpart is hard to talk to generically and pushes back with specifics. When the war room is thin, the simulation defaults toward the persona’s generic posture. The tool rewards the teams that do the intelligence work beforehand.
The coaching is stronger on the mechanical dimensions than on nuance. Confidence, value framing, specificity, balance — these are pattern-detectable and the lexicon scores them well. Responsiveness (did you answer their question) and company differentiation are harder because they require reading the conversation structure, not just the words. This is an active area of work.
Persona memory across sessions is a new capability and we are still learning how it should shape the experience. The memory is structured data, scoped to pursuit and persona, inspectable and revertible — a CIO’s memory does not bleed into a CFO’s. In the right conditions, the second session with a persona holds the consultant accountable for commitments they made in the first, and probes the gaps that went unaddressed. In the wrong conditions, the memory is too literal and the persona feels like they have a longer prior relationship than they should. We are tuning this.
The current operating envelope is strongest for public-company SEC disclosure analysis — supported by validated industry packs and the substring contract. Private companies, non-SEC filings, and non-US disclosures are explicitly outside the current envelope and known as such; they are roadmap, not silent failure modes.
Underneath all of this is a principle that took us a while to commit to: false negatives are acceptable when the alternative is ungrounded positives. Field Thesis’s commercial value comes from defensibility, not coverage. The system abstains rather than producing claims it can’t anchor — even when abstention costs us recall on patterns we’d like to catch.
What compounds and what does not.
The accumulative loop — Query to War Room to Simulate to debrief to next Query — is the core claim. The first pursuit through the system is a good research tool. The fifth or sixth is where the compounding starts to show: signal trends over quarters, personas remembering prior exchanges, hypotheses that carry across meetings. If the thing you are measuring is “how much better is my next pursuit because of the last ten,” you have to run ten pursuits to know. That is a real ask.
The fit is better for organizations that already have institutional knowledge worth indexing. Not just volume — structure. Firms that wrote their frameworks down are a strong fit. Firms whose best thinking lives in individual partners’ heads can still use it for live intelligence and simulation, but they get less of the compounding effect because there is less to compound.
Beyond pursuits
The argument so far has been about enterprise pursuits — the consulting and transformation work I’ve spent my career inside. But the structural problem isn’t specific to consulting. It shows up wherever a team builds a defensible thesis on a company and has to survive scrutiny.
Hedge fund research teams have it. They build positions on coverage names, defend them in IC, and walk into management calls where every claim gets pushed back on. The workflow Simulate was built for is literally their workflow.
PE deal teams have it. IC memos that LPs will read, prior-cycle theses to defend against, management meetings where the wrong sentence costs months. War Room holds the thesis; the daemon keeps it current between conversations.
Activist research teams have the sharpest version. Campaigns survive SEC scrutiny and adversarial board engagement. Every claim has to be source-locked because the alternative is a retraction.
Equity research analysts are the cleanest fit. Calibration bands and sourcing standards are already required by professional standards — Field Thesis just enforces what the CFA code already says you should be doing.
The discipline doesn’t change because the deal type changes. The room changes; the engineering doesn’t.
Who this is for
Managing Partners at consulting firms who have watched smart teams lose pursuits they should have won, and know the loss was in preparation, not execution.
GSIs whose account teams are prepared but not pointed, and whose largest pursuits depend on the judgment of four or five people who cannot be everywhere.
Chief Revenue Officers at enterprise software and services companies whose commercial judgment keeps concentrating in the same handful of senior people, and who know that is a liability, not a strength.
Heads of Research at PE firms, hedge funds, activist shops, and equity research houses. People who already work with calibration bands, sourcing standards, and “no opinion” disciplines as a professional norm. The discipline lands without translation.
If that is you, and you have read this far, I would like to talk.