“Give me a lever long enough and a fulcrum on which to place it, and I shall move the world.” – Archimedes
AI is a force-multiplier. Without a solid spec, it just multiplies chaos.
The False Velocity of “Vibe Code”
The potential of generative AI in software engineering is staggering. With a simple prompt, an LLM can produce a thousand lines of code in seconds.
But this velocity is misleading. We’re getting “vibe code”: code that looks right at a glance, matches the style of the repo, but holds no structural integrity. This is a familiar feeling for anyone who remembers the outsourcing boom: we’re just accumulating technical debt at a much faster rate.
Our first attempts to manage this have been weak.
- Engineered Prompts: These are just more specific instructions for code that is often still foundationally flawed.
- AI Agents: We gave the AI tools and terminal access, mistaking capability for judgment. This just amplifies the core issue: an agent with powerful tools but no grounding is a faster way to propagate errors.
We have a powerful lever, but we’re waving it in the air. The solution isn’t a better lever; it’s building a fulcrum.
The Fulcrum is Spec-Driven Development
The solution is to establish a grounding point. That grounding point is Spec-Driven Development (SDD).
The immediate objection is, “This is just Big Design Up Front (BDUF). This is Waterfall.”
This is incorrect. Waterfall failed because its specs were monolithic, rigid, and dead documents. They were obsolete on arrival.
This is the opposite. SDD, when implemented correctly, is an agile discipline. It relies on a two-part system that is common in tools like spec-kit:
- The spec-kit constitution.md: This is the stable “ground.” It’s a durable, high-level document that defines the non-negotiables: core architectural principles, security requirements, domain logic, and API standards. It’s forged from experience and changes infrequently.
- The agile spec-kit “spec”: This is the “fulcrum” for a specific task. It’s a small, clear, executable contract written just-in-time, like a user story. But unlike a story, it’s a verifiable, machine-readable input. It inherits its rules from the constitution.md and details a precise piece of work.
This structure is what makes the system work. And it reveals a profound new advantage.
In human-led development, we constantly face a trade-off between quality and speed. If you force a human developer into a highly opinionated environment; to always use TDD, document every API, never violate architectural principles, write architectural decision records, code for Day 2 operations, and be cost-conscious from Day 1, they will chafe. It’s restrictive. It’s slow. So, we make a trade-off. We intentionally take on tech debt and documentation debt to meet a deadline or get a feature out the door to test.
That trade-off is now obsolete.
An LLM has no ego or feelings. It doesn’t chafe. You can be highly prescriptive and opinionated in the constitution.md. You can mandate that the LLM document all APIs using the Open API specification, do Test Driven Development and write tests first, follow the Testing Pyramid approach, consider multi-stakeholder lenses from the start, etc. The LLM will not chafe. It will simply execute.
Industry analysis suggests that AI-SDD is positioned specifically for “enterprise and brownfield projects” with “compliance constraints” where retrofitting standards later is too expensive. By baking these rules into the spec, we create an automated risk management artifact1.
It can build the “right way” just as fast as the “fast way.” This is why forging a constitution.md that leverages the deep, hard-won wisdom of your expert team to do it, is the most critical part of this new process.
The AI (the lever) pivots on the agile spec (the fulcrum), which is placed firmly on the constitution (the ground). This creates an immediate feedback loop.
With spec-kit, the AI parses the Spec, generates a Plan, breaks that plan into Tasks, and then Implements. If the AI generates an illogical Plan, the developer knows instantly that the Spec was ambiguous or incomplete.
This isn’t a “fire-and-forget” prompt. It’s a verifiable, iterative process.
Forging the Fulcrum: The Expert Pod
This system fails if the constitution.md is ambiguous or incomplete. This isn’t a task for a junior developer; it’s a task that requires deep experience to get right.
Forging these documents requires a small, senior, cross-functional team. Call it an “expert pod” or a “spec team.” This isn’t a new full-time org; it’s a group of your best people focused on a specific goal, bringing three distinct, critical perspectives:
- An Architect (the “Orchestrator”): This is your T-shaped software expert. This is the person with the broad and deep “battle scars” from decades of software development. They understand systems, patterns, anti-patterns, and what not to do. They bring the software engineering wisdom and are also responsible for understanding the AI tooling ,like spec-kit, to translate the team’s intent.
- A Technical SME (Subject Matter Expert): This is the veteran with the domain wisdom who “knows where the bodies are buried.” This person has the deep, specific knowledge of the legacy system, the non-obvious business rules, and the existing architectural traps.
- A Product Manager: This is the voice of the customer and the business. They are accountable for the “why,” ensuring the right thing is being built and that the value is real.
This pod’s entire job is to distill their collective, hard-won experience; the Architect’s software wisdom, the SME’s domain wisdom, and the PM’s customer wisdom, into the constitution.md. This is how you ground the AI in the reality of your business.
Where This Works, and The Hurdles
This isn’t a theory. In a proof-of-concept capacity, I have begun building complex, enterprise-grade solutions using this method. The results in both development velocity and code quality are compelling.
The best targets for this approach are high-friction projects:
- Greenfield Projects: You establish a “perfect” Constitution from day one. This prevents technical debt before it even exists.
- Lift-and-Shift / Digital Transformations: This is another ideal use case. These projects are notorious for consuming fortunes just to “get to parity.” This approach makes that leap cheap and fast, letting teams get to building new features much sooner.
There are obvious scaling challenges. This expert pod is a bottleneck. The solution is likely a federated model: a central team curates the global Constitution, and trains other senior pods to write their own agile specs.
The “brownfield” problem, of a 20-year-old monolith, is harder. The answer there is likely a multi-step process: use AI to reverse-engineer a draft spec from the existing code, and then have the expert pod refine it.
This isn’t just documentation; it’s a high-yield bug hunt. Microsoft Research applied a similar spec-driven verification approach to their Confidential Consortium Framework (CCF) and discovered “six subtle bugs” and a “serious liveness bug” that standard testing missed.2 Writing the spec is the verification.
The First Step: Prove It
A new age of software development is here. The pieces are falling into place.
The AI is the lever. But the key isn’t the AI. The key is the team that writes the spec that guides the AI.
This is not a fantasy. It is a practical, disciplined methodology. The first step is not a company-wide re-org or boiling the ocean.
The first step is a small, strategic test.
Find a single item in your backlog that is notoriously high-cost. Maybe it’s a high-value feature everyone wants but is too expensive, or a high-cost, low-value piece of tech debt everyone is dreading.
Assemble your expert pod. Dedicate them to one or two sprints, two to four weeks.
Why a single, constrained task? Because empirical data shows that while general adoption of model-driven methods yields minor gains (20-30%), applying them to “tight and narrow” domains can yield productivity increases up to a “factor of 17”.3
Task them with a single mission: to forge a constitution.md and an agile spec, and flip the script. Turn that high-cost item into a low-cost, high-value delivery. The potential payoff is gold: an expensive feature delivered for a fraction of the cost.
Then, stand back and measure the result.
