The Model Is Only 10%: The Real Lesson of the New SDLC

📊 Full opportunity report: The Model Is Only 10%: The Real Lesson of the New SDLC on ThorstenMeyerAI.com — validation score, market gap, and execution plan.

TL;DR

A recent Google whitepaper emphasizes that in AI-assisted software development, the AI model itself is only 10% of the system. The real focus should be on harness design and context engineering, which drive behavior and cost efficiency.

A Google whitepaper released in early 2026 states that the AI model accounts for only about 10% of the behavior in AI-driven systems. The majority of influence comes from the harness—the prompts, tools, and configurations—and context engineering. This shifts the focus from model development to system design and configuration, impacting how organizations approach AI integration.

The whitepaper, authored by Addy Osmani, Shubham Saboo, and Sokratis Kartakis, argues that the perceived importance of the AI model is overstated. Experiments cited show that changing the harness—such as prompts, rules, and tools—can dramatically improve system performance, even with the same model. For example, one team moved a coding agent into the top tier on a benchmark solely by tweaking the harness.

Furthermore, the paper emphasizes that costs associated with AI are primarily driven by token economy. Vibe coding—quick prompts and minimal review—appears cheap but incurs high long-term operational costs due to inefficient token usage and maintenance challenges. In contrast, disciplined engineering with well-designed harnesses and context management offers a more cost-effective and reliable approach over time.

At a glance
reportWhen: published early 2026
The developmentThe whitepaper by Addy Osmani, Shubham Saboo, and Sokratis Kartakis highlights that the primary driver of AI system performance is not the model but the surrounding harness and context management.
The Model Is Only 10% — The New SDLC With Vibe Coding
AI Dispatch · Field Notes
Google · Osmani, Saboo & Kartakis · May 2026

The model is only 10%

A Google whitepaper argues software’s biggest shift is from writing code to expressing intent. Its sharpest claim: the model you obsess over is the smallest part of the system — the scaffolding around it does the real work.

A spectrum, not a binary — the differentiator is how outputs get verified
Vibe Coding
Casual prompts · “does it seem to work?” · disposable code · high risk
Structured AI-Assisted
Detailed prompts + constraints · manual testing · features in real codebases
Agentic Engineering
Formal specs · automated tests + evals + CI gates · production scale · low risk
Tests verify the deterministic; evals verify the rest. Without both, it’s vibe coding — however clever the prompt.
The idea worth building your strategy around
Agent = Model + Harness
~10%
HARNESS — prompts · tools · context · hooks · sandboxes · observability
MODEL~90% IS YOUR SURFACE AREA, NOT THE PROVIDER’S
Outside Top 30 → Top 5 on Terminal Bench 2.0 by changing only the harness — same model.
“Most agent failures, examined honestly, are configuration failures” — a missing tool, a vague rule, a noisy context.
The economics: it’s a token-cost problem (CapEx vs OpEx)
Vibe Coding
Low CapEx · High OpEx
Looks free, hides debt: token burn (fix-it loops), maintenance tax (AI spaghetti), security remediation. Crosses over to 3–10× more per feature.
Agentic Engineering
High CapEx · Low OpEx
Pay upfront (specs, evals, context), then ship cheaply. Levers: context engineering for first-pass success + intelligent model routing — cheap models for the easy work.
85%
of devs use AI coding agents (51% daily)
41%
of all new code is AI-generated
~90%
of agent behavior is the harness, not the model
+19%
longer on some tasks (METR) — verification is the cost
The read

The clearest map yet of how serious AI development works — and mostly tool-agnostic. But it’s a Google funnel: the concepts are neutral, the on-ramps point to Gemini, Jules & the ADK. If the harness is 90% and it’s yours, your moat and your costs both live there — so own your scaffolding, route across models, and remember: AI amplifies whatever engineering culture it lands in.

Source: Osmani, Saboo & Kartakis, “The New SDLC With Vibe Coding,” Google (May 2026). Figures are the paper’s own, incl. METR & LangChain. Analysis is the author’s.
thorstenmeyerai.com

Impact of Harness and Context on AI System Effectiveness

This shift in understanding means that organizations should prioritize system configuration, prompt engineering, and context management over solely focusing on acquiring the latest AI models. The durable advantage in AI deployment lies in system design and configuration, which can yield significant performance gains and cost savings.

For leaders, this underscores the importance of investing in system architecture, tooling, and training around context engineering, rather than just model procurement or upgrades. It also suggests that long-term AI strategy should include developing expertise in harness design and context management.

The AI Prompt Playbook: Master AI Prompt Engineering with 140 Ready-to-Use Templates for ChatGPT, Claude, Gemini & Copilot

The AI Prompt Playbook: Master AI Prompt Engineering with 140 Ready-to-Use Templates for ChatGPT, Claude, Gemini & Copilot

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Background on AI Development and System Design Trends

Until early 2026, the common narrative emphasized the model’s capabilities as the primary driver of AI performance. The rise of AI coding agents saw organizations investing heavily in model upgrades and training datasets. However, recent experiments and the new whitepaper challenge this view, showing that system configuration and context management are more influential than previously acknowledged.

This perspective aligns with broader trends in software engineering, where system architecture and automation layers often determine success more than raw technology alone. The whitepaper’s findings suggest a paradigm shift in how AI systems should be built and maintained.

“The AI model accounts for only about 10% of the behavior; the rest is in how you harness and engineer context.”

— Addy Osmani

Automated Testing Harness with MCP: Designing Resilient, AI-Driven Test Systems for Scalable, Secure, and Self-Healing Automation (Software Engineering, Cloud Architecture & AI Governance)

Automated Testing Harness with MCP: Designing Resilient, AI-Driven Test Systems for Scalable, Secure, and Self-Healing Automation (Software Engineering, Cloud Architecture & AI Governance)

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Unclear Aspects of System Optimization and Application

While experiments demonstrate the outsized influence of harness and context, the precise methodologies for optimal configuration vary by application and are still being developed. It remains unclear how these principles translate across different domains or with future model advancements. Additionally, the long-term impact of these shifts on AI development workflows and organizational structures is still emerging.

Context Engineering for Multi-Agent Systems: Optimizing Memory, Communication, and Workflows in AI Agents (Context Engineering for Multi AI systems Guide)

Context Engineering for Multi-Agent Systems: Optimizing Memory, Communication, and Workflows in AI Agents (Context Engineering for Multi AI systems Guide)

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Future Directions in AI System Design and Strategy

Organizations are expected to re-evaluate their AI deployment strategies, investing more in harness development and context engineering skills. Further research and case studies will likely explore best practices for system configuration, automation, and cost management. Industry leaders may also develop new tools and frameworks to simplify harness design and optimize token economy management.

Monitoring these developments will be crucial as the AI landscape continues to evolve, with an increasing emphasis on system architecture and configuration over raw model power.

Claude AI Cheat Sheet: A Simple Quick Guide to AI Shortcuts and Productivity Tools

Claude AI Cheat Sheet: A Simple Quick Guide to AI Shortcuts and Productivity Tools

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Key Questions

Why is the model only 10% of the system behavior?

Experiments and research show that the surrounding system—prompts, tools, rules, and context management—has a far greater influence on AI output than the model itself, which is only about 10% of the overall behavior.

How does this change AI development strategies?

It shifts focus from solely improving or acquiring models to designing robust harnesses and effective context management, which are more cost-efficient and impactful over time.

What are the economic implications of this shift?

While vibe coding appears cheap initially, it leads to higher long-term costs due to token inefficiency and maintenance. Disciplined system design reduces these costs and improves reliability.

What skills should AI teams prioritize now?

Teams should develop expertise in harness engineering, prompt design, context management, and system architecture to maximize AI effectiveness and cost efficiency.

Will this affect future AI model development?

Yes, the focus may shift from solely developing larger models to optimizing how models are integrated, configured, and controlled within systems.

Source: ThorstenMeyerAI.com

You May Also Like

Mistral. The fourth path.

Mistral, a Paris-based AI firm, raised over $830M in 2026, becoming Europe’s leading commercial AI player, yet still trails US models in reasoning capabilities.

The Switch: You Never Owned the AI You Depend On

Recent events highlight how AI models depend on access, which can be revoked instantly by governments or companies, exposing vulnerabilities in reliance on APIs.

Fable and Mythos: How Anthropic Shipped Its Most Powerful Model to Everyone

Anthropic launches Fable 5, a highly capable AI model available publicly with safety safeguards, marking a new approach to deploying powerful AI systems.

Capability or Control: The European Enterprise AI Playbook for the AI Act Era

How European companies navigate the AI Act with strategic model choices, infrastructure, and licensing to ensure compliance and control.