The Truth About AI Coding Agents: What 6,000+ Real Developer Sessions Reveal

The Truth About AI Coding Agents: What 6,000+ Real Developer Sessions Reveal

Apr 29, 2026 ai-coding developer-tools software-security machine-learning code-generation vibe-coding semgrep vulnerability-analysis

The Truth About AI Coding Agents: What 6,000+ Real Developer Sessions Reveal

We've all heard the promise: AI coding agents will transform development. Write less code. Ship faster. Let the machines handle the boilerplate.

But nobody was actually measuring what developers really do with these agents—until now.

A new dataset called SWE-chat tracked 6,000+ real coding sessions from developers using AI agents in production. The findings are fascinating, uncomfortable, and challenge everything we think we know about human-AI collaboration in software development.

The Rise of "Vibe Coding"—And Why It Worries Security Experts

The dataset reveals three distinct modes of human-AI collaboration:

  • Human-only mode (22.7%): AI assists with explanations, but humans write the code
  • Collaborative mode (36.5%): Shared authorship—back-and-forth refinement
  • Vibe coding (40.8%): The agent writes nearly everything; humans just approve

That last one is growing fast. "Vibe coding" sessions have doubled in just three months, and it's now the most common collaboration pattern.

Here's the catch: vibe-coded commits introduce 9x more security vulnerabilities than human-only code.

Let that sink in. When developers hand the keyboard entirely to AI, they're not just shipping code faster—they're shipping code with exponentially more security flaws. Every mode introduces more vulnerabilities than it fixes, but vibe coding is particularly problematic.

The Uncomfortable Truth: Most AI Code Gets Thrown Away

If developers love AI agents so much, why does 55.7% of agent-generated code never make it into production?

The data shows developers constantly pushing back. 44% of agent interactions involve user interruption or rejection. Meanwhile, agents rarely ask clarifying questions—pushback happens in just 1.4% of turns.

It's like watching a conversation where one person keeps talking past the other.

The pattern is clear: developers use AI agents to explore possibilities and iterate quickly, not as a hands-off coding solution. They're treating these tools like turbocharged rubber ducks, throwing away most suggestions while refining the promising ones.

What Developers Actually Want From AI

Here's a surprising finding: the #1 reason developers prompt AI agents isn't to write code—it's to understand code.

19% of prompts ask the agent to explain existing code, outpacing requests for code generation. Developers are using AI as a reverse-engineering tool, a documentation generator, a way to onboard faster into unfamiliar codebases.

Yet we've been marketing these tools as "write less code" when developers are really saying "help me understand better."

The Expert Nitpicker Problem

47% of vibe-coding users are what the researchers call "expert nitpickers"—developers who stay actively involved, scrutinizing every AI suggestion, often correcting minor details.

This is actually inefficient. If you're going to review and edit every line anyway, why use vibe coding at all? The data suggests many developers would be better served by collaborative mode, where the cost-efficiency is higher and the security risk is lower.

The expert nitpickers are getting diminishing returns. They're spending cognitive effort reviewing AI-generated code that they could've written themselves—they're just... slower about it.

Real Examples of Failure

The dataset includes actual failed sessions. One example: a developer asked an agent to fix sluggish animations in an iOS app. The agent kept modifying the wrong parameter—the individual card stagger instead of the container animation.

After multiple corrections, the session ended without resolution. No commits. The agent couldn't understand spatial context or prioritize which optimization would actually solve the problem.

Another session shows an expert nitpicker engaged in relentless micro-corrections: "don't create a separate function," "inline the UUID call," "rename this constant." The developer is more like a code reviewer than a programmer.

What This Means for Your Team

If you're evaluating AI coding agents for your team, here's what the data suggests:

Use agents for understanding, not writing: They're better at explaining code than generating it. Better at documentation than creation.

Stick with collaborative mode: The 36.5% of sessions that involve back-and-forth refinement hit the sweet spot for security, efficiency, and developer satisfaction. Vibe coding sounds appealing but introduces unacceptable risk.

Plan for review overhead: If you do use AI-generated code, budget time for security review. Run Semgrep, Snyk, or similar tools on AI-committed code. The 9x vulnerability increase isn't theoretical—it's happening in production right now.

Measure what actually ships: Like the developers in this dataset, you'll probably throw away 55% of AI suggestions. That's not failure—that's the tool working as intended. Judge agents by their hit rate, not their output volume.

The Bigger Picture

What makes SWE-chat valuable isn't just the numbers—it's the honesty. This dataset captures real developers in real workflows, making real decisions about AI code.

It shows that the narrative around AI agents is overoptimistic. We're not watching machines write code unsupervised. We're watching developers use machines as interactive thinking tools, discarding most suggestions, staying deeply involved in every decision.

The agents are powerful. But they're not magic. And the developers who'll thrive in this era aren't the ones surrendering to vibe coding—they're the ones treating AI as a collaborative partner, maintaining skepticism, and staying actively engaged.

The data proves it.


Want to dive deeper? The full SWE-chat dataset is publicly available, and if you're building tools on top of AI agents, the interaction patterns revealed here should inform your product roadmap. Understanding how developers actually use these tools beats guessing every time.

Read in other languages:

RU BG EL CS UZ TR SV FI RO PT PL NB NL HU IT FR ES DE DA ZH-HANS