The First AI-Coded DeFi Exploit: How Claude Opus 4.6 Generated a $1.78M Oracle Bug

bitcoinindex.net · · 6 min read
The First AI-Coded DeFi Exploit: How Claude Opus 4.6 Generated a $1.78M Oracle Bug

Here’s something I genuinely didn’t expect to write in 2026: an AI model just co-authored its first million-dollar DeFi exploit.

On February 15, Anthropic’s Claude Opus 4.6 helped write code for the Moonwell lending protocol that mispriced cbETH at $1.12 instead of its actual market value of around $2,200. The result? $1.78 million in protocol bad debt and $2.68 million in total user losses affecting 181 borrowers.

This isn’t a theoretical risk anymore. This is what happens when AI-generated code meets production financial infrastructure.

What actually happened

At 6:01 PM UTC on February 15, Moonwell deployed governance proposal MIP-X43, which enabled Chainlink OEV (Oracle Extractable Value) wrapper contracts on Base. The code contained an oracle configuration bug that failed to multiply the cbETH/ETH exchange rate by the ETH/USD price feed.

The intended calculation:
cbETH price in USD = (cbETH/ETH ratio) × (ETH/USD price)
cbETH price = 1.12 × ~$2,000 = ~$2,240

What the code actually did:
cbETH price in USD = cbETH/ETH ratio
cbETH price = $1.12

That’s a 99.9% undervaluation. Trading bots immediately recognized the opportunity and began liquidating positions, repaying ~$1 of debt to seize cbETH worth ~$2,200. Within minutes, 1,096 cbETH (worth ~$2.44 million at market price) was liquidated.

Moonwell’s risk manager @anthiasxyz reacted quickly, reducing cbETH borrow and supply caps to 0.01. But the damage was done. The oracle correction required a 5-day governance timelock, and liquidations continued through February 18.

The AI coding angle

Here’s where it gets interesting. The vulnerable code was submitted in pull request #578 by Moonwell core contributor “anajuliabit” and explicitly co-authored by Claude Opus 4.6 according to GitHub commit records.

Smart contract auditor “pashov” broke the news on X on February 17:

“🚨Claude Opus 4.6 wrote vulnerable code, leading to a smart contract exploit with $1.78M loss. cbETH asset’s price was set to $1.12 instead of ~$2,200. The PRs of the project show commits were co-authored by Claude – Is this the first hack of vibe-coded Solidity code?”

The post included screenshots showing Claude Opus 4.6 listed as co-author in the GitHub commit history.

“Vibe coding” is developer slang for a workflow where you rely heavily on generative AI to rapidly generate code based on prompts, often accepting outputs with minimal line-by-line verification. The code is syntactically perfect and logically plausible at a glance. It compiles. It passes basic tests. So you trust it.

That’s the trap.

The multi-layer defense failure

According to NeuralTrust’s analysis, the vulnerable code was:

  1. AI-generated by Claude Opus 4.6
  2. Reviewed by human developers
  3. Processed by GitHub Copilot
  4. Scanned by OpenZeppelin Code Inspector

None of these layers caught the missing multiplication step.

This is a classic “Swiss Cheese” failure in security parlance. Each defense mechanism has gaps, but they usually don’t align. When they do, bad things happen. In this case, they all failed simultaneously.

Other users on X highlighted the contributor’s GitHub profile showing over 1,000 commits in the past week. That’s an extremely high workrate that likely reflects heavy reliance on AI assistance.

Testing would have caught this

Trading Strategy’s Mikko Ohtamaa pointed out something crucial:

“Regardless of whether the code is written by an AI or by a human, these kinds of errors are caught in an automated integration test suite… there was no test case for price sanity.”

He’s right. Claude can even write these tests itself. But in this case, no test existed to verify that the oracle price matched expected market values.

The failure wasn’t the AI writing bad code. The failure was deploying code to production without verifying it actually worked as intended.

Moonwell’s oracle problem

This is Moonwell’s third oracle malfunction in six months, resulting in over $7 million in total bad debt:

  1. October 10, 2025: Pricing discrepancy during a flash crash led to $12M liquidations and $1.7M bad debt
  2. November 4, 2025: The Balancer $129M hack caused knock-on effects on the wrsETH/ETH oracle, creating $3.7M bad debt
  3. February 15, 2026: AI-coded oracle misconfiguration created $1.78M bad debt

According to DeFi commentary account “Yieldsandmore,” the same attacker likely exploited incidents #1 and #2, “clearly constantly scanning Moonwell for extractable value.”

Moonwell’s TVL has dropped from $380 million in August 2025 to around $90 million today. Three oracle failures in six months are clearly eroding user trust.

Who’s responsible when AI writes the bug?

When AI writes a bug that causes a $1.78M loss, who’s responsible?

  • The AI vendor (Anthropic)? Claude is a tool. Anthropic provides no warranties for code it generates.
  • The developer who used Claude? They merged and deployed the code.
  • The human reviewers? They approved the pull request.
  • The automated scanners? OpenZeppelin Code Inspector didn’t flag it.
  • The protocol DAO? Governance approved MIP-X43.

The consensus in the security community is clear: ultimate responsibility rests with the team that merged and deployed the code. As SlowMist noted, “Blaming the neural network alone is incorrect.”

But this incident raises uncomfortable questions about human oversight when AI-generated code “looks right.” We suffer from automation bias: assuming AI + automated scanner approval = secure code.

Moonwell declined to comment when asked by Decrypt about the AI coding angle.

The AI vulnerability landscape

Claude Opus 4.6 was released on February 5, 2026. The Moonwell exploit occurred just 10 days later. This was a brand-new model, and teams may have been eager to use cutting-edge capabilities without fully understanding limitations.

Recent research paints a sobering picture:

Tenzai Study (December 2025): Tested identical prompts through 5 AI coding platforms (Claude Code, Cursor, Windsurf, Replit, Devin):

  • 69 vulnerabilities generated across 15 test applications
  • 6 critical vulnerabilities
  • AI avoids generic mistakes (SQL injection, XSS) but fails at authorization and business logic
  • Zero detected by traditional static analysis scanners

Sonar Data (January 2026):

  • 96% of developers distrust AI-generated code
  • Yet only 48% consistently verify before committing

That gap between distrust and verification is the problem. We know AI is fallible, yet half of us commit without rigorous review.

Can AI fix what AI broke?

Here’s the paradox: AI tools can write exploits and detect them.

Research from Anthropic and OpenAI shows AI agents can now execute end-to-end exploits on most known vulnerable smart contracts. Exploit capability is doubling roughly every 1.3 months. The average cost of an AI-powered exploit attempt is $1.22 per contract.

North Korea is using AI to scale hacking operations and automate parts of the exploit process.

On the defensive side, a Cecuro study released on February 20 evaluated 90 real-world smart contracts exploited between October 2024 and early 2026, representing $228 million in verified losses:

  • Specialized AI security agent: 92% detection rate, flagged vulnerabilities tied to $96.8M in exploit value
  • Baseline GPT-5.1 coding agent: only 34% detection, $7.5M coverage

The difference comes down to domain-specific methodology, structured review phases, and DeFi-focused security heuristics. General-purpose AI fails at security. Specialized AI succeeds.

Several contracts in the dataset had undergone professional audits before being exploited.

Cecuro open-sourced the benchmark dataset and evaluation framework on GitHub, but has not released its full security agent due to concerns it could be repurposed for offensive use. That’s precedent-setting: should AI security tools be treated as dual-use technology?

The compensation plan

Moonwell has proposed a comprehensive recovery plan:

  • Immediate payment: $310K from Apollo Treasury (pro-rata to all 181 borrowers)
  • 12-month stream: Remaining $2.37M via future protocol fees and OEV revenue
  • Distribution: Via Sablier time-based streaming
  • Governance consolidation: Wind down Moonwell on Moonriver, deprecate MFAM token, consolidate into single WELL-based governance

The governance vote (MIP-B57) ran from February 18-21, 2026.

What this means for DeFi

AI coding tools are now good enough to write production DeFi code and bad enough to introduce million-dollar bugs.

The same technology powering offensive code generation is also powering defensive audits. Who wins this race? Right now, offense is scaling faster than defense.

The solution isn’t to ban AI from writing smart contracts. The solution is to recognize that AI-generated code requires the same rigorous testing and verification as human-written code. Maybe more.

Test for price sanity. Test for edge cases. Test for the things that should never happen but might anyway.

Because when you’re handling millions of dollars in user funds, “it compiled and looked right” isn’t good enough.

Sources: Oracle Error Leaves DeFi Lender Moonwell With $1.8 Million in Bad Debt, DeFi, meet Claude: Moonwell’s ‘vibe-coded’ oracle in $1.8M blowup, Ether briefly priced at $1 after glitch on DeFi app, MIP-X43 cbETH Oracle Incident Summary, The $1.78M Moonwell Incident and the Future of Agentic Security, Specialized AI detects 92% of real-world DeFi exploits, AI Coding Tools Systematically Ship Security Flaws, Moonwell’s AI-coded oracle glitch misprices cbETH at $1, Moonwell recovery plan proposes $2.68M cbETH compensation, Moonwell Contracts v2, Pull Request #578, Introducing Claude Opus 4.6. Data/status as of February 27, 2026.