Vitalik makes the case for AI-assisted formal verification as crypto's security answer

On May 18, Vitalik Buterin published a lengthy blog post arguing that AI-assisted formal verification may be the most consequential shift in blockchain security in years — not a product launch, not a protocol upgrade, but a structural argument about who wins the arms race between attacker and defender as AI gets better at finding bugs.

The post, titled "A shallow dive into formal verification," opens with a direct problem statement: bugs in code are scary, bugs in code controlling immutable on-chain assets are scarier, and bugs in code wrapped in zero-knowledge proofs are scarier still, because when something goes wrong "we have no idea what went wrong." He frames AI-powered exploit discovery — tools like Anthropic's Claude Mythos, which he cites by name — as turning that threat from theoretical to imminent.

His counter-argument is not defensive. "AI gives you the ability to write large volumes of code at the cost of accuracy, and formal verification gives you back… accuracy," he writes. The combination, done right, could "output extremely efficient code, and be far more secure than the way programming has been done before." He notes that developer Yoichi Hirai calls it the "final form of software development."

What formal verification actually is

Formal verification means writing machine-checkable mathematical proofs that software behaves as specified — not testing it, but proving it. The technique is nearly 60 years old; the obstacle has always been that writing those proofs by hand was too slow and tedious to scale.

Buterin's argument is that AI changes this calculus. The post walks through Lean, a programming language used to write and mechanically verify mathematical proofs, and shows a working example: a proof that every third Fibonacci number is even, expressed in Lean code that a computer can check automatically. The point isn't the math — it's that the same approach can be applied to software: "If you formally verify end-to-end, then you are proving not just that some description of the protocol is secure in theory, but that the specific piece of code that the user runs is secure in practice."

From a user's standpoint, Buterin argues, that collapses the trust requirement. Instead of auditing an entire codebase, you check the statements proven about it.

Where it matters most

Buterin names specific targets. EVM bytecode and RISC-V assembly are already the subject of active projects: evm-asm is building a formally verified EVM implementation written directly in RISC-V assembly — the same ISA that ZK-EVM provers use when they compile Ethereum clients. He also points to ZK-EVM implementations, STARK proof systems (via the Arklib project), consensus algorithms (a work-in-progress formally verified consensus implementation in Lean), post-quantum cryptographic primitives such as quantum-resistant signatures, and communication protocols including Signal and TLS.

A recurring theme in the post is that formal verification is particularly suited to systems where "the goal is much simpler than the implementation" — places where the security property is clean and formalizable even when the code is complex. A STARK, for example, has a simple security property: if a proof verifies for program P, input x, and output y, then either the hash algorithm is broken or P(x) = y. That kind of tight specification is where proofs are tractable.

He also highlights a less-discussed advantage: proofs are end-to-end. "Often, the nastiest bugs are interaction bugs, that sit at the edge of two sub-systems that are considered separately," he writes. Automated rule-checking can reason across the entire system where humans cannot.

The limits he names directly

Buterin's post does not overpromise. He lists four failure modes explicitly. First, proving the wrong thing: "It's very easy to forget to prove the claims that are actually important." Second, unmodeled hardware: even a formally verified software stack can fail if the underlying hardware behaves unexpectedly. Third, side-channel attacks, which sit outside the formal model entirely. Fourth, uncovered modules — verifying one part of a system still leaves the rest exposed, and "even the Lean implementation itself can have bugs."

He calls formal verification "not a panacea," and frames it as an accelerant of a security trend already in motion — alongside type systems, memory-safe languages, sandboxing, and better software architecture — rather than a replacement for all of them.

The bigger framing: verified cores

The most consequential argument in the post is structural. Buterin pushes back against voices in the security community who suggest AI-driven exploit discovery will make open-source software or decentralized systems impossible to defend. He calls that "a bleak future for cybersecurity" that would undermine the entire cypherpunk premise that defenders can hold an asymmetric advantage online.

His alternative: future systems built around small, formally verified "security cores," with AI handling the bulk of code generation elsewhere. "When it comes to the secure core, we don't let the buggy code multiply," he writes. "We act aggressively to keep the size of the secure core small, and indeed even shrink it further."

The trust, in this model, moves from the full codebase to the proof — and the proof is checkable by machine.

What formal verification doesn't address in 2026's exploit record

The industry context Buterin writes into is not abstract. The Kelp DAO bridge attack in April drained $292 million after attackers poisoned internal RPCs used by LayerZero — an infrastructure-layer exploit targeting cross-chain messaging logic, not a smart contract bytecode bug. Drift Protocol suffered a $285 million attack through similar cross-chain infrastructure.

Bridge exploits of this class sit partly outside what formal verification currently handles well. The formal security property for a cross-chain bridge involves assumptions about external chains, relayer behavior, and message authenticity — properties that are harder to specify cleanly than "this STARK proof is valid" or "this consensus algorithm satisfies BFT conditions." Buterin acknowledges that proofs depend on the assumptions baked into their specifications; a verified bridge that trusts a compromised relayer is still vulnerable.

That doesn't invalidate his thesis — it narrows it. The formal verification work underway in the Ethereum community is concentrated on the lowest-level, most precisely specifiable components: the EVM, ZK proofs, consensus, and cryptographic primitives. Those are the layers where a verified core would be smallest and most tractable. Whether the approach scales up to the messy cross-chain trust assumptions at the top of the stack remains an open question.

Primary source: Vitalik Buterin, "A shallow dive into formal verification," vitalik.eth.limo, May 18, 2026. All quotes taken directly from the post.

Vitalik makes the case for AI-assisted formal verification as crypto's security answer

What formal verification actually is

Where it matters most

The limits he names directly

The bigger framing: verified cores

What formal verification doesn't address in 2026's exploit record

Cetus Protocol $223M Exploit Drains Sui's Largest DEX; Validators Freeze $162M in Unprecedented Network Action

DeFi's Infrastructure Layer Cracks: Everclear and ZERϴ Network Shut Down Within 24 Hours

Polymarket's On-Chain Trails Are Becoming a National Security Liability