AIxCC: AI Cyber Challenge | Ep 89

Sep 22, 2025

Voices

Michael Brown, AIxCC Team Trail of Bits
Andrew Carney, AIxCC program manager, DARPA and ARPA-H
Kathleen Fisher, office director, DARPA Information Innovation Office
Taesoo Kim, AIxCC Team Atlanta
Tyler Nighswander, AIxCC Team Theori
Jennifer Roberts, office director, ARPA-H Resilient Systems Office
Stephen Winchell, DARPA director
Host: Tom Shortridge, Public Affairs

Identifying and patching vulnerabilities at speed and scale

The AI Cyber Challenge, AIxCC, marks a pivotal inflection point for cyber defense.

Numerous attacks in recent years have illuminated the ability for malicious cyber actors to exploit vulnerable software that runs everything from financial systems and public utilities to the health care ecosystem. AIxCC competitors successfully demonstrated the ability of novel autonomous systems using AI to secure the open-source software that underlies critical infrastructure, with winners revealed at DEF CON 33.

Hear from DARPA and ARPA-H on the game-changing results the competitors achieved. And although the competition is over, the challenge continues. All seven finalist teams’ CRSs have been made available as open-source software under a license approved by the Open Source Initiative.

Hear from teams on their experiences through the competition and how they are moving their systems to the real world.

Accordion item

Transcript

Intro Voices

Coming to DARPA is like grabbing the nose cone of a rocket and holding on for dear life.

DARPA is a place where if you don't invent the internet, you only get a “B.”

A DARPA program manager quite literally invents tomorrow.

Coming to work every day and being humbled by that.

DARPA is not one person or one place. It's a collection of people that are excited about moving technology forward.

Tom Shortridge
Hello and welcome to voices from DARPA. I'm your host, Tom Shortridge. On this episode, we're diving into DARPA’s AI Cyber Challenge, AIxCC, which recently revealed the winners of its final competition at DEF CON 33.

Andrew Carney
In third place: congratulations to Team Theori.

In second place with a prize of $3 million: congratulations to Team Trail of Bits.

And in first place, with a prize of $4 million. Congratulations to Atlanta.

Tom Shortridge
That’s all for this episode of Voices from DARPA.

I'm just kidding. Congratulations to the winners, but there's a lot more to the story.

If you want to start from the beginning, you can go back and listen to episode 73, where we first discussed the AI Cyber Challenge before any of the competitions have taken place. In the two years since, a lot has happened.

To set the playing field, let's hear from DARPA and ARPA-H AIxCC Program Manager Andrew Carney.

Andrew Carney
To achieve technology truly indistinguishable from magic, we need infrastructure and software that is extremely robust, extremely performant, and extremely resilient. But in today's increasingly software-driven world, we heavily rely on projects developed and maintained by passionate, capable volunteers and donations, and this fuels our critical infrastructure. It underpins our daily lives. At the same time, that critical infrastructure is under constant attack by nation states and ‘unsophisticated’ cyber actors.

How do we go from where we are today to what we can imagine, the sort of tech future and all the benefits of realizing that?

Tom Shortridge
DARPA Director Stephen Winchell describes the current state of the foundational tech stack and a fundamental problem that AIxCC is working to solve.

Stephen Winchell
We're living in a world right now that has ancient digital scaffolding that's holding everything up. A lot of the code bases, a lot of the languages, a lot of the ways we do business and everything we've built on top of it incurred huge technical debt over the years. And the reality is, it is a problem that is beyond human scale, and it's a critical problem that we need to solve right now.

Andrew Carney
And so in response to this growing need for better automated vulnerability discovery and remediation, DARPA launched the AI Cyber Challenge and has collaborated with ARPA-H to change the world.

Stephen Winchell
This is one of the major efforts that we have to actually try to solve some of those problems in a way that's scalable, in a way that leverages a lot of our strengths, American AI companies, ingenuity here in the United States, and also just really, really awesome people. This is a huge opportunity for the community to have great new tools and for all of us to work together to secure that digital scaffolding and make all of our lives better, both for every American citizen as well as for the men and women in uniform who are out there on the front line.

So these are critical technologies for all of us, and automating them and scaling them is the future.

Tom Shortridge
So that's the problem space and the big picture on AIxCC. But what does the competition actually entail?

Andrew Carney
So this is a public competition to develop autonomous systems that can find real vulnerabilities and patch them effectively in source code. The vulnerabilities themselves are realistic. We've taken real open source software and developed synthetic forks and developed realistic vulnerabilities and inserted them into those synthetic code bases. So millions of lines of real code with novel synthetic vulnerabilities that have never been seen by any person or LLM before.

So discovery is important, but patching is where we really change the game. So we wanted to reward teams and systems that could patch as effectively as possible. We see a lot of efforts on automated vulnerability discovery and those are great. I love that. That's been most of my career. But what we need is to fill this gap in patch development, patch generation in an effective, timely manner because the scale of the problem is sort of beyond what we can deal with as humans. We have a lot of things to address before we get to that tech ‘magic.’

Tom Shortridge
DARPA tackles challenges right at the edge of possibility, but AIxCC marks a pivotal inflection point in cybersecurity, proving that we can automatically find and patch vulnerabilities in code at speed and scale. Doctor Kathleen Fisher, director of DARPA's Information Innovation Office, I2O, weighs in.

Kathleen Fisher
So I think the competition has changed the future of cybersecurity. We fundamentally change dour understanding of what's possible. Two years ago, the people who work in this space, when they started the competition, turns out they were pretty skeptical. They thought it was going to be the kind of the cyber reasoning, the traditional program analysis, kinds of tools that were going to win the day.

But over the course of the competition, they realized that the AI systems actually had a ton to add to the competition. So how are they going to approach these kinds of tools in the future? Fundamentally different. The artifacts that they built turned out to be really, really good at finding and fixing bugs, which we saw in the results of the competition.

Tom Shortridge
Let's put the competition into perspective. The playing field, if you will, was 54 million lines of code into which the AIxCC team inserted 70 novel synthetic vulnerabilities to be discovered by the competitors. Of those 70, teams discovered 54 and patched 43. But that's not all.

Kathleen Fisher They found 18 zero days, and they patched 11 of them. And they did that – like, it took 45 minutes on average to find, to patch a vulnerability. That’s just, like, game changing times. And it took like $152 per successful task. That's just super, super cheap compared to the time that it would take a person to do that. So that will enable us to find and fix bugs in critical software.

Just game changing, the different rates and costs. So that's the technical miracle. So DARPA’s taken that technical miracle off the table.

Tom Shortridge
Now even though we at DARPA love our metrics, we know that there can be more to learn than just what's visible in the data.

Andrew Carney
One of the most exciting things about this competition is how much the teams have left on the table, that we weren't able to necessarily capture the amazing things the teams can do. The scoreboard is an accurate and useful reflection of their performance in the competition, but it's not an accurate representation of their potential or even capability today. All of the teams have done something incredible and have a unique capacity to find and patch vulnerabilities.

Tom Shortridge
To this point, we've talked about the implications of AIxCC in very broad strokes. Critical infrastructure and cybersecurity. For a more concrete example, let's turn to our friends at the Advanced Research Projects Agency for Health. ARPA-H partnered with DARPA on AIxCC in March of 2024, expanding the competition's prize pool and providing additional insight and connections to our nation's health care infrastructure.

Doctor Jennifer Roberts, director of ARPA-H’s Resilient Systems Office, puts the potential impact from AIxCC in real world perspective.

Jennifer Roberts
What our office focuses on is how to make it so that the health care ecosystem works well, regardless of whether it's a steady state or there's an unexpected disruption, like a cyberattack on a hospital system, which is a huge challenge area, especially considering a third of our nation's hospitals are at risk of going bankrupt. So a cyberattack could be the thing between them and closing their doors, which clearly has drastic implications for patient care.

So health infrastructure is increasingly targeted by cyber attacks, and that has negative implications both for patient care and for patient privacy. And even if a cyberattack only causes delays of patient care for a few minutes, that can have drastic consequences. We have been so excited to be part of the AI Cyber Challenge, in order to make it so that we can draw more intention and more progress toward the unique challenges in the health care sector.

Because the diversity of medical devices, the number of different types of devices in a hospital that need to work 24/7 just makes this a much harder problem. Patient lives are at risk, as is privacy of patient records. So the off the shelf tools are not cutting it. Which is why we are so excited about the amazing results, because this can be game changing and health care and move us toward a reality where ransomware attacks across hospitals become a thing of the past.

Tom Shortridge
So the AIxCC competition is complete, but that means the real work is just getting started.

Andrew Carney
Having the competition results made public is super exciting. It lets us start the next phase of this adventure, where we get to take the technology that the teams have developed and actually have it introduced into the real world. We've proven that it can find and patch vulnerabilities at scale in realistic code, and find zero days and patch them. And now our success is sort of giving us this opportunity to help continue that deployment into real code bases and secure our critical infrastructure at large.

Kathleen Fisher
So one piece of that is all of the teams, to get their money, have to open source their technology.

Tom Shortridge
Since the end of the competition, all seven of the finalist teams, not just the winners, have open sourced their cyber reasoning systems. You can find those at archive.aicyberchallenge.com along with competition, infrastructure challenges and documentation. We'll have a link in the show notes.

Kathleen Fisher
People out in the world can just go download and start using this and finding and patching their software today. And I should say that the technology that we're seeing, the game changing results, that's the worst it's ever going to be. It's just going to get better from there.

Tom Shortridge
We've heard an awful lot from DARPA and ARPA-H. Now let's turn the mics over to the real stars of the show: the competitor teams. First, from Team Theori, who placed third, Tyler Nighswander.

Tyler Nighswander
Our team is a lot of people who play different capture the flag. We've got in first place in the competition eight times. Most of the times we don't get first. We have gotten second. I was also part of the team that won the Cyber Grand Challenge in 2016. So we have a lot of experience with all these things, and this really has been a game changing thin, in my opinion. This was not just another okay, yeah, we will do another competition. This was, you know, a massive change and a massive shift in what I think is possible for the future.

When we started playing, we were pretty confident that we'd be able to solve challenge problems. Right. If someone inserts challenge problem and expects it to be solved, we can probably handle that.

I think the most exciting and surprising thing to us was when we ran some of these challenge problems and realized we don't just find the challenge problems, we're finding real security bugs that are relevant today that the upstream people don't know about. Right? And that was really exciting because the end of the day, the goal isn't I mean, yes, we want to win the competition, but the real goal is to have security relevant findings that we can use in the real world.

Tom Shortridge
From second-place finisher, Team Trail of Bits, here's Michael Brown.

Michael Brown
This is an area that I've worked on for probably the last seven years. The intersection of AI/ML and applying that to solve security problems. AI/ML is long promised to give us the scale needed to tip the scales back in favor of defenders who have to defend everywhere all the time versus attackers, and only have to be right one place at one time.

And I think today that promise was finally delivered on. We’ve finally shown that we can find real world vulnerabilities, we can patch them, and we can do this at scale, and we can do it at a price that is reasonable for virtually any organization to invest in. We really designed our cyber reasoning system, which we call Buttercup, to be used by anybody.

At Trail of Bits, we're big believers in open source and we're big believers that in security, we need a rising tide that lifts all ships. So we think virtually anybody can use the code, and we've actually backed that up. So in addition to open sourcing the versions of our cyber reasoning system that competed in both the semifinal and final challenges, we also open sourced a version of Buttercup that runs on a laptop, so literally anybody can use it. And that's available right now.

Tom Shortridge
For first place Team Atlanta, victory wasn't assured. In fact, as team lead Dr. Taesoo Kim notes, the team was working hard right up until moments before the submission deadline.

Taesoo Kim
We put so much energy into this competition for the last two years, and then we were worried that our system might fail. This competition is all about automation, but one single line of the CRS, might destroy the competition. We worried so much in every stage of the competition. Six hours before the deadline, we found a very significant vulnerability after submitting, and we had to wake up every single patching case at 5 AM to fix those type of vulnerabilities.

Although this is a single line, it plays a critical role in our system – might our system fail?

Tom Shortridge
And obviously those last minute fixes worked.

Taesoo Kim
This is a great moment. You can think of it this way: in the future, the AI agent, every single one of you guys now have a security expert next to you. Every decision you're going to make, you can advise. You can get some feedback from what you're doing and even proactively involve in your daily based decisions in your lifetime.

I think I see the possibility right now about, software developers now have agents, like we developed right now, can leverage those systems in their daily life. I think this is not the future - because of AIxCC, we truly believe this is happening right now to drastically improve the quality of their software, their writing, and the possibility to fix ahead of attackers.

Tom Shortridge
For a deeper dive into each of the finalist teams, you can visit aicyberchallenge.com, where in addition to the open source cyber reasoning systems and competition data we already mentioned, you can watch each team's presentation at DEF CON on challenges and lessons learned through the AIxCC journey. Again, links in the show notes. Now, while the AIxCC competition is complete, the challenge continues. Here's DARPA Director Stephen Winchell.

Stephen Winchell
How do we work with these small teams to commercialize their product and actually get it out onto different critical infrastructure software systems and start patching them at scale? We’ve invested an extra $1.4 million to make sure the teams have a lot of incentives in prizes to actually go out and solve some of these problems and start working them, and we'll be incrementing $10,000 at a time, up to $200,000 for each of the finalist teams as they go out into critical infrastructure software and start actually rolling out solutions.

Tom Shortridge
And Dr. Kathleen Fisher.

Kathleen Fisher
We see in the newspaper constant reports of national security vulnerabilities being put at risk. Things like Salt Typhoon and Volt Typhoon and other attacks. And we have this sense of learned helplessness that there's just nothing we can do about it. That's the way software is. But AIxCC points the way to a brighter future, where software does what it's supposed to do and nothing else.

AIxCC is pointing to a way where we can have a world where we can have software that we can rely on. Of course, this is just the first step. We need to actually materialize that future, and it's up to everyone to help us go from the world where we are right now, to realize the vision that AIxCC is pointing us to.

Tom Shortridge
And to close this one out, here's Andrew Carney with his final comments from the DEF CON stage.

Andrew Carney
The world changes today. The systems that were developed by every finalist, they've shown that they can fix software extremely quickly. Real software in a scalable cost effective way. And these tools are yours to use right now. There's no excuse not to use this flavor of automation, and it'll only get better from here. This is the new floor. So if you're a developer, a maintainer, or an end user, we want to hear from you, especially if you work in critical infrastructure.

Reach out to us at aixcc@darpa.mil, and let us help you leverage this new technology to change the world.

Tom Shortridge
That's all for this episode of voices from DARPA. For more information on the AI Cyber Challenge, visit aicyberchallenge.com and DARPA. mil Check the show notes for links. As always, thanks for listening.

Office

Information Innovation Office

Sep 22, 2025

Voices

Identifying and patching vulnerabilities at speed and scale

Office

Subscribe

Related content

Show notes

Contact

Work with Us

R&D Opportunities

Programs

Offices

News

Events

Careers

About

Breadcrumb

AIxCC: AI Cyber Challenge | Ep 89

Sep 22, 2025

Voices

Identifying and patching vulnerabilities at speed and scale

Office

Subscribe

Related content

Show notes

Contact