Google Gives AI Kudos For Finding Its First Real 0-Day Security Threat

Google AI servers hh
Usually, it takes a human to uncover vulnerabilities in a computer system, but before long, the computers may be doing it alone. Google's Deep Mind and Project Zero divisions have teamed up to create a new kind of large language model (LLM) that can identify security flaws, and the model spotted its first critical threat in the open source SQLite database engine.

The Google team has been working on this project since last year, when they created the first version of the vulnerability sniffer known as Naptime. Now, that model has evolved into Big Sleep, which Google says is designed to detect exploit variants that have slipped past humans and shown up in the wild.

Generative artificial intelligence is often shoehorned in where it doesn't belong, but Google says this task is ideal for an LLM. This type of AI is essentially fueled by randomness, which leads to errors known as hallucinations. Open-ended vulnerability research might benefit from LLMs in some fashion, but giving the model a firm starting point is a better application for this technology. By feeding the model previously fixed vulnerabilities, the model is able to hunt down variants of that vulnerability.

As security researchers have found, when there's one bug, there are probably other similar ones. Security teams often use a technique called fuzzing to find them. Pumping invalid or random data into a program can cause a crash or memory leak, which an attacker could exploit to gain access to a system. However, the success of Big Sleep suggests fuzzing is not catching all the variants.

image4 large
The architecture of Google's Naptime LLM, which is now known as Big Sleep.

Big Sleep identified the vulnerability in a pre-release version of SQLite—a stack buffer underflow flaw that could be exploited by any knowledgeable human programmer. Google informed the SQLite maintainers in October, who were able to devise a fix on the same day before the final build had rolled out to users. That's a win for the good guys.

The team expresses genuine excitement at having found this vulnerability. The ability of an LLM to identify a notable flaw in a popular, well-fuzzed project is a major milestone in AI and security research. The team hopes that future versions of Big Sleep will help finding exploitable crashes and conducting root-cause analysis—it may even be able to uncover distinct vulnerabilities instead of variants. The Google team intends to continue to share this work as it develops.