OpenAI And Broadcom Reveal Jalapeño AI Chip To Bring The Heat To NVIDIA
Running ChatGPT at planetary scale on rented hardware is expensive, and most of that hardware traces back to one supplier. Leaning on NVIDIA's GPUs has worked well enough to this point, but it leaves OpenAI paying a premium on every query and jockeying with the entire industry for supply. A custom-designed chip helps solve both problems.
OpenAI built an unusually large compute element and paired it with high-bandwidth memory to maximize compute, minimize data movement and keep those cores fed. TSMC reportedly manufactures the chip, Broadcom handles the silicon implementation and networking, and Celestica builds the server racks.
The development timeline is a jaw-dropper. A high-performance chip, manufactured on a leading edge node, usually takes two to three years from concept to tape-out. Jalapeño supposedly got there in nine months. OpenAI leaned on its own models to help grind through layout and optimization and apparently engineering samples are already humming in the lab, including one running the GPT-5.3-Codex-Spark model.
Leaked financials point to OpenAI having pulled in roughly $13.07 billion in revenue last year against an operating loss near $20.9 billion. Recently published OpenAI research found its most active internal users now run more than 60 hours of Codex agent work in a single day, exactly the token-hungry workload Jalapeño is built to serve cheaply. Tan has said early samples cut inference costs by about half versus a typical AI GPU, and he told one source that surging demand could let the partnership "do better" than his earlier deployment forecast. OpenAI hardware chief Richard Ho added further flavor to the new chip, calling Jalapeño "a very general purpose device" tuned for language models.
Every performance figure so far is self-reported, with a full technical report promised later this year. OpenAI still depends heavily on the NVIDIA GPU for training, and any chip is only as good as the supply chain feeding it. Jalapeño is the first rung of a planned multi-generation ladder, with the next version reportedly due in 2028, with yearly updates after that.
