
Three years ago, while working on chip design at Intel, Luminal co-founder Joe Fioti had a realization: even the most advanced hardware is limited if the software that powers it is difficult for developers to use.
“You can make the best hardware on earth, but if it’s hard for developers to use, they’re just not going to use it,” he told reporters.
That insight led to the creation of Luminal, a startup focused on solving GPU software bottlenecks. The company has now raised $5.3 million in seed funding, led by Felicis Ventures, with participation from angels including Paul Graham, Guillermo Rauch, and Ben Porterfield. Luminal was part of Y Combinator’s Summer 2025 batch.
Optimizing the Software Layer Between AI Models and GPUs
Luminal sells compute infrastructure, similar to new-generation cloud companies like CoreWeave and Lambda Labs. But instead of relying primarily on GPUs, the startup differentiates itself by squeezing more performance out of existing hardware through deep software optimization.
The company focuses on the compiler layer—the crucial translation layer between developer-written code and GPU instructions. This is the same software stack that Fioti struggled with while building chips at Intel.
Today, the industry standard compiler is Nvidia CUDA, a powerful but partially open-source system that has played a huge role in Nvidia’s dominance. With global GPU shortages and surging model workloads, Luminal believes there is enormous opportunity in building a more efficient, more flexible alternative across the rest of the compute stack.
Riding the Wave of Inference Optimization
Luminal is part of a fast-emerging category of startups focused on inference optimization—making AI models run faster and cheaper. Companies like BaseTen and Together AI have helped pioneer the space, while newer teams such as Tensormesh and Clarifai are experimenting with specialized performance techniques.
But Luminal faces competition from the optimization teams inside major AI labs and cloud hyperscalers, who tune for a narrower set of models and hardware. Luminal, by contrast, must optimize for whatever architecture its customers bring.
Fioti isn’t worried.
“It’ll always be possible to spend six months hand-tuning a model for a specific piece of hardware and outperform a compiler,” he said. “But our bet is that for everything short of that, a flexible, all-purpose optimization layer is incredibly valuable — and the demand is only growing.”
A Team Built for Systems-Level Problem Solving
Fioti co-founded Luminal with Jake Stevens (ex-Apple) and Matthew Gunton (ex-Amazon). Together, they bring engineering experience across large-scale compute, hardware design, and high-performance software — all essential for rethinking how AI workloads hit the metal.