Inferact raises $150M seed round to build next-generation AI inference infrastructure

January 23, 2026

Share now

Read this article in:

Inferact raises $150M seed round to build next-generation AI inference infrastructure — © Inferact

Inferact, an AI infrastructure startup founded by the core contributors behind the open-source inference project vLLM, has raised $150M in seed funding at an $800M valuation. The company is focused on making large-scale AI inference faster, cheaper, and easier to deploy.

The round was led by Andreessen Horowitz and Lightspeed, with participation from Databricks’ venture arm, the UC Berkeley Chancellor’s Fund, and additional investors.

Built by the team behind vLLM

Inferact was founded by Simon Mo, Woosuk Kwon, Kaichao You, and Roger Wang, the researchers and engineers behind vLLM, an open-source library widely used for large language model inference.

vLLM, short for virtual large language model, is maintained by a global open-source community and is already deployed by major technology companies including Meta and Google. The project focuses on optimising inference, the stage where trained AI models generate outputs in production environments.

Solving the inference bottleneck

As AI models grow larger and are used by more applications simultaneously, inference has become a major constraint. Running models for longer sessions, processing more tokens, and serving thousands of concurrent users places heavy demands on memory efficiency and hardware utilisation.

vLLM addresses these challenges by optimising how models manage memory and compute. Its PagedAttention technique reduces memory waste by handling key-value caches more efficiently, while additional methods such as quantisation and parallel token generation help reduce latency and infrastructure costs.

From open source to commercial infrastructure

Inferact has outlined two parallel goals. The first is to continue supporting and scaling the vLLM open-source project by funding development and expanding support for new model architectures, hardware platforms, and large multi-node deployments.

The second is to build a commercial inference platform on top of this foundation. The company plans to develop what it describes as a universal inference layer, offering production-grade features such as serverless deployment, observability, troubleshooting, and disaster recovery, likely delivered via Kubernetes-based infrastructure.

Rather than competing directly with cloud providers or model hosts, Inferact says it intends to work alongside existing platforms to simplify AI serving for engineering teams.

Making AI serving simpler at scale

Inferact’s long-term ambition is to remove the need for large, specialised infrastructure teams to deploy and operate AI models at scale. By abstracting away complexity in inference, the company aims to make production AI systems easier to run as models continue to grow in size and capability.

With a large seed round and deep roots in one of the most widely adopted inference projects, Inferact is positioning itself as a core infrastructure player in the next phase of AI deployment.

Get the top Stories in your Inbox

[mc4wp_form id="399"]

Specials from Leadership

World’s Inspiring Leaders

Inferact raises $150M seed round to build next-generation AI inference infrastructure

Share now

Read this article in:

Inferact, an AI infrastructure startup founded by the core contributors behind the open-source inference project vLLM, has raised $150M in seed funding at an $800M valuation. The company is focused on making large-scale AI inference faster, cheaper, and easier to deploy.

Built by the team behind vLLM

Advertisement

Solving the inference bottleneck

From open source to commercial infrastructure

Making AI serving simpler at scale

Get the top Stories in your Inbox

Specials from Leadership

Markus Villig: Building the Future of Urban Mobility from a Small Baltic Island

Bernd Bergmair: The Invisible Billionaire Who Built a Global Empire and Still Comes Home for the Harvest

The Architect of Modern Business: Klaus Bichler’s Mission to Redefine Transformation in a Digital Age

Code Together, Change Together: How Chris Wanstrath turned GitHub into a global Collaboration Platform

Satya Nadella: Leading with Empathy, Transforming with Vision

Jan Koum: The Obsession Behind Crafting WhatsApp’s Success