AI data infrastructure startup Datacurve has raised $15 million in Series A funding to redefine how AI companies collect and refine high-quality training data — taking direct aim at established players like Scale AI.
The round was led by Mark Goldberg at Chemistry, with participation from employees at DeepMind, Vercel, Anthropic, and OpenAI, alongside notable angel investors. It follows a $2.7 million seed round that included backing from Balaji Srinivasan, former CTO of Coinbase.
A “bounty hunter” system for the world’s best engineers
Founded by Serena Ge and Charley Lee, Datacurve operates a “bounty hunter” platform that incentivizes skilled software engineers to contribute high-value datasets for AI model training. Contributors earn rewards for completing complex coding or data-generation tasks — with the startup already distributing over $1 million in bounties.
But Ge says that financial incentives are only part of the story.
“We treat Datacurve as a consumer product, not a data-labeling operation,” Ge explained. “Our focus is on creating an engaging and frictionless experience that attracts top talent. The best contributors join because they enjoy the challenge — not just the payout.”
This design philosophy has helped Datacurve build an active, high-skill community that produces nuanced, high-quality datasets essential for today’s reinforcement learning (RL) and fine-tuning processes.
The post-training data opportunity
AI companies increasingly rely on sophisticated post-training data — datasets built from strategic interactions, edge-case testing, and domain-specific environments that can’t be scraped or simulated.
“Early models were trained on simple, static datasets,” Ge said. “But modern AI systems need dynamic, high-quality environments to learn from. We’re creating the infrastructure to make that possible — an ecosystem that attracts and retains the most capable domain experts.”
While Datacurve’s current focus is software engineering, the company sees future potential across industries such as finance, marketing, and medicine, where specialized expertise is crucial for producing context-aware data.
Competing in a maturing AI data market
As AI model development becomes increasingly commoditized, investors see data quality as the next major competitive frontier. Scale AI, once dominant in this space, faces growing competition following founder Alexandr Wang’s move to Meta.
Mark Goldberg of Chemistry says that’s where Datacurve’s model stands out:
“Datacurve is building the next-generation data network — one that combines human intelligence, gamified contribution, and AI-assisted validation. It’s the natural evolution of data infrastructure for frontier models.”
About Datacurve
Datacurve builds high-quality, human-verified datasets for AI training using a “bounty hunter” system that engages top engineers worldwide. By blending incentives, gamified participation, and expert-driven validation, Datacurve helps AI labs collect complex, post-training data that enhances reasoning, reliability, and real-world performance.