AI has officially figured out how to train itself without human help
Home>News>AI

AI has officially figured out how to train itself without human help

CoreWeave says real-world data could reshape how AI agents improve

google discoverFollow us on Google Discover

AI has officially figured out how to train itself without human help, at least according to the latest announcement from AI cloud company CoreWeave.

The company has launched a new set of agentic AI capabilities aimed at enterprise customers, as the race to build more reliable autonomous systems continues to accelerate.

AI agents are seen by many in the industry as the next major step beyond chatbots — as they’re designed to complete tasks, use tools, make decisions, and work through multi-stage problems with less direct input from users.

On the other hand, that creates an obvious problem for companies hoping to use them in real businesses.

Testing an AI assistant that answers one question is difficult enough, but testing an agent that can act across a workflow is far harder.

CoreWeave calls the feedback system its route to superintelligence (CoreWeave)
CoreWeave calls the feedback system its route to superintelligence (CoreWeave)

The issue is that traditional development relies on long offline evaluations before agents are exposed to real users, and those tests cannot cover every situation they might face.

As CoreWeave itself explained, its new platform is designed to change that by connecting training and inference in a single closed feedback loop, allowing AI agents to learn from real-world activity and improve as they operate; this is what initiates the ‘train itself’ element of the company’s pitch.

Instead of treating training and live use as two separate stages, CoreWeave says its system brings reinforcement learning, production inference, agent monitoring, and autonomous improvement together.

The company describes this as progress towards the ‘superintelligence loop’, where feedback from real-world use can be turned into improvements much faster than traditional testing cycles allow.

As laid out by the Financial Times, the launch brings together four main components: CoreWeave Serverless RL, CoreWeave Inference, W&B Weave, and W&B Skills with an MCP server.

Serverless RL is designed to let companies post-train large language models for multi-turn agent tasks without managing their own infrastructure. CoreWeave claims the service can reduce costs by up to 40% and accelerate training by approximately 1.4x compared with local H100 GPU environments.

CoreWeave's separate GPU clusters are said to speed up AI learning cycles (CoreWeave)
CoreWeave's separate GPU clusters are said to speed up AI learning cycles (CoreWeave)

CoreWeave Inference handles production deployment, while W&B Weave acts as the observability layer, helping teams monitor agent behaviour and spot failure modes. W&B Skills and the MCP server are then intended to help coding agents operate more like AI researchers and agent builders.

Chen Goldberg, Executive Vice President of Product and Engineering at CoreWeave, said: "The pace of AI has outrun the way teams build for it. Today's tradeoff: dev cycles that can't keep up, or shipping agents and discovering failure modes in production."

"Enterprises that put agents in production first and let them continuously improve from real-world experience aren't just building more reliable AI, they're accelerating the path to superintelligence."

Nick Patience, Vice President & Practice Lead, AI Platforms at Futurum, added: “Most enterprises are stuck in a cycle of building and testing agents before they ever reach real users, and that cycle is becoming too slow and too expensive to sustain."

Featured Image Credit: Ekaterina Goncharova / Getty