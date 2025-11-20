SAN FRANCISCO, Nov. 20, 2025 (GLOBE NEWSWIRE) -- Crusoe, the industry’s first vertically integrated AI infrastructure provider, today announced the general availability of Crusoe Managed Inference , a new service designed to run leading model inference on Crusoe Cloud with ultra-low latency, breakthrough time-to-first-token (TTFT) speed, and resilient scaling. Optimized for the most demanding inference workloads, including large context and long-form text generation, AI developers can use Crusoe Managed Inference to rapidly deploy and automatically scale production-ready models, instantly enabling new capabilities like AI agents and complex task automation.

The new service is powered by Crusoe's proprietary inference engine, the only inference engine with MemoryAlloy technology, a cluster-wide KV cache that eliminates duplicate prefills by allowing GPUs to fetch prefix caches from local and remote nodes instantly. Crusoe MemoryAlloy is a proprietary cluster-native memory fabric that enables persistent sessions, contextual continuity, and seamless scaling across an entire cluster. This results in faster and more cost-effective inference for AI developers.

“Developers today are forced to choose between blazing fast inference speed, throughput, and manageable infrastructure costs – a trade-off that throttles innovation,” said Erwan Menard, SVP of Product, Crusoe. “With Crusoe Managed Inference, we are not just hosting models; we are solving the most complex parts of the inference stack for AI developers. Crusoe MemoryAlloy, our inference engine’s cluster-native memory fabric, allows us to deliver unmatched time-to-first-token and throughput, accelerating our customers’ ability to deliver complex, large-scale AI applications cost-effectively.”

Crusoe Managed Inference is designed for AI developers who need to move from model to production without managing complex infrastructure. The service delivers quantifiable performance gains that directly impact user experience, as well as flexible pricing:

Breakthrough speed: Achieve up to 9.9x faster time-to-first-token* with our inference engine featuring MemoryAlloy, a cluster-wide KV cache with intelligent routing that eliminates duplicate prefills.

Introducing the Crusoe Intelligence Foundry

Crusoe Managed Inference is accessible through the new Crusoe Intelligence Foundry, a unified hub designed to provide AI developers with a fast path to production. The foundry accelerates model discovery and experimentation, allowing users to generate API keys in minutes.

Key features include:

Leading open-source models: Run the world's top open-source models including Kimi-K2, Llama 3.3 70B Instruct, Gemma 3 12B, Gpt-oss-120b, Qwen3 235B A22B Instruct 2507, DeepSeek V3 0324, and DeepSeek-R1 0528; plus experiment with unique models available exclusively on Crusoe Cloud from cutting-edge labs like Decart.

Trusted by Customers

“Our mission at Wonderful is to enable enterprises to transform their operating model with AI agents that actually work in production. The challenge is always doing that at scale without compromising speed - something which MemoryAlloy tackles. Its cluster-wide KV cache capability uniquely addresses the biggest bottlenecks in large-scale inference,” said Roey Lalazar, co-founder and CTO at Wonderful.ai . “This is the kind of foundational technology that will enable our customers to build and deploy far more powerful and responsive AI agents with confidence.”

“Yutori's Scouts are always-on AI agents that monitor the web; they are powered by in-house models for autonomously navigating websites on a browser. Optimizing for throughput and price is critical for our product experience,” Dhruv Batra, Co-founder and Chief Scientist at Yutori . “We're excited to explore the performance benefits that Crusoe's Inference Engine provides, and are looking forward to serving our models through the service.”

“The demands of clinical deployment in healthcare are unforgiving – we need to process complex records instantly. Crusoe Managed Inference helps us meet that challenge,” said Grant Jensen, Co-Founder & CEO at Oaklet . “It provides a reliable path to production at a pace we haven’t seen on other platforms. This allows us to focus entirely on refining our EHR system, utilizing Crusoe’s breakthrough speed to support clinicians in real-time.”

AI Developers Can Get Started Today

Crusoe Managed Inference is now available. AI developers can sign up for Crusoe Intelligence Foundry to get started with a library of leading models here .

About Crusoe

As the AI factory company, Crusoe is on a mission to accelerate the abundance of energy and intelligence. The company provides a reliable, scalable, cost-effective, and environmentally friendly solution for AI infrastructure. By harnessing large-scale clean energy, building AI-optimized data centers, and delivering an AI cloud platform, Crusoe empowers its customers to build the future faster.

Media Contact

Stephanie Schlegel

Offleash for Crusoe

Crusoe@offleashpr.com