Multiverse Computing Launches LittleLamb Model Family on

DONOSTIA, Spain, April 28, 2026 (GLOBE NEWSWIRE) -- Multiverse Computing, the leader in AI model compression, today announced the release of the LittleLamb open-source model family, three new ultra-compact models now available for free on Hugging Face. Designed for real-world AI deployment in a smaller footprint, the models are LittleLamb 0.3B, a general-purpose model; LittleLamb 0.3B Tool-Calling, a specialized variant optimized for tool use and agentic workflows; and LittleLamb 0.3B Mobile, a deployment-focused variant built for on-device and mobile applications. Built from Qwen3-0.6B and compressed using Multiverse’s CompactifAI technology, each model reduces the base architecture by approximately 50%, enabling efficient deployment across edge, mobile, and offline environments.

All three models share the same ~0.3B footprint, bilingual English and Spanish support, and dual inference modes, giving developers flexibility to balance deeper reasoning against lower-latency responses depending on their needs. Thinking mode enables chain-of-thought-style reasoning for complex tasks such as math, science, and multi-step problem solving, while non-thinking mode prioritizes speed for efficient, general-purpose dialogue. Both LittleLamb 0.3B and LittleLamb 0.3B Tool-Calling outperform the original Qwen3-0.6B model, as well as other models in the Gemma 270M class on HLE testing. This reflects Multiverse's focus on providing compressed models that are competitive against base architectures and within the broader smaller-model category. The new compressed models also improve system throughput, latency, output speed, and TTFT benchmarks. LittleLamb Mobile also improves accuracy on Mobile Action tasks compared to the Gemma 270M class.

Multiverse Computing LittleLamb Family Banner

“The launch of LittleLamb continues our mission to make efficient AI available across every deployment environment without losing the flexibility and accuracy developers need,” said Enrique Lizaso Olmos, CEO of Multiverse Computing. “With CompactifAI, we've demonstrated that compression doesn't require sacrificing intelligence or capability. This model family shows that compact models can do far more than lightweight chat, and can run in environments where traditional models are simply too large or too dependent on cloud infrastructure.”

The three distinct LittleLamb models are designed to give developers multiple deployment options:

LittleLamb 0.3B is a general-purpose bilingual model for conversational AI, reasoning, virtual assistants, general Q&A, and other on-device or edge use cases where no external tool integration is required.
LittleLamb 0.3B Tool-Calling adds fine-tuning for native tool use, function calling, structured JSON outputs, and agentic workflows, enabling reliable interaction with APIs, browsers, code execution environments, other tool-integrated systems, and multi-step agentic workflows. It is purpose-built for developers integrating AI into automation pipelines as a compact reasoning and action layer that can be embedded within larger workflows or agent pipelines at the edge.
LittleLamb 0.3B Mobile is a deployment-focused variant of the same compressed architecture, specifically packaged for efficient inference on mobile and edge hardware. It targets on-device assistants, offline-capable apps, and any environment where latency, memory, and battery budgets are tight. This model will also be available through the CompactifAI App for use without internet connectivity or cloud dependency in the coming weeks.

Multiverse's proprietary AI compression technology CompactifAI applies quantum-inspired tensor network mathematics to reduce model size by up to 95% with only a 2–3% precision loss, a significant departure from the industry norm of 20–30% accuracy loss at comparable compression rates. This enables models to operate in lighter, more deployable form factors across mobile, edge, and offline environments.

With this model family launch, Multiverse deepens its push into edge-native AI, expanding the company's growing portfolio of compressed open-source models designed to make advanced AI more deployable across a wider range of surfaces and systems. For developers building on edge devices, in mobile environments, or in workflows where privacy, latency, or compute constraints matter, the challenge is no longer access to AI models in theory, but access to models that are practical to run. The LittleLamb family represents a meaningful new capability class designed to meet that demand.

For developers and professionals interested in testing Multiverse Computing's models, all releases will be available on Hugging Face at https://huggingface.co/MultiverseComputingCAI. LittleLamb can be accessed now at https://huggingface.co/MultiverseComputingCAI/littlelamb. LittleLamb 0.3B Tool-Calling can be accessed now at https://huggingface.co/MultiverseComputingCAI/littlelamb-toolcalling, and LittleLamb 0.3B Mobile accessed now at https://huggingface.co/MultiverseComputingCAI/littlelamb-mobile.

For additional technical details on the LittleLamb model family, visit https://multiversecomputing.com/resources/introducing-the-littlelamb-0-3b-model-family. Technical documentation, benchmarks, and integration guides also accompany each release on the company’s Hugging Face page, https://huggingface.co/MultiverseComputingCAI. To learn more about compressed AI models, visit multiversecomputing.com.

About Multiverse Computing

Multiverse Computing is the leader in AI model compression. The company’s deep expertise in quantum software led to the development of CompactifAI, a revolutionary compressor that reduces computing requirements and unleashes new use cases for AI across industries. Headquartered in Donostia, Spain, with offices in the United States, Canada, and across Europe, Multiverse serves more than 100 global customers, including Iberdrola, Bosch, and the Bank of Canada. For more information, visit www.multiversecomputing.com.

Media Contact
LaunchSquad for Multiverse Computing
multiverse@launchsquad.com

A photo accompanying this announcement is available at https://www.globenewswire.com/NewsRoom/AttachmentNg/fc136fd5-fe46-4532-bd2e-2afce2a935a8

Multiverse Computing Launches LittleLamb Model Family on Hugging Face, Expanding Compact AI for Edge, On-Device, and Agentic Use Cases

New open-source releases LittleLamb 0.3B, LittleLamb 0.3B Tool-Calling, and LittleLamb 0.3B Mobile pair ultra-compact deployment with bilingual reasoning in 50% compressed versions of Qwen3-0.6B

Tags

Recommended Reading

Multiverse Computing Launches LittleLamb Model Family on Hugging Face, Expanding Compact AI for Edge, On-Device, and Agentic Use Cases

New open-source releases LittleLamb 0.3B, LittleLamb 0.3B Tool-Calling, and LittleLamb 0.3B Mobile pair ultra-compact deployment with bilingual reasoning in 50% compressed versions of Qwen3-0.6B

Tags

Related Links

Recommended Reading