When Enterprise-Grade AI Goes Open Source: What Poolside's Laguna XS.2 Means for Developers
When Enterprise-Grade AI Goes Open Source: Inside Poolside's Laguna XS.2
There's something refreshing about an AI model that admits what it was built for.
Most model releases come with benchmark tables, flashy performance claims, and promises of AGI just around the corner. Poolside AI's new Laguna XS.2 does something different: it shows up with the unvarnished backstory of a team that spent years building AI systems for governments and defense contractors—organizations where "move fast and break things" gets you a security audit instead of a product launch.
Now they've released what they learned, and it's available to everyone under Apache 2.0.
The Pedigree Matters More Than the Benchmarks
Before Laguna XS.2, Poolside operated in the shadows. Air-gapped deployments. On-premise infrastructure. Clearance levels most developers never think about. This wasn't glamorous work—it was the kind of engineering where reliability isn't a feature, it's a survival requirement.
Releasing Laguna XS.2 publicly almost feels incidental to their mission. But that's exactly why it's interesting. This model wasn't optimized to win leaderboard positions. It was built to handle genuinely difficult problems in environments where failure isn't acceptable.
That's a fundamentally different design philosophy than the current AI arms race produces.
What You're Actually Getting
At 33B total parameters with only 3B active per token, Laguna XS.2 is actually runnable. We're not talking about theoretical hardware requirements—you can deploy this on a Mac with 36GB of RAM. Through Ollama, vLLM, or Transformers, this is practical open-source infrastructure.
The architectural decisions reflect real-world constraints:
Efficient attention patterns: 30 of 40 layers use sliding window attention with per-head gating. This keeps KV cache requirements low, which means faster inference without sacrificing quality on longer contexts.
Native reasoning capabilities: Built-in support for interleaved thinking between tool calls. You can enable or disable it per request depending on the task—not every problem needs chain-of-thought reasoning, but when it does, it's there.
128K context window: That's enough space for substantial codebases, documentation, and reasoning chains without token exhaustion.
The Coding-First Philosophy
Here's where Poolside's perspective diverges from the mainstream: they believe coding is the core capability through which agents accomplish everything else.
An agent that can write and execute code can compose actions independently. It can build its own tools. It can interact with systems in ways that pure function-calling never allows. This isn't an accident in Laguna XS.2's design—it's the entire thesis.
If you're building AI systems that need to solve problems autonomously, this philosophy maps cleanly to real agent architectures. The model was trained for trajectories spanning hundreds of tool calls, not the short-horizon tasks most benchmarks measure.
Reality Check: The Benchmarks
Laguna XS.2 doesn't dominate every benchmark. Qwen 3.6-35B outperforms it on standard SWE-bench scores. Claude Haiku 4.5 leads on SWE-bench Verified.
But here's what matters: XS.2 holds competitive ground on multilingual coding tasks and performs respectably on longer-horizon problems. The benchmarks themselves, conducted on Poolside's own agent harness, measure short-trajectory tasks. That's genuinely not where this model's advantages show up.
The real test is long-horizon agentic performance—exactly the kind of work that happens in production systems but doesn't fit neatly into standard benchmark suites.
Getting Started Today
Via API: Poolside offers free API access for a limited time. This is the fastest way to evaluate both Laguna XS.2 (the open model) and Laguna M.1 (their 225B-parameter closed model) on your actual workload.
Local deployment: Ollama handles XS.2 out of the box. Transformers and vLLM both have day-one compatibility for more customized setups.
Agent tooling: Poolside released pool, a lightweight terminal agent designed to work with their Agent Client Protocol. If you're building agentic systems, this gives you a reference implementation.
Why This Moment Matters
The AI industry has bifurcated. On one side: consumer-facing models optimized for leaderboards and viral demos. On the other: government and enterprise systems that prioritize reliability over benchmark points.
Laguna XS.2 is what happens when you let the second group talk publicly. It's not trying to be the smartest model. It's trying to be the most reliable model for tasks that matter when failure costs real money.
For developers building production systems—especially those involving code generation, autonomous agents, or integration with external tools—that's the more useful philosophy.
The weights are on HuggingFace. The code is open source. The documentation is solid. If you've been waiting for an enterprise-grade coding model that you can actually run and modify, this is the release worth taking seriously.