Meet Swival: The AI Coding Agent That Works With Your Models, Not Against Them

May 06, 2026 ai development coding agents llm tools open-source developer productivity machine learning infrastructure local ai models security-first development

Meet Swival: The AI Coding Agent That Works With Your Models, Not Against Them

The dream of having an AI pair programmer sounds amazing until you hit reality: API costs stack up, context windows feel claustrophobic, and you're perpetually locked into whatever proprietary platform you're using.

Swival flips the script.

Instead of forcing you into a one-size-fits-all ecosystem, this open-source coding agent adapts to your infrastructure, your models, and your constraints. Whether you're running local LLMs on modest hardware or leveraging enterprise-grade models through APIs, Swival works with what you've got.

Why This Matters for Developers

The gap between "AI can code" and "AI can code reliably on my machine with my constraints" is enormous. Most AI coding tools assume unlimited context windows and premium GPUs. Swival was purpose-built for the real world: tight context budgets, modest local hardware, and models that need careful handling to produce good output.

Think about it. A junior developer today might be running llama.cpp on a MacBook Air with 16GB of RAM. An indie startup might want to use Open Router to avoid vendor lock-in. A security-conscious team might need all secrets encrypted before leaving their infrastructure. Swival handles all of these scenarios without requiring you to rewrite your workflow.

The Feature Set That Actually Matters

Context Management Done Right

Most AI agents bloat their context windows by dumping everything into the prompt. Swival's graduated compaction approach keeps conversations clean and focused. The agent maintains persistent state across sessions, so it remembers what you've been working on without needing to re-feed it a 10,000-token history every time you ask a follow-up question.

Your Models, Your Choice

Swival auto-discovers local models running on LM Studio or llama.cpp, but it also integrates seamlessly with:

HuggingFace
OpenRouter
Google Gemini
ChatGPT (via OAuth with your existing subscription)
AWS Bedrock
Any OpenAI-compatible server (Ollama, vLLM, etc.)

This flexibility means you're never locked into one provider's pricing or availability. Switch models? Update a command-line flag. Simple as that.

Security by Default

Enable --encrypt-secrets and API keys, credentials, and sensitive data get encrypted before they leave your machine. The model never sees the real values—only secure references. Decryption happens locally when the response comes back, so tools still work normally. This is how security-conscious teams actually want their AI coding tools to behave.

Learning That Persists

The agent uses BM25-based retrieval to pull relevant context from past sessions. Teach it something with /learn and it remembers across conversations without bloating your current prompt. It's like having an assistant who actually learns from your codebase over time.

Review Loops and Benchmarking

Swival includes configurable review loops with "LLM-as-a-judge" verification. JSON reports capture timing, tool usage, and context events, making it easy to compare different models, settings, and configurations on real coding tasks. Want to benchmark whether Qwen 3 Coder or GLM-5 is faster for your workflow? Run the same task through Swival with both and get detailed metrics.

Security Audits That Actually Work

Run /audit and Swival scans your codebase for provable security bugs. Here's the clever part: findings are verified by isolated agents running in separate worktrees. Each issue must be reproduced independently before it reaches the report. This dramatically reduces false positives—you get real bugs with actual patches, not just speculation.

Getting Started in Minutes

The setup is refreshingly simple. Here's a quick example with LM Studio:

# 1. Install Swival
uv tool install swival

# 2. Run a task
swival "Simplify the error handling in src/api.py"

That's it. No configuration hell, no environment variables to juggle (unless you want to customize things).

Prefer local inference with llama.cpp? Just point Swival at it:

swival --provider llamacpp "Refactor this authentication module"

Want to use HuggingFace models? Export your token and specify the model:

export HF_TOKEN=hf_...
swival --provider huggingface --model zai-org/GLM-5.1 "Add proper error handling"

For back-and-forth work, just run swival with no arguments and you get an interactive session where the agent remembers the full conversation.

Beyond Single-Model Workflows

Swival isn't just a CLI tool. You can embed agents directly in your own Python code:

import swival

answer = swival.run(
    "What files handle authentication?",
    provider="openrouter",
    model="z-ai/glm-5",
)

For complex multi-turn sessions, the Session class gives you full control over conversation state and iteration.

There's also an A2A (Agent-to-Agent) server mode. Run swival --serve and your agent becomes an HTTP endpoint that other agents can call. This opens up possibilities for building agent networks and orchestrating complex coding tasks across multiple specialized agents.

The Extensibility Angle

Small, hackable, and framework-free. Swival is pure Python, making it easy to read, modify, and extend. You can add custom skills via SKILL.md files, integrate MCP (Model Context Protocol) servers, or compose agents together. No proprietary abstractions or lock-in—just straightforward code you can understand and adapt.

Benchmarking and Evaluation

Swival ships with Calibra, a companion tool for benchmarking. Compare models, settings, skills, and MCP servers on real coding tasks. This is invaluable if you're trying to figure out which model-infrastructure combo gives you the best quality-to-cost ratio for your specific workflow.

The Bottom Line

AI coding agents have become table stakes for modern development. But they don't have to come with vendor lock-in, privacy compromises, or unrealistic hardware requirements. Swival proves you can build a genuinely useful coding agent that respects your constraints, trusts your judgment, and plays nicely with whatever infrastructure you've already chosen.

Whether you're a solo developer optimizing for cost, a startup protecting sensitive data, or a team that values flexibility and control, Swival deserves a spot in your toolkit. It's free, open-source, and ready to go. The question isn't whether you can afford to try it—it's whether you can afford not to.

Read in other languages:

RU BG EL CS UZ TR SV FI RO PT PL NB NL HU IT FR ES DE DA ZH-HANS