Swama

Machine-learning runtime

Swama is a pure‑Swift machine‑learning runtime that runs locally on macOS 15.0+ with Apple Silicon. It leverages Apple’s MLX framework to provide fast inference for large language models and vision‑language models, exposing standard OpenAI‑compatible endpoints for chat, embeddings, audio transcription and experimental speech synthesis. The runtime includes a menu‑bar application, a full command‑line interface, and a modular library (SwamaKit) that handles model downloading, caching, versioning and streaming responses.

The tool is aimed at developers and power users who need on‑device AI capabilities without relying on cloud services. It supports multimodal inputs, allowing text and image data to be processed together, and integrates Whisper for local audio transcription. Model management is automated: aliases such as “qwen3” or “llama3.2” trigger automatic download from the HuggingFace Hub and store the model in a local cache for reuse.

Swama’s architecture separates core logic, CLI utilities and the macOS graphical app, enabling flexible integration into scripts, terminal workflows, or a native UI. Installation is available via Homebrew, pre‑built DMG, or source compilation with Xcode 16 and Swift 6.2. The runtime is released under the MIT license.

Reviews

Loading reviews…

Similar apps

Window & Desktop Management

LlamaBarn

Menu bar app for running local LLMs

AI Coding Agents

Osaurus

LLM server built on MLX

System Monitoring & Maintenance

Stability Matrix

Package manager and inference UI for Stable Diffusion

HuggingChat

AI Chat & Voice Agents

HuggingChat

Chat client for models on HuggingFace

AI Coding Agents

LM Studio

Discover, download, and run local LLMs

Terminals & CLI

Awal Terminal

AI-native terminal emulator with multi-provider profiles and voice input.