A private AI research agent running Gemma entirely in your browser via WebGPU. Tool calling, persistent memory, and web search — all on-device.
Only your search queries touch the network (and only when you choose to). All reasoning stays on your device.
Models are cached after first download — future visits load instantly.
Requirements: Chrome 113+, Edge 113+, or Firefox 130+ with WebGPU.
Gemma 4 can call SAM (Segment Anything Model) to segment objects in attached images. Choose your SAM model in Settings — loaded on first use.
Translation works directly — Gemma 4 speaks 140+ languages natively, no tool needed.
Gemma 3 1B works as a simple chatbot — no agent tools.
Documents are chunked, embedded, and stored as searchable knowledge. A summary is generated on upload.
Settings → tick Expose window.localmind. Same-tab only — cross-origin scripts cannot reach it. The object is frozen and non-writable; disable the toggle to detach.
loaded flagchat.completionchat.completion.chunk objectsEvery API call is logged in-memory (last 50). Click the • API chip in the toolbar or Settings → View activity log. Each call shows method, prompt length, tokens generated, duration, and outcome (ok / err / busy).
Open demo.html in the same folder. It iframes LocalMind, auto-flips the toggle, waits for the model, and runs both a non-streaming and a streaming completion against iframe.contentWindow.localmind.
Gemma runs entirely on your device. Nothing is sent to any server.
Switch to a Gemma 4 model for tool calling, memory, and multimodal input.
Web search requires an API key (Tavily, Brave) or a self-hosted SearXNG instance.
onnx/. Multimodal
custom models are not yet supported. Added models appear in the model selector.