LocalMind

A private AI research agent running Gemma entirely in your browser via WebGPU. Tool calling, persistent memory, and web search — all on-device.

Only your search queries touch the network (and only when you choose to). All reasoning stays on your device.

Models are cached after first download — future visits load instantly.

Requirements: Chrome 113+, Edge 113+, or Firefox 130+ with WebGPU.

Available Models

Gemma 3 1B (~760 MB) — fast, text-only chat
Gemma 4 E2B (~1.5 GB) — multimodal + agent
Gemma 4 E4B (~4.9 GB) — multimodal + agent, best quality

Gemma 4 Agent Tools

calculate — math, percentages, conversions
get_current_time — date/time with timezone
store_memory — save facts to persistent memory
search_memory — recall from stored memories
web_search — search the web (requires API key)
fetch_page — read a web page’s content
set_reminder — browser notification after N minutes
list_memories — show what’s stored in memory
delete_memory — forget specific memories
segment_image — segment objects in attached images (SAM)

Image Segmentation (SAM)

Gemma 4 can call SAM (Segment Anything Model) to segment objects in attached images. Choose your SAM model in Settings — loaded on first use.

SlimSAM 50 (~10 MB) — fastest, good enough for most tasks
SlimSAM 77 (~14 MB) — default, better accuracy
SAM ViT-Base (~350 MB) — full quality, slower download
SAM 3 (latest) — newest architecture

Things to try

“Segment the main object in this image”
“Outline the person on the left”
“Isolate the background”
“How many distinct objects are in this image?”

Translation works directly — Gemma 4 speaks 140+ languages natively, no tool needed.

Gemma 3 1B works as a simple chatbot — no agent tools.

Document Upload

Text — .txt, .md, .json, .csv
PDF — extracted via PDF.js, auto-summarized
DOCX — extracted via mammoth.js, auto-summarized

Documents are chunked, embedded, and stored as searchable knowledge. A summary is generated on upload.

Multimodal (Gemma 4)

📎 Attach — images, audio, MP4 video, or documents
📷 Camera / 🎤 Mic / Paste / Drag & drop

Document Upload

Text — .txt, .md, .json, .csv
PDF — extracted via PDF.js, auto-summarized
DOCX — extracted via mammoth.js, auto-summarized
Folder — open a local folder to ingest all .md/.txt/.pdf files at once; re-open to sync only changed files (incremental)

Conversations

New Chat — archives to History + starts fresh
Clear — deletes without saving
History — sidebar, click to resume any past chat
Share — generate an encrypted or plain link to share any conversation; recipient opens the URL to load it

Memory browser

Category pills — filter by fact / preference / finding / document / conversation with live counts
Source grouping — document chunks grouped by file; bulk “Delete all” per source
Audit — flags stale (>60 days), near-duplicates (cosine sim ≥0.92), and outliers (low avg similarity to category); bulk or per-item delete; auto-reruns after each deletion

Output & Export

Save as MD — download any response as Markdown (or write directly to open folder if one is active)
Code download — hover code blocks for download button
Export / Import — in Memory panel, full data as JSON
Auto-backup — toggle in Settings to download on New Chat

Batch Prompts

Enter one prompt per line in the Batch panel — they run sequentially through the full agent loop
{{previous}} — explicit placeholder substituted with the previous response text
Auto-inject — checkbox (on by default) appends the previous response as context even without a placeholder; disabled for any prompt that already contains {{previous}}
Stop — halts after the current generation finishes; progress shown live

Other

Web Search — Settings → provider + API key → 🌐 button
Thinking Mode — see chain-of-thought (collapses when done)
Cache management — view/clear cached models in Settings
Custom models — Settings → paste a Hugging Face ONNX repo id (causal LMs only). The validator probes the HF API, picks the best available quantisation, estimates real load size, and hard-blocks anything that exceeds the device’s WebGPU buffer limit or the 6 GB ceiling.
Response badges — On-device / Agent / Web-enriched

JavaScript API (experimental)

Settings → tick Expose window.localmind. Same-tab only — cross-origin scripts cannot reach it. The object is frozen and non-writable; disable the toggle to detach.

Surface (v1.0)

version · ready · model — live state getters
listModels() — full registry incl. custom models with loaded flag
load(idOrKey) — loads a model (short key or HF id); resolves when ready
chat.completions.create({ messages, max_tokens, temperature, top_p, model }) — non-streaming, returns OpenAI-shaped chat.completion
chat.completions.create({ …, stream: true }) — async iterator yielding chat.completion.chunk objects

Not exposed

Tools / tool calling
Memory read/write
File system, web search, search API keys, user profile
Multimodal input

Activity log

Every API call is logged in-memory (last 50). Click the • API chip in the toolbar or Settings → View activity log. Each call shows method, prompt length, tokens generated, duration, and outcome (ok / err / busy).

Demo

Open demo.html in the same folder. It iframes LocalMind, auto-flips the toggle, waits for the model, and runs both a non-streaming and a streaming completion against iframe.contentWindow.localmind.

Experimental — the shape may change before a stable v1.1.

Math & Conversions

What is 15% of 2450? Convert 72 Fahrenheit to Celsius Compound interest: $10K at 7% for 5 years?

Time & Reminders

What time is it in Tokyo? Remind me in 5 minutes to check the oven

Memory

Remember: I'm a software engineer on Dashboard Pro What do you know about me? Forget my preferences

Translation

Translate "Good morning" to Japanese, French, Hindi Train station directions in Spanish & German

Writing & Analysis

Write a polite meeting decline email Microservices vs monolith: pros and cons Explain WebGPU in 3 simple sentences

Documents (attach a PDF, DOCX, or text file)

Summarize the uploaded document Main conclusions from my document

Multimodal (attach an image first)

Describe this image in detail Transcribe text from this image

Web Research (requires API key)

Top tech news today Latest WebGPU browser support status AI in the browser: recent articles with sources

Coding

Sieve of Eratosthenes in Python async/await vs Promises explained

Click any prompt to paste it. Web search needs a provider in Settings. Multimodal needs a Gemma 4 model + an attached image.

Initializing...

Drop image, audio, or video here

🧠

Private AI, right in your browser

Gemma runs entirely on your device. Nothing is sent to any server.

Switch to a Gemma 4 model for tool calling, memory, and multimodal input.
Web search requires an API key (Tavily, Brave) or a self-hosted SearXNG instance.

Preparing to download model...

Loading SAM model...

System prompt (optional) Show reasoning (thinking mode)

Web search provider

API key

SearXNG instance URL

Segmentation model (SAM)

Loaded on first use. Gemma calls this when you ask it to segment, outline, or identify objects in attached images.

Auto-download backup on New Chat

Model cache

Custom models (causal LMs with ONNX exports on Hugging Face)

Must be a Hugging Face repo with ONNX files under onnx/. Multimodal custom models are not yet supported. Added models appear in the model selector.

JavaScript API (experimental) Expose window.localmind to scripts in this page

Lets other JavaScript on this page call the loaded model via an OpenAI-shaped method. Same-tab only — cross-origin scripts cannot reach it. No tool calling, memory access, or web search is exposed.

Memory 0 memories

Batch Prompts 0 prompts

Auto-inject {{previous}}

API

History 0 conversations

LocalMind

Available Models

Gemma 4 Agent Tools

Image Segmentation (SAM)

Things to try

Document Upload

Multimodal (Gemma 4)

Document Upload

Conversations

Memory browser

Output & Export

Batch Prompts

Other

JavaScript API (experimental)

Surface (v1.0)

Not exposed

Activity log

Demo

Math & Conversions

Time & Reminders

Memory

Translation

Writing & Analysis

Documents (attach a PDF, DOCX, or text file)

Multimodal (attach an image first)

Web Research (requires API key)

Coding

Private AI, right in your browser

Share Conversation