Local LLM - powered by Gemma

Moose thinks on
your machine.

Every brief, insight, and draft comes from a real Gemma model running locally - unlimited, private, and free. No prompt ever leaves your desk. Bring your own key when you want a frontier model in the loop.

Runs offline. Works on Windows & macOS.

Hi, Moose · Chat Gemma 4B · local
draft me a tight FAQ answer on "is local AI private"
on it. running this through Gemma right here - nothing's going to the cloud.
Running on your machine
$0
cost
0
tokens sent out
100%
on device
$0
Per query, always
The local model has no per-token bill. Chat, brief, and draft as much as you like.
Unlimited
Chats, with memory
Moose keeps your context and projects in mind across every conversation.
Nothing
Leaves your desk
Prompts, context, and drafts stay on disk - unless you connect a key yourself.
Pick your size

A model that fits the laptop you already have.

Gemma comes in four sizes. Start small on an older machine, or load the big one on a workstation for the deepest analysis. The model downloads once, then runs fully offline - and you can switch any time.

Downloads in the background, runs offline forever after.
Uses your GPU when it can, your CPU when it can't.
Switch sizes whenever - the cost stays exactly the same.
Choose your model size
Light
Gemma 1B
8 GB RAM · ~0.8 GB
Balanced
Gemma 4B
16 GB RAM · ~3 GB
Pro
Gemma 12B
32 GB RAM · ~8 GB
Max
Gemma 27B
64 GB or GPU · ~17 GB
Gemma 4B≈1.2s a reply

The everyday default. Strong briefs and insights with no perceptible wait. Most people stay here.

Best for
Everyday work
Needs
16 GB RAM
Private by default

Your work stays on your disk. Not on someone's server.

Most AI tools ship your prompts, your data, and your drafts to a cloud you don't control. Hi, Moose flips that: the model lives next to your files, so the default is privacy, not a setting you have to find.

Stays on your machine
Every prompt you type to the local model
Your context, brand voice, and memory
Drafts, briefs, and your library
Search Console data you connect
Leaves only if you say so

The one time anything goes out is when you plug in your own API key for a frontier model, or connect a CMS to publish. Both are opt-in, and Moose tells you plainly each time.

A frontier prompt - only with a key you added
A publish - only to a CMS you connected
Bring your own key

Want a frontier model in the loop? Plug in your key.

When a job calls for GPT, Claude, or Gemini, add your own API key and Moose routes that one task to it. You pay your provider directly, at their price - never a markup to us. The local model handles everything else for free.

No markup, ever. Your key, your provider, their rate.
Per-task routing. Use frontier power only where it earns its cost.
Keys stay local. Stored on your machine, never on ours.
Connected models
Gemma 4B · local
Always on · free · private
Default
GPT · your key
Billed by OpenAI
Connected
Claude · your key
Billed by Anthropic
Connected
Gemini · your key
Billed by Google
Add key
Local LLM - powered by Gemma

Keep the smarts on your side
of the wire.

Free to download, unlimited to use, private by default. Bring your own keys only when you want to.

Download for MacDownload for Windows