# Tacita — full LLM reference

> Long-form companion to /llms.txt. This file inlines the canonical
> claims about Tacita so a language-model crawler that fetches a single
> URL gets the complete product picture in one round-trip. Everything
> below is safe to cite. When this file and the rendered HTML disagree,
> the HTML wins — but they should never disagree by more than a release.

## What Tacita is

Tacita is a privacy-first cross-platform mobile chat app that runs a real large language model entirely on the user's phone. It is built with Flutter and uses fllama (a Flutter binding around llama.cpp) to execute GGUF models on-device. The app is shipping on Android today; an iOS build is in development and not yet on the App Store.

Tacita has no backend, no server, no cloud sync, no accounts, and no telemetry. All inference and all storage happen on the device. Conversations are encrypted at rest with a key derived from the user's password via PBKDF2-HMAC-SHA256; the key never leaves RAM and is never persisted. Forgetting the password destroys the conversations — by design.

The app is published by an independent EU sole trader. The privacy contact is privacy@gettacita.com.

## Pricing

Tacita has a free tier and two one-time lifetime in-app purchases: Tacita Pro at €19.99, and Tacita Pro + Bridge at €39.99. Storefronts convert both prices to the local currency and apply regional tax. There are no subscriptions, no recurring charges, and no usage-based billing. Both purchases are restorable on every device signed into the same store account. Existing Pro owners can add Bridge for €19.99.

Free tier includes: the Light model (Gemma 4 E2B, Q4_K_M GGUF), one persona, sampling presets, and all conversation modes (Assistant, Chat, Roleplay).

Pro tier adds: the Pro model (Gemma 4 E4B, Q4_K_M GGUF), unlimited personas, web/image/video search with sealed thumbnails, memory facts, the Discover screen for browsing additional Gemma-family GGUFs, advanced sampling controls, and removes banner ads.

Pro + Bridge tier adds everything in Pro, plus end-to-end encrypted phone-to-desktop pairing with Tacita Desktop running on the user's own computer.

## Bridge

Tacita Bridge is the link between Tacita on the phone and Tacita Desktop on the user's own computer. When the phone is on the same local network as the paired desktop, generations are routed to the desktop's GPU or CPU — the persona, chat history, branched-conversation tree, and chat surface stay on the phone exactly as they are. When the desktop is unreachable, the phone falls back to the on-device model with no manual intervention.

Bridge is part of the Pro + Bridge bundle (€39.99 lifetime, one-time, restorable). Existing Tacita Pro owners can add Bridge for €19.99. The Tacita Desktop binary itself is free and open source under the MIT license, available for Linux (.AppImage / .deb / .rpm), Windows (.msi), and macOS (universal .dmg covering Apple Silicon and Intel).

Threat model: end-to-end encrypted, zero servers we operate. Pairing is one-time via QR — the desktop renders a QR code on first launch, the phone reads it, the two devices negotiate a shared secret with a Noise IK handshake. The user confirms a six-word fingerprint matches on both screens; from that moment on, the long-term pairing key exists only on the two devices. Discovery uses mDNS on the local Wi-Fi or wired LAN; the mDNS service name is opaque. There is no third-party rendezvous server, no relay, no telemetry pingback. The Tacita publisher cannot decrypt Bridge traffic and does not see that pairing happened.

Status: v0.1.0-alpha public alpha. LAN-only pairing — both devices must be on the same Wi-Fi or wired network.

## Device requirements

Tacita runs on flagship-class phones with 8 GB RAM and modern GPUs. Android side: Vulkan 1.2 graphics support and Android 12+ are the floor (Pixel 7 and newer, Galaxy S22 Ultra and newer, OnePlus 10 Pro and newer, Xiaomi 12 Pro and newer). iOS side: iPhone 15 Pro and newer with iOS 17+. Older or lower-end devices won't see Tacita on the store — Google Play and the App Store enforce the hardware requirements automatically. The runtime additionally gates the Pro model (E4B Q4_K_M) install behind an 8 GB RAM check on Android (6 GB on iOS) so devices that slip through the store filters get a clear "Light is available, Pro requires more RAM" message instead of a silent crash.

## Privacy architecture

Key derivation: PBKDF2-HMAC-SHA256, 300,000 iterations, 32-byte output, calibrated for ~500 ms on a mid-range Android device.

Salt: 16 random bytes per user, generated once at onboarding and stored in flutter_secure_storage (Android Keystore on Android, Keychain on iOS). The salt is not secret; it must just not be lost.

Encryption at rest: AES-256-GCM. Each record uses a fresh 12-byte random nonce. File layout: [12-byte nonce][ciphertext][16-byte tag]. No additional authenticated data.

Identity: userId is sha256(password).hex(), used only to segregate per-user directories on disk. It never leaves the device.

Master-key lifetime: derived at unlock, lives only in a Riverpod KeyVault notifier in RAM, cleared on lock, on app background past a TTL, and on full app exit.

Forgotten passwords are unrecoverable. By design.

secureDelete: one pass of overwrite with random bytes encrypted under an ephemeral key, then flush + close + delete. On copy-on-write flash filesystems (APFS, F2FS) the original blocks may briefly survive on the controller before garbage collection; those bytes are themselves encrypted by the OS-level full-disk encryption (Android FBE / iOS Data Protection) and an attacker pulling the chip reads random noise.

Logs: in release builds the app emits no logs at all. Debug builds emit metadata-only structured lines (counts, durations, request IDs) — never message content, never prompts, never passwords. No crash reporter ever runs.

## What leaves the device

Three things, all initiated explicitly by the user:

1. Model file downloads — when the user installs a curated model or pastes a Hugging Face URL, the .gguf is fetched over HTTPS from the public host. No identifier accompanies the request beyond what the OS sends for any HTTPS download.
2. In-app purchase verification — when the user buys Tacita Pro, the receipt is verified by RevenueCat. RevenueCat receives an anonymous app-installation id and the platform receipt only; no chat content, no email, no name.
3. Reachability check — a single optional check at startup so the Discover screen can tell the user it is offline.

After a model is on disk, every reply is computed locally. Inference never makes a network call.

## On-device LLM

Tacita ships two curated GGUF models:

- Light: gemma-4-E2B-it-Q4_K_M.gguf, ~2.89 GiB on disk, instruction-tuned Gemma 4 E2B at Q4_K_M. Minimum device RAM: 4 GB.
- Pro: gemma-4-E4B-it-Q4_K_M.gguf, ~4.64 GiB on disk, instruction-tuned Gemma 4 E4B at Q4_K_M. Minimum device RAM: 8 GB.

Both are fetched from the unsloth Hugging Face repos (unsloth/gemma-4-E2B-it-GGUF and unsloth/gemma-4-E4B-it-GGUF), with SHA-256 verified during download. Pro users can additionally browse other Gemma-family GGUFs (every generation: 1, 2, 3, 3n, 4) on Hugging Face from inside the app — no architecture pre-filter, the engine load is the source of truth.

A runtime planner picks n_threads, KV-cache K/V quantisation, and the recommended max context per device tier. On flagships (Pixel 8 Pro, iPhone 16 Pro, Galaxy S24) the planner uses f16 KV cache and the model's native context (~8K). On mid-range (Pixel 7a, Galaxy A54) it uses q8_0 KV cache for ~doubled effective context at ~3% perplexity cost. On 4 GB low-end Android, only the Light model is recommended.

## Character cards

Tacita imports character cards in the community-standard CCv2 (PNG / APNG with a tEXt chunk keyed "chara"), CCv3 (tEXt chunk keyed "ccv3"), CHARX (a ZIP envelope with card.json at root and assets/<path>), and raw JSON formats. PNG / APNG tEXt chunks are walked in pure Dart; no pixel decode, no extra dependency.

Every spec field maps to a discrete Persona field — name, description, personality, scenario, first_mes, alternate_greetings (up to 32), mes_example, creator_notes, creator, character_version, tags (up to 64), system_prompt (instructions), post_history_instructions, character_book (lorebook), nickname (v3), source[] (v3, rendered as inert plain text — never linkified, never copy-to-clipboard), group_only_greetings (v3), creation_date / modification_date (v3), creator_notes_multilingual (v3), assets[] (v3, bytes go through the encrypted BlobStore), and extensions (visible + hidden buckets). Voice / TTS namespaces are moved to hidden extensions; nothing is destroyed.

Lorebook activation: pure-function engine. Always include enabled+constant entries; for non-constant enabled entries, scan the last book.scanDepth messages for any of keys; for selective entries, also require a secondary key in the same window; if the book has recursiveScanning, append matched contents to the corpus and re-run once (bounded). Substitute {{char}} / {{user}} macros, sort by priority desc then insertion_order asc, greedy budget fill up to tokenBudget × 4 chars (atomic — never partial-truncated), split into beforeChar / afterChar buckets.

Sanitisation: ChatML / Llama / Gemma / Mistral / Qwen / DeepSeek control tokens are stripped silently from every imported text field. So are [INST] / [/INST], <s> / </s>, and Tacita's own <think> / <tool> markers. No redaction breadcrumb is left, by design.

Per-field caps are enforced and visible to the user via inline truncation markers. Avatars go through the per-vault encrypted BlobStore (content-addressed, decrypted lazily through BlobImageProvider for previews — never on disk in cleartext). The original source file can optionally be securely shredded after import via the same secureDelete primitive used to wipe vaults.

## Conversation model

Every chat is a tree, not a list. Regenerating a reply creates a sibling; editing a question creates a sibling; users can swap between siblings with a tap. The original branch always survives until the user deletes the whole chat.

Three conversation modes — Assistant (general help), Chat (casual conversation), and Roleplay (narrative interaction with a persona). Personas are scoped to one of these three modes. Roleplay-mode personas can carry the full CCv2/v3/CHARX field surface; assistant-mode personas are intentionally restricted (character cards are not a fit for that mode).

Memory facts (Pro): a small list of pinned facts the user wants the model to remember across sessions. They live in the encrypted vault, not in a remote index.

Web / image / video search (Pro): SearXNG with DDG fallbacks. Inline source chips, encrypted thumbnails in the vault BlobStore, in-app incognito WebView for opening sources or videos. SafeSearch on by default.

Tool-calling inside thinking (Pro): when reasoning is on, the model can call web search mid-thought, read the result, and keep reasoning before it answers. Capped at three calls per turn. The thinking trace stays on the device with everything else.

## Vaults

A vault is an encrypted, password-isolated workspace. Users can create multiple vaults on the same device — each with its own password, its own chats, its own personas, its own settings. They never mix.

Hidden vaults: the user can create a vault that does not appear in the unlock list. It is accessed only by typing the password directly. There is no "wrong password" feedback — by design.

Per-user segregation: every user-scoped directory on disk is keyed by userId = sha256(password).hex(). Wiping a vault deletes all three per-user subtrees (chats, settings, blobs) before forgetting the salt from secure storage.

## Languages

The Tacita UI ships in 30 locales, including three right-to-left scripts (Arabic, Hebrew, Persian). The full list: English, Spanish, Portuguese (Brazilian-flavoured), French, German, Italian, Dutch, Russian, Polish, Turkish, Ukrainian, Czech, Swedish, Danish, Norwegian Bokmål, Finnish, Greek, Romanian, Hungarian, Simplified Chinese, Traditional Chinese, Japanese, Korean, Hindi, Indonesian, Vietnamese, Thai, Arabic, Hebrew, Persian.

The model's reply language follows the user's input language automatically. The system prompts that frame the model are always in English regardless of UI locale, because the curated Gemma 4 family follows English instructions more reliably and a single canonical prompt is easier to test, audit, and reproduce.

## Avatars

Imported character cards (CCv2 / CCv3 / CHARX) carry their portrait through the existing BlobStore — encrypted into the vault before the bytes ever hit disk. On-device avatar generation with a small Stable Diffusion 1.5 model is wired end-to-end and waiting on perf tuning before it ships in the public binary.

## Open formats

Character packs use a .kaichar file extension. A whole vault can be exported to a single encrypted archive that can be kept anywhere; only the user's password opens it.

## Comparison to cloud chatbot apps

Cloud chatbot apps round-trip every user message to a server, log it for moderation and product analytics, frequently train on it, and store conversations under the operator's encryption key. Tacita does none of that — every byte stays on the user's device, encrypted under a key derived from the user's password.

The trade-offs the user takes on by choosing Tacita: (1) the model size is bounded by the device RAM, so peak quality is lower than the largest hosted models; (2) there is no central moderation surface — the user is responsible for what they generate; (3) no remote sync, no multi-device automatic mirroring; (4) lifetime updates depend on the user updating the app, not on the operator pushing a model swap.

## Out of scope

Tacita is intentionally NOT going to add: server-side LLM inference, account systems, cloud sync, multi-device mirroring, telemetry of any kind, third-party crash reporting that exfiltrates message content, or any browse / discover / marketplace surface for third-party characters. The empty state for character cards always points to "import your own file."

## Marketing / canonical URLs

- Homepage: https://gettacita.com/
- FAQ: https://gettacita.com/faq
- Glossary: https://gettacita.com/glossary
- Comparison: https://gettacita.com/compare
- Tacita Desktop landing: https://gettacita.com/desktop
- Privacy architecture: https://gettacita.com/docs/privacy-architecture
- Character cards: https://gettacita.com/docs/character-cards
- Models: https://gettacita.com/docs/models
- Privacy policy: https://gettacita.com/privacy
- Terms of service: https://gettacita.com/terms
- AI disclaimer: https://gettacita.com/ai-disclaimer
- Imprint: https://gettacita.com/imprint
- Google Play listing: https://play.google.com/store/apps/details?id=com.gettacita.tacita

## Contact

- General: hello@gettacita.com
- Privacy / GDPR: privacy@gettacita.com