Reference

Tacita glossary

Definitions of the terms that show up across Tacita's documentation, settings, and conversations. Each entry is short, citation-shaped, and links to its own anchor.

The glossary is grouped alphabetically. If you arrived here from a chat or a settings screen looking for one term, use your browser's find-in-page or click directly to the on-device LLM, CCv2, vault, or AES-256-GCM entries.

On-device LLM

A large language model that runs locally on the user's phone instead of on a remote server. Tacita uses fllama (a Flutter binding around llama.cpp) to execute quantised GGUF models on Android, with iOS support in development. After the model file is downloaded once, every reply is computed without a network call.

See also: GGUF , llama.cpp , fllama .

GGUF (GPT-Generated Unified Format) is the file format used by llama.cpp to store quantised language-model weights. It bundles the tokenizer, the architecture metadata, and the quantised tensors into a single file the inference runtime can mmap. Tacita ships pre-quantised Q4_K_M GGUFs of the Gemma 4 family.

See also: llama.cpp , Quantization , Gemma 4 .

An open-source C/C++ inference engine for transformer language models, optimised for CPU and embedded GPU execution. It is the runtime Tacita relies on (via fllama) to evaluate Gemma 4 GGUFs on Android phones at acceptable latency.

See also: GGUF , fllama .

A Flutter package that wraps llama.cpp's inference loop and exposes it to Dart. Tacita uses fllama to load GGUF models, run streaming generation, and parse Gemma 4's thought channel.

See also: On-device LLM , llama.cpp .

Google's fourth-generation Gemma family of open-weight, instruction-tuned language models, distributed as GGUFs by community packagers like Unsloth. Tacita ships the Gemma 4 E2B (Light) and E4B (Pro) variants and lets Pro users browse other Gemma-family GGUFs from Hugging Face.

See also: GGUF , Quantization .

Quantization

A technique that reduces model weight precision (typically from 16-bit floats to 4- or 8-bit integers) so the model fits in less RAM and runs faster, at a small cost in quality. Tacita uses Q4_K_M quantisation for the curated models — the K_M variant balances size and quality well for mobile inference.

See also: GGUF , KV cache .

The key/value cache that an LLM keeps for every token already in its context. Tacita's runtime planner picks f16 KV on flagship devices and q8_0 KV on mid-range devices — q8_0 doubles the effective context for ~3% perplexity loss, which is the right trade on memory-constrained phones.

See also: Context window , Runtime planner .

Context window

The number of tokens the model can attend to at once. Tacita's runtime planner sizes the context per device tier — a Pixel 8 Pro lands at ~8K (the model's native max), a 4 GB low-end Android lands at ~1–2K with the Light model.

See also: KV cache , Runtime planner .

Runtime planner

The component in Tacita that picks n_threads, KV-cache K/V quantisation, and recommended max context based on the RAM the device actually has. It is the reason a single binary runs acceptably on a mid-range phone and at full quality on a flagship without manual tuning.

See also: Context window , KV cache .

CCv2 (Character Card v2)

The community-standard format for distributing character definitions inside a PNG or APNG image, encoded as a base64 JSON payload in a tEXt chunk keyed "chara". Tacita imports CCv2 cards natively and maps every documented field to a discrete persona field — name, description, personality, scenario, first message, alternate greetings, message examples, system prompt, and the character book (lorebook).

See also: CCv3 (Character Card v3) , CHARX , Lorebook (character_book) .

CCv3 (Character Card v3)

The successor format to CCv2, identified by a tEXt chunk keyed "ccv3". CCv3 adds a nickname, a sources list, group-only greetings, creation and modification dates, multilingual creator notes, and an assets list. Tacita prefers CCv3 when both chunks are present in the same file and preserves every documented field.

See also: CCv2 (Character Card v2) , CHARX , Alternate greetings .

A ZIP-based character-card envelope with card.json at the root and an assets/ directory for binary attachments. Tacita imports CHARX, content-addresses every asset into the per-vault encrypted blob store, and preserves the original asset URIs so a future export can rebuild the layout.

See also: CCv3 (Character Card v3) , BlobStore .

Lorebook (character_book)

A list of conditional knowledge entries attached to a character card. Each entry has keys, a content body, and metadata like priority and insertion order. Tacita's lorebook engine scans the last N messages, picks matching entries (with optional secondary-key gating and one bounded recursive pass), substitutes macros, sorts by priority, budget-fills, and splits into pre-character and post-character buckets.

See also: CCv2 (Character Card v2) , CCv3 (Character Card v3) , Macros ({{char}}, {{user}}, …) .

Alternate greetings

A list of opening messages a character can use to begin a chat (up to 32 per card in Tacita). When you start a new chat with a card that has more than one greeting, Tacita shows a picker; the chosen greeting is macro-resolved before it lands as the first assistant message.

See also: Lorebook (character_book) , Macros ({{char}}, {{user}}, …) .

Macros ({{char}}, {{user}}, …)

Inline placeholders inside character-card text fields. Tacita resolves {{char}} / {{Char}} / {{CHAR}} / {{char_name}} to the persona's name, {{user}} variants to the user's display name, and supports extended macros like {{random:a,b}}, {{roll:NdM±K}}, {{date}}, {{time}}, and {{idle_duration}}. Stable-prefix slots use frozen values so the cached prompt prefix stays byte-stable across turns.

See also: Alternate greetings , System prompt .

mes_example

An optional field on a character card that contains example dialogue, typically separated by <START> markers. Tacita preserves the field, splits it on <START> after macro resolution, and emits each non-empty block as a separate <example> tag in the system prompt. Under context pressure the oldest whole block is dropped first — never partial-truncated.

See also: CCv2 (Character Card v2) , System prompt .

Tacita's authored character entity. Personas are scoped to one of three conversation modes — Assistant, Chat, or Roleplay — and carry every CCv2 / CCv3 field as a discrete property. A persona is bound to a chat at creation and can be swapped mid-conversation if the tone needs to shift.

See also: Conversation mode , System prompt .

Conversation mode

Tacita supports three conversation modes: Assistant (general help), Chat (casual conversation), and Roleplay (narrative interaction with a persona). Each mode has its own system-prompt framing. Imported character cards default to Roleplay; assistant-mode personas are intentionally restricted because character cards are not a fit for that mode.

See also: Persona , System prompt .

System prompt

The assembled instruction block prepended to every model turn. Tacita's prompt assembler stitches together the mode framing, pre-character lorebook entries, persona description, personality, scenario, instructions, memory facts, message examples, post-character lorebook entries, and finally the chat history. Post-history instructions land in a separate slot after the history.

See also: Macros ({{char}}, {{user}}, …) , Lorebook (character_book) .

An encrypted, password-isolated workspace on the device. Each vault has its own password, its own chats, its own personas, and its own settings. Multiple vaults on the same phone are fully isolated; users can also create secret vaults that never appear in the unlock screen list.

See also: PBKDF2-HMAC-SHA256 , AES-256-GCM , secureDelete .

PBKDF2-HMAC-SHA256

The key-derivation function Tacita uses to turn a user's password into the AES-256 master key. Tacita runs 300,000 iterations with a 16-byte per-vault salt, calibrated for ~500 ms on a mid-range Android device. The salt is persisted in the platform secure store; the derived key lives only in RAM.

See also: AES-256-GCM , Vault .

AES-256-GCM

The authenticated symmetric cipher Tacita uses to encrypt every persisted file (chats, personas, settings, blobs). Each record uses a fresh 12-byte random nonce and a 16-byte authentication tag; layout is [12B nonce][ciphertext][16B tag]. AAD is intentionally empty.

See also: PBKDF2-HMAC-SHA256 , secureDelete .

secureDelete

Tacita's deletion primitive. One pass of overwrite with random bytes encrypted under an ephemeral, never-stored key, then flush + close + delete. On copy-on-write filesystems the original blocks may briefly survive on the controller; those bytes are doubly encrypted (by Tacita and by the OS-level full-disk encryption) and read as random noise to a forensic recovery tool.

See also: AES-256-GCM , Vault .

Tacita's per-vault encrypted, content-addressed store for binary content — image / video thumbnails fetched from search, character avatars, CHARX assets. Bytes are keyed by SHA-256, encrypted with the vault key, and only ever rendered back through BlobImageProvider; cleartext bytes never touch the disk.

See also: AES-256-GCM , Vault .

Memory facts

A small list of pinned facts the user wants the model to remember across sessions inside a vault. Memory facts are stored encrypted and auto-injected into the system prompt of every chat in the vault. They are a Pro feature and are capped to a small fixed number to keep the prompt budget predictable.

See also: System prompt , Vault .

The multimodal projector file paired with certain vision-capable Gemma 4 GGUFs. Tacita uses mmproj both on-device and over Bridge — when the active model ships an mmproj, the camera button in the composer lets you drop in an image, and the model can read it while staying inside your vault.

See also: GGUF , Gemma 4 .

Stable Diffusion (1.5)

An open-weight text-to-image diffusion model. Tacita's persona-avatar generation flow runs a small Stable Diffusion 1.5 model entirely on-device, with the resulting image encrypted into the vault before it ever hits disk. The surface is wired end-to-end and waiting on perf tuning before it ships in the public binary.

See also: BlobStore .

Google's User Messaging Platform and Apple's App Tracking Transparency frameworks for surfacing the consent dialog tied to ad personalisation. Tacita implements both because banner ads run on the free tier; once the user buys Tacita Pro, banners are removed and the consent flow is no longer relevant.

Looking for the longer story behind these terms? See the privacy architecture page, the character cards page, and the models page. Direct answers to common questions live on the FAQ.