P3AK Documentation
P3AK is the AI data foundation every organization needs. Not an AI wrapper — the data layer underneath every AI wrapper. Three products, one platform. Get your data right once and every AI tool you use gets smarter. Forever.
One encrypted .vault file holds your entire organization's knowledge. 35 formats. 98% hybrid search accuracy. Portable. Model-agnostic.
Five-tributary knowledge organization for any business. Version-controlled. Gap analysis. Exports to .mdr for vault ingestion.
AI reasoning with permanent memory. CREST protocol. Works with any model. The operating system for how your organization thinks.
Quick Start
From zero to a searchable encrypted knowledge base in under 5 minutes.
1 — Install
# Install the CLI via Cargo $ cargo install p3ak-vault # Verify $ p3ak-vault --version p3ak-vault 0.1.0
2 — Create a vault
$ p3ak-vault create --path company.vault --passphrase $VAULT_KEY {"ok":true,"encrypted":true,"path":"company.vault"}
3 — Ingest documents
# Single file (35 formats supported) $ p3ak-vault ingest --path company.vault --file term-sheet.pdf --room legal # Entire directory $ p3ak-vault ingest --path company.vault --dir ./documents --room legal {"added":14,"skipped":0,"formats":["pdf","docx","md","mdr"]}
4 — Search
$ p3ak-vault search --path company.vault --query "renewal terms" --mode hybrid [{ "score": 1.41, "filename": "services-agreement.mdr", "room": "legal", "snippet": "The initial term is 12 months, auto-renewing..." }]
Set P3AK_VAULT_PASSPHRASE as an environment variable to avoid passing --passphrase on every command.
How It Fits Together
P3AK is three independent products connected by the .mdr format and the vault API.
P3AK harness (orchestration — CREST protocol, Pi/Claude) ↓ vault_search / vault_write via Pi extension ↓ room REST API (gaps, documents, analysis) ──────────────────────────────────────────────── P3AK room (application — Next.js 14, 5 tributaries) ↓ POST /api/companies/[slug]/vault-push ↓ GET /api/companies/[slug]/export?format=mdr ──────────────────────────────────────────────── P3AK vault (infrastructure — Rust, single .vault file) p3ak-vault ingest --file doc.mdr --room legal p3ak-vault search --query "..." --mode hybrid p3ak-vault serve --port 8080 ──────────────────────────────────────────────── .mdr format (bridge — room creates, vault ingests, harness reads)
vault — Overview
A single encrypted binary file that stores, indexes, and retrieves your entire knowledge base. No database. No cloud. No SaaS. Yours.
| Property | Value |
|---|---|
| Language | Rust 1.77+ |
| License | MIT |
| Encryption | AES-256-GCM · Argon2id KDF |
| Search | BM25 (Tantivy) + ZVec TF-IDF + PageIndex · hybrid |
| Accuracy | 98% Top-1 on 153-query benchmark |
| Formats | 35 file types |
| Tests | 272 unit · 54 integration · 12 accuracy |
| Install | cargo install p3ak-vault |
Installation
From crates.io (recommended)
$ cargo install p3ak-vault
From source
$ git clone https://github.com/siliconbayou/p3ak-vault $ cd p3ak-vault $ cargo build --release $ cp target/release/p3ak-vault /usr/local/bin/
Environment variables
| Variable | Description |
|---|---|
P3AK_VAULT_PASSPHRASE | Encryption passphrase (avoids --passphrase flag) |
ANTHROPIC_API_KEY | Enables LLM classification features |
P3AK_VAULT_BIN | Path to binary (used by room vault-push) |
CLI Commands
create
p3ak-vault create --path <PATH> [--passphrase <P>]
Creates a new empty vault. With --passphrase, the vault is AES-256-GCM encrypted. Without, it is stored unencrypted with a warning.
ingest
# Single file p3ak-vault ingest --path <VAULT> --file <FILE> [--room <R>] [--upsert] # Directory (recursive) p3ak-vault ingest --path <VAULT> --dir <DIR> [--room <R>]
Ingests a file or directory. Content is normalized to markdown, SHA-256 deduped, indexed, and appended to the vault. Supports 35 file formats. Returns {"action":"added"|"skipped"|"updated"}.
search
p3ak-vault search --path <VAULT> --query <Q> [--limit <N>] [--mode flat|pageindex|hybrid] [--room <R>]
Searches the vault using the specified mode. Returns a JSON array sorted by relevance score. Use --room to scope the search to a specific tributary.
serve
p3ak-vault serve --path <VAULT> [--port 8080] [--bind 127.0.0.1]
Starts a synchronous HTTP server exposing the vault over REST. Binds to 127.0.0.1 by default. See REST API for available endpoints.
watch
p3ak-vault watch --path <VAULT> --dir <DIR> [--room <R>]
Watches a directory for file changes and automatically ingests new or modified files. Ctrl-C to stop.
read
p3ak-vault read --path <VAULT> --type goals|docs|wal
Reads structured sections of the vault. docs lists all ingested documents. wal shows the hash-linked write-ahead log. goals returns stored goal entries.
write
p3ak-vault write --path <VAULT> --type goal|doc|plan|review --payload '{"title":"..."}' # Or pipe JSON from stdin echo '{"title":"Q1 goals"}' | p3ak-vault write --path vault --type goal --payload -
canary-check
p3ak-vault canary-check --path <VAULT> [--threshold 0.8]
Runs the embedded canary query set and measures retrieval accuracy. Exit code 0 = passing, 2 = recall below threshold, 1 = fatal error.
sync
p3ak-vault sync --path <VAULT>
Flushes the WAL, rebuilds the index, and compacts the vault file. Run after bulk ingests.
export
p3ak-vault export --path <VAULT> [--format json|md] [--out <FILE>]
accuracy-test
p3ak-vault accuracy-test --path <VAULT> --ground-truth ground-truth.json [--mode hybrid]
Runs a structured accuracy benchmark against a ground-truth JSON file. See testdata/fixtures/ground-truth.json for format.
Search Modes
P3AK vault supports three search modes. Hybrid is recommended for production.
| Mode | Engine | Best For | Accuracy |
|---|---|---|---|
flat | BM25 (Tantivy) | Keyword search, exact term matching | ~85% Top-1 |
pageindex | PageIndex tree | Hierarchical documents, long-form content | ~90% Top-1 |
hybrid | BM25 + ZVec TF-IDF combined | General purpose — recommended | 98% Top-1 |
Hybrid mode runs BM25 and ZVec TF-IDF in parallel, then combines scores with a weighted merge. BM25 handles exact term recall; ZVec captures semantic similarity via term-frequency vectors built at ingest time. No external embeddings API required.
File Formats (35)
P3AK vault normalizes all formats to markdown before indexing. No external tools required for Tier 1 and 2 formats.
REST API
Start the API server with p3ak-vault serve --path vault.vault --port 8080. All endpoints return JSON.
| Method | Path | Description |
|---|---|---|
POST | /ingest | Ingest a document. Body: {"path":"...","room":"..."} |
POST | /search | Search. Body: {"query":"...","mode":"hybrid","limit":10} |
GET | /docs | List all ingested documents |
POST | /write | Write a structured entry (goal/doc/plan/review) |
GET | /wal | Read the hash-linked write-ahead log |
POST | /canary-check | Run canary accuracy check |
GET | /health | Health check — returns {"ok":true} |
The REST API binds to 127.0.0.1 by default. Do not expose it to the public internet without adding authentication. Use --bind 0.0.0.0 only in trusted environments.
Python SDK
Installation
$ pip install p3ak-vault # subprocess wrapper (zero deps)
Usage
from p3ak_vault import VaultClient client = VaultClient("company.vault", passphrase="your-key") # Ingest client.ingest("term-sheet.pdf", room="legal") # Search results = client.search("renewal terms", mode="hybrid", limit=5) for r in results: print(r["filename"], r["score"], r["snippet"])
Configuration
P3AK vault reads configuration from ~/.p3ak/config.toml.
# Default vault path default_vault = "~/vaults/main.vault" # Default search mode search_mode = "hybrid" # LLM classification (optional) anthropic_model = "claude-3-haiku-20240307"
room — Overview
AI-native knowledge organization for any business. Five tributaries structure your company's intelligence, version-control every document, track what's missing, and export everything as portable .mdr files for vault ingestion. Built for companies that need their data organized — whether for investors, acquirers, partners, or themselves.
| Property | Value |
|---|---|
| Framework | Next.js 14 (App Router) |
| Database | PostgreSQL + Drizzle ORM |
| Version control | isomorphic-git (per-company git repo) |
| Auth | Clerk (optional) |
| AI | Anthropic Claude via Vercel AI SDK |
5 Tributaries
Every organization's knowledge is organized into five tributaries — the five areas that matter most, whether you're running day-to-day operations, preparing for investment, or just need everything in one place.
| # | Tributary | Contents |
|---|---|---|
01 | Legal | Articles, operating agreement, cap table, IP assignments, contracts |
02 | Financial | P&L, balance sheet, projections, tax returns, burn rate |
03 | Operations | Org chart, employee agreements, insurance, SOC2, DR plan |
04 | GTM | Sales playbook, pipeline, customer contracts, marketing |
05 | Tech | Architecture docs, security reports, API docs, roadmap |
API Reference
All routes are under /api/companies/[slug]/.
| Method | Route | Description |
|---|---|---|
GET | /files | List all files in the data room |
POST | /files | Upload a file to a tributary |
GET | /export | Export data room as ZIP |
GET | /export?format=mdr | Export all files as .mdr ZIP |
GET | /export?format=mdr&file=path | Export single file as .mdr |
POST | /vault-push | Push all documents to a P3AK vault |
GET | /sync | Sync status and git history |
POST | /process | Trigger AI processing pipeline |
.mdr Export
Export any document as a .mdr file — the P3AK portable document format.
# Export single file as .mdr GET /api/companies/acme/export?format=mdr&file=01-Legal/operating-agreement.md # Export all files as .mdr ZIP GET /api/companies/acme/export?format=mdr
Vault Bridge
Push all data room documents directly into a P3AK vault with one API call.
POST /api/companies/acme/vault-push { "vaultPath": "/Users/you/vaults/acme.vault", "passphrase": "your-vault-key", "room": "legal", // optional — scopes to one tributary "dryRun": false } // Response { "pushed": 14, "skipped": 2, "errors": 0, "documents": [{ "file": "operating-agreement.md", "action": "added" }, ...] }
vault-push serializes each text file as a .mdr document, writes them to a temp directory, then calls p3ak-vault ingest for each one. The vault binary is resolved via P3AK_VAULT_BIN env or common install paths. Temp files are cleaned up after each push.
harness — Overview
The reasoning layer. Your AI co-pilot with permanent memory — using the CREST protocol for systematic reasoning, vault as long-term memory, and room as the document source. Works with any model. The operating system for how your organization thinks, plans, and executes.
Pi is Anthropic's local CLI agent. P3AK harness runs inside Pi, which means every session has access to the vault, the CREST skills, and your full tool stack. Think of Pi as the brain and P3AK as the nervous system.
CREST Protocol
CREST is a five-phase systematic reasoning cycle for turning intentions into executed strategy.
| Phase | Skill | Output |
|---|---|---|
| Clarify | /skill:crest-clarify | SMART goal + identity anchor + vault write |
| Risks | /skill:crest-risks | Pre-mortem + WOOP analysis + ranked obstacles |
| Establish | /skill:crest-establish | 9×9 open-window grid + daily habit design |
| Sprints | /skill:crest-sprints | Quarters → sprints → daily wins roadmap |
| Tune | /skill:crest-tune | Review triggers + vault promotion criteria |
# Open Pi in p3ak-harness directory, then: /skill:crest-clarify # → Pi asks for your intention, creates SMART goal, writes to vault /skill:crest-risks # → Pi reads the goal from vault, maps obstacles, writes analysis /skill:crest-sprints # → Pi creates quarterly/sprint roadmap based on goal + obstacles
Domain Agents
The CAIO operates across six domains, each with its own vault.
| Domain | Vault | Scope |
|---|---|---|
| org-brain | vault/org-brain.vault | Company-wide strategy, goals, decisions |
| finance | vault/finance.vault | Financial models, reports, projections |
| legal | vault/legal.vault | Contracts, agreements, compliance |
| marketing | vault/marketing.vault | Campaigns, positioning, content |
| operations | vault/operations.vault | SOPs, hiring, team processes |
| tech | vault/tech.vault | Architecture, roadmap, engineering decisions |
Pi Skills
Skills are registered in .pi/skills/ and auto-discovered by Pi. Each skill is a directory with a SKILL.md file.
crest-clarify/SKILL.md crest-risks/SKILL.md crest-establish/SKILL.md crest-sprints/SKILL.md crest-tune/SKILL.md
Session Start Protocol
Every Pi session opened in p3ak-harness runs this four-step protocol automatically.
| Step | Action | Command |
|---|---|---|
| 1 | Canary-check the org-brain vault | p3ak-vault canary-check |
| 2 | Read current goals | p3ak-vault read --type goals |
| 3 | Read the state bus | cat state/state_bus.json |
| 4 | Report status to user | CAIO brief |
.mdr Format
The P3AK Document format. A portable, human-readable file containing your document's content, version history, and access-tier layers. Created by room, ingested by vault, queried by harness. Readable in any text editor.
Structure
+++mdr format_version: 1 doc_id: "acme-series-a-term-sheet" title: "Series A Term Sheet" created: "2025-11-01T00:00:00Z" created_by: "p3ak-room" current_layer: internal current_version: 3 tributaries: ["legal"] tags: ["term-sheet", "series-a", "legal"] layers: - id: public - id: internal restricted: true - id: legal privileged: true +++ @@@ layer:internal version:3 author:alice ts:2025-11-01T00:00:00Z @@@ # Series A Term Sheet Pre-money valuation: $8M. Investment: $2M. Auto-conversion at Series B... @@@ layer:internal version:2 author:alice ts:2025-10-15T00:00:00Z @@@ # Series A Term Sheet (Draft 2) ...
Header Fields
| Field | Type | Description |
|---|---|---|
format_version | integer | Always 1 for v1 spec |
doc_id | string | Stable URL-safe identifier (slug-company-filename) |
title | string | Human-readable document title |
current_layer | string | Which layer to serve by default |
current_version | integer | Version number of the current layer content |
tributaries | array | Which data room tributaries this doc belongs to |
tags | array | Free-form classification tags |
layers | array | Layer definitions (id, restricted, privileged) |
The complete .mdr format specification is at spec/mdr-format-v1.md in the p3ak parent repo.
Security
Vault Encryption
| Component | Algorithm | Details |
|---|---|---|
| Cipher | AES-256-GCM | Authenticated encryption, 256-bit key |
| KDF | Argon2id | Memory-hard, tuned for brute-force resistance |
| Nonce | 96-bit random | Unique per write operation |
| MAC | GCM tag | Tamper detection on every read |
| Key zeroing | zeroize crate | Keys wiped from memory after use |
Audit Log (WAL)
Every read and write is recorded in a hash-linked Write-Ahead Log. Each entry contains a SHA-256 hash of the previous entry, making the log tamper-evident. Use p3ak-vault read --type wal to inspect.
Privacy Model
- No telemetry. No phone-home. Zero network requests from the vault binary.
- REST API binds to
127.0.0.1by default — never exposed externally without explicit config. - The
.vaultfile is a single portable binary you own entirely — move it, back it up, delete it. No SaaS, no cloud, no lock-in. - Per-room isolation: documents tagged with
--roomare searchable independently. Cross-room queries require explicit--roomremoval.
Always use --passphrase or P3AK_VAULT_PASSPHRASE in production. An unencrypted vault stores all content in plaintext MessagePack.