A self-hosted, containerized PDF reader with an offline-capable AI assistant. All data lives on your machine in plain files you can read, edit, and download at any time.
data/; zip and move at any timeBrowser
β
βββΊ nginx :80
βββΊ /api/* βββΊ backend (Rust/Axum) :3000
β βββ Session middleware (HMAC-signed cookies)
β βββ File storage (data/)
β βββ PDF text extraction (pdftotext)
β βββ AI proxy (β user-configured endpoint)
β
βββΊ /* βββΊ frontend (SvelteKit/Node) :3000
βββ PDF rendering (pdfjs-dist, worker bundled)
βββ Svelte stores (auth, pdf state)
βββ Tailwind CSS (no CDN, fully local)
The three services run as Docker containers. In development, Vite proxies /api to the Rust server so you only need one origin.
NimbusPDF/
βββ docker-compose.yml # Orchestrates nginx + backend + frontend
βββ nginx.conf # Reverse proxy: /api β backend, / β frontend
βββ .env.example # Copy to .env and fill in secrets
βββ .gitignore
βββ data/ # Runtime data (gitignored, created on first run)
β
βββ backend/ # Rust/Axum API server
β βββ Cargo.toml
β βββ Cargo.lock
β βββ Dockerfile
β βββ config/ # Mounted read-only into the container
β β βββ default.toml # Server, session, AI, storage settings
β β βββ ai_system_prompt.md # Default chat system prompt
β β βββ ai_system_prompt.summary.md # Summary quick-action prompt
β β βββ ai_system_prompt.keypoints.md
β βββ src/
β βββ main.rs # Server bootstrap, AppState, middleware wiring
β βββ config.rs # Typed config (TOML + NIMBUS__ env vars)
β βββ session.rs # HMAC-signed cookies, disk sessions, Axum middleware
β βββ pdf_text.rs # pdftotext subprocess wrapper (text extraction)
β βββ gdrive.rs # Google Drive OAuth2 client
β βββ ai/mod.rs # AiProxy (OpenAI-compat), load_prompt()
β βββ auth/mod.rs # (re-exports, types live in session.rs)
β βββ storage/
β β βββ mod.rs # Principal enum, module re-exports
β β βββ local.rs # LocalStorage β all file I/O
β βββ routes/
β βββ mod.rs # Router assembly (no state consumed here)
β βββ pdfs.rs # Upload, list, serve (range requests), delete
β βββ ai.rs # Chat, summary, keypoints, history, AI config
β βββ highlights.rs # GET/PUT highlights.json per document
β βββ notes.rs # GET/PUT notes.json (per-page, full or single)
β βββ memory.rs # Long-term memory markdown + AI append
β βββ categories.rs # Knowledge graph (nodes + edges) CRUD
β βββ auth.rs # OIDC login/callback/logout, Google Drive OAuth, /me
β
βββ frontend/ # SvelteKit app
βββ package.json
βββ svelte.config.js # adapter-node + path aliases ($api, $components, $stores)
βββ vite.config.js # viteStaticCopy (PDF.js worker), /api proxy for dev
βββ postcss.config.js
βββ tailwind.config.js
βββ Dockerfile
βββ src/
βββ app.html # HTML shell
βββ app.css # Tailwind directives + CSS variables
βββ lib/
β βββ api/
β β βββ client.js # Thin fetch wrapper (get/post/put/delete/upload)
β βββ stores/
β β βββ auth.js # Auth state (user, loading); calls GET /api/auth/me
β β βββ pdf.js # Viewer state (zoom, page, highlights, notes)
β βββ components/
β βββ PDFViewer.svelte # pdfjs-dist renderer, highlight overlay, color picker
β βββ AISidebar.svelte # Chat UI, history load, Summary/Key Points buttons
β βββ Notes.svelte # Per-page textarea with debounced autosave
β βββ MemoryEditor.svelte # Split markdown editor + marked preview
β βββ CategoryManager.svelte # Documentβcategory assignment, graph CRUD
βββ routes/
βββ +layout.svelte # Auth init on mount
βββ +page.svelte # Document library + upload
βββ viewer/[docId]/
β βββ +page.svelte # Viewer + AI sidebar + Notes drawer
βββ settings/
β βββ +page.svelte # AI endpoint config + Google Drive connect
βββ memory/
β βββ +page.svelte # Long-term memory editor (auth required)
βββ categories/
βββ +page.svelte # Category management (auth required)
Prerequisites: Docker and Docker Compose.
# 1. Clone and enter the repo
git clone <repo-url> NimbusPDF && cd NimbusPDF
# 2. Create your environment file
cp .env.example .env
# Edit .env β at minimum set SESSION_SECRET to a random string
# 3. Build and start
docker compose up --build
# 4. Open http://localhost
The app is fully functional without filling in any OIDC or AI credentials. You can add a PDF and configure your AI endpoint from the Settings page.
cd backend
# First run β downloads and compiles all crates (~2 min)
cargo build
# Run the dev server (reads config from ./config/, data from ./data/)
cargo run
# Run all tests
cargo test
# Run a single test by name
cargo test <test_name>
# Type-check without linking (fast)
cargo check
The backend listens on http://localhost:3000 by default.
Required system tool: poppler-utils must be installed for PDF text extraction.
# macOS
brew install poppler
# Debian/Ubuntu
sudo apt-get install poppler-utils
cd frontend
npm install
# Dev server with hot reload β proxies /api to localhost:3000
npm run dev # http://localhost:5173
# Type-check
npm run check
# Lint
npm run lint
# Production build (output to build/)
npm run build
Open two terminals:
# Terminal 1
cd backend && cargo run
# Terminal 2
cd frontend && npm run dev
Navigate to http://localhost:5173. All /api calls are proxied to the Rust server.
backend/config/default.toml| Key | Default | Description |
|---|---|---|
server.host |
0.0.0.0 |
Bind address |
server.port |
3000 |
Listen port |
server.data_dir |
./data |
Root for all user data |
server.config_dir |
./config |
Directory containing prompt files and this TOML |
server.max_upload_bytes |
104857600 |
Maximum PDF upload size in bytes (100 MB). Must match nginx.conf client_max_body_size |
session.cookie_name |
nimbus_session |
Cookie name |
session.anonymous_ttl |
86400 |
Anonymous session lifetime in seconds (24 h) |
ai.system_prompt_file |
ai_system_prompt.md |
Chat prompt template filename |
ai.summary_prompt_file |
ai_system_prompt.summary.md |
Summary prompt filename |
ai.keypoints_prompt_file |
ai_system_prompt.keypoints.md |
Key points prompt filename |
ai.max_context_tokens |
4096 |
Max words of PDF text sent to AI |
auth.require_auth |
false |
Set true to block unauthenticated access |
Any key can be overridden with an environment variable using the NIMBUS__ prefix with __ as separator:
NIMBUS__SERVER__PORT=8080
NIMBUS__AUTH__REQUIRE_AUTH=true
.env file| Variable | Required | Description |
|---|---|---|
SESSION_SECRET |
Yes | Random string used to sign session cookies (min 32 chars) |
OIDC_ISSUER_URL |
No | OIDC provider base URL. Auth is disabled when unset |
OIDC_CLIENT_ID |
If OIDC | Client ID from your OIDC provider |
OIDC_CLIENT_SECRET |
If OIDC | Client secret |
OIDC_REDIRECT_URI |
If OIDC | Must match a URI registered with your provider |
GOOGLE_CLIENT_ID |
No | For Google Drive sync |
GOOGLE_CLIENT_SECRET |
No | For Google Drive sync |
GDRIVE_REDIRECT_URI |
If Drive | Must be registered in Google Cloud Console |
The three files in backend/config/ are loaded at request time β edit them without restarting the server. Each uses a single placeholder:
{document_context} β replaced with extracted PDF text at request time
The Docker images contain every dependency. Once built, they need no internet access.
docker compose build
docker save \
nimbuspdf-backend \
nimbuspdf-frontend \
nginx:alpine \
| gzip > nimbuspdf-images.tar.gz
# Copy nimbuspdf-images.tar.gz and the project directory to the target machine
# Load images
docker load < nimbuspdf-images.tar.gz
# Run (no build needed)
docker compose up
| Concern | How it's handled |
|---|---|
| PDF.js worker | Copied into the SvelteKit build output by vite-plugin-static-copy at build time |
| Rust crates | All compiled into a static binary at build time |
| Node modules | npm ci --omit=dev run inside the builder stage; build/ is copied into runtime image |
| SSL certificates | ca-certificates installed in the backend runtime image (needed for outbound AI/OIDC calls) |
| PDF text extraction | poppler-utils installed in the backend runtime image |
| Fonts / CSS | Tailwind generates all CSS at build time; no Google Fonts or CDN links anywhere |
| AI endpoint | Configured by the user β points to their own Ollama/LM Studio/other instance |
Note: The AI endpoint you configure in Settings must be reachable from the backend container. For a local Ollama instance, use
http://host.docker.internal:11434/v1/chat/completionson Docker Desktop (Mac/Windows) or the host's LAN IP on Linux.
Authentication is optional. Without it, the app creates an anonymous session (cookie-backed, TTL configurable) and all features except long-term memory and categories are available.
http://<your-host>/api/auth/callbackOIDC_* vars to .envThe login flow uses PKCE + CSRF state, so no client secret is technically required for public clients β set OIDC_CLIENT_SECRET= to empty in that case.
Sessions are stored as JSON files in data/sessions/<id>.json. They are HMAC-SHA256 signed via SESSION_SECRET. Anonymous sessions expire after session.anonymous_ttl seconds; authenticated sessions have no TTL until logout.
From the Settings page (or PUT /api/ai/config), configure:
| Field | Example |
|---|---|
| Endpoint URL | http://localhost:11434/v1/chat/completions |
| Model | llama3, mistral, gpt-4o, etc. |
| API Key | Leave blank for local models; required for OpenAI/Anthropic |
Any OpenAI-compatible endpoint works. Tested with:
http://host.docker.internal:11434/v1/chat/completionshttp://host.docker.internal:1234/v1/chat/completionshttps://api.openai.com/v1/chat/completionsThe API key is stored in data/[users|anonymous/sessions]/<id>/settings/ai_config.toml. It is not encrypted at rest in the current version β use filesystem-level encryption for sensitive deployments.
All data lives under data/ and is human-readable:
data/
βββ sessions/ # Server-side session files (all users)
β βββ <uuid>.json
β
βββ anonymous/
β βββ sessions/
β βββ <session-id>/
β βββ pdfs/
β βββ <doc-id>/
β βββ original.pdf
β βββ highlights.json # Array of highlight objects
β βββ notes.json # { "1": { content, updated_at }, ... }
β βββ chat_history.json # Array of { role, content, timestamp }
β βββ metadata.json # id, filename, page_count, uploaded_at
β
βββ users/
βββ <oidc-subject>/
βββ pdfs/
β βββ <doc-id>/ # Same structure as anonymous above
βββ memory/
β βββ long_term_memory.md # User-editable markdown
βββ categories/
β βββ graph.json # { nodes: [...], edges: [...] }
βββ settings/
βββ ai_config.toml # endpoint_url, model, api_key
βββ gdrive_token.json # OAuth2 token (when Drive is connected)
To back up or migrate a user: copy their folder under data/users/<subject>/.
# Check logs
docker compose logs backend
# Common causes:
# - SESSION_SECRET not set in .env
# - config/default.toml missing (must be present in backend/config/)
# - Port 3000 already in use on host
The upload limit is set in two places and both must agree:
# 1. backend/config/default.toml
server.max_upload_bytes = 104857600 # bytes
# 2. nginx.conf
client_max_body_size 100m;
Increase both values, then rebuild: docker compose up --build.
# Verify poppler-utils is installed in the container
docker compose exec backend pdftotext -v
# Check the data directory is writable
docker compose exec backend ls -la /app/data
# Look at the backend log for the actual error
docker compose logs -f backend
# Test pdftotext directly
docker compose exec backend pdftotext /app/data/users/<id>/pdfs/<doc>/original.pdf -
# If it outputs nothing: the PDF is image-based (scanned) β pdftotext
# only works on text-layer PDFs. OCR support is a planned future feature.
The user-specific ai_config.toml is missing. Go to Settings and fill in the endpoint URL and model name.
# From inside the backend container, test the endpoint directly
docker compose exec backend \
curl -s -X POST http://host.docker.internal:11434/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model":"llama3","messages":[{"role":"user","content":"hi"}],"stream":false}'
# Common causes:
# - Wrong endpoint URL (check Settings)
# - Ollama/LM Studio not running or not accessible from the container
# - On Linux Docker: use host LAN IP instead of host.docker.internal
# Check the discovery endpoint is reachable from the backend container
docker compose exec backend \
curl -s "$OIDC_ISSUER_URL/.well-known/openid-configuration" | head -20
# Verify OIDC_REDIRECT_URI matches exactly what is registered with the provider
# (including http vs https and trailing slash)
SESSION_SECRET is set and consistent across restarts (not regenerated)nimbus_session cookie (dev tools β Network)HttpOnly; SameSite=Lax β it won't appear in document.cookie# Check token file exists and is valid JSON
docker compose exec backend \
cat /app/data/users/<subject>/settings/gdrive_token.json
# Token refresh logs appear at INFO level
docker compose logs backend | grep -i gdrive
cd frontend
# Clear caches
rm -rf node_modules .svelte-kit build
npm install
npm run build
# If PDF.js worker is missing from build output:
ls build/pdf.worker.min.js
# Should exist β produced by vite-plugin-static-copy
# If missing: verify vite.config.js has the viteStaticCopy plugin
# Verify tailwind.config.js content paths cover all Svelte files
cat frontend/tailwind.config.js
# Rebuild
cd frontend && npm run build
cd backend
# Fast type check
~/.cargo/bin/cargo check
# Linter
~/.cargo/bin/cargo clippy
# Full build
~/.cargo/bin/cargo build --release
| Service | Internal port | External (docker compose) |
|---|---|---|
| nginx | 80 | 80 (access point) |
| backend | 3000 | not exposed directly |
| frontend | 3000 | not exposed directly |
In development (no Docker):
| Service | Port |
|---|---|
| Rust backend | 3000 |
| Vite dev server | 5173 |
This project is licensed under the Polyform Noncommercial License 1.0.0. Contributions are welcome, but commercial use or resale of this software is strictly prohibited.