A desktop app for audio recording, transcription, and annotation. Record voice, capture screenshots and clipboard content simultaneously, get word-level transcripts with timestamps, and annotate them with rich metadata — all processed locally, no cloud APIs required.
Built with Tauri 2 (Rust backend) + Svelte 5 (TypeScript frontend).
session.json, transcript.json, annotations.json, plus all asset files| Layer | Technology |
|---|---|
| Desktop framework | Tauri 2 |
| Frontend | Svelte 5 + SvelteKit 2 |
| Language | TypeScript (frontend), Rust (backend) |
| Transcription | whisper-rs with Metal acceleration |
| Audio I/O | cpal + hound |
| Audio resampling | rubato |
| Database | SQLite via sqlx + tauri-plugin-sql |
| File watching | notify |
| Build tool | Vite 8 |
libwebkit2gtk, libgtk-3, libappindicator3, etc.# Install frontend dependencies
npm install
# Start dev mode (launches Vite dev server + Tauri window)
npm run tauri dev
The first build will compile all Rust dependencies, which takes a few minutes. Subsequent builds are incremental.
# Build frontend
npm run build
# Bundle desktop app (outputs to src-tauri/target/release/bundle/)
npm run tauri build
Output formats by platform:
.dmg and .app.msi and .exe.AppImage and .debez/
├── src/ # Svelte frontend
│ ├── lib/
│ │ ├── components/ # UI components (AudioPlayer, TranscriptView, etc.)
│ │ ├── db.ts # SQLite schema and initialization
│ │ ├── audio.ts # Recording logic + device enumeration
│ │ ├── transcript.ts # Transcript data access
│ │ ├── annotations.ts # Annotation CRUD
│ │ ├── sessions.ts # Session management
│ │ ├── captures.ts # Clipboard + watcher capture handling
│ │ └── export.ts # Session export
│ └── routes/
│ ├── +page.svelte # Home (session list + recording controls)
│ └── session/[id]/ # Session detail view (transcript + annotations)
├── src-tauri/
│ ├── src/
│ │ ├── lib.rs # Tauri app setup, command registration, URI schemes
│ │ ├── audio.rs # Audio recording, device management
│ │ ├── transcribe.rs # Whisper inference
│ │ ├── model.rs # Model download and validation
│ │ ├── sessions.rs # Session backend commands
│ │ ├── annotations.rs # Annotation backend commands
│ │ ├── watcher.rs # File system watcher
│ │ ├── resample.rs # Audio resampling
│ │ └── export.rs # Export command
│ ├── capabilities/
│ │ └── default.json # Tauri permission grants
│ ├── Cargo.toml
│ └── tauri.conf.json
├── package.json
└── svelte.config.js
All app data is stored locally in the platform-specific app data directory:
| Platform | Path |
|---|---|
| macOS | ~/Library/Application Support/com.ez.app/ |
| Windows | %APPDATA%\ez\ |
| Linux | ~/.config/ez/ |
Contents:
ez.db — SQLite database (sessions, transcripts, words, annotations)recordings/ — WAV audio filescaptures/ — Clipboard/screenshot capturesmodels/ — Downloaded Whisper and VAD model weights| Table | Purpose |
|---|---|
sessions |
Recording sessions with metadata (duration, device, sample rate) |
transcripts |
One per session; stores aggregated word count and detected language |
words |
Individual words with start_ms / end_ms timestamps |
annotations |
Rich annotations tied to word index ranges; supports soft delete |
pending_captures |
Staging area for clipboard/watcher assets before user annotation |
settings |
Key-value store (watch folder path, last open session) |
audio://, asset://) with path canonicalization to prevent directory traversalOn first launch, macOS will request:
These are granted via standard macOS permission dialogs and can be revoked in System Settings > Privacy & Security.