Martin

Privacy-first meeting transcriber for Linux — captures audio, transcribes offline with Whisper, stores in SQLite. Built with Tauri 2, Svelte 5, and Rust.

Download

martin

Privacy-first meeting transcriber and dictation tool for Linux. Records microphone and system audio, transcribes locally with Whisper, stores results in SQLite. No cloud, no internet — your audio never leaves your machine.

Two modes: Record meetings with dual audio capture (mic + system), or Dictate with real-time speech-to-text that shows words as you speak.

Screenshots

Recorder	Recording in progress

Pending recordings	Finalizing transcription

History	Summary

Features

Recording mode

Dual audio capture — records microphone + system audio (browser, Zoom, Meet) via PipeWire
Pending recordings — recordings are tracked in the database, survive app restarts, and can be transcribed or deleted from a list
Non-blocking stop — stopping long recordings runs mixing in the background, UI stays responsive

Dictation mode

Real-time transcription — speaks and sees text appear live as you talk
Mic-only capture — uses microphone directly, no system audio needed
Three-state indicator — Listening / Processing / Paused, so you know what the app is doing
VU meter — visual confirmation that the microphone is picking up your voice
Provisional vs stable text — recently transcribed text shown in gray italic (may still be refined), confirmed text in normal style
Automatic paragraphs — long pauses (>2.5s) become paragraph breaks in the output
Voice formatting commands — say "novo parágrafo", "nova linha", "ponto final", "vírgula", "ponto de interrogação", "ponto de exclamação", "abre aspas", "fecha aspas" to insert formatting
Smart capitalization — sentences are capitalized automatically and punctuation spacing is normalized
Silence-aware — skips whisper passes during silence to save CPU
Continuous auto-save — partial transcript is persisted every 5 seconds; nothing is lost if the app crashes
Audio preserved — raw mic audio is saved as a WAV alongside the transcript so you can reprocess later if needed

General

Auto-download model — Whisper model downloads automatically on first use, with progress bar
Local transcription — Whisper runs on your machine, no internet needed
Bilingual UI — Portuguese and English, follows system locale (also used for transcription language)
Transcription history — browse, view, copy, and delete past transcriptions
AI summary — summarize transcriptions with key points via Claude Code CLI, with copy support
Privacy first — audio files deleted after transcription, data stays in local SQLite

Requirements

Linux with PipeWire (Ubuntu 22.10+, Fedora 36+, etc.)
pw-record and wpctl (included with PipeWire)
Rust 1.70+
Node.js 18+

System dependencies

# Ubuntu/Debian
sudo apt install -y libwebkit2gtk-4.1-dev libgtk-3-dev libasound2-dev libpulse-dev \
  build-essential libssl-dev libayatana-appindicator3-dev librsvg2-dev pipewire

Install

From .deb (recommended)

Download the latest .deb from GitHub Releases and install:

sudo apt install ./martin_0.1.0_amd64.deb

On first use, the app automatically downloads the Whisper model (~466MB). Internet required only for this one-time download.

From source

git clone https://github.com/vagnerzampieri/martin.git
cd martin
npm install
cargo install tauri-cli --version "^2"
cargo tauri build

The binary will be in src-tauri/target/release/martin.

Usage

Recording a meeting

Open martin and select the Record tab
Click Start Recording / Iniciar Gravação
Have your meeting — both your mic and system audio are captured
Click Stop Recording / Parar Gravação — recording appears in the pending list
Click Transcribe / Transcrever on the pending item — wait for local transcription
Done. Text is saved, audio is deleted.

You can record multiple times before transcribing — each recording is tracked separately. Close and reopen the app; your pending recordings are still there.

Dictating text

Select the Dictation / Ditado tab
Click Start Dictation / Iniciar Ditado
Speak — text appears in real time as you talk
Click Stop Dictation / Parar Ditado — the full text is saved to history

Voice commands (Portuguese)

Speak these phrases mid-dictation to insert punctuation or formatting. Matching is case-insensitive and accent-insensitive — novo paragrafo works the same as novo parágrafo, and Whisper dropping a diacritic does not break the command.

Say (PT)	Produces
`novo parágrafo` or `ponto parágrafo`	new paragraph (`\n\n`)
`nova linha`	newline (`\n`)
`ponto final`	`.`
`vírgula`	`,`
`ponto de interrogação`	`?`
`ponto de exclamação`	`!`
`abre aspas`	`"` (opening)
`fecha aspas`	`"` (closing)

The dictation also reacts to silence:

A pause of ~2 seconds flips the UI state to Pausa detectada (no other effect — speak again and it resumes)
A pause of ~5 seconds commits the current text and starts a new paragraph automatically — useful when you stop to think between thoughts

How audio capture works

Recording mode

Records two audio sources simultaneously:

Microphone — captured via cpal (ALSA backend)
System audio — captured via pw-record targeting the default PipeWire sink

When you stop recording, both WAV files are mixed into a single file. If PipeWire is not available or the system audio is corrupt, martin falls back to microphone-only recording. The mixed file is saved as a pending recording and can be transcribed later.

Dictation mode

Captures microphone audio via cpal and transcribes in real time:

Audio streams into a shared buffer at the device's native sample rate
A second thread emits the audio level (dictation://level) and session state (dictation://state, one of listening/processing/paused) for UI feedback
Every ~500ms, the transcription loop drains new samples, measures RMS, and skips the whisper pass if the chunk is silent
When there is enough new audio, the full accumulated buffer is converted to mono 16kHz and sent to Whisper for re-transcription (this is what keeps the output accurate — Whisper self-corrects with more context)
The output passes through a text-normalization pipeline (voice command substitution → punctuation spacing → whitespace collapse → sentence capitalization) before being emitted
A pause longer than ~2.5s commits the current text as a segment and inserts a paragraph break (\n\n) before the next segment
When the buffer exceeds 120 seconds, the current segment is committed (reusing the last transcription) and a new buffer starts
The mic audio is written to a WAV file in real time, so the audio is available for reprocessing after the session ends
The partial transcription row is created on Start Dictation and updated every ~5s during the session; on Stop Dictation the same row is finalized

Whisper Models

Model	Size	Quality	Speed
tiny	75MB	Usable	Fast
base	142MB	Good	Fast
small	466MB	Very good	Moderate
medium	1.5GB	Excellent	Slow

Download with: ./scripts/download-model.sh <model>

Development

cargo tauri dev          # Full app dev mode with hot reload
cargo test               # Run Rust tests
npm run check            # Svelte/TypeScript type checking
cargo fmt                # Format Rust code
cargo clippy             # Lint Rust code

Architecture

src-tauri/src/
├── lib.rs              # Tauri commands, app state
├── audio/
│   ├── capture.rs      # Mic (cpal) + system audio (pw-record)
│   ├── mix.rs          # WAV mixing (mic + system → single file)
│   └── wav_writer.rs   # Thread-safe WAV writer (used by recorder and dictation)
├── db/
│   └── store.rs        # SQLite CRUD for transcriptions + pending recordings
├── dictation.rs        # Real-time mic capture + transcription loop + level/state emitter
├── model.rs            # Auto-download Whisper model with progress events
├── postprocess.rs      # Pure text normalization: voice commands, spacing, capitalization
├── summarize.rs        # Claude CLI integration for AI summaries
├── vad.rs              # Pure RMS-based silence detection helpers
└── transcribe/
    ├── whisper.rs      # Whisper transcription + WAV loading + resampling
    └── job.rs          # Finalize worker + cancel/progress orchestration

src/
├── lib/
│   ├── i18n.js         # Locale detection + translations (pt/en)
│   ├── format.js       # Shared date/duration formatting
│   ├── appBusy.js      # Cross-tab "app busy" store
│   ├── Recorder.svelte # Recording controls + pending recordings list
│   ├── Dictation.svelte # Real-time dictation with state/level/provisional UI
│   ├── VuMeter.svelte  # Audio-level meter component
│   ├── FinalizingProgress.svelte # Progress overlay for finalize phase
│   ├── ModelDownload.svelte # Model download progress overlay
│   ├── History.svelte  # Transcription list
│   └── TranscriptionView.svelte  # View + copy + summarize
└── routes/
    └── +page.svelte    # Main page (three tabs: Record / Dictation / History)

License

GPLv3

Top categories