BVoice Svelte Themes

Bvoice

Push-to-talk dictation for Linux. Hold a key, speak, release the transcription is typed at your cursor. 100% offline, powered by whisper.cpp. Built with Rust, Tauri and Svelte.

BVoice

BVoice

Local push-to-talk speech-to-text desktop app.
Hold a key, speak, release — the transcription is typed at your cursor.
Runs 100% offline using whisper.cpp.

Latest release Platform License


Install

Grab the latest build from the Releases page.

Debian / Ubuntu.deb:

sudo apt install ./BVoice_0.1.0_amd64.deb

Fedora / RHEL / openSUSE.rpm:

sudo dnf install ./BVoice-0.1.0-1.x86_64.rpm

After install you'll find BVoice in your application menu. On first launch the selected whisper model (75–466 MB) downloads to `/.local/share/bvoice/models/`.

Features

  • Push-to-talk trigger (default: Right-Alt), rebindable from the settings window
  • Local transcription with whisper.cpp (tiny.en / base.en / small.en)
  • Silero VAD trims silence before transcription
  • FFT-based resampling (rubato) for high-quality 48 kHz → 16 kHz conversion
  • Beam search (configurable size, default 5) or greedy decoding
  • Live-applied settings for threshold, input device, and model swap — no restart
  • Tray icon reflects state (idle / recording / transcribing) using the branded icon
  • Single-instance enforcement; optional autostart on login
  • Types output directly at the cursor — never touches your clipboard

Usage

  1. Launch BVoice — a tray icon appears (no window by default).
  2. Click the tray icon → Settings to configure model, hotkey, input device, beam size, and autostart.
  3. Focus any text field (editor, terminal, browser, …).
  4. Hold the trigger key for ≥ the arm threshold (default 1 s), speak, release.
  5. The transcription is typed at the cursor.

Platform support

  • Linux / X11 — primary target, tested on Ubuntu GNOME
  • Wayland — not supported (global hotkeys and synthesized typing require compositor-specific portals)
  • macOS / Windows — not yet ported

Configuration

Settings persist at ~/.config/bvoice/config.toml:

Key Type Default Description
model string base.en Whisper model (tiny.en, base.en, small.en)
arm_threshold_ms u64 1000 Hold duration before recording arms
input_device string|null null Input device name; null = system default
hotkey string AltGr Trigger key (rebindable from the settings window)
beam_size u32 5 Beam search size; 1 = greedy

All fields are editable from the Settings window and persist on Save.

Build from source

Prerequisites

  • Rust (stable) via rustup
  • Node.js 20+ and npm
  • Tauri CLI: cargo install tauri-cli --version '^2.0' --locked
  • Linux system packages (Ubuntu/Debian):
    sudo apt install \
      libwebkit2gtk-4.1-dev libsoup-3.0-dev libayatana-appindicator3-dev \
      libasound2-dev libxdo-dev libclang-dev libssl-dev libstdc++-12-dev \
      pkg-config build-essential
    

Run / build

npm install
npm run tauri dev          # development
npm run tauri build        # release bundles (.deb + .rpm)

Architecture

hotkey (rdev, X11 XRecord)  ─▶ state machine (arm-on-1s)
                                   │
                             armed ▼
                             audio::start   (cpal, dedicated thread)
                                   │
                          released ▼
                             audio::stop    (mono + rubato 16 kHz)
                                   │
                                   ▼
                             vad::trim_silence  (Silero VAD)
                                   │
                                   ▼
                             transcribe::transcribe  (whisper-rs, beam search)
                                   │
                                   ▼
                             inject::paste  (enigo — types at cursor)

License

MIT — see LICENSE.

Top categories

Loading Svelte Themes