TheWallFlower Svelte Themes

Thewallflower

A self-hosted NVR that listens. Real-time RTSP video monitoring with instant local AI transcription using Whisper, Svelte 5, and FastAPI.

TheWallflower

A self-hosted, privacy-focused NVR (Network Video Recorder) with real-time Speech-to-Text and high-accuracy Face Recognition.

⚠️ Warning: This project is under active, experimental development. Features change often, and the main branch is frequently broken. We prioritize moving fast and testing new AI integrations over stability at this stage. Use at your own risk.

Current Status: v0.3.0

TheWallflower has evolved into a "Split-Pipeline" architecture, offloading heavy video tasks to go2rtc while the Python backend focuses on AI processing (Whisper + InsightFace).

Features

  • Low-Latency WebRTC - Primary viewing via WebRTC for <100ms latency.
  • 24/7 Continuous Recording - Zero-transcode (direct copy) segmented MP4 recording for maximum efficiency.
  • Advanced Face Recognition - Robust identity management using multi-embedding averages and local pretraining (/data/faces/known/{name}/).
  • Real-time Speech-to-Text - Transcription of RTSP audio streams powered by WhisperLive with aggressive anti-hallucination filtering.
  • Audio Pre-filtering - RMS Energy gating and Silero VAD (Voice Activity Detection) ensure Whisper only processes actual speech.
  • Event Snapshots - High-definition full-frame captures for every face detection event.
  • UI-First Configuration - No YAML files needed. Add and manage cameras through the modern Svelte 5 web interface.

Tech Stack

Component Technology
Frontend Svelte 5 (Runes) + TailwindCSS v4
Backend FastAPI + SQLModel + Alembic
Video Engine go2rtc (Embedded)
Speech AI WhisperLive + Faster-Whisper + Silero VAD
Vision AI InsightFace (buffalo_l) + ONNX Runtime
Database SQLite (WAL Mode)
Container Docker (Multi-stage build)

Quick Start

Prerequisites

  • Docker and Docker Compose
  • Intel GPU (recommended for OpenVINO) or NVIDIA GPU

Running with Docker Compose

# Clone the repository
git clone https://github.com/Jellman86/TheWallFlower.git
cd TheWallflower

# Copy and customize environment (essential for WebRTC)
cp .env.example .env
# Edit .env and set WEBRTC_ADVERTISED_IP to your server's local IP

# Start the services
docker compose up -d

The web UI will be available at http://localhost:8953

GPU Acceleration

TheWallflower supports hardware acceleration for AI tasks:

  • Intel iGPU: Set WHISPER_IMAGE to the openvino variant in .env.
  • NVIDIA GPU: Set WHISPER_IMAGE to the gpu variant and uncomment the deploy section in docker-compose.yml.

Architecture

┌─────────────────────────────────────────────────────────────────────────┐
│                         TheWallflower Container                          │
│                                                                          │
│  ┌────────────────┐      ┌─────────────────┐      ┌─────────────────┐  │
│  │    FastAPI     │◄────►│     go2rtc      │◄────►│   RTSP Camera   │  │
│  │    Backend     │      │  (Video Engine) │      │                 │  │
│  │    :8953       │      │  :8954/8955/8956│      │                 │  │
└───────┬────────┘      └─────────────────┘      └─────────────────┘  │
        │                                                              │
        │ Audio Worker Pipeline:                                       │
        │  FFmpeg ──► Bandpass ──► Energy Gate ──► Silero VAD          │
        │                                                              │
        ▼                                                              │
┌────────────────┐                                                    │
│  WhisperLive   │ ◄── Only verified speech chunks reach here        │
│   (External)   │                                                    │
│    :9090       │                                                    │
└────────────────┘                                                    │
                                                                       │
        │ Face Worker Pipeline:                                        │
        │  Fetch Frame ──► InsightFace ──► Identify ──► DB Event       │
        │                                                              │
        │ Recording Worker:                                            │
        │  FFmpeg (Copy) ──► Segmented MP4s ──► /data/recordings       │
        │                                                              │
└─────────────────────────────────────────────────────────────────────────┘

Face Recognition Pretraining

To skip the "Unknown" phase, you can pretrain the system with existing photos:

  1. Create a folder: /data/faces/known/John_Smith/
  2. Drop .jpg or .png photos of John into that folder.
  3. Restart the container.
  4. TheWallflower will automatically detect faces, generate embeddings, and register the identity.

Project Structure

TheWallflower/
├── backend/
│   ├── app/
│   │   ├── main.py           # FastAPI application & SSE
│   │   ├── stream_manager.py # Worker lifecycle
│   │   ├── worker.py         # Audio extraction & VAD
│   │   ├── workers/          # Background tasks (Face, Recording)
│   │   └── services/         # Business logic (Detection, Recording)
│   └── migrations/           # Alembic DB migrations
├── frontend/
│   ├── src/
│   │   ├── lib/
│   │   │   ├── components/   # WebRTCPlayer, FaceCard, RecordingsPanel
│   │   │   └── services/     # API client (api.js)
│   └── public/
├── docker-compose.yml
├── Dockerfile
└── docker-entrypoint.sh

License

MIT License - see LICENSE for details.

Top categories

Loading Svelte Themes