whisper-asr-webapp
A web app for automatic speech recognition using OpenAI's Whisper model running locally.
# Quickstart with Docker:
docker run --rm -it -p 8000:8000 -v whisper_models:/root/.cache/whisper ghcr.io/fluxcapacitor2/whisper-asr-webapp:main
The frontend is built with Svelte and builds to static HTML, CSS, and JS.
The backend is built with FastAPI. The main endpoint, /transcribe
, pipes an uploaded file into ffmpeg, then into Whisper. Once transcription is complete, it's returned as a JSON payload.
In a containerized environment, the static assets from the frontend build are served by the same FastAPI (Uvicorn) server that handles transcription.
docker run --rm -it -p 8000:8000 -v whisper_models:/root/.cache/whisper ghcr.io/fluxcapacitor2/whisper-asr-webapp:main
docker run -d -p 8000:8000 -v whisper_models:/root/.cache/whisper ghcr.io/fluxcapacitor2/whisper-asr-webapp:main
The easiest way to get started is by using Docker. You can use the premade run.sh
shell script or the following commands in the root of the project:
docker build . -t fluxcapacitor2/whisper-asr-webapp:local-dev
docker run -p 8000:8000 -v whisper_models:/root/.cache/whisper --rm -it fluxcapacitor2/whisper-asr-webapp:local-dev
This will build and run a Docker container that hosts both the frontend and backend on port 8000. Navigate to http://localhost:8000 in a web browser to start using the app.
Note: When you make any code changes, you will need to rebuild and restart the Docker container. However, due to caching, this should still be reasonably fast.