Audio Transcription Service mit Whisper.cpp - Recording, Streaming, Chunked und File-Based Transkription
  • Python 70.4%
  • HTML 23.1%
  • Shell 3.7%
  • Dockerfile 2.1%
  • Jinja 0.7%
Find a file
2026-04-18 19:08:40 +00:00
.devcontainer removed comma 2026-04-08 07:07:17 +00:00
.forgejo/workflows feat: add Forgejo CI pipeline and batch MR workflow 2026-04-18 10:18:14 +02:00
.vscode Updated template 2026-04-08 07:06:48 +00:00
scripts fix: setup.sh line endings 2026-04-17 19:38:14 +02:00
src Merge pull request 'Batch: Fixes #8' (#10) from batch/8 into master 2026-04-18 19:08:40 +00:00
tests fix: address review feedback 2026-04-18 21:05:39 +02:00
ui Batch: Fixes #4, #5 - timestamp-based chunk folders and chunks UI panel 2026-04-18 17:09:05 +02:00
.gitignore fix: move WORKDIR before version.json creation in Dockerfile 2026-04-18 11:55:58 +02:00
.python-version Updated template 2026-04-08 07:06:48 +00:00
AGENTS.md fix: address review feedback 2026-04-18 21:05:39 +02:00
build.sh fix: move WORKDIR before version.json creation in Dockerfile 2026-04-18 11:55:58 +02:00
docker-compose.yml fix: auto-detect git branch/commit at Docker build time — zero manual setup 2026-04-18 14:38:34 +02:00
Dockerfile upgrade: switch from medium to large-v3-turbo whisper model 2026-04-18 17:40:00 +02:00
pyproject.toml feat: serve UI at root with StaticFiles 2026-04-17 19:57:20 +02:00
README.md 🎙️ Add web recording UI 2026-04-16 20:42:53 +02:00
settings.yaml Merge pull request 'feat: add StructureService for LLM-powered transcript structuring' (#9) from feat/transcript-structuring into master 2026-04-18 19:08:30 +00:00

whisper-transcribe

Audio Transcription Service with Whisper.cpp local-first, chunked recording support.

Quick Start

1. Interactive Setup

# Clone the repo
git clone https://git.fuksa.de/Tasch/whisper-transcribe.git
cd whisper-transcribe

# Run setup (downloads whisper.cpp + model + configures everything)
bash scripts/setup.sh

2. Start Server

uv run uvicorn src.main:app --reload

3. Or use Docker

docker compose up --build

CLI Tool

# Interactive setup
python scripts/cli.py setup

# Download a model
python scripts/cli.py download base

# Start server
python scripts/cli.py serve --port 8000

Web UI

A simple recording UI is available at ui/index.html.

# Serve the UI (or open directly in browser)
# Just open ui/index.html in your browser

Features:

  • One-click recording
  • Visual timer
  • Sends chunks automatically to server
  • Shows transcription when done

API Endpoints

Method Endpoint Description
GET /api/v1/health Health check
POST /api/v1/record/start Start recording session
POST /api/v1/record/chunk/{session_id} Add audio chunk
POST /api/v1/record/stop/{session_id} Stop + transcribe
POST /api/v1/transcribe Direct file upload

Configuration

Edit settings.yaml:

whisper:
  model: base              # tiny, base, small, medium, large
  language: null           # auto-detect, or: "de", "en", etc.
  model_path: models/ggml-base.bin

recording:
  chunk_duration_sec: 30   # Chunk size for recording sessions

routing:
  to_tasch:
    enabled: true          # Send transcripts to Tasch
  to_repo:
    enabled: false
    repo_url: "https://git.fuksa.de/..."

Development

# Install dependencies
uv sync

# Run tests
uv run pytest -m "not integration"

# Lint
uv run ruff check src/ tests/

Architecture

src/
├── audio/           # Recording sessions, file handling
├── transcription/   # Whisper.cpp wrapper
├── routing/         # Output dispatcher
├── api/             # FastAPI routes
└── main.py          # App entry point