Audio Transcription Service mit Whisper.cpp - Recording, Streaming, Chunked und File-Based Transkription
  • HTML 47.4%
  • Python 46.6%
  • JavaScript 4%
  • Dockerfile 1.6%
  • Jinja 0.4%
Find a file
2026-04-28 14:48:39 +00:00
.claude fix 11 2026-04-28 13:44:47 +00:00
.devcontainer removed comma 2026-04-08 07:07:17 +00:00
.forgejo/workflows feat: add Forgejo CI pipeline and batch MR workflow 2026-04-18 10:18:14 +02:00
.vscode Updated template 2026-04-08 07:06:48 +00:00
src fix 12-2 2026-04-28 14:11:46 +00:00
tests Add unit tests with dummy .md test files for _extract_headings_for_filename 2026-04-21 19:21:01 +02:00
ui fix 13 2026-04-28 14:48:17 +00:00
.dockerignore Some more fixes by claude 2026-04-27 12:28:12 +00:00
.gitignore claude fix 3 2026-04-27 12:34:12 +00:00
.python-version Updated template 2026-04-08 07:06:48 +00:00
AGENTS.md fix: address review feedback 2026-04-18 21:05:39 +02:00
docker-compose.yml Some more fixes by claude 2026-04-27 12:28:12 +00:00
Dockerfile Fix docker 2026-04-27 12:43:46 +00:00
pyproject.toml Some more fixes by claude 2026-04-27 12:28:12 +00:00
README.md Fix #41: Remove deprecated files build.sh, scripts/cli.py, scripts/setup.sh 2026-04-21 18:43:46 +02:00
settings.yaml Fix #36: New Chunk Transcription Mode 2026-04-21 09:19:57 +02:00
uv.lock claude fix 3 2026-04-27 12:34:12 +00:00

whisper-transcribe

Audio Transcription Service with Whisper.cpp local-first, chunked recording support.

Quick Start

1. Clone the repo

git clone https://git.fuksa.de/Tasch/whisper-transcribe.git
cd whisper-transcribe

2. Start Server

uv run uvicorn src.main:app --reload

3. Or use Docker

docker compose up --build

Web UI

A simple recording UI is available at ui/index.html.

# Serve the UI (or open directly in browser)
# Just open ui/index.html in your browser

Features:

  • One-click recording
  • Visual timer
  • Sends chunks automatically to server
  • Shows transcription when done

API Endpoints

Method Endpoint Description
GET /api/v1/health Health check
POST /api/v1/record/start Start recording session
POST /api/v1/record/chunk/{session_id} Add audio chunk
POST /api/v1/record/stop/{session_id} Stop + transcribe
POST /api/v1/transcribe Direct file upload

Configuration

Edit settings.yaml:

whisper:
  model: base              # tiny, base, small, medium, large
  language: null           # auto-detect, or: "de", "en", etc.
  model_path: models/ggml-base.bin

recording:
  chunk_duration_sec: 30   # Chunk size for recording sessions

routing:
  to_tasch:
    enabled: true          # Send transcripts to Tasch
  to_repo:
    enabled: false
    repo_url: "https://git.fuksa.de/..."

# Git Notes Export — send structured transcripts to a GitHub repo
git_notes:
  enabled: false
  # GitHub repository in format "owner/repo"
  repo: "YourUsername/your-repo"
  # Branch to commit to
  branch: "main"
  # Environment variable name containing the GitHub PAT
  token_env: "GIT_NOTES_TOKEN"
  # Path template: {date} = recording date (YYYY-MM-DD), {heading} = first H1/H2/H3 from MD
  path_template: "{date}/{heading}.md"

Structured Transcript Export

After transcribing and structuring recordings, you can:

  • 💾 Download — download the .md file to your machine
  • 📋 Copy — copy Markdown to clipboard
  • 📤 Send to Repo — commit directly to a GitHub repository

Setup: Send to Repo

1. Create a GitHub Personal Access Token (PAT)

  1. Go to https://github.com/settings/tokensGenerate new token (classic)
  2. Name: e.g. whisper-transcribe-export
  3. Scope: repo (full control of private repositories)
  4. Expiration: choose as desired
  5. Generate token → copy it now (you won't see it again)

2. Configure the container

Create a .env file in the project root (never commit this!):

# .env — DO NOT COMMIT THIS FILE!
GIT_NOTES_TOKEN=ghp_your_token_here

.env is already in .gitignore.

3. Enable in settings.yaml

git_notes:
  enabled: true
  repo: "YourUsername/your-private-repo"
  branch: "main"
  token_env: "GIT_NOTES_TOKEN"
  path_template: "{date}/{heading}.md"

4. Enable the Docker environment variable

In docker-compose.yml, uncomment the line:

environment:
  - MODEL_NAME=base
  - GIT_NOTES_TOKEN=${GIT_NOTES_TOKEN}  # ← uncomment this

5. Build and start

docker compose up --build

Path Template

Files are saved as path_template with these variables:

Variable Description Example
{date} Recording date 2026-04-19
{heading} First H1/H2/H3 from the MD Kapitel 3: Klassifikation
{folder_name} Raw folder name 20260419_143022

Examples for path_template: "{date}/{heading}.md":

  • 2026-04-19/Kapitel 3 Klassifikation.md
  • 2026-04-20/Maschinelles Lernen.md

Invalid filename characters (< > : " / \ | ? *) are automatically removed from the heading.


Home Server Deployment

The service runs as a Docker container on your home server.

Prerequisites

  • Docker & Docker Compose installed on the server
  • SSH access to the server
  • A domain/subdomain pointing to your server (optional but recommended)

Setup

# 1. SSH into your server
ssh user@your-server

# 2. Clone the repo
git clone https://git.fuksa.de/Tasch/whisper-transcribe.git
cd whisper-transcribe

# 3. Create .env with your secrets
nano .env
# Add:
#   GIT_NOTES_TOKEN=ghp_your_github_token

# 4. Edit settings.yaml — enable git_notes and configure as needed
nano settings.yaml

# 5. Edit docker-compose.yml — uncomment the GIT_NOTES_TOKEN line

# 6. Build and start
docker compose up --build -d

# 7. Check logs
docker compose logs -f

Updating

cd whisper-transcribe
git pull
docker compose up --build -d

Accessing the UI

The Web UI is served by the FastAPI backend:

Service URL
API + Web UI http://your-server:8000/api/v1/
Web UI directly http://your-server:8000/

For external access, configure a reverse proxy (e.g. nginx, Caddy) with HTTPS.

GPU Acceleration (optional)

If your server has an NVIDIA GPU, uncomment the deploy section in docker-compose.yml for Whisper inference acceleration.


Development

# Install dependencies
uv sync

# Run tests
uv run pytest -m "not integration"

# Lint
uv run ruff check src/ tests/

Architecture

src/
├── audio/           # Recording sessions, file handling
├── transcription/   # Whisper.cpp wrapper
├── routing/         # Output dispatcher
├── api/             # FastAPI routes
└── main.py          # App entry point