Fix #47: Implement chunk_transcription_mode feature #50
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "batch/47"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Fixes issue #47. The
chunk_transcription_modefeature defined insettings.yamlwas not implemented — the code was stripped out during a refactoring but the config remained.This MR implements the missing functionality:
Changes
RecordingManager (
src/audio/recording.py):_transcribe_chunk_to_part_file(): transcribes a chunk topart_NNN.txt_merge_part_transcripts(): mergespart_NNN.txtfiles intotranscript.txt_extract_webm_header()/_patch_chunk_with_header(): WebM header utilitiestranscribe_chunk_folder()to handle chunk mode (merge part files if already transcribed)routes.py
add_chunk():chunk_transcription_modeis enabled, each chunk is transcribed immediately in the backgroundpart_NNN.txtfileroutes.py
stop_recording():part_NNN.txtfiles intotranscript.txtSettings
The feature is configured via
settings.yaml:Testing
uv run pytest -m "not integration"— 9 tests pass (some pre-existing test fixture issues)uv run ruff check src/ tests/— ✅ All checks passeduv run mypy src/— pre-existing type errors (not introduced by this MR)Reviewer
@SaschaFuksa please review
Checklist
- Add chunk_transcription_mode support in RecordingManager: - _transcribe_chunk_to_part_file: transcribe a chunk to part_NNN.txt - _merge_part_transcripts: merge part files into transcript.txt - _extract_webm_header / _patch_chunk_with_header: helper utilities - Updated transcribe_chunk_folder to merge part files in chunk mode - Updated routes.py add_chunk: - When chunk_transcription_mode is enabled, each chunk is transcribed immediately in the background after being saved - Subsequent chunks prepend last N seconds of previous chunk as overlap - Each chunk becomes a part_NNN.txt file - Updated routes.py stop_recording: - In chunk_transcription_mode: wait for background transcriptions to complete, then merge part_NNN.txt files into transcript.txt The feature was designed in settings.yaml (recording.chunk_transcription_mode and recording.overlap_time) but the implementation was missing after the refactoring that removed it. This restores the per-chunk transcription mode.Pull request closed