Fix #36: New Chunk Transcription Mode #37

Merged
SaschaFuksa merged 4 commits from batch/36 into master 2026-04-21 14:09:04 +00:00
Owner

Fixes #36

Implements Chunk Transcription Mode with configurable overlap_time. Each chunk is transcribed immediately after recording, with overlap from the previous chunk to preserve context. Part transcripts are merged into transcript.txt after all chunks are done. full_audio.wav is skipped in this mode.

Fixes #36 Implements Chunk Transcription Mode with configurable overlap_time. Each chunk is transcribed immediately after recording, with overlap from the previous chunk to preserve context. Part transcripts are merged into transcript.txt after all chunks are done. full_audio.wav is skipped in this mode.
feat: implement Chunk Transcription Mode (fixes #36)
Some checks are pending
CI / lint (pull_request) Waiting to run
CI / test (pull_request) Waiting to run
0e8bc71074
- Add chunk_transcription_mode (bool, default False) and overlap_time
  (int seconds, default 5) to settings.yaml and Pydantic AppSettings model
- RecordingSession gains chunk_transcription_mode, overlap_time,
  part_transcripts, prev_chunk_wav fields
- RecordingManager gains set_whisper_service() for background transcription
- When chunk_transcription_mode=True: after each chunk is written, it is
  immediately converted to WAV and transcribed via ThreadPoolExecutor
- Each chunk after the first prepends the last overlap_time seconds of the
  previous chunk's WAV audio (for context preservation)
- Transcript parts saved as part_000.txt, part_001.txt, etc.
- On stop: merge_chunks waits for all transcriptions and concatenates
  part_*.txt into transcript.txt; full_audio.wav is not created
- Normal mode (chunk_transcription_mode=False) unchanged
- New src/config/models.py with Pydantic settings models
- Updated /record/start and /record/stop routes to use new config
- Fix unused variable lint errors (transcript, transcript_dispatch)
- Fix test fixture: chunks_base_dir -> chunks_base
Tasch force-pushed batch/36 from 0e8bc71074
Some checks are pending
CI / lint (pull_request) Waiting to run
CI / test (pull_request) Waiting to run
to 5c05a98063
Some checks are pending
CI / lint (pull_request) Waiting to run
CI / test (pull_request) Waiting to run
2026-04-21 07:20:30 +00:00
Compare
@ -134,2 +145,4 @@
raise HTTPException(status_code=400, detail="Cannot add chunk to inactive session")
# In chunk_transcription_mode, trigger immediate transcription of this chunk
if chunk_mode and result.chunk_count > 0:
Owner

VERY BAD! Too long - bad code quality! refactor it, make it more granular - USE CLASSES AND FUNCTIONS! DONT BE A BAD/LAZY DEV!

VERY BAD! Too long - bad code quality! refactor it, make it more granular - USE CLASSES AND FUNCTIONS! DONT BE A BAD/LAZY DEV!
SaschaFuksa marked this conversation as resolved
@ -173,2 +271,2 @@
transcript_path.write_text(transcription_text)
logger.debug(f"Saved transcript to {transcript_path}")
if chunk_mode:
Owner

MAYBE USE CLASSES!? EVER HEARD OF CLASSES!? if chunk_mode: do ChunkTranscription ... or do AudioFileTranscription ...

MAYBE USE CLASSES!? EVER HEARD OF CLASSES!? if chunk_mode: do ChunkTranscription ... or do AudioFileTranscription ...
SaschaFuksa marked this conversation as resolved
@ -413,0 +435,4 @@
return None
def _transcribe_chunk_to_part_file(
self,
Owner

Ever heard of clean code!? I guess not! Good functions are fine granular and good testable!

Ever heard of clean code!? I guess not! Good functions are fine granular and good testable!
SaschaFuksa marked this conversation as resolved
Refactor: extract TranscriptionMode classes and AudioFileMerger
Some checks are pending
CI / lint (pull_request) Waiting to run
CI / test (pull_request) Waiting to run
0b18bd6c0f
- Added TranscriptionMode protocol with ChunkTranscriptionMode,
  AudioFileTranscriptionMode, ChunkFolderTranscriptionMode subclasses
- AudioFileMerger class handles WebM header patching, chunk decoding,
  and WAV concatenation (cleaner separation of concerns)
- Broke up long methods in recording.py into smaller testable functions
- Added helper functions: run_transcription, save_transcript_and_dispatch
- Routes now use proper class-based transcription modes instead of
  inline if/else chains
- _get_structured extracted as standalone function to avoid duplication
@ -56,2 +55,4 @@
# Request/Response models
# =============================================================================
class StartRecordingResponse(BaseModel):
Owner

How about a models.py and declare data classes there?

How about a models.py and declare data classes there?
SaschaFuksa marked this conversation as resolved
@ -243,4 +84,1 @@
# --- Chunks management endpoints ---
class ChunkFoldersResponse(BaseModel):
Owner

How about a models.py and declare data classes there?

How about a models.py and declare data classes there?
SaschaFuksa marked this conversation as resolved
@ -281,33 +119,289 @@ class SendToRepoResponse(BaseModel):
error: Optional[str] = None
class OutputConfig(BaseModel):
Owner

How about a models.py and declare data classes there?

How about a models.py and declare data classes there?
SaschaFuksa marked this conversation as resolved
@ -308,0 +207,4 @@
)
class ChunkFolderTranscriptionMode:
Owner

This should be the same like AudioFileTranscriptionMode because the audio file (wav) already existis? So I see only two modes, please explain why we need this one

This should be the same like AudioFileTranscriptionMode because the audio file (wav) already existis? So I see only two modes, please explain why we need this one
SaschaFuksa marked this conversation as resolved
@ -308,0 +243,4 @@
# Transcription helpers
# =============================================================================
def run_transcription(audio_path: Path, mode: TranscriptionMode) -> Optional[TranscriptionResult]:
Owner

Please add a transcript_helper.py and move the helper functions to helper script

Please add a transcript_helper.py and move the helper functions to helper script
SaschaFuksa marked this conversation as resolved
@ -515,0 +614,4 @@
return safe[:100].strip() or "structured"
def _extract_headings_for_filename(md_content: str) -> tuple[str | None, str | None, str | None]:
Owner

Should be part of file_handler?

Should be part of file_handler?
SaschaFuksa marked this conversation as resolved
@ -515,0 +644,4 @@
return h1, h2, h3
def _safe_filename_part(text: str) -> str:
Owner

Should be part of file_handler?

Should be part of file_handler?
SaschaFuksa marked this conversation as resolved
@ -515,0 +655,4 @@
h1: str | None,
h2: str | None,
h3: str | None,
) -> str:
Owner

Should be part of file_handler?

Should be part of file_handler?
SaschaFuksa marked this conversation as resolved
@ -515,0 +667,4 @@
return "-".join(parts)
def _build_structured_filename(folder_name: str, md_content: str) -> str:
Owner

Should be part of file_handler?

Should be part of file_handler?
SaschaFuksa marked this conversation as resolved
Refactor: extract models, file_handler and transcript_helper from routes.py
Some checks are pending
CI / lint (pull_request) Waiting to run
CI / test (pull_request) Waiting to run
95e8a601ad
- Move Pydantic data models (StartRecordingResponse, ChunkFoldersResponse, etc.)
  to src/api/models.py
- Move filename/heading extraction helpers to src/api/file_handler.py
- Move TranscriptionMode classes and helpers to src/api/transcript_helper.py
- Update routes.py to import from new modules
- Add docstring to ChunkFolderTranscriptionMode explaining its purpose
  vs AudioFileTranscriptionMode (persistence + session_id context)
Tasch removed their assignment 2026-04-21 13:08:56 +00:00
Tasch left a comment
Author
Owner

Ready for re-review

Ready for re-review
Merge remote-tracking branch 'origin/master' into batch/36
Some checks failed
CI / lint (pull_request) Has been cancelled
CI / test (pull_request) Has been cancelled
8f8a2a571e
SaschaFuksa deleted branch batch/36 2026-04-21 14:09:04 +00:00
Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
Tasch/whisper-transcribe!37
No description provided.