Fix #35: Add hhmmss-based unique filename for structured.md send-to-repo #38

Tasch · 2026-04-21T05:27:06Z

Tasch commented

2026-04-21 05:27:06 +00:00

Fix #35: New naming conventions for *.md file

When sending to a repository, we now include the hhmmss timestamp in the filename to prevent overwrites when the same book title appears in multiple audio sessions.

Changes

extract_headings_for_filename(): Extract H1/H2/H3 respecting start-of-document rules (H1 only at doc start, H2 only at doc start or after H1, H3 only at doc start or after H2)
build_filename_from_headings(): Build filename as hhmmss-h1[-h2[-h3]].md
_build_structured_filename(): Combines folder_name time component (hhmmss from yyyymmdd_hhmmss) with headings
send_structured_to_repo(): Now uses _build_structured_filename() instead of simple extract_title_from_markdown()

Example

Folder: 20260418_162032
MD content starts with:

# Praxiseinstieg Machine Learning

## Kapitel 3: Klassifikation

### Absatz: Klassifikatoren mit mehreren Kategorien

Old filename: Praxiseinstieg Machine Learning.md (would overwrite)
New filename: 162032-Praxiseinstieg Machine Learning-Kapitel 3 Klassifikation-Absatz Klassifikatoren mit mehreren Kategorien.md

Closes #35
Reviewer: @SaschaFuksa

## Fix #35: New naming conventions for *.md file When sending to a repository, we now include the hhmmss timestamp in the filename to prevent overwrites when the same book title appears in multiple audio sessions. ### Changes - **`extract_headings_for_filename()`**: Extract H1/H2/H3 respecting start-of-document rules (H1 only at doc start, H2 only at doc start or after H1, H3 only at doc start or after H2) - **`build_filename_from_headings()`**: Build filename as `hhmmss-h1[-h2[-h3]].md` - **`_build_structured_filename()`**: Combines folder_name time component (hhmmss from yyyymmdd_hhmmss) with headings - **`send_structured_to_repo()`**: Now uses `_build_structured_filename()` instead of simple `extract_title_from_markdown()` ### Example Folder: `20260418_162032` MD content starts with: ```markdown # Praxiseinstieg Machine Learning ## Kapitel 3: Klassifikation ### Absatz: Klassifikatoren mit mehreren Kategorien ``` Old filename: `Praxiseinstieg Machine Learning.md` (would overwrite) New filename: `162032-Praxiseinstieg Machine Learning-Kapitel 3 Klassifikation-Absatz Klassifikatoren mit mehreren Kategorien.md` Closes #35 Reviewer: @SaschaFuksa

Tasch added 1 commit

2026-04-21 05:27:06 +00:00

Fix #35 : Add hhmmss-based filename for structured.md send-to-repo

CI / lint (pull_request) Waiting to run

Details

CI / test (pull_request) Waiting to run

Details

9900e3bf0d

- Add extract_headings_for_filename() respecting start-of-document rules
  for H1, H2, H3 (only use heading if valid at document start per issue #35)
- Add build_filename_from_headings() to construct hhmmss-h1-h2-h3.md
- Add _safe_filename_part() for safe filename characters
- Add _build_structured_filename() to build full filename from folder_name
  time component and md headings
- Update send_structured_to_repo() to use new filename builder instead
  of simple extract_title_from_markdown

This ensures every send-to-repo operation creates a unique filename
even when the same book title appears in multiple sessions.

Tasch added 1 commit

2026-04-21 05:29:56 +00:00

Fix #35 : Remove duplicate .md extension in structured filename

CI / lint (pull_request) Waiting to run

Details

CI / test (pull_request) Waiting to run

Details

5e3d9afe2b

The build_filename_from_headings function was adding .md, but the
path_template already includes .md, causing double extension.

SaschaFuksa requested changes

2026-04-21 11:21:12 +00:00

src/api/routes.py Outdated

					
				@ -31,2 +31,4 @@

				llm_config = app_config.get("llm", {})

				git_notes_config = app_config.get("git_notes", {})

				chunk_transcription_config = app_config.get("chunk_transcription", {})

				chunk_transcription_enabled_default = chunk_transcription_config.get("enabled", False)

THIS IS NOT PART OF THIS PR!? REMOVE IT!

src/api/routes.py Outdated

					
				@ -58,6 +61,7 @@ class StartRecordingResponse(BaseModel):

				    session_id: str

				    message: str

				    chunk_duration_sec: int

				    chunk_transcription_enabled: bool = False

NOT PART OF THIS PR, REMOVE!

src/api/routes.py Outdated

					
				@ -65,6 +69,7 @@ class StopRecordingResponse(BaseModel):

				    status: str

				    chunks_count: int

				    transcription: Optional[str] = None

				    chunk_transcription_mode: bool = False

NOT PART OF THIS PR, REMOVE!

src/api/routes.py

					
				@ -299,0 +338,4 @@

				    h3: str | None = None

				    line_idx = 0

				    for line in lines:

Very bad method design! Too long, very bad! Don't do bad code! And for line in lies: WILL go through the whole document! Requirement is, only to take a look at the START of the document (First ~7 lines)

Tasch referenced this pull request from a commit

2026-04-21 11:52:56 +00:00

Fix #35: Remove chunk_transcription feature (not part of this PR) and simplify extract_headings_for_filename

Tasch added 1 commit

2026-04-21 11:52:56 +00:00

Fix #35 : Remove chunk_transcription feature (not part of this PR) and simplify extract_headings_for_filename

CI / lint (pull_request) Has been cancelled

Details

CI / test (pull_request) Has been cancelled

Details

a2093dcde5

SaschaFuksa review feedback:
- Remove chunk_transcription config loading and response fields (not part of PR #38)
- Remove chunk_transcription and overlap_time_sec params from start_recording
- Simplify extract_headings_for_filename to scan only first 7 lines instead of whole doc

Tasch requested review from SaschaFuksa

2026-04-21 11:56:17 +00:00

SaschaFuksa merged commit 03a6927144 into master

2026-04-21 11:59:51 +00:00

SaschaFuksa deleted branch batch/35

2026-04-21 11:59:52 +00:00

SaschaFuksa referenced this pull request from a commit

2026-04-21 11:59:54 +00:00

Merge pull request 'Fix #35: Add hhmmss-based unique filename for structured.md send-to-repo' (#38) from batch/35 into master

Sign in to join this conversation.

No reviewers