fix: ensure last audio chunk is sent before stopping MediaRecorder #3

Merged
SaschaFuksa merged 17 commits from fix/last-chunk-not-sent into master 2026-04-18 13:46:09 +00:00
Owner

Fixes #N/A - The last audio chunk was not being sent to the server before the MediaRecorder stopped, resulting in incomplete transcriptions.

Problem

When stopRecording() was called, the ondataavailable handler was set AFTER calling requestData(). If the browser fires ondataavailable synchronously, the handler missed the final chunk.

Solution

  • Set ondataavailable BEFORE calling requestData()
  • Track pendingSends and wait for final chunk send to complete
  • 3-second safety timeout to prevent hanging
Fixes #N/A - The last audio chunk was not being sent to the server before the MediaRecorder stopped, resulting in incomplete transcriptions. ## Problem When stopRecording() was called, the ondataavailable handler was set AFTER calling requestData(). If the browser fires ondataavailable synchronously, the handler missed the final chunk. ## Solution - Set ondataavailable BEFORE calling requestData() - Track pendingSends and wait for final chunk send to complete - 3-second safety timeout to prevent hanging
fix: ensure last audio chunk is sent before stopping MediaRecorder
Some checks are pending
CI / lint (pull_request) Waiting to run
CI / test (pull_request) Waiting to run
2b41804c91
MediaRecorder.ondataavailable handler must be set BEFORE calling
requestData() to catch synchronously-fired events. Add pendingSends
tracking to wait for the final chunk's send to complete before
calling stop(). Add 3-second safety timeout to prevent hanging.
debug: add version footer, chunk send logging, and live chunk counter
Some checks are pending
CI / lint (pull_request) Waiting to run
CI / test (pull_request) Waiting to run
4a292938c2
- Show git version/timestamp in footer
- Log chunk sends to console with size and server chunk count
- Display live chunk counter during recording
- Reset debug state on new recording session
fix: collect final chunk before stop + ignore empty chunks on server
Some checks are pending
CI / lint (pull_request) Waiting to run
CI / test (pull_request) Waiting to run
79f694e4e7
Client (ui/index.html):
- collectFinalChunk(): set ondataavailable BEFORE requestData()
- Handler catches BOTH requestData() flush AND stop() final fire
- Console logs at every step for debugging
- Version footer showing whisper availability

Server (recording.py):
- add_chunk(): ignore zero-byte audio data
- Prevents empty .webm files from corrupting the WAV merge

Fixes: second chunk was empty because requestData() fires ondataavailable
with empty blob when no new data is buffered, and old code missed this.
fix: wait for all final chunks before calling mediaRecorder.stop()
Some checks are pending
CI / lint (pull_request) Waiting to run
CI / test (pull_request) Waiting to run
43a74dc245
CRITICAL FIX: ondataavailable handler must stay in place until ALL final
chunks are sent. Previously the handler was restored before calling stop(),
which caused the final ondataavailable (fired by stop() itself) to go to
the OLD handler — the audio was never sent.

New stopMediaRecorder() flow:
1. Set ondataavailable to intercept final chunks
2. Call requestData() to flush buffer → fires ondataavailable
3. Each intercepted chunk is await-sent to server
4. tryStop() only calls mediaRecorder.stop() when pendingSends == 0
5. Only then does onstop fire → stopSession() → server processes

This ensures the audio captured between last timeslice and stop button
press is actually sent to the server.
fix: let browser handle final chunk collection via onstop callback
Some checks are pending
CI / lint (pull_request) Waiting to run
CI / test (pull_request) Waiting to run
3280716896
PLAN: User presses Stop → requestData() + stop() → browser fires
ondataavailable with final audio → onstop fires AFTER all dataavailable
events → stopSession() is called in onstop callback.

This relies on the browser guaranteeing onstop fires after all
ondataavailable events for the recorder are processed.
fix: wait 100ms after requestData() before stop() to capture final audio
Some checks are pending
CI / lint (pull_request) Waiting to run
CI / test (pull_request) Waiting to run
14802e5536
The key insight from MDN: stop() is preceded by dataavailable, but
dataavailable fires asynchronously. Added 100ms delay after
requestData() before calling stop() to let the event handler run.
Also added comprehensive [ONDATA], [CHUNK], [SEND] logging.
feat: add version info to /health endpoint and build-time git data
Some checks are pending
CI / lint (pull_request) Waiting to run
CI / test (pull_request) Waiting to run
81aeba323e
Add GIT_BRANCH/GIT_COMMIT build args to Dockerfile, write
version.json at build time. Update /health endpoint to serve
version info (branch + commit). UI already fetches and displays it.
fix: wait for all pendingSends in onstop before calling stopSession
Some checks are pending
CI / lint (pull_request) Waiting to run
CI / test (pull_request) Waiting to run
477d6477e2
The 400 on chunk 3 happened because stopSession() was called before
the final chunk finished sending. Now onstop waits for all pending
sends to complete before calling stopSession() API.
fix: move WORKDIR before version.json creation in Dockerfile
Some checks are pending
CI / lint (pull_request) Waiting to run
CI / test (pull_request) Waiting to run
ef9d8fc79e
Also add build.sh wrapper that auto-detects git branch/commit
and passes them to docker-compose build.
fix: remove requestData() call to prevent empty 3rd chunk
Some checks are pending
CI / lint (pull_request) Waiting to run
CI / test (pull_request) Waiting to run
7b09f3b966
The previous code called requestData() before stop() to capture the
buffered audio between the last timeslice and the stop button click.
However, requestData() creates a NEW blob and the browser continues
recording into it. When stop() was called ~150ms later, ondataavailable
fired with only milliseconds of audio from the new blob = EMPTY chunk.

The fix: stop() itself delivers ALL data accumulated since the last
ondataavailable event. By calling only stop() (no requestData()), we
get exactly 2 chunks: 30s (timeslice) + 10s (final buffer) = 40s.

Also adds version.sh that regenerates version.json at container start
using the mounted .git, for accurate branch/commit display in footer.
fix: generate version.json in Python at startup instead of version.sh
Some checks are pending
CI / lint (pull_request) Waiting to run
CI / test (pull_request) Waiting to run
4b292b790e
The version.sh approach required mounting .git and risked cache issues
on Docker Desktop. Instead, version.json is now generated directly in
Python (main.py) on app startup using git rev-parse if .git is available.

This is simpler and more portable - no entrypoint script needed.
fix: guard ondataavailable to ignore spurious post-stop events + better merge debugging
Some checks are pending
CI / lint (pull_request) Waiting to run
CI / test (pull_request) Waiting to run
728614e68b
Client (ui/index.html):
- Add stopWasCalled guard to ignore ondataavailable events that fire
  AFTER stop() was called. Chrome can fire multiple ondataavailable
  events from stop(), causing spurious tiny chunks (CHUNK 3 with 3KB).
- Fix chunk counter: use ++chunkNum BEFORE sendChunk so the log
  correctly reflects the actual chunk number sent.
- stopWasCalled flag is set BEFORE calling mediaRecorder.stop().

Server (src/audio/recording.py):
- Add debug logging: print each chunk file path + size before merge.
- Log the concat_list.txt content before ffmpeg runs.
- Log each chunk transcode result in fallback path.
- Log final merged WAV size.
- This helps diagnose why merged WAV might be empty/silent.

Build (Dockerfile + docker-compose.yml):
- Add generate-version.sh to write version.json at build time.
- GIT_BRANCH and GIT_COMMIT passed as docker-compose build args.
- version.json created during image build, not at runtime.
- main.py _ensure_version_json only touches version.json if
  /app/.git is deliberately mounted (dev mode).

Main (src/main.py):
- Simplify _ensure_version_json: only regenerate if /app/.git mounted.
- Otherwise keep build-time version.json as-is.
fix: auto-detect git branch/commit at Docker build time — zero manual setup
Some checks are pending
CI / lint (pull_request) Waiting to run
CI / test (pull_request) Waiting to run
1f277d9cad
The .git directory is included in the Docker build context (no .dockerignore
excluding it). The Dockerfile now reads git info directly during build:

  git rev-parse --abbrev-ref HEAD  → branch
  git rev-parse --short HEAD       → commit

version.json is written once at build time and baked into the image.
No build args, no .env file, no generate-version.sh script needed.
The user just runs 'docker-compose up --build' and it works automatically.

Also removes the now-unnecessary generate-version.sh and .env file.
fix: use ffmpeg filter_complex concat instead of concat demuxer
Some checks are pending
CI / lint (pull_request) Waiting to run
CI / test (pull_request) Waiting to run
77b971d1e3
The concat demuxer fails to properly merge WebM/Opus chunks because
each chunk starts at timestamp 0 (not continuation). filter_complex
concat properly handles this by decoding all inputs and remuxing
with correct timestamps.

Also add per-chunk duration logging so we can verify each chunk
has actual audio content.
debug: log first 4 bytes of each chunk to verify WebM header
Some checks are pending
CI / lint (pull_request) Waiting to run
CI / test (pull_request) Waiting to run
16fbac6328
fix: robust merge - decode each chunk individually before concat
Some checks are pending
CI / lint (pull_request) Waiting to run
CI / test (pull_request) Waiting to run
f9653e34fc
Previous approach used filter_complex concat which can fail if chunks
have WebM timestamp issues. New approach:
1. Decode each chunk individually to PCM WAV (16kHz mono)
2. Log detailed errors if a chunk fails to decode
3. Try alternative decode flags if first attempt fails
4. Concatenate decoded WAVs with concat demuxer
5. If concat fails, fall back to first decoded chunk

Also added detailed per-chunk decode status logging so we can see
exactly which chunk fails and why.
fix: prepend WebM header to subsequent chunks that are missing it
Some checks are pending
CI / lint (pull_request) Waiting to run
CI / test (pull_request) Waiting to run
8cd331ab34
Chrome-based browsers omit the WebM EBML header (1a45dfa3) in chunks after
the first timeslice. Only raw Opus audio frames are included.

Example from debug log:
- chunk_0 first4=1a45dfa3 (valid WebM header)  
- chunk_1 first4=43c38172 (INVALID - raw Opus, no WebM container)

The fix:
1. Extract the WebM header from chunk_0 (first 4096 bytes contain EBML + Segment + Tracks)
2. For each subsequent chunk that doesn't start with 1a45dfa3, prepend the header
3. Then decode the patched chunk to PCM WAV

Also adds _patch_chunk_with_header() and _extract_webm_header() helper methods
and improved debug logging with per-chunk first4 bytes.
SaschaFuksa deleted branch fix/last-chunk-not-sent 2026-04-18 13:46:09 +00:00
Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
Tasch/whisper-transcribe!3
No description provided.