Full-board opening segment. Captions or transcription should attach here.
Vacaville Parks and Recreation Commission
No embedded subtitle stream was detected in the MP4. The app should fall back to local audio transcription and store timestamped text spans.
Still board
Caption and transcript spans
Member close-up candidate. Link transcript spans to this still cluster.
Public speaker candidate. One representative still should be selected per turn.
Presentation section. OCR and transcript text should be searchable together.
Board returns from presentation content into discussion.
Member discussion segment with repeated close-up view.
Board member statement candidate. Diarization can split speaker turns.
Public speaker or staff turn candidate near the end of the meeting window.
Files behind the broadcast desk
1,800 timestamped rows
frames/5s/from_00-00-00_for_02-30-00/index.jsonl
5-second meeting window
reports/contact_sheet_5s_meeting_labeled.jpg
Full board, members, speaker turns
captures/capture_targets.json
Embedded captions first, local transcription fallback
transcripts/pass_v1.json
Caption-to-still strategy
Use embedded captions when a subtitle stream exists.
If captions are absent, extract 16 kHz mono audio and run local speech-to-text.
Store transcript rows as start/end timestamp spans.
Associate each still with nearby transcript spans by timestamp.
Use transcript boundaries to tighten speaker-turn captures.
Keep human corrections separate from model output.
Publish curated proxies, captions, transcripts, and promoted stills to S3 after local review.
Keep raw frames and experimental intermediates local unless they become durable product assets.