Generate ground-truth diarization data using Voice Activity Detection (VAD).
This script processes audio files using pyannote's VAD pipeline to create ground-truth diarization data in RTTM format. For each input audio file, it:
- Applies Voice Activity Detection to identify speech segments
- Labels the segments with the audio file's name
- Saves individual RTTM files for each input
- Creates a combined RTTM file mixing all inputs