feat(transcribe): add speaker diarization support #188
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
pip install agent-cli[diarization]New CLI Options
--diarize/--no-diarize--diarize-formatinline(default) orjson--hf-token--min-speakers--max-speakersOutput Formats
Inline (default):
JSON:
{ "segments": [ {"speaker": "SPEAKER_00", "start": 0.0, "end": 2.5, "text": "Hello, how are you?"} ] }Usage Examples
Test plan
DiarizedSegmentdataclassalign_transcript_with_speakersfunctionformat_diarized_output(inline and JSON)SpeakerDiarizerclass with mocked pyannote