Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Looks like this model doesn't do realtime diarization, what model should I use if I want that? So far I've only seen paid models do diarization well. I heard about Nvidia NeMo but haven't tried that or even where to try it out.


Not sure if its "realtime" but the recently released VibeVoice-ASR from Microsoft does do diarization. https://huggingface.co/microsoft/VibeVoice-ASR




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: