Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

ASR systems already use language models during decoding, though mostly not large decoder-only LLMs. However, incorporating LLMs into ASR is currently at the center of a lot of research, e.g. using a speech encoder like wav2vec 2.0 or the whisper encoder with a Qformer etc. and a LoRA adapter on an LLM trained for ASR.


Really interested in this! Do you know of some good reading in this area?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: