Good points, but I think we underestimate how much situational context humans us...

lutorm · on May 3, 2010

Context is huge for human interpretation. If you've ever have someone address you in a different language than you were expecting, you know what I mean. It's almost like you can imagine the search just going deeper and deeper without finding anything that makes sense until it swaps in the other language and go: Ah, you said "good morning"! :-)

eru · on May 4, 2010

Especially embarrassing when somebody addresses you in your native language, and you expected something different.

mstoehr · on May 3, 2010

It is true that humans do use situational context. In the cases where semantics is important and complex for understanding an utterance a computer will fail even more because it won't get the semantics or the speech signal.

On the topic of dialog, this is arguably the area that speech recognition has gained in over the last nine years. Prior to 2001 there were not many usable dialog systems and (depending on your definition of "usable") there are many usable dialog systems deployed in call centers around the world.

Most call center dialog systems have a rudimentary system asking for people to repeat things when it doesn't understand. Although, if it asks more than once the callers tend to get very angry.

jerf · on May 3, 2010

Nobody would use a system that interrogated you on every fifth word. That would actually be a step worse than silent failure on every fifth word.

fauigerzigerk · on May 3, 2010

It shouldn't interrupt you once every 5 words of course. What it should try to do is to create a model of what you meant to say. At some point, if the system is unsure, it should ask you to confirm or correct what it has understood so far.