Speech recognition researchers often describe the solution to speech recognition as a combination of three knowledge sources, Acoustics, lexicons, and language models. We introduced a fourth knowledge source, the context. By context we define all the side information that surrounds a voice query, such as user identity, geographic data , temporal information, dialog state, previous queries, display text, etc. In this talk I’ll give a high level overview of our current efforts to leverage this information and improve the quality of google’s voice search.
Dr. Pedro J. Moreno leads the Languages Modeling group within Google's speech team. Pedro's team is in charge of the infrastructure, engineering, and research needed to deploy and maintain multilingual speech recognition services worldwide. In addition the group conducts research and development in contextual speech recognition. He joined Google 12 years ago after working as a research scientist at HP Labs. He completed his Ph.D. studies at Carnegie Mellon University under the direction of Prof. Richard Stern. Before that he completed an Electrical Engineering degree at Universidad Politecnica de Madrid, Spain.