[2017-12-15]


Title: Speech Research at Google Aimed at Ubiquitous Speech Recognition
Date: 2017-12-15 02:20pm-03:30pm
Location: R103, CSIE
Speaker: Dr. Michiel Bacchiani, Google
Hosted by: Prof. Lin-shan Lee


Recent years have shown a large scale adoption of speech recognition by the public, in particular around mobile devices. Google, with its Android operating system, has integrated speech recognition as a key input modality. The huge volume of speech that our systems process each day shows how popular speech processing has become. This talk will briefly describe some of the history and highlight some of the technical challenges we faced in getting to this point.


More recently, home farfield devices, as popularized by Amazon Echo, have resulted in a major research emphasis on speech processing in such conditions. This talk will describe the Google research effort that underpinned the Google @Home intelligent speaker product. It will describe how our neural network technology is capable of processing multi-channel data and implicitly learn how to localize and beamform the incoming signal. We show three distinct approaches based on factorization or adaptive processing.The third section of this talk will describe some of our newest work focusing on end-to-end processing using attention-based models. Although this modeling is still far from a viable product, their lack of independence assumptions and lack of data preparation requirements make them an attractive subject of research. The talk will show the type of models we have investigated so far, how well they perform on some of our key tasks and touch on research topics we are investing in to see if we can make these models come to fruition in our products.





Michiel Bacchiani has been an active speech researcher for over 20 years. Although he has worked in various areas of speech, his main focus has been on acoustic modeling for automatic speech recognition. He currently manages the acoustic modeling team of the speech group at Google. His team is responsible for developing novel algorithms and training infrastructure for the acoustic models for all speech recognition applications backing Google services.

Before joining Google, Michiel Bacchiani worked as a member of technical staff at IBM Research, as a technical staff member at AT&T Labs Research and as a research associate at Advanced Telecommunications Research labs in Kyoto, Japan.

Michiel Bacchiani is currently the chair of the IEEE Spoken Language Technical Committee and has previously served as an elected member of that committee for several terms. He is a subject editor and board member of the Speech Communication journal. He has served as an organizing committee member or technical chair on numerous committees of workshop like ASRU and SLT. He has served as an area chair for several large international conferences like ICASSP and Interspeech. As an active participant with the academic community as an author, he has published more than 50 scientific publications.

Michiel Bacchiani received the "ingenieur" (ir.) degree from the Technical University of Eindhoven, The Netherlands and the Ph.D degree from Boston University, both in electrical engineering. 


