Speech Recognition: What’s Left? Dr Michael Picheny 12 November 2019
Duration: 59 mins 29 secs
Share this media item:
Embed this media item:
Embed this media item:
About this item
Description: | This talk examines speech recognition issue, comparing and contrasting them to what is known about human perception. With recent advances in Deep Learning, it is suggested that it is now achievable for Word Error Rates to be comparable to human listeners. This talk specifically highlights issues with accented, noisy speech, different speaking styles, multilingual speech recognition and more. And through demonstrations in comparison to human perception, there is still significant work in speech recognition research from the community. |
---|
Created: | 2019-11-25 13:43 |
---|---|
Collection: | Information Engineering Distinguished Lecture Series |
Publisher: | University of Cambridge |
Copyright: | Dr Michael Picheny |
Language: | eng (English) |
Abstract: | Recent speech recognition advances on the SWITCHBOARD corpus suggest that because of recent advances in Deep Learning, we now achieve Word Error Rates comparable to human listeners. Does this mean the speech recognition problem is solved and the community can move on to a different set of problems? In this talk, we examine speech recognition issues that still plague the community and compare and contrast them to what is known about human perception. We specifically highlight issues in accented speech, noisy/reverberant speech, speaking style, rapid adaptation to new domains, and multilingual speech recognition. We try to demonstrate that compared to human perception, there is still much room for improvement, so significant work in speech recognition research is still required from the community. |
---|
Available Formats
Format | Quality | Bitrate | Size | |||
---|---|---|---|---|---|---|
MPEG-4 Video | 1280x720 | 2.99 Mbits/sec | 1.30 GB | View | Download | |
MPEG-4 Video | 640x360 | 1.93 Mbits/sec | 864.45 MB | View | Download | |
WebM | 1280x720 | 2.35 Mbits/sec | 1.02 GB | View | Download | |
WebM | 640x360 | 406.25 kbits/sec | 177.04 MB | View | Download | |
iPod Video | 480x270 | 520.17 kbits/sec | 226.63 MB | View | Download | |
MP3 | 44100 Hz | 249.75 kbits/sec | 108.93 MB | Listen | Download | |
Auto * | (Allows browser to choose a format it supports) |