"These speech developments build on decades of research, and achieving speech recognition comparable to that of humans is a complex task. At IBM, we are dedicated to creating the technology that will one day match the complexity of how the human ear, voice and brain interact", stated Michael Karasick, IBM Vice President, Cognitive Computing. "This progress will have important implications for how man and machine collaborate in the future, making the interactions more natural and productive. We believe it is only a matter of time before we achieve parity on speech recognition with humans."
The success of speech recognition technology is measured against human parity, an error rate on par with that of two humans speaking. Previously, human parity was considered a 5.9 percent word error rate; IBM partnered with Appen, a speech and technology service provider, to reassess the industry benchmark and determined that human parity is lower than what anyone has yet achieved: 5.1 percent.
In the face of other industry claims, this research, in partnership with Appen, shows finding a standard measurement for human parity across the industry is more complex than it seems. As IBM continues to develop and improve upon this technology, its researchers will remain accountable to the highest standards of accuracy when measuring for it for the findings to be truly valuable.
"In spite of impressive advances in recent years, reaching human-level performance in AI tasks such as speech recognition or object recognition remains a scientific challenge. Indeed, standard benchmarks do not always reveal the variations and complexities of real data", stated Yoshua Bengio, leader of University of Montreal's Institute for Learning Algorithms. "IBM continues to make significant strides in advancing speech recognition by applying neural networks and deep learning into acoustic and language models."
"The ability to recognize speech as well as humans do is a continuing challenge, since human speech, especially during spontaneous conversation, is extremely complex", stated Julia Hirschberg, a professor and Chair at the Department of Computer Science at Columbia University. "IBM's recent achievements in speech recognition are quite impressive, as is IBM's dedication to better understand how we measure the success speech technology and industry benchmarks."
Today's achievement builds upon IBM's recent advancements in language and speech technology, gained from IBM's decades of experience researching, developing and investing in AI technology. These research developments are critical to advancing the development and adoption of cognitive around the globe; as IBM continues to strengthen and improve upon its speech and language technology, these updates will be embedded in the cognitive capabilities it offers via the Watson Developer Cloud.