a group of researchers from Microsoft Artificial Intelligence and
Research and puts its accuracy on par with professional human
transcribers who have advantages like the ability to listen to text
Both studies transcribed recordings from the Switchboard corpus,
a collection of about 2,400 telephone conversations that have been used
by researchers to test speech recognition systems since the early
1990s. The new study was performed by a group of researchers at
Microsoft AI and Research with the goal of achieving the same level of
accuracy as a group of human transcribers who were able to listen to
what they were transcribing several times, access its conversational
context and work with other transcribers.
Overall, researchers from the latest study reduced the error rate by
about 12 percent compared to last year’s findings by improving the
neural net-based acoustic and language models of Microsoft’s speech
recognition system. Notably, they also enabled its speech recognizer to
use entire conversations, which let it adapt its transcriptions to
context and predict what words or phrases were likely to come next, the
way humans do when talking to one another.
Microsoft’s speech recognition system is used in services like
Cortana, Presentation Translator and Microsoft Cognitive Services.