Nuance Advances Text-to-Speech Technology through Deep Learning

Deep learning, combined with traditional knowledge-based systems, elevates quality of text-to-speech technologies to near human speech


BURLINGTON, Mass., Feb. 14, 2018 (GLOBE NEWSWIRE) -- Nuance Communications, Inc. (NASDAQ:NUAN) today announced that it has advanced its text-to-speech (TTS) technology with deep neural networks (DNN) to deliver a new standard of quality, reducing errors by 40 percent compared to previous speech synthesis techniques.

Combining advancements in deep learning with knowledge-based developments, Nuance’s Vocalizer suite of TTS solutions – including Vocalizer Embedded for embedded platforms, Vocalizer Server for cloud applications and the Vocalizer Studio development tool – enables speech output that is nearly indistinguishable from human speech, enriching user experiences across automotive, enterprise, healthcare, IoT and smart home offerings and resulting in a more intuitive and conversational interaction between people and machines. The application of artificial intelligence (AI) techniques gives Vocalizer the ability to quickly learn new words, phrases and pronunciations and communicate with more expressivity and personality across more than 50 languages.

Nuance’s approach to use deep neural networks for speech synthesis is as follows. First, the networks learn the relation between written text and the corresponding voice characteristics from Nuance’s vast speech data. Then, the system applies this knowledge to the words and phrases in an unseen text. In addition to learning the relations between the orthographic representation of the words and the acoustic output, Nuance’s deep neural nets also use the context of the utterances to ensure that words are spoken in the appropriate expressive manner for the application, with the proper pattern of stress and intonation. For example, street names and driving directions sound clearly intelligible and articulated, whereas dialogs with a virtual assistant sound more fluent and dynamic.

“The advancements we have made through the application of DNN allow our text-to-speech technology to deliver high-quality, more expressive speech output, enabling more natural interactions between man and machine,” said Christophe Couvreur, vice president and general manager of TTS at Nuance. “We’re able to create highly tailored and computationally efficient solutions adapted to our customers’ unique needs, their application domains and the voice persona they want to realize.”

Key applications of Nuance Vocalizer include:

  • Automotive in-dashboard systems and virtual assistants
  • Robotics and autonomous virtual agents
  • Digital television and set-top boxes
  • Omni-channel customer engagement services

Nuance’s enhanced text-to-speech solutions are available for the cloud today and will be made available for embedded devices this year. For more information about Vocalizer, including voice samples, visit https://www.nuance.com/mobile/mobile-solutions/vocalizer-expressive.html.   

About Nuance Communications, Inc.
Nuance Communications, Inc. (NASDAQ:NUAN) is a leading provider of voice and language solutions for businesses and consumers around the world.  Its technologies, applications and services make the user experience more compelling by transforming the way people interact with devices and systems. Every day, millions of users and thousands of businesses experience Nuance’s proven applications.  For more information, please visit www.nuance.com.

Trademark reference: Nuance and the Nuance logo are registered trademarks or trademarks of Nuance Communications, Inc. or its affiliates in the United States and/or other countries. All other trademarks referenced herein are the property of their respective owners.

Contact Information

For Press
Kate Hickman
Nuance Communications, Inc.
Tel: 781-565-4627
Email: kathryn.hickman@nuance.com