Tech UPTechnologyThis is how AI helps deaf people speak and...

This is how AI helps deaf people speak and understand

The World Health Organization estimates that by the year 2055, there will be 900 million people with hearing loss. But thanks to technology, and artificial intelligence (AI), solutions are developed that are capable of breaking down barriers and making life a little easier for people with this type of disability. Euphonia or Parrotron are two of the initiatives that aim to partially solve these problems, and which have as one of their creators a Russian Google engineer named Dimitri Kanevsky.

Dimitri Kanevsky began his career at Google working on speech recognition algorithms for YouTube. But before joining Google, he was a member of the Research staff in the Department of Speech and Language Algorithms at the Watson Research Center at IBM. Previously, he worked at various centers for higher mathematics, such as the Weizmann Institute of Sciences, the Max Planck Institute in Germany, and the Institute for Advanced Study at Princeton.

He currently holds 274 patents in the United States. He was born in Russia to parents with normal hearing, but has been deaf from a young age. He learned to speak English as a teenager, using Russian phonetic representations of English words, learning to pronounce English using Russian transliteration.

Seeing and hearing Dimitri speak, aided by one of his gadgets that allow him to communicate with other people thanks to his Euphonia project, is absolutely spectacular. I had the opportunity to see him live in Zurich a few months ago, and I was left speechless and open-mouthed using his app. With Live Transcribe, which is available in more than 70 languages ​​and dialects, Dimitri’s voice was converted to subtitles in real time using just his phone’s microphone. The same solution allowed two-way conversations through a keyboard, which was connected to external microphones to improve transcription accuracy.

But Euphonia is not alone. Parrotron is another of the projects, developed with artificial intelligence techniques, for the verbal communication of people with speech disabilities. For the millions of people who live with speech impairments caused by physical or neurological conditions, trying to communicate with others can be difficult and frustrating. While there have been a lot of recent advancements in automatic speech recognition technologies, these interfaces can be inaccessible to people with speech impairments. Additionally, applications that rely on speech recognition as input for text-to-speech synthesis may exhibit word substitution, deletion, and insertion errors. Critically, in today’s technology environment, limited access to voice interfaces, such as digital assistants that rely on direct voice understanding, means being excluded from cutting-edge tools and experiences, widening the gap between what those with and without voice.

Parrotron integrates a unique end-to-end deep neural network trained to convert the speech of a speaker with atypical speech patterns directly into fluent synthesized speech, without an intermediate step of generating text, bypassing speech recognition entirely. The focus of this project is on speech, looking at the problem only from the point of view of speech signals, for example, without visual signals such as lip movements. In this way, Parrotron can help people with a variety of atypical speech patterns, including those with ALS, deafness and muscular dystrophy, to better understand each other both in person-to-person interactions and in automatic speech recognition systems. .

To demonstrate the validity of the Parrotron project, its creators worked with Dimitri Kanevsky, who recorded 15 hours of speech that were used to adapt his base model to the specific nuances of his speech. The resulting Parrotron system helped it become better understood by both people and automatic speech recognition, significantly reducing the word error rate from 89% to 32%.

From this platform, I can only thank Google for showing me and teaching me about the existence of this type of project, and for allowing me to meet scientists like Dimitri Kanevsky, with a severe disability, but who develops solutions so that people like him can communicate.

Chip war rages on: TSMC suspends Biren chip manufacturing

The semiconductor company is seeking to follow US rules, which prevent companies from developing certain technologies.

Do you want your company to be more optimal? use digital twins

According to one study, organizations that use them have seen a 15% improvement in operational metrics.

Not everything is DALL-E: The best websites to create images with Artificial Intelligence

The use of Artificial Intelligence in creative work is becoming more and more common and these are some of the platforms that you can use.

Electric cars with charge in 5 minutes: a new NASA technology will make it...

A new technology funded by NASA for future space missions can charge an electric car in just five minutes.

This is Optimus, Tesla's humanoid robot

Impressed? Tesla founder Elon Musk wants to build millions of robots like these and sell them for 20,000 euros a unit.