MIDS Capstone Project Summer 2022

MyVoice: Machine Translation for American Sign Language

Team members:

MyVoice is an American Sign Language (ASL) translation model that can automatically generate captions for short paragraphs.

Our Mission

We want to empower every individual that is deaf or hard of hearing to communicate and express their thoughts effectively, irrespective of their ability to speak or hear.

The Problem

During the recent pandemic, the deaf and hard of hearing (DHH) often found themselves excluded. Information broadcast on television was not always made available in sign language. And at schools around the world, remote learning alternatives failed to meet the needs of children who use sign language leaving them feeling isolated, excluded, and frustrated. Even without a crisis, the hearing impaired face daily issues of isolation and miscommunication. Just like the rest of us, the DHH have a right to quality education, healthcare and an environment that maximizes their potential. This work is motivated to further the research in continuous sign language translation that can bring deaf people a step closer to the realities of day-to-day communication.

In the last couple of decades, researchers in the computer vision field have focused more on Sign Language Recognition (SLR) than in Sign Language Translation (SLT). SLR seeks to recognize a sequence of signs but neglects the underlying grammatical and linguistic structures of sign language that differ from spoken language. On the other hand, the SLT goal is to generate spoken language translations from sign language videos, taking into account the different word orders and grammar.

Impact

According to the World Health Organization, over 5% of the worlds population (430 million people) have a "disabling" hearing loss. And that number is expected to increase to 700 million by 2050.

The National Center for Health Statistics estimates 28 million Americans have some degree of hearing loss. About 2 million of these 28 million people are classified as deaf (they can’t hear every day sounds or speech even with a hearing aid). American Sign Language (ASL) is the natural language of around 500,000 deaf people in the US and Canada and it is used in 20 other countries.

Translation models for ASL have the potential to change the way Deaf and Hard-of-Hearing (DHH) people transmit their thoughts to others in virtual meetings or conferences.

How it Works

The MyVoice model receives a video of a person performing American Sign Language and generates captions.

Future applicability: A virtual meeting scenario. A person uses American Sign Language to communicate in front of a camera. The camera captures the video frames during the online call from this person, the trained machine translation model converts ASL into text in real time and the other people can read the translation.