Researcher Developing System to Crowdsource Transcribing Speech
Making Computers Smarter, and Helping Deaf People Too
By Julie Rehmeyer, Scientific American 9/24/2013
A friend of mine is very hard of hearing — not quite deaf enough to fully belong to the deaf community, but sufficiently deaf that participating in a conversation is terribly hard work for her. She does her best to put together what she can hear with what she can lip read and what she can extrapolate, and then she asks her conversational partners to repeat themselves as often as she can bear. I was shocked to hear just how exhausting and isolating it is for her.
One of the young researchers here is developing a solution that could make a big difference for people like her, as well as the fully deaf — and even for journalists. In particular, Walter Lasecki of the University of Rochester (together with his advisor Jeffrey Bigham) is creating a system to transcribe conversations in real time, with no advance planning, for a fraction of the cost of a skilled human transcriber.
Lasecki’s basic idea is to crowdsource the problem, using Amazon’s Mechanical Turk (or another service) to get six or seven people to simultaneously transcribe bits of the conversation. His software then stitches together the transcriptions using their overlaps to get a single coherent, accurate transcript. Ordinarily, transcription at real-time requires a highly skilled transcriber, who might charge $150 to $300 an hour; Lasecki’s system harnesses the ability of ordinary folks.
Lasecki pointed out that one of the big advantages of his system is that it eliminates scheduling hassles. Universities are required to provide “reasonable accommodation” for students with disabilities, which includes providing sign language interpreters. But usually, there are only a very few interpreters available, so if a student needs assistance at the last minute and hasn’t scheduled it at least 24 to 48 hours in advance, he may well be out of luck. But Lasecki’s system is always just a cell phone app away — and he aims for it cost no more than $50/hour.
Furthermore, for someone like my friend, who is hard of hearing but not quite deaf, sign language interpretation can be confusing and difficult. American sign language is truly a language, with its own syntax and grammar, not simply a transcription of English into motions. So catching some of the English and simultaneously watching the sign language interpretation requires a strange bifurcation of one’s mind.
That has also set a demanding task for Lasecki’s transcription system, because it requires that it work very, very quickly. If the system produces a transcription with a 15-second delay, a hard-of-hearing person who catches every other word can’t hold what they’ve heard in their minds and then use the transcription to fill it in, and so they’ll be forced to ignore what they hear entirely and rely fully on the transcription, and they then won’t be able to put together the words with the facial expressions and gestures. So he’s aimed to have his system produce a transcription within five seconds. He’s currently at just under four.
See the rest of the story at: http://tinyurl.com/k7d736x
Distributed 2013 by Northern Virginia Resource Center for Deaf and Hard of Hearing Persons (NVRC), 3951 Pender Drive, Suite 130, Fairfax, VA 22030; www.nvrc.org; 703-352-9055 V, 703-352-9056 TTY, 703-352-9058 Fax. Items in this newsletter are provided for information purposes only; NVRC does not endorse products or services. You do not need permission to share this information, but please be sure to credit NVRC. This news service is free of charge, but donations are greatly appreciated.