The ‘Robust Automatic Transcription of Speech’ (RATS) program was developed by the ‘Defense Advanced Research Projects Agency’ (DARPA).
The project quickly translates and transcribes voice communications to troops or analysts in the field.
The Pentagon could have the project operational this year if additional funds of $2.4 million were made available. $13 million has been spent so far.
DARPA has been involved in voice-recognition research since 1971. It has spawned developments such as the voice-recognition software used by Apple iPhone’s Siri system. The RATS program can filter out background sounds and quickly identify the Spoken Dialects that it finds.
The additional funds are to take the project from its laboratory environment and into the field. The final two phases reached now are to make the system operational by the Air Force as early as this year, and by Government Agencies and others by 2017.
The program involves four key areas, DARPA documents show:
• Speech activity detection. What is speech and what is background noise?
• Key word spotting. Analysts can be directed to seek out certain words, such as in the 2007 movie The Bourne Ultimatum, in which NSA computers tracked the word “Blackbriar,” which identified a secret government program.
• Language identification. Once the software has separated speech from noise, what language is being spoken? DARPA researchers are focusing on Arabic, Farsi, Pashto, Dari and Pashto, which are used in the Middle East, Afghanistan and Iran.
• Speaker identification. The system can use voice-pattern analysis and other technology to determine who is speaking, which can be particularly useful if intelligence analysts and troops are looking for certain targets.
Much of the work on the RATS system has been done in laboratories in quiet and controlled environments. Problems, DARPA documents show, arise when researchers try to collect voice signals in environments with a lot of background noise and competing radio signals.
Troops and intelligence analysts trying to monitor multiple signals need to be able to separate one signal from another and then focus on the actual words being spoken without the interference of static or background noise. That’s challenging in controlled environments but even more so in places like Afghanistan or Iraq.
Once the technology can clean out the background noise and static, it must rapidly translate the spoken words, which are often in regional dialects and accents that are hard to understand for outsiders.
Latest posts by Edmondo Burr (see all)
- Parents Jailed For ‘Gifting’ Young Daughters To Sex Offender - July 22, 2017
- Sean Spicer Quits Spokesperson Job At White House - July 21, 2017
- AI Robots Could Speak To Each Other In Their Own Language - July 17, 2017