MITRE's Computational Linguists Support Government Human Language Technology NeedsSeptember 2011
Topics: Communication, Artificial Intelligence
Imagine this scenario: An elderly Afghan man drives up to a military checkpoint in a small town outside Kabul. He stops his vehicle and waits. A soldier approaches and begins asking him questions. Even though the older man does not speak English and the soldier does not speak Pashto or Dari, they are able to understand each other. The soldier speaks into his Android phone in EnglishWhere are you from? Where are you going?and instantly the soldier's questions are translated into Pashto or Dari. The Afghan civilian then speaks into the phone in his native language and the soldier can instantly hear the man's reply in English.
The soldier is using a two-way speech translation/human language technology that MITRE researched and evaluated for the Defense Advanced Research Projects Agency (DARPA). "In Afghanistan, it can be used at military checkpoints, hospitals, and infrastructure projectsalmost anywhere Americans need basic communication with the local population," says Sherri Condon, a principal engineer in the Information Discovery and Understanding department. "It is far from perfect, but if conversations become complicated, they can call in a translator. Unfortunately, there are only so many translators available at any one time."
Today, MITRE is working on dozens of human language technology (HLT) projects like this DARPA speech translation initiative. MITRE experts support numerous military and civilian government agencies with language challenges related to national security, defense, healthcare, and even international money laundering.
70 Experts in One Company
The recent explosion in the amount of data made possible by the information age has created new challenges for numerous government agencies. "Today we have chats, tweets, SMS text, foreign language SMS messaging, and blogging," says Dan Loehr, head of MITRE's Information Discovery and Understanding department. "We have to understand not only English slang and nicknames but foreign ones, too. For instance, what if the intelligence community wants to know what dissidents are blogging about in Iran? In that scenario, they need to understand not only various Iranian dialects, but also blogging language and colloquialisms."
Nearly 70 computational linguists and human language technology specialists at MITRE currently work on the government's language technology needs. "That's one of the largest concentrations of computational linguists in the country," says Loehr. "For the government, MITRE is a key place to go for computational linguistic support; when we work in this area, we fulfill our role as the manager of FFRDCs [federally funded research and development centers]."
MITRE's work extends across the military and intelligence community, the Securities and Exchange Commission, Department of State, Transportation Security Administration, numerous healthcare organizations, and many more organizations. "We help them with situation and safety reports, records, and name identification," says Loehr. "Language needs are everywhere."
"I Bombed in the Theater"
MITRE supports the foreign language office of a major intelligence organization, which directs human language technology resources to other intelligence organizations. "One of our primary goals is to improve the processing of foreign language materialregardless of format," says Betsi McGrath, chief engineer in the Information and Knowledge Management department. "Through our work with this sponsor, we have a good understanding of the community's requirements for HLT."
"Our work encompasses everything from foreign language education and training to analyzing data on the capabilities and needs of the community," says Richard Lutz, a program manager in the Information Outreach and Management department. "Because of the breadth and depth of MITRE expertise, we in the CIIS [Center for Integrated Intelligence Systems, part of MITRE's defense and intelligence FFRDC] are able to see a broader view, and work at a strategic level. We've been key in defining terms as well as in providing guidance on how to identify and evaluate priorities for the community."
He adds: "When the office was first created, the government focused its efforts on foreign language training and educationmaintaining and enhancing the language skills of the workforce. While these are essential elements, they don't paint the whole picture. We played a significant role in arguing for the central role that HLT now plays in supporting the overall mission. We helped put all the various pieces of the puzzle together."
"Our customers know we have a lot of knowledge on HLT and trust us when we make product or resource recommendations," says McGrath. "We also advise when and where humans need to stay in the loop for quality. When a computer reads 'I bombed in the theater last night,' it really can't know the intent. It's always going to be about people, processes, and technologies."
At the same time, many government organizations are still learning that they need language support. "Every agency has a language problemthey need to search, find names, understand text messages, and find patterns," says Catherine Ball, area lead in the Information Discovery and Understanding department. "One of our challenges is helping our sponsors see that they have language needs."
Dictionaries, Glossaries, and Indexes
Many of the tools and resources developed by MITRE can be used by multiple organizations. "We've created resources such as foreign language dictionaries, glossaries, and indexes that help a sponsor with their foreign language needs," says Ball. "We know what's been created and how it can be used. The government agencies can share with one another. For instance, the MITRE-developed Arabic Document Processing Evaluation Corpus, which aids in product evaluations, was developed for one agency and has now been used by others."
Since there are so many computational linguists spread out across MITRE, they make a point to stay connected. "MITRE has a culture of sharingthat's part of our culture and our mission," says Ball. "We have an HLT list [a listserv], Languapedia [a website for the MITRE language technology community to share resources], and a catalog of product evaluations."
MITRE's Foreign Language Coordination Group regularly sponsors technical exchange meetings and workshops for internal experts and sponsors. "Recently we held a training workshop on the need for more clearable computational linguists," says McGrath. "There's always more to do."
by Nadine Monaco