About Us Our Work Employment News & Events
MITRE Remote Access for MITRE Staff and Partners Site Map
edge top

December 1999,
Volume 3
Number 4

 

Home > News & Events > MITRE Publications > The Edge >

Spoken Language Interfaces Gaining Acceptance As Technology Matures
by Margot Peet

Spoken Language Interfaces Gaining Acceptance as Technology Matures
© 1999 The John Hopkins University Applied Physics Laboratory. All Rights Reserved. Used with permission of The John Hopkins University Applied Physics Laboratory

January 12, 1997 was the birthday of HAL, the arguably sentient computer in Stanley Kubrick's film 2001: A Space Odyssey. Although 1997 has passed, we are still a long way from developing computers that can produce and understand the communication and emotions that are at the core of what it means to be human. However, the 1990s produced remarkable growth in the ability of software to capture human speech and turn it into an effective means of interacting with certain applications. By constraining the nature of the interaction, we can combine current generation commercial speech recognition and synthesis technology with research-grade technology in natural language understanding to make speech an effective modality for human-computer interaction.

Several technologies, varying in maturity, combine to aid in understanding speech. To recognize speech, the speech recognizer software takes as input the acoustic waveform and outputs a string of text. There is no understanding, just recognition. The speech is constrained, either because the product requires prior training on the user's voice or because of the grammar of the input. Conversely, speech synthesizers take as input a string of text, and output a speech waveform. Viable commercial products are available based on these fairly mature technologies. Commercial speech recognition software is used everyday to interact with databases over telephones (as, for example, for financial transactions with firms such as Fidelity Investments and Charles Schwab) and to provide input to word processors. Synthesized voices providing information over the phone are commonplace. But the ability to fully understand the nuances of everyday speech still lies beyond the realm of commercial technology. To "understand" speech, interfaces also require components to parse the natural language, and to track the conversational interaction by means of a dialog manager.

Zones of Interaction

The Defense Advanced Research Projects Agency (DARPA) has fostered much of the research and development of spoken language interfaces, and MITRE plays a key role in some of the projects (see Communicator article). However, MITRE has also been at the nexus of a number of efforts that transition these same spoken language interfaces from the research milieu into the hands (or voices) of Army and Navy sponsors, thereby promoting a new paradigm for user interactions with command and control systems.

Although the sponsors are diverse, their applications for language interfaces have common characteristics. They involve command and control interaction with maps and databases. Traditional graphical user interfaces (GUIs) to these applications call for menu-based interactions that are time consuming. Currently, highly trained technicians must operate many of the GUIs. By substituting or supplementing menus with spoken language interfaces, operators can access information in a more efficient and natural manner. Furthermore, most of the spoken input demanded by map and database interaction requires a specialized command and control syntax. This syntax conforms to that of the finite state grammars used in most commercial speech recognition systems. A finite-state grammar defines a path the user can take through the application and specifies what words are acceptable at each state. Hence we have a perfect example of the constraints of current-generation speech recognition technology (finite-state grammars) meeting the needs of the human operator. But finding the kind of functionality that profits from natural language interaction is only the first step in transitioning spoken language technology into operational environments.

Take a moment to read the case studies and see how MITRE's innovative melding of commercial and research-grade technologies is creating new opportunities for our customers. While HAL-like computers are not in the offing, human-computer interaction research continues to ease access to available data.


For more information, please contact Margot Peet using the employee directory.


Homeland Security Center Center for Enterprise Modernization Command, Control, Communications and Intelligence Center Center for Advanced Aviation System Development

 
 
 

Solutions That Make a Difference.®
Copyright © 1997-2013, The MITRE Corporation. All rights reserved.
MITRE is a registered trademark of The MITRE Corporation.
Material on this site may be copied and distributed with permission only.

IDG's Computerworld Names MITRE a "Best Place to Work in IT" for Eighth Straight Year The Boston Globe Ranks MITRE Number 6 Top Place to Work Fast Company Names MITRE One of the "World's 50 Most Innovative Companies"
 

Privacy Policy | Contact Us