![]() |
|||||
|
|
Spoken Language Interfaces Gaining
Acceptance As Technology Matures
January 12, 1997 was the birthday of HAL, the arguably sentient computer in Stanley Kubrick's film 2001: A Space Odyssey. Although 1997 has passed, we are still a long way from developing computers that can produce and understand the communication and emotions that are at the core of what it means to be human. However, the 1990s produced remarkable growth in the ability of software to capture human speech and turn it into an effective means of interacting with certain applications. By constraining the nature of the interaction, we can combine current generation commercial speech recognition and synthesis technology with research-grade technology in natural language understanding to make speech an effective modality for human-computer interaction. Several technologies, varying in maturity, combine to aid in understanding speech. To recognize speech, the speech recognizer software takes as input the acoustic waveform and outputs a string of text. There is no understanding, just recognition. The speech is constrained, either because the product requires prior training on the user's voice or because of the grammar of the input. Conversely, speech synthesizers take as input a string of text, and output a speech waveform. Viable commercial products are available based on these fairly mature technologies. Commercial speech recognition software is used everyday to interact with databases over telephones (as, for example, for financial transactions with firms such as Fidelity Investments and Charles Schwab) and to provide input to word processors. Synthesized voices providing information over the phone are commonplace. But the ability to fully understand the nuances of everyday speech still lies beyond the realm of commercial technology. To "understand" speech, interfaces also require components to parse the natural language, and to track the conversational interaction by means of a dialog manager.
The Defense Advanced Research Projects Agency (DARPA) has fostered much of the research and development of spoken language interfaces, and MITRE plays a key role in some of the projects (see Communicator article). However, MITRE has also been at the nexus of a number of efforts that transition these same spoken language interfaces from the research milieu into the hands (or voices) of Army and Navy sponsors, thereby promoting a new paradigm for user interactions with command and control systems. Although the sponsors are diverse, their applications for language interfaces have common characteristics. They involve command and control interaction with maps and databases. Traditional graphical user interfaces (GUIs) to these applications call for menu-based interactions that are time consuming. Currently, highly trained technicians must operate many of the GUIs. By substituting or supplementing menus with spoken language interfaces, operators can access information in a more efficient and natural manner. Furthermore, most of the spoken input demanded by map and database interaction requires a specialized command and control syntax. This syntax conforms to that of the finite state grammars used in most commercial speech recognition systems. A finite-state grammar defines a path the user can take through the application and specifies what words are acceptable at each state. Hence we have a perfect example of the constraints of current-generation speech recognition technology (finite-state grammars) meeting the needs of the human operator. But finding the kind of functionality that profits from natural language interaction is only the first step in transitioning spoken language technology into operational environments. Take a moment to read the case studies and see how MITRE's innovative melding of commercial and research-grade technologies is creating new opportunities for our customers. While HAL-like computers are not in the offing, human-computer interaction research continues to ease access to available data. For more information, please contact Margot Peet using the employee directory. |
Solutions That Make a Difference.® |
|
|