About Us Our Work Employment News & Events
MITRE Remote Access for MITRE Staff and Partners Site Map
The MITRE Digest

Follow Us:

Visit MITRE on Facebook
Visit MITRE on Twitter
Visit MITRE on Linkedin
Visit MITRE on YouTube
View MITRE's RSS Feeds
View MITRE's Mobile Apps
Home > News & Events > MITRE Publications > The MITRE Digest >

Chasing HAL: In Pursuit of a Human-Computer Interface

November 2001

Human Computer InterfaceConsider for a moment two people in casual conversation. Each brings to the conversation a vast, enormously versatile and powerful ability: speech! It is estimated by Steven Pinker, director of MIT's Center for Cognitive Neuroscience, that each of the speakers possesses the ability to produce a hundred million trillion sentences. If that isn't astounding enough, each speaker is easily able to understand all of the other's conversational offerings.

No machine of any kind in history comes even close to performing such a feat. With computers rapidly becoming bigger, faster, and better, and the daily news brimming with exploits of nearly unimaginable computer prowess, the natural tendency is to think that talking language machines must be just around the corner. Not so fast! Nature's great prize is not so easily won.

As science continues to wrestle with the question of exactly how the brain understands language, Human Language Technology tries to coax conversation between machines and humans into being. According to Speech Technology Magazine, after a 15-year adolescence, "understanding" between machines and humans is now a definite "when," and not an "if." But the "when" seems grudgingly slow in coming, and the advent of sentient computers still in the distant future.

For eight years now, Lynette Hirschman, chief scientist of MITRE's Human Language Technology research group, together with her colleagues, has sought to marry human language capabilities to the ever-growing power and sophistication of computers.

Erasing a small corner of a large, crammed whiteboard in her office, she drew a matrix illustrating the many subsets of this sprawling field of research. "Human language technology makes it possible for computers to recognize human speech and to understand what it means—it makes it possible for machines to use speech to communicate with humans," she explained, "that is, to have a successful conversation or interaction via language."

Making Smart Machines

A watershed moment in Human Language Technology would be in achieving a true conversational interface—a go-between device enabling a human and a computer to speak, to be understood, and to reply in conversation as comfortable and natural sounding as the human's native tongue. To date, algorithms have been created for parsing sentences and large databases of vocabularies with pronounciations have been developed to match words. Yet for Hirschman and her colleagues, a machine able to emulate true language interaction is still far off.

Far off, of course, can be a relative term, especially to researchers like Victor Zue, head of MIT's Spoken Language Systems Group, who recently delivered a presentation at MITRE on next-generation speech-based interfaces. Zue is upbeat about the positive, near-term potential of what he calls true "intelligent agents," also known as smart interfaces. For example, MIT's Oxygen project, scheduled for rollout in stages over the next 10 years, is, according to Zue, a true multi-domain, perceptual interface that combines both speech and vision—a computer with ears, eyes, and vocal chords serving up a vast, digital storehouse of varied and expert knowledge and capable of conversing with the smoothness of human or near-human speech. To wit, a distant forebear of HAL, the omnipotent computer in the film 2001: A Space Odyssey.

During an onstage demo by telephone hookup to MIT's airline booking interface Pegasus (a much less advanced machine than the proposed Oxygen), Zue successfully booked a roundtrip flight from Boston to San Francisco. However, when he asked the computer for the cheapest fare, he got a reply of $3,300.

Explained associate professor and speech interface researcher Roni Rosenfeld of Carnegie Mellon University, "No suitable universal interaction paradigm has yet been proposed for humans to effectively, efficiently, and effortlessly communicate by voice with machines. Natural language systems require a lengthy development phase which is data and labor intensive, with heavy involvement by experts who meticulously craft the vocabulary, grammar, and semantics for a specific field of knowledge."
Terrence Deacon, author of The Symbolic Species: The co-evolution of language and the brain, is blunt about the huge gaps in our understanding of language. "We know how to use a word to mean something," he wrote, "and to refer to something. We know how to coin new words and to assign meanings to them. We know how to create codes and artificial languages. Yet we do not know how we know how to do this, nor what we are doing when we do. Or rather, we know on the surface, but we do not know what mental processes underlie these activities, much less what neural processes are involved." Enabling a computer to duplicate some of these capabilities would be a great leap forward for interface technology.

MITRE's DARPA Communicator

MITRE's DARPA Communicator program, funded by the Defense Advanced Research Projects Agency, has worked to provide human-to-machine interaction via speech and represents forward progress in the evolution of interface technology. This summer, Communicator will roll out of the laboratory for field tests. Communicator will integrate databases from multiple fields of knowledge, like weather information, airline travel, car and hotel rentals, calendar access, as well as e-mail and voice mail access. The user and Communicator will interact conversationally to access and exchange this information, including the human-like abilities to signal non-understanding or to interrupt to clarify information.

The success of DARPA Communicator could well hasten fuller development of true, "natural" dialog interaction with computers. A conversational interface such as DARPA Communicator, once operational, brings up another question: could the computer be taught to learn on its own? And, if so, can it be tested to see what it has learned? "First," answered Hirschman, "we must teach it how to read. After it has read something, say, a chemistry book, we then need to be able to question it on what it knows about chemistry." Even more brain-like and more econmical for the system, she continued, the entire chemistry book need not be learned to pull out some facts about specific areas. All the machine would really need to learn are the highlights—the essentials—just as humans learn. The division's Reading Comprehension program is actively researching the process, but beginning slowly in using elementary school third- and fourth-grade reading and tests.

When it comes to an interface, one size does not fit all. Although some interfaces are commercially available and some are, according to Hirschman and Zue, quite good, they operate in narrow, specialized fields of knowledge, say for instance, weather reporting or airline bookings. Machine voice response or dialogue modeling is still somewhat stilted and not as smooth or natural sounding as is the goal for a true conversational interface. But not everything needs to be big to succeed. Hirschman and her group also custom engineer small-scale, highly specialized interfaces for computer-mediated communication.

Specialty Interfaces

A computer that understands rules of grammar and syntax would be useful not only for understanding speech, but also for reading large amounts of text and extracting information. Another MITRE project, DARPA TIDES, scans running text from newswire feeds and newsletters for references to outbreaks of infectious diseases. DARPA TIDES uses Alembic, a trainable language processing system and rule sequence processor for high-speed, intelligent text extraction. Conceptual Browsing automatically organizes the information extracted from large collections of text.

Hirschman's lab is also currently researching a speech-to-speech translation interface for what are called "low density languages," that is, rare languages, many without easily accessible grammar or dictionaries such as Tetun in East Timor, a remote island in the Indian Ocean. Military or humanitarian intervention in East Timor, as violence there in 1999 made necessary, would make knowledge of Tetun a very important asset.

Hirschman speculated on still another important specialty interface regarding the recently completed Human Genome Project. "Now the real work begins," she said, referring to the gargantuan database of genome information needing to be interpreted and parsed into genes and their associated proteins.

Inderjeet Mani, the principal investigator for MITRE's Conceptual Browsing program, estimates that humanity's yearly output of information is more than an exabyte or 1,000,000,000,000,000,000 bytes (1 quintillion bytes). The astounding complexity of speech, together with this immensity of information, presents both a daunting challenge, as well as an enormous opportunity, for the field of Human Language Technology.

A Single Hunt with Many Eyes

Hirschman believes that breakthroughs in Human Language Technology will not come in one fell swoop. Rather, progress will be realized through a network of research communities over time. Certainly, as the Internet had Bell Labs and the Xerox PARC as high-focus research facilities, places like MITRE, MIT, and Carnegie Mellon provide the same for interface research. But, she emphasized, the importance of an interconnected worldwide community is absolutely necessary in finally unraveling the secrets of speech.
An amazing aspect to the pursuit is the diversity of disciplines necessary to keep up the chase. It takes linguists, psycholinguists, psychologists, mathematicians, electrical engineers, acoustical engineers, physicists, computer scientists, information technologists, cognitive scientists, and neurobiologists, working separately and together, just to keep pace. As Hirschman, with a smile of understatement, succinctly puts it, "These are exciting times." Indeed.

—By Tom Green

Page last updated: August 23, 2001   |   Top of page

Homeland Security Center Center for Enterprise Modernization Command, Control, Communications and Intelligence Center Center for Advanced Aviation System Development

 
 
 

Solutions That Make a Difference.®
Copyright © 1997-2013, The MITRE Corporation. All rights reserved.
MITRE is a registered trademark of The MITRE Corporation.
Material on this site may be copied and distributed with permission only.

IDG's Computerworld Names MITRE a "Best Place to Work in IT" for Eighth Straight Year The Boston Globe Ranks MITRE Number 6 Top Place to Work Fast Company Names MITRE One of the "World's 50 Most Innovative Companies"
 

Privacy Policy | Contact Us