About Us Our Work Employment News & Events
MITRE Remote Access for MITRE Staff and Partners Site Map
The MITRE Digest

Follow Us:

Visit MITRE on Facebook
Visit MITRE on Twitter
Visit MITRE on Linkedin
Visit MITRE on YouTube
View MITRE's RSS Feeds
View MITRE's Mobile Apps
Home > News & Events > MITRE Publications > The MITRE Digest >

Human Language Technology

March 2001

Chasing HAL - The Pursuit of a Human-Computer Interface

graphic

"Man became man not by the tool but by the Word. It is not walking upright and using a stick to dig for food or strike a blow that makes a human being, it is speech."

—Nadine Gordimer, The Essential Gesture

Consider for a moment two people in casual conversation. Each brings to the conversation a vast, enormously versatile and powerful ability: speech! It is estimated by Steven Pinker, director of MIT's Center for Cognitive Neuroscience, that each of the speakers possesses the ability to produce a hundred million trillion sentences. If that isn't astounding enough, each speaker is easily able to understand all of the other's conversational offerings.

No machine of any kind in history comes even close to performing such a feat. With computers rapidly becoming bigger, faster, and better, and the daily newspapers brimming with exploits of nearly unimaginable computer prowess, the natural tendency is to think that talking language machines must be just around the corner. Not so fast! Nature's great prize is not so easily won.

As science continues to wrestle with the question of exactly how the brain understands language, Human Language Technology tries to coax conversation between machines and humans into being. According to Speech Technology Magazine, after a 15-year adolescence, "understanding" between machines and humans is now a definite "when" and not an "if". But the "when" seems grudgingly slow in coming, and the advent of sentient computers still far off.

For eight years now Lynette Hirschman, chief scientist of MITRE's Human Language Technology research group, together with her colleagues, has sought to marry human language capabilities to the ever-growing power and sophistication of computers.

Erasing a small corner of a large, crammed whiteboard in her office, she drew a matrix illustrating the many subsets of this sprawling field of research. "Human language technologies are those innovations that make it possible to have computers recognize human speech to understand the meaning of human speech, to use speech to communicate with humans," she explained, "that is, to have a successful conversation or interaction via language."

Making Smart Machines

Dave: Open the pod bay doors, HAL.
HAL: I'm sorry Dave, I'm afraid I can't do that.

—2001: A Space Odyssey

A watershed moment in Human Language Technology would be in achieving a true conversational interface—a go-between device enabling a human and a computer to speak, to be understood, and to reply in conversation as comfort-able and natural sounding as the human's native tongue. To date, algorithms have been created for parsing sentences and large databases of vocabularies have been developed to match words. Yet for Hirschman and her colleagues, a machine able to emulate true language interaction is still far off.

During an onstage demo by telephone hookup to MIT's airline booking interface Pegasus (a much less advanced machine than the proposed Oxygen), Zue successfully booked a roundtrip flight from Boston to San Francisco. However, when he asked the computer for the cheapest fare, he got a reply of $3,300.00.

Far off, of course, can be a relative term, especially to researchers like Victor Zue, head of MIT's Spoken Language Systems Group, who recently delivered a presentation at MITRE on next generation speech-based interfaces. Zue is upbeat about the positive, near-term potential of what he calls true "intelligent agents," also known as smart interfaces. For example, MIT's Oxygen project, scheduled for rollout out in stages over the next 10 years, is, according to Zue, a true multi-domain, perceptual interface that combines both speech and vision—a computer with ears, eyes, and vocal chords serving up a vast, digital storehouse of varied and expert knowledge, and capable of conversing with the smoothness of human or near-human speech. To wit, a distant forebear of HAL, the omnipotent computer in the film "2001: A Space Odyssey."

Explained associate professor and speech interface researcher Roni Rosenfeld of Carnegie Mellon University: "No suitable universal interaction paradigm has yet been proposed for humans to effectively, efficiently and effortlessly communicate by voice with machines. Natural language systems require a lengthy development phase which is data and labor intensive, with heavy involvement by experts who meticulously craft the vocabulary, grammar, and semantics for a specific field of knowledge."

Hollywood and Reality

HAL, Gort, Data, and Robby the Robot aside, infusing computers with an awareness of human language and the ability to freely communicate with humans is one of the supreme challenges and one of the most arduous tasks in all of modern computing. Gordon Bell, of computer architecture fame, once confided that he looked at the problems of human language technology, thought them inordinately difficult, and moved on to work in other areas. So what is it that scared Bell off, and why is it that if computers have gotten so much faster and cheaper and more powerful, they have not become any better at understanding what we want them to do?

hollywoodFor sure, human beings have had a giant head start on computers, when at the lower end of the Holocene Era—about 40,000 years ago—humans began to turn gestures and grunts into language. The human brain, during over thirty millennia of orality, contrived and produced all of what we today call human language. And all of it is complex, not a jot of a simple language is anywhere to be found. From Pygmy villages, to Eskimo igloos, to the backstreets of Beijing, to the halls of Oxford, brain-to-speech interaction of language—the human interface—is nearly as fantastically complex as the brain itself.

Terrence Deacon, author of The Symbolic Species: The co-evolution of language and the brain, is blunt about the huge gaps in our understanding of language. "We know how to use a word to mean something," he writes, "and to refer to something. We know how to coin new words and to assign meanings to them. We know how to create codes and artificial languages. Yet we do not know how we know how to do this, nor what we are doing when we do. Or rather, we know on the surface, but we do not know what mental processes underlie these activities, much less what neural processes are involved." Enabling a computer to duplicate some of these capabilities would be a great leap forward for interface technology.

MITRE's DARPA Communicator

MITRE's DARPA Communicator program, funded by the Defense Advanced Research Projects Agency, has worked to provide human-to-machine interaction via speech and represents a solid step forward in the evolution of interface technology. This summer it will roll out of the laboratory for field tests. Communicator will integrate databases from multiple fields of knowledge like weather information, airline travel, car and hotel rentals, calendar access, as well as e-mail and voice mail access. The user and Communicator will interact conversationally to access and exchange this information, including the human-like abilities to signal non-understanding or to interrupt the other to clarify information.

darpaThe success of DARPA Communicator could well hasten fuller development of true, "natural" dialog interaction with computers. A conversational interface such as DARPA Communicator, once operational, brings up another question: could the computer be taught to learn on its own? And, if so, can it be tested to see what it has learned? "First," answered Hirschman, "we must teach it how to read. After it has read something, say, a chemistry book, we then need to be able to question it on what it knows about chemistry." Even more brain-like and more economical for the system, she continues, the entire chemistry book need not be learned to pull out some facts about specific areas. All the machine would really need to learn are the highlights—the essentials—just as humans learn. The department's Reading Comprehension program is actively researching the process, but beginning slowly in using elementary schools third- and fourth-grade reading and tests.

When it comes to an interface, one size does not fit all. Although some interfaces are commercially available and some are, according to Hirschman and Zue, quite good, they operate in narrow, specialized fields of knowledge, say for instance, weather reporting or airline bookings. Machine voice response or dialogue modeling is still somewhat stilted and not as smooth or natural sounding as is the goal for a true conversational interface. But not everything needs to be big to succeed. Hirschman and her group also custom engineer small-scale, highly specialized interfaces for computer-mediated communication.

Small-Scale Interfaces

A computer that understands rules of grammar and syntax would be useful not only for understanding speech, but also for reading large amounts of text and extracting information. Another MITRE project, DARPA TIDES, scans running text from newswire feeds and newsletters for references to outbreaks of infectious diseases using Alembic, a trainable language processing system and rule sequence processor for high-speed, intelligent text extraction. Conceptual Browsing automatically organizes the information extracted from large collections of text.

Hirschman's lab is also currently researching a speech-to-speech translation interface for what are called "low density languages," that is, rare languages, many without easily accessible grammar or dictionaries such as Tetun in East Timor, a tiny island in the Indian Ocean. Military or humanitarian intervention in East Timor, as violence there in 1999 made necessary, would make knowledge of Tetun a very important asset.

Hirschman speculated on still another important specialty interface regarding the recently completed Human Genome Project. "Now the real work begins," she said, referring to the gargantuan database of genome information needing to be interpreted and parsed into genes and their associated proteins.

Inderjeet Mani, the principal investigator for MITRE's Conceptual Browsing program estimates that humanity's yearly output of information is more than an exabyte or 1,000,000,000,000,000,000 bytes (1 quintillion bytes). The astounding complexity of speech together with this immensity of information presents both a daunting challenge as well as an enormous opportunity for the field of Human Language Technology.

A Single Hunt with Many Eyes

hunt with eyesHirschman believes that breakthroughs in Human Language Technology will not come in one fell swoop. Rather, progress will be realized through a network of research communities over time. Certainly, as the Internet had Bell Labs and the Xerox PARC as high-focus research facilities, places like MITRE, MIT, and Carnegie Mellon provide the same for interface research. But she emphasizes, the importance of an interconnected worldwide community is absolutely necessary in finally unraveling the secrets of speech.

An amazing aspect to the pursuit is the diversity of disciplines necessary to keep up the chase. It takes linguists, psycholinguists, psychologists, mathematicians, electrical engineers, acoustical engineers, physicists, computer scientists, information technologists, cognitive scientists, and neurobiologists, working separately and together, just to keep pace. As Hirschman, with a smile of understatement, succinctly puts it: "These are exciting times." Indeed.

 

Page last updated: May 21, 2002  |   Top of page

Homeland Security Center Center for Enterprise Modernization Command, Control, Communications and Intelligence Center Center for Advanced Aviation System Development

 
 
 

Solutions That Make a Difference.®
Copyright © 1997-2013, The MITRE Corporation. All rights reserved.
MITRE is a registered trademark of The MITRE Corporation.
Material on this site may be copied and distributed with permission only.

IDG's Computerworld Names MITRE a "Best Place to Work in IT" for Eighth Straight Year The Boston Globe Ranks MITRE Number 6 Top Place to Work Fast Company Names MITRE One of the "World's 50 Most Innovative Companies"
 

Privacy Policy | Contact Us