About Us Our Work Employment News & Events
MITRE Remote Access for MITRE Staff and Partners Site Map
The MITRE Digest

Follow Us:

Visit MITRE on Facebook
Visit MITRE on Twitter
Visit MITRE on Linkedin
Visit MITRE on YouTube
View MITRE's RSS Feeds
View MITRE's Mobile Apps

 

 

Home > News & Events > MITRE Publications > The MITRE Digest >
 cartoon of cars driving

Building a Better Web


October 2002

The MITRE Corporation wants to boost the World Wide Web to the next level. Along with colleagues from around the globe, our scientists envision a vastly enhanced Web where digital brokers look for information, follow hyperlinks, interact with other intelligent agents, and perform tasks, all automatically. The goal is not just improved searches, but higher productivity, perhaps even greater personal and public safety.

Currently, the Web is a marvelous entity for finding everything from crankshafts to computer software to breaking news. But hitting the "Search Now!" button often yields as much junk as useful information, and every result must be evaluated. This conspicuous inability of Web technology to perform much heavy lifting has spurred Tim Berners-Lee, the acknowledged father of the Web, to conceptualize a "better" Web. He calls it the Semantic Web—an evolution of the existing system.

MITRE's Frank Manola, our senior Semantic Web researcher, points out why the differences between the "old" and "new" Web are important. "The Web's original design provides information to people, not software," he says. Currently, most Web resources don't include descriptions of their meanings or capabilities that software can understand.

"What we have now is an enormous distributed database with hyperlinks that require humans to do the reading and link following," Manola says. As a result, software can't provide relevant information or services with precision. For example, if you're searching for a "Mustang," do you expect to get information about cars or horses or WWII fighter aircraft? Does your search engine know the difference?

"There's too much data to expect efficient processing without software help" imagine not having Google, for instance," Manola says. "The next step is to make it machine interoperable so a software engine can make distinctions about the data."

Letting the Computers Help

"Strictly speaking, the Semantic Web is about the structured use of terms and formal definitions that are machine-interpretable. The definitions fall into application domains, such as medicine, military, and so on. These include formal ontologies [machine-interpretable descriptions of concepts] that define some aspects of the terms, such as 'is a' relationships, even if a complete definition isn't machine-interpretable," he says. ("Is a" relationships include "a minivan is a car" or "a helicopter is an aircraft").

The Semantic Web doesn't entail full natural-language processing; humans will still be responsible for decoding masses of arbitrary text, and (with increasing amounts of software assistance) for using terms from the ontologies in describing information. However, the implications for improved, computer-aided communication—among people, businesses, academia, and especially the government—are huge. As a cutting-edge developer of information technology solutions for numerous government customers, MITRE began exploring Semantic Web research not long after Berners-Lee proposed it and has been involved in related research areas such as ontologies even longer. MITRE's internal research program supports these efforts.

Manola sees the Semantic Web and MITRE as a natural fit. "We have the right resources—people with expertise in databases, ontology and knowledge representation, data interoperability, and XML [extensible mark-up language] technology, and many of the key application areas," he says. "Data interoperability is a big problem, especially for the government. Different parts of the government encounter the inability of computer systems to interact efficiently sooner rather than later, especially in an emergency. For example, consider the difficulties of interchanging data with local, state, and federal authorities for homeland security. Our inability to exchange data is costly—sometimes deadly."

A (Very) Short Course in the Semantic Web

Right now, Web pages are created using a page-description language called HTML. HTML defines machine-readable "tags" that tell the browser how to display text, graphics, and other material (for example, <i>highlight</i> indicates that "highlight" should be italicized). The World Wide Web Consortium (www.w3c.org) defines the standard HTML tags; thus people throughout the world achieve similar results when designing pages.

But because HTML tags are fixed, and designed for display, it is difficult to expand the Web's capabilities in innovative ways. So instead of HTML, the foundation of the Semantic Web rests on XML.

Using XML tags, information can be labeled in almost any fashion, using terms related to the application (such as <car>Mustang</car>). But tags alone don't convey much meaning to a computer. To extend XML's capabilities, the Semantic Web also requires a language called the Resource Description Framework (RDF). RDF provides a common structure for expressing the information so that it can be exchanged among applications without loss of meaning. This means the information may be made available to applications other than those for which it was originally created—a giant step forward in achieving true system interoperability.

MITRE's role in this is substantial. For instance, as a key member of the W3C's RDF Core Working Group, Manola is one of two co-editors on the consortium's RDF Primer, currently in development. As an official RDF guidebook from W3C, the primer will likely become a fundamental source for teaching the basics of RDF. Several other MITRE scientists are involved in other crucial, related areas, such as Leo Obrst's work on Web-ontology languages, that will eventually help turn theory into reality.

To put the Semantic Web through its paces, Manola has been developing various test cases. One is a disaster-relief scenario—something of great importance to government agencies. With the new technology, a computer search agent could crawl through hundreds of Web sites looking for supplies such as tents. Given the appropriate term definitions and a set of rules for guidance, the agent could locate tents meeting specific criteria—seeking answers based on our intentions, not just specific words. For instance, the agent could find tents that match our requirements regardless of whether dimensions are listed in width and height, person capacity, or some other fashion.

With the right rule set, the search agent could also refer to published ratings guides about tents, ranking them in order of perceived quality. And of course, the agent wouldn't find only tents; it could find "portable shelters" as well (using a definition that a tent is a portable shelter). What the agent wouldn't do is return information outside of the category of relief supplies that use the word "tent" (such as "tent caterpillar"). When was the last time your search engine did that?

Can I Use It Anyway?

The above description is an oversimplification, of course. Just as sophisticated Web sites use all sorts of programming scripts to enliven plain HTML pages, the Semantic Web requires sophisticated technology to work to its fullest capacity.

But as Manola likes to say, "As an end user all I want is a pizza—I don't want to know how to make mozzarella!" In other words, you will no more need to be a programmer or database manager to use this technology than you need to be a Web designer to use Yahoo. Eventually, there will be easy tools for creating Semantic Web information (some are being developed now), just as there are for the current Web.

Manola foresees great advances in software to automate and simplify the process of generating those all-important tags. Search engines must evolve too, he notes, as well as tools for classifying information. The better the organization of the data, the better the results.

Most important, there will be enormous issues revolving around data exchange and the documenting of agreements so the Semantic Web can reach its full potential, as well as trust issues (such as, is the data from a reliable source?). "There's a lot of social work to do—software can't read minds," he says. "It's a slow process. The costs will be large, but so will the benefits."

Evolution Begets Revolution

What's the time frame for all this? According to Manola, some segments of the marketplace have already taken the plunge. For instance, media giant Reuters provides XML-coded material (in some cases including RDF with references to structured ontologies) to make its news feeds more useful to its customers.

"One of the biggest difficulties is that application domains may come into conflict with each other [because of similarities in words and their meanings]. Defining these formal ontologies is not going to happen at once. So far, this has started with specific application domains that really need it, such as medicine and some media. To some people, it is worth doing now and they're doing it. For them, there are market forces that make it worthwhile," adds Manola.

He likens the evolution of the Web to the early years of the Internet. "Academia and the military were using the Internet heavily long before businesses did. The federal government is really interested in this right now, for instance."

The W3C's stated long-term goal for the Semantic Web is "to develop a software environment that permits each user to make the best use of the resources available on the Web." When this goal is attained, Manola and his colleagues visualize a future in which the immense stores of knowledge on this evolved Web would better assist us at work, school, and home.

"The Semantic Web would make maximum use of the information resources we have," he says. "People won't be out of the loop, however. In fact, this will enable a better synergy between software and people."

—by Alison Stern-Dunyak


Related Information

Technical Papers and Presentations

Websites

Page last updated: February 16, 2004 | Top of page

Homeland Security Center Center for Enterprise Modernization Command, Control, Communications and Intelligence Center Center for Advanced Aviation System Development

 
 
 

Solutions That Make a Difference.®
Copyright © 1997-2013, The MITRE Corporation. All rights reserved.
MITRE is a registered trademark of The MITRE Corporation.
Material on this site may be copied and distributed with permission only.

IDG's Computerworld Names MITRE a "Best Place to Work in IT" for Eighth Straight Year The Boston Globe Ranks MITRE Number 6 Top Place to Work Fast Company Names MITRE One of the "World's 50 Most Innovative Companies"
 

Privacy Policy | Contact Us