About Us Our Work Employment News & Events
MITRE Remote Access for MITRE Staff and Partners Site Map

Home > News & Events > MITRE Publications > The Edge >

Creating Standards for Multiway Data Sharing

Elizabeth Harding, Leo Obrst, and Arnon Rosenthal

haring data across multiple systems within an organization, such as the Department of Defense, or among organizations, such as those of the Intelligence Community, is essential to making fast, informed decisions. This is particularly true in times of crisis. But what happens when you're trying to exchange vital information across dissimilar systems, each with its own data format? Often, humans have to painstakingly do the translations. By creating standards, however, systems can talk to each other, machine to machine.

In this article, we describe two MITRE-supported efforts to do just that. Our work to help our sponsors create small, tractable data standards could provide other organizations with ideas on how to solve their data-sharing issues.

The first project involves creating a new capability for the Air Force: the automated passing of key sets of information across multiple machines. MITRE developed the concept and prototype, known as Cursor on Target (CoT), in 2003 to meet an urgent need. Michael Butler and his team did this by narrowing down the content area to the most critical elements and creating a small standard that can be extended for use across the enterprise. Certain Air Force systems can now exchange key information about the tactical environment (targets, troops, etc.) without the need for human translation. Another MITRE team then documented and modeled the CoT approach so that it could be used by other groups.

In the second project, we worked on the Intelligence Community (IC) Metadata Standard for Publications, which is designed to standardize publication metadata (e.g., document structure, author, creation date, security level, topic) across a large community. Standardized metadata enables a consistent retrieval of meaningful and relevant information.

Machine-to-Machine Messages

The CoT prototype gave the Air Force the ability to rapidly exchange information on strike missions across multiple systems, enhancing the reaction speed and accuracy of these missions. CoT focuses on "what" (is it a target to be hit or a survivor to be rescued?), "where" (guidance coordinates), and "when" (timeline).

With CoT, key information is shared machine to machine, whereas in the past it could only be done via voice transmission, human transcription, and manual data reentry. This prototype was enthusiastically received by Air Force leaders, has been tested extensively, and is being used in Iraq today. Butler's team is working with other groups within the military services to apply the CoT approach to their needs.

Our MITRE team (part of the Air Force Data Interoperability Group) saw CoT as an opportunity to apply and test our work on standards and metadata expressed in information models. We were looking for ways to reduce system design time through simple ontologies and Community of Interest (COI) processes. We wanted to leverage the work of Butler's team and extend and enhance it through models, which would make it easier for other groups to understand and adopt. Our goals were to demonstrate a scaleable approach to solving data interoperability problems by helping to understand, analyze, and represent the meaning of the data; model the results of the analysis; and be able to build one of many possible physical representations. We call this work the "Cursor on Target- Extender" (CoT-X) initiatives.

The first CoT-X (CoT-X1) provides additional representations in a "visualizable" Unified Modeling Language (UML) and a common Extensible Markup Language (XML) representation based on the UML. Modeling puts everything in English so that people can quickly understand the approach and apply it to other projects. UML representations are key to the CoT-X initiatives because they set the stage for future interoperability by using formal models, which is one alternative to using standard data. Ultimately, negotiation between systems will be based on models, and systems will self-describe using models.

Before developing our information model, we identified stakeholders and selected information analysts/systems engineers to play the role of "information modelers," people who understand data and how to analyze it and represent it for flexible, efficient processing.

To get members of the participating organizations involved in creating CoT-X1, we formed a COI made up of subject matter experts. Our team looked at position information in seven systems (including one message set and the CoT standard). Our information modelers met with 17 experts to discuss, describe, and validate their particular system's position representation. We asked them open-ended questions about the information they needed and shared, and we dug through volumes of data documentation.

The information modelers then analyzed the results of these sessions and the document review and described each system representation as a UML model. They looked for the intersections in all the information to determine what information is common and unambiguous. We then constructed a consolidated CoT-X1 model to which each of the systems can map.

For the CoT-X1 initiative, we re-used the position data from CoT and added 10 additional elements to elaborate on time, source, and accuracy in the consolidated CoT-X1 model. These analyses, modeling, and consolidation steps took approximately 10 weeks (400 hours) of the information modelers' time. The 40 hours per new element was considered quite acceptable, especially since we kept demand on subject matter experts low. We then manually created an XML schema from the CoT-X UML. For the second initiative (CoT-X2), we worked to reduce the modelers' time by up to 20 percent. We realized little improvement, however, because of the ramp-up of a new analysis team. We expect time improvement in the future, but it's important to note that it takes a lot of analysis to reduce data to its core elements.

The interoperability benefits of CoT have extended beyond the immediately affected programs. The CoT and CoT-X UML models and schemas are both registered in the DOD Metadata Registry (formerly called the XML Registry) in the Aerospace Operations namespace, where they are available to anyone in the Department of Defense. In addition, a number of MITRE-supported programs are using the CoT-X UML models to articulate requirements for ground moving target indicators and to communicate with contractors.

One key to the CoT-X initiative's success is that it involved small communities supported by information analysts. The representation of semantics in UML (as simple ontologies) captured important aspects of meaning, including nonhierarchical relationships.

MITRE often spearheads the development of data standards that ease information exchange among diverse systems.

Intelligence Community Data Sharing

Much of the data consumed and produced in the IC is in the form of text documents—e.g., reports on economic and political conditions and military capabilities.

The IC needs improved capabilities to share and reuse documents across the community, as well as to search and to apply security, archiving, and other applications to these documents. The best way for the agencies to achieve these goals is through shared metadata. That is, information providers must attach descriptive tags to appropriate chunks of their documents (e.g., where the information came from, what the content is about, classification level, etc.) and provide tools that will apply and exploit the tags. To do this, organizations must standardize the tags that producers use.

Standardizing tags is increasing within the IC, but is still spotty. In general, different standards are used across agencies. Consequently, documents are difficult to share across agencies, and frequently within agencies, even when they are well tagged because the recipient system may use different tags. Often documents are not thoroughly or uniformly tagged, and there is no full set of tools available for tagging or interpreting the tags from one system to another.

To solve these problems, the IC is developing standards in the form of the the IC Metadata Standard for Publications (IC-MSP), which is aimed at describing bibliographic information and also generic document components. It's an ongoing effort. Such standards will enable the creation of tools to exploit tags provided by other agencies, encourage more complete tagging, and increase the incentive to create new tools. It will facilitate the interchange and reuse of published IC products and their components.

IC-MSP is an implementation of XML intended for document-style intelligence products posted on Intelink (the classified intelligence network shared by the IC) and other domain servers. IC-MSP currently consists of more than 200 XML elements or attribute definitions, as well as additional elements that are common to most publication hierarchies, including the IC Information Security Marking standard's security attributes. (More standards will be added in the future.)

This IC document markup effort is a large initiative that currently involves hundreds of elements (basic document properties) for all new IC products and many agencies. The Intelligence Community CIO funds the IC-MSP effort, and agencies have agreed to mandate its use where appropriate.

To make this happen, the IC Metadata Working Group (ICMWG) was created in 2000. It includes representatives from major intelligence agencies and partners (e.g., State, Justice, Energy, the military commands and services, and some commercial companies). MITRE representatives sit in on these meetings to observe and advise.

MITRE has provided a variety of assistance to the IC, including creating the requirements for a sound metadata architecture and infrastructure, developing cases and scenarios for exploiting metadata, and giving strategic advice on emerging Semantic Web standards. We recently completed a sponsor-requested review of the IC-MSP standard, and we also helped the ICMWG respond to Congressional inquiries on XML and metadata issues. We continue to work on describing more IC-MSP semantics to support wider sharing and exploitation.

To enable the IC-MSP, many IC agencies have begun pilots to apply tags or generate them from other formats that were already being captured. Intelink is developing applications that exploit these tags to accomplish information sharing and interoperability across the community and to make search and discovery more effective.

Conclusion

These two efforts in establishing standards for data sharing have been successful for several reasons. The scope of the subject matter was manageable, and the users were interested in finding a solution. They agreed that the standard did not need to cover all information possible, just the most critical information to be exchanged. Also, both chose widely known languages for expressing their standards: UML (aimed at software developers) and XML (an industry standard for data interchange and document management).

Successful standardization efforts have a realistic understanding of both the benefits and the limits of data standards. Because no single standard will describe all systems in a very large enterprise with many autonomous participants, organizations must develop effective two-pronged strategies. First, they should minimize diversity by developing data standards within focused communities of interest, as described in this article. Second, they must develop tools and processes to help system builders mediate across multiple standards and both conforming and non-conforming systems. MITRE is currently developing such strategies for several of our sponsors.

Information Interoperability Issue

Summer 2004
Vol. 8, No. 1



Introduction

Arnon Rosenthal and Len Seligman


A Framework for Information Interoperability

Len Seligman and Arnon Rosenthal


How Do We Build Information Systems That Support Network-Centric Warfare?

Scott Renner


Network Representations Support Powerful Data Analysis

Sarah Piekut, Lowell Rosen, and Daniel Venese


The Semantic Web: A Path to Large-Scale Interoperability

Frank Manola, Mary Pulvermacher, and Leo Obrst


Mapping Among Independently Developed Aviation Information Systems Increases Interoperability

Catherine Bolczak, Len Seligman, Nels Broste, Ron Schwarz, and Shawne Lampert


Using Data Warehousing to Integrate Multiple Sources of Data

Victor Pérez-Núñez, Robert Jurgens, Larry Hughes, and Ali Obaidi


Creating Standards for Multiway Data Sharing

Elizabeth Harding, Leo Obrst, and Arnon Rosenthal


Formatted Messaging Modernization Exploits XML Technologies

Robert W. Miller, Mary Ann Malloy, and Ed Masek


pdf icon Download this issue [1.2MB]

 

For more information, please contact Elizabeth Harding, Leo Obrst, Larry Hughes or Arnon Rosenthal using the employee directory.


Page last updated: August 5, 2004   |   Top of page

Homeland Security Center Center for Enterprise Modernization Command, Control, Communications and Intelligence Center Center for Advanced Aviation System Development

 
 
 

Solutions That Make a Difference.®
Copyright © 1997-2013, The MITRE Corporation. All rights reserved.
MITRE is a registered trademark of The MITRE Corporation.
Material on this site may be copied and distributed with permission only.

IDG's Computerworld Names MITRE a "Best Place to Work in IT" for Eighth Straight Year The Boston Globe Ranks MITRE Number 6 Top Place to Work Fast Company Names MITRE One of the "World's 50 Most Innovative Companies"
 

Privacy Policy | Contact Us