Information Management
Information Management investigates databases, distributed databases,
data mining, and legacy databases.
Building
the Semantic Web
Bedford and Washington
Problem
As the amount of information on the World Wide Web continues to grow,
the value of automated tools capable of finding, filtering, and combining
information in response to specific user requirements greatly increases.
The largest barrier preventing more automated use of Web resources is
that the semantics (meaning) of these resources is generally unavailable
to automated agents.
Objectives
The objective of this project is to develop technical foundations for
a “Semantic Web,” in which programs such as agents, search
engines, or service brokers can identify and use World Wide Web resources
(including both information and services) based on machine-readable representations
of their semantics (meaning).
Activities
We are investigating language concepts for representing and processing
semantic information that scale to the Web environment, and application
areas that include eBusiness and disaster relief. We are participating
in the World Wide Web Consortium’s Semantic Web Activity, engaging
in joint research with MIT’s Context Interchange (COIN) project,
and cooperating with researchers in DARPA's DAML (DARPA Agent Markup Language)
program.
Impacts
This research addresses a key area of current Web technology development,
impacting numerous MITRE programs dependent on Web technologies such as
XML, as well as wider eBusiness and other communities addressing issues
of large-scale interoperability. The research also provides technology
transfer opportunities with a wide range of academic and industry R&D
activities and standards groups.
Data Integration as an Industrial Process
Bedford and Washington
Problem
Data integration requires too much human time and skill. We need to industrialize,
to create narrow-skill steps, each of which produces reusable knowledge
rather than opaque code. To move from (easily evaded) mandates to natural
incentives, we will explore “describe and generate” tools
to make even the first connection easier. The approach should be incremental,
driven by real interoperability needs, not special initiatives.
Objectives
Our goals are to refine the industrial approach and to move industry,
the research community, and sponsors toward that vision. Specifically,
we will extend (very scalable) profile-driven integration techniques to
be compatible with commercial multidatabase query tools, develop metrics
to help project planners compare data integration techniques and judge
tools' utility, and evaluate emerging describe-and-generate data integration
research prototypes in real projects.
Activities
We will conduct experiments using research prototypes (e.g., IBM Research's
Clio) with aviation, brain mapping, and tax administration data; conduct
a survey of data integration practitioners to determine where the costs
are the greatest; and adapt metrics to improve project planning. Subsequently,
we will refine the metrics and perform further experimentation. Throughout,
we will publish results and transition them to MITRE and sponsor projects.
Impacts
We will reframe a critical technology to reflect rarely addressed
organizational realities. We will influence emerging industrial tools
and researchers' agendas, and provide metrics where none previously existed.
MITRE's sponsors will be aided in moving from doomed giant data integration
initiatives to incremental progress.
Presentation PDF
Database Curation and Access for
Bioinformatics
Bedford and Washington
Problem
Biological databases store information on proteins, genes, and their functions.
The biomedical literature describes the experiments behind the database
entries. Many databases lag behind the literature because they require
biologists to transfer the information from articles to database entries.
Biologists need interactive tools to help in the timely and consistent
transfer of information from the literature into the databases (the “curation”
process).
Objectives
This project will develop interactive techniques for the curation of biological
databases. These techniques will allow database curators to maintain currency
and consistency of these databases in the face of the exponential growth
of research in genomics and proteomics. To provide the curation tools,
we will develop information mining methods for free text and structured
data, specifically geared to the biology domain.
Activities
In the first year we will determine requirements for biological database
curation and mine existing databases for training and test data. We will
also develop an initial curation system prototype and conduct initial
technology and user-centered evaluations. In the second year we will refine
our interactive curation system and further evaluate it. Finally, we will
explore a Question-and-Answer front end.
Impacts
This work will have impact on the biology community by defining methods
and evaluations for automating database curation. We estimate that hundreds
to thousands of biology databases exist, and the number is growing rapidly.
Investment in this area leverages MITRE's expertise in text data mining
and databases, allowing MITRE to become a significant player in bioinformatics.
Intelligence Sharing Across
Boundaries
Eric Hughes, Principal Investigator
Bedford and Washington
JBI Agent Based Architecture
Mary K. Pulvermacher, Principal Investigator
Bedford and Washington
Problem
Achieving flexible and scalable information management across the DoD
enterprise will require a World Wide Web that is not only machine readable,
but also machine understandable. The term “Semantic Web” describes
a vision of a Web that is built upon XML and other Web technologies. The
question is how to apply these emerging technologies to help achieve C2
Enterprise interoperability.
Objectives
Our project applies emerging semantic Web technologies to solve a real
problem in the intelligence domain by using an ontology and agents to
perform ontologically-driven queries over image product libraries (IPLs)
on behalf of the analyst. The project also examines how to bridge Semantic
Web technologies with an agent-based Joint Battlespace Infosphere (JBI).
Activities
We will build an IPL search agent that uses an IPL ontology and analyst
heuristics to determine which of the over 450 IPL repositories to query
for a needed product. This should allow queries to be performed more quickly
and effectively. We will also examine how to integrate this agent into
MITRE’s eJBI prototype.
Impacts
A key impact will be to develop informed opinions on how Semantic Web,
JBI, and software agent technologies could be applied to our customers’
domains. However, by taking a pragmatic approach to their application,
we may help solve a real problem, for real people, in a real program.
Our findings could also potentially affect emerging international standards.
Neuroinformatics
Washington
Problem
The neuroscience community is accumulating a vast amount of human brain
mapping data that does not reach its full scientific potential because
it is generally confined to the originating lab. While data may exist
that a researcher could use to explore a hypothesis, the investigator
may be unaware of it or lack access to it.
Objectives
The overall goals of this research, conducted in conjunction with an external
NIH grant, are to design, prototype, and evaluate an information infrastructure
to help realize the full potential of a growing store of human brain mapping
data. In this initial undertaking, we focus on a system that enables the
analysis, exploration, and dissemination of structural magnetic resonance
imaging (MRI) data.
Activities
We have made significant progress toward development of a digital library,
including schema integration, a data sharing policy space, and Web-based
tools for exploring the data. The project focus is on medical image exploitation:
designing query-by-example functionality, enhancing querying using data
mining, and developing data quality metrics intrinsic to the neuroimagery.
We are also continuing to acquire MRI data from our collaborators.
Impacts
This project provides an important public service to the neuroscience
research and clinical communities. But the problems facing these communities
are not unique; they are isomorphic to those facing many of MITRE’s
traditional sponsors who must manage and exploit large quantities of imagery.
We expect our research to readily transition to our Treasury Department,
DoD, and USGC sponsors.
Policy Management for the Enterprise
JBI/C2E Web
Bedford and Washington
Problem
Digital information is rapidly becoming integrated into all aspects of
military activities. Operations are becoming increasingly fast-paced and
diverse. To provide commanders with the knowledge required to make decisions
in this environment, a greatly enhanced C2 concept for intelligence gathering,
dissemination, and visualization is needed, based on revolutionary new
information age concepts and technologies.
Objectives
The C2 Enterprise (C2E) architecture represents a way to realize the Joint
Battlespace Infosphere (JBI) vision through a loosely coupled information
environment using commercial standard Web protocols. Our primary objective
is to evaluate and integrate policy-based techniques for managing information
dissemination that can influence the flow of information between publishers
and subscribers in this loosely coupled environment.
Activities
This continuing research in the JBI will focus on the selection and integration
of policy-based management services, effectively bridging the gap between
those services and information brokering while seeking to influence industry
to provide these capabilities. These core services will be evaluated and
prototyped for integration into C2E infrastructure/common integrated infrastructure
enterprise services.
Impacts
The expected outcome will enable the migration of C2 mission applications
to a truly adaptable Web Services architecture that supports policy-based
monitoring and control. A fundamental contribution of this work will be
the transition of key concepts and the best implementation components
to existing and emerging C2 programs, including the Multi-sensor Command
and Control Constellation (MC2C) contractor base.
|