Intelligent Information Processing
Intelligent Information Processing investigates technologies, tools,
and processes that support the discovery, processing, exploitation, and
dissemination of information, tools and knowledge. Intelligent agents
are covered in this area.
Audio
Hot Spotting
Qian Hu, Principal Investigator
Bedford and Washington
Problem
Large volumes of recordings require rapid retrieval of segments potentially
relevant to a given query (audio hot spotting). Because of high automatic
speech recognition (ASR) word error rates and the loss of important audio
information in speech transcription, spoken document retrieval systems
that simply combine ASR with information retrieval (IR) do not meet this
need in real applications.
Objectives
We propose to research and develop audio-specific retrieval algorithms
in critical domains by (1) exploiting multiple types of acoustic information
from the audio signals; (2) exploring several adaptive techniques to improve
existing ASR performance; and (3) fusing component technologies such as
ASR, language/speaker identification, audio feature extraction, and information
retrieval.
Activities
We will research algorithms and techniques to extend and improve ASR and
audio feature extraction and to develop audio-based query algorithms making
use of the multiple types of audio information. We will research and develop
fusion algorithms to build an audio hot spotting system based on the extended
ASR, audio feature extraction, language/speaker identification, and the
new audio query language.
Impacts
Our research in audio hot spotting algorithms and prototype development
will address the needs of MITRE sponsors with warehouses of recordings
waiting for efficient retrieval. It will extend MITRE's information retrieval
capability from text to include audio. The expertise gained through the
research will equip MITRE to better advise industry developers and our
sponsors on audio information retrieval topics and evaluation standards.
Automated Information Discovery
and Retrieval from Asian Language Sources
Ray LeBlanc, Principal Investigator
Bedford and Washington
Problem
While several commercial capabilities exist to address particular facets
of machine translation (MT) needs, emphasis has been placed on European-based
languages. Furthermore, none of the existing COTS products are particularly
well suited to the military environment. English translation of Asian
languages is much more difficult than translation of European languages
and has presented the MT community with significant challenges.
Objectives
This project will develop a capability to perform Chinese and Korean cross-language
information retrieval, information discovery (ID), data mining (DM), and
knowledge management (KM) in support of open source intelligence analysis.
The project will develop a prototype capability that can support in-field
experimentation with a broad spectrum of users.
Activities
We will provide an automated capability to translate electronic textual
information between Chinese and English, and between Korean and English.
We will characterize and subsequently retrieve information, based on user
specified profiles, from Chinese- and Korean-language sources by means
of a prototype analytic tool. A dictionary management capability will
allow users to build, import/export, and aggregate custom dictionaries.
Impacts
This project has the potential for improving the efficiency and effectiveness
of intelligence organizations currently impacted by foreign language translation
issues. It is expected to provide the beneficiaries with needed interim
capabilities and validation of the most fertile areas for the future application
of government funds.
Distributed Resource Brokering
in Complex Network Environments
Paul Silvey, Principal Investigator
Bedford and Washington
Problem
Many challenging problems facing MITRE's sponsors require the coordinated
use of large numbers of distributed IT resources. Hard computational problems
such as data mining become tractable when thousands of computers are simultaneously
brought to bear. Likewise, global information management architectures,
like those envisioned for a Joint Battlespace Infosphere, require distributed
infrastructures to be sufficiently scalable, fault-tolerant, and timely.
Objectives
We are investigating the performance of various proposed techniques for
distributed resource discovery in peer-to-peer (P2P) networks by modeling
and simulating them in realistic complex network environments. By studying
many approaches in many situations, we aim to discover fundamental principles
that designers of distributed IT systems can use to achieve desired levels
of performance in particular real-world environments.
Activities
To prepare for experimentation, we are developing models of resource brokers
that capture essential aspects of their collective performance in a society
of brokers, such as index size and granularity, breadth of resource and
topic coverage, supply and demand loading, referral network topology,
etc. We are simultaneously developing network environment models ranging
from simple and static to complex and dynamic.
Impacts
Our sponsors will be affected by the changes that network centric, distributed,
and P2P technologies are bringing to resource discovery and management
problems. Dynamic information management in intelligence, logistics, weather
reporting, and command and control all face scalability and timeliness
challenges that centralized content indexing approaches will not completely
solve. This research will help improve our understanding of key issues.
Foundations for Next Generation
Information Access
Bedford and Washington
Problem
Computerized support for information gathering is fragmented across multiple
research communities, and integration is difficult due to the lack of
an underlying formalism that cuts across the different technologies. Statistical
techniques for individual components have been developed in isolation
and without a common theoretical foundation. As a result we are left with
a number of reasonably effective, semi-principled, incompatible techniques.
Objectives
The principal objective is the development of statistical foundations
for information access. A successful foundation will comprise rigorous
characterizations of the issues of modeling and estimation, together with
principled methodologies for adapting to new languages, genres, information
domains, auxiliary knowledge sources, and tasks.
Activities
We will develop simulations that model the stochastic generation of latent
document features, observable document features, the determination of
document relevance, and the distribution of query characteristics. We
will perform exploratory data analysis on available research corpora to
verify our models. A central focus will be on research into the importance
of variance reduction and the potential benefits of various bias-variance
strategies.
Impacts
This research is directly relevant to existing MITRE projects. The results
will allow MITRE to develop information access systems incorporating new
sources of evidence and to tailor information systems to meet specific
military and intelligence needs. MITRE will then be strategically positioned
to set the direction of research into, and development of, next-generation
information access technology.
Robot Platoon Command and Control
Washington
Problem
Reliable autonomous soldier robot teams will not be possible for many
years. However, an intermediate level of autonomy, where a commander gives
high-level commands (e.g., “Go to the top of Hill 203”), is
achievable in the near future. This supervisory control requires only
occasional intervention by a commander during a mission.
Objectives
This proposal asserts that one human is adequate for directing a small
team of robots. Validating the assertion will require us to demonstrate
a working team system where robots exhibit some automated reasoning (route
planning, navigation) and cooperative behavior, while attending to human
guidance. We will use reconnaissance tasks in urban terrain as our testbed.
Activities
We will extend behavior-based robotics approaches to include the memory
and communication required for human participation in the team. Our principal
demonstration task will be to produce a team entry for the RoboCup-Rescue
annual competition. We will also investigate the utility of platform mobility
for reconnaissance-directed sensor networks.
Impacts
MITRE’s capability in robotics will be of considerable importance
to our customers in the near future. This proposal builds on MITRE’s
current expertise in command and control and artificial intelligence.
Robot platoon command and control defines a niche that is a natural extension
of this expertise.
Social Information Retrieval
Washington
Problem
Our research centers on developing new technology for tracking Internet-based
networked organizations, and using those results to identify potential
vulnerabilities and threats. Current information retrieval technology
does not directly address the problem of detecting activist networks,
assessing behavior, and tracking their evolution; new technology is needed
to detect networks based on their structure and context.
Objectives
The main objective is to develop technology for a worldwide monitoring
system used to detect the emergence of new groups (e.g., activists) and
track the evolution of existing organizations based on their online activity.
The focus will be on assessing an organization's behavior and its vulnerabilities.
Activities
We are exploring the confluence of information retrieval for collecting
distributed information, social network analysis for determining network
structure and characteristics, and dynamical systems modeling for determining
network function or behavior. Work includes the development of advanced
smart crawler collection tools that will use adaptive and cooperative
searching techniques to provide efficient and high-coverage collection
from the Web or other network search environments.
Impacts
This research will provide new tools for detecting emergent networked
organizations in the open Web and enterprise environments, and will provide
a basis for modeling their behavior and identifying critical nodes for
assessing vulnerabilities and network robustness. Our initial work has
already had impact on several sponsor mission areas.
Web Ontology Language
(OWL)
DARPA Office: IXO
DARPA PM: Mr. Murray A. Burke
Bedford and Washington
Problem
Interoperability is difficult when we use different terms;for example,
“guarantee” versus “warranty.” Will a machine
be able to dynamically realize that these terms mean the same thing? How
do we define the semantics of a vocabulary in a way that will enable machines
to dynamically realize that two terms are the same or are related (and
how they are related)? Solving this problem will go a long way toward
achieving the vision of the Semantic Web.
Objectives
For several years DARPA has been working on an XML-based language to enhance
interoperability by defining the semantics of vocabulary. An outcome of
their work is the Web Ontology Language (OWL). MITRE has been asked to
develop a PowerPoint-based tutorial that explains in a clear fashion the
OWL and OWL-Light ontology languages.
Activities
A PowerPoint tutorial will be developed, including complete, validated
examples. The culmination of the work will be a tutorial that is presented
to DARPA.
Impacts
The resulting tutorial will be made available Internet-wide. It will enable
the Internet community to quickly attain expertise in this technology.
The consequence will be increased skill levels on ontologies, and (hopefully)
increased interoperability. From a corporate perspective this will be
great public relations for both MITRE and DARPA.
|