About Us Our Work Employment News & Events
MITRE Remote Access for MITRE Staff and Partners Site Map
Innvovation Exchange

»Complete Project List

»

Projects Featured in Unraveling the Network: Enhancing Analysis for Modern Threats:

Closing the Semantic Gap

Content Extraction and Duplicate Analysis and Recognition (CEDAR)

Integrated Ops-Intel Team Sensemaking for Non-Traditional Warfare

Live Hotspotting of VoIP

Machine Translation for Foreign Language Science and Technology Analysis

Modeling Phase Change Behavior

Rapid Trusted Video Stream Dissemination

Scientometric Analyses for Science & Technology Intelligence (S&TI)

Social Network Services for Intelligence Community Professionals

Spatio-Temporal Information Extraction and Reasoning from Natural Language

Unraveling the Network: Enhancing Analysis for Modern Threats

Closing the Semantic Gap

Marc Vilain, Principal Investigator

Problems:
The explosive adoption of language-enabled analytic tools speaks to their abilities to interpolate what people mean from what people say. But much language-enabled analysis has reached a semantic gap: progress on key tasks is stymied because these methods can only approximate what humans mean. This is especially true for the critical unsolved problem of identifying events and their ramifications.

Objectives:
We will create a computational database of word meanings that will bridge the semantic gap and enable analytic tools to better model human language. This database will be compiled from dictionary definitions that cover the full range of the English language. Further, we will marshal algorithmic methods that apply the inter-relationships of word meanings to identifying events and their ramifications.

Activities:
In order to create this lexical database, we will: infer hierarchies of word meanings from their dictionary definitions; establish those non-hierarchical word relations that further capture essential meanings; and derive a repertoire of primitive semantic elements from which meanings are composed. We will also define evaluation measures to better assess progress and determine goodness of fit to actual analytic tasks.

Impact:
This work will sharpen analytic capabilities in many areas. In particular, being able to reliably identify and compare events is necessary to next-generation capabilities in Indications and Warnings. It will enable: the ability to catalogue what happens to entities of interest; the ability to filter redundant information about the same event; and the ability to detect inconsistent versions of events.

Approved for Public Release: 06-0209

Presentation [PDF]


Content Extraction and Duplicate Analysis and Recognition (CEDAR)

Susan Lubar, Principal Investigator

Problems:
Duplicate documents cause reduced productivity and incomplete analysis by inflating datasets without adding new information. On the Web, researchers estimate that about 30 percent of documents are duplicates. Because of the presence of extraneous content such as advertisements, navigation bars, lists of URLs, and formatting information, there is currently no accurate method to identify documents that contain duplicate text.

Objectives:
Our goal is to develop an effective system for detecting duplicate documents. We will create a process to identify a document's "core content": the parts of the pages that are of interest to analysts. On the basis of the core content, these documents will then be analyzed to determine the presence of duplicates.

Activities:
We will create a dataset containing hand-tagged duplicate pages to use as a gold standard for system evaluation. We will then define criteria to be varied as system input to determine what should be considered a duplicate document, develop a system for identifying core content, and detect duplicates. We will evaluate system performance by measuring precision and recall.

Impact:
Our research will enable our sponsors to significantly improve their ability to analyze information from document collections. Detecting duplicates will improve data quality, allowing analysts to achieve more accurate analytical results in a shorter timeframe.

Approved for Public Release: 06-1423

Presentation [PDF]


Integrated Ops-Intel Team Sensemaking for Non-Traditional Warfare

Frank Stech, Principal Investigator

Problems:
Intelligence groups have had notorious failures, for example, 9/11's "failure of imagination" and the WMD "groupthink." Intelligence sharing, the complexity of problems, and operational needs mandate more analysis by groups comprising both intelligence and operational personnel. Intelligence reform and design of operational-intelligence teams are largely uninformed by the cognitive sociology of high-reliability organizations or organizational disasters.

Objectives:
We will apply High Reliability Theory and Normal Accident Theory to develop Cognitive Team Analysis (CTA) to identify factors associated with team sensemaking success and failures. We will develop CTA assessment measures for team sensemaking to leverage existing frameworks and assessment methods, develop field measures for team success, and adapt change management practices to re-design sensemaking environments and enterprises.

Activities:
We will define success measures for team sensemaking, determine sensemaking success factors, derive observational measures, assess both "captive" and "wild" teams, derive and apply change management steps to real teams, and assess sensemaking success. We will publish our results on "Sensemaking Metrics and Methods for Operations-Intelligence Teams," "Predicting Operations -- Intelligence Success and Failure with Sensemaking Metrics," and "Changing Operations -- Intelligence Failure to Success with Sensemaking Redesigns."

Impact:
This research develops a sensemaking engineering capability that links social-cognitive theory to MITRE intellectual capital in user interface design, cognitive engineering, team collaboration, and team decision-making. The framework and tools support re-design of sensemaking environments and are applicable to high-uncertainty environments of interest to many MITRE clients, e.g., operations and intelligence, air traffic control, medical surveillance, financial surveillance, and forensics.

Approved for Public Release: 06-1134


Live Hotspotting of VoIP

Qian Hu, Principal Investigator

Problems:
Internet Protocol (IP)-based telephony is a significant source of time-sensitive information. It requires a system to rapidly filter and prioritize multiple live VoIP streams. Current systems depend on post-processing of captured streams and use audio information alone. It also requires humans to monitor live audio signals for keywords or significant phrases. This significantly delays detection and dissemination of time-sensitive information.

Objectives:
This project will explore the capability to automatically identify areas of interest from multiple live VoIP streams and provide near real-time alerts and warnings for analysis and decision support.

Activities:
The project will research and experiment with live processing capability using multiple speech recognition engines on multiple VoIP audio streams. We will experiment with integrating Audio Hot Spotting functionality on analysis platforms. We will research algorithms to fuse pertinent information from VoIP, text, and IP traffic to provide more comprehensive information.

Impact:
The capability to automatically find areas of interest from multiple IP streams in near real-time will help humans rapidly filter and prioritize multiple live VoIP streams. It will provide analysts and decision makers with critical and time-sensitive information from VoIP.

Approved for Public Release: 07-1491


Machine Translation for Foreign Language Science and Technology Analysis

John Burger, Principal Investigator

Problems:
Foreign language science and technology analysis requires several valuable sets of skills, making it a high-value target for any force-multiplier technology. The most commonly used technology is machine translation, but it is unclear whether recent dramatic improvements in newswire translation will apply equally well to other genres and domains. Anecdotal feedback indicates that MT in S&T settings is, at best, an 80 percent solution.

Objectives:
We will characterize MT's strengths and weaknesses in the S&T linguistic domain through a rigorous, categorical feasibility study. We will identify dimensions and thresholds required in translation improvement, likely integration points into the S&T analysis workflow, and any serendipitously positive interactions between MT and S&T. We will then experiment with several ways to improve MT in scientific domains.

Activities:
After acquiring parallel foreign-language -- English text in several scientific domains and languages, we will train the standard open-source MT toolkits on this material and evaluate the results. Guided by on-going discussions with analysts/translators, we will perform a series of experiments to improve this baseline MT using S&T-specific language models and translation models, and by leveraging syntax and document structure.

Impact:
This work can directly impact S&T analysts through greater access to source material, improving at least coarse understanding of foreign research and thus providing better document triage and more efficient assessment. Ancillary benefits could include improved cross-linguistic search, using automatically acquired bilingual dictionaries, and improved human translation, using structured translation lexicons. There is wider relevance to any application areas where better access to foreign scientific material is desired.

Approved for Public Release: 07-1490

Presentation [PDF]


Modeling Phase Change Behavior

Lashon Booker, Principal Investigator

Problems:
We hypothesize that the social group is an external representation of a subset of human behavior that serves to simplify human decision-making by reliance on group influences. This project aims to better understand the dynamics of groups such as "leaderless resistance groups," which are not organizations as much as ideologies that depend on external communications such as the Internet.

Objectives:
We will test a framework for modeling social group formation, recruitment, adaptation of belief upon recruitment, group competition, and group utilization of communications technology to further group objectives. Testing will start with a potentially simple domain such as the formation of an "invisible college" in scientific publication patterns and expand to resistance group recruitment.

Activities:
The initial domain will be similar to group recruitment, but much simpler in terms of data collection and extraction. Data on group formation and use of electronic communications media for recruitment purposes will be acquired from other sources. The modeling framework will seek to replicate various known aspects of recruitment. Modeling results may give insight into intervention strategies.

Impact:
U.S. agencies are showing increased interest in modeling of complex systems, taking a more quantitative approach to social and behavioral research. There is potential to move beyond entity-relationship models for data representation. In the war on terrorism, the target of intelligence has changed in ways that make entity-relationship models less applicable.

Approved for Public Release: 05-1219

Presentation [PDF]


Rapid Trusted Video Stream Dissemination

Mark Workman, Principal Investigator

Presentation [PDF]


Scientometric Analyses for Science & Technology Intelligence (S&TI)

Frank Stech, Principal Investigator

Problems:
The foundations of science and technology (S&T) threats and weaponization are in open-source S&T literature. Scientometric methods support and extend the monitoring of these S&T domains. Scientometric methods are neither well-known nor frequently used analytically by the Intelligence Community, despite their demonstrated utility for understanding and assessing science and technology.

Objectives:
Our objectives: predict trajectories of key S&T undertaken by nations of interest; analyze and data mine open source scientific publications and patents to monitor and measure science career patterns, S&T trajectories, and evolution of key S&T capabilities; and identify indicators suggesting denial and deception of key S&T activities and trajectories.

Activities:
First phase, we will focus on establishing our "Scientometrics Lab" capability, validating tools and methods, investigating science career trends, and identifying key S&T trends. Second, we will transition methods from "tame" domains to S&T interest areas to investigate scientific organizations, future trends, and applications. Finally, we will investigate S&T trends at various levels at the frontiers of science.

Impact:
The MSR extends MITRE expertise and capabilities in: Scientometrics, IC exploitation of open source literature, and counter-deception. It impacts IC S&TI methodologies, adds new data and methods, augments multi-layer analysis methodology, and combines Scientometrics with other methodologies, such as Analysis of Competing Hypotheses. Other MITRE customers will benefit as well, including health sciences, civil aviation systems, and homeland security.

Approved for Public Release: 08-0104


Social Network Services for Intelligence Community Professionals

Tom Bartee, Principal Investigator

Problems:
One of the historical problem areas within the IC is the segregation of individuals into narrow communities based on agencies and organizations within agencies. Analysts often focus on information specific to a small number of intelligence sources and have somewhat limited collaboration with analysts working in similar geographic areas and on related issues from other parts of the community.

Objectives:
This project will develop prototype software and methods to help IC analysts identify and manage potential contacts in other organizations that may be of interest to them. Using collaborative filtering techniques often used in recommender systems, we will discover implicit social networks and alert analysts in real-time to the existence of other analysts working in similar problem areas.

Activities:
We will install and configure a OneCommunity Social Networking platform on our customer networks using an available open source tool and perform observational field studies. In parallel with platform configuration and user training, we will develop the collaborative filtering plug-in for generating contact recommendations. We will incorporate this component into the installed platform and perform additional field studies.

Impact:
An application that connects users working similar areas and issues in real-time has the potential to significantly improve the efficiency and effectiveness of analysts. Additionally, this application may help analysts gain a greater awareness of other elements of the community and establish standing relationships with other individuals and groups within the community.

Approved for Public Release: 08-0003

Presentation [PDF]


Spatio-Temporal Information Extraction and Reasoning from Natural Language

Inderjeet Mani, Principal Investigator

Problems:
In many customer problems, a large fraction of important events are given vague relative spatial and temporal characterizations that are ignored by today's systems. These systems reason very little, if at all, about space and time, requiring considerable interpolation and extrapolation by the analyst. Finally, current approaches make it difficult to incorporate end-user preferences.

Objectives:
We will develop information extraction and reasoning algorithms within a machine learning framework that will address the spatio-temporal location of events in text. We will develop the SpatialML annotation scheme to produce annotated corpora for training systems to integrate learning with reasoning, use active learning to ask user questions about particular examples, and measure system performance on a particular application.

Activities:
In Year 1, we will develop the annotation framework, including a spatial expression normalizer evaluated in an operational setting, and integrate reasoning and learning. In Year 2, we will test active learning with analysts, develop a location labeler, and initiate a challenge task for the community. In Year 3, we will transition the system and port it to Mandarin Chinese.

Impact:
This retargetable approach will allow for common solutions across many domains. The key problems of spatial and temporal representation from language will be solved. Particular problems in integration of learning and reasoning will be solved for spatio-temporal domains. Richer structured spatio-temporal data will become available for data mining and visualization.

Approved for Public Release: 06-1442

Presentation [PDF]


Last Updated:05/05/2008  |  ^TOP

Homeland Security Center Center for Enterprise Modernization Command, Control, Communications and Intelligence Center Center for Advanced Aviation System Development

 
 
 

Solutions That Make a Difference.®
Copyright © 1997-2013, The MITRE Corporation. All rights reserved.
MITRE is a registered trademark of The MITRE Corporation.
Material on this site may be copied and distributed with permission only.

IDG's Computerworld Names MITRE a "Best Place to Work in IT" for Eighth Straight Year The Boston Globe Ranks MITRE Number 6 Top Place to Work Fast Company Names MITRE One of the "World's 50 Most Innovative Companies"
 

Privacy Policy | Contact Us