About Us Our Work Employment News & Events
MITRE Remote Access for MITRE Staff and Partners Site Map
Our Work

Follow Us:

Visit MITRE on Facebook
Visit MITRE on Twitter
Visit MITRE on Linkedin
Visit MITRE on YouTube
View MITRE's RSS Feeds
View MITRE's Mobile Apps
Home > Our Work > Technical Papers >

Data Mining with Semantic Features Represented as Vectors of Semantic Clusters

July 2012

Merwyn Taylor, The MITRE Corporation

ABSTRACT

Data mining with taxonomies merged with categorical data has been studied in the past but often limited to small taxonomies. Taxonomies are used to aggregate categorical data such that patterns induced from the data can be expressed at higher levels of conceptual generality. Semantic similarity and relatedness measures can be used to aggregate categorical values for cluster-based data mining algorithms. Many aggregation techniques rely solely on hierarchical relationships to aggregate categorical values. While computationally attractive, these approaches have conceptual limitations that can lead to spurious data mining results. Alternatively, categorical data can be aggregated using hierarchical relationships and other semantic relationships that are expressed in ontologies and conceptual graphs thus requiring graph based similarity/ relatedness measures. Scaling these techniques to large ontologies can be computationally expensive since there is a wider search space for expressing patterns. An alternative representation of semantic data is presented that has attractive computational properties when applied to data mining. Semantic data is represented as vectors of cluster memberships. The representation supports the use of cosine similarity measures to improve the run-time performance of data mining with ontologies. The method is illustrated via examples of KMeans clustering and Association Rule mining.

View/Download Document

Additional Search Keywords

Data mining, ontologies, taxonomies, semantics, vectors, semantic similarity, semantic vectors

 

Page last updated: September 25, 2012   |   Top of page

Homeland Security Center Center for Enterprise Modernization Command, Control, Communications and Intelligence Center Center for Advanced Aviation System Development

 
 
 

Solutions That Make a Difference.®
Copyright © 1997-2013, The MITRE Corporation. All rights reserved.
MITRE is a registered trademark of The MITRE Corporation.
Material on this site may be copied and distributed with permission only.

IDG's Computerworld Names MITRE a "Best Place to Work in IT" for Eighth Straight Year The Boston Globe Ranks MITRE Number 6 Top Place to Work Fast Company Names MITRE One of the "World's 50 Most Innovative Companies"
 

Privacy Policy | Contact Us