|
Novelty Discovery with Heterogeneous Features
July 2010
Ken Samuel, The MITRE Corporation
Peter Mork, The MITRE Corporation
Adriane Chapman, The MITRE Corporation
David Moore, The MITRE Corporation
Irina Vayndiner, The MITRE Corporation
Erik Sax, The MITRE Corporation
ABSTRACT
This paper presents experiments with a unique machine learning method called Cross-Feature Analysis, which is a novelty discovery method that can easily accommodate heterogeneous features. The domain of our work is database security, with the goal of detecting attacks that are similar to those seen in the past as well as completely novel attacks that have not yet been seen. The training data consists of database logs that have no attacks, so supervised machine learning methods cannot apply, and unsupervised machine learning methods are unsatisfactory, because we have a variety of feature types, including numerical features, categorical features, and set-valued features. However, Cross-Feature Analysis transforms our novelty discovery problem into multiple supervised machine learning problems, building one submodel for each feature by treating that feature as the class, Then new instances are analyzed by the submodels to determine whether they are consistent (legitimate) or anomalous (suspicious). In our experiments we discovered that, by setting a limit on the number of submodels that reject an instance, our system can distinguish legitimate instances from attacks with perfect (100%) recall of real attacks and a specificity of 99.9% on legitimate instances for one data set, and on another data set, recall = 97.2% and specificity = 99.9%.

Additional Search Keywords
n/a
|