About Us Our Work Employment News & Events
MITRE Remote Access for MITRE Staff and Partners Site Map
Our Work

Follow Us:

Visit MITRE on Facebook
Visit MITRE on Twitter
Visit MITRE on Linkedin
Visit MITRE on YouTube
View MITRE's RSS Feeds
View MITRE's Mobile Apps
Home > Our Work > Technical Papers >

Leveraging Machine Readable Dictionaries in Discriminative Sequence Models

June 2006

Ben Wellner, The MITRE Corporation
Marc Vilain, The MITRE Corporation

ABSTRACT

Many natural language processing tasks make use of a lexicon—typically the words collected from some annotated training data along with their associated properties. We demonstrate here the utility of corpora-independent lexicons derived from machine readable dictionaries. Lexical information is encoded in the form of features in a Conditional Random Field tagger providing improved performance in cases where: i) limited training data is made available ii) the data is case-less and iii) the test data genre or domain is different than that of the training data. We show substantial error reductions, especially on unknown words, for the tasks of part-of-speech tagging and shallow parsing, achieving up to 20% error reduction on Penn TreeBank part-of-speech tagging and up to a 15.7% error reduction for shallow parsing using the CoNLL 2000 data. Our results here point towards a simple, but effective methodology for increasing the adaptability of text processing systems by training models with annotated data in one genre augmented with general lexical information or lexical information pertinent to the target genre (or domain).

View/Download Document

Additional Search Keywords

N/A

 

Page last updated: July 19, 2006   |   Top of page

Homeland Security Center Center for Enterprise Modernization Command, Control, Communications and Intelligence Center Center for Advanced Aviation System Development

 
 
 

Solutions That Make a Difference.®
Copyright © 1997-2013, The MITRE Corporation. All rights reserved.
MITRE is a registered trademark of The MITRE Corporation.
Material on this site may be copied and distributed with permission only.

IDG's Computerworld Names MITRE a "Best Place to Work in IT" for Eighth Straight Year The Boston Globe Ranks MITRE Number 6 Top Place to Work Fast Company Names MITRE One of the "World's 50 Most Innovative Companies"
 

Privacy Policy | Contact Us