Mining a Large-Scale Term-Concept
Network from Wikipedia
October 2006
Andrew Gregorowicz, The MITRE Corporation
Mark A. Kramer, The MITRE Corporation
ABSTRACT
Social tagging and information retrieval are challenged by the fact
that the same item or idea can be expressed by different terms or words.
To counteract the problem of variable terminology, researchers have proposed
concept-based information retrieval. To date, however, most concept
spaces have been either manually-produced taxonomies or special-purpose ontologies,
too small for classifying arbitrary resources. To create a large set
of concepts, and to facilitate terms to concept mapping, we introduce mine a network
of concepts and terms from Wikipedia. Our algorithm results in a robust,
extensible term-concept network for tagging and information retrieval,
containing over 2,000,000 concepts with mappings to over 3,000,000 unique
terms.

Additional Search Keywords
Information retrieval, concept search, Wikipedia, text mining
|