About Us Our Work Employment News & Events
MITRE Remote Access for MITRE Staff and Partners Site Map

Home > News & Events > MITRE Publications > The Edge >

Advanced Image Retrieval for Neuroinformatics

Monica Carley-Spencer

ow do you search datasets for images that are hard to describe in words? For example, how would you quickly search a database of brain scans to look for examples of unusually shaped corpus callosa in the brains of female schizophrenics? MITRE is applying Content-Based Image Retrieval (CBIR), which includes "query-by-example," to the problem. Our goal is to give neuroscientists a more precise search mechanism, which will cut down on the amount of images they have to look through to find what they need.

With the query-by-example function, you could request all images that are visually similar to your example. This is accomplished by the comparison of features intrinsic to the images (i.e., image content) rather than by textual or other descriptors, as is the case with a standard database query, such as "find images in the dataset that were collected at location x on January 1." Query-by-example functionality can provide researchers and image analysts with a powerful tool for finding images that have hard-to-describe details, discovering patterns across images, and testing hypotheses.

For example, suppose a clinical researcher notices that a female schizophrenic patient's magnetic resonance image (MRI) shows an unusual bend in the corpus callosum, the band of fibers connecting the hemispheres. It could be a structural defect correlated with the disease, but data from other subjects, both schizophrenics and matched controls, would be needed to test this hypothesis. Before undertaking an expensive and time-consuming data collection effort, the researcher could take advantage of data already collected and shared by other labs. Using query by example, the researcher would present an image of the patient's brain with the region of interest indicated, and the system would rapidly retrieve from a database any subjects with a similar bend in the corpus callosum. Then, inspection of the metadata for the subjects, including gender and disease state, would reveal the percentage of those subjects who are also female and schizophrenic. A percentage significantly higher than the percentage of female schizophrenics whose scans are in the database would indicate that the hypothesis is worth pursuing and justify the expenditure of additional data collection.

This work is part of the larger MITRE Neuroinformatics Project, which is funded both by MITRE and the National Institute of Mental Health through the Human Brain Project. The Neuroinformatics Project addresses the needs of neuroscientists, and in particular the brain mapping community, for better data management, secure data sharing, and advanced exploration tools. The CBIR tool we are building fits into a larger open source system, called NeuroServ, that MITRE is developing to meet these needs.

These systems and tools, including CBIR, also have broader applicability for bioinformatics databases of many types and are very much in need, as recognized by the National Library for Medicine in a 2002 report*:

"The ever-increasing volume of medical images, the economic impracticality of manually indexing these images, and the inadequacy of human language alone to describe image contents that are visually recognizable and medically significant, such as shape and geometry, color, texture of objects within images, all provide impetus for research and development toward practical Content-Based Image Retrieval systems that could become a standard offering of the medical library of the future."

Research Challenges

One of the most challenging parts of developing CBIR is extracting the salient features of images that can be used to characterize edges, textures, and contours of interest. Many of the early CBIR systems were developed to query over archived photographs and video, with histograms of pixel colors/intensities as the features used for comparison. While this kind of feature alone may work well with images that vary significantly in content, such as different scenes, it is insufficient for MRIs and other medical imagery. The histograms would all be too similar to provide discrimination.

To protect its citizens, 
              the United States is continuously preparing for possible attacks

Courtesy of the Laboratory of Neuro Imaging, UCLA

It is difficult to search for images that are hard to describe in words. How do you distinguish among thousands of brain scans in databases across the country? MITRE is developing a precise search mechanism for neuroscientists.


For several years we have been working with members of the Human Brain Project, including the International Consortium for Brain Mapping, who provide us with MRIs of the brain and their expertise. Using these MRIs, we've identified and developed a preliminary set of features for quantifying similarity among images. The features can be broadly categorized as (1) region- and contour-based shape features to characterize segmented structures and (2) statistical metrics computed over images as a whole or over subregions of images.

Region-based features include invariant statistical moments that allow for comparison of structures independent of orientation. Contour-based features include geometric measures of local curvature and spectral-based features that quantify smoothness. The latter category of statistical metrics includes metrics developed to assess quality in image coding (images are compared before compressing and after decompressing) but can be just as applicable to measuring similarity among different images for CBIR. Included in this category are perceptually motivated metrics that attempt to measure image content by heavily weighting those features that the human visual system tends to use in identification. One such metric is the structural similarity metric, which captures underlying structure, independent of variations in contrast and illumination across multiple images of the same objects.

Using both synthetic and recorded MRIs provided by our collaborators, we are testing these features and metrics for their efficacy. In the case of matching the contour of a brain structure (e.g., the hippocampus or the corpus callosum), features that are invariant to translation and rotation are proving to be the most useful because of the variability in orientation of anatomical regions within the brain across subjects. Even when a set of images is aligned in x,y,z space, the relative position of a region can vary significantly from subject to subject. For measuring similarity over whole images or arbitrary regions (i.e., not a segmented brain structure), our preliminary results successfully demonstrate the robustness of the selected set of similarity metrics to common distortions in MRIs, including intensity nonuniformity due to radio frequency effects and additive noise.

Another challenge in developing CBIR for neuroimagery, in particular, is the lack of "ground truth." Before features are extracted, images that are to be compared must first be registered to the same image. For example, in satellite imagery everything can be registered to what is literally ground truth—the earth. MRIs of the human brain have no absolute map to guide the registration, so typically a single brain MRI that is considered to be a good representative, or an average image, is used as the template.

Testing the Prototype

We have created a prototype of our tool, with query-by-example and image quality screening capabilities, which we will have our collaborators test in the near future. Performance of a CBIR system is typically evaluated in terms of two statistics: precision and recall. Precision is the percentage of retrieved images that are actually similar, and recall is the percentage of similar images in the entire dataset that are retrieved. Determining precision and recall is tricky because computing them with certainty would require descriptive information about the images that is unavailable; if it were, a CBIR query would be unnecessary. And so feedback from the user is important in assessing and refining the query results.

Refinement can be an iterative process in which the constraints are modified and the query re-issued, or refinement can simply be a filtering of the initial set of retrieved images in which fewer are kept based on the user's response to a sample of the retrieved images, particularly if the set of images is very large. Finally, query efficiency and scalability must be considered. High-dimensional feature sets that work well for searching a database of tens or hundreds of images may be impractical when the number of images increases by orders of magnitude. The problem can be mitigated both by feature dimensionality reduction and with indexing algorithms that obviate searching every single image.

Beyond Medical Imagery

While we're applying image retrieval research to neuroimagery, our ideas could easily be adapted to other purposes, such as searching databases of surveillance images and video. An example would be an analyst who needs to retrieve an old image containing a familiar scene but does not remember any of the specifics of the image, such as where the image was collected, only that it is similar to a recent image. The analyst could use the recently viewed image as the target image and search for similar images. Or, the analyst might simply want to find any images that are similar, without knowing if any already exist in the database, to see if a particular scene follows some pattern of activity established by other intelligence information.

The need to locate an image based on appearance (scene, shape, etc.), discover a pattern by correlating features that exist across a number of images, or even test a hypothesis by correlating metadata with image features is clearly not limited to users of biomedical image databases.


*Note: From "Content-Based Image Retrieval of Biomedical Images, A Report to the Board of Scientific Counselors," September 26-27, 2002, Communications Engineering Branch, Lister Hill National Center for Biomedical Communications, U.S. National Library of Medicine.

 

 

For more information, please contact Monica Carley-Spencer using the employee directory.


Page last updated: May 24, 2005   |   Top of page

Homeland Security Center Center for Enterprise Modernization Command, Control, Communications and Intelligence Center Center for Advanced Aviation System Development

 
 
 

Solutions That Make a Difference.®
Copyright © 1997-2013, The MITRE Corporation. All rights reserved.
MITRE is a registered trademark of The MITRE Corporation.
Material on this site may be copied and distributed with permission only.

IDG's Computerworld Names MITRE a "Best Place to Work in IT" for Eighth Straight Year The Boston Globe Ranks MITRE Number 6 Top Place to Work Fast Company Names MITRE One of the "World's 50 Most Innovative Companies"
 

Privacy Policy | Contact Us