Seeing Artificial Intelligence Breakthroughs with Computer Vision

February 2017
Mikel Rodriguez
Mikel Rodriguez

Mikel Rodriguez left Paris in 2012 for MITRE after years of academic research in computer vision, a field of artificial intelligence. His first day, he sat down to the task of teaching computers to recognize objects in aerial videos and photographs. Unlike his academic research where he worked with a small number of studio photos, these were all taken from different angles, sizes, and backgrounds.

"They were messy, like real life," Rodriguez says. "And there were thousands of them. I was so overwhelmed, I felt like getting back on the plane heading back to Paris. But my passion lies in working national security issues with real-world data—so I stayed."

Four years after he joined MITRE, Rodriguez now heads the Computer Vision Group's seven members. Along the way, he learned that instead of being overwhelmed with the number he should "embrace the volume of images."

He and his team have made major contributions in computer vision for law enforcement and national security, including a 3-D virtual reality system inspired by the search for information after the Boston Marathon bombing. What's more, his group passed a milestone in 2016—they developed a computer system that can recognize objects better than the average human being.

Fighting Crime and Protecting National Security

One of his first projects was Content-based Retrieval and Access (COBRA). As the principal investigator, he proposed the idea to MITRE's internal research program and then led the team developing it. COBRA automatically recognizes objects and events of interest within millions of frames of aerial surveillance video—such as a car approaching a secure facility late at night. After analyzing the 3-D structure of the scene, COBRA stitches frames together in a brief video summary. "For me it was a big learning experience on how we can deliver this transformational impact for the government."

Rodriguez's next major project was Holodeck. He proposed the idea and led the team after being inspired by the challenges law enforcement faced following the Boston Marathon bombing in searching through thousands of frames of cellphone, security, and ATM video footage. With Holodeck, users put on virtual reality headgear and step into an immersive world and look around—say on a city street in Boston. He or she would then see small video camera icons, each positioned where the video was actually shot. By reaching out and touching the icon, the video plays in a window in the context of the virtual reality world.

He explains that COBRA and Holodeck don't replace human involvement, but rather provide a first filter. COBRA has already transitioned to a government agency, and MITRE is currently making Holodeck available to our government sponsors.

An AI First—Computers Can Now Recognize Objects Better than the Average Person

Early in his career, Rodriguez realized just how challenging it would be to teach computers to recognize objects as well as humans. Yet that's exactly the milestone his team passed this year against a popular benchmark set by ImageNet, with the computer actually beating the person at being able to recognize all kinds of objects. Rodriguez points out that people still read emotions better. But against the benchmark, computers were better in fine-grained classifications of vehicles, animals, and other areas. Whereas an average person might recognize a picture or video of a car as "a sedan" or "a Honda," the computer would instantly identify it as a "2013 Honda Accord DX."

"I've been at this 15 years, and I once thought I might never see such breakthroughs in my career—but they are coming along fast now."

He sees opportunities to tackle myriad problems that concern our sponsors, including aviation, ground transportation, national security, healthcare, and more.

Rodriguez takes pride in expanding computer vision’s potential for MITRE's sponsors. "What's fantastic about being at MITRE is that our sponsors know we're not trying to sell products. We have the freedom to say, 'Here's what we're really good at. And here's where we need to do more work.' It's very attractive to not always have to think about the bottom line. Many of my colleagues in industry have to think: 'How do I sell the next ad using computer vision?' But at MITRE, we do deeply creative technical work that helps protect national security."

For more information about Mikel Rodriguez and computer vision, check out these stories in The Boston Globe and WCVB-Boston.

—by Bill Eidson