![]() |
|||||
|
|
Home > News & Events > MITRE Publications > The MITRE Digest > | |||||||||||||||||||
A 3D-SLAM Dunk in Urban Situation Analysis June 2008
Military operations in city streets and building interiors are among the most dangerous activities for soldiers. It's easy for the enemy to hide behind windows high above the street or behind furnishings inside a building. Naturally, ground patrols would prefer to know what's around the corner or what's inside a building and what its layout is before putting themselves in danger. Research at MITRE aims to improve this type of situation awareness by adding 3D sensors to unmanned ground vehicles (UGVs) and hovering unmanned aerial vehicles (UAVs). Soldiers today can use camera-equipped UGVs and hovering UAVs to see video before they move down a street or enter a building. But it can be hard to fully exploit this video. As the UGV or UAV moves ahead, the video only shows what's currently in view, making it difficult for operators to judge the relative positions of objects that are not seen together. In addition, the video shows only the scene in front of the cameras, so it's difficult to understand how far a robot has moved down a street, down a hallway, or how many turns it made. This lack of situation awareness—a clear sense of where the robot is and what it sees—limits the operational impact of unmanned vehicles. A further problem with these video sensors is their tendency to suffer from the "soda straw" effect, which is a narrow field of view that lacks the peripheral vision necessary to give a good sense of location. The soda straw effect also tends to cause disorientation in cluttered environments. In contrast, wide field-of-view cameras give severely distorted images when viewed on the small operator's screen. UAVs that fly at high altitudes can compensate for the soda straw effect by using "video mosaicing," a technique in which two-dimensional (2D) frames of video are stitched together to make a map of the scene. This provides the kind of situation awareness needed by soldiers. However, this 2D image preprocessing technique doesn't work at very low altitudes or when operating on the ground, due to motion parallax, which makes objects close to the camera appear to move faster than things further away. 3D Virtual Mapping on the Move "What's needed is a means to provide the same situation awareness that 2D video mosaics currently bring operators of higher altitude UAVs, but which work in the cluttered 3D environments of small UGVs and hovering, low altitude UAVs. MITRE is researching a method that uses 3D sensors on unmanned vehicles to provide warfighters with real-time 3D views of potentially hostile spaces. It's called 3D-Simultaneous Localization and Mapping, or 3D-SLAM.
"The 'localization' part is how the robot determines its position in the environment, such as a hallway," says Scott Robbins, senior systems engineer and the project's principal investigator. "The 'mapping' part is the process the robot uses to figure out the shape of that hallway and present it to operators." For 3D-SLAM to work properly, the mathematical processing for the localization and the mapping must happen simultaneously. The result is a 3D representation of the robot's environment, which is "built" while the unmanned vehicle is on the move. You can look at the scene from different angles and virtually fly through it. For example, even though the video is taken from a moving robot at ground level, you can view the fly-through at any height, or even directly overhead. "You can understand the scene much better than from a conventional 2D ground-level video that you normally get from a robot," says Robbins. "You get a unified 3D video map that's useful for navigation, reconnaissance, and planning. Looking at the video of a room, for example, you can see all the objects in the room and where they are in relation to each other. You can look at a room's interior from any direction or height and more easily see the location of a threat—either a person or an object." Sensors That See in 2-1/2 Dimensions The MITRE team uses stereo vision and flash LIDAR (light detection and ranging) sensors, both of which are kinds of 2.5D (two-and-a-half dimension) sensors. Both sensor types return an image that gives, in addition to a typical color picture, a distance or "range" value for each pixel. This "range image" is called 2.5D because it can't show you what's behind something in a given volume, only the distance to the first surface. "We use SLAM to integrate these range images into a 3D model of the world as we move through an environment seeing things from the front, sides, and back," explains Robbins. The team is researching methods to use multiple sensors simultaneously for the localization and mapping problems. "Each sensor type has specific strengths and weaknesses," says Robbins. "Stereo can have longer range, but can't sense distance to a uniformly colored surface. Flash LIDAR can sense such surfaces, but its range is much shorter. By playing one mode's strengths against another's weakness, we'll be able to handle a larger variety of environments with better reliability." The 3D sensors are small and sufficiently low in weight and power to fit into small UGVs and small hovering UAVs. Part of the 3D problem is that a vehicle, especially a UAV, can move in six degrees of freedom at once—up-down, left-right, front-back, yawing left and right, rolling clockwise and counterclockwise, and pitching up and down. "Determining how you move in three dimensions at once is a much larger problem than for two dimensions," says Robbins. Finding Your Way 3D SLAM works by aligning and joining successive overlapping 3D sensor data to determine the change in position. "Each successive 3D picture is aligned with the previous pictures by a process called visual odometry," says Brigit Schroeder, senior systems engineer and the project's co-investigator. "You determine which features from image to image are stable, such as hard corners and edges. You match those points across images and a best fit is made using the last known position. The coordinates of this fit provide feedback for an updated position, and new sensor data is integrated into the evolving map of the environment."
A significant issue for UGVs and hovering UAVs is the availability of a constant global positioning signal (GPS). It may drop out or become deflected by structures, terrain, or weather. When a UAV loses its GPS, the UAV also loses its moving map, current position, and its ability to find its way home. If that happens, it's important to know where the vehicle is and where it's going when it loses GPS. Furthermore, "GPS only tells you where you're located," explains Robbins. "It doesn't tell you where you're pointed." Current alternatives to GPS for small UGVs and UAVs, such as inertial sensors, are too inaccurate for positioning beyond fairly short distances. Wheel odometry for UGVs is even worse because of wheel slippage. "Instead, we use the onboard 3D sensors and the environment itself to navigate," notes Robbins. "Visual odometry uses the change in what the sensors see to determine the change in position. We're not reliant on GPS, inertial navigation, or wheel odometers." With the vehicle position determined by the 3D sensors, the sensor data can then be integrated into the 3D map. Mapping in Volumes To build the 3D maps, Robbins' team uses an approach called volumetric mapping, in which the environment is represented by volume pixels (called "voxels") rather than the more typical 3D surface-based geometry found in modern video games. "We break up the world into a 3D grid with each grid element, or voxel, represented as empty or filled," says Eric Hasan, senior systems engineer and the project's graphics lead. "Filled voxels contain color information. Merging in a new set of 3D sensor data is faster and more efficient then, because overlapping data can be quickly merged into a single voxel." The volumetric approach also enables the team to use an optimized data structure, called an oct-tree. "The oct-tree representation of our volumetric data gives further efficiencies by allowing us to use variably sized voxels. That enables us to represent empty space—which is most of the volume of any given scene—by a few very large voxels, and using smaller, more detailed voxels for surfaces. This process saves a great deal of memory. We've demonstrated a volumetric modeling of an entire atrium using only a few hundred megabytes," explains Hasan. The visualization system must also run fast, as it needs to display the scene to unmanned vehicle operators while at the same time merging newly collected 3D data. The volumetric display system can currently map a 2,500 square foot room with 1-cubic centimeter resolution at 5-10 frames per second. The Map Ahead The SLAM team has demonstrated its localization and mapping system on a research UGV. "We still have a great deal of work to do, but our initial results so far are very promising," says Robbins. "We're still only mapping over fairly short distances, and we need to apply more sophisticated localization algorithms that exploit our multiple sensors. Our goal over the next year is to implement these algorithms and outfit a UAV and military UGV with this technology to demonstrate its effectiveness outside of the laboratory in a mission-oriented context." —by David A. Van Cleave Related Information Articles and News
Technical Papers and Presentations Websites |
||||||||||||||||||||
Page last updated: June 30, 2008 | Top of page |
Solutions That Make a Difference.® |
|
|