Spatial Cognition
  • Home
  • Technology
  • Products
  • Solutions
  • Contact
    • Contact Info
    • We Are Hiring

Technology, Products, Solutions

All Projects

Spatial Partitioning

  • » Next
  • Previous »

Traditional Approaches

The most common technique, by far, for organizing large quantities of 3D data has been to create object model hierarchies, including the ubiquitous scene graph. In this and related techniques, all parts of the environment are divided into objects. By grouping objects that are closer together, a hierarchy can be created and easily traversed, allowing a large area to be represented in high detail without much overhead.

This approach, however, does not work well when different, overlapping data span objects. For example, it is hard to process imagery covering a larger area with small objects derived from sensor data.

To generalize this problem, no common techniques are conducive to quickly organizing and traversing data based on data sources and sensors as they exist naturally. This is the main reason why lengthy and unintuitive processes have been created to transform geospatial data into runtime 3D databases.

A Simple Example

Let’s start with a simple, common example using three pieces of data for a small area on the terrain: a Digital Elevation Model (DEM) of the area in a global projection coordinate system, a high-resolution orthoimage from an aerial photograph in a local projection to reduce distortion, and a recent frame from a traffic camera.

Figure 1

Traditional approaches first divide the terrain into a grid and then recalculate the elevation and imagery data to fit that common-gridded system. This process must be completed prior to visualization and can take significant time as the amount of data increases. It is also inefficient. For example, what resolution should be used for of the tile highlighted in red? If it is to capture all the detail in the traffic camera image, it would have to contain several million pixels. But to have tiles of that resolution for other parts of this area would be a waste of the limited resources of the graphics engine.

Worse yet, this tiling process has to be recomputed every time the scene is updated with new imagery. This makes it difficult to manually add updated imagery, let alone use live video feeds.

The problem becomes even worse when we incorporate 3D models. Object model hierarchies, such as scene graphs, require that each object contain all the information needed to draw itself. This means every model contains its own textures. As we add updated imagery to the scene, not only do we have to reprocess all the overlapping terrain tiles, we also have to update the overlapping textures in every model on the terrain.

Spatial Cognition’s Technology

We solves these problems by using a fundamentally different approach that we call Metadata-based Spatial Partitioning. This proprietary approach is a game changer.

The key to solving many of the problems described above is to allow data to remain in its native form and composite them into the scene without first combining them into a single spatial model for ingestion. This requires the ability to perform many different types of operations on a very large amount of data.

Fortunately, advances in GPU architectures (and programming standards for these devices) now allow complex manipulation of these large datasets at rapid speeds. As the resolution and speed of sensors continue to increase, the size of metadata compared to the data they represent is growing smaller and smaller. These two factors allow us to separate metadata processing from data processing, and thus quickly organize three-dimensional space based on the data we have available. We then send the larger data elements directly to massively parallel devices (such as GPUs) for real-time analysis and/or visualization.

To extend the simplified example above, let’s say we have the same three sources. Instead of pre-processing the entire dataset into tiles, we use the metadata to figure out how these pieces interact and partition the terrain into corresponding patches. Each patch knows what data are used to create it and what patches are around it. Because we don’t have to actually look at the pixels within each image, this process is very fast and can be done in real-time with massive amounts of data.

Figure 2

The native data sources, along with the information of our parititoning, are sent directly to the GPU. The highly parallel process of reprojection, combining elevation and imagery, and stitching together a seamless terrain is all done in real-time, every frame, on the GPU at interactive speeds. Because the source data is unaltered, this allows us to most efficiently view the scene at the detail of the raw data as we zoom in.

This technique extends into three dimensions and more complex data sources. Spatial partitioning quickly separates space into the regions that contain the individual data elements (e.g. point slices and/or perspective imagery frames) that define those regions.

As data are updated or live sensors move, the spatial partitioning structure can quickly determine what regions are affected by the change and update those parts of the structure. This allows us to add many dynamic data sources while maintaining a high update rate with low latency. Sources can include imagery, video, elevation, 3D geometry from models, LIDAR, and others.

Because of the real-time spatial partitioning step, the GPU knows exactly what data to use when drawing any surface, including the geometry, reflectivity (color), and material properties of the data. We also know what the adjacent regions are, allowing us to create a seamless partition between different regions. Points are rendered in a similar fashion, allowing massive amounts of LIDAR data to be processed very quickly in parallel.

Unlike scene graph techniques, this means different objects can be drawn using several pieces of source data (and vice versa – one piece of source data can cover several objects). For example, this allows a street-level camera image (or aerial image) to be used on every building in its field of view without recreating all of the individual textures on each of the buildings. It also allows us to integrate 3D point clouds with all available imagery in real-time. And because this process happens in real-time, native video-decoding hardware can be used to allow video feeds to be projected onto the scene with almost no computational overhead.


« Technology Features

Next: Using Cameras and Video »

  • Home
  • Technology
  • Products
  • Solutions
  • Contact
  • 100 Grand Ave #104, Oakland, CA 94612
  • +1 (510) 907-7550
Stay Connected
  • Facebook
  • Linkedin
  • Twitter