The Figure shows proposed system architecture, where IDMP takes as input the depth sensor's data and pose. These inputs are used to generate the local Frustum Field which determines the implicit semantics of the scene and is then used to fuse the new observation with the global GPDF. Our RMP policy queries distance and gradient information from the fused global GPDF to generate accelerations which are passed to a controller for execution on the robot.
The key aspect of the IDMP framework is that it uses a Frustrum Field to fuse and identify the dynamic regions locally before passing the information to the Fused Field that contains the global information.
The following figures are showing the internal update process of IDMP. The background displays the distance field within the sensor's field of view generated by the frustum GPDF. While the fused GPDF is trained on all points from the internal global map, the frustum GPDF only utilizes the latest observations, capturing changes in the scene. By querying the frustum GPDF with the fused GPDF's training points, we can directly retrieve implicit semantics based on distance metrics.
Training points in the fused GPDF are classified as static if their queried distance in the frustum GPDF is below a certain threshold, indicating the object has not moved. Training points are classified as dynamic when this distance exceeds the sensor noise threshold, indicating that the object has moved. For the final case we query newly observed sensor points with the fused GPDF. Those points with distances greater than a certain threshold are classified as new and are fused into the global GPDF.