I. Introduction
Optical images are a valuable source of information for scene classification (semantic labeling) and object detection. In the investigation of such data, however, it is not possible to effectively differentiate objects composed of the same material (i.e., objects with the same spectral characteristics). For example, roofs and roads that are made of the same material exhibit the same spectral characteristics, which make the discrimination of such categories a laborious task using optical data alone. Conversely, elevation data [e.g., LiDAR and digital surface model (DSM)] provide rich height information but are unable to differentiate between objects with the same elevation that are made of different materials (e.g., roofs with the same elevation made of concrete or asphalt).