3D Building Reconstruction from Monocular Remote Sensing Images

International Conference on Computer Vision (ICCV), 2021
Weijia Li1,2,*     Lingxuan Meng2,3,*     Jinwang Wang2,4     Conghui He2     Gui-song Xia4     Dahua Lin1,5    
1The Chinese University of Hong Kong   2Sensetime Research   3University of Electronic Science and Technology of China   4Wuhan University   5Shanghai AI Laboratory

3D building reconstruction from monocular remote sensing imagery is an important research problem and an economic solution to large-scale city modeling, compared with reconstruction from LiDAR data and multi-view imagery. However, several challenges such as the partial invisibility of building footprints and facades, the serious shadow effect, and the extreme variance of building height in large-scale areas, have restricted the existing monocular image based building reconstruction studies to certain application scenes, i.e., modeling simple low-rise buildings from near-nadir images. In this study, we propose a novel 3D building reconstruction method for monocular remote sensing images, which tackles the above difficulties, thus providing an appealing solution for more complicated scenarios. We design a multi-task building reconstruction network, named MTBR-Net, to learn the geometric property of oblique images, the key components of a 3D building model and their relations via four semantic-related and three offset-related tasks. The network outputs are further integrated by a prior knowledge based 3D model optimization method to produce the the final 3D building models. Results on a public 3D reconstruction dataset and a novel released dataset demonstrate that our proposed method improves the height estimation performance by over 40% and the segmentation F1-score by 2% - 4% compared with current state-of-the-art.