Abstract:
Neural Volume Rendering (NVR) has advanced explosively since the advent of Neural Radiance Field (NeRF), a technique for novel view synthesis of complex scenes based on a...Show MoreMetadata
Abstract:
Neural Volume Rendering (NVR) has advanced explosively since the advent of Neural Radiance Field (NeRF), a technique for novel view synthesis of complex scenes based on a finite set of input views. Existing ray casting-based NVR approaches process rays concurrently to leverage parallelism but fails to consider its impact on cache locality, which ultimately undermines the efficiency of corresponding dedicated hardware accelerator designs. We further observed that there exhibits spatial correspondence between features and voxels in NVR that can be exploited by processing in the order of voxel, not ray. This paper introduces a novel approach to meticulously reorder the execution of rays, ensuring that rays with similar memory access patterns are processed in parallel, thereby enhancing cache locality. On the basis of that, we also propose an efficient backend architecture and a corresponding memory subsystem, facilitating accurate data prefetching to hide off-chip memory latency. To validate the proposed architecture, we implement our design in VerilogHDL and evaluate the performance by post-synthesis simulation with real scene data. The evaluation results demonstrate that our design markedly enhances the efficiency of NVR processing, achieving a considerable speedup ( 1.62\times ) compared to the state-of-the-art NVR accelerator, while necessitating significantly less silicon area ( 5.12\times ) and power ( 32.79\times ).
Published in: IEEE Transactions on Circuits and Systems for Video Technology ( Volume: 34, Issue: 11, November 2024)