Time multiplexing (TM) and spatial neighborhood (SN) are two mainstream structured light techniques widely used for depth sensing. The former is well known for its high accuracy and the latter for its low delay. In this paper, we explore a new paradigm of scalable depth sensing to integrate the advantages of both the TM and SN methods. Our contribution is twofold. First, we design a set of hybrid structured light patterns composed of phase-shifted fringe and pseudo-random speckle. Under the illumination of the hybrid patterns, depth can be decently reconstructed either from a few consecutive frames with the TM principle for static scenes or from a single frame with the SN principle for dynamic scenes. Second, we propose a scene-adaptive depth sensing framework based on which a global or region-wise optimal depth map can be generated through motion detection. To validate the proposed scalable paradigm, we develop a real-time (20 fps) depth sensing system. Experimental results demonstrate that our method achieves an efficient balance between accuracy and speed during depth sensing that has rarely been exploited before.