1. Introduction
Marker-based optical motion capture (mocap) systems record 2D infrared images of light reflected or emitted by a set of markers placed at key locations on the surface of a subject's body. Subsequently, the mocap systems recover the precise position of the markers as a sequence of sparse and unordered points or short tracklets. Powered by years of commercial development, these systems offer high temporal and spatial accuracy. Richly varied mocap data from such systems is widely used to train machine learning methods in action recognition, motion synthesis, human motion modeling, pose estimation, etc. Despite this, the largest existing mocap dataset, AMASS [28], has about 45 hours of mocap, much smaller than video datasets used in the field.