A common framework for studying perception and performance in both human-technology interaction and music is presented. The framework represents the cognitive challenges faced by both musicians and human operators in technological systems. In the perceptual realm, both must infer distal or covert states from proximally available information (e.g., inferring the emotional meaning of a musical composition; inferring the functional meaning of interface displays). In the realm of action, both must select proximally available actions, or means, to achieve distal ends or goals (e.g., a conductor using hand movements to direct an orchestra, an operator using interface controls to tune an industrial process or respond to a fault). The framework represents these proximal-distal relations and enables quantitative measurement of the degree to which performers adapt to them. The framework is illustrated with a review of music and human-technology interaction research and our own study of the coordination between a professional conductor's hand movements and a concertmaster's bowing actions in the opening of Beethoven's Fifth Symphony. In providing a common theoretical framework for both music and engineering, we hope to enhance prospects for research on group musical performance to inspire novel, robust design models for modern sociotechnical systems.