Skip to Main Content
We present a method for word recognition with input speech transduced simultaneously by several sensors in high levels of broadband acoustic background noise. In prior work on single-input multisensor systems, limited success in machine recognition was achieved by linearly combining multiple sensor signals to yield a robust estimate of the speech signal in the presence of noise. In this paper, we demonstrate that improved recognition results are obtained by using all available sensor signals jointly as a vector, which preserves information from all sensors, as input to the decision process. We report on multisensor configurations using close-talking pressure-gradient microphones and accelerometers placed at the throat and nose of the speaker. The recognition error rates obtained by using the joint output vector are 45% lower than the error rates obtained with the best constituent sensor in the multisensor system; single-input multisensor systems, on the other hand, produce error rates that are about equal to the error rates obtained with the best constituent sensor.