I. Introduction
Sound localization is essential in several fields including video conferencing and surveillance, and it represents a fundamental function for robots interacting with humans [1]. The localization of a sound source is mostly based on steerable beam-forming and on time delay estimation obtained from the data processing of two or more signals picked up by different microphones.