I. Introduction
Due to population aging [1] in recent years, the request for domestic robots has been raising and they will play a significant role in our daily life in the future. Domestic robots usually have multiple sensors installed for multimodality sensing. Vision is the main source of human perception of the world. Human visual system has been highly developed and perfected and they obtain 80% of external information through visual access. In order to perceive the surroundings as human beings do, domestic robots often carry vision sensors like cameras for rich scene information, but at the same time there are information redundancy issues.