Motivated by humans' ability to lipread, the visual component is considered to yield information in the speech recognition system. The lip-reading is the perception of the speech purely based on observing the talker lip movements. The major difficulty of the lip- reading system is the extraction of the visual speech descriptors. In fact, to ensure this task it is necessary to carry out an automatic localization and tracking of the labial gestures. We present in this paper a new automatic approach for lip and point of interest localization on a speaker's face based both on the color information of mouth and a geometric model of lips. This hybrid solution makes our method more tolerant to noise and artifacts in the image. Experiments revealed that our lip POI localization approach for lip-reading purpose is promising. The presented results show that our system recognizes 94.64 % of French visemes.