I. Introduction
Underwater passive acoustic monitoring (PAM) has become a popular method for studying the behavior and distribution of marine mammals, especially dolphins, due to its continuous monitoring capability and high adaptability [1], [2]. The complex and highly developed communication systems of dolphins reflect the complexity of their social relationships [3]. The communication signals emitted by dolphins have highly variable amplitude-modulated (AM) pitch and frequency-modulated (FM) pitch, also known as whistle signals [4]. Whistle signals exhibit a highly structured and sparse representation because of the physical limitations of the dolphin vocal apparatus [5]. Meanwhile, whistles vary significantly between and within populations, and specific categories can be used to convey specific information about the identity of individual species [6]. Meanwhile, whistle signals of different amplitude-frequency modulation (AM-FM) modes correspond to different behavioral information of dolphins [7]. As such, PAM technology provides valuable insights into dolphin habitat locations, social interactions, and population densities while minimizing the negative impacts of human activities [8], [9]. Given these advantages, PAM has become a critical tool for marine biodiversity and ecological conservation efforts.