1. INTRODUCTION

Due to their highly developed brain structure and unique form of communication, expressed in the transmission of series of stereotyped click patterns (termed "codas"), sperm whales (SW) have been identified as a species with an advanced non-human language. Automatic detection of SW via underwater acoustic recordings is an important task for a variety of applications such as collision avoidance management [1], biomimiking communication for military applications [2], and assessment of acoustic noise pollution levels [3]. Recently, as part of a project called CETI, an interdisciplinary group of researchers has begun collecting SW acoustic data [4]. Relying on deep-learning tools and the collection of massive amounts of acoustic recordings over 5 years, the CETI team will attempt to decode the SW’s communication. Besides communication, SW use clicking for echolocation. This mechanism allows underwater navigation as well as foraging in depth exceeding 1000 m, where they spend most of their time [5]. Hence, long-range passive acoustic monitoring (PAM) of SW is required.