Convolutional Neural Networks for Risso’s Dolphins Identification

Photo-identification is one of the best practices to estimate the abundance of cetaceans and, as such, it can help to obtain the biological information necessary to decision-making and actions to preserve the marine environment and its biodiversity. The Risso’s dolphin is one of the least-known cetacean species on a global scale, and the distinctive scars on its dorsal fin proved to be extremely useful to photo-identify single individuals. The main novelty of this paper is the development of a new method based on deep learning, called Neural Network Pool (NNPool), and specifically devoted to the photo-identification of Risso’s dolphins. This new method also includes the unique function of recognizing unknown vs known dolphins in large datasets with no interaction by the user. Moreover, the new version of DolFin catalogue, collecting Risso’s dolphins data and photos acquired between 2013-2018 in the Northern Ionian Sea (Central-eastern Mediterranean Sea), is presented and used here to carry out the experiments. Results have been validated using a further data set, containing new images of Risso’s dolphins from the Northern Ionian Sea and the Azores, acquired in 2019. The performance of the NNPool appears satisfying and increases proportionally to the number of images available, thus highlighting the importance of building large-scale data set for the application at hand.


I. INTRODUCTION
Top predators such as marine mammals help maintain functionality and resilience of the ecosystem, while actions that contribute to their conservation can generally be beneficial to marine biodiversity [1]. Conducting photo-identification (photo-ID) studies based on the recognition of single individuals through specific markers on their body can help to evaluate the abundance of cetaceans, providing relevant biological information usable for marine environment protection.
The Risso's dolphin Grampus griseus (Cuvier, 1812) is one of the least-known cetacean species on a global scale, with Mediterranean subpopulation ranked as Data Deficient by the IUCN Red List [2]. To bridge the gap of understanding The associate editor coordinating the review of this manuscript and approving it for publication was Zahid Akhtar . this species, a key component is obtained through photo-ID studies. It is, in fact, their appearance, that makes the Risso's dolphins particularly suitable for this kind of research. Commonly, adult Risso's dolphins display extensive white scarring on their bodies, solid grey at birth (see Figure 1). VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ These scars, most of which are presumably caused by infraspecific interactions [3], can appear as scratches, stains, or circular marks, and in some animals can cover most of their body surface. As a result, the unique markings on their dorsal fin can be successfully analyzed to identify single individuals. The state of the art for the automated photo-identification of Risso's dolphins is the algorithm SPIR (Smart Photoidentification of Risso's dolphin) [4], [5], where Scale Invariant Feature Transform (SIFT) [6] is applied to detect the key-points over the scars on the dorsal fin of Risso's dolphins. The purpose of this photo-ID tool is to match the dolphin captured in the query image with the most similar dolphin, in terms of SIFT features, in a catalogue of known and labeled Risso's dolphins. Acting as a best-matching algorithm, the peculiarity of SPIR is that it still provides an answer in terms of probability, even if the dolphin in the query image is unknown, that is it has never been sighted before, and for this reason its photos are not included in the reference catalogue. If we were to have photographs of new fins, for example acquired during a new survey, in order to bring about a more powerful photo-identification, the unknown class should be considered among the possible identities for the query dolphin. How could unknown dolphins be automatically detect? In general, machine learning algorithms, such as Support Vector Machine [7], [8], Random Forest [9], [10], Adaboost [11] and RUSBoost [12]- [14], provide us with strategies able to classify examples never seen before, taking into consideration all desirable classes, and in particular the unknown one. With this aim, in our previous paper [14] we provided an initial study on this matter, showing the RUS-Boost performances when recognizing the unknown Risso's dolphins against one of the previously catalogued and known individuals. The selection of this algorithm depended on its ability to manage class imbalance, a straightforward skill considering that the number of images available for each Risso's dolphin, on which the model is trained, is small due to the low sighting frequency of this species [15]. The strategy proposed in [14] can be summarized in three different steps: fin segmentation (fin mask creation), feature extraction inside the mask and RUSBoost classification. The dorsal fin captured in the image was represented by a Super-SIFT, the vector containing the first three principal components of the SIFT computed inside the fin mask, in the Principal Component Analysis (PCA) analysis [16]. The choice fell on SIFT features, based on our previous paper [5], where we have demonstrated that the SIFT-based approach is more robust for the specific application of Risso's dolphin photo-ID. Results of RUSBoost photo-identification on a small data set were quite promising and highlighted the need to apply similar strategies to larger data sets. Encouraged by the results, the study has been further pursued with several new contributions discussed in this paper. First of all, DolFin catalogue, previously published in [4], has been updated collecting Risso's dolphins data and photos acquired up to 2018 in the Northern Ionian Sea (Central-eastern Mediterranean Sea) by our research team.
The updated version of this catalogue, presented here, was used to carry out experiments. Successively, the study proposed in [14] has been applied as follows to the DolFin catalogue. A number of m RUSBoost classifiers were trained to identify unknown dolphins against each one of the m known individuals selected in the DolFin catalogue. The selection of the m known individuals, over the total number n of known dolphins in DolFin catalogue was done automatically, based on the accuracy of the fin segmentation, as SIFT features are extracted inside the computed fin mask and are informative for that individual. Ideally, in the best case scenario, the fin mask is correctly computed when the mask contains all fin pixels and no sea pixels; in such a way, SIFT will only be computed over the dolphin?s fin, and no features will be extracted over the sea. Hence, the number m of known selected individuals is related to the number of dolphins whose fin?s image has been well segmented by the algorithm. It is clear that this selection of a subset of individuals aims to improve the performances of RUSBoos. In fact, the application of RUSBoost to all the n known individuals worsens its performances. Now, a new contribution of this paper is RUSPool, a strategy focusing on the identification of unknown vs known Risso's dolphins, where the known dolphins are all the m individuals previously chosen in the catalogue. In this case, the aim is not to predict the exact identity of the known individual. RUSPool consists of a smart merge, by means of a tailored filter, of the m RUSBoost classifiers, trained as previously described.
Finally, the main novelty of this paper is the application of Convolutional Neural Network (CNN) [17] models to the Risso's dolphin photo-identification pipeline. In particular, the aim of this paper is to investigate the advantages and drawbacks of the modern CNN, when compared to RUSBoost, in the analysis of unbalanced data sets. CNN was selected among others because widely used in literature with excellent performances throughout many applications [18]- [25], and most importantly, because it does not require an explicit computation of external features, unlike RUSBoost which requires SIFT extraction, thus overcoming the previously discussed problems, related to fin segmentation and feature selection. Hence no selection of dolphins is done in the CNN analysis, and it will be applied over all the n known dolphins listed in DolFin catalogue. A number of n CNNs were trained to recognize the unknown dolphins against each one of the n known individuals. A total number of n CNNs was trained.
Lastly, a new methodology, called Neural Network Pool (NNPool), is presented here, faced with the specific task of recognizing unknown vs known Risso's dolphins, similarly to what has already been done with RUSPool. NNPool consists of a pool of the n CNNs, trained as described. Its output simply consists of a major voting of the outputs of the n CNNs. Both NNPool and RUSPool can be employed to automatically process extensive amounts of data, without user interaction, and their accuracy can be applied to identify unknown dolphins. NNPool is based on the photoidentification of n individuals photo-identified, a number grater than the number m of individuals used in RUSPool.
However, the selection of the subset of m dolphins for RUSPool is internal to the methodology and should be considered as a part of it. For this reason, the study of comparison between RUSPool and NNPool-based strategies is valid, both starting from the n dolphins listed in DolFin. Finally, the performance of NNPool was compared to that of RUSPool, and experimental results were also validated using a further data set, collecting Risso's dolphins images acquired in 2019 from the Northern Ionian Sea and Azores.
In summary, the main contributions of the paper are: the development of NNPool, a new methodology devoted to the automated photo-identification of unknown vs known Risso's dolphins; a comparison study of the performances of CNN against RUSBoost algorithm for imbalanced data classification; the publication of the updated version of DolFin, a freely-accessible catalogue collecting photos and data of Risso's dolphins in the Gulf of Taranto.

A. SURVEY AREA AND DATA COLLECTION
Sighting data of G. griseus were collected from July 2013 to August 2018 during vessel-based surveys conducted on board of a 12-m catamaran, investigating an area of about 960 km2, in the Gulf of Taranto (Northern Ionian Sea, Central-eastern Mediterranean Sea). Date, daytime, sea weather conditions, geographic coordinates, group size (number of specimens), and depth (m) were recorded. In addition, a collection of images for photo-ID were taken using a Nikon D3300 camera with Nikon AF-P Nikkor 70-300 mm, f4,5-6,3G ED lens. We found 93 different Risso's dolphins, photo-identified using the algorithm SPIR [4], [5] and their photographs are freely accessible using the new release of DolFin platform (http://dolfin.ba.issia.cnr.it/). Unfortunately, most of them were sighted only once, thus very few photos are available and the quality is not always good. Hence, in the present study, only fin images with dimension equal or superior to 200 × 200 pixels were used. Moreover, dolphins were selected according to the following: 1) each individual must be sighted in two or more different daily surveys; 2) at least 15 images must be available for a fin side of dolphin. The first condition serves to guarantee a minimum of variability in the illumination conditions when acquiring images; the second condition is imposed to ensure a minimum number of examples necessary to train the classifier. In this way, a number n = 28 of individuals has been selected in the DolFin catalogue and used for this study. The left and right side of the dolphin's fin are considered and analysed independently, as if belonging to separate individuals.
To built the data set D R used for training RUSBoost classifiers, we introduced a tailored filter f R : the fin area, computed using the mask, must cover less than 70% of the total image. As shown in the Figure 2a, when the fin mask is correctly computed, i.e. the mask contains all the fin pixels and no sea pixels, then the mask area will be lower than a threshold, empirically fixed at the 70% of the total number of pixels. This is a crucial point. As the SIFT features, representative for each individual, will be extracted from within the fin mask, the fin must be correctly segmented and exclude as many sea pixels as possible. In this manner, a subset of m = 23 dolphins has been selected out of the n = 28 chosen for this study. The data set D R contains 433 images belonging to the m = 23 dolphins, whose names and percentages of images available are reported in Table 3.
To build the data set D NN used to train Convolutional Neural Networks, we used a resizing of all the images equal to 300 × 400 pixels because CNN requires images of equal size.
Note that the filter f R used to build D R is not necessary now because, in this case, the input data for CNN is the full image and no mask is computed over the fin. Finally, D NN consists of 582 images of n = 28 different dolphins, whose name and percentage of available images are reported in Table 6.
Lastly, a validation data set D v was considered to validate experimental results, it containing 300 images of Risso's dolphin fins so detailed: • 150 collected during daily surveys in 2019 in the Gulf of Taranto, using the previously described protocol and instruments, of which 40 belong to some of the 23 known dolphins, and 110 to the unknown dolphins; • 150 collected in Azores. This dataset was obtained off Pico island covering approximately 540 km2 between May and September 2019. Risso's dolphins were first located from a land based look out (38.4078 N and 28.1880 W) using 25 × 80 binoculars (Steiner observer) [26], and encountered during ocean based surveys, using a 5.8 meters long zodiac, equipped with a 50 HP outboard engine. In the present study all this photos belong to the unknown class.
The unknown label is assigned to dolphins never seen during previous surveys and for this reasons these dolphins are not already catalogued in previous photo-id studies. On the contrary, the known class considers dolphins already listed in the reference catalogue, i.e. photo-identified in previous studies.

B. RUSBOOST METHODOLOGY
In this section we illustrate the methodology, previously presented in [14], based on RUSBoost classifier and developed with the aim of identifying the name of the dolphin captured in a new photo is described. This methodology developed can be summarized as shown in Figure 3. The PRE-PROCESSING phase consists of the automated segmentation of the fin in the image, through 1) the imagebinarization using Otsu's method [27], 2) application of morphological operators (opening and thinning, filling), 3) extraction of the contour from the B/W image and mask creation [4], [5]. In the FEATURE EXTRACTION step, the SIFT features are extracted in each fin image [5], and subsequently the PCA is performed and the first 3 principal components are selected to build the 3-dimensional super-SIFT descriptor for each image. The CLASSIFICATION step consists of two different processes and it is based on RUSBoost classifier, which is a boosting-based sampling algorithm designed to handle class imbalance [12], [13]. It combines Random Under-Sampling (RUS) and Adaboost [11]. RUS is a technique that randomly removes examples from the majority class until the desired balance is achieved. Firstly, one-vs-all RUSBoost algorithm is trained to classify the identity of one selected dolphin (known and already photo-identified, and labeled in the data set D R ) against ''all'' the other individuals, name of the individuals. It consists in the smart mixing of the M i RUSBoost classifiers, with i = 1, 2, . . . , 23, previously trained. Considering that each M i classifier is made of n CV trained models, the pool of RUSBoost trained classifiers is made by the 23 × n CV models. When a new image is presented in input to RUSPool, the super-SIFT are computed over the image, and this novel specimen is given as input to the pool of classifiers. Firstly, among the n CV models trained for each M i classifier, only those with |score(1) − score(0)| > 0.11 are taken into account, where score(1) and score (0) are respectively the posterior probabilities that the selected specimen belong to the known (1) or unknown (0) classes, and the cut-off of 0.11 has been empirically chosen. Then, the RUSPool output is a vector R as the one shown in Table 1, where each element r i ∈ R is the number of positive prediction for each M i classifier. A filter ( Figure 5) has been developed over the vector R to obtain the prediction of the identity of the dolphin in the new image, choosing between known or unknown label. The strongest model is the M i classifier with the highest r i value, named as r strong . A contour of r strong is defined as , then if no classifier exists with r i contained in this contour (meaning that no M i classifiers has r i > r strong − I ), the specimen was classified as known, otherwise the specimen was classified as unknown.

D. CONVOLUTIONAL NEURAL NETWORK
Deep Learning is a type of machine learning in which a model learns how to perform classification tasks directly from images. This is advantageous because feature extraction is internal to the learning step, whilst RUSBoost required an explicit Super-SIFT features computation. In this paper we use one of the most popular algorithms for Deep Learning, the Convolutional Neural Network (CNN) [17]. This algorithm can have dozens or hundreds of layers, each able to detect different features of an image. Like other neural networks, a CNN is composed of an input layer, an output layer, and many hidden layers in between. These layers perform operations that alter the data with the intent of learning features specific to the data. The most common layers are: • Convolutional (Conv) which puts the input images through a set of convolutional filters, each activating a certain feature from the images.
• Rectified linear unit (ReLu) which allows faster and more effective training by mapping negative values to zero and maintaining positive values, thus adding non-linearity.
• Pooling (MaxPool) which simplifies the output by performing nonlinear down-sampling, reducing the number of parameters that the network needs to learn.
• Fully Connected (FC) which multiplies the input by a weight tensor and adds a bias vector. All neurones are connected together and typically, these kinds of layers are used to actually classify the extracted features from previous layers.
• SoftMax which allows to transform a set of values into probabilities associated to the classes. By mixing these three types of layers, we obtain the architecture of the CNN we have built. In order to recognize the fin image, we apply three times the following combination of layers: Conv-ReLu-MaxPool, each time doubling the learned filters number on the convolutional layers from 8 to 16 to 32 (see Figure 6).
After learning features the architecture of a CNN shifts to classification which is composed by two (FC + ReLu) layers, used to reduce the dimension to a k-dimension vector, where k = 2 is the number of classes, known vs unknown, to predict. This vector contains the probabilities that the input image belongs to each class. The final (FC3 + SoftMax + Classification) layers of the CNN use a SoftMax function to provide the classification output [28].
In order to better justify the CNN architectural choice, it is worth providing an overview of the number of learnable parameters involved in the training of the CNN proposed in this paper while comparing it to the homologous number for one of the CNN classifiers most known in literature. Our CNN requires 7.248.274 parameters to be learned, compared to AlexNet (for example) that requires the update of 60.965.225 parameters. The number of images required to train our classifier avoiding the overfitting is lower than the one required from other bigger networks. The straightforwardness of our architecture is also justified by the task that the network solves, i.e. the classification of dolphin individuals by analysing dorsal fins images. In fact, even if this kind of images can be different one from the other, the intrinsic variability of our dataset is lower when compared to other datasets used to train, for example, AlexNet, which aims to discriminate a generic input image among 1.000 different classes. For these reasons, the number of images collected in our experiments is sufficient and has also been properly augmented and used to train a simple but effective CNN built from scratch.
A CNN model is then trained for each of the n = 28 dolphins in the D NN data set with a one-vs-all technique and a CV strategy. We know that overfitting can be an issue that needs to be avoided at all costs. For this reason, we decided to execute an additional validation experiment to test the performance of NNPool on a dataset never used for training purposes. The results confirm the capability of the model to perform the classification task without extensively drop the performance.
Class imbalance is managed with a downsampling strategy. Given a dolphin d i among the n = 28 dolphins in D NN , let n i be the number of images available for d i . Then, the unknown class is composed by m i images, where m i = 27 × κ * with κ * = min{κ ∈ N | 27 × κ ≥ n i }, i.e. the first multiple of 27 greater than n i . This way, κ images for each of the remaining 27 individuals are taken into account. The training set is composed of (n i + m i ) photos. Subsequently, an oversampling technique based on image augmentation is used to increase the number of images in the training set. Two geometric transformations have been applied to every image: random rotation of ±45 degrees (over both y and x axes) and translation of ±20 pixels (over both axes, too). VOLUME 8, 2020

E. NNPOOL
Here we propose a new strategy, named NNPool, whose aim is to recognize known vs unknown dolphins without explicitly identifying the name of the known and previously labeled dolphin. Similarly to RUSPool, NNPool consists in the mixing of the CNN i networks, with i = 1, 2, . . . , 28, where CNN i is made of n CV trained models. NNPool is made of the (28 × n CV ) models.
Each time we want to predict the label of a new photo, it will be resized to the required dimension by the input layer of all the CNNs, which is 300 × 400 pixels. Subsequently, the photo can be used as input in NNPool, giving a P vector as output (as shown in Table 2). If there is only one p i ∈ P > 51%, the new photo will be labeled as known, otherwise it will be labeled as unknown.

III. EXPERIMENTS AND RESULTS
All data are analyzed using Matlab (MathWorks, Natick, MA) and the experiments described in this paper have been conducted on a HP z840 Workstation equipped with Intel Xeon CPU E5-2699 v3 @ 2.30 GHz CPU, 256 GB RAM and Nvidia Quadro K5200 graphics card. In the first step of this study a one-vs-all RUSBoost model is built to recognize a selected dolphin against all the others in the data set D R . A CV technique is used to evaluate the performances of the classifiers, In each round of the CV, the data set D R is divided into training and test set, as shown in Figure 7. The training set contains about 2/3 of the data, and the remaining examples are collected in the test set [29]- [31]. The RUSBoost classifier is trained on the first set and its performances are evaluated on the test set, results are averaged over the n CV = 100 rounds. The number of CV was empirically set. The imbalance of the data is varying from 9.5% to 3.5% (see Table 3), reaching critical values for the classification task. Hence the selection of RUSBoost, among other machine learning algorithms, is focused. The number of cycles of RUSBoost was empirically set as NLearnCycles = 60 for all the classifiers. Moreover, the image quality surely impacts on the algorithm performances and therefore should be taken into account. Several methods to evaluate image quality have been discussed in literature [32], [33]. In this paper the Perception based Image Quality Evaluator (PIQE scores) [34] is used to evaluate the image quality and is computed for all the images used to built each M i classifiers. These are no-reference image quality scores, with values in the range [0, 100], inversely correlated to the perceptual quality of an image. The quality scale of the image based on its PIQE score is reported in Table 4. A low  Table 6) computed on the data set D NN . Blue (red) data refer to sensitivities of specimen with Percentage lower (greater) than 3.5% (Table 6). Cyan (magenta) line represents the linear trend of blue (red) data.
PIQE score value indicates high perceptual quality and high score value indicates low perceptual quality. The accuracies, specificities and sensitivities of RUSBoost classification over the 23 dolphins are shown in Table 3, where PIQE median value and Median absolute deviation (Mad) of images are reported in the last two columns.The classifier of ERARD R is built on the highest number of images, however having fair qualities, as shown by PIQE median and Mad, the specificity is good, as it corresponds to 82%. Similarly, CUPIDO R , with an imbalancing of 5.3% of fair quality images, reached the highest accuracy and specificity, equal to 93%. A sensitivity of 90% was reached in the recognition of EMME R , which is represented by lower number of images (corresponding to the 4.2% of the data set size), all having good qualities. When the imbalance decreases (lower than 5.5) and the image quality is fair or poor, the performance decreases as well, displaying low sensitivity value. A remark should be given on ELE R showing an excellent quality score, but 65% of sensitivity, explainable by the 3.5% class imbalance.
Hence the performances of the RUSBoost classifiers clearly depend on the number of images available for the training of the model, as well as on the quality of the images.
Successively to test the RUSPool performance, the images collected in the data set D v were used. The experimental results, shown in Table 5, highlight that RUSPool is able to recognize the unknown dolphins with an accuracy = 78%, sensitivity = 58% and specificity = 81%.
CNN has been trained using the Stochastic Gradient Descent with Momentum method (momentum set to 0.9), with minibatch dimension of (0.25 × Training set dimension), number of epochs 60 (with shuffle at every epoch) and initial learning rate of 0.00001. The loss function used was the cross entropy loss. The performances of CNN were evaluated using a CV strategy, with n CV = 10. In each round of the CV, the data set D NN is divided into training and test set, where the training set contains about 2/3 TABLE 3. The performances of RUSBoost (NLearnCycles = 60) in classifying the twenty-three dolphins (listed in Name) are shown in terms of accuracy, sensitivity and specificity computed using cross-validation strategy. The second column shows the percentage (%) of images of the data set available for each dolphin. Median, Mad (Median absolute deviation) and quality scale of the PIQE (Perception based Image Quality Evaluator) scores are shown in the last three columns. of the data, and the remaining examples are collected in the test set. The classifier is trained on the first set and its performances are evaluated on the test set, results are averaged over the n CV rounds and illustrated in Table 6, showing that CNN overall behaviour outperforms that of RUSBoost, even with generally lower quality scores.  Figure 8 shows that when few images are available for the specimen (Percentage ≤ 3.5%), the CNN sensitivity decreases with the images quality (cyan line in the figure). Instead, when the number of images for the specimen increases (Percentage > 3.5%), good sensitivity values are achieved even if fair or poor quality images are used (magenta line in the figure). In general, the performance of the CNNs TABLE 6. The performances of CNNs in classifying the twenty-six dolphins (listed in Name) are shown in terms of accuracy, sensitivity and specificity computed using cross-validation strategy. The second column shows the percentage (%) of images of the data set available for each dolphin. Median, Mad (Median absolute deviation) and quality scale of the PIQE (Perception based Image Quality Evaluator) scores are shown in the last three columns. * refers to dolphins which were not analyzed by RUSBoost.
is very good both in case of many low quality images as well as in case of a few high quality images.
The performances of NNPool, shown in Table 5, were tested over the data set D v , obtaining accuracy = 87%, sensitivity = 70% and specificity = 90%, higher than the values obtained with RUSPool.
A final remark should be given about the time required for the two analyzed classifiers. With this hardware configuration, the mean time for RUSPool training was 2288.53 seconds whilst NNPool required 1816.84 seconds. As far as classification time is concerned, i.e. the time required for a single unknown image to be classified by RUSPool or VOLUME 8, 2020 NNPool, the value is respectively 150 seconds for RUSPool and 25 seconds for NNPool.

IV. CONCLUSION AND FUTURE WORKS
The performance of the proposed CNN-based algorithm appears, overall, satisfying, in addition to the fact that this system automatically processes large amounts of data with no interaction by the user. In particular, results highlighted by NNPool performances overcome those of RUSPool also on the validation set. Its ability to identify unknown Risso's dolphins, namely those dolphins never encountered during previous surveys, will open new frontiers to photo-identification studies, in particular, when paired with a photo-ID algorithm, such as SPIR [4]. An important advantage of NNPool is that it analyzes the full fin images, thus its performance is independent from the accuracy of the segmentation of the fin. And this is not a trivial task of the application at hand, keeping in mind that photos of wildlife taken on a background of sea, sun, or waves, are analyzed in the this study. Finally, it is also worth considering that the time required for classifying an unknown image with NNPool is significantly reduced if compared to the other method evaluated in this work, thus highlighting a practical advantage in terms of time complexity.
Moreover, our experimental results show that the evaluation metrics increase proportionally to the number of images available, highlighting the need to gather larger amounts of dolphins images. Thus, a strong effort is required to conduct large-scale studies on the Risso's dolphin photoidentification. Hence a further future goal will be to extend the study to larger data sets, which will be obtained by acquiring data during new surveys in our area of study, as well as by sharing data with other research groups that are offering their collaboration. DolFin portal, previously published by our research team in [4], and the updated version of which has been presented here, will facilitate the work in this regard. ROCCO CACCIOPPOLI was born in Bari, Italy, in 1995. Following the internship at the National Research Council of Italy, Bari, he will complete a bachelor's degree in computer science at the University of Bari Aldo Moro, Italy, in 2020, defending a thesis on study and development of deep learning for the photo-identification of marine mammals.
EMANUELE SELLER was born in Bari, Italy, in 1995. He received the bachelor's degree in computer science and technologies for software production from the University of Bari Aldo Moro, Italy, in 2019. He is currently attending a first level master's degree at the University of Verona, Italy. During his internship (started in April 2019) at the National Research Council of Italy, Bari, he worked on the development of an intelligent system in the field of ecological informatics, for the automatic-photo identification of dolphins. His research interests include computer vision and automatic identification systems.
STEFANO BELLOMO was born in Bari, Italy, in 1988. He received the Master of Science degree in environmental biology from the University of Bari Aldo Moro, Italy, in 2019. He has been collaborating with the Jonian Dolphin Conservation in Taranto since 2013 as marine mammals' observer and, nowadays, he is currently a member of the scientific committee of the Jonian Dolphin Conservation. He acts as an International Education Project Coordinator. He has coauthored five articles published in international conferences. His research interests are mainly related to ethology and photo-identification. He has been a member of the Italian Society of Marine Biology, since February 2020. Recently, she became a Researcher in ecology at the Department of Biology, University of Bari Aldo Moro, Italy. She shows expertise in the implementation and analysis of biological/environmental data in GIS environment. She has coauthored more than 30 scientific publications in congress proceedings, national and international ISI journals. Her main research interests focus on the application of biological, statistical, and mathematical models to marine ecology, and population dynamics focused on cetacean.
Dr She co-founded the Nova Atlantis Foundation in 2002, aiming to study and protect cetacean species in the Azores. She is studying Risso's dolphins for over 20 years, focusing on the social ecology and behaviour. At present, she is a Field Director of the Nova Atlantis Foundation and has built up the largest database and photo-identification catalogue of Risso's dolphins (n = 1250 individuals) in the world. In 2018, she was invited to write the chapter on Risso's dolphins for the Encyclopaedia of Marine Mammals. She is the coauthor on more than 50 international communications at conferences and articles.
Dr. Hartman is the Chair and the Co-Founder of the International Grampus Workgroup, a collective of 40 scientists aiming to collaborate and share data of Risso's dolphins. In 2018, she produced the film Scars, Politics in the Big Blue, a film that won several awards for best drone and wildlife footage in international film festivals.
CARMELO FANIZZA was born in Taranto, Italy, in 1976. He received the bachelor's degree in mariculture sciences from the University of Bari Aldo Moro, Italy, in 2001.
His main expertise is in scientific research on cetaceans, along with cetacean-sighting expeditions aimed at directly involving students, tourists and citizens. He founded the Jonian Dolphin Conservation, in 2009, where he currently acts as the President, and participated to the realization of the Euro-Mediterranean Center of the Sea and Cetaceans, Ketos, inaugurated in Taranto, Italy, in 2019. Ketos hosts a museum, an area dedicated to tourist services, a sector specifically set aside for start-ups and social entrepreneurship, and a VR library overlooking the sea. He has coauthored more than 20 articles published in international conferences and journals.
Dr. Fanizza and the Jonian Dolphin Conservation were recognized as one of the 21 Italian Excellence organizations at Expo Milan, in 2015.
GIOVANNI DIMAURO (Member, IEEE) was born in Taranto, Italy, in 1964. He received the Laurea degree in computer science from the University of Bari Aldo Moro, Italy, in 1987. He is currently an Associate Professor with the Computer Science Department, University of Bari Aldo Moro. He teaches computer programming and multimedia systems. He has authored more than 150 articles, which have been published in scientific journals, proceedings, and books. He is the author of two patents, of which the latter pertains to the field of noninvasive anemia estimation. His research interests include e-health, multimedia systems, and pattern recognition with applications in medicine, such as new diagnosis technology for anemia and Parkinson's disease.
ROBERTO CARLUCCI was born in Taranto, Italy, in 1970. He received the Ph.D. degree in environmental science.
He is currently an Assistant Professor of ecology with the Department of Biology, University of Bari Aldo Moro, Italy. He has coauthored 105 scientific publications in congress proceedings, national and international ISI journals. He is a reviewer for Italian and international ISI journals. His main research interests focus on the application of biological, statistical and mathematical models to marine ecology, population dynamics, and fishery stock assessment.
Prof. Carlucci attended the Working Group on Fishery Stock Assessment of Demersal Species at FAO General Fisheries Commission for the Mediterranean as responsible of the GSA19 (Western Ionian Sea). He is the scientific responsible of the Jonian Dolphin Conservation onlus. He is a member of the Commission Internationale pour l'Exploration Scientifique de la Méditerranée Task Force on Sharks and Rays.