A Computer Vision Based Behavioral Study and Fish Counting in a Controlled Environment

Aquaculture is a fast-growing food-production sector that accounts for almost 50% of the world’s fish used for consumption. Aquaculture refers to the cultivation of fish in cages. The fish in the cage must be fed at regular intervals, for which the number of fish helps in estimating the amount of feed to be put in the cage. The behavior of fish in a caged environment reflects their health. In the absence of an ambient atmosphere, fish are stressed, which results in frantic movement. The frantic behavior of fish can be identified using recent advancements in image and video processing. In this study, we have focused on frantic behavior detection, fish detection, and counting. For this, a RAS with Tilapia fish has been setup, and the videos of the fish are captured. The detection and counting have been achieved by using the YOLOv5 model. The model has resulted in a Precision, Recall and F-measure of 81%. The results are compared with the ground truth, which indicates that the model has been successful in counting the fish. The frantic movement of the fish has been detected by developing an optical flow model. The results are encouraging and can be used for frantic behavior detection.


I. INTRODUCTION
India is the second-largest producer of aquaculture fish and the third-largest fish-producing country globally. Fisheries and aquaculture are essential for poverty alleviation. Fish accounts for 17% of the animal protein intake [1] globally and remains one of the most traded food commodities worldwide. By 2050 around 125-210 million tons of fish are expected to be required to meet the annual per capita requirement of 15-20 kg [2]. Aquaculture refers to the cultivation of aquatic organisms in a controlled environment. Aquaculture has been growing at a rapid pace and is part of global food security and sustainable development growth. It has gained immense importance in recent times as fish consumption is increasing by 6% annually. The demand has doubled since the beginning of year 2000 and will double again if the current growth rate holds. Aquatic animals, especially fish, are a renewable The associate editor coordinating the review of this manuscript and approving it for publication was Senthil Kumar . resource that people can feed on indefinitely if managed sustainably.
Cage aquaculture involves growing fish in their natural environment while being enclosed in a cage that allows the free flow of water. The fish be it in cage, pond or RAS (Recirculatory Aquaculture System) system must be fed at regular intervals. Since 60% of the operating expenses are used for feeding [3], it is important to optimize the use of fish feed to avoid feed wastage and contamination of water inside the cage. The counting of fish is also helpful in determining the mortality rate of the fish. It is challenging to keep track of fish and their behavior manually inside a cage as several behavioral changes can be seen in cultivated fish [4]. It costs a lot of time and money. Monitoring of fish in the aquaculture systems [5] is essential to ensure that the fish are not stressed and have ambient surroundings to grow.
The fish tend to move faster when stressed and this movement is referred to as frantic movement. A frantic movement pattern observed in the absence of the feed can be due to stress. The stress in fish can be detected through their movement in the water. Physical, chemical, and perceived stressors can evoke a non-specific response in fish [6]. Bad water quality also affects the behavior of the fish. Monitoring or predicting the water quality parameters is very important in behavioral study [7], [8]. Stress can affect the fish's reproduction, growth, and resistance [9]. Early detection can help in better growth and development of the fish. The advancement in technology gives us hope to conduct such experiments more straightforwardly.
The experiment in this work has been performed on Tilapia fish. These are edible freshwater fish having a high nutritional value. These are commonly consumed fish worldwide and is the most farmed fish species in over 120 countries and territories across the globe [10]. Tilapia fish are fresh water fish found in ponds and shallow waters. These fish are very much affordable and provide a great amount of nutritional value as they are a good source of omega-3 fatty acids [11] that make up good fat. These are also beneficial for cancer prevention and good heart health.
RAS [12], [13], [14] has been used to cultivate these fish. In RAS, the water is recycled and reused after mechanical and biological infiltration. New water is added to the system to make up for the splash-out, evaporation, and any other process that leads to water loss from the system. RAS has several advantages such as better fish quality, close monitoring, risk reduction due to climatic factors, disease, and parasite impacts.
The contributions of this paper are: • Fish detection and counting • An algorithm for frantic movement pattern detection The paper is organized as follows: Section I gives the Introduction to the research work. Section II shows the Related Works. The Experimental Setup, Dataset Preparation and Proposed Methodology are given in Section III,Section IV, Section V respectively. Section VI shows the results obtained from this research. Finally the paper is concluded in Section VII.

II. RELATED WORKS
Shreesha et al. [15] have performed a behavioral analysis of Silago sihama fish in an aquarium setup. An analysis is performed on the various types of anomalous behaviors of the fish, such as swimming at the surface, frantic movement behavior, and no movement, which has been mathematically computed. Wang et al. [16] have used a convolutional neural network, YOLOv2, to classify and detect the fish images. They have used this model to detect the fish and classified them into one of the seven classes, such as shark, dolphinfish, yellowfin tuna etc. Lumauag et al. [17] have proposed a method to count and track the fish, which was carried out using blob analysis and Euclidean filtering. The authors claim that the model exhibits an average counting accuracy of 91% and an average detection accuracy of 94%. Hossain et al. [18] have worked on detecting the underwater moving objects as fish, identifying the species of those fish, and tracking the detected fish to avoid multiple counting. GMM-based model background subtraction method has been used to detect objects and the Kalman filter for object tracking. Pyramid Histogram of Visual Words (PHOW) is used as a feature vector and SVM as a classifier. Kong et al. [19] have done a comparative study of fish detection wherein they have modified the YOLOv5 algorithm to achieve better performance. The authors have made use of 3 models, namely Centernet, YOLOv4, and YOLOv5, for Golden Crucian Carp detection. They have shown that the YOLOv5 model has performed better than all other models.
Mohamed et al. [20] have tracked the movement of the fish by making use of optical flow algorithm and combination of YOLO + optical flow. They have concluded that the results obtained from YOLO+optical flow were much better than the optical flow algorithm only as YOLO+optical flow was able to make track 4/4 fish whereas only optical flow could track 1/4 fish only.
Sung et al. et al. [21] have made use of YOLO algorithm and have compared it with the HOG classifier-based algorithm and have proved that the former has better performance than the latter as it can successfully process noisy, dim light, and hazy underwater image.
Naseer and Baro [22], [23] have come up with a refinement technique based on spatial temporal analysis to improve the mAP for detection of Nephrops burrows. The proposed refinement technique is said to work for generic detection models. From this Literature Review it is evident that YOLO models give accurate detection results and hence YOLOv5 has been implemented in this work for detection of fish.
To the best of our knowledge there has been no work on frantic behavior detection. So in this work an algorithm has been developed to detect frantic behavior pattern in the video sequences.

III. EXPERIMENTAL SETUP
A tank-based RAS(Recirculatory Aquaculture System) [24] of dimensions 8 feet × 6 feet × 4 feet, as shown in Fig.1 has been set up in the Institute premises for the experimental purpose and consists of Tilapia fish. The details of the experimental setup and dataset are shown in Table 1. It has been constructed using cement. A few portions of the pond are covered with glass for better visibility from the sides. The bottom of the system is covered with sandbed. An aerator is used to mix air with water. The water in the RAS is recirculated through the biofilter. The biofilter allows the water to pass through it. This filter serves as a home for bacteria that break down fish waste to keep the environment safe and non-toxic. The water quality in the pond is closely monitored using water quality sensors as shown in Table 2 so that the fish are provided with a conducive ecosystem.

IV. DATASET PREPARATION
An underwater camera is used for capturing the underwater video. Two videos are captured and converted into frames. The annotations have been done using Roboflow which acts VOLUME 10, 2022  a ground truth for detection. For the purpose of frantic movement detection multiple short video sequences are manually cut from the captured videos and classified as normal or frantic based on the movement of the fish by observation.
Some challenges faced during the process of data collection are as follows: The farthest point of the pond was not visible in a frame as the length of the pond was huge. Since there is no lighting inside the pond, the video capture was a bit dark. Since an underwater camera was used to capture the videos, the motion of the camera had to be completely controlled.

V. PROPOSED METHODOLOGY
The video data is manually collected from the RAS system using an underwater camera. The videos captured are then annotated and used for training the detection model. Fish counting is then performed on the detected images. Masked image is created from the detections. For frantic behavior detection, the fish are tracked across the frames. The displacement of the fish in the successive frames are calculated and stored which are then averaged to find a global threshold τ . Once the threshold τ is defined, the frames with values greater than τ are considered to be frantic. The block diagram for the work is shown in Fig.2.

A. FISH TRACKING AND DETECTION
YOLO [25], which stands for 'You Only Look Once' is an object detection algorithm used for object detection and uses a Convolutional Neural Network (CNN) [26]. This algorithm has gained a lot of popularity due to its high-speed

Algorithm 1 Fish Detection and Counting
Step 1: Collect the video of fish using underwater camera Step 2: Convert the video into frames Step 3: Annotate the fish in each of the frame Step 4: Split the frame set in train, validation and test set Step 5: Train the YOLOv5 model using the train set Step 6: Use the trained model to detect and count the fish in the test set

Algorithm 2 Frantic Behaviour Detection
Step 1: Collect the video of fish using underwater camera Step 2: Convert the video into shorter video sequences Step 3: Classify the video sequences captured in Step 2 into Normal or Frantic based on observation Step 4: Apply the optical flow on the Normal sequences Step 5: Obtain the movement of fish in consecutive frames Step 6: Find the threshold using the Otsu method for each Normal video sequence Step 7: Average the thresholds obtained in Step 6 to obtain a global threshold τ Step 8: Classify the video sequences as frantic or not using the global threshold.
accuracy and learning capabilities. The algorithm detects object through single forward propagation. The YOLOv5 network architecture as shown in Fig. 3 consists of three parts namely Backbone, Neck and Head where CSPDarknet [27] makes up the Backbone and is used for feature extraction, PANet [28] makes up the neck and is essential for feature fusion and the head is made up of Yolo which is used for the detecting output reults.
YOLOv5 model, has been trained for the detection of fish. It has been trained using Roboflow.
A total of 508 images has been used. The ratio of train, validation, and test is set to 70%, 15%, and 15%, respectively. The images are resized to 416 × 416.

B. OPTICAL FLOW
Optical flow [30] is a motion estimation technique and is usually applied to the images having a small time step between them. The velocity for points within the images is calculated. This work captures the displacement of the fish detected by the bounding boxes. The equations (1)-(4) are used for optical   flow computation [31].
Since the movement of the fish between the consecutive frames is small

+ higher order terms
Truncating the higher order terms,

C. THRESHOLDING
To classify the fish movement as normal or frantic, a threshold has been found out. Otsu method [32] is used for automatic image thresholding. It maximizes the between class variance.
The threshold values for all the video sequences are obtained and averaged which is the global threshold, τ .

A. FISH TRACKING AND DETECTION
YOLOv5 model has been implemented for fish detection and counting. Based on the Precision, recall and F-measure, the results obtained for fish detections can be considered as good.
The model is able to detect the fish almost accurately. In order to measure the performance of the model, mean Average Precision (mAP), Precision(5), Recall(6) and F-score(7) have been calculated.
Precision is calculated as the number of true positives divided by the total number of true positives and false positives.
Recall is calculated as the number of true positives divided by the total number of true positives and false negatives.
F-Measure presents a single score that evaluates both the concerns of precision and recall as shown in Training graphs are shown in Fig.4. Sample ground truth and model predictions are shown in Fig.5. Fig. 6 shows the case where the model has failed to detect the overlap of the fishes. The evaluation metrics are shown in Table 3.

B. OPTICAL FLOW
The video sequences are manually classified as normal or frantic based on the displacement of fish. Since no prior research has been conducted on the frantic movement pattern of the fish, it has been done manually to establish a groundtruth. 10

C. THRESHOLDING
Otsu method has been applied to all the csv files obtained in the previous step. The obtained thresholds are then averaged to find a global threshold τ for the frantic movement classification as shown in Table 4. Here the value of τ is found to be 0.0455.
Optical flow is then applied on the frantic video sequences with the thresold set to τ . The frames having displacement values greater than τ are classified as frantic frames and the those with displacement values lower than τ are classified as normal frames. The percentage of frantic frames are the calculated using (8), as shown at the top of page 8.   The percentages of frantic frames in each of the video sequence are shown in Table 5.
In this particular research work, fish detection and counting and frantic behavior detections have been performed. Decent results were obtained from YOLOv5 model in-terms of accuracy, precision, recall and F-measure for fish detection and counting.
Early detection of frantic movement in fish facilitates fish farmers to take precautionary measures promptly. The percentages of frantic frames in a video sequence gives a

Percentage of frantic frames =
Number of frantic frames in the sequence Total number of frames in the sequence * 100 (8)

VII. CONCLUSION
Aquaculture has gained immense importance in the recent times as it has increased the number of potential jobs as it produces a product for the marketplace as well as creates employment opportunities. It is a source of food as well as an extra source of income. This work has focused on the detections and counting of fish and frantic movement behaviour detections in the aquaculture setup. The early detection of frantic behaviour pattern is essential to ensure that the fish in the aquaculture system are provided with an ambient environment for growth. In this work, the fish detection and counting has been carried out using the YOLOv5 model. The model has managed to detect and count a good percentage of fish in the frame. The detected fish has been tracked using Optical flow and Otsu thresholding method has been used to find the global threshold. The threshold has then been used to find the percentage of frantic frames in the video sequence and thus classify the video sequences as normal or frantic.
KRITHIKA M. PAI is currently pursuing the bachelor's degree with the Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, India. She has worked on various industry research projects and has published her work in international conference proceedings. She has also received an award for poster presentation. He holds seven patents to his credit and has published more than 130 papers in national and international journals/conference proceedings. He has published two books and guided eight Ph.D.'s and 80 master's theses. His research interests include data analytics, cloud computing, the IoT, computer networks, mobile computing, scalable video coding, and robot motion planning. He is a Life Member of ISTE and of Systems Society of India.