Autonomous Detection and Deterrence of Pigeons on Buildings by Drones

Pigeons may transmit diseases to humans and cause damages to buildings, monuments, and other infrastructure. Therefore, several control strategies have been developed, but they have been found to be either ineffective or harmful to animals and often depend on human operation. This study proposes a system capable of autonomously detecting and deterring pigeons on building roofs using a drone. The presence and position of pigeons were detected in real time by a neural network using images taken by a video camera located on the roof. Moreover, a drone was utilized to deter the animals. Field experiments were conducted in a real-world urban setting to assess the proposed system by comparing the number of animals and their stay durations for over five days against the 21-day-trial experiment without the drone. During the five days of experiments, the drone was automatically deployed 55 times and was significantly effective in reducing the number of birds and their stay durations without causing any harm to them. In conclusion, this study has proven the effectiveness of this system in deterring birds, and this approach can be seen as a fully autonomous alternative to the already existing methods.


I. INTRODUCTION
The damage caused by birds includes direct and indirect effects on agricultural crops, livestock, and infrastructure, which have serious economic and human implications [1], as shown in Fig. 1. Specifically, feral pigeons (Columba livia domestica), also known as city doves, city pigeons, or street pigeons, are considered as nuisances and pests because of their large amounts of excrement, which piles upon property and serves as a reservoir and vector of diseases. Among all birds, this persistent and invasive species is considered the most serious pest bird in terms of economic loss in the United States, with an annual damage estimate of $1.1 billion [2]. In addition, the acidic droppings of pigeons deteriorate and damage different materials such as cars, valuable buildings, and cultural objects [3]. Pigeons also tend to gather in locations that make human intervention difficult, dangerous, or expensive (see Fig. 2).
The associate editor coordinating the review of this manuscript and approving it for publication was Khoa Luu .
Besides their economic impact, health risks are associated with feral pigeon contact linked to pigeon droppings [4]. Several studies have indicated that these droppings can be reservoirs of zoonotic pathogens, such as Chlamydia psittaci (causative agent of human psittacosis) and Salmonella [5]- [7]. Among all pigeon species, feral pigeons are responsible for the highest infection rates of zoonotic agents [5]. Further, the recent SARS-CoV-2 virus is believed to have reached humans via animals [8]- [10]. The aforementioned problems are further compounded by the steady increase in the feral pigeon population during the second half of the last century [11], [12].
There are several strategies to deter birds, including auditory deterrents, visual deterrents, physical barriers, and natural predation [13]. An auditory deterrent has been used to deter birds from crops. 1 Although this strategy is applicable in farmland, it can cause disturbances when employed to urban settings because the hearing range of pigeons largely  . This excrement can remain on buildings for long periods, causing damage. To remove it, expensive and dangerous cleaning must be performed (right). An automatic pigeon deterring system can provide a remedy. overlaps with that of humans. Pigeons can detect sounds at frequencies as low as 0.05 Hz up to 11,000 Hz whereas the human range is 20-20,000 Hz [14], [15]. On the other hand, visual deterrents, such as decoys, moving lights, and reflective items, are often deemed ineffective because pigeons can rapidly habituate to visual disturbance [16]. Meanwhile, physical barriers, including spikes, wires, nets, or gel repellents, are widely used in urban environments for feral pigeons due to their efficacy [17]. However, these barriers have high initial costs, degrade over time if not taken care of [16], can only be applied to treated areas [18], and are harmful to animals. Finally, natural predation is one of the most effective methods for deterring birds, owing to their long-term efficacy without disturbing humans [19]. For example, falcons and raptors can be used to fend off and intimidate feral pigeons effectively [20]. However, this method can be harmful to birds and requires hiring a falconer, and falconers are difficult to find and expensive to hire.
The effectiveness of natural predators has inspired the use of artificial predators, such as drones. The use of drones to deter birds is not novel, with early studies having been reported more than a decade ago [21], [22]. Although drones are becoming more publicly available, related approaches heavily rely on human operators, which makes them expensive to be deployed on a larger scale.
This study proposes an autonomous drone-based system capable of automatically detecting and deterring pigeons in urban environments without being harmful to birds (see Fig. 1). The proposed system detects pigeons using a specifically trained neural network that assesses images from a camera positioned on a vantage point in the environment to estimate the presence and positions of birds in the area in which the drone is deployed to fly. We assessed the efficacy of the system on the roof of the EPFL SwissTech Convention Center, which was reported to have large amounts of pigeon droppings that normally require continuous cleaning to prevent damage on the building (see Fig. 2). We started by analyzing the behavior of pigeons on the roof for 21 days without the interference of the drone. Then, we autonomously deployed the drone-based system over a period of five days. In both cases, we measured the time pigeons stayed on the roof. The results indicate that pigeons leave significantly earlier when the proposed system is in place.

II. RELATED WORK
To the best of our knowledge, Grimm et al.. [23] were the first to document the application of a fixed-wing unmanned aerial vehicle (UAV) in a vineyard as pest control to protect crops. The drone took off and landed autonomously, but the flight path was predefined and independent of the presence of birds. Meanwhile, Vas et al.. [24] tested the effects of drone color, speed, and flight angle on the behavioral responses of mallards, wild flamingos, and common greenshanks, which provided valuable insights, although the study was not aimed at deterring birds (specifically, pigeons).
The interactions of a drone and birds were analyzed by Wang et al. [25], who also developed a system for deterring birds in vineyards. In their most recent study, they performed manual flights to compare the efficacy of their system to other pest control strategies (netting and visual tactics) [17]. Moreover, they proposed solutions for autonomous bird detection and trajectory planning in their earlier studies [26], [27]. However, these modules have never been combined and deployed in real-world scenarios, and the deterring systems rely on loud bird distress calls, which are not applicable in urban environments.
On the contrary, Paranjape et al. [28] proposed a method of guiding flocks of birds away from a protected area (e.g., an airport) to prevent them from landing. However, birds were detected by humans, and only manual flights were conducted to collect data. Recently, these problems and limitations associated with using UAVs to deter birds have gained attention in industry as well. Indeed, several commercial products (e.g., flapping-wing UAVs and multicopters) produced by companies such as The Drone Bird Company, 2 Bird-X, 3 and Bask Aerospace 4 are said to be effective drone-based bird deterrents. However, there are no available scientific evaluations of their performances. Furthermore, all these proposed solutions rely on human operators to detect birds and perform UAV steering.
Overall, none of the aforementioned scholars proposed a fully autonomous system, which would require bird detection, position estimation, and drone deployment to the corresponding position, while also considering pigeons in urban environments.

A. SYSTEM DESIGN
The system consists of three hardware modules: a camera, ground station, and drone ( Fig. 3 and Supplementary Video S1). Specifically, the ground station commands the camera to scan the environment and receives images. A neural object detector that was trained using pigeon images identifies the bounding box in an image in which pigeons are present. Then, the position of the bounding box in the two-dimensional image space is translated into the three-dimensional global navigation satellite systems coordinates. The drone was instructed to take off and fly over the identified coordinates (i.e., detected pigeons) before returning to its home base. Although this process does not require human intervention, an operator is still needed to authorize each automatic takeoff based on national regulations because of the possible presence of people on the ground.
The following subsections explain the constituent modules in more detail.
1) Camera: Multiple sensor modalities could be used to perceive the environment. Recently, the acoustic detection of 2 www.thedronebird.com 3 www.bird-x.com/bird-products/drones 4 www.baskaerospace.com.au/aerodrone/avian-scout birds was proposed [29]. However, this method was deemed unreliable given the possible high levels of environmental noise in urban scenes. Moreover, light detection and ranging (LiDAR) data have been used for object detection with high success rates according to Lang et al. [30]. In addition, camera-based approaches have been proposed as alternatives, which have similar performance with lidarbased approaches and apparent advantages in terms of hardware [31]. Therefore, our proposed system favors a solution that leverages computer vision.
Currently, several UAVs use onboard cameras. However, state-of-the-art computer vision algorithms require dedicated hardware with adequate computational power, limiting the detection of target objects that are far away and therefore small. In this study, we could let the drone move through the environment to search for pigeons actively to address this limitation. However, the energy of the drone is a crucial parameter for long-term success of this strategy, and drones should be used as quickly as possible. An autonomous drone recharging station would only mitigate this problem and would not allow drones to detect pigeons during recharging. A straightforward solution to this issue is the use of a camera system that is set up on the ground and scans the environment.
A simple monocular camera experiences similar constraints experienced by an onboard camera (i.e., small object detection and scale ambiguity). However, installing multiple cameras, which is the solution to this issue, is expensive, requires more complex installation, and involves handling an increased amount of data. Pan-tilt-zoom (PTZ) cameras can be a good compromise because of their flexibility in orientation and zoom, covering vast areas. Thus, the proposed system relies on a PTZ camera, which is weather-resistant and can oversee its full surroundings using a 360 • pan, mounted at a fixed position in the environment. The combination of a 12× optical zoom and 4 MP resolution enables detailed representation of the environment.
2) Pigeon Detection: Next, pigeons were detected within these camera images. Some scholars have proposed the detection of birds using traditional computer vision methods, such as analyzing the pixel changes between two consecutive images using features from accelerated segment test (FAST) [26] or detecting moving objects via background subtraction [32]. In recent years, learning-based methods have also been applied. Hong et al. [33] compared different object detectors used for detecting birds with an aerial view, and they concluded that a Faster region-based convolutional neural network (R-CNN) [34] was the most accurate approach. Meanwhile, Bhusal et al. [35] proposed a detector model with pre-trained weights and only fine-tuned it with specifically collected images to counteract overfitting. In this study, we combined these two ideas. As the Faster R-CNN is still one of the best object detectors, we employed it to detect pigeons using the generated images from the camera [36]. Faster R-CNN is a two-stage object detector where the first stage (CNN) extracts features from the image and proposes image regions where objects are supposedly located. Thereafter, a second stage neural network predicts the bounding boxes -object class and coordinates -in the proposed regions. Since the accuracy of object detectors correlates with the backbone (i.e., the first stage of the Faster R-CNN model) accuracy on the ImageNet ILSVRC 2012 classification dataset [36], we started from a Faster R-CNN that uses Inception-ResNet v2 as backbone [37]. Inception-ResNet v2 is a 164-layer CNN that builds on the Inception family and incorporates residual connections to improve the performance (Top-1 error of 19.9% on ILSVRC 2012). Faster R-CNN is composed by two networks: a Region Proposal Network (RPN) and an Object Detection Network (ODN). The two networks share a common backbone, a Convolutional Neural Network (CNN), which in this work is the Inception-ResNet v2 architecture [37], pretrained on ImageNet. The RPN predicts the regions-namely the anchors-(i.e., where the object is likely to be) as a set of rectangular object proposals together with an objectness score-a measure of the membership to the set of object classes vs background. For that, the RPN uses a small CNN on top of the feature maps of the shared backbone: a n x n convolutional layer (with n = 3) followed by two 1 × 1 convolutional layers, one for the region regression and one for the classification. The RPN network is trained with a binary classification loss and a smooth-L1 loss for regression. The second stage is the Fast R-CNN model used for object detection that takes as input the selected regions, through ROI pooling, to predict the bounding boxes and the final objects class. Note that the training procedure and meta parameters follow those used in [36]. Table 1 summarizes the main meta parameters of the model.
The resulting model is one of the best performing object detectors on the MSCOCO dataset [38] (mean Average Precision, mAP, is 38.7) among those available in the Tensorflow Model Zoo (for the Tensorflow Object Detection API, the reader is referred to [36]). In addition, the model was pre-trained on the MS-COCO dataset [38] and finetuned with specifically collected images of pigeons in an urban environment. The camera was mounted on the roof of a building and collected over 30 h of video footage following a predefined routine of PTZ commands. We reduced the amount of data through random sampling and subsequent inspection of the images. Moreover, we allowed the model to distinguish between two different classes (i.e., pigeons and other) rather than having only one class. We labeled the false positive results, which mainly consisted of metal pieces on the roof, as others in the dataset. This process reduced the number of false positives in the pigeon class, improving the overall performance. Then, we balanced the number of images per class by performing image augmentation (e.g., random affine transforms, color-channel swaps, and noise addition) on the other class, which resulted in 2,539 images that were equally divided for final training and testing. Then, 10% of the training set was isolated for validation. Moreover, we downloaded the model from the Tensorflow Model Zoo and tested it on our pigeon dataset. The class ''bird'' of the Faster R-CNN model reached an AP of 0.38%. Therefore, we fine-tuned the model on the collected training set. The training procedure on the pigeon dataset lasted 40000 steps by using: 2 classes (pigeons, other); a fixed shape resizer with a target of 600 × 1024; first stage feature stride of 16 × 16; no dropout; a batch size of 12; learning rate of 2e-05. During runtime, the detector returned a bounding box for each detected object in an image. Fig. 4 shows the results of two examples.
3) Position Estimation: We converted the bounding box generated by the detector into GPS coordinates to send the drone closer to the pigeons. Recovering depth is an ill-posed problem for a monocular camera setup [39]- [41]. Early solutions leveraged visual cues in the image, such as texture or occlusions, whereas more recent alternatives rely on machine learning, which recently yielded performance improvement (see [42]). However, these methods have two major limitations. First, both provide dense depth maps, which have higher computational costs than object-specific depths (some solutions to this issue have been proposed recently by learning the depth specifically from bounding boxes [43]). Second, extensive training on diverse datasets is needed for generalization to different environments [42], and most approaches have difficulty generalizing over different camera models [44]. The results of Vas et al. [24] suggest that drones deter pigeons at distances of several meters, which relaxes the required position estimation accuracy, as supported by our field experiments with a manually flown drone. We adopted a simpler approach by leveraging only the dimensions and position of the bounding boxes and assumed that the pigeon height is known and constant (see Fig. 5). Given the bounding box height as a percentage of the image height and a pinhole camera model, the distance between the camera and pigeon can then be calculated as where p z is the metric distance of the pigeon along the optical axis of the camera, f is the focal length of the camera at the current zoom level, h bb is the relative bounding box height with reference to the image height, h p is the assumed metric pigeon height, and h s is the metric sensor height. Similarly, given the bounding box position as a percentage of the image dimensions, the vertical and horizontal offsets of the pigeon with reference to the camera can be calculated using where p x is the metric horizontal offset of the pigeon with respect to the rotated camera, p y is the metric vertical offset of the pigeon with reference to the rotated camera, and d x and d y are the offsets of the bounding box with respect to the image center along the x-and y-axes, respectively. Finally, the pigeon position can be converted from the image coordinates to the GPS coordinates using the known pose of the camera. We tested our system in a controlled environment where we knew the ground-truth position of a pigeon decoy through a motion capture system. These experiments enabled us to design a re-zooming procedure that enhances the pigeon position. Therefore, the bounding boxes in the video frame after re-zooming cover a more significant part of the image, reducing the noise in the position estimation. This process is repeated until the bounding box exceeds a specific size or the zoom level reaches its maximum.

4) Drone:
The GPS coordinates from the position estimation serve as GPS targets for the drone. We used an off-the-shelf Parrot Anafi drone that enables autonomous flights based on GPS targets. Anafi has a low sound level (65.5 dB at 1 m distance [45]) and relatively low weight (320 g), making it suitable for urban environment applications because nearby humans are less likely to be disturbed. Moreover, the development of algorithms to control the drone was sped up using the simulation software based on Gazebo provided by the vendor. We built a one-to-one simulation environment that enabled us to test the entire pipeline (pigeon detection, pigeon position estimation, and drone deployment) in the simulation.

B. FLOCK STAY TIME
This section explains the method used to evaluate the ability of the proposed system to chase away pigeons from the environment. If the system is successful, the pigeons leave earlier than they initially intended. Thus, the time that pigeons stay on the roof, denoted as the stay time, is a valid metric for evaluating the impact of the proposed solution.
We leverage the architecture proposed in Section III-A to analyze the behavior of pigeons. In our system, the camera records a video of the environment, and the detector is used to detect pigeons. Assigning the time each individual pigeon stays in an environment would be a time-intensive task. Although research in this direction is topical (see, e.g., [46]), we decided to follow a simpler approach and monitor the entire flock instead of each individual pigeon. Accordingly, we only considered the flock stay time to be the relevant information because (1) pigeons are gregarious especially during roosting [47] and (2) individual trackers do not perform well on large crowds. Indeed, the detection quality significantly affects the tracking, which drastically deteriorates on heavily occluded objects (see [48] for a survey on deep-learning-based multi-object tracking systems). In general, occlusions can be handled if they do not last long and if the tracked object is far from others. In the current environment, the frequent switching of IDs would make an accurate counting impossible due to the extreme proximity and numerous overlaps between different pigeons.
The method of calculating the flock stay time is discussed in the following section and shown in Fig. 6. We define a flock as a group with more than a threshold f th of pigeons present and detected at the same time on the roof. Then, we consider a certain flock to be different from another if there is a period VOLUME 10, 2022 between the two groups of pigeons. Moreover, we filtered the pigeon count using a moving average on the number of pigeons with a time window of size t w , as the count may suffer from some noise (e.g., pigeons occluding each other occasionally). Each time the moving average intersects the f th from the bottom up, we consider that a flock has arrived on the roof. Accordingly, we assume that a flock leaves the roof each time the moving average intersects the threshold from above. The time between arrival and disappearance of the flock is the time the flock stays in the environment (i.e., the flock stay time).

IV. EXPERIMENTAL SETUP
Based on the metric introduced in Section III-B, we assessed the pigeon behavior with two experiments: one in which we observed the natural behavior of pigeons without the deterring system in place (this experiment served as a baseline) and another in which we observed the behavior of pigeons in the presence of the drone. We refer to the former as the without drone experiment and the latter as the with drone experiment. Both experiments covered a multitude of flocks and stay times.
We hypothesized that the interference of the drone would force flocks (of any size) to leave earlier than they preferred, resulting in a significant reduction of the flock stay time compared to the without drone experiment.

A. TEST ENVIRONMENT
The applicability of the system was evaluated on the roof of the SwissTech Convention Center, a building located in an urban area in Switzerland in which pigeons are spotted almost every day.  Fig. 7 shows a satellite image of the building where the system was tested.

B. PIGEON HEATMAP
The horizontal ridge in the middle of the roof splits it into two halves: an upper half that is inclined toward the north and a lower half that is inclined towards the south. During initial observations, pigeons were reported to stay mostly in the southern half of the roof. Therefore, we focused on this part of the roof in our experiments.
Based on this decision, we performed a preliminary assessment to obtain a quantitative estimate of the pigeon activity in the observed environment. Towards this goal, we moved the camera according to a scanning routine of the southern part of the roof (see Fig. 7, orange cone). The pigeon detector detected pigeons in the images and estimated their positions using the pigeon position estimator for each PTZ value of the camera. Each of these positions is plotted as a light blue circle on the satellite image of the building in Fig. 7. The resulting heatmap enables the assessment of the pigeon distribution on the roof. We found a significant accumulation in the southeast corner of the roof. From this preliminary assessment, we elicited another assumption that simplified the evaluation of the proposed deterring system: we let the camera be fixed in one orientation to observe only the section of the roof that was the most affected by the pigeons (see Fig. 7, green cone). In addition, this assumption enables our system to be used as often as possible (i.e., we pointed the camera towards an area of the roof that often contains pigeons).

C. SYSTEM VALIDATION
We analyzed pigeon behavior in two different scenarios to assess the efficacy of the system for deterring pigeons: a natural scenario with no drone interference (the without drone experiment) and a scenario influenced by a deployed drone (the with drone experiment). As mentioned in Section III-B, we used the camera-detector combination to ensure automatic and repeatable evaluation of the proposed metrics.
For the without drone experiment, the video stream was recorded and stored, while the pigeon detector was run offline to count the pigeons over time and calculate the metrics described in Section III-B. Meanwhile, the pigeon detector was run online and the pigeon deterring system was in place (see Fig. 3) during the with drone experiment. The approximate positions of the drone takeoff location and human operator (required for national regulations) are shown in Fig. 8. The drone was deployed if the pigeon count surpassed the threshold f th . It flew to the target position, hovered there for a certain amount of time to drive the pigeons away, and then returned to its starting position. Sometimes a flight was not executed if weather conditions were too adverse.

A. PIGEON DETECTION
The precision-recall curves were used to evaluate the pigeon detector. As is common for object detectors, the chosen metric was the mean average precision (mAP) with an intersection of union (IoU) of 0.5, as used in the Pascal VOC Challenge [49], [50]. Fig. 8 shows two plots, one for each class known to the detector, namely, pigeons and other (see Section III-A2). The average precision was 59.92% for the pigeon class and 79.77% for the other class. Consequently, the overall mAP was 69.84%. Fig. 4 shows two representative examples of the detector in action: although most pigeons are detected well, the detector sometimes struggles with occlusions. In the top part of Fig. 4, multiple partially occluded pigeons are enclosed in one bounding box. Meanwhile, in the lower part of the figure, very close-by pigeons in the middle of the flock are not recognized.

B. FLOCK STAY TIME
We extracted frames from the recorded video streams at 0.5 fps for both experiments (without drones and with drones) to calculate the stay times (see Section III-B). We chose this relatively low rate to constrain the amount of data to process without affecting the validity of the results (pigeons are not expected to change their behavior within sub-second intervals). Furthermore, in both cases, we chose a flock threshold f th = 2.5 (i.e., we considered a group of pigeons a flock if there were more than two pigeons) and a window length t w = 20 s for the moving average.
For the without drone experiment, the video data were continuously recorded for 21 days from October to November 2020 from 7 am to 6 pm. This recording resulted in approximately 840,000 detections and 2,327 flock stay times. Meanwhile, for the with drone experiment, the proposed system was deployed for five days from December 2020 to January 2021. The pigeon positions were estimated based on an assumed pigeon height of 0.25 m. To deter the pigeons, the drone flew to the position of the pigeons at a speed of 5 m/s and hovered at that position before returning to its starting position. The hovering times varied between 5 and 15 s. All flights were executed during restricted time intervals between 9 am and 11 am and between 1 pm and 5 pm on the same day. These restrictions were imposed on the flight time because the building was located in an urban area. We included all measured flock stay times (185 in total), while the drone and operator were on the roof. Specifically, the drone was deployed 55 times (day 1: 8 times, day 2: 20 times, day 3: 8 times, day 4: 15 times, and day 5: 4 times) for these 185 flocks stay times.
The results of the two experiments (without drones and with drones) were compared using the survival analysis results. This term comprises various statistical procedures with the common goal of analyzing the time until an event occurs [51]. Examples of events are death in biological organisms or failure in engineering systems. In our case, we chose the flock departure as the event; that is, we analyzed the time until the flocks left. The flock stay times below 10 s were filtered out, as they were regarded as noise. The survival curves (see Fig. 9) were estimated using the Kaplan-Meier estimator. These curves plot the probability Sˆ(t) that a flock is still roosting at any given time t. A log-rank hypothesis test suggested that the drone significantly impacted the stay time (p < 0.001). These tools are suitable for unequal sample sizes [51], which makes them valid metrics for the conducted experiments. Note that, contrary to what the term survival analysis suggests, no pigeons were killed or harmed in the experiments. Table 1 lists the runtime measurements for the different parts of the proposed system. More precisely, the proposed pipeline was divided into five categories, where the former four are included under the time until takeoff. For each category, we present the mean and standard deviation. These data were measured during the with drone experiments (i.e., 55 full cycles). Overall, the pipeline has an average total duration of 115.96 s to complete a full cycle.

VI. DISCUSSION
The survival curves in Fig. 9 show the distribution of the flock stay times for both experiments. The maximum flock stay time is considerably larger in the without drone experiment (8848 s) than in the with drone experiment (290 s), indicating that pigeons tend to roost for up to multiple hours if the deterring system is not in place. Moreover, the survival probability starting at 30 s decreases more rapidly in the with drone experiment than in the without drone experiment. This difference is consolidated by the log-rank analysis, given that the underlying null hypothesis of identical survival curves [51] is rejected by a small p-value. This result suggests that, overall, pigeons leave earlier when the proposed deterring system is in place, proving the efficacy of the system.
Regarding the lower end of the stay time spectrum, it is important to note that the runtime of the system plays a crucial role. Table 2 lists the duration of each phase of the proposed system. The average time until the drone took off was 68.11 s, whereas the minimum was 29.84 s. Therefore, with the current state of the system, flock stay times shorter than 29.84 s cannot be addressed, as shown in Fig. 9. However, we believe that with specific optimizations, our system could address shorter stay times. First, no user confirmation will be requested in fully autonomous experiments, reducing the minimum time until takeoff by an average of 21.75 s (see Table 2). Moreover, further studies are required to develop the pigeon detection process. In addition, false negatives, as indicated in Fig. 4, could partially affect the calculation of the flock stay time because of the erroneously reduced flock sizes. In the worst case scenario, a flock could mistakenly drop below the flock size threshold, preventing the drone from being deployed. Therefore, the detector can be re-trained with more clearly labeled images of such cases to provide better detection in crowded regions. Moreover, more recent detector models can be considered in future studies.
Although not quantified as specific results in Section V, several interesting observations regarding the interactions of pigeons and the drone were made during the experiments. First, the distance at which pigeons perceive the drone as a threat is highly variable and may be related to the number of pigeons. Whereas larger flocks were often scared simply by takeoff (which happened at a distance of 40-60 m from the pigeons), smaller groups of birds often let the drone come as close as a few meters. Furthermore, the duration in which the drone stays in the target region is an important tuning parameter. Some pigeons attempted to return almost immediately but were repelled by the hovering drone.
As a next step, additional experiments should be conducted to enable more in-depth evaluation of the system, including testing in other urban environments with a moving camera and the consideration of the long-term behavior of pigeons. Although the deterring effect shown in Section V addresses short-term behaviors only, the drone might also have long-term effects on the pigeons (i.e., flight could affect their behavior on the same day or the succeeding ones). Such effects must be analyzed over time to check if they are consistent or if pigeons become habituated to drones, as is the case for other well-known deterring systems. Moreover, because of group behavior, the effects may be more prominent if only larger flocks are intimidated by the drones. In this case, the efficiency of the system could be enhanced by simply increasing the flock size threshold, reducing the number of flights while maintaining the same performance. However, to examine such assumptions, it would be helpful to collaborate with zoologists. Finally, the amount of data gathered for training the object detector should be increased to cover more edge cases (e.g., pigeons occluded by humans working on roof) and specifically degraded visual environment scenarios (e.g., rain, fog, hail, snow). Finally, even if the number of flight tests is aligned with the related works (Section II), they are only sufficient for a proof of concept. More experiments are needed to make our solution more robust and possibly scale it to a real-world continuous deployment.

VII. CONCLUSION
Urbanization drastically changes the environments and behaviors of animals, causing long-term effects on these animals that are yet to be understood fully. Among them, pigeons have mostly adapted to living in urban areas. However, a qualitative analysis of their behavior has not been realized fully. UAVs are also emerging in urban areas, and their increasing presence will likely produce further changes in animal behavior. Our study is the first to demonstrate the effects of UAVs on the behavior of pigeons in urban areas.
To the best of our knowledge, this is the first dronebased system to deter pigeons fully autonomously. Our approach could reduce damage to buildings and decrease the transmission of diseases spread by pigeons. In addition, the gathered data could be used to understand the complex relationship between pigeons and drones.
Future work will include more sophisticated studies on pigeon behavior to evaluate the long-term effectiveness of our solution, including the determination of whether our system is subject to habituation from pigeons. From this perspective, we believe that our drone-based system has the potential to deter birds because it can be actively reprogrammed to prevent habituation. Another interesting addition to this work could be a systematic evaluation of the system performance in degraded visual environment scenarios (e.g., fog, rain, hail, snow). It is also important to point out that our drone did not have any collision avoidance feature due to the limited possibility of customization. A possible solution to this problem would be to use a more customizable platform [52]. Finally, it would be interesting to show if the efficiency of the system could be radically changed by leveraging knowledge about the behaviors and interactions of pigeons.
FABRIZIO SCHIANO received the bachelor's and master's degrees (cum laude) in automation engineering from the University of Napoli Federico II, Italy, in 2010 and 2013, respectively, and the Ph.D. degree in robotics from the University of Rennes  DARIO FLOREANO (Senior Member, IEEE) received the M.A. degree in vision, the M.S. degree in neural computation, and the Ph.D. degree in robotics. Since 2010, he has been the Founding Director at the Swiss National Center of Competence in Robotics, a research program that brings together more than 20 labs across Switzerland. He has held research positions at Sony Computer Science Laboratory, Caltech/JPL, and Harvard University. He is currently the Director with the Laboratory of Intelligent Systems, EPFL. He made pioneering contributions to the fields of evolutionary robotics, aerial robotics, and soft robotics. He served in numerous advisory boards and committees, including the Future and Emerging Technologies Division of the European Commission, World Economic Forum Agenda Council, International Society of Artificial Life, International Neural Network Society, and editorial committees of several scientific journals. In addition, he helped start two drone companies (senseFly.com and Flyability.com) and a non-for-profit portal on robotics and AI (RoboHub.org). His main research interests include robotics and artificial intelligence (AI) at the convergence of biology and engineering.