Deep Learning Assisted Fixed Wireless Access Network Coverage Planning

Wireless network coverage planning is crucial for mobile network operators and fixed wireless network providers to estimate the performance of their networks and plan future antenna mast deployments. To generate accurate coverage maps for target buildings, traditional wireless coverage planning tools either require manual input of Customer-Premises Equipment (CPE) antenna locations or need to compute received signal strength from nearby Access Points (APs) to all geolocations in the area of interest which consumes computational resource unnecessarily. In this paper we propose a Deep Learning (DL) based universal enhancement to wireless coverage planning tools which automatically extracts potential CPE antenna locations from aerial images of the target buildings. We evaluate the performance of the pixel level object detection provided by Mask Region-based Convolutional Neural Network (Mask R-CNN) trained on an image dataset with suburban and rural residential properties across North Yorkshire, UK. We also demonstrate a complete task flow to generate informative building coverage reports while combining the DL based building detection with the WISDM industrial wireless coverage planning system.


I. INTRODUCTION
Terrestrial telecommunications networks are now widely used and serve the vast majority of the world's population. One important factor which significantly affects the end-user experience is the last-mile technology [1] which is, in most cases, the speed bottleneck of the communication networks, and results expensive Capital Expenditure (CAPEX) and Operating Expenses (OPEX) to the network operators. The traditional last-mile solution is broadband over copper wire (e.g., ADSL, ADSL+ and VDSL which is also known as FTTC) [2] which relies on the existing landline telephone networks and has limitations on speed. Currently the common last-mile technologies are fiber to the cabinet (FTTC) [3] and fiber to the premises (FTTP) [3] which use optical fiber to deliver Internet services to customers' premises directly. FTTP brings much higher bandwidth and impressive speed against the traditional copper wire. However, FTTP technology requires the investment of new infrastructure which could be considered as non-cost-effective by the network operators in less populated rural areas. Even in countries with advanced infrastructure, the fiber connections are not available to all premises. For example, according to Connected Nations 2020 [4] published by Ofcom, 96% of homes in the UK have access to fast broadband but only 18% have full fiber connections.
The last-mile technology which fills in the gaps of the fiber networks is Fixed Wireless Access (FWA) [5]. FWA uses both dedicated fixed networks and shared mobile networks (e.g., 4G and 5G networks) to deliver Internet services to endusers with less infrastructural costs. Ofcom has estimated that 95% of the UK homes have access to FWA service from at least one of the Mobile Network Operators (MNOs). With the rollout of 5G [6] the current 5G FWA [7] devices can achieve 150 Mbps [4] on 5G New Radio (5G-NR) bands below 7 GHz. The future deployment of 5G-NR at Millimeter Wave (mmWave) spectrum [8] will allow the FWA to achieve multi-Gbps level [9] which is comparable to superfast fiber networks.
Coverage planning is critical to MNOs and FWA providers to balance the costs and profits before building new masts, and to estimate the Quality of Service (QoS) of the existing wireless access points (APs). It is common to use terrain and surface elevation data and different propagation models to build a point-to-point wireless signal path profile and to estimate the received signal strength [10]. The wireless coverage can be estimated by repeated applying the path profile on different combinations of APs and customer locations. There are several commercially available coverage planning tools such as Google Network Planner [11], CHIRplus_TC [12] (from LStelcom), PROGIRA plan [13], Network planning [14] (from Cellular Expert), cnHeat [15] (from Cambium Networks), and CellNetwork [16] (from CelPlan). From the demonstrations of these tools, we can find that to generate the coverage map, it is common for the tools to require manual inputs of the Customer-Premises Equipment (CPE) antenna locations, or alternatively computing the path profiles from the APs to all Geographical Information System (GIS) [17] ''pixels'' within the interested area. The size of the pixels usually varies depending on the availability of the data, for example the LIDAR Composite DSM 2017 [18] provides the terrain elevation model of England with spatial resolutions (pixel size) between 25 cm and 2 meters. The former method requires human involvement, which mostly becomes the bottleneck of the system. The second method consumes a significant amount of computational resource, particularly for high spatial resolution cases and moreover, the coverage map needs to be generated again if there are changes to APs. In this paper, we present the idea of a Deep Learning (DL) [19] based universal enhancement to these planning tools to utilize the computing resource more effectively without the requirements of human involvement.

A. MOTIVATIONS
Wireless signal propagation is sensitive to the locations and elevations of the AP and CPE antennas as well as the obstacles along the signal path [20]. It becomes even more sensitive once the signal moves to higher frequencies with narrower antenna beams. It is very likely that the signal propagation environment changes drastically after moving the CPE antenna just a couple of meters. For example, moving the antenna from one side of a pitched roof to the other could result the differences between line-of-sight (LoS) and non-line-of-sight (NLoS) paths which will largely determine the QoS.
To accurately capture the signal coverage on customer premises, a coverage planning tool should be able to identify the potential locations for CPE antennas where acceptable received signal strength is anticipated. As far as we are aware, this is yet to be addressed in the coverage planning tools. On the other hand, considering the safety and time scale of the field engineers who install the CPEs, roof edges are common locations for them to mount the antennas, because it is fast and easy to access the roof edges without the risks of damaging the tiles while maintaining elevation. The other common antenna mounting point is the top of a chimney which mostly has the highest elevation of a property. However, a chimney is usually difficult to find on relatively modern properties. Inspired by these, we propose the idea of using DL based computer vision methods to identify property roof edges as potential antenna mounting points and feed the locations to coverage planning tools to generated detailed signal path profiles.

B. RELATED WORK
Artificial Intelligence (AI) has been advanced rapidly in the past a few years and as a major subset of the AI technologies, DL has already been widely used in many real-life applications such as object classification and detection, speech recognition and language translation [21]. The capability of Deep Neural Networks (DNNs) [22] and Convolutional Neural Networks (CNNs) [23] has resulted many powerful tools for image processing tasks which would be difficult to complete with traditional computer vision methods. The performance of object detection has been improving rapidly with the development of Region-based CNN (R-CNN) [24]. The later subbranches Faster R-CNN [25] and Mask R-CNN [26] have brought sematic segmentation and instance segmentation to object detection tasks for images and videos [27]. The readily available large image datasets such as PASCAL VOC [28], Microsoft COCO [29], and ImageNet [30] make it possible to train the DNNs to identify most commonly seen objects. The pretrained DNNs models such as AlexNets [31], ResNets [32] and GoogLeNets [33] allow the application developers to use them as backbones and to quickly adopt to different types of object detection and classification tasks. For example, pretrained GoogLeNets and ResNets are used as the backbones to automatically diagnose skin lesions to prevent the further development of cutaneous cancer due to melanoma [34], [35]. UNet [36] is used as the basis to develop a rapid diagnostic tool to identify COVID-19 from chest CT images [37]. Pretrained ResNet-101 backbone is utilized to count the number of potato and lettuce plants from Unmanned Aerial Vehicle (UAV) imagery [38]. Faster R-CNN with the same ResNet-101 backbone is also used for the identification of working industrial chimneys from remote sensing images [39].
Variants of CNNs are also widely used on building extraction and segmentation from aerial / satellite images. The authors of [40] have proposed four different CNN architectures to map the building across the landscape of continental United States using the 1-meter resolution aerial images from National Agriculture Imagery Program. In [41] the authors have prepared a high-resolution dataset with buildings labeled across a 450 km 2 area in New Zealand and proposed several Fully Convolutional Network (FCN) models to extract building footprints. Building boundary regularization is proposed in [42] to refine the building footprint predictions VOLUME 9, 2021 from Mask R-CNN. Faster Edge Region CNN (FER-CNN) is proposed in [43] to improve the building detection results particularly for buildings with irregular shapes. The work in [44] has presented a bounding box rotation method for Mask R-CNN to improve the precision of building extraction from Google Earth images. The work in [45] has applied Framed Field Learning to UNet architectures to tackle the issue of footprint predictions for buildings with irregular shapes, particularly for buildings with holes inside the footprints. The reviewed works are summarized in Table 1.

C. CONTRIBUTIONS
In this paper we apply the state-of-the-art DL based computer vision methods to assist the wireless network coverage planning tools to identify the potential CPE antenna mountings points thereby producing accurate and detailed coverage estimation reports. The main contributions of our work are threefold: 1) Our work tackles common issues of current wireless coverage planning tools including the requirement of manual inputs of CPE antenna locations and the inefficient computational resource usage for generating areal coverage maps which also leads to inadaptability to network changes. The DL based computer vision methods automatically locate potential CPE antenna mounting points and significantly reduces the amount of path profile computations required to generate accurate coverage estimation maps. 2) Our work is made generally compatible with wireless network coverage planning tools which have the capability of computing point-to-point path profiles. The only input our work requires is either the address or the latitude and longitude coordinates of the target building which needs a coverage report. The map tools we use to acquire static aerial images of the target building are freely accessible to the public.

3)
Our work is open-ended. Later in the paper we show the effectiveness of the DL based building detection after training with a relatively small dataset. By simply replacing or extending the dataset with the images including buildings with different styles from the target regions, the building detection will adapt to be applicable worldwide. We have made our work accessible 1 so the readers can easily use our trained building detection directly or train with customized datasets according to their applications. The rest of the paper is organized as follows. Section II introduces the coverage planning tool, the DL based approach and the image dataset we prepare for the proposed coverage planning task. Section III presents the details of training the DNN and the results of building detection and wireless coverage planning. Section IV discusses the applicability and scalability of the proposed work for wider applications, as well as potential improvements that can be implemented to improve the results. Section V concludes the paper and proposes future research directions.

II. METHODS AND MATERIALS
This section describes the methods and materials we use to identify potential CPE antenna mounting points for a property.

A. WIRELESS COVERAGE PLANNING TOOL
In this paper we use WISDM developed by Wireless Coverage Ltd [46] to generate point-to-point path profiles to evaluate the propagation between the APs and the potential CPE antenna mounting points. WISDM is an industrial coverage modelling system which is able to visualize the network coverage in real time, supporting the spectrum from 2 GHz to 120 GHz. Fig. 1 shows an example of the coverage computed by WISDM for a test AP (the red push pin marker in the center) we deploy at a site near the Hammerton railway station about 15 km away from the York city center, UK. For consistency and simplicity, all test APs we use in this paper are operating at the center frequency of 5 GHz with a 20 MHz bandwidth. The Effective Isotropic Radiated Power (EIRP) is 36 dBm and all AP and CPE antennas are isotropic. The height of the APs is set to 20 m above the ground, which is mostly acceptable in the test suburban and rural areas in North Yorkshire with low-rise buildings. The height of CPEs is set to 8 m above the ground or the surface elevation, whichever the greater. In practice the parameters of the APs and CPEs need to be configured according to the local spectrum regulations and the licenses held by the network operators for accurate results. Different antenna patterns can be used as input to WISDM to estimate the coverage more accurately. In Fig. 1 the small red and green dots are the properties obtained from the Ordnance Survey AddressBase [47] service covering all addresses within a 10 km radius centered by the AP. The green dots indicate that these properties have clear LoS and the received signal power above the minimum required level and the red dots indicate the signal power is below a specified target or do not have clear LoS. Fig. 2 shows the zoomed map of an area near the AP in Fig. 1 so more details can be viewed. The small red and green dots are the latitude and longitude coordinates of the centroid points of the addressed properties returned from Ordnance Survey AddressBase. WISDM computes the path profile between the AP and the property centroid points based on the Bluesky digital terrain model (DTM) [48] (5 m resolution) and Bluesky digital surface (also known as clutter) model (DSM) [48] (2 m resolution covering elevated objects such as buildings and trees).
The left-hand-side of the path profile is the AP and the CPE is on the right-hand-side. In the path profile in Fig. 2 we can see an elevated obstacle near the CPE disrupting the Fresnel Zone Clearance [49] and blocking the RF LoS path. From the map we can see this obstacle is likely to be the building about 20 m to the east of the target property (there is a small green dot just under the character ''0'' of the 0.4 km tick, overlayed by the path profile graph). In Fig. 3 the location of the CPE antenna is moved a couple of meters to the south edge of the same building (blue push pin marker) and the path profile indicates that the RF propagation is free from any obstacles therefore this property can be covered by the AP. Fig. 4 presents the overall task flow, from the customer input to a detailed coverage report indicating potential CPE antenna mounting locations. The customer here could be a network operator who is planning to set up a new mast or a property owner who is looking to purchase a fixed wireless CPE. The information required from the customer is the address or latitude and longitude coordinates of the target property. WISDM is able to find the latitude and longitude coordinates of the property if the address is given. The latitude and longitude coordinates are used as input to the Microsoft Bing Maps API [50] to download a 600 × 600 pixels static map at zoom level 20 (this is roughly a 90 × 90 m area with building level details). Static maps from other sources should be applicable as long as the resolution is similar. We select Microsoft Bing Maps because the roofs of the buildings are adjusted to match the building footprints. Other static maps VOLUME 9, 2021 (e.g., Google Maps [51]) could have some aerial images taken from tilted angles (rather than zero nadir angle), therefore searching the elevation data for the building edge latitude and longitude coordinates could return unexpected results (e.g., roof edge has a street level elevation). Mask R-CNN is used to extract the target building from the static map and Canny Edge Detection [52] is applied on the predicted mask to extract the edge pixels. A subset of the edge pixels is selected and converted to latitude and longitude coordinates using Web Mercator projection [53]. Then WISDM uses the latitude and longitude coordinates to generate point-to-point path profiles to evaluate the coverage from nearby APs to the target property. To use the Web Mercator projection, equations (1) and (2) can convert latitude and longitude coordinates to the pixel coordinates at global level:

B. OVERALL TASK FLOW
where x and y are the global pixel coordinates, z is the zoom level (20 in this paper), λ and ϕ are the longitude and latitude.
Equations (3) and (4) can be used to reversely compute latitude and longitude coordinates from x and y: The final outputs are binary path profile results (pass or failure) from all APs within 10 km range of the target property in JSON format [54], and indicative map images visually showing the path profile results for all APs. For example, in Fig. 4 the map image at the end of the task flow indicates that there is no coverage (red dots) from the selected AP at the southwest side of the property while the other sides have good coverage (green dots). Alternatively, more markers can be used to indicate different levels of received signal power, thereby making the result images more informative. We will leave this for the readers to elaborate more according to their requirements.

C. MASK R-CNN
Instance segmentation is required to identify the target building from an aerial image, particularly when the target building is visually connected to other buildings (or other irrelevant structures such as sheds and storage units) nearby. To differentiate object instances of the same class, Mask R-CNN is developed from Faster R-CNN which is designed for semantic segmentation tasks. Faster R-CNN implements a two-stage object detection [25]: it first produces candidate bounding boxes using the Regional Proposal Network (RPN) then extracts features from the bounding boxes and conducts object classification. Mask R-CNN [26] shares the same two-stage framework but outputs an additional binary mask for each Region of Interest (RoI) at the second stage (Fig. 5). RoIAlign [26] is introduced to improve the RoI misalignment due to the quantization errors caused by RoIPool [55] to achieve per-pixel level accuracy.

D. IMAGE DATASET
We have prepared a dataset of 200 aerial images to fine tune pretrained Mask R-CNN models to extract building edges. All images are 600 × 600 pixels obtained using Google Maps API [51] at 20 zoom level to cover details of buildings. The center latitude and longitude coordinates are always within the target building which requires a wireless coverage report.  The dataset 2 covers aerial images of suburban and rural residential buildings in North Yorkshire taken from both nadir and tilted angles, with various shapes and roof tile types. Fig. 6 shows some samples of the dataset with buildings labeled with the open-source annotation tool LabelMe [56]. Within the dataset, 150 images are used as the training set and the rest 50 images are used as evaluation set.

III. IMPLEMENTATION DETAILS AND RESULTS
This section presents the implementation details of the tasks in Fig. 4 and compares the Mask R-CNN detection performance using different pretrained backbone networks.

A. IMPLEMENTATION DETAILS
The tasks in Fig. 4 are implemented using Python 3.8 with Mask R-CNN training and inference using the open-source Machine Learning library PyTorch [57]. We have fine-tuned several Mask R-CNNs with different backbones, including ResNet34, ResNet50 and ResNet101 backbones pretrained on ImageNet, DeepLabv3-ResNet101 backbone pretrained on a subset of COCO, and ResNet50 backbone pretrained on COCO. The optimizer is Stochastic Gradient Descent (SGD) with the initial learning rate sets to 0.005, momentum sets to 0.9 and weight decay sets to 0.0005. The learning rate decays every 3 epochs with gamma sets to 0.1. During inference we limit the number of detections per image to 5 (5 output masks per image), the first mask with the center pixel covered is selected as the detection of the target building. All the tasks are implemented on a PC with a GeForce RTX 2070 GPU with 8GB memory.

B. TARGET BUILDING DETECTION RESULTS
Standard COCO metrics are used to evaluate performance of the detection, including Average Precision (AP) which is the mean AP averaged across the APs of 10 Intersection of Union (IoU) thresholds from 0.5 to 0.95, AP 50 and AP 75 are the APs with 0.5 and 0.75 IoU thresholds, AP M is the AP for medium objects with area greater than 32 2 pixels but less than 96 2 pixels, and AP L is the AP for large objects with area     greater than 96 2 pixels. The image dataset we prepared for training does not include any small objects with area less than 32 2 pixels so AP S is not applicable. AP m and AP bb denote the AP of masks and bounding boxes respectively. Table 2 and Table 3 show the metrics of bounding boxes and masks respectively. From the results we can see that increasing the depth of the backbones improves the detection performance but the datasets the backbones use on pretrain make significant impact as well. The clear winner here is ResNet50 pretrained on COCO which has the best performance metrics across the board. Fig. 7 shows some visual results of the fine-tuned Mask R-CNN with ResNet50 backbone pretrained on COCO. It is clear that this Mask R-CNN can extract the target buildings (which are the buildings in the center of each image) with irregular shapes and their edges from other buildings and structurers nearby. The inference time of the Mask R-CNN with ResNet50 backbone is about 210ms on the GeForce RTX 2070 GPU, therefore rapid implementation can be expected.

C. COVERAGE PLANNING RESULTS
In the previous subsection the target building edge pixels are obtained from the Mask R-CNN detection. Using equations (3) and (4) we then convert a subset of the edge pixels to latitude and longitude coordinates to feed to WISDM (or any other coverage planning tools) to generate point-to-point path profiles. For demonstration we randomly select 50 edge pixels for path profiling, which is normally acceptable to generate an informative coverage map for a private residential property.
To demonstrate coverage planning results, in WISDM we deployed three APs (Access Points, please note italic APs denotes Average Precisions) several kilometers northwest to the York city center, North Yorkshire, UK (Fig. 8). The parameters of the APs and CPEs are described earlier in subsection II.A. In Fig. 8 WISDM labels the three APs as TestSite 2, TestSite 3 and TestSite 4, for convenience we rename them as AP1, AP2 and AP3. We select 6 properties in this area to generate coverage maps for the property edges. For each property, we generate one coverage map for each AP and one coverage map combining the results from all APs. Fig. 8 also shows example path profiles from selected APs to particular property edge points on the maps. For example, the closest AP to property 1 is AP3 which has the best coverage across all APs. The path profile shows a potential massive terrain obstacle (a small ''hill'') in the middle of the path however, benefitting from the height of the AP and CPE the LoS path of the signal stays above that hill. The closest AP to property 2 is AP1 which however does not show good coverage. The path profile shows the signal from AP1 to the south corner of the property, and there are two tall obstacles (potentially trees) near the CPE blocking the LoS path. The closest AP to property 3 is AP3 which does not have coverage at all. The path profile shows the signal from AP3 to the north side of the property and there are two obstacles blocking the LoS path, which could be the buildings to the north of that property (visible in the coverage map). The closest AP to property 4 is AP2 which does not have full coverage at the west side of the property. From the path profile we can see a tall obstacle very close to the CPE which is the tree to the west VOLUME 9, 2021 side of the property (visible in the coverage map). AP3 has barely any coverage to property 5 and path profile shows that the west corner of the property is just able to achieve LoS path. Property 6 has no coverage at all from AP2 and the path profile shows several obstacles blocking LoS path from AP2 to the west side of the property.

IV. DISCUSSIONS
The image dataset we prepared to fine-tune the Mask R-CNN is a relatively small dataset including private residential properties in rural and suburban North Yorkshire, UK. To make the Mask R-CNN detection applicable to aerial images taken worldwide, the dataset needs to be expanded to include samples from regions according to the customer's interest. This does not only contribute to the variety of the target buildings but also to the variety of the background. For example, our dataset mainly includes ground vehicles and green vegetation backgrounds which may affect the detection results when the input images have boats and water backgrounds. Properties with different roof structures such as open terrace, solar panel installations, swimming pools and common areas can also be included to improve the detection accuracy. To allow the Mask R-CNN to detect buildings other than private residential properties more confidently, the dataset needs to include other building types such as warehouses, farms, theaters, schools, churches, and again considering the customer's requirements. Aerial images with different resolutions and zoom levels can be included to cover target buildings with different sizes and styles. We have made our code available so that the readers can directly use the Mask R-CNN trained on our dataset for building detection applications or train new Mask R-CNNs with customized datasets. Table 2 and Table 3 show that the accuracy of the detection can be affected by the depth of the backbone and the dataset the backbone is pretrained on. If the computational resource allows, we recommend the pretraining of deep backbones with different large datasets for the best accuracy.

V. CONCLUSION
In this paper we propose a DL based method to augment test points in coverage planning tools to improve utility and aid CPE installation. Conventionally the coverage planning tools either require manual inputs of CPE locations to generate path profiles or ''spam'' path profile computations from APs to all GIS ''pixels''. Our idea improves the capability of the coverage planning tools to accurately estimate the coverage with great details while maintaining efficient computational resource usage. The pixel level building detection from Mask R-CNN is able to greatly reduce the number of path profile computations required to generate coverage maps without human involvement (no need to compute path profile for the locations where mounting the CPE antenna is difficult or impossible), thereby allowing the coverage maps to be rapidly generated when there are changes to the APs. We have demonstrated the performance of the building detection and the entire task flow: the only input required to produce the coverage maps is either the address or latitude and longitude coordinates of the target building. We have made our work readily available for the readers to adapt to their applications. In the future we will expand the aerial image dataset to allow the Mask R-CNN to capture more features of a variety of buildings worldwide, thereby making the proposed approach applicable to users from other countries.

APPENDIX
This appendix provides a list of acronyms (and their full terms) used in the paper. College Dublin, Dublin, Ireland. Since then, he has been working at different academic and industrial positions in the Republic of Ireland and U.K. He has published more than 50 peer-reviewed book chapters, journals, and conference papers. His current research interests include design, analysis, and optimization of wireless communications networks, the application of machine learning in wireless networks, airborne networks, wireless network virtualization, blockchain, the Internet of Things, cognitive radio networks, and small cell and self-organizing networks. He is a member of the Editorial Board of IEEE ACCESS, Frontiers in Blockchain, and Wireless Networks (Springer). He is a fellow of U.K. Higher Education Academy, the Networks Working Group Co-Chair, and a Management Committee Member of COST Action 15104 (IRACON).
DAVID GRACE (Senior Member, IEEE) received the Ph.D. degree from the University of York, in 1999, with the subject of his thesis being 'Distributed Dynamic Channel Assignment for the Wireless Environment.' Since 1994, he has been a member of the Department of Electronic Engineering, University of York, where he is currently a Professor (research) and the Head of Communication Technologies Research Group. He is also the CoDirector of the York-Zhejiang Lab on Cognitive Radio and Green Communications and a Guest Professor at Zhejiang University. His current research interests include aerial platform-based communications, cognitive green radio, particularly applying distributed artificial intelligence to resource and topology management to improve overall energy efficiency, 5G system architectures, dynamic spectrum access, and interference management. He is a Lead Investigator on H2020 MCSA 5G-AURA and H2020 MCSA SPOTLIGHT. He was a one of the lead investigators on FP7 ABSOLUTE and focused on extending LTE-A for emergency/temporary events through application of cognitive techniques. He was a Technical Lead on the 14-partner FP6 CAPANINA Project that dealt with broadband communications from high altitude platforms. In 2000, he jointly founded SkyLARC Technologies Ltd., and was one of its directors. He is an author of over 220 papers and the author/editor of two books. DAVID BURNS has worked for over thirty years in the technology industry. In 2010, he was the Founder of Boundless. He was the Founder of Wireless Coverage Ltd., which provides ground-breaking wireless planning and modeling software for WISPs and 5G stakeholders. He is currently the Chairman of U.K. Wireless Internet Service Providers Association which represents the industry nationally.