Application of Convolutional Neural Networks With Object-Based Image Analysis for Land Cover and Land Use Mapping in Coastal Areas: A Case Study in Ain Témouchent, Algeria

Land use and land cover (LULC) information is a fundamental component of environmental research relating to urban planning, agricultural sustainability, and natural hazards assessment. In particular, remote sensing technology has demonstrated a powerful capacity for LULC modeling with a corresponding increase in sensor number and type. Here, an advanced convolutional neural network (CNN) deep learning model was developed in combination with object-based image analysis (OBIA) to map LULC in Ain Témouchent coastal area, western Algeria, using sentinel-2 and Pléiades imagery data. First, the CNN model was constructed based on convolution, hidden, and max pooling layers. The parameters of CNN architecture were optimized to improve the model for further processing. Then, based on high levels of CNN feature extraction, the OBIA was applied to classify the segmented objects, and detect the LULC features. Furthermore, machine learning methods, including random forest and support vector machines were tested for comparison. The proposed method achieved a high overall accuracy (93.5%) using Pléiades imagery, revealing significant improvements compared to other machine learning techniques. Accordingly, it was concluded that the method proposed here is useful for LULC detection, and can be applied at larger scales in coastal areas. The derived maps can also inform regional and national-level decision making.

monitoring and management. Indeed, LULC data maintains several environmental applications, including urban planning, agricultural sustainability, and natural hazard assessments in coastal areas. Further, frequently updated LULC information at fine spatial scales are necessary for achieving various sustainable development goals [1]. In particular, coastal areas are important for their strategic geographic location and natural ecosystems. Accordingly, LULC data in coastal cities are increasingly useful for monitoring human interference, such as increasing agricultural encroachment and urban expansion correlated to demographic growth. Over the past few decades, greater consideration has been given to remote sensing imagery applications for LULC detection [1]. Several satellites have been launched (e.g., Landsat, sentinel, and SPOT) designed to monitor urban development, forests, agricultural, and natural hazards [2]. Moreover, for high and very high spatial resolution (VHSR), remote sensing imagers are increasingly being used in LULC mapping analyses based on classification concepts using machine learning methods [3]. In recent decades, machine learning methods have been applied to remote sensing LULC classification tasks [4], [5], in particular, pixel-and object-based image analysis (OBIA) methods [6], [7], particularly random forest (RF) [8]- [10], support vector machine (SVM) [11], [12], and artificial neural networks [13], [14]. As the most critical elements of image classification, the OBIA method is capable of identifying interspersed geographic features and objects [15]. Under OBIA, objects are extracted via segmentation processes considering spectral, textural, and contextual information of similar pixels [16]. Recently, OBIAs have been extensively applied to remote sensing assessments of LULC mapping, especially in coastal areas [17], [18]. Li et al. [19] investigated the performance of remote sensing data and machine learning methods when assessing anthropogenic LULC expansion in the Liaoning province coastal zone of China. Here, OBIA was used to perform LULC classification applied to Landsat TM/ETM +/OLI images from 1990 to 2014, and showed the potential to monitor anthropogenic LULC changes over the analysis period (as indicated via its good overall accuracy; OA). Consequently, even in coastal zones with low elevation, OBIA has been adopted for the accurate detection of LULC. Nandam and Patel [20] employed a hybrid method based on an SVM and spectral This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ features to map LULC in the city of Surat, situated in the western coast of Gujarat, India, using Landsat 5-TM, 7-EMT and 8-OLI/TIRS series imagery data. In addition to the SVM learning algorithm chosen to perform the classification process, a set of spectral indices were extracted from the satellite images aiming to improve classification accuracy, including the normalized difference vegetation index, and the modified normalized difference water index (MNDWI). The SVM classifier was also compared to RF to evaluate the more effective algorithm for LULC classification in the coastal area, and the results revealed that although both algorithms were statistically significant, accuracy assessments showed that the SVM classifier was superior. Furthermore, the results of some spectral indices with SVM (e.g., MNDWI) have been validated across other testing sites (OA value ≤92%), showing that the proposed approach can be successfully implemented for LULC mapping of the coastal urban plains. However, despite the success of OBIA at accurately addressing LULC, the method remains relatively limited owing to several classification uncertainties related to irregular objects obtained by segmentation [21]. Additionally, OBIA accuracy can be compromised under a large variety of LULC types, especially in urban areas [22], resulting in inadequate feature extraction. Furthermore, OBIA based on machine learning classifiers using the conceived features, or a binary classifier typically do not consider deep level features extraction [23]. Deep learning models as a part of machine learning methods are designed to resolve various tasks in image processing [24], and their integration in remote sensing brings increased adaptability in object representation, with high levels of feature extraction from imagery data. Deep learning can increase the quantity of information extracted, thereby improving classification results for particular LULC tasks [25], [26]. Among deep learning algorithms, convolutional neural networks (CNNs) [27], [28] have been widely used in many classification tasks, particularly in LULC modeling and change analyses [26], [29].
CNNs employ stacked convolution kernels to learn spectral and spatial information, thus improving identification of high level abstract features. Nevertheless, conventional CNN methods are characterized by a large number of layers, incurring large computational costs [30]. In addition, CNN classification methods are often performed at the pixel-level; thus, the extracted features can be confused due to the mixed spatial distribution of LULC types and spectral mixing [31]. Alternatively, OBIA methods utilize homogenous multi-pixel sets to classify objects; thus, it could be optimal to integrate CNN models with OBIA when performing the classification of segmented objects. This advanced method has been tested under various LULC mapping applications [32], coastal LULC change monitoring [33], and cropland classifications [34]. Furthermore, this integrated method has been shown to be capable of powerfully extracting high level image features, effectively defining LULC type boundaries, and increasing classification accuracy.
Here, the primary objective of the study was to map LULC in Ain Témouchent coastal area of western Algeria using an OBIA-CNN method. The main remote sensing data employed in the proposed methodology were Pléiades VHSR images (2 m resolution) and sentinel-2A data (10 m). Assessments of the final LULC classifications were conducted in terms of OA. Final maps produced will be a useful tool in supporting regional and national decision making in and around the study area. Additionally, the following subobjectives were considered in the article.
1) Employing a simple CNN model with the fewest possible layers integrated in eCognition software for limiting computational demand. 2) Optimizing CNN hyperparameters to improve classification accuracy. 3) Comparing the proposed method with machine learning methods (RF and SVM algorithms). 4) And, evaluating the contribution of each dataset used in terms of the final LULC classification accuracy.

A. Study Area
The area of interest is situated in northwest Algeria, at the crossroads between three major cities; Oran, Sidi Bel Abbès, and Tlemcen (see Fig. 1). The area includes the Mediterranean coastal region of Ain Témouchent and the city center; moreover, it incorporates the Sennane watershed, which crosses the urban city (total area of 84 km 2 ) as its control point downstream of the city. The city is surrounded by mountainous areas with an average altitude of ∼500 m. Influenced by the Mediterranean climate, the Ain Témouchent region is characterized by a warm summer and temperate winter. Additionally, the winds from the northwest and southeast bring little moisture to the area, as they cross the Moroccan reliefs from the south. The study area is characterized by heterogeneous LULC due to its confinement to a narrow valley, while being surrounded by vineyards and agriculture arranged on a high fertility basaltic soil. The most dominant LULC categories in the region are built-up areas, forests, and agricultural land, with the latter two being located primarily in the rural area. The region is characterized by its higher agricultural production and vineyard activity, totaling 25% of all national production. These agricultural activities are accompanied not only by the service sector in line with its geographical location, but by significant university community growth as well. The vegetation cover consists of forest massifs, but has been replaced by mountain farms in several places. The urban city is featured by its layout and French style architecture. Today, the city continues to experience great urban development to the detriment of farmland and vineyards. Ain Témouchent is also characterized by rapid population growth. The current estimated population of the city in 2014 is ∼97,812, with a growth rate of +1.38%·yr −1 from 1987 to 1998, and +2.52% from 1998 to 2008 according to the National Statistics Office. Accordingly, the diversity of LULC categories in the study area provides the opportunity for evaluating the ability of the proposed method to extract LULC objects. Furthermore, the town of Ain Témouchent is highly exposed to floods risks [32], which requires up-to-date LULC information, in particular, in some flood prone areas. Hence, final LULC will be exploited as part of the strategy of local and national authorities to plan flood prone zones and combat illegal construction in the parts where the flood risk is high.

B. Remote Sensing Data and Preprocessing
Two different data sources were acquired to produce LULC maps. First, Pléiades VHSR imagery are derived from a dualoptical satellite (Pléiades 1A) and 2012 (Pléiades 1 B) designed for earth observation available to order. The Pléiades1A and Pléiades 1B were launched on a Soyuz ST from Europe's space port in Kourou, French Guiana, on December 17, 2011 and on December 2, 2012, respectively. Accordingly, two Pléiades datasets were required to cover the entire study area, and were acquired from the 1A platform on October 17, 2020. All images were obtained under good cloud cover conditions (0%), while the data included panchromatic images and four multispectral channels (red, green, blue, and near-infrared-NIR), at a spatial resolution of 2 m. The acquisition properties of Pléiades are given in Table I.
The second image was derived from sentinel-2A data, as acquired freely through the sentinel-hub. 1 The sentinel-2 image was acquired on the same date (October 17, 2020), allowing for a comparison of the results from both the proposed methods. These images have 13 spectral bands. Accordingly, the high spatial, spectral, and temporal resolution of sentinel satellites are appropriate for LULC monitoring programs, given its high revisit frequency (ten days for a single sentinel-2 satellite and five days for the combined constellation). Table II gives additional acquisition properties of sentinel-2 imagery. 1 [Online]. Available: https://scihub.copernicus.eu/ Pléiades images were mosaicked to obtain a single image; whereas sentinel-2A data were already geometrically corrected at the time of acquisition. Sentinel-2 bands with spatial resolution different to 10 m were resampled to 10 m using the sentinel applications platform (SNAP v.5.0), as the classification process requires the same size input images. An overview of the Pléiades and sentinel-2A imagery in the Ain Témouchent study area is shown in Fig. 2.

C. Methods
The workflow of the proposed experiments consisted of the following steps.
1) Training and test samples generation and spectral feature extraction. 2) LULC classification via the proposed integrated method of CNN deep modeling with OBIA. 3) LULC classification using pixel-based and OBIA methods (RF and SVM were the machine learning classifiers employed). 4) Accuracy assessment of LULC maps. Fig. 3 shows the flowchart of the developed methods.

1) Sample Generation for Training and Validation
Classification Process: LULC categories were identified using visual analysis and interpretation of the Pléiades VHSR image, producing ten predominant classes: forests; cultivated land; greenhouses; built-up areas; barren land; follow land; uncultivated land; roads; stadiums; and water. Since the spatial resolution of the sentinel-2A image is lower than that of the Pléiades data, six corresponding LULC categories were identified: water, cultivated land, uncultivated land, barren land, built-up area, and forests.
Sample generation was split into two categories: training and test samples. The generation provided two vector datasets using quantum geographic information system (version: 3.16). The training vector was applied during the classification process; whereas the test vector was used in both accuracy assessments.
2) Classification Process: a) Classification of the CNN deep model integrated with the OBIA approach: 1) CNN architecture: CNN is a deep learning model technique designed for image classification, and inspired by the architecture of the biological multilayer neural networks, which allows for the construction of high-level semantic features from low-level given features [31], [35]. A representative CNN architecture consists of sequential layers (e.g., convolutional, pooling, and fully connected layers) and interconnected output layers using nonlinear operations [23]. Two important characteristics are considered in any CNN architecture: local connectivity designed to simplify the CNN by limiting the number of connected neurons, and shared weights responsible for reducing and simplifying model parameters by considering the same connected weights between different neurons in a given layer. [30]. Similarly, through the convolutional layers, the CNN model can extract features based on multiple convolutional operations in an input image, thereby transforming a local receptive field of the connected region on the input data into a pixel of the next layer. Furthermore, the pooling layer is important in any CNN model which merges similar features into one, capable of reducing feature map dimensions [36], [37]. Average and max pooling are typically the most applied layers in CNNs. Additionally, each CNN layer is produced by small sample patches of a certain size scanned across the input image to capture different feature characteristics.
In CNN model design, it is essential to find the appropriate architecture capable of meeting the research needs. Because, the CNN process for deriving output layers is constructed across several stages producing a set of feature maps [38], the training of any CNN model allows for the optimal combination of model parameters. Thus, the optimization of CNN hyper-parameters (e.g., sample patch size, hidden layers, and learning rate) is an essential step for obtaining a performant model.
Here, the CNN architecture was created in Trimble eCognition Developer v. 10. The main layers characterizing CNN structural design implemented were the hidden, convolution, pooling, and fully-connected layers; whereas the process consisted of three main steps: creation of sample patches; generation of and training the model; and model application. ECognition is advantageous for its integrative ability to perform CNN classifications with OBIA. Detailed CNN architecture using Pléiades and sentinel-2A input data are shown in Figs. 4 and 5, respectively. Further, two optimal CNN models were adopted for both Pléiades and sentinel-2A input imagery, in accordance with previous studies [31], [35], [39].

3) Labeled sample patch generation for the CNN deep model:
Labeled sample patches were generated from the entire input image. Sample patch sizes are considered as one of the most critical parameters in optimal CNN architecture [40]; thus, different sizes were considered for both images: 8 × 8, 10 × 10, 16 × 16, 20 × 20, 32 × 32, and 64 × 64 pixels. In addition, through a cross validation method, the sample patch size of 16 × 16 was attributed for Pléiades data, and 32 × 32 for sentinel-2A data. Further, sample count and image bands are required parameters that should be reviewed; thus, all spectral bands were used, and a set of 10 000 labeled sample patch were generated for each model in both images.

4) Creating and training the CNN deep model:
Based on integrated algorithms of the CNN creation architecture in eCognition, the model was derived using all spectral bands of data for the input, and generated LULC classes for the output. The number of hidden layers, feature maps, kernel sizes, and max pooling layers are user-defined parameters; thus, for Pléiades data, two hidden layers were built for the CNN model after a cross validation execution, and the assessment of the CNN output accuracy results. A max pooling was applied in the study here with an even number size. The goal was to decrease the number of units by preserving only the maximum response of multiple units in the hidden layer [41]. Similarly, after a cross validation method, a convolution was implemented for each layer with a kernel size value of 3 × 3 for the first hidden layer, and 5 × 5 for the second layer; however, in the case of CNN model creation using the sentinel-2A image, only one hidden layer was created and applied with max pooling, convolution layer, and a kernel size of 7 × 7, based on CNN accuracy results.
Next, the CNN model was trained using the labeled sample patches and parameter configurations, and the model weights were adjusted using backpropagation. Notably, parameter adjustment is important in this step. The learning rate is an important parameter which controls the learning step size for each training iteration; thus, inappropriate rates can lead to slower divergence or convergence [31]. Accordingly, values of 0.0006, 0.0009, 0.001, 0.005, and 0.01 were tested, with lower values slowing the learning process by finding local minima or suboptimal weights; whereas higher values speed up the rate at an increased risk of missing the optimal minima [41]. Ultimately, the accuracy results indicated a rate of 0.0006 most accurately represented the amount of weight adjustment during statistical gradient descent optimization. Training steps and samples were set as 5000 and 50, respectively, for both input data.

5) Application CNN deep model:
Finally, fully connected layers (heatmaps) were generated after applying the created CNN model, where heatmap layers corresponded to the LULC categories. The heatmaps had a unit for each category predicted by CNN, where two possibilities existed: a value close to 1 indicated a higher likelihood of the category, while a value near 0 indicated a lower likelihood. For the Pléiades image, the ten produced heatmaps layers were equivalent to the ten identified land cover categories; whereas six heatmaps were generated for the sentinel-2A image, six heatmaps layers were produced as output.
6) OBIA classification: As the CNN model was performed at the pixel level, the classification of the integrated model with OBIA consisted of applying the latter approach to classify the entire input image at the object-level. Here, the heatmap was utilized as the input features to perform the OBIA. The sentinel-2A and Pléiades data were transformed into segmented images through a multiresolution algorithm [42]. Multiresolution models are region-growing models, and assemble pixels to provide objects through iteration, while maintaining the homogeneous conditions defined by the user [34]. Based on trial-and-error, different scale parameter values from both sets of input data were tested to obtain the highest possible classification accuracy. Through the cross validation of Pléiades and sentinel-2A images, values of 15 and 5 were selected for the scale parameters, respectively. The other homogeneous criteria (shape and compactness) were set to default values of 0.1 and 0.5, respectively. a) Methods based on machine learning algorithms for comparison: Two machine learning algorithms were chosen for the comparison with current proposed method: RF and SVM. These algorithms have been frequently applied in remote sensing analyses, are recognized for their powerful features and often considered the default techniques for LULC modeling [11], [12], [43].
RF [38] is a powerful machine learning algorithm with excellent LULC mapping capabilities using different source data [39]. RF is a nonparametric model that creates multiple decision trees, with each tree constructed by assigning the most popular class to the input images. In LULC classification, the RF classifier has shown to be consistent and relatively efficient, requiring few user-imposed parameters, and producing an OA that is often consistent or better than other algorithms (e.g., conventional decision trees and maximum likelihood) [44]. For training the RF classifier, two important parameters must be assigned: the maximum number of trees (Ntree), and the number of features should be selected for each tree (Mtry). Together, these two parameters have a high impact on the classification performance [45], [46].
Alternatively, the SVM is a non-parametric algorithm for classification and regression image analyses [39]. It is often used in LULC mapping tasks, as it is a discriminant classifier that minimizes inaccuracy of images by identifying solutions in a hyperplane that transforms data into predefined classes. In instances where the data features are inseparable, SVM has a kernel function that projects the data into higher-order functions [47]. Several kernel features are used in the SVM model: the Gaussian radial basis function (RBF), in addition to polynomial, linear, and sigmoid functions. Here, an RBF kernel was applied for SVM classification. The C and γ parameters are the two fundamental components controlling the performance of SVM when the RBF is considered as the kernel function [48], [49]. Indeed, the parameter C is used to control the magnitude Penalties for regularizing misclassified training dataset and plays an important role in affecting accuracy and/or the generalization ability of the algorithm [12]. The γ parameter gamma effect is a control Kernel widths, as well, in SVM classification based on RBF kernel, the effect of γ is similar to C because if a high When assigning value, the model is over-fitted and the generalization is not good [49].
A large dataset was tested to optimize and choose the parameter values for the two algorithms, with the aim of creating the most efficient classification model. For RF algorithm, values of Ntree = 50, 100, 200, 300, 400, and 500 of were tested by maintaining the Mtry at default value. Further the best determinate value of Ntree was set as default value a set of values of Ntry were experienced ranged between 2 and 30 (2, 5, 10, 15, 20, 25, and 30). Further, the same process was followed considering SVM algorithm, values ranged for 1 to 20 (C = 1, 2, 4, 5, 8, 10, 15, 20), and 0.5 to 5 (γ = 0.5, 1, 2, 3, 4, 5) for both C parameter and Gamma respectively. The hyperparameter values derived for the optimization process using cross validation method are given in Table III. It should be reported that the pixel-based method was performed in Orfeo Toolbox, and OBIA was performed in eCognition. 7) Accuracy Assessment: Accuracy assessments aim to validate results and confirm the stability of each applied classifier in the proposed methodology. The obtained classification accuracies were assessed using OA, user accuracy (UA), producer accuracy (PA), and the kappa index (K) derived from a confusion matrix, as these are the most common metrics used for evaluating LULC classification accuracy [50]. OA represents the overall performance of the applied method by calculating the ratio of the total number of correctly classified pixels to the total number of pixels for terrestrial investigation across all categories. PA was calculated by dividing the number of correctly classified pixels in each LULC class by the total number of pixels in that row and column, providing individual class precision, whereas UA represents the probability that a pixel assigned to a given class is part of that class [29].

III. RESULTS
Here, the results of accuracy assessments for all performed methods are presented, in addition to final land cover maps for each method from three subset regions within the study area. To improve visual analyses, these three classification subsets were extracted, and included the Ain Témouchent center as an urban area, in addition to the coastal area. Table IV gives the results of achieved accuracy assessment among the applied methods. The proposed OBIA-based CNN method yielded an OA of 93.5%, and a kappa of 0.91 for the 10 LULC categories in the study area. The OA achieved by the proposed CNN method thereby exhibited significant improvement compared to other tested methods. In addition, RF-OBIA and SVM-OBIA achieved OAs of 91.8% and 88.2%, as well as kappas of 0.91 and 0.84, respectively; whereas pixel-based RF and SVM achieved OAs of 84.8% and 72.9%, as well as kappas of 0.83 and 0.70, respectively. Based on the CNN deep model, water and roads maintained the highest UAs-99.3% and 97.7%, respectively. Further, the majority of classes held UA values >93%, including greenhouses, built-up areas, uncultivated lands, and barren lands. Cultivated lands were considered the poorest classes in terms of UA (average ∼80.8%) their overlapping pixel reflectance values and confusion with oter classes; however, each were detected with UA The OA achieved with respect to methods based on RF/SVM algorithms was generally satisfactory, with the results demonstrating that OBIA RF and SVM algorithms outperformed pixelbased RF and SVM, producing a 7% difference in OA (84.8%-91.8%). The optimal results provided by the OBIA methods was achieved with RF, which had an OA of 91.8%, and kappa of 0.85. Alternatively, SVM reached an OA of 88.2%, and kappa of 0.84. Further, the same trends were observed when comparing pixel-based algorithms, where RF achieved an OA of 84.8% and kappa of 0.83, while SVM achieved an OA of 72.9% and kappa of 0.70. The LULC classes most efficiently detected by RF were water and stadiums, with UAs of 99.6% and 99.3%, respectively (see Table IV). In contrast, cultivated and barren land was the most poorly classified, with UAs of 68.03% and 67.4%, respectively. In pixel-based RF, water and barren land were well classified, with UAs of 100% and 95.9%, respectively; however, built-up areas and forest were the least accurate in terms of UA (67.9% and 77.5%, respectively). Furthermore, confusion remained between built-up areas and roads, in addition to forest and cultivated lands, due to pixel reflectance.

A. Statistical Accuracy Assessment 1) Pléiades Data:
2) Sentinel-2A Data: The results obtained for sentinel-2A data were tiered (see Table V). Additionally, the proposed CNN deep model with OBIA, and other tested methods based on RF/SVM achieved satisfactory results, with OAs ranging from 77.4%-91.0%. Furthermore, OBIA based on RF produced superior results, with an OA of 91% compared to the CNN model with 83.4%. This can likely be explained by the effects of spatial resolution in the classification process. Moreover, in both methods, water and built-up areas were well classified, with UAs of 100% and 99.9% for the CNN based OBIA, and 98.9% and 98.0% for RF-OBIA, notably similar to that of forests, which were also well classified under this method (98.9%). For the pixel-based method, water and built-up were well detected, with UAs of 98.1% and 73.0% for RF. For both classifiers, forests were the least accurately classified, with UA< 35% for RF and< 47%. for SVM. With respect to machine learning methods, the results achieved with RF were vastly superior to those with SVM for both OBIA and pixel-based methods. RF-OBIA and SVM-OBIA achieved 91% and 72%, respectively, while RF-Pixel and SVM-Pixel achieved 80.1% and 77.4%; thus, RF outperformed SVM regardless of method used.
The achieved results for both data types were compared. In terms of OA, Pléiades data provided better results than sentinel-2A under the tested methods, including the proposed OBIA CNN method, with a difference of 2.5%. Similarly, machine learning based on RF and SVM methods achieved better results, with differences of 0.8% and 16.2%, respectively. For the pixel-based method, Pléiades data outperformed sentinel-2A by 4.7% for RF, and 4.5% for SVM. The obtained OA and kappa among all methods are given in Table VI.

B. LULC Mapping From Pléiades and Sentinel-2 Data
Figs. 6-8 present the classification results of the methods for the Pléiades image. Through the visual examination of the land cover maps, the 10 LULC categories were detected in all methods, though a number of differences were observed. Thus, OBIA-based CNN was the most suitable method for detection and delineation of LULC categorical boundaries. In particular, built-up areas, cultivated land, roads, and stadiums were well delineated. Regarding the other methods, some confusion was observed in the derived maps between roads and built-up areas, as well as forest and cultivated land.
Notably, pixel-based RF and SVM presented the worst classification. Built-up areas were well detected in CNN compared to all other methods, as confirmed in the coastal area (see Fig. 8), where CNN accurately delineated the port from barren lands (i.e., beaches). In contrast, the coastal buildings were misclassified with both the OBIA and pixel-based analysis methods. Similar results were observed for cultivated land (see Figs. 7 and 8), as CNN had the capacity to distinguish between cultivated lands and forests, while the majority of agricultural areas were also well delineated. Comparing the classifiers for each machine learning method, a slight difference in classification was detected between RF and SVM. In general, there was confusion in distinguishing between roads and buildings in final LULC  maps. In the SVM-pixel map, forests were misclassified, being confused with cultivated and fallow lands.
Figs. 9 -11 present the classification results of the different methods provided for the sentinel-2A image. The six categories were detected for all methods, though there were a number of remarkable differences in the final LULC maps. For the proposed CNN based on OBIA, buildings, uncultivated land, and cultivated lands were well defined. CNN classification displayed similar results to RF-OBIA (see Fig. 9). Similarly, uncultivated land was well identified, albeit with limited confusion. In SVM learning methods, confusion between cultivated land and forests, as well as uncultivated land and built-up areas were observed, indicating poor classification (see Fig. 11). Roads, included in built-up areas, were also well delineated with the majority of methods. Comparisons of the final maps provided from sentinel-2A data and Pléiades data showed that LULC maps based on the latter were of higher quality in terms of delineating each LULC category due to the effects of enhanced spatial resolution during the classification process.

IV. DISCUSSION
Although the application of machine learning methods in LULC mapping, in particular OBIA-based classification, has achieved good results in several studies [15], [16], these methods suffer from problems related to misclassifications, due in part to the heterogeneity of LULC classes, and the similarity between their spectral signatures. Accordingly, an extraction technique with a higher level of features is required. In this regard, the development of CNN techniques has recently increased, and demonstrated a high capability for LULC mapping. Several analyses based on CNN models have addressed LULC detection, especially in coastal areas [51], [52]. Experimental results of these studies demonstrated a high potential in LULC detection, and accuracy improvements in classification > 90%. In spite of the high performance of traditional CNN models in LULC classification, analyses are conducted at the pixel-level, which can result in misclassifications due to the spatial distribution of classes, in addition to the large number of CNN layers created to perform the classification. CNN-based OBIA methods can address these limitations by classifying images via segmented objects; thus, features generated automatically with a high level extraction through a CNN model.
In this article, a CNN deep learning model combined with an OBIA method was used to extract LULC features in Ain Témouchent, Algeria. The proposed methodology integrated CNN for in features extraction with OBIA classification. The methodology proposed was performed on two distinct sources of remote sensing imagery: Sentinel-2A and Pléiades data, acquired on the same day in October 2020. In addition to the deep CNN method integrated with OBIA, two further methods (OBIA pixel-based analysis) based on machine learning algorithms (RF/SVM) were tested on both datasets as well, to compare the capabilities of the proposed CNN-based methods. Furthermore, an optimized CNN model and OBIA was used to improve classification accuracy, and produce LULC maps with higher quality interpretation. For the Pléiades image, two primary layers (convolutions and maxpooling layers) were adopted as the CNN architecture, with a 16 × 16 input sample patch size. The CNN parameters (e.g., sample patch size, hidden layers, and learning rate) were optimized based on cross validation methods to obtain the final architecture with optimal accuracy. Similarly, the same process was applied to sentinel-2A imagery; whereas a single hidden layer with convolution, and max pooling layers were incorporated, along with an input sample patch size of 32 × 32. Notably, the CNN parameters, especially sample patch size, significantly affected the accuracy of classification performance. For sentinel-2A, one hidden layer was generated to achieve a positive result. Contrary to Pléiades data, the optimal OA results were obtained by generating two hidden layers. Similarly, according to prior experience, sample patch size assigned to the classification process also exerted a significant influence. Indeed, a number of the tested sizes produced inaccurate classifications, while others produced the optimal LULC maps with respect to OA. Furthermore, Fig. 12 shows the results of the tested patch size values from both datasets used to generate the CNN model.
The graphs evaluated the influence of patch size on OA. For the Pléiades data, large (64 × 64) and small patch sizes (8 × 8) produced inaccurate classifications; whereas similar patterns were observed for sentinel-2A data. Moreover, the generation of a CNN model with a large patch size requires greater degrees computational power, material, and time. Overall, the work here highlighted the benefits of a simple CNN architecture compared to other studies that have used multiple layers and large patch sizes [28], [51], [57]. For example, Zhao et al. [53] assessed the effects of CNN architecture depth on deep extraction learning, training CNN models with 1-5 different depths for evaluating the corresponding impacts. Results confirmed that the deeper CNN architectures produced the highest classification accuracies (≤ 95%); however, generating these models requires significantly more time and computational power. Comparing these findings to those revealed here, similarly accurate results (OA ≥ 93%) were obtained by via a simplistic CNN architecture with two hidden layers. Moreover, our results is conformed to Ghorbanzadeh et al. [35] results, who also found that the size of input sample patches for CNN models could significantly affect classification. Here, through an optimization method, the size of the optimal sample patch was set to 20 × 20 in order to perform the CNN-based classification combined with OBIA, with the findings confirming that in addition to CNN capacity, OBIA through (multiresolution segmentation) also improved the classification, ultimately improving extraction.
Furthermore, despite the ability of CNN architecture selection and OBIA input features to improve classification, it is essential to consider the influence of scale parameters on segmentation processes. As mentioned in Fig. 13, OA values were affected by scale parameter (as also seen in [52] and [53]). Here, the shape and compactness parameters were set to default values of 0.1 and 0.5, respectively. This key parameter was that controlling the size of the segmented objects, thereby adjusting the desired level of detail; consequently, tuning this parameter is an essential step to obtaining optimal classification results. Here, 15 was the optimal value of the scale parameter in CNN-based OBIA methods and the other machine learning methods for both input images.
Hence, the CNN combined with OBIA significantly improved OA classification by 1.7% over what RF-OBIA achieved when assessing the Pléiades VHR image. The results here thus demonstrated the effectiveness of CNN as a classifier, and its potential to identify the boundaries of LULC categories. Consequently, CNN is very useful for LULC classification, in particular, over large-scale environments. For the sentinel-2A image, the competition was remarkable between CNN and RF performed with OBIA, while the RF classifier obtained the best results, with an OA > 7% stronger. Therefore, for the machine learning methods, results of the classification affirmed that OBIA outperformed the pixel-based analyses for both datasets, as has been seen in earlier LULC studies [54], [55]. Conversely, when comparing machine learning classifiers (RF and SVM), both achieved good results, with OAs > 70%.
Considering the LULC maps (see Figs. [6][7][8][9][10][11], the proposed methods here produced the most accurate LULC features in the study areas, where nearly all LULC classes were well distinguished. Furthermore, for LULC classification based on the Pléiades image (see Figs. [6][7][8], the proposed method allowed for the detection of all desired LULC classes water, cultivated land, greenhouses, built-up area, fallow, uncultivated land, roads, barren land, forests, and stadiums. Moreover, classification boundaries were well delineated, with buildings being particularly well distinguished from roads. Similarly, despite the similarity in pixel reflectance between forests and cultivated lands, both classes were well extracted. For sentinel-2A image analyses (see Figs. 9-11), LULC categories water, cultivated land, uncultivated lands, built-up areas, barren lands, and forests were also well identified using the proposed and machine learning methods. Notably, the derived LULC maps for the proposed method and RF-based OBIA algorithms were much more similar. Thus, due to various misclassification, CNN based OBIA was shown to be the most suitable for LULC class detection. Regarding spatial resolution, LULC maps provided from Pléiades were had a higher level of spatial detail.
Despite the superior accuracy derived by CNNs combined with OBIA when compared to machine learning methods alone, the latter methods, especially those based on the RF classifier, were competitive, and achieved successful results in LULC classification. As mentioned in several previous studies [56], [57], RF performed better than SVM regardless of the satellite data used. For the Pléiades data, RF outperformed SVM by 11.9% with the pixel-based method, and by 3.6% using OBIA. Similarly, for sentinel-2A data, the RF classification produced an improvement in OA > 2.7% for the pixel-based analysis, > 10% for OBIA. Accordingly, RF parameters affected the training of the classification. Moreover, the optimization of RF parameters, primarily the total number of trees, can enhance the classification results. According to the cross-validation method, a large number of trees (50 500) were tested here, and OAs were evaluated for each value. Fig. 14 illustrates the impacts of the RF tree number on the OA for the classification from OBIA and pixel-based methods, where the hyperparameter's strong influence on classification accuracy can be confirmed.
In fact, as part of automated LULC mapping methods, this article demonstrates the potential of CNN for LULC classification provided for object segmentation from high and very high-resolution data. Although, deep learning models require multiple data samples with high quality for algorithm optimization. LULC classes are better identified by using only semantics rather than just images, which is reflected in accuracy scores and qualitative analysis. A principal component in CNNs is the availability of large training data which allow to successfully training of the model. Consequently, performance is investigated with some perspectives, in particular; examines the overall performance of the LULC classifications, and discusses the per class accuracies. It also discusses qualitative analysis and clarifies how semantics can be used as a source of information in the LULC classification. In particular, the LULC classes related to artificial structures, such as the built-up class, have higher classification accuracy. Globally, our proposed deep learning method successfully discriminates, classifies very similar classes based on spectral cues, and generates highly accurate LULC maps. Despite the superiority of the proposed model, we find that our deep learning model typically requires more training samples than traditional machine learning methods

V. CONCLUSION
The study here assessed and mapped LULC in the Ain Témouchent coastal area situated in western Algeria. A CNN deep learning model developed in combination with OBIA was applied, and machine learning methods based on RF and SVM classifiers were tested. The proposed methods were conducted on two different remote sensing data types, Pléiades VHSR and sentinel-2A high spatial resolution data, with the aim of testing the contribution and potential of each dataset in the extraction of LULC features. The parameters of CNN architecture, in particular the size of sample patches and CNN layers (including hidden, convolution, and max pooling layers) were improved to produce optimal model architecture, and enhance classification accuracy. The proposed CNN deep model integrated with OBIA showed significant improvements in LULC mapping compared to other machine learning classifiers, achieving an OA and kappa of 93.5% and 0.91 for Pléiades data, respectively, and 83.4% and 0.80 for sentinel-2A data. In addition, despite the capability of CNN models in high level LULC extraction, the OBIA method should be improved by optimization of the segmentation parameters. Notably, the scale parameter in multiresolution segmentation is key to controlling the size of the segmented objects, and should be optimized for improving OBIA classification.
Furthermore, results of machine learning methods confirmed that OBIA outperformed pixel-based analysis, and that RF was more stable than SVM for both datasets. In addition, given the effect of spatial resolution, the proposed CNN method performed better with Pléiades data, showing significant improvements of LULC maps regardless of the tested methods. Furthermore, the method offers higher accuracies, and can be applied over larger scales, with different remote sensing data sources.
The results here revealed that it is possible to map LULC in coastal areas using machine learning algorithms applied to data with different spatial resolutions. Accordingly, the final LULC maps are different, and the level of LULC classes detected was dependent upon the chosen resolution. Despite the lower resolution of sentinel-2 data, visibly usable maps were still produced. Hence, for more detailed analyses that require fine-scale LULC details, using VHSR products is recommended in heterogeneous coastal areas. The final maps produced here can serve as a database for other applications (e.g., assessments of flooding vulnerability, which require detailed LULC information during the modeling process), and can be considered a helpful tool in supporting regional and national-level decision making concerning LULC in and around Ain Témouchent coastal area.