Retargeting Low-Resolution Aerial Imagery by Distribution-Preserving Perceptual Feature Selection

This work presents a novel low-resolution (LR) aerial image retargeting pipeline, wherein the key is an active perception learning coupled with a distribution-preserving feature selector. We focus on engineering perceptual and descriptive visual representations for optimally shrinking different regions inside each LR aerial photo. In particular, by mimicking how humans sequentially perceiving different salient regions, an active learning paradigm is deployed to divide an LR aerial image into a succinct set of attractive regions coupled with the remaining non-attractive regions. Theoretically, the deployed active learning paradigm ensures that the selected attractive regions can maximally reconstruct the target LR aerial image, which accurately captures human gaze allocation. Subsequently, a semi-supervised distribution-preserving feature selector (DPFS) is proposed to acquire high quality features from the above selected attractive regions. Noticeably, DPFS only require a small proportion of LR aerial images to be labeled. And the labeled/unlabeled sample distribution are optimally preserved during feature selection(FS). The acquired high quality features are finally used to learn a Gaussian mixture model (GMM) for retargeting. Plenty of empirical results have shown the superiority of the proposed algorithm.


I. INTRODUCTION
Due to the significant progresses in space science and engineering, and distant communication, a considerable amount of satellites that observe earth were launched recently.Generally, we can conveniently categorize those satellites into two types: the high-as well as the low-altitude satellites.In practice, those high-altitude satellites cover a remarkably larger region than the low-altitude ones.Practically, accurately discovering the semantics toward the lowresolution (LR) aerial photographs is becoming a significant component in plenty of intelligent systems.In real-world remote sensing systems, retargeting LR aerial images by comprehensively discovering the numerous semantic regions is an indispensable technique.As an example, we can opti- The associate editor coordinating the review of this manuscript and approving it for publication was Yizhang Jiang .mally display the complicated street maps on a LR display for smart navigation.For example, we can succinctly display the planed path using a retargeting algorithm, based on which drivers will receive a pleasant the navigation process.More importantly, this will decrease road accident since drivers will be more concentrative.Moreover, by designing an LR aerial image retargeting model for iPhone or Apple Watch, refugees more effectively escape from different disasters, e.g., bushfires, hurricanes, and earthquakes.
In computer graphics, researchers believe that for LR aerial image retargeting, uniform scaling is practically a sub-optimal choice, owing to the rich number of ground objects dispersely scattered in each LR aerial image.Meanwhile, cropping is unsuitable for retargeting LR aerial images since usually some visually/semantically important regions are abandoned.In order to optimally display scenic pictures at different resolutions, content-aware retargeting FIGURE 1.The flowchart of our designed retargeting aerial images with LR.Given a collection of labeled/mislabled aerial photos, in the first place, we map the internal patches onto a manifold.Then, an active learning model extracts the attractive regions from an LR aerial image, wherein the GSP is constructed accordingly.Meanwhile, the deep GSP features can be calculated simultaneously.Then, we use our GSP to obtain sufficient representative and low redundant features from the original deep GSP features.The selected feature are finally fed into a GMM for LR aerial image retaregeting.
was introduced by preserving the salient regions while suppressing those non-salient ones following human visual perception.In practice, however, content-aware retargeting fails to encode LR aerial imagery owing to the following challenges: • Practically, we notice that there exists many attractive ground objects or their parts in an LR aerial image, as exemplified in Fig. 1.To discover those semantic labels for each LR aerial image, a biologically-inspired algorithm is required to simulate human perceiving the visually prominent regions.Practically, building a deep learning algorithm to jointly obtain the visually prominent regions and refine the visual representation to the above regions is difficult.Some possible challenges are: i) computing the path when human beings sequentially allocating their gazes onto the attractive image patches (such as the gaze shifting path (GSP) as presented in Fig. 1), 2) avoiding the inherent noisy/contaminated labels from the massive-scale training samples, and 3) semantically encoding labels at image-level into various image patches inside each LR aerial photo; • Different from those aerial images with high resolutions, in practice, LR aerial images are usually with a low visual quality.This is because low quality LR aerial images are easily influenced by different external factors like the uncontrolable weathers.This brings a few annotated low-resolution (LR) aerial images combined with a rich set of annotated high-resolution (HR) ones.
In our context, the target is a feature selector that is trained using partially-annotated aerial images with LR.Multiple difficulties have to be solved, including uncovering the sophisticated relationships of LR/HR aerial images in the high-order manifold space.
Herein, a new distribution-preserving feature selector (DPFS) is proposed which utilizes the human gaze behavior actively learned from HR aerial images to improve the retargeting of LR aerial images.An overview of the above pipeline is presented in Fig. 1.We leverage the massive-scale HR and LR aerial images, part of which are unlabeled.The aerial image regions are mapped to the feature space.Then, to stimulate how humans understand different aerial photos, an active learning algorithm is leveraged to decompose each aerial image to a few visually attractive patches as well as the unattractive background image patches.Simultaneously, the gaze shifting path (GSP) and its deep visual feature is calculated.Toward a subset of high quality GSP features cross aerial images with high/low resolutions, we propose the so-called DPFS to select discriminative features, and herein only a few samples are labeled.Theoretically, DPFS can maximally maintain the distribution of multi-resolution aerial images in the underlying feature space.Lastly, we encode the selected features to a Gaussian mixture model (GMM) for retargeting task.Comparative study with several generic image retargeting algorithms have demonstrated the competitiveness of the designed pipeline.Besides, another comparison with multiple visual categorization models has shown the high discrimination of the engineered GSP feature.
The novelties of this work are: a) an active learning algorithm sequentially generating GSP from an LR aerial image and calculates the visual representation for each GSP jointly; b) the DPFS that selects multiple high quality features cross HR and LR aerial photos, wherein the sample distribution can be maximally maintained during feature selection; and c) a probabilistic model for LR aerial photo retargeting.

II. PREVIOUS WORK RELATED TO OURS
Lots of computational visual models were proposed for analyzing aerial photos.Researchers [2] designed a multi-layer deep learner for detecting foreground visually salient objects.In [1], researchers formulated a focal-loss-based hierarchical model to accurately localize various cars within each LR&HR aerial photographs.In [36], the authors designed a geographic object detection model to handle HR aerial images by intelligently extracting intersections as well as roads.In [11], the authors proposed to combine feature engineering and soft-labels calculation to form an effective visual detector for modeling aerial images.Importantly, compared to the aforementioned techniques, our aerial image recognition method is biologically inspired and well reflects human visual perceptual process.In summary, the above region-level image models well exploit representative regions with multiple sizes from each LR aerial photo.However, they still have the following shortcomings: 1) these methods are usually designed for a specific image set, wherein some pre-specified domain knowledge are incorporated.Thereby, it might be difficult to adapt them to an unknown image set; 2) ideally we want a perception-guided region-level image model, where visually/smeantically salient regions are discovered for LR aerial photo representation.But the above models cannot explicitly discover these salient regions; and 3) the aforementioned models cannot select high quality features in a principled way.Meanwhile, the geometry structure among samples are not explicitly encoded during feature engineering.
In computer graphics, plenty of image/video retargeting algorithms in image processing community.Herein, we briefly introduce some representative retargeting models.The authors [15] considered photo retargeting as a so-called seam discovery using dynamic programming.On this basis, they calculated a gradient energy map for measuring the importance of different pixels.Researchers [18] designed an objective function by minimizing energy cost to upgrade the first retargeting algorithm.Aiming at a significant updating of seam carving, the authors [16] carefully neglected the same textural/color patterns inside different segmented superpixels.The authors [22] designed a retargeting pipeline that progressively fuzes unimportant regions to receive unobservable image distortions.Such operation propagates along the image shrinking.For work [27], the authors designed a novel retargeting technique which rapidly generates thumbnails by analyzing the original imagery.For work [20], the authors proposed an effective image retargeting workflow based on parameterizing saliency-guided visual meshes.Researchers [21] formulated an image patch-guided retargeting model based on maintaining the visually salient regions' shapes as well as the geometry.The authors [24] first build a hidden space using multiple operators, based on which a novel retargeting technique was proposed by calculating the optimal operator path.The authors [23] designed a new visual retargeting algorithm by progressively fuzing the results of photo cropping as well as wrapping.Herein, the temporally duplicated regions are identified by the cropping, while the motional features by maintained by the wrapping technique.For [19], the authors proposed a new retargeting pipeline that progressively deriving the scaling parameters and then updating the retargeted photo accordingly.Herein, the level of deformation are guided through the calculated visual saliency map.For [25], the authors proposed to retarget images in the aforementioned operator space.Besides, the authors [26] proposed to incorporate human visual perception into retargeting, wherein an extensive comparative study on the RetargetMe [17] was presented.

III. THE APPROACH A. ACTIVE LEARNING HUMAN GAZE ALLOCATION
Practically, for each LR aerial photograph, there exists a rich set of image patches which are insufficiently descriptive to its semantic categories.These image patches are generally the background unattractive ones, wherein humans will not allocate their gazes onto them.To build an effective LR aerial photo categorization model, we adopt an active learning algorithm to acquire multiple semantically representative image patches inside each aerial photo.
In theory, we expect a carefully designed machine learning technique optimally capture the hidden sample distribution.Based on the fact that spatially neighboring image patches are semantically correlated, we can linearly represent each patch using its adjacent ones.Herein, we can compute the reconstructing parameters as follows: where {x 1 , y 2 , • • • , x N } denotes the N image patches' visual features, R ij quantifies how important each image patch for rebuilding its spatially adjacent one, N counts the image patches within an aerial image, and A(y i ) is the set of neighbors corresponding to the i-th image patch.
In order to calculate the visual descriptiveness of the selected image patches, we formulate a reconstructing algorithm based on the above parameters.Herein, we use an error to score our selected image patches.Denoting {a 1 , a 2 , • • • , a N } as our rebuilt image patches, we obtain them by leveraging the following objective function: Herein, µ weights the importance of our regularizer, and K counts those selected attractive image patches, of the attractive image patches selected by us, and {s 1 , s 2 , • • • , s K } denotes the indices corresponding to the selected image patches.Noticeably, the first module calculates the cost of fixing the selected attractive image patches.Meanwhile, the second module enforces our rebuilt image patches sharing the same sample distribution with respect to the input samples. Let we set ϒ to be an diagonal matrix, wherein each diagonal entry is fixed to one if i ∈ {s 1 , s 2 , • • • , s K } otherwise we set it to zero.On this basis, we can update the objective function into: Herein, D = (I−R) T (I−R).Aiming at optimizing (3), we set the ϵ(A)'s gradient to zero, based on which we obtain: In this way, our rebuilt image patches are calculated as follows: By leveraging our rebuilt image patches, we can measure the reconstructing error using: Herein, || • || 2 F represents Frobenius norm for a matrix.Owing to the combinatorial attribute, we notice that minimization of ( 6) might be computationally intractable.In order to accelerate this, we propose a sequential scheme.Herein, we represent multiple selected image patches inside an aerial image as {b s 1 , • • • , b s K ′ }.Denoting ϒ n as the N × N diagonal matrix accordingly, and i an N × N matrix whose diagonal elements are all ones while the others are all zeros.Thereby, the s K ′ +1 -th image patch is calculated by optimizing: Practically, we notice that D in ( 7) is sparse, to speedup our matrix inversion calculation, we employ the well-known Sherman-Morrison equation [35] and receive: Herein, J * i and J i * respectively represent J's i-th column and the i-th row.Consequently, objective function ( 7) is updated into: We set K = DBB T D, then the optimization ( 7) is upgraded into: Using (10), we can sequentially compute the K attractive image patches in each aerial image.In practice, we sequentially link them to form a GSP that reflecting how humans sequentially perceive each aerial image (as exemplified in Fig. 1).For each GSP, we extract the 128-dimensional CNN (convolutional neural network)-based features [14] from each image patch, based on which a 128K -dimensional feature vector is constructed to describe each path that reflect human visual perception.

B. GEOMETRY-PRESERVING FEATURE SELECTION (DPFS)
Herein, it is natural to hypothesize that the entire LR aerial photos are unlabeled.Meanwhile, we labeled all the HR aerial images.We represent Simultaneously, when the u-th aerial image is unlabeled, l u is set as a long zero vector.We denote Q ∈ R D×C as the mapping matrix for our feature selector, a standard FS is formulated through optimizing the following regularized empirical error: Herein, L is a loss function andR represents the regularizer.
As shown in Fig. 2, the similarity graph is denoted by E. Herein, each element E ij measures the differences of h i and h j .In our work, we have the following settings, that is, E ij = 1 if h i and h j are deemed as neighbors to each other, and otherwise we set E ij = 0. Also, matrix F is defined as a diagonal one, wherein each diagonal element is computed as Thereafter, we can calculate the corresponding Laplacian matrix as T = F − E.
In order to successfully mining all the training aerial images, the prediction label matrix is defined as ∈ R N ×C with respect to the entire training data based on the transductive learning theory.Herein, p i ∈ R C represents our calculated label toward sample x i .Moreover, we make P best fit both the ground-truth label as well as the aforementioned affinity graph.Formally, we calculate P using the following optimizing task: arg min P tr(P T TP) + tr((P − L) T V(P − L)), (12) where matrix V is diagonal.
Herein, ||X T Q − P|| F quantifies the cost and R(Q) is a regularization term that penalizes Q toward the optimal feature selector; σ ∈ [0, 1] and τ ∈ [0, 1] denote the importance the corresponding terms respectively.Because of the sufficient sparsity as well as the nonconvexity constraint, we apply the l 2,p -norm toward the regularization term R(Q) for our feature selector SPFS (p ∈ (0, 1]).Therefore, we can reformulate the regularization term as: Herein, p is fixed to1/2.Details of the above optimization are presented in the following.We first set the derivative of ( 14) w.r.t P to zero, and thereby we obtain: After some derivations, we can reorganize the objective function (14) as: Herein, ( 16)'s derivative is set to zero, and then we obtain where 2 .Based on the above derivations, the solution of ( 16) is briefed in Algorithm 1.

C. GMM-BASED LR AERIAL IMAGE RETARGETING
In practice, perceiving LR aerial images is considered as a subjective task.This is because different individuals may have different views on the same LR aerial photo.In order to alleviate this issue, in our designed LR aerial photo retargeting, it is necessary to incorporate the learned visual perception experiences from many well-trained photographers into the test LR aerial photo.In detail, we deploy the Gaussian mixture model (GMM)to model our selected GSP features during training, i.e., prob(µ|ϒ) where f i represents the importance corresponding to the i-th component; ν is the GSP feature; α i and i denote the mean and variance of our learned GMM.In our implementation, we utilize the Euclidean distance to quantify the similarity between pairwise selected GSP features.Notably, humans perceive the retargeted LR aerial photos similarly to the massive-scale training ones.Given an unknown LR aerial photo, we first calculate its GSP and then select the refined features.Afterward, the significance of each grid is calculated accordingly.In the LR arial photo shrinking stage, to mitigate the triangle mesh that may generate distortions along the triangle orientations, we leverage the grid-based shrinking strategy.In our implementation, we evenly decompose the test LR aerial photo into a set of grids with equal sizes.Afterward, the significance of the horizontal grid g is calculated in the following: Herein, prob represents the GMM trained by the adopted EM-guided optimization.It is worth emphasizing that, our LR aerial photo shrinking operation is carried out from left to right, as elaborated in Fig. 3).In each shrinking stage, an intermediate retargeted LR aerial photo will be generated.As illustrated in (18), ν represents the selected GSP feature engineered from the current test LR aerial photo in the shrinking stage.Meanwhile, the GMM probability prob(ν|ϒ) is calculated based on (18).By obtaining the horizontal significance of each grid, we conduct a normalization step, that is, We assume that the size of the retargeted LR aerial photo is X × Y .The horizontal size with respect to the i-th grid is squeezed to [X • ηh (g i )].Herein, [•] rounds a real number and ηv (g i ) denotes the normalized vertical significance similarly calculated above.

Comparative Categorization and Retargeting Performance:
First of all, we compare our method with seven deep visual classification algorithms [5], [6], [7], [8], [9], [10], [12] that optimally encode the domain experiences of multiple categories of aerial photos.The experimental million-scale aerial images are from [3].Herein, the codes of [5], [6], [9], and [10] are publicly available.Based on this, we conduct comparative study, and the inherent settings remain unchanged.For [7], [8], and [12], we implement them since the codes are not provided.We re-implemented these classification algorithms by ourselves.We tried to make their performances similar to those in the original publications.
Meanwhile, our algorithm is also compared with multiple generic recognition models.Moreover, since LR aerial image classification can be considered as a sub-problem for scene categorization, we further conduct a comparative study between our method and three recently published scene classification models [4], [28], [30].For those self-implemented recognition algorithms, the empirical settings can be summarized in the following.For [7], we utilize the ResDep-128 [13] to function as the backbone.This is further updated into the multi-label variant.Different from the fully-linked layer (the number of units is fixed to 19), the rest deep layers are fixed by the above ResDep-128 [34].We deploy the ResNet-108 [13] as the backbone.The learning ratio as well as the decay are respectively fixed to 0.001 and 0.05.We calculate the loss of the entire network by leveraging the mean squared error.For [4], the well-known object bank [32] is adopted based on the carefully selected 18 LR aerial image classes.Herein, we used the average-pooling scheme.We utilized the liblinear as the solution to the linear classifier.And the 10-fold cross evaluation is applied.
For these aforementioned 18 baseline visual recognition algorithms, we test each algorithm multiple times.Accordingly, we present the average accuracies in Fig. 2. Besides, the corresponding standard errors are reported simultaneously.We notice the high competitiveness of our method.It is noticeable that the per-class standard errors calculated by us are much smaller than the competitors.This shown that the high stability of our algorithm.Overall, we made the following observations.
In this experiment, we compare our designed DPFS with a set of feature selectors in the aerial photo classification.They are information theory feature selection ITFS [37], CNN feature reduction (CNNFR) [38], feature selection for land cover classification (FSLC) [39], PCA feature reduction (PCAFR) [40], and CNN-based dimensionality reduction (CNNDR) [41].We present the comparative average categorization accuracies in Table 3.As shown, our method performs the best.This is because only of DPFS can optimally exploit sample's underlying relationships on the manifold, wherein the high-dimensional deep feature might be distributed on.
Comparative Computational Cost: Practically, computational time at both the training and testing stages is an important measure describe the effectivenss of a visual classification technique.As the comparative time cost presented in Table 2, at the training stage, we notice two categorization algorithms perform better than our method.The reason is that the architectures of [31] and [33] are overwhelmingly simple and effective.Simultaneously, it is observable that the per-class performance of [31] and [33] are nearly 4.1% worse compared to our method.Meanwhile, during the evaluation stage, our proposed algorithm is conducted much faster than its counterparts.We observe that the training step is conducted in an off-line mode, optimal time cost during testing is comparably more useful.
In retrospect, our LR aerial image classification framework includes two important components: 1) deep low-rank model for GSP generation, 2) our proposed DPFS, and 3) kernelized classifier for label calculation.Comparison of Retargeting Performance: In this experiment, our proposed GMM-based retargeting is testified by making comparison with multiple well-known generic image retaregting algorithms.More specifically, we compare with seam carving (SC) and its upgraded variant (ISC) [15], optimized scale plus sketch (OSS) [19], as well as saliency-guided mesh parametrization (SMP) [20].In the first place, Fig. 4 presents a few comparative LR aerial photo retargeting results produced by the above algorithm.The results have demonstrated that, our proposed method retargets LR aerial images in a more aesthetically-pleasing way.It is observable that the middle semantically significant targets are optimally kept with unnoticeable squeeze.Additionally, our algorithm retargets LR aerial photographs with the least distortions perceived by humans.
Thereafter, we organize an extensive user study in order to compare with the aforementioned retargeting techniques.In the user study, we recruit 40 master/Phd students from our Information Systems College.Each volunteer is invited to aesthetically compare two set of retargeted photos, a) both the retargeted and original LR aerial photographs, and 2) only the retargeted aerial photo.We simply follow the empirical settings in [17].Herein, we calculate the popular agreement coefficient to measure volunteers' views on LR aerial photographs generated by different retargeting techniques.In the comparison, a low agreement score indicates that it is difficult to decide which photo is more aesthetically pleasing.Simultaneously, our obtained agreement coefficients corresponding to the entire experimented LR aerial photos are illustrated in Fig. 5.We notice that, in Fig. 5(a), the agreement coefficient drops significantly when no reference LR aerial photos are presented.Notably, the 40 volunteers made strong agreements on visual factor such as ''face/people'' and ''symmetry''.The potential reason lies in that human face is semantically meaningful in human visual perception and symmetry is a key aesthetics-related attribute.In Fig. 5 (b), we observe that the 40 volunteers generally believe that our algorithm performs overwhelmingly better than its competitors on attributes ''face/people'' and ''texture''.Meanwhile, our method slightly outperforms its counterparts on the other attributes.

A. STEP-BY-STEP PERFORMANCE VALIDATION
This experiment evaluate each component in our LR aerial image categorization pipeline.First of all, we test our adopted active learning algorithm.We first abandon this active learning component and use randomly selected K 25618 VOLUME 12, 2024 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.4, using the other two scenarios both caused sharp performance drops.This reflects the importance of capture human visual perception in LR aerial image representation.
Next, to show how important to preserve sample distribution during feature selection, we compare our DPFS with four well-known feature selection algorithms [42], [43], [44] in the literature.We denote them by S21, S22, S23, S24 respectively.We notice that all these four compared feature selectors cannot explicitly preserve sample distribution.As the results presented in Table 4, without sample distribution component in feature selection will cause at least a 4% performance drop.This obviously shown the indispensability of our designed feature selector.
Finally, we propose to test the effectiveness of our kernelized GSP representation computed to describe an aerial image, wherein the below three settings are utilized.Firstly, the so-called aggregation-guided multi-layer CNN collects the calculated aerial image labels from all the image patches inside each aerial image.The above labels are accumulated to calculate the ultimate aerial image label (S31).Second, our deployed linear kernelized feature is replaced by the polynomial one (S32) as well as the radial basis function (S33) respectively.We reports the categorization accuracy changes in Table 4, we notice that the first scheme hurts aerial image categorization significantly.

B. CATEGORIZATION BY ADJUSTING PARAMETERS
We have two adjustable parameters of our designed categorization pipeline.The first is the number of actively selected image patches that are attractive.The second is the selected feature number.Herein, we test the LR aerial image classification performance by varying the aforementioned two parameters.
For the first parameter, we tune K from one to ten progressively, while the other parameters are set as default.Herein, such default values are decided using 5-fold cross validation, which is based on an aerial image set containing 12000 samples.As the curve presented in Fig. 8, by increasing the value of K , LR aerial image categorization accuracy goes up continuously to a high level.Thereafter, these parameters all go down stably.
For the next experiment, we test the visual categorization accuracy by adjusting the selected feature number.As the results reported at the bottom of Fig. 8, when this parameter goes up, the categorization accuracy increases significantly when the selected feature number is between one and five.And then it keeps stable when the selected feature number increases.In practice, aiming at a fast and powerful LR aerial image recognition framework, we set this parameter to five.
Finally, the proportion of labeled aerial photos are adjusted from 10% to 100%, with a step of 10%.Noticeably, the labeled aerial photos are randomly selected.We repeat the experiment 20 times, based on which the average categorization precisions are reported.As shown in Fig. 10, the designed method can well handles aerial photo categorization when there are no less than 40% labeled aerial photos.That means, our method can maximally support less than 60% unlabeled LR aerial photos.To our best knowledge, such attribute is very useful for real-world LR arial photo categorization.Besides, we present the LR aerial image retargeting results by tuning K and the number of selected perceptual features.As shown in Fig. 6 7, we present the different LR aerial photo retargeting results by tuning the two parameters.We notice  that the results are consistent with those presented in Fig. 8.That is, the optimal retargeting results are observed when K > 5 and the over five features are selected respectively.

V. CONCLUSION
Successfully retargeting LR aerial image is an important task in intelligent systems.This work introduces a new LR aerial image retargeting system.Herein, deep GSP-based visual representations are calculated and updated subsequently using HR aerial photos.The proposed recognition pipeline contains multiple parts: 1) an active learning paradigm calculating GSPs from multi-solution aerial photographs, 2) a new DPFS for effectively obtain qualified features, and 3) a probabilistic retargeting model using GMM.Extensive experimental results shown our method's effectiveness.

FIGURE 2 .
FIGURE 2. The flowchart of classifying LR aerial image by leveraging the proposed DPFS, which can optimally preserve the geometry structure among samples during feature selection.
as the matrix of features, that is, each feature is represented by a D-dimensional vector of either an LR or HR aerial image at the training stage.Herein, the number N represents the sample number during training.Meanwhile, we represent L = [y 1 , • • • , y M , y M +1 , • • • , y N ] ∈ {0, 1} N ×C as the matrix containing the labels from the entire multi-resolution aerial images at the training stage.Herein, C denotes the number of different semantic class.In this context, l uv is used to represent the v-th label toward l u (1 ≤ v ≤ C).Besides, we have the following settings, i.e., l uv = 1 if the u-th aerial image comes from the v-th category, otherwise we set l uv = 1.
1 otherwise.Noticeably, V enforces the predicted category label matrix P maximally consistent with ground-truth one L. Based on (12), the graph Laplacian semi-supervised FS can be mathematically represented as: arg min P,Q tr(P T TP) + tr((Y − P) T V(Y − P))

FIGURE 3 .
FIGURE 3.An illustration of the grid-shrinking-based LR aerial photo retargeting.
In the model learning stage, each component's time cost is given as follows: 10h44m (component 1), 3h22m (component 2), and 6h58m (component 3).During model evaluation, each component's time cost is listed in the following: 232ms (component 1), 317ms (component 2), and 68ms (component 3).Notably, for component 1, most of the time is cost during model training.For real-world AI systems, we can take full advantage of the Nvidia GPUs.In this way, we can accelerate component 1 by 10× based on the program parallelizing technique.

FIGURE 5 .
FIGURE 5. Statistics of our designed user study toward the five retargeting algorithms.
image patches (marked by S11).Then, we replace the active learning by selecting the central K image patches inside each aerial image (S12), since humans tend to fix onto the central regions in each picture.As shown in the second column of Table

FIGURE 7 .
FIGURE 7. Retargeted LR aerial images by tuning the number of selected features.

FIGURE 8 .
FIGURE 8. Categorization accuracy by changing K (top) and the selected feature number (bottom).

FIGURE 9 .
FIGURE 9.The objective function value by varying the iteration number.

FIGURE 10 .
FIGURE 10.The objective function value by varying the iteration number.

TABLE 2 .
Computational time of compared recognition algorithms (Highest results are bolded.)

TABLE 3 .
Comparative average categorization accuracies among the six FS algorithms.

TABLE 4 .
Performance enhancement and decrement by adjusting each module.