Clustering of Cities Based on Their Smart Performances: A Comparative Approach of Fuzzy C-Means, K-Means, and K-Medoids

Smart City is recognized as a potential approach to address serious urban issues such as traffic, pollution, energy use, and waste management. Therefore, it is vital to evaluate how smart cities are in order to put these methods into practice. To offer advice on these matters, numerous reports are created, and one of which is Smart City Index (SCI). The Institute for Management Development (IMD) and the Singapore University of Technology and Design collaborate on SCI every year (SUTD). The report evaluates how locals view the buildings and technological applications that are available in their towns. Although the study offers a thorough examination of the cities for evaluation, the city clusters should be more sensitive and not be created using strict clustering techniques. In order to address this problem, the clustering algorithms namely K-Medoids, Fuzzy C-Means, and K-Means, which outperform hard clustering approaches in terms of robustness to vagueness and knowledge retention, are used. The main goal of this study is to categorize cities using a scientific manner (clustering algorithms) based on SCI data and to present how the chosen approaches work for dealing with the associated problems. The primary innovation of the present study is the use of clustering techniques in reports where the indexes are used. The results indicate that grouping the cities on the basis of their smart indicators would not be as effective as using the three clustering techniques that are suggested in this paper. These results add to the analysis of the dynamic capacities of smart cities and highlight the sustainability of these tactics.


I. INTRODUCTION
In 2021, 56% of people lived in cities, and by 2050, 68% are expected to be urban population according to the World Bank's World Development Indicators (WDI) database [1], [2].There are actually relatively few nations where it is anticipated that rural shares would surpass urban shares by 2050.Although urbanization has positive impacts (welfare, economic growth, better education, increased healthcare services, etc.) on society [3], [4], rapid urbanization rates have been perceived in many nations as a treat to urban life such as a lack of energy, contamination of the environment, and The associate editor coordinating the review of this manuscript and approving it for publication was Mauro Tucci .traffic congestion, social inequality, lack of public services, and land loss [5], [6], [7].In order to overcome these negative effects, cities need to be more organized and managed in an efficient and effective way [8].
The ''smart city'' idea has been developed as a new technology-driven mechanism in the search for answers to these issues.Although smart cities are a very popular term that has been heard and used for years, there is no clear specific definition.According to Almirall et al. [9], transportation, civic entrepreneurship, democratic transparency, sustainable energy, and service delivery are all components of the concept of ''Smart Cities'', which covers the majority of the functions performed by local governments.Information and communication technology (ICT) are used as a transformative tool to make different regions ''smart,'' which becomes one area of commonality.In other words, a smart city is a new trend in urban planning that integrates resources to deliver better urban services based on ICT use.These solutions have a high potential for success while are also viable and environmentally benign [10].Following the definition of a smart city proposed by the European Commission a smart city is a place where traditional networks and services are made more efficient with the use of digital solutions for the benefit of its inhabitants and business [11].According to smart cities report issued by the World Bank, it is a city that leverages the latest in technology and connectivity to make better decisions and achieve the urban aspirations of its residents [12].As it can be noticed both of them refer to digitalization towards better life of its inhabitants.
The development of sustainable smart cities faces a range of multifaceted challenges.These encompass financial limitations that impede technology deployment and infrastructure creation, coupled with scarcities in essential resources such as technology, equipment, skilled personnel, and financial backing.Furthermore, resident unawareness of smart city services, compounded by social acceptability issues towards unfamiliar technologies, can hinder successful adoption.Institutions grapple with capacity shortages, hindering the effective management and maintenance of smart city technologies.Moreover, the absence of operational frameworks, standardization disputes, and gaps in strategic planning and governance coordination can thwart cohesive development efforts.These challenges can be addressed through strategies such as fostering partnerships to increase budgets, launching comprehensive public awareness campaigns, facilitating stakeholder consensus-building for unified standards, integrated strategic planning to ensure phased development, boosting capacity through targeted training, establishing streamlined governance structures, implementing standardized system protocols, promoting collaborative content creation, and integrating sustainability principles into all facets of smart city initiatives.By embracing these solutions, cities can navigate these complexities and cultivate more efficient, inclusive, and enduring smart city environments [13].
The issue of delineating and quantifying smart city performance has been always a primary concern for policy developers/city planners, and numerous efforts including indexes have been presented (e.g., [14], [15], [16], [17], [18], [19]).These indices provide the data required to characterize a city's or nation's smart situation in support of public policies, regional, national, and international initiatives and strategies.One of the reports called as SCI is published by combined efforts of the IMD and the SUTD annually [11].The SCI is the pioneer global index that rates metropolitan regions on the basis of the opinions of their citizens who determine how ''smart'' a city is.It assesses how locals see the scope and results of urban intelligence initiatives.On the basis of technological, economic, and public assessments of how ''smart'' their cities are, cities are ranked according to the SCI.There are three reports published between 2019 and 2021.They were carried out in 118 metropolises in five major dimensions: health and safety, activities, governance, opportunities, and mobility [20].When the SCI is established, countries are grouped together to find collaborative solutions to the problems.Indexes of this nature serve as helpful resources for overcoming challenges, providing insights into areas that require improvement and enable to concert efforts to address deficiencies.
Finding the differences and similarities among cities are fundamental in determining which cities are suitable/unsuitable to create and put into effect a policy for smartness.Although the aforementioned SCI report provides an extensive analysis to assess the cities, cities' clusters ought to be more delicate and should not be determined using strict clustering techniques.If there are cities that are not clustered correctly, it will be difficult to create road maps for smart urbanization.One of the multivariate statistical techniques, clustering analysis or simply clustering, helps to break up big data sets into subgroups based on similarities and examines various correlations and patterns [21].The aim of this approach is to partition the data set into as many inputs given to the identical cluster as doable, while avoiding assigning as many items to separate clusters as possible.
In a case of our study we employed soft computing technique (Fuzzy C-Means) in addition to the hard computing techniques (K-Means and K-Medoids).Therefore, this paper aims to frame how cities can be clustered according to their smartness indicators.Its primary goal is to categorize cities using a more scientific approach based on the SCI, and to demonstrate the effectiveness of the selected methods in addressing such issues.Furthermore, our study also utilized soft computing (fuzzy C-means), whereas the SCI relied on hard computing techniques.As the clustering of groups (cities) influences the smart city policy makers' efforts, the research questions considered in this paper are: (i) is it possible to apply soft and hard computing approaches to classify such indexes, (ii) which cities are included by which clusters according to which similarities, (iii) which clustering method provides a better result, and finally (iv) what differences will be identified in comparison to the existing clusters presented by the SCI report?
The structure of the study is as follows.A synopsis of the pertinent literature is provided in Section II.The used clustering approaches are discussed in Section III.The SCI report is given in Section IV along with a clustering of the SCI report's included nations using the suggested methods.The findings are displayed and discussed.Conclusions and limitations of this strategy are discussed in Section V.

II. LITERATURE REVIEW
This chapter conducts two levels of literary analysis: i) reviews that analyze smart cities in terms of data analytics, ii) current studies that use K-Medoids, Fuzzy C-Means, and K-Means clustering algorithms.
Moustaka et al. [22] applied a systematic review for Smart Cities (SC) by focusing on data harvesting (DH) and data mining (DM) processes.According to their survey findings, ''IoT'' and ''smart mobility'' are the themes that researchers are most interested in, although ''crowd-sensing'' and ''smart living'' are also hot topics.In addition to the conventional DM approaches, the methods that are most frequently utilized for the harvesting and mining of urban data, are statistics, fuzzy logic, machine learning, visualization, neural networks, web mining, and text mining.Transportation, security/ emergency, health, and environmental services are the most popular applications of urban data in smart cities, per the research's findings.Finally, their research findings revealed that while smart applications collect and depict open and massive data created by sensors or users, applications that evaluate human behavior for multiple objectives are favored.Soomro et al. [23] reviewed the SC domain in reference to big data analytics.Moreover, Souza et al. [24] analyzed the DM and machine learning methods regarding SC domain as well as conducted a network analysis using co-citations from the journal, cocitations from the authors, and co-occurrence of keywords to identify potential future research trends.Predictive analytics, which is mostly employed in the fields of smart mobility and the smart environment, is the technique that is used the most frequently, according to their evaluation results.Ageed et al. [25] investigated the big data usage in order to create more innovative societies.They also examined the advantages and disadvantages of applying big data systems in smart cities.In the study [26] the authors conducted an extensive analysis to determine the primary DM methods applied in SC.Their results revealed classification algorithms such as decision trees, random forest, support vector machine, Bayesian network, K-nearest neighbors, neural network, deep learning and reinforcement learning, are frequently used in smart cities.In addition, the most frequent words and also trend research topics are IoT and big data.Sarker [27] investigated the contributions of SC data science to decision-making in SC systems and services.In order to achieve it, ten application domains were identified such as public safety, transportation, education, healthcare and systems, and services were analyzed in details in these domains.
The clustering methods K-Means, Fuzzy C-Means, and K-Medoids were often employed in a variety of applications.For example, K-Means, K-Medoids, and Fuzzy C-Means algorithms were used for nuclear medicine image segmentation [28].A density Fuzzy C-Means algorithm was suggested for segmenting brain MR images [29].For the purpose of detecting brain tumors, the watershed algorithm with the Fuzzy K-Means and Fuzzy C-means algorithms was integrated [30].Fuzzy C-Means, K-Medoids, and K-Means clustering algorithms were also utilized for cancer prediction and detection [31], [32], [33], [34], [35].The aforementioned algorithms were also used to cluster the spread of the COVID-19 pandemic, one of the most serious public health problems that humanity has ever experienced, and to determine the current state of nations [36], [37], [38].
Several studies have also conducted comparative analyses of these three clustering algorithms.Park and H. Jun developed a novel K-Medoids clustering approach that was simpler and faster than existing methods.The outcomes demonstrate that the suggested method succeeds similar or better clustering quality than existing methods while being significantly faster [39].Furthermore, Jipkate and Gohokar conducted a comparison for the Fuzzy C-Means and K-Means clustering algorithms and found that K-Means were outperformed by Fuzzy C-Means in terms of accuracy and speed [40].Surya Narayana and Vasumathi proposed new attributes similaritybased K-Medoids clustering technique that achieved better clustering results than traditional K-Medoids algorithm [41].In their study, Kurniawan et al. also compared K-Means and Fuzzy C-Means with another clustering algorithm, linkage, for the NASA active fire dataset and found that Fuzzy C-Means produced better clustering results [42].On the other hand, Zhou and Yang investigated the effect of cluster size distribution on clustering and compared K-Means and Fuzzy C-Means algorithms [43].
Among the clustering algorithms that are frequently utilized in the context of smart cities are K-Medoids, Fuzzy C-Means, and K-Means [44].K-Means is a highly well-liked clustering technique in data-driven smart cities because it can scale enormous data sets [26], [27].Its computational speed and low complexity make it another advantageous algorithm.In order to identify anomalies or cyber dangers in the SC data, the K-Medoids algorithm can be utilized to analyze citizen preferences, behavioral activities, or usage [27], [45].In terms of being able to provide the probability of belonging to a cluster, the Fuzzy C-Means approach is more realistic [46].Clustering methods were also been used in intelligent transportation systems for better traffic management [47], [48], [49], [50], [51], wireless networks [52], [53], detection of congestion [54], climate change [21], energy [55], [56], etc.
Considering the literature review presented above, the three most basic contributions of the proposed paper to the relevant literature are: (i) This paper presents a methodological framework for clustering cities on the basis of their smartness indicators.(ii) The study introduces a scientifically robust approach that employs the Smart City Index to categorize cities. (iii) The effectiveness of the selected methods in addressing smart city challenges is empirically demonstrated.

III. METHODOLOGY
In smart cities, ICT is utilized to boost operational efficiency, enlighten the public, and enhance the standard of public services and citizen welfare.A smart city uses cutting-edge technology and data analysis to improve citizen quality of life, streamline municipal processes, and foster economic development.While countries and cities try to make their communities smart, they need to determine the policies they 134448 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
will implement accordingly.Some international organizations have developed indexes/reports to determine which policy should be applied or the level of smartness of each community.One of the mentioned reports, and perhaps the most important one, is the SCI, and according to this index, cities are grouped according to a number of criteria.However, hard computations approaches such as simple arithmetic operations are mostly used in such reports.Therefore, in some of the results obtained, grey areas/cities (the places that are left out) may not be grouped correctly.Fuzzy C-Means, K-Medoids, and K-Means are three popular clustering algorithms used in data ensembles with too many features.Each algorithm has strengths and weaknesses, and the choice of the algorithm depends on the specific characteristics of the data and the requirements of the application.Clustering is a fundamental task in machine learning and data analysis.Among the various clustering algorithms available, K-Means, K-Medoids, and Fuzzy C-Means are widely used due to their effectiveness and ease of implementation.This paper aims to apply and compare the Fuzzy C-Means, K-Means, and K-Medoids clustering methods to determine the best method for grouping cities based on the smart city indicators mentioned in the SCI report.The SCI report assesses individuals' perceptions of two crucial aspects: the urban environment and its technological integration.It employs five core categories-health and safety, mobility, recreational options, opportunities, and governance-to assess each major facet.The cities are categorized into four groups (1-2-3-4) based on their UN Human Development Index score for the economy.In order to perform clustering we used the entire dataset and then compared the outcomes with the present index.In addition, the Wilcoxon Signed Rank statistical test is employed to determine the effectiveness of different clustering approaches.
The flowchart of the applied methodology is presented in Figure 1.
Fuzzy C-Means is a clustering technique that assigns weights to each member of the set, while K-Means is a simple unsupervised learning algorithm that addresses the well-known clustering problem.The K-Means procedure involves classifying a given dataset into a predetermined number of clusters (assumed to be k clusters).On the other hand, K-Medoids is a clustering technique that can handle objects with high values that may deviate from their distribution [57].The Fuzzy C-Means clustering algorithm is a useful tool for clustering datasets due to several advantages it offers.Firstly, unlike traditional clustering algorithms that assign data points to only one cluster, Fuzzy C-Means allows data points to belong to multiple clusters.This feature is particularly beneficial when the data suggests overlapping clusters or when multiple interpretations of the same data are possible.Secondly, Fuzzy C-Means is easy to implement and does not require extensive computational resources.This makes it suitable for large datasets where other algorithms may struggle due to limited computing power.Moreover, Fuzzy C-Means is relatively fast and efficient, making it an excellent choice for time-sensitive projects.Furthermore, the algorithm generates a membership value for each data point, indicating the degree to which it belongs to each cluster.It proves useful for various types of analysis.Additionally, Fuzzy C-Means demonstrates robustness against noise and outliers in the data.It can effectively handle non-linearly shaped clusters, thereby establishing itself as a versatile tool for clustering datasets [58], [59].K-Means clustering is a popular unsupervised machine learning algorithm known for its speed, efficiency, and simplicity.One of its main strengths is its ability to handle large datasets quickly and efficiently.This algorithm is highly flexible and can be applied to various types of data, including numerical, categorical, and binary data.K-Means clustering is also remarkably easy to implement, making it accessible to data analysts and researchers.Its simplicity makes it an excellent starting point for clustering tasks and provides insights into the structure and patterns within a dataset.Overall, the K-Means clustering algorithm is a valuable tool for data analysis, particularly for tasks that require fast and efficient processing of large datasets [60].The K-Medoids clustering algorithm is a variation of K-Means clustering that offers several advantages over its counterpart.K-Medoids is less sensitive to noise and outliers in the dataset, making it a better choice for datasets with high variability.Additionally, it is capable of handling categorical data, unlike K-Means, which requires numerical data.Another advantage of K-Medoids is its low time complexity, making it more efficient for large datasets.The use of medoids as cluster representatives, instead of the mean value used in K-Means, makes K-Medoids clusters easier to interpret since medoids correspond to actual data points within the cluster.Overall, K-Medoids is a valuable clustering algorithm that offers advantages in terms of interpretability, efficiency, and its ability to handle noisy and categorical data [39], [41].
This section presents the standard Fuzzy C-Means, K-Means, and K-Medoids clustering algorithms, respectively.

A. FUZZY C-MEANS CLUSTERING
Hard clustering divides data into discrete clusters, assigning every element of the data to only one cluster.Data elements in fuzzy clustering, also known as soft clustering, can belong to many clusters, and each element has a set of membership levels associated with it.They show the strength of the connection between data element and a specific cluster.These membership levels are established in fuzzy clustering, after which data components are assigned to one or more clusters [40].Fuzzy C-Means is satisfied as Eq. ( 1) [42].
x i is the ith data point, c j is the center of the jth cluster, and µ ij is the degree of membership of x i in the jth cluster, where D is the number of data points, N is the number of clusters, m variable is a fuzzy partition matrix exponent for determining the degree of fuzzy overlap, and m > 1.The following lists the steps of the fuzzy C-Means clustering algorithm: Step 1. Set N, m and ε Step 2. Randomly initialize the fuzzy partition matrix µ (0) : Step 3. Calculate the cluster centers by using Eq. ( 2).
Step 5. Determine J m , the objective function Step 6. Continue to perform steps 3-5 until J m improves by less than a predetermined minimum threshold or until a predetermined maximum number of iterations is completed.
Fuzzy C-Means is a clustering algorithm that is used to partition data into clusters and has several strengths that make it an attractive option for certain types of data.Some of the main strengths of FCM are [58], [59]: i. Instead of only allowing data points to be a part of one cluster, Fuzzy C-Means allows for them to belong to multiple clusters, which can better reflect the true nature of the data.
ii.It is relatively simple and easy to implement, and does not require a lot of computational resources.
iii.It is relatively fast and efficient, especially for large datasets.
iv.It produces a membership value for each data point, indicating the degree to which it belongs to each cluster, which can be useful for certain types of analysis.
v. It is robust to noise and outliers in the data.vi.It can handle non-linearly shaped clusters.Overall, Fuzzy C-Means is a useful clustering algorithm that can be applied to a variety of data types and is especially useful for data that does not clearly fit into a single cluster.
For example, Andrejovska analyzed data from 2012 to determine the insolvency of EU countries using several different methodological approaches, including variants of agglomerative hierarchical cluster analysis, the outputs of K-Means and K-Medoids, and Fuzzy C-Means [61].Moreover, Anggoro et al. clustered the cities/districts in eastern Indonesia through the Fuzzy C-Means algorithm according to ICT vulnerability [62].The use of the Fuzzy C-Means clustering method to analyze and understand the epidemiological situation of COVID-19 in European countries, and to track changes in this situation over time is discussed in [63].

B. K-MEANS CLUSTERING
K-Means clustering, often known as tough Unsupervised and non-hierarchical partitioning techniques for data analysis includes C-Means clustering.K-Means clustering divides data points into a predetermined number of clusters.Based on the locations and separations between different input data points, the approach groups the data [64].K-Means is mostly used to maximize inter-cluster distance and minimize intracluster distance.It computes the objective function using Eq. ( 4) [43].
where V i is the ith cluster and x j is the jth data object.The steps of the K-Means clustering algorithm are listed below [40]: Step 1. Choose the number K of desired clusters.
Step 2. Create K clusters and determine the cluster's centroid.
Step 3. Assign each point to the cluster whose centroid is nearest to it.
Step 4. Recalculate the new K centroids.
Step 5. Repeat steps 3 and 4 until no point changes its cluster assignment or until the centroids no longer change.Some of the strengths of K-Means clustering algorithm are given as follows [60]: i.It is fast and efficient in terms of computational time.
ii.It is highly flexible and can be used for a variety of data types iii.It is simple and easy to implement iv.It can handle large data sets Latifah et al. [65] used the K-Means and Fuzzy C-Means algorithms to cluster regencies/cities in Central Java Province on the basis of the Human Development Index (HDI).It is an indicator used by the United Nations Development Programme (UNDP) to measure and compare the development levels of different countries and it assesses human development based on three key dimensions: life expectancy (health), education (access to knowledge), and per capita income (standard of living).In [66] there are presented the application and comparisons of the K-Medoids, K-Means, and Fuzzy C-Means algorithms to cluster provinces according to stunting which is widely known as the case of malnutrition suffered by toddlers.

C. K-MEDOIDS CLUSTERING
K-Medoids clustering is a method of partitioning a set of data points into k clusters, where each cluster is represented by a medoid [67].A medoid is a data point within a cluster that is representative of the other points in the cluster.The medoids are chosen such that the sum of the distances between each data point and its nearest medoid is minimized.
K-Medoids clustering is similar to K-Means clustering, which uses means as the cluster centers.However, K-Medoids is more robust to noise and outliers than K-Means, and is generally more suitable for dealing with non-linearly distributed data [39].Some strengths of K-Medoids clustering include [39], [41]: i.It is less sensitive to noise and outliers.
ii.It can handle categorical data.
iii.K-Medoids has low time complexity, making it more efficient for large datasets.
iv.The use of medoids as cluster representatives makes K-Medoids clusters easier to interpret.
Özdemir and Kaya used both K-Medoids and Fuzzy C-Means algorithms to classify member countries of the organization for economic co-operation and development on the basis of their CO 2 emissions from fossil fuel consumption [21].In [68] these methods are used to cluster districts in Riau Province which is one of the provinces that has social inequality in its society with a low welfare index.

IV. SMART CITY INDEX
The data is obtained from Smart City Index report which results from an annual collaboration between the Institute for Management Development (IMD) and the Singapore University of Technology and Design (SUTD).The IMD-SUTD SCI assesses residents' perceptions concerning the available structures and technology applications in their respective cities.The SCI ranks cities globally by collecting the perceptions of 120 residents from each city.The IMD-SUTD SCI rates people' impressions of two key factors: the city's physical layout and its use of technology.Five primary categories-health and safety, mobility, activities, opportunities, and governance-are used to evaluate each main title.The factor ratings are determined as it is shown in Table 1.

TABLE 2. Comparison for group 1 (structures).
TABLE 10.The comparative of clustering methods in terms of their efficiency.
An illustration of the index result for Abu Dhabi is shown in Figure 2. According to the Rating and its components, the city's place among all other cities is referred to as its ''smart city rating.''On the basis on the economy's UN Human Development Index score, the cities are ranked in four groups (1-2-3-4) (1-2-3-4).Abu Dhabi falls into ''Group 3'' here.Cities are awarded a ''rating scale'' within each group (AAA to D). Rating of Abu Dhabi is ''B''.Additively, the factor ratings according to ''structures'' and ''technologies'' are ''BB'' and ''B'', respectively.Rating scale groups are presented as follows (Figure 2).The SCI's first edition, which was published in 2019, examines 102 towns across the globe by incorporating the opinions of 120 locals in each location.The SCI's second 134452 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.      2 is provided here, the rest of them (Tables 3 to 9) are available as supplementary material.In addition, the tables display the class membership values for the Fuzzy C-Means algorithm.Accordingly, it can be said that cities belong to the class that has the largest membership value.For example, the largest value is 0.4273 for Amsterdam in Table 2. So, Amsterdam rating scale is ''A''.Table 2 and Supp Tables 3 to 5 in the supplementary material show the results for the ''structure'' main title and Supp Tables 6 to 9 in the supplementary material show the results for the ''technologies'' main title, respectively.Table 2 and Supp Table 6 show the results for cities in the highest HDI quartile.Table 3 and Supp Table 7 display the results for second HDI quartile.Supp Table 4 and 8 show the results for third HDI quartile and finally, Supp Table 5 and  According to the results from both SCI and the other three algorithms, Lausanne, Oslo, and Zurich have the highest scores under the ''structure'' title as indicated in Table 2. Supp Table 3 in the Supplementary Material reveals that Singapore has the highest score among the cities under the ''technology'' title, based on the results of the index and algorithms.
In this section, we present the analysis of the clustering methods on the basis of their effectiveness in clustering data.Specifically, we evaluate the minimum square error (MSE), the number of cities with similar rating scales, and the number of cities with dissimilar rating scales, as compared to the SCI report.The MSE is a measure of the overall quality of the clustering, where a smaller value indicates a better fit of the clustering model to the data.The number of cities with similar rating scales indicates the ability of the clustering method to group cities with similar characteristics together.On the other hand, the number of cities with dissimilar rating scales indicates the clustering method's ability to differentiate between cities with different characteristics.By evaluating the clustering efficiency using these three metrics, we can gain a more comprehensive understanding of the strengths and limitations of each method.This information will be invaluable in select-ing the most appropriate clustering algorithm for a particular dataset and research question.
The comparative results of the clustering methods are given in Table 10.According to the results, the Fuzzy C-Means algorithm outperformed K-Means and K-Medoids on SCI data for groups 1 and 2 in ''structure'' title, as it produced the results that were closest to the SCI results.In contrast, both K-Means and K-Medoids had less favorable performance.On the other hand, the Fuzzy C-Means and K-Means clustering algorithms showed similar performance for groups 3 and 4 in the ''structure'' title.In the ''technology'' title, K-Means generally had the highest performance among the tested algorithms.However, for group 3, both Fuzzy C-Means and K-Medoids outperformed K-Means.
The Wilcoxon Signed Rank statistical test is used to see which clustering approach is effective.We suggest two hypotheses to test this situation.Hypothesis theses are as follows: H 0 : The Fuzzy C-Means and K-Means algorithms produce results that aren't more similar to the index than K-Medoids algorithm.
H a : The Fuzzy C-Means and K-Means algorithms produce results that are more similar to the index than K-Medoids algorithm.
While Tables 11 and 12 show the Wilcoxon signed-rank test results of K-Medoids compared to Fuzzy C-Means and K-Means, respectively, summarized results are given in Table 13.
According to the Wilcoxon signed-rank test, if the test value is less than critical value, null hypothesis (H 0 ) is rejected (Table 13).Reject H 0 refers there is enough evidence at the 5% level of significance to support the claim that the Fuzzy C-Means and K-Means algorithms produce results that are more similar to the index than K-Medoids algorithm.
The status of countries is determined by averaging the fuzzy membership values of the cities within each country.A country's rating scale is determined according to the scale with the highest membership value among the calculated averages.Figure 3 illustrates the rankings of countries under the ''structure'' title.Accordingly, the countries that receive the highest rating (AAA) in descending order of membership value are as follows: Norway, Singapore, Switzerland, Finland, and Denmark.Figure 4 shows the rankings of countries under the ''technology'' title.According to this, the countries with the highest rating of membership value are Hong Kong, New Zealand, and Singapore.
Two images were created to enhance the understanding of the Fuzzy C-Means algorithm results for the ''technology'' and ''structure'' titles.The primary objective of them was to facilitate the identification of which city belongs to which rating.In other words, these images provide answers to questions regarding which city is included in which scale, as well as their corresponding membership degrees.For example, according to Figure 5 both Hong Kong and Singapore is part of the AAA scale with a membership rating of approximately 0.4, whereas Amsterdam is included in the AAA scale with a higher membership rating of approximately 0.6.As a result, Amsterdam is ranked higher than Hong Kong and Singapore in the context of smart cities. Figure 5 depict the clusters of smart cities as represented by their fuzzy membership values according to ''technology'' title.As a result, while Hong Kong, Singapore and Amsterdam are rated as AAA, Rio de Janeiro, Nairobi, Beijing and Lagos are rated as D in terms of the ''technology'' title.According to the ''structure'' title the cluster of smart cities is illustrated in Figure 6 through their fuzzy membership values.While Singapore is rated as AAA, Rio de Janeiro, Nairobi, and Lagos are rated as D for both ''technology'' and ''structure'' titles.

V. CONCLUSION
The desire for cities to offer the improved public services that might affect people's daily lives has been currently rising.Although cities only make up 3% of the earth's land, they are the backbone of our economy, utilize 75% of the world's energy and generate 80% of the extra greenhouse emissions produced worldwide [67].The so-called smart cities have made efficient use of resources and improved living conditions via the application of technology.The 2021 SCI, published by the IMD in partnership with SUTD, includes significant insights on how urban inhabitants will handle new duties.The report groups the cities into different clusters to interpret smart indicators better and more effectively.Although the study offers a thorough examination of the cities for evaluation, the city clusters should be more sensitive and not produced using strict clustering techniques.This paper proposes K-Means, Fuzzy C-Means, and K-Medoids clustering algorithms that are much more information and flexible than any hard clustering methods and computes the final score using the same method as the index.The effectiveness of different clustering methods is evaluated in grouping data.Our analysis focuses on three metrics: minimum square error (MSE), the number of cities with similar rating scales, and the number of cities with dissimilar rating scales, with comparison to the SCI report.A lower MSE is generally considered as advantageous in various contexts.The MSE measures the average squared difference between predicted values and actual values in a dataset.A lower MSE indicates that the predicted values are closer to the actual values, implying higher accuracy in the model's predictions.Within this framework K-means demonstrates superior performance with a MSE of 0.033 for the ''technology'' category, group 4. Our results indicate that, for groups 1 and 2 in the ''structure'' title, the Fuzzy C-Means algorithm demonstrated superior performance compared to K-Means and K-Medoids.This is evidenced by the Fuzzy C-Means algorithm producing results that closely match those of the SCI report, while K-Means and K-Medoids had less favorable performance.In contrast, for groups 3 and 4 in the ''structure'' title, both Fuzzy C-Means and K-Means algorithms showed similar performance.In the ''technology'' title, K-Means generally demonstrated the highest performance among the tested algorithms.However, for group 3, both Fuzzy C-Means and K-Medoids outperformed K-Means.These findings suggest that the selection of the most appropriate clustering algorithm may depend on the specific research question and dataset characteristics.In summary, it can be inferred that the Fuzzy C-Means and K-Means clustering algorithms demonstrated comparable performance.The Fuzzy C-Means and K-Means algorithms produce results that are more similar to the index than K-Medoids algorithm.While conducting the study, some limitations have been encountered.The first of these is that the study is limited to only 118 cities, while the other is that some of the data in the report (SCI) used as a source are obtained through a survey.
The following open research questions should be looked into for additional studies: (i) embedding other temporal and spatial indicators should be strengthen the assessment process, (ii) calculating the weights of indicators should be done multi-criteria decision making approaches, and finally (iii) different clustering algorithms should be tested in the similar studies.For future studies, other clustering techniques such as the K-rms algorithm, the Fast Markov clustering algorithm, or meta-heuristics data clustering algorithms could be employed and compared with the existing work.

FIGURE 1 .
FIGURE 1.The flowchart of the applied methodology.

FIGURE 2 .
FIGURE 2. An example of the index result for Abu Dhabi.

FIGURE 3 .
FIGURE 3. Smart cities index in terms of countries (according to structure).
edition evaluated 109 cities by gathering the opinions of 120 locals in each city.The 118 cities included in the previous edition's ranking had the same number of citizens.The final score was calculated using SCI index utilizing survey responses from the previous three years, with a weight of 3:2:1 for 2021, 2020, and 2019.The final score is calculated

FIGURE 4 .
FIGURE 4. Smart cities index in terms of countries (according to technology).

FIGURE 5 .
FIGURE 5. Clusters of smart cities (according to technology).

FIGURE 6 .
FIGURE 6. Clusters of smart cities (according to structure).
9 present the results for lowest HDI quartile.The symbols (↑, ↓, and •) represent the change (increased, decreased and stable) in ranking compared to the current report (index result).

TABLE 1 .
Factor ratings employed in the IMD-SUTD SCI.