Extracting Place Functionality From Crowdsourced Textual Data Using Semantic Space Modeling

Place has gained significant attention in geographic information science. Places are described by users that make a huge amount of user-generated textual contents. This research introduces a novel approach to extract place functionality using crowdsourcing textual data, which are shared in the form of online reviews. To achieve this goal, salient features are modeled as directions in a domain-specific semantic space. We propose an unsupervised method that only requires a Bag-of-Words (BoW) of place reviews and utilizes Natural Language Processing (NLP) methods. Finally, a probabilistic multi-label functionality for each place is predicted using the semantic space constructed based on the salient feature directions, and the maximum probability is defined as the main functionality of place. The functionality of ‘Hotels’ is determined with an average accuracy of 88.52%, while the efficiency of extracting ‘Attractions’, ‘FoodPlaces’, and ‘Shoppings’ functionalities is 65.66%, 64.99%, and 12.70%, respectively. The proposed method can help users to find places that afford a specific functionality and can improve decisions in urban planning.


I. INTRODUCTION
Crowdsourcing and volunteered geographic information (VGI) are valuable for understanding urban areas [1], [2], especially when they are linked with sense of place [3].Purves & Edwardes [4] applied VGI to begin to describe place which is a key but neglected theme in Geographic Information Science (GIScience) [5], [6].Computational research in this field seeks to build models which are closer to everyday understandings of the world [7], moving away from top-down models of a geographic phenomenon, towards more human-centered models [8], [9].Since crowdsourcing textual data such as social media posts and online reviews originate from a particular individual's perspective [10], [11], they enable us to capture the qualities of subjective experiences [12] and to gain insight into various aspects of places [13], such as their functionalities.Knowing The associate editor coordinating the review of this manuscript and approving it for publication was Arianna Dulizia .the urban function of places is essential for effective city planning and management, a clear understanding of the urban dynamics, and promoting livable, healthy, and sustainable cities [14].
The prevalent approach for responding to place-related inquiries involves using placenames, places enriched with semantics such as Gazetteers [3], or Points of Interest (POIs) [15].However, the categories of POIs are typically defined by municipal authorities, or the quality depends on platforms from which the data is extracted and there can be variations in the number and scale of POI categories [16] while only one specific category is defined for them.Hence, these data may not accurately reflect the diverse functionalities of places as perceived by the public and do not contain information on how people interact with them in urban spaces [17].
When we are planning a trip to a new city, we need to rely solely on online reviews and descriptions of places without explicit information about their functionalities.
Therefore, a method is required to assist in making decisions about which places are likely to offer the experiences we are looking for and matching our interests.The findings can also have implications for urban planning by providing insights into how people perceive and engage with different places.Google Maps reviews encompass a wider variety of place types, while Tripadvisor's 1 travel focus offers more specific information.Tripadvisor's categorized data can be used as reference points to evaluate the unsupervised framework, but Google Maps lacks such categorization.An unsupervised system for extracting place functionality from textual data is motivated by its scalability, adaptability, ability to reduce bias, suitability for dynamic data, cost-effectiveness, and potential for discovering new insights [18].Unsupervised systems do not rely on pre-defined categories, making them suitable for extracting functionalities from a wide range of places.This approach aligns with the nature of user-generated content and allows for more exploratory analysis of the place functionalities.
To address these issues, the principal aim of this paper is to identify the possible functionalities of a place from user-generated contents in the form of online reviews to take a step toward the everyday perception of places.An emerging question is to determine to what extent the place functionalities can be extracted from the online reviews.Hence, a bottom-up approach is introduced to identify multi-label functionalities of a place that are most relevant and meaningful to users along with their respective probabilities.The contributions of the proposed method are three-fold: • The objective of this paper differs from the conventional approach in the literature, as it does not aim to extract unknown urban functional zones [19], [20], [21], [22], [23] but extracts the functionality of places by analyzing their reviews.Any knowledge from other resources is ignored.
• The approach is entirely unsupervised without relying on predefined categories or subjective opinions.
• In a domain-specific semantic space (e.g., user reviews in Tripadvisor), salient features are modeled as directions where each direction shows a functionality utilizing a data-driven (bottom-up) method.
• Unlike previous methods that only consider one category, we predict the probabilities of potential functionalities for each place and define the functionality with highest probability as the main place functionality.It should be noted that the effectiveness of the study is limited by the natural language's ambiguity, which is a challenge in analysis of textual data using NLP [24].The structure of the manuscript is as follows.Section II presents an overview of the previous research conducted on the extraction of place functionality.The data, study area, and basic concepts used in this work are introduced in Section III, followed by an explanation of modeling salient features used for the extraction of place functionality.The finding results are discussed

II. RELATED WORK
Considering Citizens as sensors in urban areas, the term VGI was coined by Goodchild [1] to describe geographical content that is contributed voluntarily by users.Human activities and possible functionalities that are afforded by places are considered an integral part and useful concepts to describe, formalize, and distinguish places in various studies [25], [26], [27], [28].For example, Papadakis et al., proposed a theoretical, empirical, and probabilistic pattern to find places that satisfy shopping functionality [28].However, the patterns are generated according to knowledge obtained from expert sources or narratives in a semi-automatic method.Examples of such resources include widely accepted depictions of locations found in dictionaries or encyclopedias, specialized reports like urban design manuals and standards, expert surveys or predefined categories [17].
Recently, discovering functionality of places has been inferred using map-based data [29], [30], remote sensing data [31], [32], [33], [34], traffic flows such as taxi, bus or subway trajectory data and bicycle rental records; and social sensed human activity data like check-ins and POIs [23], [24].For example, Crooks et al., presented the opportunities of crowdsourced and VGI data (social media, trajectory, and OpenStreetMap) to obtain new insights into form and function in urban spaces [37].Deng et al., identified urban residential building functions by combining multi-source data related to the shape of buildings, distances to main roads, as well as remotely sensed images [14], while textual data is not used in their hierarchical data mining approach.In addition, the data should have the potential of extracting place functionalities.For example, although Flickr tags are textual place-based data, they are often location-based tags and do not provide information about people's activities in those locations [38].
Natural Language Processing methods are applied in various studies to analyze user-generated textual contents to extract meaningful information related to places [39], [40], [41], [42], [43].For example, Latent Dirichlet Allocation (LDA) method is used to extract place types [44], [45], analyze spatiotemporal interaction of places [46], calculate the similarity of places [12], [47], and represent items in location recommender systems [48].Discovering place functionality through analyzing textual data is ignored in previous research.SEDDaL, a novel Social-based Event Detection, Description, and Linkage framework, takes diverse social media data as input and generates semantically linked events with spatial, temporal, and semantic connections.This pioneering study offers a comprehensive model for describing semantic-aware events but does not extract functionalities of places [49].
It should be noted that place functionality and land use are related but distinct concepts in the context of urban planning and geographic analysis.Place functionality is more closely tied to the human perception and experience of a location.Land use is often regulated by zoning ordinances and land use planning policies implemented by local governments and refers to the purpose for which a particular area of land is utilized.While land use can influence the types of functionalities that exist in a given area, place functionality goes beyond land use by considering the experiential and social aspects of a place.It focuses on the activities and services that make a place meaningful to its users [50].This paper focuses on extracting the functionality of specific places (such as hotels and shops) by analyzing the content of their reviews.These data offer insights into the function of urban spaces as defined by the people who actively engage in them.This innovative bottom-up approach complements traditional urban studies and provides a novel perspective for examining urban functionalities.
This paper is inspired by [51] which considered semantic space of movies and found corresponding directions to properties such as 'Scary' or 'Romantic'.Feature directions represent specific attributes of objects within a semantic space.Think of these directions as axes in a space, each corresponding to a unique property or feature that we want to capture.Various scholars have noted that these semantic spaces frequently model salient features as directions from the considered domain [51], [52], [53].However, none of them have used this method to extract the functionality of places.As also shown in the results, one place may have multiple functionalities.This is especially the case for more complex places such as ''Attractions'' in this paper.Hence, we have utilized an unsupervised approach to find possible functionalities of a place regarding their reviews.To this end, we construct a semantic space using vector space representation of documents (i.e., places) and define meaningful directions according to salient features.Each place functionality is assigned to a specific direction in this semantic space and the probabilities of each functionality are calculated for each place.

III. METHODOLOGY
In this section, the basic concepts of our methodology are introduced to define salient features for extracting meaningful directions that represent place functionalities.Our approach is fully unsupervised and contains six steps.The overall workflow and different steps of our method is illustrated in Figure 1.

A. DATASET AND STUDY AREA
Online reviews provide insights closer to everyday understanding of the world by reflecting opinions and experiences of individuals who have visited and interacted with a place.This subjective, context-specific and user-generated data can be relevant for urban planning decisions and easily accessible for a vast range of places, making them a convenient and cost-effective data source.Tripadvisor is a widely used platform that collects user reviews from residents and tourists.It offers detailed, rich, descriptive content, including ratings and includes a large volume of reviews about places, which aids in extracting place functionalities.While other sources of textual data such as Twitter, Flickr, and Instagram primarily reflect place names or locations and do not provide information about place functionality.Above all, Tripadvisor's specialized travel focus offers more targeted information.Google Maps, in contrast, covers a broader range of place types.Tripadvisor categorizes places, providing a structured framework for understanding functionalities and enabling performance evaluation against known categories as ground truth labels, while Google Maps lacks categorization.
In the first step of our method, we collected various places and their English reviews by applying web scraping using Python libraries.We selected New York City (NYC) as our study area which contains well-known places.These data were written in various timeframes but were available on Tripadvisor website in October 2020.For each place, place type and a maximum of 1000 reviews were collected randomly.The number of reviews per place was likely driven by practical considerations and allows for a reasonable sample size for analysis while also accommodating the variation in the number of reviews across different places on Tripadvisor.By setting a maximum limit, we could ensure that we have enough data to derive meaningful insights while avoiding potential bias from an overly skewed distribution of reviews.This approach prevents the BoW representation from being overly sparse for places with only a few reviews, while still providing enough data to capture the diversity of opinions.Furthermore, this random selection approach also helps to avoid potential biases that could arise if only the most popular or highly reviewed places were included in the analysis.Tripadvisor consists of five place types, namely FoodPlaces, Attractions, Hotels, Vacation Rentals, and Shops.

B. PREPROCESSING DATA
To prepare the data, places outside of the study area, places without geographic coordinates or types, and duplicates were removed.Since the main functionality of Hotels and Vacation Rentals are very similar, we combined these two types and considered them as Hotels.The number of each place functionality and their reviews are represented in Table 1.NLTK library is used to preprocess the users' reviews.First, the reviews are changed to lowercase.After tokenization, all punctuations and stop words are eliminated.Subsequently, all tokens are transformed into infinitives by applying stemming and lemmatization.Then, only nouns, verbs, and adjectives of each review are considered utilizing WordNet POS Tagging.Finally, a BoW is generated for all reviews of each place.

C. CONCEPTS 1) BAG-OF-WORDS (BOW)
BoW is a common approach of document representation and vectorization.Using a predefined set of words, the BoW encoding of a document consists of the frequencies of a word under the unrealistic assumption that each word occurs independently of all others.In other words, the sum of the one-hot encoding vector of each word in the document.BoW generates a feature vector where its size equals the number of words in the vocabulary.Each feature is a word (term), and the feature's value is a term weight.The term weight can be a binary value, a term frequency (TF) value, or a term frequency-inverse document frequency (TF-IDF) value [54].Therefore, the BoW output will be a sparse matrix when working with a vast amount of training data.For a given document, only the unigram words can be applied to make an unordered list of words, neglecting bigrams, trigrams, grammar, syntax, POS tag, semantics, and position.These feature vectors can be used for any machine learning task [55].Figure 2 shows how to calculate the BoW vectors.It should be noted that we consider all reviews for a place as a document and calculate BoW for the corresponding place document.

2) LATENT DIRICHLET ALLOCATION (LDA) TOPIC MODELING
Latent Dirichlet Allocation (LDA) is a popular topic modeling technique that is used to discover the hidden topic structure in a corpus of documents [56].The main aim of LDA is to automatically recognize topics that are present in a document corpus and to allocate each document to one or more of these topics based on the words used in the document [57].The model assumes that there are k topics in the document collection, and each topic is a probability distribution over the terms in the document.The algorithm works by first randomly assigning each term in the corpus to a topic, and then iteratively improving the assignment by computing the probability that a term be associated with a particular topic given the current topic assignments for all other terms in the distributions and the term distributions, respectively.129220 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
The probability distribution function includes a k-dimensional random variable θ d sampled from a Dirichlet distribution and the likelihood of observing words w dm given topic assignments z dm .LDA is useful for a variety of applications, including NLP, text classification, information retrieval, and recommender systems and has become a popular tool for analyzing large text datasets [57], [58].
In this paper, LDA is used to extract the most important words representing each functionality in terms of topics.While there are other advanced methods for topic modeling (e.g., BERTopic, LSA, PLDA, GSDMM, and PAM), we have chosen LDA due to its widespread use, interpretability, more nuanced representation of the underlying topics, and established performance in similar studies [57], [58].We compared LDA and BERTopic for topic modeling.LDA outperformed BERTopic in several ways.First, LDA allowed us to specify the number of topics, while BERTopic determined it automatically.Second, LDA produced more interpretable topics.This was crucial for our study's interpretability.Third, LDA exhibited good topic coherence, making topics semantically related and coherent.Finally, LDA was computationally efficient and faster than BERTopic, suitable for our large dataset.

3) TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)
TF-IDF is a widely used weighting index that determines the significance of a term in a document or corpus of documents while also accounting for its presence in other documents [55].It consists of two numerical statistics, namely TF and IDF.The TF element measures the frequency of a term within a document, while the IDF element calculates how unique a term is across the entire corpus of documents.By increasing the score of a word as it appears more frequently in the document and decreasing it as it appears in more documents across the corpus, TF-IDF provides a way to assess the relevance of a term within a specific document.TF-IDF is commonly applied in various information retrieval and text analysis tasks, including but not limited to document classification, clustering, and ranking.Given a corpus C with a set D of documents {D 1 , . . ., D N }, TF-IDF computes the score of a word w in document d as follows: As discussed in description of BoW, we considered TF-IDF values for the term weight index to quantize our documents.We chose TF-IDF as weighting index due to its simplicity, interpretability, computational efficiency, and robustness to document length compared to PPMI (Positive Pointwise Mutual Information).

4) MULTIDIMENSIONAL SCALING (MDS)
The measurement of similarity and dissimilarity is fundamental in assessing the proximity of objects.Typically, non-negative measures are used [59].Multidimensional Scaling (MDS) is a mathematical and statistical approach utilized to quantify similarity judgments and visually represent the level of similarity between cases in a dataset [59], [60].MDS involves converting pairwise distances between objects or individuals in a set into a configuration of points in an abstract N-dimensional Euclidean space, where objects with high similarity are clustered together and those with low similarity are dispersed farther apart [60].The calculation of dissimilarity between objects oi and oj, is based on the normalized angular difference, which can be expressed as: We applied MDS to compute pairwise similarity between our objects (i.e., places) in a lower dimensional space.).MDS is employed based on its compatibility with the research objectives and their ability to provide meaningful insights from the textual data to visualize the complex and nonlinear relationships between places based on the similarity of their functionality.This help to preserve the original similarity measures between data points and find a configuration in lower-dimensional space where the pairwise distances closely resemble the original pairwise dissimilarities, while PCA is a liner simple dimensionality reduction algorithm and cannot calculate similarities.

D. CONSTRUCTING THE SEMANTIC SPACE
Sparse representations can pose challenges in text analysis, as they may lead to skewed feature distributions and hinder effective modeling and inference [61].Embeddings or vector space representations are crucial parts of various areas of NLP [52].Embeddings transform high-dimensional and sparsely populated vectors into a lower-dimensional space, preserving the meaning and contextual relationships between the elements.In this paper, our attention is on semantic spaces that are specific to a particular domain, which refers to vector space representations of objects belonging to that domain (in our case, the domain is Tripadvisor reviews).In this space, salient features are represented as directions.For example, imagine you're using a movie recommendation system, and you want to find movies that are ''similar to this one, but scarier.''If the concept of ''scariness'' is encoded as a feature direction in the semantic space of movies, the system can easily fulfill your request by moving along the ''scariness'' direction [62], [63].Feature directions are also useful in semantic search systems.Suppose you want to search for ''popular holiday destinations in Europe.''Traditional knowledge bases might not directly encode popularity, but in a semantic space, a ''popularity'' feature direction can help interpret your query and retrieve relevant results [64].They can also be used in interpretable classifiers, where rules are learned based on these directions.For instance, classifiers can determine which feature directions are most relevant for making decisions.In this research, the domain is user reviews of places on Tripadvisor [65].Hence, we aim to extract place functionalities from the place reviews by modeling salient features as directions in a semantic space.To achieve our goal, we apply the following process to define the directions of the most salient features:

1) REPRESENTING PLACE FUNCTIONALITY AS FEATURE DIRECTION
In this section, we aim to represent salient features as directions where each direction points to a particular place functionality.First, we perform part-of-speech (POS) tagging on the BoW representations of the places obtained from the preprocessing step.All nouns, verbs, and adjectives that occur frequently enough are considered to identify potential feature labels.To figure out the significance of the words within the documents (place reviews), we quantize the documents by utilizing TF-IDF to each word.Then, a vector is formed for each document, which includes the index scores of all terms in the corpus.The similarity of the vectors is inferred from the angles between them by computing cosine similarity.
Since these vectors are usually very sparse and contain different numbers of words for each document, they are not suitable to be directly used in constructing the semantic space.Hence, these vectors are introduced into MDS technique to map them into a lower dimensional similarity space with a constant number of dimensions.Each document in this semantic space is a point derived from MDS, where more similar points are located closer [59].Although dimensions are predefined in traditional semantic spaces, the interpretation of the dimensions generated by MDS is not straightforward [60].The number of dimensions required for an appropriate semantic space is not clear, and hence it's a trial-and-error process.The best number of dimensions is determined by maximizing the classification accuracy in the filtering stage.
On the other hand, to define each salient feature, we need a cluster of words representing the feature perfectly.To achieve this, First, we extract the most common words for each functionality by applying LDA topic modeling method.The corresponding words of each topic create a vector.To avoid manually assigning each topic to a particular functionality, the most related words to each topic are also extracted from a dictionary where create another vector for each functionality.Then, the cosine similarity is computed between each pair of vectors.The greater the cosine similarity of two vectors, the more relevant the extracted topic is to that functionality.
Afterward, this similarity space is classified by training a Logistic Regression binary classification to define a hyperplane in the space to distinguish between points (i.e., places) that include a given word and those that do not include the word.The direction pointing towards the given attribute that models the word can be found by identifying the vector that is perpendicular to the hyperplane.The more distance the specific document is placed from the hyperplane in the positive direction, the more prominent the word in the document.Then, we will find the salient features and their related directions.Finally, place functionalities would be distinguished using these directions.The Schematic view of generating these feature directions is presented in Figure 3.Each color represents a particular cluster of places where the hyperplane separates them.The perpendicular vector, which is shown by green narrow, represents the direction of the salient feature that is place functionality in this study.

2) FILTERING FEATURE DIRECTIONS
In this step, we determine how likely a word can represent an important feature in describing places by users.Hence, we assess the quality of the candidate directions using the accuracy of the logistic regression classifier and only consider the words resulting in promising accuracy.Even if the hyperplane is only built for words that are accurately classified, it is still possible to have many vectors in the semantic space that have similar directions.When it comes to similarity, vectors with related meanings tend to point in similar directions.Clustering the vectors can help create more reliable and significant directions within the constructed space.Each cluster will represent a distinct feature direction.The clustering process serves three purposes: first, it ensures that the feature directions are distinct enough second, it makes the features more interpretable, as a cluster of terms provides a more detailed description compared to a single term; and third, it addresses the issue of sparsity when relating features to the BoW representation.
We use the N best-scoring candidate feature directions as input for the clustering algorithm, with N as a hyperparameter.Instead of K-means, we adopted the approach proposed in [51], which we found to yield slightly better results.The key concept in their method is to choose cluster centers that rank high among the candidate feature directions and strive for orthogonality to each other.Comprehensive details are described in [51].To achieve a final direction, we use the weighted average of the distances between each place and all the words of a topic.Scores obtained from LDA are used as weights.Afterward, negative distances between all points (i.e., places) in this semantic space and each hyperplane are removed and their functionality is considered ''unclassified''.The further a place document is from the hyperplane in the positive direction, the more prominent the functionality is in the place document.Finally, the probabilities of each functionality are calculated for each place using distances.The main functionality of the place should be determined as the direction with the maximum distance.

IV. RESULTS AND DISCUSSION
In this section, the output results of our method are evaluated using different measures.Afterward, we will discuss our findings in detail.We investigate the meaningfulness of the discovered features by examining their similarity to Tripadvisor categories.Gensim library is used to apply LDA, and the logistic regression classifier is implemented using Scikit-learn library.First, we applied LDA using both BoW and TF-IDF scores to extract the most common words for each topic.Then, perplexity and coherence scores, which are two evaluation metrics for topic modeling methods, are calculated.Perplexity is a statistical measure that evaluates the predictive power of a probability model on unseen data.It is typically calculated as the normalized log-likelihood of a held-out test set.Higher likelihood implies a better model.However, perplexity and human judgment often exhibit a low correlation.Hence, the coherence score is used for assessing the quality of the learned topics to measure how interpretable the topics are to humans, in other words, how similar these words are to each other.A higher coherence score indicates that the topics are more coherent and meaningful, and therefore the number of topics chosen is likely appropriate.The score is used for deciding the required number of topics in the model.The values are depicted in Table 2. Therefore, the topics extracted from LDA using BoW are chosen due to higher perplexity and coherence score, and more relevance to meaningful functionalities.In addition, the most related words to each topic are also extracted from the dictionary.The cosine similarity is computed between each pair of vectors obtained from LDA and the dictionary.The greater the cosine similarity of the two vectors, the more relevant the extracted topic is to that functionality.Then, a functionality is assigned to each topic.Table 3 represents the 10 words with highest score each topic extracted from LDA using BoW and the assigned functionality.
As shown, the words related to Attarctions, FoodPlaces, and Hotels are better representatives for their corresponding functionalities than the words for Shoppings.For each topic, the words occurring in that topic and their relative weights are explored.The distribution of the words over each functionality is shown in Figure 4.The choice of a smaller number of topics (k = 4) for LDA aligns with the objective of identifying the most important words associated with each functionality, rather than aiming for an exhaustive topic modeling analysis.Therefore, we were able to focus on the most salient and distinct topics related to functionality.This allowed for a more straightforward interpretation of the results, as each topic would represent a specific functionality.Furthermore, while we did not explicitly mention the other values tested, we determined that four topics provided the best results.Vectors in our method are three different types of vectors: first, the TF-IDF weighted BoW vectors.The dimension of each vector equals to the length of the BoW, which is the number of words for each place collected through all reviews about that place and it can vary from one place to another one.MDS is applied to map the sparse index (TF-IDF weighted BoW) vector of each document into a lower and fixed dimensional similarity space of documents.In our experiment, five different dimensions (D = 5, 10, 15, 20, and 50) are tested and the results are compared.The accuracy of the classification of the following stage is considered as a criterion to determine the best number of dimensions.The second type of vectors are achieved by applying LDA to extract the most common words for each topic.The most common words for each functionality extracted from dictionary make the third types of vectors.
In the next step, a logistic regression classifier is applied to the semantic space to separate topics from each other based on the location of the documents containing or lacking the desired words.Figure 5 shows the accuracy of the classifier in five dimensions of MDS and for the four functionalities.We use this accuracy to filter feature directions.Only words with classification accuracy higher than 50% are selected for further analysis.The results demonstrate that the accuracy of the classifier for Hotels functionality is about 80%, while it is less than 60% for the other three functionalities in almost all cases.In addition, it seems that a 15-dimensional space leads to more accurate classifications, however, increasing the number of dimensions from 15 to 20 and 50 does not make significant improvements in the results while raising the computational costs.The direction towards the features that model the desired functionalities is indicated by the vector perpendicular to the hyperplane resulting from the classifications.The more distant a specific document is placed from the hyperplane in the positive direction, the higher the score of that functionality is assigned to the document.Then, for each functionality, the weighted average of distances is computed using the word's scores obtained from LDA.The functionality of places for which all distances are negative is considered ''unclassified''.The distances of each document (i.e., place) to the four hyperplanes and the probabilities of each functionality are represented in Table 4.The letters T and P refer to Tripadvisor and Predicted functionalities using the proposed method.The characters A, F, H, S stand the first letters of Attractions, FoodPlaces, Hotels, and Shoppings, respectively.Finally, the functionality with the largest distance or highest probability is considered the main functionality of place.Although the proposed approach is fully unsupervised and we didn't use the Tripadvisor categories to train our method, we only compare our results to indicate how differently people perceive and experience urban areas from predefined categories.Therefore, we compare the main functionality with place categories of Tripadvisor.Only classified places which afford a certain functionality are compared and unclassified place functionalities are not considered.The accuracy of the proposed method in various dimensions is illustrated in Figure 6.
Different NLP methods and various machine learning algorithms are applied in [66] to extract place functionality from the whole review and only action verbs in the descriptions.Utilizing BoW and logistic regression classifier on the whole review were identified as the best methods.Although the accuracy was high, it is a supervised approach and cannot be extended to datasets without labeled data.In another study, Ager et al. tried to predict place types using three different datasets (Geonames, Foursquare, and OpenCYC) using an unsupervised approach, but they couldn't reach accuracy higher than 48% [54].However, our approach can achieve better accuracy of 59.83%, 66.41%, 67.58%, 68.48% for dimensions 5, 10, 15, and 20, respectively.In addition, the results indicate that increasing the number of dimensions from 5 to 15 leads to higher accuracy of classification.However, there is no significant improvement in the classification accuracy by increasing the number of dimensions to 20 or 50.The results of classification are also evaluated by computing the confusion matrices for four functionalities in different dimensions of MDS.The confusion matrices are represented in Figure 7, where the rows indicate the true categories in Tripadvisor and columns show the predicted main functionality and the values demonstrate the percentage of prediction.
As demonstrated by the confusion matrices, the classification results are highly accurate for three of the functions analyzed, namely Attractions, FoodPlaces, and Hotels.In all dimensions, the Hotels functionality have the highest prediction accuracy but the results for Shopping functionality are unsatisfactory with a notable proportion of misclassification errors, whereas places with Shopping functionality are erroneously predicted as Attractions.The primary factor contributing to this issue is the inadequate representation of Shopping functionality by the words obtained during the topic modeling stage.This, in turn, is due to the lack of appropriate and relevant words in the data to accurately capture the essence of the Shopping functionality, leading to the observed poor predictions and misclassification errors.In addition, precision, recall, and f1-score of different functionalities in five dimensions are calculated according to the confusion matrices.These values are provided in Table 5.The results demonstrate that functionality of FoodPlaces can be distinguished with a significant performance since the average precision, recall, and f1-sore are 0.99, 0.64, and 0.78, respectively.Furthermore, places that afford the functionality of Hotels are determined with the average precision, recall, and f1-score of 0.65, 0.64, and 0.75, respectively.Additionally, the results for Attractions are 0.19, 0.60, and 0.28, respectively.On the other hand, the method cannot perfectly predict places related to Shoppings from those which are not labeled as Shoppings, since the average precision, recall, and f1-sore are 0.02, 0.06, and 0.03, respectively.In addition, enhancing the predictive performance is not boosted by increasing the number of dimensions to 20 or 50.
Our further analysis will only consider the semantic space with 15 dimensions as it has been observed that the results do not show a significant improvement when the number of dimensions is increased.Figure 8 shows the distribution of place functionalities based on the Tripadvisor categories, and the predicted places functionalities obtained by applying the proposed method in D=15.The four functionalities are illustrated with four colored blocks where the size of each block corresponds to the total number of places belonging to that functionality.Significant dissimilarity is evident between the two distributions.The proportion of FoodPlaces has decreased notably by 21%, declining from 72% to 51%.Conversely, the proportions of Shoppings, Hotels, and Attractions have seen increments of 2.4%, 6.5%, and 12.1%, respectively.This shift implies that roughly 30% of FoodPlaces have transitioned to emphasize three other functionalities, with Attractions being the most prominent among them.In addition, we compared the distribution of predicted functionalities using our proposed method with the categories on Tripadvisor in MDS with D=15.The central pie chart in Figure 9 shows the distribution of Tripadvisor categories while the surroundings bars represent the distribution of predicted place functionalities using the proposed method for each functionality.Each color shows a specific functionality: blue, orange, green, and yellow for Attractions, FoodPlaces, Hotels, and Shoppings, respectively.True predictions are shown by solid colors, while the wrong predictions are illustrated by dash colors.The distributions indicate that the Hotels functionality has the highest similar predictions at 88.52%, while the Shoppings functionality has the lowest similar predictions at 12.70%.Moreover, the results show that a substantial proportion of Attractions (25.30%) are predicted as FoodPlaces, while 20.03% of FoodPlaces are considered Attractions.Furthermore, 10.66% of Hotels are predicted as Attractions.These predictions highlight the challenge of considering humans' perception and experience in accurately distinguishing between certain place functionalities.
The differences in the predicted functionalities of places may be attributed to certain factors.For example, some restaurants, coffee shops, bars, or hotels may be well-known attractions, and as such, are always recommended to visit.Additionally, user opinions regarding the food served in hotels may contribute to the classification of these places as FoodPlaces.We evaluate the function with the highest percentage of misprediction for each functionality with some samples of user reviews that have resulted in these outcomes shown in Table 6.For instance, 20.03% of FoodPlaces on Tripadvisor are considered Attractions, while in fact, users have visited these places mostly for entertainment.Hence, words such as visit, see, amazing, and year, which are more relative words of the topic of Attractions, have been used in the description of the first and second place.Similarly, 10.66% of Hotels are predicted as Attractions since the words related to the topic of Attractions are repeated more than the words related to the topic of Hotels.Furthermore, shopping is often considered a popular pastime, and shopping centers may be recommended as places to visit and have fun.This may explain why a relatively high percentage of Shoppings (63.49%) was predicted as Attractions.Nonetheless, the major cause for these errors may be ascribed to the lack of suitable words extracted for the Shoppings functionality during the topic extraction phase using LDA method.Further improvements in the word extraction process may help enhance the overall accuracy of the proposed method.The results demonstrate the potential of leveraging online reviews to extract place functionality using semantic space modeling as a novel approach that can provide valuable insights and enhance our understanding how people perceive and utilize urban spaces.

V. CONCLUSION
Extracting place functionality from crowdsourced textual data is complicated by the fact that many functionalities are not pointed out directly in user descriptions.This paper presents an unsupervised method for extracting the functionalities of a place, which only requires a bag-of-words of place reviews in a particular domain to construct a semantic space.Hence, the feature directions are generated by extracting the most salient features, and then, the directions are filtered by computing the accuracy of the logistic regression classifier.Eventually, the functionalities are predicted based on the distance of each document from a hyperplane that functionalities in the obtained semantic space.The results revealed variations in the performance of predicted functionalities.Hotels functionality had the highest similar predictions at 88.52%, whereas Shoppings had the lowest similar predictions at 12.70%.The differences in the predicted functionalities of places may be attributed to various factors, such as some restaurants, hotels, or shopping centers being also popular tourist attractions.Furthermore, the lack of suitable words extracted for certain functionalities during the LDA topic extraction phase could contribute to these errors.The proposed method helps users and urban planners to understand the functionality of places from online reviews without explicit information about their main functionalities to make decisions.Although, our proposed method can be utilized in other fields, such as improving the accuracy of location-based recommendations, enhancing the representation of places in GIScience, and ranking places regarding the desired functionalities.In addition, we suggest integrating the geographic space into the semantic space to discover the influence of nearby places on functionality.While the research focuses on online reviews, integrating other data sources can provide a more comprehensive understanding of place functionalities.For example, incorporating data from social media platforms, geotagged photos, or user-generated content from different sources can offer richer insights into the functionalities of places.Considering the temporal aspect of reviews or incorporating user-profiles and demographics may enhance the accuracy of the semantic space.

FIGURE 1 .
FIGURE 1.The overall workflow and different steps of the proposed method.

FIGURE 2 .
FIGURE 2. Bag-of-words representation using TF weighting index.

FIGURE 3 .
FIGURE 3. A schematic view of representing place functionality as feature directions.

FIGURE 4 .
FIGURE 4. The score of the 10 most frequent terms per topic using LDA.

FIGURE 5 .
FIGURE 5.The accuracy of the logistic regression classifier in different dimensions of MDS.

FIGURE 6 .
FIGURE 6.The accuracy of extracting place functionalities in different dimensions.

FIGURE 7 .TABLE 5 .
FIGURE 7. The confusion matrices of Four functionalities in different dimensions.

FIGURE 8 .
FIGURE 8.The distribution of place functionalities, top: Tripadvisor, down: the proposed method with D=15.

FIGURE 9 .
FIGURE 9.The distribution of place functionalities using the proposed method with D=15.

TABLE 1 .
The number of each place functionality and their reviews.

TABLE 2 .
The Perplexity and Coherence Score of LDA models.

TABLE 3 .
The most frequent terms per topic using LDA.

TABLE 4 .
The distances and probabilities of place functionalities.

TABLE 6 .
Samples of user reviews that have resulted in the highest percentage of misprediction.