A Hybrid Action-Related K-Nearest Neighbour (HAR-KNN) Approach for Recommendation Systems

Recommendation System (RS) has been broadly utilized in various areas and discovers product recommendations during an active user interaction in E-Commerce sites. Tremendous growth of users and products in recent years has faced some key challenges. There are numerous online sites that present many decisions to the user at once, which is strenuous. Moreover, ﬁnding active user or right product is an important task in RS. Existing works have been proposed to recommend a product by considering user inclination and socio-demographic behaviour. In this paper, we propose a Hybrid Action-Related K-Nearest Neighbour similarity (HAR-KNN) recommender that consolidates the simplicity of hybrid ﬁltering to enrich user behaviour matrix with formation of the vector of features. It will classify the features using race classiﬁers from both quality and quantity aspects. The proposed approach also addresses the problems of the previous methods to efﬁciently evaluate user preference on products and balance feature analysis. The K-NN classiﬁcation method has been qualiﬁed online and real-time to ﬁnd user behaviour data coordinating to a speciﬁc user group containing the relationship between the similarity of many users and target users from a huge amount of data. The proposed experimental result is evaluated based on measures such as Mean Absolute Error (MAE), Mean Square Error (MSE) and Root Mean Squared Error (RMSE) with the lowest error of 0.7165, 0.7201 and 0.7322 separately. High predictive measures like Precision (P), Recall (R) and F1 are found to have values 0.8501, 0.2201 and 0.3507 respectively.


I. INTRODUCTION
ONLINE commercial centres make their very own benefit dependent on their notices while business metric has the business enthusiasm to rank higher on suggestions to draw in more users [1]. Traditional retail can present only popular products, but online can present a variety of products. Due to the enormous information available across the web, it is difficult for users to comprehend whether the items presented by recommender frameworks are accurate or not [2]. A recommender system extracts user's interest The associate editor coordinating the review of this manuscript and approving it for publication was Cesar Vargas-Rosales . from related dataset and provides quality recommendations. Recommender systems are a significant part of E-Commerce that use machine learning [3] and data mining techniques [4] to filter the unseen information and predict whether the user would like a particular item. An intelligent system is a special type of recommender system used to exploit the historical user ratings on data that comes from mined-relevant data through data mining process [5]. Therefore, to achieve the same proposed framework, methodologies like content-based (CB) [6], collaborative based (CF) [7], and hybrid filtering (HF) [8] techniques are required. Furthermore, a lot of factors in learner qualities make a distinction based upon information, learning style and configurations of consecutive learning where methodologies like CF, CB individually are not sufficient or reasonable to detect the distinction [9]. For accuracy of recommendations, additional information needs to be incorporated into the recommendation process along with the ratings. Then again, the RS framework utilizes SPM for prediction [10]. However, recommendations for E-Commerce resources differ from other domains since learners have different characteristics such as knowledge level and learning styles which influence the user preferences as well as personalization of learner recommendations. [11] Proposed a hybrid CF and SPM framework is used for recommending products. However, the research does not consider extra learner attributes in the recommendation process.
We propose HAR-KNN, an integrated approach of the K-NN algorithm with user behaviour matrix with the following motivations to overcome some problems in existing works: each user rates only a small subset of available products, so most of the cells in the rating matrix are empty. Finding similarity among nearer related users and items is a challenging one. Since the number of items is generated only for a user, hence it is user-specific. It might be difficult to comprehend the taste of the users. This leads to a problem of sparsity. The proposed HAR-KNN method can combine both collaborative and content-based features that categorize user goals and analyze the attributes of many users to form a larger target to accurately predict recommendations. The hybrid filtering solutions help to avoid scalability problems that exist in CB and CF.
The motivation of the design goals is researched as follows, • With the growth of number of users, the system needs more resources to process information and to give recommendations. So, we propose recommendation tasks that categorize user's goals to recommend accurate predictions using Hybrid Action-Related K-NN Similarity (HAR-KNN) recommendation system.
• Using the K-NN algorithm, vector of features generates behaviour records for each user and samples a similar purchase intention to analyze the attributes of many users to form a larger target and accurately predict recommendations.
• Different permutations and combinations of the above approach were applied to improve the recommendation accuracy for generating recommendations that match user preference better.
• We demonstrate the effectiveness of the proposed framework and the importance of multi-attribute behaviour data from a real E-Commerce site. The rest of the paper is organized as follows: The general introduction of RS based filtering techniques is discussed in section I. Section II presents the recent works and reviews related to configuration-based RS. Section III represents the proposed recommendation scheme that describes phases for the recommendation service process. Section IV presents the experimental evaluation achieved by the performance measures along with other existing methods. A case study in which the extracted dataset discussion is presented in section V. Lastly, the overall conclusion of the work is presented in section VI.

II. RELATED WORK
The existing work carried out in this section is based on the recommender system, including recommendation techniques and attentional models in RS.
Lv et al. [14] establish a similarity model in which the component module includes information sharing units among various modalities to give the suggestion model dependent on multimodal information source named Multimodal Interest-Related Item Similarity model (Multimodal IRIS). The combination of visual representation and textual information might be able to consider the effects of different modalities at the same time. This multimodal feature can model the interest relevance between the different historical items and target items. Also, the expansion of multimodal data and the diverse intrigue importance of recorded items upgrade the suggestion list.
Pereira et al. [15] proposed a new personalized RS approach for configuring the product line process. The reusable features of product lines are intended to evaluate the platform across multiple products. In this, large feature models lead to a problem of dimensionality. Consequently, this approach agrees with 6 recommender algorithms to configure the context of the product line. Through this approach, a slightly relevant feature set decides the makers to organize the product. 3 of the 6 current recommender algorithms identify the relevant features. Hence, this approach may assist users in understanding their feature preferences. The experimental results depict that the proposed approach demonstrates better efficiency and quality-related on two real-world datasets and selects efficient features to support the decision-makers.
Hwangbo et al. [16] proposed a K-RecSys framework which expands the collaborative filtering suggestion framework by prescribing the space attributes. This structure merges the online product click data and disengaged product bargain data weighted to reflect the plan items according to the customer inclinations. The customer liked to invest less energy while the suggestions are exact. Hence, the user wishes to buy the item through the item data is as of now liked. To confirm the exhibition, a real working framework is to contrast the K-RecSys with the existing collaborative filtering system. Regardless these outcomes, the experiments are limited and cannot notify the use of online and offline failure time.
Bandyopadhyay et al. [17] mentioned a recommended approach for satisfying customer needs using Artificial Neural Networks (ANN). In this approach, student pattern buying has been considered and similar types of products are recommended to other student groups. To satisfy the needs of every customer is the main concern of this approach. The time-series data are predicted to appropriate the data using NN and it is possible to estimate the prediction error. Utilizing historical data, the prediction and forecasting data VOLUME 8, 2020 have been easily generating product prediction. The collected data can be directed through different online reviews. Then, the unwanted data has been pre-processed. After that, the routine of the model has been trained using feed-forward architecture. Finally, the model can be evaluated by training and testing related to MSE. Thus, the result shows that the predicted output deserves an accurate target using various datasets.
Chen and Wang [18] mentioned a structure-based RS for considering the suitable item to the target user named Structural Balance Theory-based Recommendation (SBT-Rec). While given the sparsity of gigantic proportion of rating data in electronic business, similar things maybe both are absent from the site page. For managing the particular proposals, the SBT-Rec incorporates both user-based collaborative filtering (CF) and item-based CF suggestion in the presentation proportions of item suggestions. Besides, the customized necessities from various users can't coordinate in a specific time.
Ji et al. [19] proposed a Hybrid Recommendation model dependent on customer ratings, reviews and social data. This model comprises feature generation, review transformation, model training, community detection, feature blending, and prediction and evaluation. The experimental model is utilized to distinguish and break down the audit content to decide the social networks with convolutional suggestion models like network-based models. This shows the suggestion exactness can be precisely improved dependent on review texts and social networks.
Yang et al. [20] proposed a stepwise discovery strategy to identify the spot irregular evaluations that have been created for the combination of collaborative suggestion methods to analyze the similarity and feature extraction between the products. A set of samples is gathered from the user profiles for constructing a rating matrix. It is presumed that the items can be caught by extensively breaking down both the dispersions of mean and prescient errors of items and users. For huge-scale data, the proposed model detects the rating distribution, rating intension and time series analysis.
Previous researches in personalized RS consider a few attributes of structured and unstructured data which is not informative enough to produce any recommendation list. RS system architecture is constrained by multi-attribute information as input matrices which manages the service process. The service process has to access the real e-commerce data including not only products, but also multi-attribute behaviour data to generate predicted ratings. The integration of rating quality will be improved dramatically using multiattribute information. The existing methods have these gaps in satisfying the abovementioned needs by the recommendation service process.

III. HYBRID ACTION-RELATED K-NN SIMILARITY RECOMMENDER SYSTEM
In this section, the proposed hybrid recommendation process takes a collection of behaviour-related data as input.
This hybrid technique is applied over the dataset to assess the gathered information as input and processes the explicit feedback to frame sorted out data. The data involving the attributes of the user information is now integrated into user behaviour rankings to various products. All the information is valid in the original database that the records are extracted and the large volume of significant data is pre-processed. The hybrid approach combines the simplicity of finding the users that have similar interests with active users. Then, the behaviour pattern extracted from the behaviour prediction and inclination is collected from hybrids filtering to choose target products as recommendations. It likewise chooses the K-NN of the active user that is allocated into a feature characterization that improves the association among users and their favoured products by a classification method. This process provides recommendations according to changing user behaviour. It constructs the pattern information for all products in the behaviour knowledge about the distribution of the data. The classification is based on taking the closest neighbours for determining the vector features of neighbour and product. Thus, the similarity is measured between the number of users and the target user. The outcomes can be combined with data that refers to user activities and purchase history. The proposed recommendation model is shown in Fig 1.

A. RELATION FOR RECOMMENDER SYSTEMS
In this system, the recommendation model for the behaviour scenario could define the browsing user and recommender. We describe the relation in the information system that can be achieved by making assumptions.
Consider, User U = {u 1 , u 2 ,. . . , u i } and product P = {p 1 , p 2 , . . . , p i } be the set of products. To retrieve the space, the user and product information are stored in the database. The U and P relation can be represented as R⊆ U × P.

B. USERS RECOMMENDER BEHAVIOUR
The user recommender behaviour is a shared activity among the login user and RS. By logging onto the system, the first user logins to suggest one or more various products. The user chooses a product as his/her decision or ends the connection. Naturally, the user's behaviour logs certain behaviour conduct that happens more with lower interest and generally with higher interest. According to their behaviour information, if the user just specifies the clickstream or any other data to product, it will be called behaviour-based data and denoted by 3 tuples, where X is the sequence of user actions ordered by time, L is a purchased product ID and M is purchased product. Table 1. Shows the user's behaviour matrix mined from the user's behaviour file.
Let U k labels the user behaviour and n i,k denotes the average number of U k for user i. The user behavioural matrix  is shown in Table 1 by equation (1), where n U k represents the average number of U k and W k describes the weight of U k .

C. RECOMMENDATION SERVICE PROCESS
In this process, there are three main phases: Data Collection, Data Pre-processing and Recommendation Engine.

1) DATA COLLECTION
In this component, the behaviour-based data of users has been obtained from the Amazon dataset [14]. From the collected dataset, the principle records consist of the user's behavioural file and the product list file. The user's behavioural file consists of personal background such as clickstream, dwell time, browsing modules of the user. Data type collected from the E-Commerce site: User ID, product name (electronics, books, dresses, etc,), session ID, purchases, number of clicks, product ID, category and time. Based on user data, the collected data is scheduled repeatedly within the user requests.

2) DATA PRE-PROCESSING
In this component, there is a pre-processing module to remove the unnecessary and redundant data that are removed from the log files. It has unwanted files like elapsed time since the last visit, removing invalid values, repeated similar products, and repeated tags. Naturally, the rating matrix is used to perform the work in which the ratings represent users' feedback. In our behaviour database, we use user behaviour matrix for different behaviour types to serve as each user's implicit ratings towards different products. The scarcity (S) of this matrix is measured as, where N non−zero denotes the number of non-zero values, N represents the needed value of recommendation system and N total is the total count values in the matrix.

3) RECOMMENDATION ENGINE
This section presents the approach used to build the RS based on user behaviour relations, combining the approaches of hybrid filtering and the K-NN algorithm. The K-NN identifies the number of similar users through calculating the cosine value. Each neighbour recommends a certain number of products. The category represents neighbour purchased products in different categories of users. Next, the feature represents the vector that will be used to compute the Euclidean distance and finds the nearest neighbours of the user. Finally, the vector features are classified to estimate the similarity between the input instances and k-nearest instances using a weighted K-NN algorithm. To develop the HAR-KNN recommendation approach, the following modules were created.

Module 1 (User Behaviour Based Recommendation):
After registration, the user behaviour form is introduced for each user. This form is used where there are no similar products and the user logins for the first time, it is found to be a specific user.

Module 2 (Formation of the Vector of Features to Each Neighbour and Products):
In this, the active user has purchased products from the list of categories like several products in electronics, dresses and so on. The formation of the vector of features consists of main steps: • The list of products is listed in 3 columns that the neighbour vector of features concerns the number of products that each neighbour has purchased in each category.
• The fourth column recognizes the number of views (1 to views, 0 to non-views and 0.5 when it is preposterous to expect to find the new viewers).
• The fifth column is the average clicks and the last column distinguishes if a neighbour has acquired products in separate classifications from the user, which means 1 assuming true and 0 assuming false.
• Our objective with the last column is to offer a need to the neighbours who have been obtained in different classifications. In the wake of making every one of the neighbour vectors includes and normalizing the segments in a scope of 0 and 1. We ascertain the Euclidean distance among the last one of them and the user vector feature to find the closest neighbours to the user. Formation of Vector of Features to Each Neighbour and products is shown in Fig 2 (a) and (b). When the closest Neighbours are defined, it is important to see if the ID of the products acquired is new to the active user. Although the Products identify with neighbours, for every product, we make a vector of highlights with three columns (see Figure 2(b)).
• The first column will be given the estimation of 1 if the product brand is between the user max price and min price, and 0 if not. Other than that, we can likewise observe if the individual neighbour has given a general ranking of 5, which is the most extreme estimation of a general ranking. The most 5 common behaviour data 90982 VOLUME 8, 2020 utilized in every one of the revisions of the product are in the list of few user behaviour data that are being considered within the separate category given to the revisions of ranking 4 and 5.
• Next, the second column recognizes the general ranking given by the individual neighbour, which can be an incentive somewhere in the range of 1 and 5.
• A third column is a number somewhere in the range of 0 and 5.
• Lastly, after normalizing all the columns with an incentive range of 0 and 1 figures the Euclidean separation between the products vector of features and the vector of features P1. The framework will prescribe the number of products with the most minimal Euclidean distance esteems to the user. The Euclidean distance (D) between the two data points x and y as, where x and y denote the input and feature instances.

Module 3 (Race Classifier):
After forming the vector of features, we can classify the features using race classifier. The classifier is a combination of an improved Gaussian based-KNN algorithm. It can be expressed as a data analysis technique for producing models which can extract features from dataset. This classification is used to estimate the class value based on the known feature variables. A class label for each feature counts the input instances that belong to each class using this algorithm. The maximum number of considered instances is the classification output. All encountered K instances are shown to have equal divisions. The model tuples as in Eqn. (1) consider the set of n tuple training instances as T i = {(X 1 , L 1 ), (X 2 , L 2 ) . . . (X n , L n )}. The estimated class variables for a given tuples is where, the function f () takes the arguments arg () of the parameters. These are commonly set by the user U. For training instances, the classification performance is improved to employ a similarity method using weighted K-NN algorithm is defined as, where, c denotes the class label of each feature and then class represents the set of products that have common characteristics. C(x) denotes the predicted class for a given training instances, m represents the number of classes in the data and p(x, k) is the set of k-nearest neighbour of x. Then, the probability of j th class in the set of neighbourhood p (C(j) p(x,k) ) is defined as, The re-defined weighted K-NN algorithm is defined as, In this process, the mean value of extracted data is computed and passed to K-NN classifier. It computes the Euclidean distance of the feature vector and all samples inside the database; Where K is set to 10. Then, it sorts the distance into ascending order. Although, it selects the closest distance to compute the Gaussian value from vector based on the distance and computes each label value score individually. The above module computes the mean value of the similarities and if the value is higher than 60%, then the algorithm recommends the user products to the target user. In this, the user behaviour information is integrated with the hybrid filtering model. On the creation of vector of features of each product and neighbours and classifies the features that have the equivalent ranking to increase same inclination. The similarity is computed as (10), shown at the bottom of this page, where S target−j and S i−j denotes the target user ratings and a number of user ratings on product item P item , sim(U target , U i ) denotes the similarity of U target and number of users u i . I denotes the common product items, S target and S i represents the average rating scores of all product. The computed classification is based on similarity values between the target and the existing user. The recommended level of the product p i for the user U is computed as follows, where φ u i ,X ,k denotes the normalized factor for the parameter of u i , X, k. The Complexity Analysis of Algorithm: In our recommendation algorithm, we represent the number of users and products belong to the recommender as R ⊆ U × P. The similarity model used in existing approach defines that its time complexity of user computational time is O(n), where n represents the number of users. Here, the analysis of time complexity of the HAR-KNN algorithm involves the similarity model that is agreed to determine each user nearest neighbours. Based on this algorithm, factors which affect the We used the Amazon product dataset. The Amazon product dataset [14] consists of 18,501 product reviews (Item, Item ID, brand name, category) and the metadata information of the purchases with the subfields such as action type (view, add to cart, change quantity, purchase, remove), bytes, of purchase, pct_purchase, of view, Response number. The information such as Client IP, Server IP, Session ID, Rbytes, Rstat, Timestamp, and URL address are also referenced. The dataset information is collected from the Web Service of Amazon that joins with the administrative services.

A. EVALUATION METRICS
Performance measures are evaluated using metrics like MAE, RMSE, MSD, Precision F1-measure, and Recall.
MAE: This metric computes the average absolute error between actual ratings given by the user versus the predicted rating. This is one of the generally utilized accuracy metrics for assessing suggestions and the most powerful to identify anomalies. MAE ranges from 0 to infinity in which the lower the value defines the better accuracy and infinity is the maximum error on the prediction rating.

MSE:
It is a quality measure that processes the average squared error between the true rating and the predicted rating. The predicted value closer to zero shows the better quality of the framework. RMSE: It measures the root square of MSE, i.e., root mean square difference of the true and predicted ratings. The squared root around MSE translates the RMSE measure in units on the same scale. The lower RMSE value defines the better prediction accuracy produced by the RS. User behaviour form (U k ) ← Ui //U i → number of users// 6. Sort similarities ← Usersim(P i , U target n ) // U target n → number of target user// 7. } 8.
For (User U: Start the iteration for similarity) { 13.
R u ← Recommendation Engine has user behaviour features (P n , V n , C A , P n ). 14. MAE, MSE, and RMSE are estimated as follows: where P, represents the user Predicted rating, A denotes the user actual rating and n denotes the total number of products in the recommendation list. A lower value of MAE, RMSE and MAE indicates a better accuracy of prediction. Precision (P): It is defined as the fraction of the total number of recommended items from relevant recommended items for the target user. This measures to compute the values whether higher values specify better performance. The precision is defined as, P = n rs n s (15) where n s denote the number of recommended items for the target user and n rs denote the number of products that appeared in the recommended list for the target user. The higher precision value indicates better performance.

Recall (R):
It states that the proportion of all relevant recommended items from a number of products for the target user. On the same way to precision, the recall measures to compute the values whether higher values specify better performance. The recall is defined as, R = n rs n s (16) where n r represents the number of products favoured by the target user. F1 measure: It is a function of recall and precision, the formula can be defined as,

B. EXPERIMENTAL RESULT AND COMPARISON
The experimental performance is performed with an Amazon dataset [14]. The selected methods are compared with the HAR-KNN as follows:

1) COLLABORATIVE FILTERING (CF-BASED)
In this methodology, the proposal is resolved dependent on the collaborative method that uses the activities and evaluations of the user and network. This will in general gauge the user with comparable enthusiasm to foresee the comparative items. Additionally, the models such as the item-based CF (IBCF), IBCF using time-aware similarity computation (TRIBCF) and coverage degree based to represent the IBCF(CDIBCF) using coverage based rating prediction, Time weight IBCF (TWIBCF) and Time-related correlation degree (TCIBCF) are utilized to compare the proposed recommendation framework.

2) CONTENT-BASED (CB) FILTERING
This type of approach uses attributes such as user feedback and actions to estimate the recommendation of the item that is liked by the user before. Fig 3. Represents the MAE result of the HAR-KNN compared to the existing techniques for the Amazon dataset. In this, for the increasing size of neighbours, the MAE value gets lower and then get increased prediction result. The performance of the TCIBCF, TWIBCF has a similar prediction, however, it achieves the improved results in comparison to CDIBCF, IBCF techniques. Lower the MAE value, better prediction measure. As seen, the CF and CBF enhance prediction accuracy whereas it achieves a lower error than the other methods. From all of the above results, we can assert that the proposed system improves prediction accuracy better than the other traditional and earlier approaches.     Initially, the error value gets decreased due to the higher value of neighbours and then increases. Lower the RMSE measure, the higher the prediction measure. Therefore, our proposed HAR-KNN approach achieves lower error value and better accuracy compared to the existing recommender system. Fig 5. Illustrates the MSE value of the proposed RS in terms of the Amazon dataset. In this, the mean square error value for the HAR-KNN achieves better prediction with the lower deviation value than the earlier models. The above results can improve prediction accuracy as the values in the computed measures are better than that of existing methods.
In Fig 6. Shows the precision value achieved with the Amazon dataset. In this, the precision values are computed for those existing methods drops in which the neighbourhood size increases. Moreover, the values of IBCF, TRIBCF and TWIBCF methods perform the same compared to CF, CBF, and HAR-KNN. This is because the precision values of HAR-KNN have higher values than existing approaches. In Fig 7. Shows the Recall value achieved with the Amazon dataset [14]. This shows that the TWIBCF, TRIBCF approaches are closely the same as the traditional IBCF methods. This is because the recall value of TCIBCF, CDIBCF, CF, and CBF increases faster than IBCF methods. Furthermore, the HAR-KNN method improves largely as the neighbourhood increases. Hence, HAR-KNN and TCIBCF improve the recall measures compared to traditional methods.   increases than that of existing TCIBCF, CDIBCF, TRIBCF, TWIBCF, CF, CBF, and IBCF methods.
In Table 2., from the simulation results, MAE, RMSE and MSE measures analyzed that the HAR-KNN is having less error rate as the number of the neighbourhood [10 to 50] increases compared to existing methods. The bold represents the low error value as the size of the neighbourhood increases. Table 3. From the simulation result of Precision, Recall and F1 measures analyzed that the HAR-KNN is having higher values compared to existing methods. The bold represents higher values as the set of recommendation lists varies from [2]- [12].
In single attribute criteria, the collected data would not have the overall rating; this may lead to inaccurate visions on the true similarity among user preferences. But in the multi-attribute criteria, the user behaviour data captures better information which allows discovering connections between users and products. We use the user behaviour matrix table linked with HAR-KNN approach to evaluate the existing recommendation to test all approaches.
Our simulation uses different performance metrics to show how much added value an RS be which is referred to the average difference between the right value and the target user shown by the RMSE/MSE/MAE. Most real e-commerce data  will use small dataset to run the RS to see the matched correct answers in the data. Then, construct a list of selected top-N recommendation which includes the product and user using precision/recall/f1. Among all these metrics, our experiment results show the importance of multi-attribute behaviour data which is very useful to predict a list of recommended product to a target user

V. CASE STUDY
To solve the issues of data sparsity and loss of neighbour activity, the HAR-KNN recommender system was developed to employ the user needs of suitable products. The recent study shows that RS is the main part of an E-Commerce site and is valuable for increasing buyers and for constructing user loyalty.
An Amazon dataset [14] was extracted from the system to relate the domain characteristics of E-Commerce RS utilized as a case study to further apply the feasibility of the proposed HAR-KNN approach to a real-time application. The dataset incorporates a large number of users and products.
In our experiment, we randomly select products to utilize the user behaviour matrix, rating orders, classification of feature ordering conditions to construct the training and testing phases. To check the performance of the RS, we divide the dataset into 80% training and the rest 20% as the testing set for verification.
We set the maximum length of the document to 300. To achieve better accuracy, few combinations are taken into account the nearest neighbours (K). The K values lie between 10 to 50. Generally, higher values of K reduce the effect of noise in classification. With cross-validation, the nearest neighbour selects the value with less classification error. The new data instances with the training data can be measured by Euclidean distance. With various subsets of k, the process of K is repeated until it reaches a certain number of values. We can evaluate the error rate using MAE, MSE, and RMSE.
In simulation results, the evaluation metrics indicate the changes in the target item as the size of the neighbourhood increases. Compared to the relevant approaches, the obtained results are hard to perform in CF and CB filtering in a small size of nearest neighbours. Therefore, the predicted rating has been accomplished based on 10 to 50 nearest neighbours. At this point, the selected neighbourhood products of the target product are more consistent. Hence, the selected neighbourhood by HAR-KNN is similar to achieving the target products by the traditional methods. Moreover, the accuracy of the recall, F1 measure, and precision are computed to rate the relevant products and number of recommendations that were set to [2,4,6,8,10,12]. Therefore, the HAR-KNN provides better RS with better predictive accuracy than existing methods for the Amazon dataset.

VI. CONCLUSION
In this paper, the proposed HAR-KNN model is based on purchase histories and user behaviour data like the number of views, clicks and purchased products. The proposed method constructs the user behaviour matrix and pre-processes the frequency of product purchased to predict the users using the vector of features for each neighbour and product.
The paper makes proper utilization after constructing the user behaviour matrix utilized to map out the other users with similar tastes and neighbours. It is distinguished by calculating the Euclidean distance and proper improvisation of all the algorithms used in the framework to achieve a similarity level more accurately. Moreover, the HAR-KNN model is validated on the Amazon dataset and effectively compared with IBCF, TRIBCF, TCIBCF, CDIBCF, TWIBCF, CB and CF in terms of MAE, RMSE, MSE, Recall, Precision, and F1-measure values. The experimental result depicts that the performance measures improve high predictive accuracy with low error than traditional methods.
Based on present data, the access of user behaviour matrix depends upon the preference of user behaviour. Many applications use two dimensions (i.e. users and products). This may not be sufficient as customer preferences may largely increase in different factors. In the future, there is a need to incorporate the concept of multi-dimensionality in RS.
HOANG VIET LONG received the Ph.D. degree in computer science from the Hanoi University of Science and Technology, in 2011, where he defended his thesis in the fuzzy and soft computing field. He has been promoted to an Associate Professor of information technology since 2017. Recently, he has been concerning in cybersecurity, machine learning, bitcoin, and blockchain and published more than 40 articles in ISI-covered journal.
DAVID TANIAR received the bachelor's, master's, and Ph.D. degrees all in computer science, specializing in databases. He is currently an Associate Professor with Monash University, Australia. He has also published over 200 journal articles. His research areas are in big data processing, data warehousing, and mobile and spatial query processing. He has published a book on the High-Performance Parallel Database Processing (Wiley, 2008). He is a Regular Keynote Speaker at an international conference, delivering lectures and speeches on big data. He is a founding Editor-in-Chief of the International Journal of Web and Grid Services and the International Journal of Data Warehousing and Mining.
ISHAANI PRIYADARSHINI received the bachelor's degree in computer science engineering and the master's degree in information security from the Kalinga Institute of Industrial Technology, India, and the master's degree in cybersecurity from the University of Delaware. She is currently pursuing the Ph.D. degree with the University of Delaware, USA. She has authored several book chapters for reputed publishers. She is an Author to several publications for SCIE indexed journals. As a certified Reviewer, she conducts peer review of research articles for prestigious IEEE, Elsevier, and Springer journals and is a part of the Editorial Board for the International Journal of Information Security and Privacy (IJISP). Her areas of research include cybersecurity, artificial intelligence, and HCI. VOLUME 8, 2020