Contexts Enhance Accuracy: On Modeling Context Aware Deep Factorization Machine for Web API QoS Prediction

Service-oriented computing (SOC) promises a world of cooperating services loosely connected, constructing agile Web applications in heterogeneous environments conveniently. Web application interface (API) as an emerging technique attracts more and more enterprises and organizations to publish their deep computing functionalities and big data on the Internet, Web API has become the backbone to promote the development of SOC, thus forming the prosperous Web API economy. However, the number of available Web APIs on the Internet is massive and growing constantly, which causes the Web API overload problem. Quality of service (QoS) as an indicator is able to well differentiate the quality of Web APIs and has been widely applied for high quality Web API selection. Since testing QoS for massive Web APIs is resource-consuming, and the QoS performance depends on contextual information such as network and location, hence accurate QoS prediction has become very crucial for personalized Web API recommendation and high quality Web application construction. To address the above issue, this paper presents a context aware deep factorization machine model (CADFM for short) for accurate Web API QoS prediction. Specifically, we first carry out detailed data analysis using real-world QoS dataset and discover a positive relationship between QoS and contextual information, which motivates us to incorporate beneficial contexts for enhancing QoS prediction accuracy. Then, we treat QoS prediction as a regression problem and propose a context aware CADFM framework that integrates the contextual information via embedding technique. Particularly, we adopt MF and MLP for high-order and nonlinear interaction modeling, so as to learn the complex interaction between users and Web APIs accurately. Finally, the experimental results on real-world QoS dataset demonstrate that CADFM outperforms the classic and the state-of-the-art baselines, thereby generating the most accurate QoS predictions and increasing the revenue of Web APIs recommendation.


I. INTRODUCTION
Service oriented computing (SOC) is now an extensively accepted design paradigm for agile application development through the Internet in the form of service invocation or The associate editor coordinating the review of this manuscript and approving it for publication was Fabrizio Messina . service composition, aiming to make application flexible and respond to business requirements quickly [1], [2]. Since SOC has been proposed for the first time, it has been spread far and wide between the enterprises and organizations in the worldwide. The principle of SOC significantly changes the form, development paradigm and operation way of current Internet-based software [3]. Specifically, the form of software VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ is evolved from the source code into a mixture of source code and third-party services, the development paradigm of software is developed from application integration within an organization to function integration and data sharing of cross-border heterogeneous systems connected by services, and the operation way of software is transformed from isolated systems to open collaborative systems composed of services. Thus, SOC has a profound impact on the delivery, development and operation fashion of current Internet-based software.
In the early days, Web service is the main technology to enable the concept of SOC, Web services are exposed using standard network protocols, such as simple object access protocol (SOAP), and published in a way that allows developers to find them quickly and reuse them to assemble new Web applications [4], [5]. Currently, Web Application Interface (API) as the new technology of SOC has gradually replaced the traditional SOAP based Web service because of the advantages of lightweight, easy access and composition [6], [7]. Web APIs provide deep computing functionalities and big data that have hitherto been hidden behind enterprises and organizations to their partners and interested users conveniently, which promote the formation and prosperity of Web API economy [8]. For instance, volunteers in GitHub use Web APIs to provide coronavirus information, Alibaba uses Web API to support payment process, Google uses Web API to provide map service, and Microsoft uses Web APIs to bring various artificial intelligence services such as face recognition, speech recognition into our daily life, all of which make the trend of Web API increasingly popular.
Nowadays, more and more enterprises and organizations have published their deep computing functionalities and big data to the Internet in the form of Web API driven by the API economic model, realizing a symbiosis and win-win network. As of July 2020, Programmableweb, the world's largest Web API platform, has published 23,194 Web APIs with 490 categories, and the Web APIs have increased rapidly by 30% annually in the past four years [9]. APIs.guru and RapidAPI platforms are also constantly enriching and improving their Web API resources. Noting that such rich Web APIs provide users with more choices, but at the same time, they also bring difficulties in choice because users are drowning into to the sea of Web APIs, which further hinders the development of Web APIs. Although ProgrammableWeb, APIs.guru and RapidAPI provide users with search engine for Web API selection, they only use keywords for filtering simply, which cannot meet the needs of users for accurate and high-quality Web API selection efficiently. Furthermore, recent works exploit functional requirements matching to enable finer Web APIs recommendation [10], [11], but how to quickly identify and select suitable Web APIs from candidates with similar functionality for ordinary users is still an unsolved problem.
To make appropriate Web API selection apart from functionality, quality of service (QoS) is then introduced for supporting high quality Web API selection and personalized Web API recommendation [12], [13]. QoS is defined as a set of non-functional attributes of Web APIs, and each attribute represents the quality information of Web API in a certain aspect and has an attribute value, such as response time, throughput and cost [14]. Since QoS as an indicator can well differentiate the performance of the Web APIs with similar functionality, it has become one of the decisive factors for users to choose Web API. However, in practical application, QoS prediction is of necessity because the following reasons: 1) Data-sparsity. There are growing massive Web APIs on the Internet, a single user usually only uses a limited number of Web APIs, thus the interactive QoS data of users and Web APIs is very sparse, and the QoS values of most Web APIs are still unknown; 2) Resource-consumption. Users usually use keywords to filter out functionality irrelevant Web APIs and get a candidate list, but testing the QoS performance of all candidate Web APIs before selecting and invoking them will take a lot of time and resources; 3) Cost-consumption. Many Web APIs only have the limited number of free calls, and more than a certain number of calls need to be costly; 4) Context-sensitivity. QoS performance is quite sensitive to the unpredictable external contexts, such as varying network, invocation location. In view of the above facts, accurate and personalized Web API QoS prediction becomes a realistic and challenging issue.
(1) Data-sparsity limits the prediction accuracy and general applicability of CF methods. CF methods make QoS prediction based on the following assumptions: users/Web APIs who have observed/performed similar QoS in the past are likely to observe/perform similar QoS in the future, thus the unknown QoS values can be predicted by weighting sum the known QoS values of similar neighbors. Although existing CF works improve QoS prediction accuracy by considering the significance of similarity [15], the reliability of neighbors neighbors [16], and the influence of QoS data range [17,18], CF only uses some similar neighbors' historical QoS for prediction, thus the accuracy of CF based methods is quite susceptible to the available QoS data since there exits the similarity exaggeration under the data-sparsity case [18]. Additionally, CF may not work because of the lack of coinvoked Web APIs between users when the available QoS data is very sparse. The above deficiencies limit the prediction performance and general applicability of CF on accurate Web API QoS prediction.
(2) Higher-order and non-linear interaction modeling systematically between users and Web APIs is neglected. Different from CF that makes QoS prediction by using the QoS values of similar neighbors directly, a line of MF based research is proposed to learn the latent feature vectors of users and Web APIs from the raw QoS interactive data [19], and QoS predictions are generated by the linear combination of latent feature vectors of users and Web APIs using an inner product operation, such as NIMF [20], Colbar [21], and HMF [22]. Moreover, FM embeds each feature into a vector representation, and employs the inner product of pairwise feature as their interactions [23]- [25]. Although MF and FM yield strong performance by capturing the latent feature of users and Web APIs, the nonlinear feature interactions are still ignored. However, He et al. [26] pointed out that it is difficult to capture the complex structure of user and Web API interaction with the linear model such as MF and FM. Hence, it is crucial to consider modeling both higher-order and non-linear feature interaction systematically in order to improve the accuracy of Web API QoS prediction.
(3) The qualitative analysis of the relationship between QoS and contextual information is missing and the modeling of contextual information is unsatisfactory. As mentioned earlier, QoS has the characteristic of context sensitivity. Mangy prior efforts have shown the importance of modeling context in QoS prediction [21], [22], and [25]. However, the relationship between QoS and contextual information has not been clearly established, and contextual information is only involved in the modeling process indirectly through these context-aware neighbors. Hence, contextual information is not involved in the subsequent model prediction process, which further limits the perception ability of the model to contexts.
To tackle the aforementioned problems in real-world Web API QoS prediction, we in this paper propose a Context Aware Deep Factorization Machine (CADFM for short), a generalized model for accurate Web API QoS prediction. Specifically, we first investigate the potential relationship behind the explicit QoS data and implicit contexts through detailed data analysis using real-world QoS dataset, and discover a positive relationship between them, which motivates us to incorporate beneficial contextual information for accurate QoS prediction. Then, contextual information from both user and Web API sides are modeled in CADFM through feature embedding, which provides a promising solution to alleviate the data-sparsity issue. Furthermore, by adopting FM and MLP within a well-designed end-to-end model, CADFM is capable of capturing the higher-order and nonlinear feature interaction effectively. Finally, extensive experiments using real-world QoS dataset well answer the cared research questions satisfactorily.
In summary, the main contributions of this work are: (1) We discover a positive relationship between QoS data and contextual information through qualitative analysis on real-world QoS dataset and demonstrate that users and Web APIs having the same context such as country, network condition tend to have more similar QoS performance, and vice versa, which provides the reliable evidence for us to enhance the accuracy of QoS prediction under data sparsity case by exploiting contextual information.
(2) We build a context aware deep factorization machine and incorporate two types of contexts from both user and service sides through feature embedding for accurate Web API QoS prediction. In particular, CADFM emphasizes the higher-order and non-linear feature interaction of users and Web APIs by combining the results of factorization machine and multi-layer perceptron systematically.
(3) We compare our CADFM model with the state-of-theart methods and demonstrate the superiority of our model through extensive experiments on real-world QoS dataset, showing that our approach is a ble to leverage the contextual information more effectively and capture the complex interaction of users and Web APIs more accurately.
The rest of this paper is organized as follows. Section II reviews the existing QoS prediction works. Section III formulates the research question and shows the data analysis of QoS data and contextual information. Section IV introduces the proposed QoS prediction model, including the framework and the modeling process of our model. Section V presents the evaluation process of CADMF and analyzes the experimental results in detail. Section VII concludes the paper and points the planned future work.

II. RELATED WORKS
In the last years, many approaches have been proposed for improving the performance of QoS prediction, and Fig.1 shows the evolution history of QoS prediction methods. In this section, we review typical works in this domain and their related problems

A. COLLABORATIVE FILTERING BASED QoS PREDICTION
Generally, CF based methods abstracted Web API QoS prediction into a matrix completion problem, as shown in Fig.1. Specifically, users contribute the observed QoS of Web APIs to the universal description, discovery and integration (UDDI) system [27], then m users and n Web APIs will naturally construct an m × n QoS matrix R m×n . As we discussed before, the number of available Web APIs on the Internet is large and constantly growing, and an active user usually used a limited number of Web APIs, thus there will be many entries without QoS values in R, and the task of QoS prediction is to compete the unknown entries in R as accurate as possible.
To complete the unknown entries in user and Web API QoS matrix R, CF based method usually uses the following three steps: 1) Similarity Calculation. Similarity as the measurement of the strength of the relationship between users or Web APIs is generally calculated with the known QoS values, typical similarity computational model incudes Pearson correlation coefficient, cosine, Jaccard, etc. 2) Neighborhood Selection. This step obtains similar neighbors based on the calculated similarities, and the widely adopted neighborhood selection technique is the Top-K algorithm. 3) Collaborative Prediction. The final QoS prediction is generated by weighting sum of the QoS values of selected neighbors, as shown in Eq. (1).
(1) VOLUME 8, 2020 where sim(u, v) is the similarity between user u and v, Nei(u) is the neighbor set of user u.r u andr v is the average QoS value on invoked Web APIs. Since CF is easy to implement and its recommended results are easily accepted by users, many CF based methods are proposed to complete the QoS matrix as accurate as possible. Shao et al. first introduced CF to solve the problem of QoS prediction successfully by differentiating the positive and negative similarity [28]. Zheng et al. proposed a hybrid method WSRec, which makes prediction by combining userbased and item-based CF methods [15], and the significance of similarity is considered for alleviating the influence of similarity exaggeration caused by data-sparsity. Wang et al. proposed a distance-based Top-K selection strategy using the coordinates of latitude and longitude to select more similar neighbors for improving QoS performance [16]. Li et al. improves QoS prediction accuracy by considering the probability distribution of QoS data [17], and our research group improves the CF performance on QoS prediction accuracy by considering the data range characteristic of QoS data [18]. Prior efforts focus on designing more significant similarity, choosing more reliable neighbors and considering the characteristics of QoS data for enhance the performance of CF based methods, noting that CF method essentially uses only some similar neighbor's data for prediction, its accuracy relies heavily on the available QoS data. However, facing the unavoidable data-sparsity problems, the prediction accuracy of CF method is still not very ideal, and CF method itself may even not work because of the lack of shared Web APIs between users when QoS data is very sparse.

B. MATRIX FACTORIZATION BASED QoS PREDICTION
Different from CF that tries to complete the user-Web API QoS matrix by using the wisdom of neighborhood directly, researchers introduce MF technology to project the high-dimensional sparse QoS matrix into two lowdimensional dense matrices [19], so as to discover the potential similar features between users and Web APIs. Finally, missing QoS values can be calculated by the inner product of learned user and Web API latent feature vectors. In other words, the original QoS matrix R can be restored by user and Web API latent feature matrices, as shown in Eq.(2).
R ≈R = P · Q T ,r u,a =< p u , q a >= f k=1 p u,k · q T a,k wherer u,a ∈R and p u ∈ P, q a ∈ Q (2) where P and Q is the user and Web API latent feature matrix respectively, and p u , and q a is the user and Web API latent feature vector respectively, f is the number of latent feature.
Due to the good scalability of MF model, researchers have extended MF with many mature technologies, such as feature fusion [20], regularization technology [21], [29], to improve QoS prediction accuracy of MF model. Zheng et al. integrated the careful selected similar neighbors into MF systematically through feature fusion for modeling users' feature accurately, and experimental results demonstrate that incorporating neighborhood can improve the performance of MF [20]. Yin et al. adopted the regularization technology to model the impact of user geographical neighborhood [21]. Ryu et al. used location information to improve similarity based on geographical distance, and two well-designed regularization terms of user and service are added to the objective function of MF [29]. Experiments in [21], [29] both verify the effectiveness of regularization term on improving QoS prediction accuracy. Despite their great success on alleviating data-sparsity issue, we argue that MF models are difficult to capture the inherent higher-order and non-linear interaction structure of user and Web API, which cannot guarantee the good prediction performance.

C. DEEP LEARNING BASED QoS PREDICTION
Recently, FM has attracted much attention to improve QoS prediction accuracy [23]- [25]. FM converts QoS prediction problem into a regression problem for processing, and it embeds each feature into a vector representation, and employs the inner product of pairwise feature representations as their interactions. Wu et al. first proposed EFMPred using FM, which captures the implicit relationship between users and services by performing user id and service id embedding [23]. Tang et al. obtained user similar neighbors and nearby neighbors, then these neighbors are embedded into the classical FM [24]. Yang et al. proposed a location-based FM model for Qos prediction, they use longitude and latitude information to select near neighbors of users and services, and then these neighbors together with user and service are fed into the FM model to make prediction [25]. Note that the pairwise interaction in FM is implemented by the linear inner product operation of learn features, while the complex and nonlinear interaction is still ignored.
To model nonlinear feature interaction, typical works use auto encoders [30], [31] and multi-layer perceptron (MLP) [32]. White et al. adopted a stacked auto-encoder to generate richer representation that leads to better results [30]. Yin et al. uses the auto-encoder improved by the substitution strategy to obtain nonlinear latent feature of users and services, and missing QoS is generated by the traditional MF methods [31]. Since auto-encoder generally uses one hidden-layer neural network to learn the embedding feature, Zhang et al. adopt the MLP to model the nonlinear characteristics of embedding features [32], they also embeds similar neighborhood in MLP to further improve prediction accuracy [33]. Wu et al. proposed a deep neural network for making QoS prediction with contextual information [34], where a deep neural network is added to the end of FM in series for prediction. Note that the network structure is similar to the work in [35], and our work is a left-right network structure to model the model the high-order and non-linear feature interaction. Previous results strongly verify the fact that the nonlinear structure learning is of importance for enhancing QoS prediction accuracy. Hence, it is crucial to consider both higher-order and non-linear feature interactions modeling systematically in order to improve the prediction accuracy of Web API QoS prediction.

D. CONTEX AWARE QoS PREDICTION
To alleviate data-sparsity issue, contextual information as complementary information is generally used to improve QoS prediction accuracy. For example, Wang et al. adopted a distance-based enhanced Top-K selection strategy by using latitude and longitude to select similar neighbors [16]. Saeed et al. designed a clustering algorithm based on geographical location to improve the quality of selected similar neighbors, and their results show that geographical neighbors can effectively improve QoS prediction accuracy [36]. Wu et al. proposed a context-aware MF framework by integrating context-aware neighborhood [37]. Our group also proposed a bottom-up neighbor discovery algorithm based on context-aware tree structure [38], and MF is expanded by integrating the context aware neighborhood. Yang et al. fed the selected neighbor directly into FM model to make prediction [25], demonstrating that integrating context information can indeed enhance the performance of CF-, MF-and FM-based methods. It is worth nothing that existing works perceives contextual information through context-aware neighbors indirectly, and the contextual information is not used in the subsequent model QoS prediction process. Therefore, the context perception ability of existing method is limited, thus the further improvement of the prediction accuracy of these model is limited. Table 1 presents the characteristics of existing QoS prediction methods compared with CADFM. Different from previous works, we in this paper propose a context aware deep factorization machine, which not only captures the high-order feature combination using FM, but also but also learn the non-liner interaction by MLP. Moreover, a general context embedding framework is designed for integrating beneficial context. Experimental results demonstrate the effectiveness of the proposed model and CADMF obtains much better results under very sparse data scenarios.

III. PROBLEM FORMULATION AND DATA ANALYSIS
In this section, we first formally define the task of Web API QoS prediction. Then we conduct detailed analysis based on real-world QoS data to verify our assumption that: users/Web APIs tend to perceive/perform similar QoS performance under the same context, which lays a solid foundation for us to enhance Web API QoS accuracy by incorporating contextual information. VOLUME 8, 2020

A. PROBLEM FORMULATION
Generally, Web API providers claim attractive QoS in order to promote their own Web APIs. Given a specific Web API a, its QoS can be expressed as follows: where a denotes the specific Web API, and such r(a) model depends on the Web API-specific parameters, such as the design and implementation of Web API, server status, and pricing strategy of providers.
Because the values of static QoS attributes, such as price, security, are relatively stable, thus the above traditional QoS r(a) model is ok for measuring these static QoS attributes. However, r(a) is not suitable for measuring user-perceived dynamic QoS of Web APIs, such as throughput, response time. This is because different users may perceive quite different QoS experience causing by the different network environments. Hence, previous efforts [15], [20], and [23] adopt the following model to formulate Web API QoS evaluation.
where u and a denote the specific user and Web API respectively. r(u, s) depends both on user u and Web API a.
Although prior efforts have demonstrated that this r(u, s) model can evaluate user-perceived QoS well, we find it only use the explicit user-Web API QoS data to make prediction, the implicit beneficial contextual information are neglected in this model. As mentioned before, QoS has the contextsensitivity characteristic. Contextual information, such as location, network environment, is likely to influence the QoS performance. In this paper, we argue that the implicit contextual information is capable of characterizing more accurate users and Web APIs interactions. Thus, we improve r(u, a) and propose a context aware QoS model: where c denotes contexts under which the interaction of user and Web API is performed. The r(u, a, c) model indicates that the user-perceived QoS depends on user u, Web API a, and context c.
In addition to the limited QoS records of users and Web APIs, contextual information of users and Web APIs are integrated into the r(u, a, c) model. On one hand, contexts as side information in r(u, a, c) can make QoS data less sparse and thus result in accurate predictions. On the other hand, embedding contextual information in deep factorization machine can make us to learn the complex interactions of users and Web APIs as accurate as possible. Therefore, with the model r(u, a, c), the goal of our task is to predict the unobserved QoS values by leveraging the observed user and Web APIs QoS records and contextual information.
Before introducing the details of our proposed CADMF, the following question must be carefully examined: why is it promising to improve Web API QoS prediction accuracy by integrating contextual information?

B. DATA ANALYSIS
It is ideal to use large-scale real-world Web API QoS data. Unfortunately, the lack of real-world Web API QoS dataset impedes the study of Web API QoS prediction methods. However, Web API is based on the lightweight restful architecture, and the rest is actually an architectural pattern that is basically used for creating Web API's which uses HTTP as the communication method, thus in essence, Web service and Web API are both http calls which are invoked by users, so we adopt the widely used real-world QoS data in WS-DREAM [39] to validate the effectiveness of our model. WS-DREAM includes 1,974,675 real-world QoS evaluation results from distributed 339 users on 5,825 Web API services. In this paper, we adopt the widely used response time (RT) QoS dataset to conduct our analysis and later experiments. Since the explicit RT values are provided in WS-DREAM, while the implicit contextual information related to users and Web APIs are missing. Hence, we invoke the ip2location API from www.ip2location.com/, and identify the country and autonomous system (AS) contexts of users and Web API by IP address. The statistics of RT experimental dataset are presented in Table 2. To answer why contextual information can be expected to enhance Web API QoS prediction performance, we focus on our analysis under the user having same context. Intuitively, users in the same location/AS usually tend to perceive similar QoS experience for different Web APIs, and Web APIs in the same location/AS tend to perform similar QoS performance to different users. Here we adopt Pearson correlation coefficient (PCC) to measure the QoS similarity between users or Web APIs. Formally, the PCC of user u and v can be calculated using Eq. (6).
where A u and A v is the set of Web APIs that user u and v have accessed before respectively, andr u andr v is the average value of user u and v on the co-invoked Web APIs respectively. Similarly, the PCC of Web API a and b can be calculated using Eq. (7).
where U a and U b is the set of users who have accessed Web API a and b respectively, andr a andr b is the average values of users who have accessed Web APIs a and b respectively.
In order to study the relationship between QoS data and contextual information, we give the following definitions and calculations. Let c = {User Country (UCountry), User AS (UAS), Web API Country (ACountry), Web API AS (AAS)} be the contexts that we studied in this paper. Let U in and U out be the user set that have the same or different context with the target user. Fig.2 presents a toy example to explain the concepts U in and U out clearly. With the above two defined sets U in and U out , the equation ratio(u, U x ) is introduce to measure the average proportion of users whose PCC value is larger than a specified threshold, as shown in Eq. (8). (8) where U x can be user set U in or U out , UC is the set of user country context, PCCU x denotes the users whose similarity with the target user is larger than a specified threshold, α denotes the PCC threshold. |·| denotes the size of a set.
With Eq.(8), we plot the distribution of users located in the same country by applying ratio(u, U in ) and the distribution of users located in the different country by applying ratio(u, U out ) with the varying value of PCC threshold, as shown in Fig.3(a). Meanwhile, we compare the distribution of users located in the same and different AS by varying the value of PCC threshold to study the relationship of QoS and user AS context, as shown in Fig.3(b).
From Fig.3, we can see: the proportion of users decreases with the increase of PCC threshold, showing that the QoS performance of the majority of users is greatly different.   This poses a great challenge for accurate Web API QoS prediction. Fortunately, under the same PCC constraint, the proportion of users in the same context is higher than that the proportion of users in different context. Thus we can draw the following observation.
Observation 1: There is a positive correlation between QoS data and users' contextual information (including UC and UAS), users having the same contextual information tend to perceive similar QoS experience, and vice versa.
Similar to the analysis of QoS data and user side context, we can update Eq. (8) easily by applying Web API data and obtain the proportion of Web APIs with the same context vs. different context under PCC constraint. The comparison results from Web API perspective are shown in Fig.4, and we observe similar changes with Fig.3 and draw Observation 2.

Observation 2: There is a positive correlation between QoS data and Web APIs' contextual information (including AC and AAS), Web APIs having the same contextual information tend to perform similar QoS performance, and vice versa.
Observation 1 and Observation 2 together well answer the question that QoS is context sensitive, which motivates us to use the beneficial contextual information to model the interaction of user and Web API accurately, so as to make QoS prediction under data sparsity case. And our follow-up experiments in Section V. D verify the effectiveness of these four contexts and their combination on improving the accuracy of QoS prediction. Next, we will introduce how to embed these contexts and the design of our context aware deep factorization machine model in detail.

IV. FRAMEWORK AND METHODOLOGY
In this section, we describe the overall CADMF framework and the details of our context aware deep factorization machine for Web API QoS prediction.
A. FRAMEWORK Fig.5 presents an overview of the CADFM framework for Web API QoS prediction. CADFM is composed of three components: 1) Input preprocessing, which embeds the original feature input into the dense embedding vector that are shared by FM and MLP. 2) Model building, which includes two parts: FM and MLP. FM is used to model the high-order interactions among input features, and MLP is used to capture the non-linear interaction between user and Web API. 3) Predicting and learning, which performs regression prediction by fusing the results of MF and MLP, and CADFM is trained with an end-to-end solution.

B. METHODOLOGY 1) INPUT PREPROCESSING
In this paper, we treat Web API QoS prediction as a multiple input single output regression problem, defined as x-> r, where x is the input feature set (x 1 , x 2 , . . . , x d ), and r is its corresponding target QoS value that serves as the supervision signal for the model learning. In practical Web API QoS prediction task, the values of the input features are not always continuous values, they are categorical values. The typical input values of input features for Web API QoS prediction are presented in Fig.6 (a). Such categorical input values (1, 1, China, AS17, Japan, AS70) cannot be used directly because many machine learning models including our CADFM work on a numeric or real valued encoding of the input features.
To generate available numeric value for input feature, we first map each categorical value of input feature in training dataset to a unique number using a pre-defined feature index Agorithm 1 traverses the whole training instances and generates a feature-index map, which gives each feature a unique number. By applying Algorithm 1, we can conveniently transform the categorical training data into a feature index based training data. For example, Fig.6 (b) is the generated featureindex map using Algorithm 1, and Fig.6 (c) is the transformed training data with numeric format. It is worth noting that this transformation process can be done offline without affecting the performance of the QoS prediction model.
The reason why we introduce the above feature index map is that it allows us to easily transform input features into the data structures accepted by the deep learning model. For example, one-hot encoding is a widely used technique that encodes the inputs to zero vectors with a specified dimension. Since every feature index is unique and one-hot encoding creates additional features based on the number of unique values in the categorical feature, thus the mapped feature set x = (UID = 0, AID = 1, UCountry = 2, UAS = 3, ACountry = 4, AAS = 5) can be easily represented as x = (100000. . . , 010000. . . , 001000. . . , 000100. . . , 000010. . . , 000001. . . ), which are acceptable inputs for further modeling.
One-hot encoded vector after transformation is highdimensional and sparse [40]. High-dimensional data often makes the number of parameters become very large, the computational complexity increases, and it is easy to lead to over-fitting issue. Moreover, sparse data easily causes the gradient to disappear, resulting in the failure to effectively complete the parameter learning. Therefore, in this paper, we adopt embedding technique instead of one-hot encoding to realize the numerical vectorization of input features.
Embedding is a way to transform discrete variables into continuous vectors, which make it easier to do machine learning on large inputs like sparse vectors representing categorical variables. Ideally, an embedding captures some of the semantics of the input by placing semantically similar inputs close together in the embedding space. After embedding, the vectorized data become more suitable for the training and learning in deep neural network, because there is a one-toone mapping relationship between them. This relationship is always updated in the process of back propagation, so it can become relatively mature after several epochs. Here, we use Eq.(9) to transform the one-hot encoded representation into a dense real-valued representation, which is: where d denotes the number input features, and f denotes the size of embedding factor. x d is the d-th input feature represented by one-hot encoding, E ∈ R d×f is the embedding table. By applying Eq.(9), the embedding set (e 1 , e 2 , . . . , e d ) is then modeled as the shared input of FM and MLP, in which a mature and accurate E is expected to learn for QoS prediction through high-order interactions in FM and non-linear interaction in MLP.

2) MODEL BUILDING a: MODELING HIGH-ORDER INTERACTIONS WITH FM
FM is a general-purpose supervised learning model for regression task [41]. Since our Web API QoS prediction is a multi-input and single output regression task, thus the QoS prediction task is to estimate the QoS value r as accurate as possible from the feature set x = (x 1 , x 2 , . . . , x d ), expressed as follows: where x i and w i denotes the i-th feature and its corresponding weight.
The linear combination of features and their weights in Eq.(10) can only capture the first-order linear relationship of features, and this linear regression model does not consider the combinatorial interaction between features. However, in real user-Web API interaction scenario, feature x 1 , x 2 , . . . , x d are not independent from each other, and there may be some potential interaction relationship. Specifically, users located in the same country tend to perceive similar QoS experience of Web APIs observed in Fig.3 (a), and Web APIs having the same network AS tend to perform similar QoS performance to different users observed in Fig.4 (b). Therefore, the influence of single feature is fully considered in this paper, while the interaction between features is also captured, and we adopted the 2-way FM as shown in Equation (11) for modeling.
where w 0 is the global bias, w i,j models the interaction between feature x i and x j , which is defined in Eq. (12).
where f denotes the number of latent factor, e i,k and e j,k denotes a specified latent factor of feature i and j respectively. It is possible that much higher-order interaction may generate much better results, but it is certain that this will increase the complexity of the model, and too complex models can hardly be applied to practical applications. Therefore, the factorization model defined by Eq.(11) is capable of effectively capturing all the characteristic of single feature and as well as the combinational interaction between features. Moreover, the interaction between features is calculated by the inner product of the embedded vector e i,k and e j,k instead of original features. In this way, the FM model can better dig out the deep beneficial combinational interaction and guarantee the high quality parameter estimates of these higher-order interactions under the data-sparsity case, which eventually improve the accuracy of Web API QoS prediction.

b: MODELING NON-LINEAR INTERACTION WITH MLP
One of the biggest advantages of FM is that it models the pair-wise interaction of features, but FM is inherently a linear model and may have problems on complex interaction process. He et al. pointed out that linear models such as FM, and MF are difficult to capture the complex structure of user and Web API interaction [26]. In order to model the complex and nonlinear interaction between user and Web API, we in this paper adopt the multi interaction nonlinear model MLP to mine the deeper interaction between user and Web API. Formally, the multi-layer perceptron is essentially a feed forward neural network, which takes the full connection vector output from the embedded layer as the input, that is, the first layer input of the multi-layer perceptron is: where e i is the embedding feature vector, and con is a concatenation operation that concatenates these embedding feature vectors to the required input vector for MLP. When e (0) as input vector is fed into the MLP in the way of full connection, information transmission between the layers of MLP is realized by: where l denotes the length layers in MLP, W l and b l is the weight matrix and bias of layer l respectively, σ (·) is a activation function that is used to learn the complex nonlinear relationship between user and Web API. Here we adopt ReLU as the activation function instead of sigmoid and tanh is because ReLU is more efficient for gradient descent and back propagation, and avoids gradient explosion and gradient disappearance [42]. Since our task is to realize the regression prediction of QoS, thus in the output layer, the single value mapping as shown in Eq.(15) is used to obtain the Web API prediction QoS based on MLP interaction learning.
where W o and b o is the weight matrix and bias of output layer l respectively, e (l) is output vector of MLP layer l.

3) PREDICTING AND LEARNING a: PREDICTING BY FUSING THE RESULTS OF FM AND MLP
Since user invokes Web API is a complex interaction process, it is of necessity and importance for us to model both high-order and nonlinear interaction. Considering the advantages of FM in feature high-order interaction and MLP in VOLUME 8, 2020 nonlinear interaction, we fuse the results of FM and MLP together, as shown in Eq. (16), so as to improve the QoS prediction accuracy.r i,j =r FM +r MLP (16) The fusion in Eq.(16) is reasonable because they share the same embedding inputs, and FM and MLP both try to approach the target value. Moreover, such fusion can be solved through an end-to-end learning introduced in the following section.

b: MODEL LEARNING
To derive the parameters in CADMF, we transform the solution into an optimization problem and consider the over-fitting problem in the optimization solving process. Accordingly, our task is to minimize the defined objective function consisting of loss function and regularization shown in Eq. (17). (17) where R + denotes the set of training instances.r i,j is the predicted QoS value using Eq. (16), and r i,j is the target real QoS value. = {E, w i , W l , b l } denotes all trainable parameters of our CADFM model. Regularization coefficient λ controls the strength of L2 regularization. In Eq.(17), we choose the squared loss between the predicted value and real value as the objective function because the predicted QoS in the regression is a real value. Regularization terms of parameters are added to alleviating over-fitting and enhance the generalization ability of the CADMF model.
Our CADMF model is trained in an end-to-end manner with the widely used stochastic gradient descent optimizer for approaching the local optimal solution to the objective function. Accordingly, each parameter ρ in = {E, w i , W l , b l } can be updated using the following rule: where i denotes the epoch of iteration, γ is the learning rate that controls how much we are adjusting the weights of our network with respect the loss gradient, and ∇ρ i is the gradient of parameter ρ. Above all, the workflow of CADMF is shown in Algorithm 2.
The overall computation cost of CADFM model learning focuses on line 5 to 12 in algorithm 2. Line 7 implements the embedding for the feature input, its computation complexity is O(d * f ), where d is the number of input feature and f is the feature size. Line 8 fuses the results of FM and MLP which complexity is O(1). Line 9 and Line 10 calculate the loss and update the parameter with input feature. Since line 7 to 10 are executed sequentially, thus the overall computation complex for learning CADFM is O(|I | × |R|×(d * f +1+2f ), which

V. EXPERIMENTS
In this section, we first present the experiment setup in detail, including experimental dataset, accuracy metrics, and parameter settings. Then, we carry out a series of well-designed experiments to evaluate the performance of our proposed CADFM using real-world QoS dataset, and these experiments are dedicated to answering the following research questions (RQs): RQ1: How dose CADFM perform compared with the state-of-the-art QoS prediction methods?
RQ2: Can embedding contextual information improve accuracy?
RQ3: Does more contextual information mean higher accuracy?
RQ4: Does more available QoS data mean higher accuracy?
RQ5: How does the learning rate of CADFM affect the prediction accuracy?
RQ6: How does the embedding size of CADFM affect the prediction accuracy and efficiency?

A. EXPERIMENT SETUP 1) EXPERIMENTAL DATASET
To evaluate the performance of our CADFM model, we adopted the QoS dataset#1 in WS-DREAM [39], which contains 1,974,675 response-time records of 5825 services invoked by 339 distributed uses via PlanetLab platform. For the missing contextual information of users and Web APIs, we obtain them by invoking the IP2Location Web API. The statistics of our experimental dataset are presented in Table 2.
As we discussed in the Introduction Section, the number of available Web APIs on the Internet is large, and a single user usually invoked a small number of Web APIs, thus the available data for training our model is limited and thus the user-Web API interaction matrix is sparse. To make our experiments consistent with the practical Web API application scenario, we apply a hold-out A/B (A% + B% = 100%) schema on the user-Web API interaction matrix to generate the training dataset and testing dataset. Specifically, when matrix density (MD) = A%, which indicates that we random sampled A% as the training dataset to train our model and the remaining B% as testing dataset to evaluate the performance of our model. Accordingly, we vary the matrix density from 4% to 16% with a step size 4% for generating our training datasets.

2) ACCURACY METRICS
Since our work in this paper focuses on enhancing the Web API QoS prediction accuracy, the closeness between the predicted value and the real value is the key of our concern. Hence, we exploit mean absolute error (MAE) and root mean squared error (RMSE) [43] as the metrics to evaluate the accuracy of our approach by making comparisons with other methods. MAE is defined as: where R − denotes the set of testing records. r u,a andr u,a denotes the predicted QoS value and real QoS value of Web API a invoked by user u. RMSE is defined as: According to the above two definition, MAE and RMSE both vary in (0, +∞), and the smaller value of them indicates higher prediction accuracy. Since errors are squared before they are averaged, RMSE gives a relatively high weight to large errors, thus RMSE is an excellent metric to identify the undesirable large errors.

3) PARAMETER SETTINGS
The parameters involved in our CADMF are shown in Table 3, and their optimal values are used in the following experiments. Meanwhile, to make fair comparison, we use the same parameter settings for the compared methods as our CADMF.

B. PERFORMANCE COMPARISON (RQ1)
We take twelve existing classic and state-of-the-art methods as baselines to evaluate our CADFM, which are as follows: (1) G-Mean is a benchmark mean model, which utilizes the global average of testing data as the prediction, thus all predictions are the same, regardless of what user and Web API are.
(2) U-Mean is an improved mean model, which generates prediction by averaging the QoS data of the given user, thus for a given user, the predictions are the same, regardless of what Web API is.
(3) A-Mean is also an improved mean model, which generates prediction by averaging the QoS data of the given Web API, thus for a Web API, the predictions are the same, regardless of what user is.
(4) U-CF is a benchmark of classic CF, which makes prediction by weighting sum of known QoS values of user's similar neighbor with Eq.(1). In this paper, we adopt the widely used PCC similarity in CF methods to select neighbor and make collaborative prediction [28].
(5) A-CF is a Web API based CF, which makes prediction by weighting sum of known QoS values of Web API's similar neighbor [17]. (6) WSRec is the state-of-the-art CF method, which fuses the results of U-CF and A-CF [15], [16]. (7) MF is a benchmark of MF model, which uses a matrix factorization technique to factorize user-Web API QoS matrix for prediction with Eq. (2) [19]. (8) BiasMF is a variant of MF method, which considers the biases influence of users and Web APIs [19].
(9) WRAMF is the state-of-the-art CF method, which tackles the wide-range influence in QoS data and uses an active function mapping explicitly for prediction [18].
(10) FM learns feature interaction by factorizing it into the inner product of two vectors. FM has many successfully application in Web API QoS prediction task [23], [24].
(11) AFM is a model that an attention part is added to FM [44]. (12) NFM is an optimized version of FM, in which a deep neural network is added to the end of FM in series for prediction [34], [35]. (13) MLP is a deep model, in with embedding features are fed into the model to learn the nonlinear interaction between user and Web API and generate prediction [32]. The results of CADFM model and the compared baselines with varying training datasets are reported in Table 4, and we have the following observations: (1) Our CADFM approach consistently outperforms the existing baselines in terms of both MAE and RMSE, which signifies the impact of contextual information, high-order and nonlinear interaction in achieving better QoS prediction.
(2) AFM and NFM perform better than FM, which demonstrate the effectiveness of attention mechanism and non-linear interaction on improving QoS prediction accuracy. However, compared CADFM with the other DL models, CADFM constantly obtains the lowest prediction errors under different matrix densities, which verifies the effectiveness of modeling high-order interaction and nonlinear interaction systematically in an end to end way.
(3) For each prediction model, we observe that their prediction accuracies increase with the increasing of the available training dataset, the reason is because more training data enables predictive models to capture the interaction between users and Web APIs more accurately. This suggests to the Web API recommender system builders that they should encourage users to share their observed QoS data for more accurate QoS prediction.
(4) For the improvements of our CADMF compared with MLP, our CADMF has the positive improvements under all training dataset. CADMF has 16.68% improvement on MAE and 8.2% improvement on RMSE when the training data is only 4%, which demonstrates that our CADMF works well under data-sparsity case. In Section V. E, we will investigate the effectiveness of our CADMF in even more sparse scenarios.
(5) The last but not the least, comparing Mean, CF, MF and DL based prediction models, Mean based method performs the worst accuracy because it is based on the overall average characteristics of QoS data, without considering the personalization of users, and thus motivating researchers to design personalized and accurate prediction models. CF is better than Mean based method, but in most cases it is not as good as MF based method. This is because CF only uses part of the QoS data for prediction, while MF uses the overall QoS data to learn a prediction model for prediction. DL based models, especially those considering the influence of nonlinear interaction, tend to achieve better performance, showing the development direction of QoS prediction methods. The above results also verify the trend of Web API QoS prediction technique we summarized in literature review as shown in Fig.1.

C. BENEFIT OF CONTEXTUAL INFORMATION (RQ2)
As mentioned before, we argue that contextual information will do befits on improving Web API QoS prediction accuracy. Hence, it is necessary to check whether or not embedding contextual information in prediction model can improve prediction accuracy. Since our CADFM is a deep learning based model, we compare CADFM with other three DL models FM, NFM, AFM and MLP by incorporating the contextual information, which are named FM-C, NFM-C, AFM-C and MLP-C. The contextual information c includes {UCountry, UAS, ACountry, AAS}, which are the same with our CADFM model. Fig.7 shows the impact of incorporating contextual information on improving accuracy. We observe that FM-C, AFM-C, NFM-C and MLP-C significantly obtains lower MAE and RMSE than FM, AFM, NFM and MLP under different matrix densities respectively, illustrating that leveraging the implicit contextual information beside explicit QoS data indeed improves QoS prediction accuracy greatly. Although the accuracy of FM, AFM, NFM and MLP models is enhanced by embedding contexts, our CADFM constantly obtains the best results due to the modeling of high-order interaction and nonlinear interaction between users and Web APIs systematically.

D. CONTEXTUAL INFORMATION COMPARISON (RQ3)
The contextual information in our CADFM includes {UCountry, UAS, ACountry, AAS} due to the positive relationship between QoS and contexts discovered in Observation 1 and 2. To further investigate how these contexts impact the performance of CADFM thoroughly, we define country context as UCountry + ACountry, AS context as UAS+AAS, and then we evaluate the prediction accuracy of CADFM by incorporating different contextual information. Fig.8 (a) and Fig.8 (b) present the MAE and RMSE of CADFM with different contexts under different matrix densities. CADFM-None is the one that doesn't incorporate any contextual information and obtains the largest prediction error, and CADFM that incorporates four types of contextual information obtains the lowest prediction error under all configurations, which justifies the expressiveness and effectiveness of contexts embedding in CADFM. Moreover, compared with CADFM-UCountry and CADFM-ACountry, CADFM-Country obtains much better performance in terms of MAE and RMSE. And the similar results are observed with AS context, showing that both user side contexts and Web API side contexts can both do benefits on improving the predictive accuracy. Accordingly, different contexts and their combinations both provide distinguished improvements for Web API QoS prediction performance. The results of this experiment suggest us to find more beneficial contexts of users and Web APIs to further improve the accuracy of the CADFM model, which is a good and promising direction.

E. IMPACT OF THE AVAILABLE QoS DATA (RQ4)
In order to study the effect of data-sparsity on the prediction model, at the same time to answer whether the more available training data means the higher prediction accuracy on prediction model, we plot the MAE and RMSE results of different prediction models by varying matrix density from 1% to 20% with a step value of 1%.
Firstly, we observe that all prediction models achieve better prediction accuracy by increasing the value of matrix density, which is consistent with results observed in Table 4. Secondly, CADMF always obtains the lowers MAE and RMSE substantially when matrix density is very sparse such as MD = 1%, hence we can conclude that our CADMF model can address the data-sparsity problem by embedding the contextual information effectively. Thirdly, MAE and RMSE values of all prediction models decreased significantly with the increasing of matrix density, but when the matrix density exceeds 7%, the decreasing trend on MAE and RMSE of these compared models becomes less obvious, while our CADMF model still keeps a significantly decreasing trend with the increasing of matrix density. This shows that, on the one hand, the performance of the predictive model can be improved by collecting more data when the QoS data is very sparse. On the other hand, it indicates that our CADFM model has good robustness under different dataset densities.

F. IMPACT OF THE LEARNING RATE (RQ5)
Learning rate is a hyper-parameter of CADFM that controls how much we are adjusting the step size of CADFM model at each iteration while moving toward a minimum of loss function. The lower value of leaning rate, the slower we travel along the downward slope, and the larger value of learning rate, the more likely we are to miss the locally optimal solution. Hence, it is of importance to select the learning rate carefully for our CADMF model. To get the optimal value of learning rate γ , we vary the value of γ from 0.005 to 0.05 with  We observe that increasing the learning rate γ improves the performance when the value of γ varies from 0.005 to 0.035, this is because if a learning rate that is too small, it can cause the process to get stuck and CADMF cannot approach the local minima. However, as the value of γ varies from 0.035 to 0.05, the prediction error is increased sharply, the possible reason for explaining this phenomenon is that a larger learning rate will cause the CADMF model to converge to a suboptimal solution quickly. Since our CADFM model achieves the best performance when γ is set to be 0.035, thus as in our other experiments, the default value of γ is set to 0.035.

G. IMPACT OF THE EMBEDDING SIZE (RQ6)
As an embedding-based model, the embedding size is another crucial hyper-parameter for CADFM model. Intuitively, increasing the value of embedding size can make our model more expressive due to more latent factors will be used to express the interaction of users and Web APIs. However, on the other hand, large embedding size f takes more computational training time of our model according to the theoretical complexity analysis in Section IV.B. This subsection investigates the impact of the embedding size on prediction accuracy and training time by varying embedding size f from 5 to 100 with a step value of 5.
Observed from Fig.11 (a) and Fig.11 (b), we can see that the prediction accuracy of CADMF increases significantly as the increasing of embedding size, and the prediction accuracy fluctuates when the value of embedding size exceeds a threshold. This is because when the embedding size is small, the performance of our CADMF is limited by the embedding size, and the prediction accuracy can be improved by using more expressive latent factors. However, when using a larger value of f , the CADMF model begins to over fitting, thus causing the fluctuation on MAE and RMSE. Additionally, in Fig.11(c), we observe that the training time is linear with respect to the number of embedding latent factors, which is consistent with the complexity analysis in Section IV.B. Therefore, we choose f = 70 as the optimal embedding size in our CADFM model.

VI. CONCLUSION AND FUTURE WORK
In this paper, we address the Web API QoS prediction as a regression problem and propose a context aware deep factorization machine model to enhance the QoS prediction accuracy. Specifically, we conducted detailed data analysis on real-world QoS dataset and discovered the positive relationship between QoS data and contextual information, and verify our hypothesis that incorporating beneficial contexts including country and network autonomous system from both user and Web API sides is a promising solution to improve QoS predictive accuracy. Then, we developed a context aware deep factorization machine model, in which contextual information are fed into the deep factorization machine by adopting the embedding technique. Moreover, our proposed model adopts factorization machine to model the high-order interactions and uses multi-layer perceptron to model the non-linear interaction, so as to capture the original complex interaction between user and Web API accurately. Finally, we conducted extensive experiments to evaluate the performance of our CADFM model using real-world QoS data. Experimental results show that our CADFM performs much better prediction accuracy than the counterparts, especially when the QoS data is very sparse. Hence, it is a promising model for industrial applications to make accurate QoS prediction and as well as to increase revenue from QoS aware Web API recommendations.
In the near future, we plan to explore more beneficial contextual information to improve the performance of our model.
And we are now working on collecting more real-world Web API data to validate robustness of our proposed model. Moreover, QoS aware Web API recommendation focuses on high quality Web API selection and our previous experiments have found that user-developed Web applications tend to invoke multiple types of different Web API, so designing the recommendation system that improves the diversity of Web API recommendation results is also a worth studying direction.