A Course Teacher Recommendation Method Based on an Improved Weighted Bipartite Graph and Slope One

The quality of course teaching is directly related to education quality. Many scholars have attempted to identify the associations between course-teaching quality and teachers’ characteristics, such as educational background, degree, professional title, age, teaching age, job burnout, and academic research. However, because these characteristics are mostly evolvable, research findings are inconsistent. Therefore, we attempted to identify the association between teaching styles that reflect teachers’ stable psychological quality, Technological Pedagogical Content Knowledge (TPACK), and teaching quality. To this end, we first collected data from three different disciplines at a university using the constructed teaching quality, TPACK, and course difficulty questionnaires, together with the TSTI scale proposed by Grigorenko and Sternberg. We constructed three matrices with different sparsities as experimental datasets using teachers with the teaching style and PTACK attributes, courses with the course difficulty attribute, and teaching quality. We then constructed a weighted bipartite graph with the teachers and courses in the matrix as nodes and the teaching quality divided by course difficulty as the weights of the edges. We proposed an improved Slope One algorithm based on a weighted bipartite graph to scientifically predict teachers’ teaching quality in untaught courses. Finally, we constructed a TOP-N recommendation model for course teachers that combined teaching style and TPACK features to achieve accurate recommendations for course teachers. The experiments show that our proposed solution is feasible and that the algorithmic model is effective. Therefore, we developed a scientific method to improve the quality of university course teaching.


I. INTRODUCTION
The quality of teaching in university courses determines the quality of the education. We noticed an easily observed but long-unexplained phenomenon in university teaching. In other words, the teaching quality of different teachers of the same course differs, and that of different courses with the same teacher also differs. In other words, the quality of teaching in a course depends on whether the teacher and the course are appropriate. As course characteristics are relatively The associate editor coordinating the review of this manuscript and approving it for publication was Pasquale De Meo. fixed, we can conclude that a key factor affecting teaching quality exists among teachers' characteristics.
What are characteristics of teachers associated with teaching quality? From different research footholds, many scholars [1], [2], [3], [4], [5], [6], [7], [8], [9], [10] have selected one or several characteristics from teachers' educational background, degree, professional title, gender, age, teaching age, job burnout, academic research, and teaching evaluation to analyze the association with teaching quality and corresponding conclusions were drawn. Some studies have shown that it correlates with teaching quality, whereas others have shown that it does not. We can also see from careful observation that VOLUME 10, 2022 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ the teacher characteristics mentioned above, except gender, which is a constant characteristic, are variable or evolvable. Therefore, it is not easy to prove a stable correlation between these variables and the teaching quality. However, some studies have found that evolving teacher characteristics do not influence teachers' teaching styles [11]. The teaching style is an individualized and stable approach developed by teachers over time, reflecting their consistent and stable psychological teaching qualities [12]. Therefore, this relatively stable teaching style can be used to establish an implicit correlation with teaching quality. In addition, in 1986, Shulman [13] concluded from a long-term study of good teachers that it was not the mastery of a particular pedagogy that made them good but rather how they taught their students to quickly understand and master their knowledge. Thus, Shulman argued that teachers' professionalism is primarily reflected in their transferability of knowledge. Shulman found an intersection between ''Content Knowledge (CK)'' and ''Pedagogical Knowledge (PK),'' that is, Pedagogical Content Knowledge (PCK). The PCK is a translation of teachers' CK into a representational form that students can easily understand. Thus, an implicit link exists between PCK and teaching quality.
With the advent of the era of information technology in education, teachers need to be proficient not only in the traditional CK and PK of their disciplines but also in some Information Technology Knowledge (TK), as well as in thinking about how to integrate technology into their discipline to teach better. In this context, the American scholars Koehler and Mishra [14] proposed a new framework based on Shulman's PCK framework that integrates teachers' knowledge of using technology for effective teaching into the structure of teachers' professional knowledge. This is called technology teaching content knowledge (TPACK). TPACK is the product of dynamic integration among CK, TK, and PK and highly integrated and complex knowledge about how teachers use technology to teach in specific contexts. TPACK knowledge contributes to high-quality teaching and learning.
This shows a strong correlation between teachers' teaching styles and TPACK and the course's quality of teaching. Suppose that we can measure teachers' teaching style and TPACK, course difficulty, and course-teaching quality using scientific and effective methods. Subsequently, a course-teacher recommendation model was constructed using appropriate machine-learning techniques. By using this recommendation method to find suitable teachers for courses, we can find a new way to improve the overall quality of course teaching in universities. How do we find the right teacher for the course based on the teacher's teaching style and TPACK?
The successful application of the recommendation system in different fields provides a reference for the course to recommend suitable teachers. In recent years, many studies have proposed many educational recommendation systems that mainly focus on the following aspects.
One is course recommendation. For example, Zhu et al. [15] proposed a hybrid recommendation model that integrates network structure features with neural graph networks and user interaction activities with tensor decomposition, which helps students successfully select the required courses from many course resources. Xu et al. [16] proposed an algorithm combining knowledge graphs and collaborative filtering to effectively recommend courses for learners. Zou [17] designed a recommendation technology based on artificial neural networks (ANN), which provided a method for college students to choose innovation and entrepreneurship education courses. Nguyen et al. [18] applied various techniques based on data mining and learning analysis to predict students' learning outcomes in the next semester and developed a recommendation system to help students choose appropriate courses for learning. Gao et al. [19] proposed a personalized course recommendation model based on a convolutional neural network combined with negative sequence pattern mining. Banbhrani et al. [20] used the Taylor-chimp optimization algorithm of stochastic multimodal deep learning to recommend courses. Hao et al. [21] proposed a meta-relational course recommendation model to help students with different needs effectively recommend courses. Zhu [22] proposed an online course recommendation system based on a two-tier attention mechanism given the lack of accurate recommendations and course selection on online teaching platforms.
The second is the course on teaching resource recommendation. For example, Zhang [23] proposed a recommended online sports course resource method based on collaborative filtering technology. Min [24] used highly automated data mining (DM) technology to predict users' upcoming actions and recommend specific course resources. Diao et al. [25] proposed a personalized learning resource recommendation framework based on course ontology and learners' cognitive ability. Hui et al. [26] used genetic algorithms in a studentbased collaborative filtering algorithm to optimize the interest function, accurately recommend learning resources to students, and meet their learning needs. Zhu et al. [27] proposed a learning object recommendation model based on heterogeneous learning behavior and knowledge graphs.
Third, there were other recommendations for education. For example, Chen et al. [28] proposed a simple and effective solution for building a practical teacher recommendation system for one-to-one online courses. Huang et al. [29] proposed a simple but effective method for recommending high-quality and varied student exercises. Chang et al. [30] proposed a ''keyword cloud'' learning interest/difficulty reminder system based on learners' video-viewing logs and subtitles to promote self-directed learning. Wang [31] proposed a recommendation method based on emotional factors, which considers scholars' emotional and psychological factors according to the learning content of learners and accurately reflects learners' preferences.
Guruge et al. [32], Urdaneta-Ponte et al. [33], and Jeghal et al. [34], based on an analysis of the existing literature on educational recommender systems, summarize the main approaches currently used in educational recommender systems: collaborative, content-based, knowledge-based, data mining (DM), hybrid, statistical and conversational. Among these, hybrid course recommendations are becoming increasingly popular, followed by DM techniques.
The aforementioned studies, in which scholars have researched appropriate recommendation systems for problems in the education field, have significantly contributed to the promotion of education. However, there are limited research results on systems for recommending suitable teachers for courses [35].
For this reason, a study of methods for recommending suitable teachers for courses is of some value. Therefore, we first scientifically captured teachers' TPACK, course difficulty, and course-teaching quality by creating questionnaires with high reliability and validity. We used the Thinking Styles in Teaching Inventory (TSTI) scale proposed by Grigorenko and Sternberg [36] to capture teachers' teaching styles scientifically. Subsequently, the PersonalRank [37] algorithm based on a weighted bipartite graph was improved to incorporate course-teaching quality and course difficulty into the algorithm. A Slope One [38] algorithm based on a weighted bipartite graph was established to scientifically predict teachers' teaching quality in untaught courses. Finally, a recommendation model in-corporating teachers' teaching styles and TPACK was established to effectively implement TOP-N recommendations for course teachers.
Specifically, the main contributions of this paper are as follows: • We constructed TPACK, course teaching quality, and course difficulty scale questionnaires. After fully verifying the reliability and validity of the scale questionnaires, we used these scales to collect data on TPACK, course-teaching quality, and the course difficulty of Literature, Engineering, and Pedagogy teachers at a university. We also used the TSTI scale proposed by Grigorenko and Sternberg to collect data on teachers' teaching styles.
• We compared and analyzed the teaching styles of teachers in different disciplines and the characteristics of the mean values of each dimension of TPACK. After eliminating the effect of the dimensions on the collected data, we constructed three matrices with different sparsities as experimental datasets, teaching style and PTACK as teacher attributes, course difficulty as course attributes, and teaching quality as values.
• We proposed a Slope One algorithm based on a weighted bipartite graph. First, we constructed a weighted bipartite graph with teachers and courses in the matrix as two sets of nodes and teaching quality divided by course difficulty as the weights of the edges. In this way, we clearly correlated the teaching quality and course difficulty. We then used an improved random walkbased PersonalRank algorithm to predict the predictable teaching quality of teachers in untaught courses, thereby reducing the sparsity of the dataset. Finally, we used the Slope One algorithm to further predict teachers' teaching quality in untaught courses that were missing in the experimental dataset, which could effectively solve the algorithm cold-start problem.
• Considering that teaching style and TPACK correlate with teaching quality, we constructed a new TOP-N recommendation model for course teachers that reasonably combines teaching styles and TPACK. The effectiveness of the proposed algorithm and model is verified through several comparative experiments. In this way, we found a new way to scientifically solve the course teacher recommendation problem. The remainder of this paper is organized as follows: Section 2 discusses related work on the Slope One algorithm and the bipartite graph model. In Section 3, we describe data collection and preprocessing. Section 4 describes the Slope One algorithm based on weighted bipartite graphs. In Section 5, experimental results are presented and evaluated. Finally, Section 6 concludes the study and identifies new research directions.

II. RELATED WORKS
This section mainly focuses on Slope-One and bipartite graph models. Therefore, we discuss the relevant work of slope one and the bipartite graph model.

A. SLOPE ONE ALGORITHM
Traditional recommendation systems can be divided into three categories [39]: Content-Based Recommendation (CB) [40], Collaborative Filtering Recommendation (CF) [41], and Hybrid Recommendation (HFR). The CB algorithm constructs a set of recommended items with a high correlation to historical interactions based on the interaction history of the user to achieve a recommendation task for the target user. The CF algorithm uses a similar relationship between different users (or different items) to filter the user-item interaction information and recommend items of interest to the target user. HFR integrates different recommendation techniques into a recommendation system to avoid defects by using a single recommendation technique. In traditional recommendation systems, similarity measures include the Euclidean distance, cosine similarity, and Pearson correlation coefficient. Commonly used modeling methods include Matrix Factorization (MF) [42] and Probabilistic Matrix Factorization (PMF) [43]. Traditional recommender systems are simple and easy to operate and can quickly model user-item interaction information. However, they suffer from data sparsity and cold-start problems, cannot handle recommendations with complex relationships, and lack interpretability.
The Slope One algorithm is a CF recommendation algorithm proposed by Lemire et al. [38], which has the advantages of simplicity, efficiency, and high recommendation accuracy, and alleviates the cold start problem. The Slope One algorithm can be viewed as a linear regression model y = x + b, which uses the rating data of all users to make predictions. The basic idea is to build a rating set R that contains the ratings of all users on items in the training set. We define u i and u j as the ratings of user u on item i and user VOLUME 10, 2022 u on item j, respectively, in rating set R. C ji (R) denotes the set of users who rate both items i and j in the rating set R. The average rating difference between the definition items j and i is calculated as shown in (1).
The dev ji obtained from (1) is used to predict the rating of user u for item j pred(u, j), as expressed in (2).
In Equation (2), represents the set of items rated by user u. S(u, j) represents the set of items in which the number of common ratings for item j is greater than or equal to one among the items rated by user u. u i represents user u's rating for item i and dev ji represents the difference in ratings for items j and i. Because the Slope One algorithm does not consider similar relationships between users or items, it runs quickly and can be used for real-time recommendation. However, the recommendation effect must be improved, because the algorithm does not consider the similarity between users and items. Several methods have been proposed to address this issue. Yannam et al. [44] proposed a fusion of clustering prediction and the Slope One algorithm to generate a cluster recommendation model that improves recommendation quality. Ying et al. [45] proposed a Slope One-hybrid CF recommendation framework that integrates FCM clustering. The framework first uses the Slope One algorithm, which integrates FCM clustering to predict the ratings of items in the matrix that users do not rate. The CF recommendation algorithm then implements recommendations to improve the prediction and recommendation accuracy. Hu et al. [46] proposed a Slope One algorithm with attraction similarity that integrates attraction rating similarity and attraction. Song et al. [47] proposed a Slope-One recommendation algorithm that incorporates user clustering and scoring preferences using an improved K-means++ algorithm to classify users into several categories. The Slope One algorithm incorporates user-scoring preferences to predict item scores, thereby improving recommendations. Saeed et al. [48] proposed a weighted Slope One-based algorithm that solves the data sparsity problem without additional information, improves the recommendation accuracy, and is scalable by introducing virtual prediction items into a relatively sparse rating database. Sun et al. [49] proposed a CF algorithm that combines an uncertain CF model with the activity of neighboring items. The model dynamically selects the neighbors of each item based on item similarity and activity, which improves recommendation quality and prediction accuracy.
Conventional recommendation algorithms fail when datasets are highly sparse, CF, which is based on the cosine similarity function, must identify users who like the same items, calculate interest similarity accordingly, and generate a recommendation list. However, insufficient data makes it difficult for this algorithm to find similar users, which is one of the main problems faced by recommendation systems [50].

B. BIPARTITE GRAPH MODEL
Web analysis algorithms are important for solving this problem by using large-scale sparse data. For example, the PageRank algorithm [51] calculates the importance of web nodes. However, PageRank does not distinguish between the types of nodes; therefore, it is difficult to improve recommendation performance using the PageRank algorithm. The improved algorithm based on PageRank and PersonalRank is a dichotomous graph algorithm. A bipartite graph is a graph model that consists of two sets of nodes with different properties. The nodes within the set were not connected in any manner. A bipartite graph can be defined as a network structure G =< U , I , E >, where U represents a set of users, I represents the set of items, and E represents the edges of the bipartite graph model. A schematic representation of this process is shown in Fig. 1.
The bipartite graph algorithm first divides the network into an item-user structure with no direct connections between items or users. Then the global similarity is calculated, which is different from the local similarity of the cosine function and can better handle the data sparsity problem.
The primary concept of the PersonalRank algorithm is to set an initial visit probability α for each node before running the PersonalRank algorithm. Suppose that a recommendation is implemented for User A. The PersonalRank algorithm starts with user node A. When randomly walking to a new node, it uses the α probability to decide whether to continue wandering or stop this wandering with a (1-α) probability and restart the wander from node A. If the decision is to continue, the algorithm randomly selects a node from that node pointed to by the current node as the next node to pass through according to a uniform distribution. Thus, after several random walks, the probability of each node being visited converged to a stable value. This stability value represents the relevance of user node A to the project nodes (including the item nodes that are not connected by edges).
The item nodes connected by the red dashed lines in Fig. 1 are nodes predicted to be relevant to user A. Finally, the weights of the nodes in the recommendation list represent their access probabilities.
The iterative formula of the PersonalRank algorithm is shown in (3).
where PR(j) is the access probability of node j, out(i) is the out-degree of node i, α is the probability of continuing random walking, in(i) is the in-degree of node i and u represents the target user. The PersonalRank algorithm enables personalized recommendations for sparse datasets and has been widely used in different fields. For example, Hu et al. [52] proposed a hybrid recommendation algorithm based on the Latent Factor Model (LFM) and the PersonalRank algorithm to improve the accuracy of TOP-N recommendations on sparse social network datasets. Tian et al. [53] proposed a weighted PersonalRank algorithm to recommend activities of interest to a specific volunteer on a sparse dataset. Yang et al. [54] proposed a keyword-based scholar recommendation framework that constructed a bipartite graph by extracting keywords from abstracts. They used a bipartite graph-based PersonalRank algorithm to rank scholars by using a sparse dataset. It effectively realizes the recommendations of scholars who meet users' interests to help them advance their research. Bai et al. [55] proposed a recommendation algorithm based on a bipartite graph and PersonalRank that abstracts the relationship between customers with loyalty attributes and products in a network topology. They verified that customer loyalty can improve the product recommendation accuracy in random walking between nodes in a network. To realize multimedia recommendation, Li et al. [56] proposed adding user labels to the model for cold start and data sparse problems involved in CF recommendation, and used the random walking-based PersonalRank algorithm to calculate the weight coefficients of user labels. The probabilistic graph multimedia recommendation algorithm is then improved by dimensionality reduction and clustering. This significantly improves the accuracy and recall of the multimedia recommendations.

III. DATA ACQUISITION AND VALIDITY VERIFICATION
This section introduces the methods for obtaining, validating, and preprocessing data on teachers' teaching styles, TPACK, teaching quality, and course difficulty.

A. ACQUISITION OF TEACHING STYLE DATA
There are many classifications of teaching style. American psychologist Sternberg, who has studied teaching styles for many years, proposed an innovative theory of cognitive styles: the mental self-government theory [57], [58].
According to this theory, Grigorenko & Sternberg divided teaching styles into seven categories from the dimension of cognitive style, which are Legislative (teachers with this style like to create and propose rules, teach according to their own way, and like and encourage students to solve problems creatively), Executive (teachers with this style like to follow established rules and procedures to solve problems, and like to teach according to preplanned activities), Judicial (teachers with this (teachers with this style like to judge and evaluate facts, procedures, and rules and like to analyze or evaluate tasks during teaching activities), Global (teachers with this style like to face global, abstract problems and prefer general, conceptual, and conceptual teaching tasks), Local (teachers with this style like detailed, concrete teaching tasks and are able to think deeply when completing their work), Liberal (teachers with this style prefer to go beyond the existing rules and procedures and do not like the same tasks) and Conservative (teachers with this style prefer familiar tasks, teaching situations, and traditional teaching methods).
The main reasons for using Sternberg's classification system for teachers' teaching styles in this study are as follows: (1) Sternberg's classification theory is relatively complete and one of the most widely used and recognized by most researchers. (2) Sternberg provided a set of evaluation scales called the TSTI to measure teachers' teaching styles in teaching situations.
To demonstrate whether teaching style was associated with teaching quality, we selected teachers from three different disciplines (Literature, Engineering, and Pedagogy) at a university as the target of data collection. We used the TSTI to collect 45 valid test data points through a web-based questionnaire. There were 10 teachers in Literature, 19 in Engineering, and 16 in Pedagogy.
The TSTI scale has seven subscales, each of which includes seven items, totaling 49 items, and each item is rated on seven levels (on a scale of to 1-7, from very unsuitable to very suitable). All items were arranged in a mixed arrangement to evaluate the seven teaching styles: Legislative, Executive, Judicial, Global, Local, Liberal, and Conservative. They are divided into three dimensions: functional, level, and tendency.
After statistical analysis of the questionnaire results, the Cronbach's α coefficients of the Literature, Engineering, and Pedagogy scales were 0.89, 0.92, and 0.81, respectively, indicating that the measurement results were highly reliable.
To eliminate the dimensional influence of the variables, we used the zero-mean normalization method shown in Equation (4) to standardize the collected teaching style data. We then obtained a standardized teaching-style dataset.
In (4), ts i,j is the jth (1≤ j ≤7 ) teaching style value of standardized teacher i (1≤i≤ N , where N is the number of teachers in the dataset ), µ is the mean value of all teaching VOLUME 10, 2022  styles, and δ is the standard deviation, which is given by (5):

1) ACQUISITION OF TPACK DATA
This study focused on the TPACK structural framework proposed by Mishra and Koehler as the theoretical basis for measuring teachers' TPACK levels in various disciplines using a questionnaire survey. The questionnaire was based on the preservice teacher's TPACK level measurement tool jointly developed by Schmidt and Baran [59] at Iowa State University. This scale expresses the same question in four ways for math, social, natural, and literary teachers. Based on this scale, we must make an appropriate transformation to measure teachers' TPACK in Literature, Engineering, and Pedagogy.
The revised questionnaire included a total of 31 questions for the seven dimensions of TPACK: six questions for TK, three questions for CK, five questions for PK, five questions for PCK, four questions for Technological Pedagogical Knowledge (TPK), three questions for Technical Content Knowledge (TCK) and five questions for TPACK. A fivepoint Likert scale was used to answer these questions. Each question had ''very disagree,'' ''disagree,'' ''uncertainty,'' ''agree,'' and ''very agree''. Data analysis assigns these five options, 1 to 5, from low to high.
A reliability test was conducted before the formal questionnaire survey to ensure the scientific validity and reliability of the questionnaire. For this purpose, questionnaires from three different disciplines were distributed to teachers in corresponding disciplines for trial testing. A valid sample of 21 responses was returned for the Engineering questionnaire, 19 for the Literature questionnaire, and 22 for the Pedagogy questionnaire. First, we used structural validity to measure the consistency between the questionnaire and TPACK theory; that is, whether the designed questionnaire could measure teachers' TPACK levels. A factor analysis of the three TPACK scales was conducted using SPSSPRO to verify the correspondence between the factors and questions, explore the internal logical structure between the questions, and assess the structural validity of the questionnaire. The results are shown in Table 1.
As shown in Table 1, the overall KMO values for the three scales were 0.802, 0.799, and 0.804, and the KMO values for each component range from 0.759-0.825, which were all greater than 0.6. The p-values in the Bartlett test of sphericity were all 0.000, which is much less than 0.05. These results indicate a correlation between the variables suitable for Exploratory Factor Analysis (EFA).
Next, we used the factor analysis function of SPSSPRO dimension reduction analysis to determine each topic's factor loading and communality in the expected dimension. The results are presented in Table 2.
We can see from Table 2 that the range of communality for each question in the questionnaire is 0.594-0.826, which is greater than 0.45, and the range of the factor loading coefficient is 0.617-0.833, which is greater than 0.5. The results indicate good correspondence between the questions and dimensions, which aligns with professional expectations.
Finally, reliability analysis of the questionnaire was conducted. Reliability reflects the internal consistency of a scale. In this study, Cronbach's α coefficient in the reliability analysis was used to verify consistency among the questions in the questionnaire. The results are presented in Table 3.
As shown in Table 3, Cronbach's α coefficients for each dimension in the three scales ranged from 0.745 to 0.897, all of which were greater than the expert-perceived passing line of 0.7, indicating good consistency in scoring between topics within the scales. Cronbach's α coefficients for the scales were 0.964, 0.907, and 0.904, respectively, all of which were greater than 0.9, indicating good consistency within the scales.
In summary, it can be seen that the questionnaire used in this study has good structural validity.
Once the scale passed the reliability validity test, we formally administered the questionnaire to 123 teachers from the three disciplines in a targeted manner and obtained valid TPACK dimensional data. The z-score normalization method was used to implement the data-standardization process to obtain a standardized TPACK dataset.

B. ACQUISITION OF COURSE TEACHING QUALITY DATA 1) CONSTRUCTION OF THE TEACHING QUALITY SCALE
We developed a teaching quality scale based on a standardized scale development procedure to construct a scientific and reasonable evaluation system for teachers' teaching quality. The specific process is as follows.
i) Constructing initial scale items To summarize the content of teachers' teaching quality, we used a semi-structured interview method to construct initial measurement questions based on combining existing publications on the connotations and influencing factors of teaching quality. We selected 40 students from different grades in three disciplines to conduct interviews on teachers' teaching quality. By organizing and classifying the interview data, we outlined the connotations of teachers' teaching quality, and finally determined the initial scale containing 20 items.
ii) Testing the content validity of measurement items The initial scale must be tested for content validity to ensure that the measurement items are consistent with conceptual content. We first invited three experts and five students in the field of pedagogy to evaluate the extent to which the 20 initial measurement items matched the teachers' conceptualizations of teaching quality. After discussions between experts and students, consensus was reached to retain the 16 question items in six dimensions. Opinions were sought from some students and teachers (20 in total) to provide comments and suggestions on the measurement items, and to ensure that the statements were clearly stated and concise. After combining the opinions of students and teachers, we repeatedly deliberated and modified each question item of the questionnaire and settled on 12 measurement items in five dimensions (as shown in Table 4 ). Before distributing the official questionnaire, a pretest was conducted. A total of 150 questionnaires were distributed, 120 of which were returned. The data showed that the questionnaire could better reflect the content to be measured, and was suitable as a testing tool for the formal study.
iii) Exploratory factor analysis (EFA) As the initial measurement items were often inconsistent with the conceptual content, EFA was used to eliminate items that were inconsistent with the concepts of streamlining and determining dimensions. The 120 valid questionnaires recovered from the pretest were subjected to KMO and Bartlett's sphericity tests using SPSSPRO to determine whether the requirements for conducting EFA were met. The test results are presented in Table 5.   The test results showed that the KMO value of the scale was 0.952 >0.6, which indicated a correlation between the question variables and met the requirements of the factor analysis. Bartlett's sphericity scale test resulted in a p-value of 0.000, which was much less than 0.01 and was significant, indicating that the scale could be subjected to EFA.
Thus, we conducted an EFA on this scale and the results are presented in Table 6.
In the EFA process, after we used principal component analysis and orthogonal rotation, we found that the total contribution of the variable explanation reached 95.225%, indicating that it is reasonable to group the 12 question items into five dimensions. The common degree of all the question items is greater than 0.9, which indicates a strong correlation between the question items and dimensions and that the dimensions can effectively extract information. The maximum loading values of all the question items on the corresponding dimensions were greater than 0.5. There were no  cross-loaded items, indicating a strong correlation between the question items and the dimensions, and the correspondence was reasonable. iv) Confirmatory factor analysis (CFA) The EFA is only a preliminary analysis and exploration of the degree of streamlining and structural dimensions of scale items, and the structure of the scale is unstable. The overall goodness of fit of the factor structure was validated using CFA. Therefore, we used SPSSPRO for CFA, and the number of questionnaires in this sample was 349. The results of this analysis are listed in Table 7.
As shown in Table 7, Cronbach's α coefficient of the scale was 0.987, which was greater than 0.9, indicating that the scale had high reliability. In addition, the Construct Reliability (CR) of all five dimensions was greater than 0.7, indicating good construct reliability. The average variance extraction (AVE) was greater than 0.5, indicating high convergent validity of the data. The standardized loading coefficients for all question items were greater than 0.5, and the p-values were 0.000 and less than 0.01, indicating that the measured variables met dimensionality requirements. The AVE opensquare values in the analysis results were greater than the correlation coefficients of each variable with the other variables, indicating a high discriminant validity among the dimensions.
In summary, all the scale indicators were highly significant in the factor loadings of their respective measurement items. The model fit of the data satisfied these criteria. Therefore, it was inferred that the scale had high reliability and validity, ensuring the applicability of the Teacher Teaching Quality Scale to teaching evaluation. This scale can be considered to be applicable to student evaluations of teaching quality.

2) COLLECTION AND PROCESSING OF TEACHING QUALITY DATA
Once the scale passed the reliability and validity tests, to clarify which course each teacher each questionnaire was for, we needed to add four category-defining questions to the front part of the scale: discipline, major, course name, and teacher's name. After the revision, we formally distributed an online questionnaire to the students from each of the three disciplines. A total of 14,476 valid questionnaires were collected, including those from different students in the same course with the same teacher. Thus, we first used Equation (6) to obtain the mean value of the teaching quality of the same teacher in the same course.
where ctq denotes the course teaching quality, and m is the number of teaching quality questionnaires administered to the same teacher for the same course. The 14,476 original questionnaires were collated and summarized using equation (6) to obtain 349 valid teaching quality data for 123 teachers in three disciplines.
The z-score normalization method was then used to implement the data standardization process to obtain a standardized course teacher teaching quality dataset.

C. ACQUISITION OF COURSE DIFFICULTY DATA
Course difficulty is an important issue in current educational theory and practice. However, it is an abstract, complex, and challenging task. Thus far, there has been no consensus on the definition of course difficulty and there are differences in the understanding and construction of course difficulty models. From the models of course difficulty given by Shi Ningzhong, Bao Jiansheng, Zhong Kouzhuang, and Guo Min [60], [61], [62], [63], it can be seen that each model measures the static difficulty of the course, which is, in essence, the difficulty of knowledge. However, learner perception is not negligible in teaching courses, and there is a deficiency. Moreover, the analytical workload of this model-based course is immeasurable for many university courses.
Thus, in this study, we used questionnaires to obtain the difficulty level of courses in three different disciplines, that is, questionnaires were distributed to teachers and students by discipline. The questionnaires for different disciplinary majors included all courses in the major. A five-point Likert scale represented each course, i.e., each course was rated as ''very easy,'' ''easy,'' ''average,'' ''hard,'' and ''very hard.'' corresponding to a score of 1-5. In the statistics, we used (7) to calculate the difficulty coefficients of the courses.
where cdf i denotes the difficulty coefficient of course i, and n and m are the number of teacher and student questionnaires, respectively. λ(0< λ <1) is the weight of the teacher questionnaire during the course difficulty. cd i,k denotes the difficulty given by the kth teacher of course i and cd i,r denotes the difficulty given by the rth student of course i. We began with an inventory of the courses in each of the three disciplines, and found 271 courses. Among them are 95 professional courses in Literature, 82 in Engineering, and 94 in Pedagogy. We then distributed targeted online questionnaires and collected 581 questionnaires from Literature, 437 from Engineering, and 628 from Pedagogy. We took λ =0.6 and applied (7) to obtain the difficulty coefficient of each course. Finally, zero-mean normalization is used to standardize the collected difficulty coefficients.

IV. COURSE TEACHER RECOMMENDATION ALGORITHM
This section presents a course teacher recommendation algorithm based on weighted bipartite graphs and Slope One, which combines teacher and course features.

A. OVERVIEW
We built a teacher-course matrix based on the collected data to implement course-teaching teachers' recommendations. The non-zero values in the matrix represent the quality of teachers' teaching. Owing to the limitations of each teacher's workload, they can only teach a few courses in a large number of professional courses. This inevitably results in a matrix with many zero values, that is, a highly sparse dataset. Numerous studies have demonstrated that it is difficult to achieve desirable results by using a single recommendation algorithm on highly sparse datasets. In addition, both teachers and courses had access to relevant data. If these characteristics cannot be reasonably incorporated into a recommendation algorithm, it is difficult to guarantee the authenticity of the algorithm's recommendation results.
To this end, we used a cascade approach to design a hybrid recommendation algorithm that combined teacher and course characteristics. First, we used the dataset to construct an improved weighted bipartite graph model that subtly combined course-teaching quality with course difficulty. Based on this, we predicted the teaching quality of some untaught courses by improving the PersonalRank algorithm to scientifically reduce the sparsity of the dataset. We then used the Slope One algorithm to predict the teaching quality of all courses for each teacher, effectively solving the coldstart problem of newly introduced teachers or newly offered courses. Next, we ranked the predicted values of teaching quality and selected the top N teachers with high teaching quality as the candidate teachers. Finally, we constructed a recommendation model for course teachers that combines teaching styles and TPACK characteristics. We used the model to rank the top N teachers to obtain true TOP-N recommendations. VOLUME 10, 2022 FIGURE 2. Improved and Weighted Bipartite Graph Models. The black edge represents the association between teacher and course. w tc represents the weight between teacher t and course c, ts and Tpack are the teacher's teaching style and TPACK attributes, respectively, and cdf is the course difficulty coefficient attribute.

B. IMPROVED WEIGHTED BIPARTITE GRAPH MODEL
Because the edges of a bipartite graph are only zero or one, no weights were considered. Therefore, the weights of the teacher and course nodes were equally distributed. This situation is not conducive to the application of course teaching recommendations and the accuracy of the recommendations is difficult to achieve. Therefore, we designed a weighted bipartite graph model for course teacher recommendations. The model uses teachers with the teaching style and TPACK features as dataset T, and courses with difficulty coefficient features as dataset C. An edge is added between teachers and courses with teaching behaviors, and all edges form a set of edges E. The weight of edge w tc is the product of the inverse of teaching quality and the course difficulty coefficient, that is, w tc =tcq/cdf. An example of this model is shown in Fig. 2.
The corresponding iterations in the PersonalRank algorithm can be changed from (3) to (8) after weighting bipartite graphs.
where PR(c) is the access probability of course c, out(t) is the out-degree of teacher t, α is the probability of continuing random walking, in(t) is the in-degree of teacher t and u represents the target teacher.

C. THE SLOPE ONE RECOMMENDATION ALGORITHM IS BASED ON A WEIGHTED BIPARTITE GRAPH
When predicting the teaching quality of all teachers for a course, the Slope One algorithm does not consider the inherently different recommendation degrees between teachers and courses because it utilizes the difference in teaching quality equally, which leads to low accuracy and diversity of predictions. For example, when predicting the quality of Teacher t's teaching in course c, Teacher t first identified the set of courses taught by Teacher t was identified first. The course quality difference calculated using (1) was then used to predict the quality of Teacher t's teaching in course c. Finally, the average was obtained. However, this treatment did not consider the effects of the different recommendation levels between teachers and courses. For example, in courseteaching behavior, when most teachers teach course c1 along with course c2, course c2 has a high probability of being recommended. If very few teachers choose to teach course c3 along with course c1, then course c3 has a low probability of being recommended. Based on the above analysis, when we predict courseteaching quality using the Slope One algorithm, we first calculate the access probability between teachers and courses using (8). That is, we obtain the PR between teachers and courses. Then, we mix the recommendation degree PR into the Slope One prediction equation to improve the prediction accuracy and diversity of the Slope One algorithm. The improved Slope One teaching quality prediction is shown in (9).
In (9), S(t, c) ={i|i ∈ C(t), i = c, C tc (R) >0}. C(t) represents the set of courses taught by teacher t. S(t, c) represents the set of courses taught by the number of commonly taught times, with course j being greater than or equal to one among courses taught by teacher t. dev ci denotes the difference in teaching quality between courses c and i. PR(t) is the degree of recommendation between course c and teacher t obtained using a weighted bipartite graph.

D. IMPROVED TEACHERS' RECOMMENDATION MODEL
We can predict the teaching quality between teachers and courses using (9) and then select the top N teachers with high teaching quality by course to achieve the recommendation of course teachers. However, such processing does not consider teachers' characteristics, and the recommendation matching is poor. For example, among the top N teachers, teacher t1, with the highest course-teaching quality ranking, and teacher t2, with the second highest ranking, had very different values regarding teaching style and TPACK characteristics. In contrast, teachers t1 and t3, who ranked 3rd, were very similar in terms of the values of each dimension. Therefore, we should prioritize teacher t3 over teacher t2 to reflect the true match. To this end, we constructed an improved recommendation model for teaching teachers based on the current TOP-N recommendations. The model selects the teacher with the highest quality among the TOP-N teachers. Among the teaching styles and TPACK characteristics of this teacher, the dimension with the largest value for each was identified as the basis for comparison with the corresponding dimension values for the two characteristics of other teachers. Finally, the original recommendation list is rearranged according to the principle that the values of the two dimensions are most similar. A new TOP-N recommendation with a high degree of matching is obtained. The improved recommendation model is represented by Equation (10).
In (10), dif(t), t ={i|1≤ i ≤N, i = u} is the sum of the absolute values of the differences between the maximum dimensional values of teacher u on teaching style and TPACK and the values of teacher t on the corresponding dimensions. This can be expressed by (11).
In (11), MaxTs= max(u →Ts i ), i ∈{Legislative, Executive, Judicial, Global, Local, Liberal, Conservative } is the dimension with the maximum value among the teaching style dimensions of teacher u. MaxTapck= max(u → Tpack j ), j ∈{TK, CK, PK, PCK, TPK, TCK, TPACK } is the dimension with the maximum value among the TPACK dimensions of teachers' u.

V. EXPERIMENTS AND RESULTS ANALYSIS
This section presents the experimental dataset and evaluation metrics. Then, the important parameters of the algorithm are discussed. Finally, to validate the effectiveness of the proposed algorithm, we conducted an experiment to compare teaching-teacher recommendations using different algorithms and the proposed algorithm on three disciplinary datasets.

A. EXPERIMENTAL DATA SETS
Experimental data were obtained from teachers and courses in the three disciplines at the university. The acquisition and preprocessing methods introduced in Section 3 were first applied to obtain the teachers' teaching styles and TPACK data, course difficulty coefficients, and course teaching quality data for the three disciplines. Subsequently, three teachercourse matrices (LDs, EDs, and PDs) were constructed as experimental datasets for each of the three disciplines using teaching style and TPACK as attributes of teachers, difficulty coefficients as attributes of courses, and teaching quality of courses as values. All three datasets were sparse matrices, and depending on their sparsity, we assigned them high, medium, and low sparsity. The statistics are presented in Table 7.
A simple data analysis was conducted. We compared the mean values of the seven teaching styles of teachers in the three disciplines to obtain the results shown in Fig. 3.  From Fig. 3, we can see significant differences in teachers' teaching styles in different disciplines. In the functional dimension, Pedagogical teachers have higher Legislative and Judicial styles, whereas Engineering teachers have a higher Executive style. On the level dimension, Engineering teachers had a higher overall type than the other two disciplines, and teachers in the three disciplines did not differ significantly in their Local style. Regarding the tendency dimension, Pedagogy teachers had a strong Liberal style and Engineering teachers had a slightly more Conservative style than teachers in the other two disciplines.
In addition, we calculated the variance of the means of the seven teaching styles of Literature, Engineering, and Pedagogy teachers as 5. 62, 4.83, 7.14, 17.70, 0.13, 4.90, and 6.46, respectively. This variance value indicates that the means of the seven teaching styles of teachers in each discipline differed significantly, particularly in the Global style.
We then compared the mean values of TPACK for the three discipline teachers, and the results are shown in Fig. 4.
From Fig. 4, we can see that the mean size of each dimension varies because the number of question items in each dimension of TPACK is different. For example, the TP dimension has six questions with a score range of 6-30, while the CK dimension has only three questions with a score range of 3-15. We cannot simply make a cross-sectional comparison in terms of the mean size, which is meaningless.
Nevertheless, we can compare each dimension of the TPACK. The comparison revealed that the values of each dimension of TPACK for teachers of the three disciplines possessed intertwined characteristics and sizes, reflecting the differences in the PTACK of teachers of different disciplines, in line with the actual situation. In addition, we can see from the variance of each dimension that the three-discipline teachers did not differ significantly in TCK. However, the difference in TK was significant.

B. EVALUATION METRICS
Precision and Recall were used to evaluate the hit rate of the algorithm to verify the accuracy of the proposed method for top N teachers' recommendations.
In the above two equations, Test is the test data set, R(t, c) is the teachers recommended for course c, and Q(t, c) is the teachers who teach course c.
In addition, we used the RMSE to measure the accuracy of the recommendation algorithm. The performance of the recommendation system is better when the RMSE is lower.
where |Test| denotes the size of the test set, tcq tc the actual course-teaching quality, and tcq tc the predicted courseteaching quality.

C. ANALYSIS OF KEY PARAMETER
For the Slope One recommendation algorithm is based on a weighted bipartite graph, and the key parameter is the random walking probability α. This hyperparameter is the hopping probability coefficient, which is also known as the damping factor and is a computational control variable in the algorithm.
The size of hyperparameter α determines the PR value of the starting node to the other node visit probability after the algorithm converges. This affects the predicted teaching-quality match and accuracy of the algorithm recommendation. Therefore, we first determine the appropriate hyperparameter α for the three datasets to obtain better recommendation results.
Our experiments used a five-fold cross-validation. First, the three datasets were divided into five mutually exclusive subsets of similar size. Each subset maintained the data distribution as consistently as possible. That is, it was obtained through the stratified sampling of the three datasets. Subsequently, a concatenated set of four subsets was used as the training set and the remaining subset was used as the test set.
In our experiments, we first fixed the maximum number of iterations of the PersonalRank algorithm for the bipartite graph to 1000. then we gradually increased the parameter α from 0.05 to 0.95 in steps of 0.05. Finally, we selected one course from each dataset and used the improved algorithm to recommend it to the five teachers. The changes in RMSE are shown in Fig. 5.
From Fig. 5, we can see that the RMSE of the algorithm for the TOP-5 recommendations on the three datasets with different sparsities varies with α. At α = 0.7, the RMSE for the three datasets was the smallest and the recommendation accuracy was high. Therefore, we will fix the random walk probability parameter α to 0.7 in future experiments. In addition, we found that the sparsity of the datasets has a greater effect on the recommendation results.

D. COMPARATIVE EXPERIMENTS AND RESULTS ANALYSIS 1) COMPARISON EXPERIMENT I
In this experiment, we used the Slope One (SO), Bipartite Graph PersonalRank (BG), Weighted Bipartite Graph PersonalRank (WBG), and Weighted Bipartite Graph-based Slope One (SOWBG) algorithms to compare their TOP-N recommendations on three datasets. In the experiments, the number of iterations in the BG, WBG, and SOWBG algorithms was set to 1000, and the random access probability α was 0.7. We observed changes in Precision, Recall, and RMSE when N was increased from 2 to 7 in step 1. The results are presented in Fig. 6-8.
It is evident from Fig. 6 that the RMSEs of the four algorithms are significantly different, owing to the different sparsities of the three datasets. In addition, among the four algorithms, the RMSE of the proposed SOWBG algorithm on the three datasets was significantly smaller than those of the other three algorithms, with high recommendation accuracy. The RMSE of the weighted WBG algorithm was slightly smaller than that of the BG algorithm. The RMSE of the SO algorithm was the largest, indicating that using only the mean method to predict the teaching quality did not yield the expected results. Overall, the RMSE of the four recommendation algorithms increased as the number of recommendations (N ) increased, indicating that the RMSE was significantly correlated with the number of recommendations (N ). This is because the SO algorithm relies only on teaching behavior, suffers from cold start and sparsity problems, and does not correlate well with the characteristics of teachers and courses. The BG algorithm relies only on node degree to achieve resource diffusion and cannot effectively recommend new teachers and courses. The WBG algorithm is an improvement of the BG algorithm. Although our improved WBG algorithm is associated with teaching quality and course difficulty, it still does not correlate well with the teaching style and TPACK, leading to lower recommendation accuracy, thus making its RSME inferior to the SOWBG algorithm.
As shown in Figs. 7 and 8, both the Precision and Recall of the proposed SOWBG algorithm were higher than those of the other three algorithms for three datasets with different sparsities. This indicates that the proposed SOWBG algorithm outperformed any single algorithm in the SOWBG cascade algorithm.

2) COMPARISON EXPERIMENT II
CF algorithms are recommendation algorithms based on user behavior data, including neighborhood-based algorithms, VOLUME 10, 2022 latent factor models, and graph-based random walking algorithms. Typical neighborhood-based algorithms include userbased CF algorithms (User-CF) and item-based CF methods (Item-CF) [64]. In this study, the user is the teacher and the item is the course. A typical latent factor model is the LFM [65]. A typical graph-based random walking algorithm is PersonalRank, which was incorporated into the proposed algorithm.
In this experiment, we compared the changes in RSME, Precision, and Recall of User-CF, Item-CF, LFM, and SOWBG algorithms when the number of recommendations N was increased from 2 to 10 step size 1 on the three datasets.
In the experiments, we fixed the learning rate ϕ =0.006, regularization parameter λ =0.002, and number of latent factors f = 80 for the LFM algorithm. We fixed the random walk probability α =0.7 in the SOWBG algorithm. The experimental results are presented in Fig. 9-11.
As shown in Fig. 9, the SOWBG algorithm outperformed the typical CF algorithm in terms of RMSE for the three datasets with different sparsities. Among them, the User-CF and Item-CF performances were the worst on sparse datasets; for example, the RMSE for the EDs dataset TOP-5 recommendations was 59.26% and 54.85% higher than that of SOWBG, respectively. This is because the User-CF and Item-CF algorithms become inaccurate in calculating the similarity between users or items on sparse datasets and cannot find the correct set of nearest neighbors for the target user or item, which leads to a higher RMSE of the recommendation results. LFM is a matrix-solving method for predicting the teaching quality of untaught courses for teachers, and has some advantages in handling sparse datasets and cold starts. However, the RMSE for the EDs dataset TOP-5 recommendations is 10.29% higher than that for SOWBG. This is because the LFM algorithm is only associated with teaching quality in the teacher-course sparse matrix that we provide and fails to effectively associate with the teacher's teaching style, TPACK, and course difficulty, making the predicted teaching quality of the teacher's untaught courses lower or unrealistic, thus leading to a higher RMSE of the recommendation results. This shows that the RMSE of the proposed SOWBG algorithm was better than that of the typical CF algorithm.
Figs. 10 and 11 show that the SOWBG algorithm has better Precision and Recall on different sparse datasets than the other typical CF algorithms. For example, the Precision and Recall of the SOWBG algorithm are 126.6% and 110.4% higher than those of the User-CF and Item-CF algorithms, respectively, and 10.8% higher than those of the LFM algorithm on the TOP-5 recommendations of dataset EDs. The reason for these results is that User-CF and Item-CF are inaccurate in calculating the similarity between users or items in sparse datasets. For the LFM algorithm, Precision and Recall are still lower than for the SOWBG algorithm because VOLUME 10, 2022 the LFM algorithm does not consider the teacher's teaching style and TPACK or course difficulty characteristics.
The comparison experiments above show that our proposed SOWBG algorithm is better than the typical CF algorithm and that any single algorithm cascaded in SOWBG in terms of recommendation accuracy and precision. Therefore, the SOWBG algorithm can implement course teacher recommendations with the expected performance. This provides a new approach for scientifically recommending that university course teachers improve their overall teaching quality.

VI. CONCLUSION
Teacher characteristics, such as educational background, degree, professional title, age, teaching age, job burnout, and academic research, affect course-teaching quality. Existing studies cannot effectively determine whether a significant association exists between these characteristics and teaching quality. However, by reviewing a large body of the literature, we found that teachers' teaching styles and TPACK were relatively stable and correlated with several teacher characteristics. Thus, we believe that it is feasible and more operational to examine the association between teacher-teaching styles and TPACK characteristics, in terms of their association with teaching quality.
To test our hypothesis and accomplish the challenging task of recommending the right teachers for university courses to improve the teaching quality.
• We constructed Teacher TPACK, Course Teaching Quality, and Course Difficulty Questionnaire scales. After validating the reliability and validity of these scales, we used them and the Teaching Style Evaluation Inventory proposed by Grigorenko and Sternberg to collect relevant data from the three discipline majors using several online questionnaires.
• We constructed an experimental dataset of teachercourse sparse matrices for the three disciplines, using teaching style and TPACK as teacher characteristics, course difficulty as course characteristics, and courseteaching quality as the association between teachers and courses.
• We propose a weighted bipartite graph-based Slope-One algorithm to implement the TOP-N recommendations of teachers for courses.
• We also compared the proposed algorithm with the classical CF algorithm to verify whether the proposed algorithm had better accuracy and precision.
The above treatments verified the correctness of our proposed hypotheses with the following three characteristics.
• The Teacher TPACK, course difficulty, and courseteaching quality scale questionnaires were scientifically developed and the scale questionnaires were reliable and valid. Using these scales and TSTI, we collected data on teachers, courses, and teaching quality in different disciplines. These data were normalized to remove the effects of the scales. This study provided a reference model for data collection and quantification in education and teaching related research.
• Using teaching style and TPACK as teacher characteristics, course difficulty coefficients as course characteristics, and course-teaching quality as correlation values, an experimental dataset of teacher-course sparse matrices was constructed, which reflected the correlations among the four: teacher, course, teaching quality, and course difficulty. This provides a reference method for effectively implementing the correlation between educational-and teaching-related features.
• A Slope-One algorithm based on a weighted bipartite graph was proposed. The algorithm uses a cascading approach to construct a weighted teacher-course bipartite graph that effectively alleviates the sparsity of the dataset by predicting the teaching quality of most teachers' untaught courses using an improved PersonalRank algorithm based on random walking. The Slope One algorithm was then used to further predict the teaching quality for any missing courses in the matrix, which also solved the cold-start problem. Finally, a comprehensive recommendation method was constructed by combining teachers' teaching styles and TPACK features to achieve teachers' TOP-N recommendations for courses. This study provides a reference for solving problems in education and teaching. Although this method can scientifically and effectively solve teacher recommendation problems and improve teaching quality, it has certain limitations.
• The workload of using many questionnaire scales to collect teachers' teaching styles and TPACK characteristics is very heavy, and there is a phenomenon of incomplete data collection. The workload of quantifying these data and verifying the reasonableness of the data was also heavy.
• The recommendation accuracy and precision of the algorithm depend on the sparsity and comprehensiveness of the dataset. Datasets with different sparsities affect recommendation results and require a more comprehensive dataset as a guarantee. Therefore, this study's future extended research focus will include the following: • Thinking about methods to simplify the feature data collection and quantification process; • Designing a new method based on a recent GNN [66] to more effectively realize teaching quality prediction and teacher recommendation under sparse datasets.