Adaptive Learning Support System Based on Automatic Recommendation of Personalized Review Materials

In this study, we propose an integrated system to support learners' reviews. In the proposed system, the review dashboard is used to recommend review contents that are adaptive to the individual learner's level of understanding and to present other information that is useful for review. The pages of the digital learning materials that are estimated to be insufficiently understood by each learner and the webpages related to those pages are recommended. As a method for estimating such pages, we consider extracting the pages related to the questions that were answered incorrectly. We examined the accuracy of matching each question with the pages of the learning materials. We also conducted an experiment to verify the usefulness of the system and its effect on learning using a review dashboard. In the experiment, the evaluation of the review dashboard indicated that at least half of the participants found it useful for most types of feedback. In addition, the rate of change in quiz scores was significantly higher in the group using the review dashboard, which indicates that using the review dashboard has the effect of improving learning.


I. INTRODUCTION
W ITH the development of information technology in recent years, information and communication technology (ICT) has been introduced in various fields. The application of information technology to education and learning is called technology-enhanced learning (TEL), which causes changes in the field of education. An advantage of introducing ICT into the field of learning is that it not only provides convenience to learners and teachers but also helps collect learning data, in contrast to learning in the conventional offline environment.
Recently, attention has been focused on research in the field of learning analytics (LA) [1]. According to the SoLAR, an organization that holds international conferences on LA, LA is defined as "the measurement, collection, analysis and reporting of data about learners and their contexts, for purposes of understanding and optimizing learning and the environments" [2]. The process of LA consists of the accumulation of teaching and learning data, analysis of data, feedback of analysis results to learners and teachers, and evaluation/improvement of the effect of feedback. This cycle enables us to understand the methods of learning and their results and to support learning by providing feedback according to the obtained information. There are various research topics in LA, such as analysis of patterns of learning behavior [3], [4], discovery of at-risk students [5], [6], prediction of learning performance [7], [8], lectures based on learning logs and acquired knowledge [9], learning materials [10], and knowledge and calculation problems [11]. These studies show that improving educational methods on the basis of the results of analysis using learning data is an important process for improving learning.
One of the research areas in LA is the study of systems that recommend learning content to learners to support the learning process in which learners can achieve their learning goals [12], [13]. As abovementioned, with the rapid development of ICT technology and the consequent increase in the number of online learning materials, online learning and their platforms become indispensable for learning various subjects. As in the case of conventional face-to-face learning, it is necessary to incorporate some mechanism into the online learning process to evaluate the learner's learning progress. For example, in online lectures at universities, some quizzes are often included at the end of each learning topic. These assessments help learners confirm what they have learned and their understanding of the topic content after the lecture time [14].
One way to compensate for the lack of knowledge revealed by these assessments is to recommend the appropriate contents for review, i.e., studying again what learner has already studied. However, many recommender systems focus on recommending the next learning activity to be done, and they rarely focus This work involved human subjects or animals in its research. Approval of all ethical and experimental procedures and protocols was granted by Experimental Ethics Committee of the Faculty of Information Science and Electrical Engineering, Kyushu University, under Application No. H30-13 and H30-13-1.
Digital Object Identifier 10.1109/TLT.2022.3225206 on reviews. In the actual learning process, it is difficult to acquire a specific knowledge completely in one learning session [15]. The psychological literature shows that forgetting is strongly influenced by the temporal distribution of study time, and that temporally spaced learning leads to more robust and durable learning than massed learning [16]. In [17], personalized reviews based on psychological theory of memory have been shown to improve knowledge retention. Thus, it is important to review the material each time, even if it has already been learned. In conventional face-to-face learning with a large number of learners for one teacher, the same materials for review have to be distributed uniformly, considering the burden on the teacher. In contrast, with online platforms, it has become possible to automatically generate and recommend individual review contents suitable for each learner based on the collected learning data, such as quizzes.
In this study, we aim to support effective review and improve learning performance by developing a new system that recommends individual review contents based on learning data. The proposed system also aims to serve as a comprehensive review dashboard by presenting other information that is useful for review. The system is intended for use in university courses, where each lecture is given using slides as digital learning materials and then taking quizzes. Based on the learning logs of the digital learning materials and the results of quizzes given in the lectures to check the students' understanding, the system provides individual review information for the students. Specifically, the system presents the following contents to a learner is useful for review: 1) summary of the results of quizzes; 2) summary of the browsing time for learning materials; 3) recommendation of the pages of digital learning materials that are estimated to be poorly understood by the learners and webpages related to those pages. Therefore, we address the following research question: "Does a learner who reviewed using the information presented by the proposed system performs better on assessments after the review than a learner who reviewed without such information?" The comparison of the two groups of students is conducted by comparing the final test scores after the execution of the review procedure by using the system. The main contribution of this study is to show that the proposed system is effective in improving learning performance through this experiment.
A more detailed question is whether the recommended learning contents are appropriate for the learners. This point is effectively evaluated by questionnaires given to users. As one of the elements for recommending review contents, we develop a method to estimate the pages of learning materials related to quizzes. In order to develop a method with good performance from a quantitative perspective, the ground truth data generated by the instructor is used to evaluate the accuracy of the candidate methods.
The rest of this article is organized as follows. In Section II, related research on recommender systems in the field of education is described. Section III gives an overview of the proposed system and its structure. Next, in Section IV, a method for recommending the review contents is shown. Section V describes the details of the functions of the review dashboard developed for recommending review contents. In Section VI, the results of the experiments of the proposed recommender system in a university environment are presented. In Section VII, we discuss the results of the experiments and present the limitations of our study as well as topics for future research. Finally, Section VIII concludes this article.
The following contents have been added to the previous version: a detailed explanation of the concept, a comprehensive survey of the related literature, more candidate methods for recommending review materials, comparison of their performance, and evaluation of the system through empirical experiments. In the experiment, we explained to the subjects and obtained their agreement for the use of their data in the research.

II. RELATED WORK
With the development of digital learning environments, recommender systems for e-learning have become popular in the field of education [19]. The purpose of this type of system is to suggest the most efficient and effective learning content to achieve learning objectives among learning resources [12]. Furthermore, adaptive navigation according to real-time needs of learners must be realized. Therefore, some recommender systems have been proposed that make appropriate recommendations by analyzing learners' behavior and using achievement tests to understand learners' tendencies [20], [21].
There are various types of methods for recommendation. According to [22], recommendation methods can be classified into content-based [23], [24], collaborative filtering [25], [26], knowledge-based [27], and hybrid recommender systems that combine multiple methods [28], [29]. In the early stages of this field, the focus was on linking learning resources using term-based similarity [30], [31], which has since been replaced by modern text processing approaches, such as topic modelling and concept extraction [32], [33].
Systems for recommending online learning contents were shown in [10], [14], [26], [34], and [35]. Yang et al. [35] proposed a system that uses text information attached to video materials to recommend other similar video materials. Thaker et al. [14] proposed a system that updates the students' state of knowledge based on the results of quizzes to adaptively recommend text materials. Furthermore, there are several systems that recommend related webpages [36], [37], [38]. Liang et al. [36] proposed a method to recommend articles in Wikipedia related to each chapter of open-source online textbooks. Nakayama et al. [37], [38] proposed a method to extract important words from digital textbooks using term frequency-inverse document frequency (TF-IDF) and recommend webpages related to each page of digital textbooks. Chris et al. [39] proposed a method called deep knowledge tracing (DKT), which models the learner's knowledge state by applying a recurrent neural network [40] to the answer history of exercises. Based on the degree of understanding of learners in various learning topics calculated in the DKT model, the system proposed in [41] recommends appropriate exercises. There are also systems that recommend courses to take [42], [43], [44], and systems that recommend academic papers [45].
Among the various recommended contents, we focus on recommender systems for learning materials in e-learning courses. Zaiane [46] developed an agent that recommends shortcuts of course materials based on learners' access history through the use of web mining technology. Bauman and Tuzhilin [47] proposed an approach to recommending learning materials based on knowledge gaps. The method classifies learners into expert, intermediate, and unknown, based on the learner's previous success rate, and recommends appropriate material from the library. Hsieh et al. [48] proposed a method that construct an appropriate learning path by using a fuzzy logic and select optimal learning materials based on the learner's misconceptions discovered in a prior quiz. Drachsler et al. [12] conducted a detailed survey of other recommender systems in TEL. The recommender systems were classified into seven clusters according to method and recommendation target. Jeevamol and Renumol [49] proposed an ontology-based content recommendation system to achieve personalized learning content recommendation based on learner preferences and goals. Takii et al. [50] constructed a system that recommends picture books containing words that a learner should learn next by utilizing a knowledge map and browsing logs in an e-book system.
Most recommender systems focus on recommending the learning activities to be addressed next, and there are few cases in which they recommend the contents that have been studied before again. Some studies, such as [14] and [48], on recommending materials in remedial learning systems suggest adaptive methods for learners, but they prepare a pool of new learning materials from which to choose, not a review of materials that have already been used before. In practice, however, it is difficult for students to acquire specific knowledge completely in a single learning session [15]; thus, they must review the materials even if they have already learned them. In this study, we focus on the review of learners and aim to support learning in a way that complements the lack of knowledge of students by recommending adaptive review contents according to their individual understanding. Specifically, we propose a system that recommends learners the pages of the digital learning materials that are estimated to be poorly understood by the learners and webpages related to those pages. We have not found any research on a recommender system that estimates the parts of the material that have been studied once but are not well understood and encourages the user to study the material again for review, as in this study. However, as abovementioned, it is necessary to review the material that was not understood at the first learning session. Therefore, it is worthwhile to investigate the effectiveness of a system that presents a part of the material to be reviewed for review, as in this study.
In addition, although slide-based learning materials are not uncommon in recent online learning environments, there has been little research on page-by-page recommendations for slide-based learning materials. Using the method proposed in this article, page-by-page recommendations can be realized for such slide-based learning materials.

III. SYSTEM OVERVIEW
The learning support system proposed here consists of a learning management system, Moodle, an e-book system, BookRoll [51], a learning record store (LRS) that stores log data of these two systems, and a review dashboard that presents review information, which is the focus of this study. Fig. 1 shows the overall structure of the proposed system.
Moodle is one of the most widely used learning platforms worldwide. In this study, Moodle customized for research at our university was used as one of the components of the system. In the proposed system, Moodle also serves as a hub for the entire system, with links to other learning tools.
BookRoll [51] provides not only page transitions but also standard learning support functions, such as markers, notes, bookmarks, and search. Students can also register their understanding of each page by selecting "understand" or "not understand." Fig. 2 shows the screen of BookRoll.Activities on the digital learning materials in BookRoll are recorded in the database as learning logs in real time. The learning logs include user ID, time stamp, teaching material ID, page number, and operation. Furthermore, BookRoll automatically extracts textual information from the registered learning materials. In the distribution of learning materials by Moodle, the record of accessing the uploaded file is saved, but the logs of detailed page transitions and actions are not stored. Therefore, BookRoll contributes to a more detailed analysis of learning activities.
The log data of Moodle and BookRoll are collected by the LRS. User IDs in the two systems are uniquely identifiable, allowing information to be used interchangeably.
In this study, we integrated the review dashboard into the digital learning environment that has been operated so far to provide support for reviews that have not been considered. The review dashboard is implemented as a web application, and the learner has access via Moodle. The overall flow of using the proposed system is as follows. 1) Learners use learning materials in BookRoll to study some topic. 2) After reading of the learning materials is completed, students take a quiz to check their understanding of learning contents of the topic. 3) The system analyzes the log data obtained from the abovementioned activities stored in the LRS. 4) Review information is presented on the review dashboard for learners. The review dashboard presents the following three personalized feedbacks for review: 1) summary of the results of the quiz; 2) summary of learning time (i.e., browsing time for learning materials in BookRoll); 3) recommendation of materials for review. As abovementioned, when using the proposed system, it is assumed that students take quizzes after reading the material to check their understanding of the contents of the topic, and therefore, support for reflecting on the quiz results is a basic and necessary feedback function. By reviewing the study time for each page, the user can see what types of pages are read more or less often, and use this information to select pages that need a review. In addition, by comparing with the learning time of other learners, the user can determine the importance of pages that many learners read frequently, which can help the user select pages for review. Recommendation of materials for review is the main function of the review dashboard. This function is aimed at improving learners' understanding of the parts of the material that they have already learned and do not understand well.
The method of automatically recommending materials for review is explained in Section IV. Details of the interface and functions of the review dashboard are provided in Section V.

IV. AUTOMATIC RECOMMENDATION OF REVIEW MATERIALS
In this section, we propose a method for estimating the pages where each learner lacks understanding and a method for obtaining the webpages related to the pages, which are employed by our system to recommend review materials.

A. Overview
When reviewing learning materials, starting reading from the beginning of the material again is not only time-consuming but may also lead to a loss of motivation due to the large amount of learning. Therefore, our system estimates the pages in the learning material that are considered to be lacking in each learner's understanding and recommends them as pages to be reviewed. By presenting only specific pages as points to be reviewed, learners can review them efficiently.
In our system, we define pages with insufficient understanding as follows: 1) pages related to the questions learner got wrong on the quiz; 2) pages where learner clicked the "not understand" button on BookRoll; 3) pages that many other learners found difficult to understand, i.e., the "not understand" button was clicked by many learners. It is also considered that presenting only the pages of preprepared learning materials may be insufficient for learners. From this viewpoint, the system also recommends the URLs of webpages related to the recommended pages to the learners at the same time, so that they can supplement the learning contents.

B. Extraction of Pages Related to Quizzes
To identify pages related to the quiz in which students made mistakes, it was necessary to link the related pages of the learning material to each question in the quiz. For example, if a quiz consists of ten questions on each topic, the related pages need to be linked to each of the 10 questions. However, it is not desirable for teachers to register the related pages every time a quiz is created because it would be a burden on teachers. In this study, we propose a method for automatically matching relevant pages to each question in a quiz. For this purpose, we take the following steps: 1) transferring the text of each page in the learning material to a vector using Doc2Vec [49]; 2) transferring the text of each question in the quiz to a vector using Doc2Vec; 3) estimating related pages based on the similarity of vectors. Doc2Vec [49] is a technique for transforming documents of arbitrary length to vectors of fixed length, which can be utilized to obtain distributed representations of documents. Because the method does not depend on a specific task, it has been applied to many cases, such as document classification and spam filtering. Two algorithms exist for learning vectors to realize Doc2Vec: the distributed memory model of paragraph vectors (PV-DM) and distributed bag of words version of paragraph vector (PV-DBOW). PV-DM considers the order of words in a document, whereas PV-DBOW does not consider the context of words in a document. For details, refer to [49]. The proposed system deploys the PV-DBOW model, which showed superior performance in tasks of this study in the experiments described in Section VI. We used the text information of each question and learning materials to match the pages of materials that are related to the content of each question in the quiz. For a question in the quiz, consider the case of estimating the relevant pages of learning materials consisting of N pages for some natural number N. We note that our BookRoll system automatically extracted text information from the materials in advance.
Using the 300-D model trained by Doc2Vec for Japanese documents, the texts of N pages in the materials are transformed into N vectors in 300-D space, v p i (1 i N). Similarly, a 300-D vector v q can be obtained from the text of the question.
The cosine similarity between vectors v p i and v q , denoted by simðv p i ; v q Þ, is calculated using the following formula: The proposed method calculates the cosine similarity between each pair v p i (1 i N) and v q and ranks the pages of the learning material in the order of their values. Then, the top-ranked pages are considered to be the pages related to the question.

C. Extraction of Pages Based on Learners' Responses
BookRoll, our e-book system, has two buttons named "understand" and "not understand" that allow learners to indicate their level of understanding for the contents of each page in learning materials. Students can register their understanding of each page by clicking the button, and this information is stored in the LRS as a learning log. In the proposed system, we simply use this log to present the pages for which the "not understand" button was clicked as a target for review. In addition, the top four pages for which the number of clicks on the "not understand" button was the highest by learners using the same material are also presented. The purpose of this function is to show pages in the target materials where many learners are likely to stumble, so that a learner can pay attention to points that they may not have noticed by themselves.

D. Recommendation of Related Webpages
In Sections IV-B and IV-C, two methods for identifying pages in a learning material that are not well understood are proposed. However, the effects obtained by applying these techniques may be insufficient if the content of the original materials is insufficient. To avoid such problems, we recommend supplementary materials that are not included in the original materials, so that learners can supplement their understanding of the learning contents and learn more broadly. Therefore, the proposed system also recommends hyperlinks to webpages related to the contents of the target pages as supplementary learning materials.
Numerous studies have focused on creating hyperlinks to external webpages for entire textbooks or for each chapter of a textbook, such as the study on automatically creating hyperlinks between textbooks and Wikipedia [36]. Nakayama et al. [37], [38] proposed a method to recommend webpages related to the contents of each page of a digital textbook. Because our system also targets the same digital learning materials, we used this method to recommend webpages related to the contents of each page.
The following three steps were employed to determine the recommended webpages associated with each page: 1) extracting words from digital materials; 2) calculating the importance of extracted words; 3) determining webpages to recommend based on important words. We ranked the words in the text of the learning materials based on their importance, and used the top n words as a query to search for webpages related to the contents of the page. First, the morphological analysis tool MECAB [52] was applied to extract nouns in the text in the learning material. Then, the importance of the extracted word was calculated by applying the TF-IDF method [53]. Finally, based on the importance of the words, the recommended information was determined.
In the proposed system, we applied this method to recommend five URLs of webpages obtained from queries by the top five important words on each page of a learning material.

V. REVIEW DASHBOARD
In this section, we show the interface of the review dashboard and its functions to use for review.

A. Overview
In this study, we developed a review dashboard as a tool for presenting review information to learners. The review dashboard is implemented as a web application, and learners can access it from Moodle. When a user accesses the review dashboard, the system recognizes a user ID that is consistent across Moodle and BookRoll. Then, the user can select the topic to be reviewed. The topics displayed as options are only those in which the target user ID holder has viewed the learning materials in BookRoll and has taken the quiz in Moodle. The review dashboard presents the following three types of personalized feedback for review: 1) summary of the results of the quiz; 2) summary of the learning time (i.e., browsing time for learning materials in BookRoll); 3) recommendation of materials for review. C. Summary of Learning Time Fig. 4 shows a screen displaying a summary of the browsing time for learning materials in BookRoll. In the bar graph of the screen, the horizontal axis represents the page number, and the vertical axis represents the browsing time (minutes). Furthermore, it is possible to know how much each page is read. By viewing this graph, for example, if there is a page that a user has not read much, one may want to know what type of page it was. For this purpose, by placing the mouse cursor on each bar of the bar graph, a thumbnail of the page image is displayed, and the contents can be checked, as shown in Fig. 4.

B. Summary of the Quiz Results
The average browsing time of other users is also displayed so that the user can compare it with one's own. For example, a page with a long average browsing time for other users may be relatively important or difficult to understand. By confirming such pages, learners may obtain information that they did not notice the first time they studied the material, but which is important when they read it carefully during review. In traditional face-toface learning, this kind of awareness can be obtained through conversations with other learners who are taking the same course. However, this information is not usually available in online self-learning; therefore, it is considered useful.

D. Recommendation of Materials for Review
As described in Section 4, the review materials recommended by the system are as follows: 1) the top four pages related to questions for which the learner submitted an incorrect answer in the quiz, 2) the pages where the "not understand" button was clicked, and 3) the top four pages for which the "not understand" button was clicked by learners who used the same material the highest number of times. In addition, the system recommends links to webpages related to the target pages in the learning material.
In the user interface of the review dashboard, as shown in Fig. 5, each reason for the recommendation is divided into an accordion menu style so that the user can easily understand why each page is recommended. Fig. 6 shows the screen when the accordion is opened, and the review material is viewed. In Fig. 6, the user made a mistake in "Question 2" of the quiz, and the related pages of "Question 2" were recommended for review. The left-hand side of the screen shows the recommended pages, and the right-hand side shows the links to the related webpages. A user can also check the contents of "Problem 2" on the same screen.
Because teaching materials usually consist of a series of related pages, we designed the system so that learners can move back and forth between the pages responsively by clicking on the gray triangle below the target page.
This also supports zooming in and out of page images, as shown in Fig. 7. Even when zooming in, page transition is possible so that learners can check the contents before and after the recommended page without any problems.

VI. EXPERIMENTAL RESULTS
In this section, we present the results of evaluation experiments of the proposed recommendation methods and the review dashboard used in a university environment.

A. Outline of the Experiments
Evaluation experiments for the recommender system in education were conducted to evaluate the extent to which the system is adapted to the specified requirements. The object of evaluation for that purpose was classified into the following three large categories [54]: 1) technical indicators of the algorithm; 2) user satisfaction and favorability; 3) effectiveness in learning. In the proposed system, to recommend learning materials for review, we extracted pages in the materials that were considered lacking in each learner's understanding. For this purpose, we considered pages related to the question in which the learner made a mistake. Therefore, to satisfy the requirements of the proposed system, we verified the accuracy of matching between the quiz questions and the pages of the learning materials by the experiment in Section VI-B.
In addition, the recommended items for review materials and the quality of other feedback information were evaluated using a questionnaire for the users in Section VI-C. In this evaluation experiment, we compared the effects of using the system continuously for approximately one month between two groups: one group who used the review dashboard and the other group who did not. We further verified the usefulness of the proposed system by conducting a detailed questionnaire survey of the groups that used the review dashboard.

B. Verification of Accuracy for Matching of Quiz and Pages of Materials
In this section, we describe an experiment to verify the accuracy of the automatic matching of related pages to each question in a quiz using the method described in Section IV-B.
As the dataset for this experiment, we used four learning materials and 57 questions of the corresponding materials that were used in the first-year course "cyber security for enterprise" at Kyushu University. Table I tabulates the titles of the four learning materials, the number of pages, and the number of questions in the corresponding quiz. Each question is a multiple-choice type question with three to five choices.
For each question in this dataset, we constructed a similarity ranking of the corresponding pages of the learning materials. The similarity ranking of a question q corresponding to a learning material A is the sequence of pages such as "page 10, page 11,..., page 1" in descending order of ranking. As ground truth data, we asked the course instructors to prepare pages that were considered necessary to answer each question in the quiz, for example, answering question q requires understanding page X. Using the abovementioned data, we examined whether the automatic matching of the pages required to answer the question fits with the ground truth data for similarity rank. For example, consider the similarity ranking "page 10, page 11,..., page 1" of a question q, and suppose that page 11 is required to answer q in the ground truth data. In this case, the system is able to output the required page within two pages of the top ranking.
As described in Section IV-B, there are two algorithms to realize Doc2Vec, which is used to transform documents to vectors and calculate the similarity. The following two methods are also considered for comparison.
1) Word2Vec and TF-IDF method: Each word in the text of a page is transformed to a vector by Word2Vec, and a weighted average is taken according to the importance of TF-IDF as a vector representing the page. 2) Bert method: We use the model learned by Wikipedia in Japanese by Bert [55], which is a relatively new natural language processing model. In this study, we considered the text on each page of the material as a single document and transformed the document into a vector using the learned model. In Fig. 8, each line shows that the system was able to output the required pages within x pages of the top ranking by each methods for a rate y of the 57 question.From the results, we can see that the best method is the PV-DBOW model, which is one of the proposed methods using Doc2Vec. The PV-DBOW model of Doc2Vec extracts about 68% of the pages within the four pages presented in the review dashboard, which is better than the 52% of the second-best method. Therefore, we adopt the PV-DBOW for the review dashboard, which shows high performance in this experiment.

C. Evaluation Experiments on Effectiveness and Usefulness 1) Procedure of Experiments:
In this experiment, we asked the participants to use the system continuously for about one month for learning six materials, by following the procedure: 1) read the designated materials for at least 30 min; 2) take a corresponding quiz consisting of six questions;  3) three or four days later, review the contents of the previous material before reading the next material. Table II lists the titles of the learning materials used in the experiment. The contents of the six learning materials are related to the fundamentals of information science, and they are also used in actual lectures at the university.
In addition, to measure the effect of using the proposed system, we conducted a control experiment by dividing the participants into two groups: one group that used the review dashboard and the other group that did not. Specifically, 28 first-year and second-year students at Kyushu University were divided into two groups of 14 students each, and participants in one group were asked to use the review dashboard at least once during their review, whereas the participants in the other group were asked to review without revealing the review dashboard. To avoid bias in terms of the prior knowledge of information science among the groups, the students were asked to solve 25 questions on information science beforehand. They were divided so that the average scores of the two groups were approximately equal.
After completing the learning cycle for the six materials, the participants reviewed the six materials for 60 min and then took a final test consisting of 36 questions to summarize the contents of all the materials. Participants in the group that had been using the review dashboard also continuously reviewed using the review dashboard. By comparing the results of the final test, we measured the effect of using the proposed system. Table III summarizes information on the evaluation  experiments. After the test, the participants answered a questionnaire. In particular, the group that used the review dashboard answered a detailed questionnaire about the usefulness of the review board. The following is a summary of the questionnaire administered to the group of 14 students who used the review dashboard: (i) overall evaluation of the review dashboard: 5 evaluation factors, 27 items in total; (ii) questions about continued use of the review dashboard after the experiment; (iii) evaluation of each content of feedback in the review dashboard; (iv) evaluation of the recommended contents for review (specific pages of the materials and webpages). In (i) of the questionnaire, an overall evaluation of the review dashboard was conducted. This evaluation was divided into five evaluation factors consisting of visual appeal, usability, degree of understanding, perceived usefulness, and behavioral change, as given in Table IV. Table IV also shows the details of the five evaluation factors and the number of items.
These 27 items were rated on a five-point scale ranging from 1 to 5 (1: Disagree, 2: Slightly disagree, 3: Neither, 4: Slightly agree, and 5: Agree). Table V tabulates the specific contents and the means and standard deviations of the responses of the 14 respondents for these 27 items.
2) Results of the Questionnaire: In this section, we discuss the results of the questionnaire in terms of the five evaluation factors. Table VI tabulates the means and standard deviations for the values of the items corresponding to the five evaluation factors.
The overall evaluation results for the five factors were slightly higher than the middle, but as can be seen from the standard deviations, the result for each factor tended to be dispersed among the users. In particular, the variance was rather large in the factor part of behavioral change, such as increasing motivation to learn, making plans for learning, and managing learning activities using the review dashboard. However, visual appeal and comprehension level scored relatively high ratings. From the abovementioned and the results of individual items, the following can be said: 1) the feedback information in the review dashboard is relatively easy to understand; 2) some users found the review dashboard useful, whereas others did not feel it was necessary; 3) the review dashboard has a positive effect on the learning motivation of some users; 4) to improve the overall usability, the user interface needs to be improved. In (ii), the users were asked to rate, on a five-point scale, whether they would be willing to use the review tool after the experiment in a learning style where they had to take a quiz after learning from digital learning materials. The results of (ii) are shown in Fig. 9. The combined rating of "Agree" and   "Slightly Agree" is 57%. This means that most learners found the review dashboard useful in their learning process.
Next, in (iii), the usefulness of each feedback content described in Section V of the review dashboard was rated as follows: 1) summary of the quiz results; 2) summary of learning time (i.e., browsing time for learning materials); 3) recommended materials for review. In addition to the factors of perceived usefulness and degree of understanding, conformity was proposed as a perspective for evaluating the feedback content on learning dashboards in [56]. Conformity is an evaluation factor that indicates the degree of conformity between the information presented by the system and the perception of the learner's own activities. Because this factor is not applicable to the evaluation of our review dashboard, the other two factors for each of the three feedback contents were included in the questionnaire, that is, there were six items in (iii). Table VII tabulates the items and results of (iii). In terms of the degree of understanding of the feedback content, all the contents were not difficult to understand. However, there was a large difference between the types of feedback in terms of usefulness. The mean for feedback on "quiz results" was 4.21, which was very high, but the means for "learning time" and "review materials" were 3.21 and 3.36, respectively, which were slightly higher than the median rating, with a large standard deviation.
The questionnaire (iv) consists of six items, Q1-6. In (iv), we asked the users about the recommended review materials to evaluate the usefulness of their contents. In the proposed system, the following contents were recommended as review materials: 1) the top four pages related to questions for which the learner submitted an incorrect answer in the quiz; 2) the pages where the "not understand" button was clicked; 3) the top four pages for which the "not understand" button was clicked by learners who used the same material the highest number of times; 4) the webpages related to the recommended pages in the learning material.  Table VIII tabulates the evaluation items in (iv) for the abovementioned recommendations.
The results of Q1 and Q2 questionnaires are shown in Fig. 10. The result of Q1 shows that the presentation of the related pages of questions that users answered incorrectly was useful for most users, with a total of 64% of the respondents choosing "Useful" or "Slightly useful." The results of Q2 showed that 66% of the responses were "Mostly appropriate" and "Somewhat appropriate." In addition to the results of the verification of matching accuracy between the questions of the quiz and the pages of materials in Section VI-B, the evaluation of the actual users of the system indicates that the matching is appropriate.
The results of Q3 and Q4 questionnaires are shown in Fig. 11. As for the recommendation of the page on which the "not understand" button was clicked in Q3, the number of evaluators decreased by 43% because the button was not displayed to users who had never clicked it. According to the results of Q3, slightly more users who used the recommended information found it useful. As for the recommendation of "pages that other learners had found difficult" in Q4, the number of users who found the information useful was slightly higher.
In Q5 and Q6, we asked questions regarding the webpages recommended by the system shown in Fig. 12. The results of Q5 show that 64% users accessed the recommended webpages. The results of Q6 indicate that 45% of the users who accessed the recommended webpages answered that the webpages were slightly helpful for their learning.
In the questionnaire for the participants who used the review dashboard, we, in addition, asked them what type of courses they thought would be useful for the review dashboard as a free description, and we received the following responses:    1) for online and on-demand courses, the system can be used effectively because learner behavior can be tracked; 2) courses that require a lot of new knowledge for learners; 3) courses that require some memorization of knowledge. In the same way, we asked the users what they wanted to improve on the review dashboard and what other information they wanted. After excluding the comments on the user interface, the following comments were obtained.
1) I need explanations for the quiz.
2) It would be easier to understand if there were markers on the parts of the recommended pages related to the quiz. Preparing explanations for the questions is a heavy burden for teachers. Therefore, it is suggested that it is necessary to compensate for these issues by developing a highly accurate method of guiding learners not only to related pages, but also to other webpages and materials that can lead to explanations of the quiz.
3) Analysis of Effectiveness in Learning: Recall the procedure of the experiment described in Section VI-C1. If the use of the review dashboard has any effect on the learning efficiency or knowledge acquisition, it is considered that there is a significant difference between the scores of the final test of the two groups. Therefore, we analyzed the effect of using the proposed system by comparing the changes in the rate of correct answers in the quizzes given after the learning cycles of six materials and the final test given after the completion of all cycles by the two groups.
The average rates of correct answers for the quiz and the final test for the 14 students in each of the two groups are shown in Fig. 13. For the quizzes after studying only the material, the correct answer rates of the two groups were exactly the same at 57.9%. In contrast, in the final test, the correct answer rates were 66.9% for the group that used the review dashboard and 52.8% for the group that did not use the system, which shows that the group that used the review dashboard had a higher rate of correct answers. Fig. 14 shows the distribution of the combination of the rates of correct answers in the quiz and the rate of correct answers in the final test for each of the 28 participants. The blue dots represent the participants who used the review dashboard, and the orange dots represent the other participants. The graph shows that the correct answer rate of participants in the review dashboard usage group tended to improve in the final test.
We also analyzed the rate of change in the percentage of correct answers to the quiz and the final test. The rate of change here was calculated, for example, as a change of 1.5 times when the percentage of correct answers in the quiz was 50% and the percentage of correct answers in the final test was 75%. While considering whether learners have acquired knowledge by reviewing, it is reasonable to use the rate of change by comparing the percentage of correct answers in the quiz before reviewing with those in the final test after reviewing. Fig. 15 shows the distribution of the rate of change of the percentage of correct answers. The blue dots in the graph indicate the group that used the review dashboard, and the orange dots indicate the group that did not use the dashboard. The graph shows that the rate of change tends to be higher in the group that used the review dashboard.   In addition, a t-test was used to determine if there was a significant difference between the means of the rate of change between the two groups. Table IX summarizes the means and unbiased variances of the rate of change of the percentage of correct answers. The rate of change in the percentage of correct answers for the group that used the review dashboard was 1.16, whereas the rate of change for the group that did not use the dashboard was 0.90. Based on the results given in Table IX, a ttest was conducted with p ¼ 0:00236 < 0:01, rejecting the null hypothesis that there is no difference in the mean rate of change between the two groups at the 1% level of significance. This result indicates that there is a significant difference between the means of the rate of change between the two groups.

A. General Discussion
In this study, we develop the review dashboard system that recommends individual review contents based on data. While most recommender systems focus on recommending the learning contents to be addressed, our system recommends the pages of the material that have been studied once but are not well understood and encourages reviewing of the material.
The results of experiments imply that the group that used the system showed higher learning performance than the group that did not use the system. In fact, the average rates of correct answers on the final test increased after the use of the system from that on the quiz conducted before the use of the system. This is a change not seen in the group that did not use the system. Furthermore, the rate of change in the percentage of correct answers to the final test was significantly higher in the group using the review dashboard, which indicates the effect of reviewing using the developed dashboard.
The questionnaire evaluation of the review dashboard showed that at least half of the participants found the system useful, although the evaluation values often varied among participants. Therefore, these results suggest that the feedback of the review information by the proposed method is useful for improving the learning effect in the digital learning environment used in this research.
For the function that recommends the pages of learning materials related to quizzes, the results of the experiment support the claim that the recommendation is appropriate for the learners to the extent that it is practical.
Using the ground truth data generated by the instructor, we evaluated the accuracy of the candidate methods, and it is shown that the PV-DBOW model of Doc2Vec can extract about 68% of the pages within the four pages presented in the review dashboard. The pages recommended as relevant to the questions that the learners got wrong were evaluated as appropriate by the participants in the questionnaire Q2. In addition, for each type of the recommended learning contents, more than half of the learners found the content useful in their evaluation of the questionnaires Q1, Q3, Q4, and Q6. These results indicates that the proposed method could recommend pages with practical accuracy.
As a supplement, from a technical perspective, we discuss why the PV-DBOW model was chosen for the method that recommends the pages of learning materials. One of the reasons why the PV-DBOW model is more accurate than the Bert method is that the digital learning materials used in this study are in the form of slides, and therefore, the extracted text does not consist of well-formed sentences, unlike ordinary documents. In such a case, the meaning of context may be diminished or may have a negative effect. Therefore, it is considered that the PV-DBOW model may have been able to convert texts to vectors more appropriately in this task than the Bert and PV-DM models, which take the order of words into account.

B. Limitation and Future Direction
There are some limitations to our study that may need attention in future research.
First, we have not verified whether the user interface and the combination of feedback information presented in this system are optimal. We received some comments in the questionnaire that the system was slightly difficult to use; for example, it took some time to display feedback information. Because the usability of the system is directly associated with the usefulness of the learning efficiency, it may be possible to improve. As for the feedback information, adding useful information, such as the percentage of correct answers for each question in the quiz, by utilizing the knowledge obtained through this study has the potential to be beneficial to learning.
Next, the proposed system assumes that a quiz is conducted after learning a material. This causes the problem that the scope of application is limited by this style. One of the challenging issues is to extend the applicable targets of the proposed system. It would be possible to extend the range of applications if the pages to be reviewed can be identified only from the information obtained when reading the learning material in BookRoll. To achieve this goal, it is necessary to develop a method that can analyze the level of understanding of each individual from the learning logs of page transitions and other functions, such as markers or memos.
In the current system, webpages are recommended for supplementing learners' understanding of the learning contents and for learning more broadly. However, in the experiment, the usage rate was not high, and the usefulness of the recommended webpages was limited. For future improvement, it is necessary to refine the recommended webpages to be more suitable for the purpose and also to consider recommending other open learning contents that are useful for review from various perspectives.  IX  MEANS AND UNBIASED VARIANCES OF THE RATE OF CHANGE OF THE  PERCENTAGE OF CORRECT ANSWERS OF THE TWO GROUPS, WHERE GROUP  A USED THE SYSTEM AND GROUP B DID NOT USE THE SYSTEM Finally, the evaluation experiment consisted of six learning cycle and one assessment regarding 28 learners. More continuous use of the system and multiple assessments may allow for longer term observation of changes in learning performance. In addition, as more learners use the system, it will be possible to conduct detailed analysis, such as the relationship between the attributes of the learners and the use of the system.

VIII. CONCLUSION
In this study, we proposed an integrated system to support learners' reviews. In the proposed system, the review dashboard is used to recommend review contents that are adaptive to the individual learner's level of understanding and to present other information that is useful for review. The pages of the digital learning materials that are estimated to be poorly understood by each learner and the webpages related to those pages are recommended as review contents.
As a method for estimating pages with insufficient understanding, we considered extracting the pages related to the questions that were answered incorrectly. For this purpose, the accuracy of matching each question with the pages of the learning materials must be guaranteed. We examined the accuracy of matching each question with the pages of the learning materials. The results show that our method can extract appropriate related pages with an average probability of less than 60%. In addition, the results of the questionnaire by the users show that the pages related to the questions in which the users made mistakes on considered to be appropriate.
We also conducted an experiment to verify the usefulness of the system and its effect on learning using a review dashboard. In this experiment, the overall evaluation of the review dashboard or the evaluation of each type of feedback indicated that at least half of the participants found it useful for most types of feedback. In addition, the rate of change in quiz scores was significantly higher in the group using the review dashboard, which indicates that reviewing using the review dashboard can improve learning. From these results, it can be said that the review support system proposed in this study functioned effectively in the experiment.