Concept–Effect Relationship Weighting Based on Frequency of Concept’s Co-Occurrence for Developing Personalized Remedial Learning Path

Notably, a personalized learning path is advantageous for students who need remedial assistance due to their struggle to master concepts and poor exam performance. Personalized learning paths produced from conceptual maps have proven highly effective. Nevertheless, in establishing personalized learning paths using conceptual maps, the manual assignment of weights that depict the degree of relationship between test items and concepts is time-consuming and ineffective. To get around the problems with giving weights by hand, we suggest an auto-generated concept relationship weighting method that is based on how often primary concepts and supporting concepts appear together in test items in this study. Moreover, we build a system that generates a personalized remedial learning path for students using the proposed weighting method and a conceptual map. To evaluate the effectiveness of the proposed approach, we carried out an experiment on high school students. As shown by the experimental results, the personalized remedial learning path generated can significantly enhance the learning achievement of underachieving students. In addition, the new auto-generated weighting method eliminated the need for experts to assign weights and resolved conflicting weight values.


I. INTRODUCTION
With the advancement of computer technology, many schools have adopted e-learning, which offers a practical approach to enhancing learning outcomes.E-learning has proven successful in teaching various subjects, such as science, mathematics, social sciences, the humanities, and language [1], [2], [3].Personalized learning support in an e-learning The associate editor coordinating the review of this manuscript and approving it for publication was Dominik Strzalka .environment increases students' engagement and learning efficacy [4], [5].Furthermore, integrating personalized learning support features and adopting the mastery learning approach in an e-learning environment has been shown to effectively improve students' learning performance [6].
Several researchers have developed systems to diagnose learning difficulties, determine the appropriate learning path, and produce relevant additional learning materials tailored to each student's needs to enhance personalized learning support within an e-learning environment.Initially, Hwang et al. [7] proposed an approach to providing appropriate learning material for students by individually examining and analyzing their affective and cognitive statuses using fuzzy inference.The fuzzy inference approach was employed to establish an adaptive learning system, which was found to enhance students' learning achievements and support underachieving students in completing their assignments.Later, Gao et al. [8] proposed a method for generating a learning path based on modeling domain knowledge and learning resources using a knowledge graph.Gao et al. [8] applied the TextRank algorithm to extract keywords from courseware and construct the knowledge graph and employed the Ant Colony Optimization algorithm to generate the learning path.The results showed a high degree of similarity between the generated learning paths and those produced by experts.Subsequently, Subiyantoro et al. [9] proposed a method for establishing a learning path by considering a learning object ontology and accounting for students' cognitive abilities.The Hybrid Particle Swarm Optimization (HPSO) algorithm generated the learning path.Experimental results using the HPSO algorithm depicted the effectiveness of the proposed method in recommending appropriate learning paths for several groups of students with varying cognitive abilities.
Besides the studies mentioned earlier, other research has investigated different approaches for establishing appropriate learning paths for students.One such approach involves the utilization of conceptual maps that showcase relationships between subject concepts.The Concept-Effect Relationship (CER) model, initially introduced by Hwang [10], is a specific type of conceptual map.In the CER approach, students may need prior knowledge of related concepts and an understanding of the relationships between them when encountering a new concept.In a CER model, all related subject concepts are hierarchically organized and interconnected, signifying the learning sequence.If a student has difficulty mastering a concept during the learning process, it may be because of a lack of understanding of the prerequisite concepts.The CER model effectively identifies students' learning difficulties [11].Furthermore, the CER model enables the provision of personalized feedback to students and assists teachers in identifying misconceptions [12].Consequently, the utilization of the CER model in establishing learning paths has gained significant research [2], [13].When generating a learning path using the CER model, it is important to identify and define the concepts and their relationships beforehand.In addition, we assign weightage values to specify the degree of relationship between concepts within a test item.In addition, these weight values calculate the error ratio of students when they answer test items related to each concept.This approach has several disadvantages.First, conflicts could arise among experts regarding the weight values assigned to concepts in test items.Furthermore, when dealing with a large question bank containing numerous test items, the process would become time-consuming, and experts could experience fatigue, leading to lower-quality weight values.Furthermore, the introduction of new test items into the question bank required experts to allocate weight values to the relationships between the new items and concepts, resulting in additional time consumption.Hence, an alternative approach, such as an autogenerated method, is required to ensure the assignment of high-quality weight values between test items and concepts.This auto-generated approach would allow for the calculation of students' error ratios for each concept, enabling the identification of concepts that necessitate further learning.
This research began with the development of a framework called the Personalized Scaffolding Adaptive Learning Management System [14].In subsequent preliminary research, the study investigated the presence of hidden concepts within test items, which can provide insights into the relationships between various concepts [15].We employed a classification method to categorize the multiple concepts associated with each test item [16].Lastly, the frequency of co-occurrence among concepts was used to develop a novel weighting approach for generating personalized remedial learning paths (PRLPs).
Instead of solely relying on experts, an alternative approach can be explored to determine the weight value of items based on the occurrence of concepts within a group of test items in a question bank.Computation methods can be applied to assign weights to each test item by focusing on the characteristics of a question that involves multiple concepts in its solution, considering the simultaneous occurrence of concepts.This study suggests a better way to use conceptual maps.It uses computations to figure out weights and the level of connection between test items and concepts by looking at how often concepts appear together.Notably, the co-occurrence of concepts serves as the differentiating factor, which designates them as primary or supporting concepts within the test items.This research aims to address the following research questions: 1) What processes are involved in using the frequency of concept co-occurrences to generate a PRLP? 2) What are the benefits of implementing the frequency of concept co-occurrence in the development of a PRLP?The experimental results of our research have proven that the proposed methodology would provide meaningful contributions to the development of science and computer technology, particularly in employing Natural Language Processing (NLP) solutions in processing and understanding text data.Our research addresses the typical problem in the prevailing process by replacing the role of an expert or multi-expert and their unending story with a machine learning algorithm for identifying weights in generating CER as a reference for designing student learning paths, particularly the learning paths for remedial students.One issue we have found in previous research studies related to CER generation methodology is that the analysis conducted by domain experts tends to be rigid and subjective, leading to debates when determining weights.Another issue is related to the expert's endurance and time consumption when they have to analyze a large number of test items in the question bank, which is expandable at any time, that is, more than 500 items, as exercised in our research.The consequence is that there are inconsistencies in concept weighting, which can cause difficulty in generating CER as a reference for designing learning paths for remedial students.
The rest of the paper is organized as follows: Section II provides an overview of the literature on personalized learning paths using the CER.Section III provides a concise explanation of the weighted CER approach using the frequency of concept co-occurrences to establish a PRLP.Section IV presents the experimental results obtained and analyzes the findings.Section V highlights several research discussions, and finally, Section VI concludes by summarizing the main conclusions drawn from the study and offering suggestions for future research directions.

II. RELATED WORKS
Personalized learning path recommendations are an interesting area of research.This learning path recommendation aims to help students attain their learning objectives by recommending learning paths in the form of a sequence of learning objects that are appropriate to them.Through technological advances, personalized and adaptive learning path research is able to overcome the drawbacks of a ''one-size-fits-all'' approach [17].In the learning path recommendations, the sequence of learning objects is essential for students.Several studies have introduced a graphical system in which graphs are denoted by nodes that represent the learning content and directed connections that depict the relationships between nodes.Research starts with a theory-based graph that can produce a one-way learning path by connecting all the necessary learning objects through its relationship [18].
In comparison, other studies have introduced a learning path recommendation method utilizing predetermined learning scenarios because students require various learning paths in different scenarios [19].Research on learning path recommendations develops and builds on knowledge graph construction by recommending a learning path generation method that necessitates the determination of the start and end nodes through knowledge maps.This research recommends that a learning path be established by considering the domain knowledge and cognitive structure of students [20].In research, the use of knowledge graphs to construct learning paths is due to the fact that they can avoid ambiguity in describing learning content.Shi et al. [21] developed a learning path recommendation model that uses a multidimensional knowledge graph framework to generate and recommend customized learning paths based on students' target learning objectives.Furthermore, this study employs the basis of the graph to generate a path that meets the characteristics and requirements of the user.Another adaptive recommendation system proposes predicting suitable educational paths for college preparatory years.
Researchers further developed graph-based learning path research by considering test items to find relationships between concepts, known as CER.In this study, experts will give weight to each test item based on the closeness of the test item's relationship with the existing concept for building CER [11].The CER weighting process is still to be further developed by other researchers [12], [22], offering a new procedure in which the integration of the opinion of the relationship of concept-test items based on the majority density of several experts to enhance the quality of the CER is proposed and used as personalized feedback.This technique is practical for reducing inconsistencies in the weighting criteria of multiple experts and enhances the overall learning diagnostic procedure for developing learning path systems.The weighting process is fundamental and a severe problem because the results of this process are decisive in learning path systems for providing learning recommendations and analyzing learning problems tailored to student needs.Considering that the weighted values act as inputs into learning path systems, poor-quality weighted values result in poor-quality feedback.Therefore, researchers are currently developing further studies on CER weighting.By leveraging the decisions made by several domain experts, Wanichsan et al. [12] have managed to provide a solution to the weaknesses of previous research using a rule-based approach.
In some cases, if there is a conflict in the weight of a test item, experts may need to be asked to reconsider their assessment, even if the majority decision has been used.Of course, this affects the quality of weighting, which consequently affects the process of building a learning path.Therefore, we must establish a method to determine the weight of test items that can solve the problem of weighting conflicts.A way of weighting, as explained in the previous paragraph, must be established because it is related to the function of the weight on each test item.We need quality weighting to produce a quality learning path.Quality weighting is a challenge, especially when building a learning path for remedial students.Remedial students are those who need special assistance [23], [24].Learning paths designed to prepare them for retesting are required, whereas the built learning paths should be able to diagnose what concepts they require.
In previous research on CER, one or several experts were needed to determine the weight value between the concept and test items [11], [12], [22], with several potential problems including inconsistency of weight value, timeconsuming conflict resolution among experts, heavy reliance on domain experts, and difficulty scaling up the number of concepts and test items.In this case, the opportunity to develop and deploy systems that can learn, recognize patterns, and make decisions with minimal human intervention is exciting [25].Researchers have conducted extensive studies on CER, indicating the need to establish a method of weighing the interrelationships between concepts in a test item that avoids potential conflicts and produces quality weighting.This method will help build a learning path for remedial students to achieve learning mastery.Consequently, by considering the frequency of the concept's appearance on each test item in the question bank, it can be assumed that the more often a concept appears, the more influential it is for students to master.This idea produced a new method, assigning a numerical weight to each test item in the question bank to measure the significance of the relationship between concepts according to the frequency of co-occurrence of these concepts in a set of test items.

III. DEVELOPMENT OF PERSONALIZED REMEDIAL LEARNING PATHS METHOD
The process required to establish a PRLP varies from the process of building a learning path in general.Such a typical learning path is dedicated to assisting remedial students to acquire mastery.The use of CER in building a personalized learning path is advantageous in the sense that it can explicitly show the relationships between concepts.Meanwhile, the characteristics of CER in this latest research are significant in developing a mechanism that can trace back any concepts that remedial students have not mastered yet.The preliminary related study on the learning path for remedial students reported that the test items most likely contain some ''hidden'' concepts.The identification of those hidden concepts on the test items may facilitate the establishment of a remedial learning path that can use the co-occurrence of such concepts.
This present research identified the ''hidden'' concepts in the test item as supporting concepts, whereas a straightforward concept in the test item was called the primary concept.Both primary and supporting concepts in the test item are the main ideas of the co-occurrence approach to autogenerate concepts' weightage.Utilization of the concept's co-occurrence in the test items to build a remedial learning path is described in the following paragraphs.
To obtain an established remedial learning path, one must have the weighted-directed graph of a concept and a remedial student's test results, which have a concept mapping of each test item.In obtaining the weightage of each node in that kind of graph, the concept-rank algorithm will be used based on the frequency of occurrences of the concept obtained from the labeled test items.These test items with labels taken from the question bank consist of the question data and the test item's primary and supporting concepts, which are organized afterward into a concept's co-occurrence matrix and stored as a database.
Later, a set of formative test items for each student is produced from the multilabel question bank.For each student who fails the test, personal profiling is performed on the sets of test items to generate the personalized concept matrix.Mapping the personal profile on a concept's weighteddirected graph combined with mapped fail-to-answer test items is then assumed.A personalized test item relationship table (P-TIRT) will establish a reference for calculating the personal error ratio.Then, when mapping the error ratio by the concept index, which is produced using the conceptrank algorithm, a PRLP that helps the remedial student learn the concepts according to their needs will be generated.A detailed explanation of the previously mentioned processes is within each subsection below.As an outline, the exercised method above is shown in the flowchart in Figure 1.

A. GENERATING CONCEPT EFFECT RELATIONSHIP BASED ON THE FREQUENCY OF THE CONCEPT'S CO-OCCURRENCE
Based on the frequency of the concept's co-occurrence, a CER is generated through several stages.The output produced at this stage is a CER in the form of a weighted-directed graph.The weighting of nodes and edges is based on the frequency of the number of concepts that are found in all the test items in the questions' database.The following information helps us understand the series of processes.

1) CONSTRUCTING FREQUENCY OF CONCEPT'S CO-OCCURRENCE MATRIX
During a face-to-face learning session in the class, teachers usually employ a concept map to assist them in delivering the materials based on the concept map order.When students have to work on test items in a formative test, they often encounter difficulties because the concepts appear to be separated from one another.Understanding more than one concept is often required to complete a test item.In science education, the concepts contained in test items can present relationships between them, and it is an exciting subject to explore.In this research, we develop ideas according to the appearance of concepts in a test item to build a PRLP.The node's weightage, which represents the concept's weight, can be determined by calculating the concept's appearance frequency or the concept's co-occurrence frequency on all test items in the question bank.The edge's weightage, which represents the weight of the concept's relationship among each other, is calculated using the normalization of the concept's appearance, which has been adjusted to the index node generated from each node so that a concept's weighted-directed graph can be formed.
To generate a concept's weighted-directed graph that refers to the concept's co-occurrence, the concepts in a test item must be considered.In a test item, there is always one primary concept, and the problem can contain supporting concepts that must be mastered by students who solve the test itemexploring the co-occurrence of the concept later used as a concept's co-occurrence matrix.Adapting the Transformer BERT architecture, the concept's co-occurrence matrix is autogenerated from the test items and then used for weighting at nodes and edges.When test items are given to the students to solve a primary concept, they also need knowledge of other concepts called supporting concepts.
The following examples of test items, along with their primary concepts and supporting concepts, can illustrate the discussion in the previous paragraph: 1) Two blocks are tied together using a rope and are hung vertically from a frictionless and massless pulley.The gravitational acceleration is 10 m/s 2 .If the mass of block A = 1 kg and block B = 4 kg, how much is the acceleration of that system (in m/s 2 )?
(primary concept: acceleration; supporting concepts: Newton's second law of motion, action-reaction, tension, and weight) Blocks A and B possess a mass of 2 kg and 6 kg, respectively.Block A is on top of the frictionless Block B, and Block B is on the frictionless horizontal surface.What is the tension in the rope for each block (N) when Block B is slowly pulled by a force of 18 N? (primary concept: tension; supporting concepts: Newton's second law of motion, net force, acceleration, and weight) This research has been conducted for a formative test in Physics subject with Dynamics as the main topic consisting of 14 concepts, namely, force (C1), equilibrium (C2), net force (C3), Newton's first law of motion (C4), mass and inertia (C5), acceleration (C6), gravity (C7), weight (C8), Newton's second law of motion (C9), normal force (C10), friction (C11), tension force (C12), action-reaction (C13), and Newton's third law of motion (C14).The examples of test items indicate that a primary concept in each test item may become a supporting concept in another test item.Table 1 shows the co-occurrence matrix of all 14 concepts within the test items.

TABLE 1. Co-occurrence matrix between supporting concepts (rows) and primary concepts (columns).
Table 1 shows that on the labeled test items in the question bank database regarding C4 as the primary concept, there is one test item that requires C1, two test items containing C3, two test items containing C5, and one test item containing C9 as the supporting concepts.Figure 2 displays the relationship between nodes in the concept's co-occurrence matrix.
Using a simple illustration in Table 1, the concept ''Newton's first law of motion'' (C4) is the primary concept, which has four supporting concepts, and ''force'' (C1) is the supporting concept of one test item.Conversely, the concept ''force'' (C1) is the supporting concept of 15 test items, of which the primary concept is ''net force'' (C3), which is the supporting concept of the primary concept ''mass and inertia'' (C5).The concept's co-occurrence frequency as well as relationships between the concepts can be presented as a graph.Figure 3 displays an example of a graph that depicts the relationships between the concepts ''Newton's first law of motion'' (C4), ''mass and inertia'' (C5), ''net force'' (C3), ''force'' (C1), and ''Newton's second law of motion'' (C9).The concept emergence matrix counts the concept's co-occurrence frequency of labeled test items in a question bank.This matrix is required to calculate the weights of both nodes and edges as the substantial components of CER.

2) CONSTRUCTING A CONCEPT'S INDEX
The learning process involves a specific order of obtaining concepts, whereas advanced concepts necessitate a prior understanding of fundamental concepts.This establishes a concept's learning order, with higher-ranked concepts being more foundational and the relationships between concepts strictly one-way.A weighted-directed graph is constructed using the method introduced by Siahaan et al. [26] to identify the learning order and one-way relationships.Consider a weighted-directed graph G = (V, E) with vertices V and edges E, where each vertex V i represents a concept i.Let c ij denote the frequency of occurrence of concept j in test items with the primary concept i, and let N represent the total number of concepts.Equation (1) calculates the weight of a directed edge w ij from vertex V i to vertex V j .
Subsequently, the score S(V i ) was needed to rank the concepts.This method is based on the PageRank algorithm proposed, and we set the value of d to 0.85 as per the recommendation of Brin and Page [27].The calculation of the score associated with the vertex i at iteration t is performed using equation (2).
In equation (2), in(V i ) is the set of predecessor vertices of the vertex V i , out(Vi) is the set of successor vertices of the vertex V i , and d is a damping factor that represents the probability of jumping to a random vertex in the graph from a given vertex.The calculation continues until the error rate ε is smaller than 0.001.At iteration t, we calculate the error rate using Equation (3).Table 2 presents the results of the score calculation for each concept, in which the score index indicates the concept's order based on its frequency of appearance in test items.
Results of the score index calculation.

3) GENERATING CER BASED ON THE FREQUENCY OF THE CONCEPT'S CO-OCCURRENCE
When constructing a concept's index, the co-occurrence relationship from lower-ranked supporting concepts to higher-ranked primary concepts is disregarded, and the co-occurrence values in the matrix are set to 0. Next, we normalize the co-occurrence matrix values between 0 and 1 by dividing each value by the sum of its respective row.As shown in Table 3, this normalization process establishes a one-way relationship between the concepts, and the resulting normalized values indicate the degree of relationship between them.The concept's hierarchy changes based on the concept's score index, which results in an adjusted hierarchy.

B. PERSONALIZED PROFILING OF REMEDIAL STUDENTS
In an online formative test, students may receive different test item sets with different supporting concepts.Producing the test items individually allows for personalized sets.Therefore, to identify the specific concepts that students struggle with in achieving mastery, a personalized concept profiling process is required.Figure 6 illustrates that even if students achieve the same score on a set of test items with the same primary concept, the supporting concepts vary.In Figure 6, as an example, Student A and Student B both obtained a score of 40, which indicates that they did not achieve mastery learning.Nevertheless, their concept distribution matrices vary, which highlights the unique areas of difficulty for each student.This personalized concept profiling process is important for developing a customized learning path that caters to the diverse needs of each individual student.

C. CONSTRUCTING PERSONALIZED REMEDIAL LERNING PATH
The final step involves using the CER, the weighted-directed graph, and the results of profiling remedial students to construct a personal test item relationship table.This table serves as a reference for calculating the personal error ratio.The section below discusses the process of constructing a PRLP.

1) CONSTRUCTING PERSONALIZED TEST ITEM RELATIONSHIP TABLE AND ERROR RATIO
The following steps are taken to calculate the weight values that indicate the degree of relationships between test items and concepts: (1) Identify the primary concept and supporting concepts in each test item.(2) Define the one-way relations and degree of relationships between the concepts.(3) Calculate the weight W (V i ) of a concept i to a test item a using equations ( 4)- (8).In these equations, W (V i ) main a is the weight of the primary concept of test item a, W (V i ) supp a is the weight of a supporting concept of test item a, n a is the total number of concepts in test item a, and n max is the highest number of concepts in a test item within the question bank. ) + w pre − w suc (7) + w pre − w suc (8) Figure 7 illustrates the results obtained by applying equations ( 4) - (8).These equations help map the test items that students have failed to master in the formative test to the established weighted-directed graph, resulting in a P-TIRT.The P-TIRT serves as a reference for calculating the personalized error ratio, which provides insights into the specific areas where students are struggling based on their performance in the test items.

2) GENERATING PERSONALIZED REMEDIAL LEARNING PATH
Once the weight values that indicate the degree of relationship between test items and concepts have been identified using the proposed autogenerated weighting method, these values are stored in a Test Item-Concept Relationship Table (TIRT).In the calculation of a student's error ratio for a concept C i , the TIRT (Q a , C i ) values that represent the degree of relationship between the test item Q a and the concept C i are used.The error ratio (ER) of a student for a concept C i is obtained by dividing the sum of TIRT (Q a , C i ) values for all incorrectly answered test items by the sum of TIRT (Q a , C i ) values of all the test items.If the resulting error rate is ER (C i ) ≤ θ, then the student is considered to have mastered the concept C i , where θ is a predetermined threshold.Conversely, if ER (C i ) > θ, then the student is considered to have failed to master the concept C i .Based on the concepts that the student failed to master and the established weighted-directed graph model, a suitable learning path is generated for the student, in which the sequence of concepts to be learned is recommended, as shown in Figure 8.

IV. EXPERIMENTAL RESULTS AND ANALYSIS
In the experiment to validate the proposed approach, we collaborated with three experienced high school physics teachers who have over 15 years of teaching experience.The experiment focused on the topic of the dynamics of translational motion based on the Indonesian high school curriculum.This experiment aimed to gather empirical data that demonstrated how the concept's co-occurrence approach resolves conflicting cases and the level of importance assigned by experts to concept weights while also achieving significant improvements in the learning outcomes of remedial students.
The experimental results show that students were able to improve their learning outcomes by focusing on the concepts they had not fully mastered, using the PRLP as their guide.Despite the variation in the test items used in the remedial tests in comparison to the formative test, the experimental group showed a higher percentage of remedial students who achieved passing grades when compared to the control group.This indicates that the implementation of the PRLP positively impacted the students' performance and assisted them in achieving higher scores on their assessments.
One interesting finding is that even the lowest score among the remedial students in the experimental group was above 20, and there were even some students who achieved remarkably high scores of 95.Subsection D of this section provides the details and comprehensive data.The multistage process relies on the frequency of co-occurrence of concepts to generate a CER.This process generates a weighted-directed graph that represents the relationships between concepts.We determine the weights assigned to nodes and edges in the graph based on the frequency of concept occurrences in the test items from the entire question database.

A. EXPERIMENT SCENARIO
The participants of this experiment were tenth-grade students from a high school in Indonesia who were studying the topic of the dynamics of translational motion in their physics class.We divided them into two groups, namely, the experimental group and the control group.The average age of the students was 15 years, and there were a total of 121 participants.The only difference between the groups was that the experimental group received personalized learning recommendations in the form of personalized learning paths, whereas the control group did not receive any recommendations.
All participants were required to complete an online formative test within a time limit of 30 minutes.The test consisted of 20 items, including short-answer and multiplechoice questions, covering the 14 concepts related to the topic of the dynamics of translational motion in physics.The objectives of the formative test were to assess the students' mastery level of the taught concepts and to determine whether they passed the physics subject.Additionally, the test results ensured that the experimental and control groups had similar levels of prerequisite knowledge and identified any learning challenges they may have encountered.We set the minimum passing grade for both tests at 70.
Students who scored below the minimum passing grade in both the experimental and control groups were required to take an online remedial test.The remedial test consisted of 20 newly generated test items in a format similar to the original formative test.This remedial test aimed to evaluate the effectiveness of the personalized learning paths generated using the proposed method.Students had 7 days to independently relearn and prepare for the test.During this preparation period, remedial students had unrestricted access to learning resources and practice test items provided to them.The domain experts played a role in ensuring the reliability and validity of the test items.Figure 9 illustrates the experimental scenario of this research.

B. FORMATIVE TEST RESULTS ANALYSIS
The objective of the formative test was to assess the students' mastery level after completing the physics class.Table 4 presents the score statistics of the formative test for both the experimental and control groups.In the control group, out of 60 students, only nine passed the test.The average formative test score for the control group was 46.08, with a standard deviation of 18.69.The average score of the students who did not pass the formative test in the control group was 40.88, with a standard deviation of 14.75.In the experimental group, out of 61 students, only 10 passed the test.The average formative test score for the experimental group was 46.32, with a standard deviation of 17.60.Among the students who did not pass the formative test in the experimental group, the average score was 40.98, with a standard deviation of 14.07.Before carrying out the formative test, the students were randomly assigned to the experimental and control groups.To ensure that there were no significant statistical differences in the students' prerequisite knowledge between the two groups, a T -test and an F-test were carried out to compare the population means and variances, respectively, using the formative test results.The results of the F-test showed that the variance in test scores between the experimental and control groups was nearly identical, with a p-value of 0.32 (>0.05).Moreover, the results of the T -test indicated that there was no significant difference in the mean scores between the two groups, with a p-value of 0.48 (>0.05).These findings suggest that students in both groups had similar levels of prerequisite knowledge, which allows for a fair evaluation of the proposed method.Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

C. REMEDIAL TEST RESULTS ANALYSIS
After the formative test, the students from both the experimental and control groups who did not pass were required to take the remedial test.Table 5 provides the remedial test score statistics for both groups.A total of 51 students from the control group and 51 students from the experimental group participated in the remedial test.The mean score for the control group on the remedial test was 46.08, with a standard deviation of 13.84.Contrarily, the experimental group achieved a higher mean score of 65.27, with a standard deviation of 11.59.The significant difference in mean scores between the experimental and control groups suggests that the proposed method had a positive impact on the students' performance, as reflected in their remedial test scores.The results shown in Table 6 highlight the impact of the PRLP on the students' learning achievement.In the experimental group, the mean test score significantly increased from 40.98 in the formative test to 65.27 in the remedial test.Conversely, the control group experienced a smaller increase, with the mean test score rising from 40.88 in the formative test to 44.29 in the remedial test.These findings show that the personalized learning paths produced through the proposed concept's co-occurrence approach were highly beneficial for the students in enhancing their learning outcomes.The significant increase in the mean test score for the experimental group shows that the PRLP recommendations helped the students focus on and master the concepts they had previously struggled with, which led to higher achievement levels.

D. LEARNING ACHIEVEMENT
The remedial students in each group were divided into two clusters based on their knowledge level to further analyze the learning achievement of the underachieving students who did not pass the formative test.This division allowed for a more focused analysis of the students' performance and the effectiveness of the intervention.The first cluster consists of students with test scores of <55, whereas the second cluster consists of students with test scores of ≥55.Table 7 presents a comparison of the statistics for the remedial test scores between the experimental and control groups within these two knowledge clusters.The data in Table 7 show that among students with a remedial score of <55, the experimental group exhibited a higher mean score (46.44) and the lowest score (33) compared to the control group, which had a mean score of 35.77 and the lowest score of 10.Furthermore, when examining students with a remedial score ≥ 55, it is evident from Table 7 that the experimental group had a higher mean score (69.31) and the highest score (95) compared to the control group, which had a mean score of 57.92 and the highest score of 70.These findings propose that the implementation of the PRLP assisted the enhancement of the learning outcomes of underachieving students.This observation highlights the importance of implementing a PRLP to facilitate optimal preparation for the remedial test, underscoring the necessity of such a guidance system.Moreover, Figure 10 reveals that all students in the experimental group were able to improve their scores on the remedial test.The results further support the effectiveness of the PRLP, which was developed using the proposed autogenerated weighting by concept co-occurrence method, in providing students with learning guidance and improving their academic performance.
Additionally, we analyzed the learning achievement of underachieving students from the perspective of five distinct knowledge levels.Table 8 shows a comparison of the mean scores for the formative and remedial tests among underachieving students in both the control and experimental groups across these five knowledge levels.In Table 8, the number of students in the experimental group scoring <40 decreased from 26 (50.98%) in the formative test to 1 (1.96%) in the remedial test.Conversely, the number of students in the control group scoring <40 decreased from 25 (49.01%) in the formative test to 19 (37.25%) in the remedial test.The percentage of students with a score of <40 in the experimental group significantly decreased by 49.02%, whereas the control group only decreased by 11.76%.Table 8 also provides an interesting observation in terms of the performance of students in the experimental and control groups with scores of ≥60.5.In the experimental group, the number of students achieving a score of ≥60.5 increased from 3 (5.88%) in the formative test to 35 (68.62%) in the remedial test, which indicates a significant increase of 62.74%.Alternatively, in the control group, the number of students with a score of ≥60.5 increased from 2 (3.92%) in the formative test to 3 (5.88%) in the remedial test, representing a slight increase of 1.96%.Moreover, notably, in the remedial test, two students in the experimental group were able to achieve exceptionally high scores of >80.5, whereas none of the control group reached such a level.This finding further reinforces the efficacy of the PRLP developed using the proposed autogenerated weighting method in substantially enhancing students' learning achievement.
Based on the experiment results, it can be concluded that implementing a PRLP as a learning guide has proven to be highly beneficial in significantly improving students' learning achievement.Moreover, the proposed autogenerated weighting method efficiently addresses the limitations of manually assigning weight values to indicate the relationship between test items and concepts.The use of the autogenerated weighting method significantly reduces the time required for establishing weight values, eliminating the requirement to resolve weight conflicts among experts.Additionally, experiment results provide evidence of the effectiveness of the proposed concept's co-occurrence approach in substantially enhancing the learning achievement of underachieving students.

A. COMPARISON WITH PREVIOUS STUDIES
Remedial students must study several concepts, and it is important to discover the weight of each concept.Hence, it is possible to determine which concepts are more important to learn as a basis for mastering other concepts in a learning process so that our research emphasizes the concept's weight and their effect on the relationships between concepts.Then, we started to collect the results of various research related to CER methodology to compare which method was most appropriate, which included the research by Hwang (2021), as described thoroughly in Section II.Unfortunately, we found a typical problem in the previous research, that is, the process of identifying each related concept in the test items and determining the concept's weights was conducted manually by domain experts, who are mostly rigid and tend to be subjective, so debates often occur between experts in determining weights.The impact is that there are inconsistencies in concept weighting, the effect of which is difficulty in building CER as a reference for designing learning paths for remedial students.Our research replaces, or at least dramatically reduces, the role of the expert by employing a deep learning architecture to analyze the concept co-occurrence in the test item in the question bank.Hence, in comparison to the high dependency on experts' existence, our proposed method is automated, less time-consuming, but still consistent and reliable; it is even possible to apply it to other subjects that use natural language in learning and emphasize the importance of 13888 VOLUME 12, 2024 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
understanding the relationships between concepts to achieve mastery.

B. IMPLICATIONS TO ASSESSMENT-BASED LEARNING METHODOLOGY
Generally, the objectives of assessment-based learning methodologies are to ensure that students acquire knowledge and skills and demonstrate their understanding and competence through various assessment formats.One of the assessment formats for learning a subject is a formative test, which contains a set of test items taken from a database of test items or a question bank.A student is encouraged to analyze and evaluate information and to make connections by exploring the relationships between concepts within a single subject or between different subjects, since students with a conceptual understanding of the subject learned would be more knowledgeable than those who only know theoretical facts and procedures.Hence, the concept is essential in education, as in this research, when constructing both individual and adaptive PRLPs by mapping the test items that students have failed to master in the formative test to the established CER or weighted-directed graph, resulting in a P-TIRT.The P-TIRT serves as a reference for calculating the personalized error ratio, which provides insights into the specific areas where an individual student struggles based on the student's performance in the formative test.By doing so, the most suitable remedial learning path for students to fulfill their learning needs would be established, and recommendations for the sequence of concepts to would improve learning outcomes.

C. LIMITATIONS AND CHALLENGES
Our research has proven that the proposed method is more adaptive and practical in generating the learning path for remedial students.However, we acknowledge that there is still room for improvement due to the limitations we encountered during our research, including: • The proposed method works best in subjects that require a relationship between concepts, for example, science and math.
• The ability of the applied NLP solutions to process text data that employs a local native language might require continuous adjustment due to the impact on the result of the multilabel classification of test items, considering the vocabulary used in composing a test item.
• Manually assigning a new set of random test items from the question bank in the exercise sessions was still being performed.It would be more efficient in an assessment-based learning environment to apply automation.Several challenges that encourage us to elaborate further in the future are as follows: • When a student has difficulty mastering a concept during the learning process, it may not merely be because of a lack of understanding of the prerequisite concepts.
However, there are external factors that may be considered physical conditions (e.g., learning environment, resources and materials, social factors, health and wellbeing, assessment and feedback, teaching methods, and student characteristics), which undeniably make learning a nonlinear process in terms of learning durability and assessment results.Therefore, utilizing such factors to predict probabilities of success or failure in enhancing learning outcomes as a complement to the provided personalized learning path would be beneficial.Unfortunately, it is difficult for methods based on physical or data-driven models to fully characterize this nonlinear process, and existing methods that hybridize physical and data-driven models suffer from ambiguous hybridization, which results in the vast majority of existing methods for predicting learning durability and assessment results suffering from a lack of accuracy and robustness [28].Those conditions should have triggered the ideas to employ deep learning models, such as convolutional neural networks (CNNs), graph neural networks (GNN), long and short-term memory networks (LSTM), bidirectional LSTM, and hybrid neural networks, which can automatically learn complex feature representations of the system from large amounts of data to better capture the state evolution patterns of the system at different time steps [29].
• Currently, establishing the concepts still depends on the teaching activities in the classroom.It would be beneficial if the concepts were generated automatically from the learning material in the student's textbook, so they would follow the textbook improvement adaptively.
• The application of the proposed method should be expanded to other subjects in physics and to other subjects from cross-disciplines as supporting concepts, such as math, chemistry, electronics, and others.

D. FUTURE RESEARCH DIRECTIONS
Due to the limitations and challenges following the accomplishment of our research, the following are some suggestions that will serve as our contributions to the ongoing discourse in the field and to paving the avenues for future research: • In the context of implementation to other related topics in physics or even solely different subjects, it is more than applicable if the other subject also emphasizes the importance of understanding concepts and the relationship between concepts to achieve mastery, such as concepts in biology, chemistry, and math, of all students within a course and generates a personalized remedial learning path for remedy students.The module would also be equipped with an automatic quiz generation module that selects questions from the question bank based on the personal remedial learning path.Contrariwise, in the context of scalability to other subjects, it would be more valuable if our proposed method could build CER from multi-concepts across subjects.For example, we noticed that the concept of trigonometry in math is closely related to comprehending the concepts of dynamics in physics, so it would be more effective if our proposed method were able to analyze the test items in the same question bank and then generate the CER upon the concepts both in math and in physics, either separately or integrated.Another prior knowledge from other disciplines that will be useful to explore to mitigate the knowledge barriers is the semantic skills, either grammatical or lexical, in order to grasp the relationships between words, understand context, and recognize the subtle nuances in meanings of the concept in the textbook or a test item, particularly understanding a word from a local language that has several different definitions.Regarding the customization to different subjects, we believe that it depends on how significantly it will impact the implementation of this proposed method on the subject.Similar to our research, when we need to measure the effectiveness of an established PRLP, we can customize the passing grade in the phase of the exercise sessions and remedial test.
• Generating a weighted-directed graph as the relationships between concepts integrate expert knowledge and system data.However, expert knowledge is difficult to use directly [30], and the fusion method of expert knowledge and data is not appropriate and affects the control effect [31].As previously mentioned, one of the desired results achieved in our research is eliminating or at least minimizing the expert intervention in generating CER for remedial students and replacing it with a deep learning algorithm to autogenerate CER based on the frequency of the concept co-occurrence.Through these automation processes, we also aim to eliminate the consensus problem that involves obtaining all nodes, representing concepts, in the system to agree on a single value or decision.In this case, there is an opportunity to explore the implementation of the Fault Tolerance Consensus (FTC), a designed mechanism in distributed systems that ensures that the system can continue to function correctly and reach consensus among nodes, even in the presence of faulty nodes, which include node crashes, so the system can continue to function correctly despite any faults.

VI. CONCLUSION AND SUGGESTIONS FOR FUTURE WORK
This study proposes an autogenerated weighting method that employs the frequency of concept co-occurrence in a test item bank to determine the degree of relationship between test items and subject concepts.The frequency matrix is constructed according to the occurrence of concepts in the test items, and a concept ranking algorithm is applied to determine the most frequently appearing concepts.By obtaining the order of concept indices, the weights of concept relationships are calculated through normalization based on the frequency of occurrence.This process results in the creation of a CER, which is a weighted and directed graph used to map personalized profiles for remedial students.The CER serves as a reference to determine a personal error ratio, which is then applied to define the sequence of the PRLP.The proposed approach successfully addresses the issue of conflicts arising from assigning different weight values by experts, which results in reliable weight values.By employing the concept's co-occurrence-based CER method in the development of a PRLP, a CER model is obtained.This model incorporates adaptive weight values on both the nodes and the relationships between nodes, which accommodates changes in the composition of the test items within the question bank.The learning path sequence is no longer fixed based on an initial learning sequence but rather adaptive, prioritizing concepts that occur most frequently in the test items.Each remedial student's learning path is customized based on their profiling results.This personalized approach effectively helps students enhance their learning achievement in the remedial test.The learning paths are designed by determining concepts that students have yet to master and providing them with the necessary foundational knowledge.As a result, students acquire the necessary knowledge to achieve better results, even when faced with different sets of test items.
During the research process, we manually conducted the compilation of the frequency concept co-occurrence matrix, which involved main concepts and supporting concepts in multilabel classification.However, we recommend automating this process using machine learning modeling in future work to enhance its efficiency and accuracy.Despite the fact that our research has applied BERT (Bidirectional Encoder Representations), a machine learning model that, more specifically, falls under the category of natural language processing (NLP) models, the result of BERT is still processed manually to construct the co-occurrence matrix.Therefore, we are optimistic that applying a simple modification to its algorithm or hardcode, such as inserting sum-up instructions, can directly convert all the keywords, as the result of pre-trained BERT on large amounts of textual data, to become the co-occurrence matrix in an automated sequential process, so it can further optimize the feasibility and reliability of the method.
The application of PRLPs can also be extended to develop assessment-based learning methods by integrating them with the test item database to make the proposed method more widely applicable and not limited to remedial learning scenarios since the characteristic of the proposed method is produced by analyzing the relationships between concepts and presented as a weighted-directed graph.To build a concept map to design learning paths for students in the learning process, it is possible to switch the approach of CER utilization to become antemortem, as it assists students to comprehend prior knowledge in a learning session and prepare themselves prior to the formative test.The critical success factor of this altered approach is the existence of a question bank, as we also prepared for our research, which contains a large number of labeled test items as the database and is ready to be integrated with the proposed method to achieve the objectives of the assessment-based learning methodology that students acquire knowledge and skills and demonstrate their understanding and competence through various assessment formats.Furthermore, future research endeavors must aim to determine correlations between physics concepts and prerequisite knowledge in other subjects that may hinder the acquisition of mastery.For example, understanding mathematical concepts such as logarithms and graphs of sine, cosine, and tangent is crucial for comprehending certain physics concepts.Therefore, obtaining the prior knowledge's concept relationships between different disciplines will make the proposed method remain relevant or applicable over an extended period, and it will help address knowledge barriers between disciplines, hence providing a better guide to designing PRLPs.

FIGURE 1 .
FIGURE 1. Process of Developing PRLP based on Frequency of Concept's Co-occurrence.

FIGURE 2 .
FIGURE 2. Concept's co-occurrence relationships as a matrix representation.

Figure 4
Figure4shows the graph representation of the changes observed in the concept's co-occurrence matrix.The arrows in the graph depict the relationship a concept serves as a supporting concept for the corresponding intended concept (node).

FIGURE 6 .
FIGURE 6. Illustration of personalized concept matrix profiling.

FIGURE 7 .
FIGURE 7. Illustration of mapping personalized test item relationship table to acquire personal error ratio.

FIGURE 8 .
FIGURE 8. Illustration of generating personal remedial learning path.

Figure 10
Figure 10 presents the formative and remedial test scores of each underachieving student in both the control and experimental groups.In Figure several students in the control group obtained a lower score on the remedial test.This observation highlights the importance of implementing a PRLP to facilitate optimal preparation for the remedial test, underscoring the necessity of such a guidance system.Moreover, Figure10reveals that all students in the experimental group were able to improve their scores on the remedial test.The results further support the effectiveness of the PRLP, which was developed using the proposed autogenerated weighting by concept co-occurrence method, in providing students with learning guidance and improving their academic performance.Additionally, we analyzed the learning achievement of underachieving students from the perspective of five distinct knowledge levels.Table8shows a comparison of the mean scores for the formative and remedial tests among underachieving students in both the control and experimental groups across these five knowledge levels.In Table8, the number of students in the experimental group scoring <40 decreased from 26 (50.98%) in the formative test to 1 (1.96%) in the remedial test.Conversely, the number of students in the control group scoring <40 decreased from 25 (49.01%) in the formative test to 19 (37.25%) in the remedial test.The percentage of students with a score of <40 in the experimental group significantly decreased by 49.02%, whereas the control group only decreased by 11.76%.Table8 also

FIGURE 10 .
FIGURE 10.Comparison between formative (a) and remedial tests (b) of the control and experimental groups.

TABLE 3 .
New relationship matrix between supporting concepts and primary concepts.

TABLE 4 .
Formative test score statistics.

TABLE 5 .
Remedial test score statistics.

TABLE 6 .
comparison of the underachieving students' formative and remedial test scores.

TABLE 7 .
Statistical comparison of the remedial scores between the two groups in two different knowledge levels.

TABLE 8 .
Comparison of formative and remedial test scores of the control and experimental group in five different knowledge levels.