Disclosing Personal Names in Screen Names Predicts Better Final Achievement Levels in Massive Open Online Courses

The anonymity of the Internet used to be considered as an encouraging factor that helped learners engage in online learning. However, academic studies on anonymity have found that its effect on learning is context-dependent or mixed. In this research, we focused on massive open online course (MOOC) learners’ preference for personal name disclosure in their screen names as a predictor of their final achievement levels (FALs) at the end of a course including 2606 active learners. We conducted two studies, one to examine the associations between these two variables and one to demonstrate how such associations can be utilized in MOOC FAL prediction. We found that MOOC learners who included personal names in their MOOC screen names significantly outperformed other learners in their FALs (p < 0.001). We also found that screen name preference improved FAL prediction accuracy utilizing natural language processing and proper machine learning technologies. The error rate was reduced to 4.03% by a random forest algorithm with an appropriate feature combination: the personal name disclosure indicator (PNDI), quiz scores, number of replies, and exam scores. The results are potentially useful for the development of an early intervention to provide different types of help to students who prefer to disclose personal names and those who do not. The practical effects of these interventions will be examined in the future. In addition, whether the course difficulty level or course type affects the associations between personal name disclosure and FAL will also be examined.


I. INTRODUCTION
Massive open online courses (MOOCs) are large-enrollment, Internet-based classes that provide an alternative to oncampus college courses. A typical MOOC lasts 8-15 weeks, with new material provided weekly and time-sensitive assignments due throughout the course. Communication among classmates and instructors occurs through messages posted online using a structure similar to that of social media. Although this format can be beneficial for many students, MOOCs typically have a low completion rate, suggesting that some students may need support to finish the course [1]. If a learner's eventual need for support can be predicted at an early stage, the MOOC provider and course instructors The associate editor coordinating the review of this manuscript and approving it for publication was Alberto Cano .
can have more time to create personalized support strategies before problems become obvious [2].
To date, research on early prediction in MOOCs has focused on predicting dropout [3], [4]. Reference [5] classified two sets of MOOC dropout predictors: demographic variables, such as gender, age, or educational background [6], [7], and course-related behaviors (''user events''), such as watching an assigned video or participating in an online discussion [8], [9]. In the current study, rather than focusing on dropout, we focused on predicting final achievement in the course, which is important since being active throughout the course (suggesting no risk of dropout) but still failing the course may frustrate MOOC learners. Stakeholders in the MOOC industry may wonder if, besides the indicative Internet behaviors that have been examined in previous studies on MOOC dropout, there are any other Internet behaviors that can help predict final achievement level (FAL) earlier.
The current research investigated a familiar Internet behavior that has not previously received attention in MOOC dropout studies: the learner's choice to use a screen name that discloses something about the learner's identity (e.g., Jane77) rather than a screen name that conceals his or her identity (e.g., mooc1328). The field of education has provided evidence that students' and teachers' online anonymity preferences are associated with higher-quality feedback given to peers [10]. Anonymity on the Internet has also been shown to encourage learning engagement [11], [12] and to facilitate better learning performance [13]. However, as summarized by [14], contrary to expectations, disclosure has not consistently been found to be greater in online contexts than offline contexts. For example, [15] found that non-anonymous forums seem to generate higher-quality second language learning than anonymous forums.
Currently, the predominant MOOC platforms follow a onesize-fits-all approach. Technical support for goal-oriented and self-regulated learning has been proposed to facilitate personalized learning in MOOCs, e.g., [16]. However, it seems that the MOOC teacher is not considered in such facilitation. Although MOOCs reduce teacher labor, they do not necessarily prevent the teacher from needing to interact with MOOC learners. Compared with the Western definition of a ''good teacher'' being ''professional'' in terms of course preparation, teaching skills, and assessment fairness, the Chinese definition places more emphasis on the teacher being caring and wanting to get to know her students as individuals outside the classroom [17]. In fact, Jaspers classified ''education of a whole man'' as one of three factors that make up a university (the other two are professional training and research) and stated that ''by isolating them [these three factors], the spirit of the university perishes'' [18]. In addition, both the Western and Eastern world are familiar with the story of how scientist Michael Faraday met Sir Humphry Davy: Faraday sent Davy a 300-page book based on the notes he had taken during Davy's lectures. In the modern world, how can a teacher identify a MOOC learner whose learning motivation is likely stronger or weaker than that of others, especially when learners' public information is concealed as much as possible, with only the screen name exposed?.
Our research is one of the first studies to suggest that the disclosure of personal names in screen names predicts better FALs in MOOCs. Specifically, we studied whether students' choices to disclose personal names in their screen names were associated with better final scores in a MOOC course after we took into account their participation in class discussions. The findings may lead to a new strategy for MOOC instructors to reallocate their energy: paying more attention to MOOC learners who are active but whose screen names are anonymous.
One point should be made about the terms used in the current study. Researchers have distinguished ''real name'' screen names from ''anonymous'' screen names in other studies. This distinction is appropriate when automatically generated screen names fall under one category or the other or when students can express a preference for one of two welldefined options. However, it is common in MOOCs for students to have the option to passively accept an automatically generated screen name (one that may or may not include the student's real name) or to actively change the screen name (to one that may or may not include the student's real name). Therefore, instead of the ''real name'' vs. ''anonymous'' distinction, we categorized chosen screen names into two types: those that included a personal name of any kind and those that did not. For example, a self-disclosing screen name could be one that included a real name (jane77), a nickname (janie4), or any other personal name (mulan8888).

II. LITERATURE REVIEW A. SOCIAL PSYCHOLOGICAL THEORIES OF SELF-DISCLOSURE
The social-psychological literature on self-disclosure provides a useful conceptual framework for the current research: that including a personal name in a screen name can be seen as a kind of self-disclosure. Reference [14] systematically reviewed several key self-disclosure theories in computer-mediated communications (CMC). Among these theories, one is directly relevant to screen name preference: hyperpersonal CMC theory [19]. This theory posits that in the context of the Internet's general anonymity, users experience a sense of control when they can manipulate their selfpresentation (e.g., their screen names and profile photos). This sense of control encourages self-disclosure.
However, if we agree that disclosing a personal name can be seen as a kind of self-disclosure, then hyperpersonal CMC theory may imply a contradictory proposition, i.e., that anonymity encourages personal name disclosure. Such a contradictory proposition can be solved by arguing that the personal name disclosed is probably not the user's real name. Although this argument partially holds true in reality, a more persuasive solution could be that, even though the Internet's anonymity might encourage disclosure overall, users have choices about how much to disclose. Screen names are one way that individuals can remain anonymous or self-disclose on the Internet.
Given that self-disclosure is context-specific [14], would disclosing a personal name in a learning context imply something other than the expectation of social networking since the number of in-depth peer interactions in MOOCs is found to be quite low [20]?.

B. EDUCATIONAL RESEARCH ON ANONYMOUS FEEDBACK
Research on self-disclosure concerning screen names is more common in educational research than psychological research. In education, the research focus has been on screen names used in online peer evaluations. The results of these studies have been mixed. On the one hand, students with anonymous screen names are approximately five times more likely to provide substantively critical feedback to peers VOLUME 9, 2021 than identifiable students [21], and teachers with anonymous screen names have been shown to provide significantly more cognitive feedback on peers' microteaching performance than identifiable teachers [10]. On the other hand, [10] also found that identifiable teachers offered more affective feedback (i.e., support, disapproval) and more reflective comments than those with anonymous screen names.
Importantly, learners' preferences for disclosure or nondisclosure in screen names may vary by learner role and task. Reference [22] investigated learners' preferences for screen names characterized by a real identity, an anonymous identity, and a created identity. When constructing questions in a course, 47.5% of university freshmen preferred to use nicknames, followed by anonymous names (30%) and real names (12.5%), with nicknames being seen as ''more interesting and playful.'' However, when assessing peers' constructed questions, 37.5% of the students preferred anonymous names, 35% chose nicknames, and 12.5% used real names. Those who chose anonymous names ''did not want uneasy feelings or tensions to occur as a response to the evaluative ratings and comments.'' To our knowledge, there has been only one study to date that was directly related to our study on screen name preference and FALs. Reference [23] investigated the direct link between secondary school learners' screen name preferences and their performance goals in a game-based learning context in which students' scores were posted next to their screen names online. Students who had non-anonymous screen names reported higher performance goals in the competition than students who had anonymous screen names. However, this experimental setting was different in several ways from a MOOC, which is typically a noncompetitive environment. In such an environment, each student works autonomously, students do not know their peers' performance, and students are allowed to quit. Therefore, being self-driven is very important for MOOC learners to progress through the course [24]. Whether personal name disclosure still plays the role of such a self-driven force in a noncompetitive learning environment needs to be examined.
Additionally, [23] assigned middle school students into different groups (a real name group, an anonymous group, and a control group) rather than allowing them to freely choose a group. Therefore, it is worth investigating whether there is an association between screen name preference and FAL in the MOOC context.

C. COURSE-RELATED BEHAVIORS AS PREDICTORS OF MOOC FALS
There is little research on course-related behaviors as predictors of learning outcomes, including FALs and course assessment scores (see [25] as one of the few exceptions). The closest literature is the large body of research on behavioral predictors of MOOC dropout (for reviews, see [3], [4]). Behavioral features refer to clickstreams (e.g., navigating pages) and text input by learners [26].
Behavioral variables appear to vary in their effectiveness as predictors of MOOC dropout. For example, [27] found that the usage of emotive vocabulary (e.g., ''happy'' or ''I wasn't able to . . . '') did not significantly predict dropout. Reference [28] found that video viewing patterns and device information (e.g., use of a Chrome browser on a Windows PC) were effective dropout predictors. In addition, assignment performance [29], [30], quiz completion, and final exam completion [31], [32], [33] were found to be effective dropout predictors. In the current study, quiz scores and the frequencies of different types of participation in online discussions were included as predictors of the final course score.
In summary, social-psychological and educational research supports the association between anonymity and a certain kind of behavior. Therefore, we can hypothesize that some association between screen name preference and MOOC FALs may exist. However, neither the social-psychological nor educational research literature on self-disclosure provides a clear prediction for this association since their conclusions are either context-dependent or mixed. Although some selected course-related behaviors can predict MOOC FALs, research on the association between screen name preference and FAL is still important because with such an association (if it exists), the observed values of course-related predictors do not have to be accumulated to a certain degree over time to make an accurate prediction, which extends the warning time. The MOOC provider and course instructors can thus have more time to either create or adjust personalized support strategies before problems emerge [2] or can take advantage of diverse characteristics in MOOC learners by using different but efficient teaching strategies [34], [35].

III. RESEARCH QUESTIONS
• RQ1. Is the preference for disclosing personal names in screen names associated with better MOOC FALs when other course-related behaviors are controlled?
• RQ2. Which set of variables (screen name preference or course-related behaviors) provides the best model for predicting MOOC FALs? These two questions were investigated in the following two studies. Study 1 investigated the ''what'' problem, i.e., determining what the phenomenon was. In Study 1, the MOOC course data were used to test whether students who did and did not disclose personal names in their screen names differed in their FALs when other course-related behaviors were controlled. Study 2 answered the ''how'' problem, i.e., exploiting the phenomenon to predict learners' FALs. All the studies were approved by the Institutional Review Board (IRB) at the university with which the first author was affiliated.

IV. STUDY 1: WHAT IS THIS PHENOMENON?
The purpose of this study was to examine whether learners' choices to disclose personal names in screen names were associated with their MOOC FALs (RQ1).

1) PARTICIPANTS
The sample included 2606 active learners in a MOOC (Introduction to Psychology) from September 20, 2017 -January 13, 2018. An active learner was considered a learner with at least one record found for the measures described below. Over 9000 learners enrolled during the first two hours when the course started. 1 This MOOC was hosted by one of the most popular Chinese MOOC platforms (icourse163.org). 2 This platform has hosted over 1300 Chinese MOOCs since 2017. 3

2) MEASURES
• Formative features: participation in online discussions.
Participation in online discussions was measured by the number of comments, the number of replies, the number of new discussion threads, and the number of participation events in online discussions (the sum of the first three numbers). These components were called the ''formative features.'' • Summative features: assessments throughout the course. Assessments throughout the course included homework scores, discussion scores, quiz scores, and exam scores. These components were called the ''summative features.'' In addition, the exam and quiz scores were standardized test scores, whereas the homework and discussion scores were rated by the course teacher using open-ended questions.
• FALs. The course teacher developed the final score as a weighted sum of the scores for the formative and summative features. Based on the final score, students were classified as showing one of three FALs: failed, qualified, or excellent.

3) PROTECTION OF PRIVACY
We followed the ''separation schema'' approach to protect privacy [36]- [38]. This is a method of storing sensitive data (e.g., actual names and screen names) and nonsensitive data (e.g., course-related behaviors) separately with different storage providers. The method was implemented in three steps.
(1) De-identification: a Boolean variable, called the personal name disclosure indicator (PNDI), was used to flag personal name disclosure (1 = a personal name was disclosed in the screen name, 0 = no personal name was disclosed in the screen name) by a research assistant, Alice (a pseudonym). She also saved a list of matched screen names with PNDI flags and self-created identity index numbers (IINs). She then created a new data file by deleting the screen names and retaining the IINs and PNDIs. Alice was not allowed to access the new data file or the list of matched screen names and IINs, and Bob and Carol could not access each other's information. The new data file (Bob's file) was used in Study 1 and Study 2. The list of screen names (only the screen name column in Carol's list) was used in Study 2 as an independent data set to examine the feasibility of automatic personal name recognition.

4) ASSIGNMENT TO GROUPS BASED ON PERSONAL NAME DISCLOSURE
The 2606 learners were classified into two groups. If PNDI = 1, then they were assigned to Group D (personal name disclosed; n = 357). If PNDI = 0, then they were assigned to Group S (personal name sealed; n = 2249). Table 1 shows the number of learners in each group who were in the failed, qualified and excellent FAL categories. Learners whose FALs were qualified or excellent earned certifications. In addition, these certifications were labeled either ''qualified'' or ''excellent.'' Table 1 shows that 1203 active learners received certifications.

5) PLANNED ANALYSES
The χ 2 test was used to examine whether there was a significant difference in FALs between Groups D and S. Analysis of covariance (ANCOVA) was used to examine whether the PNDI was associated with final scores after the variance explained by covariates was removed. The covariates were the formative and summative features.

B. RESULTS
The difference in the FAL categories between Group D and S was significant at the level of 1.62 × 10 −6 (p < 0.001), with χ 2 = 26.66 > χ 2 0.00001((2−1) ×(3−1)) = 23.03. The average final scores of Group D and S were 44.39 (SD = 27.67) and 35.53 (SD = 28.51), respectively. One-way analysis of variance (ANOVA) indicated that the difference in the average final scores between groups was significant (F = 29.98 > F 0.05 (1, 2601) = 3.84, p = 5 × 10 −8 < 0.001). An intuitive way to understand this result is that in the sample of 2606 active learners, 57.7% of Group D learners had qualified or excellent FALs, whereas only 44.3% of those in Group S did. Moreover, the proportion of learners with excellent FALs in Group D was approximately 1.47 times larger than that in Group S.
ANCOVA showed that the PNDI was significantly associated with final scores after the removal of the variance VOLUME 9, 2021 introduced by the covariates, namely, the formative features, p < 0.05 (except for the discussion score, p = 0.0556) and the summative features, p < 0.05. The only significant interaction between the group assignment and a covariate was between the PNDI and the number of replies (p = 0.0041). To investigate this interaction, a new variable was created (many replies vs. few replies) using the median number of replies in each group as the cut-offs. The median number was 3 for both Groups D and S. The post hoc test showed that the final scores of Groups D and S differed significantly at the few replies level (p < 0.05) but not the many replies level (p = 0.317). This finding indicates that Group D had a significantly higher average final score at the few replies level than Group S. In contrast, at the many replies level, there was no significant difference in the final scores between Groups D and S. In addition, the association between the number of replies and final scores was significant (p < 0.001) in Groups D and S.

V. STUDY 2: HOW TO EXPLOIT THIS PHENOMENON?
The purpose of this study was to explore the extent to which screen name preference, relative to other course-related behaviors, predicted FAL (RQ2).

1) PARTICIPANTS
The participants were the same group of 2606 MOOC learners described in Study 1. Two sets of information generated in Study 1 were used in this study: (1) the new data file that included the IINs and flags for individuals who included a personal name in their screen names and (2) the list of screen names (to examine the performance of automatic name recognition).

2) FAL PREDICTION FRAMEWORK
A framework was designed to explore the extent to which screen name preference, relative to other course-related behaviors, predicted FAL, as illustrated in Fig. 1. This FAL prediction framework consisted of three components: name recognition, feature selection, and prediction approaches. The framework was both an experimental flowchart and an implementation blueprint.
These three components are introduced separately below.
a: NAME RECOGNITION Name recognition was performed manually, as the PNDI was assigned in Study 1. Personal name recognition was also performed using a well-developed natural language processing technology called named entity recognition (NER). There are at least 24 languages, including Chinese, for which NER can be implemented automatically [39] [40]. In this study, the prediction was carried out based on the PNDI tagged in Study 1. However, as a possible technical option for practice, the performance of automatic NER is also reported in the corresponding result subsection. The automatic Chinese personal name recognition programs are currently quite mature (e.g., [41], [42]). In this study, an automatic name recognition algorithm [43] was adopted to identify Chinese personal names in screen names. This algorithm has been integrated into a well-known Chinese lexical analysis system (CLAS), which is a widely used natural language processing (NLP) and information retrieval (IR) software package developed by the Institute of Computing Technology, Beijing, China: NLPIR-ICTCLAS. b: FEATURE SELECTION Study 1 described five summative features: homework score, quiz score, exam score, final score and FAL. Because the FAL was determined directly by the final score, we did not include the final score as a feature to predict FALs. As previously mentioned, since the exam and quiz scores were standardized test scores, whereas the homework score and discussion score were rated subjectively, the former two objective summative records were selected as default features.
To minimize the number of selected features and to balance the number of formative and summative features, the maximum number of formative features was set at two. To investigate the effectiveness of screen name preference in FAL prediction, the PNDI (see Study 1) had to be one of the two formative features. There were two alternatives for the second formative feature: the number of participation events and the number of replies. As described in Study 1, the number of participation events was the sum of the number of comments, replies, and new discussion threads initiated. The number of participation events was chosen since it was probably a better alternative to any single addend. However, the number of replies was reserved as a second alternative since it was the only covariate that had interacted significantly with the PNDI to predict final scores. Fig. 1 illustrates the feature combinations in a circuit diagram. Switches (S 1 and S 2 ) were used to build a path connecting the blue disks A and B. There were four status combinations of S 1 and S 2 . According to the status of S 1 , the feature combinations could be divided into two pairs. One pair shared at least three features: exam score, quiz score and the number of participation events in online discussions (termed ''EQP''). The other pair also shared at least three features: exam score, quiz score and the number of replies (termed ''EQR''). Within each pair, there was a three-feature combination (e.g., EQP or EQR) and a four-feature combination (e.g., EQP+PNDI or EQR+PNDI).
To identify the best feature combination that minimized the prediction error ratio (the number of error cases to the number of all cases), either S 1 or S 2 was controlled to connect A and B. Prediction approaches, e.g., a random forest algorithm, were then applied to predict FALs.

c: PREDICTION APPROACHES
Random forest (RF) is a prediction approach that is an ensemble of many classification trees [44].
A single classification tree is traditionally built by splitting samples into branches according to the splitting rules based on all features of the samples. This recursive splitting process ends when further splitting does not add value to the predictions.
RF improves the single classification tree schema by assembling multiple simpler trees whose splitting rules only cover a random subset of sample features. Specifically, to classify a limited input feature vector (by limited, we mean that the features were limited to those selected from the feature selection component in Fig. 1), the feature vector (denoted v in Fig. 1) is put through each tree, i.e., t 1 , . . . t T , in the forest. Each tree gives an FAL categorical level (denoted c) (i.e., failed, qualified, or excellent), as a vote. As shown in Fig. 1, the forest chooses the FAL categorical level by a simple majority rule: the most ''votes'' overall indicate the predicted outcome.
As an ideal example, suppose there are 10 trees in a forest. Among these 10 trees, five trees vote an ''excellent'' FAL for a new sample, while three trees vote a ''qualified'' FAL for the sample, and two trees vote a ''failed'' FAL. Then, the RF will conclude that the FAL of this new sample is predicted to be ''excellent''. A more practical example is that a tree t can ''vote'' a probability distribution of FALs for a sample v, denoted P t (c|v); if ''excellent'' is 0.8, ''qualified'' is 0.15, and ''failed'' is 0.05, then the ''forest'' will add P t (c|v) up to make a final decision (denoted P(c|v)) by the same simple majority rule. Along with the final decision, the prediction accuracy and RF map (i.e., what the forest looks like) are produced.
When the training set for the current tree is drawn by sampling with replacement, approximately one-third of the cases are excluded from the sample. These out-of-bag (OOB) data are used to obtain a running unbiased estimate of the classification error as trees are added to the forest. Therefore, there is no need for cross-validation or a separate test set to obtain an unbiased estimate of the test set error. Reference [5] found that RF outperformed other algorithms (i.e., decision tree, support vector machine, naive Bayes, feed-forward neural network, logistic regression, linear discriminant analysis, and self-organized map) in predicting MOOC FALs. This study was able to determine whether the PNDI could further improve the performance of RF.
Multinomial logistic regression (MLR) is an alternative prediction approach. In short, MLR is essentially a logistic regression for a multi-class problem. It is the most frequently used algorithm in dropout prediction [3]. Since it is much more commonly used than RF, the details of MLR are omitted in Fig. 1. In multinomial logistic regression, the logit link function was as follows, where subscript ''O'' was defined as the ''number of participation events'' and, alternatively, the ''number of replies,'' as in: The RF outcomes were compared with the MLR outcomes.

B. RESULTS
All 2606 screen names were used to test the performance of the automatic Chinese personal name recognition program. The results are presented in Table 2. The accuracy, precision, and recall ratios were 0.9037, 0.6132, and 0.8039, respectively. The corresponding F1 (the harmonic mean of the precision and recall ratios) was 0.6957, which outperformed the F1 scores (ranging from 0.3295 to 0.4890) of the other algorithms tested by [42]. Below, we present descriptive information about the number of participation events (in online discussions). One-way ANOVA indicated a significant difference in the number of participation events between Groups D and S (F = 25.57 > F 0.05 (1, 2604) = 3.85, p < 0.001). The average numbers of participation events were 8.81 (SD = 8.43) in Group D and 6.51 (SD = 7.94) in Group S respectively. Of the 2606 learners in the sample, 99.0% (2577) participated fewer than 31 times (including those with 0 participation events). The maximum number of participation events was 106. Among those with fewer than 31 participation events, there was a corresponding proportion of learners with failed FALs in both Group D and Group S. These associations are plotted in Fig. 2.
The effectiveness of the PNDI as a predictor could be illustrated by applying a simple prediction rule. The simple prediction rule was as follows: if a learner disclosed a personal name in his or her screen name (PNDI = 1), then he or VOLUME 9, 2021 she was predicted to pass the course (i.e., have a qualified or excellent FAL); otherwise, he or she was predicted to fail. Under such a rule, the proportion of failed learners happened to equal the prediction error ratio. Fig. 2 shows that when the number of participation events was 0, the prediction error ratio for Group D was 11.78% less than that for Group S, though both were very high (85.92% for Group S and 74.14% for Group D). Moreover, the prediction error ratios of Group D reached zero four times, at (12, 0), (16, 0), (19,0), and (20, 0) and showed fewer participation events than Group S. In other words, for learners who participated in online discussions 12, 16, 19, or 20 times, the FAL was correctly predicted 100% of the time solely because of personal name disclosure. In contrast, Fig. 2 shows that when Group S reached the same 100% accuracy ratio as Group D, learners had to participate more than 20 times (the lowest was 21 times). In fact, Fig. 2 shows that there were 13 of 30 such zero points in Group D, which means we could predict that 57 learners out of 353 (16.2%, more than 1/7) who disclosed a personal name and participated in discussions fewer than 31 times would obtain a passing FAL (qualified or excellent) with 100% accuracy by using the PNDI as the sole predictor.

1) PREDICTING FAL WITH RANDOM FOREST (RF)
The error ratio curves of the RF with different feature combinations are shown in Fig. 3. The maximum number of grown trees was set to 50. There were three reasons for this setting. The first reason was that 50 was the most frequently recommended minimum number in Table 2 in [45]. In the table, the experimental results covered 6 different datasets, and the numbers of class labels were very small (not more than 7 in five of the datasets); this was an appropriate reference since the number of class labels in our context was 3. The second reason was that 50 was a good estimation of the ensemble size at the turning point of a theoretical error rate curve in Fig. 1 in [46]. The third reason was that 50 was also used as a threshold after which the accuracy curves became smooth in three real world datasets in Fig. 2 in [47]. In addition, the number of class labels was not more than 5 in these three datasets. In addition, according to Table 3 in [48], 50 fell within a critical interval of number of trees, i.e., (32,64), in which multiple performance metrics reached stable statuses in 29 real world datasets. As shown in Fig. 3,  four error curves converged when the number of grown trees reached 50. At 50 grown trees, after the PNDI was added, the error ratios of the feature combinations were reduced from 0.0495 (EQP) and 0.0445 (EQR) to 0.0457 (EQP+PNDI) and 0.0403 (EQR+PNDI), respectively. Although the reduction seemed minor in terms of the absolute value (0.0038 for EQP and 0.0042 for EQR), it accounted for nearly 10% of the original error ratio (7.7% for EQP and 9.4% for EQR). Among these results, EQR+PNDI was the best feature combination for FAL prediction by RF.
To evaluate the role of the PNDI in FAL prediction, the predictor importance estimate function built into RF was adopted. Predictors with many discrete values may have more chances to appear in trees than those with limited discrete values. To compare the importance of predictors on a fair basis, it was necessary to exclude the influence introduced by the number of discrete values that a predictor might have. Therefore, the predictors were made binary based on the median threshold, except for PNDI since it was already a Boolean variable. The predictor importance estimates after this revision are shown in Fig. 4, but the corresponding new error ratio increased to 0.2499.   4 shows that the PNDI was estimated to be the second most important predictor after quiz scores, though its value was much smaller than that of quiz scores. Compared with the PNDI, exam scores and the number of replies had negligible predictive importance. Fig. 4 also shows that in an all-binary RF, the contribution of the quizzes was higher than that of exam scores or number of replies. This result implies that if a prediction error ratio of nearly a quarter (0.2499) was considered acceptable, then a rough prediction of a learner's FAL could simply depend on his or her quiz scores and PNDI.

2) PREDICTING FAL WITH MULTINOMIAL LOGISTIC REGRESSION (MLR)
The error ratios of MLR were 0.0568 and 0.0430 for EQP+PNDI and EQR+PNDI, respectively. Therefore, EQR+PNDI was the best four-feature combination for FAL prediction by MLR. The corresponding parameters in (1) are listed in Table 3. Table 3 shows that the absolute values of the coefficients of the PNDI (f 4 ) for ln(P failed /P excellent ) were much smaller than those of the other features in the same feature combination. Table 3 also shows that the absolute values of the coefficients of the PNDI (q 4 ) for ln(P qualified /P excellent ) were the largest in the same feature combination. These two findings imply that the PNDI had a strong ability to distinguish excellent learners from qualified learners. In contrast, this ability was weaker when the differences between failed and excellent learners in summative and formative records were already significant.
There were mixed results concerning the goodness of fit of the logistic regression model. When the MLR was changed to a failed/not-failed binary logistic regression, Nagelkerke's R 2 [49], as a measure of explained variance, was equal to 1 for the feature sets regardless of whether they included the PNDI (as calculated by a verified customer-made Matlab script 4 ). Notably, 1 is a reasonable value for Nagelkerke's R 2 (which would suggest that the PNDI did not contribute the additional explained variance), but Nagelkerke's R 2 itself is not a widely accepted equivalent of R 2 for a nonlinear model. On the other hand, the built-in ''dev'' (deviance) output parameters in the official Matlab MLR function ''mnrfit'' with the input parameters EQR and EQR+PNDI were 590.7223 and 587.1782, respectively, indicating that EQR+PNDI reduced the deviance of the fit (corresponding to the sum of the deviance residuals). This result suggests that EQR+PNDI showed better goodness of fit than EQR alone.

3) SUMMARY OF THE ERROR RATIOS IN RF AND MLR
The error ratios in RF and MLR, with different feature combinations, are summarized in Table 4. The table shows that in RF, due to the PNDI, the EQP error ratio decreased by 7.7%, and the EQR error ratio decreased by 9.4%. By contrast, in MLR, due to the PNDI, the EQP error ratio increased 3.5%, and the EQR error ratio increased by 0.94%.
Overall, RF with EQR+PNDI had the lowest prediction error ratio (see Table 4). The effectiveness of the PNDI as a predictor of FAL may depend on specific machine learning techniques; for example, the PNDI may increase the prediction error ratio when MLR is the machine learning technique, but the increase might be negligible.

VI. DISCUSSION
In this section, the relevance of this research to related works and the limitations of the research are discussed.

A. DISCUSSION OF THE STUDY 1 RESULTS
Although RQ1 was similar to the research question addressed in [23], whose participants were secondary school students, the findings were inconsistent across the two studies.
Reference [23] found that the personal name disclosure group reported significantly higher achievement goals than the nondisclosure group in a peer competition-based science learning game but reported no significant difference in test scores between groups.
In our study of a non-game-based, competition-free online MOOC learning environment, Group D reported a significantly higher FAL than Group S. These contrasting results highlight the possibility that the effect of personal name disclosure is context-specific (a peer competition-based science learning game vs. a non-game-based, competition-free MOOC; secondary school students vs. mostly college students).
We speculated that the learning duration would be one reasonable explanation for these contrasting results. In [23], the learning game was played for a short duration, and the participants ''were only allowed to play each level once for a few minutes''. In [15], which observed that non-anonymous forums had a comparative advantage over anonymous forums for learners with high levels of introjected regulation, ''each participant spent around 125 minutes providing the data, including. . . , as well as the discussions (experimental group)''. Notably, the 125 minutes comprised three 40-minute lessons in the morning. Compared with this duration, the duration of ''a few minutes'' in [23] was rather short. Even so, [23] found that the personal name disclosure group reported significantly higher achievement goals than the nondisclosure group in a peer competition-based science learning game. Therefore, we could surmise that if the learning game designed in [23] lasted longer, perhaps with more rounds at each level (rather than just one round), so that non-anonymous learners would have opportunities to win back a round when the last round was lost by accident, a significantly higher average test score would likely be observed among non-anonymous learners. Since a MOOC course usually lasts for a few weeks, our speculation, i.e., that learning duration matters when the personal name is disclosed, seems to hold. However, this speculation needs to be further examined through empirical studies, for example, following the approach taken in [23] to invite non-anonymous and anonymous learners to participate in a multi-round, game-based learning environment.
In addition, despite the similar research questions, the two studies ( [23] and the present study) had enough methodological differences to warrant caution in drawing conclusions.
One notable feature of [15] was its experimental setting: a high school in an eastern U.S. state. Therefore, culture is not very likely to be the major factor that influences personal name disclosure. However, the experimental setting in [15] was not exactly the same as ours. Hence, it is worth exploring the same research questions in other MOOC platforms outside of China, for example, EdX.
Study 1 also found that at the few replies level, Group D had a significantly higher average final score than Group S. This observation supports one of the strengths of our work: through the examination of screen name preference, the observed values of course-related predictors do not have to be accumulated to a certain degree over time to make an accurate prediction, which extends the warning time.

B. DISCUSSION OF THE STUDY 2 RESULTS
First, the predictor importance estimate showed that the PNDI was the second-best predictor of FAL after quiz scores in a fair comparison of all-binary predictors. This result again suggests that it is worth considering screen name preference in predicting FALs.
Second, the best feature combination for predicting FAL with RF was the PNDI in addition to exam scores, quiz scores and number of replies. Since incompletion of quizzes and exams or low scores on exams and quizzes were already flags of obvious problems, the number of replies was also an effective predictor in addition to the PNDI in the MOOC early learning stages. Although the number of replies variable was not estimated to have predictive importance, as shown in Fig. 4, it was not necessarily useless in FAL prediction, as Study 1 showed that this variable had a significant interaction effect with the PNDI. Therefore, it is more accurate to say that compared with the predictive importance of the PNDI, the prediction importance of the number of replies was very minor.
Reference [25] indicated that the number of clicks in the course forum of an e-campus learning system had high importance in predicting low-engagement students. Since high engagement in [25] was defined was either excellent final results or the presence of both qualified assessment scores and above-average virtual learning environment activities, low engagement in [25] was likely equivalent to a failed FAL in our research. Reference [25] found that lower clicks in the course forum predicted lower FALs. However, [25] did not verify that higher clicks predicted higher FALs since ''learners could sometimes appear to be busy but failed to complete any learning tasks''. Meanwhile, [25] confirmed that some learners spent little time in the virtual learning environment but achieved high scores on their course assessments. Adding to their findings, we not only found that Group D tended to have a significantly higher average final score than Group S at the few replies level, as mentioned in Study 1, but also found that the combination of the number of replies and the PNDI could deliver a better prediction than the number of replies alone.
In addition, replying to questions is a kind of knowledge sharing behavior. The number of times a learner shared knowledge in a virtual community was found to be positively correlated with the accumulation of ''social capital'' [50]. Reference [51] interviewed 12 learners in a MOOC. Among these 12 learners, seven earned a certification from the MOOC. Among these 7 learners, one felt no need to add this certification to her resume, but the other six either planned to include or had already included this certification in their resumes as proof of professional development or had even printed it out, framed it, and hung it on the wall. Furthermore, two learners who failed to earn the certification reported that they would probably do similar things when they earned certifications from their next MOOCs, showing their respect for the value of MOOCs. Although 12 interviewees constituted a small sample, both [50] and [51] showed that reputation was a strong potential motivator of the behavior of MOOC learners who earned a certification. However, we also found a discussion 5 on Reddit that presented an opposite opinion, i.e., that one might not take MOOC credits very seriously, especially when one's personal academic degree is good enough. Therefore, would the PNDI be an effective variable to predict learners' expectations of MOOC learning or their motivations for MOOC learning? The answer to this speculation can lead us to infer the reason behind the phenomenon identified in Study 1. Although we have begun to explore this topic, addressing it fully was beyond the scope of the present research.
Recently, [52] examined early engagement and utilized machine learning to predict student outcomes at Bangor University. By defining a new descriptive statistic for student attendance at timetabled sessions and applying modern machine learning tools and techniques, the researchers were able to predict the student outcomes at the end of a full academic year as early as the third week (of the fall semester) with approximately 97% accuracy. Their work is enlightening. Although there are much fewer timetabled sessions in MOOCs than in offline campus courses, it is very interesting to examine the association between screen name preference and MOOC homework/quiz response time, which is the time gap between the actual submission of a MOOC homework/quiz and the deadline. This response time can be used as a measure for learning enthusiasm, which will be examined in the future.
Finally, as described in Study 2, Group D did not have to accumulate as many participation events as Group S for a 100% accurate prediction to be made. This again validated the worth of using the PNDI, i.e., the ability to extend the warning time.
One note should be emphasized, even though it was previously mentioned in introduction section: the personal name disclosed in screen name was not necessarily the real name of the learner; it could be a pseudonym. However, unlike the serial-number-style screen name given by the system, such pseudonyms were chosen or even created by the learner. Therefore, the pseudonym was ''a reflection, at least partially, of the true self'' [53]. It is doubtful that a learner would have tried to impersonate another learner to earn a MOOC certification. First, there was much less external (e.g., economic) incentive than in an online game context (e.g., paying others to train an avatar to earn expensive virtual equipment), especially since MOOC certifications have not already been widely accepted in job markets. Second, the MOOC itself was quite cognitively demanding and time-consuming. Therefore, the pseudonym itself did not influence our findings. In fact, an academic handbook on names and naming [54] categorized screen names (i.e., usernames) into nine common types: standard personal name, appearance, personality, occupation/hobby/activities, origins/nationality/residence, renowned persons or characters, nature, artifacts, and abstract phenomena. Therefore, considering the pseudonym situation, the PNDI is more like a preference indicator that exhibits the tendency of a learner to disclose his or her real name, or more accurately, the extent to which a learner discloses his or her ''true self''. In addition, it would be very interesting to investigate the impact of other types of screen names, e.g., personality screen names, on FALs.
Compared with the potential aforementioned concern regarding impersonation, a more serious concern was ''fake learners'', as reported in [55]. In [55], fake learners referred to users who abused the system to receive certificates with less effort, for example, by copying using multiple accounts (CUMA) or unauthorized collaboration (UC). Regarding CUMA, a learner tried to arrive at correct answers not by thinking intensively but simply by registering several accounts to have the maximum number of allowable attempts and then feed the correct answers into a master account that would receive the credit. Hereafter, we use the term ''redundant account'' to indicate a ''fake learner''. For UC, several learners collaborated to determine an answer even when collaboration was not allowed, e.g., during quizzes or homework. UC can occur not only in online learning but also in offline learning, while CUMA can be used only in online learning. Although there are no universally accepted criteria on learning behavior, master accounts must be accurately distinguished from redundant accounts, otherwise, learning misconduct in a MOOC through the abuse of the online format could undermine the statistical significance of research observations if such behavior occurred frequently (the prevalence was 15.1% in [55]).
However, there was no such issue in our research. In our MOOC context, the redundant account strategy was not feasible since the course did not contain ''legal attempts'' at all. In our research, the course followed a somewhat traditional scoring approach: the system collected homework or quizzes submitted by students until a deadline, and after the deadline, it returned all homework or quizzes scored at the same time, showing the correct answers and explanations. In other words, the correct answers in the MOOC in our research were available to the public only after the scores were marked at the same time. Therefore, there was no time window for redundant accounts to exploit. To offer a second chance for unprepared learners, the course in our research adopted a different policy with a purpose similar to a ''legal attempt'' but prevented abuse through the creation of a ''redundant account'': two distinct quizzes of the same reliability were offered at two set times, and the higher score was only recorded if a learner took both quizzes. Regarding UC, there was no low-cost measure to efficiently prevent it. However, even if UC occurred in our research, it would not be critical enough to undermine the statistical significance of our findings since most of the MOOC learners were strangers to each other. Strangers might easily collaborate on learning, but it might be very difficult for them to directly ask for answers to copy and paste based on an honor code or moral pressure.
A limitation of our research is the course that we chose. First, psychology can be seen as overlapping with the natural sciences and social sciences, and MOOC learners in different sciences may develop different personalities. Whether different courses affect the associations between personal name disclosure and FAL is unknown. Second, ''Introduction to Psychology'' is a course for beginners, and beginners do not often pose difficult questions. A more complicated context for a self-disclosure study would be a course in which difficult or even critical-thinking questions are often posed. Whether the difficulty level of courses affects the associations between personal name disclosure and FALs is also unknown. We will investigate these two issues in the future.

VII. IMPLICATIONS
Before the COVID-19 pandemic, MOOCs were only considered to be supplementary to traditional college courses. However, since COVID-19-like epidemics are predicted to occur more often in the future [56], MOOCs may become a major learning alternative during times of crisis [57], [58].
Before the COVID-19 pandemic, the majority of MOOC learners were found to be middle-aged adults (24-35 years old) in the U.S. [59] and young adults (17-26 years old) in China [60]. Due to the COVID-19 pandemic, the proportion of college students as MOOC learners may increase for social-distancing reasons, both in the U.S. and China. Unlike middle-aged MOOC learners who selectively aim to complete short, useful knowledge modules for career plans rather than entire MOOCs, many college students have used MOOCs to obtain academic credits during the COVID-19 pandemic. These college students not only expect to finish MOOCs but also expect to achieve good FALs. Therefore, the earlier a learner's eventual need for support is predicted, the more time that is available to try different personalized strategies or teaching strategies before problems emerge, which leverages the advantage of diverse characteristics in MOOC learners.
A sufficiently long warning period is also important for MOOC learners who are active but eventually fail. In fact, according to our research, these MOOC learners can be identified very early since the majority of them (89.2%, see Table 1) do not disclose personal names. Specifically, there are three implications for MOOC stakeholders, as follows.
First, for MOOC learners who expect to achieve good FALs, we suggest that they consider disclosing their personal names in their screen names to generate additional motivation for self-regulated learning. Disclosing a personal name does not necessarily reveal one's full identity when no further personal information is disclosed.
Second, if an anonymous learner gradually decreases engagement in MOOC learning, then their failure prediction can be made much earlier than if their engagement frequency suddenly drops significantly. If this learner claims that he or she expects to earn a MOOC certification or academic credits at the beginning of the MOOC, then one possible motivating alternative for him or her would be receiving an e-mail at the moment when his or her engagement frequency starts to decrease asking whether he or she needs any special help. If the learner responds that it is difficult to catch up due to the course's pace, then an alternative version could be recommended. If the learner responds that he or she is often disturbed by something else or feels helpless, then we may inquire if he or she is interested in disclosing a personal name. If we can take all means to make a certification earner feel that he or she is taken care of, then he or she may have more chances to adhere to his or her primary goal [24]. Furthermore, all these suggested actions can be easily selected when the primary criterion is simple: whether the learner's screen name discloses a personal name.
Disclosing a personal name not only makes one push oneself harder in the MOOC learning context but also provides an unexpected benefit. Recently, [61] found that support-providers tended to show a higher level of politeness and person-centeredness (putting themselves in the target's situation, i.e., standing in the target's shoes) to support-seekers who disclosed personal names than to those who did not. In other words, when MOOC learners disclose their personal names in online forums, they may receive support messages of higher quality than those who do not.
Last, our findings are also helpful for teachers. If we think outside of the box and rethink our research based on Jaspers's philosophy (''Education is maieutic, i.e., it helps to bring the student's latent ideas into clear consciousness'' [62]), we may compare our finding (i.e., that learners disclosing personal names in their screen names tend to have higher FALs) to a symptom of childbirth. When a busy MOOC teacher prepares to being interacting with students, how can he or she select a learner to begin a conversation within a limited duration? Because learners' information is hidden from the teacher as much as possible due to privacy rules, learners' screen names remain as the only source of public information that teachers have access to, besides the information that is available in learners' ''cold'' learning behavior records. Fortunately, our research results indicated that students who disclose personal names might respond to teachers' questions better than those who do not (please note the aforementioned discussion on scores as an exception). Therefore, if a teacher wants to maintain a social atmosphere, he or she might initially pose questions to students who disclose personal names (who are more likely to respond) and subsequently encourage more follow-up comments and questions from students who have not disclosed personal names.

VIII. CONCLUSION
In this research, we focused on MOOC learners' preference for personal name disclosure in their screen names as a predictor of their FALs at the end of the course. We conducted two studies, one to examine the associations between these two variables and one to demonstrate how to utilize such associations in MOOC FAL prediction.
We found that MOOC learners who included personal names in their MOOC screen names significantly outperformed other learners in their FALs. We also found that screen name preference improved FAL prediction accuracy utilizing natural language processing and proper machine learning technologies. This research indicates that classifying MOOC learners according to their screen name preference at the beginning of a MOOC may afford more time to provide personalized support strategies, which is one way to leverage the diverse characteristics of MOOC learners. The results of this study are potentially useful in developing an early intervention to provide different types of help to students who prefer to disclose personal names and those who do not. The practical effects of these interventions will be examined in the future. Furthermore, whether the course difficulty level or type of course affects the associations between personal name disclosure and FAL will also be examined in the future.