Could the Use of AI in Higher Education Hinder Students With Disabilities? A Scoping Review

Literature reviews on artificial intelligence (AI) have focused on the different applications of AI in higher education, the AI techniques used, and the benefits/risks of the use of AI. One of the greatest potentials of AI is to personalize higher education to the needs of students and offer timely feedback. This could benefit students with disabilities tremendously if their needs are also considered in the development of new AI educational technologies (EdTech). However, current reviews have failed to address the perspective of students with disabilities, which prompts ethical concerns. For instance, AI could treat people with disabilities as outliers in the data and end up discriminating against them. For that reason, this systematic literature review raises the following two questions: To what extent are ethical concerns considered in articles presenting AI applications assessing students (with disabilities) in higher education? What are the potential risks of using AI that assess students with disabilities in higher education? This scoping review highlights the lack of ethical reflection on AI technologies and an absence of discussion and inclusion of people with disabilities. Moreover, it identifies eight risks associated with the use of AI EdTech for students with disabilities. The review concludes with suggestions on how to mitigate these potential risks. Specifically, it advocates for increased attention to ethics within the field, the involvement of people with disabilities in research and development, as well as careful adoption of AI EdTech in higher education.


I. INTRODUCTION
According to Russell and Norvig [1], the study of artificial intelligence (AI) refers to ''the study and construction of agents that do the right thing'' (p.22).What the ''right thing'' means depends on how the objective is defined (ibid.).In the context of higher education, several reviews have identified different applications of AI [2], [3], [4], [5], [6], [7], [8].Zawacki-Richter et al. [7] distinguish four uses of AI: 1) profiling and prediction, 2) assessment and evaluation, 3) adaptive systems and personalisation, and 4) intelligent tutoring systems.Overall, the use of AI for tertiary education The associate editor coordinating the review of this manuscript and approving it for publication was Meriel Huggard . is still experimental [4], and its adoption is at an early stage [9].This is also reflected in the technologies used, as traditional AI techniques such as logistic regression or naive bayes classifiers are more widespread than more advanced methods such as neural networks and genetic algorithms [6].
Existing reviews also explore the benefits and risks of using AI in higher education.On the one hand, Ouyang et al. [6] argue that AI enables administrative staff and lecturers to take informed decisions based on predictions of student performance, learning status, or satisfaction.AI also provides students with learning recommendations, and it improves academic performance as well as online engagement and participation.These arguments are also supported by the conclusions drawn from the literature review by L. Chen et al. [10].Similarly, Zawacki-Richter et al. [7] point out the possibility to facilitate admission decisions with predictions that let university employees focus on more complex cases.Moreover, the review by González-Calatayud et al. [3] suggests that students perform better with AI tutors than without.
On the other hand, the risks and ethical concerns of using AI are manifold.Some researchers stress the lack of pedagogical perspective, as the literature focuses on the technical aspects of developing AI applications [3], [6], [7], [11].Zawacki-Richter et al. [7] emphasize the fact that technology is not a panacea, as well as the importance of supporting learners in a pedagogic sense.Furthermore, there are concerns about the lack of transparency of AI-based decisions [3], [6].Some researchers also warn against biases in AI due to a lack of data diversity [6], [7], [12], [13].There are also concerns about the privacy of students and security issues [7], [8], [12], [14].Staff may also not be trained to use AI appropriately [3], [14].Finally, students and higher education staff alike may not trust that AI-based proctoring systems make correct assessments [14].
Higher education institutions are seeking to make education more accessible, and AI could help in achieving this goal.In 2006, the signatories of the United Nations Conventions on the Rights of Persons with Disabilities pledged their responsibility in guaranteeing that people with disabilities have access to tertiary education.From a socio-medical perspective, a disability is the result of the ''interaction between health conditions [. ..] and a range of environmental and personal factors' ' [15].This definition elucidates that focusing on the health condition of individuals would be insufficient to improve accessibility.There is an increasing recognition that there is no one-size-fits-all approach to education.The Universal Design for Learning (UDL) approach recognizes that every learner has a unique way to understand information, to be motivated, and to express their knowledge [16].Therefore, higher education should be more flexible and tailored to the diverse needs and abilities of students.AI provides the personalization and adaptability that could enable people with disabilities to study at the tertiary level.For instance, students with impairments may benefit from a flexible course schedule with adapted learning paces, as attending courses requires a lot of effort and concentration [17].However, these benefits can only be achieved if the needs of people with disabilities are considered in the development and adoption of AI EdTech.As Heiman et al. [18] note, failing to account for accessibility needs during technology development can lead to increased costs and lengthier processes for people with disabilities.
Unfortunately, existing literature reviews fail to make a deeper analysis on the impact AI could have for students with disabilities.Very few papers mention people with impairments.While González-Calatayud et al. [3] briefly mention one study that considered students with disabilities for an adaptive learning platform with self-assessment in their review, they provide no further analysis.Nigam et al. [14] caution that AI-based proctoring systems, i.e. systems that monitor students during examinations to prevent cheating, can increase students' anxiety.Nevertheless, they do not consider how this may impact students with chronic anxiety or other mental health conditions.A recent systematic literature review found that knowledge in inclusiveness is limited and remains fairly new in learning analytics, as they could not find articles older than 2016 dealing with the topic [19].This present study seeks to understand how the discussion of ethics and people with disabilities have been integrated in the recent literature.Specifically, it focuses on research articles presenting AI educational technologies (AI EdTech) that assess students to form or inform decisions in higher education.Assessment is defined broadly as ''the process of gathering information and intervening in that information using some criteria in order to form a judgment'' [20, p.4].For instance, an AI could assess a student's skills, interests, and preferences to recommend suitable universities [21].
Moreover, the field of AI EdTech needs to pay greater attention to ethics [2].More specifically, discrimination is a serious concern in the use of AI [22].According to Heinrichs [22], there are four aspects that characterise discrimination: 1) it is a situation when individuals are treated differently than others, 2) the treatment is, or at least is believed to be, disadvantageous towards the individuals, 3) the difference in treatment can be explained by the belonging to a specific group, and 4) the treatment is not in accordance with established weighting of ethical concerns.With this moralized definition, Heinrichs [22] stresses that to judge a decision as discriminatory, one needs to consider how it violates ethical considerations towards a group of individuals.According to Morris [23], there are seven ethical concerns for AI and accessibility: inclusivity, bias, privacy, error, expectation setting, unfeasible simulated data, and social acceptability.Firstly, inclusivity raises questions about the effectiveness of AI technologies ''for diverse user populations'' [23, p. 35].For instance, text correction is less likely to work for people with dyslexia [24].Secondly, bias is a source of harm in machine learning and can lead to discriminatory treatment of different groups of people [25].This poses a problem when it affects the right to education, which is guaranteed by the United Nations and national legislation in countries like Switzerland.There are allocative harms, meaning that specific groups are systematically excluded from opportunities or resources that other groups receive, e.g. a group of people is given lower chances to be admitted at a university (ibid.).There are also representation harms, meaning that some groups are represented negatively or lack a positive representation (ibid.).For instance, natural language processing (NLP) models associate mental illness with negative terms like gun violence, homelessness, and drug addiction [26].Furthermore, as Trewin [27] explains, it is challenging to ensure algorithmic fairness for people with disabilities since disabilities are diverse, can be multiple, and are usually treated as outliers in the data used to train algorithms.Thirdly, people with disabilities are more subject to privacy issues because their disability can act as an identifying factor in an anonymised data set or the AI may deduce a disability from the data, leading to involuntary disclosure [23].Fourthly, ''error'' refers to the fact that ''many people with disabilities need to trust and rely on the output of an AI system without the ability to verify the output'' (ibid., p.36).Fifthly, expectation setting issues arise when the capacity of AI is marketed as greater than what it can actually deliver, leading to false promises for people with disabilities who rely on these technologies (ibid.).Sixthly, simulated data are difficult to create for people with disabilities and often unrealistic (ibid).Consequently, AI evaluations need to involve people with actual impairments (ibid.).Finally, ''social acceptability'' revolves around the question of whether a technology is better accepted because of the disability status of the user (ibid.).For example, Google Glass may not be accepted by the general public due to privacy concerns, but its use to help people with visual impairments may receive more acceptance [28].These seven ethical concerns by Morris [23] are crucial to understand how AI use in higher education can potentially discriminate against students with disabilities.The European Commission [29] recommends involving individuals from vulnerable groups, such as people with disabilities, and investigating potential harm to ensure the development of trustworthy AI.Yet, there is not enough research on AI fairness that considers the perspective of people with disabilities [25], [27].
Taking the current literature into account, this study explores two main research questions.The first question aims to investigate the extent to which ethical concerns are addressed in articles that present AI applications in higher education -both in general and more specifically in relation to students with disabilities: Research Question 1: To what extent are ethical concerns considered in articles presenting AI applications assessing students (with disabilities) in higher education?This requires answering two sub-questions: a) What are the ethical concerns mentioned in articles presenting AI applications assessing students in higher education?b) To what extent are students with disabilities mentioned?
The second main research question of this study focuses on assessing the risks of discrimination with AI applications presented in the literature.Special attention will be paid to the input variables used in AI applications, since data are often the source of AI biases [25].By examining the input variables, it is possible to identify potential measurement bias, i.e. biases that arise due to measuring a concept with proxy data [30].Furthermore, not all decisions are equally critical; there are low-and high-impact decisions [31].High-impact decisions, such as hiring someone, can have a significant impact on individuals' lives and deprive them of their right to equal opportunities (ibid.).Low-impact decisions, such as a platform recommending a product to buy, are less consequential [32].This distinction relates to allocative harms of biases.For that reason, the type of decisions taken by an AI in the articles will be extracted.Additionally, the degree of user control influences how algorithmic bias affects users as they can be empowered to reject or refuse the decision of the AI [32].Hence, this review will examine who is making the decisions in AI applications.
Research Question 2: What are the potential discrimination risks in using AI that assesses students with disabilities in higher education?a) What are the input data of the AI EdTech?b) What type of decision is taken?c) Who is involved in the decision-making process and how?
Early results of this study were presented in a short paper at the European Workshop on Algorithmic Fairness 2023 [123].This study contains the final results with a detailed discussion, as well as a description on the method.This paper is structured as follows: first, the methodology of this scoping review, based on the PRISMA-Scr method, is described.Second, the results of the review are presented.In particular, this section reports on how many articles have discussed ethical concerns and the perspective of students with disabilities.The types of decisions made by AI, how humans are involved in the decision-making process, and types of variables used for the AI are also reported in this section.Third, the discussion section highlights the lack of ethical considerations and outlines eight discrimination risks for students with disabilities.Finally, the conclusion summarizes the answers to the research questions raised, discusses the limitations of the study, and provides suggestions for future research.

II. METHOD: PRISMA-ScR
To review the literature systematically, the Preferred Reporting Items for Systematic reviews and Meta-Analyses extension for Scoping Review (PRISMA-ScR) method was followed.The PRISMA provides guidelines to guarantee the transparency, completeness and replicability of the review to the fullest possible extent [33].The method includes an extension for scoping reviews, as they ask broader questions than systematic literature reviews and thus not all reporting items are relevant for scoping reviews [34].This scoping review examines whether ethical concerns related to students with disabilities are considered during the development and use of AI in higher education.Then, looking at the description of AI in academic literature, this study assesses the potential risks these AI applications could pose for students with disabilities.In sum, this scoping review seeks to highlight any gaps in the literature regarding ethical considerations for students with disabilities in AI-based educational technologies.

A. PROTOCOL AND SCREENING REPORT
A protocol was written before the start of the review and updated throughout the research process by the primary author.The screening report gathers the articles that were screened in this review and indicates the exclusion criteria for articles excluded in the second and third rounds.The protocol and the screening report are available on Zenodo.

B. ELIGIBILITY CRITERIA
Articles were selected following a set of inclusion and exclusion criteria that were updated during the search and selection procedure (Table 1).Eligibility criteria focus on two aspects of the articles: content and form.Content criteria include ''context'', ''focus of the article'', ''type of technology'', and ''type of application''.Formal criteria comprise ''Quality indicators'', ''language'', ''time period'', and ''availability''.
The review included only studies in the higher education context, as students with disabilities face different barriers in higher education than in secondary education.For instance, while they still have the same needs as in secondary education, students with disabilities must self-advocate to receive accommodations in higher education [35].Furthermore, to comprehend how current AI technologies could impact students with disabilities, this study required articles that focused on a comprehensive AI-based application.Publications that lacked clear use were deemed inadequate for assessing the potential future impact.Similarly, only AI-based systems that assessed and took or informed decisions affecting students were included.
Rule-based systems were excluded from the review since we focused on the more modern approach with machine learning techniques.Short contributions (e.g.poster sessions, extended abstracts), non-original work (conference reviews, literature reviews), editorials and books were excluded because they either did not provide enough details about an AI-based system or they did not present a new AI-based system.Moreover, only articles written in English were selected due to the large number of publications on this topic and the fact that most scientific research on AI is published in English.Further, articles published before 2018 were not included because the AI field is developing rapidly, making newer technologies more relevant to investigate.More recent articles are also more likely to present improved AI-based systems, making older publications less relevant.Additionally, a recent systematic review found that the topic of inclusiveness and students with disabilities is fairly new in the literature on learning analytics as they did not find any articles earlier than 2016 on the subject [19].It can therefore be expected that recent articles can now account for this issue as it has been raised within the research community.Articles that could not be accessed freely with the licenses of the two universities of the authors were excluded because the two universities provide access to most journals and thus purchase costs could not be justified.

C. INFORMATION SOURCES, SEARCH, SELECTION OF COURSES OF EVIDENCE, FLOW CHART
Prior to searching for articles, the first author of the study performed an exploratory search of the literature in Web of Science and looked for existing systematic literature reviews.This preliminary phase helped define a search strategy and identify keywords.The final search syntax was the following (adaptations of the syntax due to database specifics are explained below); the three groups are coupled with the Boolean operator AND: • Technology • ''artificial intelligence'' OR ''machine learning'' OR ''deep learning'' OR ''natural language processing'' • Context • ''higher education'' OR ''tertiary education'' OR ''undergraduate education'' OR ''graduated education'' • Type of application • ''assessment'' OR ''recommendation'' OR ''adaptive learning'' OR ''monitoring'' OR ''admission'' OR ''education technology'' OR ''tutor * OR ''proctor * '' Then, the search and selection of articles was divided into three phases: 1) selection of articles in databases, 2) snowballing, and 3) additional review based on a reviewer's feedback.The process is presented in the flow diagram (Fig. 1).
In the first selection phase, a total of 1,594 records were identified from four databases: Web of Science (1,089), Scopus (393), arXiv (12), and Google Scholar (100).For Web of Science, the parameters were the same except that the selected document types were ''article'', ''proceeding paper'', ''book chapter'', and ''data paper''.The search was performed from July 1-6, 2022.For Scopus, the parameters were the keywords from the search syntax + from 2018 to 2022 + English only + selected document type are article, conference paper, conference review.The search was performed from July 22-26, 2022.For arXiv, the search syntax was reduced because the first search retrieved only three results.The search syntax was (''Artificial intelligence'' OR ''Machine learning'' OR ''Deep learning'') AND (''higher education'' OR ''tertiary education'' OR ''undergraduate education'' OR ''graduated education'') AND (''assessment'' OR ''recommendation'' OR ''adaptive learning'' OR ''proctor'').The search was performed on July 20 th , 2022.For Google Scholar, the search was done with the defined search syntax and from 2018 onwards.Only the first 100 results ordered by relevance were screened because they emerge out of a full-text search yielding much more potential results that cannot be further filtered.Google Scholar was still included in the process because it allows to check that articles deemed as relevant by the search engine have not been missed.The search was done on September 9 th , 2022.
In the first screening round, the first author reviewed the titles and abstracts of 1,594 articles.Records were excluded when the titles and abstracts did not contain the keywords of the search syntax or when the keywords were used in a different meaning.For instance, deep learning can refer to an AI learning as well as a pedagogical concept of learning.Based on this, 1,252 records were excluded, and 79 duplicates were removed.
Then, the first author read the introductions and conclusions to further reduce the selection based on the eligibility criteria.This second screening round was necessary because the eligibility criteria were revised after the first screening, as the scope was initially too broad.At that stage, the availability of the text was checked as well.
In the third round of screening, the first and fourth authors independently reviewed 105 articles in full length.Subsequently, they met to discuss potential disagreements and find a consensus.At the end of this stage, 44 articles were selected for analysis.
In the second selection phase, the first author followed the backward snowball technique to select relevant articles.Furthermore, to compensate the exclusion of papers due to a lack of details (EX8), the reviewer searched for publications that cited the excluded article and/or that were authored by the first, second, or last authors.The same eligibility criteria used in the first phase of the review were applied.13 articles were identified with the snowball technique and two more recent articles were included based on the EX8 follow-up search.
During the analysis of the 59 articles, the first author noticed that two publication pairs were presenting the same AI application without significant changes to its design.To avoid inflating the results with multiple articles for one AI application, the articles that were less recent or less detailed were excluded (EX12).
After receiving feedback from a specialised academic journal, an additional literature review was performed to ensure that articles from important conferences and journals in the field of AI educational technologies were not missed.The following conferences and journals were reviewed: AIED conference, International Journal of AI in Education, Educational Data Mining (EDM) conference and journal, Learning Analytics and Knowledge (LAK) conference, and journal of learning analytics.The first author screened the articles following the same criteria as in the initial search.For the EDM conference, retrieving articles with keyword query was impossible which explained that they were not included in the first review; therefore, all articles from 2018 to 2022 were retrieved manually.In total, 16 new articles were selected for analysis.Two articles were selected from the LAK conference that had not been captured by the query in the original databases.The other selected articles were published later than July 2022 which was the end date of the first review.One article was an extended version of an already selected article.Therefore, the shorter article was excluded following EX12.In total, 72 articles were analysed.The newly selected articles did not change the review's conclusions significantly.The only noticeable difference compared to the original selection of articles was a higher frequency of consideration for privacy and transparency issues.

D. DATA CHARTING PROCESS AND DATA ITEM
To extract data from the selected articles, a coding table (Table 2) was created and discussed among authors of the study.Data extraction was performed with one reviewer using MaxQDA.In case of uncertainty, a second reviewer was consulted to check the data extraction.

A. OVERVIEW OF THE ARTICLES
The 72 selected articles were sorted into four categories, as adapted from the work of Zawacki-Richter et al. [7]: 1) Profiling and predictions, 2) Assessment and evaluation, 3) Intelligent tutoring systems, and 4) Recommenders.Fig. 2 summarizes the distribution of the types of application in the sample of articles.
An interactive dashboard was created to let the readers filter the different articles that were coded for this analysis.This dashboard also fosters accountability for the results reported in the following sections.

B. MENTIONS OF ETHICAL CONSIDERATIONS AND STUDENTS WITH DISABILITIES
Fig. 3 shows the distribution of ethical concerns mentioned in the 72 analysed articles.A little less than half of the articles did not mention any ethical aspects.Among the 39 articles that mentioned ethical concerns, the majority considered privacy issues.Typically, authors indicated that data were anonymized or that privacy was guaranteed in accordance with regulations.Less frequently, authors explained the importance of guaranteeing privacy in the design of their applications.For instance, Robal et al. [36] argued that their system for detecting attention loss in online learning was privacy-aware, as data were stored on local machines.Alike, Jia et al. [37] specifically investigated for the problem that algorithms may generate private content.Others specified that they did not include data such as gender due to privacy concerns (e.g.[38]).
The second-most commonly mentioned ethical concern was transparency and explainability.Several authors particularly emphasized the need to make predictions explainable (e.g., [39], [40], [41], [42], [43]).For instance, Ortigosa et al. [42] chose an algorithmic model that enabled them to explain which features contributed to a student being at risk of dropping out.
Finally, 14 articles mentioned considerations to mitigate biases.For instance, some excluded data like gender variables (e.g.[44]) or other demographic variables [45], while others checked that the dataset was not ill-balanced for variables known to create biases (e.g.[46]), or stated that models need to be trained periodically to reflect changes in the student population (e.g.[47]).
Only four of 72 articles mentioned students with disabilities [40], [48], [49], [50].Among these four, Renzella et al. [48] and Nagy and Molontay [40] were the only ones to address or discuss accessibility considerations clearly.Renzella et al. [48] highlighted the importance of educators reaching out and engaging with students before introducing tutoring systems that may not be compatible with speech-generation technologies or that may increase anxiety by requiring students to record themselves.Nagy and Molontay [40] explicitly chose a colourblind-friendly colour palette for their application.
Xia [50] included disability as a variable to predict a final score and identify learning behaviour profiles that need attention.Nevertheless, the author did not adequately define the variable and referred to students without disabilities as ''learners with normal intelligence and good health'' [50, p.9].The author also suggested that learners with disabilities require a different type of intervention than the one for students without disabilities to support online learning.To make predictions on the probability of a student dropping out of their studies, Tsai et al. [49] included disability status in the variable ''disadvantaged students'', which also comprised students from low-income households and students receiving financial support.Furthermore, the authors suggested engaging with and carefully listening to at-risk students and emphasized the importance of not leaving anyone behind.

C. POTENTIAL RISKS
This review extracted information on decision type, decisionmaker, and input data that will feed a discussion on the potential risks of using AI in higher education for students with disabilities.

1) DECISION TYPE
This subsection describes the types of decisions that applications are affecting.For each kind of application, the decision types that are more likely to affect the studies of students are discussed in greater detail.
Among the 72 articles, 18 presented an application to identify at-risk students.The output could be a probability to fail [49], [54], [58], [60], [63] or a categorical variable indicating that the student was at risk or not [52], [55], [59], [61], [20].Additionally, several researchers used combined categorical variables with probabilities to present their outputs or added additional contextual information [42], [46], [47], [51], [56], [57], [59], [62].This categorization of students is not trivial as supplementary information to understand the predictions was seldom provided.Sometimes, researchers discussed the importance of input features for predictions, but they did not include this information formally in the output, i.e. the decision-makers did not have access to the information [47], [61].Van Petegem et al. [62] explicitly stated that teachers could use the feature weights to better understand why a student was predicted to pass or fail in a class.Comparatively, the solutions by Hussain et al. [59], Ortigosa et al. [42], Figueiredo and García-Peñalvo [57], and Eagle et al. [56] provided teachers with contextual information such as a dashboard visualising the online activities of students in the course or profile descriptions of students.
Six articles presented an AI-based system that predicted student performance before their admission.Three of the presented applications had a direct effect on admission selection: one provided a short-list of candidates for a graduate program based on their CV [69], another modified the weight of admission criteria to select students with the characteristics that predicted a higher performance at university [70], and the third one predicted enrolment of students to inform scholarship allocations [45].The other three articles sought to inform admission decisions and strategic planning by highlighting skill or knowledge gaps and strengths [40], [71], [72].Importantly, Nagy and Molontay [40] pointed out the self-fulling prophecy issue when students are predicted to fail in a program.As a result, the authors advised against focusing on the prediction and encouraged using predictors to understand how to increase the chance to graduate instead.

c: RECOMMENDERS
Eight article introduced recommenders for learning materials [38], [89], [90], courses [91], [92], topics to review [39], forum posts [93], or learning paths [94].This type of decision has a rather low impact as it simply provides a list of courses or materials that could be of interest to students.In comparison, five articles utilised AI to recommend higher education institutions or programmes to students [21], [64], [65], [66], [67].These recommendations can potentially impact students' decision to apply to a specific programme or university, impacting their future job and salaries.These applications provide a list of recommendations.Some also added a percentage of success to the recommendations [65], [67].

2) DECISION-MAKER
This sub-section outlines how various stakeholders (students, lecturers, faculty staff) were involved in the AI-asssisted decision-making process.Note that their involvement does not imply final decision-making authority; rather, they receive information from AI outputs to inform decisions.Table 3 gathers the number of articles that involved each different stakeholder.
Across all applications, students and lecturers were equally frequently involved in the decisions informed by AI.However, the frequency diverged among application types.In profiling and predictions, the decision-makers were typically the lecturers or faculty staff who received information from the AI system and then decided on how to intervene or grade.For instance, for at-risk predictions, the most commonly proposed intervention was for the lecturer to send an email to the identified at-risk students, as seen in the works by Olivé et al. [61], Ciolacu et al. [52], or Le et al. [60].Students were usually informed about AI outputs, but rarely had a more active role in the process.For instance, Andrews-Todd et al. [73], Datta et al. [95], and Wang and Chen [82] presented applications that informed students about AI-based assessment, without getting further control on their evaluation.
Comparatively, some applications let students decide whether to use the application.For instance, Ciolacu et al. [52] highlighted that students voluntarily agreed to receive at-risk predictions.Hellings and Haelermans [58] made students the sole decision-maker by providing them with a learning analytic dashboard featuring their predicted grade.In [49], the students were not informed about the prediction, and therefore they were not technically involved.Their AI application provided a list of students with a high risk of academic failure to academic counsellors, who then informed teachers.However, academic counsellors and teachers were encouraged not to label students, to engage in a dialogue with students, to listen to them, and to let them express concerns before deciding on an intervention.Additionally, the attention assessment system by Robal et al. [36] stood out by developing a tool that allowed students to self-monitor and choose not to use it.In comparison, the three other attention assessment applications [83], [84], [85] aimed to inform lecturers or faculty staff about students' attention to engage them.
Furthermore, students were more often involved in recommenders and intelligent tutoring systems.When it comes to recommenders, students can decide whether to follow the recommendations.However, only Hur et al. [39] and Morsomme and Alferez [92] accompanied their recommendation with an explanation to aid student comprehension.Intelligent tutoring systems generally aimed at enabling students' independent learning by providing feedback and recommendations.For instance, both Kim et al. [99] and Vytasek et al. [105] presented an app to give feedback on the writing quality of students' essays before submission.Similarly, the conversational agent presented by Rossi et al. [96] suggested learning materials to enhance students' comprehension, but if unsatisfactory, the student could ask for help from a tutor.The language tutor by Schlippe and Sawatzki [97] not only graded and gave feedback to students, but also utilised peer-review to support them, thus granting students some form of control over correct answers.Still, in several articles, lecturers could also oversee the students' progress [44], [95], [96], [101].
The involvement of AI was generally limited to providing additional information to lecturers, faculty staff or students.For example, Rodriguez-Ruiz et al. [80] emphasized their tool was not for automatic grading but was complementary to the lecturers' assessment.Some also included human oversight to address inaccuracy or potential AI errors.For instance, Renzella et al. [48] presented an AI that checked students' real identity before taking an online exam and emphasized the importance of involving tutors in reviewing the alerts to avoid any negative impacts on students.Similarly, Sunaryono et al. [88] involved students in an AI-based attendance system by sending a confirmation that their attendance has been successfully recorded.However, there were a few instances where the AI took on a greater decision-making role.For instance, in [69], the faculty staff delegated the selection of candidates to the AI and only received a list of ''good'' candidates.Furthermore, two articles presented a knowledge and skill assessment tool that worked completely automatically and adjusted itself to students [75], [81].

3) INPUT DATA
This sub-section describes the data types used to train AI models.Table 4 summarizes the distribution of input data types in each application category.Data representing university and professional qualification and interaction log data were the most widely utilised across all applications.University and professional qualification data included information that was traditionally collected by universities such as grades, the type of courses that were passed or failed, or prior school or university.This data type was particularly employed to identify at-risk students and recommend learning materials, courses or programs.For example, Tsai et al. [49] used academic performance, student loan application, number of absences from school, and number of alerted subjects in the first and second semesters to identify students at risk of dropping out.
Digitalization and online learning development enable universities to leverage more information on students' learning behaviour.The interaction logs encompassed variables related to the activities and features of an online learning platform, such as the number of views to an online activity or a forum, or the time spent learning on a platform, but also more active forms of actions on an online platform like the student's number of attempted quizzes, number of submissions, number of forum posts, or number of emails sent and received.This information was mostly employed in profiling and prediction applications.Furthermore, sociodemographic information (e.g., age, gender, hobbies, parents' occupation, financial aid) was often used with interaction log data or university and professional qualification to predict students' academic performance or interest in a programme.For instance, Obeid et al. [21] asked students to enter personal information such as gender, living country, preferred language, favourite hobby, role model in the family, parents' work domain as well as favourite/ least favourite subjects and previous school to recommend suitable universities to students.
The use of video data was largely to assess students' attention, employing facial and eye recognition techniques, but also to assess students' comprehension as seen in the work by Holmes et al. [75] and Shobana and Kumar [81].

A. TO WHAT EXTENT ARE ETHICAL CONCERNS CONSIDERED IN ARTICLES PRESENTING AI APPLICATIONS ASSESSING STUDENTS IN HIGHER EDUCATION?
In the literature presenting AI applications assessing students in higher education, ethical considerations are not systematic and remain superficial.This finding is in line with the findings of Zawacki-Richter et al. [7], who highlighted the lack of reflection of challenges and risks in AI educational technologies and a lack of pedagogical reflection.If known ethical concerns are not addressed in the literature, AI EdTech risks impacting students negatively.Privacy is the most widely known and addressed ethical concern in the literature, which is probably due to the importance of data protection laws in 27820 VOLUME 12, 2024 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.recent years.However, privacy concerns are not consistently and deeply considered in the literature.This concurs with the findings of Khalil et al. [19] in the literature on learning analytics.Transparency and explainability are mentioned only in a quarter of the 72 selected articles.Similarly, bias is considered only in about 20% of the articles.Greater attention to these ethical aspects is essential because bias can lead to erroneous classification or even exclusion from a system.Furthermore, transparency can manage expectations and hold AIs accountable for their outcome.Explainable AI can also help users understand the AI outputs, facilitating decision-making.
This scoping review also highlights the missing perspective of students with disabilities.AI EdTech holds the promise to personalize higher education, but if designed without considering students with disabilities, EdTech may end up only serving the traditional students rather than the students universities are trying to support [106].Moreover, there were no AI evaluations to measure the impact on students with disabilities.Thus, research presenting AI prototypes that assess students fail to account for the potential impact for students with disabilities.This is all the more problematic as students with disabilities are often underrepresented due to the difficulty in disclosing one's disability [107], and underrepresentation may lead to misclassification [108].
These findings suggest that while ethics and inclusion are discussed in the general field of AI Edtech (see e.g.[19], [108], [109], [110]), reflections on these topics do not seem to be integrated in the work of researchers who develop new applications.Creating inclusive and responsible technologies from the start is most beneficial because it takes more time, effort, and money to remediate inaccessible technologies [18].To address this mismatch, Selwyn [110] suggested including ethics as a mandatory subject in the curricula of data scientists and those in charge of procurement decisions in higher education institutions.Researchers are also invited to report their efforts to address ethical concerns as well as limitations of their AI-based application regarding people with disabilities.

B. WHAT ARE THE POTENTIAL RISKS FOR PEOPLE WITH DISABILITIES TO USE AI IN HIGHER EDUCATION?
In total, eight potential risks of using AI in higher education for students with disabilities were identified: • Risks associated with the choice of data: 1. Interaction log data 2. Background information: hobbies and financial aid 3. Text data • Risks associated with the decision type: 4. Simplistic classification 5. Recommendations that do not consider accessibility needs • Risks associated with the involvement of stakeholders 6. Monitoring students' faces 7. AI as a decision-maker 8. Low student involvement Each potential risk is discussed in the following sub-sections.

1) RISKS ASSOCIATED WITH THE CHOICE OF DATA
First, many researchers sought to take advantage of the availability of interaction log data.The advantage of this type of data is that it captures what students are doing rather than who they are.However, even that type of data is not free from potential bias.The accessibility of online platforms may influence interaction logs of students with disabilities [106].Interaction logs often included the number of times activities were viewed or the time spent on the platform.Barriers on online platforms such as illogical structure, improper labelling, low colour contrast, lack of alternative text for visual information, or lack of keyboard navigation support may hinder or even completely prevent students with disabilities from using the platform.This might be particularly relevant for students with a visual impairment who use screen-readers or magnification software and for students with a mobility impairment who navigate online with their keyboards or other alternative systems.Moreover, an attention deficit and hyperactivity disorder (ADHD) could potentially influence the number of interactions with the online platforms, as ADHD students might need to pause and review materials more often.It is however not clear whether or how ADHD students might have different interaction logs than students without ADHD, as no research on this topic could be found.Second, the use of background information such as hobbies or financial aid can potentially be discriminating for students with disabilities.Many hobbies are not accessible to people with disabilities.For instance, Steinhardt et al. [111] identified several barriers for children's participation in hobbies such as the children's own preferences and competences, their social skills, the lack of adapted activities (especially in rural areas), or negative attitudes among peers.These barriers can affect people with any type of disabilities, even if they are likely to be affected differently.Financial aid may also be associated with disability status.Students with disabilities often have lower income [112].During the COVID-19 pandemic, students with disabilities in the United States were more likely to experience greater financial hardship [113].Many countries also provide financial aids to students with disabilities to facilitate access to higher education because of the higher costs of living with a disability [114].Hobbies and financial aids were used in applications for admission (recommendation and performance predictions) and at-risk predictions.Especially in the case of admission, researchers must check that these variables do not hinder opportunities of students with disabilities.This is crucial because incorrect recommendations could create a feedback loop that perpetuates a situation where the same groups of students are consistently funnelled into certain universities.
Third, text data were often used to assess students' knowledge.Natural language processing techniques may be unable to analyse or retrieve information from texts written by students with cognitive or intellectual disabilities who, for instance, may misspell words due to dyslexia [24].Consequently, students with disabilities may end up becoming excluded from systems that automatically analyse texts, and feedback may be more prone to errors.Moreover, Guo et al. [24] argued that ''conversational agents may not work well for people with cognitive and/or intellectual disabilities, resulting in poor user experience'' (p.4).They advised training the AI with diverse data to overcome this issue.However, considering that the perspective of students with disabilities is largely ignored in the literature, conversational agents are unlikely to be adapted to the needs of students with disabilities.Guo et al. [24] also suggested providing different modes of communication such as sign language, pictures, or icons, but this possibility was never presented in the selected articles.

2) RISKS ASSOCIATED WITH THE TYPE OF DECISION
Fourth, risk predictions often relied on numerical and categorical classifications which are simplistic and lack information on why students are at risk.Students with disabilities may require specific interventions.For instance, if the problem is that the platform is not accessible to a screen-reader user, then the intervention should be making this platform accessible to the student.Likewise, if a student with dyslexia struggles to demonstrate their knowledge through written exams, modifications to the exam format should be made.This undermines the claim that AI personalizes higher education.One solution is to provide results that may explain why a student is at risk and what could be improved.For instance, Van Petegem et al. [62] specifically chose an algorithmic model that allowed for interpretability of the at-risk predictions.Additionally, intervention systems can also be centred on talking and listening to students to understand them better, as was the case in [42] and in [49].Another solution is to investigate whether some modules could have accessibility issues by comparing dropout rate and feedbacks between disabled and non-disabled students as suggested by Cooper et al. [109].Still, although their method sounds promising, it raises privacy concerns as students may not necessarily be willing to disclose their disability to inform predictions on their performance.
Furthermore, students with disabilities may systematically end up in the at-risk category.A study on the OULAD dataset that gathered student data from a virtual learning environment found that students with disabilities were underrepresented, which led in some cases to a higher probability to be classified as at-risk [108].One may argue that it is not problematic that students are put into the ''at risk'' category, as these students then receive more attention from their lecturers.However, students with disabilities are often stigmatized and expected to perform poorly [35], [115], [116].
Thus, these categorisations may reinforce existing biases.Conversely, it is also possible that it could ease the problem of self-advocacy in higher education, as students with disabilities must disclose their disability and ask for specific accommodations to study.In particular, many students with invisible disabilities such as ADHD, chronic fatigue, or autism find disclosure emotionally difficult [35], [115], [116].Consequently, more research is needed to understand whether students with disabilities are likely to be classified as at-risk.Additionally, researchers should seek the opinion of students with disabilities on these systems.These students may be willing to accept an erroneous classification as ''atrisk'' if it results in teachers becoming more proactive in addressing the specific needs of their students.
Fifth, the recommendation systems used for school applications and learning materials may fall short in promoting inclusivity and providing personalized recommendations as advertised.Despite their ability to suggest schools or courses that align with a student's academic performance, these systems lack personalized information on accessibility.To enhance inclusivity, the recommendation systems could provide details on the availability of disability offices and overall university accessibility.Similarly, for course and learning path recommendations, it would be helpful to thoroughly consider the accessibility of learning resources.Nonetheless, further research is needed to identify the recommendation features that students with disabilities would find useful.

3) RISKS ASSOCIATED WITH THE INVOLVEMENT OF STAKEHOLDERS IN THE DECISION-MAKING
Sixth, attention assessment raises concerns for continuous surveillance [14].This holds true for all students, but especially for students with anxiety, who may feel under pressure, which increases barriers to succeed in higher education.Students with ADHD may also be systematically flagged and thus penalized.In that case, it is critical to let students decide whether they accept to be monitored or not like in the case presented by Robal et al. [36].Moreover, the use of facial recognition may not work for students with unusual face features or students with visual impairments, e.g., if they wear sunglasses [24].Moreover, Raji et al. [117] warned against AI that are not functioning because it is conceptually not possible to make inferences based on certain inputs.The authors gave the example of AI using physical appearance to infer personality traits [117].While the emotion recognition was not really employed in the literature, two cases [75], [81] applied face and eye recognition to infer students' real knowledge.It is, however, questionable whether there is a founded causal link between someone's facial appearance and knowledge.
Seventh, at the moment, AI EdTech is designed to support and aid lecturers or faculty staff rather than replace them.However, it is conceivable that future developments in AI could delegate more decision-making responsibilities 27822 VOLUME 12, 2024 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.to technologies.For example, currently, predictions regarding students who may be at risk are only meant to inform humans, who ultimately decide on the necessary interventions.Nonetheless, there are indications that automation may be possible in the future, such as through the sending of automated emails.Efforts to develop autonomous systems are also evident in the creation of intelligent tutoring systems.These systems offer students the opportunity to learn at their own pace online, which is particularly beneficial for students with disabilities since it helps to eliminate barriers [17].Nonetheless, as the involvement of humans is reduced, students with disabilities may not have the possibility to request course adaptations.Furthermore, they usually cannot provide feedback to AI EdTech, i.e. the system cannot learn from them [107].Higher education institutions need to maintain a certain level of flexibility to make reasonable accommodations.Still, it should be remembered that even when humans are involved, AI EdTech needs to be regularly reviewed to avoid automation bias, i.e. the human bias that tends to believe AI output [118].Considering the lack of staff training [3], [14], inclusion training is crucial to ensure the rights of students with disabilities pursuing higher education.
Finally, the involvement of students was in most cases limited to receiving information on chances of success and learning progress.While this may be because lecturers and university staff are the traditional decision-makers in higher education, empowering students with information and decision-making power could be most beneficial.AI can be employed to assist students rather than control them.For instance, Robal et al. [36] presented a tool for students to detect their attention loss during video lecturers, which they could choose to disable.While this has not be tested, their tool could be an assistive technology for students with ADHD.An assistive technology is defined as ''any item, piece of equipment, or product system, whether acquired commercially off the shelf, modified, or customized, that is used to increase, maintain, or improve functional capabilities of a [person] with a disability'' [119].Technologies that may be considered optional or a luxury for students without disabilities can prove to be highly beneficial for students with disabilities.A prime example of this is the use of writing assistance tools like Grammarly, which can enable students with learning disabilities to write creatively.In the literature, only Kim et al. [99] presented a similar tool to improve argumentation quality.Focusing on developing AI-based educational technology that assists students, rather than just instructors, may be a crucial factor in advancing inclusion in higher education.

V. CONCLUSION
There is a clear lack of ethical consideration for students with disabilities in articles presenting AI EdTech that assesses students in higher education.A little less than half of the selected 72 articles did not address ethics at all, and those that did focused mainly on privacy, transparency/explainability, or bias.Furthermore, the perspective of people with disabilities is largely missing.
This scoping review identified eight discrimination risks associated with the use of AI EdTech for students with disabilities, particularly emphasizing the potential for bias and exclusion.This raises concerns about the adoption of AI EdTech in higher education as it may hinder efforts towards greater accessibility and inclusion in this sector.It is important to note that the identified risks do not imply that AI EdTech should not be used in higher education to inform or take decisions affecting students.Nevertheless, it is crucial that AI Edtech are developed with accessibility and ethical concerns in mind to ensure that they benefit everyone.Additionally, human oversight and active efforts to incorporate the perspectives of students with disabilities are necessary to address these ethical concerns and promote greater accessibility and inclusivity.

A. THEORETICAL IMPLICATIONS
This study contributed to the AI Ethics field by identifying risks that are specific to the use of AI in higher education for students with disabilities.This publication aimed at sparking a discussion within the research community to consider the diversity of students and their subsequent perspectives.While this article focused on students with disabilities, it should be noted that the risks may also impact other groups such as students from a lower socio-economical background, different genders, or diverse ethnicities.For instance, the input variable ''hobby'' can also correlate with gender and socio-economical background and thus raise similar issues as for disability.Accounting for these risks is therefore crucial for more than students with disabilities.Future theoretical research could tackle an intersectional approach.For instance, it could investigate how disability status relates to gender and race in higher education and how it affects degree completion.

B. EMPIRICAL IMPLICATIONS
Empirical research is needed to test to what extent the identified risks impact students with disabilities.Several research questions can be raised, for instance whether and how interaction log data of students with disabilities (e.g., ADHD, visual impairment, and mobility impairment) differ from students without disabilities, and to what extent this affects predictions.Future research should also investigate the efficiency of solutions to mitigate risks.Qualitative research could also gather the opinions of students with disabilities to understand their preferences and risk tolerance if AI reduces barriers in higher education.Students with disabilities could also provide valuable information on features that could ease their access to higher education degrees.

C. PRACTICAL IMPLICATIONS
This review is a call for researchers developing AI EdTech to integrate ethical considerations in their research.These can be included when justifying the choice of the technological design or at the end of articles when discussing future research and potential implementation.Doing so not only signals that risks have been taken into consideration, but also encourages the community to develop responsible AI EdTech.
Developers and public procurers are invited to keep in mind the commitment of higher education institutions for greater inclusion before acquiring or buying AI EdTech.EdTech that addresses and mitigates ethical concerns is to be favoured.For instance, systems using text data should include texts written by students with cognitive and intellectual disabilities, and conversational agents need to provide different modes of communications [24].Moreover, practitioners need to keep in mind that technology is not always the solution to mitigate risks.Education is a complex phenomenon, which means an effective intervention is not necessarily the most efficient one in terms of resource management [120].To avail of technology, public institutions need to adopt a holistic vision that utilizes both human and technological strengths alike [120].
Additionally, students with and without disabilities may benefit more from AI EdTech that assists them rather than controls or monitors them.Developers and public procurers are therefore encouraged to pursue this goal.The release of ChatGPT opened many opportunities to develop personalised assistive AI EdTech.

D. LIMITATIONS
This scoping review has several limitations.First, for most of the review process, there was only one reviewer.To reduce selection bias, a second reviewer was involved in the third screening of the articles, and the two reviewers discussed issues until consensus was found.In the other steps, if the main reviewer had any doubts, the second reviewer was consulted.Second, the review only includes articles written in English.This language restriction was established due to the quantity of existing papers and the fact that most scientific research on AI is published in English.Third, some researchers published their progress on an AI-based system in different papers.It is possible that some authors published their works in separate papers and therefore, the part on ethics is included in a different paper or that further tests were performed.However, this limitation is mitigated by the fact that papers with few details were excluded from the analysis, and we looked for newer and detailed research.Moreover, during the screening, it did not seem that this was a recurring issue.Fourth, the analysis could not differentiate systematically for each type of disabilities due to a lack of empirical analysis on the subject and the diversity of disabilities.Still, this study emphasized as much as possible the types of disabilities that could be affected by the risks raised in the analysis.Finally, this study has focused on articles presenting AI applications that assess students and inform or take decisions affecting them.Yet, AI EdTech can also be used to assist students without analysing them.For instance, automatic captions can help students with disabilities such as hearing impairments, dyslexia, or ADHD follow and take notes in a lecture [121].Similarly, ChatGPT can be used to summarise or simplify texts, and thereby facilitate communication and learning for people with disabilities such as dyslexia or speech impairments [122].While this type of use was out of the scope of this study, readers should bear in mind that AI can be employed as assistive technologies to promote inclusion.

FIGURE 1 .
FIGURE 1. PRISMA flow diagram adapted from[33].* = Total number of excluded reports does not equal the sum of all exclusion reasons because some articles had more than one reasons to be excluded.

FIGURE 3 .
FIGURE 3. Distribution of ethical concerns mentioned in the 72 selected articles.

TABLE 1 .
Eligibility criteria of the scoping review.

TABLE 2 .
Coding table of the review.Distribution of application types in the 72 selected articles.

TABLE 3 .
Involvement of AI, student, lecturer, faculty in 72 articles.Cells in darker blue indicate the human stakeholder that was most involved in the respective application types.

TABLE 4 .
Distribution of input data types in AI-based applications.Cells highlighted in darker blue indicate the data type was mostly used for the respective application types.