A scoping review of the “at-risk” student literature in 2 higher education

Institutions’ inclination to fulfilling the mandate of producing quality graduates is overwhelming. Insistent petition for institutions to understand their students is about creating equitable opportunities for the diverse student bodies. However, “at-14 risk” students ubiquitously co-exist. This article conducted a scoping review of literature published locally and internationally that sought to understand “at-risk” students in higher education. The study examined the aims, participants, variables, data analytics tools, and the methods used when the topic on “at-risk” students is studied. Broadly, we sought the bigger picture of what matters, where, when, why, and how so. The Population, Concept, and Context (PCC) framework was considered for demarcating appropriate literature for the concept and context of “at-risk” students. The JBI protocol was chosen for selecting relevant literature published between 2010 and 2022, searched from the EBSCOhost and ScienceDirect databases. A search tool was developed using the litsearchr R package and screening proceeded guided by the PRISMA framework. Although 1961 articles were obtained after applying the search criteria, 84 articles satisfied the stipulated inclusion criteria. Although Africa is lagging, research on “at-risk” students is exponentially growing in America, Europe, and Asia. Notably, relevant articles use academic data to understand students at risk of dropping-out or failing in the first year. Often, statistical and machine learning methods were preferred. Most factors that determined whether a student is at risk of failing or dropping out were found to be highly correlated knowledge. Also, being “at-risk” connoted one’s geographical context, ethnicity, gender, and academic good time management,


Introduction 33
Research on "at-risk" students requires one to holistically comprehend this key term, "at-risk" student. Furthermore, it is 34 necessary to conduct an in-depth scoping review of this knowledge domain. This is because the term "at-risk" student should 35 be contextualized and grounded in the factors that distinguish such students from the rest. That understanding may allow . CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted July 7, 2022. ; https://doi.org/10.1101/2022.07.06.499019 doi: bioRxiv preprint 3 36 institutions to better prepare for new student cohorts in terms of both the necessary resources and infrastructure. At the same 37 time, that understanding may also provide clearer criteria for inclusion and exclusion of the literature that may propel further 38 research in this knowledge domain. Additionally, a broad understanding of the term, "at-risk" student, may potentially provide 39 hints on the appropriate search strategies for related literature, as well as elucidating apt mechanisms for screening suitable 40 studies for the scoping review. In this context, a scoping review is about the synthesis of research that aims to map literature 41 on a topic to the identification of the population of articles, the key concepts, and the context of the knowledge domain thereof 42 [1]. Correspondingly, scoping reviews explicate the evident gaps in the knowledge domain while pinpointing the common 43 characteristics of the evidence thereto, towards informing practice and policymaking.

44
Our understanding of an "at-risk" student is that of one who would likely dropout, stop-out, burn out, or fail to 45 complete a study programme [3] in higher education. In this context, a dropout is a student who permanently quits from studies 46 without attaining the intended qualification [9]. On the other hand, a stop-out is a student who temporarily discontinues studies 47 with the hope of re-registering at a later stage [9]. Contrary, burning-out is a situation where a student responds to chronic stress 48 through emotional and physical exhaustion characterized by low productivity [14]. Then, failing is a situation where a student 49 endures through a study programme, however, without achieving the desired performance to pass [9]. Understanding the broad 50 literature that characterizes "at-risk" students may inspire focused research for students' success.

51
The higher education literature continues to emphasize early intervention as the preeminent way to save "at-risk" 52 students. Evidence is available to support the premise that identifying an "at-risk" student early simplifies the identification of 53 the barriers which the student needs to overcome [2]. In fact, the implementation of individualized support programmes 54 increases the probability of student success, especially when the causal factors for being at risk are correctly identified in time.

55
More so, use of individualized support programmes such as student counselling or peer tutoring allows the sharing of "at-risk" 56 students' specific risk information which can facilitate proper and timely intervention [39] at a lower cost [40]. As a result, 57 when attempting to understand "at-risk" students, the focus should be on getting to know the student before attempting to solve 58 the underlying challenges. Although direct intervention programmes dominate the list of remedies for being at risk, some 59 literature connotes indirect interventions as tantamount as well, such as the need for the proper sequencing of courses and 60 logical arrangement of the content covered in the courses that put students at risk [

72
Given the propensity to boost student success rates, and the potential benefits of proactive identification of "at-risk" 73 students, most institutions are shifting focus to students' data for insights. It is our hope that reframing and expanding the 74 concept of an "at-risk" student from data and gaining a better understanding of the underlying scope of work in this knowledge 75 domain would create equitable opportunities for students while also advancing institutional roles in effectively addressing the 76 elements that put students at risk. This scoping review synthesizes research evidence within the "at-risk" student knowledge 77 domain with the goal of mapping the broad concepts to the likely intervention, emphasizing variabilities in the quoted aims, 78 research design strategies, the population of participants, methodological standards, and the reported findings. An especially 79 important point to note is an attempt to fully understand the data upon which the evidence provided is based.

81
Three objectives summarize this scoping review in the sequence they are presented as follows: (a) We want to identify 82 articles that present prevalent categories of "at-risk" students in the higher education context. (b) We also want to investigate 83 the prevalent aims, data analytics tools, common participants, variables, and methods insinuated when the topic of "at-risk" 84 students is being studied. Last, (c) we want to analyze the articles that meet the inclusion criteria to obtain a broader picture of 85 what matters, where, when, why, and how the problem of the "at-risk" student has been tackled in the past. Achievement of 86 these objectives may give insights to guide further studies aimed at bringing about change and social justice in higher education.

Research questions
88 Three questions are asked in line with the objectives as follows; (a) Which articles tackled the "at-risk" student 89 problem in the higher education context? (b) What were the aims, data analytics tools, participants, variables, and methods . CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted July 7, 2022. ; https://doi.org/10.1101/2022.07.06.499019 doi: bioRxiv preprint 5 90 used in tackling the problem? (c) What mattered, where, when, why, and how was the "at-risk" student problem addressed?

91
Hopefully, answers to these questions may provide intuition into further research to guide practices and propel data-driven 92 institutional planning.

94
The rest of the article proceeded as follows; a section on how the PCC framework fits into this study follows next.

95
The PCC framework guides the selection of the population of articles that befit the concept and context of the study. The 96 methods we followed in completing the study are presented thereafter, emphasizing the inclusion and exclusion criteria, search 97 strategy, screening procedure, and how the summaries were drawn. Subsequently, the results which report the distribution of 98 articles followed before the conclusion highlighted the contributions and direction for further studies.

99
The PCC framework 100 This scoping review categorized articles on the "at-risk" student in higher education. An appropriate search strategy 101 for articles published on this topic was proposed. In this case, we adopted an a priori model known as the PCC (Population, open population of articles. It would imply that all articles that mention the concept of an "at-risk" student may be included.

105
However, the inclusion criteria define the boundary of articles that fit into the desired population, concept, and context of the 106 study. Precisely, the key concept remained the "at-risk" student. This is a broad concept that could cover any kind of articles 107 that mention the term, "at-risk" student. However, the PCC framework was used to contextualize the concept of "at-risk" 108 students through a clearly defined search strategy that stipulated how the relevant articles were selected and screened, bearing 109 in mind the higher education setting. Also, the concept of an "at-risk" student has been left open regarding the sources of 110 evidence, which may come from anywhere, including the articles where students may be at risk of dropping-out, stopping-out, 111 burning-out, or failing. This scoping review demarcated the concept of an "at-risk" student to comprise dropouts, stop-outs, 112 burn outs, and failing students in the higher education perspective. The methods section will meticulously elucidate the 113 population and the type of evidence considered in characterizing the concept and context of this study. Anticipated results were 114 reported using figures and charts that depict the distribution of articles categorized by year, region, aim, participants, methods, 115 data analytics tools, and findings.
. CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted July 7, 2022.

129
deep inner type of the study was not of interest. Therefore, review articles, conceptual papers, theoretical articles, as well as 130 empirical quantitative and qualitative studies all qualified. An iterative approach which allowed repeated refinement of the 131 inclusion and exclusion criteria was adopted. Thus, articles went through several iterated screening rounds before the final list 132 of relevant literature was generated. Disputed articles were considered through consensus after round robin reviews by the 133 research team members. Sometime, detailed manual scrutiny of the full texts of the articles were considered as the last resort. was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted July 7, 2022. ; https://doi.org/10.1101/2022.07.06.499019 doi: bioRxiv preprint 7 140 administered [10]. Such bias would instigate irreproducibility because it would be hard to recall the procedure followed in each 141 selection of a comprehensive set of concepts. The following search query was used to mine the relevant articles.

143
The validity of this search query was verified with the help of an experienced librarian. Consultations with content experts in 144 the field of student success were also considered to triangulate the search strategy, as well as to enhance rigour and reliability.

145
In this case, content experts were a valuable resource for finding literature that was hard to identify through other means. The

146
second step was about the actual search process, where the search query was executed following the directions from content 147 experts. The final step focused on scrutinizing the list that passed the inclusion criteria for any outstanding patterns.

Screening of included articles 149
The standard procedure to verify scientific material is through manual screening. Generally, such screening can be 150 split into several steps, including screening articles by titles, screening by abstracts, or screening by physically going through 151 the full text. The revtools [16] R package that supports evidence synthesis was considered for the first round of screening. This 152 tool de-duplicates bibliographic data using titles and abstracts. It also visualizes articles using topic models, allowing articles

159
A standardized data extraction template that followed the PRISMA-ScR format was created as part of the data charting 160 process. We indicated that the population of articles that met the inclusion criteria for the concept of the "at-risk" student in 161 the context of failing, dropping-out, stopping-out, or burning-out in higher education, together with the details of those articles 162 in terms of the year of publication, country, aim, participants, methodology, intervention, and findings, were the key results

163
reported and analyzed in this scoping review. We mainly looked at the characteristics of these articles to establish likely 164 knowledge gaps to explore further. We also sought the bigger picture of what matters, where, when, why, and how literature . CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted July 7, 2022.

170
The larger project, from which this study ensued, is registered at the National Teaching Advancement Programme as an 171 institutional project. Hopefully, the results from the project will instigate change and social justice in higher education and 172 inform further research on good practices towards data-driven institutional planning and decision-making.

173
Search Results 174 Figure 1 shows the PRISMA-ScR flow diagram that summarizes the articles considered, included, and excluded. The PRISMA-

175
ScR seeks to determine the articles that tackled the "at-risk" student problem in the higher education context (research question 176 (a)). Precisely, 1918 articles that were extracted from the ScienceDirect and EBSCOhost databases using the proposed search

184
However, the full texts for 53 of the 220 articles could not be retrieved, thus reducing the number of articles to 167 articles.

185
These 167 articles were subjected to additional manual screening to check whether their content was in line with the concept 186 of "at risk" students. Another 27 articles were discarded as their participants were not part of the higher education domain.

187
Eleven articles were removed because they focused on the context of nursing students in nondegree-offering colleges. There 188 are 6 non-English articles that were also removed. A further 13 articles were dropped because they focused on other contexts, . CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted July 7, 2022. ; https://doi.org/10.1101/2022.07.06.499019 doi: bioRxiv preprint 9 189 such as the risk of quitting or stopping medication or some other programs not related to education. The full-text reviews 190 excluded another seven articles that were identified as duplicates that were missed by the revtools automated application tool.ls. . CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted July 7, 2022.  206 Figure 2 is a snapshot of the popular terms used to characterize "at-risk" students, with the terms such as dropout, 207 poor performance, at-risk, and success standing out. This observation is in line with the views that ensue from topic modelling 208 of the dominant variables used to identify the top "trending topics" on the "at-risk" student.  The research findings reported in this section sought to determine the aims, data analytics tools, participants, variables, 215 and methods that were employed in the "at-risk" student's literature. The aim(s) of most articles was to determine the 216 factors/variables that cause a student to be at risk. Table 1 categorizes these top trending topics and factors/variables in the "at-217 risk" student's literature. The summaries show that the included articles varied widely in terms of the terminology used to 218 describe these dominant variables. For example, the category "Grades" included factors such as final exam grades, exam scores, 219 major test marks, marks in formative tests, predicted grades, and prior grades. The category "Academic" included factors like 220 academic record, academic motivation, academic support, academic success, academic performance, academic background, 221 and academic integration. Literature indicates higher occurrences of the terms: Grades (53.8%), Academic (28.6%), Gender 222 (18.7%), GPA (13.2%), Age (12.1%), Data (11%), Course (9.9%), Race/ethnicity (9.9%), Study (7.7%), Support (7.7%), Time 223 (7.7%), Semester (6.6%), Scores (5.5%), Education (5.5%), and Parent (5.5%). This observation is consistent with the marks 224 being a common factor of "at-risk" students [17,18]. was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted July 7, 2022.

234
Most articles included in this study emphasized data analysis to seek institutional advancement towards students' 235 retention and success at undergraduate level. There is barely any literature on the "at-risk" student in post-graduate studies and 236 that alone is a gap to explore further. First year students are the common target group of participants unless all students in the 237 context were considered (see Figure 4d). This may be because cohorts of first year students often comprise the highest number 238 of "at-risk" students. Another reason may be that the transition from high school to university is commonly perceived as radical,

239
which renders first year students as indigent for support than senior students.

240
Equally, although a good chunk of literature focused on the building of predictive models to identify "at-risk" students,

241
comparative studies to evaluate which model gives plausible outcomes are few (see Figure 4b). This may be because this 242 knowledge domain is still in its infancy and such comparative studies may be upcoming. Nevertheless, data-driven methods

243
are still preferred because of the insights drawn from several data-analytics tools. Studies that focused on surveys, case studies, 244 experimental and cross-sectional research are also quite visible in the literature (see Figure 4c). However, advanced data-. CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted July 7, 2022. ; https://doi.org/10.1101/2022.07.06.499019 doi: bioRxiv preprint 14 245 analytics models are preferred for simplifying the way meanings can be drawn from the data collected from the many 246 information systems institutions often subscribe to, and this will remain the likely trend in most related future studies. Such 247 data analytics tools are summarized into four broad categories as shown in Figure 5.

248
Some articles employed more than one data analytics tool. Statistical methods were preferred most and were used for

257
Several findings emanated from the scoping review regarding "at-risk" students. Generally, it is repeatedly insinuated 258 that students will likely dropout if their secondary school knowledge was low or their motivation to study was low [42]. That

265
Intervention close to individualized attention are seen as more effective, including peer tutoring and one-on-one 266 counselling. Subscription to the use of early warning systems that reduce the burden of counseling, systems that will work 267 towards enhancing metacognitive awareness, self-awareness, and self-regulation, as well as tracing logs by students on learning 268 management systems may also simplify early prediction of "at-risk" students. Most compelling is the need for institutions to 269 identify courses that are hard-to-pass and evaluate the question papers to determine the levels of difficulty. Lecturers should 270 also implement student motivation strategies, including provision of timely feedback on assignments. Interventions that focus 271 on the psychosocial well-being of students and the emotional intelligence of students are also recommended. Machine learning 272 models such as AutoML can be adopted to formulate optimal student performance prediction models that use pre-start data.

273
More interpretable models that provide educators with course feedback on student status are also recommended. Creation of 274 caring, supportive, and welcoming environments within the university is critical to creating that sense of belonging.

275
Gaps to explore 276 The topic of "at-risk" student is receiving close attention. However, focus to the different arms of the concept of an 277 "at-risk" student is not fairly spread. Emphasis is tilted towards interventions against dropping out or failing. Little is visible 278 regarding students at risk of stopping-out or burning out and that is an apparent avenue for further studies in this body of 279 knowledge. Similarly, most articles dwelt on the concept of an "at-risk" student in the context of dropping out or failing from 280 American, European, or Asian institutions. Studies on this concept in African institutions' perspectives are rare. Research to 281 compare the results yielded with the context of African institutions is worthwhile. Such studies may take us closer to the 282 generalized understanding of an "at-risk" student beyond undergraduate levels. More so, literature suggests that students' 283 internal states are also predictors of performance. Data about student's prior experiences, social interactions, relationships, and 284 extracurricular activities is, thus, needed to further inform the understanding sought. A gap spins around investigating the use 285 of non-academic data to define students' journeys [44]. Lastly, little is also said about the evaluation of the proposed 286 interventions. Not much is known about the effectiveness of the interventions and that alone, is also a gap worth undertaking.
. CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted July 7, 2022.

288
Unfolding the population of articles that characterize "at-risk" students guided the aims, participants, methods, variables, 289 interventions, and data analytics tools one can adopt in related studies. Three contributions apparently stand out as follows:


The scoping review set forth an understanding of the population of studies, concepts, and context of the "at-risk" student.

291
Institutions of higher learning can build on this understanding to similarly get to know their own diverse student bodies.


The scoping review elucidated various applications of different data analytics tools in understanding "at-risk" students.

293
Tailored studies which suit particular scenarios may ensue.

294
 Although the focus of this scoping review was on understanding the "at-risk" student in the higher education space, the 295 results presented create a baseline context upon which a broader understanding of students, in general, may emanate.

296
A few challenges are observed from this scoping review as follows:

297
 Although scoping reviews comprehensively synthesize evidence, dealing with a broad range of literature may blur 298 important methodological steps which makes it difficult to establish boundaries.

299
 A good scoping process requires more time and resources that are often difficult to predict at the start of the research.

300
 Crafting an appropriately inclusive search query which would drop the number of screening iteration is hard.

301
 Manually assessing the validity of some of the articles to be included when disputes arise is even harder.

302
Four ambitious directions for future work are envisioned as follows: 303  Investigations to corroborate the "at-risk" student knowledge domain to the African context are apparently overdue.

304
 This scoping review could be enriched by extending the context of the study to accommodate other use cases.

305
 Further research is paramount which analyzes trace data to better understand the broader spectrum of the enrolled student 306  It is worth checking the extensibility of the concept of "at-risk" students to include demographic and institutional aspects . CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted July 7, 2022. ; https://doi.org/10.1101/2022.07.06.499019 doi: bioRxiv preprint