Learner analytics of student programmers
: The use of innovative technologies to better understand the learning behaviours of student programmers

Student thesis: Doctoral ThesisDoctor of Philosophy


The challenges of effective teaching in mass education environments are well documented, with one of the recurring themes being high attrition and failure rates. This is especially prevalent in third level computing courses, particularly in programming based modules. The size of these cohorts generally means that identification of struggling students is usually only at a point when meaningful interventions are too late. This thesis uses new and novel technologies to provide insights and add to the existing research into areas of learner behaviour in large-scale programming related modules. The aim is to be able to identify new Learning Analytics to be used potentially as early warning indicators of struggling students. These areas generally have thus far either not been studied or have been inaccessible or difficult to measure. Accordingly, this thesis presents a series of investigative studies into student key engagement points during a typical programming module for post graduate conversion students, including consideration of 1) seat position tracking during programming lectures, 2.) Video Lecture Capture (LC) viewing behaviours and 3.) Student Heart Rate monitoring during lectures.

The first study sought to track students’ seating positions in lectures to investigate student behaviour in terms of preferred seating position and whether this is related to their final grade performances. Unlike most previous studies in this area, it did not control the students’ seating arrangements enabling an unrestricted study of the effects of lecture theatre seating choices on assessment performance. It found there was a correlation between sitting closer to the front and higher grades and assessment score. Their scores degraded the further students sat from the front. Students tended to sit in the same area out of habit throughout the module and seating was a potential early marker of prediction for module score.

The LC Video analytics study sought to investigate and assess the impact on learning of recording lectures in a programming module. The study considered behavioural trends of the students towards lecture attendance and watching recorded lectures, such as when, frequency, duration and repeat views. It included an in-depth study of video viewing behaviours including dropout, pause with replay and relates all of these factors to academic impact, especially focusing on academic attainment. It found that there were some significant measureable factors that could be correlated with student attainment.

The measurement of cognitive activity using physiological means such as heart rate activity is a well-established but mostly clinical research practice. The majority of previous studies have concluded that elevated heart rate occurs when an individual is cognitively engaged. This investigation area presents the design and results of a study of students’ heart rate activity during programming lectures. It benchmarks student heart rate patterns during lectures and finds that there is a significant correlation between elevated heart rates and higher module scores.

The thesis then brings together the significant findings of each investigation to provide a variety of analyses on the combined dataset, specifically a statistical-based prediction using Linear Regression and Machine Learning (ML) classification modelling. The Machine Learning study provides models to classify students as Passing and At Risk, using common classifier algorithms including Naive Bayes, Decision Trees, Support Vector Machines, Artificial Neural Networks and K Nearest Neighbour (KNN) algorithms. The purpose of the ML strand of the study is to create models that could identify students that are likely to pass and those that may be at risk of failing the module. It finds that overall, MPL and Naïve Bayes classifiers and to a lesser extent SMO classifiers appear well suited to academic performance prediction using electronic attribute based data.
In summary, the work presented throughout this thesis identifies learning behaviours that could be used to predict and classify students in terms of their potential course performances in programming modules. The models presented are based on quantifiable factors identified from the three separate investigations and could be used to form the basis of an early warning detection system aimed at identifying struggling students in a much more timely way than traditional measurements or activities.
Date of AwardDec 2021
Original languageEnglish
Awarding Institution
  • Queen's University Belfast
SupervisorPhilip Hanna (Supervisor) & Desmond Greer (Supervisor)


  • Machine learning
  • learner analytics
  • programming
  • teaching

Cite this