Utilising feature selection and regression analysis to enhance predictive modelling for automotive applications

  • Tom Matthews

Student thesis: Doctoral ThesisDoctor of Philosophy

Abstract

The public transport sector is an enabler for achieving global sustainable development goals. As buses, when operating at full passenger capacity, produce less carbon emissions per person compared to cars, a modal shift from cars to buses is beneficial for helping achieve global sustainable development goals. Policy levers such as increasing subsidies for public transport, have been used to increase uptake of public transport options in preference to private car usage. Climate change mitigation strategies also involve technology-oriented solutions. This has led to the emergence of new green bus technologies such as hybrids, electric buses, and hydrogen buses. With increasing numbers of these vehicles in operation, there has also been an increasing amount of telematics data available which can be used to improve understanding of their real time performance in operation. However, interpretation of this data remains a challenge. There is currently no consistent method recommended for data reduction of telematics data and there is inconsistency in reported approaches in literature. Consequently, the work in this PhD aims to explore and critique data analysis techniques, such as feature selection and regression analysis, to develop robust relationships between variables reported through telematics data and target variables of interest for automotive applications. A number of objectives were identified to achieve this aim. First, a novel feature selection algorithm was developed for telematics data which can be used to robustly identify the critical subset of variables for predicting a target variable of interest. The developed feature selection algorithm utilises multiple feature selection methods which enables greater confidence that the chosen subset of variables is correct. Secondly, the feature selection algorithm was used to assess multiple regression/prediction methods in terms of accuracy and the minimum number of model inputs needed. Minimising the number of model inputs is important as it improves model interpretability and minimises the chance of overfitting occurring. Finally, a number of case studies were used to demonstrate the application of the feature selection algorithm.

For the first case study, data was collected from twelve Wrightbus Streetdeck HEV 96V buses which were monitored operating throughout the UK from June 2018 to October 2019. Afterwards, the feature selection algorithm developed in this work was utilised. It identified the critical subset of variables for predicting fuel consumption. The feature selection algorithm also identified the critical subset of variables for predicting engine output torque. These critical subsets were then used to aid a predictive model. The predictive model identified if the fuel consumption or engine output torque were operating outside of allowable control limits. In this case study, the control limits were changed depending on the values of the relevant variables. This led to an increased sensitivity to actual faults as well as robustness against false alarms. A second case study investigated the potential of utilising a similar approach for vehicles that do not have telematics-based data yet. The feature selection algorithm was used to identify the critical subset of variables that are necessary for predicting the energy consumption of a battery electric and fuel cell electric bus. This algorithm revealed that different vehicles need different subsets of variables to predict the same target variable. An altered form of the algorithm could also evaluate different prediction or regression models to determine which ones are most suitable for the vehicle in question. In one case the adapted feature selection algorithm was able to select a regression/prediction model which had an 𝑅2 above 0.9 while only using three variables. Consequently, the feature selection algorithm allowed for accurate prediction models to be selected using only a small number of inputs which led to reduced model complexity. This ensures accurate predictions of the operational energy demand of battery electric and fuel cell buses are delivered to operators, without the need to run vehicles on the bus route, in a computationally inexpensive manner. This is essential to ensure that operators can make informed decisions on when transitioning their bus fleets to zero emission technologies.

In summary, this thesis introduces a novel feature selection algorithm which has been used to aid in a number of different activities in the automotive sector. The feature selection algorithm is not limited to just these examples shown in this thesis but can be used for any activity that relies on identifying the critical subset of variables for predicting a key target variable. This includes developing driving cycles, creating vehicle models, fault diagnostics etc.

Thesis is embargoed until 31 July 2029.
Date of AwardJul 2024
Original languageEnglish
Awarding Institution
  • Queen's University Belfast
SponsorsNorthern Ireland Department for the Economy & Bamford Bus Company Ltd. Trading as Wrightbus
SupervisorJuliana Early (Supervisor) & Geoff Cunningham (Supervisor)

Keywords

  • Sustainable public transport
  • Machine learning
  • Vehicle modelling
  • Regression analysis
  • Driving cycles
  • Health monitoring
  • Vehicle telematics
  • Feature selection
  • Zero emission transport
  • Energy consumption prediction
  • Fuel consumption prediction
  • Battery electric bus
  • Fuel cell electric bus
  • Mild hybrid bus
  • Big data
  • Data analysis

Cite this

'