Abstract
Overfitting is a classical statistical issue that occurs when a model fits a particular observed data sample too closely, potentially limiting its generalizability. While Data Envelopment Analysis (DEA) is a powerful non-parametric method for assessing the relative efficiency of decision-making units (DMUs), its reliance on the minimal extrapolation principle can lead to concerns about overfitting, particularly when the goal extends beyond evaluating the specific DMUs in the sample to making broader inferences. In this paper, we propose an adaptation of Stochastic Gradient Boosting to estimate production possibility sets that mitigate overfitting while satisfying shape constraints such as convexity and free disposability. Our approach is not intended to replace DEA but to complement it, offering an additional tool for scenarios where generalization is important. Through simulation experiments, we demonstrate that the proposed method performs well compared to DEA, especially in high-dimensional settings. Furthermore, the new machine learning-based technique is compared to the Corrected Concave Non-parametric Least Squares (C2NLS), showing competitive performance. We also illustrate how the usual efficiency measures in DEA can be implemented under our approach. Finally, we provide an empirical example based on data from the Program for International Student Assessment (PISA) to demonstrate the applicability of the new method.
Original language | English |
---|---|
Journal | European Journal of Operational Research |
Early online date | 28 Nov 2024 |
DOIs | |
Publication status | Early online date - 28 Nov 2024 |
Keywords
- Artificial Intelligence
- Stochastic Gradient Boosting
- Stochastic
- Machine Learning
ASJC Scopus subject areas
- Artificial Intelligence
- Management Science and Operations Research