TY - JOUR
T1 - Enhanced abnormal data detection hybrid strategy based on heuristic and stochastic approaches for efficient patients rehabilitation
AU - Khan, Murad Ali
AU - Iqbal, Naeem
AU - Jamil, Harun
AU - Qayyum, Faiza
AU - Jong, Jong-Hyun
AU - Khan, Salabat
AU - Kim, Jae-Chul
AU - Kim, Do-Hyeun
PY - 2024/5
Y1 - 2024/5
N2 - Over the last few years, substantial research has been conducted towards developing efficient abnormal detection techniques while considering efficiency, accuracy, high-dimensional data, distributed environments, and others. Researchers increasingly deal with “abnormalities” in clinical patient data to derive relevant clinical knowledge for making informed decisions. However, data collection for clinically relevant research is often guided by patient conditions and administrative or clinical requirements rather than a regular schedule. Therefore, clinical data is frequently obtained in an unreliable form, characterized by data outliers and inconsistencies, incomplete information, and an unstructured format that varies based on patient types and data structures. In this research study, an enhanced hybrid AD strategy is developed based on heuristic and stochastic methods to cope with abnormalities in the clinical data of patients. The proposed hybrid strategy employs optimal k-means clustering as a heuristic method to cluster the clinical data based on the patient’s routine exercise characteristics to cope with abnormalities efficiently. Next, an interquartile range-based stochastic approach is employed as a statistical method to detect and eliminate abnormal data points by providing only reliable and effectual data to medical practitioners. The main objective of this research article is to facilitate healthcare and research practitioners by dealing with a high dimensional massive amount of inconsistent and incomplete clinical data of patients to detect and discard anomalous data points for providing only efficacious information. Furthermore, the AutoML paradigm is employed to develop an optimal regression model for analyzing the impact of the proposed hybrid strategy for abnormal pattern detection. In addition, different statistical error estimation measures are used to evaluate the empirical effectiveness of the proposed hybrid strategy using AutoML. The experiment results show a noteworthy improvement in terms of the R2 score for predicting healthcare indicators compared to the existing state-of-the-art regression models. Our optimal regression model performed efficiently regarding the R2 score and MAPE; it achieved an R2 score of 0.9855 and 0.9850 for predicting the Borg RPE and TUG, respectively. Similarly, our model achieved a low prediction error in terms of MAPE for predicting both health functional indicators; it achieved a MAPE of 6.57% and 5.19% for Borg RPE and TUG prediction. Our contribution signifies that the performance of the AutoML improves and outperforms traditional regression models while applying our proposed hybrid abnormal detection model to the patient’s rehabilitation data for accurately dealing with anomalous data.
AB - Over the last few years, substantial research has been conducted towards developing efficient abnormal detection techniques while considering efficiency, accuracy, high-dimensional data, distributed environments, and others. Researchers increasingly deal with “abnormalities” in clinical patient data to derive relevant clinical knowledge for making informed decisions. However, data collection for clinically relevant research is often guided by patient conditions and administrative or clinical requirements rather than a regular schedule. Therefore, clinical data is frequently obtained in an unreliable form, characterized by data outliers and inconsistencies, incomplete information, and an unstructured format that varies based on patient types and data structures. In this research study, an enhanced hybrid AD strategy is developed based on heuristic and stochastic methods to cope with abnormalities in the clinical data of patients. The proposed hybrid strategy employs optimal k-means clustering as a heuristic method to cluster the clinical data based on the patient’s routine exercise characteristics to cope with abnormalities efficiently. Next, an interquartile range-based stochastic approach is employed as a statistical method to detect and eliminate abnormal data points by providing only reliable and effectual data to medical practitioners. The main objective of this research article is to facilitate healthcare and research practitioners by dealing with a high dimensional massive amount of inconsistent and incomplete clinical data of patients to detect and discard anomalous data points for providing only efficacious information. Furthermore, the AutoML paradigm is employed to develop an optimal regression model for analyzing the impact of the proposed hybrid strategy for abnormal pattern detection. In addition, different statistical error estimation measures are used to evaluate the empirical effectiveness of the proposed hybrid strategy using AutoML. The experiment results show a noteworthy improvement in terms of the R2 score for predicting healthcare indicators compared to the existing state-of-the-art regression models. Our optimal regression model performed efficiently regarding the R2 score and MAPE; it achieved an R2 score of 0.9855 and 0.9850 for predicting the Borg RPE and TUG, respectively. Similarly, our model achieved a low prediction error in terms of MAPE for predicting both health functional indicators; it achieved a MAPE of 6.57% and 5.19% for Borg RPE and TUG prediction. Our contribution signifies that the performance of the AutoML improves and outperforms traditional regression models while applying our proposed hybrid abnormal detection model to the patient’s rehabilitation data for accurately dealing with anomalous data.
KW - Abnormalities
KW - Clinical data
KW - Machine learning
KW - automated learning
KW - Regression
KW - Patients rehabilitation
U2 - 10.1016/j.future.2023.11.036
DO - 10.1016/j.future.2023.11.036
M3 - Article
SN - 0167-739X
VL - 154
SP - 101
EP - 122
JO - Future Generation Computer Systems
JF - Future Generation Computer Systems
ER -