Detecting Need-Attention Patients using Machine Learning

Theng Jia Law - Multimedia University, 63000 Cyberjaya, Selangor, Malaysia
Choo-Yee Ting - Multimedia University, 63000 Cyberjaya, Selangor, Malaysia
Helmi Zakariah - HAYAT Technologies Sdn. Bhd, 59200, Kuala Lumpur, Malaysia

Citation Format:



In healthcare, detecting patients who need immediate attention is difficult. Identifying the critical variables is challenging in patient detection because human intervention in variable selection is required. Consequently, patients who need immediate attention often experience prolonged waiting times. Researchers have investigated various approaches to identify those who require attention. One of the techniques is leveraging Artificial Intelligence (AI). However, identifying the optimal feature set and predictive model is complex. Therefore, this study has attempted to (i) identify the critical features and (ii) develop and evaluate predictive models in detecting those who need attention. The dataset is collected from one of the healthcare companies. The dataset collected contains 67 variables and 51102 records. It consists of patient information and questionnaires answered by each participant registered in the Selangor Saring Program. Important features were identified in detecting those who need attention on treated data. Multiple classifiers were developed due to their simplicity. The models were evaluated before and after hyperparameter tuning based on accuracy, precision, recall, F1-score, Geometric Mean, and Area Under the Curve. The findings showed that the Stacking Classifier produced the highest accuracy (69.9%) when using the blood dataset. In contrast, Extreme Gradient Boosting achieved the highest accuracy (81.7%) when the urine dataset was used. This work can be extended to explore the incorporation of Points of Interest and geographical data near patients’ residences and study other ensemble models to enhance the performance of detecting those who need attention.


Need-Attention Patient Detection; Artificial Intelligence; Machine Learning

Full Text:



A. A. Khorana et al., “Time to initial cancer treatment in the United States and association with survival over time: an observational study,” PLoS One, vol. 14, no. 3, p. e0213209, 2019.

T. P. Hanna et al., “Mortality due to cancer treatment delay: systematic review and meta-analysis,” BMJ, vol. 371, 2020, DOI: 10.1136/bmj.m4087.

K. A. Fleming et al., “The Lancet Commission on diagnostics: transforming access to diagnostics,” The Lancet, vol. 398, no. 10315, pp. 1997–2050, 2021.

M. A. F. Pimentel et al., “Detecting Deteriorating Patients in the Hospital: Development and Validation of a Novel Scoring System,” Am J Respir Crit Care Med, vol. 204, no. 1, pp. 44–52, Jul. 2021.

P.-T. and K. W.-Y. and T. W.-C. Shih Nai-Chen and Kung, “Association of treatment delay and stage with mortality in breast cancer: a nationwide cohort study in Taiwan,” Sci Rep, vol. 12, no. 1, p. 18915, Nov. 2022, DOI: 10.1038/s41598-022-23683-y.

N. Zaki, H. Alashwal, and S. Ibrahim, “Association of hypertension, diabetes, stroke, cancer, kidney disease, and high-cholesterol with COVID-19 disease severity and fatality: A systematic review,” Diabetes & Metabolic Syndrome: Clinical Research & Reviews, vol. 14, no. 5, pp. 1133–1142, 2020, DOI:

Y. Gao et al., “Risk factors for severe and critically ill COVID-19 patients: a review,” Allergy, vol. 76, no. 2, pp. 428–455, 2021.

A. Shrestha, C. Martin, M. Burton, S. Walters, K. Collins, and L. Wyld, “Quality of life versus length of life considerations in cancer patients: a systematic literature review,” Psychooncology, vol. 28, no. 7, pp. 1367–1380, 2019.

T. Chandrasekar, S. A. Boorjian, U. Capitanio, B. Gershman, M. C. Mir, and A. Kutikov, “Collaborative Review: Factors Influencing Treatment Decisions for Patients with a Localized Solid Renal Mass,” Eur Urol, vol. 80, no. 5, pp. 575–588, 2021, DOI:

J. Mayneris-Perxachs et al., “Blood Hemoglobin Substantially Modulates the Impact of Gender, Morbid Obesity, and Hyperglycemia on COVID-19 Death Risk: A Multicenter Study in Italy and Spain,” Front Endocrinol (Lausanne), vol. 12, p. 741248, 2021.

A. and R. F. and B. V. and C. A. and A.-K. D. and G. M.-P. and D. S. and L. P. A. T. and R. E. and A. A.-S. and L. M.-E. Déry Julien and Ruiz, “A systematic review of patient prioritization tools in non-emergency healthcare services,” Syst Rev, vol. 9, no. 1, p. 227, 2020, DOI: 10.1186/s13643-020-01482-8.

B. Remeseiro and V. Bolon-Canedo, “A review of feature selection methods in medical applications,” Comput Biol Med, vol. 112, p. 103375, 2019, DOI:

B. and Z. M. Hancer Emrah and Xue, “A survey on feature selection approaches for clustering,” Artif Intell Rev, vol. 53, no. 6, pp. 4519–4545, 2020, DOI: 10.1007/s10462-019-09800-w.

R. Zebari, A. Mohsin Abdulazeez, D. Zeebaree, D. Zebari, and J. Saeed, “A Comprehensive Review of Dimensionality Reduction Techniques for Feature Selection and Feature Extraction,” Journal of Applied Science and Technology Trends, vol. 1, pp. 56–70, 2020, DOI: 10.38094/jastt1224.

R. Spencer, F. Thabtah, N. Abdelhamid, and M. Thompson, “Exploring feature selection and classification methods for predicting heart disease,” Digit Health, vol. 6, p. 2055207620914777, 2020, DOI: 10.1177/2055207620914777.

S. Ray, K. Alshouiliy, A. Roy, A. AlGhamdi, and D. P. Agrawal, “Chi-Squared Based Feature Selection for Stroke Prediction using AzureML,” in 2020 Intermountain Engineering, Technology and Computing (IETC), 2020, pp. 1–6. DOI: 10.1109/IETC47856.2020.9249117.

M. F. Ijaz, M. Attique, and Y. Son, “Data-Driven Cervical Cancer Prediction Model with Outlier Detection and Over-Sampling Methods,” Sensors, vol. 20, no. 10, p. 2809, May 2020, DOI: 10.3390/s20102809.

G. Yue et al., “Machine learning based early warning system enables accurate mortality risk prediction for COVID-19,” Nat Commun, vol. 11, no. 1, p. 5033, 2020.

R. Tang and X. Zhang, “CART Decision Tree Combined with Boruta Feature Selection for Medical Data Classification,” in 2020 5th IEEE International Conference on Big Data Analytics (ICBDA), 2020, pp. 80–84. DOI: 10.1109/ICBDA49040.2020.9101199.

W. Ru et al., “Derivation and Validation of Essential Predictors and Risk Index for Early Detection of Diabetic Retinopathy Using Electronic Health Records,” J Clin Med, vol. 10, no. 7, p. 1473, 2021, DOI: 10.3390/jcm10071473.

V. S. Narayanan, K. N. Raj, K. Kumar, and M. Kumar, “Survival Prediction of Heart Failure Patients Using Lasso Algorithm and Gaussian Naive Bayes Classifier,” 2021.

Khan et al., “Examining the prevalence of hypertension by urban–rural stratification: A Cross-sectional study of nepal demographic and health survey,” Asian Journal of Social Health and Behavior, vol. 4, no. 1, p. 15, 2021.

Wang, Y. Zhou, F. Wang, L. Ding, P. E. D. Love, and S. Li, “The Influence of the Built Environment on People’s Mental Health: An Empirical Classification of Causal Factors,” Sustain Cities Soc, vol. 74, p. 103185, 2021, DOI:

X. Guan et al., “Clinical and inflammatory features based machine learning model for fatal risk prediction of hospitalized COVID-19 patients: results from a retrospective cohort study,” Ann Med, vol. 53, no. 1, pp. 257–266, 2021, DOI: 10.1080/07853890.2020.1868564.

V. Maeda-Gutiérrez et al., “Risk-Profile and Feature Selection Comparison in Diabetic Retinopathy,” J Pers Med, vol. 11, no. 12, p. 1327, 2021, DOI: 10.3390/jpm11121327.

S. A. Antor, J. Ahmed, Z. Nayen, F. Tabassum, and R. Mahbub, “A feature selection approach to determine obesity using machine learning method,” Brac University, 2021.

J. Emakhu et al., “Acute coronary syndrome prediction in emergency care: A machine learning approach,” Comput Methods Programs Biomed, vol. 225, p. 107080, 2022, DOI:

C. Förster, M. G. Colombo, A.-J. Wetzel, P. Martus, and S. Joos, “Persisting Symptoms After COVID-19,” Dtsch Arztebl Int, vol. 119, no. 10, pp. 167–174, 2022.

S. A. Ebiaredoh-Mienye, T. G. Swart, E. Esenogho, and I. D. Mienye, “A Machine Learning Method with Filter-Based Feature Selection for Improved Prediction of Chronic Kidney Disease,” Bioengineering, vol. 9, no. 8, p. 350, 2022, DOI: 10.3390/bioengineering9080350.

M. R. Afrash, H. Kazemi-Arpanahi, M. Shanbehzadeh, R. Nopour, and E. Mirbagheri, “Predicting hospital readmission risk in patients with COVID-19: A machine learning approach,” Inform Med Unlocked, vol. 30, p. 100908, 2022, DOI:

J. Kim, J. Lee, and M. Park, “Identification of Smartwatch-Collected Lifelog Variables Affecting Body Mass Index in Middle-Aged People Using Regression Machine Learning Algorithms and SHapley Additive Explanations,” Applied Sciences, vol. 12, no. 8, p. 3819, 2022, DOI: 10.3390/app12083819.

H. Chauhan, K. Modi, and S. Shrivastava, “Development of a classifier with analysis of feature selection methods for COVID-19 diagnosis,” World Journal of Engineering, vol. 19, no. 1, pp. 49–57, 2022.

Wang et al., “Risk factors and machine learning model for predicting hospitalization outcomes in geriatric patients with dementia,” Alzheimer’s & Dementia: Translational Research & Clinical Interventions, vol. 8, no. 1, p. e12351, 2022.

A. K. Gárate-Escamila, A. Hajjam El Hassani, and E. Andrès, “Classification models for heart disease prediction using feature selection and PCA,” Inform Med Unlocked, vol. 19, p. 100330, 2020, DOI:

S. Sawangarreerak and P. Thanathamathee, “Random Forest with Sampling Techniques for Handling Imbalanced Prediction of University Student Depression,” Information, vol. 11, no. 11, p. 519, 2020, DOI: 10.3390/info11110519.

A. and S. Á. and C. e S. L. and D. da S. L. and S. D. F. S. and G. E. C. and P. A. dos Santos Santana Íris and CM da Silveira, “Classification Models for COVID-19 Test Prioritization in Brazil: Machine Learning Approach,” J Med Internet Res, vol. 23, no. 4, p. e27293, 2021, DOI: 10.2196/27293.

F. Degenhardt, S. Seifert, and S. Szymczak, “Evaluation of variable selection methods for random forests and omics data sets,” Brief Bioinform, vol. 20, no. 2, pp. 492–503, 2019.

J. L. Speiser, M. E. Miller, J. Tooze, and E. Ip, “A comparison of random forest variable selection methods for classification prediction modeling,” Expert Syst Appl, vol. 134, pp. 93–101, 2019, DOI:

J. Verhaeghe, J. Van Der Donckt, F. Ongenae, and S. Van Hoecke, “Powershap: A Power-full Shapley Feature Selection Method.” arXiv, 2022. DOI: 10.48550/ARXIV.2206.08394.

A. Gramegna and P. Giudici, “Shapley Feature Selection,” FinTech, vol. 1, no. 1, pp. 72–80, 2022, DOI: 10.3390/fintech1010006.

D. Chang, D. Chang, and M. Pourhomayoun, “Risk Prediction of Critical Vital Signs for ICU Patients Using Recurrent Neural Network,” in 2019 International Conference on Computational Science and Computational Intelligence (CSCI), 2019, pp. 1003–1006. DOI: 10.1109/CSCI49370.2019.00191.

P. Radha and B. MeenaPreethi, “Machine learning approaches for disease prediction from radiology and pathology reports,” J Green Eng, vol. 9, no. 2, pp. 149–166, 2019.

S. Srinivas and H. Salah, “Consultation length and no-show prediction for improving appointment scheduling efficiency at a cardiology clinic: A data analytics approach,” Int J Med Inform, vol. 145, p. 104290, 2021, DOI:

R.-S. and L. W.-C. and J. G.-W. and L. Y.-C. Liu Chien-Liang and Soong, “Predicting Short-term Survival after Liver Transplantation using Machine Learning,” Sci Rep, vol. 10, no. 1, p. 5654, 2020, DOI: 10.1038/s41598-020-62387-z.

C. J. Chiew, N. Liu, T. H. Wong, Y. E. Sim, and H. R. Abdullah, “Utilizing machine learning methods for preoperative prediction of postsurgical mortality and intensive care unit admission,” Ann Surg, vol. 272, no. 6, p. 1133, 2020.

R. A. N. D. V. S. M. A. N. D. L. F. A. N. D. P. C. A. N. D. J. A. A. N. D. F. S. A. N. D. H. S. A. N. D. C. L. A. Fernandes Marta AND Mendes, “Risk of mortality and cardiopulmonary arrest in critical patients presenting to the emergency department using machine learning and natural language processing,” PLoS One, vol. 15, no. 4, pp. 1–20, 2020, DOI: 10.1371/journal.pone.0230876.

Y. and N. Y. and S. G. and A. S. and G.-H. S. and S. N. and E. A. and M.-C. R. and B. A. and R. G. and L. I. and T. A. Assaf Dan and Gutman, “Utilization of machine-learning models to accurately predict the risk for critical COVID-19,” Intern Emerg Med, vol. 15, no. 8, pp. 1435–1443, 2020, DOI: 10.1007/s11739-020-02475-0.

L. Jing et al., “A Machine Learning Approach to Management of Heart Failure Populations,” JACC Heart Fail, vol. 8, no. 7, pp. 578–587, 2020, DOI: 10.1016/j.jchf.2020.01.012.

J. Awwalu, U. Aisha, I. Sani, and N. Francisca, “A MULTINOMIAL NAÃVE BAYES DECISION SUPPORT SYSTEM FOR COVID-19 DETECTION,” FUDMA JOURNAL OF SCIENCES, vol. 4, no. 2, pp. 704–711, 2020, DOI: 10.33003/fjs-2020-0402-331.

P. Golpour et al., “Comparison of Support Vector Machine, Naïve Bayes and Logistic Regression for Assessing the Necessity for Coronary Angiography,” Int J Environ Res Public Health, vol. 17, no. 18, p. 6449, 2020, DOI: 10.3390/ijerph17186449.

F.-Y. Cheng et al., “Using Machine Learning to Predict ICU Transfer in Hospitalized COVID-19 Patients,” J Clin Med, vol. 9, no. 6, p. 1668, Jun. 2020, DOI: 10.3390/jcm9061668.

N. and A. R. and K. M. and G. K. Soui Makram and Mansouri, “NSGA-II as feature selection technique and AdaBoost classifier for COVID-19 prediction using patient’s symptoms,” Nonlinear Dyn, vol. 106, no. 2, pp. 1453–1475, 2021, DOI: 10.1007/s11071-021-06504-1.

M. Pourhomayoun and M. Shakibi, “Predicting mortality risk in patients with COVID-19 using machine learning to help medical decision-making,” Smart Health, vol. 20, p. 100178, 2021, DOI:

H. Jiang et al., “Machine learning-based models to support decision-making in emergency department triage for patients with suspected cardiovascular disease,” Int J Med Inform, vol. 145, p. 104326, 2021, DOI:

T. Rahman et al., “Mortality Prediction Utilizing Blood Biomarkers to Predict the Severity of COVID-19 Using Machine Learning Technique,” Diagnostics, vol. 11, no. 9, 2021, DOI: 10.3390/diagnostics11091582.

J. and P. J. H. Yun Hyoungju and Choi, “Prediction of Critical Care Outcome for Adult Patients Presenting to Emergency Department Using Initial Triage Information: An XGBoost Algorithm Analysis,” JMIR Med Inform, vol. 9, no. 9, p. e30770, 2021, DOI: 10.2196/30770.

T. and K. A. and A.-M. S. and Z. S. M. and D. S. A. R. and H. H. and I. M. T. Chowdhury Muhammad E. H. and Rahman, “An Early Warning Tool for Predicting Mortality Risk of COVID-19 Patients Using Machine Learning,” Cognit Comput, 2021, DOI: 10.1007/s12559-020-09812-7.

C.-C. Chiu, C.-M. Wu, T.-N. Chien, L.-J. Kao, C. Li, and H.-L. Jiang, “Applying an Improved Stacking Ensemble Model to Predict the Mortality of ICU Patients with Heart Failure,” J Clin Med, vol. 11, no. 21, p. 6460, 2022, DOI: 10.3390/jcm11216460.

P. Misra and A. S. Yadav, “Impact of preprocessing methods on healthcare predictions,” in Proceedings of 2nd International Conference on Advanced Computing and Software Engineering (ICACSE), 2019.

M. R. Stavseth, T. Clausen, and J. Røislien, “How handling missing data may impact conclusions: A comparison of six different imputation methods for categorical questionnaire data,” SAGE Open Med, vol. 7, p. 2050312118822912, 2019.

A. Jadhav, D. Pramod, and K. Ramanathan, “Comparison of Performance of Data Imputation Methods for Numeric Dataset,” Applied Artificial Intelligence, vol. 33, no. 10, pp. 913–933, 2019, DOI: 10.1080/08839514.2019.1637138.