Comparative Analysis of Imputation Methods for Enhancing Predictive Accuracy in Data Models
DOI: http://dx.doi.org/10.62527/joiv.8.3.1666
Abstract
Keywords
Full Text:
PDFReferences
M. S. Gangadhar, K. V. S. Sai, S. H. S. Kumar, K. A. Kumar, M. Kavitha, and S. S. Aravinth, “Machine Learning and Deep Learning Techniques on Accurate Risk Prediction of Coronary Heart Disease,” in 2023 7th International Conference on Computing Methodologies and Communication (ICCMC), IEEE, Feb. 2023, pp. 227–232. doi:10.1109/ICCMC56507.2023.10083756.
X. Kong, W. Zhou, G. Shen, W. Zhang, N. Liu, and Y. Yang, “Dynamic graph convolutional recurrent imputation network for spatiotemporal traffic missing data,” vol. 261, p. 110188, 2023, doi:10.1016/j.knosys.2022.110188.
E. Getzen, L. Ungar, D. Mowery, X. Jiang, and Q. Long, “Mining for equitable health: Assessing the impact of missing data in electronic health records,” J Biomed Inform, vol. 139, p. 104269, Mar. 2023, doi:10.1016/J.JBI.2022.104269.
K. Psychogyios, L. Ilias, C. Ntanos, and D. Askounis, “Missing Value Imputation Methods for Electronic Health Records,” IEEE Access, vol. 11, pp. 21562–21574, 2023, doi: 10.1109/ACCESS.2023.3251919.
B. Agbo, H. Al-Aqrabi, T. Alsboui, M. Hussain, and R. Hill, “Imputation of Missing Clinical Covariates for Downstream Classification Problems,” IEEE Access, vol. 11, pp. 102935–102943, 2023, doi: 10.1109/ACCESS.2023.3317775.
P. Buczak, J. J. Chen, and M. Pauly, “Analyzing the Effect of Imputation on Classification Performance under MCAR and MAR Missing Mechanisms,” Entropy 2023, Vol. 25, Page 521, vol. 25, no. 3, p. 521, Mar. 2023, doi: 10.3390/E25030521.
G. Shen, W. Zhou, W. Zhang, N. Liu, Z. Liu, and X. Kong, “Bidirectional spatial–temporal traffic data imputation via graph attention recurrent neural network,” Neurocomputing, vol. 531, pp. 151–162, Apr. 2023, doi: 10.1016/J.NEUCOM.2023.02.017.
L. Li, Y. Wang, H. Wang, S. Hu, and T. Wei, “An Efficient Architecture for Imputing Distributed Data Sets of IoT Networks,” IEEE Internet Things J, vol. 10, no. 17, pp. 15100–15114, Sep. 2023, doi: 10.1109/JIOT.2023.3264609.
G. Batista and M.-C. Monard, “A Study of K-Nearest Neighbour as an Imputation Method,” in Hybrid Intelligent Systems, ser Front Artificial Intelligence Applications, Jan. 2002, pp. 251–260.
S. Zhang, “Nearest neighbor selection for iteratively kNN imputation,” Journal of Systems and Software, vol. 85, no. 11, pp. 2541–2552, Nov. 2012, doi: 10.1016/J.JSS.2012.05.073.
Y. He and D. Pi, “Improving KNN Method Based on Reduced Relational Grade for Microarray Missing Values Imputation,” IAENG Int J Comput Sci, vol. 43, no. 3, pp. 356–362, 2016.
J.-H. Hsu, C.-H. Wu, W.-K. Wang, H.-Y. Su, E. C.-L. Lin, and P. S. Chen, “Digital Phenotyping-Based Bipolar Disorder Assessment Using Multiple Correlation Data Imputation and Lasso-MLP,” IEEE Trans Affect Comput, pp. 1–14, 2023, doi10.1109/TAFFC.2023.3299607.
I. D. Irawati, A. B. Suksmono, I. J. M.Edward, “An Interpolation Comparative Analysis for Missing Internet Traffic Data,” Proceedings of the 3rd International Conference on Electronics, Communications and Control Engineering, pp. 26-30, 2020, doi:10.1145/3396730.3396740
D. J. Stekhoven and P. Bühlmann, “MissForest—non-parametric missing value imputation for mixed-type data,” Bioinformatics, vol. 28, no. 1, pp. 112–118, Jan. 2012, doi: 10.1093/bioinformatics/btr597.
A. K. Waljee et al., “Comparison of imputation methods for missing laboratory data in medicine,” BMJ Open, vol. 3, no. 8, p. e002847, Aug. 2013, doi: 10.1136/bmjopen-2013-002847.
J. You, J. L. Ellis, S. Adams, M. Sahar, M. Jacobs, and D. Tulpan, “Comparison of imputation methods for missing production data of dairy cattle,” animal, p. 100921, Jul. 2023, doi:10.1016/j.animal.2023.100921.
B. Gong, Z. Xu, C. Lin, and D. Wu, “Heterogeneous Traffic Flow Detection Using CAV-Based Sensor With I-GAIN,” IEEE Access, vol. 11, pp. 32616–32627, 2023, doi: 10.1109/ACCESS.2023.3263720.
G. Vink, L. E. Frank, J. Pannekoek, and S. van Buuren, “Predictive mean matching imputation of semicontinuous variables,” Stat Neerl, vol. 68, no. 1, pp. 61–90, Feb. 2014, doi: 10.1111/stan.12023.
J. Du and L. Zhou, “Improving financial data quality using ontologies,” Decis Support Syst, vol. 54, no. 1, pp. 76–86, Dec. 2012, doi: 10.1016/j.dss.2012.04.016.
Idris NF, Ismail MA, Jaya MIM, Ibrahim AO, Abulfaraj AW, Binzagr F (2024) Stacking with Recursive Feature Elimination-Isolation Forest for classification of diabetes mellitus. PLoS ONE 19(5): e0302595. https://doi.org/10.1371/journal.pone.0302595.