Early Prediction of 30-Day ICU Re-admissions Using Natural Language Processing and Machine Learning
Biomedical Statistics and Informatics
Volume 4, Issue 3, September 2019, Pages: 22-26
Received: Oct. 10, 2019;
Accepted: Nov. 18, 2019;
Published: Nov. 22, 2019
Views 431 Downloads 102
Zhiheng Li, High School Division, Northeast Yucai Foreign Language School, Shenyang, China
Xinyue Xing, High School Division, Northeast Yucai Foreign Language School, Shenyang, China
Bingzhang Lu, Junior High Division, Northeast Yucai School, Shenyang, China
Ying Zhao, Department of Engineering Science and Applied Math, Northwestern University, Evanston, USA
Zhixiang Li, Department of Biomedical Engineering, Shenyang Pharmaceutical University, Shenyang, China
ICU readmission is associated with longer hospitalization, mortality and adverse outcomes. An early recognition of ICU readmission can help prevent patients from worse situation and lead to lower treatment cost. As the abundance of Electronics Health Records (EHR), it is popular to design clinical decision tools with machine learning techniques manipulating on healthcare large scale data. To this end, we designed data-driven predictive models to estimate the risk of Intensive Care Unit (ICU) readmission. The discharge summary of each hospital admission was carefully represented by natural language processing algorithms. Unified Medical Language System (UMLS) was further used to standardize inconsistency of discharge summaries. 5 machine learning classifiers including naïve Bayes, support vector machine, logistic regression, gradient boosting decision tree and 2 feature representations including Bag-of-Words and Bag-of-CUIs were adopted to construct predictive configurations. The best configuration yielded a competitive AUC of 0.748. High contribution words and medical terms were further investigated to ensure that they were clinical meaningful. A comparative study between two feature representations were also discussed. Our work suggests that natural language processing of discharge summaries is capable to extract meaningful information from discharge summary automatically and to send clinicians the warning of unplanned 30-day readmission upon discharge.
Early Prediction of 30-Day ICU Re-admissions Using Natural Language Processing and Machine Learning, Biomedical Statistics and Informatics.
Vol. 4, No. 3,
2019, pp. 22-26.
Pronovost, P. J., et al., Developing and pilot testing quality indicators in the intensive care unit. Journal of critical care, 2003. 18 (3): p. 145-155.
Johnson, A. E., et al., MIMIC-III, a freely accessible critical care database. Scientific data, 2016. 3: p. 160035.
Higgins, T. L., et al., Assessing contemporary intensive care unit outcome: an updated Mortality Probability Admission Model (MPM0-III). Critical care medicine, 2007. 35 (3): p. 827-835.
Groenewegen, K. H., A. M. Schols, and E. F. Wouters, Mortality and mortality-related factors after hospitalization for acute exacerbation of COPD. Chest, 2003. 124 (2): p. 459-467.
Makris, N., et al., Unplanned early readmission to the intensive care unit: a case-control study of patient, intensive care and ward-related factors. Anaesthesia and intensive care, 2010. 38 (3): p. 723-731.
Tang, P. C., et al., Personal health records: definitions, benefits, and strategies for overcoming barriers to adoption. Journal of the American Medical Informatics Association, 2006. 13 (2): p. 121-126.
Burton, L. C., G. F. Anderson, and I. W. Kues, Using electronic health records to help coordinate care. The Milbank Quarterly, 2004. 82 (3): p. 457-481.
Sox, H. C., et al., Medical decision making. 2007: ACP Press.
Bellazzi, R. and B. Zupan, Predictive data mining in clinical medicine: current issues and guidelines. International journal of medical informatics, 2008. 77 (2): p. 81-97.
Alić, B., L. Gurbeta, and A. Badnjević. Machine learning techniques for classification of diabetes and cardiovascular diseases. in 2017 6th Mediterranean Conference on Embedded Computing (MECO). 2017. IEEE.
Li, Y., et al. Early prediction of acute kidney injury in critical care setting using clinical notes. in 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). 2018. IEEE.
Berenson, R.A., Pronovost, P.J. and Krumholz, H.M., 2013. Achieving the potential of health care performance measures. Timely Anal Immed Health Pol, (2013), p. 2.
Sundararaman, A., S. V. Ramanathan, and R. Thati, Novel approach to predict hospital readmissions using feature selection from unstructured data with class imbalance. Big data research, 2018. 13: p. 65-75.
Bardell, T., et al., ICU readmission after cardiac surgery. European journal of cardio-thoracic surgery, 2003. 23 (3): p. 354-359.
Vincent, J.-L., K. Donadello, and X. Schmit, Biomarkers in the critically ill patient: C-reactive protein. Critical care clinics, 2011. 27 (2): p. 241-251.
Willett, P., The Porter stemming algorithm: then and now. Program, 2006. 40 (3): p. 219-223.
Huang, Y.-F. and C.-H. Hsu, PubMed smarter: Query expansion with implicit words based on gene ontology. Knowledge-Based Systems, 2008. 21 (8): p. 927-933.
Aronson, A. R. Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. in Proceedings of the AMIA Symposium. 2001. American Medical Informatics Association.
Cotton, R. T. and C. M. Myer III, Contemporary surgical management of laryngeal stenosis in children. American journal of otolaryngology, 1984. 5 (5): p. 360-368.
Maitland, C. G., Perilymphatic fistula. Current neurology and neuroscience reports, 2001. 1 (5): p. 486-491.
Frederickson, R. G., The subdural space interpreted as a cellular layer of meninges. The Anatomical Record, 1991. 230 (1): p. 38-51.
Zhang, Y., et al., Psychiatric symptom recognition without labeled data using distributional representations of phrases and on-line knowledge. Journal of biomedical informatics, 2017. 75: p. S129-S137.