by
QINYI HU, Bachelor of Art
APPROVED:
WENYAW CHAN, PhD
HULIN WU, PhD
JOSEPH
MD
B. MCCORMICK,
JOSEPH MCCORMICK, MD
Copyright
by
Qinyi Hu, Bachelor of Art, Master of Science
2022
PREDICTION OF HOSPITAL READMISSION IN HEART FAILURE PATIENTS: A
DATA-DRIVEN ANALYSIS
by
QINYI HU
BA, Wake Forest University, 2020
Presented to the Faculty of The University of Texas
School of Public Health
in Partial Fulfillment
of the Requirements
for the Degree of
MASTER OF SCIENCE
THE UNIVERSITY OF TEXAS
SCHOOL OF PUBLIC HEALTH
Houston, Texas
December 2022
ACKNOWLEDGEMENTS
I am grateful to Dr. Hulin Wu, Dr. Wenyaw Chan, and Dr. McCormick B. Joseph for their guidance. I appreciate Wei Tao for her contributions on identification of Type 2 diabetes Mellitus, and Junwei Lu for his contributions on cleaning hypertension medication data in
this study.
PREDICTION OF HOSPITAL READMISSION IN HEART FAILURE PATIENTS: A
DATA-DRIVEN ANALYSIS
Qinyi Hu, BA, MS
The University of Texas
School of Public Health, 2022
Thesis/Dissertation Chair: Hulin Wu, PhD
Background and aims: The high rate of readmissions after heart failure (HF) hinders
the patients’ recovery, and increases their financial burdens. Therefore, it is important for
clinicians and researchers to identify risk factors of heart-failure (DHF) hospitalization. We explored the relationship of HF readmission with HF types, age, gender, race, type-2 diabetes
Mellitus (T2DM), hypertension medications, and vital signs.
Methods: Data source was the electronic health records provided by the Cerner
Health Facts® database, a comprehensive dataset that includes de-identified patient
information, with healthcare records over 63 million patients for 85 systems with 750
hospitals and healthcare facilities in the United States from 2000 to 2018. Patients who have at least one International Classification of Disease 9 diagnosis code of HF, and at least one HF medication and hospitalization record were identified as the study cohort. Age, ethnicity, heart-failure types, Type 2 diabetes Mellitus, hypertension medication intake, and vital signs
were considered as potential risk factors. Missing data was imputed by MICE package.
Purposeful variable selection was used for the variable selection of predict model. Stepwise
selection by Akaike information criterion (AIC) and lasso regression method were performed
as comparisons of purposeful variable selection.
Results: In total, 135,253 inpatients are included, of which 96627 (%) patients are
identified as HF readmission patients, and 38626 (%) patients are not readmitted for HF.
Age, gender, race, HF types, ACE inhibitors intake, Beta blockers intake, Diuretics intake,
Calcium channel blockers intake, Angiotensin receptor blockers intake, Antiadrenergic
inhibitors intake, mean measurement of systolic blood pressure, Body Mass Index (BMI) and height are predictors of HF readmission in a logistic regression model. Area under Receiver
operator characteristics (ROC) curve is 0.539, so the model is a bad discriminatory
performance.
Conclusion: Heart failure readmission is associated with patients’ age, gender, race, heart failure types, systolic blood pressure, Body Mass Index (BMI) height, and intake of
hypertension medications including ACE inhibitors, Beta blockers, Diuretics, Calcium
channel blockers, Angiotensin receptor blockers, and Antiadrenergic inhibitors. Future
improvements are needed to enhance the predictive ability of the model.
TABLE OF CONTENTS
Specific Aims........................................................................................................... 3
Methods.......................................................................................................................... 4
Data Source and Study Population ......................................................................... 4
Data cohort selection................................................................................................ 5
Variable Definition ................................................................................................. 7
Missing data report ................................................................................................. 9
Summary statistics ................................................................................................ 11
Model and Variable Selection, with Assumptions Check .................................... 15
Human Subjects Considerations ........................................................................... 21
Results.......................................................................................................................... 21
Discussion ................................................................................................................... 27
Conclusion .................................................................................................................. 28
Appendices................................................................................................................... 30
References.................................................................................................................... 44
LIST OF TABLES
Table 1: Baseline Characteristics and Demographic Factors for Study Cohort ......... 13
Table 2: Summary Statistics of HF Patients’ Readmission Status Versus Eight Groups
ofHypertension Medication .................................................................... 14
Table 3: Summary Statistics ofHF Patients’ Readmission Status Versus Seven Vital
Signs........................................................................................................... 15
Table 4: Summary Table of Final Model (ranked from the largest to the smallest odd
ratio)........................................................................................................... 22
LIST OF FIGURES
Figure 1: Flow Chart of Cohort Selection...................................................................... 7
Figure 2: Histogram and Patternof Missing Data ...................................................... 10
Figure 3: Log-odds (readmission | age) vs median points ofage in 4 groups ............ 18
Figure 4: Log-odds (readmission | mean systolic bp) vs median points of mean
systolic bp in 4 groups .............................................................................. 19
Figure 5: Log-odds (readmission | mean BMI) vs median points of mean BMI in 4
groups......................................................................................................... 19
Figure 6: Log-odds (readmission | mean height) vs median points of mean height in 4
groups......................................................................................................... 20
Figure 7: ROC Curve and AUC Value ....................................................................... 27
LIST OF APPENDICES
Appendix A: Five-number Summaryand Histogram of Age ................................... 30
Appendix B: Five-number Summary and Histogram of Mean Systolic Blood Pressure ...................................................................................................................................... 30
Appendix C: Five-numberSummary and Histogram of Mean BMI .......................... 31
Appendix D: Five-numberSummary and Histogram of Mean Height........................ 31
Appendix E: Step 1 of Purposeful Variable Selection: Separate Univariate Logistic
Regression Analysis for Each Covariate (Likelihood Ratio Test)............. 32
Appendix F: Step 1 of Purposeful Variable Selection: Chi-square Test Result and
Expected Tables of Contingency Tables.................................................... 34
Appendix G: Step 2 ofPurposeful Variable Selection: Wald Test Results of Full
Model ........................................................................................................ 37
Appendix H: Step 2 of Purposeful Variable Selection: Wald Test Results of Reduced
Model ........................................................................................................ 37
Appendix I: Step 2 of Purposeful Variable Selection: Likelihood Ratio Test Resultof
Full Model.................................................................................................. 37
Appendix J: Step 2 of Purposeful Variable Selection: Likelihood Ratio Test Resultof
Reduced Model ......................................................................................... 38
Appendix K: Step 2 ofPurposeful Variable Selection: Partial Likelihood Ratio Test
Result ........................................................................................................ 38
Appendix L: Step 3 of Purposeful Variable Selection: Potential Confounders Check 38
Appendix M: Step 4 of Purposeful Variable Selection: R result of Preliminary Main
Effect Model ............................................................................................. 40
Appendix N: R result of Stepwise Selection................................................................ 41
Appendix O: Result of LassoSelection ...................................................................... 42
Appendix P: R result of Final Model .......................................................................... 43
BACKGROUND
Introduction and Public Health Significance
Approximately 5.7 million American adults are living with heart failure and the
projections are that the prevalence of HF will increase 46% from 2012 to 2030 with greater
than 8 million adults living with the chronic condition (Ziaeian, Boback, and Gregg C
Fonarow, 2016). According to the CDC, heart failure (HF) costs the nation an estimated
$30.7 billion in 2012. This total includes the cost of health care services, medicines to treat heart failure, and missed days of work (Benjamin EJ, et al., 2019). Heart failure represents a
global public health threat.
High rates of readmissions after heart failure damage improvement of patients’
health, medical resources, and financial costs. Since heart failure is a chronic condition, heart
failure hospitalization is followed by high readmission and mortality rates. More than 25
percent of patients hospitalized for heart failure will be readmitted to the hospital within 30
days of discharge. According to 2012 research, the top reason for readmission with the
Medicare fee-for-service patient population is for patients suffering from heart failure. What is more, the Affordable Care Act instituted a financial penalty for excessive readmissions for hospitals that is capped at 3% of a hospital’s total Medicare payments for 2015 and beyond
(Ziaeian, Boback, and Gregg C Fonarow, 2016), which largely increases the financial
pressure. Therefore, it is important for clinicians and researchers to identify risk factors of
heart-failure hospitalization, in order to reduce the hospital readmission for heart-failure
patients.
Literature Review
Our study focuses on risk factors that influence readmission in heart failure patients.
Since different types of HF need different medication treatment, and therapies proven to
work for systolic heart failure don't necessarily work for diastolic heart failure, it is necessary for us to categorize and compare patients with different types of heart failure. Diastolic heart failure, in which the left ventricle stiffens, is different from systolic heart failure, in which the left ventricle becomes weak and flabby. Both diastolic and systolic heart failure are left-sided
heart failure. Besides these two types, heart failure could also impact the right side of the
heart. As researchers search for the best treatments for heart failure, controlling blood
pressure is the key strategy. Also, heart-protecting drugs can help to reduce the symptoms.
Age, race, high blood pressure, and diabetes has been proven to be risk factors of
diastolic heart failure. In previous research and studies, the effect of all kinds of HF on the
elderly population is disproportionate. According to a meta-analysis in 2021, the HF patients had an average age of 76.3 years. Even assuming that the incidence for a specific age, sex, or ethnicity is stable, heart failure prevalence shows a steady elevation over the next 20 years,
mainly in association with population aging (Lan T., et al., 2021).
Besides the influence of aging, diabetes has also been proven as one of the main
causes of heart failure. Diabetic patients carry a four- to five-fold increased risk of heart
failure. At an early stage, diabetic cardiomyopathy is manifested by diastolic heart failure 2
Reproduced with permission of copyright owner. Further reproduction prohibited without permission.