Machine Learning to Predict the Risk of Incident Heart Failure Hospitalization Among Patients With Diabetes: The WATCH-DM Risk Score

OBJECTIVE

To develop and validate a novel, machine learning–derived model to predict the risk of heart failure (HF) among patients with type 2 diabetes mellitus (T2DM).

RESEARCH DESIGN AND METHODS

Using data from 8,756 patients free at baseline of HF, with <10% missing data, and enrolled in the Action to Control Cardiovascular Risk in Diabetes (ACCORD) trial, we used random survival forest (RSF) methods, a nonparametric decision tree machine learning approach, to identify predictors of incident HF. The RSF model was externally validated in a cohort of individuals with T2DM using the Antihypertensive and Lipid-Lowering Treatment to Prevent Heart Attack Trial (ALLHAT).

RESULTS

Over a median follow-up of 4.9 years, 319 patients (3.6%) developed incident HF. The RSF models demonstrated better discrimination than the best performing Cox-based method (C-index 0.77 [95% CI 0.75–0.80] vs. 0.73 [0.70–0.76] respectively) and had acceptable calibration (Hosmer-Lemeshow statistic 2 = 9.63, P = 0.29) in the internal validation data set. From the identified predictors, an integer-based risk score for 5-year HF incidence was created: the WATCH-DM (Weight [BMI], Age, hyperTension, Creatinine, HDL-C, Diabetes control [fasting plasma glucose], QRS Duration, MI, and CABG) risk score. Each 1-unit increment in the risk score was associated with a 24% higher relative risk of HF within 5 years. The cumulative 5-year incidence of HF increased in a graded fashion from 1.1% in quintile 1 (WATCH-DM score ≤7) to 17.4% in quintile 5 (WATCH-DM score ≥14). In the external validation cohort, the RSF-based risk prediction model and the WATCH-DM risk score performed well with good discrimination (C-index = 0.74 and 0.70, respectively), acceptable calibration (P ≥0.20 for both), and broad risk stratification (5-year HF risk range from 2.5 to 18.7% across quintiles 1–5).

CONCLUSIONS

We developed and validated a novel, machine learning–derived risk score that integrates readily available clinical, laboratory, and electrocardiographic variables to predict the risk of HF among outpatients with T2DM.

Bekijk het originele bericht