Skip to main content
TSRC, Tob. Sci. Res. Conf., 2019, 73, abstr. 030

Identification of predictive clinical biomarkers for developing chronic obstructive pulmonary disease using real world evidence data

LIU GANG M.; MAKENA P.; HONG Kyung Soo; SCOTT E.; PRASAD G.L.
RAI Services Company, Winston Salem, NC, USA

Identification of predictive biomarkers and quantification of individual risk for developing smoking-associated diseases such as Chronic Obstructive Pulmonary Disease (COPD) aids in evaluating and predicting the health effects from tobacco products. This study aimed to identify predictive biomarker(s) for COPD in U.S. smokers by leveraging a Real World Evidence (RWE) approach. We performed a retrospective analysis of smokers’ electronic health records prior to COPD diagnosis dates from the Explorys database available from IBM Watson Health. Electronic health records from 181,250 smokers with COPD and 2.2 million smokers without COPD were analyzed for 75 selected health measures and 900 derived clinical features based on the selected biomarkers at the subject level. A computational model built around RWE data predicted development of COPD with 76% precision (true positive rate) and 0.801 Area Under the Receiver Operating Characteristics Curve on the subject level outcome. A set of 32 biomarkers (e.g., coagulation tissue factor, cholesterol, erythrocytes) and 96 clinical features (different ways a given biomarker is reported or analyzed) were identified to have predictive power in modeling development of COPD. Taken collectively, top clinical biomarkers identified to have high predictability were platelets, cholesterol in HDL, coagulation tissue factor, age and leukocytes. These findings from RWE data help in building an individual risk scoring model to estimate the likelihood of smokers developing COPD.