RT Journal Article SR Electronic T1 Application of a Novel Machine Learning Method to Big Data Infers a Relationship Between Asthma and the Development of Neoplasia JF bioRxiv FD Cold Spring Harbor Laboratory SP 439117 DO 10.1101/439117 A1 Abbas Shojaee A1 Jose L. Gomez A1 Xiaochen Wang A1 Naftali Kaminski A1 Jonathan M. Siner A1 Seyedtaghi Takyar A1 Hongyu Zhao A1 Geoffrey Chupp YR 2020 UL http://biorxiv.org/content/early/2020/06/25/439117.abstract AB Background A relationship between asthma and the risk of having cancer has been identified in several studies. However, these studies have used different methodologies, been primarily cross-sectional in nature, and the results have been contradictory. Population-level analyses are required to determine if a relationship truly exists.Methods We developed a novel machine learning tool to infer associations, Causal Inference using the Composition of Transactions (CICT). Two all payers claim datasets of over two hundred million hospitalization encounters from the US-based Healthcare Cost and Utilization Project (HCUP) were used for discovery and validation. Associations between asthma and neoplasms were discovered in data from the State of Florida. Validation was conducted on eight cohorts of patients with asthma, and seven subtypes of asthma and COPD using datasets from the State of California. Control groups were matched by gender, age, race, and history of tobacco use. Odds ratio analysis with Bonferroni-Holm correction measured the association of asthma and COPD with 26 different benign and malignant neoplasms. ICD9CM codes were used to identify exposures and outcomes.Findings CICT identified 17 associations between asthma and the risk of neoplasia in the discovery dataset. In the validation studies, 208 case-control analyses were conducted between subtypes of Asthma (N= 999,370, male= 33%, age= 50) and COPD (N=715,971, male = 50%, age=69) with the corresponding matched control groups (N=8,400,004, male= 42%, age= 47). Allergic asthma was associated with benign neoplasms of the meninges, salivary, pituitary, parathyroid, and thyroid glands (OR:1.52 to 2.52), and malignant neoplasms of the breast, intrahepatic biliary system, hematopoietic, and lymphatic system (OR: 1.45 to 2.05). COPD was associated with malignant neoplasms in the lung, bladder, and hematopoietic systems.Interpretation The combined use of machine learning methods for knowledge discovery and epidemiological methods shows that allergic asthma is associated with the development of neoplasia, including in glandular organs, ductal tissues, and hematopoietic systems. Also, our findings differentiate the pattern of neoplasms between allergic asthma and obstructive asthma. This suggests that inflammatory pathways that are active in asthma also contribute to neoplastic transformation in specific organ systems such as secretory organs.Funding NoneAt a Glance Commentary Over the past three decades, studies have suggested that asthma could increase the risk of developing cancer, but a consensus has not been reached. The debate persists because the current evidence has been derived using cross-sectional statistical designs, limited datasets, and small cohorts and conflicting results. In addition, the mechanism by which allergic airway inflammation contributes to neoplastic transformation is postulated but not proven.Here, we present the largest study to date on this association in patients with asthma or COPD. A knowledge discovery method was used for hypothesis generation that, when combined with epidemiological reasoning tools, identified associations between airway disease and neoplasia. The results reveal novel relationships between allergic asthma and benign glandular tumors and confirm the well-known connections between COPD and lung cancer. Further, we identified a novel association between COPD and asthma with hematological malignancies. These findings rectify contradictory results from other studies and demonstrate more specifically that the types of neoplasms associated with asthma compared to COPD that infers mechanistic plausibility.Competing Interest StatementThe authors have declared no competing interest.