Zoe Guan, a Research Scholar in the Department of Epidemiology and Biostatistics at Memorial Sloan Kettering Cancer Center and faculty candidate in the Division of Biostatistics and the Masonic Institute for the Developing Brain, will present:
“Predicting Cancer Risk from Genetic Information”
Abstract: Most cancer types have a substantial heritable component. Family history and/or DNA sequencing data can be used to identify individuals at high risk of cancer, which is key to reducing morbidity and mortality through targeted interventions. In this talk, I will present three projects on statistical and machine learning methods for leveraging these types of data to predict cancer risk. The first two projects are on predicting breast cancer risk using family history data. BRCAPRO is a family history-based breast cancer risk prediction model that is widely used in clinical practice. A major limitation of this model is that it does not consider non-genetic risk factors. The first project focuses on applying ensemble learning and survival analysis methods to integrate non-genetic risk factors into BRCAPRO. Another limitation of BRCAPRO and other family history-based models is that they rely on strong assumptions and a priori knowledge about the genetic architecture of breast cancer. Motivated by the increasing availability of large-scale health data resources and the striking performance of deep learning in other prediction problems, the second project focuses on using large collections of pedigrees to develop deep learning models for breast cancer that are data-driven and therefore do not require a priori knowledge. Finally, the third project focuses on using germline whole-exome sequencing data to predict cancer risk. Rare variants have long been hypothesized as a major source of unexplained heritability, and in recent years, large-scale next-generation sequencing projects have made it possible to study these variants much more comprehensively. In this project, we used data from the UK Biobank to build risk prediction models for common cancer types that aggregate rare variants based on their genomic contexts.
All are Welcome.