Program Official

Principal Investigator

Ju
Sun
Awardee Organization

University Of Minnesota
United States

Fiscal Year
2023
Activity Code
R01
Early Stage Investigator Grants (ESI)
Not Eligible
Project End Date

SCH: A New Computational Framework for Learning from Imbalanced Biomedical Data

Advances in cancer prevention, diagnosis, and treatment have dramatically improved long-term survival of those diagnosed with breast cancer. However, this success has been tempered by a parallel increased incidence of chronic conditions in breast cancer survivors, in particular cardiovascular disease (CVD), due at least in part to cardiotoxic treatment regimens. Current evidence-based guidelines for preventing and controlling CVD in breast cancer survivors are broad, and we lack clear guidance for assessing individualized risks of cardiovascular events. Existing CVD risk prediction models focus on the general population and rely only on a limited number of variables. The adoption and integration of electronic health record (EHR) systems has provided a wealth of information about individual characteristics at the point of care, including unstructured clinical narratives, imaging data, and structured clinical variables. However, the real-world EHR data is highly imbalanced including the fraction of patients with CVD outcomes and the uniform distribution of time for the CVD development since BC diagnosis. Our overarching goal is to develop solid computational and theoretical foundations for learning from imbalanced real-world data, with an emphasis on BC-CVD outcome risk prediction. Specifically, we will develop a computational framework for imbalanced classification and imbalanced regression tasks on the CVD risk prediction among BC survivors using multimodal EHR data. The successful implementation of this project would lay a computational foundation for imbalanced learning and can provide more accurate tools for predicting BC CVD outcomes.

Publications

  • Zhou H, Li M, Xiao Y, Yang H, Zhang R. LLM Instruction-Example Adaptive Prompting (LEAP) Framework for Clinical Relation Extraction. medRxiv : the preprint server for health sciences. 2023 Dec 17. PMID: 38168203
  • Yang H, Li M, Zhou H, Xiao Y, Fang Q, Zhang R. One LLM is not Enough: Harnessing the Power of Ensemble Learning for Medical Question Answering. medRxiv : the preprint server for health sciences. 2023 Dec 24. PMID: 38196648