Principal Investigator

Chia-Lin
Wei
Awardee Organization

Jackson Laboratory
United States

Fiscal Year
2020
Activity Code
R33
Early Stage Investigator Grants (ESI)
Not Applicable
Project End Date

Advancing Ultra Long-read Sequencing and Chromatin Interaction Analyses for Chromosomal and Extrachromosomal Structural Variation Characterization in Cancer

Structural variants (SVs) such as deletions, insertions, inversions, duplications, and translocations in cancer genomes can promote tumor progression by perturbing gene structures and expression. Additionally, extrachromosomal DNA (ecDNA)—an extreme form of SV found in a wide range of cancer types—are a reservoir of oncogene amplification and contribute to the genetic heterogeneity and evolution of tumors. Thus, a complete understanding of the structure and distribution of SVs and ecDNAs in tumors would shed light on their roles in tumor progression. However, the ability to detect and characterize SVs and ecDNAs at the molecular level has been limited by existing short-read sequencing approaches: large and complex SVs thwart efforts to detect them and correctly define their structures; and the multi-copy, heterogenous nature of ecDNAs undermines determination of their primary structures. While ecDNAs can be observed by DAPI-staining of metaphase tumor cells, determining their sequence content has typically relied on fluorescence in situ hybridization (FISH) to probe for candidate oncogenes. To support an unbiased and comprehensive molecular approach to the study of SVs, this project will develop and validate emerging genomic technologies that will enable the detection and characterization of complex SVs and ecDNAs as standard practices in cancer genomics. In Aim 1, the read lengths of the nanopore single-molecule sequencing platform will be further extended by improving genomic DNA quality and optimizing library preparation reactions, with the goal of attaining N50 read lengths of 75-100 Kb. Such long read lengths are expected to span many SVs to more effectively reveal their molecular structures and phasing information. In parallel, the recent SV-detecting computational pipeline, Picky, will be optimized to detect molecular signatures of complex SVs and ecDNAs to allow their accurate and sensitive detection in long read sequencing data to >0.8 precision and recall rates. The active transcription of ecDNAs suggests that they are associated with RNA polymerase II transcription complexes, making them suitable for unsupervised detection by the chromatin interaction assay, ChIA-PET. In Aim 2, this method will be employed to map ecDNAs via their association with RNA polymerase II and reveal transcriptionally relevant interactions between ecDNAs and the chromosomes. Computational methods will be developed to specifically detect ecDNA-amplified sequences in ChIA-PET data and their associated oncogenic genes. Additionally, ecDNAs uncovered by ChIAPET will be targeted by the CRISPR/dCas9-based targeted capture method to physically isolate ecDNA molecules for long-read sequencing and structural characterization. Aim 3 will build on the developed methods to generate a platform for unbiased and unsupervised characterization of SVs and ecDNAs in glioblastoma neurosphere cultures and in xenograft tumor models of glioblastoma, breast, and lung cancer. Taken together, this project will develop methods and tools that will empower the cancer research community to confidently and comprehensively detect SVs and ecDNAs in cancer genomes.