Program Official
Principal Investigator
Han
Liang
Awardee Organization
University Of Tx Md Anderson Can Ctr
United States
Fiscal Year
2024
Activity Code
U24
Early Stage Investigator Grants (ESI)
Not Applicable
Project End Date
NIH RePORTER
For more information, see NIH RePORTER Project 5U24CA264128-03
The Cancer Proteome Atlas: an Integrated Bioinformatics Resource for Functional Cancer Proteomic Data
Reverse-phase protein arrays (RPPAs) offer a powerful functional proteomic approach to investigate molecular mechanisms and response to therapy in cancer. MD Anderson Cancer Center is a leader in the implementation of this antibody-based technology that can assess many protein markers across large numbers of samples in a cost-effective, sensitive, and high-throughput manner. The platform currently assesses ~500 protein markers, covering all major signaling pathways and most drug targets. Its utility was demonstrated through its selection as the platform for proteomic characterization of ~8,000 patient samples through TCGA and >1,000 cell lines through CCLE, and its designation as one of two NCI Genome Characterization Centers in 2015. It is an approved Cancer Therapy Evaluation Platform site for sample characterization, leading to the implementation of multiple effective clinical trials. With ITCR support, we have developed a major bioinformatics resource dedicated to the analysis, visualization, and dissemination of RPPA data, The Cancer Proteome Atlas (TCPA), which has a community of >80,000 users worldwide. The current objective is to improve the data quality control, to enhance the existing analytic capabilities, and to expand the scope of TCPA by adding new functionalities and datasets. We have formed working relationships to link TCPA with other widely used bioinformatics resources. As an experienced, multidisciplinary team, we will pursue four specific aims: Aim #1. Develop a user-friendly, all-in-one software pipeline for processing RPPA data. We will improve quality control and batch effects adjustment steps of RPPA data processing, enhance the performance of the pipeline and interactivity of the results, and provide a user-friendly, general software package to the scientific community. Aim #2. Expand and enhance our existing web platforms for the analysis of RPPA data. We will extend the scope of RPPA data, incorporate other types of molecular data, especially proteomic data, and enhance the analytic and visualization capabilities. Aim #3. Build a user-friendly, interactive web platform for the analysis of cancer RPPA data from xenograft, PDX, and animal models. We will collect and compile RPPA data of >10,000 such samples and develop related visualization and analytic modules. Aim #4. Promote TCPA and active interaction with the user community. We will enhance the RPPA data repository and promote it as a standard reference database, provide documentation, hands-on workshops, and bug fixes, and build web APIs for interaction with other tools. The expected outcome is a dedicated, comprehensive bioinformatics resource that fully integrates RPPA data generation, analysis, dissemination, and user feedback, allowing for fluent exploration and analysis of high-quality proteomic data in rich contexts. The project is important because it will greatly enhance the quality and reproducibility of RPPA data from important consortium projects; substantially reduce barriers in mining complex functional proteomic data; serve as a hub for integrating high-quality RPPA-based proteomics data into other widely used bioinformatic resources, and directly facilitate the development of protein markers for precision cancer medicine.