HPViewer: sensitive and specific genotyping of human papillomavirus in metagenomic DNA.

Author(s): Hao Y,  Yang L,  Galvao Neto A,  Amin MR,  Kelly D,  Brown SM,  Branski RC,  Pei Z

Journal: Bioinformatics

Date: 2018 Jun 15

Major Program(s) or Research Group(s): PLCO

PubMed ID: 29377990

PMC ID: PMC6658710

Abstract: Motivation: Shotgun DNA sequencing provides sensitive detection of all 182 HPV types in tissue and body fluid. However, existing computational methods either produce false positives misidentifying HPV types due to shared sequences among HPV, human and prokaryotes, or produce false negative since they identify HPV by assembled contigs requiring large abundant of HPV reads. Results: We designed HPViewer with two custom HPV reference databases masking simple repeats and homology sequences respectively and one homology distance matrix to hybridize these two databases. It directly identified HPV from short DNA reads rather than assembled contigs. Using 100 100 simulated samples, we revealed that HPViewer was robust for samples containing either high or low number of HPV reads. Using 12 shotgun sequencing samples from respiratory papillomatosis, HPViewer was equal to VirusTAP, and Vipie and better than HPVDetector with the respect to specificity and was the most sensitive method in the detection of HPV types 6 and 11. We demonstrated that contigs-based approaches had disadvantages of detection of HPV. In 1573 sets of metagenomic data from 18 human body sites, HPViewer identified 104 types of HPV in a body-site associated pattern and 89 types of HPV co-occurring in one sample with other types of HPV. We demonstrated HPViewer was sensitive and specific for HPV detection in metagenomic data. Availability and implementation: HPViewer can be accessed at Supplementary information: Supplementary data are available at Bioinformatics online.