Cargando…

STARRPeaker: uniform processing and accurate identification of STARR-seq active regions

STARR-seq technology has employed progressively more complex genomic libraries and increased sequencing depths. An issue with the increased complexity and depth is that the coverage in STARR-seq experiments is non-uniform, overdispersed, and often confounded by sequencing biases, such as GC content....

Descripción completa

Detalles Bibliográficos
Autores principales: Lee, Donghoon, Shi, Manman, Moran, Jennifer, Wall, Martha, Zhang, Jing, Liu, Jason, Fitzgerald, Dominic, Kyono, Yasuhiro, Ma, Lijia, White, Kevin P., Gerstein, Mark
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7722316/
https://www.ncbi.nlm.nih.gov/pubmed/33292397
http://dx.doi.org/10.1186/s13059-020-02194-x
Descripción
Sumario:STARR-seq technology has employed progressively more complex genomic libraries and increased sequencing depths. An issue with the increased complexity and depth is that the coverage in STARR-seq experiments is non-uniform, overdispersed, and often confounded by sequencing biases, such as GC content. Furthermore, STARR-seq readout is confounded by RNA secondary structure and thermodynamic stability. To address these potential confounders, we developed a negative binomial regression framework for uniformly processing STARR-seq data, called STARRPeaker. Moreover, to aid our effort, we generated whole-genome STARR-seq data from the HepG2 and K562 human cell lines and applied STARRPeaker to comprehensively and unbiasedly call enhancers in them. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13059-020-02194-x.