Cargando…

Prioritising positively selected variants in whole-genome sequencing data using FineMAV

BACKGROUND: In population genomics, polymorphisms that are highly differentiated between geographically separated populations are often suggestive of Darwinian positive selection. Genomic scans have highlighted several such regions in African and non-African populations, but only a handful of these...

Descripción completa

Detalles Bibliográficos
Autores principales: Wahyudi, Fadilla, Aghakhanian, Farhang, Rahman, Sadequr, Teo, Yik-Ying, Szpak, Michał, Dhaliwal, Jasbir, Ayub, Qasim
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8684245/
https://www.ncbi.nlm.nih.gov/pubmed/34922440
http://dx.doi.org/10.1186/s12859-021-04506-9
_version_ 1784617579888771072
author Wahyudi, Fadilla
Aghakhanian, Farhang
Rahman, Sadequr
Teo, Yik-Ying
Szpak, Michał
Dhaliwal, Jasbir
Ayub, Qasim
author_facet Wahyudi, Fadilla
Aghakhanian, Farhang
Rahman, Sadequr
Teo, Yik-Ying
Szpak, Michał
Dhaliwal, Jasbir
Ayub, Qasim
author_sort Wahyudi, Fadilla
collection PubMed
description BACKGROUND: In population genomics, polymorphisms that are highly differentiated between geographically separated populations are often suggestive of Darwinian positive selection. Genomic scans have highlighted several such regions in African and non-African populations, but only a handful of these have functional data that clearly associates candidate variations driving the selection process. Fine-Mapping of Adaptive Variation (FineMAV) was developed to address this in a high-throughput manner using population based whole-genome sequences generated by the 1000 Genomes Project. It pinpoints positively selected genetic variants in sequencing data by prioritizing high frequency, population-specific and functional derived alleles. RESULTS: We developed a stand-alone software that implements the FineMAV statistic. To graphically visualise the FineMAV scores, it outputs the statistics as bigWig files, which is a common file format supported by many genome browsers. It is available as a command-line and graphical user interface. The software was tested by replicating the FineMAV scores obtained using 1000 Genomes Project African, European, East and South Asian populations and subsequently applied to whole-genome sequencing datasets from Singapore and China to highlight population specific variants that can be subsequently modelled. The software tool is publicly available at https://github.com/fadilla-wahyudi/finemav. CONCLUSIONS: The software tool described here determines genome-wide FineMAV scores, using low or high-coverage whole-genome sequencing datasets, that can be used to prioritize a list of population specific, highly differentiated candidate variants for in vitro or in vivo functional screens. The tool displays these scores on the human genome browsers for easy visualisation, annotation and comparison between different genomic regions in worldwide human populations. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-021-04506-9.
format Online
Article
Text
id pubmed-8684245
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-86842452021-12-20 Prioritising positively selected variants in whole-genome sequencing data using FineMAV Wahyudi, Fadilla Aghakhanian, Farhang Rahman, Sadequr Teo, Yik-Ying Szpak, Michał Dhaliwal, Jasbir Ayub, Qasim BMC Bioinformatics Software BACKGROUND: In population genomics, polymorphisms that are highly differentiated between geographically separated populations are often suggestive of Darwinian positive selection. Genomic scans have highlighted several such regions in African and non-African populations, but only a handful of these have functional data that clearly associates candidate variations driving the selection process. Fine-Mapping of Adaptive Variation (FineMAV) was developed to address this in a high-throughput manner using population based whole-genome sequences generated by the 1000 Genomes Project. It pinpoints positively selected genetic variants in sequencing data by prioritizing high frequency, population-specific and functional derived alleles. RESULTS: We developed a stand-alone software that implements the FineMAV statistic. To graphically visualise the FineMAV scores, it outputs the statistics as bigWig files, which is a common file format supported by many genome browsers. It is available as a command-line and graphical user interface. The software was tested by replicating the FineMAV scores obtained using 1000 Genomes Project African, European, East and South Asian populations and subsequently applied to whole-genome sequencing datasets from Singapore and China to highlight population specific variants that can be subsequently modelled. The software tool is publicly available at https://github.com/fadilla-wahyudi/finemav. CONCLUSIONS: The software tool described here determines genome-wide FineMAV scores, using low or high-coverage whole-genome sequencing datasets, that can be used to prioritize a list of population specific, highly differentiated candidate variants for in vitro or in vivo functional screens. The tool displays these scores on the human genome browsers for easy visualisation, annotation and comparison between different genomic regions in worldwide human populations. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-021-04506-9. BioMed Central 2021-12-18 /pmc/articles/PMC8684245/ /pubmed/34922440 http://dx.doi.org/10.1186/s12859-021-04506-9 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Software
Wahyudi, Fadilla
Aghakhanian, Farhang
Rahman, Sadequr
Teo, Yik-Ying
Szpak, Michał
Dhaliwal, Jasbir
Ayub, Qasim
Prioritising positively selected variants in whole-genome sequencing data using FineMAV
title Prioritising positively selected variants in whole-genome sequencing data using FineMAV
title_full Prioritising positively selected variants in whole-genome sequencing data using FineMAV
title_fullStr Prioritising positively selected variants in whole-genome sequencing data using FineMAV
title_full_unstemmed Prioritising positively selected variants in whole-genome sequencing data using FineMAV
title_short Prioritising positively selected variants in whole-genome sequencing data using FineMAV
title_sort prioritising positively selected variants in whole-genome sequencing data using finemav
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8684245/
https://www.ncbi.nlm.nih.gov/pubmed/34922440
http://dx.doi.org/10.1186/s12859-021-04506-9
work_keys_str_mv AT wahyudifadilla prioritisingpositivelyselectedvariantsinwholegenomesequencingdatausingfinemav
AT aghakhanianfarhang prioritisingpositivelyselectedvariantsinwholegenomesequencingdatausingfinemav
AT rahmansadequr prioritisingpositivelyselectedvariantsinwholegenomesequencingdatausingfinemav
AT teoyikying prioritisingpositivelyselectedvariantsinwholegenomesequencingdatausingfinemav
AT szpakmichał prioritisingpositivelyselectedvariantsinwholegenomesequencingdatausingfinemav
AT dhaliwaljasbir prioritisingpositivelyselectedvariantsinwholegenomesequencingdatausingfinemav
AT ayubqasim prioritisingpositivelyselectedvariantsinwholegenomesequencingdatausingfinemav