Cargando…
Prioritising positively selected variants in whole-genome sequencing data using FineMAV
BACKGROUND: In population genomics, polymorphisms that are highly differentiated between geographically separated populations are often suggestive of Darwinian positive selection. Genomic scans have highlighted several such regions in African and non-African populations, but only a handful of these...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8684245/ https://www.ncbi.nlm.nih.gov/pubmed/34922440 http://dx.doi.org/10.1186/s12859-021-04506-9 |
_version_ | 1784617579888771072 |
---|---|
author | Wahyudi, Fadilla Aghakhanian, Farhang Rahman, Sadequr Teo, Yik-Ying Szpak, Michał Dhaliwal, Jasbir Ayub, Qasim |
author_facet | Wahyudi, Fadilla Aghakhanian, Farhang Rahman, Sadequr Teo, Yik-Ying Szpak, Michał Dhaliwal, Jasbir Ayub, Qasim |
author_sort | Wahyudi, Fadilla |
collection | PubMed |
description | BACKGROUND: In population genomics, polymorphisms that are highly differentiated between geographically separated populations are often suggestive of Darwinian positive selection. Genomic scans have highlighted several such regions in African and non-African populations, but only a handful of these have functional data that clearly associates candidate variations driving the selection process. Fine-Mapping of Adaptive Variation (FineMAV) was developed to address this in a high-throughput manner using population based whole-genome sequences generated by the 1000 Genomes Project. It pinpoints positively selected genetic variants in sequencing data by prioritizing high frequency, population-specific and functional derived alleles. RESULTS: We developed a stand-alone software that implements the FineMAV statistic. To graphically visualise the FineMAV scores, it outputs the statistics as bigWig files, which is a common file format supported by many genome browsers. It is available as a command-line and graphical user interface. The software was tested by replicating the FineMAV scores obtained using 1000 Genomes Project African, European, East and South Asian populations and subsequently applied to whole-genome sequencing datasets from Singapore and China to highlight population specific variants that can be subsequently modelled. The software tool is publicly available at https://github.com/fadilla-wahyudi/finemav. CONCLUSIONS: The software tool described here determines genome-wide FineMAV scores, using low or high-coverage whole-genome sequencing datasets, that can be used to prioritize a list of population specific, highly differentiated candidate variants for in vitro or in vivo functional screens. The tool displays these scores on the human genome browsers for easy visualisation, annotation and comparison between different genomic regions in worldwide human populations. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-021-04506-9. |
format | Online Article Text |
id | pubmed-8684245 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-86842452021-12-20 Prioritising positively selected variants in whole-genome sequencing data using FineMAV Wahyudi, Fadilla Aghakhanian, Farhang Rahman, Sadequr Teo, Yik-Ying Szpak, Michał Dhaliwal, Jasbir Ayub, Qasim BMC Bioinformatics Software BACKGROUND: In population genomics, polymorphisms that are highly differentiated between geographically separated populations are often suggestive of Darwinian positive selection. Genomic scans have highlighted several such regions in African and non-African populations, but only a handful of these have functional data that clearly associates candidate variations driving the selection process. Fine-Mapping of Adaptive Variation (FineMAV) was developed to address this in a high-throughput manner using population based whole-genome sequences generated by the 1000 Genomes Project. It pinpoints positively selected genetic variants in sequencing data by prioritizing high frequency, population-specific and functional derived alleles. RESULTS: We developed a stand-alone software that implements the FineMAV statistic. To graphically visualise the FineMAV scores, it outputs the statistics as bigWig files, which is a common file format supported by many genome browsers. It is available as a command-line and graphical user interface. The software was tested by replicating the FineMAV scores obtained using 1000 Genomes Project African, European, East and South Asian populations and subsequently applied to whole-genome sequencing datasets from Singapore and China to highlight population specific variants that can be subsequently modelled. The software tool is publicly available at https://github.com/fadilla-wahyudi/finemav. CONCLUSIONS: The software tool described here determines genome-wide FineMAV scores, using low or high-coverage whole-genome sequencing datasets, that can be used to prioritize a list of population specific, highly differentiated candidate variants for in vitro or in vivo functional screens. The tool displays these scores on the human genome browsers for easy visualisation, annotation and comparison between different genomic regions in worldwide human populations. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-021-04506-9. BioMed Central 2021-12-18 /pmc/articles/PMC8684245/ /pubmed/34922440 http://dx.doi.org/10.1186/s12859-021-04506-9 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Software Wahyudi, Fadilla Aghakhanian, Farhang Rahman, Sadequr Teo, Yik-Ying Szpak, Michał Dhaliwal, Jasbir Ayub, Qasim Prioritising positively selected variants in whole-genome sequencing data using FineMAV |
title | Prioritising positively selected variants in whole-genome sequencing data using FineMAV |
title_full | Prioritising positively selected variants in whole-genome sequencing data using FineMAV |
title_fullStr | Prioritising positively selected variants in whole-genome sequencing data using FineMAV |
title_full_unstemmed | Prioritising positively selected variants in whole-genome sequencing data using FineMAV |
title_short | Prioritising positively selected variants in whole-genome sequencing data using FineMAV |
title_sort | prioritising positively selected variants in whole-genome sequencing data using finemav |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8684245/ https://www.ncbi.nlm.nih.gov/pubmed/34922440 http://dx.doi.org/10.1186/s12859-021-04506-9 |
work_keys_str_mv | AT wahyudifadilla prioritisingpositivelyselectedvariantsinwholegenomesequencingdatausingfinemav AT aghakhanianfarhang prioritisingpositivelyselectedvariantsinwholegenomesequencingdatausingfinemav AT rahmansadequr prioritisingpositivelyselectedvariantsinwholegenomesequencingdatausingfinemav AT teoyikying prioritisingpositivelyselectedvariantsinwholegenomesequencingdatausingfinemav AT szpakmichał prioritisingpositivelyselectedvariantsinwholegenomesequencingdatausingfinemav AT dhaliwaljasbir prioritisingpositivelyselectedvariantsinwholegenomesequencingdatausingfinemav AT ayubqasim prioritisingpositivelyselectedvariantsinwholegenomesequencingdatausingfinemav |