Cargando…
The Dfam database of repetitive DNA families
Repetitive DNA, especially that due to transposable elements (TEs), makes up a large fraction of many genomes. Dfam is an open access database of families of repetitive DNA elements, in which each family is represented by a multiple sequence alignment and a profile hidden Markov model (HMM). The ini...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4702899/ https://www.ncbi.nlm.nih.gov/pubmed/26612867 http://dx.doi.org/10.1093/nar/gkv1272 |
_version_ | 1782408675749527552 |
---|---|
author | Hubley, Robert Finn, Robert D. Clements, Jody Eddy, Sean R. Jones, Thomas A. Bao, Weidong Smit, Arian F.A. Wheeler, Travis J. |
author_facet | Hubley, Robert Finn, Robert D. Clements, Jody Eddy, Sean R. Jones, Thomas A. Bao, Weidong Smit, Arian F.A. Wheeler, Travis J. |
author_sort | Hubley, Robert |
collection | PubMed |
description | Repetitive DNA, especially that due to transposable elements (TEs), makes up a large fraction of many genomes. Dfam is an open access database of families of repetitive DNA elements, in which each family is represented by a multiple sequence alignment and a profile hidden Markov model (HMM). The initial release of Dfam, featured in the 2013 NAR Database Issue, contained 1143 families of repetitive elements found in humans, and was used to produce more than 100 Mb of additional annotation of TE-derived regions in the human genome, with improved speed. Here, we describe recent advances, most notably expansion to 4150 total families including a comprehensive set of known repeat families from four new organisms (mouse, zebrafish, fly and nematode). We describe improvements to coverage, and to our methods for identifying and reducing false annotation. We also describe updates to the website interface. The Dfam website has moved to http://dfam.org. Seed alignments, profile HMMs, hit lists and other underlying data are available for download. |
format | Online Article Text |
id | pubmed-4702899 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-47028992016-01-07 The Dfam database of repetitive DNA families Hubley, Robert Finn, Robert D. Clements, Jody Eddy, Sean R. Jones, Thomas A. Bao, Weidong Smit, Arian F.A. Wheeler, Travis J. Nucleic Acids Res Database Issue Repetitive DNA, especially that due to transposable elements (TEs), makes up a large fraction of many genomes. Dfam is an open access database of families of repetitive DNA elements, in which each family is represented by a multiple sequence alignment and a profile hidden Markov model (HMM). The initial release of Dfam, featured in the 2013 NAR Database Issue, contained 1143 families of repetitive elements found in humans, and was used to produce more than 100 Mb of additional annotation of TE-derived regions in the human genome, with improved speed. Here, we describe recent advances, most notably expansion to 4150 total families including a comprehensive set of known repeat families from four new organisms (mouse, zebrafish, fly and nematode). We describe improvements to coverage, and to our methods for identifying and reducing false annotation. We also describe updates to the website interface. The Dfam website has moved to http://dfam.org. Seed alignments, profile HMMs, hit lists and other underlying data are available for download. Oxford University Press 2016-01-04 2015-11-26 /pmc/articles/PMC4702899/ /pubmed/26612867 http://dx.doi.org/10.1093/nar/gkv1272 Text en © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Database Issue Hubley, Robert Finn, Robert D. Clements, Jody Eddy, Sean R. Jones, Thomas A. Bao, Weidong Smit, Arian F.A. Wheeler, Travis J. The Dfam database of repetitive DNA families |
title | The Dfam database of repetitive DNA families |
title_full | The Dfam database of repetitive DNA families |
title_fullStr | The Dfam database of repetitive DNA families |
title_full_unstemmed | The Dfam database of repetitive DNA families |
title_short | The Dfam database of repetitive DNA families |
title_sort | dfam database of repetitive dna families |
topic | Database Issue |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4702899/ https://www.ncbi.nlm.nih.gov/pubmed/26612867 http://dx.doi.org/10.1093/nar/gkv1272 |
work_keys_str_mv | AT hubleyrobert thedfamdatabaseofrepetitivednafamilies AT finnrobertd thedfamdatabaseofrepetitivednafamilies AT clementsjody thedfamdatabaseofrepetitivednafamilies AT eddyseanr thedfamdatabaseofrepetitivednafamilies AT jonesthomasa thedfamdatabaseofrepetitivednafamilies AT baoweidong thedfamdatabaseofrepetitivednafamilies AT smitarianfa thedfamdatabaseofrepetitivednafamilies AT wheelertravisj thedfamdatabaseofrepetitivednafamilies AT hubleyrobert dfamdatabaseofrepetitivednafamilies AT finnrobertd dfamdatabaseofrepetitivednafamilies AT clementsjody dfamdatabaseofrepetitivednafamilies AT eddyseanr dfamdatabaseofrepetitivednafamilies AT jonesthomasa dfamdatabaseofrepetitivednafamilies AT baoweidong dfamdatabaseofrepetitivednafamilies AT smitarianfa dfamdatabaseofrepetitivednafamilies AT wheelertravisj dfamdatabaseofrepetitivednafamilies |