Cargando…

Gene families as soft cliques with backbones: Amborella contrasted with other flowering plants

BACKGROUND: Chaining is a major problem in constructing gene families. RESULTS: We define a new kind of cluster on graphs with strong and weak edges: soft cliques with backbones (SCWiB). This differs from other definitions in how it controls the "chaining effect", by ensuring clusters sati...

Descripción completa

Detalles Bibliográficos
Autores principales: Zheng, Chunfang, Kononenko, Alexey, Leebens-Mack, Jim, Lyons, Eric, Sankoff, David
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4240082/
https://www.ncbi.nlm.nih.gov/pubmed/25572777
http://dx.doi.org/10.1186/1471-2164-15-S6-S8
Descripción
Sumario:BACKGROUND: Chaining is a major problem in constructing gene families. RESULTS: We define a new kind of cluster on graphs with strong and weak edges: soft cliques with backbones (SCWiB). This differs from other definitions in how it controls the "chaining effect", by ensuring clusters satisfy a tolerant edge density criterion that takes into account cluster size. We implement algorithms for decomposing a graph of similarities into SCWiBs. We compare examples of output from SCWiB and the Markov Cluster Algorithm (MCL), and also compare some curated Arabidopsis thaliana gene families with the results of automatic clustering. We apply our method to 44 published angiosperm genomes with annotation, and discover that Amborella trichopoda is distinct from all the others in having substantially and systematically smaller proportions of moderate- and large-size gene families. CONCLUSIONS: We offer several possible evolutionary explanations for this result.