Sunday, May 26, 2013

Trees of whole genomes, but locally

With increasing use of Next-generation sequencing methods, whole genomes of multiple species as well as closely related strains or populations are being generated. Sophisticated models to interpret this data that use our current knowledge of population dynamics and molecular processes are being used extensively.

The general idea is to be able to distinguish the population histories of various parts of the genome. The phylogenies of different parts of the genome might differ based on various functional constraints. Hence, building a single consensus tree for the entire genome will average over these local effects. To be able to overcome this limitation, many methods are being proposed to identify local phylogenies across the genome.

The program Saguaro, from Science for Life Laboratory and Broad Institute, is capable of using SNP data in VCF format or whole genome assembly alignments in MAF format to partition the genome based on local phylogenetic information. The method and its use on few example datasets is described in Unsupervised genome-wide recognition of local relationship patterns

1 comment: