Limits...
Inferring population structure and relationship using minimal independent evolutionary markers in Y-chromosome: a hybrid approach of recursive feature selection for hierarchical clustering.

Srivastava AK, Chopra R, Ali S, Aggarwal S, Vig L, Bamezai RN - Nucleic Acids Res. (2014)

Bottom Line: An analysis of 105 world-wide populations reflected that 15 independent variations/markers were optimal in defining population structure parameters, such as FST, molecular variance and correlation-based relationship.A subsequent addition of randomly selected markers had a negligible effect (close to zero, i.e. 1 × 10(-3)) on these parameters.The study proves efficient in tracing complex population structures and deriving relationships among world-wide populations in a cost-effective and expedient manner.

View Article: PubMed Central - PubMed

Affiliation: National Centre of Applied Human Genetics, School of Life Sciences, Jawaharlal Nehru University, New Delhi 110067, India.

Show MeSH
Representative example of hierarchical events of mutations in evolution (as would happen say in the Y-chromosome) in human population. ‘A’ represents the most recent common ancestor with a genetic background with mutation e1. In the background of e1 three independent mutation events follow to give rise to three different clades ‘B, C, D’. The variations originating in lower nodes later would represent the ancestors of their respective clades.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4150763&req=5

Figure 1: Representative example of hierarchical events of mutations in evolution (as would happen say in the Y-chromosome) in human population. ‘A’ represents the most recent common ancestor with a genetic background with mutation e1. In the background of e1 three independent mutation events follow to give rise to three different clades ‘B, C, D’. The variations originating in lower nodes later would represent the ancestors of their respective clades.

Mentions: In the current study, we used a correlation coefficient-based supervised feature selection method embedded with agglomerative hierarchical clustering based on prior knowledge of Y-chromosomal phylogeny. To validate our novel approach, we chose a model study based on real datasets of male-specific Y-chromosomal (MSY) variations generated in present and earlier studies. As per neutral theory of molecular evolution (7) and Kimura's step-wise mutation model (19), a major source of allelic diffusion in finite populations is fixation of neutral mutations by genetic drift, i.e. mutations occurring in steps are defined by state of variation occurred in the preceding generation. The same applies to Y-chromosome phylogeny as well, i.e. each haplogroup (combination of same or different haplotypes) is an outcome of one or more mutation event, which later on stabilizes under different evolutionary forces, such as migration, genetic drift, selection and admixture in a population or geographical region. Therefore, lower nodes in hierarchy appear in the background of already existing higher ones (Figure 1). In the background of the above fact, only few evolutionary markers which are most ancestral in their respective clades could be considered independent and rest are sequentially derived after the fixation and selection of ancestral ones (Figure 1).


Inferring population structure and relationship using minimal independent evolutionary markers in Y-chromosome: a hybrid approach of recursive feature selection for hierarchical clustering.

Srivastava AK, Chopra R, Ali S, Aggarwal S, Vig L, Bamezai RN - Nucleic Acids Res. (2014)

Representative example of hierarchical events of mutations in evolution (as would happen say in the Y-chromosome) in human population. ‘A’ represents the most recent common ancestor with a genetic background with mutation e1. In the background of e1 three independent mutation events follow to give rise to three different clades ‘B, C, D’. The variations originating in lower nodes later would represent the ancestors of their respective clades.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4150763&req=5

Figure 1: Representative example of hierarchical events of mutations in evolution (as would happen say in the Y-chromosome) in human population. ‘A’ represents the most recent common ancestor with a genetic background with mutation e1. In the background of e1 three independent mutation events follow to give rise to three different clades ‘B, C, D’. The variations originating in lower nodes later would represent the ancestors of their respective clades.
Mentions: In the current study, we used a correlation coefficient-based supervised feature selection method embedded with agglomerative hierarchical clustering based on prior knowledge of Y-chromosomal phylogeny. To validate our novel approach, we chose a model study based on real datasets of male-specific Y-chromosomal (MSY) variations generated in present and earlier studies. As per neutral theory of molecular evolution (7) and Kimura's step-wise mutation model (19), a major source of allelic diffusion in finite populations is fixation of neutral mutations by genetic drift, i.e. mutations occurring in steps are defined by state of variation occurred in the preceding generation. The same applies to Y-chromosome phylogeny as well, i.e. each haplogroup (combination of same or different haplotypes) is an outcome of one or more mutation event, which later on stabilizes under different evolutionary forces, such as migration, genetic drift, selection and admixture in a population or geographical region. Therefore, lower nodes in hierarchy appear in the background of already existing higher ones (Figure 1). In the background of the above fact, only few evolutionary markers which are most ancestral in their respective clades could be considered independent and rest are sequentially derived after the fixation and selection of ancestral ones (Figure 1).

Bottom Line: An analysis of 105 world-wide populations reflected that 15 independent variations/markers were optimal in defining population structure parameters, such as FST, molecular variance and correlation-based relationship.A subsequent addition of randomly selected markers had a negligible effect (close to zero, i.e. 1 × 10(-3)) on these parameters.The study proves efficient in tracing complex population structures and deriving relationships among world-wide populations in a cost-effective and expedient manner.

View Article: PubMed Central - PubMed

Affiliation: National Centre of Applied Human Genetics, School of Life Sciences, Jawaharlal Nehru University, New Delhi 110067, India.

Show MeSH