Polymorphism Analysis Reveals Reduced Negative Selection and Elevated Rate of Insertions and Deletions in Intrinsically Disordered Protein Regions.
Bottom Line: We also confirm previous findings that nonframeshifting indels are much more abundant in disordered regions relative to structured regions.We find that the rate of nonframeshifting indel polymorphism in intrinsically disordered regions resembles that of noncoding DNA and pseudogenes, and that large indels segregate in disordered regions in the human population.Our survey of polymorphism confirms patterns of evolution in disordered regions inferred based on longer evolutionary comparisons.
Affiliation: Department of Cell & Systems Biology, University of Toronto, Ontario, Canada.Show MeSH
Mentions: The dramatic increase in rate of nonframeshifting indel polymorphism suggests thatmost of the large protein coding indels segregating in the human population will befound in disordered protein regions. In figure7, we show examples of large indels segregating at high frequency in twoimportant human proteins, IRF5 (Fan et al.2010) and GRIN3B (Niemann et al.2008). In the case of IRF5, an insertion seems to have appeared in thehuman–chimp ancestor and reached a frequency of 54.6% in the overall1000 Genomes population. This region is not of low complexity, but repeating codonscould be increasing the region’s propensity for indels. Different length indelsof similar sequence in orangutan, marmoset, and squirrel monkey support this idea.Interestingly, the orangutan genome appears to contain a similar, albeit independent,insertion in this region. In the case of GRIN3B, the deletion likely represents thederived state, and removes nine amino acids in around 16% of the 1000 genomespopulation. These examples also illustrate the difficulty in properly aligningrapidly evolving disordered regions over long evolutionary distances. Fig. 7.
Affiliation: Department of Cell & Systems Biology, University of Toronto, Ontario, Canada.