Limits...
Trends in genome dynamics among major orders of insects revealed through variations in protein families.

Rappoport N, Linial M - BMC Genomics (2015)

Bottom Line: A comprehensive analysis based on statistical considerations identified the families that were significantly expanded or reduced in any of the studied organisms.We found that many species-specific families are associated with receptor signaling, stress-related functions and proteases.We propose that the expansion of TNAP families in Hymenoptera potentially contributes to the accelerated genome dynamics that characterize the wasp and ants.

View Article: PubMed Central - PubMed

Affiliation: School of Computer Science and Engineering, The Rachel and Selim Benin School of Computer Science and Engineering, The Hebrew University, Jerusalem, Israel. nadavrap@cs.huji.ac.il.

ABSTRACT

Background: Insects belong to a class that accounts for the majority of animals on earth. With over one million identified species, insects display a huge diversity and occupy extreme environments. At present, there are dozens of fully sequenced insect genomes that cover a range of habitats, social behavior and morphologies. In view of such diverse collection of genomes, revealing evolutionary trends and charting functional relationships of proteins remain challenging.

Results: We analyzed the relatedness of 17 complete proteomes representative of proteomes from insects including louse, bee, beetle, ants, flies and mosquitoes, as well as an out-group from the crustaceans. The analyzed proteomes mostly represented the orders of Hymenoptera and Diptera. The 287,405 protein sequences from the 18 proteomes were automatically clustered into 20,933 families, including 799 singletons. A comprehensive analysis based on statistical considerations identified the families that were significantly expanded or reduced in any of the studied organisms. Among all the tested species, ants are characterized by an exceptionally high rate of family gain and loss. By assigning annotations to hundreds of species-specific families, the functional diversity among species and between the major clades (Diptera and Hymenoptera) is revealed. We found that many species-specific families are associated with receptor signaling, stress-related functions and proteases. The highest variability among insects associates with the function of transposition and nucleic acids processes (collectively coined TNAP). Specifically, the wasp and ants have an order of magnitude more TNAP families and proteins relative to species that belong to Diptera (mosquitoes and flies).

Conclusions: An unsupervised clustering methodology combined with a comparative functional analysis unveiled proteomic signatures in the major clades of winged insects. We propose that the expansion of TNAP families in Hymenoptera potentially contributes to the accelerated genome dynamics that characterize the wasp and ants.

No MeSH data available.


Related in: MedlinePlus

Analysis of Root superfamilies (SF). a Number of proteins for 18 species for a Root SF with 399 proteins annotated “Fibrinogen-beta and gamma chains, C-terminal globular domain”. The maximal number of proteins is associated with Diptera and specifically with the 4 mosquitoes. b 114 Root SFs that have a size of >200 proteins from Hymenoptera (H) and Diptera (D). Considering only protein from Diptera and Hymenoptera, the baseline probability for Hymenoptera proteins is 0.61 (dashed line, see Methods). A confidence threshold based on binomial distribution at P-value <10e-5 is shown as dashed bent lines. The high-level functionalities for expanded and contracted Root SF are color-coded. TNAP, transposition and nucleic acids processes; H, Hymenoptera; D, Diptera. The Root SF annotated Fibrinogen that is analyzed in (a) is marked by an arrowhead
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4528696&req=5

Fig5: Analysis of Root superfamilies (SF). a Number of proteins for 18 species for a Root SF with 399 proteins annotated “Fibrinogen-beta and gamma chains, C-terminal globular domain”. The maximal number of proteins is associated with Diptera and specifically with the 4 mosquitoes. b 114 Root SFs that have a size of >200 proteins from Hymenoptera (H) and Diptera (D). Considering only protein from Diptera and Hymenoptera, the baseline probability for Hymenoptera proteins is 0.61 (dashed line, see Methods). A confidence threshold based on binomial distribution at P-value <10e-5 is shown as dashed bent lines. The high-level functionalities for expanded and contracted Root SF are color-coded. TNAP, transposition and nucleic acids processes; H, Hymenoptera; D, Diptera. The Root SF annotated Fibrinogen that is analyzed in (a) is marked by an arrowhead

Mentions: Figure 5a shows the protein partition among the 18 species for a Root SF annotated “Fibrinogen- beta and gamma chains, C-terminal globular domain” (399 proteins). This Root SF is of very high quality (99 % selectivity, 95 % specificity and includes 87 unannotated proteins). We noted a 4:1 ratio in favor of the proteins belonging to Diptera as compared to Hymenoptera (P-value <1.0E-56, Fig. 5a).Fig. 5


Trends in genome dynamics among major orders of insects revealed through variations in protein families.

Rappoport N, Linial M - BMC Genomics (2015)

Analysis of Root superfamilies (SF). a Number of proteins for 18 species for a Root SF with 399 proteins annotated “Fibrinogen-beta and gamma chains, C-terminal globular domain”. The maximal number of proteins is associated with Diptera and specifically with the 4 mosquitoes. b 114 Root SFs that have a size of >200 proteins from Hymenoptera (H) and Diptera (D). Considering only protein from Diptera and Hymenoptera, the baseline probability for Hymenoptera proteins is 0.61 (dashed line, see Methods). A confidence threshold based on binomial distribution at P-value <10e-5 is shown as dashed bent lines. The high-level functionalities for expanded and contracted Root SF are color-coded. TNAP, transposition and nucleic acids processes; H, Hymenoptera; D, Diptera. The Root SF annotated Fibrinogen that is analyzed in (a) is marked by an arrowhead
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4528696&req=5

Fig5: Analysis of Root superfamilies (SF). a Number of proteins for 18 species for a Root SF with 399 proteins annotated “Fibrinogen-beta and gamma chains, C-terminal globular domain”. The maximal number of proteins is associated with Diptera and specifically with the 4 mosquitoes. b 114 Root SFs that have a size of >200 proteins from Hymenoptera (H) and Diptera (D). Considering only protein from Diptera and Hymenoptera, the baseline probability for Hymenoptera proteins is 0.61 (dashed line, see Methods). A confidence threshold based on binomial distribution at P-value <10e-5 is shown as dashed bent lines. The high-level functionalities for expanded and contracted Root SF are color-coded. TNAP, transposition and nucleic acids processes; H, Hymenoptera; D, Diptera. The Root SF annotated Fibrinogen that is analyzed in (a) is marked by an arrowhead
Mentions: Figure 5a shows the protein partition among the 18 species for a Root SF annotated “Fibrinogen- beta and gamma chains, C-terminal globular domain” (399 proteins). This Root SF is of very high quality (99 % selectivity, 95 % specificity and includes 87 unannotated proteins). We noted a 4:1 ratio in favor of the proteins belonging to Diptera as compared to Hymenoptera (P-value <1.0E-56, Fig. 5a).Fig. 5

Bottom Line: A comprehensive analysis based on statistical considerations identified the families that were significantly expanded or reduced in any of the studied organisms.We found that many species-specific families are associated with receptor signaling, stress-related functions and proteases.We propose that the expansion of TNAP families in Hymenoptera potentially contributes to the accelerated genome dynamics that characterize the wasp and ants.

View Article: PubMed Central - PubMed

Affiliation: School of Computer Science and Engineering, The Rachel and Selim Benin School of Computer Science and Engineering, The Hebrew University, Jerusalem, Israel. nadavrap@cs.huji.ac.il.

ABSTRACT

Background: Insects belong to a class that accounts for the majority of animals on earth. With over one million identified species, insects display a huge diversity and occupy extreme environments. At present, there are dozens of fully sequenced insect genomes that cover a range of habitats, social behavior and morphologies. In view of such diverse collection of genomes, revealing evolutionary trends and charting functional relationships of proteins remain challenging.

Results: We analyzed the relatedness of 17 complete proteomes representative of proteomes from insects including louse, bee, beetle, ants, flies and mosquitoes, as well as an out-group from the crustaceans. The analyzed proteomes mostly represented the orders of Hymenoptera and Diptera. The 287,405 protein sequences from the 18 proteomes were automatically clustered into 20,933 families, including 799 singletons. A comprehensive analysis based on statistical considerations identified the families that were significantly expanded or reduced in any of the studied organisms. Among all the tested species, ants are characterized by an exceptionally high rate of family gain and loss. By assigning annotations to hundreds of species-specific families, the functional diversity among species and between the major clades (Diptera and Hymenoptera) is revealed. We found that many species-specific families are associated with receptor signaling, stress-related functions and proteases. The highest variability among insects associates with the function of transposition and nucleic acids processes (collectively coined TNAP). Specifically, the wasp and ants have an order of magnitude more TNAP families and proteins relative to species that belong to Diptera (mosquitoes and flies).

Conclusions: An unsupervised clustering methodology combined with a comparative functional analysis unveiled proteomic signatures in the major clades of winged insects. We propose that the expansion of TNAP families in Hymenoptera potentially contributes to the accelerated genome dynamics that characterize the wasp and ants.

No MeSH data available.


Related in: MedlinePlus