Limits...
Why genes overlap in viruses.

Chirico N, Vianelli A, Belshaw R - Proc. Biol. Sci. (2010)

Bottom Line: We conclude that gene overlap is unlikely to have evolved as a way of compressing the genome in response to the harmful effect of mutation because RNA viruses, despite having generally higher mutation rates, have less gene overlap on average than DNA viruses of comparable genome length.Our interpretation is that a physical constraint on genome length by the capsid has led to gene overlap evolving as a mechanism for producing more proteins from the same genome length.We consider that these patterns cannot be explained by other factors, namely the possible roles of overlap in transcription regulation, generating more divergent proteins and the relationship between gene length and genome length.

View Article: PubMed Central - PubMed

Affiliation: Department of Structural and Functional Biology, University of Insubria, Via JH Dunant 3, 21100 Varese, Italy.

ABSTRACT
The genomes of most virus species have overlapping genes--two or more proteins coded for by the same nucleotide sequence. Several explanations have been proposed for the evolution of this phenomenon, and we test these by comparing the amount of gene overlap in all known virus species. We conclude that gene overlap is unlikely to have evolved as a way of compressing the genome in response to the harmful effect of mutation because RNA viruses, despite having generally higher mutation rates, have less gene overlap on average than DNA viruses of comparable genome length. However, we do find a negative relationship between overlap proportion and genome length among viruses with icosahedral capsids, but not among those with other capsid types that we consider easier to enlarge in size. Our interpretation is that a physical constraint on genome length by the capsid has led to gene overlap evolving as a mechanism for producing more proteins from the same genome length. We consider that these patterns cannot be explained by other factors, namely the possible roles of overlap in transcription regulation, generating more divergent proteins and the relationship between gene length and genome length.

Show MeSH

Related in: MedlinePlus

Relationship between overlap proportion (the proportion of the genome that is within an overlap) and total genome length for RNA virus families, both expressed as natural logarithms. Points are means for the following taxa, all of which have at least one well-curated genome and some gene overlap. Open circles are families with icosahedral capsids; closed circles have flexible capsids; crosses are families with indeterminate capsid forms. Linear regression, r2 = 0.24, p = 0.003. (1) Arenaviridae (n = 2); (2) Arteriviridae (n = 3); (3) Astroviridae (n = 4); (4) Birnaviridae (n = 3); (5) Bornaviridae (n = 1); (6) Bromoviridae (n = 9); (7) Caliciviridae (n = 9); (8) Caulimoviridae (n = 7); (9) Closteroviridae (n = 7); (10) Comoviridae (n = 7); (11) Coronaviridae (n = 9); (12) Cystoviridae (n = 3); (13) Flaviviridae (n = 26); (14) Flexiviridae (n = 22); (15) Hepadnaviridae (n = 1); (16) Hordeivirus (n = 1); (17) Leviviridae (n = 6); (18) Luteoviridae (n = 8); (19) Nodaviridae (n = 2); (20) Orthomyxoviridae (n = 1); (21) Paramyxoviridae (n = 3); (22) Pecluvirus (n = 1); (23) Potyviridae (n = 60); (24) Reoviridae (n = 16); (25) Retroviridae (n = 11); (26) Schizochytrium single-stranded RNA virus (n = 1); (27) Sclerophthora macrospora virus A (n = 1); (28) Sequiviridae (n = 2); (29) Sobemovirus (n = 9); (30) Tobamovirus (n = 6); (31) Tobravirus (n = 1); (32) Togaviridae (n = 9); (33) Tombusviridae (n = 9); (34) Totiviridae (n = 3); (35) Tymoviridae (n = 5); (36) Umbravirus (n = 2).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2992710&req=5

RSPB20101052F2: Relationship between overlap proportion (the proportion of the genome that is within an overlap) and total genome length for RNA virus families, both expressed as natural logarithms. Points are means for the following taxa, all of which have at least one well-curated genome and some gene overlap. Open circles are families with icosahedral capsids; closed circles have flexible capsids; crosses are families with indeterminate capsid forms. Linear regression, r2 = 0.24, p = 0.003. (1) Arenaviridae (n = 2); (2) Arteriviridae (n = 3); (3) Astroviridae (n = 4); (4) Birnaviridae (n = 3); (5) Bornaviridae (n = 1); (6) Bromoviridae (n = 9); (7) Caliciviridae (n = 9); (8) Caulimoviridae (n = 7); (9) Closteroviridae (n = 7); (10) Comoviridae (n = 7); (11) Coronaviridae (n = 9); (12) Cystoviridae (n = 3); (13) Flaviviridae (n = 26); (14) Flexiviridae (n = 22); (15) Hepadnaviridae (n = 1); (16) Hordeivirus (n = 1); (17) Leviviridae (n = 6); (18) Luteoviridae (n = 8); (19) Nodaviridae (n = 2); (20) Orthomyxoviridae (n = 1); (21) Paramyxoviridae (n = 3); (22) Pecluvirus (n = 1); (23) Potyviridae (n = 60); (24) Reoviridae (n = 16); (25) Retroviridae (n = 11); (26) Schizochytrium single-stranded RNA virus (n = 1); (27) Sclerophthora macrospora virus A (n = 1); (28) Sequiviridae (n = 2); (29) Sobemovirus (n = 9); (30) Tobamovirus (n = 6); (31) Tobravirus (n = 1); (32) Togaviridae (n = 9); (33) Tombusviridae (n = 9); (34) Totiviridae (n = 3); (35) Tymoviridae (n = 5); (36) Umbravirus (n = 2).

Mentions: We find that 75 per cent of the approximately 2000 known virus species have at least some gene overlap. The negative relationship between overlap proportion and genome length in RNA viruses (figure 2; linear regression r2 = 0.23, p = 0.006) reported previously (Belshaw et al. 2007) also exists among DNA viruses (figure 3; linear regression, r2 = 0.38, p = 0.002). This relationship is found within all the constituent virus groups, e.g. ssDNA and dsDNA viruses (electronic supplementary material, figures S2 and S3), and within the two types of overlap: internal overlaps, where one gene is completely overlapped by a larger second, and terminal overlaps, where two genes overlap for part of their lengths—one upstream and one downstream (electronic supplementary material, figures S4 and S5).Figure 2.


Why genes overlap in viruses.

Chirico N, Vianelli A, Belshaw R - Proc. Biol. Sci. (2010)

Relationship between overlap proportion (the proportion of the genome that is within an overlap) and total genome length for RNA virus families, both expressed as natural logarithms. Points are means for the following taxa, all of which have at least one well-curated genome and some gene overlap. Open circles are families with icosahedral capsids; closed circles have flexible capsids; crosses are families with indeterminate capsid forms. Linear regression, r2 = 0.24, p = 0.003. (1) Arenaviridae (n = 2); (2) Arteriviridae (n = 3); (3) Astroviridae (n = 4); (4) Birnaviridae (n = 3); (5) Bornaviridae (n = 1); (6) Bromoviridae (n = 9); (7) Caliciviridae (n = 9); (8) Caulimoviridae (n = 7); (9) Closteroviridae (n = 7); (10) Comoviridae (n = 7); (11) Coronaviridae (n = 9); (12) Cystoviridae (n = 3); (13) Flaviviridae (n = 26); (14) Flexiviridae (n = 22); (15) Hepadnaviridae (n = 1); (16) Hordeivirus (n = 1); (17) Leviviridae (n = 6); (18) Luteoviridae (n = 8); (19) Nodaviridae (n = 2); (20) Orthomyxoviridae (n = 1); (21) Paramyxoviridae (n = 3); (22) Pecluvirus (n = 1); (23) Potyviridae (n = 60); (24) Reoviridae (n = 16); (25) Retroviridae (n = 11); (26) Schizochytrium single-stranded RNA virus (n = 1); (27) Sclerophthora macrospora virus A (n = 1); (28) Sequiviridae (n = 2); (29) Sobemovirus (n = 9); (30) Tobamovirus (n = 6); (31) Tobravirus (n = 1); (32) Togaviridae (n = 9); (33) Tombusviridae (n = 9); (34) Totiviridae (n = 3); (35) Tymoviridae (n = 5); (36) Umbravirus (n = 2).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2992710&req=5

RSPB20101052F2: Relationship between overlap proportion (the proportion of the genome that is within an overlap) and total genome length for RNA virus families, both expressed as natural logarithms. Points are means for the following taxa, all of which have at least one well-curated genome and some gene overlap. Open circles are families with icosahedral capsids; closed circles have flexible capsids; crosses are families with indeterminate capsid forms. Linear regression, r2 = 0.24, p = 0.003. (1) Arenaviridae (n = 2); (2) Arteriviridae (n = 3); (3) Astroviridae (n = 4); (4) Birnaviridae (n = 3); (5) Bornaviridae (n = 1); (6) Bromoviridae (n = 9); (7) Caliciviridae (n = 9); (8) Caulimoviridae (n = 7); (9) Closteroviridae (n = 7); (10) Comoviridae (n = 7); (11) Coronaviridae (n = 9); (12) Cystoviridae (n = 3); (13) Flaviviridae (n = 26); (14) Flexiviridae (n = 22); (15) Hepadnaviridae (n = 1); (16) Hordeivirus (n = 1); (17) Leviviridae (n = 6); (18) Luteoviridae (n = 8); (19) Nodaviridae (n = 2); (20) Orthomyxoviridae (n = 1); (21) Paramyxoviridae (n = 3); (22) Pecluvirus (n = 1); (23) Potyviridae (n = 60); (24) Reoviridae (n = 16); (25) Retroviridae (n = 11); (26) Schizochytrium single-stranded RNA virus (n = 1); (27) Sclerophthora macrospora virus A (n = 1); (28) Sequiviridae (n = 2); (29) Sobemovirus (n = 9); (30) Tobamovirus (n = 6); (31) Tobravirus (n = 1); (32) Togaviridae (n = 9); (33) Tombusviridae (n = 9); (34) Totiviridae (n = 3); (35) Tymoviridae (n = 5); (36) Umbravirus (n = 2).
Mentions: We find that 75 per cent of the approximately 2000 known virus species have at least some gene overlap. The negative relationship between overlap proportion and genome length in RNA viruses (figure 2; linear regression r2 = 0.23, p = 0.006) reported previously (Belshaw et al. 2007) also exists among DNA viruses (figure 3; linear regression, r2 = 0.38, p = 0.002). This relationship is found within all the constituent virus groups, e.g. ssDNA and dsDNA viruses (electronic supplementary material, figures S2 and S3), and within the two types of overlap: internal overlaps, where one gene is completely overlapped by a larger second, and terminal overlaps, where two genes overlap for part of their lengths—one upstream and one downstream (electronic supplementary material, figures S4 and S5).Figure 2.

Bottom Line: We conclude that gene overlap is unlikely to have evolved as a way of compressing the genome in response to the harmful effect of mutation because RNA viruses, despite having generally higher mutation rates, have less gene overlap on average than DNA viruses of comparable genome length.Our interpretation is that a physical constraint on genome length by the capsid has led to gene overlap evolving as a mechanism for producing more proteins from the same genome length.We consider that these patterns cannot be explained by other factors, namely the possible roles of overlap in transcription regulation, generating more divergent proteins and the relationship between gene length and genome length.

View Article: PubMed Central - PubMed

Affiliation: Department of Structural and Functional Biology, University of Insubria, Via JH Dunant 3, 21100 Varese, Italy.

ABSTRACT
The genomes of most virus species have overlapping genes--two or more proteins coded for by the same nucleotide sequence. Several explanations have been proposed for the evolution of this phenomenon, and we test these by comparing the amount of gene overlap in all known virus species. We conclude that gene overlap is unlikely to have evolved as a way of compressing the genome in response to the harmful effect of mutation because RNA viruses, despite having generally higher mutation rates, have less gene overlap on average than DNA viruses of comparable genome length. However, we do find a negative relationship between overlap proportion and genome length among viruses with icosahedral capsids, but not among those with other capsid types that we consider easier to enlarge in size. Our interpretation is that a physical constraint on genome length by the capsid has led to gene overlap evolving as a mechanism for producing more proteins from the same genome length. We consider that these patterns cannot be explained by other factors, namely the possible roles of overlap in transcription regulation, generating more divergent proteins and the relationship between gene length and genome length.

Show MeSH
Related in: MedlinePlus