Limits...
Genome landscapes and bacteriophage codon usage.

Lucks JB, Nelson DR, Kudla GR, Plotkin JB - PLoS Comput. Biol. (2008)

Bottom Line: We find that 33 phage genomes exhibit highly non-random patterns in their GC3-content, use of host-preferred codons, or both.We show that the head and tail proteins of these phages exhibit significant bias towards host-preferred codons, relative to the non-structural phage proteins.Our results support the hypothesis of translational selection on viral genes for host-preferred codons, over a broad range of bacteriophages.

View Article: PubMed Central - PubMed

Affiliation: FAS Center for Systems Biology, Harvard University, Cambridge, Massachusetts, USA.

ABSTRACT
Across all kingdoms of biological life, protein-coding genes exhibit unequal usage of synonymous codons. Although alternative theories abound, translational selection has been accepted as an important mechanism that shapes the patterns of codon usage in prokaryotes and simple eukaryotes. Here we analyze patterns of codon usage across 74 diverse bacteriophages that infect E. coli, P. aeruginosa, and L. lactis as their primary host. We use the concept of a "genome landscape," which helps reveal non-trivial, long-range patterns in codon usage across a genome. We develop a series of randomization tests that allow us to interrogate the significance of one aspect of codon usage, such as GC content, while controlling for another aspect, such as adaptation to host-preferred codons. We find that 33 phage genomes exhibit highly non-random patterns in their GC3-content, use of host-preferred codons, or both. We show that the head and tail proteins of these phages exhibit significant bias towards host-preferred codons, relative to the non-structural phage proteins. Our results support the hypothesis of translational selection on viral genes for host-preferred codons, over a broad range of bacteriophages.

Show MeSH

Related in: MedlinePlus

GC3 and CAI landscapes for lambda phage.Landscapes of GC3. (left) and CAI (right) measures of codon usage in Lambda phage. Only coding sequences are considered, which when concatenated together are 40,773 bp long (see Table 2). The GC3 landscape is the mean-centered cumulative sum of the GC3 content (GC3 = 1, AT3 = 0) of codons. The CAI landscape is the mean-centered cumulative sum of the log w-value for each codon. For each landscape, a region exhibiting an uphill slope corresponds to higher than average GC3 or CAI. The horizontal purple band represents the expected amount of variation in a random walk of GC3 or AT3 choices, given by Equation 2. Both landscapes exhibit features far outside of the purple bands, indicating that the patterns of codon usage are highly non-random. Gene boundaries are represented by the bars in the histograms below each landscape. The height of the bars in the histogram indicate the GC3 and CAI values for each gene.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2266997&req=5

pcbi-1000001-g001: GC3 and CAI landscapes for lambda phage.Landscapes of GC3. (left) and CAI (right) measures of codon usage in Lambda phage. Only coding sequences are considered, which when concatenated together are 40,773 bp long (see Table 2). The GC3 landscape is the mean-centered cumulative sum of the GC3 content (GC3 = 1, AT3 = 0) of codons. The CAI landscape is the mean-centered cumulative sum of the log w-value for each codon. For each landscape, a region exhibiting an uphill slope corresponds to higher than average GC3 or CAI. The horizontal purple band represents the expected amount of variation in a random walk of GC3 or AT3 choices, given by Equation 2. Both landscapes exhibit features far outside of the purple bands, indicating that the patterns of codon usage are highly non-random. Gene boundaries are represented by the bars in the histograms below each landscape. The height of the bars in the histogram indicate the GC3 and CAI values for each gene.

Mentions: Traditional visualizations of GC3 content involve moving window averages of %GC3 over the genome [32]. In order to compare these techniques with the landscape approach, we focus on the E. coli phage lambda as an illustrative example. Figure 1A shows the lambda phage GC3 landscape above its associated “GC3 histogram”. The histogram shows the GC3 content of each gene, and the width of each histogram bar reflects the length of the corresponding gene. Thus, the gene-by-gene histograms mimic a sliding window average view of nucleotide content across the genome, but focus on the contributions of individual genes to these sequence properties. Figure 1A reveals a striking pattern of lambda phage codon usage: the genome is apparently divided into two halves that contain significantly different GC3 contents [33],[34]. The large region of uphill slope on the left half of the GC3 landscape reflects the fact that the majority of the genes in this region contain an excess of codons that end in G or C. This trend is also reflected in the GC3 histogram bars, which are higher than average in the left half of the genome (Figure 1).


Genome landscapes and bacteriophage codon usage.

Lucks JB, Nelson DR, Kudla GR, Plotkin JB - PLoS Comput. Biol. (2008)

GC3 and CAI landscapes for lambda phage.Landscapes of GC3. (left) and CAI (right) measures of codon usage in Lambda phage. Only coding sequences are considered, which when concatenated together are 40,773 bp long (see Table 2). The GC3 landscape is the mean-centered cumulative sum of the GC3 content (GC3 = 1, AT3 = 0) of codons. The CAI landscape is the mean-centered cumulative sum of the log w-value for each codon. For each landscape, a region exhibiting an uphill slope corresponds to higher than average GC3 or CAI. The horizontal purple band represents the expected amount of variation in a random walk of GC3 or AT3 choices, given by Equation 2. Both landscapes exhibit features far outside of the purple bands, indicating that the patterns of codon usage are highly non-random. Gene boundaries are represented by the bars in the histograms below each landscape. The height of the bars in the histogram indicate the GC3 and CAI values for each gene.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2266997&req=5

pcbi-1000001-g001: GC3 and CAI landscapes for lambda phage.Landscapes of GC3. (left) and CAI (right) measures of codon usage in Lambda phage. Only coding sequences are considered, which when concatenated together are 40,773 bp long (see Table 2). The GC3 landscape is the mean-centered cumulative sum of the GC3 content (GC3 = 1, AT3 = 0) of codons. The CAI landscape is the mean-centered cumulative sum of the log w-value for each codon. For each landscape, a region exhibiting an uphill slope corresponds to higher than average GC3 or CAI. The horizontal purple band represents the expected amount of variation in a random walk of GC3 or AT3 choices, given by Equation 2. Both landscapes exhibit features far outside of the purple bands, indicating that the patterns of codon usage are highly non-random. Gene boundaries are represented by the bars in the histograms below each landscape. The height of the bars in the histogram indicate the GC3 and CAI values for each gene.
Mentions: Traditional visualizations of GC3 content involve moving window averages of %GC3 over the genome [32]. In order to compare these techniques with the landscape approach, we focus on the E. coli phage lambda as an illustrative example. Figure 1A shows the lambda phage GC3 landscape above its associated “GC3 histogram”. The histogram shows the GC3 content of each gene, and the width of each histogram bar reflects the length of the corresponding gene. Thus, the gene-by-gene histograms mimic a sliding window average view of nucleotide content across the genome, but focus on the contributions of individual genes to these sequence properties. Figure 1A reveals a striking pattern of lambda phage codon usage: the genome is apparently divided into two halves that contain significantly different GC3 contents [33],[34]. The large region of uphill slope on the left half of the GC3 landscape reflects the fact that the majority of the genes in this region contain an excess of codons that end in G or C. This trend is also reflected in the GC3 histogram bars, which are higher than average in the left half of the genome (Figure 1).

Bottom Line: We find that 33 phage genomes exhibit highly non-random patterns in their GC3-content, use of host-preferred codons, or both.We show that the head and tail proteins of these phages exhibit significant bias towards host-preferred codons, relative to the non-structural phage proteins.Our results support the hypothesis of translational selection on viral genes for host-preferred codons, over a broad range of bacteriophages.

View Article: PubMed Central - PubMed

Affiliation: FAS Center for Systems Biology, Harvard University, Cambridge, Massachusetts, USA.

ABSTRACT
Across all kingdoms of biological life, protein-coding genes exhibit unequal usage of synonymous codons. Although alternative theories abound, translational selection has been accepted as an important mechanism that shapes the patterns of codon usage in prokaryotes and simple eukaryotes. Here we analyze patterns of codon usage across 74 diverse bacteriophages that infect E. coli, P. aeruginosa, and L. lactis as their primary host. We use the concept of a "genome landscape," which helps reveal non-trivial, long-range patterns in codon usage across a genome. We develop a series of randomization tests that allow us to interrogate the significance of one aspect of codon usage, such as GC content, while controlling for another aspect, such as adaptation to host-preferred codons. We find that 33 phage genomes exhibit highly non-random patterns in their GC3-content, use of host-preferred codons, or both. We show that the head and tail proteins of these phages exhibit significant bias towards host-preferred codons, relative to the non-structural phage proteins. Our results support the hypothesis of translational selection on viral genes for host-preferred codons, over a broad range of bacteriophages.

Show MeSH
Related in: MedlinePlus