Limits...
In silico analysis of 3'-end-processing signals in Aspergillus oryzae using expressed sequence tags and genomic sequencing data.

Tanaka M, Sakai Y, Yamada O, Shintani T, Gomi K - DNA Res. (2011)

Bottom Line: The average 3' UTR length in A. oryzae was 241 nt, which is greater than that in yeast but similar to that in plants.The most frequently found hexanucleotide in this A-rich region is AAUGAA, although this sequence accounts for only 6% of all transcripts.Although these putative 3'-end-processing signals are similar to those in yeast and plants, some notable differences exist between them.

View Article: PubMed Central - PubMed

Affiliation: Laboratory of Bioindustrial Genomics, Department of Bioindustrial Informatics and Genomics, Graduate School of Agricultural Science, Tohoku University, 1-1 Tsutsumidori-Amamiyamachi, Aoba-ku, Sendai 981-8555, Japan.

ABSTRACT
To investigate 3'-end-processing signals in Aspergillus oryzae, we created a nucleotide sequence data set of the 3'-untranslated region (3' UTR) plus 100 nucleotides (nt) sequence downstream of the poly(A) site using A. oryzae expressed sequence tags and genomic sequencing data. This data set comprised 1065 sequences derived from 1042 unique genes. The average 3' UTR length in A. oryzae was 241 nt, which is greater than that in yeast but similar to that in plants. The 3' UTR and 100 nt sequence downstream of the poly(A) site is notably U-rich, while the region located 15-30 nt upstream of the poly(A) site is markedly A-rich. The most frequently found hexanucleotide in this A-rich region is AAUGAA, although this sequence accounts for only 6% of all transcripts. These data suggested that A. oryzae has no highly conserved sequence element equivalent to AAUAAA, a mammalian polyadenylation signal. We identified that putative 3'-end-processing signals in A. oryzae, while less well conserved than those in mammals, comprised four sequence elements: the furthest upstream U-rich element, A-rich sequence, cleavage site, and downstream U-rich element flanking the cleavage site. Although these putative 3'-end-processing signals are similar to those in yeast and plants, some notable differences exist between them.

Show MeSH

Related in: MedlinePlus

Representative hexanucleotide signals in the poly(A) signal region (from −40 to −1 nt).
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC3111234&req=5

DSR011F4: Representative hexanucleotide signals in the poly(A) signal region (from −40 to −1 nt).

Mentions: To identify 3′-end-processing elements, we searched for tetramer–heptamer nucleotide sequences that appeared most frequently in each signal element region (Table 1, the top 50 list is available in Supplementary Table S4). In region II, equivalent to the region containing the polyadenylation signal in mammals, no significantly conserved hexanucleotide sequence was observed, similar to that observed in yeast and plants. The top-ranked hexanucleotide was AAUGAA in region II. The top two pentanucleotides (AAUGA and AUGAA) were partial sequences of AAUGAA, and all of the top three heptanucleotides contained the AAUGAA sequence (Table 1). In addition, according to the zeroth- and first-order Markov chain models, calculation of a standard score (Z-score) to measure the standard deviation of the hexanucleotide sequences from its expected occurrence revealed that AAUGAA was the most over-represented hexanucleotide sequence in region II (Table 2). These results suggested that AAUGAA is the most predominant hexanucleotide sequence in region II, although it accounted for only 6% of all transcripts (64 of1043). In contrast, according to the order of Z-scores, the AAUAAA sequence was not the major hexanucleotide sequence in region II, although it ranked third in the list of hexanucleotides. This was also demonstrated by plotting the distribution of hexanucleotide sequences, including AAUGAA and AAUAAA, in the region ranging from −40 to −1 nt (Fig. 4). The AAUGAA sequence was a single nucleotide variant of AAUAAA, but no study has reported that AAUGAA is the most effective A-rich sequence for the 3′-end-processing element in any eukaryote. Interestingly, point mutation of AAUAAA to AAUGAA results in a significant reduction of polyadenylation efficiency by in vitro 3′-end-processing reactions, using nuclear extracts from Xenopus and mammalian cells.18,40 Thus, the 3′-end-processing machinery in A. oryzae may be somewhat different from that in higher eukaryotes.Table 1.


In silico analysis of 3'-end-processing signals in Aspergillus oryzae using expressed sequence tags and genomic sequencing data.

Tanaka M, Sakai Y, Yamada O, Shintani T, Gomi K - DNA Res. (2011)

Representative hexanucleotide signals in the poly(A) signal region (from −40 to −1 nt).
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC3111234&req=5

DSR011F4: Representative hexanucleotide signals in the poly(A) signal region (from −40 to −1 nt).
Mentions: To identify 3′-end-processing elements, we searched for tetramer–heptamer nucleotide sequences that appeared most frequently in each signal element region (Table 1, the top 50 list is available in Supplementary Table S4). In region II, equivalent to the region containing the polyadenylation signal in mammals, no significantly conserved hexanucleotide sequence was observed, similar to that observed in yeast and plants. The top-ranked hexanucleotide was AAUGAA in region II. The top two pentanucleotides (AAUGA and AUGAA) were partial sequences of AAUGAA, and all of the top three heptanucleotides contained the AAUGAA sequence (Table 1). In addition, according to the zeroth- and first-order Markov chain models, calculation of a standard score (Z-score) to measure the standard deviation of the hexanucleotide sequences from its expected occurrence revealed that AAUGAA was the most over-represented hexanucleotide sequence in region II (Table 2). These results suggested that AAUGAA is the most predominant hexanucleotide sequence in region II, although it accounted for only 6% of all transcripts (64 of1043). In contrast, according to the order of Z-scores, the AAUAAA sequence was not the major hexanucleotide sequence in region II, although it ranked third in the list of hexanucleotides. This was also demonstrated by plotting the distribution of hexanucleotide sequences, including AAUGAA and AAUAAA, in the region ranging from −40 to −1 nt (Fig. 4). The AAUGAA sequence was a single nucleotide variant of AAUAAA, but no study has reported that AAUGAA is the most effective A-rich sequence for the 3′-end-processing element in any eukaryote. Interestingly, point mutation of AAUAAA to AAUGAA results in a significant reduction of polyadenylation efficiency by in vitro 3′-end-processing reactions, using nuclear extracts from Xenopus and mammalian cells.18,40 Thus, the 3′-end-processing machinery in A. oryzae may be somewhat different from that in higher eukaryotes.Table 1.

Bottom Line: The average 3' UTR length in A. oryzae was 241 nt, which is greater than that in yeast but similar to that in plants.The most frequently found hexanucleotide in this A-rich region is AAUGAA, although this sequence accounts for only 6% of all transcripts.Although these putative 3'-end-processing signals are similar to those in yeast and plants, some notable differences exist between them.

View Article: PubMed Central - PubMed

Affiliation: Laboratory of Bioindustrial Genomics, Department of Bioindustrial Informatics and Genomics, Graduate School of Agricultural Science, Tohoku University, 1-1 Tsutsumidori-Amamiyamachi, Aoba-ku, Sendai 981-8555, Japan.

ABSTRACT
To investigate 3'-end-processing signals in Aspergillus oryzae, we created a nucleotide sequence data set of the 3'-untranslated region (3' UTR) plus 100 nucleotides (nt) sequence downstream of the poly(A) site using A. oryzae expressed sequence tags and genomic sequencing data. This data set comprised 1065 sequences derived from 1042 unique genes. The average 3' UTR length in A. oryzae was 241 nt, which is greater than that in yeast but similar to that in plants. The 3' UTR and 100 nt sequence downstream of the poly(A) site is notably U-rich, while the region located 15-30 nt upstream of the poly(A) site is markedly A-rich. The most frequently found hexanucleotide in this A-rich region is AAUGAA, although this sequence accounts for only 6% of all transcripts. These data suggested that A. oryzae has no highly conserved sequence element equivalent to AAUAAA, a mammalian polyadenylation signal. We identified that putative 3'-end-processing signals in A. oryzae, while less well conserved than those in mammals, comprised four sequence elements: the furthest upstream U-rich element, A-rich sequence, cleavage site, and downstream U-rich element flanking the cleavage site. Although these putative 3'-end-processing signals are similar to those in yeast and plants, some notable differences exist between them.

Show MeSH
Related in: MedlinePlus