Limits...
Identification of novel exons and transcribed regions by chimpanzee transcriptome sequencing.

Wetterbom A, Ameur A, Feuk L, Gyllensten U, Cavelier L - Genome Biol. (2010)

Bottom Line: Using stringent criteria for transcription, we identify 12,843 expressed genes, with a majority being found in both tissues.This gene does not appear to be functional in human since one exon is absent from the human genome.Our results extend the chimpanzee gene catalogue with a large number of novel exons and 3' UTRs and thus support the view that mammalian gene annotations are not yet complete.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Genetics and Pathology, Rudbeck laboratory, Uppsala University, SE-751 85 Uppsala, Sweden.

ABSTRACT

Background: We profile the chimpanzee transcriptome by using deep sequencing of cDNA from brain and liver, aiming to quantify expression of known genes and to identify novel transcribed regions.

Results: Using stringent criteria for transcription, we identify 12,843 expressed genes, with a majority being found in both tissues. We further identify 9,826 novel transcribed regions that are not overlapping with annotated exons, mRNAs or ESTs. Over 80% of the novel transcribed regions map within or in the vicinity of known genes, and by combining sequencing data with de novo splice predictions we predict several of the novel transcribed regions to be new exons or 3' UTRs. For approximately 350 novel transcribed regions, the corresponding DNA sequence is absent in the human reference genome. The presence of novel transcribed regions in five genes and in one intergenic region is further validated with RT-PCR. Finally, we describe and experimentally validate a putative novel multi-exon gene that belongs to the ATP-cassette transporter gene family. This gene does not appear to be functional in human since one exon is absent from the human genome. In addition to novel exons and UTRs, novel transcribed regions may also stem from different types of noncoding transcripts. We note that expressed repeats and introns from unspliced mRNAs are especially common in our data.

Conclusions: Our results extend the chimpanzee gene catalogue with a large number of novel exons and 3' UTRs and thus support the view that mammalian gene annotations are not yet complete.

Show MeSH
Work flow for the bioinformatics analyses. Sequence reads were mapped to the reference genome (PanTro2), a coverage signal was calculated across the genome and a threshold for expression was established. The threshold was initially used to determine expression of RefSeq genes and later for de novo detection of TRs. TRs with no previous annotations were considered to be novel and further characterized. De novo prediction of splice junctions was performed to join novel TRs with each other and with existing gene models.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2926789&req=5

Figure 1: Work flow for the bioinformatics analyses. Sequence reads were mapped to the reference genome (PanTro2), a coverage signal was calculated across the genome and a threshold for expression was established. The threshold was initially used to determine expression of RefSeq genes and later for de novo detection of TRs. TRs with no previous annotations were considered to be novel and further characterized. De novo prediction of splice junctions was performed to join novel TRs with each other and with existing gene models.

Mentions: Samples from frontal cortex and liver tissue were obtained from two young chimpanzees, one male and one female. We generated one cDNA library per tissue and individual and sequenced the fragments using the SOLiD platform. For the female chimpanzee, both 35-bp and 50-bp reads were generated (samples denoted brainF 35 bp, brainF 50 bp, liverF 35 bp and liverF 50 bp) whereas for the male only 35-bp reads were sequenced (samples denoted brainM 35 bp and liverM 35 bp). The sequencing reactions generated between 38 and 170 million reads, of which more than 40% mapped uniquely to the chimpanzee reference genome (panTro2; Table 1) when allowing for up to three mismatches for the 35-bp reads and up to four mismatches for the 50-bp reads. The subsequent analyses were performed to characterize the transcriptome repertoire, both in terms of quantifying the expression level of known genes and by identifying novel transcripts (see outline in Figure 1).


Identification of novel exons and transcribed regions by chimpanzee transcriptome sequencing.

Wetterbom A, Ameur A, Feuk L, Gyllensten U, Cavelier L - Genome Biol. (2010)

Work flow for the bioinformatics analyses. Sequence reads were mapped to the reference genome (PanTro2), a coverage signal was calculated across the genome and a threshold for expression was established. The threshold was initially used to determine expression of RefSeq genes and later for de novo detection of TRs. TRs with no previous annotations were considered to be novel and further characterized. De novo prediction of splice junctions was performed to join novel TRs with each other and with existing gene models.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2926789&req=5

Figure 1: Work flow for the bioinformatics analyses. Sequence reads were mapped to the reference genome (PanTro2), a coverage signal was calculated across the genome and a threshold for expression was established. The threshold was initially used to determine expression of RefSeq genes and later for de novo detection of TRs. TRs with no previous annotations were considered to be novel and further characterized. De novo prediction of splice junctions was performed to join novel TRs with each other and with existing gene models.
Mentions: Samples from frontal cortex and liver tissue were obtained from two young chimpanzees, one male and one female. We generated one cDNA library per tissue and individual and sequenced the fragments using the SOLiD platform. For the female chimpanzee, both 35-bp and 50-bp reads were generated (samples denoted brainF 35 bp, brainF 50 bp, liverF 35 bp and liverF 50 bp) whereas for the male only 35-bp reads were sequenced (samples denoted brainM 35 bp and liverM 35 bp). The sequencing reactions generated between 38 and 170 million reads, of which more than 40% mapped uniquely to the chimpanzee reference genome (panTro2; Table 1) when allowing for up to three mismatches for the 35-bp reads and up to four mismatches for the 50-bp reads. The subsequent analyses were performed to characterize the transcriptome repertoire, both in terms of quantifying the expression level of known genes and by identifying novel transcripts (see outline in Figure 1).

Bottom Line: Using stringent criteria for transcription, we identify 12,843 expressed genes, with a majority being found in both tissues.This gene does not appear to be functional in human since one exon is absent from the human genome.Our results extend the chimpanzee gene catalogue with a large number of novel exons and 3' UTRs and thus support the view that mammalian gene annotations are not yet complete.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Genetics and Pathology, Rudbeck laboratory, Uppsala University, SE-751 85 Uppsala, Sweden.

ABSTRACT

Background: We profile the chimpanzee transcriptome by using deep sequencing of cDNA from brain and liver, aiming to quantify expression of known genes and to identify novel transcribed regions.

Results: Using stringent criteria for transcription, we identify 12,843 expressed genes, with a majority being found in both tissues. We further identify 9,826 novel transcribed regions that are not overlapping with annotated exons, mRNAs or ESTs. Over 80% of the novel transcribed regions map within or in the vicinity of known genes, and by combining sequencing data with de novo splice predictions we predict several of the novel transcribed regions to be new exons or 3' UTRs. For approximately 350 novel transcribed regions, the corresponding DNA sequence is absent in the human reference genome. The presence of novel transcribed regions in five genes and in one intergenic region is further validated with RT-PCR. Finally, we describe and experimentally validate a putative novel multi-exon gene that belongs to the ATP-cassette transporter gene family. This gene does not appear to be functional in human since one exon is absent from the human genome. In addition to novel exons and UTRs, novel transcribed regions may also stem from different types of noncoding transcripts. We note that expressed repeats and introns from unspliced mRNAs are especially common in our data.

Conclusions: Our results extend the chimpanzee gene catalogue with a large number of novel exons and 3' UTRs and thus support the view that mammalian gene annotations are not yet complete.

Show MeSH