Limits...
Identifying and characterising key alternative splicing events in Drosophila development.

Lees JG, Ranea JA, Orengo CA - BMC Genomics (2015)

Bottom Line: We have identified a subset of protein isoforms which appear to have high functional significance, particularly in regulation.The methods and analyses we present here represent important first steps in the development of tools to address the near complete lack of isoform specific function annotation.In turn the tools allow us to better characterise the regulatory functions of alternative splicing in more detail.

View Article: PubMed Central - PubMed

Affiliation: Institute of Structural and Molecular Biology, Division of Biosciences, University College London, Gower Street, London, WC1E 6BT, UK. ucbcjle@live.ucl.ac.uk.

ABSTRACT

Background: In complex Metazoans a given gene frequently codes for multiple protein isoforms, through processes such as alternative splicing. Large scale functional annotation of these isoforms is a key challenge for functional genomics. This annotation gap is increasing with the large numbers of multi transcript genes being identified by technologies such as RNASeq. Furthermore attempts to characterise the functions of splicing in an organism are complicated by the difficulty in distinguishing functional isoforms from those produced by splicing errors or transcription noise. Tools to help prioritise candidate isoforms for testing are largely absent.

Results: In this study we implement a Time-course Switch (TS) score for ranking isoforms by their likelihood of producing additional functions based on their developmental expression profiles, as reported by modENCODE. The TS score allows us to better investigate functional roles of different isoforms expressed in multi transcript genes. From this analysis, we find that isoforms with high TS scores have sequence feature changes consistent with more deterministic splicing and functional changes and tend to gain domains or whole exons which could carry additional functions. Furthermore these functions appear to be particularly important for essential regulatory roles, establishing functional isoform switching as key for regulatory processes. Based on the TS score we develop a Transcript Annotations Pipeline for Alternative Splicing (TAPAS) that identifies functional neighbourhoods of potentially interesting isoforms.

Conclusions: We have identified a subset of protein isoforms which appear to have high functional significance, particularly in regulation. This has been made possible through the development of novel methods that make use of transcript expression profiles. The methods and analyses we present here represent important first steps in the development of tools to address the near complete lack of isoform specific function annotation. In turn the tools allow us to better characterise the regulatory functions of alternative splicing in more detail.

No MeSH data available.


Domain Switching example. Example of a domain (functional) gain on switching from primary to secondary isoforms of a High-TS gene. A structural representative of the gained domain family is displayed (ID. in CATH database: 3.30.1370.50). Expression profiles of the transcripts are shown (see Fig. 1b for explanation of axes)
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4537583&req=5

Fig3: Domain Switching example. Example of a domain (functional) gain on switching from primary to secondary isoforms of a High-TS gene. A structural representative of the gained domain family is displayed (ID. in CATH database: 3.30.1370.50). Expression profiles of the transcripts are shown (see Fig. 1b for explanation of axes)

Mentions: Because of their short sequences, ELMs cannot be assigned with high confidence. However, domains can be assigned with much higher confidence and the mapping of protein domain-family to function is well established. Hence, we can use the domain contents of proteins to detect domain differences between the primary and secondary isoforms to predict any functional changes. In order to obtain high coverage of domain assignments to the protein sequences coded by the isoforms, we made use of the extensive domain annotations in Gene3D [29] a resource that integrates domain assignments from CATH [17], Pfam-A (and Pfam-B) [30] and Superfamily [31]. We found that High-TS Genes were significantly more likely to have an additional domain family in their secondary protein isoform not present in the primary isoform (Fisher’s p-value < 0.01). As an example we can see that for the gene Protein kinase C δ, there is a switch in the maximally expressed isoform at the end of embryogenesis and at the WPP stage to a shorter protein isoform, but which contains an extra domain family (CATH superfamily: 3.30.60.20, C1 domain) (Fig. 3). This domain is thought to be important for the regulation of the kinase through ligand binding [32]. Unlike previous studies which considered all minor isoforms, analyses using our FunFam pipeline (described in methods) did not detect any significantly enriched functions associated with the domains being gained (after correcting for multiple testing using the Benjamini-Hochberg method).Fig. 3


Identifying and characterising key alternative splicing events in Drosophila development.

Lees JG, Ranea JA, Orengo CA - BMC Genomics (2015)

Domain Switching example. Example of a domain (functional) gain on switching from primary to secondary isoforms of a High-TS gene. A structural representative of the gained domain family is displayed (ID. in CATH database: 3.30.1370.50). Expression profiles of the transcripts are shown (see Fig. 1b for explanation of axes)
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4537583&req=5

Fig3: Domain Switching example. Example of a domain (functional) gain on switching from primary to secondary isoforms of a High-TS gene. A structural representative of the gained domain family is displayed (ID. in CATH database: 3.30.1370.50). Expression profiles of the transcripts are shown (see Fig. 1b for explanation of axes)
Mentions: Because of their short sequences, ELMs cannot be assigned with high confidence. However, domains can be assigned with much higher confidence and the mapping of protein domain-family to function is well established. Hence, we can use the domain contents of proteins to detect domain differences between the primary and secondary isoforms to predict any functional changes. In order to obtain high coverage of domain assignments to the protein sequences coded by the isoforms, we made use of the extensive domain annotations in Gene3D [29] a resource that integrates domain assignments from CATH [17], Pfam-A (and Pfam-B) [30] and Superfamily [31]. We found that High-TS Genes were significantly more likely to have an additional domain family in their secondary protein isoform not present in the primary isoform (Fisher’s p-value < 0.01). As an example we can see that for the gene Protein kinase C δ, there is a switch in the maximally expressed isoform at the end of embryogenesis and at the WPP stage to a shorter protein isoform, but which contains an extra domain family (CATH superfamily: 3.30.60.20, C1 domain) (Fig. 3). This domain is thought to be important for the regulation of the kinase through ligand binding [32]. Unlike previous studies which considered all minor isoforms, analyses using our FunFam pipeline (described in methods) did not detect any significantly enriched functions associated with the domains being gained (after correcting for multiple testing using the Benjamini-Hochberg method).Fig. 3

Bottom Line: We have identified a subset of protein isoforms which appear to have high functional significance, particularly in regulation.The methods and analyses we present here represent important first steps in the development of tools to address the near complete lack of isoform specific function annotation.In turn the tools allow us to better characterise the regulatory functions of alternative splicing in more detail.

View Article: PubMed Central - PubMed

Affiliation: Institute of Structural and Molecular Biology, Division of Biosciences, University College London, Gower Street, London, WC1E 6BT, UK. ucbcjle@live.ucl.ac.uk.

ABSTRACT

Background: In complex Metazoans a given gene frequently codes for multiple protein isoforms, through processes such as alternative splicing. Large scale functional annotation of these isoforms is a key challenge for functional genomics. This annotation gap is increasing with the large numbers of multi transcript genes being identified by technologies such as RNASeq. Furthermore attempts to characterise the functions of splicing in an organism are complicated by the difficulty in distinguishing functional isoforms from those produced by splicing errors or transcription noise. Tools to help prioritise candidate isoforms for testing are largely absent.

Results: In this study we implement a Time-course Switch (TS) score for ranking isoforms by their likelihood of producing additional functions based on their developmental expression profiles, as reported by modENCODE. The TS score allows us to better investigate functional roles of different isoforms expressed in multi transcript genes. From this analysis, we find that isoforms with high TS scores have sequence feature changes consistent with more deterministic splicing and functional changes and tend to gain domains or whole exons which could carry additional functions. Furthermore these functions appear to be particularly important for essential regulatory roles, establishing functional isoform switching as key for regulatory processes. Based on the TS score we develop a Transcript Annotations Pipeline for Alternative Splicing (TAPAS) that identifies functional neighbourhoods of potentially interesting isoforms.

Conclusions: We have identified a subset of protein isoforms which appear to have high functional significance, particularly in regulation. This has been made possible through the development of novel methods that make use of transcript expression profiles. The methods and analyses we present here represent important first steps in the development of tools to address the near complete lack of isoform specific function annotation. In turn the tools allow us to better characterise the regulatory functions of alternative splicing in more detail.

No MeSH data available.