Limits...
CoSREM: a graph mining algorithm for the discovery of combinatorial splicing regulatory elements.

Badr E, Heath LS - BMC Bioinformatics (2015)

Bottom Line: Our model does not assume a fixed length of SREs and incorporates experimental evidence as well to increase accuracy.We show that our results intersect with previous results, including some that are experimental.Our approach opens new directions to study SREs and the roles that AS may play in diseases and tissue specificity.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer Science, Virginia Tech, Blacksburg, Virginia, USA.

ABSTRACT

Background: Alternative splicing (AS) is a post-transcriptional regulatory mechanism for gene expression regulation. Splicing decisions are affected by the combinatorial behavior of different splicing factors that bind to multiple binding sites in exons and introns. These binding sites are called splicing regulatory elements (SREs). Here we develop CoSREM (Combinatorial SRE Miner), a graph mining algorithm to discover combinatorial SREs in human exons. Our model does not assume a fixed length of SREs and incorporates experimental evidence as well to increase accuracy. CoSREM is able to identify sets of SREs and is not limited to SRE pairs as are current approaches.

Results: We identified 37 SRE sets that include both enhancer and silencer elements. We show that our results intersect with previous results, including some that are experimental. We also show that the SRE set GGGAGG and GAGGAC identified by CoSREM may play a role in exon skipping events in several tumor samples. We applied CoSREM to RNA-Seq data for multiple tissues to identify combinatorial SREs which may be responsible for exon inclusion or exclusion across tissues.

Conclusion: The new algorithm can identify different combinations of splicing enhancers and silencers without assuming a predefined size or limiting the algorithm to find only pairs of SREs. Our approach opens new directions to study SREs and the roles that AS may play in diseases and tissue specificity.

No MeSH data available.


Related in: MedlinePlus

The number of generated MCSs and MCS sets using different values of α and θ
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4559876&req=5

Fig10: The number of generated MCSs and MCS sets using different values of α and θ

Mentions: Another aspect of CoSREM flexibility is the ability to choose the user defined thresholds. We have tried several values for the thresholds α and θ. As illustrated in Fig. 10, as α increases, the number of potential SREs decreases while the number of MCS collections increases and then decreases. This behavior can be explained, as α is the minimum number of exons that an SRE should reside in, and with increasing α, SREs that satisfy this constraint decreases and longer k-mer SREs are eliminated. However, as we set the θ threshold to a relatively small number (θ=100), some of these longer k-mers are combined again as co-occurring groups and this is the reason for the increasing number of combinatorial SREs. Eventually with the constant decreasing number of the resulted SREs, the number of the resulting MCS collections are decreased. We chose α to be 1000 to have a reasonable number of common exons between 6-mers to start with. Another reason is the time performance as shown in Fig. 11. The θ threshold eliminates only the groups with smaller exon sets. This is why we chose θ to be a small number relatively to have all the results for further filtering. We tried CoSREM with α=500 which resulted in 11 combinatorial SRE groups. These groups were a subset of our previous results with α=1000.Fig. 10


CoSREM: a graph mining algorithm for the discovery of combinatorial splicing regulatory elements.

Badr E, Heath LS - BMC Bioinformatics (2015)

The number of generated MCSs and MCS sets using different values of α and θ
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4559876&req=5

Fig10: The number of generated MCSs and MCS sets using different values of α and θ
Mentions: Another aspect of CoSREM flexibility is the ability to choose the user defined thresholds. We have tried several values for the thresholds α and θ. As illustrated in Fig. 10, as α increases, the number of potential SREs decreases while the number of MCS collections increases and then decreases. This behavior can be explained, as α is the minimum number of exons that an SRE should reside in, and with increasing α, SREs that satisfy this constraint decreases and longer k-mer SREs are eliminated. However, as we set the θ threshold to a relatively small number (θ=100), some of these longer k-mers are combined again as co-occurring groups and this is the reason for the increasing number of combinatorial SREs. Eventually with the constant decreasing number of the resulted SREs, the number of the resulting MCS collections are decreased. We chose α to be 1000 to have a reasonable number of common exons between 6-mers to start with. Another reason is the time performance as shown in Fig. 11. The θ threshold eliminates only the groups with smaller exon sets. This is why we chose θ to be a small number relatively to have all the results for further filtering. We tried CoSREM with α=500 which resulted in 11 combinatorial SRE groups. These groups were a subset of our previous results with α=1000.Fig. 10

Bottom Line: Our model does not assume a fixed length of SREs and incorporates experimental evidence as well to increase accuracy.We show that our results intersect with previous results, including some that are experimental.Our approach opens new directions to study SREs and the roles that AS may play in diseases and tissue specificity.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer Science, Virginia Tech, Blacksburg, Virginia, USA.

ABSTRACT

Background: Alternative splicing (AS) is a post-transcriptional regulatory mechanism for gene expression regulation. Splicing decisions are affected by the combinatorial behavior of different splicing factors that bind to multiple binding sites in exons and introns. These binding sites are called splicing regulatory elements (SREs). Here we develop CoSREM (Combinatorial SRE Miner), a graph mining algorithm to discover combinatorial SREs in human exons. Our model does not assume a fixed length of SREs and incorporates experimental evidence as well to increase accuracy. CoSREM is able to identify sets of SREs and is not limited to SRE pairs as are current approaches.

Results: We identified 37 SRE sets that include both enhancer and silencer elements. We show that our results intersect with previous results, including some that are experimental. We also show that the SRE set GGGAGG and GAGGAC identified by CoSREM may play a role in exon skipping events in several tumor samples. We applied CoSREM to RNA-Seq data for multiple tissues to identify combinatorial SREs which may be responsible for exon inclusion or exclusion across tissues.

Conclusion: The new algorithm can identify different combinations of splicing enhancers and silencers without assuming a predefined size or limiting the algorithm to find only pairs of SREs. Our approach opens new directions to study SREs and the roles that AS may play in diseases and tissue specificity.

No MeSH data available.


Related in: MedlinePlus