Limits...
Visual ModuleOrganizer: a graphical interface for the detection and comparative analysis of repeat DNA modules.

Tempel S, Talla E - Mob DNA (2014)

Bottom Line: This leads to less memory usage (at least 120-fold decrease in average) and decreases by at least four the computation time during the module detection process in large sequences.Visual ModuleOrganizer interface allows users to easily choose ModuleOrganizer parameters and to graphically display the results.Moreover, Visual ModuleOrganizer dynamically handles graphical results through four main parameters: gene annotations, overlapping modules with known annotations, location of the module in a minimal number of sequences, and the minimal length of the modules.

View Article: PubMed Central - HTML - PubMed

Affiliation: Aix-Marseille Université, CNRS, LCB, UMR 7283, 13009 Marseille, France.

ABSTRACT

Background: DNA repeats, such as transposable elements, minisatellites and palindromic sequences, are abundant in sequences and have been shown to have significant and functional roles in the evolution of the host genomes. In a previous study, we introduced the concept of a repeat DNA module, a flexible motif present in at least two occurences in the sequences. This concept was embedded into ModuleOrganizer, a tool allowing the detection of repeat modules in a set of sequences. However, its implementation remains difficult for larger sequences.

Results: Here we present Visual ModuleOrganizer, a Java graphical interface that enables a new and optimized version of the ModuleOrganizer tool. To implement this version, it was recoded in C++ with compressed suffix tree data structures. This leads to less memory usage (at least 120-fold decrease in average) and decreases by at least four the computation time during the module detection process in large sequences. Visual ModuleOrganizer interface allows users to easily choose ModuleOrganizer parameters and to graphically display the results. Moreover, Visual ModuleOrganizer dynamically handles graphical results through four main parameters: gene annotations, overlapping modules with known annotations, location of the module in a minimal number of sequences, and the minimal length of the modules. As a case study, the analysis of FoldBack4 sequences clearly demonstrated that our tools can be extended to comparative and evolutionary analyses of any repeat sequence elements in a set of genomic sequences. With the increasing number of sequences available in public databases, it is now possible to perform comparative analyses of repeated DNA modules in a graphic and friendly manner within a reasonable time period.

Availability: Visual ModuleOrganizer interface and the new version of the ModuleOrganizer tool are freely available at: http://lcb.cnrs-mrs.fr/spip.php?rubrique313.

No MeSH data available.


Identification and comparative analysis of repeat DNA modules in FoldBack4 sequences using Visual ModuleOrganizer. From the ten FoldBack4 sequences, a MinSizeModule of 25, ‘Palindromic modules’ and ‘Truncated modules’ options, the ModuleOrganizer algorithm detects 23 modules. Graphical displays of the results: (A) default graphical options, (B) ‘Draw Modules present in at least M Sequences’ slider sets to 9 and (C) ‘Draw Modules by Size’ slider sets to 58 bp.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4022104&req=5

Figure 3: Identification and comparative analysis of repeat DNA modules in FoldBack4 sequences using Visual ModuleOrganizer. From the ten FoldBack4 sequences, a MinSizeModule of 25, ‘Palindromic modules’ and ‘Truncated modules’ options, the ModuleOrganizer algorithm detects 23 modules. Graphical displays of the results: (A) default graphical options, (B) ‘Draw Modules present in at least M Sequences’ slider sets to 9 and (C) ‘Draw Modules by Size’ slider sets to 58 bp.

Mentions: For this study, 10 FB4 sequence elements ranging from 627 to 2266 bp were chosen. These elements are generally highly variable in their internal sequence, including numerous insertions, deletions, and repetitions, but share consensus palindromic extremities in all their copies because they are necessary for the transposition [28]. With a MinSizeModule settled to 25 bp, ‘Palindromic modules’ and ‘Truncated modules’ options selected, the ModuleOrganizer algorithm discovered 23 modules (Figure 3A). Palindromic structures of the FB4 sequences are described by modules 1-5 that should correspond to Terminal Inverted Repeat (TIR). Internal sequences are mainly composed of the modules 8-10 which are repeated in tandem, looking like minisatellites. Those are often present in the internal sequence of non-autonomous transposable elements [1,28]. According to the module composition, the upgma-based tree clusters the FB4 sequences in 4 distinct groups: Group1 = FB4_3, FB4_8, and FB4_4; Group2 = FB4_1, FB4_9, and FB4_5; Group 3 = FB4_10 and FB4_11; Group4 = FB4_2 and FB4_7, allowing inter- and intra-groups comparison of the detected modules. Indeed, the reverse occurrence of modules 3 and 4 were deleted in FB4_2 and FB4_7 (from Group4) and reverse modules 2-5 were absent in FB4_10 and FB4_11 (from Group3). These findings clearly suggest that partial deletions of these palindromic structures would impair the transposition of these FB4 sequences.


Visual ModuleOrganizer: a graphical interface for the detection and comparative analysis of repeat DNA modules.

Tempel S, Talla E - Mob DNA (2014)

Identification and comparative analysis of repeat DNA modules in FoldBack4 sequences using Visual ModuleOrganizer. From the ten FoldBack4 sequences, a MinSizeModule of 25, ‘Palindromic modules’ and ‘Truncated modules’ options, the ModuleOrganizer algorithm detects 23 modules. Graphical displays of the results: (A) default graphical options, (B) ‘Draw Modules present in at least M Sequences’ slider sets to 9 and (C) ‘Draw Modules by Size’ slider sets to 58 bp.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4022104&req=5

Figure 3: Identification and comparative analysis of repeat DNA modules in FoldBack4 sequences using Visual ModuleOrganizer. From the ten FoldBack4 sequences, a MinSizeModule of 25, ‘Palindromic modules’ and ‘Truncated modules’ options, the ModuleOrganizer algorithm detects 23 modules. Graphical displays of the results: (A) default graphical options, (B) ‘Draw Modules present in at least M Sequences’ slider sets to 9 and (C) ‘Draw Modules by Size’ slider sets to 58 bp.
Mentions: For this study, 10 FB4 sequence elements ranging from 627 to 2266 bp were chosen. These elements are generally highly variable in their internal sequence, including numerous insertions, deletions, and repetitions, but share consensus palindromic extremities in all their copies because they are necessary for the transposition [28]. With a MinSizeModule settled to 25 bp, ‘Palindromic modules’ and ‘Truncated modules’ options selected, the ModuleOrganizer algorithm discovered 23 modules (Figure 3A). Palindromic structures of the FB4 sequences are described by modules 1-5 that should correspond to Terminal Inverted Repeat (TIR). Internal sequences are mainly composed of the modules 8-10 which are repeated in tandem, looking like minisatellites. Those are often present in the internal sequence of non-autonomous transposable elements [1,28]. According to the module composition, the upgma-based tree clusters the FB4 sequences in 4 distinct groups: Group1 = FB4_3, FB4_8, and FB4_4; Group2 = FB4_1, FB4_9, and FB4_5; Group 3 = FB4_10 and FB4_11; Group4 = FB4_2 and FB4_7, allowing inter- and intra-groups comparison of the detected modules. Indeed, the reverse occurrence of modules 3 and 4 were deleted in FB4_2 and FB4_7 (from Group4) and reverse modules 2-5 were absent in FB4_10 and FB4_11 (from Group3). These findings clearly suggest that partial deletions of these palindromic structures would impair the transposition of these FB4 sequences.

Bottom Line: This leads to less memory usage (at least 120-fold decrease in average) and decreases by at least four the computation time during the module detection process in large sequences.Visual ModuleOrganizer interface allows users to easily choose ModuleOrganizer parameters and to graphically display the results.Moreover, Visual ModuleOrganizer dynamically handles graphical results through four main parameters: gene annotations, overlapping modules with known annotations, location of the module in a minimal number of sequences, and the minimal length of the modules.

View Article: PubMed Central - HTML - PubMed

Affiliation: Aix-Marseille Université, CNRS, LCB, UMR 7283, 13009 Marseille, France.

ABSTRACT

Background: DNA repeats, such as transposable elements, minisatellites and palindromic sequences, are abundant in sequences and have been shown to have significant and functional roles in the evolution of the host genomes. In a previous study, we introduced the concept of a repeat DNA module, a flexible motif present in at least two occurences in the sequences. This concept was embedded into ModuleOrganizer, a tool allowing the detection of repeat modules in a set of sequences. However, its implementation remains difficult for larger sequences.

Results: Here we present Visual ModuleOrganizer, a Java graphical interface that enables a new and optimized version of the ModuleOrganizer tool. To implement this version, it was recoded in C++ with compressed suffix tree data structures. This leads to less memory usage (at least 120-fold decrease in average) and decreases by at least four the computation time during the module detection process in large sequences. Visual ModuleOrganizer interface allows users to easily choose ModuleOrganizer parameters and to graphically display the results. Moreover, Visual ModuleOrganizer dynamically handles graphical results through four main parameters: gene annotations, overlapping modules with known annotations, location of the module in a minimal number of sequences, and the minimal length of the modules. As a case study, the analysis of FoldBack4 sequences clearly demonstrated that our tools can be extended to comparative and evolutionary analyses of any repeat sequence elements in a set of genomic sequences. With the increasing number of sequences available in public databases, it is now possible to perform comparative analyses of repeated DNA modules in a graphic and friendly manner within a reasonable time period.

Availability: Visual ModuleOrganizer interface and the new version of the ModuleOrganizer tool are freely available at: http://lcb.cnrs-mrs.fr/spip.php?rubrique313.

No MeSH data available.