Limits...
PAVE: program for assembling and viewing ESTs.

Soderlund C, Johnson E, Bomhoff M, Descour A - BMC Genomics (2009)

Bottom Line: The new 454 technology has the benefit of high-throughput expression profiling, but introduces time and space problems for assembling large contigs.A Java viewer program is provided for display and analysis of the results.The assembly software, data management software, Java viewer and user's guide are freely available.

View Article: PubMed Central - HTML - PubMed

Affiliation: BIO5 Institute, University of Arizona, Tucson, AZ 85721, USA. cari@agcol.arizona.edu

ABSTRACT

Background: New sequencing technologies are rapidly emerging. Many laboratories are simultaneously working with the traditional Sanger ESTs and experimenting with ESTs generated by the 454 Life Science sequencers. Though Sanger ESTs have been used to generate contigs for many years, no program takes full advantage of the 5' and 3' mate-pair information, hence, many tentative transcripts are assembled into two separate contigs. The new 454 technology has the benefit of high-throughput expression profiling, but introduces time and space problems for assembling large contigs.

Results: The PAVE (Program for Assembling and Viewing ESTs) assembler takes advantage of the 5' and 3' mate-pair information by requiring that the mate-pairs be assembled into the same contig and joined by n's if the two sub-contigs do not overlap. It handles the depth of 454 data sets by "burying" similar ESTs during assembly, which retains the expression level information while circumventing time and space problems. PAVE uses MegaBLAST for the clustering step and CAP3 for assembly, however it assembles incrementally to enforce the mate-pair constraint, bury ESTs, and reduce incorrect joins and splits. The PAVE data management system uses a MySQL database to store multiple libraries of ESTs along with their metadata; the management system allows multiple assemblies with variations on libraries and parameters. Analysis routines provide standard annotation for the contigs including a measure of differentially expressed genes across the libraries. A Java viewer program is provided for display and analysis of the results. Our results clearly show the benefit of using the PAVE assembler to explicitly use mate-pair information and bury ESTs for large contigs.

Conclusion: The PAVE assembler provides a software package for assembling Sanger and/or 454 ESTs. The assembly software, data management software, Java viewer and user's guide are freely available.

Show MeSH
The jPAVE interface. Within the contig display, the numbers in parentheses are the number of ESTs buried in the corresponding EST. From the stand-alone version of jPAVE shown here, a set of ESTs can be selected, and CAP3 or Phrap can be executed on the ESTs. Also, ESTs from multiple contigs can be selected and assembled. The 'Contig Pairs' link lists all pairs of contigs that are similar; selecting a pair shows the nucleotide and amino acid alignment.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2748094&req=5

Figure 3: The jPAVE interface. Within the contig display, the numbers in parentheses are the number of ESTs buried in the corresponding EST. From the stand-alone version of jPAVE shown here, a set of ESTs can be selected, and CAP3 or Phrap can be executed on the ESTs. Also, ESTs from multiple contigs can be selected and assembled. The 'Contig Pairs' link lists all pairs of contigs that are similar; selecting a pair shows the nucleotide and amino acid alignment.

Mentions: The jPAVE program can be run as either as a standalone program or a web applet. The initial window shows all PAVE projects, where any number can be selected for viewing (e.g. for comparing assemblies). As shown in Figure 3, jPAVE uses a BioMart [32] style query to allow easy complex queries on the annotation. Individual contigs can be displayed graphically or as base-pair sequences. By default, the buried ESTs are not displayed, which can save considerable time when displaying the contig; the number of buried ESTs is indicated next to each parent EST. The alignment of two CCSs can be viewed by nucleotide and amino acid similarity. In the standalone version, ESTs can be selected to assemble with CAP3 or Phrap using user-specified parameters.


PAVE: program for assembling and viewing ESTs.

Soderlund C, Johnson E, Bomhoff M, Descour A - BMC Genomics (2009)

The jPAVE interface. Within the contig display, the numbers in parentheses are the number of ESTs buried in the corresponding EST. From the stand-alone version of jPAVE shown here, a set of ESTs can be selected, and CAP3 or Phrap can be executed on the ESTs. Also, ESTs from multiple contigs can be selected and assembled. The 'Contig Pairs' link lists all pairs of contigs that are similar; selecting a pair shows the nucleotide and amino acid alignment.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2748094&req=5

Figure 3: The jPAVE interface. Within the contig display, the numbers in parentheses are the number of ESTs buried in the corresponding EST. From the stand-alone version of jPAVE shown here, a set of ESTs can be selected, and CAP3 or Phrap can be executed on the ESTs. Also, ESTs from multiple contigs can be selected and assembled. The 'Contig Pairs' link lists all pairs of contigs that are similar; selecting a pair shows the nucleotide and amino acid alignment.
Mentions: The jPAVE program can be run as either as a standalone program or a web applet. The initial window shows all PAVE projects, where any number can be selected for viewing (e.g. for comparing assemblies). As shown in Figure 3, jPAVE uses a BioMart [32] style query to allow easy complex queries on the annotation. Individual contigs can be displayed graphically or as base-pair sequences. By default, the buried ESTs are not displayed, which can save considerable time when displaying the contig; the number of buried ESTs is indicated next to each parent EST. The alignment of two CCSs can be viewed by nucleotide and amino acid similarity. In the standalone version, ESTs can be selected to assemble with CAP3 or Phrap using user-specified parameters.

Bottom Line: The new 454 technology has the benefit of high-throughput expression profiling, but introduces time and space problems for assembling large contigs.A Java viewer program is provided for display and analysis of the results.The assembly software, data management software, Java viewer and user's guide are freely available.

View Article: PubMed Central - HTML - PubMed

Affiliation: BIO5 Institute, University of Arizona, Tucson, AZ 85721, USA. cari@agcol.arizona.edu

ABSTRACT

Background: New sequencing technologies are rapidly emerging. Many laboratories are simultaneously working with the traditional Sanger ESTs and experimenting with ESTs generated by the 454 Life Science sequencers. Though Sanger ESTs have been used to generate contigs for many years, no program takes full advantage of the 5' and 3' mate-pair information, hence, many tentative transcripts are assembled into two separate contigs. The new 454 technology has the benefit of high-throughput expression profiling, but introduces time and space problems for assembling large contigs.

Results: The PAVE (Program for Assembling and Viewing ESTs) assembler takes advantage of the 5' and 3' mate-pair information by requiring that the mate-pairs be assembled into the same contig and joined by n's if the two sub-contigs do not overlap. It handles the depth of 454 data sets by "burying" similar ESTs during assembly, which retains the expression level information while circumventing time and space problems. PAVE uses MegaBLAST for the clustering step and CAP3 for assembly, however it assembles incrementally to enforce the mate-pair constraint, bury ESTs, and reduce incorrect joins and splits. The PAVE data management system uses a MySQL database to store multiple libraries of ESTs along with their metadata; the management system allows multiple assemblies with variations on libraries and parameters. Analysis routines provide standard annotation for the contigs including a measure of differentially expressed genes across the libraries. A Java viewer program is provided for display and analysis of the results. Our results clearly show the benefit of using the PAVE assembler to explicitly use mate-pair information and bury ESTs for large contigs.

Conclusion: The PAVE assembler provides a software package for assembling Sanger and/or 454 ESTs. The assembly software, data management software, Java viewer and user's guide are freely available.

Show MeSH