Limits...
Identification of novel fusion genes in lung cancer using breakpoint assembly of transcriptome sequencing data.

Fernandez-Cuesta L, Sun R, Menon R, George J, Lorenz S, Meza-Zepeda LA, Peifer M, Plenker D, Heuckmann JM, Leenders F, Zander T, Dahmen I, Koker M, Schöttle J, Ullrich RT, Altmüller J, Becker C, Nürnberg P, Seidel H, Böhm D, Göke F, Ansén S, Russell PA, Wright GM, Wainer Z, Solomon B, Petersen I, Clement JH, Sänger J, Brustugun OT, Helland Å, Solberg S, Lund-Iversen M, Buettner R, Wolf J, Brambilla E, Vingron M, Perner S, Haas SA, Thomas RK - Genome Biol. (2015)

Bottom Line: Genomic translocation events frequently underlie cancer development through generation of gene fusions with oncogenic properties.Identification of such fusion transcripts by transcriptome sequencing might help to discover new potential therapeutic targets.We apply TRUP to RNA-seq data of different tumor types, and find it to be more sensitive than alternative tools in detecting chimeric transcripts, such as secondary rearrangements in EML4-ALK-positive lung tumors, or recurrent inactivating rearrangements affecting RASSF8.

View Article: PubMed Central - PubMed

ABSTRACT
Genomic translocation events frequently underlie cancer development through generation of gene fusions with oncogenic properties. Identification of such fusion transcripts by transcriptome sequencing might help to discover new potential therapeutic targets. We developed TRUP (Tumor-specimen suited RNA-seq Unified Pipeline) (https://github.com/ruping/TRUP), a computational approach that combines split-read and read-pair analysis with de novo assembly for the identification of chimeric transcripts in cancer specimens. We apply TRUP to RNA-seq data of different tumor types, and find it to be more sensitive than alternative tools in detecting chimeric transcripts, such as secondary rearrangements in EML4-ALK-positive lung tumors, or recurrent inactivating rearrangements affecting RASSF8.

Show MeSH

Related in: MedlinePlus

Overview of the TRUP pipeline. The schematic diagram on the left panel shows the four major processing steps applied in TRUP. The cartoon on the right panel illustrates an example of detecting a fusion event. White and black colored boxes indicate reads mapped to gene A and to gene B, respectively. In a first step, TRUP aligns the read pairs onto the genome allowing discovery of chimeric alignments (read pair id p2 and p7 in the cartoon) and partial alignments (p1, p3, p6, and p8). To guarantee a sensitive detection of candidate regions containing potential breakpoint, relaxed criteria are adopted to call breakpoints from chimeric/partial alignments, as well as from entirely aligned discordant pairs (p4 and p5). Subsequently, to reach high accuracy, de novo assembly is performed on a candidate region by using the read pairs anchored in this region. Lastly, bona fide breakpoints relative to the genome are identified from the assembled sequences. A fusion candidate is called if it attracts a sufficient number of supporting reads. While the mapping and assembly steps adopt the state-of-the-art algorithms, the breakpoint searching and fusion calling steps are novel (Materials and methods).
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4300615&req=5

Fig1: Overview of the TRUP pipeline. The schematic diagram on the left panel shows the four major processing steps applied in TRUP. The cartoon on the right panel illustrates an example of detecting a fusion event. White and black colored boxes indicate reads mapped to gene A and to gene B, respectively. In a first step, TRUP aligns the read pairs onto the genome allowing discovery of chimeric alignments (read pair id p2 and p7 in the cartoon) and partial alignments (p1, p3, p6, and p8). To guarantee a sensitive detection of candidate regions containing potential breakpoint, relaxed criteria are adopted to call breakpoints from chimeric/partial alignments, as well as from entirely aligned discordant pairs (p4 and p5). Subsequently, to reach high accuracy, de novo assembly is performed on a candidate region by using the read pairs anchored in this region. Lastly, bona fide breakpoints relative to the genome are identified from the assembled sequences. A fusion candidate is called if it attracts a sufficient number of supporting reads. While the mapping and assembly steps adopt the state-of-the-art algorithms, the breakpoint searching and fusion calling steps are novel (Materials and methods).

Mentions: In order to detect fusion transcripts from PE RNA-seq data, we need to identify the fusion point from the sequencing read alignments. Discordant mapping of mate pairs, which include chimeric as well as partial alignments of an individual read, are reported by GSNAP [11] or STAR [12]. To guarantee high sensitivity, TRUP collects all candidate regions containing potential breakpoints suggested by those abnormal alignments. Additionally, for each candidate region, de novo assembly is performed using de Bruijn graphs (‘Velvet’) [13] and a modified version of Velvet (Oases) that employs additional filters to afford optimized merging of multiple assemblies, specifically of transcriptome sequencing data [14], with the aim to construct possible contigs from each region by leveraging dependency among reads. After sensitive split-read mapping and specific de novo assembly, fusion candidates are filtered and ranked based on repeat content and number of reads supporting the fusion points (Figure 1; Materials and Methods).Figure 1


Identification of novel fusion genes in lung cancer using breakpoint assembly of transcriptome sequencing data.

Fernandez-Cuesta L, Sun R, Menon R, George J, Lorenz S, Meza-Zepeda LA, Peifer M, Plenker D, Heuckmann JM, Leenders F, Zander T, Dahmen I, Koker M, Schöttle J, Ullrich RT, Altmüller J, Becker C, Nürnberg P, Seidel H, Böhm D, Göke F, Ansén S, Russell PA, Wright GM, Wainer Z, Solomon B, Petersen I, Clement JH, Sänger J, Brustugun OT, Helland Å, Solberg S, Lund-Iversen M, Buettner R, Wolf J, Brambilla E, Vingron M, Perner S, Haas SA, Thomas RK - Genome Biol. (2015)

Overview of the TRUP pipeline. The schematic diagram on the left panel shows the four major processing steps applied in TRUP. The cartoon on the right panel illustrates an example of detecting a fusion event. White and black colored boxes indicate reads mapped to gene A and to gene B, respectively. In a first step, TRUP aligns the read pairs onto the genome allowing discovery of chimeric alignments (read pair id p2 and p7 in the cartoon) and partial alignments (p1, p3, p6, and p8). To guarantee a sensitive detection of candidate regions containing potential breakpoint, relaxed criteria are adopted to call breakpoints from chimeric/partial alignments, as well as from entirely aligned discordant pairs (p4 and p5). Subsequently, to reach high accuracy, de novo assembly is performed on a candidate region by using the read pairs anchored in this region. Lastly, bona fide breakpoints relative to the genome are identified from the assembled sequences. A fusion candidate is called if it attracts a sufficient number of supporting reads. While the mapping and assembly steps adopt the state-of-the-art algorithms, the breakpoint searching and fusion calling steps are novel (Materials and methods).
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4300615&req=5

Fig1: Overview of the TRUP pipeline. The schematic diagram on the left panel shows the four major processing steps applied in TRUP. The cartoon on the right panel illustrates an example of detecting a fusion event. White and black colored boxes indicate reads mapped to gene A and to gene B, respectively. In a first step, TRUP aligns the read pairs onto the genome allowing discovery of chimeric alignments (read pair id p2 and p7 in the cartoon) and partial alignments (p1, p3, p6, and p8). To guarantee a sensitive detection of candidate regions containing potential breakpoint, relaxed criteria are adopted to call breakpoints from chimeric/partial alignments, as well as from entirely aligned discordant pairs (p4 and p5). Subsequently, to reach high accuracy, de novo assembly is performed on a candidate region by using the read pairs anchored in this region. Lastly, bona fide breakpoints relative to the genome are identified from the assembled sequences. A fusion candidate is called if it attracts a sufficient number of supporting reads. While the mapping and assembly steps adopt the state-of-the-art algorithms, the breakpoint searching and fusion calling steps are novel (Materials and methods).
Mentions: In order to detect fusion transcripts from PE RNA-seq data, we need to identify the fusion point from the sequencing read alignments. Discordant mapping of mate pairs, which include chimeric as well as partial alignments of an individual read, are reported by GSNAP [11] or STAR [12]. To guarantee high sensitivity, TRUP collects all candidate regions containing potential breakpoints suggested by those abnormal alignments. Additionally, for each candidate region, de novo assembly is performed using de Bruijn graphs (‘Velvet’) [13] and a modified version of Velvet (Oases) that employs additional filters to afford optimized merging of multiple assemblies, specifically of transcriptome sequencing data [14], with the aim to construct possible contigs from each region by leveraging dependency among reads. After sensitive split-read mapping and specific de novo assembly, fusion candidates are filtered and ranked based on repeat content and number of reads supporting the fusion points (Figure 1; Materials and Methods).Figure 1

Bottom Line: Genomic translocation events frequently underlie cancer development through generation of gene fusions with oncogenic properties.Identification of such fusion transcripts by transcriptome sequencing might help to discover new potential therapeutic targets.We apply TRUP to RNA-seq data of different tumor types, and find it to be more sensitive than alternative tools in detecting chimeric transcripts, such as secondary rearrangements in EML4-ALK-positive lung tumors, or recurrent inactivating rearrangements affecting RASSF8.

View Article: PubMed Central - PubMed

ABSTRACT
Genomic translocation events frequently underlie cancer development through generation of gene fusions with oncogenic properties. Identification of such fusion transcripts by transcriptome sequencing might help to discover new potential therapeutic targets. We developed TRUP (Tumor-specimen suited RNA-seq Unified Pipeline) (https://github.com/ruping/TRUP), a computational approach that combines split-read and read-pair analysis with de novo assembly for the identification of chimeric transcripts in cancer specimens. We apply TRUP to RNA-seq data of different tumor types, and find it to be more sensitive than alternative tools in detecting chimeric transcripts, such as secondary rearrangements in EML4-ALK-positive lung tumors, or recurrent inactivating rearrangements affecting RASSF8.

Show MeSH
Related in: MedlinePlus