Limits...
cFinder: definition and quantification of multiple haplotypes in a mixed sample.

Niklas N, Hafenscher J, Barna A, Wiesinger K, Pröll J, Dreiseitl S, Preuner-Stix S, Valent P, Lion T, Gabriel C - BMC Res Notes (2015)

Bottom Line: BCR-ABL1 samples containing multiple clones were used for testing and our cFinder could identify all previously found clones together with their abundance and even refine some results.As a result the cFinder reports the connections of variants (haplotypes) with their readcount and relative occurrence (percentage).To our knowledge, this is the first software that enables researchers without extensive bioinformatic support to designate multiple haplotypes and how they constitute to a sample.

View Article: PubMed Central - PubMed

Affiliation: Red Cross Transfusion Service for Upper Austria, Krankenhausstraße 7, 4017, Linz, Austria. norbert.niklas@o.roteskreuz.at.

ABSTRACT

Background: Next-generation sequencing allows for determining the genetic composition of a mixed sample. For instance, when performing resistance testing for BCR-ABL1 it is necessary to identify clones and define compound mutations; together with an exact quantification this may complement diagnosis and therapy decisions with additional information. Moreover, that applies not only to oncological issues but also determination of viral, bacterial or fungal infection. The efforts to retrieve multiple haplotypes (more than two) and proportion information from data with conventional software are difficult, cumbersome and demand multiple manual steps.

Results: Therefore, we developed a tool called cFinder that is capable of automatic detection of haplotypes and their accurate quantification within one sample. BCR-ABL1 samples containing multiple clones were used for testing and our cFinder could identify all previously found clones together with their abundance and even refine some results. Additionally, reads were simulated using GemSIM with multiple haplotypes, the detection was very close to linear (R(2) = 0.96). Our aim is not to deduce haploblocks over statistics, but to characterize one sample's composition precisely. As a result the cFinder reports the connections of variants (haplotypes) with their readcount and relative occurrence (percentage). Download is available at http://sourceforge.net/projects/cfinder/.

Conclusions: Our cFinder is implemented in an efficient algorithm that can be run on a low-performance desktop computer. Furthermore, it considers paired-end information (if available) and is generally open for any current next-generation sequencing technology and alignment strategy. To our knowledge, this is the first software that enables researchers without extensive bioinformatic support to designate multiple haplotypes and how they constitute to a sample.

No MeSH data available.


Related in: MedlinePlus

Scatterplot of simulated reads. The frequencies of clones were plotted against the actual detected frequencies. One dot represents one clone, perfect matches are on the red line
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4562109&req=5

Fig4: Scatterplot of simulated reads. The frequencies of clones were plotted against the actual detected frequencies. One dot represents one clone, perfect matches are on the red line

Mentions: Besides, we used GemSIM to simulate 100 k reads (l = 600) for a short (924 bp) fragment of E. coli using the profile for 454 data [10]. In total 26 test cases had 133 different haplotypes (min. 3, max. 8 haplotypes with each up to 9 variants, avg. 4.2 variants). We included multiple clones (haplotypes) that were created by the GemSIM software (GemHaps.py) along with sequencing errors. After alignment with CLC GWB the cFinder detected all haplotypes. Using the absolute number of reads detected the estimated percentages correlated with R2 = 0.96 that is nearly linear. The deviation can be related to the high number of errors included in the 454 profile. Figure 4 displays a scatterplot visualizing the simulated and detected percentages.Fig. 4


cFinder: definition and quantification of multiple haplotypes in a mixed sample.

Niklas N, Hafenscher J, Barna A, Wiesinger K, Pröll J, Dreiseitl S, Preuner-Stix S, Valent P, Lion T, Gabriel C - BMC Res Notes (2015)

Scatterplot of simulated reads. The frequencies of clones were plotted against the actual detected frequencies. One dot represents one clone, perfect matches are on the red line
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4562109&req=5

Fig4: Scatterplot of simulated reads. The frequencies of clones were plotted against the actual detected frequencies. One dot represents one clone, perfect matches are on the red line
Mentions: Besides, we used GemSIM to simulate 100 k reads (l = 600) for a short (924 bp) fragment of E. coli using the profile for 454 data [10]. In total 26 test cases had 133 different haplotypes (min. 3, max. 8 haplotypes with each up to 9 variants, avg. 4.2 variants). We included multiple clones (haplotypes) that were created by the GemSIM software (GemHaps.py) along with sequencing errors. After alignment with CLC GWB the cFinder detected all haplotypes. Using the absolute number of reads detected the estimated percentages correlated with R2 = 0.96 that is nearly linear. The deviation can be related to the high number of errors included in the 454 profile. Figure 4 displays a scatterplot visualizing the simulated and detected percentages.Fig. 4

Bottom Line: BCR-ABL1 samples containing multiple clones were used for testing and our cFinder could identify all previously found clones together with their abundance and even refine some results.As a result the cFinder reports the connections of variants (haplotypes) with their readcount and relative occurrence (percentage).To our knowledge, this is the first software that enables researchers without extensive bioinformatic support to designate multiple haplotypes and how they constitute to a sample.

View Article: PubMed Central - PubMed

Affiliation: Red Cross Transfusion Service for Upper Austria, Krankenhausstraße 7, 4017, Linz, Austria. norbert.niklas@o.roteskreuz.at.

ABSTRACT

Background: Next-generation sequencing allows for determining the genetic composition of a mixed sample. For instance, when performing resistance testing for BCR-ABL1 it is necessary to identify clones and define compound mutations; together with an exact quantification this may complement diagnosis and therapy decisions with additional information. Moreover, that applies not only to oncological issues but also determination of viral, bacterial or fungal infection. The efforts to retrieve multiple haplotypes (more than two) and proportion information from data with conventional software are difficult, cumbersome and demand multiple manual steps.

Results: Therefore, we developed a tool called cFinder that is capable of automatic detection of haplotypes and their accurate quantification within one sample. BCR-ABL1 samples containing multiple clones were used for testing and our cFinder could identify all previously found clones together with their abundance and even refine some results. Additionally, reads were simulated using GemSIM with multiple haplotypes, the detection was very close to linear (R(2) = 0.96). Our aim is not to deduce haploblocks over statistics, but to characterize one sample's composition precisely. As a result the cFinder reports the connections of variants (haplotypes) with their readcount and relative occurrence (percentage). Download is available at http://sourceforge.net/projects/cfinder/.

Conclusions: Our cFinder is implemented in an efficient algorithm that can be run on a low-performance desktop computer. Furthermore, it considers paired-end information (if available) and is generally open for any current next-generation sequencing technology and alignment strategy. To our knowledge, this is the first software that enables researchers without extensive bioinformatic support to designate multiple haplotypes and how they constitute to a sample.

No MeSH data available.


Related in: MedlinePlus