Limits...
Streamlined Genome Sequence Compression using Distributed Source Coding.

Wang S, Jiang X, Chen F, Cui L, Cheng S - Cancer Inform (2014)

Bottom Line: Existing techniques that require heavy client (encoder side) cannot be applied.To tackle this challenge, we carefully examined distributed source coding theory and developed a customized reference-based genome compression protocol to meet the low-complexity need at the client side.Our experimental results showed promising performance of the proposed method when compared with the state-of-the-art algorithm (GRS).

View Article: PubMed Central - PubMed

Affiliation: Division of Biomedical Informatics, University of California, San Diego, La Jolla, CA, USA.

ABSTRACT
We aim at developing a streamlined genome sequence compression algorithm to support alternative miniaturized sequencing devices, which have limited communication, storage, and computation power. Existing techniques that require heavy client (encoder side) cannot be applied. To tackle this challenge, we carefully examined distributed source coding theory and developed a customized reference-based genome compression protocol to meet the low-complexity need at the client side. Based on the variation between source and reference, our protocol will pick adaptively either syndrome coding or hash coding to compress subsequences of changing code length. Our experimental results showed promising performance of the proposed method when compared with the state-of-the-art algorithm (GRS).

No MeSH data available.


Related in: MedlinePlus

Workflow of genome compression based on DSC.
© Copyright Policy - open-access
Related In: Results  -  Collection


getmorefigures.php?uid=PMC4256044&req=5

f1-cin-suppl.1-2014-123: Workflow of genome compression based on DSC.

Mentions: The block diagram of the proposed genome compression framework is depicted in Figure 1. Suppose that there are two correlated DNA sequences (ie, source and reference sequences) available at the encoder and decoder, respectively, the variations between the two sequences are modeled by insertion, deletion, and substitution. The alphabet of our studied DNA sequence is confined within the set {“A”, “C”, “G”, “T”, “N”}, where “N” denotes an unknown base due to a low sequencing quality. Figure 2 shows the logical flow of the proposed framework, which we will discuss in detail.


Streamlined Genome Sequence Compression using Distributed Source Coding.

Wang S, Jiang X, Chen F, Cui L, Cheng S - Cancer Inform (2014)

Workflow of genome compression based on DSC.
© Copyright Policy - open-access
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC4256044&req=5

f1-cin-suppl.1-2014-123: Workflow of genome compression based on DSC.
Mentions: The block diagram of the proposed genome compression framework is depicted in Figure 1. Suppose that there are two correlated DNA sequences (ie, source and reference sequences) available at the encoder and decoder, respectively, the variations between the two sequences are modeled by insertion, deletion, and substitution. The alphabet of our studied DNA sequence is confined within the set {“A”, “C”, “G”, “T”, “N”}, where “N” denotes an unknown base due to a low sequencing quality. Figure 2 shows the logical flow of the proposed framework, which we will discuss in detail.

Bottom Line: Existing techniques that require heavy client (encoder side) cannot be applied.To tackle this challenge, we carefully examined distributed source coding theory and developed a customized reference-based genome compression protocol to meet the low-complexity need at the client side.Our experimental results showed promising performance of the proposed method when compared with the state-of-the-art algorithm (GRS).

View Article: PubMed Central - PubMed

Affiliation: Division of Biomedical Informatics, University of California, San Diego, La Jolla, CA, USA.

ABSTRACT
We aim at developing a streamlined genome sequence compression algorithm to support alternative miniaturized sequencing devices, which have limited communication, storage, and computation power. Existing techniques that require heavy client (encoder side) cannot be applied. To tackle this challenge, we carefully examined distributed source coding theory and developed a customized reference-based genome compression protocol to meet the low-complexity need at the client side. Based on the variation between source and reference, our protocol will pick adaptively either syndrome coding or hash coding to compress subsequences of changing code length. Our experimental results showed promising performance of the proposed method when compared with the state-of-the-art algorithm (GRS).

No MeSH data available.


Related in: MedlinePlus