Limits...
A Single-Array-Based Method for Detecting Copy Number Variants Using Affymetrix High Density SNP Arrays and its Application to Breast Cancer.

Li M, Wen Y, Fu W - Cancer Inform (2015)

Bottom Line: This method requires no between-array normalization, and thus, maintains data integrity and independence of samples among individual subjects.In addition to our efforts to apply new statistical technology to raw fluorescence values, the HMM has been applied to the standardized copy number abundance in order to reduce experimental noise.Through simulations, we show our refined method is able to infer copy number variants accurately.

View Article: PubMed Central - PubMed

Affiliation: Division of Biostatistics, Department of Pediatrics, University of Arkansas for Medical Sciences, Little Rock, AR, USA.

ABSTRACT
Cumulative evidence has shown that structural variations, due to insertions, deletions, and inversions of DNA, may contribute considerably to the development of complex human diseases, such as breast cancer. High-throughput genotyping technologies, such as Affymetrix high density single-nucleotide polymorphism (SNP) arrays, have produced large amounts of genetic data for genome-wide SNP genotype calling and copy number estimation. Meanwhile, there is a great need for accurate and efficient statistical methods to detect copy number variants. In this article, we introduce a hidden-Markov-model (HMM)-based method, referred to as the PICR-CNV, for copy number inference. The proposed method first estimates copy number abundance for each single SNP on a single array based on the raw fluorescence values, and then standardizes the estimated copy number abundance to achieve equal footing among multiple arrays. This method requires no between-array normalization, and thus, maintains data integrity and independence of samples among individual subjects. In addition to our efforts to apply new statistical technology to raw fluorescence values, the HMM has been applied to the standardized copy number abundance in order to reduce experimental noise. Through simulations, we show our refined method is able to infer copy number variants accurately. Application of the proposed method to a breast cancer dataset helps to identify genomic regions significantly associated with the disease.

No MeSH data available.


Related in: MedlinePlus

Distribution of the size of identified CNVs based on BRCA GWAS data.
© Copyright Policy - open-access
Related In: Results  -  Collection


getmorefigures.php?uid=PMC4519351&req=5

f1-cin-suppl.4-2014-095: Distribution of the size of identified CNVs based on BRCA GWAS data.

Mentions: Oligonucleotide microarrays annotate each SNP using a set of 24 probes of 25-mer photolithographically synthesized immobilized nucleic acid sequences. The target sequences are labeled with 3′-fluorescent dye before hybridization to the array, and their abundance are often measured with the fluorescent intensity on the array after hybridization.31–33 In a 500K SNP array, six quartets are adopted to interrogate a single dimorphic SNP site with its possible alleles commonly denoted as A and B. Each quartet consists of four types of probes that are 25 base pairs in length. These probes are designed either perfectly matching (PM) the target sequence or mismatching (MM) at a particular nucleotide site for each allele: perfect match A, mismatch A, perfect match B, and mismatch B, denoted, respectively, as PA, MA, PB, and MB for short. The probe sets are also designed to hybridize with either sense strands (s = 1) or antisense strands (s = –1). The quartets have different shifts (k) for the nucleotide on the probe sequence (k may take the values –4, –3, –2, –1, 0, 1, 2, 3, 4) from the center nucleotide of the probe sequence (k = 0 at position 13 of the 25 base pairs) (see Fig. 1A of Matsuzaki et al.34 for detailed illustration.).


A Single-Array-Based Method for Detecting Copy Number Variants Using Affymetrix High Density SNP Arrays and its Application to Breast Cancer.

Li M, Wen Y, Fu W - Cancer Inform (2015)

Distribution of the size of identified CNVs based on BRCA GWAS data.
© Copyright Policy - open-access
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC4519351&req=5

f1-cin-suppl.4-2014-095: Distribution of the size of identified CNVs based on BRCA GWAS data.
Mentions: Oligonucleotide microarrays annotate each SNP using a set of 24 probes of 25-mer photolithographically synthesized immobilized nucleic acid sequences. The target sequences are labeled with 3′-fluorescent dye before hybridization to the array, and their abundance are often measured with the fluorescent intensity on the array after hybridization.31–33 In a 500K SNP array, six quartets are adopted to interrogate a single dimorphic SNP site with its possible alleles commonly denoted as A and B. Each quartet consists of four types of probes that are 25 base pairs in length. These probes are designed either perfectly matching (PM) the target sequence or mismatching (MM) at a particular nucleotide site for each allele: perfect match A, mismatch A, perfect match B, and mismatch B, denoted, respectively, as PA, MA, PB, and MB for short. The probe sets are also designed to hybridize with either sense strands (s = 1) or antisense strands (s = –1). The quartets have different shifts (k) for the nucleotide on the probe sequence (k may take the values –4, –3, –2, –1, 0, 1, 2, 3, 4) from the center nucleotide of the probe sequence (k = 0 at position 13 of the 25 base pairs) (see Fig. 1A of Matsuzaki et al.34 for detailed illustration.).

Bottom Line: This method requires no between-array normalization, and thus, maintains data integrity and independence of samples among individual subjects.In addition to our efforts to apply new statistical technology to raw fluorescence values, the HMM has been applied to the standardized copy number abundance in order to reduce experimental noise.Through simulations, we show our refined method is able to infer copy number variants accurately.

View Article: PubMed Central - PubMed

Affiliation: Division of Biostatistics, Department of Pediatrics, University of Arkansas for Medical Sciences, Little Rock, AR, USA.

ABSTRACT
Cumulative evidence has shown that structural variations, due to insertions, deletions, and inversions of DNA, may contribute considerably to the development of complex human diseases, such as breast cancer. High-throughput genotyping technologies, such as Affymetrix high density single-nucleotide polymorphism (SNP) arrays, have produced large amounts of genetic data for genome-wide SNP genotype calling and copy number estimation. Meanwhile, there is a great need for accurate and efficient statistical methods to detect copy number variants. In this article, we introduce a hidden-Markov-model (HMM)-based method, referred to as the PICR-CNV, for copy number inference. The proposed method first estimates copy number abundance for each single SNP on a single array based on the raw fluorescence values, and then standardizes the estimated copy number abundance to achieve equal footing among multiple arrays. This method requires no between-array normalization, and thus, maintains data integrity and independence of samples among individual subjects. In addition to our efforts to apply new statistical technology to raw fluorescence values, the HMM has been applied to the standardized copy number abundance in order to reduce experimental noise. Through simulations, we show our refined method is able to infer copy number variants accurately. Application of the proposed method to a breast cancer dataset helps to identify genomic regions significantly associated with the disease.

No MeSH data available.


Related in: MedlinePlus