Limits...
Exome sequencing of a colorectal cancer family reveals shared mutation pattern and predisposition circuitry along tumor pathways.

Suleiman SH, Koko ME, Nasir WH, Elfateh O, Elgizouli UK, Abdallah MO, Alfarouk KO, Hussain A, Faisal S, Ibrahim FM, Romano M, Sultan A, Banks L, Newport M, Baralle F, Elhassan AM, Mohamed HS, Ibrahim ME - Front Genet (2015)

Bottom Line: Network analysis identified multiple hub genes of centrality.A likely explanation to such mutation pattern is DNA/RNA editing, suggested here by nucleotide transition-to-transversion ratio that significantly departed from expected values (p-value 5e-6).NFKB1 also showed significant centrality along with ELAVL1, raising the suspicion of viral etiology given the known interaction between oncogenic viruses and these proteins.

View Article: PubMed Central - PubMed

Affiliation: Faculty of Medicine, University of Khartoum Khartoum, Sudan.

ABSTRACT
The molecular basis of cancer and cancer multiple phenotypes are not yet fully understood. Next Generation Sequencing promises new insight into the role of genetic interactions in shaping the complexity of cancer. Aiming to outline the differences in mutation patterns between familial colorectal cancer cases and controls we analyzed whole exomes of cancer tissues and control samples from an extended colorectal cancer pedigree, providing one of the first data sets of exome sequencing of cancer in an African population against a background of large effective size typically with excess of variants. Tumors showed hMSH2 loss of function SNV consistent with Lynch syndrome. Sets of genes harboring insertions-deletions in tumor tissues revealed, however, significant GO enrichment, a feature that was not seen in control samples, suggesting that ordered insertions-deletions are central to tumorigenesis in this type of cancer. Network analysis identified multiple hub genes of centrality. ELAVL1/HuR showed remarkable centrality, interacting specially with genes harboring non-synonymous SNVs thus reinforcing the proposition of targeted mutagenesis in cancer pathways. A likely explanation to such mutation pattern is DNA/RNA editing, suggested here by nucleotide transition-to-transversion ratio that significantly departed from expected values (p-value 5e-6). NFKB1 also showed significant centrality along with ELAVL1, raising the suspicion of viral etiology given the known interaction between oncogenic viruses and these proteins.

No MeSH data available.


Related in: MedlinePlus

Gene set-based exploratory analysis. (Top) Pattern of affected gene sharing between samples is summarized as intersections of SNVs/INDELs gene sets (Venn diagrams). Gene sets for INDELs included genes harboring splice site or exonic insertions–deletions. Gene sets for SNVs included genes harboring splice site or exonic (stop gain, stop loss, or non-synonymous) changes. (Middle) Non-Metric Multidimensional Scaling of distance matrix between samples based on INDELs and SNVs gene sets (set of genes affected by exonic INDELs and SNVs, respectively). The matrix was constructed using affected genes as variables (columns) and sampled individuals as rows. A gene is assigned a score of one if it showed an exonic or splice site change, or a zero otherwise. Non-metric multidimensional scaling is shown in four dimensions (maximum number of dimensions = N-1, where N is the number of rows). The stress value is <0.1 (approximates zero). Tumor samples are shown in red. Control samples are shown in blue. The clustering of tumor samples is evident in the first dimension especially for INDELs. (Bottom) Hierarchical clustering of samples using the same distance matrix described above is depicted and reflects separation of cases from controls. Clustering distance is larger between tumors and controls (rows) for INDELs compared to SNVs.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4584935&req=5

Figure 3: Gene set-based exploratory analysis. (Top) Pattern of affected gene sharing between samples is summarized as intersections of SNVs/INDELs gene sets (Venn diagrams). Gene sets for INDELs included genes harboring splice site or exonic insertions–deletions. Gene sets for SNVs included genes harboring splice site or exonic (stop gain, stop loss, or non-synonymous) changes. (Middle) Non-Metric Multidimensional Scaling of distance matrix between samples based on INDELs and SNVs gene sets (set of genes affected by exonic INDELs and SNVs, respectively). The matrix was constructed using affected genes as variables (columns) and sampled individuals as rows. A gene is assigned a score of one if it showed an exonic or splice site change, or a zero otherwise. Non-metric multidimensional scaling is shown in four dimensions (maximum number of dimensions = N-1, where N is the number of rows). The stress value is <0.1 (approximates zero). Tumor samples are shown in red. Control samples are shown in blue. The clustering of tumor samples is evident in the first dimension especially for INDELs. (Bottom) Hierarchical clustering of samples using the same distance matrix described above is depicted and reflects separation of cases from controls. Clustering distance is larger between tumors and controls (rows) for INDELs compared to SNVs.

Mentions: There were higher numbers of variants in tumor samples (Table 1 and Figure 2). We utilized unsupervised learning to find hidden structure in unlabeled data (blinding the phenotypes) using gene-based approach (genes affected by exonic and/or splice site SNPs/INDELs used as variables; details in methods summary). We used non-metric multidimensional scaling in four dimensions to look for a pattern of gene affection by exonic variants between the five samples (Figure 3). When we studied INDELs, the first dimension completely separated cases from controls with tight sub-clustering of tumor and control samples; the second dimension perfectly separated the two cases while the third and fourth dimensions reflected the variability between the controls (Figure 3). The fact that multiple dimensions provided complete separation of cases and controls – without including the phenotypes in the clustering matrix – strongly implies that the observed pattern of gene hits in INDELs is significantly deviated from simple randomness. Hierarchical clustering of the distance matrix provided comparable results. When analyzing SNVs, clustering of controls was less evident (Figure 3). Except for the third dimension, which likely reflected IBD sharing of alleles resulting in a separation pattern of samples comparable to the pedigree, P84 showed a behavior similar yet intermediary to cancer samples in the first and second dimensions, deviating from expected clustering with the two other controls. We can argue from this preliminary analysis that a gene-based signature of exonic INDEL hits is a reasonable possibility in tumorigenesis. However, a distinct pattern in each tumor is clearly seen as well.


Exome sequencing of a colorectal cancer family reveals shared mutation pattern and predisposition circuitry along tumor pathways.

Suleiman SH, Koko ME, Nasir WH, Elfateh O, Elgizouli UK, Abdallah MO, Alfarouk KO, Hussain A, Faisal S, Ibrahim FM, Romano M, Sultan A, Banks L, Newport M, Baralle F, Elhassan AM, Mohamed HS, Ibrahim ME - Front Genet (2015)

Gene set-based exploratory analysis. (Top) Pattern of affected gene sharing between samples is summarized as intersections of SNVs/INDELs gene sets (Venn diagrams). Gene sets for INDELs included genes harboring splice site or exonic insertions–deletions. Gene sets for SNVs included genes harboring splice site or exonic (stop gain, stop loss, or non-synonymous) changes. (Middle) Non-Metric Multidimensional Scaling of distance matrix between samples based on INDELs and SNVs gene sets (set of genes affected by exonic INDELs and SNVs, respectively). The matrix was constructed using affected genes as variables (columns) and sampled individuals as rows. A gene is assigned a score of one if it showed an exonic or splice site change, or a zero otherwise. Non-metric multidimensional scaling is shown in four dimensions (maximum number of dimensions = N-1, where N is the number of rows). The stress value is <0.1 (approximates zero). Tumor samples are shown in red. Control samples are shown in blue. The clustering of tumor samples is evident in the first dimension especially for INDELs. (Bottom) Hierarchical clustering of samples using the same distance matrix described above is depicted and reflects separation of cases from controls. Clustering distance is larger between tumors and controls (rows) for INDELs compared to SNVs.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4584935&req=5

Figure 3: Gene set-based exploratory analysis. (Top) Pattern of affected gene sharing between samples is summarized as intersections of SNVs/INDELs gene sets (Venn diagrams). Gene sets for INDELs included genes harboring splice site or exonic insertions–deletions. Gene sets for SNVs included genes harboring splice site or exonic (stop gain, stop loss, or non-synonymous) changes. (Middle) Non-Metric Multidimensional Scaling of distance matrix between samples based on INDELs and SNVs gene sets (set of genes affected by exonic INDELs and SNVs, respectively). The matrix was constructed using affected genes as variables (columns) and sampled individuals as rows. A gene is assigned a score of one if it showed an exonic or splice site change, or a zero otherwise. Non-metric multidimensional scaling is shown in four dimensions (maximum number of dimensions = N-1, where N is the number of rows). The stress value is <0.1 (approximates zero). Tumor samples are shown in red. Control samples are shown in blue. The clustering of tumor samples is evident in the first dimension especially for INDELs. (Bottom) Hierarchical clustering of samples using the same distance matrix described above is depicted and reflects separation of cases from controls. Clustering distance is larger between tumors and controls (rows) for INDELs compared to SNVs.
Mentions: There were higher numbers of variants in tumor samples (Table 1 and Figure 2). We utilized unsupervised learning to find hidden structure in unlabeled data (blinding the phenotypes) using gene-based approach (genes affected by exonic and/or splice site SNPs/INDELs used as variables; details in methods summary). We used non-metric multidimensional scaling in four dimensions to look for a pattern of gene affection by exonic variants between the five samples (Figure 3). When we studied INDELs, the first dimension completely separated cases from controls with tight sub-clustering of tumor and control samples; the second dimension perfectly separated the two cases while the third and fourth dimensions reflected the variability between the controls (Figure 3). The fact that multiple dimensions provided complete separation of cases and controls – without including the phenotypes in the clustering matrix – strongly implies that the observed pattern of gene hits in INDELs is significantly deviated from simple randomness. Hierarchical clustering of the distance matrix provided comparable results. When analyzing SNVs, clustering of controls was less evident (Figure 3). Except for the third dimension, which likely reflected IBD sharing of alleles resulting in a separation pattern of samples comparable to the pedigree, P84 showed a behavior similar yet intermediary to cancer samples in the first and second dimensions, deviating from expected clustering with the two other controls. We can argue from this preliminary analysis that a gene-based signature of exonic INDEL hits is a reasonable possibility in tumorigenesis. However, a distinct pattern in each tumor is clearly seen as well.

Bottom Line: Network analysis identified multiple hub genes of centrality.A likely explanation to such mutation pattern is DNA/RNA editing, suggested here by nucleotide transition-to-transversion ratio that significantly departed from expected values (p-value 5e-6).NFKB1 also showed significant centrality along with ELAVL1, raising the suspicion of viral etiology given the known interaction between oncogenic viruses and these proteins.

View Article: PubMed Central - PubMed

Affiliation: Faculty of Medicine, University of Khartoum Khartoum, Sudan.

ABSTRACT
The molecular basis of cancer and cancer multiple phenotypes are not yet fully understood. Next Generation Sequencing promises new insight into the role of genetic interactions in shaping the complexity of cancer. Aiming to outline the differences in mutation patterns between familial colorectal cancer cases and controls we analyzed whole exomes of cancer tissues and control samples from an extended colorectal cancer pedigree, providing one of the first data sets of exome sequencing of cancer in an African population against a background of large effective size typically with excess of variants. Tumors showed hMSH2 loss of function SNV consistent with Lynch syndrome. Sets of genes harboring insertions-deletions in tumor tissues revealed, however, significant GO enrichment, a feature that was not seen in control samples, suggesting that ordered insertions-deletions are central to tumorigenesis in this type of cancer. Network analysis identified multiple hub genes of centrality. ELAVL1/HuR showed remarkable centrality, interacting specially with genes harboring non-synonymous SNVs thus reinforcing the proposition of targeted mutagenesis in cancer pathways. A likely explanation to such mutation pattern is DNA/RNA editing, suggested here by nucleotide transition-to-transversion ratio that significantly departed from expected values (p-value 5e-6). NFKB1 also showed significant centrality along with ELAVL1, raising the suspicion of viral etiology given the known interaction between oncogenic viruses and these proteins.

No MeSH data available.


Related in: MedlinePlus