Limits...
Re-examination of chimp protein kinases suggests "novel architectures" are gene prediction artifacts.

Robison K - BMC Genomics (2010)

Bottom Line: From this analysis they concluded that several chimpanzee kinases have unusual domain arrangements.None of the proposed novel chimpanzee kinase architectures are supported by experiment evidence.Guidelines to prevent such erroneous conclusions in similar papers are proposed.

View Article: PubMed Central - HTML - PubMed

ABSTRACT

Background: Anamika et al recently published in this journal a sequence alignment analysis of protein kinases encoded by the chimpanzee genome in comparison to those in the human genome. From this analysis they concluded that several chimpanzee kinases have unusual domain arrangements.

Results: Re-examination of these kinases reveals claimed novel arrangements cannot withstand scrutiny; each is either not novel or represents over-analysis of weakly confident computer generated gene models. Additional sequence evidence available at the time of the paper's submission either directly contradict the gene models or suggest alternate gene models. These alternate models would minimize or eliminate the observed differences between human and chimp kinases.

Conclusion: None of the proposed novel chimpanzee kinase architectures are supported by experiment evidence. Guidelines to prevent such erroneous conclusions in similar papers are proposed.

Show MeSH
Alignment of ENSPTRP00000001150 with PLK3 from human (RefSeq NP_004064.2), exon structure from human, and chimp Whole Genome Shotgun reads from the NCBI trace archive corresponding to segments homologous to human PLK3 but missing from ENSPTRP00000001150. Asterisks mark remaining amino acid changes between human and chimp PLK3 if all of the additional information is incorporated. Exons implied by chimp whole genome shotgun traces (NCBI Trace Archive entries 231320434, 240048640 and 236037896) are also shown. The kinase domain of human PLK3 is highlighted in yellow, with the key ATP binding region underlined.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2823696&req=5

Figure 1: Alignment of ENSPTRP00000001150 with PLK3 from human (RefSeq NP_004064.2), exon structure from human, and chimp Whole Genome Shotgun reads from the NCBI trace archive corresponding to segments homologous to human PLK3 but missing from ENSPTRP00000001150. Asterisks mark remaining amino acid changes between human and chimp PLK3 if all of the additional information is incorporated. Exons implied by chimp whole genome shotgun traces (NCBI Trace Archive entries 231320434, 240048640 and 236037896) are also shown. The kinase domain of human PLK3 is highlighted in yellow, with the key ATP binding region underlined.

Mentions: In one case, Anamika et al claim to identify a chimp kinase (ENSPTRP00000001150) whose closest kinase domain relative in human has 31% identity, a distance which is nearly unimaginable given the great similarity between chimps and humans. Furthermore, this particular kinase is claimed to have greatest similarity to casein kinase 1 but possesses a polo box, a domain involved in the specific recognition of phosphorylated peptides. Polo boxes have been found only in polo-like kinases [4], and so to find a polo box on a kinase in a different subfamily (such as casein kinase 1) would be a very remarkable finding. However, a BLAST search of the chimp ORF against the RefSeq human protein database reveals the best human match to be Polo-like kinase 3 (PLK3), with >90% sequence identity overall (Figure 1) and 100% identity. Furthermore, careful searching of the chimpanzee whole-genome shotgun sequence reveals reads consistent with most of the pieces missing relative to human PLK3 (Figure 1), with the exception of a 3' portion of exon 3. Supplementation with this additional data yields a chimp gene model with 100% identity in the protein kinase domain (positions 62 to 314 as annotated in UniProt). The restored sequence also contains the ATP-binding site, as annotated in UniProt (positions 68-76); hence the chimp gene model used by Anamika et al is either incomplete or non-functional as a kinase due to the essentiality of this site to protein kinase function. While it cannot be conclusively demonstrated that these should be incorporated into the chimp gene model, their presence in the raw sequence data suggests that a finished assembly would probably contain the missing exonic regions. It is also worth noting that the missing pieces each correspond to one or more contiguous exons; in other words the differences between the chimp model and the human protein are entirely explainable by the gene prediction program skipping exons. One interesting possibility raised by these chimp fragments is that chimp PLK3 has deleted a short region of exon 3. This is supported by two reads in the NCBI Trace Archive. However, given the sparseness of data it could also be the case that the remainder of exon 3 is present in the chimp genome but as yet unsequenced or that both of these reads contain artifacts preventing the detection of the missing portion. Furthermore, the chimp protein contains an insertion (GGDLPSVEEVEPAPP) relative to both human and macaque proteins. Otherwise, it is striking that the potential chimp exons have precisely the same amino acid boundaries as the known human gene structure. In any case, the simplest conclusion is that ENSPTRP00000001150 is chimp PLK3 and its possession of a polo-box therefore unsurprising.


Re-examination of chimp protein kinases suggests "novel architectures" are gene prediction artifacts.

Robison K - BMC Genomics (2010)

Alignment of ENSPTRP00000001150 with PLK3 from human (RefSeq NP_004064.2), exon structure from human, and chimp Whole Genome Shotgun reads from the NCBI trace archive corresponding to segments homologous to human PLK3 but missing from ENSPTRP00000001150. Asterisks mark remaining amino acid changes between human and chimp PLK3 if all of the additional information is incorporated. Exons implied by chimp whole genome shotgun traces (NCBI Trace Archive entries 231320434, 240048640 and 236037896) are also shown. The kinase domain of human PLK3 is highlighted in yellow, with the key ATP binding region underlined.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2823696&req=5

Figure 1: Alignment of ENSPTRP00000001150 with PLK3 from human (RefSeq NP_004064.2), exon structure from human, and chimp Whole Genome Shotgun reads from the NCBI trace archive corresponding to segments homologous to human PLK3 but missing from ENSPTRP00000001150. Asterisks mark remaining amino acid changes between human and chimp PLK3 if all of the additional information is incorporated. Exons implied by chimp whole genome shotgun traces (NCBI Trace Archive entries 231320434, 240048640 and 236037896) are also shown. The kinase domain of human PLK3 is highlighted in yellow, with the key ATP binding region underlined.
Mentions: In one case, Anamika et al claim to identify a chimp kinase (ENSPTRP00000001150) whose closest kinase domain relative in human has 31% identity, a distance which is nearly unimaginable given the great similarity between chimps and humans. Furthermore, this particular kinase is claimed to have greatest similarity to casein kinase 1 but possesses a polo box, a domain involved in the specific recognition of phosphorylated peptides. Polo boxes have been found only in polo-like kinases [4], and so to find a polo box on a kinase in a different subfamily (such as casein kinase 1) would be a very remarkable finding. However, a BLAST search of the chimp ORF against the RefSeq human protein database reveals the best human match to be Polo-like kinase 3 (PLK3), with >90% sequence identity overall (Figure 1) and 100% identity. Furthermore, careful searching of the chimpanzee whole-genome shotgun sequence reveals reads consistent with most of the pieces missing relative to human PLK3 (Figure 1), with the exception of a 3' portion of exon 3. Supplementation with this additional data yields a chimp gene model with 100% identity in the protein kinase domain (positions 62 to 314 as annotated in UniProt). The restored sequence also contains the ATP-binding site, as annotated in UniProt (positions 68-76); hence the chimp gene model used by Anamika et al is either incomplete or non-functional as a kinase due to the essentiality of this site to protein kinase function. While it cannot be conclusively demonstrated that these should be incorporated into the chimp gene model, their presence in the raw sequence data suggests that a finished assembly would probably contain the missing exonic regions. It is also worth noting that the missing pieces each correspond to one or more contiguous exons; in other words the differences between the chimp model and the human protein are entirely explainable by the gene prediction program skipping exons. One interesting possibility raised by these chimp fragments is that chimp PLK3 has deleted a short region of exon 3. This is supported by two reads in the NCBI Trace Archive. However, given the sparseness of data it could also be the case that the remainder of exon 3 is present in the chimp genome but as yet unsequenced or that both of these reads contain artifacts preventing the detection of the missing portion. Furthermore, the chimp protein contains an insertion (GGDLPSVEEVEPAPP) relative to both human and macaque proteins. Otherwise, it is striking that the potential chimp exons have precisely the same amino acid boundaries as the known human gene structure. In any case, the simplest conclusion is that ENSPTRP00000001150 is chimp PLK3 and its possession of a polo-box therefore unsurprising.

Bottom Line: From this analysis they concluded that several chimpanzee kinases have unusual domain arrangements.None of the proposed novel chimpanzee kinase architectures are supported by experiment evidence.Guidelines to prevent such erroneous conclusions in similar papers are proposed.

View Article: PubMed Central - HTML - PubMed

ABSTRACT

Background: Anamika et al recently published in this journal a sequence alignment analysis of protein kinases encoded by the chimpanzee genome in comparison to those in the human genome. From this analysis they concluded that several chimpanzee kinases have unusual domain arrangements.

Results: Re-examination of these kinases reveals claimed novel arrangements cannot withstand scrutiny; each is either not novel or represents over-analysis of weakly confident computer generated gene models. Additional sequence evidence available at the time of the paper's submission either directly contradict the gene models or suggest alternate gene models. These alternate models would minimize or eliminate the observed differences between human and chimp kinases.

Conclusion: None of the proposed novel chimpanzee kinase architectures are supported by experiment evidence. Guidelines to prevent such erroneous conclusions in similar papers are proposed.

Show MeSH