Limits...
Chromosome level assembly of the hybrid Trypanosoma cruzi genome.

Weatherly DB, Boehlke C, Tarleton RL - BMC Genomics (2009)

Bottom Line: The approach was substantiated through the use of Southern blot analysis to confirm the mapping of BAC clones using as probes the genes they are predicted to contain, and each chromosome construction was visually validated to ensure sufficient evidence was present to support the organization.Now assembled, these chromosomes bring T. cruzi to the same level of organization as its kinetoplastid relatives and have been used as the basis for the T. cruzi genome in TriTrypDB, a trypanosome database of EuPathDB.In addition, they will provide the foundation for analyses such as reverse genetics, where the location of genes and their alleles and/or paralogues is necessary and comparative genome hybridization analyses (CGH), where a chromosome-level view of the genome is ideal.

View Article: PubMed Central - HTML - PubMed

Affiliation: Center for Tropical and Emerging Global Diseases, University of Georgia, Athens, GA, USA. dbrentw@uga.edu

ABSTRACT

Background: In contrast to the essentially fully assembled genome sequences of the kinetoplastid pathogens Leishmania major and Trypanosoma brucei the assembly of the Trypanosoma cruzi genome has been hindered by its repetitive nature and the fact that the reference strain (CL Brener) is a hybrid of two distinct lineages. In this work, the majority of the contigs and scaffolds were assembled into pairs of homologous chromosomes based on predicted parental haplotype, inference from TriTryp synteny maps and the use of end sequences from T. cruzi BAC libraries.

Results: Ultimately, 41 pairs of chromosomes were assembled using this approach, a number in agreement with the predicted number of T. cruzi chromosomes based upon pulse field gel analysis, with over 90% (21133 of 23216) of the genes annotated in the genome represented. The approach was substantiated through the use of Southern blot analysis to confirm the mapping of BAC clones using as probes the genes they are predicted to contain, and each chromosome construction was visually validated to ensure sufficient evidence was present to support the organization. While many members of large gene families are incorporated into the chromosome assemblies, the majority of genes excluded from the chromosomes belong to gene families, as these genes are frequently impossible to accurately position.

Conclusion: Now assembled, these chromosomes bring T. cruzi to the same level of organization as its kinetoplastid relatives and have been used as the basis for the T. cruzi genome in TriTrypDB, a trypanosome database of EuPathDB. In addition, they will provide the foundation for analyses such as reverse genetics, where the location of genes and their alleles and/or paralogues is necessary and comparative genome hybridization analyses (CGH), where a chromosome-level view of the genome is ideal.

Show MeSH

Related in: MedlinePlus

The 41 model chromosomes of T. cruzi. Because the CL Brener reference strain is a hybrid of the "non-Esmeraldo-like" and "Esmeraldo-like" lineages, each chromosome is comprised of 2 homologous chromosomes. These model chromosomes represent the consensus view of both haplotypes. Gene family members are depicted as non-blue colors; of note is the number of clusters of gene family members on the chromosomes, as well as in the artificially assembled contigs that were not assignable to individual chromosomes.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2698008&req=5

Figure 1: The 41 model chromosomes of T. cruzi. Because the CL Brener reference strain is a hybrid of the "non-Esmeraldo-like" and "Esmeraldo-like" lineages, each chromosome is comprised of 2 homologous chromosomes. These model chromosomes represent the consensus view of both haplotypes. Gene family members are depicted as non-blue colors; of note is the number of clusters of gene family members on the chromosomes, as well as in the artificially assembled contigs that were not assignable to individual chromosomes.

Mentions: Figure 1 shows all 41 model chromosomes ordered by size, and Table 1 shows the breakdown of the genes/contigs/scaffolds included. Over 90% (21133 of 23216) of the genes annotated in the published genome are represented [3]. Tic marks under each chromosome show the location of gaps, indicating that the appropriate sequence for these regions was either not identified or that the sequence is not included in the released genome sequence [3]. The size of the chromosomes varies substantially (~78 kb to ~2.4 Mb) also in agreement with previous physical data [17-22,24]. These sizes are likely underestimates as not all contigs/scaffolds have been included because of the inability to confidently place many of the small contigs containing primarily members of the trans-sialidase, MASP, RHS, GP63, mucin, and DGF-1 familes, which collectively make-up approximately 23% of the annotated genes in the T. cruzi genome. In particular, the majority of the "unclosed" BAC clones (those for which the 5' and 3' ends were not oriented properly on the same chromosome) indicate that these sequences belong at the ends of the chromosome assemblies (telomeric and sub-telomeric regions). A total of 56 contigs containing telomeric repeats were identified. Of these, 39 were mapped to 22 of the chromosomes and the remaining 17 to the unassignable scaffolds and contigs. Of the 22 chromosomes containing telomeric repeats, 5 have these features on both ends (TcChr11, 22, 25, 27, and 35). Unassignable scaffolds and contigs are depicted as arbitrarily arranged assemblies in Figure 1.


Chromosome level assembly of the hybrid Trypanosoma cruzi genome.

Weatherly DB, Boehlke C, Tarleton RL - BMC Genomics (2009)

The 41 model chromosomes of T. cruzi. Because the CL Brener reference strain is a hybrid of the "non-Esmeraldo-like" and "Esmeraldo-like" lineages, each chromosome is comprised of 2 homologous chromosomes. These model chromosomes represent the consensus view of both haplotypes. Gene family members are depicted as non-blue colors; of note is the number of clusters of gene family members on the chromosomes, as well as in the artificially assembled contigs that were not assignable to individual chromosomes.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2698008&req=5

Figure 1: The 41 model chromosomes of T. cruzi. Because the CL Brener reference strain is a hybrid of the "non-Esmeraldo-like" and "Esmeraldo-like" lineages, each chromosome is comprised of 2 homologous chromosomes. These model chromosomes represent the consensus view of both haplotypes. Gene family members are depicted as non-blue colors; of note is the number of clusters of gene family members on the chromosomes, as well as in the artificially assembled contigs that were not assignable to individual chromosomes.
Mentions: Figure 1 shows all 41 model chromosomes ordered by size, and Table 1 shows the breakdown of the genes/contigs/scaffolds included. Over 90% (21133 of 23216) of the genes annotated in the published genome are represented [3]. Tic marks under each chromosome show the location of gaps, indicating that the appropriate sequence for these regions was either not identified or that the sequence is not included in the released genome sequence [3]. The size of the chromosomes varies substantially (~78 kb to ~2.4 Mb) also in agreement with previous physical data [17-22,24]. These sizes are likely underestimates as not all contigs/scaffolds have been included because of the inability to confidently place many of the small contigs containing primarily members of the trans-sialidase, MASP, RHS, GP63, mucin, and DGF-1 familes, which collectively make-up approximately 23% of the annotated genes in the T. cruzi genome. In particular, the majority of the "unclosed" BAC clones (those for which the 5' and 3' ends were not oriented properly on the same chromosome) indicate that these sequences belong at the ends of the chromosome assemblies (telomeric and sub-telomeric regions). A total of 56 contigs containing telomeric repeats were identified. Of these, 39 were mapped to 22 of the chromosomes and the remaining 17 to the unassignable scaffolds and contigs. Of the 22 chromosomes containing telomeric repeats, 5 have these features on both ends (TcChr11, 22, 25, 27, and 35). Unassignable scaffolds and contigs are depicted as arbitrarily arranged assemblies in Figure 1.

Bottom Line: The approach was substantiated through the use of Southern blot analysis to confirm the mapping of BAC clones using as probes the genes they are predicted to contain, and each chromosome construction was visually validated to ensure sufficient evidence was present to support the organization.Now assembled, these chromosomes bring T. cruzi to the same level of organization as its kinetoplastid relatives and have been used as the basis for the T. cruzi genome in TriTrypDB, a trypanosome database of EuPathDB.In addition, they will provide the foundation for analyses such as reverse genetics, where the location of genes and their alleles and/or paralogues is necessary and comparative genome hybridization analyses (CGH), where a chromosome-level view of the genome is ideal.

View Article: PubMed Central - HTML - PubMed

Affiliation: Center for Tropical and Emerging Global Diseases, University of Georgia, Athens, GA, USA. dbrentw@uga.edu

ABSTRACT

Background: In contrast to the essentially fully assembled genome sequences of the kinetoplastid pathogens Leishmania major and Trypanosoma brucei the assembly of the Trypanosoma cruzi genome has been hindered by its repetitive nature and the fact that the reference strain (CL Brener) is a hybrid of two distinct lineages. In this work, the majority of the contigs and scaffolds were assembled into pairs of homologous chromosomes based on predicted parental haplotype, inference from TriTryp synteny maps and the use of end sequences from T. cruzi BAC libraries.

Results: Ultimately, 41 pairs of chromosomes were assembled using this approach, a number in agreement with the predicted number of T. cruzi chromosomes based upon pulse field gel analysis, with over 90% (21133 of 23216) of the genes annotated in the genome represented. The approach was substantiated through the use of Southern blot analysis to confirm the mapping of BAC clones using as probes the genes they are predicted to contain, and each chromosome construction was visually validated to ensure sufficient evidence was present to support the organization. While many members of large gene families are incorporated into the chromosome assemblies, the majority of genes excluded from the chromosomes belong to gene families, as these genes are frequently impossible to accurately position.

Conclusion: Now assembled, these chromosomes bring T. cruzi to the same level of organization as its kinetoplastid relatives and have been used as the basis for the T. cruzi genome in TriTrypDB, a trypanosome database of EuPathDB. In addition, they will provide the foundation for analyses such as reverse genetics, where the location of genes and their alleles and/or paralogues is necessary and comparative genome hybridization analyses (CGH), where a chromosome-level view of the genome is ideal.

Show MeSH
Related in: MedlinePlus