Limits...
Chromosome level assembly of the hybrid Trypanosoma cruzi genome.

Weatherly DB, Boehlke C, Tarleton RL - BMC Genomics (2009)

Bottom Line: The approach was substantiated through the use of Southern blot analysis to confirm the mapping of BAC clones using as probes the genes they are predicted to contain, and each chromosome construction was visually validated to ensure sufficient evidence was present to support the organization.Now assembled, these chromosomes bring T. cruzi to the same level of organization as its kinetoplastid relatives and have been used as the basis for the T. cruzi genome in TriTrypDB, a trypanosome database of EuPathDB.In addition, they will provide the foundation for analyses such as reverse genetics, where the location of genes and their alleles and/or paralogues is necessary and comparative genome hybridization analyses (CGH), where a chromosome-level view of the genome is ideal.

View Article: PubMed Central - HTML - PubMed

Affiliation: Center for Tropical and Emerging Global Diseases, University of Georgia, Athens, GA, USA. dbrentw@uga.edu

ABSTRACT

Background: In contrast to the essentially fully assembled genome sequences of the kinetoplastid pathogens Leishmania major and Trypanosoma brucei the assembly of the Trypanosoma cruzi genome has been hindered by its repetitive nature and the fact that the reference strain (CL Brener) is a hybrid of two distinct lineages. In this work, the majority of the contigs and scaffolds were assembled into pairs of homologous chromosomes based on predicted parental haplotype, inference from TriTryp synteny maps and the use of end sequences from T. cruzi BAC libraries.

Results: Ultimately, 41 pairs of chromosomes were assembled using this approach, a number in agreement with the predicted number of T. cruzi chromosomes based upon pulse field gel analysis, with over 90% (21133 of 23216) of the genes annotated in the genome represented. The approach was substantiated through the use of Southern blot analysis to confirm the mapping of BAC clones using as probes the genes they are predicted to contain, and each chromosome construction was visually validated to ensure sufficient evidence was present to support the organization. While many members of large gene families are incorporated into the chromosome assemblies, the majority of genes excluded from the chromosomes belong to gene families, as these genes are frequently impossible to accurately position.

Conclusion: Now assembled, these chromosomes bring T. cruzi to the same level of organization as its kinetoplastid relatives and have been used as the basis for the T. cruzi genome in TriTrypDB, a trypanosome database of EuPathDB. In addition, they will provide the foundation for analyses such as reverse genetics, where the location of genes and their alleles and/or paralogues is necessary and comparative genome hybridization analyses (CGH), where a chromosome-level view of the genome is ideal.

Show MeSH

Related in: MedlinePlus

Cluster of large gene family members in the middle of TcChr33. The region from ~0.2 Mb to 0.5 Mb contains mostly gene family members (non-blue "Gene Features"), while on either side of the region are "core" regions with either hypothetical genes or those with an assigned putative function (light and dark blue "Gene Features"). The large number of spanning BAC clones linking the core regions and the cluster of gene family members substantiates the organization. However, it should be noted that these homologous chromosomes are likely different sizes. The BAC clones on the "S" chromosome that span the 120 kb gap in the gene family rich region (connected by dashed lines) are too long for the BAC libraries as shown (TARBAC: blue, avg. length 75 kb, EPIFOS: green, avg. length 35 kb). Alignment of the homologous chromosomes is a visual aid to maintain allelic synteny only.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2698008&req=5

Figure 5: Cluster of large gene family members in the middle of TcChr33. The region from ~0.2 Mb to 0.5 Mb contains mostly gene family members (non-blue "Gene Features"), while on either side of the region are "core" regions with either hypothetical genes or those with an assigned putative function (light and dark blue "Gene Features"). The large number of spanning BAC clones linking the core regions and the cluster of gene family members substantiates the organization. However, it should be noted that these homologous chromosomes are likely different sizes. The BAC clones on the "S" chromosome that span the 120 kb gap in the gene family rich region (connected by dashed lines) are too long for the BAC libraries as shown (TARBAC: blue, avg. length 75 kb, EPIFOS: green, avg. length 35 kb). Alignment of the homologous chromosomes is a visual aid to maintain allelic synteny only.

Mentions: One of the unique aspects of the T. cruzi genome among most sequenced genomes is that it contains at least 22 separate gene families with 20 to >1400 members [3]. Because of the sequence similarity between members of these large gene families, it was not possible to unambiguously map many of the contigs rich with gene family members, leaving them to be included among the "artificial" assemblies in Figure 1. However, of the gene-family-rich contigs that are included in the assembled chromosomes, many cluster at chromosome ends while others are located in the middle of chromosomes (13 clusters of 10 or more members and 6 clusters of > = 20 members that are flanked by regions containing > 10 core genes; Figures 1, 5). Figure 5 also demonstrates that size variation of chromosomes in T. cruzi is due not only to variation at chromosome ends but also variation in the composition of chromosome-internal sites rich in gene family members.


Chromosome level assembly of the hybrid Trypanosoma cruzi genome.

Weatherly DB, Boehlke C, Tarleton RL - BMC Genomics (2009)

Cluster of large gene family members in the middle of TcChr33. The region from ~0.2 Mb to 0.5 Mb contains mostly gene family members (non-blue "Gene Features"), while on either side of the region are "core" regions with either hypothetical genes or those with an assigned putative function (light and dark blue "Gene Features"). The large number of spanning BAC clones linking the core regions and the cluster of gene family members substantiates the organization. However, it should be noted that these homologous chromosomes are likely different sizes. The BAC clones on the "S" chromosome that span the 120 kb gap in the gene family rich region (connected by dashed lines) are too long for the BAC libraries as shown (TARBAC: blue, avg. length 75 kb, EPIFOS: green, avg. length 35 kb). Alignment of the homologous chromosomes is a visual aid to maintain allelic synteny only.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2698008&req=5

Figure 5: Cluster of large gene family members in the middle of TcChr33. The region from ~0.2 Mb to 0.5 Mb contains mostly gene family members (non-blue "Gene Features"), while on either side of the region are "core" regions with either hypothetical genes or those with an assigned putative function (light and dark blue "Gene Features"). The large number of spanning BAC clones linking the core regions and the cluster of gene family members substantiates the organization. However, it should be noted that these homologous chromosomes are likely different sizes. The BAC clones on the "S" chromosome that span the 120 kb gap in the gene family rich region (connected by dashed lines) are too long for the BAC libraries as shown (TARBAC: blue, avg. length 75 kb, EPIFOS: green, avg. length 35 kb). Alignment of the homologous chromosomes is a visual aid to maintain allelic synteny only.
Mentions: One of the unique aspects of the T. cruzi genome among most sequenced genomes is that it contains at least 22 separate gene families with 20 to >1400 members [3]. Because of the sequence similarity between members of these large gene families, it was not possible to unambiguously map many of the contigs rich with gene family members, leaving them to be included among the "artificial" assemblies in Figure 1. However, of the gene-family-rich contigs that are included in the assembled chromosomes, many cluster at chromosome ends while others are located in the middle of chromosomes (13 clusters of 10 or more members and 6 clusters of > = 20 members that are flanked by regions containing > 10 core genes; Figures 1, 5). Figure 5 also demonstrates that size variation of chromosomes in T. cruzi is due not only to variation at chromosome ends but also variation in the composition of chromosome-internal sites rich in gene family members.

Bottom Line: The approach was substantiated through the use of Southern blot analysis to confirm the mapping of BAC clones using as probes the genes they are predicted to contain, and each chromosome construction was visually validated to ensure sufficient evidence was present to support the organization.Now assembled, these chromosomes bring T. cruzi to the same level of organization as its kinetoplastid relatives and have been used as the basis for the T. cruzi genome in TriTrypDB, a trypanosome database of EuPathDB.In addition, they will provide the foundation for analyses such as reverse genetics, where the location of genes and their alleles and/or paralogues is necessary and comparative genome hybridization analyses (CGH), where a chromosome-level view of the genome is ideal.

View Article: PubMed Central - HTML - PubMed

Affiliation: Center for Tropical and Emerging Global Diseases, University of Georgia, Athens, GA, USA. dbrentw@uga.edu

ABSTRACT

Background: In contrast to the essentially fully assembled genome sequences of the kinetoplastid pathogens Leishmania major and Trypanosoma brucei the assembly of the Trypanosoma cruzi genome has been hindered by its repetitive nature and the fact that the reference strain (CL Brener) is a hybrid of two distinct lineages. In this work, the majority of the contigs and scaffolds were assembled into pairs of homologous chromosomes based on predicted parental haplotype, inference from TriTryp synteny maps and the use of end sequences from T. cruzi BAC libraries.

Results: Ultimately, 41 pairs of chromosomes were assembled using this approach, a number in agreement with the predicted number of T. cruzi chromosomes based upon pulse field gel analysis, with over 90% (21133 of 23216) of the genes annotated in the genome represented. The approach was substantiated through the use of Southern blot analysis to confirm the mapping of BAC clones using as probes the genes they are predicted to contain, and each chromosome construction was visually validated to ensure sufficient evidence was present to support the organization. While many members of large gene families are incorporated into the chromosome assemblies, the majority of genes excluded from the chromosomes belong to gene families, as these genes are frequently impossible to accurately position.

Conclusion: Now assembled, these chromosomes bring T. cruzi to the same level of organization as its kinetoplastid relatives and have been used as the basis for the T. cruzi genome in TriTrypDB, a trypanosome database of EuPathDB. In addition, they will provide the foundation for analyses such as reverse genetics, where the location of genes and their alleles and/or paralogues is necessary and comparative genome hybridization analyses (CGH), where a chromosome-level view of the genome is ideal.

Show MeSH
Related in: MedlinePlus