Limits...
Comprehensive bioinformatics analysis of Mycoplasma pneumoniae genomes to investigate underlying population structure and type-specific determinants

View Article: PubMed Central - PubMed

ABSTRACT

Mycoplasma pneumoniae is a significant cause of respiratory illness worldwide. Despite a minimal and highly conserved genome, genetic diversity within the species may impact disease. We performed whole genome sequencing (WGS) analysis of 107 M. pneumoniae isolates, including 67 newly sequenced using the Pacific BioSciences RS II and/or Illumina MiSeq sequencing platforms. Comparative genomic analysis of 107 genomes revealed >3,000 single nucleotide polymorphisms (SNPs) in total, including 520 type-specific SNPs. Population structure analysis supported the existence of six distinct subgroups, three within each type. We developed a predictive model to classify an isolate based on whole genome SNPs called against the reference genome into the identified subtypes, obviating the need for genome assembly. This study is the most comprehensive WGS analysis for M. pneumoniae to date, underscoring the power of combining complementary sequencing technologies to overcome difficult-to-sequence regions and highlighting potential differential genomic signatures in M. pneumoniae.

No MeSH data available.


Related in: MedlinePlus

Clusters of M. pneumoniae isolates sharing unique SNPs.(A) Number of shared unique SNPs in isolate clusters ranging from 1–106 isolates relative to reference genome FH. Only the group of isolates sharing the largest number of SNPs is shown. (B) Number of shared SNPs in each subtype relative to type 2 reference FH identified among all genomes (black bars) or closed genomes only (grey bars). *No closed genomes were available for Type 1N.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC5391922&req=5

pone.0174701.g003: Clusters of M. pneumoniae isolates sharing unique SNPs.(A) Number of shared unique SNPs in isolate clusters ranging from 1–106 isolates relative to reference genome FH. Only the group of isolates sharing the largest number of SNPs is shown. (B) Number of shared SNPs in each subtype relative to type 2 reference FH identified among all genomes (black bars) or closed genomes only (grey bars). *No closed genomes were available for Type 1N.

Mentions: We identified a total of 3,206 SNPs present in at least one isolate. Intra-type examination of SNPs revealed 889 SNPs present in at least one type 1 isolate relative to the type 1 reference and 942 SNPs in one or more type 2 isolates relative to the type 2 reference genome. However these SNPs were not consistent amongst all isolates within the type designations. Comparing all 107 isolates, 520 SNPs were identified as consensus alleles in all isolates within one type group as compared to all isolates of the other type (Fig 3, S8 Table). Of these, 470 (90.4%) were located in coding regions. These 520 SNPs led to clear separation of isolates corresponding to known P1 type based on laboratory methods. Interestingly, the primary gene encoding P1 (mpn141) was not included in the core genome identified using all 107 isolates, although it was present in the core based on closed genomes only. Thus, the separation observed in our analysis must result from genomic variation outside of this gene. Other subgroups identified in the phylogenetic analysis varied from the FH reference genome by 70 (type 2a), 59 (type 1Ref), 56 (type 1N), or 7 (type 2v) SNPs that are unique to that subtype (Fig 3B). When comparing closed genomes only, the number of SNPs between the large type 1 and 2 groups was 744, presumably due to the inclusion of a larger number of core proteins; subtype-specific SNPs were similar using either core dataset (Fig 3B).


Comprehensive bioinformatics analysis of Mycoplasma pneumoniae genomes to investigate underlying population structure and type-specific determinants
Clusters of M. pneumoniae isolates sharing unique SNPs.(A) Number of shared unique SNPs in isolate clusters ranging from 1–106 isolates relative to reference genome FH. Only the group of isolates sharing the largest number of SNPs is shown. (B) Number of shared SNPs in each subtype relative to type 2 reference FH identified among all genomes (black bars) or closed genomes only (grey bars). *No closed genomes were available for Type 1N.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC5391922&req=5

pone.0174701.g003: Clusters of M. pneumoniae isolates sharing unique SNPs.(A) Number of shared unique SNPs in isolate clusters ranging from 1–106 isolates relative to reference genome FH. Only the group of isolates sharing the largest number of SNPs is shown. (B) Number of shared SNPs in each subtype relative to type 2 reference FH identified among all genomes (black bars) or closed genomes only (grey bars). *No closed genomes were available for Type 1N.
Mentions: We identified a total of 3,206 SNPs present in at least one isolate. Intra-type examination of SNPs revealed 889 SNPs present in at least one type 1 isolate relative to the type 1 reference and 942 SNPs in one or more type 2 isolates relative to the type 2 reference genome. However these SNPs were not consistent amongst all isolates within the type designations. Comparing all 107 isolates, 520 SNPs were identified as consensus alleles in all isolates within one type group as compared to all isolates of the other type (Fig 3, S8 Table). Of these, 470 (90.4%) were located in coding regions. These 520 SNPs led to clear separation of isolates corresponding to known P1 type based on laboratory methods. Interestingly, the primary gene encoding P1 (mpn141) was not included in the core genome identified using all 107 isolates, although it was present in the core based on closed genomes only. Thus, the separation observed in our analysis must result from genomic variation outside of this gene. Other subgroups identified in the phylogenetic analysis varied from the FH reference genome by 70 (type 2a), 59 (type 1Ref), 56 (type 1N), or 7 (type 2v) SNPs that are unique to that subtype (Fig 3B). When comparing closed genomes only, the number of SNPs between the large type 1 and 2 groups was 744, presumably due to the inclusion of a larger number of core proteins; subtype-specific SNPs were similar using either core dataset (Fig 3B).

View Article: PubMed Central - PubMed

ABSTRACT

Mycoplasma pneumoniae is a significant cause of respiratory illness worldwide. Despite a minimal and highly conserved genome, genetic diversity within the species may impact disease. We performed whole genome sequencing (WGS) analysis of 107 M. pneumoniae isolates, including 67 newly sequenced using the Pacific BioSciences RS II and/or Illumina MiSeq sequencing platforms. Comparative genomic analysis of 107 genomes revealed >3,000 single nucleotide polymorphisms (SNPs) in total, including 520 type-specific SNPs. Population structure analysis supported the existence of six distinct subgroups, three within each type. We developed a predictive model to classify an isolate based on whole genome SNPs called against the reference genome into the identified subtypes, obviating the need for genome assembly. This study is the most comprehensive WGS analysis for M. pneumoniae to date, underscoring the power of combining complementary sequencing technologies to overcome difficult-to-sequence regions and highlighting potential differential genomic signatures in M. pneumoniae.

No MeSH data available.


Related in: MedlinePlus