Limits...
Small RNA-directed epigenetic natural variation in Arabidopsis thaliana.

Zhai J, Liu J, Liu B, Li P, Meyers BC, Chen X, Cao X - PLoS Genet. (2008)

Bottom Line: Here, we report that, in the model plant Arabidopsis thaliana, a cluster of approximately 24 nt siRNAs found at high levels in the ecotype Landsberg erecta (Ler) could direct DNA methylation and heterochromatinization at a hAT element adjacent to the promoter of FLOWERING LOCUS C (FLC), a major repressor of flowering, whereas the same hAT element in ecotype Columbia (Col) with almost identical DNA sequence, generates a set of low abundance siRNAs that do not direct these activities.We have called this hAT element MPF for Methylated region near Promoter of FLC, although de novo methylation triggered by an inverted repeat transgene at this region in Col does not alter its FLC expression.A genome-wide comparison of Ler and Col small RNAs identified at least 68 loci matched by a significant level of approximately 24 nt siRNAs present specifically in Ler but not Col, where nearly half of the loci are related to repeat or TE sequences.

View Article: PubMed Central - PubMed

Affiliation: State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, China.

ABSTRACT
Progress in epigenetics has revealed mechanisms that can heritably regulate gene function independent of genetic alterations. Nevertheless, little is known about the role of epigenetics in evolution. This is due in part to scant data on epigenetic variation among natural populations. In plants, small interfering RNA (siRNA) is involved in both the initiation and maintenance of gene silencing by directing DNA methylation and/or histone methylation. Here, we report that, in the model plant Arabidopsis thaliana, a cluster of approximately 24 nt siRNAs found at high levels in the ecotype Landsberg erecta (Ler) could direct DNA methylation and heterochromatinization at a hAT element adjacent to the promoter of FLOWERING LOCUS C (FLC), a major repressor of flowering, whereas the same hAT element in ecotype Columbia (Col) with almost identical DNA sequence, generates a set of low abundance siRNAs that do not direct these activities. We have called this hAT element MPF for Methylated region near Promoter of FLC, although de novo methylation triggered by an inverted repeat transgene at this region in Col does not alter its FLC expression. DNA methylation of the Ler allele MPF is dependent on genes in known silencing pathways, and such methylation is transmissible to Col by genetic crosses, although with varying degrees of penetrance. A genome-wide comparison of Ler and Col small RNAs identified at least 68 loci matched by a significant level of approximately 24 nt siRNAs present specifically in Ler but not Col, where nearly half of the loci are related to repeat or TE sequences. Methylation analysis revealed that 88% of the examined loci (37 out of 42) were specifically methylated in Ler but not Col, suggesting that small RNA can direct epigenetic differences between two closely related Arabidopsis ecotypes.

Show MeSH
Illustration of the Strategy for Identifying Loci Matched by Significant Level of ∼24 nt siRNA Specifically in Ler using Chromosome 3 as an Example.Unique small RNAs obtained by 454 sequencing from Col and Ler ≥23 nt were mapped to the genome, then the perfect matches were counted per 100 bp. With this information, a filter was used to further identify loci with no less than three hits within 300 bp in Ler versus no hits within 1500 bp for the same region in Col.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2289841&req=5

pgen-1000056-g006: Illustration of the Strategy for Identifying Loci Matched by Significant Level of ∼24 nt siRNA Specifically in Ler using Chromosome 3 as an Example.Unique small RNAs obtained by 454 sequencing from Col and Ler ≥23 nt were mapped to the genome, then the perfect matches were counted per 100 bp. With this information, a filter was used to further identify loci with no less than three hits within 300 bp in Ler versus no hits within 1500 bp for the same region in Col.

Mentions: The identification of MPF-siRNAs in Ler- but not Col-derived small RNA data made us wonder whether other loci are differentially and specifically matched by ∼24 nt siRNAs in these ecotypes. Because the MPSS small RNA sequencing data are not readily comparable with the 454 data (due to length differences in the sequencing reads), the small RNA datasets we used for a genome-wide identification are all 454 sequencing data, derived from two recent studies: 247,318 unique small RNA sequences from Col [16]and 25,981 unique small RNA sequences from Ler [15]. Also, to balance the enrichment of longer siRNAs in the sequencing results of AGO4 precipitated pool from Ler [15], we only selected for further analyses the siRNA reads of length no less than 23 nt, hence most of the miRNAs and short sRNAs are discarded from both the Col and Ler datasets. Since only the Col genome sequence is complete and the number of sequenced Col derived siRNAs is much greater than that of Ler, in this study, we only analyzed the regions matched by clusters of siRNAs present specifically in Ler, to exclude the interference of genetic alteration and also for higher reliability (please see materials and methods for details about the bioinformatic analysis). The unique siRNA sequences over 23 nt from both Col and Ler were mapped to the genome, respectively, and hits were counted in windows of 100 bp. Although the majority of the ∼24 nt small RNA clusters are conserved between Col and Ler (data not shown), after combining the overlapping regions, 68 unique loci were identified (including the MPF, locus #57; Table S1). These all shared the characteristic that they were matched by at least three distinct siRNAs within 300 bp in Ler but there were no hits in 1500 bp around the same region in Col (see Figure 6 for an example). Most of these loci are MPF-like, in that the siRNA matches are restricted to a small region (Figure S6), and their distribution in the genome is quite dispersed (Figure S7). Twenty-two loci are within known genes, and the other 46 are in intergenic regions (Table S2). An search of methylation data in Col (http://signal.salk.edu/cgi-bin/methylome) [25] demonstrated that all of these loci except locus #60 (located in a highly methylated region longer than several hundred kb, Table S1) were clearly lacking methylation; in addition, 28 loci contain repeat-associated sequences with one end beginning close to or within the small RNA matching region, and 15 loci had matching MPSS small RNA tags [12] (Table S1). We had also searched the website of DNA methylation information on the fourth chromosome in both Ler and Col background (http://chromatin.cshl.edu/cgi-bin/gbrowse/epivariation/) [2]. For the 13 loci (#44∼56) we identified on the fourth chromosome, six loci are found with methylation signals in their data: five loci (#46, 49, 52, 54, 55) are found specifically methylated in Ler as expected; one locus (#53) is methylated in both ecotypes but with a much higher methylation signal in Ler comparing to Col. Overall, our results are well supported by the two independent studies on epigenomics and epigenetic natural variation [2],[25].


Small RNA-directed epigenetic natural variation in Arabidopsis thaliana.

Zhai J, Liu J, Liu B, Li P, Meyers BC, Chen X, Cao X - PLoS Genet. (2008)

Illustration of the Strategy for Identifying Loci Matched by Significant Level of ∼24 nt siRNA Specifically in Ler using Chromosome 3 as an Example.Unique small RNAs obtained by 454 sequencing from Col and Ler ≥23 nt were mapped to the genome, then the perfect matches were counted per 100 bp. With this information, a filter was used to further identify loci with no less than three hits within 300 bp in Ler versus no hits within 1500 bp for the same region in Col.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2289841&req=5

pgen-1000056-g006: Illustration of the Strategy for Identifying Loci Matched by Significant Level of ∼24 nt siRNA Specifically in Ler using Chromosome 3 as an Example.Unique small RNAs obtained by 454 sequencing from Col and Ler ≥23 nt were mapped to the genome, then the perfect matches were counted per 100 bp. With this information, a filter was used to further identify loci with no less than three hits within 300 bp in Ler versus no hits within 1500 bp for the same region in Col.
Mentions: The identification of MPF-siRNAs in Ler- but not Col-derived small RNA data made us wonder whether other loci are differentially and specifically matched by ∼24 nt siRNAs in these ecotypes. Because the MPSS small RNA sequencing data are not readily comparable with the 454 data (due to length differences in the sequencing reads), the small RNA datasets we used for a genome-wide identification are all 454 sequencing data, derived from two recent studies: 247,318 unique small RNA sequences from Col [16]and 25,981 unique small RNA sequences from Ler [15]. Also, to balance the enrichment of longer siRNAs in the sequencing results of AGO4 precipitated pool from Ler [15], we only selected for further analyses the siRNA reads of length no less than 23 nt, hence most of the miRNAs and short sRNAs are discarded from both the Col and Ler datasets. Since only the Col genome sequence is complete and the number of sequenced Col derived siRNAs is much greater than that of Ler, in this study, we only analyzed the regions matched by clusters of siRNAs present specifically in Ler, to exclude the interference of genetic alteration and also for higher reliability (please see materials and methods for details about the bioinformatic analysis). The unique siRNA sequences over 23 nt from both Col and Ler were mapped to the genome, respectively, and hits were counted in windows of 100 bp. Although the majority of the ∼24 nt small RNA clusters are conserved between Col and Ler (data not shown), after combining the overlapping regions, 68 unique loci were identified (including the MPF, locus #57; Table S1). These all shared the characteristic that they were matched by at least three distinct siRNAs within 300 bp in Ler but there were no hits in 1500 bp around the same region in Col (see Figure 6 for an example). Most of these loci are MPF-like, in that the siRNA matches are restricted to a small region (Figure S6), and their distribution in the genome is quite dispersed (Figure S7). Twenty-two loci are within known genes, and the other 46 are in intergenic regions (Table S2). An search of methylation data in Col (http://signal.salk.edu/cgi-bin/methylome) [25] demonstrated that all of these loci except locus #60 (located in a highly methylated region longer than several hundred kb, Table S1) were clearly lacking methylation; in addition, 28 loci contain repeat-associated sequences with one end beginning close to or within the small RNA matching region, and 15 loci had matching MPSS small RNA tags [12] (Table S1). We had also searched the website of DNA methylation information on the fourth chromosome in both Ler and Col background (http://chromatin.cshl.edu/cgi-bin/gbrowse/epivariation/) [2]. For the 13 loci (#44∼56) we identified on the fourth chromosome, six loci are found with methylation signals in their data: five loci (#46, 49, 52, 54, 55) are found specifically methylated in Ler as expected; one locus (#53) is methylated in both ecotypes but with a much higher methylation signal in Ler comparing to Col. Overall, our results are well supported by the two independent studies on epigenomics and epigenetic natural variation [2],[25].

Bottom Line: Here, we report that, in the model plant Arabidopsis thaliana, a cluster of approximately 24 nt siRNAs found at high levels in the ecotype Landsberg erecta (Ler) could direct DNA methylation and heterochromatinization at a hAT element adjacent to the promoter of FLOWERING LOCUS C (FLC), a major repressor of flowering, whereas the same hAT element in ecotype Columbia (Col) with almost identical DNA sequence, generates a set of low abundance siRNAs that do not direct these activities.We have called this hAT element MPF for Methylated region near Promoter of FLC, although de novo methylation triggered by an inverted repeat transgene at this region in Col does not alter its FLC expression.A genome-wide comparison of Ler and Col small RNAs identified at least 68 loci matched by a significant level of approximately 24 nt siRNAs present specifically in Ler but not Col, where nearly half of the loci are related to repeat or TE sequences.

View Article: PubMed Central - PubMed

Affiliation: State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, China.

ABSTRACT
Progress in epigenetics has revealed mechanisms that can heritably regulate gene function independent of genetic alterations. Nevertheless, little is known about the role of epigenetics in evolution. This is due in part to scant data on epigenetic variation among natural populations. In plants, small interfering RNA (siRNA) is involved in both the initiation and maintenance of gene silencing by directing DNA methylation and/or histone methylation. Here, we report that, in the model plant Arabidopsis thaliana, a cluster of approximately 24 nt siRNAs found at high levels in the ecotype Landsberg erecta (Ler) could direct DNA methylation and heterochromatinization at a hAT element adjacent to the promoter of FLOWERING LOCUS C (FLC), a major repressor of flowering, whereas the same hAT element in ecotype Columbia (Col) with almost identical DNA sequence, generates a set of low abundance siRNAs that do not direct these activities. We have called this hAT element MPF for Methylated region near Promoter of FLC, although de novo methylation triggered by an inverted repeat transgene at this region in Col does not alter its FLC expression. DNA methylation of the Ler allele MPF is dependent on genes in known silencing pathways, and such methylation is transmissible to Col by genetic crosses, although with varying degrees of penetrance. A genome-wide comparison of Ler and Col small RNAs identified at least 68 loci matched by a significant level of approximately 24 nt siRNAs present specifically in Ler but not Col, where nearly half of the loci are related to repeat or TE sequences. Methylation analysis revealed that 88% of the examined loci (37 out of 42) were specifically methylated in Ler but not Col, suggesting that small RNA can direct epigenetic differences between two closely related Arabidopsis ecotypes.

Show MeSH