Limits...
Contribution of type W human endogenous retroviruses to the human genome: characterization of HERV-W proviral insertions and processed pseudogenes

View Article: PubMed Central - PubMed

ABSTRACT

Background: Human endogenous retroviruses (HERVs) are ancient sequences integrated in the germ line cells and vertically transmitted through the offspring constituting about 8 % of our genome. In time, HERVs accumulated mutations that compromised their coding capacity. A prominent exception is HERV-W locus 7q21.2, producing a functional Env protein (Syncytin-1) coopted for placental syncytiotrophoblast formation. While expression of HERV-W sequences has been investigated for their correlation to disease, an exhaustive description of the group composition and characteristics is still not available and current HERV-W group information derive from studies published a few years ago that, of course, used the rough assemblies of the human genome available at that time. This hampers the comparison and correlation with current human genome assemblies.

Results: In the present work we identified and described in detail the distribution and genetic composition of 213 HERV-W elements. The bioinformatics analysis led to the characterization of several previously unreported features and provided a phylogenetic classification of two main subgroups with different age and structural characteristics. New facts on HERV-W genomic context of insertion and co-localization with sequences putatively involved in disease development are also reported.

Conclusions: The present work is a detailed overview of the HERV-W contribution to the human genome and provides a robust genetic background useful to clarify HERV-W role in pathologies with poorly understood etiology, representing, to our knowledge, the most complete and exhaustive HERV-W dataset up to date.

Electronic supplementary material: The online version of this article (doi:10.1186/s12977-016-0301-x) contains supplementary material, which is available to authorized users.

No MeSH data available.


Comparison between HERV-W RepBase consensus LTR17-HERV17-LTR17 (black) and the proviral dataset generated consensus (grey). Nucleotide identity between the two consensus sequences is represented by the colored upper bar (green 100 % identity; greeny-brown between 100 and 30 % identity; red identity <30 %), while single nucleotide differences of the new consensus with respect to LTR17-HERV17-LTR17 are represented with black lines. The retroviral LTRs and genes localization is shown below
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC5016936&req=5

Fig3: Comparison between HERV-W RepBase consensus LTR17-HERV17-LTR17 (black) and the proviral dataset generated consensus (grey). Nucleotide identity between the two consensus sequences is represented by the colored upper bar (green 100 % identity; greeny-brown between 100 and 30 % identity; red identity <30 %), while single nucleotide differences of the new consensus with respect to LTR17-HERV17-LTR17 are represented with black lines. The retroviral LTRs and genes localization is shown below

Mentions: In addition to these major mutations, the analyses highlighted a greater amount of minor insertions/deletions and single nucleotides substitutions that, overall, allow to specifically identify the uniqueness of each HERW-W sequence. The majority of these variations appear to be randomly distributed among the sequences, as expected from the normal random genomic substitution rate, while a number of them are shared by the great majority of the sequences and characterize their structure with respect to the reference. This analysis allowed also to better defining a new HERV-W consensus generated from our proviral dataset that we graphically compared with the LTR17-HERV17-LTR17 consensus (Fig. 3). Interestingly, the LTR structures of the new HERV-W consensus showed recurrent mutations defining two subgroups of sequences that were used, in combination with the phylogenetic analysis, as key positions for subgroup definition.Fig. 3


Contribution of type W human endogenous retroviruses to the human genome: characterization of HERV-W proviral insertions and processed pseudogenes
Comparison between HERV-W RepBase consensus LTR17-HERV17-LTR17 (black) and the proviral dataset generated consensus (grey). Nucleotide identity between the two consensus sequences is represented by the colored upper bar (green 100 % identity; greeny-brown between 100 and 30 % identity; red identity <30 %), while single nucleotide differences of the new consensus with respect to LTR17-HERV17-LTR17 are represented with black lines. The retroviral LTRs and genes localization is shown below
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC5016936&req=5

Fig3: Comparison between HERV-W RepBase consensus LTR17-HERV17-LTR17 (black) and the proviral dataset generated consensus (grey). Nucleotide identity between the two consensus sequences is represented by the colored upper bar (green 100 % identity; greeny-brown between 100 and 30 % identity; red identity <30 %), while single nucleotide differences of the new consensus with respect to LTR17-HERV17-LTR17 are represented with black lines. The retroviral LTRs and genes localization is shown below
Mentions: In addition to these major mutations, the analyses highlighted a greater amount of minor insertions/deletions and single nucleotides substitutions that, overall, allow to specifically identify the uniqueness of each HERW-W sequence. The majority of these variations appear to be randomly distributed among the sequences, as expected from the normal random genomic substitution rate, while a number of them are shared by the great majority of the sequences and characterize their structure with respect to the reference. This analysis allowed also to better defining a new HERV-W consensus generated from our proviral dataset that we graphically compared with the LTR17-HERV17-LTR17 consensus (Fig. 3). Interestingly, the LTR structures of the new HERV-W consensus showed recurrent mutations defining two subgroups of sequences that were used, in combination with the phylogenetic analysis, as key positions for subgroup definition.Fig. 3

View Article: PubMed Central - PubMed

ABSTRACT

Background: Human endogenous retroviruses (HERVs) are ancient sequences integrated in the germ line cells and vertically transmitted through the offspring constituting about 8&nbsp;% of our genome. In time, HERVs accumulated mutations that compromised their coding capacity. A prominent exception is HERV-W locus 7q21.2, producing a functional Env protein (Syncytin-1) coopted for placental syncytiotrophoblast formation. While expression of HERV-W sequences has been investigated for their correlation to disease, an exhaustive description of the group composition and characteristics is still not available and current HERV-W group information derive from studies published a few years ago that, of course, used the rough assemblies of the human genome available at that time. This hampers the comparison and correlation with current human genome assemblies.

Results: In the present work we identified and described in detail the distribution and genetic composition of 213 HERV-W elements. The bioinformatics analysis led to the characterization of several previously unreported features and provided a phylogenetic classification of two main subgroups with different age and structural characteristics. New facts on HERV-W genomic context of insertion and co-localization with sequences putatively involved in disease development are also reported.

Conclusions: The present work is a detailed overview of the HERV-W contribution to the human genome and provides a robust genetic background useful to clarify HERV-W role in pathologies with poorly understood etiology, representing, to our knowledge, the most complete and exhaustive HERV-W dataset up to date.

Electronic supplementary material: The online version of this article (doi:10.1186/s12977-016-0301-x) contains supplementary material, which is available to authorized users.

No MeSH data available.