Basics of G-quadruplex structures

A quick overview of the (nucleo)bases of G-quadruplex nucleic acids.

Eric Largy https://ericlarg4.github.io/index.html (ARNA)https://arna.cnrs.fr/ , Valérie Gabelica https://gabelicagroup.wixsite.com/biophyms (ARNA)https://arna.cnrs.fr/ , Jean-Louis Mergny
2020-11-08

1. Definitions

In the following section, minimal definitions of terms specific to the field are proposed to ensure a better consistency in describing G-quadruplex structures across publications.

1.1. G-quadruplex

G-quadruplex (abbrev.: G4): Secondary structure formed by the association of a minimum of two G-quartets by \(\pi\)-\(\pi\) stacking and coordination to a cation (Figure 1, left). The term G-quadruplex encompasses a variety of conformations differing by their number of G-quartets, loop length and geometry, relative orientation of the G-tract strands, molecularity, and other non-canonical features (e.g. mixed quartets, extended quartets such as hexads or heptads, and additional features including triplets, base pairs or bulges). G-quadruplexes may be formed from nucleic acid having either natural (DNA, RNA) or artificial (e.g. PNA, LNA) backbones.

Tetraplex: Alternative name for G-quadruplex. Not recommended to use it to avoid confusions with other uses of the word (e.g. tetraplex PCR assay). Tetraplex should not be used to imply a tetra-stranded stoichiometry; preferred wording should be: tetramolecular G-quadruplex.

The G-quadruplex and i-motif cores. Both are formed by stacking of different building blocks, respectively the G-quartet and the hemi-protonated cytosine base pair for the i-motif. The phosphate backbone is shown as a line. Guanines and cytosines are depicted by an orange or blue circle, respectively, and a cuboid

Figure 1: The G-quadruplex and i-motif cores. Both are formed by stacking of different building blocks, respectively the G-quartet and the hemi-protonated cytosine base pair for the i-motif. The phosphate backbone is shown as a line. Guanines and cytosines are depicted by an orange or blue circle, respectively, and a cuboid

Z-G4: Left-handed G-quadruplex, named by analogy with Z-DNA left-handed double helices (13).

i-motifs (or i-DNA) are structures formed at slightly acidic pH by cytosine-rich nucleic acids, characterized by C•CH+ base pairs arranged in two parallel duplexes associated head-to-tail via base-pair intercalation (Figure 1, right). Although these double duplexes can be considered as a type of quadruplex structure, they will not be discussed further herein.

1.2. Quartets and derivatives

G-quartet: Quasi co-planar association of four guanines linked by a network of eight hydrogen bonds on both the Watson-Crick and Hoogsteen faces (i.e. N2H·N7 and N1H·O6) (Figure 2A). They are also frequently called quartets or tetrads, although these terms are less specific and should preferably be avoided.

Chemical structures of the G-quartet (A), G·C·G·C (B [@lim2009]), G·G·G·T (C [@webbadasilva2003]) and G·G·G·A (D [@webbadasilva2003a]) mixed quartets, G·G·G G-triplet (E [@zhang2009]), and A·G·A mixed triplet (F [@zhang2009]).

Figure 2: Chemical structures of the G-quartet (A), G·C·G·C (B (10)), G·G·G·T (C (9)) and G·G·G·A (D (11)) mixed quartets, G·G·G G-triplet (E (8)), and A·G·A mixed triplet (F (8)).

G-tract: Part of a nucleic sequence consisting of a minimum of two consecutive guanines, which may be involved in the formation of G-quartets (Figure 3). G-tracts involved in G-quartet formation can be interrupted by bulges or snapbacks, and linked to another G-tract by a loop (see section 1.3). A minimum of four G-tracts are necessary to form a G-quadruplex, whether they are contained in a single strand or in up to four different strands.

Stem: Consecutively stacked guanines involved in G-quartet formation. There are four stems per core.

Mixed quartet: Quartet of bases not entirely formed by guanines, e.g. G·C·G·C, G·G·G·T and G·C·G·A (Figure 2B—D).

Triplet (or triad): By analogy with the quartet, quasi co-planar association of three bases. It can be composed of guanines exclusively (G-triplet; Figure 2E) or mixed (Figure 2F).

Base pair: Association of two nucleotides by hydrogen bonding of the bases; not limited to Watson-Crick pairing. Although not typically associated with G-quadruplexes, base pairs can be formed by loop and flanking sequence nucleotides, and are sometimes stacked upon the core, hence contributing to the G4 stabilization.

Core: Structural ensemble of consecutively stacked quartets and their inner cations (Figure 1). Sometimes called subunits in the context of stacked or interlocked multimers (see section 1.8).

Inner cation: In the particular context of G-quadruplexes, cation bound by the O6 of G-quartets guanines (or related structures, vide supra) (Figures 1 and 2A). Depending on a number of factors including the cation ionic radius and lone pair attraction, cation-cation repulsion, and possible G-quartet distortion, it may be located anywhere from within a G-quartet’s plane to – more commonly – midway between two G-quartets, as typically observed for potassium (12).

G-quadruplex features beyond the core (A), loop geometries (B), and snapback (C). The phosphate backbone is shown as a line and guanines are depicted by an orange circle and a cuboid in either syn (dark gray) or anti (light gray) conformation.

Figure 3: G-quadruplex features beyond the core (A), loop geometries (B), and snapback (C). The phosphate backbone is shown as a line and guanines are depicted by an orange circle and a cuboid in either syn (dark gray) or anti (light gray) conformation.

1.3. Beyond the G4 core: loops, bulges, and flanking sequences

Loop: Part of the nucleic acid sequence linking G-tracts involved in G-quartet formation (Figure 3B). Loops can adopt different geometries (see section 1.4). Loop nucleotides may be base-paired.

Bulge: Part of a sequence interrupting a G-tract involved in G-quartet formation, i.e. nucleotides linking two guanines implicated in two consecutively stacked G-quartets (Figure 3A).

Flanking sequence: Nucleotides from the 5’- and 3’-termini of a G-quadruplex positioned before the first or after the last G-quartet forming guanine, respectively (Figure 3A). Flanking sequence nucleotides may be base-paired. Although not part of the core, flanking sequences can have a large impact on the topology and multimerization of G-quadruplexes (6). Flanking sequences are naturally present in the genomic context.

1.4. Loop types

Propeller (or chain reversal): loop spanning across G-quartets to connect guanines from two different quartets so that the two linked G-tracts are adjacent and share the same 5’-3’ polarity (parallel) (Figure 3B). Parallel loops typically contain 1—2 nucleotides.

V-shaped: Variant of the propeller loop that does not contain any nucleotide, so that the two G-tracts are directly connected.

Lateral or edgewise: loop connecting two adjacent G-tracts of opposite 5’-3’ polarity (antiparallel) (Figure 3B). Lateral loops typically contain at least 2—3 nucleotides depending on the groove width.

Diagonal: loop linking diagonally opposed G-tracts of opposite 5’-3’ polarity (antiparallel) (Figure 3B). Diagonal loops typically contain at least 3 nucleotides.

Snapback: Loop that connects an external quartet to an internal quartet., instead of linking two external quartets as more commonly observed. This means that one stem of consecutively stacked guanines is composed of two distinct G-tracts (Figure 3C) (2). This implies that the stem in question is not composed of a continuous phosphate backbone.

Both lateral and diagonal loops result in a strand polarity inversion occurring in antiparallel and hybrid topologies, whereas propeller loops preserve the same orientation and are thus found only in parallel and hybrid topologies (see section 1.6).

1.5. Grooves

Grooves are defined as the space framed by two adjacent phosphate backbones. The canonical B-DNA duplex displays two grooves, referred to as minor and major groove because of their different widths (5.7 and 11.7 Å, respectively, between a selected backbone phosphate i and the i+3 phosphate on the opposing strand ). In G-quadruplex structures, however, three groove widths are distinguished: narrow (8.9 Å), medium (10.2 Å), and wide (12.2 Å) ref.

Considering that each guanine is either in a syn or anti conformation (see section 1.7.1), there are 24 = 16 possible combinations per G-quartet. These will in turn yield 8 possible groove-widths combinations for a given core, interdependent with the loop geometries. There is no proper terminology for these groove-widths combinations yet. For monomolecular G-quadruplexes without snapbacks, the groove width can be inferred directly from the order of loop types (1).

1.6. Topologies

A classical way to describe G-quadruplex structures is to report their relative strand orientation following the 5’ to 3’ phosphate backbone polarity (Figure 4). Three topologies are often distinguished based on strand orientation. Based on the groove width combination, two types of antiparallel structures can be distinguished.

Parallel: All strands are oriented in the same direction, with often all loops displaying a propeller geometry (Figure 4; counter example: Figure 3C). It is the typical topology of tetramolecular G4s. All guanines share the same glycosidic bond angle (usually anti).

Antiparallel (or 2+2 antiparallel): Two strands are oriented in one direction, and two strands in the other (Figure 4). There are in fact not one but two distinct antiparallel topologies: the opposite-direction strands can be either laterally (“basket” type) or diagonally (“chair” type) opposed, yielding two distinct G-quartets arrangements and groove types (medium-narrow-medium-wide for the “basket”, narrow-wide-narrow-wide for the “chair”). Antiparallel strand polarities lead to systematic inversions of glycosidic bond angles vs. the previous G-tract.

Hybrid: Three strands are oriented in one direction, and the remaining strand in the other. The hybrid-1 and hybrid-2 forms are often distinguished in the human telomeric-related literature, but – contrary to the antiparallel fold – the G-quartets arrangements are equivalent upon a 90° rotation. Often referred to as “3+1” by analogy with the above-mentioned 2+2 antiparallel. Guanosines’ glycosidic bond angles are maintained after a propeller loop or inverted after a lateral or diagonal loop vs. the previous G-tract.

Schemes of G-quadruplex reflecting the major topologies (bottom) and corresponding top view of the upper G-quartets (top). The phosphate backbone is shown as a line and guanines are depicted by an orange circle and a rectangle or cuboid in either syn (dark gray) or anti (light gray) conformation.

Figure 4: Schemes of G-quadruplex reflecting the major topologies (bottom) and corresponding top view of the upper G-quartets (top). The phosphate backbone is shown as a line and guanines are depicted by an orange circle and a rectangle or cuboid in either syn (dark gray) or anti (light gray) conformation.

Note that this classification is a simplification that does not reflect the complexity of topology subsets observed in vitro, which may ultimately lead to inaccurate and/or misleading reports.

Alternatively, it is possible to define 26 distinct (monomolecular) topologies based on glycosidic bond angles, using loops combination as a descriptor (3). This approach allows determining the width of grooves from the knowledge of loop geometries. Although these topologies are possible on paper, not all have been observed so far. On the other hand, the Z-G4 (left-handed topology) was not foreseen.

Polymorphic: A sequence is defined as polymorphic if it can adopt several topologies. A salient example of a polymorphic motif is the human telomeric motif, known to adopt at least 6 different structures depending on flanking sequences and experimental conditions. In conditions wherein topologies of different types (parallel, antiparallel, or hybrid) coexist, the topology is sometimes referred to as “mixed” topology.

1.7. Guanine stacking

1.7.1. Glycosidic bond angles

Guanines can adopt a syn or anti conformation relative to the glycosidic bond (Figure 5A). Schematically, this corresponds to the base and sugar being on the “same side” or “opposite side” of the glycosidic bond, respectively. The torsion angle \(\chi\), defined by the O4′-C1′-N9-C4 atoms (for purines) can be used to determine the conformation of guanines (-90° to 90°: syn; 90° to 180° and -90° to -180°: anti) ((Figure 5B-D).

Syn and anti glycosidic bond angles of guanosine (A), Stacking between two successive guanines in anti-anti (B; 2O3M [@phan2007a]), syn-anti (C; 2JPZ [@dai2007]) and anti-syn (D; 1JPQ [@haider2002]) conformations. The 5’-guanosine is on top (n) and the 3’ at the bottom (n+1).

Figure 5: Syn and anti glycosidic bond angles of guanosine (A), Stacking between two successive guanines in anti-anti (B; 2O3M (4)), syn-anti (C; 2JPZ (5)) and anti-syn (D; 1JPQ (7)) conformations. The 5’-guanosine is on top (n) and the 3’ at the bottom (n+1).

1.7.2. G-quartet polarity

H-bonding can be used to define a G-quartet polarity, following the donor-to-acceptor direction (i.e. N2-H to N7 and N1-H to O6) ((Figure 6A,B). This leads to two types of G-quartet stacking , wherein the G-quartets share the same polarity (e.g. parallel topology) or not ((Figure 6C,D).

G-quartet polarity defined by hydrogen bonding donor-to-acceptor direction (A, B), top (C) and side (D) views of 2-G-quartet stacking with identical or opposite polarities. The 5’-end is located on the top quartet and the point of view is in the 5’ to 3’ direction (top to bottom). Guanines are depicted by an orange circle and a rectangle or cuboid.

Figure 6: G-quartet polarity defined by hydrogen bonding donor-to-acceptor direction (A, B), top (C) and side (D) views of 2-G-quartet stacking with identical or opposite polarities. The 5’-end is located on the top quartet and the point of view is in the 5’ to 3’ direction (top to bottom). Guanines are depicted by an orange circle and a rectangle or cuboid.

1.7.3. Guanines and G-quartets stacking

More precisely, consecutively stacked guanines can adopt one of the three possible steps (anti-anti, syn-anti and anti-syn; syn-syn being rarely observed), resulting in distinct stacking geometries (partial 5/6-ring, 5-ring and partial 6-ring, respectively) differing notably by the extent and twist of base overlap (Figure 6C—D) (14). In the particular case of Z-G4 , the twist is left-handed rather than right-handed (Figure 7) (15).

Top view of the stacking between two successive G-quartets in a right-handed ([TG4T]4, PDB: 244D [@laughlan1994]) and left-handed G-quadruplexes (d((TGG)4T2G(TGG)3TGT2), PDB: 2MS9 [@chung2015c]) (top). Guanines are colored in light blue (top G-quartet; 5’-end) or green (bottom quartet; 3’-end) and by heteroatom. The glycophosphate backbone has been replaced by methyl groups for clarity. All guanines are in the anti conformation yielding partial 5/6-ring stacks (bottom; the 5’-guanosine is on top (n) and the 3’ at the bottom (n+1)) but the twist have opposite polarities between the left- and right-handed constructs.

Figure 7: Top view of the stacking between two successive G-quartets in a right-handed ([TG4T]4, PDB: 244D (16)) and left-handed G-quadruplexes (d((TGG)4T2G(TGG)3TGT2), PDB: 2MS9 (17)) (top). Guanines are colored in light blue (top G-quartet; 5’-end) or green (bottom quartet; 3’-end) and by heteroatom. The glycophosphate backbone has been replaced by methyl groups for clarity. All guanines are in the anti conformation yielding partial 5/6-ring stacks (bottom; the 5’-guanosine is on top (n) and the 3’ at the bottom (n+1)) but the twist have opposite polarities between the left- and right-handed constructs.

1.8. Molecularity

The terms dimer, trimer, tetramer, and bimolecular, trimolecular, tetramolecular are often interchangeably used in the G-quadruplex literature to describe structural features that are not equivalent, which may lead to inaccuracies and confusions. Herein we propose definitions of these terms in the context of G-quadruplexes.

Intramolecular G-quadruplex: G-quadruplex formed by a single strand, containing a minimum of four G-tracts (Figure 4) which can be interrupted by bulges. Intramolecular G-quadruplexes normally contain three loops.

Multimolecular (di-, tri-, tetramolecular) G-quadruplex: Refers to G-quadruplex formed by association of more than a single strand (Figure 8). This term exclusively refers to the molecularity of the structure but does not inform about the nature of the interaction between the strands, i.e. H-bonding (intertwined), stacking (stacked), or both (interlocked).

Intertwined multimeric G-quadruplexes: Association of 2—4 nucleic acid strands through G-quartet-forming hydrogen bonding to form a single G-quadruplex core (Figure 8). The monomeric strands may contain less than the 4 G-tracts necessary to form a G-quadruplex intramolecularly. All G-quartets must be broken to separate the strands.

Stacked multimeric (di-, tri-, tetramer) G-quadruplexes: Association of several G-quadruplex cores through stacking (Figure 8). The cores can exist as monomers (typically in equilibrium with the multimer form(s)) that can be separated without breaking any G-quartet. Conceptually, each core could itself be an intertwined multimer.

Interlocked G-quadruplex: Association of two G-quadruplex cores through both stacking (as in stacked multimeric G4s), and G-quartet-forming hydrogen bonding (as in intertwined G4s) of one strand with another, typically at an interfacing G-quartet (Figure 8). The cores cannot be separated without disrupting these G-quartets, but the strands may still be able to form an intramolecular G-quadruplex.

Interface: G-quartets from a G-quadruplex core being directly stacked to a G-quartet from another core in stacked or interlocked multimers (Figure 8). Interfaces may be characterized by the relative polarities of the strands (i.e. 5’-5’, 3’-3’, 5’-3’) and distinct G-quartets stacking (partial 6-ring, 6-ring, 5/6-ring, 5-ring) (18).

Schemes of three bimolecular G-quadruplexes displaying distinct dimerization patterns.

Figure 8: Schemes of three bimolecular G-quadruplexes displaying distinct dimerization patterns.

1.9. Drawing G-quadruplexes

Simple schemes, such as those found in figures 3, 4 and 8, are often needed to depict G-quadruplex structures. Despite their undeniable usefulness and ubiquity in the literature, these schemes are simplifications of real structures. Notably, the groove widths, helicity, and proper guanine relative positions are not rendered. Twists can be illustrated by “top-view” figures (as in Figure 6) or directly from high-resolution structural data when available (Figures 5 and 7). Furthermore, the guanine orientation (syn or anti) – when known – should be specified through the use of two distinct colors and/or symbols for instance.

Second, there are often discrepancies between the structural knowledge gathered in a study and the amount of structural details in schemes from the associated publication. This is often the case where authors rely on low-resolution data, often circular dichroism, from which strand topologies are determined. There are often several structures possible for a given topology (even for parallel structures; see Figure 3C), sometimes in equilibrium, and thus by presenting a single scheme authors may mislead the readers. It is therefore advisable to draw schemes that reflect exclusively what is proven and/or can be inferred from the sequence, and to clearly state their limitations.

1. Karsisiotis,A.I., O’Kane,C. and Webba da Silva,M. (2013) DNA quadruplex folding formalism A tutorial on quadruplex topologies. Methods, 6428–35.
2. Phan,A.T., Kuryavyi,V., Burge,S., Neidle,S. and Patel,D.J. (2007) Structure of an Unprecedented G-Quadruplex Scaffold in the Humanc-kitPromoter. Journal of the American Chemical Society, 1294386–4392.
3. Karsisiotis,A.I., O’Kane,C. and Webba da Silva,M. (2013) DNA quadruplex folding formalism A tutorial on quadruplex topologies. Methods, 6428–35.
4. Phan,A.T., Kuryavyi,V., Burge,S., Neidle,S. and Patel,D.J. (2007) Structure of an Unprecedented G-Quadruplex Scaffold in the Humanc-kitPromoter. Journal of the American Chemical Society, 1294386–4392.
5. Dai,J., Carver,M., Punchihewa,C., Jones,R.A. and Yang,D. (2007) Structure of the Hybrid-2 type intramolecular human telomeric G-quadruplex in K+ solution: insights into structure polymorphism of the human telomeric sequence. Nucleic Acids Research, 354927–4940.
6. Largy,E., Marchand,A., Amrane,S., Gabelica,V. and Mergny,J.-L. (2016) Quadruplex Turncoats: Cation-Dependent Folding and Stability of Quadruplex-DNA Double Switches. Journal of the American Chemical Society, 1382780–2792.
7. Haider,S., Parkinson,G.N. and Neidle,S. (2002) Crystal Structure of the Potassium Form of an Oxytricha nova G-quadruplex. Journal of Molecular Biology, 320189–200.
8. Zhang,Z., Dai,J., Veliath,E., Jones,R.A. and Yang,D. (2009) Structure of a two-G-tetrad intramolecular G-quadruplex formed by a variant human telomeric sequence in K+ solution: insights into the interconversion of human telomeric G-quadruplex structures. Nucleic Acids Research, 381009–1021.
9. Webba da Silva,M. (2003) Association of DNA Quadruplexes through G:C:G:C Tetrads. Solution Structure of d(GCGGTGGAT). Biochemistry, 4214356–14365.
10. Lim,K.W., Alberti,P., Guédin,A., Lacroix,L., Riou,J.-F., Royle,N.J., Mergny,J.-L. and Phan,A.T. (2009) Sequence variant (CTAGGG)n in the human telomere favors a G-quadruplex structure containing a G·C·G·C tetrad. Nucleic Acids Research, 376239–6248.
11. Webba da Silva,M. (2003) Association of DNA Quadruplexes through G:C:G:C Tetrads. Solution Structure of d(GCGGTGGAT). Biochemistry, 4214356–14365.
12. Sigel,A., Sigel,H. and Sigel,R.K.O. eds. (2016) The alkali metal ions: Their role for life Springer International Publishing.
13. Chung,W.J., Heddi,B., Schmitt,E., Lim,K.W., Mechulam,Y. and Phan,A.T. (2015) Structure of a left-handed DNA G-quadruplex. Proceedings of the National Academy of Sciences, 1122729–2733.
14. Lech,C.J., Heddi,B. and Phan,A.T. (2012) Guanine base stacking in G-quadruplex nucleic acids. Nucleic Acids Research, 412034–2046.
15. Chung,W.J., Heddi,B., Schmitt,E., Lim,K.W., Mechulam,Y. and Phan,A.T. (2015) Structure of a left-handed DNA G-quadruplex. Proceedings of the National Academy of Sciences, 1122729–2733.
16. Laughlan,G., Murchie,A., Norman,D., Moore,M., Moody,P., Lilley,D. and Luisi,B. (1994) The high-resolution crystal structure of a parallel-stranded guanine tetraplex. Science, 265520–524.
17. Chung,W.J., Heddi,B., Schmitt,E., Lim,K.W., Mechulam,Y. and Phan,A.T. (2015) Structure of a left-handed DNA G-quadruplex. Proceedings of the National Academy of Sciences, 1122729–2733.
18. Lech,C.J., Heddi,B. and Phan,A.T. (2012) Guanine base stacking in G-quadruplex nucleic acids. Nucleic Acids Research, 412034–2046.

References

Corrections

If you see mistakes or want to suggest changes, please create an issue on the source repository.

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY-NC-SA 4.0. Source code is available at https://github.com/EricLarG4/EricLarG4.github.io, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".