NUCLEIC ACID STRUCTURE

 

I. Nucleic acid polymers are made of mononucleotide monomers. Fig.2.10:

A. NUCLEOSIDE = NITROGENOUS BASE (PURINE OR PYRIMIDINE) + SUGAR

B. NUCLEOTIDE = BASE (N-CONTAINING) + SUGAR + 1 or more phosphates

 

NUCLEOTIDE TYPE

BASES*

SUGAR

PHOSPHATE GROUPS

RIBONUCLEOTIDE:

A, C, G, U

RIBOSE

USUALLY 3' OR 5'

DEOXYRIBONUCLEOTIDE:

A, C, G, T

DEOXYRIBOSE

USUALLY 3' OR 5'

*A=ADENINE, G = GUANINE, C = CYTOSINE, U = URACIL, T = THYMINE

A and G are PURINE bases

C, U, and T are PYRIMIDINE bases

(Note, you will not be expected to memorize nucleotide structures. You do need to know what the components of nucleotides are, know the letter designations for the bases, which are purine and pyrimidine, which pair with which (see below) and understand the difference between 5’ and 3’ ends. If shown a chemical depiction of a nucleotide, you should be able to label which part is the base, which part the sugar, and recognize deoxyribose from ribose.)

II. BASE PAIRING: IN DOUBLE STRANDED NUCLEIC ACIDS, Fig. 2.12 and 3.7

A. G pairs with C; A pairs with T (or U)

B. Most cellular RNA is SINGLE-STRANDED; Most DNA is DOUBLE STRANDED

(What does this imply about the ratio of purine to pyrimidine bases in double stranded DNA? Work this out for yourself.)

 

III. ESSENTIAL FEATURES OF DNA STRUCTURE, Fig. 3.7

A. COMPLEMENTARITY. Like a form and its mold, the structure of one strand determines that of its complementary strand. In other words, the DNA molecule carries the same information in two forms (like a photographic print and its negative). Although the two strands are different in structure, one can be used to synthesize the other. Also, one strand’s sequence can be used to identify that of its complement.

B. HELICAL STRUCTURE. The most common B FORM of DNA is a RIGHT-HANDED DOUBLE HELIX. The paired bases are "stacked" on the inside of the double helix with one base pair every 3.4 Angstrom (0.34 nM). There are slightly more than 10 base pairs (bp) per full turn of helix. The phosphate (phosphodiester) backbones are on the outside (negatively charged). The double helix has a MAJOR (wide) and a MINOR (narrow) GROOVE. It is about 2 nM (or 20 Angstoms) in diameter. The two phosphate backbones ionically repel one another, but the two strands are held in association at normal temperatures by hydrogen bonding between the bases.

C. ANTIPARALLEL NATURE. The DNA strands are formed by a series of phosphodiester bonds between the 5' end of one nucleotide sugar and the 3' end of the next one. Each strand has an inherent directionality. Conventionally, a DNA (or RNA) strand is described as going from its free 5' end to its free 3' end (and sequences are written by convention in their 5' to 3' order). In any double stranded nucleic acid, the two strands are ANTIPARALLEL; i.e., go in opposite directions. Therefore the 5' end of one strand is generally across from the 3' end of its complement and vice versa.

 

How Do We Know?

1. That genetic (inherited) information is encoded in the sequence of DNA?

After the rediscovery of Mendel's laws, genetic experiments with Drosophila by T.H. Morgan and others proved the "chromosomal theory of inheritance", i.e., that genes are located at specific positions on chromosomes in the eukaryotic nucleus (discussed later). Biochemical analysis of chromosomes showed that they were mostly protein and DNA. The first indication that DNA was the information carrier came from Avery and colleagues who demonstrated (in 1944) that the "transforming agent" that genetically altered Pneumococcus bacteria was made up of DNA, purified away from protein (Fig. 3.6). Many scientists remained unconvinced, in part because it was unclear that bacterial transformation involved the same type of heredity as seen in eukaryotes. A second, biochemical line of evidence was provided in the early 50's by Alfred Hershey (an MSU grad) and Martha Chase. They used radioactive isotope labeling (newly available after WW II) and bacterial viruses (bacteriophage) to show that when these viruses infected cells, they injected their DNA but not their protein into the cell, so there could be no transfer of information via protein from parent bacteriophage to offspring. (This was still rather different from eukaryotic sexual inheritance, but the agreement of these two very different types of experiment made a persuasive case for DNA. Shortly thereafter, the Watson-Crick structure of DNA convinced almost everyone.)

2. That the Watson-Crick model of DNA structure is correct?

Watson and Crick put together their model (1953) based primarily on the following evidence:

a. X-ray "fiber diffraction" patterns suggested a right-handed helical structure that repeated itself in some fashion every 3.4 and every 34 Å (Angstrom, 10-10 m). These came from several labs, but the some of the best were from Rosalind Franklin and Maurice Wilkins.

b. Biochemical analysis, mostly from the lab of Erwin Chargaff, indicated that in most DNA the amount of guanine equaled that of cytosine (G=C) and adenine equaled thymine (A=T).

It's important to note that the X-ray patterns from DNA fibers were much less precise than the patterns produced now or even then by crystals formed by a single pure protein, so the exact locations of atoms could only be speculated at, based on crude models. However, Watson and Crick accurately estimated all the critical properties of DNA noted above: two strands held together by hydrogen bonding between bases, phosphates on the outside (suggested by others in general and confirmed by W-C model building), complementarity (using Chargaff's results) implying that once separated, each of the two strands can be the template on which to build a new complementary strand, right-handed helices with base pairs perpendicular to the helix axis every 3.4 Å and about 10 base pairs per helical turn (using X-ray results) and antiparallel nature of the two strands (again from model building, once base pair rules were figured out).

The Watson-Crick DNA structure is an unusual example of a major scientific discovery, not just because of the incredible impact it had. For example, note that Watson and Crick did almost none of the experimental work themselves; they just were the first to put together all the data in the right way. However, the model building technique that Watson and Crick employed is still widely used by structural biologists; of course, now they do it on high speed computers. Second, although the Watson-Crick structure was rapidly accepted by almost everyone, it can't be said that it was confirmed beyond a reasonable doubt for nearly 30 years until small, chemically defined and completely pure DNAs could be crystallized for exact X-ray analysis. Note, for example, that Alex Rich and colleagues showed that certain unusual sequence arrangements of DNA can adopt a left-handed double helical form called Z-DNA in 1979 and non-Watson-Crick structures were still being seriously proposed for normal DNA well into the '70's.

 

 

IV. RNA STRUCTURE

A. Although most RNA’s are single stranded, that RNA strand can fold on itself to form secondary structure (like a protein folds to form secondary and tertiary structure). RNA secondary structure is formed by stable loops that arise from INTRAMOLECULAR BASE PAIRING. ("HAIRPIN" LOOPS.) (e.g., see Fig. 7.1 for tRNA).

 

B. There are many different key types of RNA in the cell (just as there are many types of proteins). RNA’s are predominantly used to generate important structures and in information transfer processes. Some major RNA’s are named according to how "big" they are as measured by their S value (how fast they spin to the bottom of a centrifuge tube). Most types of RNA are present in both prokaryotes and eukaryotes, but often the eukaryotic versions are larger than the prokaryotic homologues. (See table below).

 

MAJOR RNAs (adapted from Singer and Berg, "Genes and Genomes", Univ. Sci. Books)

RNA TYPE

# of different kinds in cells

Approx. length(nt)

Distrbution*

Major Role

Transfer RNA (tRNA)

80-100

75-90

P,E

Translation adapter

5S Ribosomal RNA (rRNA)

1-2

120

P,E

Ribosome structure

5.8S rRNA

1

155

E

Ribosome structure

16S rRNA

1

1600

P

Ribosome structure

23S rRNA

1

3200

P

Ribosome structure

18S rRNA

1

1900

E

Ribosome structure

28S rRNA

1

5000

E

Ribosome structure

Messenger RNA (mRNA)

thousands

variable

P,E

Used by ribosome as directions for protein synthesis

Heterogeneous nuclear RNA (hnRNA)

thousands

variable

E

Copied from DNA; Processed to mRNA

Small nuclear RNA (snRNA)

tens

58-220

E

Used in hnRNA processing (spliceosome)

*P = in prokaryotes; E = in eukaryotes

(Come back and review this table at the end of chapters 6 and 7 to see where these RNA’s function in gene expression.)

 

V. INFORMATION TRANSFER IN THE CELL FOLLOWS THE CENTRAL "DOGMA":

DNAÞ DNAÞ mRNAÞ PROTEIN

 

GENE EXPRESSION: The flow of information from genotype to phenotype.

(An overview of where we’re going.)

A. In order to live and replicate the cell or organism must generate many different biological structures and catalyze thousands of different biochemical reactions. Almost all of this depends on the generation of specific PROTEINS which form the structures or, acting as enzymes, catalyze the generation of other structures and the other biochemical reactions required for life. One example is glucose Û glucose-6-phosphate, the first reaction in glycolysis, which is essential to the utilization of glucose for energy. This is catalyzed by the enzyme hexokinase.

B. The function of any protein is completely dependent on its 3-DIMENSIONAL STRUCTURE. This 3-dimensional structure is comprised of elements of primary, secondary, tertiary, and, in the case of multisubunit proteins, quaternary structure. The structure is the "information" needed for the enzyme to function. Hexokinase is a single subunit protein.

C. Essentially all aspects of the 3-dimensional structure of a protein are the result of its PRIMARY STRUCTURE (or AMINO ACID SEQUENCE). The way a protein folds into its secondary and tertiary structure, whether or not it associates with other proteins (quaternary structure), and whether or not it binds other molecules or is modified by the cell all depend on the primary structure. Hexokinase, for example, contains a specific sequence of 920 amino acids.

D. The amino acid sequence of a protein is determined by the process of TRANSLATION, in which the correct amino acid is chosen for each position according to the sequence of CODONS (3 nucleotide units) in the MESSENGER RNA (mRNA) which codes for that specific protein. Codons and amino acids correspond according to the GENETIC CODE, in which each of 64 possible codons is read as one of the 20 amino acids used (or as a stop codon). The coding sequence of hexokinase mRNA consists of a specific linear array of 920 codons or 2760 nucleotides (920 x 3) of mRNA.

E. The sequence of nucleotides in mRNA is copied using complementarity from the sequence of nucleotides in the template strand of DNA which encodes that specific gene. This is the process of TRANSCRIPTION, and it is accomplished by the enzyme(s) RNA POLYMERASE, along with a variety of associated protein factors.

F. All the genes of the organism are arrayed, along with other sequences, on one or more specific long DNA molecules that make up CHROMOSOMES. Each gene is located at a specific place on that DNA/chromosome. (For example, the gene for the major form of human hexokinase is at a position called q22 on human chromosome 10.)

G. Each gene is duplicated by the process of DNA REPLICATION in which two (nearly) exact copies of each chromosomal DNA are made, followed by segregation of one copy to each of the daughter cells. The two strands of DNA are separated and new duplexes are synthesized using the complementarity principle. Each daughter cell receives a complete copy of all the genetic information present in the parental cell.