Biosynthesis and catabolism of purine and pyrimidine nucleotides. Determination of the end products of their metabolism.

Molecular mechanism of DNA replication. Transcription. Biosynthesis of proteins on ribosomes. Stages and mechanisms of translation, regulation of translation. Antibiotics inhibitors of transcription and translation.

One of the important specialized pathways of a number of amino acids is the synthesis of purine and pyrimidine nucleotides. These nucleotides are important for a number of reasons. Most of them, not just ATP, are the sources of energy that drive most of our reactions. ATP is the most commonly used source but GTP is used in protein synthesis as well as a few other reactions. UTP is the source of energy for activating glucose and galactose. CTP is an energy source in lipid metabolism. AMP is part of the structure of some of the coenzymes like NAD and Coenzyme A. And, of course, the nucleotides are part of nucleic acids. Neither the bases nor the nucleotides are required dietary components. We can both synthesize them de novo and salvage and reuse those we already have.

Nitrogen Bases

There are two kinds of nitrogen-containing bases - purines and pyrimidines. Purines consist of a six-membered and a five-membered nitrogen-containing ring, fused together. Pyridmidines have only a six-membered nitrogen-containing ring. There are 4 purines and 4 pyrimidines that are of concern to us.


                    Adenine = 6-amino purine

                    Guanine = 2-amino-6-oxy purine

                    Hypoxanthine = 6-oxy purine

                    Xanthine = 2,6-dioxy purine

Adenine and guanine are found in both DNA and RNA. Hypoxanthine and xanthine are not incorporated into the nucleic acids as they are being synthesized but are important intermediates in the synthesis and degradation of the purine nucleotides.




                    Uracil = 2,4-dioxy pyrimidine

                    Thymine = 2,4-dioxy-5-methyl pyrimidine

                    Cytosine = 2-oxy-4-amino pyrimidine

                    Orotic acid = 2,4-dioxy-6-carboxy pyrimidine

Cytosine is found in both DNA and RNA. Uracil is found only in RNA. Thymine is normally found in DNA. Sometimes tRNA will contain some thymine as well as uracil.



If a sugar, either ribose or 2-deoxyribose, is added to a nitrogen base, the resulting compound is called a nucleoside. Carbon 1 of the sugar is attached to nitrogen 9 of a purine base or to nitrogen 1 of a pyrimidine base. The names of purine nucleosides end in -osine and the names of pyrimidine nucleosides end in -idine. The convention is to number the ring atoms of the base normally and to use l', etc. to distinguish the ring atoms of the sugar. Unless otherwise specificed, the sugar is assumed to be ribose. To indicate that the sugar is 2'-deoxyribose, a d- is placed before the name.



                    Inosine - the base in inosine is hypoxanthine





Adding one or more phosphates to the sugar portion of a nucleoside results in a nucleotide. Generally, the phosphate is in ester linkage to carbon 5' of the sugar. If more than one phosphate is present, they are generally in acid anhydride linkages to each other. If such is the case, no position designation in the name is required. If the phosphate is in any other position, however, the position must be designated. For example, 3'-5' cAMP indicates that a phosphate is in ester linkage to both the 3' and 5' hydroxyl groups of an adenosine molecule and forms a cyclic structure. 2'-GMP would indicate that a phosphate is in ester linkage to the 2' hydroxyl group of a guanosine. Some representative names are:

                    AMP = adenosine monophosphate = adenylic acid

                    CDP = cytidine diphosphate

                    dGTP = deoxy guanosine triphosphate

                    dTTP = deoxy thymidine triphosphate (more commonly designated TTP)

                    cAMP = 3'-5' cyclic adenosine monophosphate






Nucleotide 5-monophosphate





Uracil Hypoxanthine



Guanosine Thymidine





Adenosine 5-monophosphate (adenylate, AMP)

Guanosine 5-monophosphate (guanylate, GMP)

Thymidine 5-monophosphate (thymidylate, TMP)

Cytidine 5-monophosphate (cytidylate, CMP)

Uridine 5-monophosphate (uridylate, UMP)

Inosine 5-monophosphate (inosinate, IMP)

Xanthosine 5-monophosphate (xanthylate , XMP)



Nucleotides are joined together by 3'-5' phosphodiester bonds to form polynucleotides. Polymerization of ribonucleotides will produce an RNA while polymerization of deoxyribonucleotides leads to DNA.

Hydrolysis of Polynucleotides

Most, but not all, nucleic acids in the cell are associated with protein. Dietary nucleoprotein is degraded by pancreatic enzymes and tissue nucleoprotein by lysosomal enzymes. After dissociation of the protein and nucleic acid, the protein is metabolized like any other protein.

The nucleic acids are hydrolyzed randomly by nucleases to yield a mixture of polynucleotides. These are further cleaved by phosphodiesterases (exonucleases) to a mixture of the mononucleotides. The specificity of the pancreatic nucleotidases gives the 3'-nucleotides and that of the lysosomal nucleotidases gives the biologically important 5'-nucleotides.



The nucleotides are hydrolyzed by nucleotidases to give the nucleosides and Pi. This is probably the end product in the intestine with the nucleosides being the primary form absorbed. In at least some tissues, the nucleosides undergo phosphorolysis with nucleoside phosphorylases to yield the base and ribose 1-P (or deoxyribose 1-P). Since R 1-P and R 5-P are in equilibrium, the sugar phosphate can either be reincorporated into nucleotides or metabolized via the Hexose Monophosphate Pathway. The purine and pyrimidine bases released are either degraded or salvaged for reincorporation into nucleotides.


There is significant turnover of all kinds of RNA as well as the nucleotide pool. DNA doesn't turnover but portions of the molecule are excised as part of a repair process.



Purine and pyrimidine bases which are not degraded are recycled - i.e. reincorporated into nucleotides. This recycling, however, is not sufficient to meet total body requirements and so some de novo synthesis is essential. There are definite tissue differences in the ability to carry out de novo synthesis. De novo synthesis of purines is most active in liver. Non-hepatic tissues generally have limited or even no de novo synthesis. Pyrimidine synthesis occurs in a variety of tissues. For purines, especially, non-hepatic tissues rely heavily on preformed bases - those salvaged from their own intracellular turnover supplemented by bases synthesized in the liver and delivered to tissues via the blood.

"Salvage" of purines is reasonable in most cells because xanthine oxidase, the key enzyme in taking the purines all of the way to uric acid, is significantly active only in liver and intestine. The bases generated by turnover in non-hepatic tissues are not readily degraded to uric acid in those tissues and, therefore, are available for salvage. The liver probably does less salvage but is very active in de novo synthesis - not so much for itself but to help supply the peripheral tissues.

De novo synthesis of both purine and pyrimidine nucleotides occurs from readily available components.


Phosphoribosyl pyrophosphate (PRPP) is important in both, and in these pathways the structure of ribose is retained in the product nucleotide, in contrast to its fate in the tryptophan and histidine biosynthetic pathways discussed earlier. An amino acid is an important precursor in each type of pathway: glycine for purines and aspartate for pyrimidines. Glutamine again is the most important source of amino groups in five different steps in the de novo pathways. Aspartate is also used as the source of an amino group in the purine pathways, in two steps. Two other features deserve mention. First, there is evidence, especially in the de novo purine pathway, that the enzymes are present as large, multienzyme complexes in the cell, a recurring theme in our discussion of metabolism. Second, the cellular pools of nucleotides (other than ATP) are quite small, perhaps 1% or less of the amounts required to synthesize the cells DNA.

Therefore, cells must continue to synthesize nucleotides during nucleic acid synthesis, and in some cases nucleotide synthesis may limit the rates of DNA replication and transcription. Because of the importance of these processes in dividing cells, agents that inhibit nucleotide synthesis have become particularly important to modern medicine. We examine here the biosynthetic pathways of purine and pyrimidine nucleotides and their regulation, the formation of the deoxynucleotides, and the degradation of purines and pyrimidines to uric acid and urea. We end with a discussion of chemotherapeutic agentsthat affect nucleotide synthesis.


De Novo Synthesis of Purine Nucleotides


The two parent purine nucleotides of nucleic acids are adenosine-monophosphate (AMP; adenylate) and guanosine-monophosphate (GMP; guanylate), containing the purine bases adenine and guanine. Figure shows the origin of the carbon and nitrogen atoms of the purine ring system, as determined by John Buchanan using isotopic tracer experiments in birds. The detailed pathway of purine biosynthesis was worked out primarily by Buchanan and G. Robert Greenberg in the 1950s.


In the first committed step of the pathway, an amino group donated by glutamine is attached at C-1 of PRPP.




The resulting 5-phosphoribosylamine is highly unstable, with a half-life of 30 seconds at pH 7.5. The purine ring is subsequently built up on this structure. The pathway described here is identical in all organisms, with the exception of one step that differs in higher eukaryotes as noted below.




The second step is the addition of three atoms from glycine (step 2 ). An ATP is consumed to activate the glycine carboxyl group (in the form of an acyl phosphate) for this condensation reaction:



The added glycine amino group is then formylated by N10- formyltetrahydrofolate (step 3 ):



A nitrogen is contributed by glutamine (step 4 ):




Before dehydration and ring closure yield the five-membered imidazole ring of the purine nucleus, as 5-aminoimidazole ribonucleotide (AIR; step 5).



At this point, three of the six atoms needed for the second ring in the purine structure are in place. To complete the process, a carboxyl group is first added (step 6 ). This carboxylation is unusual in that it does not require biotin, but instead uses the bicarbonate generally present in aqueous solutions. A rearrangement transfers the carboxylate from the exocyclic amino group to position 4 of the imidazole ring (step 7 ).



Steps 6 and 7 are found only in bacteria and fungi. In higher eukaryotes, including humans, the 5-aminoimidazole ribonucleotide product of step 5 is carboxylated directly to carboxyaminoimidazole ribonucleotide in one step instead of two (step 6a). The enzyme catalyzing this reaction is AIR carboxylase.



Aspartate now donates its amino group in two steps ( 8 and 9 ): formation of an amide bond, followed by elimination of the carbon skeleton of aspartate (as fumarate).



Recall that aspartate plays an analogous role in two steps of the urea cycle. The final carbon is contributed by N10-formyltetrahydrofolate (step 10 ),




and a second ring closure takes place to yield the second fused ring of the purine nucleus (step 11).



The first intermediate with a complete purine ring is inosinate (IMP).

As in the tryptophan and histidine biosynthetic pathways, the enzymes of IMP synthesis appear to be organized as large, multienzyme complexes in the cell. Once again, evidence comes from the existence of single polypeptides with several functions, some catalyzing nonsequential steps in the pathway. In eukaryotic cells ranging from yeast to fruit flies to chickens, steps 1 , 3 , and 5 are catalyzed by a multifunctional protein. An additional multifunctional protein catalyzes steps 10 and 11. In humans, a multifunctional enzyme combines the activities of AIR carboxylase and SAICAR synthetase (steps 6a and 8 ).


In bacteria, these activities are found on separate proteins, but a large noncovalent complex may exist in these cells. The channeling of reaction intermediates from one enzyme to the next permitted by these complexes is probably especially important for unstable intermediates such as 5-phosphoribosylamine.

Conversion of inosinate to adenylate requires the insertion of an amino group derived from aspartate; this takes place in two reactions similar to those used to introduce N-1 of the purine ring, (steps 8 and 9 ). A crucial difference is that GTP rather than ATP is the source of the high-energy phosphate in synthesizing adenylosuccinate.




Guanylate is formed by the NAD1-requiring oxidation of inosinate at C-2, followed by addition of an amino group derived from glutamine. ATP is cleaved to AMP and PPi in the final step.




Three major feedback mechanisms cooperate in regulating the overall rate of de novo purine nucleotide synthesis and the relative rates of formation of the two end products, adenylate and guanylate. The first mechanism is exerted on the first reaction that is unique to purine synthesis transfer of an amino group to PRPP to form 5-phosphoribosylamine. This reaction is catalyzed by the allosteric enzyme glutamine-PRPP amidotransferase, which is inhibited by the end products IMP, AMP, and GMP. AMP and GMP act synergistically in this concerted inhibition. Thus, whenever either AMP or GMP accumulates to excess, the first step in its biosynthesis from PRPP is partially inhibited.

In the second control mechanism, exerted at a later stage, an excess of GMP in the cell inhibits formation of xanthylate from inosinate by IMP dehydrogenase, without affecting the formation of AMP. Conversely, an accumulation of adenylate inhibits formation of adenylosuccinate by adenylosuccinate synthetase, without affecting the biosynthesis of GMP. In the third mechanism, GTP is required in the conversion of IMP to AMP ( step 1 ), whereas ATP is required for conversion of IMP to GMP (step 4 ), a reciprocal arrangement that tends to balance the synthesis of the two ribonucleotides.

The final control mechanism is the inhibition of PRPP synthesis by the allosteric regulation of ribose phosphate pyrophosphokinase. This enzyme is inhibited by ADP and GDP, in addition to metabolites from other pathways of which PRPP is a starting point.



Purine Catabolism

The end product of purine catabolism in man is uric acid. Uric acid is formed primarily in the liver and excreted by the kidney into the urine.



Nucleotides to Bases

Guanine nucleotides are hydrolyzed to the nucleoside guanosine which undergoes phosphorolysis to guanine and ribose 1-P. Man's intracellular nucleotidases are not very active toward AMP, however. Rather, AMP is deaminated by the enzyme adenylate (AMP) deaminase to IMP. In the catobilsm of purine nucleotides, IMP is further degraded by hydrolysis with nucleotidase to inosine and then phosphorolysis to hypoxanthine.

Adenosine does occur but usually arises from S-Adenosylmethionine during the course of transmethylation reactions. Adenosine is deaminated to inosine by an adenosine deaminase. Deficiencies in either adenosine deaminase or in the purine nucleoside phosphorylase lead to two different immunodeficiency diseases by mechanisms that are not clearly understood. With adenosine deaminase deficiency, both T and B-cell immunity is affected. The phosphorylase deficiency affects the T cells but B cells are normal. In September, 1990, a 4 year old girl was treated for adenosine deaminase deficiency by genetically engineering her cells to incorporate the gene. The treatment,so far, seems to be successful.

Whether or not methylated purines are catabolized depends upon the location of the methyl group. If the methyl is on an -NH2, it is removed along with the -NH2 and the core is metabolized in the usual fashion. If the methyl is on a ring nitrogen, the compound is excreted unchanged in the urine.

Bases to Uric Acid

Both adenine and guanine nucleotides converge at the common intermediate xanthine. Hypoxanthine, representing the original adenine, is oxidized to xanthine by the enzyme xanthine oxidase. Guanine is deaminated, with the amino group released as ammonia, to xanthine. If this process is occurring in tissues other than liver, most of the ammonia will be transported to the liver as glutamine for ultimate excretion as urea.

Xanthine, like hypoxanthine, is oxidized by oxygen and xanthine oxidase with the production of hydrogen peroxide. In man, the urate is excreted and the hydrogen peroxide is degraded by catalase. Xanthine oxidase is present in significant concentration only in liver and intestine. The pathway to the nucleosides, possibly to the free bases, is present in many tissues.




- Uric acid is the end product of purine metabolism.

- Hyperuricaemia is associated with a tendency to form crystals of monosodium urate causing:

- Clinical gout (due to the deposition of monosodium urate crystals in the cartilage, synovium and synovial fluid of joints),

- Renal calculi

- Tophi (accretions of sodium urate in soft tissues)

- Acute urate nephropathy (due to sudden increases in urate production leading to widespread crystallisation in the renal tubules).


Gouts and Hyperuricemia


Both undissociated uric acid and the monosodium salt (primary form in blood) are only sparingly soluble. The limited solubility is not ordinarily a problem in urine unless the urine is very acid or has high [Ca2+]. [Urate salts coprecipitate with calcium salts and can form stones in kidney or bladder.] A very high concentration of urate in the blood leads to a fairly common group of diseases referred to as gout. The incidence of gout in this country is about 3/1000.

Gout is a group of pathological conditions associated with markedly elevated levels of urate in the blood (3-7 mg/dl normal). Hyperuricemia is not always symptomatic, but, in certain individuals, something triggers the deposition of sodium urate crystals in joints and tissues. In addition to the extreme pain accompanying acute attacks, repeated attacks lead to destruction of tissues and severe arthritic-like malformations. The term gout should be restricted to hyperuricemia with the presence of these tophaceous deposits.

Urate in the blood could accumulate either through an overproduction and/or an underexcretion of uric acid. In gouts caused by an overproduction of uric acid, the defects are in the control mechanisms governing the production of - not uric acid itself - but of the nucleotide precursors. The only major control of urate production that we know so far is the availability of substrates (nucleotides, nucleosides or free bases).

One approach to the treatment of gout is the drug allopurinol, an isomer of hypoxanthine.

Allopurinol is a substrate for xanthine oxidase, but the product binds so tightly that the enzyme is now unable to oxidized its normal substrate. Uric acid production is diminished and xanthine and hypoxanthine levels in the blood rise. These are more soluble than urate and are less likely to deposit as crystals in the joints. Another approach is to stimulate the secretion of urate in the urine.


- Gout is a group of metabolic diseases associated with hyperuricaemia and deposition of crystals of monosodium urate in tissues.

- Prevalence: 3/1000, males affected more than females (8-10:1).

- Presentation usually occurs in males over 30 years of age and females after the menopause.


A. Hypoxanthine-guanine phosphoribosyl transferase (HGPRT) deficiency (Lesch-Nyhan syndrome):

- The Lesch-Nyhan syndrome is an X-linked recessive disorder, due to severe deficiency of HGPRT.

- It is characterised by hyperuricaemia, mental deficiency, spasticity, choreoathetosis and self-mutilation.

- Hyperuricaemia is due to decreased activity of the salvage pathway causing decreased purine reutilization and increased uric acid synthesis. Relatively low levels of nucleotides result in decreased inhibition of de novo synthesis, resulting in further overload of the non-functioning salvage pathway and increased uric acid production.

B. Glucose 6-phosphatase deficiency (Glycogen storage disease type I/ Von Gierkes disease):

- Deficiency of glucose 6-phosphatase (final enzyme in glycogenolysis pathway) results in accumulation of glycogen, and hypoglycemia.

- Increased metabolism of glucose 6-phosphate through glycolysis results in lactic acidosis.

- Increased metabolism of glucose 6-phosphate through pentose phosphate pathway increases formation of ribose 5-phosphate and NADPH.

- Ribose 5-phosphate is a substrate for increased de novo purine nucleotide synthesis, which is subsequently degraded to uric acid resulting in hyperuricaemia.

- NADPH is a coenzyme in triglyceride synthesis, and overproduction results in hypertriglyceridaemia.

- Hyperuricaemia is aggravated by increased lactic acid which inhibits renal excretion of uric acid.


A. Physiological/environmental factors

B. Primary hyperuricaemia


- Idiopathic

- Glucose-6-phosphatase deficiency (Von Gierkes disease)

- HGPRT deficiency (Lesch-Nyhan syndrome)

Reduced excretion:

- Idiopathic

C. Secondary hyperuricaemia


- Increased nucleic acid turnover:

- Myeloproliferative disease, eg polycythemia vera

- Lymphoma, leukemia

- Multiple myeloma

- Cytotoxic therapy of malignancies

- Psoriasis

- Disordered ATP metabolism:

- Alcohol (increased ATP turnover)

- Tissue hypoxia

- Excessive dietary purine intake

Reduced excretion:

- Decreased glomerular filtration:

- Renal failure

- Decreased secretion (competition with urate for tubular secretion):

- Lactic acidosis alcohol, exercise

- Ketoacidosis alcohol, diabetes, starvation

- Drugs low dose salicylate

- Increased reabsorption:

- Hypovolemia, eg diuretics.


Lesch-Nyhan Syndrome


Lesch-Nyhan Syndrome Poster Child

Lesch-Nyhan syndrome (LNS) is a rare genetic disorder characterized by an overproduction of uric acid, neurological disability, and behavioral problems. The symptoms of LNS typically appear between ages 3 and 6 months; the presence of orange-colored crystal-like deposits (orange sand) in the childs diaper is usually the first symptom to appear in those affected with the syndrome.

LNS is caused by a mutation in the HPRT gene on the X-chromosome, resulting in a deficiency of the enzyme hypoxanthine-guanine phosphoribosyltransferase (HPRT). HPRT is involved in the recycling of purines. When the body is unable to recycle these purines, there is a dramatic overproduction of uric acid, which then leads to hyperuricemia. Hyperuricemia can result in gouty arthritis, tophi (lumpy deposits of uric acid crystals just under the skin) and kidney stones. LNS has been reported to occur in 1 out of every 100,000 live births. It is estimated that there are only several hundred individuals with the disorder in the United States. LNS has been found equally among all races and ethnic groups, however as an X-linked disorder, nearly all cases are male. LNS can either be inherited or it can occur as a spontaneous (or new) mutation.

LNS was first described by Michael Lesch, M.D. and William Nyhan, M.D., Ph.D. in 1964 when they reported two affected brothers. The enzymatic defect was discovered by Seegmiller and colleagues in 1967. Finally, the gene responsible for LNS was cloned and sequenced by Friedmann and colleagues in 1985.


Features and Characteristics

The following characteristics have been identified in individuals with LNS:

                    Hyperuricemia (overproduction of uric acid)

                    Urate crystal formation (orange, crystal-like deposits found in the urine, caused by the overproduction of uric acid)

                    Mental retardation (typically in the moderate range)

                    Aggressive and impulsive behaviors (always including self-injurious behaviors)

                    Choreoathetosis (involuntary writhing movements of the arms and legs and purposeless repetitive movements)


                    Dystonia (involuntary spasms and muscle contractions)

                    Ballismus (violent flinging movements of the limbs)

                    Muscle weakness (hypotonia)

                    Speech impairment

                    Hyperrefelxia (exaggeration of reflexes)

                    Kidney stones

                    Blood in the urine

                    Pain and swelling in the joints

                    Difficulty swallowing and eating


                    Impaired kidney function




The overproduction of uric acid is often evident in urine studies and uric acid levels in the blood are typically elevated. However, there are many different causes of hyperuricemia (other than LNS) and some patients with LNS actually have serum uric acid levels that fall into the normal range. Therefore, the detection of hyperuricemia in the blood or urine does not provide reliable diagnostic information. Therefore, a definitive diagnosis of LNS can be obtained by the measurement of the HPRT enzyme in the blood or tissue, or by determining a molecular genetic mutation in the HPRT gene.


The treatment for LNS is symptomatic:

Hyperuricemia - As mentioned earlier, if hyperuricemia is left untreated, it can result in the production of kidney stones with renal failure, gouty arthritis, and tophi deposits. Therefore, a medication called allopurinol is used to control the overproduction of uric acid, which reduces the risk of developing the symptoms described above.


De Novo Synthesis of Pyrimidine Nucleotides

Since pyrimidine molecules are simpler than purines, so is their synthesis simpler but is still from readily available components. Glutamine's amide nitrogen and carbon dioxide provide atoms 2 and 3 or the pyrimidine ring. They do so, however, after first being converted to carbamoyl phosphate. The other four atoms of the ring are supplied by aspartate. As is true with purine nucleotides, the sugar phosphate portion of the molecule is supplied by PRPP.

Pyrimidine synthesis begins with carbamoyl phosphate synthesized in the cytosol of those tissues capable of making pyrimidines (highest in spleen, thymus, GItract and testes). This uses a different enzyme than the one involved in urea synthesis. Carbamoyl phosphate synthetase II (CPS II) prefers glutamine to free ammonia and has no requirement for N-Acetylglutamate.



Formation of Orotic Acid

Carbamoyl phosphate condenses with aspartate in the presence of aspartate transcarbamylase to yield N-carbamylaspartate which is then converted to dihydroorotate.



In man, CPSII, asp-transcarbamylase, and dihydroorotase activities are part of a multifunctional protein.

Oxidation of the ring by a complex, poorly understood enzyme produces the free pyrimidine, orotic acid. This enzyme is located on the outer face of the inner mitochondrial membrane, in contrast to the other enzymes which are cytosolic. Note the contrast with purine synthesis in which a nucleotide is formed first while pyrimidines are first synthesized as the free base.



Formation of the Nucleotides

Orotic acid is converted to its nucleotide with PRPP. OMP is then converted sequentially - not in a branched pathway - to the other pyrimidine nucleotides.




Decarboxylation of OMP gives UMP. O-PRT and OMP decarboxylase are also a multifunctional protein. After conversion of UMP to the triphosphate, the amide of glutamine is added, at the expense of ATP, to yield CTP.


Orotic aciduria refers to an excessive excretion of orotic acid in urine. It causes a characteristic form of anemia and may be associated with mental and physical retardation.

In addition to the characteristic excessive orotic acid in the urine, patients typically have megaloblastic anemia which cannot be cured by administration of vitamin B12 or folic acid.

It also can cause inhibition of RNA and DNA synthesis and failure to thrive. This can lead to mental and physical retardation.

Its hereditary form, an autosomal recessive disorder, can be caused by a deficiency in the enzyme UMPS, a bifunctional protein that includes the enzyme activities of orotate phosphoribosyltransferase and orotidine 5'-phosphate decarboxylase.

It can also arise secondary to blockage of the urea cycle, particularly in ornithine transcarbamylase deficiency (or OTC deficiency). You can distinguish this increase in orotic acid secondary to OTC deficiency from hereditary orotic aciduria (seen above) by looking at blood ammonia levels and the BUN. In OTC deficiency, because the urea cycle backs up, you will see hyperammonemia and a decreased BUN.

Administration of cytidine monophosphate and uridine monophosphate reduces urinary orotic acid and the anemia.

Administration of uridine, which is converted to UMP, will bypass the metabolic block and provide the body with a source of pyrimidine.

Pyrimidine Catabolism

In contrast to purines, pyrimidines undergo ring cleavage and the usual end products of catabolism are beta-amino acids plus ammonia and carbon dioxide. Pyrimidines from nucleic acids or the energy pool are acted upon by nucleotidases and pyrimidine nucleoside phosphorylase to yield the free bases. The 4-amino group of both cytosine and 5-methyl cytosine is released as ammonia.

Formation of Deoxyribonucleotides

De novo synthesis and most of the salvage pathways involve the ribonucleotides. (Exception is the small amount of salvage of thymine indicated above.) Deoxyribonucleotides for DNA synthesis are formed from the ribonucleotide diphosphates (in mammals and E. coli).

A base diphosphate (BDP) is reduced at the 2' position of the ribose portion using the protein, thioredoxin and the enzyme nucleoside diphosphate reductase. Thioredoxin has two sulfhydryl groups which are oxidized to a disulfide bond during the process. In order to restore the thioredoxin to its reduced for so that it can be reused, thioredoxin reductase and NADPH are required.

This system is very tightly controlled by a variety of allosteric effectors. dATP is a general inhibitor for all substrates and ATP an activator. Each substrate then has a specific positive effector (a BTP or dBTP). The result is a maintenance of an appropriate balance of the deoxynucleotides for DNA synthesis.

Synthesis of dTMP

DNA synthesis also requires dTMP (dTTP). This is not synthesized in the de novo pathway and salvage is not adequate to maintain the necessary amount. dTMP is generated from dUMP using the folate-dependent one-carbon pool.

Since the nucleoside diphosphate reductase is not very active toward UDP, CDP is reduced to dCDP which is converted to dCMP. This is then deaminated to form dUMP. In the presence of 5,10-Methylene tetrahydrofolate and the enzyme thymidylate synthetase, the carbon group is both transferred to the pyrimidine ring and further reduced to a methyl group. The other product is dihydrofolate which is subsequently reduced to the tetrahydrofolate by dihydrofolate reductase.

Chemotherapeutic Agents

Thymidylate synthetase is particularly sensitive to availability of the folate one-carbon pool. Some of the cancer chemotherapeutic agents interfere with this process as well as with the steps in purine nucleotide synthesis involving the pool.

Cancer chemotherapeutic agents like methotrexate (4-amino, 10-methyl folic acid) and aminopterin (4-amino, folic acid) are structural analogs of folic acid and inhibit dihydrofolate reductase. This interferes with maintenance of the folate pool and thus of de novo synthesis of purine nucleotides and of dTMP synthesis. Such agents are highly toxic and administered under careful control.



The Chemical Nature of DNA

The polymeric structure of DNA may be described in terms of monomeric units of increasing complexity. In the top shaded box of the following illustration, the three relatively simple components mentioned earlier are shown. Below that on the left , formulas for phosphoric acid and a nucleoside are drawn. Condensation polymerization of these leads to the DNA formulation outlined above. Finally, a 5'- monophosphate ester, called a nucleotide may be drawn as a single monomer unit, shown in the shaded box to the right. Since a monophosphate ester of this kind is a strong acid (pKa of 1.0), it will be fully ionized at the usual physiological pH (ca.7.4).

Isomeric 3'-monophospate nucleotides are also known, and both isomers are found in cells. They may be obtained by selective hydrolysis of DNA through the action of nuclease enzymes. Anhydride-like di- and tri-phosphate nucleotides have been identified as important energy carriers in biochemical reactions, the most common being ATP (adenosine 5'-triphosphate).




Names of DNA Base Derivatives



















First, the remaining P-OH function is quite acidic and is completely ionized in biological systems.

Second, the polymer chain is structurally directed. One end (5') is different from the other (3').

Third, although this appears to be a relatively simple polymer, the possible permutations of the four nucleosides in the chain become very large as the chain lengthens.

Fourth, the DNA polymer is much larger than originally believed. Molecular weights for the DNA from multicellular organisms are commonly 109 or greater.

Information is stored or encoded in the DNA polymer by the pattern in which the four nucleotides are arranged. To access this information the pattern must be "read" in a linear fashion, just as a bar code is read at a supermarket checkout. Because living organisms are extremely complex, a correspondingly large amount of information related to this complexity must be stored in the DNA. Consequently, the DNA itself must be very large, as noted above. Even the single DNA molecule from an E. coli bacterium is found to have roughly a million nucleotide units in a polymer strand, and would reach a millimeter in length if stretched out. The nuclei of multicellular organisms incorporate chromosomes, which are composed of DNA combined with nuclear proteins called histones. The fruit fly has 8 chromosomes, humans have 46 and dogs 78 (note that the amount of DNA in a cell's nucleus does not correlate with the number of chromosomes). The DNA from the smallest human chromosome is over ten times larger than E. coli DNA, and it has been estimated that the total DNA in a human cell would extend to 2 meters in length if unraveled. Since the nucleus is only about 5μm in diameter, the chromosomal DNA must be packed tightly to fit in that small volume.

In addition to its role as a stable informational library, chromosomal DNA must be structured or organized in such a way that the chemical machinery of the cell will have easy access to that information, in order to make important molecules such as polypeptides. Furthermore, accurate copies of the DNA code must be created as cells divide, with the replicated DNA molecules passed on to subsequent cell generations, as well as to progeny of the organism. The nature of this DNA organization, or secondary structure, will be discussed in a later section.

The high molecular weight nucleic acid, DNA, is found chiefly in the nuclei of complex cells, known as eucaryotic cells, or in the nucleoid regions of procaryotic cells, such as bacteria. It is often associated with proteins that help to pack it in a usable fashion. In contrast, a lower molecular weight, but much more abundant nucleic acid, RNA, is distributed throughout the cell, most commonly in small numerous organelles called ribosomes. Three kinds of RNA are identified, the largest subgroup (85 to 90%) being ribosomal RNA, rRNA, the major component of ribosomes, together with proteins. The size of rRNA molecules varies, but is generally less than a thousandth the size of DNA. The other forms of RNA are messenger RNA , mRNA, and transfer RNA , tRNA. Both have a more transient existence and are smaller than rRNA.

All these RNA's have similar constitutions, and differ from DNA in two important respects. As shown in the following diagram, the sugar component of RNA is ribose, and the pyrimidine base uracil replaces the thymine base of DNA. The RNA's play a vital role in the transfer of information (transcription) from the DNA library to the protein factories called ribosomes, and in the interpretation of that information (translation) for the synthesis of specific polypeptides.

The Secondary Structure of DNA

In the early 1950's the primary structure of DNA was well established, but a firm understanding of its secondary structure was lacking. Indeed, the situation was similar to that occupied by the proteins a decade earlier, before the alpha helix and pleated sheet structures were proposed by Linus Pauling. Many researchers grappled with this problem, and it was generally conceded that the molar equivalences of base pairs (A & T and C & G) discovered by Chargaff would be an important factor. Rosalind Franklin, working at King's College, London, obtained X-ray diffraction evidence that suggested a long helical structure of uniform thickness. Francis Crick and James Watson, at Cambridge University, considered hydrogen bonded base pairing interactions, and arrived at a double stranded helical model that satisfied most of the known facts, and has been confirmed by subsequent findings.

Base Pairing

Careful examination of the purine and pyrimidine base components of the nucleotides reveals that three of them could exist as hydroxy pyrimidine or purine tautomers, having an aromatic heterocyclic ring. Despite the added stabilization of an aromatic ring, these compounds prefer to adopt amide-like structures. These options are shown in the following diagram, with the more stable tautomer drawn in blue.

A simple model for this tautomerism is provided by 2-hydroxypyridine. As shown on the left below, a compound having this structure might be expected to have phenol-like characteristics, such as an acidic hydroxyl group. However, the boiling point of the actual substance is 100º C greater than phenol and its acidity is 100 times less than expected (pKa = 11.7). These differences agree with the 2-pyridone tautomer, the stable form of the zwitterionic internal salt. Further evidence supporting this assignment will be displayed by clicking on the diagram. Note that this tautomerism reverses the hydrogen bonding behavior of the nitrogen and oxygen functions (the N-H group of the pyridone becomes a hydrogen bond donor and the carbonyl oxygen an acceptor).

Once they had identified the favored base tautomers in the nucleosides, Watson and Crick were able to propose a complementary pairing, via hydrogen bonding, of guanosine (G) with cytidine (C) and adenosine (A) with thymidine (T). This pairing, which is shown in the following diagram, explained Chargaff's findings beautifully, and led them to suggest a double helix structure for DNA. Before viewing this double helix structure itself, it is instructive to examine the base pairing interactions in greater detail. The G#C association involves three hydrogen bonds (colored pink), and is therefore stronger than the two-hydrogen bond association of A#T. These base pairings might appear to be arbitrary, but other possibilities suffer destabilizing steric or electronic interactions.

A simple mnemonic device for remembering which bases are paired comes from the line construction of the capital letters used to identify the bases. A and T are made up of intersecting straight lines. In contrast, C and G are largely composed of curved lines. The RNA base uracil corresponds to thymine, since U follows T in the alphabet.

The Double Helix Structure for DNA

After many trials and modifications, Watson and Crick conceived an ingenious double helix model for the secondary structure of DNA. Two strands of DNA were aligned anti-parallel to each other, i.e. with opposite 3' and 5' ends , as shown in part a of the following diagram. Complementary primary nucleotide structures for each strand allowed intra-strand hydrogen bonding between each pair of bases. These complementary strands are colored red and green in the diagram. Coiling these coupled strands then leads to a double helix structure, shown as cross-linked ribbons in part b of the diagram. The double helix is further stabilized by hydrophobic attractions and pi-stacking of the bases. A space-filling molecular model of a short segment is displayed in part c on the right.

The helix shown here has ten base pairs per turn, and rises 3.4 Å in each turn. This right-handed helix is the favored conformation in aqueous systems, and has been termed the B-helix. As the DNA strands wind around each other, they leave gaps between each set of phosphate backbones. Two alternating grooves result, a wide and deep major groove (ca. 22Å wide), and a shallow and narrow minor groove (ca. 12Å wide).


Space-Filling Molecular Model

Other molecules, including polypeptides, may insert into these grooves, and in so doing perturb the chemistry of DNA. Other helical structures of DNA have also been observed, and are designated by letters (e.g. A and Z).

Deoxyribonucleic acid (DNA) consists of covalently linked chains of deoxyribonucleotides, and ribonucleic acid (RNA) consists of chains of ribonucleotides. DNA and RNA share a number of chemical and physical properties because in both of them the successive nucleotide units are covalently linked in identical fashion by phosphodiester bridges formed between the 5'-hydroxyl group of one nucleotide and the 3'-hydroxyl group of the next. Thus the backbone of both DNA and RNA consists of alternating phosphate and pentose groups, in which phosphodiester bridges provide the covalent continuity. The purine and pyrimidine bases of the nucleotide units are not present in the backbone structure but constitute distinctive side chains, just as the R groups of amino acid residues are the distinctive side chains of polypeptides.


The three major types of ribonucleic acid in cells are called messenger RNA (mRNA), ribosomal RNA (rRNA), and transfer RNA (tRNA). Although all three types occur as single polyribonucleotide strands, each type has a characteristic range of molecular weight and sedimentation coefficient. Moreover, each of the three major kinds of RNA occurs in multiple molecular forms. Ribosomal RNA of any given biological species exists in three or more major forms, transfer RNA in as many as 60 forms, and messenger RNA in hundreds and perhaps thousands of distinctive forms. Most cells contain 2 to 8 times as much RNA as DNA.

Messenger RNA

Messenger RNA contains only the four major bases. It is synthesized in the nucleus during the process of transcription, in which the sequence of bases in one strand of the chromosomal DNA is enzymatically transcribed in the form of a single strand of mRNA; some mRNA is also made in the mitochondria. The sequence of bases of the mRNA strand so formed is complementary to that of the DNA strand being transcribed. Each of the thousands of different proteins synthesized by the cell is coded by a specific mRNA or segment of an mRNA molecule.

Transfer RNAs

Transfer RNAs are relatively small molecules that act as carriers of specific individual amino acids during protein synthesis on the ribosomes. Each of the 20 amino acids found in proteins has at least one corresponding tRNA, and some have multiple tRNAs

Ribosomal RNA

Ribosomal RNA (rRNA) constitutes up to 65 percent of the mass of ribosomes. Although rRNAs make up a large fraction of total cellular RNA, their function in ribosomes is not yet clear. A few of the bases in rRNAs are methylated.


Analysis of Structural Similarities and Differences between DNA and RNA

1. Background

We know that living organisms have the ability to reproduce and to pass many of their characteristics on to their offspring. From this we may infer that all organisms have genetic substances and an associated chemistry that enable inheritance to occur. It is instructive to consider the essential requirements such genetic materials must fullfill.

Biologically useful information, especially instructions for protein synthesis, must be incorporated in the material.

The inherited information must be stable (unchanged) over the lifetime of the organism if accurate copies are to be conveyed to the offspring. Infrequent changes may take place (see mutability).

A method of faithfully replicating the information encoded in the material, and transmitting this copy to the offspring must exist.

Despite the inherent stability noted above, the material must be capable of incorporating stable structural change, and passing this change on to succeeding generations.

Since this genetic substance has been identified as the nucleic acids DNA and RNA, it is instructive to examine the manner in which these polymers satisfy the above requirements.

2. Information Storage

The complexity of life suggests that even simple organisms will require very large inheritance libraries. Although the four nucleotides that make up of DNA might appear to be too simple for this task, the enormous size of the polymer and the permutations of the monomers within the chain meet the challenge easily. After all, the words and graphics in this document are all presented to the computer as combinations of only two characters, zeros and ones (the binary number system). DNA has four letters in its alphabet (A, C, G & T), so the number of words that can be formed increase exponentially with the number of letters per word. Thus, there are 42 or 16 two letter words, and 43 or 64 three letter words.

Assuring the stability of information encoded by the DNA alphabet presents a serious challenge. If the letters of this alphabet are to be strung together in a specific way on the polymer chain, chemical reactions for attaching (and removing) them must be available. Simple carboxylic ester or amide links might appear suitable for this purpose (note step-growth polymerization), but these are used in lipids and polypeptides, so a separate enzymatic machinery would be needed to keep the information processing operations apart from other molecular transformations. The overall stability of such covalent links presents a more serious problem. Under physiological conditions (aqueous, pH near 7.4 & 27 to 37º C) esters are slowly hydrolyzed. Amides are more stable, but even a hydrolytic cleavage of one bond per hour would be devastating to a polymer having tens of thousands to millions such links. Furthermore, short difunctional linking groups, such as carbonates, oxylates and malonates show enhanced reactivity, and their parent acids are unstable or toxic.

1. DNA Replication

In their 1953 announcement of a double helix structure for DNA, Watson and Crick stated, "It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material.". The essence of this suggestion is that, if separated, each strand of the molecule might act as a template on which a new complementary strand might be assembled, leading finally to two identical DNA molecules. Indeed, replication does take place in this fashion when cells divide, but the events leading up to the actual synthesis of complementary DNA strands are sufficiently complex that they will not be described in any detail.

As depicted in the following drawing, the DNA of a cell is tightly packed into chromosomes. First, the DNA is wrapped around small proteins called histones (colored pink below). These bead-like structures are then further organized and folded into chromatin aggregates that make up the chromosomes. An overall packing efficiency of 7,000 or more is thus achieved. Clearly a sequence of unfolding events must take place before the information encoded in the DNA can be used or replicated.


Because of the strategic importance of DNA, all living organisms must pos sess the following features: (1) rapid and accurate DNA synthesis and (2) genetic stability provided by effective DNA repair mechanisms.


The prokaryotic replication process contains several basic steps, each of which requires certain enzyme activities:

1. DNA uncoiling. As their name implies, the helicases are ATP-requiring enzymes that catalyze the unwinding of duplex DNA.

2. Primer synthesis. The formation of short RNA segments called primers, which are required for the initiation of DNA replication, is catalyzed by primase.

3. DNA synthesis. The synthesis of a complementary DNA strand by creating phosphodiester linkages between nucleotides base paired to a template strand is catalyzed by large multienzyme complexes referred to as the DNA polymerases. DNA polymerase III (pol III), the major DNA-synthesizing enzyme, is composed of at least ten different subunits. DNA polymerase I (pol I) is a DNA repaer enzyme. (Pol I is also believed to play a role in the timely removal of RNA primer.) The function of DNA polymerase II (pol II) is not understood. In addition to a 5'3' polymerizing activity, all three enzymes possess a 3'5' exonuclease activity (An exonuclesse is an enzyme that removes nucleotides from an end of a potynucleotide strand.) Pol I also possesses a 5'3' exonuclease activity.

4. Joining of DNA fragments. Discontinuous DNA synthesis (described below) requires an enzyme, referred to as a ligase, that joins the newly synthesized segments.

5. Supercoiling control. The tangling of DNA strands, which can prevent further unwinding of the double helix, is prevented by the DNA topoisomerases. Tangling is a very real possibility, since the double helix unwinds rapidly (as many as 50 revolutions per second during bacterial DNA replication). Topoisomerases are enzymes that alter the linking number of closed duplex DNA molecules. The terms "topoisomerase" and "topoisomers" (circular DNA molecules that differ only in their linking numbers) are derived from "topology," a form of mathematics that investigates the properties of geometric structures that do not change with bending or stretching.


When DNA replication was first observed experimentally (with the aid of electron microscopy and autoradiography), investigators were confronted with a paradox. The bidirectional synthesis of DNA as it appeared in their research seemed to indicate that continuous synthesis occurs in the 5' 3' direction on one strand and in the 3' 5' direction on the other strand. (Recall that DNA double helix has an antiparallel configuration.) However, all the enzymes that catalyze DNA synthesis do so in the 5' 3' direction only. It was later determined that only one strand, referred to as the leading strand, is continuously synthesized in the 5' 3' direction. The other strand, referred to as the lagging strand, is also synthesized in the 5' 3' direction but in a series of small pieces. (Reiji Okazaki and his colleagues provided the experimental evidence for discontinuous DNA synthesis.) Subsequently, these pieces (now called Okazaki fragments) are covalently linked together by DNA ligase. (In prokaryotes such as E. coli, Okazaki fragments possess between 1000 and 2000 nucleotides.)

The replication fork moves forward as the helicases, assisted by topoisomerases (especially DNA gyrase), unwind the helix. The single strands are kept separate by the binding of numerous copies of single-stranded DNA binding protein (SSB). (SSB may also protect vulnerable ssDNA segments from attack by various nucleases.) Before pol III can initiate DNA synthesis, however, an RNA primer must be present. On the leading strand, where DNA synthesis is continuous, primer formation occurs only once per replication fork. In contrast, the discontinuous synthesis on the lagging strand requires primer synthesis for each of the Okazaki fragments. A multienzyme complex containing primase and several other proteins, called the primosome, travels along the lagging strand. At intervals the primosome stops and reverses direction while it synthesizes a short RNA primer. Subsequently, pol III synthesizes DNA beginning at the 3' end of the primer. After most of the laggin strand synthesis is complete, the RNA primers are removed and replaced by DNA segments synthesized by pol I. Finally, the Okazaki fragments are joined by DNA ligase.

It appears that the synthesis of both the leading and lagging strands are coi pled. The tandem operation of two pol III complexes require that one strand (the lagging strand) is looped around the replisome. (The ten "replisome" is used to describe the set of enzymes and other molecules required for DNA synthesis at a single replication fork.) When the pol III complex that copies the lagging strand completes an Okazaki fragment, it releases the duplex DNA. Once it does so, the primosome moves in and synthesizes another RNA primer.

Despite the complexity of DNA replication, as well as its rate (as high as 1000 base pairs per second per replication fork), this process is amazingly accurate (approximately one error per 109 to 1010 base pairs per generation).

Replication ends when the replication forks collide on the other side of circular chromosome. The subsequent separation of the two daughter DNA molecules is not understood, although a type II topoisomerase is believed to be involved.

Unique features of eukaryotic DNA synthesis

1. Timing of replication. In contrast to rapidly growing bacterial cells, in which replication occurs throughout most of the cell division cycle, eukaryotic replication is limited to a specific time period referred to as the S phase. It is now known that eukaryotic cells produce certain proteins that regulate phase transitions within the cell cycle.

2. Replication rate. DNA replication is significantly slower in eukaryotes than that observed in prokaryotes. The eukaryotic rate is approximately 50 nucteotides per second per replication fork. (Recall that the rate in prokaryotes is about ten times higher). This discrepancy is presumably due, in part, to the complex structure of chromatin

3. Replicons. Despite the relative slowness of DNA synthesis, the replication process is relativly brief, considering the large sizes of eukaryotic genomes. For example, on the basis of the replication rate mentioned above, the replication of an average eukaryotic chromosome (approximately 150 million base pairs) should take over a month to complete. Instead, this process usually requires several hours. Eukaryotes have compressed the replication of their large genomes into short time period with the use of multiple replicons.

4. Okazaki fragments. At between 100 and 200 nucleotides in length, the Okazaki fragments of eukaryotes are significantly shorter than those that occur in prokaryotes.


Once the double stranded DNA is exposed, a group of enzymes act to accomplish its replication. These are described briefly here:

Topoisomerase: This enzyme initiates unwinding of the double helix by cutting one of the strands.

Helicase: This enzyme assists the unwinding. Note that many hydrogen bonds must be broken if the strands are to be separated. SSB: A single-strand binding-protein stabilizes the separated strands, and prevents them from recombining, so that the polymerization chemistry can function on the individual strands.
DNA Polymerase: This family of enzymes link together nucleotide triphosphate monomers as they hydrogen bond to complementary bases. These enzymes also check for errors (roughly ten per billion), and make corrections.
Ligase: Small unattached DNA segments on a strand are united by this enzyme.

The DNA polymerization process that builds the complementary strands in replication, could in principle take place in two ways. Referring to the general equation above, R1 could represent the next nucleotide unit to be attached to the growing DNA strand, with R2 being this strand. Alternatively, these assignments could be reversed. In practice, the former proves to be the best arrangement. Since triphosphates are very reactive, the lifetime of such derivatives in an aqueous environment is relatively short. However, such derivatives of the individual nucleosides are repeatedly synthesized by the cell for a variety of purposes, providing a steady supply of these reagents. In contrast, the growing DNA segment must maintain its functionality over the entire replication process, and can not afford to be changed by a spontaneous hydrolysis event. As a result, these chemical properties are best accommodated by a polymerization process that proceeds at the 3'-end of the growing strand by 5'-phosphorylation involving a nucleotide triphosphate.


The polymerization mechanism described here is constant. It always extends the developing DNA segment toward the 3'-end (i.e. when a nucleotide triphosphate attaches to the free 3'-hydroxyl group of the strand, a new 3'-hydroxyl is generated). There is sometimes confusion on this point, because the original DNA strand that serves as a template is read from the 3'-end toward the 5'-end, and authors may not be completely clear as to which terminology is used.

Because of the directional demand of the polymerization, one of the DNA strands is easily replicated in a continuous fashion, whereas the other strand can only be replicated in short segmental pieces. This is illustrated in the following diagram. Separation of a portion of the double helix takes place at a site called the replication fork. As replication of the separate strands occurs, the replication fork moves away (to the left in the diagram), unwinding additional lengths of DNA. Since the fork in the diagram is moving toward the 5'-end of the red-colored strand, replication of this strand may take place in a continuous fashion (building the new green strand in a 5' to 3' direction). This continuously formed new strand is called the leading strand. In contrast, the replication fork moves toward the 3'-end of the original green strand, preventing continuous polymerization of a complementary new red strand. Short segments of complementary DNA, called Okazaki fragments, are produced, and these are linked together later by the enzyme ligase. This new DNA strand is called the lagging strand.

When you consider that a human cell has roughly 109 base pairs in its DNA, and may divide into identical daughter cells in 14 to 24 hours, the efficiency of DNA replication must be extraordinary. The procedure described above will replicate about 50 nucleotides per second, so there must be many thousand such replication sites in action during cell division. A given length of double stranded DNA may undergo strand unwinding at numerous sites in response to promoter actions. The unraveled "bubble" of single stranded DNA has two replication forks, so assembly of new complementary strands may proceed in two directions. The polymerizations associated with several such bubbles fuse together to achieve full replication of the entire DNA double helix. A cartoon illustrating these concerted replications will appear by clicking on the above diagram. Note that the events shown proceed from top to bottom in the diagram.

2. Repair of DNA Damage and Replication Errors

One of the benefits of the double stranded DNA structure is that it lends itself to repair, when structural damage or replication errors occur. Several kinds of chemical change may cause damage to DNA:

                    Spontaneous hydrolysis of a nucleoside removes the heterocyclic base component.

                    Spontaneous hydrolysis of cytosine changes it to a uracil.

                    Various toxic metabolites may oxidize or methylate heterocyclic base components.

                    Ultraviolet light may dimerize adjacent cytosine or thymine bases.

All these transformations disrupt base pairing at the site of the change, and this produces a structural deformation in the double helix.. Inspection-repair enzymes detect such deformations, and use the undamaged nucleotide at that site as a template for replacing the damaged unit. These repairs reduce errors in DNA structure from about one in ten million to one per trillion.

RNA and Protein Synthesis

The genetic information stored in DNA molecules is used as a blueprint for making proteins. Why proteins? Because these macromolecules have diverse primary, secondary and tertiary structures that equip them to carry out the numerous functions necessary to maintain a living organism. As noted in the protein chapter, these functions include:

                    Structural integrity (hair, horn, eye lenses etc.).

                    Molecular recognition and signaling (antibodies and hormones).

                    Catalysis of reactions (enzymes)..

                    Molecular transport (hemoglobin transports oxygen).

                    Movement (pumps and motors).

The critical importance of proteins in life processes is demonstrated by numerous genetic diseases, in which small modifications in primary structure produce debilitating and often disastrous consequences. Such genetic diseases include Tay-Sachs, phenylketonuria (PKU), sickel cell anemia, achondroplasia, and Parkinson disease. The unavoidable conclusion is that proteins are of central importance in living cells, and that proteins must therefore be continuously prepared with high structural fidelity by appropriate cellular chemistry.

Early geneticists identified genes as hereditary units that determined the appearance and / or function of an organism (i.e. its phenotype). We now define genes as sequences of DNA that occupy specific locations on a chromosome. The original proposal that each gene controlled the formation of a single enzyme has since been modified as: one gene = one polypeptide. The intriguing question of how the information encoded in DNA is converted to the actual construction of a specific polypeptide has been the subject of numerous studies, which have created the modern field of Molecular Biology.

1. The Central Dogma and Transcription

Francis Crick proposed that information flows from DNA to RNA in a process called transcription, and is then used to synthesize polypeptides by a process called translation. Transcription takes place in a manner similar to DNA replication. A characteristic sequence of nucleotides marks the beginning of a gene on the DNA strand, and this region binds to a promoter protein that initiates RNA synthesis. The double stranded structure unwinds at the promoter site., and one of the strands serves as a template for RNA formation, as depicted in the following diagram. The RNA molecule thus formed is single stranded, and serves to carry information from DNA to the protein synthesis machinery called ribosomes. These RNA molecules are therefore called messenger-RNA (mRNA).
To summarize: a gene is a stretch of DNA that contains a pattern for the amino acid sequence of a protein. In order to actually make this protein, the relevant DNA segment is first copied into messenger-RNA. The cell then synthesizes the protein, using the mRNA as a template.

An important distinction must be made here. One of the DNA strands in the double helix holds the genetic information used for protein synthesis. This is called the sense strand, or information strand (colored red above). The complementary strand that binds to the sense strand is called the anti-sense strand (colored green), and it serves as a template for generating a mRNA molecule that delivers a copy of the sense strand information to a ribosome. The promoter protein binds to a specific nucleotide sequence that identifies the sense strand, relative to the anti-sense strand. RNA synthesis is then initiated in the 3' direction, as nucleotide triphosphates bind to complementary bases on the template strand, and are joined by phosphate diester linkages. An animation of this process for DNA replication was presented earlier. A characteristic "stop sequence" of nucleotides terminates the RNA synthesis. The messenger molecule (colored orange above) is released into the cytoplasm to find a ribosome, and the DNA then rewinds to its double helix structure.

In eucaryotic cells the initially transcribed m-RNA molecule is usually modified and shortened by an "editing" process that removes irrelevant material. The DNA of such organisms is often thousands of times larger and more complex than that composing the single chromosome of a procaryotic bacterial cell. This difference is due in part to repetitive nucleotide sequences (ca. 25% in the human genome). Furthermore, over 95% of human DNA is found in intervening sequences that separate genes and parts of genes. The informational DNA segments that make up genes are called exons, and the noncoding segments are called introns. Before the mRNA molecule leaves the nucleus, the nonsense bases that make up the introns are cut out, and the informationally useful exons are joined together in a step known as RNA splicing. In this fashion shorter mRNA molecules carrying the blueprint for a specific protein are sent on their way to the ribosome factories.

The Central Dogma of molecular biology, which at first was formulated as a simple linear progression of information from DNA to RNA to Protein, is summarized in the following illustration. The replication process on the left consists of passing information from a parent DNA molecule to daughter molecules. The middle transcription process copies this information to a mRNA molecule. Finally, this information is used by the chemical machinery of the ribosome to make polypeptides.






As more has been learned about these relationships, the central dogma has been refined to the representation displayed on the right. The dark blue arrows show the general, well demonstrated, information transfers noted above. It is now known that an RNA-dependent DNA polymerase enzyme, known as a reverse transcriptase, is able to transcribe a single-stranded RNA sequence into double-stranded DNA (magenta arrow). Such enzymes are found in all cells and are an essential component of retroviruses (e.g. HIV), which require RNA replication of their genomes (green arrow). Direct translation of DNA information into protein synthesis (orange arrow) has not yet been observed in a living organism. Finally, proteins appear to be an informational dead end, and do not provide a structural blueprint for either RNA or DNA.

In the following section the last fundamental relationship, that of structural information translation from mRNA to protein, will be described

2. Translation

Translation is a more complex process than transcription. This would, of course, be expected. After all, the coded messages produced by the German Enigma machine could be copied easily, but required a considerable decoding effort before they could be read with understanding. In a similar sense, DNA replication is simply a complementary base pairing exercise, but the translation of the four letter (bases) alphabet code of RNA to the twenty letter (amino acids) alphabet of protein literature is far from trivial. Clearly, there could not be a direct one-to-one correlation of bases to amino acids, so the nucleotide letters must form short words or codons that define specific amino acids. Many questions pertaining to this genetic code were posed in the late 1950's:

How many RNA nucleotide bases designate a specific amino acid?
If separate groups of nucleotides, called codons, serve this purpose, at least three are needed. There are 43 = 64 different nucleotide triplets, compared with 42 = 16 possible pairs.
Are the codons linked separately or do they overlap?
Sequentially joined triplet codons will result in a nucleotide chain three times longer than the protein it describes. If overlapping codons are used then fewer total nucleotides would be required.
If triplet segments of mRNA designate specific amino acids in the protein, how are the codons identified?
For the sequence ~CUAGGU~ are the codons CUA & GGU or ~C, UAG & GU~ or ~CU, AGG & U~?
Are all the codon words the same size?
In Morse code the most widely used letters are shorter than less common letters. Perhaps nature employs a similar scheme.

Physicists and mathematicians, as well as chemists and microbiologists all contributed to unravelling the genetic code. Although earlier proposals assumed efficient relationships that correlated the nucleotide codons uniquely with the twenty fundamental amino acids, it is now apparent that there is considerable redundancy in the code as it now operates. Furthermore, the code consists exclusively of non-overlapping triplet codons. Clever experiments provided some of the earliest breaks in deciphering the genetic code. Marshall Nirenberg found that RNA from many different organisms could initiate specific protein synthesis when combined with broken E.coli cells (the enzymes remain active). A synthetic polyuridine RNA induced synthesis of poly-phenylalanine, so the UUU codon designated phenylalanine. Likewise an alternating ~CACA~ RNA led to synthesis of a ~His-Thr-His-Thr~ polypeptide.


RNA Codons for Protein Synthesis

 The following table presents the present day interpretation of the genetic code. Note that this is the RNA alphabet, and an equivalent DNA codon table would have all the U nucleotides replaced by T. Methionine and tryptophan are uniquely represented by a single codon. At the other extreme, leucine is represented by eight codons. The average redundancy for the twenty amino acids is about three. Also, there are three stop codons that terminate polypeptide synthesis.

The translation process is fundamentally straight forward. The mRNA strand bearing the transcribed code for synthesis of a protein interacts with relatively small RNA molecules (about 70-nucleotides) to which individual amino acids have been attached by an ester bond at the 3'-end.






These transfer RNA's (tRNA) have distinctive three-dimensional structures consisting of loops of single-stranded RNA connected by double stranded segments. This cloverleaf secondary structure is further wrapped into an "L-shaped" assembly, having the amino acid at the end of one arm, and a characteristic anti-codon region at the other end. The anti-codon consists of a nucleotide triplet that is the complement of the amino acid's codon(s). Models of two such tRNA molecules are shown to the right. When read from the top to the bottom, the anti-codons depicted here should complement a codon in the previous table.

 A cell's protein synthesis takes place in organelles called ribosomes. Ribosomes are complex structures made up of two distinct and separable subunits (one about twice the size of the other). Each subunit is composed of one or two RNA molecules (60-70%) associated with 20 to 40 small proteins (30-40%). The ribosome accepts a mRNA molecule, binding initially to a characteristic nucleotide sequence at the 5'-end. Image Preview

This unique binding assures that polypeptide synthesis starts at the right codon. A tRNA molecule with the appropriate anti-codon then attaches at the starting point and this is followed by a series of adjacent tRNA attachments, peptide bond formation and shifts of the ribosome along the mRNA chain to expose new codons to the ribosomal chemistry.

The genetic code

It became apparent during the early phase of the investigation of protein synthesis that translation is fundamentally different from the transcription process that precedes it. During transcription the "language" of DNA sequences is converted to the closely related dialect of RNA sequences. During protein synthesis, however, a nucleic acid base sequence is converted to a clearly different language (i.e., an amino acid sequence), hence the use of the term "translation." Because mRNA and amino acid molecules have no natural affinity for each other, it became obvious to researchers (e.g., Francis Crick) that a series of adaptor molecules are required to mediate the translation process. This role was eventually assigned to tRNA molecules.

The genetic code can be described as a coding dictionary that specifies a meaning for each specific base sequence. Once the importance of the genetic code was recognized, investigators began to speculate about its dimensions. Because only four different bases (G, C, A, and U) occur in mRNA and 20 amino acids must be specified, it appeared obvious that more than one base coded for each amino acid. A sequence of two bases would specify only a total of 16 amino acids (i.e., 42 - 16). However, a three-base sequence provides more than sufficient base combinations for translation to occur (i.e., 43 = 64).

The first major breakthrough in assigning mRNA triplet base sequences (later referred to as codons) came in 1961, when Marshall Nirenberg performed a series of experiments using an artificial test system containing an extract of Escherichia coli fortified with nucleotides, amino acids, ATP, and GTP. He showed that poly U (a synthetic polynucleotide whose base components consist only of uracil) directed the synthesis of polyphenylalanine. Assuming that codons consist of a three-base sequence, Nirenberg surmised that UUU codes for the amino acid phenylalanine. Subsequently, they repeated their experiment using poly A and poly C. Because polylysine and polyproline products resulted from these tests, the codons AAA and CCC were assigned to lysine and proline, respectively.


The codon assignments for the 64 possible trinudeotide sequences are presented in Table. Of these, 61 code for amino acids. The remaining three codons (UAA, UAG, and UGA) are stop (polypeptide chain terminating) signals. AUG, the codon for methionine, also serves as a start signal (sometimes referred to as the initiating codon).

As a result of a variety of investigations, the genetic code is now believed to possess the following properties:

1. Degenerate. Any coding system in which several signals have the same meaning is said to be degenerate. The genetic code is partially degenerate because most amino acids are coded for by several codons. For example, leucine is coded for by six different codons (UAA, DUG, CUU, CUC, CUA, and CUG). In fact, methionine (AUG) and tryptophan (UGG) are the only amino acids that are coded for by a single codon.

2. Specific. Each codon is a signal for a specific amino acid. The majority of codons that code for the same amino acid possess similar sequences. For example, in each of the four serine codons (UCU, UCC, UCA, and UCG) the first and second bases are identical. It would appear that this feature of the genetic code serves to minimize the danger of point mutations (DNA sequence changes involving a single base pair).

3. Nonoverlapping and without punctuation. The mRNA coding sequence is "read" by a ribosome starting from the initiating codon (AUG) as a continuous sequence taken three bases at a time until a stop codon is reached. A set of contiguous triplet codons in an mRNA is called a reading frame. The term open reading frame is used to describe a series of triplet base sequences in mRNA that do not contain a stop codon.

4. Universal. With a few minor exceptions the genetic code is universal. In other words, examinations of the translation process in the species that have been investigated have revealed that the coding signals for amino acids are always the same.

Recognition of amino acids

The attachment of amino acids to tRNAs, a process that is considered to be the first step in protein synthesis, is catalyzed by a group of enzymes called the aminoacyl-tRNA synthetases. The precision with which these enzymes esterify each specific amino acid to the correct tRNA is now believed to be so important for accurate translation that their functioning has been referred to collectively as the second genetic code.

1. Activation. The synthetase first catalyzes the formation of aminoacyl-AMP. This reaction, which serves to activate the amino acid through the formation of a high-energy mixed anhydride bond is driven to completion through the subsequent hydrolysis of its other product, pyrophosphate. (An anhydride is a molecule containing two carbonyl groups linked through an oxygen atom).


2. tRNA linkage. A specific tRNA, also bound in the active site of the synthetase, becomes attached to the aminoacyl group through an ester linkage. Although the aminoacyl ester linkage to the tRNA is lower in energy than the mixed anhydride of aminoacyl AMP, it still possesses sufficient energy to drive peptide bond formation.

The sum of the reactions catalyzed by the aminoacyl-tRNA synthetases is as follows:

Amino acid + ATP + tRNA aminoacyl-tRNA + AMP +PP

Because the product PP is immediately hydrolyzed with a large loss of free energy, tRNA charging is an irreversible process. Because AMP is a product of this reaction, the metabolic price for the linkage of each amino acid to its tRNA is the equivalent of two molecules of ATP.

Protein Synthesis

The translation of a genetic message into the primary sequence of a polypeptide can be divided into three phases: initiation, elongation, and termination.

Translation is relatively rapid in prokaryotes. For example, an E. coli ribosome can incorporate as many as 20 amino acids per second. (The eukaryotic rate, at about 50 residues per minute, is significantly slower.) Prokaryotic ribosomes are composed of a 50S large subunit and a 30S small subunit.


1. Initiation. Translation begins with the formation of an initiation complex. In prokaryotes this process requires three initiation factors (IFs). IF-3 has previously bound to the 30S subunit, thereby preventing it from binding prematurely to the 50S subunit. As an mRNA binds to the 30S subunit, it is guided into a precise location, so that the initiation codon AUG is correctly positioned. Each gene on a polycistronic mRNA possesses its own initiation codon. The translation of each gene appears to occur independently, that is, translation of the first gene may or may not be followed by the translation of subsequent genes.;_ylu=X3oDMTA4NDgyNWN0BHNlYwNwcm9m/SIG=12k2100f4/EXP=1175702634/**http%3A/

2. Elongation. It is during the elongation phase that the polypeptide is actually synthesized according to the specifications of the genetic message. Elongation, the phase in which amino acids are incorporated into a polypeptide chain, consists of three steps: (1) positioning of an aminoacyl-tRNA in the A site, (2) peptide bond formation, and (3) translocation.

For translation to continue, the mRNA must move, or "translocate," so that a new codon-anticodon interaction can occur. Translocation requires the binding of another GTP-binding protein referred to as EF-G. GTP hydrolysis provides the energy required for the ribosomal conformational change that is apparently involved in the movement of the peptidyl-tRNA (the tRNA bearing the growing peptide chain) from the A site to the P site. The unoccupied A site then binds an appropriate aminoacyl-tRNA to the new A site codon. After the subsequent release of EF-G the ribosome is ready for the next elongation cycle. Elongation continues until a stop codon enters the A site.

Termination. The termination phase begins when a termination codon (UAA, UAG, or UGA) enters the A site. Three releasing factors (RF-1, RF-2, and RF-3) are involved in termination. The codons UAA and UAG are recognized by RF-1, whereas UAA and UGA are recognized by RF-2.

This recognition process, which involves GTP hydrolysis, results in the following alterations in ribosome function. The peptidyl transferase, which is transiently transformed into an esterase, hydrolyzes the bond linking the completed polypeptide chain and the P site tRNA. Following the polypeptide's release from the ribosome, the mRNA and tRNA also dissociate. Termination ends with the dissociation of the ribosome into its constituent subunits.

In addition to the ribosomal subunits, mRNA and aminoacyl-tRNAs, translation requires an energy source (GTP) and a wide variety of protein factors. These factors perform several types of roles. Some have catalytic functions; others stabilize specific structures that form during translation. Translation factors are classified according to the phase of the translation process that they affect, that is, initiation, elongation, or termination. The major differences between prokaryotic and eukaryotic translation appear to be due largely to the identity and functioning of these protein factors.

3. Post-translational Modification

Once a peptide or protein has been synthesized and released from the ribosome it often undergoes further chemical transformation. This post-translational modification may involve the attachment of other moieties such as acyl groups, alkyl groups, phosphates, sulfates, lipids and carbohydrates. Functional changes such as dehydration, amidation, hydrolysis and oxidation (e.g. disulfide bond formation) are also common. In this manner the limited array of twenty amino acids designated by the codons may be expanded in a variety of ways to enable proper functioning of the resulting protein. Since these post-translational reactions are generally catalyzed by enzymes, it may be said: "Virtually every molecule in a cell is made by the ribosome or by enzymes made by the ribosome."

Modifications, like phosphorylation and citrullination, are part of common mechanisms for controlling the behavior of a protein. As shown on the left below, citrullination is the post-translational modification of the amino acid arginine into the amino acid citrulline. Arginine is positively charged at a neutral pH, whereas citrulline is uncharged, so this change increases the hydrophobicity of a protein. Phosphorylation of serine, threonine or tyrosine residues renders them more hydrophilic, but such changes are usually transient, serving to regulate the biological activity of the protein. Other important functional changes include iodination of tyrosine residues in the peptide thyroglobulin by action of the enzyme thyroperoxidase. The monoiodotyrosine and diiodotyrosine formed in this manner are then linked to form the thyroid hormones T3 and T4, shown below.



Amino acids may be enzymatically removed from the amino end of the protein. Because the "start" codon on mRNA codes for the amino acid methionine, this amino acid is usually removed from the resulting protein during post-translational modification. Peptide chains may also be cut in the middle to form shorter strands. Thus, insulin is initially synthesized as a 105 residue preprotein. The 24-amino acid signal peptide is removed, yielding a proinsulin peptide. This folds and forms disulfide bonds between cysteines 7 and 67 and between 19 and 80. Such dimeric cysteines, joined by a disulfide bond, are named cystine. A protease then cleaves the peptide at arg31 and arg60, with loss of the 32-60 sequence (chain C). Removal of arg31 yields mature insulin, with the A and B chains held together by disulfide bonds and a third cystine moiety in chain A. The following cartoon illustrates this chain of events.



The regulation of genes, as measured by their transcription rates, is the result of a complex hierarchy of control elements that act to coordinate the cell's metabolic activities. Some genes, referred to as constitutive or housekeeping genes, are routinely transcribed because they code for gene products (e.g., glucose-metabolizing enzymes, ribosomal proteins, and histones) that are required for cell function. In addition, in the differentiated cells of multicellular organisms, certain specialized proteins are produced that cannot be detected elsewhere (e.g., hemoglobin in red blood cells). Genes, which are expressed only under certain circumstances, are referred to as inducible. For example, the enzymes that are required for lactose metabolism in E. coli are synthesized only when lactose is actually present and glucose, the bacterium's preferred energy source, is absent.

Most of the mechanisms that are used by living cells to regulate gene expression involve DNA-protein interactions. At first glance, the seemingly repetitious and regular structure of B-DNA appears to make it an unlikely partner for the sophisticated binding with myriad different proteins that obviously must occur in gene regulation. However, DNA is somewhat deformable, and certain sequences can be curved or bent. In addition, it is now recognized that the edges of the base pairs within the major groove (and to a lesser extent the minor groove) of the double helix can participate in sequence-specific binding to proteins. Numerous contacts (often about 20 or so) involving hydrophobic interactions, hydrogen bonds, and ionic bonds between amino acids and nucleotide bases result in highly specific DNA-protein binding.

The three-dimensional structures of a number of DNA regulatory proteins that have been determined have surprisingly similar features. In addition to usually possessing twofold axes of symmetry, most of these molecules can be separated into families on the basis of the following structural domains: (1) helix-turn-helix, (2) helix-loop-helix, (3) leucine zipper, (4) zinc finger, and (5) beta-sheets. It should be noted that DNA-binding proteins, many of which are transcription factors, often form dimers. For example, a variety of transcription factors with leucine zipper motifs form dimers as their leucine-containing a-helices interdigitate. Because each type of protein possesses its own unique binding specificity, the capacity of these and many other transcription factors to combine to form homodimers (two identical monomers) and heterodimers (two different monomers) results in a large number of unique gene regulatory agents.

Considering the obvious complexity of function observed in living organisms, it is not surprising that the regulation of gene expression has proven to be both remarkably complex and difficult to investigate. For many of the reasons, knowledge concerning prokaryotic gene expression is significantly more advanced than that of eukaryotes. Prokaryotic gene expression was originally investigated, in part, as a model for the study of the more complicated gene function of mammals. Although it is now recognized that the two genome types are vastly different in many respects, the prokaryotic work has provided many valuable insights into the basic mechanisms of gene expression. In general, prokaryotic gene expression involves the interaction of specific proteins (sometimes referred to as regulators) with DNA in the immediate vicinity of a transcription start site. Such interactions may have either a positive effect (i.e., transcription is initiated or increased) or a negative effect (i.e., transcription is blocked). In an interesting variation the inhibition of a negative regulator (called a repressor) results in the activation of affected genes. (The inhibition of a represser gene is referred to as derepression.) Eukaryotic gene expression also uses these mechanisms as well as several others, including gene rearrangement and amplification and various types of complex transcriptional, RNA processing, and translational controls. In addition, the spatial separation of transcription and translation that is inherent in eukaryotic cells provides another opportunity for regulation: RNA transport control. Finally, eukaryotes (as well as prokaryotes) also regulate cell function through the modulation of proteins through various types of covalent modification.

The discussion of prokaryotic gene expression focuses on the lac operon. The lac operon of E. coli, originally investigated by Francois Jacob and Jacques Monod in the 1950s, remains one of the best-understood models of gene regulation. Despite a daunting lack of knowledge concerning eukaryotic gene expression, a significant number of the pieces in this marvelous puzzle have been revealed.

At the genetic level, the control of inducible genes is often effected by collections of structural and regulatory genes called operons. Investigations of operons, especially the lac operon, has provided substantial insight into how gene expression can be altered by environmental conditions. Similarly investigations of viral infections of prokaryotes have furnished relatively unobstructed views of certain genetic mechanisms. The infection of E. coli by bacteriophage has been especially instructive.

The Lac Operon. The lac operon consists of a control element and structural genes that code for the enzymes of lactose metabolism. The control element contains the promoter site, which overlaps the operator site. (In prokaryotes the operator is a DNA sequence involved in the regulation of adjacent genes that binds to a represser protein.)

The promoter site also contains the CAP site. The structural genes Z, Y, and A specify the primary structure of b-galactosidase, lactose permease, and thiogalactoside transacetylase, respectively. b-Galactosidase catalyzes the hydrolysis of lactose, which yields the monosaccharides galactose and glucose, whereas lactose permease promotes lactose transport into the cell. Because lactose metabolism proceeds normally in the absence of thiogalactoside transacetylase, its role is unclear. A repressor gene i, directly adjacent to the lac operon, codes for the lac repressor protein, a tetramer that binds to the operator site with high affinity. (There are about ten copies of lac represser per cell.) The binding of the lac repressor to the operator prevents the functional binding of RNA polymerase to the promoter.

In the absence of its inducer (allolactose) the lac operon remains repressed because of the binding of lac repressor to the operator. When lactose becomes available, a few molecules are converted to allolactose by b-galactosidase. Allolactose then binds to the repressor, causing a change in its conformation that promotes dissociation from the operator. Once the inactive repressor diffuses away from the operator, the transcription of the structural genes is initiated. The lac operon remains active until the lactose supply is consumed. The repressor subsequently reverts to its active form and rebinds to the operator.

Glucose is the preferred carbon and energy source for E. coli In the event that the organism is exposed to both glucose and lactose, the glucose is metabolized first. Syntheses of the lac operon enzymes are induced only after the glucose is no longer available. (This makes sense because glucose is more commonly available and has a central role in cellular metabolism. Why expend the energy to synthesize the enzymes required for the metabolism of other sugars if glucose is also available?) The delay in activating the lac operon is mediated by a catabolite gene activator protein (CAP). CAP is an allosteric homodimer that binds to the chromosome at a site directly in front of the lac promoter when glucose is absent. CAP can act as an indicator of glucose concentration because of its capacity to bind to cAMP. (For reasons that are not yet clear, the cell's cAMP concentration is inversely related to glucose concentration.) The binding of cAMP to CAP, a process that occurs only when glucose is absent and cAMP levels are high, causes a conformational change that allows the protein to bind to the lac promoter. CAP binding promotes transcription by increasing the affinity of RNA polymerase for the lac promoter. In other words, CAP exerts a positive or activating control on lactose metabolism.

Protein synthesis is an extraordinarily complex process in which genetic information encoded in the nucleic acids is "translated" into the 20 amino acid "alphabet" of polypeptides.


Translational Control Mechanisms

Protein synthesis is an exceptionally expensive process. With a cost of four high-energy phosphate bonds per peptide bond (i.e., two bonds expended during tRNA charging and one each during A site-tRNA binding and translocation) it is perhaps not surprising that enormous quantities of energy are involved.

Although the speed and accuracy of translation require a high energy input, the cost would be even higher without metabolic control mechanisms. It is these mechanisms that allow prokaryotic cells to compete with each other for limited nutritional resources.

Eukariotic translation control mechanisms are proving to be exceptionally complex, substantially more so than those observed in prokaryotes. In prokaryotes such as E. coli, most of the control of protein synthesis occurs at the level of transcription. This circumstance makes sense for several reasons. First, transcription and translation are directly coupled; that is, translation is initiated shortly after transcription begins. Second, the lifetime of prokaryotic mRNA is usually relatively short. With half-lives of between 1 and 3 minutes, the types of mRNAs produced in a cell can be quickly altered as environmental conditions change.

Despite the preeminence of transcriptional control mechanisms, there are variations in the rates of prokaryotic mRNA translation.

An interesting example of negative translational regulation in prokaryotes is provided by ribosomal protein synthesis. There are approximately 55 proteins in prokaryotic ribosomes. These molecules are coded for by genes occurring in 20 operons. Efficient bacterial growth requires that their synthesis be coordinately regulated among the operons as well as with rRNA synthesis. For example, in the PL11 operon, which contains the genes for the ribosomal proteins L1 and L11, excessive amounts of L1 (i.e., more L1 molecules than can bind available 23S rRNA) trigger an inhibition of PL11 mRNA translation. Apparently, LI can bind to either 23S rRNA or PL11 mRNA. In the absence of 23S rRNA, LI inhibits the translation of its own operon by binding to the 5' end of PL11 mRNA.