The first isolation of what we now refer to as DNA was accomplished by Johann Friedrich Miescher circa 1870. He reported finding a weakly acidic substance of unknown function in the nuclei of human white blood cells, and named this material "nuclein". A few years later, Miescher separated nuclein into protein and nucleic acid components. In the 1920's nucleic acids were found to be major components of chromosomes, small gene-carrying bodies in the nuclei of complex cells. Elemental analysis of nucleic acids showed the presence of phosphorus, in addition to the usual C, H, N & O. Unlike proteins, nucleic acids contained no sulfur. Complete hydrolysis of chromosomal nucleic acids gave inorganic phosphate, 2-deoxyribose (a previously unknown sugar) and four different heterocyclic bases (shown in the following diagram). To reflect the unusual sugar component, chromosomal nucleic acids are called deoxyribonucleic acids, abbreviated DNA. Analogous nucleic acids in which the sugar component is ribose are termed ribonucleic acids, abbreviated RNA. The acidic character of the nucleic acids was attributed to the phosphoric acid moiety.

The two monocyclic bases shown here are classified as pyrimidines, and the two bicyclic bases are purines. Each has at least one N-H site at which an organic substituent may be attached. They are all polyfunctional bases, and may exist in tautomeric forms.

Base-catalyzed hydrolysis of DNA gave four nucleoside products, which proved to be N-glycosides of 2'-deoxyribose combined with the heterocyclic amines. Structures and names for these nucleosides will be displayed above by clicking on the heterocyclic base diagram. The base components are colored green, and the sugar is black. As noted in the 2'-deoxycytidine structure on the left, the numbering of the sugar carbons makes use of primed numbers to distinguish them from the heterocyclic base sites. The corresponding N-glycosides of the common sugar ribose are the building blocks of RNA, and are named adenosine, cytidine, guanosine and uridine (a thymidine analog missing the methyl group).
From this evidence, nucleic acids may be formulated as alternating copolymers of phosphoric acid (P) and nucleosides (N), as shown:

~ P – N – P – N'– P – N''– P – N'''– P – N ~

At first the four nucleosides, distinguished by prime marks in this crude formula, were assumed to be present in equal amounts, resulting in a uniform structure, such as that of starch. However, a compound of this kind, presumably common to all organisms, was considered too simple to hold the hereditary information known to reside in the chromosomes. This view was challenged in 1944, when Oswald Avery and colleagues demonstrated that bacterial DNA was likely the genetic agent that carried information from one organism to another in a process called "transformation". He concluded that "nucleic acids must be regarded as possessing biological specificity, the chemical basis of which is as yet undetermined." Despite this finding, many scientists continued to believe that chromosomal proteins, which differ across species, between individuals, and even within a given organism, were the locus of an organism's genetic information. It should be noted that single celled organisms like bacteria do not have a well-defined nucleus. Instead, their single chromosome is associated with specific proteins in a region called a "nucleoid". Nevertheless, the DNA from bacteria has the same composition and general structure as that from multicellular organisms, including human beings.

Views about the role of DNA in inheritance changed in the late 1940's and early 1950's.

By conducting a careful analysis of DNA from many sources, Erwin Chargaff found its composition to be species specific. In addition, he found that the amount of adenine (A) always equaled the amount of thymine (T), and the amount of guanine (G) always equaled the amount of cytosine (C), regardless of the DNA source.

As set forth in the following table, the ratio of (A+T) to (C+G) varied from 2.70 to 0.35. The last two organisms are bacteria.

Building Blocks of Nucleic Acids

Nucleic acids are bio polymers composed of monomer units called as nucleotides, thus they are the building blocks of all nucleic acids. Each nucleotide has three components which are bonded together in a certain manner to form complete unit. These components are as follow.

1. Nitrogen-containing "base" 

There are two type of nitrogenous base present in nucleotides, pyrimidine (one ring) or purine (two rings)which are quite differ from each other in there structures.

(a) Purine

There are two purine base commonly found in nucleic acid, Adenine (A) and guanine (G). 
Both purine bases bonded with sugar through the N9 of base with C1
 of the sugar. 


(b) Pyrimidine

There are three pyrimidine base,
 Thymine (T) and cytosine(C) present in DNA and Uracil (U).

2. Five carbon sugar

Two pentose sugars are present in nucleic acids,
 ribose or deoxyribose sugar. DNA contains β-D-2-deoxyribose sugar while RNA contains β-D-ribose sugar. Both sugars are differing only in the presence of one oxygen atom at C2 position.

Five carbon sugar

·                    In any nucleotide, the combination of these two components, a base and sugar is known as a nucleoside.

·                    The bonding between nucleosides and phosphoric acid molecules results the formation of nucleotides.

·                    In nucleosides, the 1-position of pyrimidine and 9-position of purine bonded with C1 of the sugar molecule through a β-linkage also known as N-glycosidic linkage.

General Structure of the Nucleotides

The monomeric units of DNA are called deoxyribonucleotides; those of RNA are ribonucleotides. Each nucleotide contains three characteristic components: (1) a nitrogenous heterocyclic base, which is a derivative of either pyrimidine or purine; (2) a pentose; and (3) a molecule of phosphoric acid.

Four different deoxyribonucleotides serve as the major components of DNAs; they differ from each other only in their nitrogenous base components, after which they are named.



 The four bases characteristic of the deoxyribonucleotide units of DNA are the purine derivatives adenine and guamine and the pyrimidine derivatives cytosine and thymine. Similarly, four different ribonucleotides are the major components of RNAs; they contain the purine bases adenine and guanine and the pyrimidine bases cytosine and uracil.Thus thymine is characteristically present in DNA but not usually in RNA, whereas uracil is normally present in RNA but only rarely in DNA.


The other difference in the composition between these two kinds of nucleic acids is that deoxyribonucleotides con­tain as their pentose component 2-deoxy-p-ribose,whereas ribonucleotides contain p-ribose. The pentose is joined to the base by a glycosyl bond between carbon atom 1 of the pentose and ni­trogen atom 9 of purine bases or nitrogen atom 1 of pyrimi­dine bases. The phosphate group of nucleotides is in ester linkage with carbon atom 5 of the pentose.


When the phosphate group of a nucleotide is removed by hydrolysis, the structure remaining is called a nucleoside.

On the basis of five different bases, there are five nucleosides are possible in DNA and RNA.




Nucleic Acid


Structure of nucleoside



























Deoxythymidine (thymidine)








The Chemical Nature of DNA

The polymeric structure of DNA may be described in terms of monomeric units of increasing complexity. In the top shaded box of the following illustration, the three relatively simple components mentioned earlier are shown. Below that on the left , formulas for phosphoric acid and a nucleoside are drawn. Condensation polymerization of these leads to the DNA formulation outlined above. Finally, a 5'- monophosphate ester, called a nucleotide may be drawn as a single monomer unit, shown in the shaded box to the right. Since a monophosphate ester of this kind is a strong acid (pKa of 1.0), it will be fully ionized at the usual physiological (ca.7.4). Names for these DNA components are given in the table pH.


 Isomeric 3'-monophospate nucleotides are also known, and both isomers are found in cells. They may be obtained by selective hydrolysis of DNA through the action of nuclease enzymes. Anhydride-like di- and tri-phosphate nucleotides have been identified as important energy carriers in biochemical reactions, the most common being ATP (adenosine 5'-triphosphate).

Names of DNA Base Derivatives



















First, the remaining P-OH function is quite acidic and is completely ionized in biological systems.

Second, the polymer chain is structurally directed. One end (5') is different from the other (3').

Third, although this appears to be a relatively simple polymer, the possible permutations of the four nucleosides in the chain become very large as the chain lengthens.

Fourth, the DNA polymer is much larger than originally believed. Molecular weights for the DNA from multicellular organisms are commonly 109 or greater.

Information is stored or encoded in the DNA polymer by the pattern in which the four nucleotides are arranged. To access this information the pattern must be "read" in a linear fashion, just as a bar code is read at a supermarket checkout. Because living organisms are extremely complex, a correspondingly large amount of information related to this complexity must be stored in the DNA. Consequently, the DNA itself must be very large, as noted above. Even the single DNA molecule from an E. coli bacterium is found to have roughly a million nucleotide units in a polymer strand, and would reach a millimeter in length if stretched out. The nuclei of multicellular organisms incorporate chromosomes, which are composed of DNA combined with nuclear proteins called histones. The fruit fly has 8 chromosomes, humans have 46 and dogs 78 (note that the amount of DNA in a cell's nucleus does not correlate with the number of chromosomes). The DNA from the smallest human chromosome is over ten times larger than E. coli DNA, and it has been estimated that the total DNA in a human cell would extend to 2 meters in length if unraveled. Since the nucleus is only about 5μm in diameter, the chromosomal DNA must be packed tightly to fit in that small volume.

In addition to its role as a stable informational library, chromosomal DNA must be structured or organized in such a way that the chemical machinery of the cell will have easy access to that information, in order to make important molecules such as polypeptides. Furthermore, accurate copies of the DNA code must be created as cells divide, with the replicated DNA molecules passed on to subsequent cell generations, as well as to progeny of the organism. The nature of this DNA organization, or secondary structure, will be discussed in a later section.

The high molecular weight nucleic acid, DNA, is found chiefly in the nuclei of complex cells, known as eucaryotic cells, or in the nucleoid regions of procaryotic cells, such as bacteria. It is often associated with proteins that help to pack it in a usable fashion. In contrast, a lower molecular weight, but much more abundant nucleic acid, RNA, is distributed throughout the cell, most commonly in small numerous organelles called ribosomes. Three kinds of RNA are identified, the largest subgroup (85 to 90%) being ribosomal RNA, rRNA, the major component of ribosomes, together with proteins. The size of rRNA molecules varies, but is generally less than a thousandth the size of DNA. The other forms of RNA are messenger RNA , mRNA, and transfer RNA , tRNA. Both have a more transient existence and are smaller than rRNA.

All these RNA's have similar constitutions, and differ from DNA in two important respects. As shown in the following diagram, the sugar component of RNA is ribose, and the pyrimidine base uracil replaces the thymine base of DNA. The RNA's play a vital role in the transfer of information (transcription) from the DNA library to the protein factories called ribosomes, and in the interpretation of that information (translation) for the synthesis of specific polypeptides. These functions will be described later.

4. The Secondary Structure of DNA

In the early 1950's the primary structure of DNA was well established, but a firm understanding of its secondary structure was lacking. Indeed, the situation was similar to that occupied by the proteins a decade earlier, before the alpha helix and pleated sheet structures were proposed by Linus Pauling. Many researchers grappled with this problem, and it was generally conceded that the molar equivalences of base pairs (A & T and C & G) discovered by Chargaff would be an important factor. Rosalind Franklin, working at King's College, London, obtained X-ray diffraction evidence that suggested a long helical structure of uniform thickness. Francis Crick and James Watson, at Cambridge University, considered hydrogen bonded base pairing interactions, and arrived at a double stranded helical model that satisfied most of the known facts, and has been confirmed by subsequent findings.

Base Pairing

Careful examination of the purine and pyrimidine base components of the nucleotides reveals that three of them could exist as hydroxy pyrimidine or purine tautomers, having an aromatic heterocyclic ring. Despite the added stabilization of an aromatic ring, these compounds prefer to adopt amide-like structures. These options are shown in the following diagram, with the more stable tautomer drawn in blue.

A simple model for this tautomerism is provided by 2-hydroxypyridine. As shown on the left below, a compound having this structure might be expected to have phenol-like characteristics, such as an acidic hydroxyl group. However, the boiling point of the actual substance is 100º C greater than phenol and its acidity is 100 times less than expected (pKa = 11.7). These differences agree with the 2-pyridone tautomer, the stable form of the zwitterionic internal salt. Further evidence supporting this assignment will be displayed by clicking on the diagram. Note that this tautomerism reverses the hydrogen bonding behavior of the nitrogen and oxygen functions (the N-H group of the pyridone becomes a hydrogen bond donor and the carbonyl oxygen an acceptor).

The additional evidence for the pyridone tautomer, that appears above by clicking on the diagram, consists of infrared and carbon nmr absorptions associated with and characteristic of the amide group. The data for 2-pyridone is given on the left. Similar data for the N-methyl derivative, which cannot tautomerize to a pyridine derivative, is presented on the right.

Once they had identified the favored base tautomers in the nucleosides, Watson and Crick were able to propose a complementary pairing, via hydrogen bonding, of guanosine (G) with cytidine (C) and adenosine (A) with thymidine (T). This pairing, which is shown in the following diagram, explained Chargaff's findings beautifully, and led them to suggest a double helix structure for DNA. Before viewing this double helix structure itself, it is instructive to examine the base pairing interactions in greater detail. The G#C association involves three hydrogen bonds (colored pink), and is therefore stronger than the two-hydrogen bond association of A#T. These base pairings might appear to be arbitrary, but other possibilities suffer destabilizing steric or electronic interactions.

A simple mnemonic device for remembering which bases are paired comes from the line construction of the capital letters used to identify the bases. A and T are made up of intersecting straight lines. In contrast, C and G are largely composed of curved lines. The RNA base uracil corresponds to thymine, since U follows T in the alphabet.

The Double Helix Structure for DNA

After many trials and modifications, Watson and Crick conceived an ingenious double helix model for the secondary structure of DNA. Two strands of DNA were aligned anti-parallel to each other, i.e. with opposite 3' and 5' ends , as shown in part a of the following diagram. Complementary primary nucleotide structures for each strand allowed intra-strand hydrogen bonding between each pair of bases. These complementary strands are colored red and green in the diagram. Coiling these coupled strands then leads to a double helix structure, shown as cross-linked ribbons in part b of the diagram. The double helix is further stabilized by hydrophobic attractions and pi-stacking of the bases. A space-filling molecular model of a short segment is displayed in part c on the right.

Space-Filling Molecular Model

The helix shown here has ten base pairs per turn, and rises 3.4 Å in each turn. This right-handed helix is the favored conformation in aqueous systems, and has been termed the B-helix. As the DNA strands wind around each other, they leave gaps between each set of phosphate backbones. Two alternating grooves result, a wide and deep major groove (ca. 22Å wide), and a shallow and narrow minor groove (ca. 12Å wide). Other molecules, including polypeptides, may insert into these grooves, and in so doing perturb the chemistry of DNA. Other helical structures of DNA have also been observed, and are designated by letters (e.g. A and Z).

Deoxyribonucleic acid (DNA) consists of covalently linked chains of deoxyribonucleotides, and ribonucleic acid (RNA) consists of chains of ribonucleotides. DNA and RNA share a number of chemical and physical properties because in both of them the successive nucleotide units are covalently linked in identical fashion by phosphodiester bridges formed be­tween the 5'-hydroxyl group of one nucleotide and the 3'-hydroxyl group of the next. Thus the back­bone of both DNA and RNA consists of alternating phosphate and pentose groups, in which phosphodiester bridges pro­vide the covalent continuity. The purine and pyrimidine bases of the nucleotide units are not present in the backbone structure but constitute distinctive side chains, just as the R groups of amino acid residues are the distinctive side chains of polypeptides.

DNA molecules from different cells and viruses vary in the ratio of the four major types of nucleotide monomers, in their nucleotide sequence, and in their molec­ular weight. Besides the four major bases (adenine, guanine thymine, and cytosine) found in all DNAs, small amounts of methylated derivatives of these bases are present in some DNA molecules, particularly those from viruses.

The DNAs isolated from different organisms and viruses nor­mally have two strands in complementary double-helical arrangement. In most cells the DNA molecules are so large that they are not easily isolated in in­tact form. In diploid eukaryotic cells nearly all the DNA molecules are present in the cell nucleus, where they are combined in ionic linkage with basic proteins called histones. In addition to the nuclear DNA, diploid eukaryotic cells also contain very small amounts of DNA in the mitochondria; it differs in its base composition and molecular weight from nuclear DNA.


The three major types of ribonucleic acid in cells are called messenger RNA (mRNA), ribosomal RNA (rRNA), and transfer RNA (tRNA). Although all three types occur as single polyribonucleotide strands, each type has a charac­teristic range of molecular weight and sedimentation coef­ficient. Moreover, each of the three major kinds of RNA occurs in multiple molecular forms. Ribosomal RNA of any given biological species exists in three or more major forms, transfer RNA in as many as 60 forms, and messenger RNA in hundreds and perhaps thousands of distinctive forms. Most cells contain 2 to 8 times as much RNA as DNA.

Messenger RNA

Messenger RNA contains only the four major bases. It is syn­thesized in the nucleus during the process of transcription, in which the sequence of bases in one strand of the chromo­somal DNA is enzymatically transcribed in the form of a single strand of mRNA; some mRNA is also made in the mitochondria. The sequence of bases of the mRNA strand so formed is complementary to that of the DNA strand being transcribed. Each of the thousands of different proteins synthe­sized by the cell is coded by a specific mRNA or segment of an mRNA molecule.

Transfer RNAs

Transfer RNAs are relatively small molecules that act as car­riers of specific individual amino acids during protein syn­thesis on the ribosomes. Each of the 20 amino acids found in proteins has at least one corresponding tRNA, and some have multiple tRNAs

Ribosomal RNA

Ribosomal RNA (rRNA) constitutes up to 65 percent of the mass of ribosomes. Although rRNAs make up a large fraction of total cellular RNA, their function in ribosomes is not yet clear. A few of the bases in rRNAs are methylated.


1. DNA Replication

In their 1953 announcement of a double helix structure for DNA, Watson and Crick stated, "It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material.". The essence of this suggestion is that, if separated, each strand of the molecule might act as a template on which a new complementary strand might be assembled, leading finally to two identical DNA molecules. Indeed, replication does take place in this fashion when cells divide, but the events leading up to the actual synthesis of complementary DNA strands are sufficiently complex that they will not be described in any detail.

As depicted in the following drawing, the DNA of a cell is tightly packed into chromosomes. First, the DNA is wrapped around small proteins called histones (colored pink below). These bead-like structures are then further organized and folded into chromatin aggregates that make up the chromosomes. An overall packing efficiency of 7,000 or more is thus achieved. Clearly a sequence of unfolding events must take place before the information encoded in the DNA can be used or replicated.


Because of the strategic importance of DNA, all living organisms must pos sess the following features: (1) rapid and accurate DNA synthesis and (2)  genetic stabilityprovided by effective DNA repair mechanisms.


Paradoxically, the long-term survival of species also depends on genetic variations that allow them to adapt to changing environmental conditions. In most species these variations are caused predominantly by genetic recombination, although mutation also plays a role. It should be noted that prokaryotic nucleic acid metabolism is more completely understood than that of eukaryotes. Because of their minimal growth requirements, short generation times, and relatively simple genetic makeup, prokaryotes (especiallyEscherichia coil) have proven to be excellent research tools for investigations of genetic mechanisms. In contrast, multicellular eukaryotes possess several properties that hinder genetic investigations.

The most formidable of these are long generation times (often months or years) and extraordinary difficulties in identifying gene products (e.g., enzymes or structural components). (A common tactic in genetic research is the induction of mutations, followed by observing changes in or the absence of a spe­cific gene product.) Unfortunately, because of their considerable complexity, very few gene products have been identified in higher organisms by using this method. Various recombinant DNA techniques are being used to circumvent this obstacle.

DNA replication must occur before every cell division. The basic mechanism by which DNA copies are produced is similar in all living organisms. After the two strands separate, each subsequently serves as a template for the synthesis of a complementary strand. (In other words, each of the two new DNA mol­ecules contains one old strand and one new strand.) This process, referred to as semiconservative replication, was first demonstrated in an elegant experiment reported in 1958 by Matthew Messelson and Franklin Stahl. In this classic work, Messelson and Stahl took advantage of the different density proper­ties of DNA labeled with the heavy nitrogen isotope 15N (normal nitrogen is 14N). After E. coli cells were grown for 14 generations in growth media whose nitrogen source consisted only of 15NH4Cl, the 15N-containing cells were transferred to growth media containing the normal 14N-isotope. At the end of both one and two cell di­visions, samples were removed. The DNA in each of these samples was isolated and analyzed by CsCl density gradient centrifugation. Because pure 15N-DNA and 14N-DNA produce characteristic bands in centrifuged CsCl tubes, this analytical method discriminates between DNA molecules containing large amounts of the two nitro­gen isotopes. When the DNA isolated from 15N-containing cells grown in 14N-medium for precisely one generation was centrifuged, only one band was observed. Because this band occurred halfway between where 15N-DNA and 14N-DNA bands would nor­mally appear, it seemed reasonable to assume that the new DNA was a hybrid mol­ecule, that is, containing one 15N-strand and one 14N-strand. (Any other means of replication would create more than one band.) After two cell divisions, extracted DNA was resolved into two discrete bands containing equal amounts of 14N,14N-DNA and 15N,15N-DNA, a result that was also consistent with the semiconservative model of DNA synthesis.

In the years since the Messelson and Stahl experiment, many of the details of DNA replication have been discovered.

DNA Synthesis in ProkaryotesDNA replication in E. coli has proven to be a com­plex process that involves a wide variety of proteins, referred to collectively as thereplisome.


The prokaryotic replication process contains several basic steps, each of which requires certain enzyme activities:

1. DNA uncoiling. As their name implies, the helicases are ATP-requiring enzymes that catalyze the unwinding of duplex DNA.

2. Primer synthesis. The formation of short RNA segments called primers, which are required for the initiation of DNA replication, is catalyzed by primase.

3. DNA synthesis. The synthesis of a complementary DNA strand by creating phosphodiester linkages between nucleotides base paired to a template strand is catalyzed by large multienzyme complexes referred to as the DNA polymerases. DNA polymerase III (pol III), the major DNA-synthesizing enzyme, is composed of at least ten dif­ferent subunits. DNA polymerase I (pol I) is a DNA repaer enzyme. (Pol I is also believed to play a role in the timely removal of RNA primer.) The function of DNA polymerase II (pol II) is not understood. In addition to a 5'®3' polymerizing activity, all three enzymes possess a 3'®5' exonuclease activity (An exonuclesse is an enzyme that removes nucleotides from an end of a potynucleotide strand.) Pol I also possesses a 5'®3' exonuclease activity.

4. Joining of DNA fragments. Discontinuous DNA synthesis (described below) requires an enzyme, referred to as a ligase, that joins the newly synthesized segments.

5. Supercoiling control. The tangling of DNA strands, which can prevent further unwinding of the double he­lix, is prevented by the DNA topoisomerases. Tangling is a very real possibility, since the double helix unwinds rapidly (as many as 50 revolutions per second during bac­terial DNA replication). Topoisomerases are enzymes that alter the linking number of closed duplex DNA mole­cules. The terms "topoisomerase" and "topoisomers" (circular DNA molecules that differ only in their linking numbers) are derived from "topology," a form of math­ematics that investigates the properties of geometric structures that do not change with bending or stretch­ing.


When DNA replication was first observed experimentally (with the aid of elec­tron microscopy and autoradiography), investigators were confronted with a para­dox. The bidirectional synthesis of DNA as it appeared in their research seemed to indicate that continuous synthesis occurs in the 5' ® 3' direction on one strand and in the 3' ® 5' direction on the other strand. (Recall that DNA double helix has an antiparallel configuration.) However, all the enzymes that catalyze DNA synthe­sis do so in the 5' ® 3' direction only. It was later determined that only one strand, referred to as the leading strand, is continuously synthesized in the 5' ® 3' direction. The other strand, referred to as thelagging strand, is also synthesized in the 5' ® 3' direction but in a series of small pieces. (Reiji Okazaki and his colleagues provided the experimental evidence for discontinuous DNA synthe­sis.) Subsequently, these pieces (now called Okazaki fragments) are covalently linked together by DNA ligase. (In prokaryotes such as E. coli, Okazaki fragments possess between 1000 and 2000 nucleotides.)

The initiation of replication is a complex process involving several enzymes, as well as other proteins. The initiation of replication begins when approximately 20 copies ofDnaA protein (50 kD) bind to four specific sites. During the binding of DnaA, which requires ATP and a histonelike protein (HU), this portion of the bacterial chromosome forms into a nucleosomelike structure. This coiling causes a small segment of the double helix to open sufficiently that DnaB (a 300-kD helicase) complexed with DnaC (29 kD) can enter.

The replication fork moves forward as the helicases, assisted by topoisomerases (especially DNA gyrase), unwind the helix. The single strands are kept separate by the binding of numerous copies of single-stranded DNA binding protein (SSB). (SSB may also protect vulnerable ssDNA segments from attack by various nucleases.) Before pol III can initiate DNA synthesis, however, an RNA primer must be present. On the leading strand, where DNA synthesis is continuous, primer formation oc­curs only once per replication fork. In contrast, the discontinuous synthesis on the lagging strand requires primer synthesis for each of the Okazaki fragments. A multienzyme complex containing primase and several other proteins, called the primosome, travels along the lagging strand. At intervals the primosome stops and reverses direction while it synthesizes a short RNA primer. Subsequently, pol III synthesizes DNA beginning at the 3' end of the primer. After most of the laggin strand synthesis is complete, the RNA primers are removed and replaced by DNA segments synthesized by pol I. Finally, the Okazaki fragments are joined by DNA ligase.

It appears that the synthesis of both the leading and lagging strands are coi pled. The tandem operation of two pol III complexes require that one strand (the lagging strand) is looped around the replisome. (The ten "replisome" is used to describe the set of enzymes and other molecules required for DNA synthesis at a single replication fork.) When the pol III complex that copies the lagging strand completes an Okazaki fragment, it releases the duplex DNA. Once it does so, the primosome moves in and synthesizes another RNA primer.

Despite the complexity of DNA replication, as well as its rate (as high as 1000 base pairs per second per replication fork)this process is amazingly accurate (approximately one error per 109 to 1010 base pairs per generation).

Replication ends when the replication forks collide on the other side of circular chromosome. The subsequent separation of the two daughter DNA molecules is not understood, although a type II topoisomerase is believed to be involved.

Unique features of eukaryotic DNA synthesis

1. Timing of replication. In contrast to rapidly growing bacterial cells, in which replication occurs throughout most of the cell division cycle, eukaryotic replication is limited to a specific time period referred to as the S phase. It is now known that eukaryotic cells produce certain proteins that regulate phase tran­sitions within the cell cycle.

2. Replication rate. DNA replication is significantly slower in eukaryotes than that observed in prokaryotes. The eu­karyotic rate is approximately 50 nucteotides per second per replication fork. (Recall that the rate in prokaryotes is  about ten times higher). This discrepancy is presumably due, in part, to the complex structure of chromatin

3. Replicons. Despite the relative slowness of DNA synthesis, the replication process is relativly brief, considering the large sizes of eukaryotic genomes. For example, on the basis of the replication rate mentioned above, the replication of an average eukaryotic chromosome (approximately 150 million base pairs) should take over a month to complete. Instead, this process usually requires several hours. Eukaryotes have compressed the replication of their large genomes into short time period with the use of multiple replicons.


4. Okazaki fragments. At between 100 and 200 nucleotides in length, the Okazaki fragments of eukaryotes are significantly shorter than those that occur in prokaryotes.


Once the double stranded DNA is exposed, a group of enzymes act to accomplish its replication. These are described briefly here:

Topoisomerase: This enzyme initiates unwinding of the double helix by cutting one of the strands.

Helicase: This enzyme assists the unwinding. Note that many hydrogen bonds must be broken if the strands are to be separated. SSB: A single-strand binding-protein stabilizes the separated strands, and prevents them from recombining, so that the polymerization chemistry can function on the individual strands.
DNA Polymerase: This family of enzymes link together nucleotide triphosphate monomers as they hydrogen bond to complementary bases. These enzymes also check for errors (roughly ten per billion), and make corrections.
Ligase: Small unattached DNA segments on a strand are united by this enzyme.

Polymerization of nucleotides takes place by the phosphorylation reaction described by the following equation.

Di- and triphosphate esters have anhydride-like structures and are consequently reactive phosphorylating reagents, just as carboxylic anhydrides are acylating reagents. Since the pyrophosphate anion is a better leaving group than phosphate, triphosphates are more powerful phosphorylating agents than are diphosphates. Formulas for the corresponding 5'-derivatives of adenosine will be displayed by Clicking Here, and similar derivatives exist for the other three common nucleosides. The DNA polymerization process that builds the complementary strands in replication, could in principle take place in two ways. Referring to the general equation above, R1 could represent the next nucleotide unit to be attached to the growing DNA strand, with R2 being this strand. Alternatively, these assignments could be reversed. In practice, the former proves to be the best arrangement. Since triphosphates are very reactive, the lifetime of such derivatives in an aqueous environment is relatively short. However, such derivatives of the individual nucleosides are repeatedly synthesized by the cell for a variety of purposes, providing a steady supply of these reagents. In contrast, the growing DNA segment must maintain its functionality over the entire replication process, and can not afford to be changed by a spontaneous hydrolysis event. As a result, these chemical properties are best accommodated by a polymerization process that proceeds at the 3'-end of the growing strand by 5'-phosphorylation involving a nucleotide triphosphate.

The polymerization mechanism described here is constant. It always extends the developing DNA segment toward the 3'-end (i.e. when a nucleotide triphosphate attaches to the free 3'-hydroxyl group of the strand, a new 3'-hydroxyl is generated). There is sometimes confusion on this point, because the original DNA strand that serves as a template is read from the 3'-end toward the 5'-end, and authors may not be completely clear as to which terminology is used.

Because of the directional demand of the polymerization, one of the DNA strands is easily replicated in a continuous fashion, whereas the other strand can only be replicated in short segmental pieces. This is illustrated in the following diagram. Separation of a portion of the double helix takes place at a site called the replication fork. As replication of the separate strands occurs, the replication fork moves away (to the left in the diagram), unwinding additional lengths of DNA. Since the fork in the diagram is moving toward the 5'-end of the red-colored strand, replication of this strand may take place in a continuous fashion (building the new green strand in a 5' to 3' direction). This continuously formed new strand is called the leading strand. In contrast, the replication fork moves toward the 3'-end of the original green strand, preventing continuous polymerization of a complementary new red strand. Short segments of complementary DNA, called Okazaki fragments, are produced, and these are linked together later by the enzyme ligase. This new DNA strand is called the lagging strand.

When you consider that a human cell has roughly 109 base pairs in its DNA, and may divide into identical daughter cells in 14 to 24 hours, the efficiency of DNA replication must be extraordinary. The procedure described above will replicate about 50 nucleotides per second, so there must be many thousand such replication sites in action during cell division. A given length of double stranded DNA may undergo strand unwinding at numerous sites in response to promoter actions. The unraveled "bubble" of single stranded DNA has two replication forks, so assembly of new complementary strands may proceed in two directions. The polymerizations associated with several such bubbles fuse together to achieve full replication of the entire DNA double helix. A cartoon illustrating these concerted replications will appear by clicking on the above diagram. Note that the events shown proceed from top to bottom in the diagram.

2. Repair of DNA Damage and Replication Errors

One of the benefits of the double stranded DNA structure is that it lends itself to repair, when structural damage or replication errors occur. Several kinds of chemical change may cause damage to DNA:

·                    Spontaneous hydrolysis of a nucleoside removes the heterocyclic base component.

·                    Spontaneous hydrolysis of cytosine changes it to a uracil.

·                    Various toxic metabolites may oxidize or methylate heterocyclic base components.

·                    Ultraviolet light may dimerize adjacent cytosine or thymine bases.

All these transformations disrupt base pairing at the site of the change, and this produces a structural deformation in the double helix.. Inspection-repair enzymes detect such deformations, and use the undamaged nucleotide at that site as a template for replacing the damaged unit. These repairs reduce errors in DNA structure from about one in ten million to one per trillion.

RNA and Protein Synthesis

The genetic information stored in DNA molecules is used as a blueprint for making proteins. Why proteins? Because these macromolecules have diverse primary, secondary and tertiary structures that equip them to carry out the numerous functions necessary to maintain a living organism. As noted in the protein chapter, these functions include:

·                    Structural integrity (hair, horn, eye lenses etc.).

·                    Molecular recognition and signaling (antibodies and hormones).

·                    Catalysis of reactions (enzymes)..

·                    Molecular transport (hemoglobin transports oxygen).

·                    Movement (pumps and motors).

The critical importance of proteins in life processes is demonstrated by numerous genetic diseases, in which small modifications in primary structure produce debilitating and often disastrous consequences. Such genetic diseases include Tay-Sachs, phenylketonuria (PKU), sickel cell anemia, achondroplasia, and Parkinson disease. The unavoidable conclusion is that proteins are of central importance in living cells, and that proteins must therefore be continuously prepared with high structural fidelity by appropriate cellular chemistry.

Early geneticists identified genes as hereditary units that determined the appearance and / or function of an organism (i.e. its phenotype). We now define genes as sequences of DNA that occupy specific locations on a chromosome. The original proposal that each gene controlled the formation of a single enzyme has since been modified as: one gene = one polypeptide. The intriguing question of how the information encoded in DNA is converted to the actual construction of a specific polypeptide has been the subject of numerous studies, which have created the modern field of Molecular Biology.

1. The Central Dogma and Transcription

Francis Crick proposed that information flows from DNA to RNA in a process called transcription, and is then used to synthesize polypeptides by a process calledtranslation. Transcription takes place in a manner similar to DNA replication. A characteristic sequence of nucleotides marks the beginning of a gene on the DNA strand, and this region binds to a promoter protein that initiates RNA synthesis. The double stranded structure unwinds at the promoter site., and one of the strands serves as a template for RNA formation, as depicted in the following diagram. The RNA molecule thus formed is single stranded, and serves to carry information from DNA to the protein synthesis machinery called ribosomes. These RNA molecules are therefore called messenger-RNA (mRNA).
To summarize: a gene is a stretch of DNA that contains a pattern for the amino acid sequence of a protein. In order to actually make this protein, the relevant DNA segment is first copied into messenger-RNA. The cell then synthesizes the protein, using the mRNA as a template.

An important distinction must be made here. One of the DNA strands in the double helix holds the genetic information used for protein synthesis. This is called the sense strand, or information strand (colored red above). The complementary strand that binds to the sense strand is called the anti-sense strand (colored green), and it serves as a template for generating a mRNA molecule that delivers a copy of the sense strand information to a ribosome. The promoter protein binds to a specific nucleotide sequence that identifies the sense strand, relative to the anti-sense strand. RNA synthesis is then initiated in the 3' direction, as nucleotide triphosphates bind to complementary bases on the template strand, and are joined by phosphate diester linkages. An animation of this process for DNA replication was presented earlier. A characteristic "stop sequence" of nucleotides terminates the RNA synthesis. The messenger molecule (colored orange above) is released into the cytoplasm to find a ribosome, and the DNA then rewinds to its double helix structure


In eucaryotic cells the initially transcribed m-RNA molecule is usually modified and shortened by an "editing" process that removes irrelevant material. The DNA of such organisms is often thousands of times larger and more complex than that composing the single chromosome of a procaryotic bacterial cell. This difference is due in part to repetitive nucleotide sequences (ca. 25% in the human genome). Furthermore, over 95% of human DNA is found in intervening sequences that separate genes and parts of genes. The informational DNA segments that make up genes are called exons, and the noncoding segments are called introns. Before the mRNA molecule leaves the nucleus, the nonsense bases that make up the introns are cut out, and the informationally useful exons are joined together in a step known as RNA splicing. In this fashion shorter mRNA molecules carrying the blueprint for a specific protein are sent on their way to the ribosome factories.

The Central Dogma of molecular biology, which at first was formulated as a simple linear progression of information from DNA to RNA to Protein, is summarized in the following illustration. The replication process on the left consists of passing information from a parent DNA molecule to daughter molecules. The middle transcription process copies this information to a mRNA molecule. Finally, this information is used by the chemical machinery of the ribosome to make polypeptides.

As more has been learned about these relationships, the central dogma has been refined to the representation displayed on the right. The dark blue arrows show the general, well demonstrated, information transfers noted above. It is now known that an RNA-dependent DNA polymerase enzyme, known as a reverse transcriptase, is able to transcribe a single-stranded RNA sequence into double-stranded DNA (magenta arrow). Such enzymes are found in all cells and are an essential component of retroviruses (e.g. HIV), which require RNA replication of their genomes (green arrow). Direct translation of DNA information into protein synthesis (orange arrow) has not yet been observed in a living organism. Finally, proteins appear to be an informational dead end, and do not provide a structural blueprint for either RNA or DNA.

In the following section the last fundamental relationship, that of structural information translation from mRNA to protein, will be described

2. Translation

Translation is a more complex process than transcription. This would, of course, be expected. After all, the coded messages produced by the German Enigma machine could be copied easily, but required a considerable decoding effort before they could be read with understanding. In a similar sense, DNA replication is simply a complementary base pairing exercise, but the translation of the four letter (bases) alphabet code of RNA to the twenty letter (amino acids) alphabet of protein literature is far from trivial. Clearly, there could not be a direct one-to-one correlation of bases to amino acids, so the nucleotide letters must form short words or codons that define specific amino acids. Many questions pertaining to this genetic code were posed in the late 1950's:

• How many RNA nucleotide bases designate a specific amino acid?
If separate groups of nucleotides, called codons, serve this purpose, at least three are needed. There are 43 = 64 different nucleotide triplets, compared with 42 = 16 possible pairs.
• Are the codons linked separately or do they overlap?
Sequentially joined triplet codons will result in a nucleotide chain three times longer than the protein it describes. If overlapping codons are used then fewer total nucleotides would be required.
• If triplet segments of mRNA designate specific amino
 acids in the protein, how are the codons identified?
For the sequence ~CUAGGU~ are the codons CUA & GGU or ~C, UAG & GU~ or ~CU, AGG & U~?
• Are all the codon words the same size?
In Morse code the most widely used letters are shorter than less common letters. Perhaps nature employs a similar scheme.

Physicists and mathematicians, as well as chemists and microbiologists all contributed to unravelling the genetic code. Although earlier proposals assumed efficient relationships that correlated the nucleotide codons uniquely with the twenty fundamental amino acids, it is now apparent that there is considerable redundancy in the code as it now operates. Furthermore, the code consists exclusively of non-overlapping triplet codons. Clever experiments provided some of the earliest breaks in deciphering the genetic code. Marshall Nirenberg found that RNA from many different organisms could initiate specific protein synthesis when combined with broken E.coli cells (the enzymes remain active). A synthetic polyuridine RNA induced synthesis of poly-phenylalanine, so the UUU codon designated phenylalanine. Likewise an alternating ~CACA~ RNA led to synthesis of a ~His-Thr-His-Thr~ polypeptide.

The following table presents the present day interpretation of the genetic code. Note that this is the RNA alphabet, and an equivalent DNA codon table would have all the Unucleotides replaced by T. Methionine and tryptophan are uniquely represented by a single codon. At the other extreme, leucine is represented by eight codons. The average redundancy for the twenty amino acids is about three. Also, there are three stop codons that terminate polypeptide synthesis.


RNA Codons for Protein Synthesis

The translation process is fundamentally straight forward. The mRNA strand bearing the transcribed code for synthesis of a protein interacts with relatively small RNA molecules (about 70-nucleotides) to which individual amino acids have been attached by an ester bond at the 3'-end.

These transfer RNA's (tRNA) have distinctive three-dimensional structures consisting of loops of single-stranded RNA connected by double stranded segments. This cloverleaf secondary structure is further wrapped into an "L-shaped" assembly, having the amino acid at the end of one arm, and a characteristic anti-codon region at the other end. The anti-codon consists of a nucleotide triplet that is the complement of the amino acid's codon(s). Models of two such tRNA molecules are shown to the right. When read from the top to the bottom, the anti-codons depicted here should complement a codon in the previous table.


 A cell's protein synthesis takes place in organelles called ribosomes. Ribosomes are complex structures made up of two distinct and separable subunits (one about twice the size of the other). Each subunit is composed of one or two RNA molecules (60-70%) associated with 20 to 40 small proteins (30-40%). The ribosome accepts a mRNA molecule, binding initially to a characteristic nucleotide sequence at the 5'-end (colored light blue in the following diagram). This unique binding assures that polypeptide synthesis starts at the right codon. A tRNA molecule with the appropriate anti-codon then attaches at the starting point and this is followed by a series of adjacent tRNA attachments, peptide bond formation and shifts of the ribosome along the mRNA chain to expose new codons to the ribosomal chemistry.


Image Preview

The genetic code

It became apparent during the early phase of the investigation of protein synthesis that translation is fundamentally different from the transcrip­tion process that precedes it. During transcription the "language" of DNA sequences is converted to the closely related dialect of RNA sequences. During pro­tein synthesis, however, a nucleic acid base sequence is converted to a clearly dif­ferent language (i.e., an amino acid sequence), hence the use of the term "transla­tion." Because mRNA and amino acid molecules have no natural affinity for each other, it became obvious to researchers (e.g., Francis Crick) that a series of adap­tor molecules are required to mediate the translation process. This role was even­tually assigned to tRNA molecules.




 Before the identification of adaptor molecules became feasible, however, a more important problem had to be solved: the deciphering of the genetic code.

The ge­netic code can be described as a coding dictionary that specifies a meaning for each specific base sequence. Once the importance of the genetic code was recog­nized, investigators began to speculate about its dimensions. Because only four dif­ferent bases (G, C, A, and U) occur in mRNA and 20 amino acids must be specified, it appeared obvious that more than one base coded for each amino acid. A se­quence of two bases would specify only a total of 16 amino acids (i.e., 42 - 16). However, a three-base sequence provides more than sufficient base combinations for translation to occur (i.e., 43 = 64).

The first major breakthrough in assigning mRNA triplet base sequences (later referred to as codons) came in 1961, when Marshall Nirenberg  performed a series of experiments using an artificial test system con­taining an extract of Escherichia coli fortified with nucleotides, amino acids, ATP, and GTP. He showed that poly U (a synthetic polynucleotide whose base com­ponents consist only of uracil) directed the synthesis of polyphenylalanine. As­suming that codons consist of a three-base sequence, Nirenberg sur­mised that UUU codes for the amino acid phenylalanine. Subsequently, they repeated their experiment using poly A and poly C. Because polylysine and polyproline products resulted from these tests, the codons AAA and CCC were assigned to lysine and proline, respectively.

Most of the remaining codon assignments were determined with the aid of syn­thetic polynucleotides with repeating sequences. Such molecules were constructed by enzymatically amplifying short chemically synthesized sequences. The result­ing polypeptides, which contained repeating peptide segments, were then analyzed. The information obtained from this technique, devised by Har Gobind Khorana, was later supplemented with a strategy used by Nirenberg. In this latter technique the capacity of specific trinucleotides to promote tRNA binding to ribosomes was measured.

The codon assignments for the 64 possible trinudeotide sequences are presented in Table. Of these, 61 code for amino acids. The remaining three codons (UAA, UAG, and UGA) are stop (polypeptide chain terminating) signals. AUG, the codon for methionine, also serves as a start signal (some­times referred to as the initiating codon).

As a result of a va­riety of investigations, the genetic code is now believed to possess the following properties:

1. Degenerate. Any coding system in which several signals have the same meaning is said to be degenerate. The ge­netic code is partially degenerate because most amino acids are coded for by several codons. For example, leucine is coded for by six different codons (UAA, DUG, CUU, CUC, CUA, and CUG). In fact, methionine (AUG) and tryptophan (UGG) are the only amino acids that are coded for by a single codon.

2. Specific. Each codon is a signal for a specific amino acid. The majority of codons that code for the same amino acid possess similar sequences. For example, in each of the four serine codons (UCU, UCC, UCA, and UCG) the first and second bases are identical. It would appear that this feature of the genetic code serves to minimize the dan­ger of point mutations (DNA sequence changes involving a single base pair).;_ylu=X3oDMTA4NDgyNWN0BHNlYwNwcm9m/SIG=12loh6fe4/EXP=1175701579/**http%3A/

3. Nonoverlapping and without punctuation. The mRNA coding sequence is "read" by a ribosome starting from the initiating codon (AUG) as a continuous sequence taken three bases at a time until a stop codon is reached. A set of contiguous triplet codons in an mRNA is called a reading frame. The term open reading frame is used to describe a series of triplet base sequences in mRNA that do not contain a stop codon.

4. Universal. With a few minor exceptions the genetic code is universal. In other words, ex­aminations of the translation process in the species that have been investigated have revealed that the coding sig­nals for amino acids are always the same.


Codon-Anticodon Interactions

tRNA molecules are the "adapters" that are required for the translation of the ge­netic message. Each type of tRNA binds a specific amino acid (at the 3' terminus) and possesses a three-base sequence called the anticodon. It is the base pairing between the anticodon of the tRNA and an mRNA codon that is responsi­ble for the actual translation of the genetic information of structural genes. It should be noted that codon-anticodon pairings are antiparallel. However, both sequences are given in the 5' ® 3' direction. For example, the codon UGC binds to the anti­codon GCA.

Once the genetic code was broken, researchers anticipated the identification of 61 different types of tRNAs in living cells. Instead, they discovered that cells of­ten operate with substantially fewer tRNAs than expected. Most cells possess about 50 tRNAs, although lower numbers have been observed. Further investigation of various tRNAs also revealed that the anticodon in some molecules contain un­common nucleotides, such as inosinate (I), which typically occur at the third anti­codon position. (In eukaryotes, A in the third anticodon position is deaminated to form I.) As tRNAs were investigated, it became increasingly clear that some mole­cules recognize several codons. Crick proposed a rational explanation for this phenomenon, which he referred to as the wobble hypothesis.


The wobble hypothesis, which allows for multiple codon-anticodon interactions by individual tRNAs, is based principally on the following observations:

1.   The first two base pairings in a codon-anticodon interac­tion confer most of the specificity required during trans­lation. Recall that most redundant codons specifying a certain amino acid possess identical nucleotides in the first two positions. These interactions are standard base pairings.

2. The interactions between the third codon and anticodon nucleotides are less stringent. In fact, nontraditional base pairs often occur. For example, tRNAs containing G in the 5' (or "wobble") position of the anticodon can pair with two different codons (i.e., G can interact with either C or U). The same is true for U, which can interact with A or G. When I is in the wobble position of an anticodon, a tRNA can base pair with three different codons, since I can interact with U or A or C.

A careful examination of the genetic code and the "wobble rules" indicates that a minimum of 31 tRNAs are required for the translation of all 61 codons. An addi­tional tRNA that is required for initiating protein synthesis brings the total to 32 tRNAs.


Recognition of amino acids

Although the accuracy of translation (approximately one er­ror per 104 amino acids incorporated) is lower than those of DNA replication and transcription, it is remarkably higher than one would expect of such a complex process. The principal reasons for the accuracy with which amino acids are incor­porated into polypeptides include codon-anticodon base pair­ing and the mechanism by which amino acids are attached to their cognate tRNAs. The attachment of amino acids to tRNAs, a process that is considered to be the first step in pro­tein synthesis, is catalyzed by a group of enzymes called the aminoacyl-tRNA synthetases. The precision with which these enzymes esterify each specific amino acid to the correct tRNA is now believed to be so important for accurate translation that their functioning has been referred to collectively as the second genetic code.

In most organisms there is at least one aminoacyl-tRNA synthetase for each of the 20 amino acids. (Note that each enzyme links its specific amino acid to any appropriate tRNA. This is an important point, since in most cells many amino acids have several cognate tRNAs each.) The process in which an amino acid is linked to the 3' terminus of the correct tRNA consists of two sequential reactions, both of which occur within the active site of the synthetase:

1. Activation. The synthetase first catalyzes the formation of aminoacyl-AMP. This reaction, which serves to activate the amino acid through the formation of a high-energy mixed anhydride bond is driven to completion through the subsequent hydrolysis of its other product, pyrophosphate. (An anhydride is a molecule containing two carbonyl groups linked through an oxygen atom).


2. tRNA linkage. A specific tRNA, also bound in the active site of the synthetase, becomes attached to the aminoacyl group through an ester linkage. Although the aminoacyl ester linkage to the tRNA is lower in energy than the mixed anhydride of aminoacyl AMP, it still possesses sufficient energy to drive peptide bond formation.

The sum of the reactions catalyzed by the aminoacyl-tRNA synthetases is as follows:

             Amino acid + ATP + tRNA ® aminoacyl-tRNA + AMP +PP

Because the product PP is immediately hydrolyzed with a large loss of free energy, tRNA charging is an irreversible process. Because AMP is a product of this reaction, the metabolic price for the linkage of each amino acid to its tRNA is the equiva­lent of two molecules of ATP.

The aminoacyl-tRNA synthetases are a diverse group of enzymes that vary in molecular weight, primary sequence, and number of subunits. Despite this diversity, each enzyme effi­ciently produces a specific aminoacyl-tRNA product in a rela­tively error-free manner. The specificity with which each of the synthetases binds the cor­rect amino acid and its cognate tRNA is crucial for the fidelity of the translation process.

Aminoacyl-tRNA synthetases must also recognize and bind the correct tRNA molecules. For some enzymes (e.g., glutaminyl-tRNA synthetase), anticodon structure is an important feature of the recognition process. However, several enzymes appear to recognize other tRNA structural elements in addi­tion to or instead of the anticodon.


Protein Synthesis

The translation of a genetic message into the primary sequence of a polypeptide can be divided into three phases: initiation, elongation, and termination.

Image Preview

Translation is relatively rapid in prokaryotes. For example, an E. coli ribosome can incorporate as many as 20 amino acids per second. (The eukaryotic rate, at about 50 residues per minute, is significantly slower.) Prokaryotic ribosomes are composed of a 50S large subunit and a 30S small subunit.;_ylu=X3oDMTA4NDgyNWN0BHNlYwNwcm9m/SIG=12j1na8dk/EXP=1175702471/**http%3A/

1. Initiation. Translation begins with the formation of an initiation complex In prokaryotes this process requires three ini­tiation factors (IFs). IF-3 has previously bound to the 30S subunit, thereby pre­venting it from binding prematurely to the 50S subunit. As an mRNA binds to the 30S subunit, it is guided into a precise location, so that theinitiation codon AUG is correctly positioned. Each gene on a polycistronic mRNA possesses its own initiation codon. The translation of each gene appears to occur independently, that is, translation of the first gene may or may not be fol­lowed by the translation of subsequent genes.

In the next step in initiation, IF-2 (a GTP-binding protein with a bound GTP) binds to the 30S subunit, where it promotes the binding of the initiating tRNA to the initiation codon in the P site. (There are two sites on the complete ribosome for codone-anticodone interaction: the P (peptidyl) site and the A (acyl) site. The initiating tRNA in prokaryotes is N-formyl-methionine-tRNA. The initiation phase ends as the 50S subunit binds the 30S subunit. Simultaneously, IF-2 and IF-3 are released. The role of IF-1 is unclear.;_ylu=X3oDMTA4NDgyNWN0BHNlYwNwcm9m/SIG=12k1fh1ua/EXP=1175702535/**http%3A/

Most of the major differences between the prokaryotic and eukaryotic versions of protein synthesis occur during the initiation phase. There are at least nine eukaryotic initiating factors (eIFs), several of which possess numerous subunits. Eukaryotic initiation begins when the small 40S ribosomal subunit binds to a complex composed of eIF-2 (a GTP-binding protein), GTP, and an initiating species of methionyl-tRNA. The small (40S) subunit is prevented from binding to the large (60S) subunit during this phase of initiation because it is associated with eIF-3, a multisubunit protein.

ElongationIt is during the elongation phase that the polypeptide is actually synthesized according to the specifications of the genetic message. Elongation, the phase in which amino acids are incorporated into a polypeptide chain, consists of three steps: (1) positioning of an aminoacyl-tRNA in the A site, (2) peptide bond formation, and (3)translocation.

The prokaryotic elongation process begins when an aminoacyl-tRNA, specified by the next codon, binds to the A site. Before it can be positioned in the A site, the aminoacyl-tRNA must first bind EF-Tu-GTP. The elongation factor EF-Tu is a GTP-binding protein involved in the positioning of aminoacyl-tRNA molecules in the A site. After the aminoacyl-tRNA is positioned, the GTP bound to EF-Tu is hydrolyzed to GDP and P. GTP hydrolysis results in the release of EF-Tu from the ribosome. Subsequently, a second elongation factor, referred to as EF-Ts, promotes EF-Tu re­generation by displacing its GDP moiety. EF-Ts is then itself displaced by an in­coming GTP molecule.

After the positioning of the second aminoacyl-tRNA in the A site, the formation of a peptide bond is catalyzed by peptidyl transferase. The energy required to drive this reaction is provided by the high-energy ester bond linking the P site amino acid to its tRNA. (During the first elongation cy­cle, this amino acid is formylmethionine.) As was described previously, the now uncharged tRNA occupying the P site leaves the ribosome.;_ylu=X3oDMTA4NDgyNWN0BHNlYwNwcm9m/SIG=12k2100f4/EXP=1175702634/**http%3A/

For translation to continue, the mRNA must move, or "translocate," so that a new codon-anticodon interaction can occur. Translocation requires the binding of another GTP-binding protein referred to as EF-G. GTP hydrolysis provides the en­ergy required for the ribosomal conformational change that is apparently involved in the movement of the peptidyl-tRNA (the tRNA bearing the growing peptide chain) from the A site to the P site. The unoccupied A site then binds an appropriate aminoacyl-tRNA to the new A site codon. After the subsequent release of EF-G the ribosome is ready for the next elongation cycle. Elongation continues until a stop codon enters the A site.

TerminationThe termination phase begins when a termination codon (UAA, UAG, or UGA) enters the A site. Three releasing factors (RF-1, RF-2, and RF-3) are in­volved in termination. The codons UAA and UAG are recognized by RF-1, whereas UAA and UGA are recognized by RF-2.

This recognition process, which involves GTP hydrolysis, results in the following alterations in ribosome function. The peptidyl transferase, which is transiently transformed into an esterase, hydrolyzes the bond linking the completed polypeptide chain and the P site tRNA. Following the polypeptide's release from the ribosome, the mRNA and tRNA also dissociate. Termination ends with the dissociation of the ribosome into its constituent subunits.

In addition to the ribosomal subunits, mRNA and aminoacyl-tRNAs, translation requires an energy source (GTP) and a wide variety of protein factors. These fac­tors perform several types of roles. Some have catalytic functions; others stabilize specific structures that form during translation. Translation factors are classified according to the phase of the translation process that they affect, that is, initiation, elongation, or termination. The major differences between prokaryotic and eukaryotic translation appear to be due largely to the identity and functioning of these protein factors.

3. Post-translational Modification

Once a peptide or protein has been synthesized and released from the ribosome it often undergoes further chemical transformation. This post-translational modificationmay involve the attachment of other moieties such as acyl groups, alkyl groups, phosphates, sulfates, lipids and carbohydrates. Functional changes such as dehydration, amidation, hydrolysis and oxidation (e.g. disulfide bond formation) are also common. In this manner the limited array of twenty amino acids designated by the codons may be expanded in a variety of ways to enable proper functioning of the resulting protein. Since these post-translational reactions are generally catalyzed by enzymes, it may be said: "Virtually every molecule in a cell is made by the ribosome or by enzymes made by the ribosome."

Modifications, like phosphorylation and citrullination, are part of common mechanisms for controlling the behavior of a protein. As shown on the left below, citrullination is the post-translational modification of the amino acid arginine into the amino acid citrulline. Arginine is positively charged at a neutral pH, whereas citrulline is uncharged, so this change increases the hydrophobicity of a protein. Phosphorylation of serine, threonine or tyrosine residues renders them more hydrophilic, but such changes are usually transient, serving to regulate the biological activity of the protein. Other important functional changes include iodination of tyrosine residues in the peptide thyroglobulin by action of the enzyme thyroperoxidase. The monoiodotyrosine and diiodotyrosine formed in this manner are then linked to form the thyroid hormones T3 and T4, shown below.

Amino acids may be enzymatically removed from the amino end of the protein. Because the "start" codon on mRNA codes for the amino acid methionine, this amino acid is usually removed from the resulting protein during post-translational modification. Peptide chains may also be cut in the middle to form shorter strands. Thus, insulin is initially synthesized as a 105 residue preprotein. The 24-amino acid signal peptide is removed, yielding a proinsulin peptide. This folds and forms disulfide bonds between cysteines 7 and 67 and between 19 and 80. Such dimeric cysteines, joined by a disulfide bond, are named cystine. A protease then cleaves the peptide at arg31 and arg60, with loss of the 32-60 sequence (chain C). Removal of arg31 yields mature insulin, with the A and B chains held together by disulfide bonds and a third cystine moiety in chain A. The following cartoon illustrates this chain of events.

Nisin is a polypeptide (34 amino acids) made by the bacterium Lactococcus lactis. Nisin kills gram positive bacteria by binding to their membranes and targeting lipid II, an essential precursor of cell wall synthesis. Such antimicrobial peptides are a growing family of compounds which have received the name lantibiotics due to the presence oflanthionine, a nonproteinogenic amino acid with the chemical formula HO2C-CH(NH2)-CH2-S-CH2-CH(NH2)-CO2H. Lanthionine is composed of two alanine residues that are crosslinked on their β-carbon atoms by a thioether linkage (i.e. it is the monosulfide analog of the disulfide cystine). Lantibiotics are unique in that they are ribosomally synthesized as prepeptides, followed by post-translational processing of a number of amino acids (e.g. serine, threonine and cysteine) into dehydro residues and thioether crossbridges. Nisin is the only bacteriocin that is accepted as a food preservative. Several nisin subtypes that differ in amino acid composition and biological activity are known. A typical structure is drawn below, and a Jmol model will be presented by clicking on the diagram.


The bacterial cell wall is a cross-linked glycan polymer that surrounds bacterial cells, dictates their cell shape, and prevents them from breaking due to environmental changes in osmotic pressure. This wall consists mainly of peptidoglycan or murein, a three-dimensional polymer of sugars and amino acids located on the exterior of the cytoplasmic membrane.


The monomer units are composed of two amino sugars, N-acetylglucosamine (NAG) and N-acetylmuramic acid (NAM), is shown. Transglycosidase enzymes join these units by glycoside bonds, and they are further interlinked to each other via peptide cross-links between the pentapeptide moieties that are attached to the NAM residues. Peptidoglycan subunits are assembled on the cytoplasmic side of the bacterial membrane from a polyisoprenoid anchor. Lipid II, a membrane-anchored cell-wall precursor that is essential for bacterial cell-wall biosynthesis, is one of the key components in the synthesis of peptidoglycan. Peptidoglycan synthesis via polymerization of Lipid II is illustrated in the following diagram. Cross-linking of the peptide side chains is then effected by transpeptidase enzymes. A model of Lipid II complexed with nisin may be examined as part of the previous Jmol display.

In order for bacteria to divide by binary fission and increase their size following division, links in the peptidoglycan must be broken, new peptidoglycan monomers must be inserted, and the peptide cross links must be resealed. Transglycosidase enzymes catalyze the formation of glycosidic bonds between the NAM and NAG of the peptidoglycan monomers and the NAG and NAM of the existing peptidoglycan. Finally, transpeptidase enzymes reform the peptide cross-links between the rows and layers of peptidoglycan making the wall strong. Many antibiotic drugs, including penicillin, target the chemistry of cell wall formation. The effectiveness of choosing Lipid II for an antibacterial strategy is highlighted by the fact that it is the target for at least four different classes of antibiotic, including the clinically important glycopeptide antibiotic vancomycin. The growing problem of bacterial resistance to many current drugs, including vancomycin, has led to increasing interest in the therapeutic potential of other classes of compound that target Lipid II. Lantibiotics such as nisin are part of this interest.


Analysis of Structural Similarities and Differences between DNA and RNA

1. Background

We know that living organisms have the ability to reproduce and to pass many of their characteristics on to their offspring. From this we may infer that all organisms have genetic substances and an associated chemistry that enable inheritance to occur. It is instructive to consider the essential requirements such genetic materials must fullfill.


Biologically useful information, especially instructions for protein synthesis, must be incorporated in the material.

The inherited information must be stable (unchanged) over the lifetime of the organism if accurate copies are to be conveyed to the offspring. Infrequent changes may take place (see mutability).

A method of faithfully replicating the information encoded in the material, and transmitting this copy to the offspring must exist.

Despite the inherent stability noted above, the material must be capable of incorporating stable structural change, and passing this change on to succeeding generations.

Since this genetic substance has been identified as the nucleic acids DNA and RNA, it is instructive to examine the manner in which these polymers satisfy the above requirements.

2. Information Storage

The complexity of life suggests that even simple organisms will require very large inheritance libraries. Although the four nucleotides that make up of DNA might appear to be too simple for this task, the enormous size of the polymer and the permutations of the monomers within the chain meet the challenge easily. After all, the words and graphics in this document are all presented to the computer as combinations of only two characters, zeros and ones (the binary number system). DNA has four letters in its alphabet (A, C, G & T), so the number of words that can be formed increase exponentially with the number of letters per word. Thus, there are 42 or 16 two letter words, and 43 or 64 three letter words.

Assuring the stability of information encoded by the DNA alphabet presents a serious challenge. If the letters of this alphabet are to be strung together in a specific way on the polymer chain, chemical reactions for attaching (and removing) them must be available. Simple carboxylic ester or amide links might appear suitable for this purpose (note step-growth polymerization), but these are used in lipids and polypeptides, so a separate enzymatic machinery would be needed to keep the information processing operations apart from other molecular transformations. The overall stability of such covalent links presents a more serious problem. Under physiological conditions (aqueous, pH near 7.4 & 27 to 37º C) esters are slowly hydrolyzed. Amides are more stable, but even a hydrolytic cleavage of one bond per hour would be devastating to a polymer having tens of thousands to millions such links. Furthermore, short difunctional linking groups, such as carbonates, oxylates and malonates show enhanced reactivity, and their parent acids are unstable or toxic.

Ester Hydrolysis at 35º C and pH 7



Rate of Hydrolysis

Relative Rate

Ethyl Acetate



Trimethyl Phosphate



Dimethyl Phosphate




Phosphate is an ubiquitous inorganic nutrient. Mono, di and triesters of the corresponding acid (phosphoric acid) are all known. Because of their acidity (pKa ≈ 2), the mono and diesters are negatively charged at physiological pH, rendering them less susceptible to nucleophilic attack. The influence of negative charge on the rate of nucleophilic hydrolysis of some representative esters is shown in the table on the right. Clearly, a polymer in which monomer units are joined by negatively charged diphosphate ester links should be substantially more stable than one composed of carboxylate ester bonds. The negative charge found on all biological phosphate derivatives serves other purposes as well.

 The diphosphate ester links that join the nucleotides units of DNA are formed by phosphorylation reactions involving nucleotide triphosphate reagents. These reagents are the phosphoric acid analogs of carboxylic acid anhydrides, a functional group that would not survive the aqueous environment of a cell. The high density of negative charge on the triphosphate function not only solubilizes the organic moiety to which it is attached, but also reduces the rate at which it is hydrolyzed.
 Living cells must conserve and employ their chemical reagents within a volume defined and enclosed by a membrane barrier. These lipid bilayer membranes have hydrophobic interiors, which resist the passage of ions. Indeed, special trans-membrane structures called ion channels exist so that controlled ion transport across a membrane may take place. Small neutral organic molecules, such as adenosine, cytidine and guanosine, may pass through lipid membranes, albeit at a reduced rate, but their mono, di and triphosphate derivatives are more tightly sequestered in the cell.

3. Why is 2'-Deoxyribose the Sugar Moiety in DNA?

Common perhydroxylated sugars, such as glucose and ribose, are formed in nature as products of the reductive condensation of carbon dioxide we call photosynthesis. The formation of deoxysugars requires additional biological reduction steps, so it is reasonable to speculate why DNA makes use of the less common 2'-deoxyribose, when ribose itself serves well for RNA. At least two problems associated with the extra hydroxyl group in ribose may be noted. First, the additional bulk and hydrogen bonding character of the 2'-OH interfere with a uniform double helix structure, preventing the efficient packing of such a molecule in the chromosome. Second, RNA undergoes spontaneous hydrolytic cleavage about one hundred times faster than DNA. This is believed due to intramolecular attack of the 2'-hydroxyl function on the neighboring phosphate diester, yielding a 2',3'-cyclic phosphate. If stability over the lifetime of an organism is an essential characteristic of a gene, then nature's selection of 2'-deoxyribose for DNA makes sense. The following diagram illustrates the intramolecular cleavage reaction in a strand of RNA.

Structural stability is not a serious challenge for RNA. The transcripted information carried by mRNA must be secure for only a few hours, as it is transported to a ribosome. Once in the ribosome it is surrounded by structural and enzymatic segments that immediately incorporate its codons for protein synthesis. The tRNA molecules that carry amino acids to the ribosome are similarly short lived, and are in fact continuously recycled by the cellular chemistry.

4. The Thymine vs. Uracil Issue

Structural formulas for the three pyrimidine bases, cytosine, thymine and uracil are shown. The carbon atoms that are part of these compounds may be categorized as follows. All of these compounds are apparently put together from a three-carbon malonate-like precursor (blue colored bonds) and a single high oxidation state carbon species (colored red). Such biosynthetic intermediates are well established. Thymine is unique in having an additional carbon, the green methyl group. Biosynthesis of this compound must involve additional steps, thus adding constructional complexity to the DNA molecules in which it replaces uracil.

The reason for the substitution of thymine for uracil in DNA may be associated with the repair mechanisms by which the cell corrects damage to its DNA. One source of error in the code is the slow hydrolysis of heterocyclic enamines, such as cytosine and guanine, to their corresponding lactams. This changes the structure of the base, and disrupts base pairing in a manner that can be identified and then repaired. However, the hydrolysis product from cytosine is uracil, and this mismatched species must somehow be distinguished from the uracil-like base that belongs in the DNA. The extra methyl group serves this role nicely.

Gene Expression

Ultimately, the internal order that is the most essential property of living organisms requires the precise and timely regulation of gene expression. It is, after all, the capacity to switch genes on and off that enables cells to respond efficiently to a changing environment. In multicellular organisms, com­plex programmed patterns of gene expression are responsible for cell differentia­tion as well as intercellular cooperation.

The regulation of genes, as measured by their transcription rates, is the result of a complex hierarchy of control elements that act to coordinate the cell's meta­bolic activities. Some genes, referred to as constitutive or housekeeping genesare routinely transcribed because they code for gene products (e.g., glucose-metabolizing enzymes, ribosomal proteins, and histones) that are required for cell function. In addition, in the differentiated cells of multicellular organisms, certain specialized proteins are produced that cannot be detected elsewhere (e.g., hemoglobin in red blood cells). Genes, which are expressed only under certain circum­stances, are referred to as inducible. For example, the enzymes that are required for lactose metabolism in E. coli are synthesized only when lactose is actually pres­ent and glucose, the bacterium's preferred energy source, is absent.

Most of the mechanisms that are used by living cells to regulate gene expression involve DNA-protein interactions. At first glance, the seemingly repetitious and regular structure of B-DNA appears to make it an unlikely partner for the sophisti­cated binding with myriad different proteins that obviously must occur in gene regulation. However, DNA is somewhat deformable, and certain sequences can be curved or bent. In addition, it is now recognized that the edges of the base pairs within the major groove (and to a lesser extent the minor groove) of the double helix can participate in sequence-specific binding to proteins. Numerous contacts (often about 20 or so) involving hydrophobic interactions, hydro­gen bonds, and ionic bonds between amino acids and nucleotide bases result in highly specific DNA-protein binding.

The three-dimensional structures of a number of DNA regulatory proteins that have been determined have surprisingly similar features. In addition to usually pos­sessing twofold axes of sym­metry, most of these mole­cules can be separated into families on the basis of the following struc­tural domains: (1) helix-turn-helix, (2) helix-loop-helix, (3) leucine zipper, (4) zinc finger, and (5) beta-sheets. It should be noted that DNA-binding pro­teins, many of which are tran­scription factors, often form dimers. For example, a variety of transcription factors with leucine zipper motifs form dimers as their leucine-containing a-helices interdigitate. Because each type of protein possesses its own unique binding specificity, the ca­pacity of these and many other transcription factors to combine to form homodimers (two identical monomers) and heterodimers (two different monomers) results in a large number of unique gene regulatory agents.

Considering the obvious complexity of function observed in living organisms, it is not surprising that the regulation of gene expression has proven to be both re­markably complex and difficult to investigate. For many of the reasons, knowledge concerning prokaryotic gene expression is significantly more ad­vanced than that of eukaryotes. Prokaryotic gene expression was originally inves­tigated, in part, as a model for the study of the more complicated gene function of mammals. Although it is now recognized that the two genome types are vastly dif­ferent in many respects, the prokaryotic work has provided many valuable insights into the basic mechanisms of gene expression. In general, prokaryotic gene ex­pression involves the interaction of specific proteins (sometimes referred to as reg­ulators) with DNA in the immediate vicinity of a transcription start site. Such in­teractions may have either a positive effect (i.e., transcription is initiated or increased) or a negative effect (i.e., transcription is blocked). In an interesting vari­ation the inhibition of a negative regulator (called a repressor) results in the acti­vation of affected genes. (The inhibition of a represser gene is referred to as derepression.) Eukaryotic gene expression also uses these mechanisms as well as several others, including gene rearrangement and amplification and various types of complex transcriptional, RNA processing, and translational controls. In addition, the spatial separation of transcription and translation that is inherent in eukaryotic cells provides another opportunity for regulation: RNA transport control. Fi­nally, eukaryotes (as well as prokaryotes) also regulate cell function through the modulation of proteins through various types of covalent modification.

The discussion of prokaryotic gene expression focuses on the lac operon. The lac operon of E. coli, originally investigated by Francois Jacob and Jacques Monod in the 1950s, remains one of the best-understood models of gene regulation. Despite a daunting lack of knowledge concerning eukaryotic gene expression, a significant number of the pieces in this marvelous puzzle have been revealed.

The highly regulated metabolism of prokaryotes such as E. coli allows these organisms to respond rapidly to changing environmental con­ditions in a manner that promotes growth and survival. The timely synthesis of en­zymes and other gene products only when needed prevents the waste of energy and nutritional resources. At the genetic level, the control of inducible genes is of­ten effected by collections of structural and regulatory genes called operons. In­vestigations of operons, especially the lac operon, has provided substantial insight into how gene expression can be altered by environmental conditions. Similarly investigations of viral infections of prokaryotes have fur­nished relatively unobstructed views of certain genetic mechanisms. The infection of E. coli by bacteriophage  has been especially instructive.


The Lac Operon. The lac operon consists of a control element and structural genes that code for the enzymes of lactose metabolism. The control ele­ment contains thepromoter site, which overlaps the operator site. (In prokary­otes the operator is a DNA sequence involved in the regulation of adjacent genes that binds to a represser protein.) The promoter site also contains the CAP site. The structural genes Z, Y, and A specify the primary structure of b-galactosidase, lactose permease, and thiogalactoside transacetylase, respectively. b-Galactosidase catalyzes the hydrolysis of lactose, which yields the monosaccharides galactose and glucose, whereas lactose permease promotes lactose transport into the cell. Because lactose metabolism proceeds normally in the absence of thiogalactoside transacetylase, its role is unclear. A repressor gene i, directly adjacent to the lac operon, codes for the lac repressor pro­tein, a tetramer that binds to the operator site with high affinity. (There are about ten copies of lac represser per cell.) The binding of the lac repressor to the oper­ator prevents the functional binding of RNA polymerase to the promoter.

In the absence of its inducer (allolactose) the lac operon remains repressed because of the binding of lac repressor to the operator. When lactose becomes available, a few molecules are converted to allolactose by b-galactosidase. Allolactose then binds to the repressor, causing a change in its conformation that promotes dissociation from the operator. Once the inactive re­pressor diffuses away from the operator, the transcription of the structural genes is initiated. The lac operon remains active until the lactose supply is consumed. The repressor subsequently reverts to its active form and rebinds to the operator.

Glucose is the preferred carbon and energy source for E. coli In the event that the organism is exposed to both glucose and lactose, the glucose is metabolized first. Syntheses of the lac operon enzymes are induced only after the glucose is no longer available. (This makes sense because glucose is more commonly available and has a central role in cellular metabolism. Why expend the energy to synthe­size the enzymes required for the metabolism of other sugars if glucose is also avail­able?) The delay in activating the lac operon is mediated by a catabolite gene activator protein (CAP). CAP is an allosteric homodimer that binds to the chro­mosome at a site directly in front of the lac promoter when glucose is absent. CAP can act as an indicator of glucose concentration because of its capacity to bind to cAMP. (For reasons that are not yet clear, the cell's cAMP concentration is inversely related to glucose concentration.) The binding of cAMP to CAP, a process that oc­curs only when glucose is absent and cAMP levels are high, causes a conformational change that allows the protein to bind to the lac promoter. CAP binding pro­motes transcription by increasing the affinity of RNA polymerase for the lac promoter. In other words, CAP exerts a positive or activating control on lactose metabolism.

Protein synthesis is an extraordinarily complex process in which genetic infor­mation encoded in the nucleic acids is "translated" into the 20 amino acid "alpha­bet" of polypeptides.



Posttranslation modification

Regardless of the species, immediately after translation, some polypeptides fold into their final form without further modifications. Frequently, however, newly syn­thesized polypeptides are modified. These alterations, referred to as posttranslational modifications, can be considered to be the fourth phase of translation. They include the removal of portions of the polypeptide by proteases, the addition of a variety of groups to the side chains of certain amino acid residues, and the inser­tion of cofactors. Often, individual polypeptides then combine to form polymeric proteins. Posttranslational modifications appear to serve two general purposes: (1) preparation of a polypeptide to serve its specific function and (2) direction of a polypeptide to a specific location, a process referred to as targeting. Targeting is an especially complex process in eukaryotes because proteins must be directed to a variety of different destinations. In addition to cytoplasm and the plasma mem­brane (the principal destinations in prokaryotes), eukaryotic proteins may be des­tined for delivery to a variety of organelles (e.g., mitochondria, chloroplasts, lysosomes, peroxisomes).

Most nascent polypeptides undergo one or more types of covalent modifications. These alterations, which may occur ei­ther during ongoing polypeptide synthesis or afterwards, con­sist of reactions that modify the side chains of specific amino acid residues or involve the breaking of specific bonds. In gen­eral, posttranslational modifications prepare each molecule for its functional role and/or for folding into its native (i.e., bio­logically active) conformation. Examples of prominent post­translational changes include the following:

1. Proteolytic cleavage. Typical examples of proteolytic cleavage include the removal of the N-terminal methionine residue, signal sequences, and the conversion of inactive precursors to their active counter­parts. Recall, for example, that certain enzymes, referred to as proenzymes or zymogens, are transformed into their active forms by cleavage of specific peptide bonds. Inac­tive polypeptide precursors are called proproteins. The proteolytic processing of insulin provides a well-researched example of the conversion of a nonenzyme protein into its active form.


2. Glycosylation. Although a wide variety of eukaryotic pro­teins are glycosylated, the functional purpose of the car­bohydrate moieties is not always obvious. In general, secreted proteins contain complex oligosaccharide species, while ER membrane proteins possess high mannose species.

3. Hydroxylation. Hydroxylation of the amino acids proline and lysine is required for the structural integrity of the connective tissue proteins collagen and elastin. Additionally, 4-hydroxyproline is also found in acetylcholinesterase (the enzyme that degrades the neu-rotransmitter acetylcholine) and complement (a complex series of serum proteins involved in the immune re­sponse). Ascorbic acid (vi­tamin C) is required for the hydroxylation of proline and lysine residues in collagen. When dietary intake is inade­quate, scurvy results. The symp­toms of scurvy (e.g., blood vessel fragility and poor wound healing) are a consequence of weak collagen fiber structure.

4. Phosphorylation. The roles of protein phosphorylation in various examples of metabolic control and signal transduction are well known. Protein phosphory­lation may also play a critical (and interrelated) role in protein-protein interactions. For example, the autophosphorylation of tyrosine residues in PDGF receptors ap­parently results in the subsequent binding of certain cytoplasmic signaling molecules.

5. Lipophilic modifications. The covalent attachment of lipid moieties to proteins improves membrane binding ca­pacity and/or certain protein-protein interactions. Among the most common lipophilic modifications is acylation (the attachment of fatty acids). Although the fatty acid myristate (14:0) is rel­atively rare in eukaryotic cells, myristoylation is one of the most common forms of acylation.

6. Methylation. Protein methylation serves several purposes in eukaryotes. The methylation of altered aspartate residues by a specific type of methyltransferase promotes either the repair or the degradation of damaged proteins. Other methyltransferases catalyze reactions that alter the cellular roles of certain proteins.

7. Disulfide bond formation. Disulfide bonds are generally found only in secretory proteins (e.g., insulin) and certain membrane proteins. (Recall that "disulfide bridges" are strong bonds that confer considerable structural stability on the molecules that contain them.) Cytoplasmic proteins generally do not possess disulfide bonds because of the presence of various reducing agents in cytoplasm (e.g., glutathione and thioredoxin).



Despite the vast complexities of eukaryotic cell structure and function, each newly synthesized polypeptide is normally directed to its proper destination. Considering that translation takes place in the cytoplasm (except for certain mol­ecules that are produced within mitochondria and plastids) and that a wide vari­ety of polypeptides must be directed to various locations, it is not surprising that the mechanisms by which cellular proteins are "targeted" are complex. Although this process is not yet completely understood, there appear to be two principal mechanisms by which polypeptides are directed to their correct locations: tran­script localization and signal peptides.

It is generally recognized that cells often have asymmetric protein distributions within the cytoplasm. It is now believed that cytoplasmic protein gradients are created bytranscript localization, that is, the binding of specific mRNA to receptors in certain cytoplasmic locations.

Polypeptides that are destined for secretion or for use in the plasma membrane or any of the membranous organelles must be specifically targeted to their proper location. Several types of these proteins possess sorting signals that are referred to as signal peptides. Each signal peptide sequence promotes the insertion of the polypeptide that contains it into an appropriate mem­brane.


Translational Control Mechanisms

Protein synthesis is an exceptionally expensive process. With a cost of four high-energy phosphate bonds per peptide bond (i.e., two bonds expended during tRNA charging and one each during A site-tRNA binding and translocation) it is perhaps not surprising that enor­mous quantities of energy are involved.

Although the speed and accuracy of translation require a high energy input, the cost would be even higher without metabolic control mechanisms. It is these mechanisms that allow prokaryotic cells to compete with each other for limited nutritional resources.

Eukariotic translation control mechanisms are proving to be exceptionally complex, substantially more so than those observed in prokaryotes. In prokaryotes such as E. coli, most of the control of protein synthesis occurs at the level of transcription. This circumstance makes sense for sev­eral reasons. First, transcription and translation are directly coupled; that is, trans­lation is initiated shortly after transcription begins. Second, the lifetime of prokaryotic mRNA is usually relatively short. With half-lives of between 1 and 3 minutes, the types of mRNAs produced in a cell can be quickly altered as environ­mental conditions change.

Despite the preeminence of transcriptional control mechanisms, there are vari­ations in the rates of prokaryotic mRNA translation.

An interesting example of negative translational regu­lation in prokaryotes is pro­vided by ribosomal protein synthesis. There are approximately 55 pro­teins in prokaryotic ribosomes. These molecules are coded for by genes occurring in 20 operons. Efficient bac­terial growth requires that their synthesis be coordinately regulated among the operons as well as with rRNA synthesis. For example, in the PL11 operon, which con­tains the genes for the ribo­somal proteins L1 and L11, excessive amounts of L1 (i.e., more L1 molecules than can bind available 23S rRNA) trigger an inhibition of PL11 mRNA translation. Apparently, LI can bind to either 23S rRNA or PL11 mRNA. In the absence of 23S rRNA, LI inhibits the translation of its own operon by binding to the 5' end of PL11 mRNA.

Deoxyribonucleic acid (DNA) and ribonucleic acid (RNA) are chainlike macromolecules that function in the storage and transfer of genetic information They are major components of all cells, together making up from 5 to 15 percent of their dry weight. Nucleic acids are also present in viruses, infec­tious nucleic acid-protein complexes capable of directing their own replication in specific host cells. Although nucleic acids are so named because DNA was first isolated from cell nuclei, both DNA and RNA also occur in other parts of cells.

Just as the amino acids are the building blocks, or monomeric units, of polypeptides, the nucleotides are the monomeric units of nucleic acids. Just as one type of protein molecule is distinguished from another by the sequence of the characteristic side chains or R groups of the amino acid monomers, each type of nucleic acid is dis­tinguished by the