INVESTIGATION OF DNA REPLICATION AND RNA TRANSCRIPTION.
PROTEIN BIOSYNTHESIS IN RYBOSOMES.
The first isolation of what we now refer to as DNA was accomplished by Johann Friedrich Miescher circa 1870. He reported finding a weakly acidic substance of
unknown function in the nuclei of human white blood cells, and named this
material "nuclein". A few years later, Miescher separated nuclein
into protein and nucleic acid components. In the 1920's nucleic acids were
found to be major components of chromosomes, small gene-carrying bodies in the
nuclei of complex cells. Elemental analysis of nucleic acids showed the presence
of phosphorus, in addition to the usual C, H, N & O. Unlike proteins,
nucleic acids contained no sulfur. Complete hydrolysis of chromosomal nucleic
acids gave inorganic phosphate, 2-deoxyribose (a previously unknown sugar) and
four different heterocyclic bases (shown in the following diagram). To reflect
the unusual sugar component, chromosomal nucleic acids are called
deoxyribonucleic acids, abbreviated DNA. Analogous nucleic acids in which the
sugar component is ribose are
termed ribonucleic acids, abbreviated RNA. The acidic character of the nucleic
acids was attributed to the phosphoric acid moiety.
The two monocyclic bases shown here are
classified as pyrimidines,
and the two bicyclic bases are purines.
Each has at least one N-H site at which an organic substituent may be attached.
They are all polyfunctional bases, and may exist in tautomeric forms.
Base-catalyzed hydrolysis of DNA gave four nucleoside products, which proved to be
N-glycosides of 2'-deoxyribose combined with the heterocyclic amines.
Structures and names for these nucleosides will be displayed above by clicking
on the heterocyclic base diagram. The base components are colored green, and
the sugar is black. As noted in the 2'-deoxycytidine structure on the left, the
numbering of the sugar carbons makes use of primed numbers to distinguish them
from the heterocyclic base sites. The corresponding N-glycosides of the common
sugar ribose are the building blocks of RNA, and are named adenosine, cytidine,
guanosine and uridine (a thymidine analog missing the methyl group).
From this evidence, nucleic acids may be formulated as alternating copolymers
of phosphoric acid (P) and nucleosides (N), as shown:
~ P
– N – P – N'– P – N''– P – N'''– P – N ~
At first the four nucleosides, distinguished by prime marks in this crude
formula, were assumed to be present in equal amounts, resulting in a uniform
structure, such as that of starch. However, a compound of this kind, presumably
common to all organisms, was considered too simple to hold the hereditary
information known to reside in the chromosomes. This view was challenged in
1944, when Oswald Avery and colleagues demonstrated that bacterial DNA was
likely the genetic agent that carried information from one organism to another
in a process called "transformation". He concluded that "nucleic acids must be regarded as
possessing biological specificity, the chemical basis of which is as yet
undetermined." Despite
this finding, many scientists continued to believe that chromosomal proteins,
which differ across species, between individuals, and even within a given
organism, were the locus of an organism's genetic information. It should be
noted that single celled organisms like bacteria do not have a well-defined
nucleus. Instead, their single chromosome is associated with specific proteins in
a region called a "nucleoid". Nevertheless, the DNA from bacteria has
the same composition and general structure as that from multicellular
organisms, including human beings.
Views about the role of DNA in inheritance changed in the late 1940's and
early 1950's.
By
conducting a careful analysis of DNA from many sources, Erwin Chargaff found
its composition to be species specific. In addition, he found that the amount
of adenine (A) always equaled the amount of thymine (T), and the amount of
guanine (G) always equaled the amount of cytosine (C), regardless of the DNA
source.
As set forth in the following table, the ratio of (A+T) to (C+G) varied
from 2.70 to 0.35. The last two organisms are bacteria.
Building Blocks of Nucleic
Acids
Nucleic
acids are bio polymers composed of monomer units called as nucleotides, thus
they are the building blocks of all nucleic acids. Each nucleotide has three
components which are bonded together in a certain manner to form complete unit.
These components are as follow.
1. Nitrogen-containing
"base"
There
are two type of nitrogenous base present in nucleotides, pyrimidine (one ring) or purine (two rings)which
are quite differ from each other in there structures.
(a) Purine
There
are two purine base commonly found in nucleic acid, Adenine (A) and guanine (G).
Both purine bases bonded with sugar through the N9 of base with C1 of the sugar.
(b) Pyrimidine
There are three pyrimidine base, Thymine (T) and cytosine(C) present in DNA and
Uracil (U).
2. Five carbon sugar
Two pentose sugars are present in nucleic acids, ribose or deoxyribose sugar. DNA contains β-D-2-deoxyribose sugar while
RNA contains β-D-ribose sugar. Both sugars are differing only in the
presence of one oxygen atom at C2 position.
· In any nucleotide, the combination of
these two components, a base and sugar is known as a nucleoside.
· The bonding between nucleosides and
phosphoric acid molecules results the formation of nucleotides.
· In nucleosides, the 1-position of
pyrimidine and 9-position of purine bonded with C1 of the sugar molecule through a
β-linkage also known as N-glycosidic linkage.
General Structure of the Nucleotides
The monomeric
units of DNA are called deoxyribonucleotides; those of RNA are ribonucleotides.
Each nucleotide contains three characteristic components: (1) a nitrogenous
heterocyclic base, which is a derivative of either pyrimidine or purine; (2) a
pentose; and (3) a molecule of phosphoric acid.
Four different
deoxyribonucleotides serve as the major components of DNAs; they differ from
each other only in their nitrogenous base components, after which they are
named.
http://www.youtube.com/watch?v=PHOjrY3zYdM
The four bases characteristic of the
deoxyribonucleotide units of DNA are the purine derivatives adenine and guamine and the pyrimidine
derivatives cytosine and thymine.
Similarly, four different ribonucleotides are the major components of RNAs;
they contain the purine bases adenine and guanine and the pyrimidine
bases cytosine and uracil.Thus thymine is characteristically present in DNA but not usually in
RNA, whereas uracil is normally present in RNA but only rarely in DNA.
http://www.youtube.com/watch?v=hW9EmUN-wsc&feature=related
The other difference in the composition
between these two kinds of nucleic acids is that deoxyribonucleotides contain
as their pentose component 2-deoxy-p-ribose,whereas ribonucleotides
contain p-ribose. The pentose is joined to the base
by a glycosyl bond between carbon atom 1 of the pentose and nitrogen atom 9 of
purine bases or nitrogen atom 1 of pyrimidine bases. The phosphate group of
nucleotides is in ester linkage with carbon atom 5 of the pentose.
When the phosphate group of a
nucleotide is removed by hydrolysis, the structure remaining is called a
nucleoside.
On the basis of five different bases, there are
five nucleosides are possible in DNA and RNA.
Abbreviation |
Base |
Nucleoside |
Nucleic Acid |
Structure of nucleoside |
A |
Adenine |
Deoxyadenosine |
DNA |
|
Adenosine |
RNA |
|
||
G |
Guanine |
Deoxyguanosine |
DNA |
|
Guanosine |
RNA |
|
||
C |
Cytosine |
Deoxycytidine |
DNA |
|
Cytidine |
RNA |
|
||
T |
Thymine |
Deoxythymidine (thymidine) |
DNA |
|
U |
Uracil |
Uridine |
RNA |
|
The
Chemical Nature of DNA
The polymeric structure of DNA may be
described in terms of monomeric units of increasing complexity. In the top
shaded box of the following illustration, the three relatively simple
components mentioned earlier are shown. Below that on the left , formulas for
phosphoric acid and a nucleoside are drawn. Condensation polymerization of these leads to the DNA formulation outlined
above. Finally, a 5'- monophosphate ester, called a nucleotide may be drawn as a single monomer unit,
shown in the shaded box to the right. Since a monophosphate ester of this kind
is a strong acid (pKa of 1.0), it will be fully ionized at the usual
physiological (ca.7.4). Names for these DNA components are given in the table pH.
Isomeric 3'-monophospate nucleotides
are also known, and both isomers are found in cells. They may be obtained by selective
hydrolysis of DNA through the action of nuclease enzymes. Anhydride-like di-
and tri-phosphate nucleotides have been identified as important energy carriers
in biochemical reactions, the most common being ATP (adenosine
5'-triphosphate).
|
Names
of DNA Base Derivatives
|
||
Base |
Nucleoside |
5'-Nucleotide |
Adenine |
2'-Deoxyadenosine |
2'-Deoxyadenosine-5'-monophosphate |
Cytosine |
2'-Deoxycytidine |
2'-Deoxycytidine-5'-monophosphate |
Guanine |
2'-Deoxyguanosine |
2'-Deoxyguanosine-5'-monophosphate |
|
|
|
First, the remaining P-OH function is quite
acidic and is completely ionized in biological systems.
Second, the polymer chain is structurally
directed. One end (5') is different from the other (3').
Third, although this appears to be a
relatively simple polymer, the possible permutations of the four nucleosides in
the chain become very large as the chain lengthens.
Fourth, the DNA polymer is much larger than
originally believed. Molecular weights for the DNA from multicellular organisms
are commonly 109 or greater.
Information is stored or encoded in the DNA
polymer by the pattern in which the four nucleotides are arranged. To access
this information the pattern must be "read" in a linear fashion, just
as a bar code is read at a supermarket checkout. Because living organisms are
extremely complex, a correspondingly large amount of information related to
this complexity must be stored in the DNA. Consequently, the DNA itself must be
very large, as noted above. Even the single DNA molecule from an E. coli
bacterium is found to have roughly a million nucleotide units in a polymer
strand, and would reach a millimeter in length if stretched out. The nuclei of
multicellular organisms incorporate chromosomes, which are composed of DNA
combined with nuclear proteins called histones. The fruit fly has 8
chromosomes, humans have 46 and dogs 78 (note that the amount of DNA in a
cell's nucleus does not correlate with the number of chromosomes). The DNA from
the smallest human chromosome is over ten times larger than E. coli DNA, and it
has been estimated that the total DNA in a human cell would extend to 2 meters in length if unraveled. Since the nucleus is only about
5μm in diameter, the chromosomal DNA must be packed tightly to fit in that
small volume.
In addition to its role as a stable informational library, chromosomal DNA
must be structured or organized in such a way that the chemical machinery of
the cell will have easy access to that information, in order to make important
molecules such as polypeptides. Furthermore, accurate copies of the DNA code
must be created as cells divide, with the replicated DNA molecules passed on to
subsequent cell generations, as well as to progeny of the organism. The nature
of this DNA organization, or secondary structure, will be discussed in a later
section.
The high molecular weight nucleic acid, DNA,
is found chiefly in the nuclei of complex cells, known as eucaryotic cells, or
in the nucleoid regions of procaryotic
cells, such as bacteria. It is often associated with proteins
that help to pack it in a usable fashion. In contrast, a lower molecular
weight, but much more abundant nucleic acid, RNA, is
distributed throughout the cell, most commonly in small numerous organelles called ribosomes.
Three kinds of RNA are identified, the largest subgroup (85 to 90%) being
ribosomal RNA, rRNA,
the major component of ribosomes, together with proteins. The size of rRNA
molecules varies, but is generally less than a thousandth the size of DNA. The
other forms of RNA are messenger RNA , mRNA,
and transfer RNA , tRNA.
Both have a more transient existence and are smaller than rRNA.
All these RNA's have similar constitutions,
and differ from DNA in two important respects. As shown in the following
diagram, the sugar component of RNA is ribose, and the pyrimidine base uracil
replaces the thymine base of DNA. The RNA's play a vital role in the transfer
of information (transcription) from the DNA library to the protein factories
called ribosomes, and in the interpretation of that information (translation)
for the synthesis of specific polypeptides. These functions will be described
later.
4. The Secondary Structure of DNA
In the early 1950's the primary structure of DNA
was well established, but a firm understanding of its secondary structure was
lacking. Indeed, the situation was similar to that occupied by the proteins a
decade earlier, before the alpha helix and pleated sheet structures were
proposed by Linus Pauling. Many researchers grappled
with this problem, and it was generally conceded that the molar equivalences of
base pairs (A & T and C & G) discovered by Chargaff would be an
important factor. Rosalind Franklin, working at King's
College, London, obtained X-ray diffraction evidence that suggested a long
helical structure of uniform thickness. Francis Crick and James Watson, at Cambridge
University, considered hydrogen bonded base pairing interactions, and arrived
at a double stranded helical model that satisfied most of the known facts, and
has been confirmed by subsequent findings.
Base Pairing
Careful examination of the purine and
pyrimidine base components of the nucleotides reveals that three of them could
exist as hydroxy pyrimidine or purine tautomers, having an aromatic
heterocyclic ring. Despite the added stabilization of an aromatic ring, these
compounds prefer to adopt amide-like structures. These options are shown in the
following diagram, with the more stable tautomer drawn in blue.
A
simple model for this tautomerism is provided by 2-hydroxypyridine. As shown on
the left below, a compound having this structure might be expected to have
phenol-like characteristics, such as an acidic hydroxyl group. However, the boiling point of the actual substance is 100º C
greater than phenol and its acidity is 100 times less than expected (pKa =
11.7). These differences agree with the 2-pyridone tautomer, the stable form of
the zwitterionic internal salt. Further evidence supporting this assignment
will be displayed by clicking on the diagram. Note that this tautomerism
reverses the hydrogen bonding behavior of the nitrogen and oxygen functions
(the N-H group of the pyridone becomes a hydrogen bond donor and the carbonyl
oxygen an acceptor).
The additional evidence for the pyridone tautomer, that appears above by
clicking on the diagram, consists of infrared and carbon nmr absorptions associated with and characteristic of the
amide group. The data for 2-pyridone is given on the left. Similar data for the
N-methyl derivative, which cannot tautomerize to a pyridine derivative, is
presented on the right.
Once they had identified the favored base
tautomers in the nucleosides, Watson and Crick were able to propose a
complementary pairing, via hydrogen bonding, of guanosine (G) with cytidine (C)
and adenosine (A) with thymidine (T). This pairing, which is shown in the
following diagram, explained Chargaff's findings beautifully, and led them to
suggest a double helix structure for DNA. Before viewing this double helix
structure itself, it is instructive to examine the base pairing interactions in
greater detail. The G#C association involves three hydrogen bonds (colored
pink), and is therefore stronger than the two-hydrogen bond association of A#T.
These base pairings might appear to be arbitrary, but other possibilities
suffer destabilizing steric or electronic interactions.
A simple mnemonic device for remembering which bases are paired comes from
the line construction of the capital letters used to identify the bases. A and
T are made up of intersecting straight lines. In contrast, C and G are largely
composed of curved lines. The RNA base uracil corresponds to thymine, since U
follows T in the alphabet.
The Double Helix Structure for DNA
After many trials and modifications, Watson and Crick conceived an
ingenious double helix model for the secondary structure of DNA. Two strands of
DNA were aligned anti-parallel to each other, i.e. with opposite 3' and 5' ends
, as shown in part a of the following diagram. Complementary
primary nucleotide structures for each strand allowed intra-strand hydrogen
bonding between each pair of bases. These complementary strands are colored red
and green in the diagram. Coiling these coupled strands then leads to a double
helix structure, shown as cross-linked ribbons in part b of
the diagram. The double helix is further stabilized by hydrophobic attractions
and pi-stacking of the bases. A space-filling molecular model of a short
segment is displayed in part c on the right.
Space-Filling Molecular Model
The helix shown here has ten base pairs per turn, and rises 3.4 Å in
each turn. This right-handed helix is the favored conformation in aqueous
systems, and has been termed the B-helix. As
the DNA strands wind around each other, they leave gaps between each set of
phosphate backbones. Two alternating grooves result, a wide and deep major groove (ca. 22Å wide), and a shallow and
narrow minor
groove (ca. 12Å wide). Other molecules,
including polypeptides, may insert into these grooves, and in so doing perturb
the chemistry of DNA. Other helical structures of DNA have also been observed, and
are designated by letters (e.g. A and Z).
Deoxyribonucleic acid (DNA) consists of covalently linked chains of
deoxyribonucleotides, and ribonucleic acid (RNA) consists of chains of
ribonucleotides. DNA and RNA share a number of chemical and physical properties
because in both of them the successive nucleotide units are covalently linked
in identical fashion by phosphodiester bridges formed between the 5'-hydroxyl
group of one nucleotide and the 3'-hydroxyl group of the next. Thus the backbone
of both DNA and RNA consists of alternating phosphate and pentose groups, in
which phosphodiester bridges provide the covalent continuity. The purine and
pyrimidine bases of the nucleotide units are not present in the backbone
structure but constitute distinctive side chains, just as the R groups of amino
acid residues are the distinctive side chains of polypeptides.
DNA molecules from different cells and
viruses vary in the ratio of the four major types of nucleotide monomers, in
their nucleotide sequence, and in their molecular weight. Besides the four
major bases (adenine, guanine thymine, and cytosine) found in all DNAs, small
amounts of methylated derivatives of these bases are present in some DNA
molecules, particularly those from viruses.
http://www.youtube.com/watch?v=qy8dk5iS1f0
The DNAs isolated from different organisms
and viruses normally have two strands in complementary double-helical
arrangement. In most cells the DNA molecules are so large that they are not
easily isolated in intact form. In diploid eukaryotic cells nearly all the DNA
molecules are present in the cell nucleus, where they are combined in ionic
linkage with basic proteins called histones. In addition to the nuclear DNA,
diploid eukaryotic cells also contain very small amounts of DNA in the
mitochondria; it differs in its base composition and molecular weight from
nuclear DNA.
RNA
The three major types of ribonucleic acid in cells
are called messenger RNA (mRNA), ribosomal RNA (rRNA), and transfer RNA (tRNA).
Although all three types occur as single polyribonucleotide strands, each type
has a characteristic range of molecular weight and sedimentation coefficient.
Moreover, each of the three major kinds of RNA occurs in multiple molecular
forms. Ribosomal RNA of any given biological species exists in three or more
major forms, transfer RNA in as many as 60 forms, and messenger RNA in hundreds
and perhaps thousands of distinctive forms. Most cells contain 2 to 8 times as
much RNA as DNA.
Messenger RNA
Messenger RNA contains only the four major
bases. It is synthesized in the nucleus during the process of transcription,
in which the sequence of bases in one strand of the chromosomal DNA is
enzymatically transcribed in the form of a single strand of mRNA; some mRNA is
also made in the mitochondria. The sequence of bases of the mRNA strand so
formed is complementary to that of the DNA strand being transcribed. Each of
the thousands of different proteins synthesized by the cell is coded by a
specific mRNA or segment of an mRNA molecule.
Transfer RNAs
Transfer RNAs are relatively small molecules
that act as carriers of specific individual amino acids during protein synthesis
on the ribosomes. Each of the 20 amino acids found in proteins has at least one
corresponding tRNA, and some have multiple tRNAs
Ribosomal RNA
Ribosomal RNA (rRNA) constitutes up to 65
percent of the mass of ribosomes. Although rRNAs make up a large fraction of
total cellular RNA, their function in ribosomes is not yet clear. A few of the
bases in rRNAs are methylated.
1. DNA Replication
In their 1953 announcement of a double helix
structure for DNA, Watson and Crick stated, "It
has not escaped our notice that the specific pairing we have postulated
immediately suggests a possible copying mechanism for the genetic
material.". The essence of this suggestion is that, if
separated, each strand of the molecule might act as a template on which a new
complementary strand might be assembled, leading finally to two identical DNA
molecules. Indeed, replication does take place in this fashion when cells
divide, but the events leading up to the actual synthesis of complementary DNA
strands are sufficiently complex that they will not be described in any detail.
As depicted in the following drawing, the DNA
of a cell is tightly packed into chromosomes. First, the DNA is wrapped around
small proteins called histones (colored pink below). These bead-like structures
are then further organized and folded into chromatin aggregates that make up
the chromosomes. An overall packing efficiency of 7,000 or more is thus
achieved. Clearly a sequence of unfolding events must take place before the
information encoded in the DNA can be used or replicated.
Because of the
strategic importance of DNA, all living organisms must pos sess the following features: (1) rapid and
accurate DNA synthesis and (2) genetic stabilityprovided by effective DNA repair mechanisms.
http://www.youtube.com/watch?v=teV62zrm2P0
Paradoxically, the
long-term survival of species also depends on genetic variations that allow them
to adapt to changing environmental conditions. In most species these variations
are caused predominantly by genetic recombination, although mutation also plays a role. It should be noted that prokaryotic
nucleic acid metabolism is more completely understood than that of eukaryotes. Because
of their minimal growth requirements, short generation times, and relatively
simple genetic makeup, prokaryotes (especiallyEscherichia coil) have
proven to be excellent research tools for investigations of genetic mechanisms.
In contrast, multicellular eukaryotes possess several properties that hinder
genetic investigations.
The most
formidable of these are long generation times (often months or years) and
extraordinary difficulties in identifying gene products (e.g., enzymes or
structural components). (A common tactic in genetic research is the induction
of mutations, followed by observing changes in or the absence of a specific
gene product.) Unfortunately, because of their considerable complexity, very
few gene products have been identified in higher organisms by using this
method. Various recombinant DNA techniques are being used to circumvent this
obstacle.
DNA replication
must occur before every cell division. The basic mechanism by which DNA copies
are produced is similar in all living organisms. After the two strands
separate, each subsequently serves as a template for the synthesis of a
complementary strand. (In other words, each of the two new DNA molecules
contains one old strand and one new strand.) This process, referred to as semiconservative replication, was first demonstrated
in an elegant experiment reported in 1958 by Matthew Messelson and Franklin
Stahl. In this classic work, Messelson and Stahl took advantage of the different
density properties of DNA labeled with the heavy nitrogen isotope 15N
(normal nitrogen is 14N). After E. coli cells were grown for 14 generations in growth media
whose nitrogen source consisted only of 15NH4Cl, the 15N-containing cells were transferred to growth
media containing the normal 14N-isotope. At the end of both one and two cell divisions,
samples were removed. The DNA in each of these samples was isolated and analyzed by CsCl
density gradient centrifugation. Because pure 15N-DNA and 14N-DNA produce characteristic bands in
centrifuged CsCl tubes, this analytical method discriminates between DNA
molecules containing large amounts of the two nitrogen isotopes. When the DNA
isolated from 15N-containing cells grown in 14N-medium for precisely one generation was centrifuged, only one band was
observed. Because this band occurred halfway between where 15N-DNA and 14N-DNA bands would normally appear,
it seemed reasonable to assume that the new DNA was a hybrid molecule, that is, containing one 15N-strand and one 14N-strand. (Any other means of replication would create more than one
band.) After two cell divisions, extracted DNA was resolved into two discrete
bands containing equal amounts of 14N,14N-DNA and 15N,15N-DNA, a result that was
also consistent with the semiconservative model of DNA synthesis.
In the years since
the Messelson and Stahl experiment, many of the details of DNA replication have
been discovered.
DNA
Synthesis in Prokaryotes. DNA replication in E.
coli has proven to be a complex process that
involves a wide variety of proteins, referred to collectively as thereplisome.
http://www.youtube.com/watch?v=-mtLXpgjHL0&feature=related
The
prokaryotic replication process
contains several basic steps, each of which requires certain
enzyme activities:
1. DNA uncoiling. As
their name implies, the helicases are
ATP-requiring enzymes that catalyze the unwinding of duplex DNA.
2. Primer synthesis. The
formation of short RNA segments called primers, which are required for
the initiation of DNA replication, is catalyzed by primase.
3. DNA synthesis. The
synthesis of a complementary DNA strand by creating phosphodiester linkages between
nucleotides base paired to a template strand is catalyzed by large multienzyme
complexes referred to as the DNA polymerases. DNA
polymerase III (pol III), the major DNA-synthesizing enzyme,
is composed of at least ten different subunits. DNA
polymerase I (pol I) is a DNA repaer enzyme. (Pol I is also believed to play a role in the timely
removal of RNA primer.) The function of DNA
polymerase II (pol II) is not
understood. In addition to a 5'®3' polymerizing activity, all three enzymes possess a
3'®5' exonuclease activity (An exonuclesse is an
enzyme that removes nucleotides from an end of a potynucleotide strand.) Pol I
also possesses a 5'®3' exonuclease activity.
4. Joining of
DNA fragments. Discontinuous DNA synthesis (described below)
requires an enzyme, referred to as a ligase, that joins the newly synthesized segments.
5. Supercoiling
control. The tangling of DNA strands, which can prevent
further unwinding of the double helix, is prevented by the DNA topoisomerases. Tangling
is a very real possibility, since the double helix unwinds rapidly (as many as
50 revolutions per second during bacterial DNA replication). Topoisomerases are
enzymes that alter the linking number of closed duplex DNA molecules. The
terms "topoisomerase" and "topoisomers" (circular DNA
molecules that differ only in their linking numbers) are derived from
"topology," a form of mathematics that investigates the properties
of geometric structures that do not change with
bending or stretching.
http://www.youtube.com/watch?v=qn-JW-M89fo&feature=related
When DNA replication was first
observed experimentally (with the aid of electron microscopy and
autoradiography), investigators were confronted with a paradox. The
bidirectional synthesis of DNA as it appeared in their research seemed to
indicate that continuous synthesis occurs in the 5' ® 3'
direction on one strand and in the 3' ® 5'
direction on the other strand. (Recall that DNA double helix has an
antiparallel configuration.) However, all the enzymes that catalyze DNA synthesis
do so in the 5' ® 3' direction only. It was later determined that only
one strand, referred to as the leading strand, is continuously synthesized in the 5' ® 3'
direction. The other strand, referred to as thelagging strand, is also synthesized in the 5' ® 3' direction but in a series of small pieces.
(Reiji Okazaki and his colleagues provided the experimental evidence for
discontinuous DNA synthesis.) Subsequently, these pieces (now called
The initiation of replication is a complex process
involving several enzymes, as well as other proteins. The initiation of
replication begins when approximately 20 copies ofDnaA protein (50 kD)
bind to four specific sites. During the binding of DnaA, which requires ATP and a histonelike
protein (HU), this portion of the bacterial chromosome forms into a
nucleosomelike structure. This coiling causes a
small segment of the double helix to open sufficiently that DnaB (a
300-kD helicase) complexed with DnaC (29 kD) can enter.
The replication
fork moves forward as the helicases, assisted by topoisomerases (especially DNA
gyrase), unwind the helix. The single strands are kept separate by the binding of
numerous copies of single-stranded DNA binding protein (SSB). (SSB may also
protect vulnerable ssDNA segments from attack by various nucleases.) Before pol
III can initiate DNA synthesis, however, an RNA primer must be present. On the
leading strand, where DNA synthesis is continuous, primer formation occurs
only once per replication fork. In contrast, the discontinuous synthesis on the
lagging strand requires primer synthesis for each of the
It appears that the synthesis of both the leading and
lagging strands are coi pled. The tandem operation of two pol III complexes require that one strand (the lagging strand) is looped around the replisome.
(The ten "replisome" is used to describe the set of enzymes
and other molecules required for DNA synthesis at a single replication fork.)
When the pol III complex that copies the lagging strand completes an
Despite the complexity of DNA
replication, as well as its rate (as high as 1000 base pairs per second per
replication fork), this process is amazingly accurate (approximately one error per 109 to
1010 base pairs per generation).
Replication ends when the replication
forks collide on the other side of circular chromosome. The
subsequent separation of the two daughter DNA molecules is not understood,
although a type II topoisomerase is believed to be involved.
Unique
features of eukaryotic DNA synthesis
1. Timing of replication. In contrast
to rapidly growing bacterial cells, in which replication occurs throughout most
of the cell division cycle, eukaryotic replication is limited to a specific
time period referred to as the S phase. It is now known that eukaryotic cells produce
certain proteins that regulate phase transitions within the cell cycle.
2. Replication rate. DNA replication is significantly slower in eukaryotes than that
observed in prokaryotes. The eukaryotic rate is approximately 50 nucteotides
per second per replication fork. (Recall that the rate in prokaryotes is about ten times higher). This discrepancy is presumably due, in part, to the
complex structure of chromatin
3. Replicons. Despite the relative slowness of DNA synthesis, the replication process
is relativly brief, considering the large sizes of eukaryotic genomes. For
example, on the basis of the replication rate mentioned above,
the replication of an average eukaryotic chromosome (approximately 150 million
base pairs) should take over a month to complete. Instead, this process usually
requires several hours. Eukaryotes have compressed the replication of their
large genomes into short time period with the use of multiple replicons.
4.
Once the double stranded DNA is exposed, a
group of enzymes act to accomplish its replication. These are described briefly
here:
Topoisomerase:
This enzyme initiates unwinding of the double helix by cutting one of the
strands.
Helicase: This enzyme assists
the unwinding. Note that many hydrogen bonds must be broken if the strands are
to be separated. SSB: A single-strand
binding-protein stabilizes the separated strands, and prevents them from
recombining, so that the polymerization chemistry can function on the
individual strands.
DNA Polymerase: This family of enzymes link together nucleotide
triphosphate monomers as they hydrogen bond to complementary bases. These
enzymes also check for errors (roughly ten per billion), and make corrections.
Ligase: Small unattached DNA segments on a strand are united by this
enzyme.
Polymerization of nucleotides takes place by
the phosphorylation reaction described by the following equation.
Di- and triphosphate esters have
anhydride-like structures and are consequently reactive phosphorylating
reagents, just as carboxylic anhydrides are acylating
reagents. Since the pyrophosphate anion is a better leaving group
than phosphate, triphosphates are more powerful phosphorylating agents than are
diphosphates. Formulas for the corresponding 5'-derivatives of adenosine will
be displayed by Clicking Here, and similar derivatives exist for the other
three common nucleosides. The DNA polymerization process that builds the
complementary strands in replication, could in principle take place in two
ways. Referring to the general equation above, R1 could represent the next
nucleotide unit to be attached to the growing DNA strand, with R2 being this
strand. Alternatively, these assignments could be reversed. In practice, the
former proves to be the best arrangement. Since triphosphates are very
reactive, the lifetime of such derivatives in an aqueous environment is relatively
short. However, such derivatives of the individual nucleosides are repeatedly
synthesized by the cell for a variety of purposes, providing a steady supply of
these reagents. In contrast, the growing DNA segment must maintain its
functionality over the entire replication process, and can not afford to be
changed by a spontaneous hydrolysis event. As a result, these chemical
properties are best accommodated by a polymerization process that proceeds at
the 3'-end of the growing strand by 5'-phosphorylation involving a nucleotide
triphosphate.
The polymerization mechanism described here
is constant. It always extends the developing DNA segment toward the 3'-end
(i.e. when a nucleotide triphosphate attaches to the free 3'-hydroxyl group of
the strand, a new 3'-hydroxyl is generated). There is sometimes confusion on
this point, because the original DNA strand that serves as a template is read
from the 3'-end toward the 5'-end, and authors may not be completely clear as
to which terminology is used.
Because of the directional demand of the
polymerization, one of the DNA strands is easily replicated in a continuous
fashion, whereas the other strand can only be replicated in short segmental
pieces. This is illustrated in the following diagram. Separation of a portion
of the double helix takes place at a site called the replication fork. As
replication of the separate strands occurs, the replication fork moves away (to
the left in the diagram), unwinding additional lengths of DNA. Since the fork
in the diagram is moving toward the 5'-end of the red-colored strand,
replication of this strand may take place in a continuous fashion (building the
new green strand in a 5' to 3' direction). This continuously formed new strand
is called the leading
strand. In contrast, the replication fork moves toward the
3'-end of the original green strand, preventing continuous polymerization of a
complementary new red strand. Short segments of complementary DNA, called
Okazaki fragments, are produced, and these are linked together later by the
enzyme ligase.
This new DNA strand is called the lagging
strand.
When you consider that a human cell has roughly 109 base pairs in its DNA,
and may divide into identical daughter cells in 14 to 24 hours, the efficiency
of DNA replication must be extraordinary. The procedure described above will
replicate about 50 nucleotides per second, so there must be many thousand such
replication sites in action during cell division. A given length of double
stranded DNA may undergo strand unwinding at numerous sites in response to
promoter actions. The unraveled "bubble" of single stranded DNA has
two replication forks, so assembly of new complementary strands may proceed in
two directions. The polymerizations associated with several such bubbles fuse
together to achieve full replication of the entire DNA double helix. A cartoon
illustrating these concerted replications will appear by clicking on the above
diagram. Note that the events shown proceed from top to bottom in the diagram.
2.
Repair of DNA Damage and Replication Errors
One
of the benefits of the double stranded DNA structure is that it lends itself to
repair, when structural damage or replication errors occur. Several kinds of
chemical change may cause damage to DNA:
· Spontaneous hydrolysis of a nucleoside
removes the heterocyclic base component.
· Spontaneous hydrolysis of cytosine
changes it to a uracil.
· Various toxic metabolites may oxidize or
methylate heterocyclic base components.
· Ultraviolet light may dimerize adjacent
cytosine or thymine bases.
All
these transformations disrupt base pairing at the site of the change, and this
produces a structural deformation in the double helix.. Inspection-repair enzymes
detect such deformations, and use the undamaged nucleotide at that site as a
template for replacing the damaged unit. These repairs reduce errors in DNA
structure from about one in ten million to one per trillion.
RNA
and Protein Synthesis
The
genetic information stored in DNA molecules is used as a blueprint for making
proteins. Why proteins? Because these macromolecules have diverse primary,
secondary and tertiary structures that equip them to carry out the numerous
functions necessary to maintain a living organism. As noted in the protein chapter, these functions include:
· Structural integrity (hair, horn, eye
lenses etc.).
· Molecular recognition and signaling
(antibodies and hormones).
· Catalysis of reactions (enzymes)..
· Molecular transport (hemoglobin
transports oxygen).
· Movement (pumps and motors).
The critical importance of proteins in life
processes is demonstrated by numerous genetic diseases, in which small
modifications in primary structure produce debilitating and often disastrous
consequences. Such genetic diseases include Tay-Sachs, phenylketonuria (PKU),
sickel cell anemia, achondroplasia, and Parkinson disease. The unavoidable
conclusion is that proteins are of central importance in living cells, and that
proteins must therefore be continuously prepared with high structural fidelity
by appropriate cellular chemistry.
Early geneticists identified genes as hereditary units that determined the appearance and /
or function of an organism (i.e. its phenotype). We now define genes as
sequences of DNA that occupy specific locations on a chromosome. The original
proposal that each gene controlled the formation of a single enzyme has since
been modified as: one
gene = one polypeptide. The intriguing question of how the
information encoded in DNA is converted to the actual construction of a
specific polypeptide has been the subject of numerous studies, which have
created the modern field of Molecular
Biology.
1.
The Central Dogma and Transcription
Francis Crick proposed that information flows
from DNA to RNA in a process called transcription,
and is then used to synthesize polypeptides by a process calledtranslation.
Transcription takes place in a manner similar to DNA replication. A
characteristic sequence of nucleotides marks the beginning of a gene on the DNA
strand, and this region binds to a promoter protein that initiates RNA
synthesis. The double stranded structure unwinds at the promoter site., and one
of the strands serves as a template for RNA formation, as depicted in the
following diagram. The RNA molecule thus formed is single stranded, and serves
to carry information from DNA to the protein synthesis machinery called
ribosomes. These RNA molecules are therefore called messenger-RNA
(mRNA).
To summarize: a gene is a stretch of DNA that contains a pattern for the amino
acid sequence of a protein. In order to actually make this protein, the
relevant DNA segment is first copied into messenger-RNA. The cell then
synthesizes the protein, using the mRNA as a template.
An important distinction must be made here. One of the DNA strands in the
double helix holds the genetic information used for protein synthesis. This is
called the sense
strand, or information strand (colored red above). The complementary
strand that binds to the sense strand is called the anti-sense strand (colored green), and it serves as a
template for generating a mRNA molecule that delivers a copy of the sense
strand information to a ribosome. The promoter protein binds to a specific
nucleotide sequence that identifies the sense strand, relative to the
anti-sense strand. RNA synthesis is then initiated in the 3' direction, as
nucleotide triphosphates bind to complementary bases on the template strand,
and are joined by phosphate diester linkages. An animation of this process for
DNA replication was presented earlier. A characteristic
"stop sequence" of nucleotides terminates the RNA synthesis. The
messenger molecule (colored orange above) is released into the cytoplasm to
find a ribosome, and the DNA then rewinds to its double helix structure
.
In eucaryotic cells the initially transcribed
m-RNA molecule is usually modified and shortened by an "editing"
process that removes irrelevant material. The DNA of such organisms is often
thousands of times larger and more complex than that composing the single
chromosome of a procaryotic bacterial cell. This difference is due in part to
repetitive nucleotide sequences (ca. 25% in the human genome). Furthermore,
over 95% of human DNA is found in intervening sequences that separate genes and
parts of genes. The informational DNA segments that make up genes are called exons, and the noncoding
segments are called introns.
Before the mRNA molecule leaves the nucleus, the nonsense bases that make up
the introns are cut out, and the informationally useful exons are joined
together in a step known as RNA
splicing. In this fashion shorter mRNA molecules carrying the
blueprint for a specific protein are sent on their way to the ribosome
factories.
The Central
Dogma of
molecular biology, which at first was formulated as a simple linear progression
of information from DNA to RNA to Protein, is summarized in the following
illustration. The replication process on the left consists of passing
information from a parent DNA molecule to daughter molecules. The middle transcription
process copies this information to a mRNA molecule. Finally, this information
is used by the chemical machinery of the ribosome to make polypeptides.
As
more has been learned about these relationships, the central dogma has been
refined to the representation displayed on the right. The dark blue arrows show
the general, well demonstrated, information transfers noted above. It is now
known that an RNA-dependent DNA polymerase enzyme, known as a reverse
transcriptase, is able to transcribe a single-stranded RNA sequence into
double-stranded DNA (magenta arrow). Such enzymes are found in all cells and
are an essential component of retroviruses (e.g. HIV), which require RNA
replication of their genomes (green arrow). Direct translation of DNA information
into protein synthesis (orange arrow) has not yet been observed in a living
organism. Finally, proteins appear to be an informational dead end, and do not
provide a structural blueprint for either RNA or DNA.
In the
following section the last fundamental relationship, that of structural
information translation from mRNA to protein, will be described
2.
Translation
Translation is a more complex process than transcription.
This would, of course, be expected. After all, the coded messages produced by
the German Enigma machine could be copied easily, but required a considerable
decoding effort before they could be read with understanding. In a similar
sense, DNA replication is simply a complementary base pairing exercise, but the
translation of the four letter (bases) alphabet code of RNA to the twenty
letter (amino acids) alphabet of protein literature is far from trivial.
Clearly, there could not be a direct one-to-one correlation of bases to amino
acids, so the nucleotide letters must form short words or codons that define specific amino acids. Many questions
pertaining to this genetic code were posed in the late 1950's:
•
How many RNA nucleotide bases designate a specific amino acid?
If separate groups of nucleotides, called codons, serve this purpose, at least
three are needed. There are 43 = 64 different nucleotide triplets, compared
with 42 = 16 possible pairs.
• Are the codons linked separately or do they overlap?
Sequentially joined triplet codons will result in a nucleotide chain three
times longer than the protein it describes. If overlapping codons are used then
fewer total nucleotides would be required.
• If triplet segments of mRNA designate specific amino acids in the protein, how are the
codons identified?
For the sequence ~CUAGGU~ are the codons CUA & GGU or ~C, UAG & GU~ or
~CU, AGG & U~?
• Are all the codon words the same size?
In Morse code the most widely used letters are shorter than less common
letters. Perhaps nature employs a similar scheme.
Physicists and mathematicians, as well as
chemists and microbiologists all contributed to unravelling the genetic code.
Although earlier proposals assumed efficient relationships that correlated the
nucleotide codons uniquely with the twenty fundamental amino acids, it is now
apparent that there is considerable redundancy in the code as it now operates.
Furthermore, the code consists exclusively of non-overlapping triplet codons.
Clever experiments provided some of the earliest breaks in deciphering the
genetic code. Marshall Nirenberg found that RNA from many different organisms
could initiate specific protein synthesis when combined with broken E.coli
cells (the enzymes remain active). A synthetic polyuridine RNA induced
synthesis of poly-phenylalanine, so the UUU codon designated phenylalanine.
Likewise an alternating ~CACA~ RNA led to synthesis of a ~His-Thr-His-Thr~
polypeptide.
The following table presents the present day
interpretation of the genetic code. Note that this is the RNA alphabet, and an
equivalent DNA codon table would have all the Unucleotides replaced by T. Methionine and tryptophan
are uniquely represented by a single codon. At the other extreme, leucine is
represented by eight codons. The average redundancy for the twenty amino acids
is about three. Also, there are three stop
codons that
terminate polypeptide synthesis.
RNA
Codons for Protein Synthesis
The translation process is fundamentally straight forward. The mRNA strand bearing the
transcribed code for synthesis of a protein interacts with relatively small RNA
molecules (about 70-nucleotides) to which individual amino acids have been
attached by an ester bond at the 3'-end.
These transfer
RNA's (tRNA)
have distinctive three-dimensional structures consisting of loops of
single-stranded RNA connected by double stranded segments. This cloverleaf
secondary structure is further wrapped into an "L-shaped" assembly,
having the amino acid at the end of one arm, and a characteristic anti-codon region at the other end. The anti-codon
consists of a nucleotide triplet that is the complement of the amino acid's
codon(s). Models of two such tRNA molecules are shown to the right. When read
from the top to the bottom, the anti-codons depicted here should complement a
codon in the previous table.
A cell's protein synthesis takes place in organelles called ribosomes.
Ribosomes are complex structures made up of two distinct and separable subunits
(one about twice the size of the other). Each subunit is composed of one or two
RNA molecules (60-70%) associated with 20 to 40 small proteins (30-40%). The
ribosome accepts a mRNA molecule, binding initially to a characteristic
nucleotide sequence at the 5'-end (colored light blue in the following
diagram). This unique binding assures that polypeptide synthesis starts at the
right codon. A tRNA molecule with the appropriate anti-codon then attaches at
the starting point and this is followed by a series of adjacent tRNA
attachments, peptide bond formation and shifts of the ribosome along the mRNA
chain to expose new codons to the ribosomal chemistry.
The genetic code
It became apparent during the early
phase of the investigation of protein synthesis that translation is
fundamentally different from the transcription process that precedes it. During transcription the "language"
of DNA sequences is converted to the closely related dialect of RNA sequences.
During protein synthesis, however, a nucleic acid base sequence is converted
to a clearly different language (i.e., an amino acid sequence), hence the use
of the term "translation." Because mRNA and amino acid
molecules have no natural affinity for each other, it became obvious to
researchers (e.g., Francis Crick) that a series of adaptor molecules are
required to mediate the translation process. This role was eventually assigned
to tRNA molecules.
Before the identification of adaptor molecules became
feasible, however, a more important problem had to be solved: the deciphering
of the genetic code.
The genetic code can be
described as a coding dictionary that specifies a meaning for each specific
base sequence. Once the importance of the genetic code was recognized,
investigators began to speculate about its dimensions. Because only four different
bases (G, C, A, and U) occur in mRNA and 20 amino acids must be specified, it
appeared obvious that more than one base coded for each amino acid. A sequence
of two bases would specify only a total of 16 amino acids (i.e., 42 - 16). However, a three-base sequence provides more
than sufficient base combinations for translation to occur (i.e., 43 = 64).
The first major
breakthrough in assigning mRNA triplet base sequences (later referred to as codons) came in 1961, when Marshall Nirenberg performed a series of experiments using an artificial
test system containing an extract of Escherichia coli fortified with nucleotides, amino acids, ATP, and GTP.
He showed that poly U (a synthetic polynucleotide whose base components
consist only of uracil) directed the synthesis of polyphenylalanine. Assuming
that codons consist of a three-base sequence, Nirenberg surmised that UUU
codes for the amino acid phenylalanine. Subsequently, they repeated their
experiment using poly A and poly C. Because polylysine and polyproline products
resulted from these tests, the codons AAA and CCC were assigned to lysine and
proline, respectively.
Most of the remaining codon assignments were
determined with the aid of synthetic polynucleotides with repeating sequences.
Such molecules were constructed by enzymatically amplifying short chemically
synthesized sequences. The resulting polypeptides, which contained repeating
peptide segments, were then analyzed. The information obtained from this
technique, devised by Har Gobind Khorana, was later supplemented with a
strategy used by Nirenberg. In this latter technique the capacity of specific
trinucleotides to promote tRNA binding to ribosomes was measured.
The codon assignments for the 64 possible trinudeotide
sequences are presented in Table. Of these, 61 code for amino acids. The
remaining three codons (UAA, UAG, and UGA) are stop (polypeptide chain
terminating) signals. AUG, the codon for methionine, also serves as a start
signal (sometimes referred to as the initiating codon).
As a result of a variety of
investigations, the genetic code is now believed to possess the following
properties:
1. Degenerate. Any coding system in
which several signals have the same meaning is said to be degenerate. The genetic
code is partially degenerate because most amino acids are coded for by several
codons. For example, leucine is coded for by six different codons (UAA, DUG,
CUU, CUC, CUA, and CUG). In fact, methionine (AUG) and tryptophan (UGG) are the
only amino acids that are coded for by a single codon.
2. Specific. Each codon is a signal for a specific amino acid. The majority of
codons that code for the same amino acid possess similar sequences. For
example, in each of the four serine codons (UCU, UCC, UCA, and UCG) the first
and second bases are identical. It would appear that this feature of the
genetic code serves to minimize the danger of point mutations (DNA sequence
changes involving a single base pair).
3. Nonoverlapping and
without punctuation. The mRNA coding
sequence is "read" by a ribosome starting from the initiating codon
(AUG) as a continuous sequence taken three bases at a time until a stop codon
is reached. A set of contiguous triplet codons in an mRNA is called a reading
frame. The term open reading frame is used to describe a series of triplet base
sequences in mRNA that do not contain a stop codon.
4. Universal. With a few minor exceptions the genetic code is universal. In other
words, examinations of the translation process in the species that have been
investigated have revealed that the coding signals for amino acids are always
the same.
Codon-Anticodon
Interactions
tRNA molecules are the "adapters" that
are required for the translation of the genetic message. Each type of tRNA
binds a specific amino acid (at the 3' terminus) and possesses a three-base
sequence called the anticodon. It is the base pairing between the anticodon of
the tRNA and an mRNA codon that is responsible for the actual translation of
the genetic information of structural genes. It should be noted that
codon-anticodon pairings are antiparallel. However, both sequences are given in
the 5' ® 3'
direction. For example, the codon UGC binds to the anticodon GCA.
Once the genetic code was
broken, researchers anticipated the identification of 61 different types of tRNAs in
living cells. Instead, they discovered that cells often operate with
substantially fewer tRNAs than expected. Most cells possess about 50 tRNAs, although lower
numbers have been observed. Further investigation of various tRNAs also
revealed that the anticodon in some molecules contain uncommon nucleotides,
such as inosinate (I), which typically occur at the third anticodon position.
(In eukaryotes, A in the third anticodon position is deaminated to form I.)
As tRNAs were investigated, it became increasingly clear that some molecules
recognize several codons. Crick proposed a rational explanation for this
phenomenon, which he referred to as the wobble hypothesis.
The wobble
hypothesis, which allows for multiple codon-anticodon interactions by
individual tRNAs, is based principally on the following
observations:
1. The first two base pairings in a codon-anticodon
interaction confer most of the specificity required during translation.
Recall that most redundant codons specifying a certain amino acid possess
identical nucleotides in the first two positions. These interactions are
standard base pairings.
2. The
interactions between the third codon and anticodon nucleotides are less
stringent. In fact, nontraditional base pairs often occur. For example, tRNAs containing G in the 5' (or "wobble") position
of the anticodon can pair with two different codons (i.e., G can interact with
either C or U). The same is true for U, which can interact with A or G. When I
is in the wobble position of an anticodon, a tRNA can base pair with three
different codons, since I can interact with U or A or C.
A careful
examination of the genetic code and the "wobble rules" indicates that
a minimum of 31 tRNAs are required for the translation of all 61
codons. An additional tRNA that is required for initiating protein synthesis
brings the total to 32 tRNAs.
Recognition of
amino acids
Although the accuracy of translation
(approximately one error per 104 amino acids incorporated)
is lower than those of DNA replication and transcription, it is remarkably
higher than one would expect of such a complex process. The principal reasons
for the accuracy with which amino acids are incorporated into polypeptides
include codon-anticodon base pairing and the mechanism by which amino acids
are attached to their cognate tRNAs. The attachment of amino acids to tRNAs, a process that is considered to be the first step in
protein synthesis, is catalyzed by a group of enzymes called the aminoacyl-tRNA synthetases. The precision with which these enzymes esterify each
specific amino acid to the correct tRNA is now believed to be so important for
accurate translation that their functioning has been referred to collectively
as the second genetic code.
In most organisms there is at least one aminoacyl-tRNA
synthetase for each of the 20 amino acids. (Note that each enzyme links its
specific amino acid to any appropriate tRNA. This is an important point, since
in most cells many amino acids have several cognate tRNAs each.)
The process in which an amino acid is linked to the 3' terminus of the correct
tRNA consists of two sequential reactions, both of which occur within the
active site of the synthetase:
1. Activation. The synthetase first catalyzes the formation of
aminoacyl-AMP. This reaction, which serves to activate the amino acid through
the formation of a high-energy mixed anhydride bond is driven to completion
through the subsequent hydrolysis of its other product,
pyrophosphate. (An anhydride is a molecule containing two carbonyl groups
linked through an oxygen atom).
Aminoacyladenilate
2. tRNA linkage. A
specific tRNA, also bound in the active site of the synthetase, becomes
attached to the aminoacyl group through an ester linkage. Although the
aminoacyl ester linkage to the tRNA is lower in energy than the mixed anhydride
of aminoacyl AMP, it still possesses sufficient energy to drive peptide bond
formation.
The sum of the reactions catalyzed by the
aminoacyl-tRNA synthetases is as follows:
Amino acid + ATP + tRNA ® aminoacyl-tRNA
+ AMP +PP
Because the product PP is immediately hydrolyzed with
a large loss of free energy, tRNA charging is an irreversible process. Because
AMP is a product of this reaction, the metabolic price for the linkage of each
amino acid to its tRNA is the equivalent of two molecules of ATP.
The aminoacyl-tRNA synthetases are a diverse group of
enzymes that vary in molecular weight, primary sequence, and number of
subunits. Despite this diversity, each enzyme efficiently produces a specific
aminoacyl-tRNA product in a relatively error-free manner. The specificity with
which each of the synthetases binds the correct amino acid and its cognate
tRNA is crucial for the fidelity of the translation process.
Aminoacyl-tRNA synthetases must also recognize and
bind the correct tRNA molecules. For some enzymes (e.g., glutaminyl-tRNA
synthetase), anticodon structure is an important feature of the recognition
process. However, several enzymes appear to recognize other tRNA structural
elements in addition to or instead of the anticodon.
Protein Synthesis
The translation of
a genetic message into the primary sequence of a polypeptide can be divided
into three phases: initiation, elongation, and termination.
Translation is
relatively rapid in prokaryotes. For example, an E. coli ribosome can incorporate
as many as 20 amino acids per second. (The eukaryotic rate, at about 50
residues per minute, is significantly slower.) Prokaryotic ribosomes are
composed of a 50S large subunit and a 30S small subunit.
1. Initiation. Translation begins with the formation of an initiation complex. In prokaryotes this process requires three initiation
factors (IFs). IF-3 has
previously bound to the 30S subunit, thereby preventing it from binding
prematurely to the 50S subunit. As an mRNA binds to the 30S subunit, it is
guided into a precise location, so that theinitiation codon AUG is correctly positioned. Each gene on a polycistronic
mRNA possesses its own initiation codon. The translation of each gene appears
to occur independently, that is, translation of the first gene may or may not
be followed by the translation of subsequent genes.
In the next step in initiation, IF-2 (a
GTP-binding protein with a bound GTP) binds to the 30S subunit, where it
promotes the binding of the initiating tRNA to the initiation codon in the P
site. (There are two sites on the complete ribosome for codone-anticodone
interaction: the P (peptidyl) site and the A (acyl) site. The initiating tRNA
in prokaryotes is N-formyl-methionine-tRNA. The initiation phase ends as the
50S subunit binds the 30S subunit. Simultaneously, IF-2 and IF-3 are released.
The role of IF-1 is unclear.
Most of the major differences between the prokaryotic
and eukaryotic versions of protein synthesis occur during the initiation phase.
There are at least nine eukaryotic initiating factors (eIFs), several of which
possess numerous subunits. Eukaryotic initiation begins when the small 40S
ribosomal subunit binds to a complex composed of eIF-2 (a GTP-binding protein),
GTP, and an initiating species of methionyl-tRNA. The small (40S) subunit is
prevented from binding to the large (60S) subunit during this phase of
initiation because it is associated with eIF-3, a multisubunit protein.
Elongation. It is during the elongation phase that the polypeptide
is actually synthesized according to the specifications of the genetic message.
Elongation, the phase in
which amino acids are incorporated
into a polypeptide chain, consists of three steps: (1) positioning
of an aminoacyl-tRNA in the A site,
(2) peptide bond formation, and (3)translocation.
The prokaryotic elongation process begins when an
aminoacyl-tRNA, specified by the next codon, binds to the A site. Before it can
be positioned in the A site, the aminoacyl-tRNA must first bind EF-Tu-GTP. The
elongation factor EF-Tu is a GTP-binding protein involved in the positioning of
aminoacyl-tRNA molecules in the A site. After the aminoacyl-tRNA is positioned,
the GTP bound to EF-Tu is hydrolyzed to GDP and P. GTP hydrolysis results in
the release of EF-Tu from the ribosome. Subsequently, a second elongation
factor, referred to as EF-Ts, promotes EF-Tu regeneration by displacing its
GDP moiety. EF-Ts is then itself displaced by an incoming GTP molecule.
After the positioning of the second aminoacyl-tRNA in
the A site, the formation of a peptide bond is catalyzed by peptidyl
transferase. The energy required to drive this reaction is provided by the
high-energy ester bond linking the P site amino acid to its tRNA. (During the
first elongation cycle, this amino acid is formylmethionine.) As was described
previously, the now uncharged tRNA occupying the P site leaves the ribosome.
For translation to
continue, the mRNA must move, or "translocate," so that a new
codon-anticodon interaction can occur. Translocation requires the binding of
another GTP-binding protein referred to as EF-G. GTP hydrolysis provides the energy
required for the ribosomal conformational change that is apparently involved in
the movement of the peptidyl-tRNA (the tRNA bearing the growing peptide chain)
from the A site to the P site. The unoccupied A site then binds an appropriate
aminoacyl-tRNA to the new A site codon. After the subsequent release of EF-G
the ribosome is ready for the next elongation cycle. Elongation continues until
a stop codon enters the A site.
Termination. The termination phase begins when a termination codon
(UAA, UAG, or UGA) enters the A site. Three releasing factors (RF-1, RF-2, and
RF-3) are involved in termination. The codons UAA and UAG are recognized by
RF-1, whereas UAA and UGA are recognized by RF-2.
This recognition process, which involves GTP
hydrolysis, results in the following alterations in ribosome function. The
peptidyl transferase, which is transiently transformed into an esterase,
hydrolyzes the bond linking the completed polypeptide chain and the P site
tRNA. Following the polypeptide's release from the ribosome, the mRNA and tRNA
also dissociate. Termination ends with the dissociation of the ribosome into
its constituent subunits.
In addition to the
ribosomal subunits, mRNA and aminoacyl-tRNAs, translation requires an energy
source (GTP) and a wide variety of protein factors. These factors perform several types of roles. Some
have catalytic functions; others stabilize specific structures that form during
translation. Translation factors are classified according to the phase of the
translation process that they affect, that is, initiation, elongation, or
termination. The major differences between prokaryotic and eukaryotic
translation appear to be due largely to the identity and functioning of these
protein factors.
3. Post-translational Modification
Once a peptide or protein has been
synthesized and released from the ribosome it often undergoes further chemical
transformation. This post-translational
modificationmay involve the attachment of other moieties
such as acyl groups, alkyl groups, phosphates, sulfates, lipids and
carbohydrates. Functional changes such as dehydration, amidation, hydrolysis
and oxidation (e.g. disulfide bond formation) are also common. In this manner
the limited array of twenty amino acids designated by the codons may be
expanded in a variety of ways to enable proper functioning of the resulting
protein. Since these post-translational reactions are generally catalyzed by
enzymes, it may be said: "Virtually every molecule in a cell is made by
the ribosome or by enzymes made by the ribosome."
Modifications, like phosphorylation and
citrullination, are part of common mechanisms for controlling the behavior of a
protein. As shown on the left below, citrullination is the post-translational
modification of the amino acid arginine into the amino acid citrulline.
Arginine is positively charged at a neutral pH, whereas citrulline is
uncharged, so this change increases the hydrophobicity of a protein. Phosphorylation
of serine, threonine or tyrosine residues renders them more hydrophilic, but
such changes are usually transient, serving to regulate the biological activity
of the protein. Other important functional changes include iodination of
tyrosine residues in the peptide thyroglobulin by action of the enzyme
thyroperoxidase. The monoiodotyrosine and diiodotyrosine formed in this manner
are then linked to form the thyroid hormones T3 and T4, shown below.
|
|
|
Amino acids may be enzymatically removed from the amino end of the protein.
Because the "start" codon on mRNA codes for the amino acid
methionine, this amino acid is usually removed from the resulting protein
during post-translational modification. Peptide chains may also be cut in the
middle to form shorter strands. Thus, insulin is
initially synthesized as a 105 residue preprotein. The 24-amino acid signal
peptide is removed, yielding a proinsulin peptide. This folds and forms
disulfide bonds between cysteines 7 and 67 and between 19 and 80. Such dimeric
cysteines, joined by a disulfide bond, are named cystine. A protease then cleaves
the peptide at arg31 and arg60, with loss of the 32-60 sequence (chain C).
Removal of arg31 yields mature insulin, with the A and B chains held together
by disulfide bonds and a third cystine moiety in chain A. The following cartoon
illustrates this chain of events.
Nisin is a polypeptide (34 amino acids) made by the bacterium Lactococcus lactis.
Nisin kills gram positive bacteria by binding to their membranes and targeting
lipid II, an essential precursor of cell wall synthesis. Such antimicrobial
peptides are a growing family of compounds which have received the name
lantibiotics due to the presence oflanthionine, a nonproteinogenic amino
acid with the chemical formula HO2C-CH(NH2)-CH2-S-CH2-CH(NH2)-CO2H. Lanthionine
is composed of two alanine residues that are crosslinked on their β-carbon
atoms by a thioether linkage (i.e. it is the monosulfide analog of the
disulfide cystine). Lantibiotics are unique in that they are ribosomally
synthesized as prepeptides, followed by post-translational processing of a
number of amino acids (e.g. serine, threonine and cysteine) into dehydro
residues and thioether crossbridges. Nisin is the only bacteriocin that is
accepted as a food preservative. Several nisin subtypes that differ in amino
acid composition and biological activity are known. A typical structure is
drawn below, and a Jmol model will be presented by clicking on the diagram.
The
bacterial cell wall is a cross-linked glycan polymer that surrounds bacterial
cells, dictates their cell shape, and prevents them from breaking due to
environmental changes in osmotic pressure. This wall consists mainly of
peptidoglycan or murein, a three-dimensional polymer of sugars and amino acids
located on the exterior of the cytoplasmic membrane.
The monomer units are composed of two amino sugars, N-acetylglucosamine
(NAG) and N-acetylmuramic acid (NAM), is shown.
Transglycosidase enzymes join these units by glycoside bonds, and they are
further interlinked to each other via peptide cross-links between the
pentapeptide moieties that are attached to the NAM residues. Peptidoglycan
subunits are assembled on the cytoplasmic side of the bacterial membrane from a
polyisoprenoid anchor. Lipid II, a membrane-anchored cell-wall precursor that
is essential for bacterial cell-wall biosynthesis, is one of the key components
in the synthesis of peptidoglycan. Peptidoglycan synthesis via polymerization
of Lipid II is illustrated in the following diagram. Cross-linking of the
peptide side chains is then effected by transpeptidase enzymes. A model of
Lipid II complexed with nisin may be examined as part of the previous Jmol
display.
In order for bacteria to divide by binary fission and increase their size
following division, links in the peptidoglycan must be broken, new
peptidoglycan monomers must be inserted, and the peptide cross links must be
resealed. Transglycosidase enzymes catalyze the formation of glycosidic bonds between
the NAM and NAG of the peptidoglycan monomers and the NAG and NAM of the
existing peptidoglycan. Finally, transpeptidase enzymes reform the peptide
cross-links between the rows and layers of peptidoglycan making the wall
strong. Many antibiotic drugs, including penicillin, target the chemistry of
cell wall formation. The effectiveness of choosing Lipid II for an
antibacterial strategy is highlighted by the fact that it is the target for at
least four different classes of antibiotic, including the clinically important
glycopeptide antibiotic vancomycin. The growing problem of bacterial resistance
to many current drugs, including vancomycin, has led to increasing interest in
the therapeutic potential of other classes of compound that target Lipid II. Lantibiotics
such as nisin are part of this interest.
Analysis of Structural Similarities and Differences
between DNA and RNA
1. Background
We know that living organisms have the
ability to reproduce and to pass many of their characteristics on to their offspring.
From this we may infer that all organisms have genetic substances and an
associated chemistry that enable inheritance to occur. It is instructive to
consider the essential requirements such genetic materials must fullfill.
Information |
Since this genetic substance has been
identified as the nucleic acids DNA and RNA, it is instructive to examine the
manner in which these polymers satisfy the above requirements.
2. Information Storage
The complexity of life suggests that even
simple organisms will require very large inheritance libraries. Although the four
nucleotides that make up of DNA might appear to be too simple for this task,
the enormous size of the polymer and the permutations of the monomers within
the chain meet the challenge easily. After all, the words and graphics in this
document are all presented to the computer as combinations of only two
characters, zeros and ones (the binary number system). DNA has four letters in
its alphabet (A, C, G & T), so the number of words that can be formed
increase exponentially with the number of letters per word. Thus, there are 42
or 16 two letter words, and 43 or 64 three letter words.
Assuring the stability of information encoded
by the DNA alphabet presents a serious challenge. If the letters of this
alphabet are to be strung together in a specific way on the polymer chain,
chemical reactions for attaching (and removing) them must be available. Simple
carboxylic ester or amide links might appear suitable for this purpose (note step-growth polymerization), but these are
used in lipids and polypeptides, so a separate enzymatic machinery would be
needed to keep the information processing operations apart from other molecular
transformations. The
overall stability of such covalent links presents a more serious problem. Under
physiological conditions (aqueous, pH near 7.4 & 27 to 37º C) esters
are slowly hydrolyzed. Amides are more stable, but even a hydrolytic cleavage
of one bond per hour would be devastating to a polymer having tens of thousands
to millions such links. Furthermore, short difunctional linking groups, such as
carbonates, oxylates and malonates show enhanced reactivity, and their parent
acids are unstable or toxic.
Ester
Hydrolysis at 35º C and pH 7
Ester |
Rate of Hydrolysis |
Relative Rate |
Ethyl Acetate |
1.0*10-2 |
5*106 |
Trimethyl Phosphate |
3.4*10-4 |
2*105 |
Dimethyl Phosphate |
2.0*10-9 |
1.0 |
Phosphate is an ubiquitous inorganic nutrient. Mono, di and triesters of
the corresponding acid (phosphoric acid) are all known. Because of their
acidity (pKa ≈ 2), the mono and diesters are negatively charged at
physiological pH, rendering them less susceptible to nucleophilic attack. The
influence of negative charge on the rate of nucleophilic hydrolysis of some
representative esters is shown in the table on the right. Clearly, a polymer in
which monomer units are joined by negatively charged diphosphate ester links
should be substantially more stable than one composed of carboxylate ester
bonds. The negative charge found on all biological phosphate derivatives serves
other purposes as well.
• The
diphosphate ester links that join the nucleotides units of DNA are formed by phosphorylation reactions involving nucleotide triphosphate
reagents. These reagents are the phosphoric acid analogs of carboxylic acid
anhydrides, a functional group that would not survive the aqueous environment
of a cell. The high density of negative charge on the triphosphate function not
only solubilizes the organic moiety to which it is attached, but also reduces
the rate at which it is hydrolyzed.
• Living
cells must conserve and employ their chemical reagents within a volume defined and
enclosed by a membrane barrier. These lipid bilayer membranes have hydrophobic interiors, which resist
the passage of ions. Indeed, special trans-membrane structures called ion channels exist so that controlled ion transport
across a membrane may take place. Small neutral organic molecules, such as
adenosine, cytidine and guanosine, may pass through lipid membranes, albeit at
a reduced rate, but their mono, di and triphosphate derivatives are more
tightly sequestered in the cell.
3. Why is 2'-Deoxyribose the Sugar Moiety in DNA?
Common perhydroxylated sugars, such as glucose and ribose, are formed in
nature as products of the reductive condensation of carbon dioxide we call photosynthesis.
The formation of deoxysugars requires additional biological reduction steps, so it is
reasonable to speculate why DNA makes use of the less common 2'-deoxyribose, when
ribose itself serves well for RNA. At least two problems associated with the
extra hydroxyl group in ribose may be noted. First, the additional bulk and
hydrogen bonding character of the 2'-OH interfere with a uniform double helix
structure, preventing the efficient packing of such a molecule in the
chromosome. Second, RNA undergoes spontaneous hydrolytic cleavage about one
hundred times faster than DNA. This is believed due to intramolecular attack of
the 2'-hydroxyl function on the neighboring phosphate diester, yielding a
2',3'-cyclic phosphate. If stability over the lifetime of an organism is an
essential characteristic of a gene, then nature's selection of 2'-deoxyribose
for DNA makes sense. The following diagram illustrates the intramolecular cleavage
reaction in a strand of RNA.
Structural
stability is not a serious challenge for RNA. The transcripted information
carried by mRNA must be secure for only a few hours, as it is transported to a
ribosome. Once in the ribosome it is surrounded by structural and enzymatic
segments that immediately incorporate its codons for protein synthesis. The
tRNA molecules that carry amino acids to the ribosome are similarly short
lived, and are in fact continuously recycled by the cellular chemistry.
4. The Thymine vs. Uracil Issue
Structural formulas for the three pyrimidine bases, cytosine, thymine and
uracil are shown. The carbon atoms that are part of these compounds may be
categorized as follows. All of these compounds are apparently put together from
a three-carbon malonate-like precursor (blue colored bonds) and a single high
oxidation state carbon species (colored red). Such biosynthetic intermediates
are well established. Thymine is unique in having an additional carbon, the
green methyl group. Biosynthesis of this compound must involve additional
steps, thus adding constructional complexity to the DNA molecules in which it
replaces uracil.
The reason for the substitution of thymine for uracil in DNA may be
associated with the repair mechanisms by which the cell corrects damage to its
DNA. One source of error in the code is the slow hydrolysis of heterocyclic
enamines, such as cytosine and guanine, to their corresponding lactams. This
changes the structure of the base, and disrupts base pairing in a manner that
can be identified and then repaired. However, the hydrolysis product from
cytosine is uracil, and this mismatched species must somehow be distinguished
from the uracil-like base that belongs in the DNA. The extra methyl group
serves this role nicely.
Gene Expression
Ultimately, the internal order that is the most
essential property of living organisms requires the precise and timely
regulation of gene expression. It is, after all, the capacity to switch genes on
and off that enables cells to respond efficiently to a changing environment. In
multicellular organisms, complex programmed patterns of gene expression are
responsible for cell differentiation as well as intercellular cooperation.
The regulation of
genes, as measured by their transcription rates, is the result of a complex
hierarchy of control elements that act to coordinate the cell's metabolic
activities. Some genes, referred to as constitutive or housekeeping genes, are routinely transcribed because they code
for gene products (e.g., glucose-metabolizing enzymes, ribosomal proteins, and
histones) that are required for cell function. In addition, in the
differentiated cells of multicellular organisms, certain specialized proteins
are produced that cannot be detected elsewhere (e.g., hemoglobin in red blood
cells). Genes, which are expressed only under certain circumstances, are
referred to as inducible. For example, the enzymes that are required for lactose metabolism in E.
coli are synthesized only when lactose is actually present and glucose, the
bacterium's preferred energy source, is absent.
Most of the mechanisms that are used
by living cells to regulate gene expression involve DNA-protein interactions. At
first glance, the seemingly repetitious and regular structure of B-DNA appears
to make it an unlikely partner for the sophisticated binding with myriad
different proteins that obviously must occur in gene regulation. However, DNA
is somewhat deformable, and certain sequences can be curved or bent. In
addition, it is now recognized that the edges of the base pairs within the
major groove (and to a lesser extent the minor groove) of the double helix can
participate in sequence-specific binding to proteins. Numerous contacts (often about 20 or so)
involving hydrophobic interactions, hydrogen bonds, and ionic bonds between
amino acids and nucleotide bases result in highly specific DNA-protein binding.
The three-dimensional structures of a
number of DNA regulatory proteins that have been determined have surprisingly
similar features. In addition to usually possessing twofold axes of symmetry,
most of these molecules can be separated into families on the basis of the
following structural domains: (1) helix-turn-helix, (2) helix-loop-helix, (3)
leucine zipper, (4) zinc finger, and (5) beta-sheets. It should be noted that
DNA-binding proteins, many of which are transcription factors, often form
dimers. For example, a variety of transcription factors with leucine zipper
motifs form dimers as their leucine-containing a-helices interdigitate. Because
each type of protein possesses its own unique binding specificity, the capacity
of these and many other transcription factors to combine to form homodimers (two
identical monomers) and heterodimers (two different monomers) results in a large number of unique gene
regulatory agents.
Considering the obvious complexity of
function observed in living organisms, it is not surprising that the regulation
of gene expression has proven to be both remarkably complex and difficult to
investigate. For many of the reasons, knowledge concerning prokaryotic gene
expression is significantly more advanced than that of eukaryotes. Prokaryotic
gene expression was originally investigated, in part, as a model for the study
of the more complicated gene function of mammals. Although it is now recognized
that the two genome types are vastly different in many respects, the
prokaryotic work has provided many valuable insights into the basic mechanisms
of gene expression. In general, prokaryotic gene expression involves the
interaction of specific proteins (sometimes referred to as regulators) with DNA in the immediate vicinity of a transcription start site. Such interactions may have either a positive effect
(i.e., transcription is initiated or increased) or a negative effect (i.e.,
transcription is blocked). In
an interesting variation the inhibition of a negative regulator (called a repressor) results
in the activation of affected genes. (The inhibition of a represser gene is referred to as derepression.) Eukaryotic gene expression also uses these
mechanisms as well as several others, including gene rearrangement and
amplification and various types of complex transcriptional, RNA processing, and
translational controls. In addition, the spatial separation of transcription
and translation that is inherent in eukaryotic cells provides another
opportunity for regulation: RNA transport control. Finally, eukaryotes (as well
as prokaryotes) also regulate cell function through the modulation of proteins
through various types of covalent modification.
The discussion of prokaryotic gene
expression focuses on the lac operon. The lac
operon of E. coli, originally investigated by Francois Jacob and Jacques Monod in the 1950s, remains one of the best-understood models of gene
regulation. Despite a daunting lack of knowledge concerning eukaryotic gene
expression, a significant number of the pieces in this marvelous puzzle have
been revealed.
The highly regulated metabolism of prokaryotes such as E.
coli allows these organisms to respond rapidly to changing environmental conditions
in a manner that promotes growth and survival. The timely synthesis of enzymes
and other gene products only when needed prevents the waste of energy and
nutritional resources. At the genetic level, the control of inducible genes is
often effected by collections of structural and regulatory genes called operons. Investigations
of operons, especially the lac operon, has provided substantial insight into how gene expression can be altered by
environmental conditions. Similarly investigations of viral infections of
prokaryotes have furnished relatively unobstructed views of certain genetic
mechanisms. The infection of E. coli by bacteriophage has been especially instructive.
The
Lac Operon. The lac operon consists of a control element
and structural genes that
code for the enzymes of lactose metabolism. The control element contains thepromoter
site, which overlaps the operator site. (In prokaryotes the operator is a DNA sequence involved in the regulation of
adjacent genes that binds to a represser protein.) The promoter site also
contains the CAP site. The structural genes Z, Y,
and A specify the primary structure of b-galactosidase, lactose permease, and
thiogalactoside transacetylase, respectively. b-Galactosidase
catalyzes the hydrolysis of lactose, which yields the monosaccharides galactose
and glucose, whereas lactose permease promotes lactose transport into the cell. Because
lactose metabolism proceeds normally in the absence of thiogalactoside
transacetylase, its role is unclear. A repressor gene i, directly adjacent to
the lac operon, codes for the lac repressor protein, a tetramer that binds to
the operator site with high affinity. (There are about ten copies of lac
represser per cell.) The binding of the lac repressor to the operator prevents
the functional binding of RNA polymerase to the promoter.
In the absence of its inducer (allolactose) the lac
operon remains repressed because of the binding of lac repressor to the
operator. When lactose becomes available, a few molecules are converted to
allolactose by b-galactosidase.
Allolactose then binds to the repressor, causing a change in its conformation
that promotes dissociation from the operator. Once the inactive repressor
diffuses away from the operator, the transcription of the structural genes is
initiated. The lac operon remains active until the lactose supply is consumed.
The repressor subsequently reverts to its active form and rebinds to the
operator.
Glucose is the preferred carbon and
energy source for E. coli In the event that the organism is exposed to both glucose and lactose, the
glucose is metabolized first. Syntheses of the lac operon enzymes are induced
only after the glucose is no longer available. (This makes sense because
glucose is more commonly available and has a central role in cellular
metabolism. Why expend the energy to synthesize the enzymes required for the
metabolism of other sugars if glucose is also available?) The delay in
activating the lac operon is mediated by a catabolite gene
activator protein (CAP). CAP is an allosteric homodimer that binds to the chromosome at a
site directly in front of the lac promoter when glucose is absent. CAP can act
as an indicator of glucose concentration because of its capacity to bind to
cAMP. (For reasons that are not yet clear, the cell's cAMP concentration is
inversely related to glucose concentration.) The binding of cAMP to CAP, a process that occurs
only when glucose is absent and cAMP levels are high, causes a conformational
change that allows the protein to bind to the lac promoter. CAP binding promotes transcription by increasing the affinity of RNA
polymerase for the lac promoter. In other words, CAP exerts a positive or
activating control on lactose metabolism.
Protein synthesis is an
extraordinarily complex process in which genetic information encoded in the
nucleic acids is "translated" into the 20 amino acid "alphabet"
of polypeptides.
http://www.youtube.com/watch?v=1PSwhTGFMxs&feature=related
http://www.youtube.com/watch?v=5bLEDd-PSTQ
http://www.youtube.com/watch?v=-zb6r1MMTkc&feature=related
Posttranslation
modification
Regardless of the species, immediately
after translation, some polypeptides fold into their final form without further
modifications. Frequently, however, newly synthesized polypeptides are
modified. These alterations, referred to as posttranslational
modifications, can be considered to be the fourth
phase of translation. They include the removal of portions of the
polypeptide by proteases, the addition of a variety of groups to the side chains of certain amino acid
residues, and the insertion of
cofactors. Often, individual polypeptides then
combine to form polymeric proteins. Posttranslational modifications appear to
serve two general purposes: (1) preparation of a polypeptide to serve its
specific function and (2) direction of a polypeptide to a specific location, a
process referred to as targeting. Targeting is
an especially complex process in eukaryotes because proteins must be directed
to a variety of different destinations. In addition to cytoplasm and the plasma
membrane (the principal destinations in prokaryotes), eukaryotic proteins may
be destined for delivery to a variety of organelles (e.g., mitochondria,
chloroplasts, lysosomes, peroxisomes).
Most nascent polypeptides undergo one or more types of
covalent modifications. These
alterations, which may occur either during ongoing polypeptide synthesis or
afterwards, consist of reactions that modify the side chains of specific amino
acid residues or involve the breaking of specific bonds. In general, posttranslational modifications prepare each molecule for
its functional role and/or for folding into its native (i.e., biologically
active) conformation. Examples of prominent posttranslational changes include
the following:
1. Proteolytic cleavage. Typical examples of proteolytic cleavage include the
removal of the N-terminal methionine residue, signal sequences, and the
conversion of inactive precursors to their active counterparts. Recall,
for example, that
certain enzymes, referred to as proenzymes or zymogens, are transformed into their active forms by cleavage of specific peptide
bonds. Inactive polypeptide precursors are called proproteins. The proteolytic
processing of insulin provides a well-researched example of the conversion of a
nonenzyme protein into its active form.
2. Glycosylation. Although a wide variety of eukaryotic proteins are
glycosylated, the functional purpose of the carbohydrate moieties is not
always obvious. In general, secreted proteins contain complex oligosaccharide
species, while ER membrane proteins possess high mannose species.
3.
Hydroxylation. Hydroxylation of the amino acids proline and lysine is required
for the structural integrity of the connective tissue proteins collagen and
elastin. Additionally, 4-hydroxyproline is also found in acetylcholinesterase
(the enzyme that degrades the neu-rotransmitter acetylcholine) and complement
(a complex series of serum proteins involved in the immune response). Ascorbic
acid (vitamin C) is required for the hydroxylation of proline and lysine
residues in collagen. When
dietary intake is inadequate, scurvy results. The
symptoms of scurvy (e.g., blood vessel fragility and poor wound healing) are a
consequence of weak collagen fiber structure.
4.
Phosphorylation. The roles of protein phosphorylation in various examples
of metabolic control and signal transduction are well known. Protein phosphorylation
may also play a critical (and interrelated) role in protein-protein
interactions. For example, the autophosphorylation of
tyrosine residues in PDGF receptors apparently results in the subsequent
binding of certain cytoplasmic signaling molecules.
5. Lipophilic modifications. The covalent attachment of lipid moieties to
proteins improves membrane binding capacity and/or certain protein-protein
interactions. Among the most common lipophilic modifications is acylation (the
attachment of fatty acids). Although the fatty acid myristate (14:0) is relatively
rare in eukaryotic cells, myristoylation is one of the most common forms of
acylation.
6. Methylation. Protein methylation serves several purposes in
eukaryotes. The methylation of altered aspartate residues by a specific type of
methyltransferase promotes either the repair or the degradation of damaged
proteins. Other methyltransferases catalyze reactions that alter the cellular
roles of certain proteins.
7. Disulfide
bond formation. Disulfide
bonds are generally found only in secretory proteins (e.g., insulin) and
certain membrane proteins. (Recall that "disulfide bridges" are
strong bonds that confer considerable structural stability on the molecules
that contain them.) Cytoplasmic proteins generally do not possess disulfide
bonds because of the presence of various reducing agents in cytoplasm (e.g.,
glutathione and thioredoxin).
Targeting
Despite the vast complexities of eukaryotic
cell structure and function, each newly synthesized polypeptide is normally
directed to its proper destination. Considering that translation takes place in
the cytoplasm (except for certain molecules that are produced within
mitochondria and plastids) and that a wide variety of polypeptides must be
directed to various locations, it is not surprising that the mechanisms by
which cellular proteins are "targeted" are complex. Although this
process is not yet completely understood, there appear to be two principal
mechanisms by which polypeptides are directed to their correct locations: transcript
localization and signal peptides.
It is generally recognized that cells
often have asymmetric protein distributions within the cytoplasm. It is now
believed that cytoplasmic protein gradients are created bytranscript
localization, that is, the binding of specific mRNA to receptors in certain
cytoplasmic locations.
Polypeptides that are destined for
secretion or for use in the plasma membrane or any of the membranous organelles
must be specifically targeted to their proper location. Several types of these
proteins possess sorting signals that are referred to as signal
peptides. Each signal peptide sequence promotes
the insertion of the polypeptide that contains it into an appropriate membrane.
Translational Control Mechanisms
Protein
synthesis is an exceptionally expensive process. With a cost of four
high-energy phosphate bonds per peptide bond (i.e., two bonds expended during
tRNA charging and one each during A site-tRNA binding and translocation) it is
perhaps not surprising that enormous quantities of energy are involved.
Although
the speed and accuracy of translation require a high energy input, the cost
would be even higher without metabolic control mechanisms. It is these
mechanisms that allow prokaryotic cells to compete with each other for limited
nutritional resources.
Eukariotic translation control
mechanisms are proving to be exceptionally complex, substantially more so than those
observed in prokaryotes. In prokaryotes such as E. coli, most
of the control of protein synthesis occurs at the level of transcription. This
circumstance makes sense for several reasons. First, transcription and
translation are directly coupled; that is, translation is initiated shortly
after transcription begins. Second, the lifetime of prokaryotic mRNA is usually
relatively short. With half-lives of between 1 and 3 minutes, the types of
mRNAs produced in a cell can be quickly altered as environmental conditions
change.
Despite the preeminence of transcriptional control
mechanisms, there are variations in the rates of prokaryotic mRNA translation.
An interesting example of negative translational
regulation in prokaryotes is provided by ribosomal protein synthesis. There
are approximately 55 proteins in prokaryotic ribosomes. These molecules are
coded for by genes occurring in 20 operons. Efficient bacterial growth
requires that their synthesis be coordinately regulated among the operons as
well as with rRNA synthesis. For example, in the PL11 operon, which contains the genes for the ribosomal proteins L1 and
L11, excessive amounts of L1 (i.e., more L1 molecules than can bind available
23S rRNA) trigger an inhibition of PL11 mRNA translation. Apparently, LI can bind to either 23S rRNA or PL11 mRNA. In the absence of 23S rRNA, LI inhibits the translation of its own
operon by binding to the 5' end of PL11 mRNA.
Deoxyribonucleic acid (DNA) and
ribonucleic acid (RNA) are chainlike macromolecules that function in the
storage and transfer of genetic information They are
major components of all cells, together making up from 5 to 15 percent of their
dry weight. Nucleic acids are also present in viruses, infectious nucleic
acid-protein complexes capable of directing their own replication in specific
host cells. Although nucleic acids are so named because DNA was first isolated
from cell nuclei, both DNA and RNA also occur in other parts of cells.
Just as the amino acids are the
building blocks, or monomeric units, of polypeptides, the nucleotides are the
monomeric units of nucleic acids. Just as one type of protein molecule is
distinguished from another by the sequence of the characteristic side chains or
R groups of the amino acid monomers, each type of nucleic acid is distinguished
by the