Genetics

Randall K. Holmes

Michael G. Jobling

General Concepts

Genetic Information in Microbes

Genetic information in bacteria and many viruses is encoded in DNA, but some viruses use RNA. Replication of the genome is essential for inheritance of genetically determined traits. Gene expression usually involves transcription of DNA into messenger RNA and translation of mRNA into protein

Genome Organization

The bacterial chromosome is a circular molecule of DNA that functions as a self-replicating genetic element (replicon). Extrachromosomal genetic elements such as plasmids and bacteriophages are nonessential replicons which often determine resistance to antimicrobial agents, production of virulence factors, or other functions. The chromosome replicates semiconservatively; each DNA strand serves as template for synthesis of its complementary strand

Mutation and Selection

The complete set of genetic determinants of an organism constitutes its genotype, and the observable characteristics constitute its phenotype. Mutations are heritable changes in genotype that can occur spontaneously or be induced by chemical or physical treatments. Organisms selected as reference strains are called wild type, and their progeny with mutations are called mutants. Selective media distinguish between wild type and mutant strains based on growth; differential media distinguish between them based on other phenotypic properties.

Exchange of Genetic Information

Genetic exchanges among bacteria occur by several mechanisms. In transformation, the recipient bacterium takes up extracellular donor DNA. In transduction, donor DNA packaged in a bacteriophage infects the recipient bacterium. In conjugation, the donor bacterium transfers DNA to the recipient by mating. Recombination is the rearrangement of donor and recipient genomes to form new, hybrid genomes. Transposons are mobile DNA segments that move from place to place within or between genomes.

Recombinant DNA and Gene Cloning

Gene cloning is the incorporation of a foreign gene into a vector to produce a recombinant DNA molecule that replicates and expresses the foreign gene in a recipient cell. Cloned genes are detected by the phenotypes they determine or by specific nucleotide sequences that they contain. Recombinant DNA and gene cloning are essential tools for research in molecular microbiology and medicine. They have many medical applications, including development of new vaccines, biologics, diagnostic tests, and therapeutic methods.

Regulation of Gene Expression

Expression of genes in microbes is often regulated by intracellular or environmental conditions. Regulation can affect any step in gene expression, including transcription initiation or termination, translation, or activity of gene products. An operon is a set of genes that is transcribed as a single unit and expressed coordinately. Specific regulation induces or represses a particular gene or operon. Global regulation affects a set of operons, which constitute a regulon. All operons in the regulon are coordinately controlled by the same regulatory mechanism.

INTRODUCTION

Genetic Information In Microbes

The genetic material of bacteria and plasmids is DNA. Bacterial viruses (bacteriophages or phages) have DNA or RNA as genetic material. The two essential functions of genetic material are replication and expression. Genetic material must replicate accurately so that progeny inherit all of the specific genetic determinants (the genotype) of the parental organism. Expression of specific genetic material under a particular set of growth conditions determines the observable characteristics (phenotype) of the organism. Bacteria have few structural or developmental features that can be observed easily, but they have a vast array of biochemical capabilities and patterns of susceptibility to antimicrobial agents or bacteriophages. These latter characteristics are often selected as the inherited traits to be analyzed in studies of bacterial genetics.

Nucleic Acid Structure

Nucleic acids are large polymers consisting of repeating nucleotide units (Fig. 5-1). Each nucleotide contains one phosphate group, one pentose or deoxypentose sugar, and one purine or pyrimidine base. In DNA the sugar is D-2-deoxyribose; in RNA the sugar is D-ribose. In DNA the purine bases are adenine (A) and guanine (G), and the pyrimidine bases are thymine (T) and cytosine (C). In RNA, uracil (U) replaces thymine. Chemically modified purine and pyrimidine bases are found in some bacteria and bacteriophages. The repeating structure of polynucleotides involves alternating sugar and phosphate residues, with phosphodiester bonds linking the 3'-hydroxyl group of one nucleotide sugar to the 5'-hydroxyl group of the adjacent nucleotide sugar. These asymmetric phosphodiester linkages define the polarity of the polynucleotide chain. A purine or pyrimidine base is linked at the 1'-carbon atom of each sugar residue and projects from the repeating sugar-phosphate backbone. Double-stranded DNA is helical, and the two strands in the helix are antiparallel. The double helix is stabilized by hydrogen bonds between purine and pyrimidine bases on the opposite strands. At each position, A on one strand pairs by two hydrogen bonds with T on the opposite strand, or G pairs by three hydrogen bonds with C. The two strands of double-helical DNA are, therefore, complementary. Because of complementarity, double-stranded DNA contains equimolar amounts of purines (A + G) and pyrimidines (T + C), with A equal to T and G equal to C, but the mole fraction of G + C in DNA varies widely among different bacteria. Information in nucleic acids is encoded by the ordered sequence of nucleotides along the polynucleotide chain, and in double-stranded DNA the sequence of each strand determines what the sequence of the complementary strand must be. The extent of sequence homology between DNAs from different microorganisms is the most stringent criterion for determining how closely they are related.

FIGURE 5-1 Double helical structure of DNA. The diagram shows the structure of DNA represented as a helical ladder. The backbone of each polynucleotide strand (represented as a ribbon) consists of alternating phosphate and deoxyribose residues linked by phosphodiester bonds, and the strands have opposite polarities (arrows). The purine or pyrimidine base of each nucleotide on one strand projects toward the complementary base of the corresponding nucleotide from the other strand and is linked to it by hydrogen bonds. The double helix has a diameter of 2 nm. Each full turn of the double helix contains 10 nucleotide pairs and is 3.4 nm in length.

DNA Replication

During replication of the bacterial genome, each strand in double-helical DNA serves as a template for synthesis of a new complementary strand. Each daughter double-stranded DNA molecule thus contains one old polynucleotide strand and one newly synthesized strand. This type of DNA replication is called semiconservative. Replication of chromosomal DNA in bacteria starts at a specific chromosomal site called the origin and proceeds bidirectionally until the process is completed (Fig. 5-2). When bacteria divide by binary fission after completing DNA replication, the replicated chromosomes are partitioned into each of the daughter cells. The origin regions specifically and transiently associate with the cell membrane after DNA replication has been intitiated, leading to a model whereby membrane attachment directs separation of daughter chromosomes (the replicon model). These characteristics of DNA replication during bacterial growth fulfill the requirements of the genetic material to be reproduced accurately and to be inherited by each daughter cell at the time of cell division.

FIGURE 5-2 Autoradiograph of intact replicating chromosome of E coli. Bacteria were radioactively labeled with tritiated thymidine for approximately two generations and were lysed gently. Bacterial DNA was then examined by autoradiography. Insert shows replicating bacterial chromosome in diagrammatic form. The chromosome is circular, and two forks (X and Y) are present in replicating structure. The segments of chromosome represented by double lines had completed two replications in presence of tritiated thymidine, whereas segments represented by a solid line and a dotted line had replicated only once in presence of tritiated thymidine. The density of grains in the autoradiogram was twice as great in the segments of chromosome that had completed two cycles of replication in presence of tritiated thymnidine. Bar, 100 µm. From Cairns, J.P.: Cold Spring Harbor Symposia on Quantitative Biology 28:44, 1963.

Gene Expression

Genetic information encoded in DNA is expressed by synthesis of specific RNAs and proteins, and information flows from DNA to RNA to protein. The DNA-directed synthesis of RNA is called transcription. Because the strands of double-helical DNA are antiparallel and complementary, only one of the two DNA strands can serve as template for synthesis of a specific mRNA molecule. Messenger RNAs (mRNAs) transmit information from DNA, and each mRNA in bacteria functions as the template for synthesis of one or more specific proteins. The process by which the nucleotide sequence of an mRNA molecule determines the primary amino acid sequence of a protein is called translation. Ribosomes, complexes of ribosomal RNAs (rRNAs) and several ribosomal proteins, translate each mRNA into the corresponding polypeptide sequence with the aid of transfer RNAs (tRNAs), amino-acyl tRNA synthesases, initiation factors and elongation factors. All of these components of the apparatus for protein synthesis function in the production of many different proteins. A gene is a DNA sequence that encodes a protein, rRNA, or tRNA molecule (gene product).
The genetic code determines how the nucleotides in mRNA specify the amino acids in a polypeptide. Because there are only 4 different nucleotides in mRNA (containing U, A, C and G), single nucleotides do not contain enough information to specify uniquely all 20 of the amino acids. In dinucleotides 16 (4 x 4) arrangements of the four nucleotides are possible, and in trinucleotides 64 (4 x 4 x 4) arrangements are possible. Thus, a minimum of three nucleotides is required to provide at least one unique sequence corresponding to each of the 20 amino acids. The "universal" genetic code employed by most organisms (Table 1) is a triplet code in which 61 of the 64 possible trinucleotides (codons) encode specific amino acids, and any of the three remaining codons (UAG, UAA or UGA) results in termination of translation. The chain-terminating codons are also called nonsense codons because they do not specify any amino acids. The genetic code is described as degenerate, because several codons may be used for a single amino acid, and as nonoverlapping, because adjacent codons do not share any common nucleotides. Exceptions to the "universal" code include the use of UGA as a tryptophan codon in some species of Mycoplasma and in mitochondrial DNA, and a few additional codon differences in mitochondrial DNAs from yeasts, Drosophila, and mammals. Translation of mRNA is usually initiated at an AUG codon for methionine, and adjacent codons are translated sequentially as the mRNA is read in the 5' to 3' direction. The corresponding polypeptide chain is assembled beginning at its amino terminus and proceeding toward its carboxy terminus. The sequence of amino acids in the polypeptide is, therefore, colinear with the sequence of nucleotides in the mRNA and the corresponding gene. Specific enzymatic reactions involved in DNA, RNA, and protein synthesis are beyond the scope of this chapter.

Expression of genetic determinants in bacteria involves the unidirectional flow of information from DNA to RNA to protein. In bacteriophages, either DNA or RNA can serve as genetic material. During infection of bacteria by RNA bacteriophages, RNA molecules serve as templates for RNA replication and as mRNAs. Studies with the retrovirus group of animal viruses reveal that DNA molecules can be synthesized from RNA templates by enzymes designated as RNA-dependent DNA polymerases (reverse transcriptases). This reversal of the usual direction for flow of genetic information, from RNA to DNA instead of from DNA to RNA, is an important mechanism for enabling information from retroviruses to be encoded in DNA and to become incorporated into the genomes of animal cells.

Genome Organization

DNA molecules that replicate as discrete genetic units in bacteria are called replicons. In some Escherichia coli strains, the chromosome is the only replicon present in the cell. Other bacterial strains have additional replicons, such as plasmids and bacteriophages.

Chromosomal DNA

Bacterial genomes vary in size from about 0.4 x 109 to 8.6 x 109 daltons (Da), some of the smallest being obligate parasites (Mycoplasma) and the largest belonging to bacteria capable of complex differentiation such as Myxococcus. The amount of DNA in the genome determines the maximum amount of information that it can encode. Most bacteria have a haploid genome, a single chromosome consisting of a circular, double stranded DNA molecule. However linear chromosomes have been found in Gram-positive Borrelia and Streptomyces spp., and one linear and one circular chromosome is present in the Gram-negative bacterium Agrobacterium tumefaciens. The single chromosome of the common intestinal bacterium E coli is 3 x 109 Da (4,500 kilobase pairs [kbp]) in size, accounting for about 2 to 3 percent of the dry weight of the cell. The E coli genome is only about 0.1% as large as the human genome, but it is sufficient to code for several thousand polypeptides of average size (40 kDa or 360 amino acids).

The chromosome of E coli has a contour length of approximately 1.35 mm, several hundred times longer than the bacterial cell, but the DNA is supercoiled and tightly packaged in the bacterial nucleoid. The time required for replication of the entire chromosome is about 40 minutes, which is approximately twice the shortest division time for this bacterium. DNA replication must be initiated as often as the cells divide, so in rapidly growing bacteria a new round of chromosomal replication begins before an earlier round is completed. At rapid growth rates there may be four chromosomes replicating to form eight at the time of cell division, which is coupled with completion of a round of chromosomal replication. Thus, the chromosome in rapidly growing bacteria is replicating at more than one point. The replication of chromosomal DNA in bacteria is complex and involves many different proteins.

Plasmids

Plasmids are replicons that are maintained as discrete, extrachromosomal genetic elements in bacteria. They are usually much smaller than the bacterial chromosome, varying from less than 5 to more than several hundred kbp, though plasmids as large as 2 Mbp occur in some bacteria. Plasmids usually encode traits that are not essential for bacterial viability, and replicate independently of the chromosome. Most plasmids are supercoiled, circular, double-stranded DNA molecules, but linear plasmids have also been demonstrated in Borrelia and Streptomyces. Closely related or identical plasmids demonstrate incompatibility; they cannot be stably maintained in the same bacterial host. Classification of plasmids is based on incompatibility or on use of specific DNA probes in hybridization tests to identify nucleotide sequences that are characteristic of specific plasmid replicons. Some hybrid plasmids contain more than one replicon.Conjugative plasmids code for functions that promote transfer of the plasmid from the donor bacterium to other recipient bacteria, but nonconjugative plasmids do not. Conjugative plasmids that also promote transfer of the bacterial chromosome from the donor bacterium to other recipient bacteria are called fertility plasmids, and are discussed below. The average number of molecules of a given plasmid per bacterial chromosome is called its copy number. Large plasmids (>40 kilobase pairs) are often conjugative, have small copy numbers (1 to several per chromosome), code for all functions required for their replication, and partition themselves among daughter cells during cell division in a manner similar to the bacterial chromosome. Plasmids smaller than 7.5 kilobase pairs usually are nonconjugative, have high copy numbers (typically 10-20 per chromosome), rely on their bacterial host to provide some functions required for replication, and are distributed randomly between daughter cells at division.

Many plasmids control medically important properties of pathogenic bacteria, including resistance to one or several antibiotics, production of toxins, and synthesis of cell surface structures required for adherence or colonization. Plasmids that determine resistance to antibiotics are often called R plasmids (or R factors). Representative toxins encoded by plasmids include heat-labile and heat-stable enterotoxins of E coli, exfoliative toxin of Staphylococcus aureus, and tetanus toxin of Clostridium tetani. Some plasmids are cryptic and have no recognizable effects on the bacterial cells that harbor them. Comparing plasmid profiles is a useful method for assessing possible relatedness of individual clinical isolates of a particular bacterial species for epidemiological studies. The role of plasmids in the evolution of resistance to antibiotics is discussed below.

Bacteriophages

Bacteriophages (bacterial viruses, phages) are infectious agents that replicate as obligate intracellular parasites in bacteria. Extracellular phage particles are metabolically inert and consist principally of proteins plus nucleic acid (DNA or RNA, but not both). The proteins of the phage particle form a protective shell (capsid) surrounding the tightly packaged nucleic acid genome. Phage genomes vary in size from approximately 2 to 200 kilobases per strand of nucleic acid and consist of double-stranded DNA, single-stranded DNA, or RNA. Phage genomes, like plasmids, encode functions required for replication in bacteria, but unlike plasmids they also encode capsid proteins and nonstructural proteins required for phage assembly. Several morphologically distinct types of phage have been described, including polyhedral, filamentous, and complex. Complex phages have polyhedral heads to which tails and sometimes other appendages (tail plates, tail fibers, etc.) are attached.

A single cycle of phage growth is shown in Fig. 5-3. Infection is initiated by adsorption of phage to specific receptors on the surface of susceptible host bacteria. The capsids remain at the cell surface, and the DNA or RNA genomes enter the target cells (penetration). Because infectivity of genomic DNA or RNA is much less than that of mature virus, there is a time immediately after infection called the eclipse period during which intracellular infectious phage cannot be detected. The infecting phage RNA or DNA is replicated to produce many new copies of the phage genome, and phage-specific proteins are produced. For most phages assembly of progeny occurs in the cytoplasm, and release of the progeny occurs by cell lysis. In contrast, filamentous phages are formed at the cell envelope and released without killing the host cells. The eclipse period ends when intracellular infectious progeny appear. The latent period is the interval from infection until extracellular progeny appear, and the rise period is the interval from the end of the latent period until all phage are extracellular. The average number of phage particles produced by each infected cell, called the burst size, is characteristic for each virus and often ranges between 50 and several hundred. For discussions of structure, multiplication, and classification of animal viruses, see Chapters 41 and 42.

FIGURE 5-3 One-step growth of bacteriophage. A culture of susceptible bacteria is synchronously infected with bacteriophage added at time 0 at low multiplicity of infection. Unabsorbed phage is inactivated shortly thereafter by addition of anti-phage antiserum, and the culture is then diluted to prevent further activity of the antiserum. Samples are taken at intervals for phage assays. Total phage (intracellular plus extracellular) is determined by testing the sample after treating it to disrupt infected bacteria, and extracellular phage is determined by testing supernatant after removal of bacteria by centrifugation or ultrafiltration. Phage titers are as the ratio of phage per infected bacterial cell.

Phages are classified into two major groups: virulent and temperate. Growth of virulent phages in susceptible bacteria destroys the host cells. Infection of susceptible bacteria by temperate phages can have either of two outcomes: lytic growth or lysogeny. Lytic growth of temperate and virulent bacteriophages is similar, leading to production of phage progeny and death of the host bacteria. Lysogeny is a specific type of latent viral infection in which the phage genome replicates as a prophage in the bacterial cell. In most lysogenic bacteria the genes required for lytic phage development are not expressed, and production of infectious phage does not occur. Furthermore, the lysogenic cells are immune to superinfection by the virus which they harbor as a prophage. The physical state of the prophage is not identical for all temperate viruses. For example, the prophage of bacteriophage l in E coli is integrated into the bacterial chromosome at a specific site and replicates as part of the bacterial chromosome, whereas the prophage of bacteriophage P1 in E coli replicates as an extrachromosomal plasmid.

Lytic phage growth occurs spontaneously in a small fraction of lysogenic cells, and a few extracellular phages are present in cultures of lysogenic bacteria. For some lysogenic bacteria, synchronous induction of lytic phage development occurs in the entire population of lysogenic bacteria when they are treated with agents that damage DNA, such as ultraviolet light or mitomycin C. The loss of prophage from a lysogenic bacterium, converting it to the nonlysogenic state and restoring susceptibility to infection by the phage that was originally present as prophage, is called curing.

Some temperate phages contain genes for bacterial characteristics that are unrelated to lytic phage development or the lysogenic state, and expression of such genes is called phage conversion (or lysogenic conversion). Examples of phage conversion that are important for microbial virulence include production of diphtheria toxin by Corynebacterium diphtheriae, erythrogenic toxin by Streptococcus pyogenes (group A b-hemolytic streptococci), botulinum toxin by Clostridium botulinum, and Shiga-like toxins by E coli. In each of these examples the gene which encodes the bacterial toxin is present in a temperate phage genome. The specificity of O antigens in Salmonella can also be controlled by phage conversion. Phage typing is the testing of strains of a particular bacterial species for susceptibility to specific bacteriophages. The patterns of susceptibility to the set of typing phages provide information about the possible relatedness of individual clinical isolates. Such information is particularly useful for epidemiological investigations.

Mutation and Selection

Mutations are heritable changes in the genome. Spontaneous mutations in individual bacteria are rare. Some mutations cause changes in phenotypic characteristics; the occurrence of such mutations can be inferred from the effects they produce. In microbial genetics specific reference organisms are designated as wild-type strains, and descendants that have mutations in their genomes are called mutants. Thus, mutants are characterized by the inherited differences between them and their ancestral wild-type strains. Variant forms of a specific genetic determinant are called alleles. Genotypic symbols are lower case, italicized abbreviations that specify individual genes, with a (+) superscript indicating the wild type allele. Phenotypic symbols are capitalized and not italicized, to distinguish them from genotypic symbols. For example, the genotypic symbol for the ability to produce b-galactosidase, required to ferment lactose, is lacZ+, and mutants that cannot produce b-galactosidase are lacZ. The lactose-fermenting phenotype is designated Lac+, and inability to ferment lactose is Lac-.

Detection of Mutant Phenotypes

Selective and differential media are helpful for isolating bacterial mutants. Some selective media permit particular mutants to grow, but do not allow the wild-type strains to grow. Rare mutants can be isolated by using such selective media. Differential media permit wild-type and mutant bacteria to grow and form colonies that differ in appearance. Detection of rare mutants on differential media is limited by the total number of colonies that can be observed. Consider a wild-type strain of E coli that is susceptible to the antibiotic streptomycin (phenotype Strs) and can utilize lactose as the sole source of carbon (phenotype Lac+). Spontaneously occurring Strr mutants are rare and are usually found at frequencies of less than one per 109 bacteria in cultures of wild-type E coli. Nevertheless, Strr mutants can be isolated easily by using selective media containing streptomycin, because the wild-type Strs bacteria are killed. Isolation of lactose-negative (phenotype Lac-) mutants of E coli poses a different problem. On minimal media with lactose as the sole source of carbon, Lac+ wild-type strains will grow, but Lac- mutants cannot grow. On differential media such as MacConkey-lactose agar or eosin-methylene blue-lactose agar, Lac+ wild-type and Lac- mutant strains of E coli can be distinguished by their color, but spontaneous Lac- mutants are too rare to be isolated easily. Selective media for Lac- mutants of E coli can be made by incorporating chemical analogs of lactose that are converted into toxic metabolites by Lac+ bacteria but not by Lac- mutants. The Lac- mutants can then grow on such media, but the Lac+ wild-type bacteria are killed.

Mutations that inactivate essential genes in haploid organisms are usually lethal, but such potentially lethal mutations can often be studied if their expression is controlled by manipulation of experimental conditions. For example, a mutation that increases the thermolability of an essential gene product may prevent bacterial growth at 42°C, although the mutant bacterium can still grow at 25°C. Conversely, cold-sensitive mutants express the mutant phenotype at low temperature, but not at high temperature. Temperature-sensitive and cold-sensitive mutations are examples of conditional mutations, as are suppressible mutations described later in this chapter. A conditional lethal phenotype indicates that the mutant gene is essential for viability.

Spontaneous and Induced Mutations

The mutation rate in bacteria is determined by the accuracy of DNA replication, the occurrence of damage to DNA, and the effectiveness of mechanisms for repair of damaged DNA.

For a particular bacterial strain under defined growth conditions, the mutation rate for any specific gene is constant and is expressed as the probability of mutation per cell division. In a population of bacteria grown from a small inoculum, the proportion of mutants usually increases progressively as the size of the bacterial population increases.

Mutations in bacteria can occur spontaneously and independently of the experimental methods used to detect them. This principle was first demonstrated by the fluctuation test (Fig. 5-4). The numbers of phage-resistant mutants of E coli in replicate cultures grown from small inocula were measured and compared with those in multiple samples taken from a single culture. If mutations to phage resistance occurred only after exposure to phage, the variability in numbers of mutants between cultures should be similar under both sets of conditions. In contrast, if phage-resistant mutants occurred spontaneously before exposure of the bacteria to phage, the numbers of mutants should be more variable in the independently grown cultures, because differences in the size of the bacterial population when the first mutant appeared would contribute to the observed variability. The data indicated that the mutations to phage resistance in E coli occurred spontaneously with constant probability per cell division.

FIGURE 5-4 The fluctuation test. Differences in numbers of colonies of phage-resistant mutants in replicate samples from single subculture were small and reflected only expected fluctuations due to sampling errors. In contrast, numbers of phage-resistant colonies in samples from individual subcultures were more variable and reflected both sampling errors and the independent origins of mutants in individual subcultures. Sizes of clonal populations of mutants in each culture reflected numbers of generations of growth between times that mutations occurred and time of sampling.

Replica plating confirmed that mutations in bacteria can occur spontaneously, without exposure of bacteria to selective agents (Fig. 5-5). For replica plating, a flat, sterile, velveteen surface is used to pick up an inoculum from the surface of an agar master plate and transfer samples to other agar plates. In this manner, samples of the bacterial population from the master plate are transferred to the replica plates without distorting their spatial arrangement. If the replica plates contain selective medium and the master plates do not, the positions of selected mutant colonies on the replica plates can be noted, and bacteria that were not exposed to the selective conditions can be isolated from the same positions on the master plate. Mutants of E coli resistant to bacteriophage T1 or to streptomycin have been isolated in this way, without exposing the wild-type bacteria to the bacteriophage or the antibiotic.

FIGURE 5-5 Detecting preexisting bacterial mutants by replica plating. Master plate was heavily inoculated with sample from pure cultures of phage-susceptible bacterium. After incubation, bacteria from master plate were transferred by replica plating to duplicate agar plates impregnated with bacteriophage. Phage-susceptible bacteria were killed by the bacteriophage. Colonies of phage-resistant bacteria appeared at identical positions on duplicate plates, indicating that phage-resistant bacteria had been transferred to each replica plate from the corresponding locations on master plate. Bacterial inocula selected from appropriate locations on master plate contained a higher proportion of phage-resistant mutants than original bacterial culture. By repeating these procedures several times, it was possible to isolate pure cultures of phage-resistant bacterial mutants that had never been exposed to bacteriophage.

Both environmental and genetic factors affect mutation rates. Exposure of bacteria to mutagenic agents causes mutation rates to increase, sometimes by several orders of magnitude. Many chemical and physical agents, including X-rays and ultraviolet light, have mutagenic activity. Chemicals that are carcinogenic for animals are often mutagenic for bacteria, or can be converted by animal tissues to metabolites that are mutagenic for bacteria. Standardized tests for mutagenicity in bacteria are used as screening procedures to identify environmental agents that may be carcinogenic in humans. Mutator genes in bacteria cause an increase in spontaneous mutation rates for a wide variety of other genes. Expression of these genes, induced by DNA damage (see SOS response later), enables the repair of DNA lesions that would otherwise be lethal, but by an error-prone mechanism that increases the rate of mutation. The overall mutation ratethe probability that a mutation will occur somewhere in the bacterial genome per cell divisionis relatively constant for a variety of organisms with genomes of different sizes and appears to be a significant factor in determining the fitness of a bacterial strain for survival in nature. Most mutations are deleterious, and the risk of adverse mutations for individual bacteria must be balanced against the positive value of mutability as a mechanism for adaptation of bacterial populations to changing environmental conditions.

Molecular Basis of Mutations

Mutations are classified on the basis of structural changes that occur in DNA (Table 2). Some mutations are localized within short segments of DNA (for example, nucleotide substitutions, microdeletions, and microinsertions). Other mutations involve large regions of DNA and include deletions, insertions, or rearrangements of segments of DNA.

When a nucleotide substitution occurs in a region of DNA that codes for a polypeptide, one of the three nucleotides within a single codon of a corresponding mRNA molecule will be changed. Silent mutations cause no change in polypeptide structure or function, because one codon in mRNA is changed to another for the same amino acid. Other substitutions cause one amino acid to be replaced by another at the specific position within the polypeptide corresponding to the altered codon. Mutations that result in replacement of one amino acid for another within a polypeptide chain are called missense mutations. The effects of amino acid replacements on the function of a polypeptide gene product vary and depend on the location and the identity of the amino acid replacement. Mutant polypeptides containing amino acid replacements usually share antigenic determinants with the wild-type polypeptide and often have some residual biologic activity. Mutations that result in replacement of an amino acid codon with a termination codon are called nonsense mutations. This results in production of an amino-terminal fragment of the normal polypeptide when the mutant mRNA is translated. Nonsense mutations often result in complete loss of activity of the gene product.

Because of the triplet nature of the genetic code, the consequences of mutations caused by insertions or deletions of small numbers of nucleotides (microinsertions, microdeletions) depend on both the number and sequence of nucleotides involved. Deletion or addition of multiples of three nucleotide pairs does not affect the reading frame, but causes deletion or addition of appropriate numbers of amino acids at one site within the polypeptide. If a new chain-terminating codon is introduced, premature chain termination occurs within the polypeptide. In contrast, addition or deletion of other numbers of nucleotide pairs alters the reading frame for the entire segment of mRNA from the mutation to the distal end of the gene. Therefore, frameshift mutations are likely to cause drastic changes in the structure and activity of polypeptide gene products, and they are often classified as nonsense mutations.

Complementation Tests

To determine if mutations are located in the same gene or different genes, complementation tests are performed with partially diploid bacterial strains (Fig. 5-6). Two copies of the region of the bacterial chromosome harboring a mutation are present in the same bacterium, with each copy containing a different mutation (mutations are in the trans arrangement). A wild-type phenotype indicates that the mutations are in different genes. This phenomenon is called complementation. If a mutant phenotype is observed, a control experiment should be performed with the mutations in the cis arrangement to exclude the possibility that the wild-type alleles cannot be expressed normally in a partially diploid bacterial strain. Complementation tests were originally called "cis-trans" tests, and the term cistron is sometimes used as a synonym for gene. Complementation tests can be performed and interpreted even if the specific biochemical functions of the gene products are unknown.

FIGURE 5-6 Complementation is a method to test for functional gene products. Two mutants with similar phenotypes (inability to convert substrate X to product Z) were isolated. Mutations in these strains are designated a and b, respectively, and the wild type alleles are a+ and b+. Partially diploid heterozygous strains were tested to determine if mutations a and b were in the same structural gene (cistron) and inactivated the same gene products. A), If a and b are in the same structural gene (e.g., encoding the enzyme that converts X to Y), neither the a+b nor the ab+ allele codes for an active enzyme, substrate X cannot be utilized, the mutant phenotype is expressed, and no complementation occurs. B), If a and b are in different cistrons (e.g., encoding the enzymes that convert X to Y and Y to Z), the a+ and b+ alleles encode active enzymes, substrate X is converted to product Z, the wild type phenotype is expressed, and complementation occurs.

As an example, consider using a complementation test to characterize two independently derived Lac- mutants of E coli. The biochemical pathway for utilization of lactose requires ß-galactoside permease (genotypic symbol lacY) to transport lactose into the bacterial cell and b-galactosidase (genotypic symbol lacZ) to convert lactose into D-glucose and D-galactose. Mutants that lack b-galactoside permease or b-galactosidase cannot utilize lactose for growth. If the mutations in both Lac- mutants inactivated the same protein (e.g., b-galactoside) then a partial diploid strain containing the lacZ genes from both mutants in the trans arrangement would be unable to utilize lactose. In contrast, if the genotypes of the two mutants were lacZ+ lacY and lacZ lacY+, the partially diploid bacterium would produce active b-galactosidase from the lacZ+ determinant and active b-galactoside permease from the lacY+ determinant. Complementation would occur, and the partially diploid strain would utilize lactose.

Reversion and Suppression

Mutations that convert the phenotype from wild-type to mutant are called forward mutations, and mutations that change the phenotype from mutant back to wild-type are called reverse mutations (reversions). Bacterial strains that contain reverse mutations are called revertants. Analysis of mutations that cause phenotypic reversion yields useful information. Reverse mutations that restore the exact nucleotide sequence of the wild-type DNA are true reversions. True revertants are identical to wild-type strains both genotypically and phenotypically. Reverse mutations that do not restore the exact nucleotide sequence of the wild-type DNA are called suppressor mutations (suppressors). Some revertants that harbor suppressor mutations are phenotypically indistinguishable from wild-type strains. Other revertants, called pseudorevertants, can be distinguished phenotypically from wild-type strains, for example, by subtle differences in the characteristics of an enzymatic activity that has been regained (such as specific activity, substrate specificity, kinetic constants, or susceptibility to thermal or chemical inactivation). Recognition of pseudorevertant phenotypes suggests the presence of suppressor mutations.

Suppressor mutations can be intragenic or extragenic. Intragenic suppressors are located in the same gene as the forward mutations that they suppress. The possible locations and nature of intragenic suppressors are determined by the original forward mutation and by the relationships between the primary structure of the gene product and its biologic activity. Extragenic suppressors are located in different genes from mutations whose effects they suppress. The ability of extragenic suppressors to suppress a variety of independent mutations can be tested. Some extragenic suppressors are specific for particular genes, some are specific for particular codons, and some have other specificity patterns. Extragenic suppressors that reverse the phenotypic effects of chain-terminating codons have been well characterized and found to alter the structure of specific tRNAs. A particular suppressor tRNA can permit a specific chain-terminating codon to be translated, resulting in incorporation of a specific amino acid into the nascent polypeptide at the position corresponding to the chain-terminating codon. In a bacterium that has a chain-terminating mutation and an appropriate extragenic suppressor, translation of the mRNA containing the mutant codon can therefore result in formation of a full-length polypeptide. The biologic activity of the full-length polypeptide formed as a consequence of suppression depends both on the amount of protein made and on the functional consequences of the specific amino acid replacement determined by the suppressor tRNA.

Exchange of Genetic Information

The biologic significance of sexuality in microorganisms is to increase the probability that rare, independent mutations will occur together in a single microbe and be subjected to natural selection. Genetic interactions between microbes enable their genomes to evolve much more rapidly than by mutation alone. Representative phenomena of medical importance that involve exchanges of genetic information or genomic rearrangements include the rapid emergence and dissemination of antibiotic resistance plasmids, flagellar phase variation in Salmonella, and antigenic variation of surface antigens in Neisseria and Borrelia.

Sexual processes in bacteria involve transfer of genetic information from a donor to a recipient and result either in substitution of donor alleles for recipient alleles or addition of donor genetic elements to the recipient genome. Transformation, transduction, and conjugation are sexual processes that use different mechanisms to introduce donor DNA into recipient bacteria (Fig. 5-7). Because donor DNA cannot persist in the recipient bacterium unless it is part of a replicon, recombination between donor and recipient genomes is often required to produce stable, hybrid progeny. Recombination is most likely to occur when the donor and recipient bacteria are from the same or closely related species.

FIGURE 5-7 Exchange of genetic information in bacteria. Transformation, transduction, and conjugation differ in means for introducing DNA from donor cell into recipient cell. A) In transformation, fragments of DNA released from donor bacteria are taken up by competent recipient bacteria. B) In transduction, abnormal bacteriophage particles containing DNA from donor bacteria inject their DNA into recipient bacteria. C) Conjugation occurs by formation of cytoplasmic connections between donor and recipient bacteria, with direct transfer of newly synthesized donor DNA into the recipient cells. In all three cases, recombination between donor and recipient DNA molecules is required for formation of stable recombinant genomes. Bacterial genome is represented diagrammatically as a circular element in bacterial cells. Donor and recipient DNA are indicated by fine lines and heavy lines, respectively. In each recombinant genome, the a+ allele from donor strain has replaced the a allele from recipient strain, and the b+ allele is derived from recipient strain.

For a recombinant to be detected, its phenotype must be different from both parental phenotypes. Growth or cell division may be required before the recombinant phenotype is expressed. Delay in expression of a recombinant phenotype until a haploid recombinant genome has segregated is called segregation lag, and delay until synthesis of products encoded by donor genes has occurred is called phenotypic lag. Testing for linkage (nonrandom reassortment of parental alleles in recombinant progeny) is possible when the parental bacteria have different alleles for several genes. The donor allele of an unselected gene is more likely to be present in recombinants if it is linked to the selected donor gene than if it is not linked to the selected donor gene. Quantitative analysis of linkage permits construction of genetic maps. The genome of E coli is circular (Fig. 5-8), as determined both by genetic linkage and direct biochemical analysis of chromosomal DNA, and the genetic map is colinear with the physical map of the chromosomal DNA. Genetic and physical mapping are also used to analyze extrachromosomal replicons such as bacteriophages and plasmids.

FIGURE 5-8 Circular genetic map of E coli. Positions of representative genes are indicated on inner circle. Distances between genes are calibrated in minutes, based on times required for transfer during conjugation. Position of threonine (thr) locus is arbitrarily designated as 0 minutes, and other assignments are relative to thr. On next circle, symbols and arrowheads identify specific Hfr donor strains of E coli and their characteristics. For each Hfr strain the point of arrowhead is the origin for chromosomal transfer; oriented transfer of chromosome during conjugation proceeds from point of arrowhead, followed immediately by base of the arrowhead, and so on. F' plasmids are identified by numbers, and the fragment of the E coli chromosome present in each F' plasmid is represented by an arc corresponding to a specific segment of the circular genetic map. See text for definitions of Hfr and F' donor strains and for description of the conjugal mating system in E coli. From Bachman, B.M., Low, K.B. Microbiol Rev, 1980;44:31.

Many bacteria have restriction-modification (RM) systems, consisting of modifying enzymes that methylate adenine or cytosine residues at specific sequences in their own DNA and corresponding restriction endonucleases that cleave foreign DNA which does not carry the specific modification at the same target sequences. Some restriction enzymes will only cleave DNA that has been methylated at specific sequences. These restriction systems, which may have evolved to protect bacteria against invasion by phages or plasmids, are an important barrier to genetic exchanges between different bacterial strains or species. Recent evidence suggests that plasmid-borne RM systems may be a way for the plasmid to ensure its carriage in a host strain, since cells that lose the plasmid (and the corresponding protective methylase gene) are killed by the action of the more stable restriction enzyme, which attacks the newly replicated but unmodified chromosomal DNA.

Transformation

In transformation, pieces of DNA released from donor bacteria are taken up directly from the extracellular environment by recipient bacteria. Recombination occurs between single molecules of transforming DNA and the chromosomes of recipient bacteria. To be active in transformation, DNA molecules must be at least 500 nucleotides in length, and transforming activity is destroyed rapidly by treating DNA with deoxyribonuclease. Molecules of transforming DNA correspond to very small fragments of the bacterial chromosome. Cotransformation of genes is unlikely, therefore, unless they are so closely linked that they can be encoded on a single DNA fragment. Transformation was discovered in Streptococcus pneumoniae and occurs in other bacterial genera including Haemophilus, Neisseria, Bacillus, and Staphylococcus. The ability of bacteria to take up extracellular DNA and to become transformed, called competence, varies with the physiologic state of the bacteria. Many bacteria that are not usually competent can be made to take up DNA by laboratory manipulations, such as calcium shock or exposure to a high-voltage electrical pulse (electroporation). In some bacteria (including Haemophilus and Neisseria) DNA uptake depends on the presence of specific oligonucleotide sequences in the transforming DNA, but in others (including Streptococcus pneumoniae) DNA uptake is not sequence-specific. Competent bacteria may also take up intact bacteriophage DNA (transfection) or plasmid DNA, which can then replicate as extrachromosomal genetic elements in the recipient bacteria. In contrast, a piece of chromosomal DNA from a donor bacterium usually cannot replicate in the recipient bacterium unless it becomes part of a replicon by recombination. Historically, characterization of "transforming principle" from S pneumoniae provided the first direct evidence DNA is genetic material.

Transduction

In transduction, bacteriophages function as vectors to introduce DNA from donor bacteria into recipient bacteria by infection. For some phages, called generalized transducing phages, a small fraction of the virions produced during lytic growth are aberrant and contain a random fragment of the bacterial genome instead of phage DNA. Each individual transducing phage carries a different set of closely linked genes, representing a small segment of the bacterial genome. Transduction mediated by populations of such phages is called generalized transduction, because each part of the bacterial genome has approximately the same probability of being transferred from donor to recipient bacteria. When a generalized transducing phage infects a recipient cell, expression of the transferred donor genes occurs. Abortive transduction refers to the transient expression of one or more donor genes without formation of recombinant progeny, whereas complete transduction is characterized by production of stable recombinants that inherit donor genes and retain the ability to express them. In abortive transduction the donor DNA fragment does not replicate, and among the progeny of the original transductant only one bacterium contains the donor DNA fragment. In all other progeny the donor gene products become progressively diluted after each generation of bacterial growth until the donor phenotype can no longer be expressed. On selective medium upon which only bacteria with the donor phenotype can grow, abortive transductants produce minute colonies that can be distinguished easily from colonies of stable transductants. The frequency of abortive transduction is typically one to two orders of magnitude greater than the frequency of generalized transduction, indicating that most cells infected by generalized transducing phages do not produce recombinant progeny.

Specialized transduction differs from generalized transduction in several ways. It is mediated only by specific temperate phages, and only a few specific donor genes can be transferred to recipient bacteria. Specialized transducing phages are formed only when lysogenic donor bacteria enter the lytic cycle and release phage progeny. The specialized transducing phages are rare recombinants which lack part of the normal phage genome and contain part of the bacterial chromosome located adjacent to the prophage attachment site. Many specialized transducing phages are defective and cannot complete the lytic cycle of phage growth in infected cells unless helper phages are present to provide missing phage functions. Specialized transduction results from lysogenization of the recipient bacterium by the specialized transducing phage and expression of the donor genes. Phage conversion and specialized transduction have many similarities, but the origin of the converting genes in temperate converting phages is unknown.

Conjugation

In conjugation, direct contact between the donor and recipient bacteria leads to establishment of a cytoplasmic bridge between them and transfer of part or all of the donor genome to the recipient. Donor ability is determined by specific conjugative plasmids called fertility plasmids or sex plasmids.

The F plasmid (also called F factor) of E coli is the prototype for fertility plasmids in Gram-negative bacteria. Strains of E coli with an extrachromosomal F plasmid are called F+ and function as donors, whereas strains that lack the F plasmid are F- and behave as recipients. The conjugative functions of the F plasmid are specified by a cluster of at least 25 transfer (tra) genes which determine expression of F pili, synthesis and transfer of DNA during mating, interference with the ability of F+ bacteria to serve as recipients, and other functions. Each F+ bacterium has 1 to 3 F pili that bind to a specific outer membrane protein (the ompA gene product) on recipient bacteria to initiate mating. An intercellular cytoplasmic bridge is formed, and one strand of the F plasmid DNA is transferred from donor to recipient, beginning at a unique origin and progressing in the 5' to 3' direction. The transferred strand is converted to circular double-stranded F plasmid DNA in the recipient bacterium, and a new strand is synthesized in the donor to replace the transferred strand. Both of the exconjugant bacteria are F+, and the F plasmid can therefore spread by infection among genetically compatible populations of bacteria. In addition to the role of the F pili in conjugation, they also function as receptors for donor-specific (male-specific) phages.

The F plasmid in E coli can exist as an extrachromosomal genetic element or be integrated into the bacterial chromosome (Fig. 5-9). Because the F plasmid and the bacterial chromosome are both circular DNA molecules, reciprocal recombination between them produces a larger DNA circle consisting of F plasmid DNA inserted linearly into the chromosome. E coli contains multiple copies of several different genetic elements called insertion sequences (see section on transposons for more detail), at various locations in its chromosome and in the F plasmid. Homologous recombination between insertion sequences in the chromosome and the F plasmid leads to preferential integration of the F plasmid at chromosomal sites where insertion sequences are located. The chromosomal sites where insertion sequences are found vary, however, among strains of E coli.

FIGURE 5-9 Role of F plasmid in determining donor and recipient states of E coli. The F plasmid is representative of specific conjugative plasmids that control donor ability in E coli. F- strains lack the F plasmid and are genetic recipients. F+ strains harbor the F plasmid as a cytoplasmic element, express F pili, and are genetic donors. The F plasmid can become integrated into bacterial chromosome at various locations to produce Hfr (high-frequency recombination) donor strains. Abnormal excision of F plasmid can result in formation of F' plasmids that contain segments of bacterial chromosome and the corresponding bacterial genes. The arrowhead in F plasmid defines origin for transfer of DNA during conjugation. F plasmid and chromosomal DNA are indicated by heavy and fine lines, respectively. For additional data concerning the genomes of Hfr and F' strains, see Fig. 5-8.

An E coli strain with an integrated F plasmid retains its ability to function as a donor in conjugal matings. Because donor strains with integrated F factors can transfer chromosomal genes to recipients with high efficiency, they are called Hfr (High frequency recombination) strains. Transfer of single-stranded DNA from an Hfr donor to a recipient begins from the origin within the F plasmid and proceeds as described above, except that the transferred DNA is the hybrid replicon consisting of F plasmid integrated into the bacterial chromosome. Transfer of this entire replicon, including the bacterial chromosome, requires approximately 100 minutes. The identity of the first chromosomal gene to be transferred and the polarity of chromosomal transfer are determined by the site of integration of the F plasmid and its orientation with respect to the bacterial chromosome. Because the mating bacteria usually separate spontaneously before the entire chromosome is transferred, conjugation typically transfers only a fragment of the donor chromosome into the recipient. The probability that a donor gene will enter the recipient bacterium during conjugation decreases, therefore, as its distance from the F origin (and therefore the time of its transfer) increases. Mating cells can also be broken apart experimentally by subjecting them to strong shearing forces in a mechanical blender; this is called interrupted mating. Formation of recombinant progeny requires recombination between the transferred donor DNA and the genome of the recipient bacterium. Analysis of progeny from matings that are interrupted after different intervals demonstrates which chromosomal genes are transferred first by particular donor strains, the sequential times of entry for genes that are transferred subsequently, and the progressively lower probability that genes transferred later will appear in recombinant progeny. The circularity of the genetic map of E coli was originally deduced from the overlapping, circularly permuted groups of linked genes that were transferred early by individual donor strains in which the F factor was integrated at different chromosomal locations.

In matings between F+ and F- bacteria, only the F plasmid is transferred with high efficiency to recipients. Chromosomal genes are transferred with very low efficiency, and it is the spontaneous Hfr mutants in F+ populations that mediate transfer of donor chromosomal genes. In matings between Hfr and F- strains, the segment of the F plasmid containing the tra region is transferred last, after the entire bacterial chromosome has been transferred. Most recombinants from matings between Hfr and F- cells fail to inherit the entire set of F plasmid genes and are phenotypically F-. In matings between F+ and F- strains, the F plasmid spreads rapidly throughout the bacterial population, and most recombinants are F+.

Integrated F plasmids in Hfr strains can sometimes be excised from the bacterial chromosome. If excision precisely reverses the integration process, F+ cells are produced. On rare occasions, however, excision occurs by recombinations involving insertion sequences or other genes on the bacterial chromosome that are located at some distance from the original integration site. In such cases segments of the bacterial chromosome can become incorporated into hybrid F plasmids that are called F' plasmids (see Fig. 5-9). By similar processes, segments of the bacterial chromosome can sometimes become incorporated into R plasmids to produce hybrid R' plasmids. Conjugative R' plasmids can function as fertility plasmids because they can integrate into the bacterial chromosome by homologous recombination and mediate transfer of chromosomal genes during matings with recipient bacteria. F' plasmids, R' plasmids, specialized transducing phages, and recombinant plasmids or phages constructed by gene cloning (described below) are hybrid replicons that can include segments of the bacterial chromosome. Therefore, any of these genetic elements can be used to construct the partially diploid bacterial strains that are required for complementation tests and other purposes.

Conjugation also occurs in Gram-positive bacteria. Gram-positive donor bacteria produce adhesins that cause them to aggregate with recipient cells, but sex pili are not involved. In some Streptococcus species, recipient bacteria produce extracellular sex pheromones that cause the donor phenotype to be expressed by bacteria that harbor an appropriate conjugative plasmid, and the conjugative plasmid prevents the donor cells from producing the corresponding pheromone.

Recombination

Recombination involves breakage and joining of parental DNA molecules to form hybrid, recombinant molecules. Several distinct kinds of recombination have been identified that depend on different features of the participating genomes and require the activities of different gene products. Specific enzymes that act on DNA (for example, exonucleases, endonucleases, polymerases, ligases) participate in recombination. Detailed discussion of the biochemical events in recombination is beyond the scope of this chapter.

Generalized recombination involves donor and recipient DNA molecules that have homologous nucleotide sequences. Reciprocal exchanges can occur between any homologous donor and recipient sites. In E coli, the product of the recA gene is essential for generalized recombination, but other gene products also participate.

Site-specific recombination involves reciprocal exchanges only between specific sites in donor and recipient DNA molecules. The recA gene product is not required for site-specific recombination. Integration of the temperate bacteriophage l into the chromosome of E coli is a well-studied example of site-specific recombination (Fig. 5-10). The specific attachment (att) sites on the E coli chromosome and l phage DNA have a common core sequence of 15 nucleotides, within which reciprocal recombination occurs, flanked by adjacent sequences that are not homologous in the phage and bacterial genomes. In phage l the product of the int gene (integrase) is required for the site-specific integration event in lysogenization; the products of the int and xis (excisionase) genes are both needed for the complementary site-specific excision event that occurs during induction of lytic phage development in lysogenic cells.

Illegitimate recombination is the term used to describe nonhomologous, aberrant recombination events such as those involved in formation of specialized transducing phages. The mechanisms of illegitimate recombination are unknown.

FIGURE 5-10 Integration and excision of bacteriophage l are examples of site-specific recombination. l DNA is shown by thin lines and chromosomal DNA by thick lines. Attachment (att) sites are closed boxes for the bacterial chromosome and open boxes for the l chromosome. The gal and bio operons, which determine utilization of galactose and biosynthesis of biotin, are located adjacent to the bacterial attachment site. In an infected E coli the l DNA becomes circular by joining ends m and m', and site-specific recombination between phage and bacterial att sites results in insertion of the l genome into the bacterial chromosome. The arrangement of the prophage DNA (m and m' located internally) is, therefore, a circular permutation of l virion DNA (m and m' located terminally).

Transposons

Transposons are segments of DNA that can move from one site in a DNA molecule to other target sites in the same or a different DNA molecule. The process is called transposition and occurs by a mechanism that is independent of generalized recombination. Transposons are important genetic elements because they cause mutations, mediate genomic rearrangements, function as portable regions of genetic homology, and acquire new genes and contribute to their dissemination within bacterial populations. Insertion of a transposon often interrupts the linear sequence of a gene and inactivates it. Transposons have a major role in causing deletions, duplications, and inversions of DNA segments as well as fusions between replicons. Transposons are not self-replicating genetic elements, however, and they must integrate into other replicons to be maintained stably in bacterial genomes.

Most transposons share a number of common features. Each transposon encodes the functions necessary for its transposition, including a transposase enzyme that interacts with specific sequences at the ends of the transposon. During transposition a short sequence of target DNA is duplicated, and the transposon is inserted between the directly repeated target sequences. The length of this short duplication varies, but is characteristic for each transposon. The duplication is presumed to involve asymmetric cleavage of DNA at the target site, followed by synthesis of new complementary strands corresponding to the region between the cleavage sites. Some transposons insert into almost any target sequence, whereas others have relatively stringent target specificity. Two types of transposition are recognized. Excision of the transposon from a donor site followed by its insertion into a target site is called nonreplicative transposition. If the transposon at a donor site is replicated and a copy is inserted into the target site, however, the process is called replicative transposition. The process of replicative transposition can involve formation of a cointegrate, a single circular DNA molecule consisting of two replicons joined with copies of the transposon in an alternating sequence. Resolution of the cointegrate into its component replicons is often accomplished by a transposon-encoded resolvase that catalyzes site-specific recombination between the transposons. Generalized recombination between homologous transposons can also lead to the formation or resolution of cointegrates. Transposition differs from site-specific recombination by duplicating a segment of the target sequence and by using a variety of different target sequences for a single donor sequence.

Most transposons in bacteria can be separated into three major classes (Fig. 5-11). Insertion sequences and related composite transposons comprise the first class. Insertion sequences are simplest in structure and encode only the functions needed for transposition. The known insertion sequences vary in length from approximately 780 to 1500 nucleotide pairs, have short (15-25 base pair) inverted repeats at their ends, and are not closely related to each other. The DNA between the inverted terminal repeats contains one (or rarely two) transposase genes and does not encode a resolvase. Complex transposons vary in length from about 2,000 to more than 40,000 nucleotide pairs and contain insertion sequences (or closely related sequences) at each end, usually as inverted repeats. The entire complex element can transpose as a unit. The DNA between the terminal insertion sequences of complex transposons encodes multiple functions that are not essential for transposition. In medically important bacteria, genes that determine production of adherence antigens, toxins, or other virulence factors, or specify resistance to one or more antibiotics, are often located in complex transposons. Well-known examples of complex transposons are Tn5 and Tn10, which determine resistance to kanamycin and tetracycline, respectively. The complex transposons probably evolve by transposition of homologous insertion sequences to nearby sites within a DNA molecule.

FIGURE 5-11 Features of representative transposons (heavy lines) integrated into the bacterial chromosome (fine lines). Transposons are important genetic elements because they cause mutations, mediate genomic rearrangements, function as portable regions of genetic homology, and acquire new genes and contribute to their dissemination within bacterial populations. 1A) IS1 insertion sequence (786 base pairs) has transposase gene flanked by inverted terminal repeats (hatched bars with arrows above them). The IS1 element is flanked by copies of target site (open arrows) with same orientation. 1B) Composite transposon Tn5 (5816 base pairs) consists of kanamycin resistance determinant flanked by inverted copies of IS50 insertion element. 2), Transposon TnA (4957 base pairs) contains ampicillin resistance determinant, transposase and resolvase genes between terminal inverted repeat sequences (hatched bars with arrows above them), flanked by direct repeats of target site (open arrows). 3) Phage Mu (37 kilobase pairs) encodes transposase that catalyzes recombination between the ends of Mu DNA and target DNA. Direct repeats of the target site (open arrows) flank the integrated Mu genome. Mu virion DNA is longer than Mu prophage and contains chromosomal sequences at both ends, reflecting the process by which prophage Mu is excised and packaged.

The second class of transposons consists of the highly homologous TnA family. These transposons have longer (35 to 40 base pair) terminal inverted repeats than the complex transposons described above, but they lack terminal insertion sequences. All members of the family encode both transposase and resolvase functions. Well known examples from the TnA transposon family include the ampicillin resistance transposon Tn3 and Tn1000 (the gamma-delta transposon) found in the F plasmid. The TnA family has an important place in the history of medical microbiology. The development of high-level resistance to ampicillin in Haemophilus influenzae and Neisseria gonorrhoeae during the 1970s, which severely limited the usefulness of ampicillin for treatment of gonorrhea and Haemophilus infections in areas where such strains became prevalent, was caused by dissemination of ampicillin resistance determinants from TnA transposons in plasmids of the Enterobacteriaceae to plasmids in Haemophilus and Neisseria.

The third class of transposons consists of bacteriophage Mu and related temperate phages. The entire phage genome functions as a transposon, and replication of the phage DNA during vegetative growth occurs by replicative transposition. Prophage integration can occur at many different sites in the bacterial chromosome and often causes mutations. For that reason Mu and related phages are sometimes called mutator phages.

A fourth class of transposons, discovered in Gram-positive bacteria and represented by Tn917, consists of conjugative transposons that are completely different from the transposons described above. The conjugative transposon does not generate a duplication of the target sequence into which it inserts, and in Gram-positive bacteria the host strain carrying the transposon can act as a conjugal donor. Recipient bacteria need not be closely related to the donor bacterium. The transposon is excised from the chromosome of the donor and transmitted by conjugation to the recipient, where it integrates randomly into the chromosome. Tn917 encodes tetracycline resistance, but other larger conjugative transposons may encode additional antibiotic resistances. Conjugative transposons appear to be a major cause of the spread of antibiotic resistance in Gram-positive bacteria.

Some roles of transposons in bacterial evolution are illustrated by considering enteric Gram-negative bacteria and the structure of their plasmids. Bacteria collected during the pre-antibiotic era contained many plasmids, but they usually lacked resistance determinants. Many of the R plasmids from current clinical isolates belong to the same incompatibility groups as plasmids found previously, but they also determine resistance to multiple antibiotics. The close relationships between their replicons provide strong evidence that many current R plasmids evolved from the older plasmids by acquisition of resistance determinants. Some of the multiple antibiotic resistant plasmids have individual transposons with several resistance determinants, others have multiple resistance transposons located at separate sites, and still others contain complex hybrid resistance transposons formed by integration of one transposon into another. The stepwise acquisition of resistance determinants can lead, in some cases, to the formation of composite transposons that encode multiple resistance determinants. Therapeutic use of antibiotics and their incorporation into animal feeds provide selective advantages for bacteria with R plasmids, whereas conjugation, transformation and transfection provide means for dissemination of R plasmids within and between bacterial species. After a plasmid carrying a transposon is introduced into a new bacterial host, the transposon and its determinants can jump into the chromosome or indigenous plasmids of the new host. Therefore, stability of the mobilizing plasmid in a new bacterial host is not essential for persistence of genetic determinants located on a transposon.

Recombination DNA and Gene Cloning

Many methods are available to make hybrid DNA molecules in vitro (recombinant DNA) and to characterize them. Such methods include isolating specific genes in hybrid replicons, determining their nucleotide sequences, and creating mutations at designated locations (site-directed mutagenesis). A clone is a population of organisms or molecules derived by asexual reproduction from a single ancestor. Gene cloning is the process of incorporating foreign genes into hybrid DNA replicons. Cloned genes can be expressed in appropriate host cells, and the phenotypes that they determine can be analyzed. Some key concepts underlying representative methods are summarized here.

The first step in gene cloning is to make fragments of the donor DNA by mechanical or enzymatic methods. Certain restriction endonucleases, designated as class II, are particularly useful for preparing defined fragments of DNA molecules. They cleave both strands of double-stranded DNA molecules at specific, palindromic sequences (restriction sites) that usually vary from four to eight nucleotides in length, and the resulting DNA fragments are called restriction fragments. Some restriction endonucleases cleave at coincident sites to create blunt-ended DNA fragments, and others cut at staggered positions to create DNA fragments with short, self-complementary, single-stranded 5' or 3' ends (see Table 3). The random probability that n adjacent nucleotides in a DNA strand will correspond to a specific restriction site is approximately 1/4n. Sites for enzymes that recognize unique 4, 6, or 8 nucleotide targets are likely to occur about once in every 256, 4096, or 65,536 nucleotides, respectively. By choosing appropriate restriction enzymes, specific DNA molecules, including bacterial chromosomes, plasmids, and phage genomes, can be digested into sets of restriction fragments that have appropriate sizes for specific applications.

A restriction map identifies the positions of target sites for specific restriction endonucleases in a DNA molecule. Restriction maps are available for many cloned DNA fragments, plasmids and phage genomes, as well as for the entire chromosome of E coli and several other bacteria.

The second step in gene cloning is to create hybrid replicons consisting of donor DNA fragments and a cloning vector (Fig. 5-12). Cloning vectors are small plasmid or phage replicons that have one or more restriction sites into which foreign DNA can be inserted. Hybrid replicons are produced by using DNA ligase to join the restricted vector DNA with donor DNA fragments that have compatible ends, or, alternatively, synthetic oligonucleotides are used as linkers to create compatibility between donor and vector DNA molecules with different ends. Ligating a vector to a heterogeneous set of DNA fragments from a donor genome is called shotgun cloning, and the collection of recombinant DNA molecules that contains the various fragments is called a genomic library. If a specific DNA fragment is available, it can be incorporated into a recombinant replicon by direct cloning into an appropriate vector chosen from the wide variety of vectors available. Plasmid and phage vectors are used mainly to clone small inserts usually less than 10 kbp. Examples of more special purpose vectors include cosmids, which are plasmid vectors that can be packaged into phage capsids (lambda cosmids accept inserts up to 30-40 kbp), and phagemids, which are plasmid-phage hybrid replicons that can exist either as plasmids or as single-stranded DNA phages under different experimental conditions. Phage P1 cosmids can accept inserts up to 100 kbp, and still larger DNA molecules can be cloned in yeast artificial chromosomes (YACs) which can stably maintain inserts up to and exceeding 1 Mbp in size. Other specialized vectors detect promoters, transcription termination signals, or other regulatory elements within foreign DNA inserts or, conversely, provide promoters from which transcription of cloned genes can be initiated.

FIGURE 5-12 Diagrammatic representation of gene cloning experiment. Plasmid cloning vector pBR322 is 4.36 kilobase pairs in size, has genes for resistance to ampicillin (ampr) and to tetracycline (tetr), and has only one HindIII restriction site that is located within the tetr locus. HindIII is used to treat samples of DNA from plasmid pBR322 and from a donor organism with a gene, designated a+, to be cloned. The donor can be a prokaryotic or a eukaryotic organism. If HindIII restriction sites are located adjacent to a+ in donor DNA, but do not occur within a+, a restriction fragment carrying intact a+ marker can be generated from donor DNA. Hybrid plasmids can be formed by random association and ligation of the HindIII-treated donor and vector DNA fragments. Although pBR322 is tetr, hybrid plasmids will be tets because the donor DNA fragments are inserted at the HindIII restriction site within the tetr locus. After transformation of amps-recipient bacteria that also lack a+, transconjugants with hybrid plasmids can be selected by their ampr tets phenotypes. Strains in which the a+ gene is present can then be identified by expresion of a+ or by testing for the polynucleotide sequence corresponding to a+. The pBR322 plasmid contains other uniuqe restriction sites that can also be used for cloning (e.g., PstI in ampr and BamHI in tetr). Many other cloning vectors and restriction endonucleases have also been used for gene cloning experiments.

The final steps in gene cloning are to introduce hybrid replicons into appropriate recipient cells and test them for expression of donor genes of interest. Prokaryotic cells (including bacteria) or eukaryotic cells (including yeast, animal or plant cells) can be used as recipients, but they differ with respect to their permissiveness for specific replicons, the transcriptional signals that they recognize, and the post-translational modifications of protein structure that they can accomplish. Recombinant DNA molecules produced in vitro can be introduced directly into recipient cells by transformation or transfection. In addition, clones in cosmid or phage vectors can be packaged into phage coats and introduced into susceptible recipient cells by transduction. By using specialized vectors (shuttle vectors) that can replicate in multiple cell types, genes from any organism can be cloned and manipulated in a convenient bacterial system and subsequently reintroduced into cells of the original organism for analysis in their natural environment.

Many methods are available to identify bacteria that contain recombinant DNA molecules. Most cloning vectors have genes for traits that can be positively selected, such as resistance to antibiotics. Furthermore, it is often possible to introduce foreign DNA into the cloning vector at a site that inactivates a nonessential, but easily recognizable, vector function. If both of these conditions are fulfilled, bacteria that contain recombinant molecules can be selected and distinguished easily from bacteria that contain only the vector. Bacteria in a genomic library that contain a particular cloned gene can be identified by using biochemical or immunologic methods to test for the desired gene product. Alternatively, the cloned gene of interest can be detected directly by using nucleic acid hybridization methods, provided that a specific DNA or RNA probe is available. Because insertion of foreign DNA into a cloning vector at an appropriate site does not inactivate its ability to replicate in appropriate recipient cells, hybrid replicons of interest can be amplified by replication, and the recombinant DNA molecules or their gene products can be purified and studied. The ability to purify specific DNA molecules made it feasible to develop enzymatic and chemical methods for determining their nucleotide sequences, and current methods for introducing mutations at defined sites in cloned genes are based on knowing their restriction maps or nucleotide sequences.

Recombinant DNA methods make it feasible to clone specific DNA fragments from any source into vectors that can be studied in well-characterized bacteria, in eukaryotic cells, or in vitro. Applications of DNA cloning are expanding rapidly in all fields of biology and medicine. In medical genetics such applications range from the prenatal diagnosis of inherited human diseases to the characterization of oncogenes and their roles in carcinogenesis. Pharmaceutical applications include large-scale production from cloned human genes of biologic products with therapeutic value, such as polypeptide hormones, interleukins, and enzymes. Applications in public health and laboratory medicine include development of vaccines to prevent specific infections and probes to diagnose specific infections by nucleic acid hybridization or polymerase chain reaction (PCR). The latter process uses oligonucleotide primers and DNA polymerase to amplify specific target DNA sequences during multiple cycles of synthesis in vitro, making it possible to detect rare target DNA sequences in clinical specimens with great sensitivity.

Regulation of Gene Expression

The phenotypic properties of bacteria are determined by their genotypes and growth conditions. For bacteria in pure culture, changes in growth conditions often result in predictable physiological adaptations in all members of the population. Typically, essential gene products are made in amounts that permit fastest growth in the given environment, and products required under special circumstances are made only when they are needed.

Physiological adaptations are often associated with changes in metabolic activities. The flow of metabolites through particular biochemical pathways can be controlled both by regulating the synthesis of specific enzymes and by altering the activities of existing enzymes. Mechanisms that regulate expression of genes by affecting synthesis of specific gene products are discussed here.

Specific regulation involves a gene or group of genes involved in a particular metabolic process. Induction and repression enable bacteria to regulate production of specific gene products in response to appropriate signals. Generally catabolic enzymes are induced when the substrate for the pathway is present in the growth medium, and biosynthetic enzymes are repressed by the product of the pathway. Enzymes that participate in a single biochemical pathway often occupy adjacent positions on the bacterial chromosome and are coordinately induced or repressed. They form an operon, a group of contiguous genes that is transcribed as a single unit and translated to produce the corresponding gene products. Organization into an operon is an important strategy for coordinately regulating the expression of genes in bacteria. Operons that can be induced or repressed are controlled by binding of specific regulatory proteins to particular nucleotide sequences that function as regulatory sites within the operon. Comparison of the amino acid sequences of many of these different regulatory proteins showed that they could be grouped together into families of regulators (e.g. the lysR family of proteins) that may have evolved from common ancestoral genes. Members of the lysR family include regulators of such diverse phenomena as lysine, cysteine and methionine metabolism in E coli and iron repression in V cholerae.

Global regulation simultaneously alters expression of a group of genes and operons, collectively called a regulon, that are controlled by the same regulatory signal. Global regulation determines responses of bacteria to basic nutrients such as carbon, nitrogen or phosphate, reactions to stresses such as DNA damage or heat shock, and synthesis by pathogens of specific virulence factors during growth in their host animals.

The amount of a specific protein in a bacterial cell can vary from none to many thousands of molecules. This wide range is often determined by the combined action of several regulatory mechanisms that affect expression of the corresponding structural gene. Regulation is achieved by determining how often a gene is transcribed into functional mRNA, how efficiently the mRNA is translated into protein, how rapidly the mRNA is degraded, how rapidly the protein product turns over, and whether the activity of the protein product can be altered by allosteric effects or covalent modifications.

mRNAs as Transcriptional Units

Gene expression begins with DNA-dependent RNA polymerase (RNA polymerase) catalyzing the transcription of specific mRNA from one strand of a DNA template. Binding of RNA polymerase to DNA occurs at specific sites called promoters, and transcription begins adjacent to the promoter. Strong promoters can interact efficiently with RNA polymerase and initiate transcription at a high rate; weak promoters initiate transcription at slow rates. In either case, mRNA is synthesized from its 5' end toward its 3' end at an approximately constant rate until the RNA polymerase recognizes another specific site called a terminator. RNA polymerase then dissociates from the template, and transcription of the mRNA is completed.

Individual mRNA molecules may code for one or more polypeptides. Transcription of an operon produces a polycistronic mRNA that codes for several polypeptides. Translation of polycistronic mRNAs leads to coordinate synthesis of the encoded polypeptides, but each polypeptide is synthesized as a separate molecule. A specific ribosome binding site is located just upstream from the start of each coding sequence on the mRNA molecule.

Messenger RNAs in bacteria are degraded rapidly with an average half life of several minutes, in contrast to tRNAs and rRNAs which are much more stable. Although mRNAs represent about half of the newly synthesized RNA, they represent only a small fraction of the total RNA. The short half-life of mRNAs has important consequences for gene expression. If the synthesis of a specific mRNA is prevented, production of the corresponding polypeptides declines rapidly.

Control of gene expression occurs by regulating one or more of the steps in the pathway from the DNA template to the active gene product. Simultaneous regulation at several levels permits greater control over gene expression than would be possible with a single regulatory mechanism. The most common way to regulate gene expression in bacteria is to control the production of specific mRNAs. Since the rate of elongation of an RNA molecule is approximately constant, the major factors that control mRNA synthesis are the rate of initiation and the probability that a full length transcript will be produced.

Regulation of Transcription Initiation

Some mRNAs in bacteria are synthesized at constant rates, resulting in constitutive production of the encoded polypeptides. The amounts of specific mRNAs and polypeptides produced from different constitutive genes vary greatly, however, and often reflect differences in strength of the promoters for those genes.

Transcription of many operons is regulated in response to changing environmental conditions. The promoters determine the maximum rate of transcription initiation for such operons, but regulatory proteins participate in controlling transcription. Nucleotide sequences in operons to which specific regulatory proteins bind are called regulatory sites or operators. Operators and promoters are located close together within operons and may have overlapping DNA sequences. The binding of regulatory proteins to operators can either increase (positive regulation) or decrease (negative regulation) the frequency of transcription initiation. Proteins that function as negative regulators are usually called repressors. Because regulatory proteins can diffuse through the cytoplasm, the structural genes for regulatory proteins do not have to be linked to the target operons.

The ability to sense the presence or absence of specific compounds and change the rates of synthesis of appropriate gene products are central to the control of gene expression. Regulatory proteins offer one solution to this problem of stimulus-response coupling. Many regulatory proteins are bifunctional and bind not only to appropriate operators but also to specific effectors, which are small molecules such as particular sugars, amino acids, and other metabolites. Furthermore, regulatory proteins are allosteric, meaning that they can exist in different conformations which exhibit different binding affinities for their cognate operators and effectors. A sufficient concentration of effector favors formation of the regulatory protein-effector complex, which has either high or low affinity for the operator in any specific case. In negatively regulated systems the effector functions as a corepressor if the regulatory protein-effector complex is the active repressor, and the effector functions as an inducer and causes derepression if the free regulatory protein is the active repressor. Conversely, in positively regulated systems, the effector stimulates expression of the operon if the regulatory protein-effector complex is the positive regulator, and the effector inhibits expression of the operon if the free regulatory protein is the positive regulator.

The lactose (lac) operon of E coli is an example of an inducible, negatively regulated operon (Fig. 5-13). The lacI gene codes for a repressor that binds to the lac operator and prevents transcription from the lac promoter. The structural gene for this repressor is separate from the lac operon, and the repressor is synthesized constitutively at a low rate. When inducer binds to the lac repressor, the complex cannot bind to the operator and cannot prevent binding of the RNA polymerase to the promoter. If other conditions are favorable, the lac operon is expressed, resulting in synthesis of b-galactosidase, b-galactoside permease and b-galactoside transacetylase. The lac operon can be induced by lactose or by structurally related compounds such as isopropyl-b-D-thiogalactoside (IPTG). IPTG is called a gratuitous inducer because it induces the lac operon, but is not a substrate for b-galactosidase. Negative regulation also occurs in many biosynthetic operons in E coli. In such operons a product of the biosynthetic pathway functions as the effector for the negative regulatory system.

FIGURE 5-13 Regulation of lac operon in E coli. Structural genes lacZ, lacY, and lacA code for b-galactosidase, b-galactoside permease, and b-galactoside transacetylase, respectively. The physiologic role of lacA is unknown. The lac repressor is product of lacI gene in separate regulatory operon. Transcription of mRNA encoding lacZ, lacY, and lacA is negatively regulated. and binding of lac repressor to operator lacO prevents initiation of transcription at promoter lacP. Inducer binds to lac repressor and inactivates it. Catabolite activator protein (CAP) forms a complex with cyclic AMP, and binding of the complex to a site immediately adjacent to the lac promoter stimulates transcription of the lac operon by RNA polymerase. An expanded diagram of the lac operator-promoter region shows the binding sites for CAP, RNA polymerase, and lac repressor.

The arabinose (ara) operon in E coli is both positively and negatively regulated. In the presence of arabinose the regulatory protein stimulates transcription of the ara operon. In the absence of arabinose, however, the regulatory protein represses the ara operon.

Operons are often controlled by more than one mechanism. When E coli is grown in a medium containing glucose and an alternative carbon source such as lactose or arabinose, induction of the lac or ara operon and utilization of the lactose or arabinose are delayed until the glucose has been consumed. This phenomenon is called diauxic growth. The failure to induce the lac or ara operon in the presence of glucose is an example of catabolite repression. The lac and ara operons are positively regulated by cyclic-3',5'-adenosine monophosphate (cAMP) and the catabolite gene activator (CAP) protein (the product of the crp gene). The cAMP-CAP complex interacts with CAP binding sites in the regulatory regions of some operons, including the lac and ara operons, and stimulates transcription from the corresponding promoters. The level of intracellular cAMP in E coli is high during growth in the absence of glucose, and low during growth in the presence of glucose. Catabolite repression is due, therefore, to lack of activation of cAMP-dependent operons when the bacteria are grown in the presence of glucose or certain other rapidly metabolizable carbon sources.

Regulation of Transcription Termination

Attenuation is a mechanism for regulating operons by terminating transcription of mRNA prematurely. Attenuation is common in biosynthetic operons, including the trp, histidine (his), threonine (thr), isoleucine-valine (ilv), and phenylalanine (phe) operons. The trp operon in E coli is controlled both by repression and attenuation. In the presence of excess tryptophan, initiation of transcription from the trp promoter is repressed. In addition, however, those transcripts that are initiated from the trp promoter are usually terminated before any of the structural genes of the trp operon are transcribed. The concentration of intracellular tryptophan required to maintain repression exceeds that needed for attenuation. Such dual control enables the cell to fine tune the expression of the trp operon in response to decreasing concentrations of tryptophan.

The secondary structure of mRNA has an important role in the mechanism of attenuation. All mRNAs have a leader sequence between the transcriptional start site and the beginning of the coding sequence for the first structural gene. For amino acid biosynthetic operons that are subject to attenuation, the mRNA leader sequence has two distinctive features. It encodes a short peptide containing the amino-acid produced by the regulated pathway, and it can form alternative, mutually incompatible, double-stranded RNA structures that participate in regulatory events. For example, the peptide encoded by the trp mRNA leader sequence contains two adjacent tryptophan residues, and the peptide encoded by the his mRNA leader sequence has a series of seven consecutive histidine residues. Fig. 5-14 shows the trp operon and illustrates alternative secondary structures in the leader sequence of trp mRNA. There are three possible secondary structures for this region, called the pause site (segments 1+2), the anti-terminator (segments 2+3), and the attenuator (segments 3+4). Segment 1 of the pause site overlaps with the coding region for the trpL peptide. Which secondary structures are formed depends on efficiency of translation of the trpL peptide. When segments 1 and 2 are transcribed, they immediately anneal and cause the RNA polymerase to pause temporarily. Subsequent initiation of translation of the trpL peptide disrupts the pause site and allows RNA polymerase to continue transcription. If tryptophan is present, transcription of segments 3 and 4 and formation of the attenuator structure occurs while the ribosome is blocking segment 2, causing the RNA polymerase to terminate transcription. If tryptophan is deficient, however, tryptophanyl-tRNA is also deficient, and the ribosome stalls at the tryptophan codons in segment 1. This allows segment 2 to anneal with newly synthesized segment 3 to form the antiterminator, thereby making segment 3 unavailable to anneal with segment 4. Formation of the attenuator is therefore prevented, and the RNA polymerase transcribes the entire trp operon. In this manner depletion of tryptophan (actually the supply of tryptophanyl-tRNA) is coupled to regulation of transcription of the biosynthetic operon for tryptophan.

FIGURE 5-14 Regulation of trp operon in E coli. The organization of the trp operon is shown at the top of the figure. The five structural genes trpE, trpD, trpC, trpB, and trpA encode enzymes that catalyze terminal sequence of reactions in tryptophan formation. Transcription initiation is controlled at the promoter-operator (p-o) locus, and signals within the 162 nucleotide trp mRNA leader sequence control termination of transcription by attenuation. The leader sequence of trp mRNA is expanded to show locations of the trpL coding sequence, the complementary segments 1, 2, 3, and 4, and their possible alternative secondary structures which function as pause site, anti-terminator, or attenuator (see text).

In E coli transcription and translation are functionally coupled. Nonsense mutations that cause premature termination of translation often cause decreased transcription of more distal genes in the same operon. This phenomenon is called polarity. Ribosomes usually initiate translation of a growing mRNA molecule prior to completion of transcription, and such translation masks sites that would otherwise cause the RNA polymerase to terminate transcription. Premature termination of translation by a nonsense codon dissociates the ribosomes from the mRNA and enables RNA polymerase to interact with the unmasked transcription termination sites.

In some biological systems, including phage lambda, antitermination is used as a positive regulatory mechanism to control gene expression. Immediately after infection of E coli by lambda, RNA polymerase binds to two promoters in lambda DNA and initiates divergent primary transcripts which terminate at specific sites on the lambda genome. A protein encoded by one of the primary transcripts interacts with RNA polymerase and enables it to continue transcription through the primary termination sites, thereby expressing a second set of lambda genes. One of the products encoded by a secondary transcript blocks termination of another mRNA and activates expression of a third set of genes. Antitermination has a key role, therefore, in controlling the cascade of gene expression during lytic growth of phage lambda. Antitermination is also involved in the regulation of E coli rRNA operons.

Regulation of Translation

The ribosome binding site on mRNA is complementary to a sequence at the 3' end of 16S rRNA. Interaction between these sequences facilitates formation of the initiation complex for protein synthesis. Both the extent of homology with 16S rRNA and the spacing of the ribosome binding site from the initiation codon affect the efficiency of translation initiation. Codon usage in mRNA also influences translation efficiency. Messenger RNAs for proteins that are required in large amounts tend to use codons that are translated by the most abundant species of tRNA, and the converse is also true.

Translational control is important for regulation of synthesis of ribosomal proteins. Production of ribosomes involves a high metabolic cost for bacteria, and at high growth rates ribosomes can constitute nearly one-half of the cell weight. Most ribosomal proteins and rRNAs are found assembled into ribosomes, and the pool of free ribosomal subunits is very small. The genes for ribosomal proteins are organized into several operons. Certain of the free ribosomal proteins directly inhibit the translation of the polycistronic mRNAs that encode them, thereby ensuring that synthesis of ribosomal proteins is balanced with the requirement for their utilization.

Regulons and signal transducing proteins

A regulon is a group of genes or operons controlled by a common regulator. There are several advantages to placing different operons under control by the same regulator. It enables the sensing of a single stimulus to be coupled to expression of a large number of genes that may be needed for an appropriate response, and it eliminates the requirement for the coordinately regulated genes to be linked on the bacterial chromosome. The stimulus to which the regulon responds can be an intracellular component or an environmental signal. Individual operons may also be subject to regulation by several different mechanisms and expressed under conditions that differ from those affecting the whole regulon.

More than 40 different regulons have been identified in E coli. Specific examples of regulons that respond to intracellular components include the cAMP-CAP regulon described previously and the regulons controlled by the stringent response and the SOS response. When ribosomes encounter uncharged tRNA molecules during protein synthesis, the stringent response is activated and results in prompt cessation of rRNA synthesis. A novel nucleotide called guanosine-3'-diphosphate-5'-diphosphate (ppGpp) accumulates during amino-acid starvation. The ppGpp produced by idling ribosomes appears to be a mediator of the stringent response, but the precise mechanism causing inhibition of rRNA synthesis is unknown. The SOS response is associated with damage to DNA and involves induction of more than 20 genes involved in several DNA repair pathways. The product of the recA gene detects inhibition of DNA synthesis and initiates events leading to proteolytic cleavage and inactivation of the repressor for the SOS pathway, encoded by the lexA gene.

Some regulons are induced by specific environmental stimuli, such as nutrient limitation or osmotic stress. Often operons from more than one regulon may be induced, and the term stimulon has been used to describe the set of genes so induced. Typically, bacteria sense such environmental conditions by two component systems. The first component is a membrane-spanning protein with extracellular and intracellular domains. Its extracellular domain detects the environmental stimulus, and its cytoplasmic domain transmits the signal. The second component is a bifunctional cytoplasmic protein. It has a receiver domain that interacts with the transmitter module of the first component, as well as an effector domain that controls expression of the corresponding regulon. The transmitter and receiver modules of the two component regulatory systems from a wide variety of regulons are genetically related and share amino-acid homology. The signal-detecting and effector domains of the proteins from different regulons vary, however, and determine the signal that is detected and the operons that are activated or repressed in response to that signal.

Global regulation has an important role in the physiology of pathogenic bacteria. For example, Vibrio cholerae and Bordetella pertussis express many of their virulence determinants under the control of signal transducing systems that are related to the two component systems described above. The expression of proteins needed for the invasive phenotype is controlled by temperature in Shigella. Yersinia enterocolitica senses both the environmental temperature and the concentration of calcium ions and couples these signals to the expression of genes and cellular location of the gene products that are appropriate for an intracellular or extracellular environment. In host tissues the concentration of free iron is extremely low, and most pathogenic bacteria have high affinity iron transport systems that are induced under low-iron conditions. The synthesis of diphtheria toxin by C diphtheriae, Shiga toxin by Shigella dysenteriae, exotoxin A by Pseudomonas aeruginosa, and other specific proteins in many pathogenic bacteria is induced under conditions of iron-limited growth. These examples illustrate how environmental factors can regulate the expression of virulence genes in pathogenic bacteria.

REFERENCES

Dorman CJ: The genetics of bacterial virulence. Blackwell Scientific Press, Oxford, England, 1994

Drlica K, Riley M (eds): The bacterial chromosome. American Society for Microbiology, Washington, DC, 1990

Harwood AJ (ed): Protocols for gene analysis. Methods in Molecular Biology vol. 31. Human Press, NJ, 1993

Holloway BW: Genetics for all bacteria. Annu Rev Microbiol 47:659, 1993

Lewin B: Genes V. Oxford University Press, Oxford, England, 1994

Miller JH: A short course in bacterial genetics: a laboratory manual and handbook for Escherichia coli and related bacteria. Cold Spring Harbor Laboratory Press, NY, 1992

Miller VL, Kaper JB, Portnoy DA et al. (eds): Molecular genetics of bacterial pathogenesis. American Society for Microbiology, Washington DC, 1994

Saylers AA, Whitt DD: Bacterial pathogenesis: a molecular approach. American Society for Microbiology, Washington DC, 1994

Singer M, Berg P. Genes and genomes: a changing perspective. University Science Books, Mill Valley, CA, 1991