Introduction into biochemistry. Aminoacid composition, structure, physical-chemical properties, classification and functions of simple and complex proteins
Biochemistry can be defined as the science concerned with the chemical basis of life (Gk bios “life”). The cell is the structural unit of living systems. Thus, biochemistry can also be described as the science concerned with the chemical constituents of living cells and with the reactions and processes they undergo. By this definition, biochemistry encompasses large areas of cell biology, of molecular biology, and of molecular genetics.
The Aim of Biochemistry Is to Describe & Explain, in Molecular Terms, All Chemical Processes of Living Cells
The major objective of biochemistry is the complete understanding, at the molecular level, of all of the chemical processes associated with living cells. To achieve this objective, biochemists have sought to isolate the numerous molecules found in cells, determine their structures, and analyze how they function.
A Knowledge of Biochemistry Is Essential to All Life Sciences
The biochemistry of the nucleic acids lies at the heart of genetics; in turn, the use of genetic approaches has beencritical for elucidating many areas of biochemistry.
Physiology, the study of body function, overlaps with biochemistry almost completely. Immunology employs numerous biochemical techniques, and many immunologic approaches have found wide use by biochemists. Pharmacology and pharmacy rest on a sound knowledge of biochemistry and physiology; in particular, most drugs are metabolized by enzyme-catalyzed reactions.
Poisons act on biochemical reactions or processes; this is the subject matter of toxicology. Biochemical approaches are being used increasingly to study basic aspects of pathology (the study of disease), such as inflammation, cell injury, and cancer. Many workers in microbiology, zoology, and botany employ biochemical approaches almost exclusively. These relationships are not surprising, because life as we know it depends on biochemical reactions and processes. In fact, the old barriers among the life sciences are breaking down, and biochemistry is increasingly becoming their common language.
The World Health Organization (WHO) defines health as a state of “complete physical, mental and social well-being and not merely the absence of disease and infirmity.” From a strictly biochemical viewpoint, health may be considered that situation in which all of the many thousands of intra- and extracellular reactions that occur in the body are proceeding at rates commensurate with the organism’s maximal survival in the physiologic state. However, this is an extremely reductionist view, and it should be apparent that caring for the health of patients requires not only a wide knowledge of biologic principles but also of psychologic and social principles.
Protein is an important nutrient that builds muscles and bones and provides energy. Protein can help with weight control because it helps you feel full and satisfied from your meals.
The healthiest proteins are the leanest. This means that they have the least fat and calories. The best protein choices are fish or shellfish, skinless chicken or turkey, low-fat or fat-free dairy (skim milk, low-fat cheese), and egg whites or egg substitute. The best red meats are the leanest cuts (loin and tenderloin). Other healthy options are beans, legumes (lentils and peanut butter), and soy foods such as tofu or soymilk.
Protein is an important part of every diet and is found in many different foods. Lean protein, the best kind, can be found in fish, skinless chicken and turkey, pork tenderloin and certain cuts of beef, like the top round. Low-fat dairy products like milk, yogurt, ricotta and other cheeses supply both protein and calcium.
· Protein is crucial for tissue repair, building and preserving muscle, and making important enzymes and hormones.
· Lean meats and dairy contribute valuable minerals like calcium, iron, selenium and zinc. These are not only essential for building bones, and forming and maintaining nerve function, but also for fighting cancer, forming blood cells and keeping immune systems robust.
Structure and Function
The word protein was first coined in 1838 to emphasize the importance of this class of molecules. The word is derived from the Greek word proteios which means "of the first rank".
This chapter will provide a brief background into the structure of proteins and how this structure can determine the function and activity of proteins. It is not intended to substitute for the more detailed information provided in a biochemistry or cell biology course.
Proteins are the major components of living organisms and perform a wide range of essential functions in cells. While DNA is the information molecule, it is proteins that do the work of all cells - microbial, plant, animal. Proteins regulate metabolic activity, catalyze biochemical reactions and maintain structural integrity of cells and organisms. Proteins can be classified in a variety of ways, including their biological function (Table 2.1).
Table 2.1 Classification of Proteins According to biological function.
Enzymes- Catalyze biological reactions
Transport and Storage
Regulatory Function within cells
How does one group of molecules perform such a diverse set of functions? The answer is found in the wide variety of possible structures for proteins.
In the English language, there are an enormous number of words with varied meaning that can be formed using only 26 letters as building blocks. A similar situation exists for proteins where an incredible variety of proteins can be formed using 20 different building blocks called amino acids. Each of these amino acid building blocks has a different chemical structure and different properties.
Each protein has a unique amino acid sequence that is genetically determined by the order of nucleotide bases in the DNA, the genetic code. Since each protein has different numbers and kinds of the twenty available amino acids, each protein has a unique chemical composition and structure. For example, two proteins may each have 37 amino acids but if the sequence of the amino acids is different, then the protein will be different. How many different proteins can be formed from the twenty different amino acids? Consider a protein containing 100 different amino acids linked into one chain. Since each of the 100 positions of this chain could be filled with any one of the 20 amino acids, there are 20100 possible combinations, more than enough to account for the 90-100 million different proteins that may be found in higher organisms.
A change in just one amino acid can change the structure and function of a protein. For example, sickle cell anemia is a disease that results from an altered structure of the protein hemoglobin, resulting from a change of the sixth amino acid from glutamic acid to valine. (This is the result of a single base pair change at the DNA level.) This single amino acid change is enough to change the conformation of hemoglobin so that this protein clumps at lower oxygen concentrations and causes the characteristic sickle shaped red blood cells of the disease.
The unique structure and chemical composition of each protein is important for its function; it is also important for separating proteins in a protein purification strategy. Each of these differences in properties can be used as a basis for the separation methods that are used to purify proteins. Because these differences in protein properties originate from differences in the chemical structure of the amino acids that make up the protein, we need to explore the structure of amino acids and their contribution to protein properties in more detail.
Chemical Composition of Proteins: (Protein Structure)
Amino acid structure:
Amino acids are composed of carbon, hydrogen, oxygen, and nitrogen. Two amino acids, cysteine and methionine, also contain sulfur. The generic form of an amino acid is shown in Figure 2.1. Atoms of these elements are arranged into 20 kinds of amino acids that are commonly found in proteins. All proteins in all species, from bacteria to humans, are constructed from the same set of twenty amino acids. All amino acids have an amino group (NH2) and a carboxyl group (COOH) bonded to the same carbon atom, known as the alpha carbon. Amino acids differ in the side chain or R group that is bonded to the alpha carbon. (Figure 2.2) Glycine, the simplest amino acid has a single hydrogen atom as its R group - Alanine has a methyl (-CH3) group.
The chemical composition of the unique R groups is responsible for the important characteristics of amino acids such as chemical reactivity, ionic charge and relative hydrophobicity. In Figure 2.2, the amino acids are grouped according to their polarity and charge. They are divided into four categories, those with polar uncharged R groups, those with apolar (nonpolar) R groups, acidic (charged) and basic (charged) groups.
The polar amino acids are soluble in water because their R groups can form hydrogen bonds with water. For example, serine, threonine and tyrosine all have hydroxyl groups (OH). Amino acids that carry a net negative charge at neutral pH contain a second carboxyl group. These are the acidic amino acids, aspartic acid and glutamic acid, also called aspartate and glutamate, respectively. The basic amino acids have R groups with a net positive charge at pH 7.0. These include lysine, arginine and histidine. There are eight amino acids with nonpolar R groups. As a group, these amino acids are less soluble in water than the polar amino acids. If a protein has a greater percentage of nonpolar R groups, the protein will be more hydrophobic (water hating) in character.
A protein is formed by amino acid subunits linked together in a chain. The bond between two amino acids is formed by the removal of a H20 molecule from two different amino acids, forming a dipeptide. (Figure 2.3) The bond between two amino acids is called a peptide bond and the chain of amino acids is called a peptide (20 amino acids or smaller) or a polypeptide.
Each protein consists of one or more unique polypeptide chains. Most proteins do not remain as linear sequences of amino acids; rather, the polypeptide chain undergoes a folding process. The process of protein folding is driven by thermodynamic considerations. This means that each protein folds into a configuration that is the most stable for its particular chemical structure and its particular environment. The final shape will vary but the majority of proteins assume a globular configuration. Many proteins such as myoglobin consist of a single polypeptide chain; others contain two or more chains. For example, hemoglobin is made up of two chains of one type (amino acid sequence) and two of another type.
Although the primary amino acid sequence determines how the protein folds, this process is not completely understood. Although certain amino acid sequences can be identified as more likely to form a particular conformation, it is still not possible to completely predict how a protein will fold based on its amino acid sequence alone, and this is an active area of biochemical research.
The final folded 3-D arrangement of the protein is referred to as its conformation. In order to maintain their function, proteins must maintain this conformation. To describe this complex conformation, scientists describe four levels of organization: primary, secondary, tertiary, and quaternary (Figure 2.4). The overall conformation of a protein is the combination of its primary, secondary, tertiary and quaternary elements.
Four levels of Organization of Protein Structure:
· Primary Structure refers to the linear sequence of amino acids that make up the polypeptide chain. This sequence is determined by the genetic code, the sequence of nucleotide bases in the DNA. The bond between two amino acids is a peptide bond. This bond is formed by the removal of a H20 molecule from two different amino acids, forming a dipeptide. The sequence of amino acids determines the positioning of the different R groups relative to each other. This positioning therefore determines the way that the protein folds and the final structure of the molecule.
· The secondary structure of protein molecules refers to the formation of a regular pattern of twists or kinks of the polypeptide chain. The regularity is due to hydrogen bonds forming between the atoms of the amino acid backbone of the polypeptide chain. The two most common types of secondary structure are called the alpha helix and ß pleated sheet. (Figure 2.4)
· Tertiary structure refers to the three dimensional globular structure formed by bending and twisting of the polypeptide chain. This process often means that the linear sequence of amino acids is folded into a compact globular structure. The folding of the polypeptide chain is stabilized by multiple weak, noncovalent interactions. These interactions include:
o Hydrogen bonds that form when a Hydrogen atom is shared by two other atoms.
o Electrostatic interactions that occur between charged amino acid side chains. Electrostatic interactions are attractions between positive and negative sites on macromolecules.
Covalent bonds may also contribute to tertiary structure. The amino acid, cysteine, has an SH group as part of its R group and therefore, the disulfide bond (S-S ) can form with an adjacent cysteine. For example, insulin has two polypeptide chains that are joined by two disulfide bonds.
· Quaternary structure refers to the fact that some proteins contain more than one polypeptide chain, adding an additional level of structural organization: the association of the polypeptide chains. Each polypeptide chain in the protein is called a subunit. The subunits can be the same polypeptide chain or different ones. For example, the enzyme ß-galactosidase is a tetramer, meaning that it is composed of four subunits, and, in this case, the subunits are identical - each polypeptide chain has the same sequence of amino acids. Hemoglobin, the oxygen carrying protein in the blood, is also a tetramer but it is composed of two polypeptide chains of one type (141 amino acids) and two of a different type (146 amino acids). In chemical shorthand, this is referred to as a2ß2 . For some proteins, quaternary structure is required for full activity (function) of the protein.
The wide variety of 3-dimensional protein structures corresponds to the diversity of functions proteins fulfill.
Proteins fold in three dimensions. Protein structure is organized hierarchically from so-called primary structure to quaternary structure. Higher-level structures are motifs and domains.
Above all the wide variety of conformations is due to the huge amount of different sequences of amino acid residues. The primary structure is the sequence of residues in the polypedptide chain. The primary structure refers to amino acid linear sequence of the polypeptide chain. The primary structure is held together by covalent or peptide bonds, which are made during the process of protein biosynthesis or translation. The two ends of the polypeptide chainare referred to as the carboxyl terminus (C-terminus) and the amino terminus (N-terminus) based on the nature of the free group on each extremity. Counting of residues always starts at the N-terminal end (NH2-group), which is the end where the amino group is not involved in a peptide bond. The primary structure of a protein is determined by the gene corresponding to the protein. A specific sequence of nucleotides in DNA istranscribed into mRNA, which is read by the ribosome in a process called translation. The sequence of a protein is unique to that protein, and defines the structure and function of the protein. The sequence of a protein can be determined by methods such as Edman degradation or tandem mass spectrometry. Often however, it is read directly from the sequence of the gene using the genetic code. We know that there are over 10,000 proteins in our body which are composed of different arrangements of 20 types of amino acid residues (it is strictly recommended to use the word "amino acid residues" as when peptide bond is formed a water molecule is lost so, protein is made up of amino acid residues). Post-translational modifications such as disulfide formation, phosphorylations and glycosylations are usually also considered a part of the primary structure, and cannot be read from the gene.
Secondary structure is a local regulary occuring structure in proteins and is mainly formed through hydrogen bonds between backbone atoms. So-called random coils, loops or turns don't have a stable secondary structure. There are two types of stable secondary structures: Alpha helices and beta-sheets (see Figure 3 and Figure 4). Alpha-helices and beta-sheets are preferably located at the core of the protein, whereat loops prefer to reside in outer regions.
Secondary structure refers to highly regular local sub-structures. Two main types of secondary structure, the alpha helix and the beta strand or beta sheets, were suggested in 1951 by Linus Pauling and coworkers. These secondary structures are defined by patterns of hydrogen bonds between the main-chain peptide groups. They have a regular geometry, being constrained to specific values of the dihedral angles ψ and φ on the Ramachandran plot. Both the alpha helix and the beta-sheet represent a way of saturating all the hydrogen bond donors and acceptors in the peptide backbone. Some parts of the protein are ordered but do not form any regular structures. They should not be confused with random coil, an unfolded polypeptide chain lacking any fixed three-dimensional structure. Several sequential secondary structures may form a "supersecondary unit".
Figure 3: An alpha helix:
The backbone is formed as a helix.
An ideal alpha helix consists
of 3.6 residues per complete turn.
The side chains stick out.
There are hydrogen bonds
between the carboxy group of amino acid n
and the amino group of another amino acid n+4 .
The mean phi angle is -62 degrees
and the mean psi angle is -41 degrees.
Figure 4: An antiparallel beta sheet.
Beta sheets are created,
when atoms of beta strands are hydrogen bound.
Beta sheets may consist of parallel strands,
antiparallel strands or out of a mixture
of parallel and antiparallel strands.
Tertiary structure describes the packing of alpha-helices, beta-sheets and random coils with respect to each other on the level of one whole polypeptide chain. Figure 5 shows the tertiary structure of Chain B of Protein Kinase C Interacting Protein.
Tertiary structure refers to three-dimensional structure of a single protein molecule. The alpha-helices and beta-sheets are folded into a compact globule. The folding is driven by the non-specific hydrophobic interactions (the burial of hydrophobic residues from water), but the structure is stable only when the parts of a protein domain are locked into place by specific tertiary interactions, such as salt bridges, hydrogen bonds, and the tight packing of side chains and disulfide bonds. The disulfide bonds are extremely rare in cytosolic proteins, since the cytosol is generally a reducing environment.
Figure 5: Chain B of Protein Kinase C Interacting Protein.
Helices are visualized as ribbons and
extended strands of betasheets by broad arrows.
(the figure was obtained by using rasmol
and the PDB-file corresponding to PDB-ID 1AV5
stored at PDB, the Brookhaven Protein Data Bank)
Quaternary structure only exists, if there is more than one polypeptide chain present in a complex protein. Then quaternary structure describes the spatial organization of the chains. Figure 6 shows both, Chain A and Chain B of Protein Kinase C Interacting Protein forming the quaternary structure.
Quaternary structure is the three-dimensional structure of a multi-subunit protein and how the subunits fit together. In this context, the quaternary structure is stabilized by the same non-covalent interactions and disulfide bonds as the tertiary structure. Complexes of two or more polypeptides (i.e. multiple subunits) are called multimers. Specifically it would be called a dimer if it contains two subunits, a trimer if it contains three subunits, and a tetramer if it contains four subunits. The subunits are frequently related to one another by symmetry operations, such as a 2-fold axis in a dimer. Multimers made up of identical subunits are referred to with a prefix of "homo-" (e.g. a homotetramer) and those made up of different subunits are referred to with a prefix of "hetero-" (e.g. a heterotetramer, such as the two alpha and two beta chains of hemoglobin).
Figure 6: Quaternary structure of
Protein Kinase C Interacting Protein.
(the figure was obtained by using rasmol
and the PDB-file corresponding to PDB-ID 1AV5
stored at PDB, the Brookhaven Protein Data Bank)
The primary structure of proteins
Drawing the amino acids
In chemistry, if you were to draw the structure of a general 2-amino acid, you would probably draw it like this:
However, for drawing the structures of proteins, we usually twist it so that the "R" group sticks out at the side. It is much easier to see what is happening if you do that.
That means that the two simplest amino acids, glycine and alanine, would be shown as:
Peptides and polypeptides
Glycine and alanine can combine together with the elimination of a molecule of water to produce a dipeptide. It is possible for this to happen in one of two different ways - so you might get two different dipeptides.
In each case, the linkage shown in blue in the structure of the dipeptide is known as a peptide link. In chemistry, this would also be known as an amide link, but since we are now in the realms of biochemistry and biology, we'll use their terms.
If you joined three amino acids together, you would get a tripeptide. If you joined lots and lots together (as in a protein chain), you get a polypeptide.
A protein chain will have somewhere in the range of 50 to 2000 amino acid residues. You have to use this term because strictly speaking a peptide chain isn't made up of amino acids. When the amino acids combine together, a water molecule is lost. The peptide chain is made up from what is left after the water is lost - in other words, is made up of amino acid residues.
By convention, when you are drawing peptide chains, the -NH2 group which hasn't been converted into a peptide link is written at the left-hand end. The unchanged -COOH group is written at the right-hand end.
The end of the peptide chain with the -NH2 group is known as the N-terminal, and the end with the -COOH group is the C-terminal.
A protein chain (with the N-terminal on the left) will therefore look like this:
The "R" groups come from the 20 amino acids which occur in proteins. The peptide chain is known as the backbone, and the "R" groups are known as side chains.
Note: In the case where the "R" group comes from the amino acid proline, the pattern is broken. In this case, the hydrogen on the nitrogen nearest the "R" group is missing, and the "R" group loops around and is attached to that nitrogen as well as to the carbon atom in the chain.
I mention this for the sake of completeness - not because you would be expected to know about it in chemistry at this introductory level.
The primary structure of proteins
Now there's a problem! The term "primary structure" is used in two different ways.
At its simplest, the term is used to describe the order of the amino acids joined together to make the protein. In other words, if you replaced the "R" groups in the last diagram by real groups you would have the primary structure of a particular protein.
This primary structure is usually shown using abbreviations for the amino acid residues. These abbreviations commonly consist of three letters or one letter.
Using three letter abbreviations, a bit of a protein chain might be represented by, for example:
If you look carefully,
you will spot the abbreviations for glycine (Gly) and alanine (
If you followed the protein chain all the way to its left-hand end, you would find an amino acid residue with an unattached -NH2 group. The N-terminal is always written on the left of a diagram for a protein's primary structure - whether you draw it in full or use these abbreviations.
The wider definition of primary structure includes all the features of a protein which are a result of covalent bonds. Obviously, all the peptide links are made of covalent bonds, so that isn't a problem.
But there is an additional feature in proteins which is also covalently bound. It involves the amino acid cysteine.
If two cysteine side chains end up next to each other because of folding in the peptide chain, they can react to form a sulphur bridge. This is another covalent link and so some people count it as a part of the primary structure of the protein.
Because of the way sulphur bridges affect the way the protein folds, other people count this as a part of the tertiary structure (see below). This is obviously a potential source of confusion!
Important: You need to know where your particular examiners are going to include sulphur bridges - as a part of the primary structure or as a part of the tertiary structure. You need to check your current syllabus and past papers. If you are studying a UK-based syllabus and haven't got these, follow this link to find out how to get hold of them.
The secondary structure of proteins
Within the long protein chains there are regions in which the chains are organised into regular structures known as alpha-helices (alpha-helixes) and beta-pleated sheets. These are the secondary structures in proteins.
These secondary structures are held together by hydrogen bonds. These form as shown in the diagram between one of the lone pairs on an oxygen atom and the hydrogen attached to a nitrogen atom:
Important: If you aren't happy about hydrogen bonding and are unsure about what this diagram means, follow this link before you go on. What follows is difficult enough to visualise anyway without having to worry about what hydrogen bonds are as well!
You must also find out exactly how much detail you need to know about this next bit. It may well be that all you need is to have heard of an alpha-helix and know that it is held together by hydrogen bonds between the C=O and N-H groups. Once again, you need to check your syllabus and past papers - particularly mark schemes for the past papers.
Notice that we are now talking about hydrogen bonds between side groups - not between groups actually in the backbone of the chain.
Lots of amino acids contain groups in the side chains which have a hydrogen atom attached to either an oxygen or a nitrogen atom. This is a classic situation where hydrogen bonding can occur.
For example, the amino acid serine contains an -OH group in the side chain. You could have a hydrogen bond set up between two serine residues in different parts of a folded chain.
You could easily imagine similar hydrogen bonding involving -OH groups, or -COOH groups, or -CONH2 groups, or -NH2 groups in various combinations - although you would have to be careful to remember that a -COOH group and an -NH2 group would form a zwitterion and produce stronger ionic bonding instead of hydrogen bonds.
In an alpha-helix, the protein chain is coiled like a loosely-coiled spring. The "alpha" means that if you look down the length of the spring, the coiling is happening in a clockwise direction as it goes away from you.
Note: If your visual imagination is as hopeless as mine, the only way to really understand this is to get a bit of wire and coil it into a spring shape. The lead on your computer mouse is fine for doing this!
The next diagram shows how the alpha-helix is held together by hydrogen bonds. This is a very simplified diagram, missing out lots of atoms. We'll talk it through in some detail after you have had a look at it.
What's wrong with the diagram? Two things:
First of all, only the atoms on the parts of the coils facing you are shown. If you try to show all the atoms, the whole thing gets so complicated that it is virtually impossible to understand what is going on.
Secondly, I have made no attempt whatsoever to get the bond angles right. I have deliberately drawn all of the bonds in the backbone of the chain as if they lie along the spiral. In truth they stick out all over the place. Again, if you draw it properly it is virtually impossible to see the spiral.
So, what do you need to notice?
Notice that all the "R" groups are sticking out sideways from the main helix.
Notice the regular arrangement of the hydrogen bonds. All the N-H groups are pointing upwards, and all the C=O groups pointing downwards. Each of them is involved in a hydrogen bond.
And finally, although you can't see it from this incomplete diagram, each complete turn of the spiral has 3.6 (approximately) amino acid residues in it.
If you had a whole number of amino acid residues per turn, each group would have an identical group underneath it on the turn below. Hydrogen bonding can't happen under those circumstances.
Each turn has 3 complete amino acid residues and two atoms from the next one. That means that each turn is offset from the ones above and below, such that the N-H and C=O groups are brought into line with each other.
In a beta-pleated sheet, the chains are folded so that they lie alongside each other. The next diagram shows what is known as an "anti-parallel" sheet. All that means is that next-door chains are heading in opposite directions. Given the way this particular folding happens, that would seem to be inevitable.
It isn't, in fact, inevitable! It is
possible to have some much more complicated folding so that next-door chains are
actually heading in the same direction. We are getting well beyond the demands
The folded chains are again held together by hydrogen bonds involving exactly the same groups as in the alpha-helix.
Note: Note that there is no reason why these sheets have to be made from four bits of folded chain alongside each other as shown in this diagram. That was an arbitrary choice which produced a diagram which fitted nicely on the screen!
The tertiary structure of proteins
What is tertiary structure?
The tertiary structure of a protein is a description of the way the whole chain (including the secondary structures) folds itself into its final 3-dimensional shape. This is often simplified into models like the following one for the enzyme dihydrofolate reductase. Enzymes are, of course, based on proteins.
Note: This diagram was obtained from the RCSB Protein Data Bank. If you want to find more information about dihydrofolate reductase, their reference number for it is 7DFR.
There is nothing particularly special about this enzyme in terms of structure. I chose it because it contained only a single protein chain and had examples of both types of secondary structure in it.
The model shows the alpha-helices in the secondary structure as coils of "ribbon". The beta-pleated sheets are shown as flat bits of ribbon ending in an arrow head. The bits of the protein chain which are just random coils and loops are shown as bits of "string".
The colour coding in the model helps you to track your way around the structure - going through the spectrum from dark blue to end up at red.
You will also notice that this particular model has two other molecules locked into it (shown as ordinary molecular models). These are the two molecules whose reaction this enzyme catalyses.
What holds a protein into its tertiary structure?
The tertiary structure of a protein is held together by interactions between the the side chains - the "R" groups. There are several ways this can happen.
Some amino acids (such as aspartic acid and glutamic acid) contain an extra -COOH group. Some amino acids (such as lysine) contain an extra -NH2 group.
You can get a transfer of a hydrogen ion from the -COOH to the -NH2 group to form zwitterions just as in simple amino acids.
You could obviously get an ionic bond between the negative and the positive group if the chains folded in such a way that they were close to each other.
van der Waals dispersion forces
Several amino acids have quite large hydrocarbon groups in their side chains. A few examples are shown below. Temporary fluctuating dipoles in one of these groups could induce opposite dipoles in another group on a nearby folded chain.
The dispersion forces set up would be enough to hold the folded structure together.
Many proteins contain only amino acids and no other chemical groups, and they are called simple proteins. However, other kind of proteins yield, on hydrolysis, some other chemical component in addition to amino acids and they are called conjugated proteins. The nonamino part of a conjugated protein is usually called its prosthetic group. Mostprosthetic groups are formed from vitamins. Conjugated proteins are classified on the basis of the chemical nature of their prosthetic groups.
Hemoglobin contains the prosthetic group containing iron, which is the heme. It is within the heme group that carries the oxygen molecule through the binding of the oxygen molecule to the iron ion (Fe2+) found in the heme group.
Glycoproteins are generally the largest and most abundant group of conjugated proteins. They range from glycoproteins in cell surface membranes that constitute the glycocalyx, to important antibodies produced by leukocytes.
Some proteins combine with other kinds of molecules such as carbohydrates, lipids, iron and other metals, or nucleic acids, to form glycoproteins, lipoproteins, hemoproteins, metalloproteins, and nucleoproteins respectively. The presence of these other biomolecules affects the protein properties. For example, a protein that is conjugated to carbohydrate, called a glycoprotein, would be more hydrophilic in character while a protein conjugated to a lipid would be more hydrophobic in character.
Protein Properties and Separation
Proteins are typically characterized by their size (molecular weight) and shape, amino acid composition and sequence, isolelectric point (pI), hydrophobicity, and biological affinity. Differences in these properties can be used as the basis for separation methods in a purification strategy (Chapter 4). The chemical composition of the unique R groups is responsible for the important characteristics of amino acids, chemical reactivity, ionic charge and relative hydrophobicity. Therefore protein properties relate back to number and type of amino acids that make up the protein.
Size of proteins is usually measured in molecular weight (mass) although occasionally the length or diameter of a protein is given in Angstroms. The molecular weight of a protein is the mass of one mole of protein, usually measured in units called daltons. One dalton is the atomic mass of one proton or neutron. The molecular weight can be estimated by a number of different methods including electrophoresis, gel filtration, and more recently by mass spectrometry. The molecular weight of proteins varies over a wide range. For example, insulin is 5,700 daltons while snail hemocyanin is 6,700,000 daltons. The average molecular weight of a protein is between 40,000 to 50,000 daltons. Molecular weights are commonly reported in kilodaltons or (kD), a unit of mass equal to 1000 daltons. Most proteins have a mass between 10 and 100 kD. A small protein consists of about 50 amino acids while larger proteins may contain 3,000 amino acids or more. One of the larger amino acid chains is myosin, found in muscles, which has 1,750 amino acids.
Separation methods that are based on size and shape include gel filtration chromatography (size exclusion chromatography) and polyacrylamide gel electrophoresis.
Amino Acid Composition and Sequence
The amino acid composition is the percentage of the constituent amino acids in a particular protein while the sequence is the order in which the amino acids are arranged.
Each protein has an
amino group at one end and a carboxyl group at the other end as well as
numerous amino acid side chains, some of which are charged. Therefore each
protein carries a net charge. The net protein charge is strongly influenced by
the pH of the solution. To explain this phenomenon, consider the hypothetical
protein in Figure 2.5. At pH 6.8, this protein has an equal number of positive
and negative charges and so there is no net charge on the protein. As the pH
drops, more H+ ions are available in the solution. These hydrogen ions bind to
negative sites on the amino acids. Therefore, as the pH drops, the protein as a
whole becomes positively charged. Conversely, at a basic pH, the protein
becomes negatively charged. pH 6.8 is called the pI,
or isoelectric point, for this protein; that is, the pH at which there are an
equal number of positive and negative charges. Different proteins have
different numbers of each of the amino acid side chains and therefore have
different isoelectric points. So, in a buffer solution at a particular pH, some
proteins will be positively charged, some proteins will be negatively charged
and some will have no charge.
Separation techniques that are based on charge include ion exchange chromatography, isoelectric focusing and chromatofocusing.
Literally, hydrophobic means fear of water. In aqueous solutions, proteins tend to fold so that areas of the protein with hydrophobic regions are located in internal surfaces next to each other and away from the polar water molecules of the solution. Polar groups on the amino acid are called hydrophilic (water loving) because they will form hydrogen bonds with water molecules. The number, type and distribution of nonpolar amino acid residues within the protein determines its hydrophobic character. (Chart of hydrophobicity or hydropathy)
A separation method that is based on the hydrophobic character of proteins is hydrophobic interaction chromatography.
As the name implies, solubility is the amount of a solute that can be dissolved in a solvent. The 3-D structure of a protein affects its solubility properties. Cytoplasmic proteins have mostly hydrophilic (polar) amino acids on their surface and are therefore water soluble, with more hydrophobic groups located on the interior of the protein, sheltered from the aqueous environment. In contrast, proteins that reside in the lipid environment of the cell membrane have mostly hydrophobic amino acids (non polar) on their exterior surface and are not readily soluble in aqueous solutions.
Each protein has a distinct and characteristic solubility in a defined environment and any changes to those conditions (buffer or solvent type, pH, ionic strength, temperature, etc.) can cause proteins to lose the property of solubility and precipitate out of solution. The environment can be manipulated to bring about a separation of proteins- for example, the ionic strength of the solution can be increased or decreased, which will change the solubility of some proteins.
Biological Affinity (Function):
Proteins often interact with other molecules in vivo in a specific way- in other words, they have a biological affinity for that molecule. These molecular counterparts, termed ligands, can be used as “bait” to “fish” out the target protein that you want to purify. For example, one such molecular pair is insulin and the insulin receptor. If you want to purify (or catch) the insulin receptor, you could couple many insulin molecules to a solid support and then run an extract (containing the receptor) over that column. The receptor would be “caught” by the insulin bait. These specific interactions are often exploited in protein purification procedures. Affinity chromatography is a very common method for purifying recombinant proteins (proteins produced by genetic engineering). Several histidine residues can be engineered at the end of a polypeptide chain. Since repeated histidines have an affinity for metals, a column of the metal can be used as bait to “catch” the recombinant protein.
Table 2.2: Methods Used for Protein Separation and Analysis
Protein Property Exploited
Ammonium sulfate precipitation
Gel Filtration (Gel Permeation)
Size or molecular wt.
Denaturing Gel (SDS-PAGE)
Mass (Molecular weight)
pI or charge
Molecular weight and pI (charge)
Working with proteins
How proteins lose their structure and function.
Although DNA can be isolated and amplified from thousand year old mummies, most proteins are more fragile biomolecules. Therefore, laboratory reagents and storage solutions must provide suitable conditions so that the normal structure and function of the protein is maintained. To understand how the structure of proteins is protected in laboratory solutions, it is necessary to understand how that structure can be destroyed.
· Proteins can denature, or unfold so that their three dimensional structure is altered but their primary structure remains intact.(Figure 2.7) Many of the interactions that stabilize the 3-D conformation of the protein are relatively weak and are sensitive to various environmental factors including high temperature, low or high pH and high ionic strength. Protein vary greatly in the degree of their sensitivity to these factors. Sometimes proteins can be renatured but often the denaturation is irreversible.
Proteins can also be broken apart by enzymes, called proteases, that digest the covalent peptide bonds between
amino acids that are responsible for the primary structure. This process is
called proteolysis and is irreversible. Cells contain proteases that are found
in lysosomes, membrane bound organelles inside the cell. When cells are
disrupted, lysosomes break and release these proteases, which can damage the
other proteins in the cell. In the laboratory, it is therefore necessary to
minimize the activities of cellular proteases to protect proteins from
proteolysis. Methods used to minimize proteolysis include working at lower
· Sulfur groups on cysteines may undergo oxidation to form disulfide bonds that are not normally present. Extra disulfide bonds can form when proteins are removed from their normal environment. Reducing agents such as dithiothreitol or ß-mercaptoethanol are often added to prevent undesirable disulfiate bond formation.
· Proteins readily adsorb (stick to) surfaces, thereby reducing their available activity. To prevent significant loss, do not store dilute solutions of proteins for prolonged periods of time. Always dilute them right before use.
The composition of the extraction buffer is important for maintaining structure and function of the target protein. To prevent denaturation, the buffering pH is based on the pH stability range of the protein. Other components such as ionic strength, divalent cations (Ca++ and Mg++), or reducing agents (dithiothreitol or ß-mercaptoethanol) may be needed to maintain activity. In making the extract, cells are lysed and proteases (enzymes that degrade proteins) are released from their intracellular compartments. To prevent proteases from digesting the target protein, two strategies are commonly followed: 1) The extract is kept cold. The activity of proteolytic enzymes is greatly reduced by cold temperatures. For this reason, the protein purification process is often conducted in cold rooms. At the very least, an effort is made to keep the extract at 4?C. 2) Protease inhibitors are sometimes added to the mixture to prevent degradation by proteases. The drawback to this strategy is that the inhibitors must eventually be removed, along with other contaminant proteins.
Denaturation of proteins involves the disruption and possible destruction of both the secondary and tertiary structures. Since denaturation reactions are not strong enough to break the peptide bonds, the primary structure (sequence of amino acids) remains the same after a denaturation process. Denaturation disrupts the normal alpha-helix and beta sheets in a protein and uncoils it into a random shape.
Denaturation occurs because the bonding interactions responsible for the secondary structure (hydrogen bonds to amides) and tertiary structure are disrupted. In tertiary structure there are four types of bonding interactions between "side chains" including: hydrogen bonding, salt bridges, disulfide bonds, and non-polar hydrophobic interactions. which may be disrupted. Therefore, a variety of reagents and conditions can cause denaturation. The most common observation in the denaturation process is the precipitation or coagulation of the protein.
The natural or native
structures of proteins may be altered, and their biological activity changed or
destroyed by treatment that does not disrupt the primary structure. This denaturation is often done deliberately in the
course of separating and purifying proteins. For example, many soluble globular
proteins precipitate if the pH of the solution is set at the pI of the protein.
Also, addition of trichloroacetic acid or the bis-amide urea (NH2CONH2)
is commonly used to effect protein precipitation. Following denaturation, some
proteins will return to their native structures under proper conditions; but
extreme conditions, such as strong heating, usually cause irreversible change.
Some treatments known to denature proteins are listed in the following table.
Mechanism of Operation
hydrogen bonds are broken by increased translational and
Similar to heat
Strong Acids or Bases
salt formation; disruption of hydrogen bonds.
competition for hydrogen bonds.
Some Organic Solvents
change in dielectric constant and hydration of ionic
shearing of hydrogen bonds.
Separation and identification of amino acids are operations that must be performed frequently by biochemists. The 20 amino acids present in proteins have similar structures. However, each amino acid is unique in polarity and ionic characteristics. In this experiment, we will use a combination of ion exchange chromatography and paper chromatography to separate and identify the components of an unknown amino acid mixture.
Twenty amino acids are the fundamental building blocks of proteins. Amide bond linkages between a-amino acids construct all proteins found in nature. The amino acids isolated from proteins material all have common structural characteristics.
The distinctive physical, chemical and biological properties associated with an amino acid are the result of the R group. There are 20 major amino acids that differ in their R-group. The R-group can be hydrophobic or polar, aromatic or aliphatic, charged or uncharged. The different R-groups are responsible for amino acids having different polarities, solubilities and chromatographic behavior (see below).
The structure and biological function of a protein depend on its amino acid composition. It is a matter of basic importance to understand practical methods used for the separation and identification of the 20 common amino acids.
Amino acids are amphitropic because they contain both an acidic group and basic group. The COOH group is acidic with a pKa value of 1.7-2.4. Thus at pH values below this, the group exists as COOH while at higher pH values, the group exists as COO-. The NH2 group is basic with a pKa of 9-10.5, so below this it exists as NH3+ while above this pH it exists as NH2. At neutral pH values, both groups are ionized and the amino acid exists in a dipolar form with no net charge. This form is called a zwitterion. The pH at which all the amino acid molecules are in this form is the isoionic point (pI) of the amino acid where (for amino acids with non-ionizable side chain chains)
Paper chromatography of amino acids
Paper chromatography can separate different amino acids based on their varying solubilities in two different solvents. In this method, a sample of an amino acid (or mixture of amino acids) is applied as a small spot near one edge of a piece of chromatography paper. The edge of the paper is then placed in a shallow layer of solvent mixture in a chromatography tank.
The solvent mixture contains several components, one of which is usually water and another of which is a more non-polar solvent. As the solvent mixture moves up the paper by capillary action, the water in the mixture binds to the hydrophilic paper (cellulose) and creates a liquid stationary phase of many small water droplets. The non-polar solvent continues to move up the paper forming a liquid mobile phase. Since amino acids have different R-groups, they also have different degrees of solubility in water vs. the non-polar solvent. An amino acid with a polar R-group will be more soluble in water than in the non-polar solvent, so it will dissolve more in the stationary water phase and will move up the paper only slightly. An amino acid with a hydrophobic R-group will be more soluble in the mobile non-polar solvent than in water, so it will continue to move up the paper. Different amino acids will move different distances up the paper depending upon their relative solubilities in the two solvents, allowing for separation of amino acid mixtures.
The movement of amino acids can be defined by a quantity known as Rf value, which measures the movement of an amino acid compared to the movement of the solvent. At the start of the chromatography, the amino acid is spotted at what is called the origin. The chromatography is then performed, and the procedure is stopped before the solvent runs all the way up the paper. The level to which the solvent has risen is called the solvent front. The Rf value of an amino acid is the ratio of the distance traveled by the amino acid from the origin to the distance traveled by the solvent from the origin.
Since Rf value for an amino acid is constant for a given chromatography system, an unknown amino acid can be identified by comparing its Rf value to those of known amino acids.
Certain technical aspects are important when performing paper chromatography. First, it is necessary to keep the applied amino acid spot very small. The spot tends to spread out as it moves up the paper, so starting with a big spot will produce a large smear by the end of the procedure, making it difficult to measure an accurate Rf value. Second, the chromatogram paper must be kept very clean. Fingerprints or other types of contamination will interfere with the chromatography and give poor results. Finally, since amino acids are colorless, something must be done to detect the amino acids at the completion of the chromatography. One of the simplest methods for this involves spraying the paper with ninhydrin. When heated, ninhydrin reacts with amino acids to produce a blue-purple color (yellow in the case of proline), making the amino acids spots visible for analysis.
In this experiment, paper chromatography will be performed using an unknown amino acid along with known standards. Through a comparison of Rf values, the unknown amino acid will be identified.
Spectrophotometry is widely used in biochemistry. Many biochemical compounds absorb light in the ultraviolet (200-400 nm), visible (400-700 nm), or near infrared (700-900 nm) regions of the spectrum. Even if a particular compound does not absorb light itself, it can often be reacted with another compound to produce a light-absorbing substance. Thus spectrophotometry allows for the qualitative and quantitative determination of biochemical compounds. In addition, such techniques are often simple, fast, and clean. Because of their sensitivity, these methods are frequently employed by biochemists.
When white light is passed through a solution containing a colored compound, certain wavelengths of light are absorbed. Which wavelengths (energies) of light are absorbed depends upon the chemical structure of the compound. The absorption of a particular wavelength of light indicates the absorption of photons possessing particular energies, and the absorption of these photons increases various types of molecular energy (electronic, rotational, vibrational, etc.) of the compound. Those wavelengths of light that are not absorbed by the compound are reflected or transmitted, and are responsible for the appearance of the compound. Since different types of compounds have characteristic wavelengths at which they absorb light, it is possible to measure the absorbance of a substance at many different wavelengths to obtain its absorption spectrum. A compound can often be qualitatively identified in this manner.
The preceding discussion applies to both inorganic and biochemical spectrophotometry. However, in biochemistry, only a few important compounds are highly colored and so can be studied directly. Many biochemical molecules absorb UV light, but the amount of absorption is often too small for an accurate analysis if one is dealing with a limited amount of the compound to be analyzed. To circumvent this difficulty, various reactions have been developed in which a particular type of biochemical compound is converted into a highly colored substance. In performing such quantitative determinations, a series of solutions of the compound (or a similar one) are made, the concentrations of which are known. Under defined conditions, the compound in these solutions is reacted with an excess of the color-forming reagents. The absorbances of the solutions are measured, and a standard Beer's law plot showing the variation of absorbance with concentration can be drawn. In addition, a blank is prepared which contains all of the color-forming reagents, but none of the compound being assayed. The absorbance of the blank serves as a control. Then, the color-forming reaction can be performed with the sample where the concentration of the compound is unknown, and a quantitative determination can be made.
Proteins in particular are a biochemical
compound that must often be measured. Proteins absorb UV light at
280 nm due to the presence of aromatic amino acids, allowing for a direct
determination of protein. Most pure protein solutions containing 1
mg/mL of protein have an absorbance of about 1.0 when the light path is
Protein Molecular Weight Determination
The purpose of this experiment is to determine the molecular weight of a protein using gel filtration and SDS-gel electrophoresis.
I. Gel Filtration
Gel filtration is a chromatographic technique that separates different molecules on the basis of size. It is commonly used during protein purification to remove unwanted proteins from the protein being purified. It can also be used to determine the molecular weight of a protein.
In gel filtration, a dextran, polyacrylamide, or agarose gel is suspended in buffer and packed in a glass or plastic column. The sample to be analyzed is applied to the top of the column and is allowed to run down into the gel. A continuous supply of buffer is then provided at the top of the column, and, as the buffer runs through the column, the components in the sample are carried down the gel and separated. The buffer is collected at the bottom of the column in fractions of constant volume (i.e. 1.0 mL), and all the fractions are analyzed for the presence of the various components in the sample. The separation of the components is caused by cross-linking in the gel which creates pores. Small molecules can penetrate the pores and so are slowed down and retained as they pass down the column. Large molecules cannot penetrate the pores and so run down the column quickly. Gels with different degrees of cross-linking (and therefore different sized pores) are commercially available to separate molecules in different molecular weight ranges. In this experiment, Sephadex G-75 will be used. This gel is a dextran capable of separating proteins with molecular weights between 3000 and 70,000.
For a Sephadex column, the total
volume, Vt, is equal to the sum of the volume of the
gel matrix, the volume inside the gel matrix, and the volume outside the
matrix. The total volume is also , in most
cases, equal to the amount of the buffer required to run a substance through
the column (also known as eluting a substance) when the substance is small
enough to completely penetrate the pores of the gel. Such a substance is
said to be completely included by the gel. For Sephadex G-75, compounds
with molecular weights less than 3000 are completely included. The volume
outside the gel matrix is known as the void volume, Vo. This is the
volume required to elute a substance so large that it cannot penetrate the
pores at all. Such a substance is said to be completely excluded by the
gel. For Sephadex G-75, proteins with molecular weights greater than
70,000 are completely excluded. Compounds with intermediate molecular
sizes that can partially penetrate the pores elute between the void volume and
the total volume, and are said to be partially included by the gel. The
volume of buffer required to elute any given substance is known as the elution
volume, Ve, of the compound. Thus on Sephadex G-
During protein purification, a mixture of many proteins can be subjected to gel filtration, and all proteins that have molecular weights different from the one being purified can be separated out. Thus gel filtration is a powerful technique for purifying a protein. Gel filtration can also be used to determine the molecular weight of a protein. To do this, several proteins with known molecular weights are run on the column and their elution volumes determined. If the elution volumes are then plotted against the log molecular weight of the corresponding proteins, a straight line is obtained for the separation range of the gel being used. If the elution volume of a protein of unknown molecular weight is then found, it can be compared to the calibration curve and the molecular weight determined.
Gel filtration has many advantages as a biochemical technique. It is relatively simple to perform, and the mild conditions used tend to prevent denaturation of proteins, unlike some other techniques. The protein that runs off the column can be collected and used for further analysis, so no protein is consumed in gel filtration. However, there are also disadvantages as well. The column must be carefully prepared to obtain optimal separation. Any cracks or discontinuities in the column will interfere. The size of the sample and the rate of buffer flow must be strictly controlled. If a column is run several times, each run must be done under the exact same conditions in order to compare the different runs. finally, some substances stick to Sephadex and do not elute properly.
The second method used
to find the molecular weight of a protein will be
SDS-gel electrophoresis. When a charged protein is placed in an electric field, it will migrate toward the oppositely charged region, and this is the basis of electrophoresis. In most electrophoresis methods, the molecules being analyzed are placed on a solid support and then allowed to migrate. For proteins, a polyacrylamide gel support is commonly used. The proteins are applied to the gel, and the gel is contained in an electrophoresis cell, which in turn is connected to a power supply which creates a positive electrode and a negative electrode in the cell. Buffer is used to complete the circuit in the cell between the gel and the electrode wires. The buffer in the cell and contained in the gel is important, since its pH determines the charge on the protein molecules.
Usually the determining factor in the separation of the molecules is their charge. The more highly charged the molecule, the faster and farther it will move during electrophoresis. With proteins, however, a second effect is seen, namely the size of the protein. As a protein moves through the gel, it must overcome frictional forces which oppose its movement. The larger the protein, the greater the frictional force. Thus in most gels, the exact rate of movement of a particular protein depends on both its charge and its size.
One type of electrophoresis is SDS-gel electrophoresis. In this method, the proteins to be separated are denatured (usually in urea) and then mixed with the detergent SDS (sodium dodecyl sulfate). SDS binds along the length of the protein, obscuring the protein’s own charges and giving all proteins the same negative charge per unit length. Thus charge is essentially removed as a factor in the separation and size alone becomes important. All proteins will move toward the positive electrode, but large proteins will move more slowly than small proteins. The distance moved is inversely proportional to the log of the molecular weight. It is therefore possible to run several proteins of known molecular weight in an SDS-gel electrophoresis procedure, measure their migration distances, and construct a calibration curve. The distance moved by a protein of unknown molecular weight can be compared to the standards and its size determined.
Some proteins are colored and can be seen directly on a gel, but most are colorless. To visualize most proteins, a staining procedure is needed. Coomassie blue is a general protein stain, causing the protein to be come visible as blue bands within the gel. Silver stain can detect very small amounts of proteins, causing them to turn brown-black
The compounds we call proteins exhibit a broad range of physical and biological properties. Two general categories of simple proteins are commonly recognized.
the name implies, these substances have fiber-like structures, and serve as the
chief structural material in various tissues. Corresponding to this
structural function, they are relatively insoluble in water and unaffected by
moderate changes in temperature and pH. Subgroups within this category
of this class serve regulatory, maintenance and catalytic roles in living
Fibrous proteins such as keratins, collagens and elastins are robust, relatively insoluble, quaternary structured proteins that play important roles in the physical structure of organisms. Secondary structures such as the α-helix and β-sheet take on a dominant role in the architecture and aggregation of keratins. In addition to the intra- and intermolecular hydrogen bonds of these structures, keratins have large amounts of the sulfur-containing amino acid Cys, resulting in disulfide bridges that confer additional strength and rigidity. The more flexible and elastic keratins of hair have fewer interchain disulfide bridges than the keratins in mammalian fingernails, hooves and claws. Keratins have a high proportion of the smallest amino acid, Gly, as well as the next smallest, Ala. In the case of β-sheets, Gly allows sterically-unhindered hydrogen bonding between the amino and carboxyl groups of peptide bonds on adjacent protein chains, facilitating their close alignment and strong binding. Fibrous keratin chains then twist around each other to form helical filaments.
Elastin, the connective tissue protein, also has a high percentage of both glycine and alanine. An insoluble rubber-like protein, elastin confers elasticity on tissues and organs. Elastin is a macromolecular polymer formed from tropoelastin, its soluble precursor. The secondary structure is roughly 30% β-sheets, 20% α-helices and 50% unordered. The elastic properties of natural elastin are attributed to polypentapeptide sequences (Val-Pro-Gly-Val-Gly) in a cross-linked network of randomly coiled chains. Water is believed to act as a "plasticizer", assisting elasticity.
Collagen is a major component of the extracellular matrix that supports most tissues and gives cells structure. It has great tensile strength, and is the main component of fascia, cartilage, ligaments, tendons, bone and skin. Collagen contains more Gly (33%) and proline derivatives (20 to 24%) than do other proteins, but very little Cys. The primary structure of collagen has a frequent repetitive pattern, Gly-Pro-X (where X is a hydroxyl bearing Pro or Lys). This kind of regular repetition and high glycine content is found in only a few other fibrous proteins, such as silk fibroin (75-80% Gly and Ala + 10% Ser). Collagen chains are approximately 1000 units long, and assume an extended left-handed helical conformation due to the influence of proline rings. Three such chains are wound about each other with a right-handed twist forming a rope-like superhelical quaternary structure, stabilized by interchain hydrogen bonding.
Globular proteins are more soluble in aqueous solutions, and are generally more sensitive to temperature and pH change than are their fibrous counterparts; furthermore, they do not have the high glycine content or the repetitious sequences of the fibrous proteins. Globular proteins incorporate a variety of amino acids, many with large side chains and reactive functional groups. The interactions of these substituents, both polar and nonpolar, often causes the protein to fold into spherical conformations which gives this class its name. In contrast to the structural function played by the fibrous proteins, the globular proteins are chemically reactive, serving as enzymes (catalysts), transport agents and regulatory messengers.
Although globular proteins are generally sensitive to denaturation (structural unfolding), some can be remarkably stable. One example is the small enzyme ribonuclease A, which serves to digest RNA in our food by cleaving the ribose phosphate bond. Ribonuclease A is remarkably stable. One procedure for purifying it involves treatment with a hot sulfuric acid solution, which denatures and partially decomposes most proteins other than ribonuclease A. This stability reflects the fact that this enzyme functions in the inhospitable environment of the digestive tract. Ribonuclease A was the first enzyme synthesized by R. Bruce Merrifield, demonstrating that biological molecules are simply chemical entities that may be constructed artificially. By clicking the cartoon image on the left, an interactive model of ribonuclease A will be displayed.
Chromatographic methods are applicable not only to separation, identification, and quantitative analysis of amino acid mixtures but also of peptides, proteins, nucleotides, nucleic acids, lipids, and carbohydrates.
Partition Chromatography. When a solute is allowed to distribute itself between equal volumes of two immiscible liquids, the ratio of the concentrations of the solute in the two phases is called the partition coefficient. Amino acids can be partitioned in this manner between two liquid phases, e.g., the pairs phenol-water or n-butanol-water. Each amino acid has a distinctive partition coefficient for any given pair of immiscible solvents.
Partition chromatography is the
chromatographic separation of mixtures essentially by
the countercurrent-partition principle. The separation is achieved in a
huge number of separate partition steps, which take place on microscopic
granules of a hydrated insoluble inert substance, such as starch or silica gel,
packed in a column about 10 to
The total number of partition steps in the column is so great that the different amino acids in the mixture move down the column at different rates as the moving liquid phase flows through it. The liquid appearing at the bottom of the column, called the eluate, is caught in small fractions with an automatic fraction collector and analyzed by means of the quantitative ninhydrin reaction.
Precisely the same principle is involved in filter-paper chromatography of amino acids. The cellulose of the filter-paper is hydrated. As a solvent containing an amino acid mixture ascends in the vertically held paper by capillary action (or descends, in descending chromatography), many microscopic distributions of the amino acids occur between the flowing phase and the stationary water phase bound to the paper fibers. At the end of the process, the different amino acids have moved different distances from the origin. The paper is dried, sprayed with ninhydrin solution, and heated in order to locate the amino acids. In the important refinement of two-dimensional paper chromatography, the mixture of amino acids is chromatographed in one direction; then the paper is dried and subjected to chromatography with a different solvent system in a direction at right angles to the first. A two-dimensional map of the different amino acids results.
Ion-Exchange Chromatography. The partition principle has been further refined in ion-exchange chromatography. In this method solute molecules are sorted out by the differences in their acid-base behavior. A column is filled with granules of synthetic resins: cation exchangers and anion exchangers. Amino acids are usually separated on cation exchange columns.
Amino acids can also be separated by thin-layer chromatography, a refinement of partition chromatography.
Molecular-Exclusion Chromatography. One of the most useful and powerful tools for separating proteins from each other on the basis of size is molecular-exclusion chromatography, also known as gel-filtration or molecular-sieve chromatography. It differs from ion-exchange chromatography, which separates solutes on the basis of their electric charge and acid-base properties. In molecular-exclusion chromatography the mixture of proteins is allowed to flow by gravity down a column packed with beads of an inert, highly hydrated polymeric material that has previously been washed and equilibrated with the buffer alone. Common column materials are Sephadex, the commercial name of a polysaccharide derivative; Bio-Gel, a commercialpolyacryl-amide derivative; and agarose, another polysaccharide — all of which can be prepared with different degrees of internal porosity. In the column proteins of different molecular size penetrate into the internal pores of the beads to different degrees and thus travel down the column at different rates. Very large protein molecules cannot enter the pores of the beads; they are said to be excluded and thus remain in the excluded volume of the column, denned as the volume of the aqueous phase outside the beads. On the other hand, very small proteins can enter the pores of the beads freely. Small proteins are retarded by the column while large proteins pass through rapidly, since they cannot enter the hydrated polymer particles. Proteins of intermediate size will be excluded from the beads to a degree that depends on their size. From measurements of the protein concentration in small fractions of the eluate an elution curve can be constructed.
Molecular-exclusion chromatography can also be used to separate mixtures of other kinds of macromolecules, as well as very large biostructures, e.g., viruses, ribosomes, cell nuclei, or even bacteria, simply by using beads or gels with different degrees of internal porosity. The resolving power of molecular-exclusion chromatography is so great that this simple method is now widely used as a way of determining the molecular weight of proteins.
Selective Adsorption. Proteins can be adsorbed to, and selectively eluted from, columns of finely divided, relatively inert materials with a very large surface area in relation to particle size. They include nonpolar substances, e.g., charcoal, and polar substances, e.g., silica gel or alumina. The precise nature of the forces binding the protein to such adsorbents is not known, but presumably van der Waals and hydrophobic interactions prevail with nonpolar adsorbents, whereas ionic attractions and/or hydrogen bonding are the main forces with polar adsorbents.
Affinity Chromatography. Some proteins can be isolated from a very complex mixture and brought to a high degree of purification, often in a single step, by affinity chromatography. This method is based on a biological property of some proteins, namely, their capacity for specific, noncovalent binding of another molecule, called the ligand. For example, some enzymes bind their specific coenzymes very tightly through noncovalent forces. In order to separate such an enzyme from other proteins by affinity chromatography, its specific coenzyrne is covalently attached, by means of an appropriate chemical reaction, to a functional group on the surface of large hydrated particles of a porous column material, e.g., the poly-saccharide agarose, which otherwise allows protein molecules to pass freely. When a mixture of proteins containing the enzyme to be isolated is added to such a column, the enzyme molecule, which is capable of binding tightly and specifically to the immobilized ligand molecule, adheres to the ligand-derivatized agarose particles, whereas all the other proteins, which lack a specific binding site for that particularligand molecule, will pass through. This method thus depends on the biological affinity of the protein for its characteristic ligand. The protein specifically bound to the column particles in this manner can then be eluted, often with a solution of the free ligand molecule.
Diagnostic significance of blood and urine chromatographic analysis. Hypo- and hyperaminoacidemia, hypo- and hyperaminoaciduria.
The measurement of amino acids level in organism is important for studing of protein metabolism in organism. There are approximately 21,2 mmol/l amino acids in blood plasma in normal conditions. Hyperaminoacidemia – the increasing of amino acid level in blood plasma. The possible causes of such state are liver diseases, diabetus mellitus, acute and chronical kidney failure, congenital enzymopathy.
Hypoaminoacidemia is observed during the protein starvation, fever, kidney diseases, hyperfunction of adrenal cortex.
Acid-Base Properties of Peptides. Since none of the a-carboxyl groups and none of the a-amino groups that are combined in peptide linkages can ionize in the pH zone 0 to 14, the acid-base behavior of peptides is contributed by the free a-amino group of the N-terminal residue, the free a-carboxyl group of the carboxy-terminal (abbreviated C-terminal) residue, and those R groups of the residues in intermediate positions which can ionize. In long polypeptide chains the ionizing R groups necessarily greatly outnumber the terminal ionizing groups.
Optical Properties of Peptides. If partial hydrolysis of a protein is carried out under sufficiently mild conditions, the peptides formed are optically active, since they contain only L-amino acid residues. In relatively short peptides, the total observed optical activity is approximately an additive function of the optical activities of the component amino acid residues. However, the optical activity of long polypeptide chains of proteins in their native conformation is much less than additive, a fact of great significance with regard to the secondary and tertiary structure of proteins.
Chemical Properties of Peptides. The free N-terminal amino groups of peptides undergo the same kinds of chemical reactions as those given by the a-amino groups of free amino acids, such as acylation and carbamoylation. The N-terminal amino acid residue of peptides also reacts quantitatively with ninhydrin to form colored derivatives; the ninhydrin reaction is widely used for detection and quantitative estimation of peptides in electrophoretic and chromatographic procedures. Similarly, the C-terminal carboxyl group of a peptide may be esterified or reduced. Moreover, the various R groups of the different amino acid residues found in peptides usually yield the same characteristic reactions as free amino acids.
One widely employed color reaction of peptides and proteins that is not given by free amino acids is the biuret reaction. Treatment of a peptide or protein with Cu2+ and alkali yields a purple Cu2+-peptide complex, which can be measured quantitatively in a spectrophotometer.
The molecular weight of proteins and its determination.
The molecular weights of proteins ranges from about 5000, which is the lower limit, to 1 million or more.
Many proteins having molecular weights above 40000 contain two or more polypeptide chains. The individual polypeptide chains of most proteins of known structure contain from 100 to 300 amino acid residues. However, some proteins have much longer chains, such as serum albumin (approximately 550 residues) and myosin (approximately 1800 residues).
Determination of the Molecular Weight from Osmotic-Pressure Measurements
When a semipermeable membrane separates a solution of a protein from pure water, the water moves across the membrane into the compartment containing the solute, a process called osmosis. The molecular weight of a protein can be determined from measurements of the osmotic pressure of a solution of a known concentration of protein.
Determination of Molecular Weight by Sedimentation Analysis
The ultracentrifuge can yield centrifugal fields exceeding 250 000 times the force of gravity. Such a high centrifugal field causes protein sedimentation, opposing the force of diffusion, which normally keeps them evenly dispersed in solution. If the centrifugal force exerted on protein molecules in a solution greatly exceeds the opposing diffusion force, the molecules will sediment down. The rate of sedimentation is observed by optical measurements and depend on molecular weight of proteins.
Determining Molecular Weight by Light Scattering
When a beam of light is passed through a protein solution in a darkened room, the path of the beam can be seen because the light is scattered by the protein molecules. This is called theTyndal effect. From the wavelength of the incident radiation, the intensity of the scattered light, the refractive index of the solvent and solute, and the concentration of the solute, the molecular weight of the protein can be calculated.
Determining Molecular Weight by Molecular-Exclusion Chromatography
Protein mixtures can be sorted out on the basis of molecular weight by molecular-exclusion chromatography. This simple method, which requires no complex equipment, can yield accurate determinations of the molecular weight of a protein. Molecular-exclusion columns measure not the true molecular weight of an unknown protein but its Stokes radius, which is most simply defined as the radius of a perfect unhydrated sphere having the same rate of passage through the column as the unknown protein in question. If the unknowm and marker proteins are spherical, the method yields the molecular weight directly.
Proteins Solubility. Factors Determining the Solubility.
Proteins in solution show profound changes in solubility as a function of (1) pH, (2) ionic strength, (3) the dielectric properties of the solvent (hydrated shell), and (4) temperature.
The solubility of most globular proteins is profoundly influenced by the pH of the system because the electric charge of protein molecule results from pH. When the protein molecule has no net electric charge there is no electrostatic repulsion between neighboring protein molecules and they tend to coalesce and precipitate. When all the protein molecules have a net charge of the same sign they repel each other, preventing coalescence of single molecules into insoluble aggregates.
Electric charge of proteins and hence the availability of hydrated shell and solubility of proteins depend also on the ionic composition of the medium, since proteins can bind certain anions and/or cations.
Methods of protein precipitation.
There are two methods of protein precipitation: reversible (salting-out) and inreversible (denaturation).
Reversible coagulation of proteins. Salting-in and Salting-out of Proteins.
Reversible coagulation of proteins - precipitation without the loss of native structure. If optimal conditions will be created for proteins (for example, the adding of solvent) they can be dissolved again.
Neutral salts have pronounced effects on the solubility of globular proteins. In low concentration, salts increase the solubility of many proteins, a phenomenon called salting-in. Salts of divalent ions, such as MgCI2 are far more effective at salting-in than salts of monovalent ions, such as NaCl and KCl. The ability of neutral salts to influence the solubility of proteins is a function of their ionic strength, a measure of both the concentration and the number of electric charges on the cations and anions contributed by the salt. Salting-in effects are caused by changes in the tendency of dissociable R groups on the protein to ionize.
On the other hand, as the ionic strength is increased further, the solubility of a protein begins to decrease. At sufficiently high ionic strength a protein may be almost completely precipitated from solution, an effect called salting-out. The physicochemical basis of salting-out is rather complex; one factor is that the high concentration of salt may remove water of hydration from the protein molecules, thus reducing their solubility, but other factors are also involved. Proteins precipitated by salting-out retain their native conformation and can be dissolved again, usually withoutdenaturation. Ammonium sulfate is preferred for salting out proteins because it is so soluble in water that very high ionic strengths can be attained.
Separation, Purification and Characterization of Proteins
Each type of cell may contain thousands of different proteins. The isolation in pure form of a given protein from a given cell or tissue may appear to be a difficult task, particularly since any given protein may exist in only a very low concentration in the cell, along with thousands of others.
Separation Procedures Based on Molecular Size.
Dialysis and Ultrafiltration. Globular proteins in solution can easily be separated from low-molecular-weight solutes by dialysis, which utilizes a semipermeable membrane to retain protein molecules and allow small solute molecules and water to pass through.
Another way of separating proteins from small molecules is by ultrafiltration, in which pressure or centrifugal force is used to filter the aqueous medium and small solute molecules through asemipermeable membrane, which retains the protein molecules. Cellophane and other synthetic materials are commonly used as the membrane in such procedures.
Density-Gradient (Zonal) Centrifugation. Because proteins in solution tend to sediment at high centrifugal fields, thus overcoming the opposing tendency of diffusion, it is possible to separate mixtures of proteins by centrifugal methods.
Molecular-Exclusion Chromatography. One of the most useful and powerful tools for separating proteins from each other on the basis of size is molecular-exclusion chromatography, also known as gel-filtration. In molecular-exclusion chromatography the mixture of proteins, dissolved in a suitable buffer, is allowed to flow by gravity down a column packed with beads of an inert, highly hydrated polymeric material. Common column materials are Sephadex, the commercial name of a polysaccharide derivative, which can be prepared with different degrees of internal porosity. In the column proteins of different molecular size penetrate into the internal pores of the beads to different degrees and thus travel down the column at different rates. Very large protein molecules cannot enter the pores of the beads, very small proteins can enter the pores of the beads freely. Small proteins are retarded by the column while large proteins pass through rapidly, since they cannot enter the polymer particles. Proteins of intermediate size will be excluded from the beads to a degree that depends on their size. From measurements of the protein concentration in small fractions of the eluate an elution curve can be constructed.
Separation Procedures Based on Solubility Differences.
Isoelectric Precipitation. The solubility of most globular proteins is profoundly influenced by the pH of the system. Since different proteins have different isoelectric pH values, because their content of amino acids with ionizable R groups differs, they can often be separated from each other by isoelectric precipitation. When the pH of a protein mixture is adjusted to theisoelectric pH of one of its components, much or that entire component will precipitate, leaving behind in solution proteins with isoelectric pH values above or below that pH. The precipitatedisoelectric protein remains in its native conformation and can be redissolved in a medium having an appropriate pH and salt concentration.
Salting-out of Proteins. A protein may be almost completely precipitated from solution adding to it neutral salts. This effect is called salting-out. The physicochemical basis of salting-out is rather complex; one factor is that the high concentration of salt may remove water of hydration from the protein molecules, thus reducing their solubility.
Solvent Fractionation. The addition of water-miscible neutral organic solvents, particularly ethanol or acetone, decreases the solubility of most globular proteins in water to such an extent that they precipitate out of solution. Quantitative study of this effect shows that protein solubility at a fixed pH and ionic strength is a function of the dielectric constant of the medium. Since ethanol has a lower dielectric constant than water, its addition to an aqueous protein solution increases the attractive force between opposite charges, thus decreasing the degree of ionization of the R groups of the protein. As a result, the protein molecules tend to aggregate and precipitate. Mixtures of proteins can be separated on the basis of quantitative differences in their solubility in cold ethanol-water or acetone-water mixtures. A disadvantage of this method is that since such solvents can denature proteins at higher temperatures, the temperature must be kept rather low.
Effect of Temperature on Solubility of Proteins.
limited range, from about 0 to about
Separation Procedures Based on Electric Charge.
Electrophoretic Methods. This method can separate a protein mixture on the basis of both electric charge and molecular size. For this purpose, special paper, gels of potato starch orpolyacrylamide are commonly used. By this technique the protein components of blood plasma can be resolved into 15 or more bands.
Ion-Exchange Chromatography. Columns of ion-exchange resins are successfully applied to the separation of protein mixtures. The most commonly used materials for chromatography of proteins are synthetically prepared derivatives of cellulose. Protein mixtures are resolved and the individual components successively eluted from DEAE-cellulose columns by passing a series of buffers of decreasing pH or a series of salt solutions of increasing ionic strength, which have the effect of decreasing the binding of anionic proteins. The protein concentration in the eluate, which is collected in small fractions, is estimated optically by its capacity to absorb light in the ultraviolet region.
Separation of Proteins by Selective Adsorption.
Proteins can be adsorbed to, and selectively eluted from, columns of finely divided, relatively inert materials with a very large surface area in relation to particle size. They include nonpolarsubstances, e.g., charcoal, and polar substances, e.g., silica gel or alumina. The precise nature of the forces binding the protein to such adsorbents is not known, but presumably van der Waals and hydrophobic interactions prevail with nonpolar adsorbents, whereas ionic attractions and/or hydrogen bonding are the main forces with polar adsorbents.
Separations Based on Ligand Specificity: Affinity Chromatography.
This method is based on a biological property of some proteins, namely, their capacity for specific, noncovalent binding of another molecule, called the ligand. For example, some enzymes bind their specific coenzymes very tightly through noncovalent forces. In order to separate such an enzyme from other proteins by affinity chromatography, its specific coenzyrne is covalently attached, by means of an appropriate chemical reaction, to a functional group on the surface of large hydrated particles of a porous column material, which otherwise allows protein molecules to pass freely. When a mixture of proteins containing the enzyme to be isolated is added to such a column, the enzyme molecule, which is capable of binding tightly and specifically to the immobilized ligand molecule, adheres to the ligand-derivatized agarose particles, whereas all the other proteins, which lack a specific binding site for that particular ligand molecule, will pass through.
QUALITATIVE REACTIONS ON THE PROTEINS AND AMINO ACIDS
Biuret test. The protein is warmed gently with 10 % solution of sodium hydroxide and then à drop of very dilute copper sulphate solution is added, the formation of reddish - violet colour indicates the presence of peptide link, – ÑÎ – NH – . The test is given by all proteins, peptones and peptides. Its name is derived from the fact that the test is also positive for the compound biuret, Í2N –CONH – CONH2 obtained from urea by heating.
It should be noted that dipeptides do not give the biuret test, while all other polypeptides do so. Hence biuret test is important to know whether hydrolysis of proteins is complete or not. If the biuret test is negative, hydrolysis is complete, at least to the dipeptide stage.
Xanthoproteic test. On treatment with concentrated nitric acid, certain proteins give yellow colour. This yellow colour is the same that is formed on the skin when the latter comes in contact with the concentrated nitric acid. The test is given only by the proteins having at least one mole of aromatic amino acid, such as tryptophan, phenylalanine, and tyrosine which are actually nitrated during treatment with concentrated nitric acid.
Millon's test. Protein on adding Millon's reagent (à solution of mercuric and mercurous nitrates in nitric acid containing à little nitrous acid) followed by heating the solution give à red precipitate or colour. The test is responded by the proteins having tyrosine. The hydroxyphenyl group of tyrosine is the structure responsible for this test. Moreover, the non-proteinous material having phenolic group also responds the test.
Foll reaction. This reaction reveals the sulfur containing amino acids (cysteine, cystine). Treatment of the sulfur containing amino acids with salt of lead and alkali yields a black sediment.
Adamkevich reaction. This reaction detects the amino acid tryptophan containing indol ring. The addition of the concentrated acetic and sulfuric acids to the solution of tryptophan results in the formation of red-violet ring appearing on the boundary of different liquids.
Ninhydrin test. The ninhydrin colour reaction is the most commonly test used for the detection of amino acids. This is an extremely delicate test, to which proteins, their hydrolytic products, and α-amino acids react. Although the test is positive for all free amino groups in amino acids, peptides, or proteins, the test is much weaker for peptides or proteins because not as many free groups are available as in amino acids. For certain amino acids the test is positive in dilutions as high as 1 part in 100,000 parts of water.
When ninhydrin is added to à protein solution and the mixture is heated to boil, blue to violet colour appears on cooling. The colour is due to the formation of à complex compound.
The test is also given by ammonia, ammonium salts, and certain amines. Ninhydrin is also used as à reagent for the quantitative determination of free carboxyl groups in solutions of amino acids.
Nitroprusside test. Proteins containing free -SH groups (of cysteine) give à reddish colour with sodium nitroprusside in ammonical solution.
Proteins are polypeptides that contain more than 50 amino acid units. The dividing line between à polypeptide and à protein is arbitrary. The important point is that proteins are polymers containing à large number of amino acid units linked by peptide bonds. Polypeptides are shorter chains of amino acids. Some proteins have molecular masses in the millions. Some proteins also contain more than one polypeptide chain.
To aid us in describing protein structure, we will consider four levels of substructure: primary, secondary, tertiary, and quaternary. Even though we consider these structure levels one by one, remember that it is the combination of all four levels of structure that controls protein function.