PHYSICAL-CHEMICAL PROPERTIES OF PROTEINS; METHODS OF ITS
DETERMINATION, PRECIPITATION REACTIONS
Protein is an important nutrient that builds muscles and bones and
provides energy. Protein can help with weight control because it helps you feel
full and satisfied from your meals.
The healthiest proteins are the leanest. This means that they have the
least fat and calories. The best protein choices are fish or shellfish,
skinless chicken or turkey, low-fat or fat-free dairy (skim milk, low-fat cheese),
and egg whites or egg substitute. The best red meats are the leanest cuts (loin
and tenderloin). Other healthy options are beans, legumes (lentils and peanut
butter), and soy foods such as tofu or soymilk.
Protein is an important part of every diet and is found in many
different foods. Lean protein, the best kind, can be found in fish, skinless
chicken and turkey, pork tenderloin and certain cuts of beef, like the top
round. Low-fat dairy products like milk, yogurt, ricotta and other cheeses
supply both protein and calcium.
• Protein is crucial for tissue
repair, building and preserving muscle, and making important enzymes and
hormones.
• Lean meats and dairy contribute
valuable minerals like calcium, iron, selenium and zinc. These are not only
essential for building bones, and forming and maintaining nerve function, but
also for fighting cancer, forming blood cells and keeping immune systems
robust.
The word protein was first coined in 1838 to emphasize the importance of
this class of molecules. The word is derived from the Greek word proteios which
means "of the first rank".
This chapter will provide a brief background into the structure of
proteins and how this structure can determine the function and activity of
proteins. It is not intended to substitute for the more detailed information
provided in a biochemistry or cell biology course.
Proteins are the major components of living organisms and perform a wide
range of essential functions in cells. While DNA is the information molecule,
it is proteins that do the work of all cells - microbial, plant, animal.
Proteins regulate metabolic activity, catalyze biochemical reactions and
maintain structural integrity of cells and organisms. Proteins can be
classified in a variety of ways, including their biological function (Table
2.1).
How does one group of molecules perform such a diverse set of functions?
The answer is found in the wide variety of possible structures for proteins.
In the English language, there are an enormous number of words with
varied meaning that can be formed using only 26 letters as building blocks. A
similar situation exists for proteins where an incredible variety of proteins
can be formed using 20 different building blocks called amino acids. Each of
these amino acid building blocks has a different chemical structure and
different properties.
Each protein has a unique amino acid sequence that is genetically
determined by the order of nucleotide bases in the DNA, the genetic code. Since
each protein has different numbers and kinds of the twenty available amino
acids, each protein has a unique chemical composition and structure. For
example, two proteins may each have 37 amino acids but if the sequence of the
amino acids is different, then the protein will be different. How many different
proteins can be formed from the twenty different amino acids? Consider a
protein containing 100 different amino acids linked into one chain. Since each
of the 100 positions of this chain could be filled with any one of the 20 amino
acids, there are 20100 possible combinations, more than enough to account for
the 90-100 million different proteins that may be found in higher organisms.
A change in just one amino acid can change the structure and function of
a protein. For example, sickle cell anemia is a disease that results from an
altered structure of the protein hemoglobin, resulting from a change of the
sixth amino acid from glutamic acid to valine. (This is the result of a single
base pair change at the DNA level.) This single amino acid change is enough to
change the conformation of hemoglobin so that this protein clumps at lower
oxygen concentrations and causes the characteristic sickle shaped red blood
cells of the disease.
The unique structure and chemical composition of each protein is important
for its function; it is also important for separating proteins in a protein
purification strategy. Each of these differences in properties can be used as a
basis for the separation methods that are used to purify proteins. Because
these differences in protein properties originate from differences in the
chemical structure of the amino acids that make up the protein, we need to
explore the structure of amino acids and their contribution to protein
properties in more detail.
Amino acids are composed of carbon, hydrogen, oxygen, and nitrogen. Two
amino acids, cysteine and methionine, also contain sulfur. The generic form of
an amino acid is shown in Figure 2.1. Atoms of these elements are arranged into
20 kinds of amino acids that are commonly found in proteins. All proteins in
all species, from bacteria to humans, are constructed from the same set of
twenty amino acids. All amino acids have an amino group (NH2) and a carboxyl
group (COOH) bonded to the same carbon atom, known as the alpha carbon. Amino acids
differ in the side chain or R group that is bonded to the alpha carbon. (Figure
2.2) Glycine, the simplest amino acid has a single hydrogen atom as its R group
- Alanine has a methyl (-CH3) group.
The chemical composition of the
unique R groups is responsible for the important characteristics of amino acids
such as chemical reactivity, ionic charge and relative hydrophobicity. In
Figure 2.2, the amino acids are grouped according to their polarity and charge.
They are divided into four categories, those with polar uncharged R groups,
those with apolar (nonpolar) R groups, acidic (charged) and basic (charged)
groups.
The polar amino acids are soluble
in water because their R groups can form hydrogen bonds with water. For
example, serine, threonine and tyrosine all have hydroxyl groups (OH). Amino
acids that carry a net negative charge at neutral pH contain a second carboxyl
group. These are the acidic amino acids, aspartic acid and glutamic acid, also
called aspartate and glutamate, respectively. The basic amino acids have R
groups with a net positive charge at pH 7.0. These include lysine, arginine and
histidine. There are eight amino acids with nonpolar R groups. As a group,
these amino acids are less soluble in water than the polar amino acids. If a protein
has a greater percentage of nonpolar R groups, the protein will be more
hydrophobic (water hating) in character.
A protein is formed by amino acid
subunits linked together in a chain. The bond between two amino acids is formed
by the removal of a H20 molecule from two different amino acids, forming a
dipeptide. (Figure 2.3) The bond between two amino acids is called a peptide
bond and the chain of amino acids is called a peptide (20 amino acids or
smaller) or a polypeptide.
Each protein consists of one or more unique polypeptide chains. Most
proteins do not remain as linear sequences of amino acids; rather, the
polypeptide chain undergoes a folding process. The process of protein folding
is driven by thermodynamic considerations. This means that each protein folds
into a configuration that is the most stable for its particular chemical
structure and its particular environment. The final shape will vary but the
majority of proteins assume a globular configuration. Many proteins such as
myoglobin consist of a single polypeptide chain; others contain two or more
chains. For example, hemoglobin is made up of two chains of one type (amino
acid sequence) and two of another type.
Although the primary amino acid sequence determines how the protein
folds, this process is not completely understood. Although certain amino acid
sequences can be identified as more likely to form a particular conformation,
it is still not possible to completely predict how a protein will fold based on
its amino acid sequence alone, and this is an active area of biochemical
research.
The final folded 3-D arrangement of the protein is referred to as its
conformation. In order to maintain their function, proteins must maintain this
conformation. To describe this complex conformation, scientists describe four
levels of organization: primary, secondary, tertiary, and quaternary (Figure
2.4). The overall conformation of a protein is the combination of its primary,
secondary, tertiary and quaternary elements.
Four levels of Organization of Protein Structure:
• Primary Structure refers to the
linear sequence of amino acids that make up the polypeptide chain. This
sequence is determined by the genetic code, the sequence of nucleotide bases in
the DNA. The bond between two amino acids is a peptide bond. This bond is
formed by the removal of a H20 molecule from two different amino acids, forming
a dipeptide. The sequence of amino acids determines the positioning of the
different R groups relative to each other. This positioning therefore determines
the way that the protein folds and the final structure of the molecule.
• The secondary structure of
protein molecules refers to the formation of a regular pattern of twists or
kinks of the polypeptide chain. The regularity is due to hydrogen bonds forming
between the atoms of the amino acid backbone of the polypeptide chain. The two
most common types of secondary structure are called the alpha helix and ß
pleated sheet. (Figure 2.4)
• Tertiary structure refers to the
three dimensional globular structure formed by bending and twisting of the
polypeptide chain. This process often means that the linear sequence of amino
acids is folded into a compact globular structure. The folding of the
polypeptide chain is stabilized by multiple weak, noncovalent interactions.
These interactions include:
o Hydrogen bonds that form
when a Hydrogen atom is shared by two other atoms.
o Electrostatic
interactions that occur between charged amino acid side chains. Electrostatic
interactions are attractions between positive and negative sites on
macromolecules.
o Hydrophobic interactions:
During folding of the polypeptide chain, amino acids with a polar (water
soluble) side chain are often found on the surface of the molecule while amino acids
with non polar (water insoluble) side chain are buried in the interior. This
means that the folded protein is soluble in water or aqueous solutions.
Covalent bonds may also contribute to tertiary structure. The amino
acid, cysteine, has an SH group as part of its R group and therefore, the
disulfide bond (S-S ) can form with an adjacent cysteine. For example, insulin
has two polypeptide chains that are joined by two disulfide bonds.
• Quaternary structure refers to
the fact that some proteins contain more than one polypeptide chain, adding an
additional level of structural organization: the association of the polypeptide
chains. Each polypeptide chain in the protein is called a subunit. The subunits
can be the same polypeptide chain or different ones. For example, the enzyme
ß-galactosidase is a tetramer, meaning that it is composed of four
subunits, and, in this case, the subunits are identical - each polypeptide
chain has the same sequence of amino acids. Hemoglobin, the oxygen carrying
protein in the blood, is also a tetramer but it is composed of two polypeptide
chains of one type (141 amino acids) and two of a different type (146 amino
acids). In chemical shorthand, this is referred to as a2ß2 . For some
proteins, quaternary structure is required for full activity (function) of the
protein.
Some proteins combine with other
kinds of molecules such as carbohydrates, lipids, iron and other metals, or
nucleic acids, to form glycoproteins, lipoproteins, hemoproteins,
metalloproteins, and nucleoproteins respectively. The presence of these other
biomolecules affects the protein properties. For example, a protein that is
conjugated to carbohydrate, called a glycoprotein, would be more hydrophilic in
character while a protein conjugated to a lipid would be more hydrophobic in
character.
Proteins are typically characterized by their size (molecular weight)
and shape, amino acid composition and sequence, isolelectric point (pI),
hydrophobicity, and biological affinity. Differences in these properties can be
used as the basis for separation methods in a purification strategy (Chapter
4). The chemical composition of the unique R groups is responsible for the
important characteristics of amino acids, chemical reactivity, ionic charge and
relative hydrophobicity. Therefore protein properties relate back to number and
type of amino acids that make up the protein.
Size of proteins is usually measured in molecular weight (mass) although
occasionally the length or diameter of a protein is given in Angstroms. The
molecular weight of a protein is the mass of one mole of protein, usually
measured in units called daltons. One dalton is the atomic mass of one proton
or neutron. The molecular weight can be estimated by a number of different
methods including electrophoresis, gel filtration, and more recently by mass
spectrometry. The molecular weight of proteins varies over a wide range. For
example, insulin is 5,700 daltons while snail hemocyanin is 6,700,000 daltons.
The average molecular weight of a protein is between 40,000 to 50,000 daltons.
Molecular weights are commonly reported in kilodaltons or (kD), a unit of mass
equal to 1000 daltons. Most proteins have a mass between 10 and 100 kD. A small
protein consists of about 50 amino acids while larger proteins may contain 3,000
amino acids or more. One of the larger amino acid chains is myosin, found in
muscles, which has 1,750 amino acids.
Separation methods that are based on size and shape include gel
filtration chromatography (size exclusion chromatography) and polyacrylamide
gel electrophoresis.
The amino acid composition is the percentage of the constituent amino
acids in a particular protein while the sequence is the order in which the
amino acids are arranged.
Each protein has an amino group at one end and a carboxyl group at the
other end as well as numerous amino acid side chains, some of which are
charged. Therefore each protein carries a net charge. The net protein charge is
strongly influenced by the pH of the solution. To explain this phenomenon,
consider the hypothetical protein in Figure 2.5. At pH 6.8, this protein has an
equal number of positive and negative charges and so there is no net charge on
the protein. As the pH drops, more H+ ions are available in the solution. These
hydrogen ions bind to negative sites on the amino acids. Therefore, as the pH
drops, the protein as a whole becomes positively charged. Conversely, at a
basic pH, the protein becomes negatively charged. pH 6.8 is called the pI, or
isoelectric point, for this protein; that is, the pH at which there are an
equal number of positive and negative charges. Different proteins have
different numbers of each of the amino acid side chains and therefore have
different isoelectric points. So, in a buffer solution at a particular pH, some
proteins will be positively charged, some proteins will be negatively charged
and some will have no charge.
Separation techniques that are based on charge include ion exchange
chromatography, isoelectric focusing and chromatofocusing.
Hydrophobicity:
Literally, hydrophobic means fear of water. In aqueous solutions,
proteins tend to fold so that areas of the protein with hydrophobic regions are
located in internal surfaces next to each other and away from the polar water
molecules of the solution. Polar groups on the amino acid are called
hydrophilic (water loving) because they will form hydrogen bonds with water
molecules. The number, type and distribution of nonpolar amino acid residues
within the protein determines its hydrophobic character. (Chart of hydrophobicity
or hydropathy)
A separation method that is based on the hydrophobic character of
proteins is hydrophobic interaction chromatography.
Solubility:
As the name implies, solubility is the amount of a solute that can be
dissolved in a solvent. The 3-D structure of a protein affects its solubility
properties. Cytoplasmic proteins have mostly hydrophilic (polar) amino acids on
their surface and are therefore water soluble, with more hydrophobic groups
located on the interior of the protein, sheltered from the aqueous environment.
In contrast, proteins that reside in the lipid environment of the cell membrane
have mostly hydrophobic amino acids (non polar) on their exterior surface and
are not readily soluble in aqueous solutions.
Each protein has a distinct and characteristic solubility in a defined
environment and any changes to those conditions (buffer or solvent type, pH,
ionic strength, temperature, etc.) can cause proteins to lose the property of
solubility and precipitate out of solution. The environment can be manipulated
to bring about a separation of proteins- for example, the ionic strength of the
solution can be increased or decreased, which will change the solubility of
some proteins.
Biological Affinity (Function):
Proteins often interact with other molecules in vivo in a specific way-
in other words, they have a biological affinity for that molecule. These
molecular counterparts, termed ligands, can be used as “bait” to “fish” out the
target protein that you want to purify. For example, one such molecular pair is
insulin and the insulin receptor. If you want to purify (or catch) the insulin
receptor, you could couple many insulin molecules to a solid support and then
run an extract (containing the receptor) over that column. The receptor would be
“caught” by the insulin bait. These specific interactions are often exploited
in protein purification procedures. Affinity chromatography is a very common
method for purifying recombinant proteins (proteins produced by genetic
engineering). Several histidine residues can be engineered at the end of a
polypeptide chain. Since repeated histidines have an affinity for metals, a
column of the metal can be used as bait to “catch” the recombinant protein.
Although DNA can be isolated and
amplified from thousand year old mummies, most proteins are more fragile
biomolecules. Therefore, laboratory reagents and storage solutions must provide
suitable conditions so that the normal structure and function of the protein is
maintained. To understand how the structure of proteins is protected in
laboratory solutions, it is necessary to understand how that structure can be
destroyed.
• Proteins can denature, or unfold
so that their three dimensional structure is altered but their primary
structure remains intact.(Figure 2.7) Many of the interactions that stabilize
the 3-D conformation of the protein are relatively weak and are sensitive to
various environmental factors including high temperature, low or high pH and
high ionic strength. Protein vary greatly in the degree of their sensitivity to
these factors. Sometimes proteins can be renatured but often the denaturation
is irreversible.
• Proteins can also be broken apart
by enzymes, called proteases, that digest the covalent peptide bonds between
amino acids that are responsible for the primary structure. This process is
called proteolysis and is irreversible. Cells contain proteases that are found
in lysosomes, membrane bound organelles inside the cell. When cells are
disrupted, lysosomes break and release these proteases, which can damage the
other proteins in the cell. In the laboratory, it is therefore necessary to
minimize the activities of cellular proteases to protect proteins from
proteolysis. Methods used to minimize proteolysis include working at lower
temperatures (4°C), and adding chemicals that inhibit protease activity.
• Sulfur groups on cysteines may
undergo oxidation to form disulfide bonds that are not normally present. Extra
disulfide bonds can form when proteins are removed from their normal environment.
Reducing agents such as dithiothreitol or ß-mercaptoethanol are often
added to prevent undesirable disulfiate bond formation.
• Proteins readily adsorb (stick
to) surfaces, thereby reducing their available activity. To prevent significant
loss, do not store dilute solutions of proteins for prolonged periods of time.
Always dilute them right before use.
The composition of the extraction buffer is important for maintaining
structure and function of the target protein. To prevent denaturation, the
buffering pH is based on the pH stability range of the protein. Other
components such as ionic strength, divalent cations (Ca++ and Mg++), or
reducing agents (dithiothreitol or ß-mercaptoethanol) may be needed to
maintain activity. In making the extract, cells are lysed and proteases
(enzymes that degrade proteins) are released from their intracellular
compartments. To prevent proteases from digesting the target protein, two
strategies are commonly followed: 1) The extract is kept cold. The activity of
proteolytic enzymes is greatly reduced by cold temperatures. For this reason,
the protein purification process is often conducted in cold rooms. At the very
least, an effort is made to keep the extract at 4?C. 2) Protease inhibitors are
sometimes added to the mixture to prevent degradation by proteases. The
drawback to this strategy is that the inhibitors must eventually be removed,
along with other contaminant proteins.
Peptides and polypeptides
Glycine and alanine can combine together with the elimination of a molecule
of water to produce a dipeptide. It is possible for this to happen in one of
two different ways - so you might get two different dipeptides.
In each case, the linkage shown in blue in the structure of the dipeptide
is known as a peptide link. In chemistry, this would also be known as an amide
link, but since we are now in the realms of biochemistry and biology, we'll use
their terms.
If you joined three amino acids together, you would get a tripeptide. If
you joined lots and lots together (as in a protein chain), you get a
polypeptide.
A protein chain will have somewhere in the range of 50 to 2000 amino
acid residues. You have to use this term because strictly speaking a peptide
chain isn't made up of amino acids. When the amino acids combine together, a
water molecule is lost. The peptide chain is made up from what is left after
the water is lost - in other words, is made up of amino acid residues.
By convention, when you are drawing peptide chains, the -NH2 group which
hasn't been converted into a peptide link is written at the left-hand end. The
unchanged -COOH group is written at the right-hand end.
The end of the peptide chain with the -NH2 group is known as the
N-terminal, and the end with the -COOH group is the C-terminal.
A protein chain (with the N-terminal on the left) will therefore look
like this:
The "R" groups come from the 20 amino acids which occur in
proteins. The peptide chain is known as the backbone, and the "R"
groups are known as side chains.
Note: In the case where the "R" group
comes from the amino acid proline, the pattern is broken. In this case, the
hydrogen on the nitrogen nearest the "R" group is missing, and the
"R" group loops around and is attached to that nitrogen as well as to
the carbon atom in the chain.
I mention this for the sake of completeness - not because you would be
expected to know about it in chemistry at this introductory level.
Now there's a problem! The term "primary structure" is used in
two different ways. At its simplest, the term is used to describe the order of
the amino acids joined together to make the protein. In other words, if you
replaced the "R" groups in the last diagram by real groups you would
have the primary structure of a particular protein.
This primary structure is usually shown using abbreviations for the
amino acid residues. These abbreviations commonly consist of three letters or
one letter.
Using three letter abbreviations, a bit of a protein chain might be
represented by, for example:
If you look carefully, you will
spot the abbreviations for glycine (Gly) and alanine (Ala) amongst the others.
If you followed the protein chain all the way to its left-hand end, you
would find an amino acid residue with an unattached -NH2 group. The N-terminal
is always written on the left of a diagram for a protein's primary structure -
whether you draw it in full or use these abbreviations.
The wider definition of primary
structure includes all the features of a protein which are a result of covalent
bonds. Obviously, all the peptide links are made of covalent bonds, so that
isn't a problem.
But there is an additional feature in proteins which is also covalently
bound. It involves the amino acid cysteine.
If two cysteine side chains end
up next to each other because of folding in the peptide chain, they can react
to form a sulphur bridge. This is another covalent link and so some people
count it as a part of the primary structure of the protein.
Because of the way sulphur bridges affect the way the protein folds,
other people count this as a part of the tertiary structure (see below). This
is obviously a potential source of confusion!
Within the long protein chains there are regions in which the chains are
organised into regular structures known as alpha-helices (alpha-helixes) and
beta-pleated sheets. These are the secondary structures in proteins.
These secondary structures are held together by hydrogen bonds. These
form as shown in the diagram between one of the lone pairs on an oxygen atom
and the hydrogen attached to a nitrogen atom:
Important: If you aren't happy about hydrogen bonding
and are unsure about what this diagram means, follow this link before you go
on. What follows is difficult enough to visualise anyway without having to
worry about what hydrogen bonds are as well!
You must also find out exactly how much detail you need to know about
this next bit. It may well be that all you need is to have heard of an
alpha-helix and know that it is held together by hydrogen bonds between the C=O
and N-H groups. Once again, you need to check your syllabus and past papers -
particularly mark schemes for the past papers.
.
Denaturation occurs because the bonding interactions responsible for the
secondary structure (hydrogen bonds to amides) and tertiary structure are
disrupted. In tertiary structure there are four types of bonding interactions
between "side chains" including: hydrogen bonding, salt bridges,
disulfide bonds, and non-polar hydrophobic interactions. which may be
disrupted. Therefore, a variety of reagents and conditions can cause
denaturation. The most common observation in the denaturation process is the
precipitation or coagulation of the protein.