MECHANISMS OF LEARNING
Learning is a relatively permanent change in an organism’s behavior due to experience. In associative learning, we learn to associate two stimuli (as in classical conditioning) or a response and its consequences (as in operant conditioning). In observational learning, we learn by watching others’ experiences and examples. Learned associations also feed our habitual behaviors. As we repeat behaviors in a given context—the sleeping posture we associate with bed, our walking routes on campus, our eating popcorn in a movie theater—the behaviors become associated with the contexts. Our next experience of the context then automatically triggers the habitual response. Such associations can make it hard to kick a smoking habit; when back in the smoking context, the urge to light up can be powerful. By linking two events that occur close together, both the sea slug and the seals exhibit associative learning. The sea slug associates the squirt with an impending shock; the seal associates slapping and barking with a herring treat. Each animal has learned something important to its survival: predicting the immediate future. Conditioning is the process of learning associations. In classical conditioning, we learn to associate two stimuli and thus to anticipate events. We learn that a flash of lightning signals an impending crack of thunder, so when lightning flashes nearby, we start to brace ourselves. In operant conditioning, we learn to associate a response (our behavior) and its consequence and thus to repeat acts followed by good results and avoid acts followed by bad results.
Conditioning is not the only form of learning. Through observational learning, we learn from others’ experiences. Chimpanzees, too, may learn behaviors merely by watching others perform them. If one sees another solve a puzzle and gain a food reward, the observer may perform the trick more quickly.
By conditioning and by observation we humans learn and adapt to our environments. We learn to expect and prepare for significant events such as food or pain (classical conditioning). We also learn to repeat acts that bring good results and to avoid acts that bring bad results (operant conditioning). By watching others we learn new behaviors (observational learning). And through language, we also learn things we have neither experienced nor observed.
Classical conditioning is a type of learning in which an organism comes to associate stimuli. Pavlov’s work on classical conditioning laid the foundation for behaviorism, the view that psychology should be an objective science that studies behavior without reference to mental processes. Although learning by association had been discussed for centuries, it remained for Ivan Pavlov to capture the phenomenon in his classic experiments on conditioning.
In classical conditioning, a unconditional reflex (UR) is an event that occurs naturally (such as salivation), in response to some stimulus. A unconditional stimulus (US) is something that naturally and automatically (without learning) triggers the unlearned response (as food in the mouth triggers salivation). A conditional stimulus (CS) is a previously irrelevant stimulus (such as a bell) that, through learning, comes to be associated with some unlearned response (salivating). A conditional reflex (CR) is the learned response (salivating) to the originally irrelevant but now conditioned stimulus.
In classical conditioning, acquisition is associating a CS with the US. Acquisition occurs most readily when a CS is presented just before (ideally, about a half-second before) a US, preparing the organism for the upcoming event. This finding supports the view that classical conditioning is biologically adaptive. Extinction is diminished responding when the CS no longer signals an impending US. Spontaneous recovery is the appearance of a formerly extinguished response, following a rest period. Generalization is the tendency to respond to stimuli that are similar to a CS. Discrimination is the learned ability to distinguish between a CS and other irrelevant stimuli.
The behaviorists’ optimism that in any species, any response can be conditioned to any stimulus has been tempered. Conditioning principles, we now know, are cognitively and biologically constrained. In classical conditioning, animals learn when to expect a US, and they may be aware of the link between stimuli and responses. Moreover, because of biological predispositions, learning some associations is easier than learning others. Learning is adaptive: Each species learns behaviors that aid its survival.
Pavlov taught us that significant psychological phenomena can be studied objectively, and that classical conditioning is a basic form of learning that applies to all species. Later research modified this finding somewhat by showing that in many species cognition and biological predispositions place some limits on conditioning.
Classical conditioning techniques are used in treatment programs for those recovering from cocaine and other drug abuse and to condition more appropriate responses in therapy for emotional disorders. The body’s immune system also appears to respond to classical conditioning.
Pavlov was driven by a lifelong passion for research. After setting aside his initial plan to follow his father into the Russian Orthodox priesthood, Pavlov received a medical degree at age 33 and spent the next two decades studying the digestive system. This work earned him Russia’s first Nobel prize in 1904. But it was his novel experiments on learning, to which he devoted the last three decades of his life, that earned this feisty scientist his place in history.
Pavlov’s new direction came when his creative mind seized on an incidental observation. Without fail, putting food in a dog’s mouth caused the animal to salivate. Moreover, the dog began salivating not only to the taste of the food, but also to the mere sight of the food, or the food dish, or the person delivering the food, or even the sound of that person’s approaching footsteps. At first, Pavlov considered these “psychic secretions” an annoyance—until he realized they pointed to a simple but important form of learning.
Pavlov and his assistants tried to imagine what the dog was thinking and feeling as it drooled in anticipation of the food. This only led them into fruitless debates. So, to explore the phenomenon more objectively, they experimented. To eliminate other possible influences, they isolated the dog in a small room, secured it in a harness, and attached a device to divert its saliva to a measuring instrument. From the next room, they presented food—first by sliding in a food bowl, later by blowing meat powder into the dog’s mouth at a precise moment. They then paired various neutral events—something the dog could see or hear but didn’t associate with food—with food in the dog’s mouth. If a sight or sound regularly signaled the arrival of food, would the dog learn the link? If so, would it begin salivating in anticipation of the food?
The answers proved to be yes and yes. Just before placing food in the dog’s mouth to produce salivation, Pavlov sounded a tone. After several pairings of tone and food, the dog, anticipating the meat powder, began salivating to the tone alone. In later experiments, a buzzer, a light, a touch on the leg, even the sight of a circle set off the drooling. (This procedure works with people, too. When hungry young Londoners viewed abstract figures before smelling peanut butter or vanilla, their brains soon were responding in anticipation to the abstract images alone).
Because salivation in response to food in the mouth was unlearned, Pavlov called it an unconditioned response (UR). Food in the mouth automatically, unconditionally, triggers a dog’s salivary reflex. Thus, Pavlov called the food stimulus an unconditioned stimulus (US).
Pavlov repeatedly presented a neutral stimulus (such as a tone) just before an unconditioned stimulus (UCS, food) that triggered an unconditioned response (UCR, salivation). After several repetitions, the tone alone (now the conditioned stimulus, CS) triggered a conditioned response (CR, salivation). Further experiments on acquisition revealed that classical conditioning was usually greatest when the CS was presented just before the UCS, thus preparing the organism for what was coming. Other experiments explored the phenomena of acquisition, extinction, spontaneous recovery, generalization, and discrimination.
Salivation in response to the tone was conditional upon the dog’s learning the association between the tone and the food. Today we call this learned response the conditioned response (CR). The previously neutral (in this context) tone stimulus that now triggered the conditional salivation we call the conditioned stimulus (CS). Distinguishing these two kinds of stimuli and responses is easy: Conditioned =learned; unconditioned = unlearned.
Pavlov’s work laid a foundation for John B. Watson’s emerging belief that psychology, to be an objective science, should study only overt behavior, without considering unobservable mental activity. Watson called this position behaviorism.
Extending Pavlov’s Understanding.
The behaviorists’ optimism that learning principles would generalize from one response to another and from one species to another has been tempered. Conditioning principles, we now know, are cognitively influenced and biologically constrained. In classical conditioning, animals learn when to "expect" an unconditioned stimulus. Moreover, animals are biologically predisposed to learn associations between, say, a peculiar taste and a drink that will make them sick, which they will then avoid. They don’t, however, learn to avoid a sickening drink announced by a noise.
To understand the acquisition, or initial learning, of the stimulus-response relationship, Pavlov and his associates had to confront the question of timing: How much time should elapse between presenting the neutral stimulus (the tone, the light, the touch) and the unconditioned stimulus? In most cases, not much—half a second usually works well.
What do you suppose would happen if the food (US) appeared before the tone (CS) rather than after? Would conditioning occur? Not likely. With but a few exceptions, conditioning doesn’t happen when the CS follows the US. Remember, classical conditioning is biologically adaptive because it helps humans and other animals prepare for good or bad events. To Pavlov’s dogs, the tone (CS) signaled an important biological event—the arrival of food (US). To deer in the forest, the snapping of a twig (CS) may signal a predator’s approach (US). If the good or bad event had already occurred, the CS would not likely signal anything significant.
Extinction and Spontaneous Recovery
After conditioning, what happens if the CS occurs repeatedly without the US? Will the CS continue to elicit the CR? Pavlov discovered that when he sounded the tone again and again without presenting food, the dogs salivated less and less. Their declining salivation illustrates extinction, the diminished responding that occurs when the CS (tone) no longer signals an impending US (food).
Pavlov found, however, that if he allowed several hours to elapse before sounding the tone again, the salivation to the tone would reappear spontaneously. This spontaneous recovery—the reappearance of a (weakened) CR after a pause—suggested to Pavlov that extinction was suppressing the CR rather than eliminating it.
After breaking up with his fire-breathing heartthrob, Tirrell also experienced extinction and spontaneous recovery. He recalls that “the smell of onion breath (CS), no longer paired with the kissing (US), lost its ability to shiver my timbers. Occasionally, though, after not sensing the aroma for a long while, smelling onion breath awakens a small version of the emotional response I once felt.”
Pavlov and his students noticed that a dog conditioned to the sound of one tone also responded somewhat to the sound of a different tone that had never been paired with food. Likewise, a dog conditioned to salivate when rubbed would also drool a bit when scratched or when touched on a different body part. This tendency to respond to stimuli similar to the CS is called generalization. Generalization can be adaptive, as when toddlers taught to fear moving cars also become afraid of moving trucks and motorcycles. So automatic is generalization that one Argentine writer who underwent torture still recoils with fear when he sees black shoes—his first glimpse of his torturers as they approached his cell.
Generalization of anxiety reactions has been demonstrated in laboratory studies comparing abused with nonabused children. Shown an angry face on a computer screen, abused children’s brain-wave responses are dramatically stronger and longer lasting. Because of generalization, stimuli similar to naturally disgusting or appealing objects will, by association, evoke some disgust or liking. Normally desirable foods, such as fudge, are unappealing when shaped to resemble dog feces. Adults with childlike facial features (round face, large forehead, small chin, large eyes) are perceived as having childlike warmth, submissiveness, and naiveté. In both cases, people’s emotional reactions to one stimulus generalize to similar stimuli.
Pavlov’s dogs also learned to respond to the sound of a particular tone and not to other tones. Discrimination is the learned ability to distinguish between a conditioned stimulus (which predicts the US) and other irrelevant stimuli. Being able to recognize differences is adaptive. Slightly different stimuli can be followed by vastly different consequences. Confronted by a pit bull, your heart may race; confronted by a golden retriever, it probably will not.
Pavlov taught us that principles of learning apply across species that significant psychological phenomena can be studied objectively, and that conditioning principles have important practical applications.
Through higher-order conditioning, a new neutral stimulus can become a new conditioned stimulus. All that’s required is for it to become associated with a previously conditioned stimulus. If a tone regularly signals food and produces salivation, then a light that becomes associated with the tone may also begin to trigger salivation. Although this higher-order conditioning (also called second-order conditioning) tends to be weaker than first-stage conditioning, it influences our everyday lives.
Imagine that something makes us very afraid (perhaps a pit bull dog associated with a previous dog bite). If something else, such as the sound of a barking dog, brings to mind that pit bull, the bark alone may make us feel a little afraid.
Associations can influence attitudes. As Andy Field showed British children novel cartoon characters alongside either ice cream (Yum!) or Brussels sprouts (Yuk!), the children came to like best the ice-cream–associated characters. Michael Olson and Russell Fazio classically conditioned adults’ attitudes, using little-known Pokémon characters. The participants, playing the role of a security guard monitoring a video screen, viewed a stream of words, images, and Pokémon characters. Their task, they were told, was to respond to one target Pokémon character by pressing a button. Unnoticed by the participants, when two other Pokémon characters appeared on the screen, one was consistently associated with various positive words and images (such as awesome or a hot fudge sundae); the other appeared with negative words and images (such as awful or a cockroach). Without any conscious memory for the pairings, the participants formed more gut-level positive attitudes for the characters associated with the positive stimuli. Follow-up studies indicate that conditioned likes and dislikes are even stronger when people notice and are aware of the associations they have learned.
Classical conditioning and operant conditioning are both forms of associativelearning, yet their difference is straightforward:
• Classical conditioning forms associations between stimuli (a CS and the US it signals). It also involves respondent behavior—actions that are automatic responses to a stimulus (such as salivating in response to meat powder and later in response to a tone).
• In operant conditioning, organisms associate their own actions with consequences. Actions followed by reinforcers increase; those followed by punishers decrease. Behavior that operates on the environment to produce rewarding or punishing stimuli is called operant behavior.
In operant conditioning, an organism learns associations between its own behavior and resulting events; this form of conditioning involves operant behavior (behavior that operates on the environment, producing consequences). In classical conditioning, the organism forms associations between stimuli—behaviors it does not control; this form of conditioning involves respondent behavior (automatic responses to some stimulus).
Expanding on Edward Thorndike’s law of effect, B. F. Skinner and others found that the behavior of rats or pigeons placed in an operant chamber (Skinner box) can be shaped by using reinforcers to guide closer and closer approximations of the desired behavior. Through operant conditioning, organisms learn to produce behaviors that are followed by reinforcing stimuli and to suppress behaviors that are followed by punishing stimuli.
Positive reinforcement adds something desirable to increase the frequency of a behavior. Negative reinforcement removes something undesirable to increase the frequency of a behavior. Primary reinforcers (such as receiving food when hungry or having nausea end during an illness) are innately satisfying—no learning is required. Conditioned (or secondary) reinforcers (such as cash) are satisfying because we have learned to associate them with more basic rewards (such as the food or medicine we buy with them). Immediate reinforcers (such as unprotected sex) offer immediate payback; delayed reinforcers (such as a weekly paycheck) require the ability to delay gratification.
In continuous reinforcement (reinforcing desired responses every time they occur), learning is rapid, but so is extinction if rewards cease. In partial (intermittent) reinforcement, initial learning is slower, but the behavior is much more resistant to extinction. Fixed-ratio schedules offer rewards after a set number of responses; variable-ratio schedules, after an unpredictable number. Fixed-interval schedules offer rewards after set time periods; variable-interval schedules, after unpredictable time periods.
Punishment attempts to decrease the frequency of a behavior (a child’s disobedience) by administering an undesirable consequence (such as spanking) or withdrawing something desirable (such as taking away a favorite toy). Undesirable side effects can include suppressing rather than changing unwanted behaviors, teaching aggression, creating fear, encouraging discrimination (so that the undesirable behavior appears when the punisher is not present), and fostering depression and feelings of helplessness.
Skinner underestimated the limits that cognitive and biological constraints place on conditioning. Research on cognitive mapping and latent learning demonstrate the importance of cognitive processes in learning. Excessive rewards can undermine intrinsic motivation. Training that attempts to override biological constraints will probably not endure because the animals will revert to their predisposed patterns.
In school, teachers can use shaping techniques to guide students’ behaviors, and they can use interactive software and Web sites to provide immediate feedback. In sports, coaches can build players’ skills and self-confidence by rewarding small improvements. At work, managers can boost productivity and morale by rewarding well-defined and achievable behaviors. At home, parents can reward behaviors they consider desirable, but not those that are undesirable. We can shape our own behaviors by stating our goals, monitoring the frequency of desired behaviors, reinforcing desired behaviors, and cutting back on incentives as behaviors become habitual.
B. F. Skinner (1904–1990) was a college English major and an aspiring writer who, seeking a new direction, entered graduate school in psychology. He went on to become modern behaviorism’s most influential and controversial figure. Skinner’s work elaborated what psychologist Edward L. Thorndike called the law of effect: Rewarded behavior is likely to recur. Using Thorndike’s law of effect as a starting point, Skinner developed a behavioral technology that revealed principles of behavior control. These principles also enabled him to teach pigeons such unpigeonlike behaviors as walking in a figure 8, playing Ping-Pong, and keeping a missile on course by pecking at a screen target.
For his pioneering studies, Skinner designed an operant chamber, popularly known as a Skinner box. The box has a bar or key that an animal presses or pecks to release a reward of food or water, and a device that records these responses. Operant conditioning experiments have done far more than teach us how to pull habits out of a rat. They have explored the precise conditions that foster efficient and enduring learning.
In his experiments, Skinner used shaping, a procedure in which reinforcers, such as food, gradually guide an animal’s actions toward a desired behavior. Imagine that you wanted to condition a hungry rat to press a bar. First, you would watch how the animal naturally behaves, so that you could build on its existing behaviors. You might give the rat a food reward each time it approaches the bar. Once the rat is approaching regularly, you would require it to move closer before rewarding it, then closer still.
Finally, you would require it to touch the bar before you gave it the food. With this method of successive approximations, you reward responses that are ever-closer to the final desired behavior, and you ignore all other responses. By making rewards contingent on desired behaviors, researchers and animal trainers gradually shape complex behaviors.
Skinner showed that when placed in an operant chamber, rats or pigeons can be shaped to display successively closer approximations of a desired behavior. Researchers have also studied the effects of primary and secondary reinforcers, and of immediate and delayed reinforcers. Partial reinforcement schedules (fixed-ratio, variable-ratio, fixed-interval, and variable-interval) produce slower acquisition of the target behavior than does continuous reinforcement, but they also create more resistance to extinction. Punishment is most effective when it is strong, immediate, and consistent. However, it can have undesirable side effects.
Shaping can also help us understand what nonverbal organisms perceive. Can a dog distinguish red and green? Can a baby hear the difference between lower- and higher-pitched tones? If we can shape them to respond to one stimulus and not to another, then we know they can perceive the difference. Such experiments have even shown that some animals can form concepts. If an experimenter reinforces a pigeon for pecking after seeing a human face, but not after seeing other images, the pigeon learns to recognize human faces. In this experiment, a face is a discriminative stimulus; like a green traffic light, it signals that a response will be reinforced. After being trained to discriminate among flowers, people, cars, and chairs, pigeons can usually identify the category in which a new pictured object belongs. They have even been trained to discriminate between Bach’s music and Stravinsky’s.
Extending Skinner’s Understanding.
Skinner’s emphasis on external control of behavior made him both influential and controversial. Many psychologists criticized Skinner (as they did Pavlov) for underestimating the importance of cognitive and biological constraints. For example, research on latent learning and motivation, both intrinsic and extrinsic, further indicates the importance of cognition in learning.
Skinner and his collaborators compared four schedules of partial reinforcement. Some are rigidly fixed, some unpredictably variable.
Fixed-ratio schedules reinforce behavior after a set number of responses. Just as coffee shops reward us with a free drink after every 10 purchased, laboratory animals may be reinforced on a fixed ratio of, say, one reinforcer for every 30 responses. Once conditioned, the animal will pause only briefly after a reinforcer and will then return to a high rate of responding.
Variable-ratio schedules provide reinforcers after an unpredictable number of responses. This is what slot-machine players and fly-casting anglers experience—unpredictable reinforcement—and what makes gambling and fly fishing so hard to extinguish even when both are getting nothing for something. Like the fixed-ratio schedule, the variable-ratio schedule produces high rates of responding, because reinforcers increase as the number of responses increases.
Fixed-interval schedules reinforce the first response after a fixed time period. Like people checking more frequently for the mail as the delivery time approaches, or checking to see if the Jell-O has set, pigeons on a fixed-interval schedule peck a key more frequently as the anticipated time for reward draws near, producing a choppy stop-start pattern rather than a steady rate of response.
Variable-interval schedules reinforce the first response after varying time intervals. Like the “You’ve got mail” that finally rewards persistence in rechecking for e-mail, variable-interval schedules tend to produce slow, steady responding. This makes sense, because there is no knowing when the waiting will be over.
Animal behaviors differ, yet Skinner contended that the reinforcement principles of operant conditioning are universal. It matters little, he said, what response, what reinforcer, or what species you use. The effect of a given reinforcement schedule is pretty much the same: “Pigeon, rat, monkey, which is which? It doesn’t matter. Behavior shows astonishingly similar properties.”
Skinner’s ideas that operant principles should be used to influence people were extremely controversial. Critics felt he ignored personal freedoms and sought to control people. Today, his techniques are applied in schools, sports, workplaces, and homes. Shaping behavior by reinforcing successes is effective.
Learning by Observation.
In observational learning, we observe and imitate others. Mirror neurons, located in the brain’s frontal lobes, demonstrate a neural basis for observational learning. They fire when we perform certain actions (such as responding to pain or moving our mouth to form words), or when we observe someone else performing those actions. Another important type of learning, especially among humans, is what Albert Bandura and others call observational learning. In experiments, children tend to imitate what a model both does and says, whether the behavior is social or antisocial. Such experiments have stimulated research on social modeling in the home, within peer groups, and in the media. Children are especially likely to imitate those they perceive to be like them, successful, or admirable.
Mirror Neurons in the Brain.
Having earlier observed the same weird result when the monkey watched humans or other monkeys move peanuts to their mouths, the flabbergasted researchers, led by Giacomo Rizzolatti, eventually surmised that they had stumbled onto a previously unknown type of neuron: mirror neurons, whose activity provides a neural basis for imitation and observational learning. When a monkey grasps, holds, or tears something, these neurons fire. And they likewise fire when the monkey observes another doing so. When one monkey sees, these neurons mirror what another monkey does.
Imitation shapes even very young humans’ behavior. Shortly after birth, a baby may imitate an adult who sticks out his tongue. By 8 to 16 months, infants imitate various novel gestures. By age 12 months, they begin looking where an adult is looking. And by age 14 months, children imitate acts modeled on TV. Children see, children do. PET scans of different brain areas reveal that humans, like monkeys, have a mirror neuron system that supports empathy and imitation. As we observe another’s action, our brain generates an inner simulation, enabling us to experience the other’s experience within ourselves. Mirror neurons help give rise to children’s empathy and to their ability to infer another’s mental state, an ability known as theory of mind. People with autism display reduced imitative yawning and mirror neuron activity—“broken mirrors,” some have said.
Albert Bandura is the pioneering researcher of observational learning. A preschool child works on a drawing. An adult in another part of the room is building with Tinkertoys. As the child watches, the adult gets up and for nearly 10 minutes pounds, kicks, and throws around the room a large inflated Bobo doll, yelling, “Sock him in the nose. . . . Hit him down. . . . Kick him.”
The child is then taken to another room filled with appealing toys. Soon the experimenter returns and tells the child she has decided to save these good toys “for the other children.” She takes the now-frustrated child to a third adjacent room containing a few toys, including a Bobo doll. Left alone, what does the child do?
Compared with children not exposed to the adult model, those who viewed the model’s actions were much more likely to lash out at the doll. Apparently, observing the aggressive outburst lowered their inhibitions. But something more was also at work, for the children imitated the very acts they had observed and used the very words they had heard.
Applications of Observational Learning.
Children tend to imitate what a model does and says, whether the behavior being modeled is prosocial (positive, constructive, and helpful) or antisocial. If a model’s actions and words are inconsistent, children may imitate the hypocrisy they observe.
The big news from Bandura’s studies is that we look and we learn. Models—in one’s family or neighborhood, or on TV—may have effects—good or bad. Many business organizations effectively use behavior modeling to train communications, sales, and customer service skills (Taylor et al., 2005). Trainees gain skills faster when they not only are told the needed skills but also are able to observe the skills being modeled effectively by experienced workers (or actors simulating them).
The good news is that prosocial (positive, helpful) models can have prosocial effects. To encourage children to read, read to them and surround them with books and people who read. To increase the odds that your children will practice your religion, worship and attend religious activities with them. People who exemplify nonviolent, helpful behavior can prompt similar behavior in others.
The bad news is that observational learning may have antisocial effects. This helps us understand why abusive parents might have aggressive children, and why many men who beat their wives had wife-battering fathers (Stith et al., 2000). Critics note that being aggressive could be passed along by parents’ genes.
The violence-viewing effect seems to stem from at least two factors. One is imitation. As we noted earlier, children as young as 14 months will imitate acts they observe on TV. As they watch, their mirror neurons simulate the behavior, and after this inner rehearsal they become more likely to act it out. rolonged exposure to violence also desensitizes viewers; they become more indifferent to it when later viewing a brawl, whether on TV or in real life.