Chapter 4
Basic Processes of Learning
Outline
Introduction
The Behavioral Perspective on Learning
The Cognitive Perspective on Learning
The Ecological Perspective on Learning
Introduction
Learning: A Working Definition
A relatively permanent change in behavior, thoughts or feelings as a result of experience. It allows us to:
Use past experience to predict the future
To adapt to a rapidly changing environment.
To exert control over our environment.
Examples:
Which of these represent learning (drawn from Rocklin, 1987)?
an infant who stops thumb-sucking?
children being able to use language?
a patient who has a lobotomy and no longer manifests psychotic behavior?
a zinnia plant being pinched back and then growing more dense foliage and flowers?
Zeb throwing his cigarettes away after 30 years of smoking two packs a day?
A computer that uses its first 100 tabulations to influence its choice of opening moves in a chess game?
How do we adapt to the world? Through learning?
Learning may be similar for lower animals and humans. If we can understand learning we can understand much about behavior.
The Basic Issues (that we'll cover anyway):
Does learning not involve thinking? (that's what early researchers thought)
Are the behaviors of ourselves and other animals governed by similar laws of learning?
Can lower animals learn?
Do we learn in the same way as animals? (If yes, and if we can understand these laws, the potential for improving life is immeasurable.)
What governs most of our behavior?
Unconscious motives & innate drives?
Rational, logical problem solving?
Learned habits? What's rewarded?
How about aggression: Is it learned or innate?
How about helplessness? Learned, innate? An excuse for laziness?
How about depression? Learned, innate? A way to manipulate others?
How do we learn fears, guilt, pleasures?
Do we learn more when the reward is greater?
The greater the reward, the more we enjoy the behavior, right?
Who is the acknowledged founder of behaviorism?
ANSWER: John B. Watson
(Cartoons by Mark Parisi. Used by special permission. For many more, visit his
site.)
The Behavioral Perspective on Learning: Acquiring New Responses to and for Stimuli
Learning by Association (simplest form of learning)
Ivan Pavlov, a turn of the century Russian physiologist, confronted with the ability of his research dogs to adapt to laboratory conditions, set out to understand learning and made important discoveries.
Classical or Pavlovian Conditioning
The connectionist explanation: we learn to associate ideas or events because they occur close together. Think of the adaptive value of learning when one event predicts another.
Pavlov and his experiments: Digestion in dogs
Pavlov example: Dogs reflexively salivate to food in mouth.
Food is an unconditioned stimulus (UCS) to the salivation response.
Salivation is unconditioned response (UCR) to the food (you don't need to learn to salivate to food; it's an automatic response).
Pavlov wondered whether you could take these natural associations between certain stimuli and responses and use them to produce true learning.
He noticed that his dogs would start to salivate before they were even given any food, e.g., when the lab assistant simply opened the door to the room in which dogs were housed (just like you might start to "salivate" when the clock chimes 6 p.m.)
He asked: What is going on here? Developed experiment to test some ideas.
Took a neutral stimulus = a stimulus that can catch the learner's attention but does not elicit the UCR (e.g., a bell)
At baseline, the situation looks like:
Neutral Stimulus (Rung Bell) ---->
No salivation
Unconditioned Stimulus (Food; UCS)--> Salivation(UCR)
During the conditioning (learning) trials:
Repeatedly rang bell just before presenting food
Neutral Stimulus (Rung Bell) + UCS (Food) ---> Salivation (UCR)
After conditioning (learning):
Repeatedly rang bell without showing dog any food & he finds:
Conditioned Stimulus (Rung Bell; CS)----->Salivation (Conditioned Response; CR)
The initially neutral stimulus (bell) became a learned (conditioned) elicitor of the response
conditioned stimulus (CS) = a neutral stimulus which you have learned to give the response to.
conditioned response (CR) = similar to UCR but is made to CS because oflearning
Examples of classical conditioning in humans
Many phobias:
fear of dental work
fear of dogs
fear of snakes
waking up right before your alarm goes off
feeling nauseous or dizzy upon entering a hospital
many ad campaigns use classical conditioning
How Tamara learned to:
never eat raisins again
hate the movie Gulliver's Travels (as a kid)
Little Albert's fears (Watson & Raynor)
How classical conditioning applies to our fear in movie "Jaws"
Important Phenomena associated with Classical Conditioning
Extinction
(similar to common sense idea of "forgetting")
After the conditioning phase, the CR gets weaker and weaker when the CS is NOT accompanied by the UCS (food). It gets weaker NOT because the organism no longer remembers the UCS-CS connection. It gets weaker because CR is somehow inhibited.
How does Pavlov know this? Spontaneous recovery
After extinction trials, the dog will start to salivate again in response to CS after only one pairing of UCS and CS.
Is there a best time relation between CS and UCS? Complicated. Just remember:
Law of Contiguity: The closer the two are in time (with neutral stimulus preceding presentation of UCS), the stronger the conditioning (generally speaking)
Generalization
In generalization, the CR will be 'given' to similar stimuli. Examples:
salivation and tones vs. Bach
flowers and fear
Discrimination
In discrimination, the CR is 'given' to some stimuli, but not others. Examples:
salivation and tones vs. Bach
flowers and fear
Remember about Classical Conditioning
Learning takes place fairly involuntarily. It is a more passive kind of learning. What is learned is an association between two stimuli (the neutral stimulus and unconditioned stimulus).
IMPORTANT:
Know why Pavlov would call classical conditioning S-S learning
Know why Watson would call this S-R learning
Be able to explain how Rescorla's study supported the S-S interpretation
Operant Conditioning
The area of Operant Conditioning owes heritage to Thorndike's Law of Effect
The fundamental principle behind operant conditioning is that responses leading to "pleasant" effects are more likely to be repeated in future than those leading to "discomforting" ones.
Idea of Operant Conditioning
You operate on environment and your behavior has effects. The effects of consequences of your voluntary behavior determine which behaviors or responses you learn, retain, or forget (Skinner).
This principle doesn't necessarily apply to reflexes and emotions (like classical conditioning does).
Does it apply to heart rate, blood pressure, love, anger, fear? Only if you can voluntarily regulate them.
Differences with classical conditioning
In classical conditioning the most important components are:
stimuli eliciting ---> responses
"Meat" ---> "dog salivating (involuntary)"
new stimuli (acting as substitutes)
"bell" eliciting ---> salivation
In operant conditioning the important components are:
behaviors ---> leading to ---> consequences
boy gives girl a flower ---> girl takes flower, ---> kisses boy and says "thank you"
The consequences influence your future behavior.
cue ---> organism responds ---> leading to consequence
What types of consequences are important?
If operant conditioning (learning) is controlled by the consequences, then do different consequences lead to different results? Yes!
Reinforcers
Reinforcers by definition are "consequences" that strengthen behavior.
cue----> organism responds ---> has reinforcing consequence
(a) boy gives flower---> girl takes flower ---> girl kisses boy
Giving the flower is reinforced by the girl kissing the boy.
(b) new toy ---> child throws tantrum ---> gets toy
The child throwing tantrum is reinforced by the child getting the toy.
(c) homework ---> you study ---> good grade
Studying is reinforced by you getting good grade.
These are all examples of positive reinforcement, which is a consequence that is pleasant for organism strengthens the behavior that it followed. Note that here a pleasant stimulus comes after the response.
Punishments
Punishments by definition refer to consequences that decrease the behavior they follow
cue----> organism responds ---> has punishing consequence
(a) boy gives flower---> girl takes flower ---> girl slaps boy
Giving flower is punished by the girl slapping the boy.
(b) new toy ---> child throws tantrum---> doesn't get toy
The child throwing a tantrum is punished by not getting the toy.
(c) homework ---> you study ---> worst grade ever
Studying is punished by you getting bad grade.
Punishment is not always good to use
There aer many negative side effects of punishment, such as anger, sadness and avoidance.
Can you think of other side effects?
It is important to remember that what is reinforcing for one person, might not be for another. So don't fall into trap of assuming that you know what is reinforcing for a person. The important thing is to find out what is reinforcing for the person whose behavior you are trying to change. Then, use this (not your own preconceived notions) to change person's behavior. The same idea applies to punishment. What might be a punishment for you, could be a reinforcer for another, or irrelevant to them.
Perverted example: The flower guy might be a masochist who likes being slapped. In this case, will his flower giving be reinforced or punished by the girl slapping him?
A more normal example: Kids are fighting. Mom sends older brother to his room. Mom's intent is to punish the older brother. But, is this necessarily what she is doing?
Negative Reinforcers
All of the above is relatively easy. Now comes the part that is confusing to some students:
Remember reinforcers necessarily strengthen behavior. Some reinforcers follow the behavior they are meant to strengthen. These are positive reinforcers. They are called positive NOT because they feel good. Don't confuse positive with a value judgment. They are called positive in this literature because they are consequences that are ADDED TO (+) the behavior they are meant to reinforce -- they FOLLOW that behavior.
Negative reinforcement by definition is behavior strengthened because its enactment REMOVES an annoying stimulus for the organism.
Some reinforcers operate because they remove or withdraw an aversive stimulus for the organism.
Examples:
Seat belt buzzer ---> you put seat belt on ---> turns off alarm.
Kid's tantrum ---> Mom gives in ---> turns off kid's tantrum
Husband beats wife --> Wife says "I love you" --> husband stops beating
Dad nags to mow lawn --> You mow lawn --> Dad stops nagging
In all of these examples: Doing the behavior (seat belt on, Mom giving in, wife professing love, you mowing lawn) REMOVES the aversive stimulus of (buzzer, tantrum, beating, nagging). Unlike positive reinforcement, the aversive stimulus typically precedes the behavior that it is meant to reinforce. By turning off the stimulus, you are reinforced. It is called negative reinforcement because doing the behavior SUBTRACTS (-) away the aversive stimulus. It is also known as ESCAPE conditioning.
Important Phenomena associated with Operant Conditioning
Extinction
Extinction occurs when the pperant behavior is weakened.
cue ---> organism ---> response ---> nothing
boy gives girl a flower ---> girl takes flower ---> girl does nothing
new toy ---> child throws tantrum ---> nothing
Eventually, boy will stop giving flowers & kid will stop throwing tantrums
Generalization
Generalization is when you are teaching an organism to display the behavior in response to similar, but not identical, "consequences" that were used in initial learning trials. We'll see this in the movie Harry.
Discrimination
Discrimination is when you are teaching organism to display the behavior in response to certain "consequences," and not others. That is, should behave only when the discriminative stimulus appears. We'll also see this in the movie Harry.
Other examples of discrimination:
Stop signs (vs. yield signs)
Three knocks at door (vs. fewer or more)
Mom's good (vs. bad mood)
Shaping
Shaping occurs when successively closer approximations to the ultimately desired behavior are reinforced (until the behavior you really want is established).
Sometimes the behavior we want the organism to learn doesn't automatically appear for us to reinforce. Instead, we need to use principles of reinforcement to get the organism "there" by progressively reinforcing responses that "look" like the response we want.
Examples of Shaping:
Potty training
Animal training
Examples in the movie Harry.
Primary vs. Secondary Reinforcers
Primary reinforcers: Person doesn't need to learn that consequence is reinforcing. These consequences are innately reinforcing (like food, water, sex).
Secondary reinforcers: Person first needs to learn that stimulus has reinforcing value (e.g., money, compliments). See Gray on tokens.
Often times, we'll start out using primary reinforcers and then switch to secondary. Why switch? Primary are more "expensive" and more difficult to come up with when needed!
Maximizing learning
With either reinforcers or punishers, they must be:
contingent (consistent)
immediate (e.g., with punishment, it must be swift and certain)
right magnitude
Schedules of reinforcement
Question: Is it important to consistently reinforce, punish or extinguish behavior? Or is it just as effective to occasionally do so?
Continuous reinforcement
This is simply when you're reinforced every time you respond. For example:
A boy gives girl a flower ---> the girl takes flower ---> kisses boy and says "thank you" EVERY SINGLE TIME.
The consequences of continuous reinforcement are:
very rapid learning
rapid responding
very rapid extinction (forgetting)
Partial Reinforcement "Schedules": Examples of Ratio Schedules
Partial reinforcement is when you're reinforced after "a number of the desired behaviors."
The consequences of partial reinforcement differ depending upon exact schedule used. There are two primary types of partial reinforcement: fixed and variable.
Fixed ratio partial reinforcement
In fixed ratio partial reinforcement, the number of responses until reinforcement is delivered is absolutely constant. You know exactly how often you have to show the desired behavior before you get reinforced. For example:
Every 5th car I sell, I get a bonus.
Every 10th customer I sign up with MCI, I get paid.
Every 20th time I bug my Mom for clothes, she buys them.
The consequences of fixed ratio partial reinforcement are:
slow learning,
variable responding
slow extinction
Variable ratio partial reinforcement
In variable ratio partial reinforcement, the number of response until reinforcement varies randomly, very much like a slot machine. You don't have a clue how many responses you need to emit in order to receive the desired consequence.
What if you didn't know how many cars you had to sell, customers you had to sign, flowers you had to give to get the desired effect? What would your behavior look like? For example:
A boy gives a girl a flower ---> the girl takes the flower ---> kisses boy says "thank you".
BUT the next time...
A boy gives a girl a flower ---> the girl takes the flower ---> the girl does nothing.
Pop quizzes are another good example.
The consequences of variable ratio partial reinforcement are:
slower learning,
very fast responding,
very slow extinction
Partial Reinforcement "Schedules": Examples of Interval Schedules
With interval schedules you're reinforced only after a certain interval of time. The consequences of this type of partial reinforcement differ depending upon exact schedule used. There are two primary types of interval schedules: fixed and variable.
Fixed interval partial reinforcement
With fixed interval partial reinforcement, you get reinforced after a certain period of time has elapsed SINCE the last time you emitted the desired response. For example:
Getting paid every two weeks
Having an exam every four weeks
The consequences of fixed interval partial reinforcement are that you emit the response right before you know you'll be reinforced and that's it (think of studying).
Variable interval partial reinforcement
With variable interval partial reinforcement, you get reinforced "on average" every x time intervals for making a response. For example:
I say you'll be tested on average every three weeks during 16-week period. You don't know exactly when the test will happen. You could be tested:
Week 2, Week 7, Week 11, Week 13 or Week 14
This averages out to being tested roughly every three weeks. But you don't know exactly when the test is going to take place. Since you can't exactly predict the test dates, it keeps you on your toes (with textbook in hand!). This is better to use than a fixed interval of exactly every 3 weeks. If you knew you'd be tested every third week, you'd only study RIGHT BEFORE each test.
Ivan Pavlov
Pavlov was awarded the Nobel Prize in 1904 for research in what area?
ANSWER: the reflexes involved in digestion
Pavlov, dogs, and a lot of saliva.
Click here for the Jaws theme music by John Williams
The fact that Little Albert, after being conditioned to fear white rats, also feared cotton balls, stuffed white animals and little pieces of white fur is an example of what?
ANSWER: generalization
What kind of animals did Edward Lee Thorndike train to escape from puzzle boxes?
ANSWER: cats
LINK: Go to this site for a review of the principles of Classical Conditioning.
Edward Lee Thorndike
What do we call the use of operant conditioning to control physiologically based
problems such as high blood pressure and migrane headaches?
ANSWER: biofeedback training
A secondary reinforcer, like money, which can be saved and exchanged later for another reinforcer is called what?
ANSWER: token
The Cognitive Perspective on Learning: Acquiring Information About the World.
Watson or Skinner's Radical Behavioral Perspective
Remember they saw "mind" as a "black box." Thinking/cognition, to them, is irrelevant to accounting for how we learn. All we need to know from W's or S's perspective are the environmental contingencies, from which we can predict/control your behavior.
More "Liberal" Views of Learning
S (stimulus) - O (organism) - R (response)
The organism interprets (perceives, anticipates, "thinks" about) the stimulus before any response is made. This interpretation affects what they learn/how they behave.
Razran's, Volkova's, Herrnstein's experiments nicely illustrate this. Study these!
Think (oops, I used that word again) of Herrnstein's experiment with concept formation in Shakespearean pigeons. They obviously had to analyze the the slides in order to learn when to emit the response and when not to.
"To peck or not to peck, that is the question!"
The cognitive view of learning often seems so obvious to us that it's difficult to fathom learning without cognition!
The cognitive approach is represented well in an area called "animal cognition." For some interesting sites directly or indirectly pertinent to this, go to:
Back to Pavlov and Classical (S-S) Conditioning
What is the cognitive perspective on classical conditioning?
The organism has learned an expectancy. For example, the baby learned to expect a "bee" (the unconditioned stimulus) upon seeing flowers (the conditioned stimuli)
flowers ---> bee expectation ---> crying
music ---> shark expectation ----> fear
Back to Operant (S-R) Conditioning
First remember what S-R theory said (in much simplified form):
Basic Ideas
Example
There is a STIMULUS to behavior
Cute guy at party grabs your attention
You give a RESPONSE
You flirt with him
There is a CONSEQUENCE of your response
He asks you out for a date (probably: positive reinforcement for you)
CONSEQUENCE strengthens your flirting behavior
You flirt again....and on and on until you get married
So, the consequence (reinforcer) stamps in the link between stimulus (cute guy) and your response (flirting)
Now consider Tolman's means-end idea (more cognitive explanation):
Basic Idea
Example
There is a STIMULUS to behavior
Cute guy at party grabs your attention
You THINK of what to do:
I could flirt with him and he'd like that...and ask me out You know that flirting is a MEANS to getting asked out (the END)
You're still THINKING
But, do I really want to go out with him?
You DECIDE what to do
I guess I'll flirt
You give a RESPONSE
You flirt with him
There is a CONSEQUENCE
He asks you out....
etc.
There was a lot of resistance to this kind of idea. It seemed to imply so much conscious AWARENESS and deliberation about our behavior. But, Tolman (and others) didn't mean that we (or the rat or the pigeon) sat there consciously deliberating. We do, however, have knowledge of which behaviors will lead to which consequences; we also "decide" whether to engage in behaviors depending upon whether we want that particular consequence at that particular moment in time!
Other evidence for the cognitive explanation:
You should know:
What is a negative contrast effect?
What is a positive contrast effect?
How are these explained from cognitive perspective?
Do they occur in all organisms? Why? Why not?
How do organisms learn cognitive maps?
What actually is involved in place learning if it's NOT reward?
Overjustification effect
Strict S-R interpretation: larger reward should increase rate of behavior, right?
Wrong: sometimes extrinsic rewards actually decrease rate of behavior
Why? Lepper, Greene, & Nisbett Study
Do rewards necessarily increase the probability of behavior?
Nursery school children
Expected reward - Knew they'd receive reward if they drew well with magic markers.
Unexpected reward - Did not expect reward, but did receive it.
No reward - Just played with magic markers. Didn't expect & didn't receive reward.
The findings of the Lepper, Green and Nisbett study
Expected reward condition led to a decrease in the behavior during post-test. Expected reward decreases intrinsic interest value of the task. Expected reward "makes" you feel externally rather than internally controlled.
Observational Learning
Back to Tolman and Latent Learning
Tolman's place learning research showed that animals learn (= acquire a new behavior) even if they aren't rewarded for it. Rewards don't seem to necessarily/always affect learning. Instead, rewards sometimes affect whether we show the behavior we've learned.
How does this explain Tolman's research on place learning? (Know this)
Is all voluntary behavior learned this way? Do we have to respond to learn?
This led to another Basic Idea: We can learn from observing the behavior of others.
If this is true you do not have to actually experience consequences or respond in order to learn. For example, children will do not only what they are reinforced for. They also will do what they see you do (and on the TV), and your friends, and their friends, and the teachers, etc., etc. This is known as Modeling or Observational Learning.
Bandura's Bobo doll experiments with preschool kids
Rank Order the following conditions, using 1-3
(1) = When given the chance to display aggression, these kids would aggress the most
(2) = When given the chance to display aggression, these kids would aggress the next most
(3) = When given the chance to display aggression, these kids would aggress the least
After seeing an adult model punished for aggressing?
After seeing an adult model reinforced for aggressing?
After seeing the adult model aggress (but wasn't reinforced or punished)?
Factors Influencing Observational Learning?
Attention? Retention? Motor Reproduction? Reinforcement/Incentive?
Which of these is involved in actual learning/acquisition?
Which of these is involved in whether what you've learned is displayed?
(Cartoons by Mark Parisi. Used by special permission. For many more, visit his site.)
LINK: This site teaches parents how observational learning (e.g., television) is involved in teaching violence to children.
The Ecological Perspective on Learning: Filling in the Blanks in Species-Typical Behavior
Ecological Perspectives
Ecological perspectives study species-specific learning. These learning mechanisms evolved to meet survival-related purposes
Example from Food-Aversion Learning
Used to be interpreted in terms of classical conditioning, but the delay is too long to really be able to explain this. Garcia interpreted food aversion learning in terms of function that food aversions evolved to have for organisms
Examples:
Food-Preference Learning
Biases in Learning Fears
Imprinting