Chapter 4
Basic Processes of Learning

Outline

Introduction
The Behavioral Perspective on Learning
The Cognitive Perspective on Learning
The Ecological Perspective on Learning



Introduction

Learning: A Working Definition

A relatively permanent change in behavior, thoughts or feelings as a result of experience. It allows us to:

Use past experience to predict the future

To adapt to a rapidly changing environment.

To exert control over our environment.

Examples:

Which of these represent learning (drawn from Rocklin, 1987)?

an infant who stops thumb-sucking?

children being able to use language?

a patient who has a lobotomy and no longer manifests psychotic behavior?

a zinnia plant being pinched back and then growing more dense foliage and flowers?

Zeb throwing his cigarettes away after 30 years of smoking two packs a day?

A computer that uses its first 100 tabulations to influence its choice of opening moves in a chess game?


How do we adapt to the world? Through learning?

Learning may be similar for lower animals and humans. If we can understand learning we can understand much about behavior.


The Basic Issues (that we'll cover anyway):

Does learning not involve thinking? (that's what early researchers thought)

Are the behaviors of ourselves and other animals governed by similar laws of learning?

Can lower animals learn?

Do we learn in the same way as animals? (If yes, and if we can understand these laws, the potential for improving life is immeasurable.)


What governs most of our behavior?

Unconscious motives & innate drives?

Rational, logical problem solving?

Learned habits? What's rewarded?

How about aggression: Is it learned or innate?

How about helplessness? Learned, innate? An excuse for laziness?

How about depression? Learned, innate? A way to manipulate others?

How do we learn fears, guilt, pleasures?

Do we learn more when the reward is greater?

The greater the reward, the more we enjoy the behavior, right?


Who is the acknowledged founder of behaviorism?

ANSWER: John B. Watson


(Cartoons by Mark Parisi. Used by special permission. For many more, visit his site.)

The Behavioral Perspective on Learning: Acquiring New Responses to and for Stimuli

Learning by Association (simplest form of learning)

Ivan Pavlov, a turn of the century Russian physiologist, confronted with the ability of his research dogs to adapt to laboratory conditions, set out to understand learning and made important discoveries.


Classical or Pavlovian Conditioning

The connectionist explanation: we learn to associate ideas or events because they occur close together. Think of the adaptive value of learning when one event predicts another.


Pavlov and his experiments: Digestion in dogs

Pavlov example: Dogs reflexively salivate to food in mouth.

Food is an unconditioned stimulus (UCS) to the salivation response.

Salivation is unconditioned response (UCR) to the food (you don't need to learn to salivate to food; it's an automatic response).

Pavlov wondered whether you could take these natural associations between certain stimuli and responses and use them to produce true learning.

He noticed that his dogs would start to salivate before they were even given any food, e.g., when the lab assistant simply opened the door to the room in which dogs were housed (just like you might start to "salivate" when the clock chimes 6 p.m.)

He asked: What is going on here? Developed experiment to test some ideas.

Took a neutral stimulus = a stimulus that can catch the learner's attention but does not elicit the UCR (e.g., a bell)

At baseline, the situation looks like:

Neutral Stimulus (Rung Bell) ---->
No salivation

Unconditioned Stimulus (Food; UCS)--> Salivation(UCR)

During the conditioning (learning) trials:

Repeatedly rang bell just before presenting food

Neutral Stimulus (Rung Bell) + UCS (Food) ---> Salivation (UCR)

After conditioning (learning):

Repeatedly rang bell without showing dog any food & he finds:

Conditioned Stimulus (Rung Bell; CS)----->Salivation (Conditioned Response; CR)

The initially neutral stimulus (bell) became a learned (conditioned) elicitor of the response

conditioned stimulus (CS) = a neutral stimulus which you have learned to give the response to.

conditioned response (CR) = similar to UCR but is made to CS because oflearning

Examples of classical conditioning in humans

Many phobias:

fear of dental work

fear of dogs

fear of snakes

• waking up right before your alarm goes off

• feeling nauseous or dizzy upon entering a hospital

• many ad campaigns use classical conditioning

• How Tamara learned to:

never eat raisins again

hate the movie Gulliver's Travels (as a kid)

• Little Albert's fears (Watson & Raynor)


How classical conditioning applies to our fear in movie "Jaws"


Important Phenomena associated with Classical Conditioning

Extinction

(similar to common sense idea of "forgetting")

After the conditioning phase, the CR gets weaker and weaker when the CS is NOT accompanied by the UCS (food). It gets weaker NOT because the organism no longer remembers the UCS-CS connection. It gets weaker because CR is somehow inhibited.


How does Pavlov know this? Spontaneous recovery

After extinction trials, the dog will start to salivate again in response to CS after only one pairing of UCS and CS.

Is there a best time relation between CS and UCS? Complicated. Just remember:

Law of Contiguity: The closer the two are in time (with neutral stimulus preceding presentation of UCS), the stronger the conditioning (generally speaking)


Generalization

In generalization, the CR will be 'given' to similar stimuli. Examples:

salivation and tones vs. Bach

flowers and fear


Discrimination

In discrimination, the CR is 'given' to some stimuli, but not others. Examples:

salivation and tones vs. Bach

flowers and fear

Remember about Classical Conditioning

Learning takes place fairly involuntarily. It is a more passive kind of learning. What is learned is an association between two stimuli (the neutral stimulus and unconditioned stimulus).

IMPORTANT:

Know why Pavlov would call classical conditioning S-S learning

Know why Watson would call this S-R learning

Be able to explain how Rescorla's study supported the S-S interpretation

Operant Conditioning


The area of Operant Conditioning owes heritage to Thorndike's Law of Effect

The fundamental principle behind operant conditioning is that responses leading to "pleasant" effects are more likely to be repeated in future than those leading to "discomforting" ones.


Idea of Operant Conditioning

You operate on environment and your behavior has effects. The effects of consequences of your voluntary behavior determine which behaviors or responses you learn, retain, or forget (Skinner).

This principle doesn't necessarily apply to reflexes and emotions (like classical conditioning does).

Does it apply to heart rate, blood pressure, love, anger, fear? Only if you can voluntarily regulate them.


Differences with classical conditioning

In classical conditioning the most important components are:

stimuli eliciting ---> responses
"Meat" ---> "dog salivating (involuntary)"

new stimuli (acting as substitutes)
"bell" eliciting ---> salivation

In operant conditioning the important components are:

behaviors ---> leading to ---> consequences

boy gives girl a flower ---> girl takes flower, ---> kisses boy and says "thank you"

The consequences influence your future behavior.

cue ---> organism responds ---> leading to consequence


What types of consequences are important?

If operant conditioning (learning) is controlled by the consequences, then do different consequences lead to different results? Yes!

Reinforcers

Reinforcers by definition are "consequences" that strengthen behavior.

cue----> organism responds ---> has reinforcing consequence

(a) boy gives flower---> girl takes flower ---> girl kisses boy

Giving the flower is reinforced by the girl kissing the boy.


(b) new toy ---> child throws tantrum ---> gets toy

The child throwing tantrum is reinforced by the child getting the toy.


(c) homework ---> you study ---> good grade

Studying is reinforced by you getting good grade.

These are all examples of positive reinforcement, which is a consequence that is pleasant for organism strengthens the behavior that it followed. Note that here a pleasant stimulus comes after the response.


Punishments

Punishments by definition refer to consequences that decrease the behavior they follow

cue----> organism responds ---> has punishing consequence

(a) boy gives flower---> girl takes flower ---> girl slaps boy

Giving flower is punished by the girl slapping the boy.

(b) new toy ---> child throws tantrum---> doesn't get toy

The child throwing a tantrum is punished by not getting the toy.


(c) homework ---> you study ---> worst grade ever

Studying is punished by you getting bad grade.


Punishment is not always good to use

There aer many negative side effects of punishment, such as anger, sadness and avoidance.

Can you think of other side effects?

It is important to remember that what is reinforcing for one person, might not be for another. So don't fall into trap of assuming that you know what is reinforcing for a person. The important thing is to find out what is reinforcing for the person whose behavior you are trying to change. Then, use this (not your own preconceived notions) to change person's behavior. The same idea applies to punishment. What might be a punishment for you, could be a reinforcer for another, or irrelevant to them.

Perverted example: The flower guy might be a masochist who likes being slapped. In this case, will his flower giving be reinforced or punished by the girl slapping him?

A more normal example: Kids are fighting. Mom sends older brother to his room. Mom's intent is to punish the older brother. But, is this necessarily what she is doing?


Negative Reinforcers

All of the above is relatively easy. Now comes the part that is confusing to some students:

Remember reinforcers necessarily strengthen behavior. Some reinforcers follow the behavior they are meant to strengthen. These are positive reinforcers. They are called positive NOT because they feel good. Don't confuse positive with a value judgment. They are called positive in this literature because they are consequences that are ADDED TO (+) the behavior they are meant to reinforce -- they FOLLOW that behavior.

Negative reinforcement by definition is behavior strengthened because its enactment REMOVES an annoying stimulus for the organism.

Some reinforcers operate because they remove or withdraw an aversive stimulus for the organism.

Examples:

Seat belt buzzer ---> you put seat belt on ---> turns off alarm.

Kid's tantrum ---> Mom gives in ---> turns off kid's tantrum

Husband beats wife --> Wife says "I love you" --> husband stops beating

Dad nags to mow lawn --> You mow lawn --> Dad stops nagging

In all of these examples: Doing the behavior (seat belt on, Mom giving in, wife professing love, you mowing lawn) REMOVES the aversive stimulus of (buzzer, tantrum, beating, nagging). Unlike positive reinforcement, the aversive stimulus typically precedes the behavior that it is meant to reinforce. By turning off the stimulus, you are reinforced. It is called negative reinforcement because doing the behavior SUBTRACTS (-) away the aversive stimulus. It is also known as ESCAPE conditioning.


Important Phenomena associated with Operant Conditioning

Extinction

Extinction occurs when the pperant behavior is weakened.

cue ---> organism ---> response ---> nothing

boy gives girl a flower ---> girl takes flower ---> girl does nothing

new toy ---> child throws tantrum ---> nothing

Eventually, boy will stop giving flowers & kid will stop throwing tantrums


Generalization

Generalization is when you are teaching an organism to display the behavior in response to similar, but not identical, "consequences" that were used in initial learning trials. We'll see this in the movie Harry.


Discrimination

Discrimination is when you are teaching organism to display the behavior in response to certain "consequences," and not others. That is, should behave only when the discriminative stimulus appears. We'll also see this in the movie Harry.

Other examples of discrimination:

Stop signs (vs. yield signs)

Three knocks at door (vs. fewer or more)

Mom's good (vs. bad mood)


Shaping

Shaping occurs when successively closer approximations to the ultimately desired behavior are reinforced (until the behavior you really want is established).

Sometimes the behavior we want the organism to learn doesn't automatically appear for us to reinforce. Instead, we need to use principles of reinforcement to get the organism "there" by progressively reinforcing responses that "look" like the response we want.

Examples of Shaping:

Potty training

Animal training

Examples in the movie Harry.

Primary vs. Secondary Reinforcers

Primary reinforcers: Person doesn't need to learn that consequence is reinforcing. These consequences are innately reinforcing (like food, water, sex).

Secondary reinforcers: Person first needs to learn that stimulus has reinforcing value (e.g., money, compliments). See Gray on tokens.

Often times, we'll start out using primary reinforcers and then switch to secondary. Why switch? Primary are more "expensive" and more difficult to come up with when needed!


Maximizing learning

With either reinforcers or punishers, they must be:

contingent (consistent)

immediate (e.g., with punishment, it must be swift and certain)

right magnitude

Schedules of reinforcement

Question: Is it important to consistently reinforce, punish or extinguish behavior? Or is it just as effective to occasionally do so?


Continuous reinforcement

This is simply when you're reinforced every time you respond. For example:

A boy gives girl a flower ---> the girl takes flower ---> kisses boy and says "thank you" EVERY SINGLE TIME.

The consequences of continuous reinforcement are:

very rapid learning

rapid responding

very rapid extinction (forgetting)


Partial Reinforcement "Schedules": Examples of Ratio Schedules

Partial reinforcement is when you're reinforced after "a number of the desired behaviors."

The consequences of partial reinforcement differ depending upon exact schedule used. There are two primary types of partial reinforcement: fixed and variable.


Fixed ratio partial reinforcement

In fixed ratio partial reinforcement, the number of responses until reinforcement is delivered is absolutely constant. You know exactly how often you have to show the desired behavior before you get reinforced. For example:

Every 5th car I sell, I get a bonus.

Every 10th customer I sign up with MCI, I get paid.

Every 20th time I bug my Mom for clothes, she buys them.

The consequences of fixed ratio partial reinforcement are:

slow learning,

variable responding

slow extinction


Variable ratio partial reinforcement

In variable ratio partial reinforcement, the number of response until reinforcement varies randomly, very much like a slot machine. You don't have a clue how many responses you need to emit in order to receive the desired consequence.

What if you didn't know how many cars you had to sell, customers you had to sign, flowers you had to give to get the desired effect? What would your behavior look like? For example:

A boy gives a girl a flower ---> the girl takes the flower ---> kisses boy says "thank you".

BUT the next time...

A boy gives a girl a flower ---> the girl takes the flower ---> the girl does nothing.

Pop quizzes are another good example.

The consequences of variable ratio partial reinforcement are:

slower learning,

very fast responding,

very slow extinction


Partial Reinforcement "Schedules": Examples of Interval Schedules

With interval schedules you're reinforced only after a certain interval of time. The consequences of this type of partial reinforcement differ depending upon exact schedule used. There are two primary types of interval schedules: fixed and variable.

Fixed interval partial reinforcement

With fixed interval partial reinforcement, you get reinforced after a certain period of time has elapsed SINCE the last time you emitted the desired response. For example:

Getting paid every two weeks

Having an exam every four weeks

The consequences of fixed interval partial reinforcement are that you emit the response right before you know you'll be reinforced and that's it (think of studying).

Variable interval partial reinforcement

With variable interval partial reinforcement, you get reinforced "on average" every x time intervals for making a response. For example:

I say you'll be tested on average every three weeks during 16-week period. You don't know exactly when the test will happen. You could be tested:

Week 2, Week 7, Week 11, Week 13 or Week 14  

This averages out to being tested roughly every three weeks. But you don't know exactly when the test is going to take place. Since you can't exactly predict the test dates, it keeps you on your toes (with textbook in hand!). This is better to use than a fixed interval of exactly every 3 weeks. If you knew you'd be tested every third week, you'd only study RIGHT BEFORE each test.


Ivan Pavlov

Pavlov was awarded the Nobel Prize in 1904 for research in what area?

ANSWER: the reflexes involved in digestion

Pavlov, dogs, and a lot of saliva.

Click here for the Jaws theme music by John Williams

The fact that Little Albert, after being conditioned to fear white rats, also feared cotton balls, stuffed white animals and little pieces of white fur is an example of what?

ANSWER: generalization

What kind of animals did Edward Lee Thorndike train to escape from puzzle boxes?

ANSWER: cats

LINK: Go to this site for a review of the principles of Classical Conditioning.


Edward Lee Thorndike


What do we call the use of operant conditioning to control physiologically based problems such as high blood pressure and migrane headaches?

ANSWER: biofeedback training

A secondary reinforcer, like money, which can be saved and exchanged later for another reinforcer is called what?

ANSWER: token


The Cognitive Perspective on Learning: Acquiring Information About the World.

Watson or Skinner's Radical Behavioral Perspective

Remember they saw "mind" as a "black box." Thinking/cognition, to them, is irrelevant to accounting for how we learn. All we need to know from W's or S's perspective are the environmental contingencies, from which we can predict/control your behavior.


More "Liberal" Views of Learning

S (stimulus) - O (organism) - R (response)

The organism interprets (perceives, anticipates, "thinks" about) the stimulus before any response is made. This interpretation affects what they learn/how they behave.

Razran's, Volkova's, Herrnstein's experiments nicely illustrate this. Study these!

Think (oops, I used that word again) of Herrnstein's experiment with concept formation in Shakespearean pigeons. They obviously had to analyze the the slides in order to learn when to emit the response and when not to.

"To peck or not to peck, that is the question!"

The cognitive view of learning often seems so obvious to us that it's difficult to fathom learning without cognition!

The cognitive approach is represented well in an area called "animal cognition." For some interesting sites directly or indirectly pertinent to this, go to:


Back to Pavlov and Classical (S-S) Conditioning

What is the cognitive perspective on classical conditioning?

The organism has learned an expectancy. For example, the baby learned to expect a "bee" (the unconditioned stimulus) upon seeing flowers (the conditioned stimuli)

flowers ---> bee expectation ---> crying

music ---> shark expectation ----> fear

Back to Operant (S-R) Conditioning

First remember what S-R theory said (in much simplified form):

Basic Ideas 

Example 

There is a STIMULUS to behavior

Cute guy at party grabs your attention

You give a RESPONSE

You flirt with him

There is a CONSEQUENCE of your response

He asks you out for a date (probably: positive reinforcement for you)

CONSEQUENCE strengthens your flirting behavior

You flirt again....and on and on until you get married

So, the consequence (reinforcer) stamps in the link between stimulus (cute guy) and your response (flirting)

Now consider Tolman's means-end idea (more cognitive explanation):

Basic Idea 

Example 

There is a STIMULUS to behavior

Cute guy at party grabs your attention

You THINK of what to do:

I could flirt with him and he'd like that...and ask me out  You know that flirting is a MEANS to getting asked out (the END)

You're still THINKING

But, do I really want to go out with him?

You DECIDE what to do

I guess I'll flirt

You give a RESPONSE

You flirt with him

There is a CONSEQUENCE

He asks you out....

etc.

There was a lot of resistance to this kind of idea. It seemed to imply so much conscious AWARENESS and deliberation about our behavior. But, Tolman (and others) didn't mean that we (or the rat or the pigeon) sat there consciously deliberating. We do, however, have knowledge of which behaviors will lead to which consequences; we also "decide" whether to engage in behaviors depending upon whether we want that particular consequence at that particular moment in time!

Other evidence for the cognitive explanation:

You should know:

What is a negative contrast effect?

What is a positive contrast effect?

How are these explained from cognitive perspective?

Do they occur in all organisms? Why? Why not?

How do organisms learn cognitive maps?

What actually is involved in place learning if it's NOT reward?

Overjustification effect

Strict S-R interpretation: larger reward should increase rate of behavior, right?

Wrong: sometimes extrinsic rewards actually decrease rate of behavior

Why? Lepper, Greene, & Nisbett Study


Do rewards necessarily increase the probability of behavior?

Nursery school children

Expected reward - Knew they'd receive reward if they drew well with magic markers.

Unexpected reward - Did not expect reward, but did receive it.

No reward - Just played with magic markers. Didn't expect & didn't receive reward.


The findings of the Lepper, Green and Nisbett study

Expected reward condition led to a decrease in the behavior during post-test. Expected reward decreases intrinsic interest value of the task. Expected reward "makes" you feel externally rather than internally controlled.

Observational Learning

Back to Tolman and Latent Learning

Tolman's place learning research showed that animals learn (= acquire a new behavior) even if they aren't rewarded for it. Rewards don't seem to necessarily/always affect learning. Instead, rewards sometimes affect whether we show the behavior we've learned.

How does this explain Tolman's research on place learning? (Know this)

Is all voluntary behavior learned this way? Do we have to respond to learn?

This led to another Basic Idea: We can learn from observing the behavior of others.

If this is true you do not have to actually experience consequences or respond in order to learn. For example, children will do not only what they are reinforced for. They also will do what they see you do (and on the TV), and your friends, and their friends, and the teachers, etc., etc. This is known as Modeling or Observational Learning.

Bandura's Bobo doll experiments with preschool kids

Rank Order the following conditions, using 1-3

(1) = When given the chance to display aggression, these kids would aggress the most

(2) = When given the chance to display aggression, these kids would aggress the next most

(3) = When given the chance to display aggression, these kids would aggress the least

After seeing an adult model punished for aggressing?

After seeing an adult model reinforced for aggressing?

After seeing the adult model aggress (but wasn't reinforced or punished)?

Factors Influencing Observational Learning?

Attention? Retention? Motor Reproduction? Reinforcement/Incentive?

Which of these is involved in actual learning/acquisition?

Which of these is involved in whether what you've learned is displayed?

(Cartoons by Mark Parisi. Used by special permission. For many more, visit his site.)

LINK: This site teaches parents how observational learning (e.g., television) is involved in teaching violence to children.

The Ecological Perspective on Learning: Filling in the Blanks in Species-Typical Behavior

Ecological Perspectives

Ecological perspectives study species-specific learning. These learning mechanisms evolved to meet survival-related purposes

Example from Food-Aversion Learning

Used to be interpreted in terms of classical conditioning, but the delay is too long to really be able to explain this. Garcia interpreted food aversion learning in terms of function that food aversions evolved to have for organisms

Examples:

Food-Preference Learning

Biases in Learning Fears

Imprinting