Chapter 10 Summary

Two-factor theory addressed the question, what reinforces an avoidance response? In modern terms, the theory proposes that (a) organisms associate stimuli in the environment with an aversive O, which allows those stimuli to evoke fear; and (b) the avoidance response is reinforced when it eliminates or escapes those warning stimuli and therefore causes fear reduction. Two-factor theory emphasizes the interaction between stimulus learning (Pavlovian fear-conditioning) and response learning (operant/instrumental reinforcement through fear reduction).
Two-factor theory was challenged by the fact that avoidance learning can occur if the response simply reduces the rate of aversive stimulation, without any explicit warning stimuli, and by the fact that the strength of avoidance behavior is not correlated with overt levels of fear. These challenges were addressed by noting that temporal cues that predict O can become conditioned fear stimuli and that “fear” is best defined as a central state or expectancy rather than a peripheral response (see Chapter 9).
Two-factor theory’s emphasis on reinforcement by fear reduction ran into difficulty when it was discovered that escaping the warning signal is not important when the animal can avoid by performing a natural behavior that has presumably evolved to avoid predation—a so-called species-specific defense reaction (SSDR).
SSDR theory emphasized the organism’s evolutionary history. Avoidance learning was thought to occur rapidly if the required response resembled a natural defensive behavior. If not, learning depends more on feedback (or perhaps reinforcement provided by the inhibition of fear, which is provided by feedback cues).
The field’s approach to avoidance behavior has become more ethological (it considers the function of natural defensive behavior), more Pavlovian (natural SSDRs appear to be respondents guided by learning about environmental cues rather than operants reinforced by their consequences), and more cognitive in the sense that what is learned now appears to be separate from what is shown in behavior.
Exposure to uncontrollable aversive events can interfere with subsequent escape or avoidance learning (the “learned helplessness effect”). Although exposure to uncontrollable aversive events has many effects, one is that organisms may learn that their behavior is independent of O. If an aversive O is uncontrollable, it also has especially pernicious effects (e.g., it can cause more fear conditioning and may lead to stomach ulceration).
In appetitive learning (in which organisms learn to respond to earn positive Os like food), different behaviors are also learned at unequal rates, and natural behaviors that are elicited by Ss that predict Os can intrude. Pavlovian learning is always occurring in operant learning situations, and its impact is difficult to ignore.
S-O and R-O learning often work in concert. For example, when an operant behavior is punished, the organism may stop responding either because it associates the response with the aversive consequence (R-O) or because it associates nearby stimuli with that consequence (S-O) and withdraws from them (negative sign tracking). Conversely, in reward learning, the organism may respond either because it associates the response with reward or because it associates nearby stimuli with reward and approaches them (positive sign tracking).
Animals learn several things in instrumental/operant situations: They associate their behavior with its consequences (R-O), they associate stimuli in the environment with those consequences (S-O), they may learn that stimuli in the environment signal the current relationship between the behavior and its consequences (occasion setting or S-[R-O]), and they may learn a simple association between the environmental cues and the response (S-R).
The reinforcer devaluation effect provides clear evidence of R-O learning: If a rat is taught to press a lever for sucrose and then sucrose is separately associated with illness, it will press the lever less—even though the response has never been paired with sucrose after its association with illness. Thus, the rat learns that lever pressing leads to sucrose and then responds according to how much sucrose is liked or valued. Incentive learning plays a crucial role in the assignment of value.
Organisms can also learn what specific O follows an S. S-O learning influences instrumental action by motivating the response (see Chapter 9) and by allowing S to evoke behavior directly (including positive or negative sign tracking). The laws of S-O learning were discussed in Chapters 3 through 5. Learning about more “complex” stimuli, such as polymorphous categories and temporal and spatial cues, was reviewed in Chapter 8; it appears to follow similar rules.
S can also “set the occasion” for an R-O relationship (S-[R-O] learning) in a way that is distinct from simple S-O learning. For example, organisms can learn that S signals that the animal must now perform R to obtain an O that is otherwise freely available.
Organisms may also learn to perform an R reflexively, without regard to its consequences—out of habit (S-R learning). S-R learning may become especially important after many repetitions of the instrumental action and may function to keep working memory free and available for other activities.