Chapter 7 Summary

Early thinkers had different ideas about what was going on in instrumental learning. Thorndike emphasized reinforcement: Satisfaction was supposed to stamp in an S-R association. Guthrie claimed that reinforcement was not necessary for learning; S and R were associated if they merely occurred together in time. Tolman argued that learning was mostly about stimuli (S-S) and that reinforcers were important for motivating performance even though they were not necessary for learning. The early theorists identified at least three possible functions of reinforcers: They might stamp in behavior (Thorndike), they might provide another stimulus (Guthrie), and they might motivate (Tolman).
Skinner’s “atheoretical” approach emphasized the strengthening effects of both primary and conditioned reinforcers and also emphasized stimulus control, the concept that operant behaviors occur in the presence of stimuli that set the occasion for them. Skinner invented the operant experiment, which is a method that examines how “voluntary” behavior relates to its payoff—the animal is free to repeat the operant response as often as it chooses.
Schedules of reinforcement provide a way to study how behavior relates to payoff. Ratio schedules require a certain number of responses for each reinforcer, and there is a direct relationship between behavior rate and payoff rate. Interval schedules reinforce the first response after a specified interval of time has elapsed. In this case, there is a less direct relationship between behavior rate and payoff rate. Different schedules generate their own patterns of responding.
Choice is studied in concurrent schedules, where two behaviors are available and are reinforced according to their own schedules of reinforcement. Choice conforms to the matching law, which states that the percentage of behavior allocated to one alternative will match the percentage of reinforcers earned there. Even the simplest operant experiment involves this kind of choice because the organism must choose between lever pressing and all other available behaviors, which are reinforced according to their own schedules. According to the quantitative law of effect—an extension of the matching law—the rate of a behavior always depends both on its own reinforcement rate and on the reinforcement rate of other behaviors.
We often have to choose between behaviors that produce large, delayed rewards versus behaviors that yield smaller, but more immediate rewards. We are said to exercise self-control when we choose the large, delayed reward, but we are seen as impulsive when we go for the smaller, immediate reward. Choice here is a lawful function of how a reinforcer’s value depends on both its size and its imminence in time. Self-control can be encouraged by committing to the delayed reward earlier (precommitment) and by several other strategies.
Reinforcers are not all the same. According to economic principles, reinforcers may substitute for one another (e.g., Coke for Pepsi, and vice versa), they may be independent of one another (Pepsi and books), or they may complement one another (chips and salsa). A complete understanding of choice will need to take into account the relationship between the different reinforcers.
The Premack reinforcement principle states that access to one behavior will reinforce another behavior if the first behavior is preferred in a baseline preference test. The principle has very wide applicability, and it rescues the Skinnerian definition of a reinforcer (“any consequence of a behavior that increases the probability of that behavior”) from its circularity. Premack’s punishment principle states that access to a behavior will punish another behavior if the first behavior is less preferred.
According to behavior regulation theory, animals have a preferred level of every behavior that they engage in. When a behavior is blocked or prevented so that it is deprived below its preferred baseline level, access to it becomes reinforcing. Behavior regulation theory replaces the Premack principle as a way of identifying potential reinforcers.
Reinforcers may operate like natural selection. According to this idea, they may select certain behaviors largely by preventing them from elimination (extinction). As is true in evolution, great subtlety and complexity can emerge over time from repeated application of principles of variation and selection.