Extracts of article, "Skinner - Operant Conditioning" by Saul McLeod dtd 2018 w/commenting and minor changes to fit the articles theme and subject.
Caveat/Note: Most of what follows are either direct quotes or extracts modified to fit this presentation meant for one to study and meant to inspire further research for analysis, synthesis and application in the dojo as both sensei and senpai.
“Operant conditioning is a method of learning that occurs through rewards and punishments for behavior. Through operant conditioning, an individual makes an association between a particular behavior and a consequence (Skinner, 1938).”
Thorndike's Law of Effect: “According to this principle, behavior that is followed by pleasant consequences is likely to be repeated, and behavior followed by unpleasant consequences is less likely to be repeated.”
https://www.simplypsychology.org/edward-thorndike.html
Skinner's Law of Effect: “Law of Effect - Reinforcement. Behavior which is reinforced tends to be repeated (i.e., strengthened); behavior which is not reinforced tends to die out-or be extinguished (i.e., weakened).”
Skinner, three types of response:
- Neutral operants: responses from the environment that neither increase nor decrease the probability of a behavior being repeated.
- Reinforcers: Responses from the environment that increase the probability of a behavior being repeated. Reinforcers can be either positive or negative.
- Punishers: Responses from the environment that decrease the likelihood of a behavior being repeated. Punishment weakens behavior.
Note: behavior are affected by reinforcers and punishers.
Positive Reinforcement: Positive reinforcement strengthens a behavior by providing a consequence an individual finds rewarding.
Negative Reinforcement: The removal of an unpleasant reinforcer can also strengthen behavior. This is known as negative reinforcement because it is the removal of an adverse stimulus which is ‘rewarding’ to the animal or person. Negative reinforcement strengthens behavior because it stops or removes an unpleasant experience.
Learned Responses: Escape Learning and Avoidance Learning.
Punishment (weakens behavior)
Punishment is the opposite of reinforcement since it is designed to weaken or eliminate a response. An aversive event that decreases the behavior that it follows.
Punishment can work either directly applying an unpleasant stimulus or by removing a potentially rewarding stimulus.
Problems using Punishment:
- Punished behavior is not forgotten, it's suppressed - behavior returns when punishment is no longer present.
- Causes increased aggression - shows that aggression is a way to cope with problems.
- Creates fear that can generalize to undesirable behaviors, e.g., fear of school.
- Does not necessarily guide toward desired behavior - reinforcement tells you what to do, punishment only tells you what not to do.
Different patterns (or schedules) of reinforcement had different effects on the speed of learning and extinction.
- The Response Rate - The rate at how hard practitioner worked.
- The Extinction Rate - The rate at which stimulus dies out (i.e., how soon one gives up).
The type of reinforcement which produces the slowest rate of extinction (i.e., people will go on repeating the behavior for the longest time without reinforcement) is variable-ratio reinforcement. The type of reinforcement which has the quickest rate of extinction is continuous reinforcement.
- Continuous Reinforcement: positively reinforced every time a specific behavior occurs.
- Fixed Ratio Reinforcement: Behavior is reinforced only after the behavior occurs a specified number of times.
- Fixed Interval Reinforcement: One reinforcement is given after a fixed time interval providing at least one correct response has been made.
- Variable Ratio Reinforcement: Behavior is reinforced after an unpredictable number of times. Response rate is FAST; Extinction rate is SLOW (very hard to extinguish because of unpredictability.
- Variable Interval Reinforcement: one correct response has been made, reinforcement is given after an unpredictable amount of time has passed. Response rate is FAST; Extinction rate is SLOW.
Behavior modification is a set of therapies / techniques based on operant conditioning (Skinner, 1938, 1953). The main principle comprises changing environmental events that are related to a person's behavior. For example, the reinforcement of desired behaviors and ignoring or punishing undesired ones.
Always reinforcing desired behavior, for example, is basically bribery.
Types of Positive Reinforcements: Primary reinforcement is when a reward strengths a behavior by itself. Secondary reinforcement is when something strengthens a behavior because it leads to a primary reinforcer.
Behavior Shaping: the principles of operant conditioning can be used to produce extremely complex behavior if rewards and punishments are delivered in such a way as to encourage move an organism closer and closer to the desired behavior each time.
Educational Applications (Dojo Collective/Group Training): A simple way to shape behavior is to provide feedback on learner performance, e.g., compliments, approval, encouragement, and affirmation. A variable-ratio produces the highest response rate for students learning a new task, whereby initially reinforcement (e.g., praise) occurs at frequent intervals, and as the performance improves reinforcement occurs less frequently, until eventually only exceptional outcomes are reinforced.
- Praise them for every attempt (regardless of whether their answer is correct).
- Gradually the teacher will only praise when their answer is correct, and over time only exceptional answers will be praised.
- Unwanted behaviors, such as dominating class discussion can be extinguished through being ignored by the teacher.
- Knowledge of success is also important as it motivates future learning.
- It is important to vary the type of reinforcement given so that the behavior is maintained.
The major influence on human behavior is learning from our environment. Operant conditioning fails to take into account the role of inherited and cognitive factors in learning, and thus is an incomplete explanation of the learning process in humans and animals. In my assessment of analysis of the OC process I came to the conclusion that a combination of OC and Social Learning Theories can provide a better/different and more complete form of teaching in an educational type of environment like the dojo.
Social Learning Theory (Bandura, 1977) suggests that humans can learn automatically through observation rather than through personal experience.
Social Learning Theory (Bandura, 1977)
Individuals that are observed are called models. In society/dojo environment, humans are surrounded by many influential models, such as parents within the family, friends within their peer group and sensei/teachers at school/dojo. These models provide examples of behavior to observe and imitate, e.g., masculine and feminine, pro and anti-social, etc.
Students/Practitioners will pay attention to some of these people (models) and encode their behavior. At a later time they may imitate (i.e., copy) the behavior they have observed. A mainstay of dojo conditioning is about the focus/concentration, ability to analyze and synthesize and then to imitate the lesson from observation followed by sensei/teachers observation toward the type of conditioning that best encodes the concepts into their minds and bodies.
Factors to note in this process, "First, the student/practitioner is more likely to attend to and imitate those people it perceives as similar to itself. Consequently, it is more likely to imitate behavior modeled by people of the same like mind or who are noted as authority figures in the process being conditioned."
Then another important process, "Second, the people around the student/practitioner will respond to the behavior it imitates with either reinforcement or punishment. If a student/practitioner imitates a model’s behavior and the consequences are rewarding, the child is likely to continue performing the behavior."
"Reinforcement can be external or internal and can be positive or negative. If a student/practitioner wants approval from peers, this approval is an external reinforcement, but feeling happy about being approved of is an internal reinforcement. A student/practitioner will behave in a way which it believes will earn approval because it desires approval."
"Positive (or negative) reinforcement will have little impact if the reinforcement offered externally does not match with an individual's needs. Reinforcement can be positive or negative, but the important factor is that it will usually lead to a change in a person's behavior."
"Third, the student/practitioner will also take into account of what happens to other people when deciding whether or not to copy someone’s actions. A person learns by observing the consequences of another person’s (i.e., models) behavior, e.g., a younger student/practitioner observing an older student/practitioner being rewarded for a particular behavior is more likely to repeat that behavior themselves. This is known as vicarious reinforcement."
"The motivation to identify with a particular model is that they have a quality which the individual would like to possess."
"Identification occurs with another person (the model) and involves taking on (or adopting) observed behaviors, values, beliefs and attitudes of the person with whom you are identifying. Identification is different to imitation as it may involve a number of behaviors being adopted, whereas imitation usually involves copying a single behavior."
This presentation/article makes us of a creative copy of the entire article linked below in the bibliography source. Any errors that appear due to this process and effort are mine. As to the original article the errors and/or omissions are that author’s.
Bibliography: https://www.simplypsychology.org/bandura.html
For reference and sources and professionals go here:
No comments:
Post a Comment