Clicker Taining Principles 3

Written by Hiddenhorse on 23/03/2010 – 11:44 am -

How to do it

When we clicker train an animal we are using two types of learning. Don’t worry too much about the names but it is important to understand the principles here. The first type of learning is called classical conditioning. You will have seen the effect of classical conditioning many times, for example when you go to your horse in the morning and the horse hears the rattling of the gate or the clink of feed buckets you will perhaps hear a whinny or snicker, you will see the horses behaviour change as they anticipate the arrival of the morning feed. An even better example of this might be if you own a dog and the dog sees you go to the cupboard and get the dog lead/leash. The dog will respond with a change in behaviour as it anticipates setting our on it’s walk. What is happening in both these cases is that some event is predicting an expected behaviour. The rattle of the gate, the sound of feed being prepared, the appearance of the dog lead, the time of day, are all predictors of a coming event. This phenomenon was first described by a Russian scientist called Ivan Pavlov, you may have heard the phrase ‘Pavlov’s dogs’. He noticed this idea of predicting a coming event, when he went to feed his dogs and observed that they would dribble (salivate) in anticipation of the arrival of food. He realised that the the predictors the dogs were responding to were random or general events occurring accidentally in the dog’s environment and he wondered if he introduced a deliberate event, such as ringing a bell he would see the same effect. It did. Once the sound of the bell had been linked to the event as a predictor of getting fed, he found that simply ringing the bell by itself was enough to start the dogs salivating. This simple pairing of ideas became known as classical conditioning.

The second type of learning we are dealing with in Positive Reinforcement Training (PRT) or clicker training is called operant conditioning. By the way, the word ‘conditioning’ is just a scientific way of saying learning, the operant part is when the subject, (the operant) actually takes some action in order to try to get a behaviour to occur. So if your dog gets excited when you get the dog lead, that is classical conditioning, but if it goes and gets the dog lead and brings it to you in order to initiate a walk, that is operant conditioning. We use both types of learning in clicker training/PRT.

We begin with classical conditioning, by linking two ideas together a click (predictor) and a reward. Usually the reward is a food reward as this is something that the horse wants and it is something that it is willing to work for. Some people like to use a ‘reward’ such as a voice reward (Good Boy!) or a touch or a scratch but to me this is not effective at all, for reasons you will see in one moment and it brings us to a very important point. Before we go any further, we must make something very clear, – the difference between a treat and a reward.

Drum this into your head!

Rewards are NOT treats and treats are NOT rewards.

This is the main stumbling block to anyone who starts using PRT and this is the reason people try clicker training and drop it again. I never treat my horses. This is not because I am some hard, unfeeling person but because treats are NOT designed to reward the horse! They are designed to reward the human, and like most things that are designed to reward the human they are basically anthropomorphic in nature. People give treats to horses because they want to make themselves feel good. Now it may be that horses find the treats pleasurable but that is irrelevant, when we act like this it is because we want the horse to think we are a nice person and if we believe our horse thinks we are a nice person then we get good feelings. This is putting human ideas into a horse’s head, (anthropomorphism) and horses don’t think like this. When people think of treats they usually think of food rewards but treating horses also extends rather bizarrely, to physical possessions, for example we might ‘buy our horse’ a new grooming kit or a new rug or a new food bucket. It should be fairly obvious that animals don’t understand concepts like possession and ownership. What is happening of course, is that the human is trying to buy the affection of the horse and thus buy themselves good feelings.

The Treat-aholics Test

Here is a little test for all you treat-aholics out there. I might say to you, ‘carry on and give your horse all the treats that you normally do, – in fact, give more if you want to, but here is the rule: instead of dishing them out randomly throughout the day, I want you to just add them to the normal bucket feed’. Most people think about this for a moment and then say, ‘What’s the point of that?’ what they mean is ‘where is the reward in that?’ What this rule does is take away your pleasure in giving treats and shows them for what they are, just extra food that the horse doesn’t really want or need.

The way this applies to clicker training and when people get confused, overwhelmed and give up is because they don’t realise that their horse doesn’t understand what they are asking for because they have turned it into an experience based on giving the human good feelings not rewarding the horse. In this situation you will find people always get confused and forget what it was they wanted the horse to do because they are effectively clicking themselves!

Rewards

I have a very narrow and precise definition of a reward:

A reward is a very specific event in a horse’s life, intended to encourage a repeated behaviour.

That’s it, so while I say I never treat my horses, I do reward the behaviour I want, and I reward it often. The way I do this is to offer my horses choices.

Choices

All the training I do is based on me making decisions and presenting my horses with choices based on those decisions and then simply rewarding the right choice.

When I present a choice to the horse there are many different ‘wrong’ answers the horse could give me. This doesn’t matter! The wrong choice has no consequences. To the horse the wrong choice is simply a chance to learn what gets a reward and what doesn’t, so the horse naturally tries again. Eventually this process of refinement will produce the right choice and CLICK! The right behaviour is rewarded.

This is the absolute opposite of conventional training that is based on correction, where the instructor waits for the horse to do something wrong so that behaviour can be corrected. In other words the wrong choice always has a consequence, a negative one. This is not only very, very inefficient but emotionally it is very coercive and will produce negative reactions such as flight, fight or compliance. This is systems thinking in action where the student is forced to adapt to the system and why do humans think like this? Because that is exactly how they were trained when they were in the education system, and that kind of thinking comes ultimately from the military.

If your training is focused on rewarding the right thing instead of correcting (punishing) the wrong thing you are telling the horse something very profound at an emotional level, you are telling the horse that your decisions lead to good things, which in turn leads to good emotions. Each time you reward the right thing you are teaching your horse to trust your emotions and therefore your decisions. This process rapidly builds that bridge between the predator and the prey animal, the bridge of trust.

The Magic of Operant Conditioning

I said at the beginning that PRT involved something called operant conditioning, where a subject tries to make something happen by performing a deliberate action, another term for this might be ‘problem solving’, with clicker training this is where the magic really begins, you will find that your horse will actively start participating in the process and will actually start offering you behaviours to see if they elicit a reward! This is when your student really takes off and starts learning independently a moment that most teachers dream about. Be careful though when this happens recognise it for what it is and stay focused. It is all too easy to start clicking random behaviour, a process known as shaping, to start developing new behaviours. But there is one big problem here the horse is actually training you! Beware, this can be a tricky side effect of the clicker, in that it not only rewards the student but also the teacher as well! My advice is when this starts happening to treat all these promising behaviours as ‘wrong answers’ and stay concentrating on the path you were already on, keep on rewarding what you do want and don’t get distracted by what you don’t. The horse will not forget that it offered a particular behaviour even if it didn’t get rewarded that time, there will come a time when that behaviour will be exactly what you want, but not now.

In the next section ‘ Clicker training 4 we will look at some specific things we can do to rapidly build that bridge of trust, in the meantime remember:

It’s not about training, it’s about TRUST.

Share and Enjoy:
  • Print
  • Digg
  • del.icio.us
  • Facebook
  • Mixx
  • Google Bookmarks
  • Add to favorites
  • email
  • MySpace
  • PDF
  • RSS
  • StumbleUpon
  • Technorati
  • Twitter

Related Posts

Tags: , , , , ,
Posted in Clicker Training, Training | No Comments »

Leave a Comment