As we have emphasized throughout the book, one of the main objectives of experimental design should be to reduce to the absolute minimum any suffering by experimental subjects. Often this will be achieved by using as few subjects as possible, whilst still ensuring that you have reasonable chances of detecting any biologically significant effect, and in the absence of any other extenuating circumstances this is a good rule of thumb to follow. However, sometimes with a little careful thought, we can do better than this.
Suppose that we wish to carry out an experiment to investigate the effects of some experimental treatment on mice. Our experiment would obviously include two groups to which mice would be allocated at random; one experimental group that would receive the treatment, and a control group that would be kept in exactly the same way, except that they would not receive the treatment. First let’s assume that we are limited in the number of mice we can use, to, say, 20. How should we divide our mice between groups? To carry out the most powerful experiment, the best way to allocate our mice would be to have 10 mice in each group. You will remember from the book that an experiment with equal numbers of individuals in each treatment group is referred to as a balanced experiment, or balanced design; and in general balanced designs are more powerful than unbalanced designs. Thus, if we wish to maximize our probability of seeing an effect in our experiment, we should balance it.
However, imagine that the treatment that we are applying to our mice is stressful, and will cause some suffering to them. With a balanced design, 10 of the mice will experience this unpleasant treatment. It would be better ethically if we could move some of these mice from the experimental group into the control group, and reduce the number of mice that suffer the unpleasant treatment. If we carried out our experiment with five experimental mice and 15 control mice this would mean that only five mice experience the unpleasant treatment. The downside of this is that our experiment has become unbalanced, and so will be less powerful than the original balanced design. If the drop in power is not substantial, we might accept that the change is worthwhile to reduce the number of mice that suffer. However, if the drop in power is too great, it may make the whole experiment a waste of time, as we would have no realistic chance of detecting a difference between the groups. In that case fewer mice will have suffered, but they will all have suffered for nothing. Is there any way that we can both reduce the number of mice that suffer, but maintain statistical power? The answer is that we can, if we can use more mice in the experiment overall.
As you may remember from Chapter 6, one of the determinants of the power of an experiment is the total sample size of the experiment. We can increase the power
of our unbalanced experiment by increasing the total sample size. Practically, this would be achieved by increasing the number of mice in the control group. Thus, we might find that an experiment with five experimental mice and 21 control mice has the same power as our original experiment with 10 experimental and 10 control mice (see Figure S7.2.3). The experiment has the downside of using more mice in total (26 compared to 20), but the benefit that fewer of them have to experience the unpleasant experimental treatment. Presumably, even the control mice will experience some stress from being handled and kept in the lab, so the question arises: is increasing the number of mice that experience the mild stress of simply being in the experiment outweighed by the reduction in the number that experience the very stressful treatment? This will depend on, among other things, how stressful the treatment is and how stressful simply being in the lab is. This is an issue for you as a life scientist: it’s not something that anyone else can decide for you (although that doesn’t mean that you shouldn’t seek advice!).
Figure S7.2.3 Three possible combinations of group sizes for an experiment comparing a control group of mice to a group of mice subject to some experimental manipulation. In most circumstances you should have equal numbers of animals in the two groups, since this balanced design (a) maximizes statistical power. If you have ethical or practical reasons to want to minimize the number of individuals subjected to the manipulation, then you could move individuals from the treatment group to the control group but keep the overall number of subjects used the same (see (b) for an example). This will lower the statistical power of the test compared to testing 10 versus 10, but will be more powerful than testing 5 versus 5, and this loss of power may be a price worth paying against the reasons that you had for reducing the number subjected to the manipulation. Finally, if you wanted to reduce the number of manipulated animals but keep the same power as the balanced experiment this can be done by moving some animals from the manipulation group to the control group and supplementing them with even more animals (c), such that more animals are used in the experiment overall, but fewer as subject to the manipulation without compromising statistical power.
The details of how to design effective unbalanced designs are beyond the scope of this brief introduction, but the paper by Ruxton (1998) included in the Bibliography of the book provides further discussion. In principle, it is simply a matter of determining the power of the different possible experimental designs in the same way as was done in Chapter 6. We mention it here in the hope that by making you aware of the possibility, you will be encouraged to investigate further should the need arise. Of course the same logic can also be used in contexts other than ethical concerns. Imagine that the chemicals used in the treatment are very expensive or the treatment is time consuming, it may pay to carry out a larger unbalanced experiment with fewer individuals in the expensive or time-consuming group.
Sometimes clinical trials feature a control group that is smaller than the treatment group. This occurs when the control is a well-established treatment and the trial is biased towards giving more subjects the novel treatment. Having more subjects in the ‘new treatment’ group allows the outcomes of this treatment to be evaluated more fully, and it is argued that a smaller control group is needed because there is already an extensive body of knowledge on the outcomes of this treatment. We would counsel against such an approach, since the implicit assumption is made that the performance of this new treatment can be compared against historical evidence on the effectiveness of the existing treatment (albeit with a small concurrent control group to act as a safety net). We have warned about the dangers of historical controls in Chapter 2 of the book and we do not recommend reliance on them when a good concurrent control could be used.
You should almost always try and use balanced designs - but keep in mind that it may sometimes be worth paying the cost of an unbalanced design to reduce the numbers in a particular treatment group.