Supplementary section 4.3: Some less frequently-encountered sampling methods

You might come across the term snowball sampling. This is used to recruit individuals into a sample in situations where suitable individuals are difficult to find. Imagine that we wanted to interview people engaged in a particular form of illegal drug use. If we can find a single individual, then as well as interviewing them we can ask them if they can put us in touch with anyone else that would be suitable for our study; and we keep doing this with everyone we interview. In this way we build up our sample from our initial individual (or small number of individuals) through social contacts between members of the study population. The name snowball comes from analogy with how a snowball might grow in size as it rolls down a hill and more and more snow accumulates on it. The problem with this technique is that it is very much prone to your sampling being biased towards well-connected individuals. You are less likely to sample individuals in the population who are not well-known to other people.

Another type of sampling that you might hear of is systematic random sampling. As we mentioned in the book, one significant challenge to implementing simple random sampling is that you must be able to list all the individuals that could potentially be in the sample (the population of interest). Sometimes this is impossible. Imagine we wanted to survey users of a hospital Accident and Emergency department. A list of users is not available beforehand. Where individuals are ordered in some way, systematic random sampling involves selecting a starting individual randomly then selecting individuals at regular intervals thereafter. So for our Accident and Emergency survey, we would select a time of day randomly to start our survey and interview the first person to use the department after that point; we would then seek to recruit every fifth person (say) who used the department. You should always consider whether there is any danger of regularity in the type of ordering of individuals. This seems very unlikely in this case, but if you felt there was any danger then rather than taking every fifth person it would not be too much work to roll two dice after each person surveyed to determine the nth person you will next attempt to survey.

Yet another type of sampling that you might encounter mention of is proportional quota sampling. This generally occurs when collecting data from humans in situations where we sample consecutively and expect the willingness of participants to take part to be low. Imagine we want to sample shoppers in a large supermarket to explore sex differences in shopping habits. This involves participants standing in the supermarket with us for ten minutes while we ask them questions and fill out a questionnaire. Likely most people approached will decline to take part. The key thing about proportional quota sampling is that the numbers of independent data points in each category are pre-determined. For example, we decide beforehand that we will keep approaching people to take part until we have responses from 70 males and 70 females. This means that once we have reached our quota of females (say) we switch our approaches to only focusing on men, until we manage to collect data on 70 men too. We should always be careful here not to introduce ‘time of sampling’ as a confounder. Imagine that we start sampling shoppers at 10am and finish women by 12.30pm but need until 1.30pm to get our quota of males. Now we might have a problem if ‘morning’ shoppers (regardless of sex) tend to be different from ‘lunchtime’ shoppers, because all our women were sampled in the morning, whereas men were sampled over the morning and lunchtime. This is not a catastrophe, it just suggests that if we think time of sampling might be relevant we should record this for every individual and include that as a covariate in our statistical analyses.

Back to top