In our guide to Understanding impact, we explore how to use your theory of change to build a measurement and evaluation framework. In this closer look we uncover how to select a representative sample for your research.Sample size is important, but not as important as ensuring the representativeness of the sample. Click To Tweet
Our recommended approach to evaluation is that outcomes and impact data should mainly be collected through samples of individuals. Good sampling is a real opportunity to reduce the cost and burden of data collection through getting high quality information from smaller groups of people.
Charities should find that their efforts are much more fruitful if they concentrate on getting small amounts of good quality data from a small representative group, rather than large amounts of poor-quality data from lots of people. Unfortunately, we’ve found that sampling is woefully underused and under-appreciated in the charity sector.
What is sampling?
Asking all your service users and partners to be involved in your evaluation may be costly, time consuming and probably not even feasible. The next best approach is to speak to a representative group of them. This is known as a sample.
You might decide there is some information you need from everyone you have worked with (such as user and engagement data) and some information you only need from a sample (such as feedback on particular activities). This is a good approach. Data on everyone gives you confidence in the scale of achievement while sample data tells you more about how these were achieved.
Sampling is applied to both quantitative and qualitative research, although statistical calculations of margin of error only apply when you are conducting quantitative research (we talk more about this later).
If you do decide to sample, you will still want to be confident that the findings are representative and valid. An unrepresentative sample will not deliver useful conclusions, regardless of whether you asked the right questions or conducted robust analysis. Regardless of who you are researching, it helps your credibility to sample correctly and to do everything you can to minimise bias.
When thinking about sampling it’s useful to understand four terms.
Population: This is everyone eligible for the research. In service evaluations this is everyone who has used the service (or possibly everyone who is referred to the service). You can also think about this as the population you want to generalise about. For example, “50% of our service users take up volunteering opportunities”.
Sample frame: This is everyone who you could feasibly conduct research with. Often it will be the same as the population, but there are circumstances when it will differ. For example, if some service users move away and are not contactable then they are in your population but not your sample frame.
Selected sample: This is everyone you will try to conduct research with. So, if your sample frame has many hundreds of people in it, your sample will be a selection from it.
Achieved sample: This is everyone you actually conduct the research with. It’s different to the selected sample because there will be people who you can’t get in touch with or who refuse to take part.
Sampling is therefore a multi-step process. The challenge is that in moving from each step to the next, there is the possibility of introducing error or bias. A biased sample is one that does not reflect the views or experiences of the whole population and provides misleading or wrong findings. For instance:
- Even if you are able to identify the population, you may not have access to all of them to include in the research, and those you do have access to might be different.
- Even if you do have access, you may not draw the sample correctly or accurately.
- Even if you do draw it correctly, they may not all choose to take part in your research.
The aim of a sampling process is to minimise or eliminate these risks so that the achieved sample is as representative of the population as possible.
To minimise differences which can introduce bias between your population and your sample frame, it is important to keep good records of everyone you work with. This includes their contact details, mobile numbers and even appropriate details of family members who might be able to help you reach them in future.
Minimising bias between the sample frame and the selected sample is the main business of sampling. The point at which people move from the selected sample to the achieved sample is also a likely time for bias to develop. This is because research can be affected by low response and attrition, where users are hard to reach, unresponsive or withdraw part way through.
Below are the main approaches you can use to minimise bias:
Simple random sampling
This is the best approach because it gives everyone an equal chance of selection, thereby minimising bias. To do simple random sampling you need to list everyone in your sample frame and then make a random selection, like a raffle.
The best way to draw a random sample is to use the RAND function in excel. This allocates a random number to everyone in your sample frame, you then simply need to rank the data by the RAND variable, thus reordering the list at random, and then take from the top the number of people in your sample size.
Stratified random sampling
A drawback to a simple random approach is that if you are interested in subgroups, particularly minorities, you may not select enough people from these subgroups. Stratified random sampling addresses this. It is similar to random sampling but is preceded by dividing your sample frame into different sub groups (or strata) and then using the simple random approach within each group.
This is a helpful approach if you want to explore differences in views or experiences by subgroup. For example, if your service user population was 90% men and 10% women a pure random sample would reproduce this profile, but if you stratify the sample and select randomly from amongst men and women, you will achieve a more equal balance that helps you compare the two groups.
Having described the best approach to selecting from your sample frame, we now describe the worst, which is convenience sampling. As the term implies, it simply involves using anyone who is available and willing to take part. It could mean just catching any users as they come into your service, leaving questionnaires out in a waiting room or approaching stakeholders or partners you know well.
The advantage over random sampling is that it is easier; you don’t have to work hard to get in touch with people and increase the response rate. The disadvantage is that those who are easier to engage in research are also probably more engaged with the service, so likely to give a more favourable view, or more likely to be biased in other ways.
Quota sampling is a way to make convenience sampling more robust. It involves taking what you know about your population and designing a convenience sample to match. Say for example you have a good understanding of the profile of all your service users (be this age, gender, ethnicity, level of engagement with the service or something else), you would then use this knowledge to select service users to match that profile. You might then know that in wanting to speak to thirty service users, you want at least five to be female, another five to be from minority ethnic groups and five to have been out of education for the last year. If someone refuses to take part in the research, you would need to replace them with someone who has the same characteristics so that the overall profile remains the same.
The challenge is that getting your proportions right can be difficult as it isn’t always easy to find up to date information on the population. Biases may also be introduced because even though some individuals fit the sampling criteria they might differ to the population by some other reason, such as their willingness to engage with the research or the service.
The final main type of sampling is purposive sampling which, as the name suggests, is used when you want to focus your research on a particular group. For example, you may be interested in reaching people who failed to complete a programme to find out why. “Snowballing” is a type of purposive sampling used to engage hard to reach groups. Snowballing involves people who have already taken part in the research using their social networks to find other people who could participate.
Reporting your response rate
An advantage of both random and stratified approaches is that they enable you to calculate a response rate. This is a simple calculation of the proportion of people you selected for the research who then actually took part. It is one of the main ways to assess the quality of research because a low response rate greatly increases the risk of bias. If you can quote a response rate it will add greatly to the credibility of your findings.
A good response rate is generally over 50%, while 70-80% is exceptional because the majority of people have been interviewed and the risk of bias reduced. But the overall response rate is not the only measure of success, you also need to consider differential response; whether some groups responded more than others. For example, a 50% response rate in the general population looks good, but if only men took part and no women did then the sample is clearly biased. The section below on checking for bias looks at this issue further.
Checking and correcting for bias
No matter how good your sampling approach is, you should always do what you can to check for bias. Remember, the aim is to ensure your sample is as representative of your target population as possible, so you should take whatever you know about the population and compare it to the sample. For example, if you know that half of your service users are men and half are women, then you will want the same proportion in your sample. Similarly, if you know that half of your service users worked with you for three months and half for six months then this should also be reflected.
The more you go through the process of checking your sample against the population (and report this), the more robust your research will appear.
If you do identify a bias, you may decide to correct this by doing more interviews with groups who are under-represented. An alternative approach, applicable to quantitative studies only, is to weight the data. This is a statistical adjustment of the data that corrects biases. At its simplest, it’s quite easy to do. For instance, if the split between men and women in your population is 50/50, but in your sample it is 75/25, then you could correct this by giving men in the sample a weight of 0.67 and women a weight of 2 to bring it back to 50/50. The drawbacks are that technically it reduces statistical confidence in the findings. You will also need more advanced software like SPSS or Stata to do it. Weighting by more than two variables starts to get very complicated.
Sample size and confidence levels
An appropriate sample size differs depending on whether you are conducting qualitative or quantitative research.
Sampling in quantitative research
In quantitative research, the larger the sample the more confident you can be that your findings are not down to chance but reflect the overall population. This is expressed statistically by confidence levels or margins of error, which help you understand how confident you can be that a finding reflects what your findings would be in the whole population if everyone were interviewed. Confidence levels are based on complex maths that you don’t need to understand fluently as computers do the calculations. However, it is useful to have a basic knowledge of how it works.
So, by way of illustration, if you found that 50% of a random sample said that they were satisfied with a service, this figure would only be an estimate because you have not interviewed everyone but only a sample of the service users. Confidence intervals then tell you how accurate that estimate is. A large confidence interval would be +/- 10 points, i.e. the real figure is between 40% and 60% (quite a big range), a small confidence interval would be +/- 2 points, i.e. between 48% and 52%, so you have greater certainty about what the real figure would be if you interviewed everyone.
The confidence interval is determined by the related factors of the size of the sample and the proportion of population interviewed. So if you interview 10 people, out of a population of 100 service users, the confidence interval would be +/- 30 points (huge) but if you interview 80 out of 100 then the confidence interval would be +/- 5 points (much more precise).
Finally, it should also be noted that very small samples are always treated with caution. Generally, any quantitative sample below about 30 should only be seen as indicative and not suitable for statistical analysis.
Sampling in qualitative research
Statistical calculations are not applicable for qualitative research, which typically entails smaller sample sizes. Rather, qualitative researchers talk about getting to a ‘saturation point’. This occurs when repeated interviews result in the same themes and findings. This can mean you need only conduct as little as ten or 15 interviews before you have enough findings to meet your objectives.
Whilst qualitative research is not attempting to be representative, it should still ensure that a good mix of the research population is included in the study. Any tendency towards being selective may mean that the sample becomes biased towards those with certain experiences, such as those more likely to engage with the service. Selecting randomly and eliminating bias is therefore equally applicable to qualitative research as to quantitative.
Research should not rely on speaking to those who are easiest to engage, as this produces biased results. Rather, if any sampling is involved, it should be based on a systematic approach to selecting people and careful comparison of the sample against the population you want to generalise.
Sample size is important, but not as important as ensuring the representativeness of the sample. This applies as much to qualitative research as it does to quantitative.
Finally, you should explicitly address the issue of bias when reporting your findings. Being upfront about concerns or limitations, and how confident you are in your findings, is an important part of being credible.
How to turn your theory of change into a plan for measurement, the five types of data you will need to pay attention to, and how to prioritise what to measure.
This new guide is a ten step handbook to creating a theory of change, built on many years of developing them for charities and funders. It will teach you the basics, our core approach, with the information you need to do any theory of change.
Theory of change mechanisms are where you describe how you want people to engage with your activities; the kind of relationship you establish; and the thought processes you want them to go through in order to achieve the outcomes and impact you want.