Generic NPC banner

The evolution of evaluation

By Guest contributor 29 April 2014

Louise Stimpson is Monitoring and Evaluation Manager at Marie Curie Cancer Care. She will be speaking about using measurement, monitoring and evaluation to improve internal services at How to measure outcomes: practical tips & tools on 4 June.

Asked to speak at NPC and Third Sector’s upcoming conference on measuring, evaluating and evidencing your outcomes, I’ve been deliberating the origins of evaluation.

There are many methodological similarities between research and evaluation, and the two are often used interchangeably. But it’s important to remember the difference. Evaluation is specific to a particular service or initiative, and focuses on asking whether goals are being accomplished or whether improvements can be made. Research, on the other hand, is designed to provide results that go beyond an individual project or service and can be generalised to other populations, conditions or times.

So how did evaluation become independent from research and randomised control trials (RCTs)?

Here’s my three-minute theory.

Let us start with a hypothesis. Assume the average British person walks 150 miles a year, lives for around 80 years and consumes 900 gallons of wine in their lifetime. This means we Brits run on a somewhat inefficient 13 miles to the gallon.

Yes, I can see that there are obvious flaws to this theory (for starters, it doesn’t account for those people who do not drink yet are still able to walk from point A to point B), but bear with me.

Now, let us also say that we would like to pilot a new intervention to ‘increase our fuel efficiency’ by supporting people to walk more frequently or further.

To work out one’s actual miles to the gallon in order to measure the effectiveness of our new intervention, we would need to consider many factors about the individuals who are consuming wine and walking around. If we were trying to control this study, we would need to think about other fuel sources consumed, and the age, weight, and metabolism of these consumers. To control each of these factors in search of the answer to this question would fall short on any measure of ease or practicality. We would need to do research.

My theory on the evolution of evaluation is that practitioners, funders and commissioners are in search of a lean approach to research. A ‘just enough’, pragmatic approach that weighs the integrity and robustness of the data and collection methods from research, with consideration of the practicality of having limited control of the environment in which we work and limited resource available.

An important consideration for evaluation here is that it aims to meet the needs of the audience we seek to inform.

To evaluate whether our intervention has an effect on our fuel efficiency we would need to consider what we can actually measure rather than control, and understand the constraints in which we are measuring. Perhaps, for example, we would measure the fuel efficiency of a sample of people who take part in the intervention and then search to retrospectively match them to a sample of people who did not (based on age, previous patterns of drinking and walking).  This would give the audience just enough information to illustrate the effect of the intervention in a statistically robust way (albeit with its limitations).

I think that evaluation over the last decade has taken huge strides towards independence, and I think we should continue in this vein to establish our own methods of helping people to measure their outcomes.