In putting together our reviews of the evidence on local economic growth policies we sorted through thousands of research papers, government evaluations and think tank reports. However, not all of this material is useful and cannot always help us to answer definitively what policies do and don’t work. This guide to scoring the evidence takes you through exactly why we place more importance on some evidence over others, and explains what makes different evaluation methods more or less robust and thus more useful to our understanding of policy effectiveness.
This scoring guide will help policymakers and researchers by:
- Providing a guide to different evaluation methodologies helping those about to undertake an evaluation choose the best one. (Read our eight point guide on how to evaluate for more guidance on this)
- Helping to assess an evaluation after its completion
- Providing a scoring handbook for anyone wanting to assess the robustness of an evaluation
- Providing documentation for anyone wanting to know more about how we rank studies for robustness in our policy reviews.
Our assessment is based on the scoring of papers on the Maryland Scientific Methods Scale (SMS), which ranks policy evaluations from 1 (least robust) to 5 (most robust) according to the robustness of the method used and the quality of its implementation. Robustness, as judged by the Maryland SMS, is the extent to which the method deals with the selection biases inherent to policy evaluations. We examine a wide variety of commonly employed methods and explains how we place them on the Maryland SMS. It then looks at a number of examples of policy evaluations for each method, scoring them on the quality of their implementation.
This guide is no substitute for better technical training and expert advice. But it should help those with some knowledge of evaluation techniques to better understand recent advances and the way that we treat these in our evidence reviews. It’s also important to note that the ranking of individual studies is not an exact science and often involves a degree of judgement. Statistical testing can only take us so far in assessing the suitability of a given method and the quality of its application in practice. When assessing an individual study (including the specific examples that we discuss here) there is always scope for some disagreement on the exact ranking.
The examples that we use are drawn from a wide range of studies. Not all of them are specifically focused on local economic growth but in all cases they demonstrate approaches that could feasibly be taken in future evaluations. You can find examples of evaluations for a number of policy areas we have reviewed on our resources page and via individual policies.
Please note that this document was updated in June 2016 to reflect the way our methodology was refined during the review-writing process. You can find out more about this on our blog.