RESULTS-BASED FINANCING IN EDUCATION: FINANCING RESULTS TO STRENGTHEN SYSTEMS RESULTS-BASED FINANCING IN EDUCATION Learning from What Works RESULTS IN EDUCATION FOR ALL CHILDREN (REACH) Education Global Practice World Bank 1818 H Street, NW / Washington DC, 20433 / USA worldbank.org/reach / reach@worldbank.org Education Global Practice Results in Education for All Children (REACH) Acknowledgements This report was written by Jessica D. Lee and Octavio Medina. We are grateful to Omar Arias, Practice Manager, and Samer Al-Samarrai, Senior Economist and Results in Education for All Children (REACH) program manager, for their guidance and support. Helpful comments and feedback were also provided by Mariam Adil, Rabia Ali, Juan Baron, Deon Filmer, Andrea Guedes, Alaka Holla, Peter A. Holland, Sarah Holzapfel, Alice Kunz, and Shwetlena Sabarwal. We also want to thank all of the staff from various development agencies who took the time to speak with us to share their knowledge. The REACH trust fund is supported by the governments of Germany, Norway, and the United States. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments that they represent. Contents Acronyms������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������ 6 Executive Summary Incentives to Teachers, Students and Families, and Schools ���������������������������������������������������������������������������������� 8 Incentives to Governments �������������������������������������������������������������������������������������������������������������������������������������������������������������� 8 Introduction Which Definition of RBF is Being Used?������������������������������������������������������������������������������������������������������������������������������� 10 Why Focus on RBF in Education? ��������������������������������������������������������������������������������������������������������������������������������������������� 10 Why Examine Four Different Levels? ��������������������������������������������������������������������������������������������������������������������������������������11 Methodology������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������� 12 Methodology to Collect Operational and Tacit Knowledge ���������������������������������������������������������������������������������� 12 Survey Design�������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������� 14 Results-based Financing in Education: Donor Portfolios ADB (Asian Development Bank)�������������������������������������������������������������������������������������������������������������������������������������������������� 16 AFDB (African Development Bank) ������������������������������������������������������������������������������������������������������������������������������������������ 16 DFID (Department for International Development)������������������������������������������������������������������������������������������������������ 16 GPE (Global Partnership for Education)�������������������������������������������������������������������������������������������������������������������������������� 16 IDB (Inter-American Development Bank) ���������������������������������������������������������������������������������������������������������������������������� 16 NORAD (Norwegian Aid Agency) ���������������������������������������������������������������������������������������������������������������������������������������������� 17 SIDA (Swedish International Development Cooperation Agency)�������������������������������������������������������������������� 17 Part I – When Do Incentives Work? Design Issues ����������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������� 19 Using RBF with Teachers ��������������������������������������������������������������������������������������������������������������������������������������������������������������� 19 The Theory Behind Using RBF with Teachers ���������������������������������������������������������������������������������������������������������������� 19 Does Using RBF with Teachers Improve Outcomes? ������������������������������������������������������������������������������������������������ 20 Design Issues: What to Incentivize (Outputs or Outcomes?)�������������������������������������������������������������������������������� 22 Design Issues: Who to incentivize? ������������������������������������������������������������������������������������������������������������������������������������������ 22 Individual or Group-based�������������������������������������������������������������������������������������������������������������������22 Design Issues: Which Metrics to Use�������������������������������������������������������������������������������������������������������������������������������������� 23 What Metric to Use: Level, Piece Rate, or Rank�������������������������������������������������������������������������������23 Design Issues: Behavioral Responses������������������������������������������������������������������������������������������������������������������������������������ 24 Design Issues: Sustainability �������������������������������������������������������������������������������������������������������������������������������������������������������� 25 Design issues: Long term effects���������������������������������������������������������������������������������������������������������������������������������������������� 25 Design Issues: Gaming and Cheating ������������������������������������������������������������������������������������������������������������������������������������ 25 Conclusion �������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������� 26 Using RBF with Students and Families ��������������������������������������������������������������������������������������������������������������������������������� 27 The Theory behind Using RBF with Students and Families ���������������������������������������������������������������������������������� 27 Does Using RBF with Students and Families Improve Outcomes?������������������������������������������������������������������ 27 Design Issues: The Role of Conditionality �������������������������������������������������������������������������������������������������������������������������� 30 Design Issues: The Role of Information and Labeling������������������������������������������������������������������������������������������������ 30 Design Issues: What to Incentivize ������������������������������������������������������������������������������������������������������������������������������������������ 31 Design Issues: Who Should Be Incentivized? ������������������������������������������������������������������������������������������������������������������ 32 Conclusion �������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������� 33 Using RBF with Schools������������������������������������������������������������������������������������������������������������������������������������������������������������������� 33 The Theory behind Using RBF with Schools��������������������������������������������������������������������������������������������������������������������� 33 Does Using RBF with Schools Improve Outcomes?���������������������������������������������������������������������������������������������������� 34 Design Issues: Competitive Distribution of Resources �������������������������������������������������������������������������������������������� 35 Design Issues: Equity�������������������������������������������������������������������������������������������������������������������������������������������������������������������������� 36 Design Issues: Household response �������������������������������������������������������������������������������������������������������������������������������������� 37 Design Issues: Long term effects���������������������������������������������������������������������������������������������������������������������������������������������� 37 Conclusion �������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������� 37 Combining RBF Interventions to Overcome Constraints ����������������������������������������������������������������������������������������� 38 Targeting different levels������������������������������������������������������������������������������������������������������������������������������������������������������������������ 38 Combining RBF and institutional capacity building ���������������������������������������������������������������������������������������������������� 38 Summary �������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������� 40 Part II – RBF and Governments: Making RBF More Effective Implementation���������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������� 41 Planning �������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������� 41 Design������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������� 41 Choosing RBF: Commitment, Cautions, Cost, Context ��������������������������������������������������������������������������������������������� 43 Commitment����������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������44 Recipient Feels “Pressured” into Accepting RBF ���������������������������������������������������������������������������45 Recipient is Keen to Pursue RBF and Has the Necessary Political Will  �������������������������������������45 Costs and Benefits (Advantages and Disadvantages of RBF Over Traditional Aid) �������������������������� 46 Cost-effectiveness of RBF�������������������������������������������������������������������������������������������������������������������47 Cautions�������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������� 49 Context ���������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������� 50 Country Systems�����������������������������������������������������������������������������������������������������������������������������������50 Capacity��������������������������������������������������������������������������������������������������������������������������������������������������51 Conflict, Fragility, and Violence ���������������������������������������������������������������������������������������������������������53 Design Priorities ��������������������������������������������������������������������������������������������������������������������������������������������������������������������������������� 54 Cascading Incentives�������������������������������������������������������������������������������������������������������������������������������������������������������������������������� 55 Selecting and Pricing Indicators ������������������������������������������������������������������������������������������������������������������������������������������������ 57 DLI Analysis: The Basics ���������������������������������������������������������������������������������������������������������������������57 DLIs and Results Chains: Few DLIs Focus on Outcomes �������������������������������������������������������������59 Examples of Results Chains: Bangladesh, Lebanon, and Tanzania �������������������������������������������62 DLIs and Short Project Timelines �����������������������������������������������������������������������������������������������������68 Pricing DLIs: Three Hypotheses and a Heuristic�����������������������������������������������������������������������������69 DLIs: Scalability and Disbursement Models �����������������������������������������������������������������������������������72 Zero/Global DLIs �����������������������������������������������������������������������������������������������������������������������������������73 Adaptive Implementation��������������������������������������������������������������������������������������������������������������������������������������������������������������� 74 Monitoring and Information Systems ������������������������������������������������������������������������������������������������������������������������������������ 74 Purpose of Education Management Information Systems (EMIS) ���������������������������������������������74 Building Systems for RBF �������������������������������������������������������������������������������������������������������������������75 Level of Complexity Needed for Systems�����������������������������������������������������������������������������������������75 Monitoring Options �����������������������������������������������������������������������������������������������������������������������������76 Verification�������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������� 77 Gaming and Cheating�������������������������������������������������������������������������������������������������������������������������������������������������������������������������� 80 Implementation Quality �������������������������������������������������������������������������������������������������������������������������������������������������������������������� 81 Failure to Achieve Targets ������������������������������������������������������������������������������������������������������������������������������������������������������������� 84 Sustainability ����������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������� 85 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 Acronyms African Development Bank AFDB Asian Development Bank ADB Conditional Cash Transfers CCT Department for International Development DFID Disbursement-Linked Indicators DLI Education Management Information System EMIS Financial Management FM Fragile and Conflict/Violence affected FCV Girls Education Challenge GEC Global Partnership for Education GPE Health Results and Innovation Trust Fund HRITF Information Technology IT Inter-American Development Bank IADB International Anti-Corruption Resource Center IACRC Investment Project Financing IPF Management Information System MIS Management System for Education Quality SIGCE Non-Governmental Organization NGO Norwegian Aid Agency NORAD Program for Results PforR Programa de Asignación Familiar PRAF Project Appraisal Document PAD Public Financial Management PFM Quality Assurance System QAS Results in Education for All Children REACH Results-Based Financing RBF Sistema de Administração Financeira do Estado SISTAFE Standard Deviation SD Swedish International Development Cooperation Agency SIDA The Vaccine Alliance GAVI United States of America US Executive Summary Results-based financing (RBF) has gained popularity in the international development community because of its potential to make education spending more effective and efficient. In the education sector, RBF has been primarily applied to four levels: teachers; students and families; schools; and governments. The results overall have been mixed, with some notable successes and some disappointing experiences. This report explores when and how RBF can help achieve better impacts in education. While there is no rigorous evidence available to suggest that RBF on its own is better at producing learning outcomes relative to other development financing modalities, there is a significant amount of research that shows RBF can have positive effects by incentivizing specific stakeholders in the education system. In addition, there is operational evidence available on how RBF can be designed and implemented with country partners more effectively. It is important for practitioners and policymakers to learn from this evidence as the RBF portfolio in education grows across development agencies. Incentives to Teachers, Students Incentives to Governments and Families, and Schools There is much less robust research available on Though the research is neither comprehensive results-based financing arrangements between nor definitive, there is substantial evidence to donors and country governments. Though suggest the following: there is some evidence from other sectors such as health, few of these programs have been 1. RBF and teachers: Teacher incentives rigorously evaluated in education. However, can but do not always improve teacher there is a large base of operational knowledge attendance and student learning. The across multiple agencies, which points at design of the incentive scheme and the several key criteria for more effective RBF: context matter. The effects are larger and more positive in developing country 1. Choosing RBF as the appropriate contexts. financing modality requires careful consideration of political commitment, 2. RBF and students and families: and understanding the risks involved, Student and family incentives (such as costs, and country context (for example, CCTs, for instance) has a good track capacity and country systems). record of reducing school dropout and increasing school attendance, though 2. RBF project design should prioritize the evidence for its effects on student the cascading of incentives and learning is more mixed. Conditional should select and price indicators transfers to students tied to their own with an objective or methodology learning are a promising area of future in mind. Some of these include cost research. effectiveness, increasing the chances of achieving other indicators, or reducing 3. RBF and schools: The evidence base risk of nonpayment. on the effectiveness of performance- based grants is still quite limited. For 3. RBF project implementation should now, it seems that in some cases they think of the purpose of monitoring and can work, especially when grants are information systems, invest upfront combined with other interventions in verification, and be adaptive and such as capacity building (for example, flexible in order to address realities on- to principals and school committees) the-ground. or when money is spent on inputs that affect learning outcomes. 4. Synergies: There is growing evidence that combining different RBF interventions within the same program can generate results that go beyond the sum of any two interventions alone. Though the research is limited, this suggests that RBF that tackles several bottlenecks at once can have larger effects. Executive Summary 8 Introduction It is not clear that the majority of development financing has been either effective or sustainable, and many stakeholders in international development are keen to change this. In recent years, results-based financing (RBF) has been championed as a way to increase both the efficiency and effectiveness of aid. While there are those who defend it and others who decry it, almost everyone wants to know: does it work? For various reasons that will be explored in this report, this is not a yes or no question. However, it is possible to investigate when and how RBF can improve results in education. This report will outline the theories, design considerations, implementation issues, and impact of RBF in the education sector and examine how RBF has worked, or not worked, when used with teachers, students and families, schools, and governments. Specifically, this report will examine three types of evidence: (i) particular types of RBF interventions, such as teacher incentives, on which many studies have been done; (ii) the operational knowledge of development agency staff who have designed and implemented RBF projects in education; and (iii) documentation from RBF projects from different stages of the project cycle. Many of the lessons can be generalized across sectors, though we will draw our examples primarily from the education and health sectors. Which Definition of RBF and the public-private partnership GAVI (the Vaccine Alliance), both of which have specific is Being Used? goals and target populations. In the original The world of results-based financing is GAVI scheme, bonuses were paid out to every populated by an alphabet soup of acronyms. additional child over the baseline who received For the purposes of this report, results-based vaccinations. 3 Many of the early HRITF financing (RBF) is an umbrella term referring interventions were aimed at increasing the to any program or intervention that provides use of maternal and neonatal care services, rewards after the credible verification of which translated into indicators such as an achieved result. These rewards can be immunizations, clinic visits, and the delivery monetary or non-monetary and can be partial of babies in a health facility. 4 These types of (such as a bonus on top of a salary) or whole indicators are inherently quantitative and thus (such as the cost of training a teacher under more easily measured than outcomes such as output-based aid).1 learning. There are differing opinions on what actually The World Bank’s 2018 World Development constitutes results-based financing, with much Report states that, even when learning is of the debate centered on what constitutes the explicit goal, achieving that goal can be a “result.” In this report, results are defined difficult because tasks within the education broadly. They can be outputs (such as the system are often carried out in a fragmented implementation of a new teacher training way by many different actors, which dissipates system), intermediate outcomes, final outcomes accountability.5 This can make it challenging (such as learning) or — more likely — a mixture. to accurately identify the binding constraints Importantly, the dividing line between inputs in any particular country’s education system, and outputs can depend on which particular much less know which constraints can be bottleneck the RBF is being used to resolve and overcome by incentives. Moreover, education on the objectives of each specific project.2 indicators are often not inherently quantitative, particularly when related to quality. For example, simply training teachers does not Why Focus on RBF in Education? necessarily lead to better learning outcomes, nor does increasing enrollment rates. The There has been less research, either qualitative education system and theories of change within and quantitative, done on RBF in education the system are complex and contain multiple than other social sectors, such as health. There actors whose actions must be aligned in order are various reasons for this, including the for learning to occur.6 Thus, it is important to fact that there have been more long-standing assess the promises and pitfalls of RBF to help examples of RBF in the health sector, notably different stakeholders in the education sector to through the World Bank-managed Health understand what it can and cannot accomplish. Results and Innovation Trust Fund (HRITF) 1 World Bank (2017) 2 World Bank (2017) 3 Pearson et al (2010) 4 World Bank (2015) 5 World Bank (2017c) 6 World Bank (2017c) Introduction 10 Why Examine Four For example, at the teacher level, a constraint may be whether a teacher shows up or has the Different Levels? content and pedagogical knowledge to teach. This report focuses on four different levels: At the student and family level, a constraint teachers, students and families, schools, and is getting the child to attend school (and then governments. Why look at different levels of learn). At the school level, managing inputs RBF interventions? Simply put, the education effectively becomes crucial, whereas at the sector in every country is not a monolithic government level policy and incentive design entity, but a system with many moving parts.7 may take precedence. The problems and The role of RBF interventions varies based variables are related, but the constraints and on the level of the intervention, and so do the possible solutions vary. relevant actors involved in the process. As such, while there are things in common, the Of course, some variables are common to all guidance and design issues can be quite different. levels. An actor or an institution’s performance This analysis at different levels can provide depends on variables like motivation, inputs, more nuance than treating RBF interventions in and skill sets. But these can take different forms. education as a homogeneous whole. Table 1: Levels, Roles and Constraints in Education Systems LEVEL ROLE SAMPLE CONSTRAINTS Teachers Service delivery agents Showing up, effective teaching Students and Families Users of the service Attending school, learning Schools Managers of front-line service Leading school staff, managing delivery inputs Governments Designers and managers of the Designing policy and incentives, system allocating resources Methodology to directly solicit the views of country clients, but we advocate for more analytical work to be Extensive literature already exists that undertaken to reflect the perspectives of local describes the theoretical underpinnings actors and stakeholders as they are clearly an of RBF. 8 This report will not restate this important constituency in RBF. There is work discussion but rather will present the evidence underway as part of the REACH trust fund to of when and how RBF can improve educational gather more country-level information about outcomes. The target audience for this report RBF. are stakeholders who are interested in using RBF to unlock binding constraints in the For the section on When Incentives Work, we education system to improve learning. A reviewed academic research available on the limitation of this report is that we were not able following levels: RBF and teachers; RBF and 7 World Bank (2017) 8 See, for example, Clist (2016), Clist and Verschoor (2014), and Birdsall and Savedoff (2012). Introduction 11 schools; and RBF and students and families. Unfortunately, there is less academic The scope of the review included articles evidence available on the impacts of RBF and published since 2000.9 We restricted the search Governments, which drove the decision to to experimental or quasi-experimental impact focus this section on operational evidence. evaluations in developing countries, though Nonetheless, quasi-experimental and relevant or seminal research from developed experimental evaluations were added where countries is cited where appropriate. The available. Finally, both sections draw from the primary instrument to conduct the search was theoretical literature on the relevant topics. Google Scholar. A number of other studies that had not been identified were added later. These Methodology to Collect Operational were i) references given in articles found during and Tacit Knowledge the initial search, ii) newly published articles The operational information that we use in that were not available during the initial the report comes from our desk review of drafting of the document, and iii) other articles documentation of projects in the education pointed out by colleagues and reviewers. A sector that have RBF components and were total of 41 impact evaluations were included. either cited by survey takers as good examples Additionally, 8 reviews of the existing evidence to review or stood out as flagship projects were included in the search.10 such as Big Results Now in Tanzania. We also conducted a survey of 46 staff from After the initial search, papers were kept or development agencies who design, implement, discarded based on whether the intervention and evaluate RBF interventions and programs evaluated fell under the three RBF levels. in the education sector to elicit their opinions Papers were then classified by theme and coded on RBF in education, keeping their responses for similarities and differences, including anonymous to encourage candor. We also a note on effect sizes (when possible). A carried out follow-up interviews with 19 conceptual framework for each topic was of these staff to supplement in more detail outlined, and gaps in the literature were the findings of the survey. These interview identified. The goal of this literature review responses were also anonymous. We developed was to provide greater context for the findings the survey questions based on the experiences from the REACH grants, and to underscore from REACH, which generates evidence and how they contribute to the evidence base. knowledge on RBF in education. Strictly operational evidence plays a smaller role in the When Incentives Work section because impact evaluation evidence was available. 9 Though most of the RBF literature was published in the 2010s, some of the impact evaluations of conditional cash transfers date back to the early 2000s and even late 90s. We chose to only include articles published after 2000 to narrow down the scope. 10 Murnane and Ganimian 2014, Bard et al 2014, Evans and Popova 2016, Glewwe and Muralidharan 2016, McEwan 2015, Molina-Millan et al 2016, Fiszbein and Schady 2009, Results for Development 2016, Snilstveit et al 2015, Imberman 2015. Introduction 12 What is REACH? Results in Education for All Children (REACH) is a program housed at the World Bank that supports efforts to improve education, especially for the most vulnerable populations, by helping country systems focus on results. It was established in 2015 and currently funds 33 RBF activities in education in 23 countries around the world. REACH also provides technical support and advice on RBF in education to World Bank teams and other development partners. The main purpose of the program is to contribute to the evidence base around RBF in education. Table 2: Sources of Information SOURCE NUMBER AGENCIES / TOPICS Impact evaluations Total: 42 papers 11 RBF with Teachers: 16 (quasi-experimental and RBF with Students experimental evidence) and Families: 13 RBF with Schools: 6 Meta-analyses 8 reviews of evidence and Several topics: 5 meta-analyses RBF with Teachers: 1 RBF with Students and Families: 2 Project documentation Total: 20 documents (a mixture ADB, DFID, GPE, IDB, WBG of project design documents, RBF and Governments: 20 status reports, completion reports, and evaluations) 19-question survey 46 respondents ADB, SIDA, DFID, GPE, WBG, independent evaluation firm Follow-up interviews 15 interviewees ADB, SIDA, DFID, GPE, IDB, WBG Survey Design The agencies that were chosen to participate To capture practitioners’ perceptions and in the survey were those that have funded and/ insights, we created a 19-question survey that or implemented operations in education that aimed to define the attitudes and behavior of have used RBF. Many of these agencies have staff in development agencies. The survey was had experience with RBF through projects sent to roughly 200 individuals who work in for which the World Bank has been the main international development in the education implementing agency. sector, though not necessarily with any experience with RBF. 11 Some covered several areas. Introduction 13 We created distribution lists for the survey that their work was not related to RBF (so it is and emphasized that anyone could opt in to unclear why they took the survey), and a final take the survey. There were 46 respondents pair wrote that they oversaw RBF activities as from five development agencies — the Asian part of their portfolio. The survey tool did not Development Bank (ADB), the Swedish allow for matching responses, so it is unclear International Development Cooperation whether those who said they designed activities Agency (SIDA), the UK Department for also implemented them or evaluated them. International Development (DFID), the However, the data still showed that most of the Global Partnership for Education (GPE), respondents had expertise in the design and and the World Bank Group (WBG) and one implementation of RBF activities in education. independent consulting company that has done The survey data are purely qualitative and evaluations of RBF programs for the DFID. may be skewed given that respondents self- We subsequently conducted 15 follow-up, selected into taking the survey, in other words, semi-structured interviews with staff from six those who completed it may have felt more development agencies — the ADB, SIDA, DFID, strongly than average about RBF for positive or GPE, the International Development Bank negative reasons. In addition, the survey only (IDB), and the WBG. To ensure that the survey captured the viewpoints of staff of development responses were focused on the education sector agencies, not of officials in national-level and were practitioner-oriented, the respondents ministries and administrative units or other were asked whether they had designed, recipients and implementers of RBF. implemented, and/or evaluated RBF activities in education. The respondents could choose Wherever possible, we have bolstered this from multiple answers: 34 indicated they had qualitative information with supporting designed RBF activities, 28 had implemented evidence from other sectors (the most them, and 15 had evaluated them. Two comparable social sector being health) and with respondents indicated that they were starting other published studies. an RBF project, another two wrote Introduction 14 RESULTS-BASED More recently, the African Development Bank FINANCING IN EDUCATION has proposed creating a lending instrument Donor Portfolios based on investment project financing (IPF)13 and disbursement-linked indicators (DLIs).14 Nonetheless, the World Bank remains the RBF is a relatively recent phenomenon. primary funder of RBF initiatives in education, Although the first interventions appeared in the both as a direct lender and through other 1990s, they were slow to take off. Beyond several implementing agreements with other donors pilot initiatives, most of the growth in RBF such as the GPE. projects has happened over the past decade. Figure 1 presents a summary of the current The Bank launched its Program for Results results-based programs of some of the largest (PforR)12 instrument in 2012 (though it had aid donors. Note that the World Bank Group been using investment project financing with also manages trust funds and other lending disbursement-linked indicators even earlier), and mechanisms financed by third party donors. the Asian Development Bank (ADB) launched Since these are hard to track down, results are its six-year pilot of an RBF instrument in 2013. shown by ultimate lender of the funds allocated. Figure 1: Education RBF Portfolio of Major Donors (2014–2018) 12 PforR or Program for Results is one of the three World Bank financing instruments. What sets it apart from other instruments is its focus on results. It uses a country’s own institutions and processes and disburses funds against the achievement of a series of agreed-upon results. 13 IPF provides a loan or credit/grant financing to governments for activities that create the physical/social infrastructure necessary to reduce poverty and create sustainable development. 14 Disbursement-linked indicators (DLIs) provide the government with incentives to achieve key program milestones and improve performance. Introduction 15 ADB (Asian Development Bank) share of the DFID’s RBF budget is spent on the Since July 2014, the ADB has committed a Girls’ Education Challenge (GEC), a global RBF total of US$4.28 billion in RBF projects. Of initiative to fund projects aimed at increasing these, US$1.11 billion (or 27.9% percent of the girls’ access to education. The budget for the total) is committed in the education sector. GEC alone is around US$454.4 million (not Only the energy sector has a higher amount shown in the graph because it was approved of RBF lending from the ADB (with US$1.5 before 2014). billion). The ADB was one of the pioneers in results-based financing, introducing it in 2013 GPE (Global Partnership for Education) for a six-year trial period. The most recent The Global Partnership for Education (GPE) evaluation of the pilot (published in November adopted a new results-based funding model 2017) suggested that the pilot had had in 2014. Up to 70 percent of GPE grants are generally positive results. Despite some delays disbursable following the adoption by the because of a lack of familiarity by among the government of an education sector plan and a implementers with the design and execution commitment to increase education spending of these projects, RBF projects have been and to improve data collection and analysis. rolled out effectively. Key stakeholders (in both This part is not strictly contingent on results. governments and agencies) have endorsed the However, the remaining 30 percent of the lending instrument, and as a result, demand for GPE grant is paid only if targets are met. RBF is expected to grow. Disbursement is contingent on improvements in three dimensions: equity, efficiency, and AFDB (African Development Bank) learning outcomes. As of February 2018, the The African Development Bank approved total GPE grant portfolio was around USS1.8 its results-based financing instrument in billion, of which US$130 million (or 7 percent) November 2017. As yet, there are no projects was part of the variable tranche and thus that are using RBF. results-based. DFID (Department for International IDB (Inter-American Development Bank) Development) The IDB pioneered a results-based instrument The UK’s DFID has been active in using called performance-driven loans in 2003. results-based financing for almost a decade. Several loans were approved, but the results Since 2011 there has been a strong focus on the were mixed.16 As a result, the instrument was value for money component of RBF, to the point discontinued in 2009. Among the reasons for where it is often associated with DFID.15 There this was the fact that there was little demand have been three main education projects based because of the strict requirements related on RBF. The first was a 2011 pilot in Rwanda to disbursement.17 Also, the verification of that rewarded schools for the number of their outcomes caused disbursement delays because students who completed education. The second outcomes had to be matched to specific was a 2012 project in Ethiopia that rewarded expenditures. In 2017, a new results-based the government for the number of students who program (Programa basado en resultados or PBR) took and passed a graduation exam. The lion’s was piloted. This instrument corrected the 15 ICAI (2018) 16 R4D (2016) 17 IDB (2014) Introduction 16 perceived problems of PDL, in that it disburses against outcomes instead of expenditures and allows partial disbursements against partial outcomes among other changes. As of 2018, one education project is being implemented under this instrument for a total of US$30 million. NORAD (Norwegian Aid Agency) The Norwegian Aid Agency (NORAD) has participated in results-based financing initiatives in three key areas: health, climate and forestry, and clean energy.18 Norway has committed around NOK 2.1 billion to the World Bank-managed Health Results and Innovation Trust Fund (HRITF) and another NOK 1.1 billion to Gavi (the Vaccine Alliance). Norway has also channeled NOK 6.4 billion bilaterally (to Brazil and Guyana) through the Norwegian International Climate and Forest Initiative. However, there are no significant investments in education RBF as of today. SIDA (Swedish International Development Cooperation Agency) The Swedish International Development Cooperation Agency (SIDA) has been active in RBF for a few years in a number of areas. However, it does not publish centralized portfolio data so it is hard to gauge the amount of funds committed to RBF. 18 NORAD (2015) Introduction 17 Part I – When Do Incentives Work? Results-based financing is based on the idea that incentives can help individuals and agencies in the education sector to work towards improving learning outcomes for all. Results-based financing can take many different forms. Teacher incentives and performance grants focus on service delivery agents (teachers), and organizations (schools), while interventions such as cash transfers focus on the users and recipients of the service (students and families). These topics were selected because they have been the most commonly researched topics when it comes to RBF and education, and because they showcase several key features of RBF at sub-national levels. In this section, we explore each type of incentive using the existing, global evidence base and discuss the key factors to consider when planning an RBF intervention. Key findings from REACH-funded grants will be used as case studies that add to the academic literature and illustrate the practical challenges of designing and implementing an RBF intervention. The choice of the design issues to discuss is driven by the available evidence. There are many design choices that have unfortunately not been studied yet, and this limits the analysis. Some of these possibilities for further research are mentioned later on. The following table shows some of the overarching design issues that are relevant to all interventions. As we move through each section, we will highlight those particularly relevant to the level in question (teacher, students and families, and schools). Design Issues Table 3: Design Issues to Consider The incentive scheme How will the scheme itself work? What metrics will be chosen as indicators? Does complexity matter? Do conditionalities matter? Who to incentivize? What actor should be incentivized? Sometimes the relevant actor is obvious (such as with teacher incentives), but in many cases there are several options available. What to incentivize? Should one incentivize final outcomes (such as learning) or intermediate outcomes and outputs (such as teacher attendance or student enrollment)? Should one incentivize at the individual level (for example, individual teachers), or at the group level (schools)? Behavioral responses What unforeseen behavioral responses could appear? Is this likely to change the perceived beliefs, preferences or identities of the agents involved? Is this likely to cause gaming and cheating? Sustainability Will the effects of the intervention last? Is the intervention financially sustainable? Complementarities Should one combine RBF interventions with other interventions? Are the effects additive? Using RBF with Teachers The Theory Behind Using RBF with Teachers The theory behind paying teachers for their Teacher incentives are schemes that reward performance is based on personnel economics teachers for their performance. The rewards and compensation theory. Under a contract are usually cash, but sometimes they can be that pays a fixed salary, agents have no in-kind (such as a bag of rice) or intangible (for incentive to supply effort19 since compensation example, a certificate of recognition). Incentive is not contingent on an output. However, schemes can be designed in many different linking payment to some sort of output or ways. For example, incentives can be individual outcome (such as student results or teacher or group-based, and they can be linked to the attendance) will theoretically induce teachers attendance or performance either of the to supply more effort and therefore increase or students or of the teachers themselves. The improve that output or outcome.20 rationale behind these interventions is that a conditional reward will lead to increased This improvement in results can happen teacher effort, which will lead to improved through several channels. One of them is student outcomes. simply higher teacher attendance, which is 19 In the simplest model, beyond the minimum effort threshold under which they will be fired. 20 For example, see Lazear (2003). Part I – When Do Incentives Work? 19 especially relevant in developing countries In general, the results for developed countries where teacher absenteeism levels are often are the most disappointing.26 A theoretical quite high.21 High rates of teacher absenteeism reason is that salaries in those countries obviously hinder student learning. Therefore, are already relatively high and, therefore, schemes that incentivize teachers’ effort and higher incentives would be required to get lead to better attendance can lead to better a significant behavioral response. However, results simply by increasing the number of some incentive schemes with modest bonuses hours of teaching. Of course, teachers can have managed to elicit large responses, so the improve their attendance while the amount of relative size of the incentive may not be the instruction time stays the same (for example, if main factor behind these differences between teachers allocate their time to administrative developed and developing countries.27 It is tasks or are present in schools but not in the also worth mentioning that increasing teacher classroom). However, there are also other ways salaries unconditionally does not lead to better for results to be improved. The incentives student outcomes whatsoever.28 might induce increased effort from those The results for interventions in developing teachers who already show up by making countries are somewhat more positive, and the them more motivated (for example, if teachers effects are larger. A recent meta-analysis found felt that they were not being valued for their the effect size of teacher incentives on student contributions) or by making teachers fear learning in developing contexts was around dismissal. Both of these impulses could induce 0.08 SD for math and 0.00 SD for language.29 teachers to spend more time on teaching, In general there is a wide range of results, with to make the content of their teaching more some interventions reporting large effects30 effective, and in general to engage in other and some reporting smaller or even negligible strategies to improve student learning.22 effects. Nonetheless, the evidence base is still Does Using RBF with Teachers limited by the small number of interventions Improve Outcomes? that have been rigorously evaluated. The evidence on teacher incentives as a whole The best way to reconcile these divergent is mixed. A vast review conducted in 2016 findings for now is to recognize that design found that teacher incentives do not qualify and context matter a lot. Table 4 lists a series of as one of the education interventions that crucial issues that affect the design of teacher consistently improve student outcomes.23 Some incentive interventions. teacher incentives schemes seem to improve student performance, even substantially,24 while others have no effect.25 21 Duflo et al (2012) 22 Glewwe and Muralidharan (2016) 23 Evans and Popova (2016) 24 See, for example Muralidharan and Sundararaman (2011) in India or Lavy (2009) in Israel. 25 Or even a negative effect; see Fryer (2013) for an intervention in New York City public schools. 26 Imberman (2015) 27 Murnane and Ganimian (2014) 28 De Ree et al (2015) 29 Snilstveit et al (2015) 30 For example, Muralidharan and Sundararaman (2011) reported a 0.28 SD improvement in math and a 0.16 SD in language. Part I – When Do Incentives Work? 20 Table 4: Factors that Can Affect Teacher Incentives FACTOR EFFECT Size The size of the incentives does not seem to have a large impact on the size of the effect. What to incentivize Incentivizing attendance can improve attendance if monitoring and accountability mechanisms are in place, but has mixed effects on learning outcomes. Incentivizing learning outcomes has mixed effects on learning outcomes. Some interventions have yielded substantive improvements, but others have not. Context and design seem to be key. Who to incentivize: Both individual and group-based incentives have been shown to have Individual or group- positive effects, though the latter tend to be smaller. based Which metrics to Unclear. Some report “pay-per-percentile”31 is more effective at raising use: level, gains, and student scores across the distribution than other simpler schemes (like percentiles levels or gains), while others report that it has similar results to paying for learning gains or level attained. There may be a trade-off between design complexity and ease of use. Behavioral responses: There is some evidence from the US that loss aversion (receiving a bonus loss aversion upfront and losing it if learning outcomes do not improve) can induce higher levels of effort, but more evidence needed. Behavioral responses: Evidence from other fields suggests that high stakes incentives (for high stakes and example, high enough to induce significant volatility in income) can decrease uncertainty aversion performance and make agents more risk-averse. Behavioral responses: Evidence from other fields suggests that information in the form of labeling information or framing can have an effect on how agents understand an incentive. For instance, framing an experiment as a “Community Game” can make agents more cooperative.32 Gaming and cheating Incentives can induce agents to cheat by altering the results. This can create the illusion of improvements in learning outcomes that disappears when assessed using a different instrument or test. Sustainability Teacher incentives can be designed to be cost-neutral, but some agents will be net losers and will be likely to oppose the scheme. Long-term effects Besides changing the way current teachers behave, incentives can also change the applicant pool for future teachers. There is limited evidence that introducing incentives could attract more qualified teachers, at least based on their grades. Source: Authors’ summary 31 “Pay-per-percentile” rewards a group of students’ ranking vis à vis a comparable group of students, whereas “learning gains” rewards increases in test scores, and “levels” rewards reaching a certain threshold (for example, a passing test score). 32 Gneezy et al (2011) Part I – When Do Incentives Work? 21 Design Issues: What to Incentivize been evaluated35 used cameras, whereas (Outputs or Outcomes?) other programs delegated accountability to A design question that comes up quite either the school (for example, the principal) frequently is what to incentivize in a teacher or the community. This can be an attractive incentive scheme. The overall objective alternative because cameras are expensive is to improve learning outcomes, but as and can still be tampered with. However, previously mentioned, this can be achieved delegating accountability to schools can be through different channels and mechanisms. ineffective since there is a risk of collusion Traditionally, two types of incentives have been between teachers and other stakeholders. An used: those that reward effort (for example, intervention in Kenya gave principals bonuses teacher attendance) and those that reward if they reported teacher absenteeism,36 but the outcomes, which were just discussed (for program had no effect on attendance or student example, student results). 33 In RBF terms, the learning (in fact, the principals reported latter option involves rewarding results that are enough missing teachers to get the bonus but further along the results chain (an outcome), no more). whereas the former would be rewarding an A different program in Uganda tested a scheme intermediate outcome. to crowdsource attendance reporting from In the case of teacher incentives, some principals and parents and concluded that evidence suggests that rewarding attendance giving bonuses to principals but not to parents can increase teacher attendance. Teacher led to somewhat improved teacher attendance absenteeism can be very common in many and higher reporting of absent teachers. 37 countries, and this directly undermines student However, both principals and parents learning. In an intervention in Rajasthan, systematically under-reported absences, which India, researchers found that paying teachers once again suggests that delegating attendance for their daily attendance reduced absenteeism reporting to schools may be cheaper but by 21 percentage points and increased test somewhat ineffective. scores (0.17 SD). 34 To make sure teachers were actually present in the classroom, schools were Design Issues: Who to incentivize? given cameras with tamper-free timestamps. Individual or Group-based Every day, students were instructed to take a Another design question is whether incentives picture of the teacher both at the beginning should be individual or group-based. In and at the end of the school day. general, the evidence seems to show that both individual and group incentives can work, However, attendance incentives do not seem though the latter tend to have smaller effects. 38 to have a significant effect on student learning In other sectors, group incentives have also outcomes (and occasionally even on attendance been shown to work. 39 Theoretically, individual itself) as several other interventions have failed incentives should work better because they to show any positive effects. A crucial factor connect a person’s effort with a reward whereas seems to be the accountability mechanism. group effort is beyond the individual’s sole One of the successful programs that has 33 Murnane and Ganimian (2014) 34 Duflo et al (2012) 35 Duflo et al (2012) 36 Cited in Murnane and Ganimian (2014) 37 Cilliers et al (2013) 38 Murnane & Ganimian (2014), Glewwe & Muralidharan (2016), Lavy (2002), Lavy (2009) 39 Garbers and Konradt (2014) Part I – When Do Incentives Work? 22 control though arguably individual student in the district). 42 A key aspect of a rank-based learning is also beyond the teacher’s control. system is that it is cost-predictable because one However, there is some evidence that group teacher’s percentile gain is another teacher’s incentives may still work. For example, a percentile loss. This may be attractive to study in the Indian state of Andhra Pradesh administrators because it makes it easier to found that rewarding teachers for group- gauge the fiscal impact of a program. level improvements led to improvements in A recent study proposed a hybrid method student learning similar to those seen for named pay for percentile. 43 This method individual incentives. However, after two years rewards teachers for their students’ ranking of implementation, the individual group had position in comparison with an equivalent shown greater improvements. 40 group of students defined in advance. Design Issues: Which Metrics to Use Therefore, it combines the features of a piece- rate system and of rank-based incentives. What Metric to Use: Level, Piece Rate, or Rank It is similar to piece-rate systems in that an Finally, in order to incentivize student improvement is “worth” the same at every outcomes, the question arises of what specific point of the distribution and is similar to metric to choose. There are broadly three rank-based systems in that improvements are options: by levels, piece rate, or rank. An measured by percentiles (relative position) incentive scheme based on levels will reward instead of scores. teachers by the number of students hitting a certain target (or level). For example, an An evaluation of an intervention in China intervention can provide a bonus to teachers found that a pay-for-percentile scheme did based on the fraction of students who pass indeed work better than two similar schemes an exam. While these systems are easy to that rewarded teachers based on a class average understand, they can also provide some of the level reached by students or by their perverse incentives. 41 For example, if the gains, as measured by a test. 44 Despite the system rewards teachers for the number of greater complexity of the intervention, teachers students who pass a test, then they are given understood it and reacted accordingly. While no incentive to help students at the bottom in the levels and gain treatment groups, the end of the distribution (because it is unlikely teachers mostly focused on the students whom that they will pass). Instead, teachers may feel they thought could improve the most, in the that it makes more sense for them to focus on pay-for-percentile scheme, they increased the students around the middle of the distribution. coverage and intensity of their instruction for However, this can be mitigated by including the class as a whole. several different thresholds in the design To look at whether these effects have been of the incentive. Thus, piece-rate systems replicated elsewhere, REACH funded an instead reward teachers for incremental evaluation in Tanzania that involved two improvements instead of for reaching a single interventions, one that used learning gains threshold (for example, the total increase in and one that used pay for percentile. 45 The the test scores of students in a class). Finally, first intervention rewarded teachers based on rank-based incentives provide bonuses based how many students attained a certain level or on the teachers’ ranking vis à vis the rest of score in a test. However, to avoid the perverse the universe (for example, the other teachers 40 Muralidharan & Sundararaman (2011) and Muralidharan (2012) 41 Murnane and Ganimian (2014) 42 Imberman (2015) 43 Barlevy and Neal (2012) 44 Loyalka et al (2016) 45 Mbiti et al (2018b) Part I – When Do Incentives Work? 23 incentives just discussed, the intervention aversion, the human tendency to value losses created different levels across the distribution. more than the equivalent gain.50 This partly solved the problems of simple Information can also alter a person’s behavioral models with just one or two levels (where response. Some evidence suggests that the size teachers would have no incentive to help the of the incentive on its own is not crucial, but students that fall far away from the threshold). that it can interact positively or negatively with Contrary to the intervention in China,46 the other design features.51 Teacher incentives, evaluation of these interventions found that both monetary and non-monetary, convey the levels system worked just as well as the a lot of information beyond the incentive pay-for-percentile system. Therefore, further scheme itself. This information can have a very research is needed to tease out whether pay- important impact on the behavioral response for-percentile really is a better way of eliciting of teachers, an issue that has been researched improvements in teacher performance and extensively in the behavioral economics student outcomes. A key issue is whether there literature.52 For example, the introduction of is a complexity-efficiency trade-off. an incentive scheme can make the teachers think that they are not trusted by whoever is in It is also worth noting that changes in test charge of the incentives (perhaps the Ministry scores can be prone to volatility caused by of Education), leading to a drop in their morale. student cohort characteristics or one-time shocks. 47 This means that incentive schemes Thus, information can have an effect through can be based on noisy metrics that do not what it reveals about the designer/implementer accurately reflect reality (and effort). This is and what it reveals about the situation itself. especially problematic for smaller schools, An example of the former is when teachers since the chances of random variation are much lack motivation because they believe that being higher in those schools because of their small offered an economic incentive for effort means number of students. Test scores themselves, that the administrators do not trust them. An rather than changes in test scores, are less noisy.48 example of the latter is when teachers teach to the test because they see the offer of incentives Design Issues: Behavioral Responses treats teaching as a market exchange rather Another critical design issue is the behavioral than a vocation. There are endless examples response of the agents involved. For example, of cases where the interpretation of a specific an intervention in Chicago public schools design can lead to very different outcomes, included two different treatments: a teacher which highlights the need to think about the bonus based on student performance in an realities of the implementation of a specific exam at the end of the year compared to a lump program beyond its theoretical design. It sum of money given to teachers in advance is not just the design of an incentive that that would be taken away if student outcomes matters, but the way this is conveyed to the did not improve. 49 Interestingly, the latter agents themselves. This shows that clearly treatment had a positive effect on student communicating to relevant stakeholders how learning, but the former did not. The authors an incentive scheme will work and the ideas of the evaluation of this intervention suggested behind it is incredibly important. that this result may have been due to loss 46 Loyalka et al (2016) 47 Barrera-Osorio and Ganimian (2016) 48 Murnane & Ganimian (2014) 49 Fryer et al (2012) 50 Fryer et al (2012) 51 Loyalka et al (2016) 52 See Bowles and Polanía Reyes (2012) Part I – When Do Incentives Work? 24 Design Issues: Sustainability the population of people who apply (and then An additional factor to keep in mind is the become) teachers, which is likely to become sustainability of teacher incentives. The more important in the long-term. financial sustainability of the incentive Unfortunately there isn’t much evidence on program largely depends on the nature of the effect of teacher incentives on recruitment. the program itself. It is possible to design But theoretically they could alter the pool interventions that are relatively cost-neutral, of applicants, perhaps by attracting more or at least cost-predictable.53 For example, if motivated candidates (which would do better part of teachers’ annual salary increases is under a pay-for-performance scheme) and instead redesigned as variable pay, the only discouraging less motivated candidates. There additional cost will be the administration is experimental evidence that shows that the of the program (for example, the grading of way a position is advertised can have drastic any test used to evaluate students’ learning effects on the people who are recruited. In outcomes). However, the political economy of Zambia, an intervention tested the effect of two such an intervention is likely to be much more advertisements for the same community worker problematic as it inevitably involves winners job. The first ad highlighted career progression and losers. Instead, a program that adds a as a key component of the job, whereas the variable pay component to any planned annual second highlighted community service. The salary increases will be more popular but also career progression ad attracted candidates that significantly more expensive. did not seem different based on observable Chile stands out as an example of how characteristics but then went on to perform countries have addressed these challenges. far better than the other group.55 Additionally, Instead of rolling out a teacher incentive evidence from the US suggests that teacher system outright, it created a voluntary scheme incentives could actually attract more qualified first and engaged in extensive consultations candidates (at least judging from grades and and negotiations with teacher unions. This test scores).56 REACH is currently funding gave teachers a chance to get to know the research to examine how pay-for-performance system and choose whether or not to adjust teacher contracts in Rwanda affect selection their classroom practices. The government and recruitment. also implemented a series of reforms (such as steady salary increases) that reassured teachers Design Issues: Gaming and Cheating that any losers from the new system would be When money (or some other good) is tied to a compensated.54 performance indicator, there will inevitably be perverse incentives for the agent involved. The Design issues: Long term effects result is cheating and gaming. For example, Another question is what will be the long-term teachers could tell their students what to effects elicited by the program. There are two answer when they take high-stakes exams channels through which teacher incentives since the results will be used to evaluate the may change teacher behavior: (i) changing performance of the teacher. Teaching to the the behavior of current teachers, which will test is another (much less flagrant) way of happen in the short-term, and (ii) changing gaming the system. In the developing world, 53 Muralidharan and Sundararaman (2011) 54 World Bank (2017c) 55 Ashraf et al (2014) 56 World Bank (2018e) Part I – When Do Incentives Work? 25 there are a few examples of interventions fact, reducing cheating to a minimum comes that have led to these kinds of behavior.57 For at a cost and is not always desirable since example, an evaluation in Kenya showed that supervision and monitoring costs money incentives improved student scores according too. Therefore, there are trade-offs involved to the tests that would be used to determine the between the cost of cheating and the cost of rewards. However, when they were tested with supervision, some of which will be discussed a different test that measured the same content, later in the report. there was no improvement. This suggests teachers may focus on teaching to the test (or Conclusion drilling) rather than on activities that improve One of the key objectives of using RBF is to students’ content knowledge and skills.58 focus on outcomes down the results chain, Alternatively, teachers may reallocate their such as intermediate and final outcomes, rather time away from subjects that are not linked to than on inputs as has often been the case with incentives (such as art) to subjects that are. traditional financing. However, the available evidence thus far on teacher incentives is It is difficult but possible to prevent gaming mixed. What has emerged is that interventions when designing teacher incentives. Generally, that use teacher incentives can be successful it can be done by including several verifiable given the right conditions, but they can also metrics that aim to reduce perverse behavior. have negligible or negative effects. However, the cost involved in adding more indicators is ending up with a scheme that When designing an intervention, it is is over-designed and too hard to understand important to keep in mind the issues discussed that confuses teachers about what they in this section such as the structure (and should prioritize. There is some evidence complexity) of the incentive scheme, the that when several variables are incentivized, behavioral response of all the agents involved, agents choose to invest the most effort in and the possibility of gaming and cheating. achieving those that are easiest to attain. For Perfecting the incentives at just one level example, an intervention in Mexico created will be of little use if the incentives at the several incentive schemes that included both other levels are not aligned too. Therefore, an participation in final exams and student incentive scheme that aligns the incentives of outcomes as metrics (to prevent teachers and all agents involved in the education sector will principals removing low-performing kids probably work better than a scheme that just from taking the test).59 This led to increases in influences teachers. participation (the easier metric to pursue) but Many gaps remain in the literature. The most no improvements in student test scores. relevant are how different incentive structures It is also worth noting that, while ideally (pay-for-percentile or levels, for example) gaming and cheating should be minimized, operate in different contexts and the fiscal they do not necessarily invalidate an impact and long-term effects (both for students intervention.60 It is possible both for some and for teachers) of teacher performance agents to cheat and for there to be a general incentives. improvement in the targeted outcome. In 57 Kremer et al (2010) 58 Kremer et al (2010) 59 Ganimian and Murnane (2014) 60 Murnane and Ganimian (2014) Part I – When Do Incentives Work? 26 Using RBF with Students education. An alternative would be to conduct an information campaign to directly inform and Families families about these benefits. One of the In RBF, incentives usually involve rewarding an REACH grants, which we will discuss later on, individual in exchange for a certain behavior or targets precisely this channel. A final reason action. The most popular example of this sort for providing incentives to students and their of mechanism are conditional cash transfers, families has to do with behavioral factors such which have been implemented for over 25 as present bias and discounting. Sometimes years, often quite successfully. In this section, households would like to invest more in education we discuss how to use RBF with students and but end up allocating resources to meet more their families to encourage improved learning urgent needs. In this situation, mechanisms that outcomes. could prevent this, such as conditional incentives, can play an important role. The Theory behind Using RBF with Students and Families Does Using RBF with Students One of the main reasons for providing and Families Improve Outcomes? incentives to students and their families is that The evidence base regarding the provision households have been shown to underinvest of incentives to students and their families in education.61 The literature points to several is quite large, though mainly focused on factors that can lead to this underinvestment. conditional cash transfers (CCTs). The first The first is the fact that households and CCT was PROGRESA, a Mexican program students may want to invest in education but launched in the late 1990s that distributed may not have the money to do so. In these transfers to families in exchange for ensuring cases, a subsidy would relax that constraint.62 that their children attended school and for A second reason is that households may be taking them to clinics for preventative health unwilling to invest in education because they checks. The program was also randomized, are not aware of the returns to education. which has made it relatively easy to evaluate In this case, directly providing a payment since its inception. Research has shown that, in return for some action (such as school for children aged between 6 and 14 years old in attendance) would signal that schooling is 2003, PROGRESA raised the number of years important and worthwhile, thus increasing of schooling by 0.25 (about three months) for households’ awareness of the returns to boys and 0.32 for girls (about four months) after 61 Underinvesting, for example, in the sense that households have children who end up getting far fewer years of schooling than what would be optimal based on returns to education in the country in question. 62 Glewwe and Muralidharan (2016) Part I – When Do Incentives Work? 27 5.5 years of exposure to the program.63 Since year period by 0.535.68 Other studies have also then, the number of CCT programs around found that CCTs have continued to increase the world has increased tremendously.64 In enrollment in the year after the cash transfer education, these programs generally provide was received. A cash transfer in Colombia has cash transfers to family members (which increased attendance by 2.9 to 3.2 percentage member is the final recipient varies depending points, while re-enrollment the following on the program’s design) in exchange for some year increased by 1.1 to 5 percentage points. behavioral change by the household, usually This program also includes a treatment enrolling their children in school or increasing option where the household receives part of their rate of school attendance.65 the transfer the year after the child has re- enrolled, and this has yielded better results Overall, the literature suggests that CCTs than the standard CCT, but both are positive decrease school dropouts and increase school and significant.69 For Brazil’s Bolsa Família, attendance and completion for children.66 some studies have found that the transfers A recent meta-analysis found that their decreased dropout by 3 percentage points and effect on attendance has been 0.13 SD (-0.12 increased enrollment by the same amount.70 In SD for dropouts) and 0.12 for completion. Honduras, the story is the same where PRAF, a Furthermore, evaluations of programs in conditional cash transfer, increased enrollment Brazil, Honduras, Malawi, Colombia, and rates by 8 percentage points and decreased several other countries have all suggested that the probability of child labor by 3 percentage they have had a positive impact in variables points.71 such as re-enrollment, transition to the next education level, labor outcomes, and even One of the benefits of having a wide range of health status.67 interventions to examine is that there is some evidence of the different factors that can affect For example, in Malawi, a conditional transfer the impact of a CCT. Table 5 shows several.72 increased the number of terms during which Some of these will be discussed in more detail girls were enrolled in school during a two- further below. 63 Behrman et al (2009) cited in Glewwe and Muralidharan (2016) 64 For a review, see Fiszbein and Schady (2009). 65 Glewwe and Muralidharan (2016) 66 Glewwe and Muralidharan (2016) and Snilstveit et al (2015) 67 For further information, see Murnane and Ganimian (2014) or Glewwe and Muralidharan (2016). For the impact on health status, a good example is Gertler (2004). For Latin American CCTs, Molina-Millan (2016) is a recent survey. 68 Baird et al (2011) cited in Glewwe and Muralidharan (2016) 69 Barrera-Osorio et al (2011) 70 Glewwe and Kassouf 2012 cited in Glewwe and Muralidharan (2016) 71 Galiani and McEwan (2013) 72 Most of these were identified by Murnane and Ganimian (2014) Part I – When Do Incentives Work? 28 Table 5: Factors that Affect Conditional Cash Transfers FACTOR EFFECT Conditionality Both conditional and unconditional transfers have effects of similar magnitude in general, but some unconditional interventions have smaller effects. Information and labeling Information treatments (for example, providing an attendance report card with the transfer) can have positive effects that complement the transfer itself. Some evidence suggests that labeling cash transfers as school transfers can also improve outcomes, perhaps by increasing the salience or perceived importance of education. Some information treatments on their own can also have a positive effect, but this is usually smaller. What to incentivize Conditioning transfers on attendance raises attendance, retention and graduation. There is some positive evidence on rewarding students to improve learning outcomes (see cell below). Who to incentivize There seem to be no large differences between fathers and mothers (with some exceptions). Giving kids part of the transfer may increase the magnitude of the effect. Some evidence suggests that rewarding students for effort (input) and grades (outcome) can improve learning outcomes, though more evidence is needed. Rewarding goal-setting does not seem to improve learning outcomes. Other factors (from Murnane & Ganimian 2014) Share of students enrolled The lower the initial share of students enrolled, the higher the effect on enrollment. Size of transfer Larger transfers do not always cause larger effects; diminishing returns. Timing of transfer Delaying part of the payment and making it conditional on next grade enrollment increases retention. There is limited evidence that more frequent student rewards improve learning outcomes more than one large reward (see “Who to incentivize” section). Age and grade of recipient CCTs more effective in transitions from primary to secondary and from lower secondary to higher secondary. Poverty level The poorer the beneficiaries, the larger the impact. Source: Data from Murnane & Ganimian (2014) with some modifications and additions from the papers cited in the report Part I – When Do Incentives Work? 29 The evidence in terms of attendance is quite In Malawi, however, the unconditional transfer clear, but it seems that these incentives do not led to only 43 percent of the reduction in generally improve student learning outcomes. dropouts as the conditional transfer, which The overall effect in a recent meta-analysis was suggests there are more issues involved than measured as 0.01 SD for a composite language/ just financial constraints.77 This was also seen math score, which is indistinguishable from in other interventions. zero.73 Of course, there are some exceptions. The overall verdict so far is that both In the case of Malawi, test scores improved conditional and unconditional cash transfers under the cash transfer. English test scores can have important positive effects on improved by 0.13 SD, and math scores by 0.16 attendance. However, it seems that conditional SD.74 In Nicaragua, there were significant transfers have a greater effect.78 Additionally, gains in the math (0.17 SD) and language depending on the context, the effects of (0.23 SD) test scores for young men exposed an unconditional cash transfer can vary. to the program.75 However, most other As mentioned previously, in Burkina Faso, recent evaluations have shown no significant marginal children were better off with a changes. For example, an evaluation funded by transfer that required parents to take them to REACH in Mozambique found that the gains school. Therefore, it is possible that subgroups in attendance in all treatment groups did not are affected in different ways by conditional translate into improved student learning. This and unconditional transfers, though as of now makes sense intuitively because CCTs increase there is not enough evidence to tell how or why. the attendance of the most vulnerable students, who generally have lower grades. Therefore, Design Issues: The Role of Information by virtue of the changes in composition of the and Labeling student body, it could be expected that the As mentioned above in the teacher incentives average grade would decrease. section, an intervention conveys much Design Issues: The Role of Conditionality more than money or goods. It also provides information to the recipients. For example, it There are other questions that remain to be may signal what is considered important by answered, mostly related to the behavioral the public or it may show that the returns to mechanisms by which CCTs operate. Some education are higher than the family initially evidence suggests that conditionality is not thought. These changes in beliefs can alter necessary, for instance. An evaluation of an a family’s behavior and their subsequent intervention in Burkina Faso that included both investment in education. conditional and unconditional cash transfers found that both led to similar increases in Indeed, several interventions have shown enrollment.76 However, the authors also noted that just providing information on its own that unconditional cash transfers were worse at can have a positive effect on behavior. In increasing the enrollment of what they called the REACH grant in Mozambique that was “marginal children,” In other words, children mentioned above, one of the main objectives who were not prioritized by their parents in was to evaluate the impact of information. terms of school attendance, such as girls or The authors of the evaluation found that the younger siblings. information content of a conditional transfer 73 Snilstveit et al (2015) 74 Baird et al (2011) 75 Barham et al (2013) 76 Akresh et al (2013) 77 Baird et al (2011) 78 Baird et al (2014) Part I – When Do Incentives Work? 30 can have a substantial effect on school students in Zanzibar that tested the impact of attendance independently of the transfer itself. student goal-setting on their performance and In the authors’ experiment, the estimated effect whether this impact differed when reinforced of the information treatment (report cards with extrinsic incentives (in other words, only) on attendance was as large as 54 percent non-financial recognition awards for meeting of the child incentive effect and 75 percent of self-set goals). It was found that goal-setting the effect of the parent incentive, which was increased students’ effort, especially for impressive given that it cost a fraction of the those who exhibited low to medium ability transfer. at the baseline and for those to aspired to higher education. 81 However, this increased Finally, labeling a transfer can also have an effort did not translate into improved student effect. An intervention in Morocco tested the performance within the short time period of effects of an unconditional transfer labeled the study (eight months). The team also found as a “school transfer” compared to a similar that extrinsic incentives did not enhance the unlabeled conditional cash transfer.79 The effectiveness of goal-setting for students. labeled unconditional transfer decreased dropout by 76%, and the effects were similar Finally, there is the possibility of incentivizing for the conditional transfer. When looking at results further down the results chain, such re-entry rates the following year, the labeled as learning outcomes themselves. There is transfer actually performed better than the a growing literature that looks at whether CCT. paying students for improvements in test scores can work, and the evidence from developing This suggests that providing information contexts is promising. For example, an can have significant effects on outcomes, evaluation of an intervention in Nepal found both when attached to transfers and when that rewarding eighth-grade students for their provided alone. Therefore, in situations where average performance increased test scores by governments or agencies have financial 0.09 SD. 82 A different intervention in Benin constraints, providing information could be a tried out three different incentive schemes: strong next best option for increasing school the first paid individual students for reaching attendance and perhaps improving other a specific performance level on tests, whereas indicators. the other schemes incentivized groups of Design Issues: What to Incentivize four students to perform better on tests. All incentive schemes led to improvements in test In principle, there is no reason why cash scores, ranging from 0.27 to 0.34 SD. 83 transfers cannot incentivize other kinds of behavior. Evidence has shown that increasing Other evaluations have sought to compare school enrollment does not always translate input and output incentives as a way to tease into improved learning, so perhaps it makes out what specific factors cause improvements sense to focus on other kinds of behavior that in student learning. An intervention in India could plausibly improve student outcomes. 80 looked at two different incentive schemes One of these kinds of behavior is goal setting. operating within a math computer-assisted In Zanzibar (Tanzania), REACH helped to learning platform. The first incentive scheme fund a field experiment among secondary (the input scheme) rewarded students for every 79 Benhassine et al (2015) 80 Pritchett (2001) 81 Islam, A., S. Kwon, E. Masoon, N. Prakash, & S. Sabarwal (2017) 82 Murnane and Ganimian (2014) 83 Blimpo (2014) Part I – When Do Incentives Work? 31 learning module that they completed (including mentioned showed similar results for kids quizzes), while the second scheme rewarded regardless of whether the recipient was their students for their scores in a test administered mother or their father. 85 The authors suggest at the end of the scheme (the output scheme). this may be because father immediately The students were rewarded with points that appropriated the transfer, and indeed the data they could use to purchase real goods in a shows that a majority of mother recipients were virtual store. The output incentive scheme led accompanied when they went to pick up the to a 0.27 SD increase in test scores compared money, whereas most fathers picked it up alone. to the control group, whereas the input scheme What emerges from the literature is that caused a whopping 0.54 SD improvement. outcomes depend crucially on the rules that It is tempting to argue that this input incentive govern intra-household decision-making. And worked better because it rewarded key factors this is prone to variation by context. Giving of the learning production function rather transfers to mothers may make a difference in than leaving it up to students to decide how situations where they have bargaining power in best to improve their outcomes. However, that the household, but not otherwise. An alternate conclusion would be premature. The input view is that giving mothers money may scheme rewarded students periodically as increase their bargaining power vis à vis their they mastered each subject, which may have husbands. raised the profile of the incentive scheme and Recently, it has become clear that children reduced present bias. Since the reward offered play a role in the decision-making process by the outcome incentive was concentrated at too. 86 After all, they have preferences that are the end of the intervention, students may have fairly distinct from their parents. A few recent discounted the reward more heavily. If this interventions have explicitly targeted children was the case, then the difference in favor of as the recipients of transfers, sometimes the input scheme would be due to behavioral with interesting results. Some studies have responses rather than to the distinction found that giving money to parents and toys between outcomes and inputs. to children had a similar effect in terms of Design Issues: Who Should Be Incentivized? improving student learning. 87 There is also a question of who should receive The Mozambique grant added to the evidence the incentive. In the case of CCTs for school by directly comparing the provision of similar attendance, this is almost always the student’s incentives to parents and children in terms family, but within the family, each member has of their respective effects on attendance. All different views, beliefs, and preferences. The female students in senior primary grades had differences between spouses is particularly attendance report cards that were given to their relevant. There is strong evidence that gender parents at the end of each week. In the first plays a role in a range of issues related to group, if the girl’s attendance was at 90 percent financial decisions. Women are more likely or more, she received vouchers that could be to pay back microfinance loans and to invest used to purchase certain school materials. For household resources in health and education. 84 the girls in the second group, the monetary However, when targeting parents for CCTs, it amount of the voucher was given to the parents, is unclear whether gender has a strong effect. with the option of purchasing the same school An intervention in Morocco we previously materials that were made available to the first 84 Ashraf (2009) 85 Benhassine et al (2015) 86 Dauphin et al (2011) 87 Berry 2015. Part I – When Do Incentives Work? 32 group. In the third group, girls simply received expected to go to school. The role played by the attendance report card with no incentives the use of information in interventions (such as attached. There was also a fourth control the returns on education or the simple labeling group. The evaluation of the program found of a transfer) also needs further elucidation. that the incentive given to the children was Occasionally, it seems that labeling a transfer at least as effective as the incentive given to as an “education transfer” is enough to promote their parents. In fact, the effect was 38 percent attendance even without any conditionality. higher for the children’s incentive group, but the intervention lacked enough statistical power to establish significance. Using RBF with Schools In conclusion, the effects of an incentive can be RBF with schools usually takes the form of different depending on who receives it. Women school grants. For the sake of the report, we recipients (often mothers or grandmothers) take grants to be public funds transferred have been shown to make higher investments to schools to cover operational (and other) in family, health, and education than male expenses, over which schools have some recipients. Providing transfers to the students discretion. This discretion over the allocation themselves is not yet a widespread practice, but of resources is a key feature that sets grants the examples so far have been encouraging. apart from regular school financing (in the form of earmarked transfers to pay for teacher Conclusion salaries, for example). There are several types of There is much available evidence on the effects school grants, including those that do not have of CCTs, and overall the results are positive. strings attached (unconditional) and those that Giving incentives to students and their do (conditional). Conditional grants include families in the form of transfers can increase performance-based grants, which are a type of intermediate outcomes like school attendance RBF. We will be focusing on these throughout and enrollment rates but can also increase final this section. It belongs to the kind of financing outcomes like graduation rates. The effects of policy that incentivizes the front-line of the CCTs on student learning are less impressive. education results chain (usually schools, which While some interventions have shown promise, are the direct providers of education services). generally CCTs have not been found to have a positive effect on learning as measured by test The Theory behind Using RBF with Schools scores. It remains unclear whether transfers The main idea behind school grants is that need to be conditional. The effects of providing many schools both know how to and would conditional and unconditional transfers are like to improve student learning but often similar, though they tend to be larger for lack the resources or motivation necessary to conditional programs. do so. For those who lack the resources, an There are still many research gaps to fill increase in financing through grants could regarding context-specific effects, such as the help them to implement the improvement role played by social norms in the households’ plans that they deem appropriate and that response to CCTs. For example, in Burkina would eventually improve learning outcomes. Faso, an evaluation found that unconditional The argument in favor of providing these transfers led to the exclusion of marginal grants is that school leaders have more children, whereas conditional cash transfers knowledge about the deficiencies of their did not. One possible explanation for this is school than planners and officials in any that the conditional transfer explicitly broke line ministry, so they will spend the money the norm that marginal children were not more effectively. However, for those school Part I – When Do Incentives Work? 33 leaders who lack motivation, a conditional school completion, -0.02 SD on dropout zero, grant program could induce them to improve though both results were not statistically their management practices by offering their significant.91 Effects on enrolment and teacher schools more resources contingent on the attendance were negligible. school’s performance. Indeed, research has Nonetheless, there is plenty of variation shown that school management practices vary amongst the results.Finally, one evaluation widely and that good management practices looked at the interesting question of whether are associated with better learning outcomes. 88 it makes a difference when schools receive Since in principle, good management practices unannounced grants as opposed to grants can be adopted by lower-performing schools, that are expected.92 Using data from Andhra this could plausibly lead to improved student Pradesh in India and from Zambia, the learning. authors found that unannounced grants led Does Using RBF with Schools to improvements in student learning but Improve Outcomes? announced grants did not. In the case of India, this amounted to improvements of 0.08 and There is only limited evidence on the effects of 0.09 SD in language and mathematics test school grants on improving learning outcomes. scores respectively (for a grant of US$3 per One reason for this is that they are rarely student). In Zambia, the government assigned stand-alone policies. Many grant programs block grants that also cost a little under US$3 are the result of the abolition of school fees per student. The evaluation found that test when schools are compensated for this lost scores in language and mathematics both revenue. 89 Other grant programs are bundled increased by 0.10 SD. The authors suggest that together with wider school-based management the reason for this was that households offset interventions, which often include training the anticipated grants by reducing their own for principals and other staff, or the creation spending on education. of school committees. This makes it hard to disentangle the effect of each component. For example, a one-time school grant combined with an intervention to improve According to the available evidence, the school management in the Gambia led to a 21 overall effect of grants on learning outcomes percent drop in student absenteeism and a 23 is mixed.90 A recent meta-analysis found percent drop in teacher absenteeism but no that the pooled effect of these grants on a improvement in learning outcomes.93 However, composite language/math score was -0.01 the group of schools that only received the (and not statistically significant), with a range one-time grant saw no improvements in any from -0.34 to 0.15. For other outcomes, such as category. A grant and training program for enrollment and participation rates, the results parent associations in Mexico led to reductions are somewhat more positive. For instance, a in grade repetition and grade failure of 4 to 5 recent review found an effect of 0.05 SD in percentage points.94 88 See for example Bloom et al 2015. 89 Al-Samarrai et al (2017) 90 Note that this is based on the pooling of school-based management interventions. An overwhelming majority of these include school grants, but they often also include capacity building. 91 Snilstveit et al (2015) 92 Das et al (2013) 93 Blimpo et al (2015) 94 Gertler et al (2012) Part I – When Do Incentives Work? 34 Table 6: Factors that Can Affect Performance-Based School Grants FACTOR EFFECT Competitive distribution of It is unclear whether using a competitive allocation system resources for school grants makes a difference, since research is lacking. An intervention in Senegal led to improved student outcomes, but results in Indonesia were mixed. Equity issues Equity considerations will arise when including a competitive system for allocating resources. High- performing schools may be more likely to receive the grant, thus increasing inequality. This can be mitigated by creating different competitions based on the socioeconomic backgrounds of the district, for example. Household response There is some evidence that households may reduce their own educational spending if they anticipate an increase in school grants, which could limit the effectiveness of the intervention. Some alternatives could include financing inputs that are harder to substitute for, or providing larger grants. Long-term effects No evidence on this so far, but other interventions in health suggest that short-term grants can help permanently overcome organizational constraints (and thus improve outcomes). Source: Authors’ summary Design Issues: Competitive the grant was a sizable US$3,190, or around 7 Distribution of Resources percent of each school’s total annual budget A key design issue is whether to allocate (including teacher salaries). The Ministry of resources competitively or not. For instance, Education’s guidelines specified that schools’ some grant programs may distribute extra grant applications had to be focused on funding to schools that meet a series of pedagogical improvements and to be prepared requirements, such as improvements in student by a committee of local officials, parents, and learning or the submission of an improvement teachers. (This is reminiscent of what was plan. The evidence so far is mixed on whether found in Indonesia (see below) where increasing this is effective, though the number of studies links between schools and local officials led is very limited. to improved student learning.) The evaluation of the Senegal program found that the grants In Senegal, a competitive grant program had increased student test scores by 0.09 SD after positive effects on student learning, especially two years. for schools that spent the funds on human resources rather than school materials.95 The REACH intervention in Indonesia used an Schools could apply for funding for specific RBF-based reform of the entire system to try projects of their choosing, and the amount of to evaluate the effects of a new performance- 95 Carneiro et al (2016) Part I – When Do Incentives Work? 35 based school grant. The intervention created that those schools that are already doing quite a bonus grant for the top 25 percent best- well will receive even more money. Higher- performing schools in the system, with performing schools tend to have more affluent schools competing against other schools in students since higher income is heavily their district, which reduced equity concerns correlated with good educational outcomes. since their socioeconomic backgrounds were This is an issue worth mitigating because similar.96 The bonus grant was equal to 20 otherwise financing can become regressive. percent of the fixed grant, which is a sizable In developed countries, a classic example amount. The fixed grant was US$4.5 per is the No Child Left Behind reforms in the student for primary schools and US$8.2 for United States.97 Since accountability was based junior secondary schools. on proficiency scores, it disproportionately penalized schools in low-income areas. The results, however, were mixed. The test scores of junior schools improved, but those In the Indonesia grant program financed by of primary schools fell (though this effect REACH, measures were included to mitigate was temporary). These changes in test these equity concerns. For example, schools scores occurred before the new system itself competed against other schools in their own was implemented, which suggests that the districts rather than nationally. Also, one of improvements happened as a result of the the metrics used to calculate which were the incentives rather than of the purchase of new top performing schools was absolute change materials or changes in practices that were in performance, which in effect benefited paid for with the additional resources. In other poorer performing schools. Since schools words, it seems that schools worked to improve from the same district generally have similar their results in order to become eligible for socioeconomic profiles, this reduced the the new performance-based grant. However, amount of inequity in the final allocation. after the program was implemented, the new However, it was not enough to eliminate funding had little effect on student outcomes. inequity, and higher performing schools were still more likely on average to receive the Overall, the jury is still out on whether performance grant. They were also more likely adding conditionality (for example, restricting to have improved scores than the schools at the eligibility to top performing schools or bottom of the distribution. distributing funds based on a series of performance indicators) to a grant increases Ideally, a performance-based grant its effectiveness. It seems that both context intervention should not only incentivize the and design can be critical for the effectiveness entire distribution of schools to improve of school grants. This approach worked in their learning outcomes but also close the Senegal (perhaps because the grants were gap between the best and worst performing spent on human capital rather than materials), schools. However, there is no guarantee that but it had mixed effects in Indonesia. Perhaps this will happen. In the Indonesia REACH the effectiveness is driven by other design grant, there were heterogeneous effects. variables. Among primary schools, equity increased because the worst performing schools Design Issues: Equity improved by more than the better performing An additional issue to consider regarding schools. However, among junior schools the performance-based grants is equity. If grants opposite was the case. Another way to mitigate are conditional on performance, the risk is inequality between schools would be to modify 96 Al-Samarrai et al (2017) 97 Kim and Sunderman (2005) Part I – When Do Incentives Work? 36 the allocation formula. For example, the growing amount of evidence (which we will formula could specify that only schools in the review in the following section) on the benefits bottom socioeconomic quartile are eligible to of targeting several constraints at the same time. participate. Alternatively, the weighting of each component in the allocation formula could be Design Issues: Long term effects adapted to benefit disadvantaged schools.98 As mentioned, performance-based school grants are often design to incentivize Design Issues: Household response improvements in management and allocation Another design issue mentioned by the of school inputs. Could they have a long-term literature is how households react to the effect on outcomes? creation of a school grant program. One Unfortunately there is not much available evaluation looked at whether it makes a research in education, but in healthcare difference when schools receive unannounced RBF there is some evidence that short-term grants as opposed to grants that are expected, transfers to frontline agencies can lead to and found that it does.99 When households long-term changes in behavior. For example, expect the grant, they offset it by reducing the temporary subsidies to health clinics under their own spending on education. If the Plan Nacer in Argentina led to permanent grant is unexpected, they do not reduce their increases in the provision of prenatal care and spending. Using data from Andhra Pradesh healthcare packages.101 The implication is that in India and from Zambia, the authors found the clinics did not provide these services before that unannounced grants led to improvements the program not because of the early fixed costs in student learning but announced grants did of adopting them but rather because of their not. In the case of India, this amounted to perception that they would yield low returns. In improvements of 0.08 and 0.09 SD in language other words, the incentives helped to overcome and mathematics test scores respectively (for organizational inertia to reach a new and better a grant of US$3 per student). In Zambia, the equilibrium in healthcare delivery. government assigned block grants that also cost a little under US$3 per student. The In the Argentine case, subsidies to clinics evaluation found that test scores in language helped to overcome an organizational and mathematics both increased by 0.10 SD. coordination problem. In other RBF areas such as teacher incentives or cash transfers, Of course, this does not mean policymakers constraints to improve outcomes are not should design unanticipated grants. That would organizational, so the effect could be more be impossible. However, the authors suggest limited. But in schools organizational two alternatives. First, providing a larger constraints can be a limiting factor. In that grant (since the grants provided under both —perhaps by helping case grants could­­ programs were small) that makes it impossible the school move to a better management for households to offset. And second, focusing equilibrium—lead to long-term effects. on providing resources for inputs that are harder to substitute.100 The authors mention Conclusion extra teachers or infrastructure, but funds Unfortunately, the evidence base on the for capacity building or other combined effectiveness of performance grants is still interventions would also qualify. There is a quite limited. As the number of interventions 98 Al-Samarrai et al (2017) 99 Das et al (2013) 100 Das et al (2013) 101 Celhay et al (2015) Part I – When Do Incentives Work? 37 increase, it might be possible to tease out more these schools, students had higher scores (0.11 factors that determine whether a program SD) on the exam questions that were related to is successful or not. For now, it seems that the topics and content covered in the textbooks. in some cases they can work (as in Senegal Similarly, an intervention in Mexico that where the school performance grant led to targeted several RBF levels found that improved outcomes), but in others there have incentives that only affected teachers were not been more mixed results (like the REACH- as effective as incentives that covered teachers, funded evaluation in Indonesia, which saw principals, and students.103 This shows the improvements in junior secondary school importance of making sure that the design of learning outcomes but no improvements in the intervention aligns the incentives of all primary schools). Often additional money agents involved. Finally, a recent intervention is a necessary but not sufficient condition in Tanzania tested a program that provided a for schools to improve learning outcomes, cash grant for schools, a pay-for-performance especially if households adjust their own scheme for teachers, or both together.104 The spending in response. A promising research cash grant and pay-for-performance schemes agenda is the combination of school grants had no significant effect separately but and other RBF interventions, which will be combined they increased test scores by an discussed in the next section. average of 0.12 SD. Combining RBF and institutional Combining RBF Interventions capacity building to Overcome Constraints RBF interventions can also be combined with capacity building. Some evidence suggests this Targeting different levels can lead to good results. For instance, stand- One of the lessons that we learned from the alone grants seem to be less effective than evidence base is that interventions tend to work in combination with other interventions.105 better when they are combined. An emerging This may be the case because schools are literature seems to suggest that targeting RBF not aware of the best ways to improve their interventions at different levels may have learning outcomes. There is some evidence strong synergies. from the management literature that firms and bureaucracies do not adopt good management For example, a pay-for-performance scheme practices automatically but can benefit from in rural Uganda raised attendance rates and them when they are exposed to them.106 improved student learning outcomes but Therefore, providing financial resources is only when complementary inputs were also not enough to overcome resistance to adopting provided, in this case textbooks.102 Student new practices. Often, interventions that provide attendance rose by around 0.56 to 0.60 SD two complementary inputs (such as training, follow years after its creation, but gains were driven up visits, or other capacity building) are required. by schools that had access to textbooks. The evaluation shows that most teachers in the The evaluation of the Indonesia grant found intervention increased their levels of effort, but some of the same issues. Schools were found that effort was only transformed into improved to invest their grant money in hiring fewer learning when the textbooks were available. In contract teachers and spending more on 102 Gilligan et al (2018) 103 Behrman et al (2015) cited in Murnane and Ganimian (2014) 104 Mbiti et al (2018) 105 Das et al (2013) 106 Bloom et al (2013) Part I – When Do Incentives Work? 38 inputs that are not correlated with improved in Northern Uganda.111 One was the original outcomes, such as school infrastructure. In program, and the second kept most of the Tanzania and Kenya, many principals had features of the original program but with small little knowledge of the specifics of the grants changes that reduced the costs by 60 percent that their schools received, with 60 percent to make it scalable (such as removing some of of Tanzanian principals not knowing how the expensive materials). Whereas the original much money they were eligible to receive and program had a very sizeable impact (0.64 SD 35 percent of their Kenyan counterparts not in reading and 0.45 SD in writing, some of the knowing the size of the grant for non-teaching largest reported in the literature), the low-cost expenses.107 This suggests that there is space program had no significant impact on reading to include guidance for principals and/or other and a large negative impact on writing (-0.3 capacity-building in such programs.108 SD). Further analysis suggested that these differences were due to large complementarities For example, another intervention in Indonesia between inputs in the original program, such evaluated different combinations of grants, as teacher quality and materials. training, community participation, and elections to school boards.109 While the grants The lesson to be drawn with regard to the on their own and the grants plus training had design of RBF interventions is that careful no effect, the grants plus community links (in thought must be given to every step in the this case, involving the village council in the process from financing to results. How are planning meetings of the school committee) inputs going to interact with each other? Will led to improvements in learning outcomes. It the funding provided be enough for schools/ increased test scores in language by 0.17 SD.110 teachers/students and families to accomplish what is required to improve learning outcomes? This is illustrative of the importance And if the answer is no, what other activities of complementarities between types can be included that could enhance the effects of interventions and the sensitivity of of the RBF intervention? interventions to small changes. A recent study looked at two versions of a literacy program 107 Mbiti (2016) 108 Al-Samarrai et al (2017) 109 Pradhan et al (2014) 110 A fourth treatment that included grants plus community links plus elections to the school board improved test scores even more - 0.23 SD for language. 111 Kerwin and Thornton (2018) Part I – When Do Incentives Work? 39 Summary While the research on RBF and teachers, RBF and students and families, and RBF and schools is not comprehensive, there is substantial evidence to suggest the following conclusions: 1. Teacher incentives can but do not always improve teacher attendance and student learning. The design of the incentive scheme and the context matter. The effects are larger and more positive in developing country contexts. 2. Student and family incentives (such as CCTs, for instance) can reduce school dropout and increase school attendance, though the evidence for its effects on student learning are more mixed. Conditional transfers to students tied to their own learning are a promising area of future research. 3. The evidence on performance-based grants is still quite limited. For now, it seems that in some cases they can work, especially when grants are combined with other interventions such as capacity building (for example, to principals and school committees) or when money is spent on inputs that affect learning outcomes. 4. There is growing evidence that combining different RBF interventions within the same program can generate better results than using any one intervention alone. Part I – When Do Incentives Work? 40 Part II – RBF and Governments: Making RBF More Effective When it comes to RBF and governments, few standardized studies are available, which makes it difficult to make definitive or comprehensive statements about how RBF can be more effective in development projects. However, as the number of RBF projects grows in the education sector, there is more operational experience from which to learn. This discussion is structured around the project cycle (see Figure 2 below) because the lessons differ for each of the stages of the cycle. The information in this part of the report is taken directly from our qualitative survey data that reflect the experiences of development agency staff in the field, along with examples from project documentation and academic research. In this section, we aim to highlight the practical experiences of using RBF with governments, and the information that we present mostly relates to results-based financing agreements between a donor and a country client. Figure 2: A Typical Project Cycle Planning Upstream dialogue Implementation Design Figure 3: How RBF and Governments Work This figure shows how the relationship between RBF and a national government typically functions. The funder is the donor, and the partner is the country client. The donor and client must mutually agree on the expected results of the intervention, the indicators that will be used to measure those results, and what values the indicators must reach. The client then must work to achieve the indicators. Once these results have been verified, the donor then disburses funds to the client. Source: DFID (2014) There are very few education projects at claims that there is no evidence that RBF the national-level that have closed and been projects lead to fundamentally more innovation independently evaluated, and thus, little or autonomy,113 though those factors may not be rigorous evidence exists of its effectiveness the primary benefits of RBF. In education, the relative to other development financing. For evidence is even more limited, and researchers example, there is some suggestive evidence have noted a lack of documentation of that RBF may be more effective than other practical, real-life experiences with using RBF financing modalities in health, but more in comparison with other social sectors such as research is needed.112 Looking at eight RBF health.114 projects across three sectors, a recent paper 112 Grittner (2013) 113 Clist (2018) 114 R4D (2016) Part II – RBF and Governments 42 In general, it is difficult to ascertain the the larger project.116 Despite these challenges, direct impact of RBF in comparison to other some lessons and best practices have emerged financing given that it is rarely used in related to how and when RBF can work. In isolation.115 Another challenge is that there is order to promote the use of diverse and flexible usually no counterfactual situation with which approaches to solving complex education to compare it. There are no known experiments development problems, donors and clients where researchers have compared a situation require a range of financing options, with RBF with RBF to a situation without RBF as being one potential choice. it would be difficult to create conditions under which the two situations would be comparable. It is also just as challenging to Choosing RBF: Commitment, know whether countries would have funded Cautions, Cost, Context similar activities to achieve results using less money. Furthermore, a typical project that Table 7 below shows the four considerations uses RBF may target a broad set of indicators, that must be borne in mind when selecting some of which are tied to financing and some which type of RBF to use in any given of which are not, and this can make it difficult project — commitment, cautions (risks), cost, to ascertain the overall effects of RBF within and context. Table 7: The Four Cs CONSIDERATIONS TO BEAR IN MIND WHEN CHOOSING RBF Commitment Cautions Cost Context (country systems; capacity; conflict, fragility and violence) 115 UNESCO (2018) 116 Grittner (2013) Part II – RBF and Governments 43 Commitment Our survey responses also point to the At the project planning stage, the most importance of political commitment. One important thing to consider when choosing respondent wrote, “I think we can work RBF is whether there is mutual agreement around the financing, weak systems, weak between both parties to use it. While this capacity. I mean, it’s not ideal. But it can be applies to other financing modalities as well, it accounted for. We can’t, however, work around is often discounted. For RBF in particular, it is the lack of political will.” In fact, strong an important signal of political commitment, political commitment was the most commonly especially given that there has been criticism mentioned factor needed for RBF to be of RBF as a new form of conditionality and successful (see Figure 4). as a tool that donors use to ensure that the recipient’s incentives are aligned with theirs.117 Figure 4: What Conditions Are Necessary for RBF to be Successful? While political commitment and ownership Ultimately, both parties have specific interests. by policymakers in the recipient country These can range from the seemingly innocuous are critical to the success of all development to the more problematic. One argument is projects, it is also important to acknowledge that donors use RBF to try to make aid more the inherent power imbalance that exists efficient and to get the results they want, between donors and recipients. Oftentimes, without ensuring a mutual agreement with the countries are not in a position to refuse recipient government on which results will be funding, and an added complexity with results- linked to financing, without putting any effort based financing is that country governments into explaining how RBF works, or providing may not fully understand how the modality the recipient country with the support needed works (this will be further discussed in a later to actually achieve those results. On the section). recipient’s side, the fear of not receiving funds 117 Clist (2016) Part II – RBF and Governments 44 may lead policymakers to choose easy-to- However, while there is some qualitative achieve targets in order to ensure that they evidence that development partners are receive the payments. choosing RBF as a financing instrument without much in-depth consultation with Regardless, RBF has the best chance at success country governments, our survey results show when both parties are equally committed to that this does not seem to be the overarching it and understand the risks involved. Here pattern as is illustrated in the next example. are two examples of when the interests of the parties are and are not aligned. Recipient is Keen to Pursue RBF and Has the Necessary Political Will Recipient Feels “Pressured” into Accepting RBF In 2008 a new government in Pakistan came Based on survey feedback, there has been into power with a strong commitment to some indication that RBF is being heavily the World Bank-financed Sindh Education championed by a number of development Program and with the desire for World Bank’s agencies. Roughly 25 percent of respondents assistance in further refining the program’s indicated that their agency’s position on focus on results. Given that Pakistan was in RBF was positive, with some even saying it the process of decentralizing responsibilities was “hyper-positive” or “positive, perhaps in order to improve public service delivery, the excessively so.” One respondent gave an government also wanted to institutionalize example of a middle-income country in the results-based budgeting. For this to work, Middle East and North Africa region where they needed to introduce RBF at different RBF was chosen as the financing modality levels and agreed to a series of disbursement- before the project’s objectives and activities linked indicators that were meant to reinforce were fully identified. The project manager from the priority areas of the program. RBF was the Ministry of Education indicated that, while effective in this instance because of the sector- the government wanted to achieve results, wide approach that was taken, which required it would have been better if they could have “strong political commitment and ownership introduced some of the necessary reforms to (which) is critical for… addressing governance strengthen their own country systems prior constraints to effective service delivery.”118 In to implementing RBF. This project is still addition, the use of RBF complemented the ongoing, so it is unclear whether RBF will support being given by other development be a success or not, though the government’s partners. At that time, the main donor was commitment is now there. There may also be the European Commission, which has some other country context issues irrespective of elements of RBF in its budget support model. RBF that will generate political instability, The success of the first Sindh Education project which may alter the country’s ability to achieve led to a second iteration, which also used RBF. some of the indicators. 118 World Bank (2009) Part II – RBF and Governments 45 The results-based design of the original project Costs and Benefits (Advantages and “likely helped in orienting and focusing the Disadvantages of RBF Over Traditional Aid) Sindh government’s efforts on agreed program The research that outlines the potential implementation progress and performance advantages of RBF generally makes two targets. In particular, the disbursement-linked arguments in its favor: (i) it demonstrates indicators (DLIs) likely helped to promote and the impact of aid money and (ii) it is more protect the continuity of politically difficult, effective than other forms of aid.123 One study governance-oriented reforms undertaken by in particular has argued that RBF can help to the Sindh government.”119 maximize the alignment of interests between donors and client countries, among other One of the most critical aspects of RBF is the things, which can make aid more efficient.124 need to communicate to the recipient upfront how RBF works, whether the recipient is a While these theories may be true in some form, national agency, a sub-national agency, and/ over half of respondents (57 percent) to our or a direct service provider. Any group of survey indicated that RBF helped the recipients individuals who will be affected by RBF should to “achieve results that were previously not be made very aware of how the RBF scheme achieved through other financing modalities.” will work. In two DFID-sponsored RBF This sentiment was confirmed by another schemes, this lesson proved doubly true — the survey question that asked respondents Girls Education Challenge and a pilot RBF what the biggest benefit of RBF was over intervention in Ethiopia where RBF was found the financing of inputs. The overwhelming not to have any discernible effect.120 The majority, 96 percent, indicated that RBF evaluation team noted that the project had not produced a “sharper focus on results.” The been effective in disseminating information second biggest benefit that they identified about RBF to the regions where, even after two (64 percent) was that RBF “relies on and/ years, few officials, including head teachers, or strengthens country systems.” However, had heard of the pilot.121 while the survey takers were able to see the promise of RBF, they also acknowledged that Many of the existing studies of RBF examine it often is not used very effectively (43 percent) the relationship between the donor and the and requires more implementation support to recipient (the government of the recipient ensure good results. The challenges related to country, usually the Ministry of Finance), but implementing RBF are discussed in more detail in reality, governments who are genuinely in a later section. interested in results can also use it internally at all sub-national levels.122 For example, there is research underway in Morocco, Sudan, and the Dominican Republic that is looking at how performance contracts between national and sub-national government levels might improve education quality. 119 World Bank (2013) 120 Coffey (2016) and Cambridge Education (2015) 121 Cambridge Education (2015) and Coffey (2016) 122 Birdsall and Savedoff (2012) and Clist and Dercon (2014) 123 Clist (2018) 124 Clist and Verschoor (2014) Part II – RBF and Governments 46 Figure 5: What Are the Biggest Benefits of RBF over Financing Inputs to Achieve Results? Although RBF has become a mainstream Cost-effectiveness of RBF financing modality and can help donors to Although RBF can be used to get both donors encourage a greater focus on results, some and recipients to focus more carefully on interviewees felt that clients themselves may results, do these results come at a higher price also be keen to do the same. One survey taker than using other forms of aid? To date, there wrote, “I first was on the implementation side, is no consensus on whether RBF has a clear based within the Ministry of Education in cost advantage over traditional financing.126 Ghana, and now I’m on the donor side (World In this report, cost-effectiveness is defined Bank) for the same project, which is an RBF, as the ability of RBF to produce results more secondary education project. I found that, due efficiently and effectively than traditional to the emphasis on results, my government financing. Measuring the cost-effectiveness of colleagues took much more ownership over the RBF is still problematic because of the limited project that some of the other donor-funded existing evidence base. While there have been projects in Ghana.” a number of RBF interventions in education, they are rarely implemented alongside other While there are strong proponents of RBF, modalities, making it hard to compare RBF’s there are also detractors, those who believe relative value for money. Also, there are no that identifying good indicators is difficult, comprehensive cost-effectiveness frameworks that it undermines country ownership (“new for RBF in education that would enable conditionality”) and that it does not produce these comparisons. In other sectors, such good value for money.125 as health, toolkits have been developed that provide guidance on how to evaluate the cost- effectiveness of RBF in comparison with other aid modalities.127 125 UNESCO (2018) 126 Paul et al (2018) 127 See for example, Shepard et al (2015). Part II – RBF and Governments 47 The Center for Global Development has argued at a higher price than implementing other that choosing a lending instrument like PforR financing modalities, at least theoretically. does not create any financing additionality The case of Ethiopia is somewhat similar. An because countries can still avail themselves of impact evaluation found that the program had other traditional lending instruments.128 The had negligible effects on the outcome (in this IDB looked at whether RBF (donor to country case the number of students taking and passing client) was more effective than traditional aid an exam).132 Therefore, it is not possible to say in the health sector in El Salvador and found whether the program was good value for money that RBF led to higher growth in many of or not. Nonetheless, the evaluation found it the indicators measured. In municipalities to be a relatively low-cost alternative since it receiving RBF, preventive visits increased by had low transaction costs and did not disburse 42 percent compared to a 20.9 percent increase funds if there were no results. in traditional aid visits. There was a more modest difference in increases in outpatient Despite the possibility that RBF may increase visits (6.7 percent in RBF villages versus 4.2 the cost-effectiveness of education financing, percent in traditional aid villages).129 However, only 17.8 percent of respondents from the these improvements seemed to be due to an agencies in our survey (including DFID) expansion of infrastructure and increased indicated that cost-effectiveness was one of the medical staff rather than to divestment from biggest benefits of RBF. This may be because other areas. those responding to the survey were not directly responsible for discussing the modality In education, the idea that RBF should with the country government or that the loan achieve greater “value for money” comes off amount had already been determined or agreed particularly strongly in the evaluations done upon and RBF had simply been chosen as the of some of the DFID’s early investments using way for the funds to flow. Alternatively, many RBF. An evaluation of the use of RBF in the of those surveyed may not have taken cost- Rwandan education sector found that increases effectiveness into consideration as a reason to in completion rates (for primary, lower choose or not choose RBF. It should be noted secondary, and upper secondary education) that many of the respondents had not seen a during the implementation of the RBF program project through to completion so may not have could not be attributed to the program itself assessed its final costs. and instead were the result of other factors.130 However, the evaluation did show that There is some evidence outside of the education investing in the Rwandan education system sector that the initial costs of acclimating a was sound and good value for money regardless country to an RBF approach may be higher of which aid modality was used because the than those needed for traditional financing. benefits of increasing access, retention and In Ethiopia, the first World Bank-funded completion clearly outweigh the costs.131 It also PforR operation was incredibly difficult and suggested that the value for money of the RBF costly to prepare and was not advantageous program would have been greater than that of to the Bank on cost grounds. The evaluation other financing modalities. Therefore, it seems of the project recommended that the World that implementing RBF would not have come Bank invest sufficient resources upfront in 128 Gelb et al (2016) 129 Bernal et al (2018) 130 Upper Quartile (2014 and 2015) and Cambridge Education (2015) 131 Upper Quartile (2015) 132 Cambridge Education (2015) Part II – RBF and Governments 48 future to ensure that teams have the capacity recipient does not achieve the required results. to explain RBF (in this instance, the PforR Other write-in responses alluded to similar instrument) well enough so that the country actions, including “watering down” indicators can make an informed decision about the or extending the project. The failure to financing modality.133 A more recent study has achieve targets will be further discussed in the found evidence that RBF is more expensive implementation section. and not necessarily more efficient than The liquidity constraints faced by country traditional financing, mostly due to the costs governments and other incentivized actors of supervision and independent verification. can be mitigated through such flexible For example, in a World Bank-supported measures as providing advances to cover initial health project in Benin, for each US$1 paid expenses and/or staggering the indicators, to providers, half (US$.50) was spent on in other words, setting achievable targets at verification.134 first and building up to more difficult ones as Cautions the project progresses. This was done in the Sindh education project mentioned earlier and Another reason why countries (and will be further explored later in the section organizations) are skittish about using results- on indicators. Recent practitioner experience based financing is that they have to assume reflected in informal World Bank guidance more risk.135 If they fail to achieve the required has shown that it can be helpful to use “zero results, they will not receive any money. Also, DLIs” — those that are easier to achieve and some countries may not have enough upfront thus to earn funds — as a way to speed up financing available to cover the costs of implementation as well as to “boost morale and achieving results. These are legitimate fears. momentum.”137 An evaluation of the Girls Education Challenge (GEC), a DFID-funded initiative that seeks to In addition to non-payment and liquidity improve learning amongst the poorest girls, concerns, there is also political risk involved found that many organizations could not in using RBF, particularly the risk that RBF bear those risks.136 Although the GEC is not can be more difficult to control when there technically an RBF and government scheme, it are many political actors involved. This can is used here for illustrative purposes because complicate accountability and coordination, it still involves donor funding, except that the for example, if the agency receiving the money donor funds are directly channeled to service is not the agency in charge of achieving the providers. results. In some cases, the agency in charge of achieving the results (usually a line ministry) According to the results of our survey, over half might wonder why they should prioritize (54.3 percent) of respondents indicated that those results since they are not the ultimate when the expected results were not met in the recipients of the funds (usually the Ministry of projects that they managed, they did indeed Finance). One respondent to our survey wrote withhold funds. This is a politically difficult about a social protection project in Nepal choice to make, which may also explain why where there was some question over whether the remaining half of respondents indicated RBF should be used as the team leader had that they scale back the indicators when the doubts that the Ministry of Finance (the funds 133 IEG (2016) 134 Paul et al (2018) 135 UNESCO (2018) and Results for Development (2016) 136 Bond (2017) 137 Sabarwal et al (2016) Part II – RBF and Governments 49 recipient) would be capable of holding the line recommend using RBF as the only financing ministry (Ministry of Education) accountable modality for the GEC. for achieving the results. In the end, RBF was If a country does not have the ability to still chosen as the financing modality because mitigate risk or if the project guidelines other government actors were also involved, and procedures do not allow for sufficient but the team was cognizant that “successful mitigation, then it may be necessary to rethink implementation depended on how pro-actively the use of RBF in that particular country. [they] engaged these other institutional actors beyond the immediate counterpart agency. The Context jury is still out as to whether this will work as RBF, like all financing modalities, needs to intended.” be designed to fit the specific context within With these examples in mind, both types of the recipient country, and there is no “one risks should be assessed in the planning stage size fits all” design. However, some contexts of a project. While traditional input-based may be more conducive to the successful financing can involve similar risks to RBF, the implementation of RBF than others. The key stakes are often higher for recipients because of criteria that are likely to lead to its successful the threat that they will not receive the funds if implementation are: (i) the pre-existing they do not achieve their targets. involvement of the development agency in the sector or in a government program and Another risk that underscores the importance (ii) the pre-existence of strong financial of explaining the details of RBF upfront is management systems and EMIS in the country when countries or organizations unwillingly in question.139 Other important factors that or unwittingly agree to RBF without fully may influence whether RBF is the appropriate understanding what they are signing up modality are the capacity of the state and for. A good example of this is the GEC. In a whether the country is affected by conflict, process evaluation, the evaluators indicated fragility, and violence. that, although the project team mentioned RBF upfront in all of its guidance to service Country Systems providers, their understanding of it evolved According to our survey data, 67.4 percent of over time, and DFID was unable to issue respondents believed that the second most specific guidance about it, due to the lack of necessary condition for RBF to work was the consistent definitions within the organization. existence of strong country systems, the first This meant that, while many applicants signed (as has already been noted) being political will on to the concept, they did not really know (80.4 percent). Two of the key country systems what it would entail. This caused confusion needed for RBF to be implemented successfully and frustration among applicants as well as are financial management (FM) systems and delays in the implementation of the GEC.138 an education management information system The evaluators subsequently found that it (EMIS). Financial management systems would have been more useful to develop are necessary to ensure that funds are well guidance on RBF and begin disseminating it as managed, while a functioning EMIS is needed early as possible so that applicants could make for all monitoring and evaluation efforts, which informed decisions about whether they wanted are the key feature of RBF operations. to apply to be a part of the GEC given the RBF An example of an intervention that benefited requirements. Overall, the evaluation did not from all of these prerequisites being met is the 138 Coffey (2016) 139 ADB (2017) Part II – RBF and Governments 50 Asian Development Bank’s Additional Skills Capacity Acquisition Program in Kerala, India. This RBF can work in a variety of contexts, but the program is a post-basic education results-based capacity of the implementing country is key. financing operation that was designed as such Capacity, in this report, is defined as the ability because all of the necessary country systems of a country government to implement and were in place. The RBF approach was chosen so monitor RBF activities or at least to have the that the implementing agency would have the desire to build its capacity in order to be able to freedom to make any required changes in real- do those two things. time, subject to the proviso that the ultimate It is important to dispel the implicit results were met.140 assumption that low capacity means lower The project used the government’s financial income. This is not always the case, given that management systems for its budgeting, there are some low-income countries that have accounting, reporting, monitoring, and the capacity to implement and monitor RBF, auditing arrangements. According to the such as Tanzania and Rwanda.142 National Institute of Public Finance and Policy, For example, Rwanda is often listed as India scored very highly on the public financial a forerunner in the use of results-based management dimension of “comprehensiveness financing, especially in the health sector. and transparency.”141 Their experience with the modality began In addition, the country’s existing management in the early 2000s when donor support for information system (MIS) was able to provide reconstruction after the 1994 genocide waned program managers with critical information for and health facilities were once again reliant program planning such as gender, inclusiveness on user fees to keep going. In addition, health (specifically of socially and economically workers were poorly compensated and thus not marginalized or differently abled students), keen to work in the public sector. To refocus geographical spread of students, and sector efforts on increasing use and coverage of health training. The MIS was already set up to track services and improving their quality, results- output and outcome indicators, including the based financing activities were initiated, disbursement-linked indicators. The MIS also which were designed to increase the use of facilitated evidence-based planning and could services by incentivizing health providers. flag potential problems with the program early on. The initial schemes were very successful and were eventually scaled up and tested in Although all of the requisite country systems other provinces. For example, some research were functioning and available to be used in has suggested that it helped to increase the Kerala project, the project documentation measles immunization by 11 percentage also noted, “While country systems (especially points (compared to only a 1 percentage point data systems) are critical, RBF can be designed increase in the non-RBF areas).143 The quality to strengthen those systems and… can provide of service delivery also improved, with RBF advances and strengthen technical capacity.” areas scoring 73 percent in a composite quality This emphasizes the importance of using score versus 47 percent for non-RBF areas. monitoring and evaluation as a feedback Ultimately, despite Rwanda’s history of conflict mechanism, which will be further detailed in and violence, the government has remained the implementation section. 140 ADB (2014) 141 ADB (2014) 142 See examples in Andrews et al (2017). 143 Rusa et al (2009) Part II – RBF and Governments 51 committed to using results-based financing.144 subnational levels.150 This type of government As previously discussed, over the past decade it commitment resembles what was mentioned has piloted initiatives in the education sector, earlier in the section on Choosing RBF. though with less impressive results.145 Another dimension of the capacity question is According to the evaluations of the Rwandan that capacity is inextricably linked to country RBF activities, they succeeded for several context and project design. One respondent reasons. First, they were built on three existing to our survey wrote, “I’ve worked on [RBF donor-funded pilots, which facilitated the projects] in relatively high capacity contexts (a scaling up process. Lessons were learned reformist state in Brazil) and very low capacity from the pilots on key issues such as the contexts (Nepal). The country I am currently need for robust information systems or assigned to is an extremely fragile state with the best way to manage and distribute the ongoing conflict, a fragmented state, and funds, and these were put into practice in the endemic corruption. I wouldn’t categorically scaled-up nationwide program. This echoes deny the applicability of RBF based on broad the conclusions reached in other studies categorization of ‘contexts’ but I do believe that of RBF programs about the importance in each place it really is important to think of experimentation and context in their through why an incentive-based approach is implementation.146 Second, they benefited superior to traditional input financing, what from broad political leadership and political are the institutional (or fiscal) pre-conditions will at the highest level in Rwanda, which for this to work, and whether enough of such evidence suggests is key for any reforms pre-conditions are in place.” to be successful.147 Third, the Rwandan These examples demonstrate that it is useful government had already demonstrated that it for the designers of RBF projects to identify had the capacity to responsibly manage funds upfront: (i) the kind of capacity that will be and monitor indicators, which led them to required to implement the project and (ii) promote institutional capacity, particularly for whether this capacity (both political and service provider contract management, into technical) exists in the intervention. The the public system rather than rely on donor type of capacity required varies and can accountability.148 The Rwandan case shows be straightforward, as in Rwanda, or more how even very low-income and fragile countries complex, as in pay-for-performance schemes can successfully implement RBF (Rwanda has for teachers.151 Similarly, schemes that will a per capita GDP of around US$750).149 Also completely revamp the existing system will part of Rwanda’s context is the government’s require more political leadership than others development of the Imihigo system, a “Home that use existing structures. As to whether Grown Solution”, which is based on contracts capacity exists to implement the specific between national and 144 Rusa et al (2009) and Rusa and Fritsche (2007) 145 Upper Quartile (2015) 146 See Andrews et al (2013). 147 Andrews (2013) and Andrews et al (2017) 148 Rusa et al (2009) 149 World Bank (2018) 150 Klingebiel et al (2016) 151 As suggested by Barleavy and Neal (2012). Part II – RBF and Governments 52 intervention, in Rwanda, RBF was built onto “strengthened systems” such as improving existing programs that had already been data management, revising the curriculum, proved to be effective. In other words, as long putting foundational policies in place, and as there are pockets of effectiveness where increasing government capacity for planning the required capacity exists (or willingness by and implementation. One survey respondent the government or actors to invest in creating wrote, “I was skeptical that Lebanon would be them), it seems that RBF can be implemented able to successfully implement RBF, to bring successfully.152 more refugees into the systems. But it turns out I was wrong.” Based on the most recent Conflict, Fragility, and Violence status report, there has been an increase in the From our survey data, it is clear that the enrollment rates of both Lebanese and non- vast majority of respondents (79 percent) felt Lebanese students. that RBF could be introduced in fragile and conflict/violence affected (FCV) areas. Even if The Dutch NGO Cordaid has been this is the case, RBF in FCV contexts generally implementing RBF projects in FCV requires more customization. In SIDA’s internal environments since 2001, particularly in the guidelines, there is a section dedicated solely to health sector. Cordaid introduced RBF to the “special design considerations in fragile states,” health sector in Sub-Saharan Africa and has a which rightly indicates that, thus far, the few activities underway in the education sector. experience is limited and not very conclusive.153 Cordaid believes that RBF works particularly However, the guidelines note that there has well in FCV contexts because it allows for been more experience in the health sector with more flexibility in funding allocations for local operating RBF programs in FCV contexts, and health facilities to decide what is needed based the general approach has been to pilot RBF in on their neighborhoods’ needs (this flexibility a particular region or province and then scale will be discussed further in the implementation up based on the success of the pilot. This was section).154 RBF can also be used to target precisely the modus operandi in the successful specific populations. For instance, in the health intervention in Rwanda that was Democratic Republic of Congo, the number discussed in the previous section. of safe deliveries of babies rose to 97 percent in RBF facilities compared to non-RBF In these settings, RBF might be better facilities.155 deployed to incentivize system-building than outcomes at first, as these systems will make Haiti is an example of a fragile, low-income it possible to set intermediate and outcome- country where the preconditions for the level indicators in the future. For example, effective implementation of RBF were in Lebanon, which is dealing with a Syrian not in place. In particular, Haiti has been refugee crisis, RBF is being used to move from struck by numerous external shocks that a crisis situation to a more sustainable one have exacerbated the country’s fragile by incentivizing the government to prioritize state, including a devastating earthquake education quality for both Lebanese children in 2010, a powerful hurricane in 2016, and and Syrian children. The project has nine long political transitions. These shocks have indicators that must be met for funding to greatly diminished public sector capacity, be disbursed, four of which are focused on particularly in the education sector. Haiti has 152 The idea of pockets of effectiveness has been discussed extensively in the literature, often under different names. See, for example, Andrews et al (2017) and Marsh et al (2004). 153 Olander and Högberg (2016). 154 See, for example, Results for Development (2016) or World Bank (2017) . 155 Cordaid (2017) Part II – RBF and Governments 53 a unique education system in that the majority strengthen its capacity to implement RBF will of providers are private (approximately 80 be further explored below in the section on percent of students in the system attend non- monitoring and evaluation. public schools) and therefore are not within the jurisdiction of the Ministry of Education. To support access to education for the poorest Design Priorities children, the Government of Haiti has funded several tuition waiver programs where fees While there is a general consensus in the are directly paid to non-public schools on development community that how a project the condition that they meet a series of is designed is critical for its successful requirements related to education quality. implementation, it is not always clear which Despite the good design of these programs, design elements are the most important, the government did not have a reliable set especially in RBF projects. In our survey, of indicators or monitoring systems in place practitioners identified the two biggest to verify whether schools met the stated challenges in project design as choosing conditions. Thus, the government participated indicators (67 percent) and verifying results (61 in an exercise to develop such systems in an percent). These will be discussed in detail in this effort to create a stronger link between data section, along with other design considerations (indicators) and incentives. Haiti’s development that can lead to more effective RBF. of a quality assurance system (QAS) to Figure 6: What is the Biggest Challenge in Designing RBF Activities? Part II – RBF and Governments 54 Cascading Incentives held responsible for achieving results as well In education, as in in health, there have been as the actors within the country. This idea many RBF schemes that have targeted frontline of mutual responsibility for results is often providers, notably teachers and health workers. overlooked in other theories of RBF where Interestingly, a very large majority of our in-country agents are expected to innovate on survey respondents (90.9 percent) indicated that their own.156 it was most important to incentivize national- These responses show the importance of level actors such as policymakers. In education, thinking through the results chain and how one of the challenges is that the main national- incentives can cascade down to every level. level actor is usually the Ministry of Finance, There is some operational evidence that which often does not communicate to the incentives should be targeted to the responsible Ministry of Education about the incentive administrative levels where the action being scheme or does not cascade the incentive, i.e. incentivized is taking place, but this is not keeps all of the disbursements at the Ministry always the case.157 In many projects where of Finance level, or does not incentivize the donors provide funding to incentivize country Ministry of Education to achieve results that governments, other actors at lower levels are they are responsible for. The mechanics of how often bypassed or overlooked. This may not be this relationship plays out is further explored intentional, but this is often the problem — that in Box 1 on choosing financing mechanisms. donors have not thought about which actors Other than national actors, 61.4 percent of play a role at what stage of the results chain. the survey respondents indicated that front- line providers (for example, teachers) were the In the DFID-funded Ethiopia Secondary most important people to incentivize, followed Education pilot, the national government was by schools (75 percent) and then meso-level incentivized to increase the number of exam officials (for example, district education sitters, boys and girls, over the course of three officers) (84.1 percent). These responses may be years. The national government in turn passed skewed due to the targeted survey population down part of the incentive to the regional (since staff at development agencies primarily level, and within some regions, schools were interact with their counterparts at the central responsible for spending the RBF funds, and government level), but in the written responses some were also directly incentivized. Even to this question, respondents addressed this though there was no short-term impact, there by indicating that teams “usually do not think was some evidence that strategic thinking and through the potential cascading effects (or lack prioritization improved at both the regional and thereof) of incentives within the government school levels, though the evaluators indicated structure.” In one example, a respondent that their team did not visit enough schools mentioned a pipeline project that had not in Year 3 to make a strong argument that the addressed the disconnect between how donor school-level changes were widespread.158 financing would flow to which part of the This example shows that the intervention’s government and who within the government designers assumed that regions and schools structure needed to be incentivized to would be influential in increasing the number achieve the desired outcomes. Other survey of exam sitters, though it is unclear if that takers indicated that, even though the donor assumption is indeed true. The trickling down community and development agencies are of incentives did create some other types of obviously not incentivized, they should also be 156 Birdsall and Barder (2006) 157 Sabarwal et al (2016) 158 Cambridge Education (2015) Part II – RBF and Governments 55 positive results, such as modifying the formula stakeholders such as parents and families and/ to reward only the most successful schools, or even students themselves would have made supporting all schools to increase the numbers a bigger difference since presumably they of both sitters and passers at the regional may have had more control over whether they level, and supporting underperforming school showed up for the exam. While there may have to increase the numbers of both sitters and been other unintended consequences of that passers at the regional level.159 Given that one design, there is some evidence that those types flaw of the pilot was that most head teachers of incentives can work to increase effort and did not even know about the pilot intervention, attendance. While there is also some evidence it is unclear if providing incentives to other that school-level incentives can produce similar 159 Cambridge Education (2015) Box 1: Choosing Financing Mechanisms for Cascading Incentives Regardless of what result or stakeholder is incentivized and how much an indicator costs, there also needs to be a way to transfer funds. This is particularly important in RBF because the incentives need to be able to get to the right actors. Unfortunately, this aspect is not always carefully thought through during the design phase, which leads to problems in implementation. For example, in a higher education project in India, there was no mechanism for the central government to transfer funds directly to participating institutions, and thus the funds were first sent to state treasuries (World Bank, 2017b). The completion report noted that fund releases from the state treasuries to institutions took an inordinately long time — over 100 days in many cases and even 300 days in a couple of cases. Unfortunately, the central government did not have the ability to control or sanction those states that did not disburse funds, and delays in disbursement greatly affected the ability of certain institutions to comply with project milestones and diluted the effectiveness of RBF. In a West African country, there were problems with funds flow that led to the cancellation of DLIs. The Ministry of Finance was not keen to disburse money to the Ministry of Education, which was responsible for achieving the DLIs. This was due to a number of factors, including an economic crisis in the country and the poor relationship between the Ministries of Finance and Education. This is not a unique situation. Especially when it comes to traditional aid relationships, in many countries the Ministry of Education is not the strongest line ministry, yet they are responsible for selecting the DLIs while the Ministry of Finance is the agency that receives the disbursements. This type of issue has been managed through “results-based budgeting”, which has been used in various projects. For example, the designers of the Jamaica Early Childhood Development Project included DLIs that required the Ministry of Education to prove that there was an adequate budget for achieving other DLIs for the same fiscal year and that the execution rate under the budget lines for DLIs exceeded 70 percent (World Bank, 2008). Many of the “budget” related DLIs in the World Bank DLI analysis (to be described in further detail in the next section) are tied to timely execution and/or adequate resource flows. Part II – RBF and Governments 56 results, perhaps combining the two incentives to certain levels. Ultimately, the best incentive would have created more of an impact, along scheme for service providers, meso-level with better dissemination of information about (district/province) stakeholders, or national- the RBF scheme. Without understanding level stakeholders will not work if other agents’ the full Ethiopian context, it is difficult to interests are not aligned with that incentive, as know, but this example still shows that when was seen in the Burkina Faso. designing a project, teams should question their assumptions and think about how Selecting and Pricing Indicators incentives will trickle down and to whom and In RBF, just as with traditional development how those incentives might work together. financing projects, it is not always easy to know which indicators are the “right” ones, Some survey respondents argued that targeting but indicators in RBF projects carry more the incentives to the wrong actors can have weight because achieving them prompts the negative effects. If they are targeted to high disbursement of funds. RBF indicators must levels (for example, the government), they strike a balance between cost, effort, feasibility, have little hope of cascading, but if they are and ambition.162 For more insight into the targeted to actors too low down the chain, the selection and pricing of indicators, we take a risk of perverse behavior grows as in the case detailed look at designing disbursement-linked of teacher performance pay schemes (which we indicators (DLIs), which are the indicators that discussed in the previous section on teacher must be achieved for funds to be disbursed in incentives). In a health RBF intervention in RBF projects. Burkina Faso, researchers found that the designers failed to target the incentives to DLI Analysis: The Basics certain groups of medical support personnel We started by analyzing the disbursement- and health management committees even linked indicators (DLIs) that have been though they were working closely with the used in World Bank education projects. We health workers who were being incentivized, classified indicators into four types (input, which contributed to those actors’ perceptions process, intermediate, outcome) to ensure more that RBF was just another form of regular differentiation during the analysis. In Table 7, development aid rather than something that there are examples of each of these categories. could generate more systemic results.160 For the purposes of this report, the primary difference between a process indicator and There is no conclusive evidence that proves an intermediate indicator is that a process that incentives work best at one particular indicator generally reflects that an action or level of the education system. However, based policy has taken place, but nothing additional on operational experience, it seems to be has happened. more important to think about whether the incentives given to the different actors involved We focused on the World Bank’s portfolio in the delivery of the service or programs are given the Bank’s large share of RBF projects in aligned rather than on who receives explicit the education sector, as well as the fact that it incentives.161 This requires designers to have a often acts as the implementing agency for other clear theory of change in mind, as well as to be donors. The analysis covers 352 DLIs from 51 aware of the political economy of the sector and projects (investment project financing using of funds flow issues as, in some instances, there DLIs and PforRs) from 2008 through June may not be a practical way to transfer funds 2018. This is around 6 percent of the Bank’s 160 Ridde et al (2018) 161 As mentioned, for example, in Olander and Högberg (2016) 162 World Bank (2017) Part II – RBF and Governments 57 total education portfolio, which included instrument was not a PforR. We categorized 843 projects during that time period. Almost the DLIs by topic (Table 8) and by their position 94 percent of the funding in these projects in the results chain (Table 9). was results-based, even when the financing Table 8: Common Disbursement-linked Indicator Topics TOPIC EX AMPLE Inputs Budget, infrastructure, textbooks Data/Systems EMIS, school census data, annual reports Assessment Administering exams, examination commissions Teachers and Teacher Training Teacher management, teacher accountability, teacher deployment, teacher training and evaluation Enrollment, Completion, Retention Number of students, increase in students, number of students completing grade or training Quality Assurance Increased capacity, accreditation, performance benchmarks, readiness criteria School-based Management School grants, school management committees Curriculum Curriculum standards, curriculum framework Learning Outcomes Test scores, employment Policies/Frameworks Reforms, council or agency established/operational Other Scholarships, industry relationships, skills training Table 9: Examples of Indicators POSITION IN RESULTS CHAIN EX AMPLE Input Textbooks have been procured and delivered to targeted schools. Process An effective and relevant curriculum is in place. Intermediate The required number of pilot school inspections has been completed with reports published on the Ministry of Education’s website. Outcome The recipient has demonstrated an improvement in student learning outcomes. Part II – RBF and Governments 58 Before delving into other questions, here are more each. Some of the DLIs were worth huge some basic details on the DLIs analyzed. The sums of money, especially in P4R projects. average number of DLIs per project was 6.9, The highest-valued DLI was worth US$ 341.5 with the median number being 6. However, million and belonged to the Nigeria Basic there was some variation, with the bottom Education Project. quartile of projects having 5 or less DLIs while Regarding time trends, it is too early to tell, the upper quartile had 9 or more. The values since our sample is limited to 51 projects. The of DLIs are also quite variable. The mean DLI only result worth highlighting is the growing was worth around $18.8 million. However, the popularity of results-based financing, as bottom quartile of DLIs cost $12 million or evidenced by the increase in the number of less each and the top quartile $29.5 million or DLIs over the past few years. Figure 7: Number of DLIs by Year DLIs and Results Chains: Few DLIs at the input or outcome level. Overall, as Focus on Outcomes seen in Figure 8, the majority (75 percent) of Our survey results showed that most DLIs are focused on intermediate outcomes, practitioners believe that incentivizing a mix meaning that they require an improvement or of results is the most effective design, and our strengthening of something, such as teacher DLI analysis results supports this. According training. Very few DLIs focus on inputs to the DLI analysis, most indicators are set at (5 percent), while even fewer are set at the the intermediate or process level, with very few outcome level (4 percent). Part II – RBF and Governments 59 Figure 8: DLIs by Position in Results Chain in World Bank Education Projects While the survey respondents indicated that capitalize on the potential of RBF to strengthen inputs were worth financing (see Figure 10), systems and institutions.164 In the responses our DLI analysis revealed that inputs were very to our survey, there were mixed opinions, with rarely used as DLIs in World Bank education a few respondents echoing, “RBF is too often projects. Even indicators related to traditional used for processes and inputs; because these inputs such as textbooks tended to focus on are seen as ‘easy’ and so can provide a flow of quality aspects, such as timelier delivery of funds. But fund flows should be smoothed with textbooks. Similarly, DLIs related to school other mechanisms. Processes can be important construction were generally more about if they really represent a change in the way a ensuring quality improvements than about system is operating.” However, the majority building more schools. of survey takers indicated that context was incredibly important and that “good projects In a recent World Bank evaluation of the PforR finance different steps in the result chain, instrument (the most widely used financing not only final outcomes.” The importance of instrument that directly ties financing to incentivizing throughout the results chain will indicators within the World Bank), roughly be illustrated in later examples. 48 percent of disbursement-linked indicators across all sectors were defined as results In contrast, some studies have argued that such as capacity building and institutional outcome-based indicators are associated development, which do not qualify as final with better results since they are agnostic outcomes.163 on the activities required to produce those outcomes.165 Arguably, this allows agents to This is most likely due to pragmatism on the find the best way to achieve the outcomes. part of project designers. In fact, the ADB However, as previously discussed, outcome recommends that teams focus on institutional indicators are also those over which strengthening when selecting DLIs to 163 IEG (2016) 164 ADB (2016) 165 Holzapfel and Janus (2015) Part II – RBF and Governments 60 agents have the least control. In education determines changes in any indicator year interventions, an example of an outcome to year. For example, enrollment rates are a indicator is literacy or student learning as common indicator in many education projects, measured by test scores. There are many but it is contingent both on the number of factors that can affect these indicators, thus students enrolled and on how the denominator undermining the link between the effort of the (the universe of school-age students) is implementing agent and the results that it has measured. to show for it. As mentioned in the previous Another lesson that we learned from our DLI section on teacher incentives, test scores are analysis included the importance of valuing prone to random variation due to shocks, indicators based on their leverage (or ability cohort characteristics, or school size.166 This is to unlock processes and progress) rather why many survey respondents believed that it than on their value for money. It is worth is crucial to extend RBF further up the results mentioning that DLIs are often tied to larger chain, such as intermediate outcomes and even systemic change. For example, one objective process indicators and inputs. of an ongoing basic education project in A good way forward may be to focus not just on the Dominican Republic is to increase the the position of an indicator in the results chain country’s capacity to recruit and train primary but also on how much control the agent has and secondary school teachers. Attached to over achieving it. This is a principle that can this objective are a series of specific DLIs, also apply to the rest of the results chain since such as the development and dissemination of there are also other types of indicators that professional standards for secondary school are more controllable than others. Therefore, teachers. it is important to think seriously about what Figure 9: DLIs by Position in World Bank Education Projects Input Process Intermediate Outcome Source: Authors’ analysis of DLIs in World Bank education sector projects 166 Murnane and Ganimian (2014) Part II – RBF and Governments 61 Figure 10: Types of DLIs Worth Incentivizing Source: Authors’ survey of opinions on RBF of development agency staff working in education Given how difficult it can be to identify a Examples of Results Chains: Bangladesh, clear formula that leads to strong learning Lebanon, and Tanzania outcomes, the best DLIs to select often Ultimately, in RBF projects, the results chain depends on understanding the results chain. is of paramount importance, and it would be As one survey respondent wrote, “If a major helpful for teams to ensure that they have all constraint to equity in a particular country is the necessary context-relevant information the near-permanent delay in the production about the results framework.168 Here we and delivery of mother-tongue-instructional present some examples of results chains and materials for ethnic communities, tying DLI progression in World Bank RBF projects disbursement to this input could be powerful. in Tanzania, Lebanon, and Bangladesh. They New or revised processes can be sensitive and illustrate how and where disbursement-linked challenging to design or implement so tying indicators can be inserted along the chain on disbursement to process indicators could be the assumption that incentives can potentially key in a situation like this.” This also speaks to unlock identified bottlenecks. the idea that deciding where an indicator fits in the results chain is hard because it depends on perspective — an input in one project may very well be an output in another.167 167 ADB (2017), Clist and Verschoor (2014), and Holzapfel and Janus (2015) 168 Sabarwal et al (2016) Part II – RBF and Governments 62 Figure 11: Results Chain Example from Tanzania Results Chain Example: Tanzania INTERMEDIATE LEVER INPUT/ACTIVIT Y FINAL OUTCOME HIGHER ORDER OUTCOME Strengthen Official School School ranking Identification of Performance— Ranking released lagging schools, Transparency students, and teachers [for better National 3R No. of schools planning and focused Assessment participating in the 3R attention] assessment Motivate through School Incentive No. of schools incentives Grants (SIG) – receiving performance based performance-based incentive rewards (DLI) Increased teacher Non‐financial Teacher awards effort measured performance announced yearly through classroom Improved student incentives for to high-performing presence (PDO) learning outcomes teachers teachers in PSLE & CSEE examinations Improve Teacher Clear backlog of No. of outstanding Conditions claims Teacher claims older than three months Provide Support School improvement No. of schools that where required toolkit receive the toolkit 3R teacher training No. of teachers Improved teacher program trained proficiency 3R subjects (PDO) Student‐Teacher No. of schools Improved student Enrichment Program participating in STEP performance in 3R (STEP) assessment (PDO) + (DLI) Timely Delivery of Percentage of schools Improved textbook Adequate Capitation receiving capitation student ratio Grants grants on time (DLI) Source: Big Results Now in Education Program, Project Appraisal Document, World Bank 2014 Part II – RBF and Governments 63 Figure 12: Results Chain Example from Lebanon Results Chain Example: Lebanon INTERMEDIATE LEVER INPUT/ACTIVIT Y FINAL OUTCOME HIGHER ORDER OUTCOME Equitable Access School construction No. of school-aged and rehabilitation children (3–18) enrolled in formal education (DLI) Psychosocial program No. of children for students to help and youth whose Increase in the with re-integration registration fees proportion of school into formal education for public formal aged Lebanese education and ALP and non-Lebanese are partially or fully children (3-18) subsidized enrolled in formal education (PDO) No. of public schools newly built or expanded to meet quality standards specified in GoL’s Decree 9091 Enhanced Quality Training for educators % of children and youth aged 3–15 above the corresponding Improved access, graduation age who Increase in the quality and a stronger have completed a proportion of students education system Cycle passing their grades, to respond to the and transitioning to refugee crisis School grants the next grade (PDO) + (DLI) M&E of teacher Proportion of quality, learning students transitioning outcomes, and grades (DLI) learning environments Strengthened EMIS Unified framework for Systems data management, data collection protocols, and compliance systems endorsed and operational (DLI) Timely and robust data available for Capacity building CERD adequately evidence informed capacitated and policy-making and equipped to develop planning (PDO) interactive content and e-platform Revised curricula Curriculum revised to improve quality of learning Source: Adapted from Reaching All Children with Education in Lebanon Support Project, Project Appraisal Document, World Bank 2016 Part II – RBF and Governments 64 In Tanzania, the government had already education system. For example, the first DLI participated in an intensive retreat to identify was the incorporation of all children into the the primary results that it wanted to achieve education system. The second was for all of as part of its education reform initiative. Thus, these children to complete the school year the team focused on designing DLIs that and transition through into the next year would: (i) link incentives to key points within throughout the grades. The remaining DLIs the program results chain; (ii) focus incentives bolstered other key pillars in recognition that as closely as possible on the key actors enrollment, completion, and retention are not accountable for their attainment; (iii) be simple sufficient achievements on their own.170 and manageable in terms of their number and In the Bangladesh Primary Education Project, framing; and (iv) have a high likelihood of DLIs were introduced during the third iteration being achieved within the specified timeframe of the project. In this case, the government and within the control of the government.169 already knew how to manage donor funding, In Lebanon, the team sat down with their and many of the reforms under the second government counterparts to discuss the key project were being rolled into the third.171 The areas in the education system that required first DLI was the introduction of a five-year immediate improvement. These areas became action plan to improve the Grade 5 completion the key levers, or pillars, of the project. As exam, and subsequent DLIs focused on revising a result of these discussions, the DLIs were the exam and piloting it with incremental structured to follow a logical progression down increases in the number of competency-based the results chain with specific reference to the exam items. Ultimately, the goal was not only Syrian refugee crisis, because of which many to improve the exam but to also ensure that the Syrian children were out of the formal results were analyzed and disseminated in a timely manner. 169 World Bank (2014b) 170 World Bank (2016) 171 World Bank (2011) Part II – RBF and Governments 65 Table Sample10: from DLIs Sample DLIs from the Bangladesh Third Primary Education Development Project Bangladesh YEAR 0 YEAR 1 YEAR 2 YEAR 3 DLI BASELINE (May–June 2011) (April/May 2012) (April/May 2013) (April/May 2014) 3. Grade 5 Grade 5 A 5 -year Revised Action plan Action plan Completion completion exam Action plan for 2011 Grade 5 implemented implemented Exam: improvements Completion implemented with at least with at least in Grade 5 Exam, based Improving the for all primary Completion on action plan 10% of items 25% of items quality of primary school students and pilot results competency- competency- completion exam in 2009. Exam developed implemented, based based and the regular by NAPE and incl. guidelines introduced in introduced in measurement of Content focused approved by developed for learning MOPME and the 2012 Grade the 2013 grade on testing markers and including revising training of 5 exam and an 5 exam and an students‘ test items markers additional 15% additional 25% memory more to gradually of competency- of competency- than ability to transform exam Analysis of 2011 based items based items use subject into competency- Grade 5 based test completion piloted piloted knowledge exam results New test items and content Analysis of 2012 Analysis of 2013 developed by completed by Grade 5 Grade 5 NAPE on selected DPE and NAPE completion completion competencies and results exam results exam results and piloted with disseminated accompanying and content and content guidelines completed by completed by for pilot test DPE and NAPE DPE and NAPE administration and results and results and training disseminated disseminated of test administrators PROTOCOL Definition: The Grade 5 Action Plan specifies the number of new competency-based items to be introduced each year, with the aim of achieving a fully competency-based exam by end- 2016. Analysis of results includes: (i) analysis of pass rates by gender, subjects, Upazilas conducted by DPE; and (ii) analysis of NAPE of marking and scoring of a sample of answered scripts in selected Upazilas. Source: Action plan as approved by DG, NAPE and MOPME; sample of test items and questionnaire of grade 5 exam; test analysis reports by DPE and NAPE. Source: World Bank (2011b) All three of these examples illustrate the topics, the most common ones were: (i) teachers theories of change within the projects and and teacher training; (ii) quality assurance; (iii) how incentives were used to promote the enrollment, retention, and completion rates; achievement of higher order goals. (iv) inputs such as textbooks or budgets; and (v) policies/frameworks. There was nothing In our DLI analysis, we also looked at the to suggest that these were unusual, and it is thematic focus of DLIs to see if there were possible that non-RBF projects might also favor any patterns in the types of DLIs that teams the same types of indicators. preferred. While the DLIs covered a range of Part II – RBF and Governments 66 Figure 13: DLIs by Topic As previously mentioned, the types of that will guarantee transformation, other achievable results will heavily rely on the researchers and agencies have put together country context and, in particular, how much some criteria to help practitioners to select control an agent has over the result. Although DLIs and to assess the quality of those there is no clear way to select specific DLIs indicators (see Table 11 and Table 12). Table 11: SIDA’s Checklist for Choosing Indicators in RBF Indicators relate to the desired outcome †† Indicators are neither too many nor too few †† Indicators are based on what is already there and needed anyway †† Indicators are precisely defined with clear protocols †† The measurability and periodicity of the indicators are fully verified †† Baselines are determined and verified †† Any targets are sensibly set †† Timeliness of data informs planning, budget, and disbursement schedules †† Flexibility is built into the agreement Source: SIDA’s internal guidelines for RBF Part II – RBF and Governments 67 Table 12: Criteria for Assessing the Quality of Disbursement-Linked Indicators CRITERION KEY QUESTION CONSIDERATIONS FOR RESULTS-BASED APPROACHES 1) Focus on Do indicators ensure • The indicators can measure results (outputs and results a focus on results? outcomes) or processes (inputs and activities) 2) Control Can results be • The extent to which incentivised actors have control influenced by and over achieving the intended results plausibly associated • The extent to which results can be attributed to the with the intervention? intervention • The institutional setting of incentivised actors 3) Financial Can intended effects • The extent to which financial amounts reflect incentives be maximised? ‘value for money’, policy leverage, risk or other considerations • Whether disbursement is scaled in proportion to performance or conditional on achieving a threshold level 4) Measurability Are indicators • The relationship between the indicator and the and verifiability reliable, consistent underlying objective of the programme over time and • The data quality and source (administrative data or independently survey data) verified? • The way verification is organised (independent or not) 5) Unintended Can unintended • The extent to which indicators allow gaming (active consequences effects be manipulation of the indicators) minimised? • The extent to which indicators lead to distortions (indirect consequences of overemphasising or neglecting policy choices) Source: Holzapfel and Janus 2015 DLIs and Short Project Timelines outcome targets may only be setting things Another important consideration when up for failure. The results of our survey and selecting indicators is the timeframe within interviews also point to this. One respondent which indicators are expected to be achieved. wrote, “In a recent grant to Cambodia … the Nowadays, many development projects are implementation timeline was too short for scheduled to last only four to five years (for RBF to be used meaningfully.” Both DFID- example, according to the GPE guidelines, funded evaluations of projects in Rwanda and grants with the variable part are to last three Ethiopia noted that the projects’ life cycles to four years), and teams must be realistic were probably too short to generate any long- about the type of result that can be achieved term changes. One possible solution might be within that timeframe. If the project is to be to apply RBF either after traditional financing implemented in a country that is struggling projects, as in the Bangladesh example, or to meet its basic education goals, then setting as a way to incentivize results further down Part II – RBF and Governments 68 the chain directly after the completion of an that are easier to attain with some that are existing RBF project. harder. Pricing DLIs: Three Hypotheses and a Heuristic There is some support for these hypotheses. There does not seem to be any consensus on Based on our interviews with practitioners how to price DLIs, though researchers have and our DLI portfolio analysis, we found the suggested several criteria that could be used, most common practice for pricing a DLI in including (i) value for money; (ii) leverage World Bank RBF projects was to use a simple effects; and (iii) additional risks for partners.172 heuristic: divide the number of DLIs by the total amount set aside for DLI disbursements. Using the value for money criterion involves Thus, if a project had 10 DLIs and the pricing a DLI in proportion to the benefit or financing allotted to DLIs was US$100 million, “value” of the activities required to attain then teams would roughly price each DLI at it. As discussed above, using the leverage US$10 million. The table below shows the criterion relates to whether a DLI can unlock average dispersion of DLI indicators, measured results further down the results chain or is in percentage of the total funds. In around complementary to other DLIs. Using the 18 percent of projects, all of the DLIs were criterion of spreading out risks involves priced equally, and therefore the version is 0 distributing the disbursements amongst several percent.173 Around 50 percent of projects have DLIs to avoid an all-or-nothing situation. For an average dispersion of 5 percent or less. example, a project could combine some DLIs Figure 14: DLI Value Dispersion by Project 172 Holzapfel and Janus (2015) 173 In other words, the standard deviation across DLIs within the project was 0. Part II – RBF and Governments 69 In general, we found that most of the variation As we have shown, DLI pricing in World Bank in DLI pricing came from differences between RBF education projects does not follow any projects rather than within projects. In other particular formula nor is it clear how much a words, different DLIs within the same project DLI should cost to create “sufficient” incentive. tended to be closer in price than similar DLIs In the Sri Lanka Education Sector Development in different projects. This approach suggests Program (see Table 13), which is set to close that teams are indeed trying to spread out in 2019, there are nine disbursement-linked the risk and minimize the potential negative results areas (DLRs), which cover a total of impact of countries failing to achieve the DLIs 37 indicators. The financing for each DLR is by ensuring that all DLIs are worth roughly the either US$10 million or US$32 million, but same in terms of disbursements. no explicitly articulated rationale exists for these allocations. Presumably, if the donor Another pricing mechanism that came up wanted to signal the importance of one during our analysis, survey, and interviews indicator over another, there would be more was to “make the most important thing the variation between the allocations. Instead, most expensive.” This adds some evidence to disbursements are relatively evenly spaced out the “value for money” and “leverage effects” per year, with lower disbursements in the final hypotheses. In RBF, the actual expenditures of two years. This makes sense given that most of an activity can be delinked from disbursements the legwork to achieve final indicators would and the activity can then be priced for its have been done in the previous years. perceived worth. While a policy reform may cost nothing, it may be a major improvement In addition, the program budget shows a total in terms of enabling results further down of US$200 million allocated to DLIs, which the results chain and, thus, could be priced is a sizable amount and is probably enough accordingly. to incentivize action. However, in an early childhood development project in Jamaica, Overall, this evidence suggests that project each DLI was only worth US$180,000, yet the teams do not price DLIs in proportion to their government was still incentivized to achieve real cost since, otherwise, there would be more the targets.174 variation within projects and less across similar projects. Instead, they seem to focus on how to spread out disbursement risks and on the perceived value of DLIs as a way to produce results further down the results chain. 174 World Bank (2008) Part II – RBF and Governments 70 Table 13: Disbursement Table of DLIs for Sri Lanka Education Sector Development Program BANK FINANCING YEAR 1 YEAR 2 YEAR 3 YEAR 4 YEAR 5 # DLR ALLOCATED TO DLR (US$ M) (US$ M) (US$ M) (US$ M) (US$ M) ($US M) 1 Pass rates for GCE 10 5 5 O level examinations increased 2 Pass rates for GCE 10 5 5 A level examinations increased Pathways from school 32 8 8 8 4 4 to TVET developed- 3 Technology Stream commenced and implemented at GCE A levels. 4 Secondary schools 32 8 8 8 4 4 upgraded to offer all subject streams 5 Enrollment in GCE A 10 5 5 levels Science Stream increased 6 Enrollment in GCE 10 5 5 A levels Commerce stream increased 7 Principals and deputy 32 8 8 8 4 4 principals trained 8 Institutional capacity 32 8 8 8 4 4 at MOE and provincial levels strengthened. 9 Improved transparent 32 8 8 8 4 4 and efficient procurement TOTAL 200 40 40 40 40 40 A/L = advance level, DLR = disbursement-linked results, GCE = General Certificate of Education, MOE = Ministry of Education, TVET = Technical and Vocational Education and Training. Source: ADB (2013) Part II – RBF and Governments 71 Box 2: Overpriced Indicators in Uganda In Uganda, a health intervention led by the Dutch NGO Cordaid used RBF to incentivize the use of maternal and neonatal care services. The program faced budget constraints after only two years of implementation. As a result, the most expensive subsidy for outpatient consultation services was cut in half. Initially the facilities complained about the reduction, but attendance rates did not go down. Cordaid originally hypothesized that facilities would simply start charging higher user fees to cover the gap, but upon further investigation, it was discovered that the initial incentive had dramatically improved the quality of public facilities so that they could now compete with private not-for-profit health centers. This improvement in quality was able to generate some price competition. Perhaps the most interesting conclusions from this example is that it showed Cordaid that the outpatient consultation services indicator had likely been originally “overpriced” and that indicator pricing, in general, warrants more thought. Source: RBF Health (2017) In general, there are no distinct trends for indicators since they are the most difficult to pricing DLIs based on themes. Our analysis achieve, but it is unclear why process indicators showed the average price of a DLI based on are, on average, worth less than input-related its position in the results chain. As can be ones. More analysis of individual DLIs in the seen in Table 14 below, process DLIs have a process category is needed to understand how lower value than intermediate outcomes, and teams are pricing them in relation to input- intermediate outcomes have a lower value related DLIs, although, based on our overall than final outcomes. Intuitively it makes sense DLI analysis, there may be no distinguishable that final outcomes would be the costliest methodology being applied. Table 14: Cost of Indicators Based on Position in the Results Chain POSITION AVERAGE COST (US$ M) Process 12.7 Input 26.3 Intermediate 18.5 Outcome 39.5 While these examples give an idea of how disburse more easily, based on their theme and teams have priced DLIs, they are by no means position in the results chain. based on any science. In reality, project teams DLIs: Scalability and Disbursement Models often make a judgement when it comes to One of the advantages of RBF as a funding costing, which is usually done jointly with the mechanism is its flexibility. Making DLI government and other relevant development disbursements scalable can be a useful way stakeholders. To provide teams with more to mitigate the risks faced by the borrowing guidance, additional analysis could be governments, especially those facing liquidity conducted to identify which types of indicators Part II – RBF and Governments 72 constraints. Many projects have scalable of the DLIs in our DLI analysis were scalable DLIs, meaning that even if the borrower only (55 percent). For roughly 15 percent, it was partially achieves the DLI, it can request the unclear whether the DLI was scalable or not. In disbursement of an agreed proportion of the Table 15 below, there are examples of the most total value. In some projects, DLIs are also not commonly used disbursement models in RBF time-bound. We found that a little over half education projects. Table 15: Advantages and Disadvantages of Disbursement Models DISBURSEMENT TYPE FEATURE ADVANTAGE DISADVANTAGE Target based Allocating a fixed Easily understood, Difficult to set because amount per DLI and can be scalable. it is hard to know how disbursing once much progress will be annual targets are made on a yearly basis. met. Often ends up penalizing borrower for lack of progress instead of rewarding the country for making good progress. Baseline based Rewarding progress Easily understood, Provides same incentive over the current can raise the bar regardless of degree baseline (can be a every year as long of progress, e.g. any rolling baseline). as there is progress. increase over the baseline will trigger full disbursement. Progress by unit Rewards progress Most flexible Pricing can be difficult to proportionally on approach, most ascertain, may need to basis of agreed price/ common method have a cap on maximum reward per unit used in RBF amount that can be projects. disbursed. Source: SIDA’s internal guidelines Zero/Global DLIs Another practice often adopted by project Previously, “zero DLIs” were described as an teams consists of “global DLIs,” which are DLIs option to help governments (and other entities) that must be met for the project to continue. to manage liquidity risks by creating indicators In other words, if they are not achieved, then that are achievable and thus can bring in nothing else can happen. While liquidity financing that can fund efforts to achieve constraints and the scalability of DLIs are other, more difficult indicators. While not important to consider, in some projects these formally part of any guidance or research, this are insufficient criteria to ensure meaningful is a common practice used by project teams to progress. In short, flexibility can only overcome introduce RBF into contexts where it has never so much. been used before in an effort to acclimate them to it. Part II – RBF and Governments 73 The Sindh Education Project included a The main point of this section is to global DLI that was conditional on the highlight the need for adaptive and flexible implementation of merit-based recruitment implementation. In RBF, the way in which of public school teachers. This became a an indicator is achieved is very open and can necessary condition for all disbursements, in require careful management and supervision. other words, there would be no disbursements While this is where innovation and autonomy if the DLI was not met. The rationale for that on the client side can occur, in practice, this is particular approach was that the DLI was usually not the case. RBF projects in education considered by both the Government of Sindh often require technical assistance from donors and the World Bank as central to increasing for the desired results to be achieved. education access, improving education quality, and improving sector governance and as an Monitoring and Information Systems indicator whose achievement could be strongly Monitoring and information systems are assisted by Bank financing. This particular DLI critical for results-based financing, which also entailed a high implementation risk and is based on the ability to accurately monitor thus was worth turning into a global DLI.175 and verify indicators. Thus, these systems must be in place before countries can put In conclusion, as shown, there are many RBF into practice. An education management ways to select and price indicators. They information system (EMIS) can serve multiple can be negotiated between the donor and purposes, but the primary goal of teams is to the government/recipient, they can be set use EMIS data as a way to assess the strengths by international benchmarking, they can be and weaknesses of the education system. aligned with national objectives (as part of a development plan, for instance), they can fit In most instances, monitoring and information into a cost-effectiveness framework, or they systems are needed to ensure the disbursement can even be decided by a panel of experts.176 of funds, but RBF can also be used to It is clear that there is no consensus on which establish or improve existing monitoring and method is most effective. Thus, more research information systems. into optimal indicator selection and pricing Purpose of Education Management Information practices would be useful to guide practitioners Systems (EMIS) on the types of indicators that are likely to lead Given that RBF requires strong monitoring to the best outcomes and how much those are and information systems to ensure that worth. indicators are accurately tracked, EMIS can be used as a disciplinary tool, in other words, if governments do not meet agreed targets, then Adaptive Implementation they face punitive measures and donor funds Implementation is a lesser understood and are not disbursed. Conversely, governments researched part of RBF projects. This is face positive measures when indicators are probably because most RBF education projects met. This is a somewhat overly dichotomous are still being implemented so it is not yet depiction of the process, but it illustrates one possible to analyze what lessons can be learned. of the fundamental concepts behind RBF, that In addition, the nature of implementation is the threat of not getting the funds encourages very context-specific and requires more robust governments to make all possible efforts to qualitative research. achieve indicator targets. 175 IEG (2013) 176 Cruz-Aguayo and Martínez (2016) Part II – RBF and Governments 74 In reality, most practitioners think that the scratch, the intervention started by engaging best way to use an EMIS in an RBF project in a dialogue with the government on the is to see it as a feedback mechanism. Rather following question: what results did they than using a punitive lens with regard to want to see in the education system, and what the non-achievement of targets, most of the information and monitoring systems would be practitioners that we interviewed thought that required? it was more important to know why the targets Based on this discussion with government were not being met. An interviewee from the counterparts and education sector ADB said that the “EMIS and the verification stakeholders, the program put together a process should always be used to strengthen results framework that led to the development the system.” of a quality assurance system (QAS). This For example, in a health project in India funded is now guiding the government’s efforts to by the ADB, the team chose two outcome- strengthen the national statistics agency and level indicators, one which was chosen even the EMIS.177 This is a good example of how though data from the country’s systems were RBF can be framed when information systems not reliable and there were no funds available and capacity are lacking. It is not always to conduct a survey. There was a regular about jumping to the finish line and financing household survey, but it was fielded only every outcomes but rather about thinking about what three to four years. Given these constraints, needs to be in place before an RBF intervention the team proposed that the EMIS should can happen.178 be based at the health center level to give Level of Complexity Needed for Systems the centers a strong incentive to collect and Another design issue is the level of complexity report data. While these data might not be required of an information and monitoring 100 percent accurate, the ADB team wanted system. There are many examples in the to see a general upward trend rather than development literature of complex, expensive exact percentile increases. While this may not information systems that were funded and have been a very scientific approach, it was initiated by donors but were not successful.179 A practical, and it gave the government a strong classic example is SISTAFE — the PFM system motivation to strengthen the EMIS, which was that was developed in Mozambique. Despite a positive result in and of itself, as well as a using state-of-the-art software and receiving general improvement in health services. millions in donor funding, it did not take off.180 Building Systems for RBF It was slow to be adopted by line ministries The higher a country’s income level, the more and, to date, is still not universally used. likely it is to have EMIS in place. However, There are many reasons for this sort of some low-income countries are also able dynamic. First, sometimes technology is to develop an effective EMIS. REACH prioritized over purpose. IT systems and recently awarded a grant to Haiti to enable software are much easier to develop and replace the development of an EMIS and, therefore, than organizational culture and practices, create the pre-conditions for an RBF project so it can be tempting to build an EMIS from in the future. Rather than start with an scratch but then to find that no one uses it. RBF intervention and scramble to create Second, the donor’s needs can sometime be information and monitoring systems from 177 Barón and Adelman (2018) 178 World Bank (2017) 179 Andrews et al (2017) 180 Andrews (2013) Part II – RBF and Governments 75 prioritized over those of the country. Because The Colombia program started by holding information systems are crucial for the focus groups to discover the concerns and effective implementation of RBF, there is the problems of agents at different government danger that the donor will invest in designing levels and to have them state their vision for the a country’s EMIS with the sole purpose of national information system. Only then did the serving a program or intervention. This just team start work on the technical design of the ensures that it will never be used again once indicators and software. Although the system the program is over. is still being piloted, it is an example of how to undertake an EMIS reform by engaging all of In some contexts, the challenge is to build the actors involved. Recent evidence suggest a system from scratch because nothing else that this broad engagement is one of the exists. In these contexts, it is worth it to build determinants of whether a reform succeeds or only simple and easy to maintain systems. not (rather than a lone political champion).183 For example, the REACH-funded Haiti intervention aimed to develop a functional Monitoring Options EMIS to solve the country’s problems, not There are various ways in which data can be just to serve donors or programs. In Niger, collected and reported in order to monitor the Ministry of Education has limited data the achievement of indicators. Generally, sources and relies mainly on school censuses if government information systems and and statistical yearbooks,181 and likely would administrative data are available, then this not require a more complex, high-tech system can be the cheapest option.184 However, this at present. can be problematic if the data are unreliable, inaccurate, or just not updated frequently However, if the country in question has plenty enough. If the agency in charge of the data of reliable data, strong capacity to interpret benefits from the RBF intervention, there is data, and a functioning EMIS, then the also a risk of gaming and cheating. challenges are different. For example, REACH recently funded a grant to fund an intervention The alternative is to create a parallel in Colombia to redesign the country’s EMIS structure to gather the information required around the objective of education quality, thus to implement the RBF intervention. While creating SIGCE (the Management System for this may be more reliable than working Education Quality). The team avoided pitfalls with existing government systems, it can such as creating a whole new system with the be a missed opportunity to strengthen sole purpose of enforcing accountability since state capacity, and it goes against the aid that would have imposed more administrative effectiveness agenda.185 Our interviews also costs on different government agents with revealed practitioners’ frustration that, even no clear benefit. Instead, the objective was to though most education projects include an design a useful and user-friendly tool to enable EMIS as a component, there is very rarely a stakeholders to understand the weaknesses functioning EMIS in-country after the projects and strengths of the education system and to have closed. One interviewee echoed the idea guide policymaking.182 Essentially, the new that a complicated and high-tech EMIS may system was meant to be a management tool for be unnecessary and difficult to maintain, education stakeholders to use proactively. particularly at the school or health center level. 181 Majgaard et al (2018) 182 Cerdán-Infantes (2018) 183 Andrews et al (2017) and Andrews (2013) 184 Holzapfel and Janus (2015) 185 Holzapfel and Janus (2015) Part II – RBF and Governments 76 Instead, it might be more beneficial to simply verifiable data that both sides can agree on, use the existing manual systems and improve this theoretical relationship is broken. The on them by including more data fields to be agent has no way to convey to the principal collected or by strengthening the validity and which results have been attained, and the reliability of the data collection process. principal has no way to know whether any of the information that it receives is reliable. An interesting innovation in terms of data collecting and monitoring is the use of open Once donors and country governments can source techniques. Open source data collection agree on a system for collecting and reporting was pioneered by Cordaid in several RBF the data pertaining to the intervention, the projects in Africa.186 For example, in one of most important question is who will be their health interventions, health centers are responsible for carrying out the verification. requested to draft their business plans and then This can be done by central governments or enter them in the online platform that is open line ministries, local service delivery agencies, for everyone to access. Health centers then or external firms or NGOs.190 Whoever carries report their performance data in a monthly or it out, the most common option is to use EMIS quarterly fashion, and this is also published data, if available. This was the approach used on the platform. Since the data are entered by the Ethiopia intervention piloted by the by the centers themselves in a decentralized DFID, which was discussed earlier.191 manner (through computers, smartphones and Using international organizations to carry out tablets), this makes the process easier. Once the the verification is usually the costliest option data are entered, they are validated externally, since it is done through direct supervision. for example by the donor team. The local Conflict of interest can be a problem in these community can also verify in the system that situations since the donor can sometimes be the services are being provided. If the results under pressure to disburse the funds because add up, funds may then be disbursed.187 It is disbursements are often a metric of success. important to note that the system still relies on Our interviews with practitioners indicated monthly verification visits to ensure that the that using this type of verification can send data are not being misreported.188 a signal to the government that the donor Verification distrusts them, though sometimes, it is the government itself that requests that type of Even with well-designed monitoring and verification. In any case, using international information systems in place, RBF, by its organizations to carry out the verification is very nature, depends on the verification of unsustainable over the long run because it results. Without credible systems that can does not build local capacity nor does it help to evaluate whether a target or result was attained improve and sustain national systems. Having (whether run by the government or otherwise), the government carry out the verification can RBF will not work. One of the reasons that is be cheaper and can build state capacity, but often cited for the use of RBF is that it is a way once again there is an even greater conflict of to align the interests of the principal (donor) interest since it would have an incentive to say and the agent (recipient),189 but if there are no that the results have been achieved even if this 186 Cordaid (2014) and Results for Development (2016) 187 Cordaid (2014) 188 Lurton (2018) 189 Results for Development (2016) 190 Results for Development (2016) 191 Results for Development (2016) and Cambridge Education (2015) Part II – RBF and Governments 77 Box 3: Perverse Incentives, Gaming, and Corruption During the Implementation of RBF Projects It is not controversial to say that all types of development financing, RBF or not, are susceptible to perverse incentives, gaming, and corruption. The question is whether there are more instances of these kinds of behavior in RBF projects than in traditionally financed projects. The International Anti-Corruption Resource Center (IACRC) has estimated that private sector corruption in traditional development financing has cost developing countries over US$500 billion (IACRC, 2018). In response, the IACRC has issued guidance on how to spot the biggest red flags in development projects, including bid rigging, collusive bidding, and fraud. These are mostly related to procurement and tend to be in the form of inflated invoices or outright fraudulent transactions as was the case in the infamous Cashgate scandal in Malawi (The Economist, 2014). Broadly speaking, procurement is often not a part of RBF projects, or rather, procurement does not follow donor agency rules (another reason why country systems are needed for RBF to work). Thus, there has not been any definitive evidence of such corruption in RBF projects, but our survey interviewees talked about how these projects can create unanticipated perverse incentives to game data. In Cameroon, one development agency staff member noticed that, when the verification team made a school visit and asked the headteacher for a document, the headteacher said that she had to look for it in her office and came back 20 minutes later with a brand new, fake paper. This is one of the reasons why teams often insist on independent, third-party verification. Another perverse incentive that RBF can introduce is to move the focus of actors from the long term to the short term (Bond, 2017). This was the case in the GEC, where small NGOs ended up prioritizing specific targets rather than working to develop sustainable, long-term gains. is not the case. The third option is to use an that there will be a limited number of actors independent verification agent such as a local available for verification (for example, from NGO or an audit firm. While this can ensure a large research university) to provide both impartiality, it can also lead to higher costs, technical guidance to the government and both in time and money. In fragile and conflict independent verification. Since verification states, there are additional issues. First, there agents need to be independent, which usually may not be any reliable local organizations means that they will not be eligible to help able to carry out the verification, and second, to implement other project activities. For as a recent policy note on RBF experiences example, in Nepal, civil service organizations in Niger has suggested, the procurement were upset that the University of Kathmandu processes to hire verification firms can be very was going to be the independent verification lengthy, which would delay implementation agent for a GPE grant because that meant and verification.192 Conflicts of interest can that the University of Kathmandu could not also exist in these situations, though these provide technical guidance on the program. will manifest differently than in the case of Per the World Bank DLI analysis, 202 DLIs out governments. In countries that lack sufficient of 352 (about 57 percent) required third party capacity to collect and analyze data, it is likely verification, while roughly 28 percent relied on 192 Majgaard et al (2018) Part II – RBF and Governments 78 national institutions for verification, e.g. the of its bonus payments on verification.193 State Examination Board or General Auditor. Therefore, context and, more importantly, the The remaining DLIs were to be verified by the verification mechanism matter. World Bank, which generally involves World How effective is traditional verification that Bank teams checking nationally produced data relies on human supervision (usually by the (e.g. Ministry of Education documents or status donor, a third party, or the government)? reports). Unfortunately, this sort of verification can In practice, many teams will opt for third often fail to identify misreporting, even if party verification initially as a way to build the percentage of data supervised/checked capacity (usually by pairing the firm with a is large.194 Thus, unless it is widespread, local agency) and then transfer responsibility misreporting could go unnoticed. This for the verification process to national agencies is a troubling finding since many RBF subsequently. This means a high initial interventions rely on these spot checks. investment, as described previously, but for A recent verification system that has been a worthwhile pay-off. For example, in Sierra garnering some interest is automated (or Leone, the team used an independent, third- algorithmic) monitoring in which machine party firm for verification in year 1, which learning and algorithmic techniques are used was very expensive and still involved some to pinpoint possible problems in the data. The quality issues, but are now using government hope is that using this approach would reduce systems in year 2, which has required revising verification costs, for example, because the and strengthening protocols (such as rotating algorithm will come up with a list of schools, enumerators every quarter, sending multiple clinics, or other fund recipients that may be enumerators to the same school per quarter misreporting data. (sample based), and having district officers make spot checks at schools). This tiered Researchers tried this method out on data from approach has also been adopted by a project in a health RBF intervention in Zambia.195 They Nepal, where the disbursement of funds was tested several machine learning algorithms conditional on an independent audit of the to see whether any of them could successfully EMIS, thus giving the government an incentive predict which clinics were over-reporting to have the system evaluated and to make data. For this, they used a sample of reported the necessary improvements so that future and verified data from 140 clinics. What they verification would be more reliable. found was that some algorithms196 were quite successful at identifying which clinics are There is no comprehensive overview of the likely to be misreporting data. An additional relative costs of verification mechanisms, but benefit is that these algorithms should get there is some evidence that these costs vary a increasingly accurate as more data are entered. lot. A recent study has calculated verification They can also be used to increase the accuracy costs as a fraction of total program costs for and reliability of existing administrative health interventions, with a 2011 intervention datasets. in Burundi having verification costs equal to 1 percent of the project’s total cost, whereas Another methodology using a series of Plan Nacer having spent around 10 percent of algorithmic simulations has been proposed its maximum bonus payments on verification. as a way to pinpoint potential data issues A program in Benin paid as much as 50 percent and has been tested using data from a 193 Grover et al (2018) 194 Grover et al (2018) 195 Grover et al (2018) 196 Specifically, the random forest algorithm, see paper for more details. Part II – RBF and Governments 79 health intervention in Benin.197 The results learning. However, when the students took of this experiment showed that using this a complementary test that measured the methodology could bring verification costs same skills, the improvement disappeared. down from 30 percent to around 20 percent. It seemed that the teachers had focused on Thus, automated methods are a promising teaching kids how to take the government possibility for bringing down the costs of exam rather than on developing their skills and verification in RBF projects. However, the improving the teaching content as activities evidence so far is mostly based on simulations such as homework and teacher attendance had using existing data. remained unchanged, but the number of exam preparation sessions had increased.199 Gaming and Cheating There may be several ways to get around this. Even the most well-designed information and One way, as previously discussed, might be to monitoring system can be prone to gaming distribute the funding upon the achievement and cheating. Any indicator that is used to of a range of indicators rather than just one distribute funding de facto becomes a high or two. This would relieve the pressure on the stakes indicator. recipient as they would receive funding even Several factors are likely to increase the if they only achieved some but not all of the likelihood of strategic or gaming behavior indicators. One example of an intervention happening in education interventions: (i) that used several indicators was the Big the higher the stakes are; (ii) the longer the Results Now program in Tanzania. The only intervention has been operating; and (iii) final outcome indicator for education (linked the less able the agents or recipients are to to disbursements) was an improvement in influence the actual outcome.198 These make students’ reading, writing, and arithmetic intuitive sense. If an indicator determines skills, but it only represented 13 percent of whether or not agents receive a large sum of the total budget.200 Also, to minimize gaming, money, they are more likely to report having the final outcome was measured using a achieved it regardless of whether or not this is new test of reading, writing, and arithmetic actually the case. Similarly, the more helpless implemented in only a random sample of agents feel in terms of controlling the results, schools. The program’s designers argued that the more likely they are to game it. If agents this would decrease the risk of gaming because have no control over an outcome that is being it would not give teachers and schools a chance used as a metric, this defeats the purpose of to prepare students for the test in advance. RBF. However, this did not prevent gaming in other areas of the program. Since increasing passing One way to evaluate whether the results have rates was one of the objectives of Big Results really been achieved is to use complementary Now, many schools prevented struggling indicators. For example, an intervention students from taking the exam in order to in Kenya rewarded teachers for improving improve their metrics.201 students’ test scores in the government exam, and these subsequently were seen to There is not a lot of concrete evidence of have improved. So it seemed as though the perverse incentives when it comes to RBF program had led to improvements in student activities in the health sector, though that 197 Lurton (2018) 198 Murnane and Ganimian (2014) 199 Kremer et al (2010) cited in Murnane and Ganimian (2014) 200 Janus and Kejzer (2015) 201 Cilliers et al (2018) Part II – RBF and Governments 80 might be due to the fact that the monitoring should achieved in a meaningful way, e.g. not of perverse effects is usually not built into the just counting the number of teachers who are design of these projects.202 More analysis has trained but ensuring that they apply what they been done on the negative effects of conditional learned in the classroom. cash transfers (CCTs), which include The easy solution might seem to be to simply cherry picking or cream skimming where design indicators better or to create stringent, governments and/or organizations target those quality-oriented protocols, but in the end, populations that are more likely to achieve the the indicator still must be measurable (and targets, which can increase inequity. In our thus, quantifiable). Also, it is important to survey interviews, practitioners mentioned state that verification cannot, and does not, that the most common form of gaming was dictate every action that a relevant actor takes the tendency for clients to make no effort to to achieve that indicator. In this regard, the achieve non-remunerated indicators. examples used in this section are meant to In general, there is no strong evidence of an show the realities of implementation through overwhelming number of negative unintended the lens of verification, which is what many consequences associated with RBF, though practitioners spoke about when asked in teams may not be establishing the monitoring our interviews about how they ensured mechanisms needed to catch such undesirable quality implementation, perhaps because behavior. it is more documented than other facets of implementation. In our interviews with practitioners, some seemed to accept the inevitability of gaming but For example, one of the very first Program for emphasized that teams should first observe the Results at the World Bank was the Bridges perverse behavior in action before re-designing Improvement and Maintenance Program in parts of the project to minimize them. Nepal. To date, this is one of two PforRs at the World Bank that are now closed and is Implementation Quality the only one for which an implementation In RBF projects, there is a lot of emphasis on completion report is publicly available. In the design. Many researchers and practitioners completion report, the team wrote about the alike will insist that the design of the incentives challenges that they faced while implementing will make or break the project. However, even the program, namely, that quality was difficult if a project is well-designed, implementation to achieve. One of the DLIs was “new bridges is never easy to predict, yet it is exactly how built or improved on,” which was measured indicators are achieved that really matters for by the length in meters of the bridges that long-term impact. Donors often seek to control had been built or improved. The completion how an indicator is achieved by putting in place report indicated that poor construction quality verification protocols, which are steps that was a constant issue and that the National must be checked to validate that an indicator Planning Commission had recommended that has been achieved. disbursements should not be made for four new bridges because of their poor quality. The One of the issues with verification protocols World Bank team then had to work closely with in RBF projects is that they can become the Department of Roads (DoR) to improve mechanical, quantitative exercises, which the quality of construction quality because the defeats the purpose of ensuring quality DoR staff did not have the requisite capacity implementation. Quality implementation is to carry out the quality assurance.203 To do difficult to explicitly define, but indicators 202 Grittner (2013) 203 World Bank (2018b) Part II – RBF and Governments 81 this, the team had to add a technical assistance quality and allowing for some flexibility during component to the project. implementation. These types of quality issues are not unique to Generally speaking, protocols that are infrastructure and are often even more difficult developed jointly by the project team and the to address in the education sector. In Ghana, client can improve the way in which results a project called for the distribution of iBoxes, are achieved as well as provide the necessary which were devices on which were uploaded flexibility to make adjustments when needed video lessons, interactive applications, and as has been necessary in several projects video tests as well as over 3,000 additional (in Moldova and the Dominican Republic, open source resources. One of the project’s for example). In addition, many teams have DLIs was to ensure that teachers received incorporated both top-down and bottom-up training in how to use the iBox and that the accountability measures. In some performance- iBoxes were functional. While this DLI was based school grant schemes, schools must independently verified, the team discovered submit improvement plans and/or evidence that that the government had not taken into account they have used the money for eligible items to the quality of the results or their sustainability. the provincial level, while the community also In other words, in many instances, the iBoxes must vouch that the school received the funds. are not being used in classrooms, even though For example, under the Indonesia performance the teachers have completed the training. grant program funded by REACH, schools were required to spend the grant money either An example that illustrates the difficulty of on teacher training or to purchase learning balancing stringency and quality control in equipment. protocol design is in Uganda, a project with GPE funds. The project team very carefully In general, high quality implementation thought through the protocols to be included requires technical assistance. Every one of and wanted not only to train the teachers but our interviewees stressed the importance of also to ensure that there was evidence that the working with their counterparts to help them teachers had applied some of what they learned to achieve the indicators. For example, the in their training. They came up with a four-step ADB has built a “review and corrective action” verification protocol that required proof that: mechanism into their Sri Lanka Education (i) a training schedule existed; (ii) teachers had Sector Development Program, which enabled been given initial training; (iii) teachers had them to make changes to the design whenever participated in supervision/support meetings they were faced with challenges without having with a coach; and (iv) teachers had received to halt their pilot programs. This feedback loop training materials. As the verification process of identifying problems and then correcting the was underway, the team learned that it was course of the program helped the team to work very difficult to track whether or not teachers more effectively with implementing agencies had met with their coaches because the coaches to achieve better results. This is known as often forgot to sign a log indicating that they adaptive implementation. had met with the teachers. Another survey respondent discussed a similar In the end, the team saw this as a lesson situation in Lebanon. “In Lebanon, we have learned and an opportunity to stress the invested in parallel technical assistance importance of documentation with teachers support to the Ministry of Education to ensure and their coaches and were able to continue that comprehensive planning takes place that monitoring all four steps while making focuses both on how to achieve the DLIs but disbursements when only three were met. also how to strengthen the system as a whole.” There needed to be a balance between ensuring Experience in the health sector has shown Part II – RBF and Governments 82 the importance of adaptive implementation, In a recent study on RBF in the health sector in with adaptation and change during a project’s fragile and conflict states, a team of researchers implementation being “the norm” rather than found that most successful interventions had the exception” in the sector.204 RBF can enable had to adapt their original plans and come flexible course correction since it involves the up with local innovations in order to improve setting of targets but does not dictate the way delivery.207 A program in South Kivu (in the in which to achieve them. Democratic Republic of Congo) included mechanisms such as non-performance-based Investing in technical assistance often adds payments to jumpstart operations (bonus de to the cost of RBF activity, but it is often demarrage) and harnessing the support of the necessary for successful and meaningful community for the rehabilitation of health implementation, especially in the many facilities, whereas in Adamawa State (Nigeria), countries that are still familiarizing themselves regular monthly meetings were set up between with RBF. donors, local authorities, and implementers As mentioned above, adaptive implementation to improve coordination and avoid conflicts. means building flexibility into the design of These were ad hoc solutions to problems that RBF projects to enable teams to correct course were not discovered until after implementation when necessary and, therefore, it must be had started. context-specific. While there are fewer studies With RBF projects, there is usually no on implementation in the education sector than prescribed implementation plan in place, and it in the health sector, the literature in health is difficult to predict what challenges are likely has confirmed the importance of customizing to be encountered. As in the Nigeria example project design—­­on what is called “artisanal above, sometimes working groups have been RBF”  —  and of disseminating information 205 set up to come up with good practices and about RBF. For example, in an intervention in solutions to implementation problems.208 This Nigeria, it was found that part of the variation can also be done at the central level. Some in performance within a single RBF scheme of these solutions may be idiosyncratic and was due to implementation factors such as how unique to a particular national context. For much the implementing agents understood example, Rwanda introduced community about RBF or how well front-line agents verification of DLIs as a tool, while Burundi communicated with different government tested out the complementarities of reducing levels and vice versa.206 user fees and implementing RBF at the same time in healthcare delivery.209 The key is to understand that RBF is not a static modality or blueprint but a process of continued improvement.210 204 Ridde et al (2018) 205 World Bank (2017) 206 Ma-Nitu et al (2018) 207 Bertone et al (2018) 208 Ma-Nitu et al (2018) 209 Ma-Nitu et al (2018) 210 Ma-Nitu et al (2018), Bertone et al (2018) Part II – RBF and Governments 83 Failure to Achieve Targets Education Support project, US$77.82 million of the total US$222.18 million was cancelled as a There is not much evidence on how often result of the non-achievement of disbursement- recipients fail to achieve their targets and what linked indicators. In the completion report, the repercussions are. One of the critiques the independent auditor indicated that, of RBF is that disbursements will happen although the team had lowered the DLI targets regardless of results because it is too politically using a “thin” justification, the new targets difficult to withhold funds. In many cases, helped to move the project forward, and the development agency teams simply change the attainment of the outputs suggested that there indicators (60 percent of survey respondents was substantial improvement in learning indicated that was what they had done). conditions.212 In fairness, even non-RBF projects end up changing indicators, and teams argue that they During our interviews with practitioners, there revise indicators to reflect changing priorities were some who reported that some canceled and realities on the ground and not based on funds had simply been reallocated to a non- their whims. RBF portion of the project. This was not the case in Bangladesh or Pakistan, and there is There is some concrete evidence that teams no substantial evidence to corroborate that have taken measures to either reduce or observation, but further analysis could be done withhold funds when they felt that things were to investigate how often funds are genuinely not going well (54.3 percent of respondents canceled as opposed to redistributed. indicated that they have done just that). In the Burundi Common Education Fund (2011- For projects that have not yet closed, 2015), the Bank project team reduced transfers disbursement flexibility is still a key feature, to some schools and meso-level institutions if with one respondent writing, “we are their internal controls and accounting had been withholding funds for now as many of our DLIs under-par in the previous year. have a roll-over option.” Out of the 51 results-based projects in the Undoubtedly, all stakeholders want to have World Bank portfolio, only six have closed, successful projects, regardless of RBF, and are and only four have completion reports. Of the likely to do what is in their power to ensure a four, two projects in Jamaica did not disburse positive outcome. There is a fine line between the full loan amount because the government incentivizing and rewarding the effort that had miscalculated the total amount of loan goes into achieving results and signaling to that it required. The completion reports for country governments that there are no real the other two projects, in Bangladesh and consequences if indicators are not met. If Pakistan, indicated that the total commitment the latter occurs, then the promise of RBF amount was not disbursed due to unmet DLIs. is minimized. As emphasized previously, if In the Bangladesh Primary Education Project, information about RBF is widely disseminated the team revised the indicators several times upfront, then the bad news of withholding and ended up canceling several unmet DLIs funds based on non-performance should not (US$8.3 million out of the total US$100 million be unexpected and thus be somewhat more was canceled).211 In the Pakistan Tertiary palatable. 211 World Bank (2018c) 212 World Bank (2018d) Part II – RBF and Governments 84 Sustainability Plan Nacer is a well-known program that provides health insurance to otherwise An area where questions remain, and where uninsured pregnant women and children. In a there is little research available, is the field experiment, the program randomly paid sustainability of RBF activities. Generally, a 200 percent premium to treatment clinics the sustainability of results seems promising that initiated prenatal care, and a study of this for areas such as conditional cash transfers. intervention found that those clinics that were For example, children who experienced paid the premium had a 34 percent increased PROGRESA/Oportunidades in Mexico have rate of initiation of services than those who been shown to have better education and labor were not paid the premium.215 The study found market outcomes years after the program that these higher levels of initiation of prenatal (though for other transfers the effects have care persisted for at least 18 months, and in faded out).213/214 However, for interventions some cases 24 months, after the incentives such as teacher incentives or performance ended. This evidence speaks to the fact that grants, the evidence seem to be lacking, to a temporary incentives can produce longer- great extent because most are not designed to term behavioral changes without creating measure their long-term effects. unsustainable financing expectations. Plan In other areas such as health, there have been Nacer is unusual in that it is a long-running several examples that suggest that incentives government-sponsored program that was can have a lasting effect, particularly a originally funded by the World Bank and randomized control trial in Argentina the central government with the provincial conducted with Plan Nacer and the previously governments taking on responsibility for 30 mentioned Cordaid intervention in Uganda. percent of the total costs later on.216 Also, the central government was already financially solvent, which is not always the case for all countries.217 213 Behrman et al (2009) 214 Parker and Vogl (2018) 215 Celay et al (2015) 216 Center for Global Development (2018) 217 Berman (2015) Part II – RBF and Governments 85 Concluding Remarks Currently, there is more robust evidence available on how RBF can have an impact on education at the level of specific groups and individuals (such as teachers) and less evidence available on its effectiveness at the national or programmatic level. However, in some ways, aligning donors and clients with the aim of prioritizing the achievement of results, whether they are inputs or processes or outcomes, is a legitimate goal in and of itself. Unsurprisingly, RBF cannot serve as a substitute for a strong theory of change, nor can it compensate for improperly identifying the types of binding constraints in an education system that can be unlocked by incentives. In particular, international development will forever be anchored in country context, and until there are more evaluations of RBF and governments in education, it will be difficult to make broader statements about its effectiveness. The evidence base continues to grow as more new research comes out (for example, REACH is funding three country-level assessments of RBF as well as a round of proposals that focus on incentives at the district/regional/ provincial level). To improve the odds of lasting change, design and implementation matter a lot, and RBF often requires even more planning than a traditionally financed project. For RBF to work well, stakeholders must genuinely think through the results chain, and not fall into the habit of identifying disparate activities that could be financed. RBF forces donors and countries alike to think about how those different activities interact in the education system, and which behaviors might respond to incentives. To this end, evidence thus far shows: For RBF and Governments, several key criteria are critical for more effective RBF: 1. RBF and teachers: Teacher incentives can but do not always improve teacher 1. Choosing RBF as the appropriate attendance and student learning. The financing modality requires careful design of the incentive scheme and the consideration of political commitment, context matter. The effects are larger and understanding the risks involved, and more positive in developing country costs, and country context (for example, contexts. capacity and country systems). 2. RBF and students and families: 2. RBF project design should prioritize Student and family incentives (such as the cascading of incentives and CCTs, for instance) has a good track should select and price indicators record of reducing school dropout and with an objective or methodology increasing school attendance, though in mind. Some of these include cost the evidence for its effects on student effectiveness, increasing the chances of learning is more mixed. Conditional achieving other indicators, or reducing transfers to students tied to their own risk of nonpayment. learning are a promising area of future 3. RBF project implementation should research. think of the purpose of monitoring and 3. RBF and schools: The evidence base information systems, invest upfront on the effectiveness of performance- in verification, and be adaptive and based grants is still quite limited. For flexible in order to address realities on- now, it seems that in some cases they the-ground. can work, especially when grants are Ultimately, there is proof that RBF can have combined with other interventions a positive impact on learning conditions and, such as capacity building (for example, in rare instances, on increasing learning itself. to principals and school committees) This makes it a powerful financing modality or when money is spent on inputs that for policymakers around the world to consider affect learning outcomes. using in their education sectors. 4. Synergies: There is growing evidence that combining different RBF interventions within the same program can generate results that go beyond the sum of any two interventions alone. Though the research is limited, this suggests that RBF that tackles several bottlenecks at once can have larger effects. Conclusion 87 References ADB (2013). Education Sector Development Ashraf, N., Bandiera, O., & Lee, S. S. (2014). Program (2013-2017). Disbursement Linked Do-gooders and go-getters: career incentives, Results and Indicators. selection, and performance in public service delivery. STICERD-Economic Organisation and ADB (2014). Supporting Kerala’s Additional Public Policy Discussion Papers Series, 54. Skill Acquisition Program in Post-Basic Education. Program Fiduciary Systems Baird, S., McIntosh, C., & Özler, B. (2011). Cash Assessment. or condition? Evidence from a cash transfer experiment. The Quarterly Journal of Economics, ADB (2016). Midterm Review of Results-Based 126(4), 1709-1753. Lending for Programs. Baird, S., Ferreira, F. H., Özler, B., & Woolcock, ADB (2017). Results-Based Lending at M. (2014). Conditional, unconditional and the Asian Development Bank: An Early everything in between: a systematic review Assessment. of the effects of cash transfer programmes on Akresh, R., De Walque, D., & Kazianga, H. schooling outcomes. Journal of Development (2013). Cash transfers and child schooling: Effectiveness, 6(1), 1-43. evidence from a randomized evaluation of the role Barlevy, G. & Neal, D. (2012). Pay for percentile. of conditionality. The World Bank. American Economic Review, 102(5), 1805-31. Al-Samarrai, S., Shrestha, U., Hasan, A., Barón, J. & Adelman, M. (2018). HAITI: Can Nakajima, N., Santoso, S., & Wijoyo, W. H. Pre-Conditions for RBF be Established in A. (2018). Introducing a performance-based Fragile States? REACH Policy Note. component into Jakarta’s school grants: What do we know about its impact after three years?. Barrera-Osorio, F. & Raju, D. (2015). Teacher Economics of Education Review, 67, 110-136. performance pay: Experimental evidence from Pakistan. Andrews, M. (2013). The limits of institutional reform in development: Changing rules for Barrera-Osorio, F. & Ganimian, A.J. (2016). realistic solutions. Cambridge University Press. The barking dog that bites: Test score volatility and school rankings in Punjab, Andrews, M., Pritchett, L., & Woolcock, M. Pakistan. International Journal of Educational (2013). Escaping capability traps through Development, 49, 31-54. problem driven iterative adaptation (PDIA). World Development, 51, 234-244. Barrera-Osorio, F., Bertrand, M., Linden, L. L., & Perez-Calle, F. (2011). Improving the design Andrews, M., Woolcock, M., & Pritchett, of conditional transfer programs: Evidence L. (2017). Building state capability: Evidence, from a randomized education experiment in analysis, action. Oxford University Press. Colombia. American Economic Journal: Applied Ashraf, N. (2009). Spousal control and intra- Economics, 3(2), 167-95. household decision making: An experimental BBC (2018). Guinea country profile. https:// study in the Philippines. American Economic www.bbc.com/news/world-africa-13442051. Review, 99(4), 1245-77. Accessed July 2018. References 88 Barham, T., Macours, K., & Maluccio, J.A. Blimpo, M. P. (2014). Team incentives (2013). More schooling and more learning? Effects for education in developing countries: A of a three-year conditional cash transfer program randomized field experiment in Benin. in Nicaragua after 10 years (No. IDB-WP-432). American Economic Journal: Applied Economics, IDB Working Paper Series. 6(4), 90-109. Behrman, J.R., Parker, S.W., & Todd, P.E. Blimpo, M.P., Evans, D.K., & Lahire, N. (2015). (2009). Schooling impacts of conditional cash Parental human capital and effective school transfers on young children: Evidence from management: evidence from The Gambia. The Mexico. Economic development and cultural World Bank. change, 57(3), 439-477. Bloom, N., Eifert, B., Mahajan, A., McKenzie, Behrman, J.R., Parker, S.W., Todd, P.E., D., & Roberts, J. (2013). Does management & Wolpin, K.I. (2015). Aligning learning matter? Evidence from India. The Quarterly incentives of students and teachers: results Journal of Economics, 128(1), 1-51. from a social experiment in Mexican high Bloom, N., Lemos, R., Sadun, R., & Van schools. Journal of Political Economy, 123(2), Reenen, J. (2015). Does management matter in 325-364. schools? The Economic Journal, 125(584), 647- Berman, D. (2015). Argentina: can short term 674. incentives change long term behavior? Bond (2017). Does ‘skin in the game’ improve Bernal, P., Celhay, P., & Martinez, S. (2018). Is the level of play? https://www.bond.org.uk/ results-based aid more effective than conventional resources/does-skin-in-the-game-improve-the- aid? Evidence from the health sector in El level-of-play Salvador (No. IDB-WP-859). IDB Working Bowles, S. & Polania-Reyes, S. (2012). Economic Paper Series. incentives and social preferences: substitutes Berry, J. (2015). Child Control in Education or complements? Journal of Economic Literature, Decisions: An Evaluation of Targeted 50 (2), 368-425. Incentives to Learn in India. Journal of Human Cambridge Education (2015). Evaluation of Resources, 50 (4), 1051-1080. the Pilot Project of Results-Based Aid in the Bertone, M. P., Jacobs, E., Toonen, J., Education Sector in Ethiopia. Akwataghibe, N., & Witter, S. (2018). Carneiro, P.M., Koussihouèdé, O., Lahire, N., & Performance-based financing in three Meghir, C. (2016). School grants and education humanitarian settings: principles and quality: experimental evidence from Senegal. pragmatism. Conflict and health, 12(1), 28. Celhay, P., Gertler, P., Giovagnoli, P., & Birdsall, N. & Barder, O. (2006). Payments for Vermeersch, C. (2015). Long-run effects of Progress: A Hands-Off Approach to Foreign Aid temporary incentives on medical care productivity. (No. 102). The World Bank. Birdsall, N. & Savedoff, W.D. (2012). Cash on Center for Global Development (2018). delivery: a new approach to foreign aid. CGD Argentina’s Plan Nacer. Paying for Provincial Books. Performance in Health. http://millionssaved. cgdev.org/case-studies/argentinas-plan-nacer References 89 Cerdán-Infantes, P. (2018). Developing an Dauphin, A., Lahga, E., Fortin, B., & Lacroix, Effective Management and Information System G. (2011). Are Children Decision‐Makers for Education Quality in Colombia. within the Household? The Economic Journal, 121(553), 871-903. Cilliers, J., Kasirye, I., Leaver, C., Serneels, P., & Zeitlin, A. (2013). Improving teacher De Ree, J., Muralidharan, K., Pradhan, M., attendance using a locally managed monitoring & Rogers, H. (2015). Double for nothing? scheme: Evidence from Ugandan Primary Experimental evidence on the impact of an Schools. Rapid response paper for International unconditional teacher salary increase on student Growth Centre. performance in Indonesia (No. w21806). National Bureau of Economic Research. Cilliers, J., Mbiti, I., & Zeitlin, A. (2018). Accountability and school performance: DFID (2014). DFID’s Approach to Payment by Evidence from Big Results Now in Tanzania. Results. RISE Conference. Duflo, E., Hanna, R., & Ryan, S.P. (2012). Clist, P. (2016). Payment by results In Incentives work: Getting teachers to come to development aid: All that glitters is not gold. school. American Economic Review, 102(4), 1241- The World Bank Research Observer, 31(2), 290- 78. 319. Evans, D. K. & Popova, A. (2016). What really Clist, P. (2018). Payment by results in works to improve learning in developing international development: Evidence from the countries? An analysis of divergent findings in first decade. Development Policy Review. systematic reviews. The World Bank Research Observer, 31(2), 242-270. Clist, P. & Dercon, S. (2014). 12 Principles for Payment By Results (PbR) In International Fiszbein, A. & Schady, N.R. (2009). Conditional Development. DFID Research for Development. cash transfers: reducing present and future London: DFID. poverty. The World Bank. Clist, P. & Verschoor, A. (2014). The conceptual Fryer, R. G. (2013). Teacher incentives and basis of payment by results. London: DFID. student achievement: Evidence from New York City public schools. Journal of Labor Economics, Coffey (2016). Girls Education Challenge 31(2), 373-407. Process Review Report. Fryer Jr, R.G., Levitt, S.D., List, J., & Sadoff, Cordaid (2014). Open Development Movement. S. (2012). Enhancing the efficacy of teacher Co-Creation Leads to Transformation. Cordaid incentives through loss aversion: A field Position Paper. experiment (No. w18237). National Bureau of Cordaid (2017). Strengthening Health Systems Economic Research. through RBF. Galiani, S. & McEwan, P.J. (2013). The Cruz-Aguayo, Y. & Martinez, S. (2016). Setting heterogeneous impact of conditional cash Targets for Results Based Financing Programs: A transfers. Journal of Public Economics, 103, 85- Simple Cost Benefit Framework. Inter-American 96. Development Bank. Garbers, Y. & Konradt, U. (2014). The effect Das, J., Dercon, S., Habyarimana, J., Krishnan, of financial incentives on performance: P., Muralidharan, K., & Sundararaman, V. A quantitative review of individual and (2013). School Inputs, Household Substitution, team‐based financial incentives. Journal of and Test Scores. American Economic Journal: occupational and organizational psychology, Applied Economics, 29-57. 87(1), 102-137. References 90 Gelb, A., Diofasi, A., & Postel, H. (2016). Holzapfel, S. & Janus, H. (2015). Improving Program for results: the first 35 operations. Education Outcomes by Linking Payments Center for Global Development Working Paper to Results-An Assessment of Disbursement- 430. Linked Indicators in Five Results-Based Approaches. Gertler, P. (2004). Do conditional cash transfers improve child health? Evidence from IACRC (2018). International Anti-Corruption PROGRESA’s control randomized experiment. Resource Center. https://iacrc.org/. American economic review, 94(2), 336-341. ICAI (2018). DFID’s approach to value Gertler, P.J., Patrinos, H.A., & Rubio-Codina, for money in programme and portfolio M. (2012). Empowering parents to improve management: A performance review. education: Evidence from rural Mexico. Journal IEG (2013). ICR Review of the Pakistan Sindh of Development Economics, 1(99), 68-79. Education Sector Project. Gilligan, D.O., Karachiwalla, N., Kasirye, IEG (2016). Program-for-Results. An Early- I., Lucas, A., & Neal, D.A. (2018). Educator Stage Assessment of the Process and Effects of Incentives and Educational Triage in Rural a New Lending Instrument. Primary Schools. Imberman, S.A. (2015). How effective are Glewwe, P., Ilias, N., & Kremer, M. (2010). financial incentives for teachers? IZA World of Teacher incentives. American Economic Journal: Labor. Applied Economics, 2(3), 205-27. Islam, A., Kwon, S., Masoon, E., Prakash, Glewwe, P. & Kassouf, A.L. (2012). The impact N., & Sabarwal, S. (2017). Information before of the Bolsa Escola/Familia conditional cash Incentives: Goals, Expectations, and Effort transfer program on enrollment, dropout among Zanzibari secondary students. rates and grade promotion in Brazil. Journal of development Economics, 97(2), 505-517. Janus, H. & Keijzer, N. (2015). Big results now? Emerging lessons from results-based aid in Glewwe, P. & Muralidharan, K. (2016). Tanzania. Improving education outcomes in developing countries: Evidence, knowledge gaps, and Kerwin, J. & Thornton, R.L. (2018). Making the policy implications. In Handbook of the Grade: The Sensitivity of Education Program Economics of Education (Vol. 5, pp. 653-743). Effectiveness to Input Choices and Outcome Elsevier. Measures. Gneezy, U., Meier, S., & Rey-Biel, P. (2011). Kim, J.S. & Sunderman, G.L. (2005). Measuring When and why incentives (don’t) work academic proficiency under the No Child to modify behavior. Journal of Economic Left Behind Act: Implications for educational Perspectives, 25(4), 191-210. equity. Educational Researcher, 34(8), 3-13. Grittner, A.M. (2013). Results-based financing: Klingebiel, S., Gonsior, V., Jakobs, F., & evidence from performance-based financing in Nikitka, M. (2016). Public Sector Performance the health sector. and Development Cooperation in Rwanda. https:// www.researchgate.net/publication/311256916_ Grover, D., Bauhoff, S., & Friedman, J. (2018). Imihigo_and_Development_Cooperation_ Using Supervised Learning to Select Audit Targets What_Kind_of_Relationship. in Performance-Based Financing in Health: An Example from Zambia (No. 481). Kremer, M., Glewwe, P., & Ilias, N. (2010). Teacher incentives. American Economic Journal: Applied Economics, 2(3). References 91 Lavy, V. (2002). Evaluating the effect of Mbiti, I., Romero, M., & Schipper, Y. (2018b). teachers’ group performance incentives on Left Behind by Optimal Design: The Challenge pupil achievement. Journal of political Economy, of Designing Effective Teacher Performance Pay 110 (6), 1286-1317. Programs. Technical report, working paper. Lavy, V. (2009). Performance pay and teachers’ McEwan, P.J. (2015). Improving learning in effort, productivity, and grading ethics. primary schools of developing countries: A American Economic Review, 99(5), 1979-2011. meta-analysis of randomized experiments. Review of Educational Research, 85(3), 353-394. Lazear, E.P. (2003). Teacher incentives. Swedish Economic Policy Review, 10 (2), 179-214. Molina-Millan, T., Barham, T., Macours, K., Maluccio, J.A., & Stampini, M. (2016). Long- Loyalka, P.K., Sylvia, S., Liu, C., Chu, J., & Shi, term impacts of conditional cash transfers in Y. (2016). Pay by Design: Teacher Performance Latin America: Review of the evidence. Inter- Pay Design and the Distribution of Student American Development Bank. Achievement. Muralidharan, K. (2012). Long-Term Effects Lurton, G. (2018). First approach to algorithmic of Teacher Performance Pay: Experimental RBF supervision.  https://grlurton.github.io/ Evidence from India. Society for Research on orbf_data_validation/Analysis.html Educational Effectiveness. Ma-Nitu, S. M., Tembey, L., Bigirimana, Muralidharan, K. & Sundararaman, V. (2011). E., Dossouvi, C. Y., Basenya, O., Mago, E., Teacher performance pay: Experimental Salongo, P.M., Zongo, A., & Verinumbe, F. evidence from India. Journal of Political (2018). Towards constructive rethinking of Economy, 119(1), 39-77. PBF: perspectives of implementers in sub- Saharan Africa. BMJ Global Health, 3(5), Murnane, R.J. & Ganimian, A.J. (2014). e001036. Improving educational outcomes in developing countries: Lessons from rigorous evaluations. Majgaard, K., Ouedraogo, A., Mallberg, M., Cambridge, MA: National Bureau of Economic Andriamarofara, H., & Mulet, P. (2018). Early Research. lessons around results-based aid for education in Niger. REACH Policy Note. Olander, S. & Högberg, A.S. (2016). Practical Guidance for Results-Based Financing Marsh, D.R., Schroeder, D.G., Dearden, K.A., Approaches. SIDA. Sternin, J., & Sternin, M. (2004). The power of positive deviance. Bmj, 329(7475), 1177-1179. Parker, S.W. & Vogl, T.S. (2018). Long-term effects of cash transfers: Looking to the next Mbiti, I.M. (2016). The need for accountability generation. https://voxeu.org/article/long-term- in education in developing countries. Journal of effects-cash-transfers. Economic Perspectives, 30 (3), 109-32. Paul, E., Albert, L., Bisala, B. N. S., Bodson, Mbiti, I., Muralidharan, K., Romero, M., O., Bonnet, E., Bossyns, P., ... & Gyselinck, K. Schipper, Y., Manda, C., & Rajani, R. (2018). (2018). Performance-based financing in low- Inputs, Incentives, and Complementarities in income and middle-income countries: isn’t it Education: Experimental Evidence from Tanzania time for a rethink?. BMJ Global Health, 3(1), (No. w24876). National Bureau of Economic e000664. Research. References 92 Pearson, M., Johnson, M., & Ellison, R. (2010). Snilstveit, B., Stevenson, J., Phillips, D., Review of major Results Based Aid (RBA) and Vojtkova, M., Gallagher, E., Schmidt, T., Jobse, Results Based Financing (RBF) schemes. Final H., Geelen, M., Pastorello, M.G., & Eyers, J. report. DfID Human Development Resource (2015). Interventions for improving learning Centre. outcomes and access to education in low-and middle-income countries: a systematic review. Pritchett, L. (2001). Where has all the education 3ie. gone? The World Bank Economic Review, 15(3), 367-391. The Economist (2014). The $32m heist. https:// www.economist.com/baobab/2014/02/27/the- Results for Development (2016). Paying for 32m-heist Performance: An Analysis of Output-Based Aid in Education. UNESCO (2018). Walk before you run: the challenges of results-based payments in aid to RBF Health (2017). Pricing and The Use of education. GEMR Policy Paper 33. Data in RBF: Towards a Higher Return on Investment. https://www.rbfhealth.org/blog/ Upper Quartile (2014). Evaluation of Results pricing-and-use-data-rbf-towards-higher- Based Aid in Rwandan Education - 2013 return-investment Evaluation Report. Ridde, V., Yaogo, M., Zongo, S., Somé, P.A., & Upper Quartile (2015). Evaluation of Results Turcotte‐Tremblay, A.M. (2018). Twelve months Based Aid in Rwandan Education - Year Two. of implementation of health care performance‐ World Bank (2001). Project Appraisal based financing in Burkina Faso: A qualitative Document for the Guinea - Education for All multiple case study. The International Journal Program (Phase I) Project. of Health Planning and Management, 33(1), e153-e167. World Bank (2008). Project Appraisal Document for the Jamaica Early Childhood Rusa, L. & Fritsche, G. (2007). Rwanda: Development Project. performance-based financing in health. Emerging Good Practice in Managing for World Bank (2009). Project Appraisal Development Results: Sourcebook-, 55-60. Document for the Pakistan Sindh Education Sector Project (SEP). Rusa, L., Schneidman, M., Fritsche, G., & Musango, L. (2009). Rwanda: Performance- World Bank (2011). Implementation Completion based financing in the public sector. and Results Report on the Bangladesh Primary Performance incentives for global health: potential Education Development Project II. and pitfalls. World Bank (2011b). Project Appraisal Sabarwal, S., Joshi, A., & Blackmon, W. (2016). Document for the Bangladesh Third Primary A Review of the Multi-Donor Results-Based Education Development Project. Financing mechanism used for Tanzania’s Big World Bank (2013). Project Appraisal Results Now in Education Program. Document for the Pakistan Second Sindh Shepard, D., Zeng, W., & Nguyen, H.T.H. (2015). Education Sector Project (SEP II). Cost-Effectiveness Analysis of Results-Based World Bank (2014). Project Appraisal Financing Programs. Document for the Tanzania Big Results Now in Education Program. References 93 World Bank (2015). Achieving Results for Women’s and Children’s Health: 2015 Progress Report. World Bank (2016). Project Appraisal Document for a Lebanon Support to Reaching All Children with Education Program for Results. World Bank (2017). Results-Based Financing in Education: Financing Results to Strengthen Systems. World Bank (2017b). Implementation Completion and Results Report on the India Technical Engineering Educational Quality Improvement Project II. World Bank. (2017c). World Development Report 2018: Learning to Realize Education’s Promise. World Bank (2018). World Bank Open Data. https://data.worldbank.org/ World Bank (2018b). Implementation Completion and Results Report on the Nepal Bridges Improvement and Maintenance Program. World Bank (2018c). Implementation Completion and Results Report on the Bangladesh Primary Education Development Program III. World Bank (2018d). Implementation Completion Report Review for the Pakistan Tertiary Education Support Project. World Bank (2018e). How to attract and motivate passionate public service providers. http://blogs.worldbank.org/impactevaluations/ how-attract-and-motivate-passionate-public- service-providers References 94 RESULTS-BASED FINANCING IN EDUCATION: FINANCING RESULTS TO RESULTS-BASED STRENGTHEN SYSTEMS FINANCING IN EDUCATION Learning from What Works The Results in Education for All Children (REACH) RESULTS IN EDUCATION Education World Bank FOR Trust Fund supports and disseminates research on the ALL Global PracticeCHILDREN (REACH) 1818 H Street, NW impact of results-based financing on education outcomes. The goal is to collect and build empirical evidence and Results in Education Global Education Practice Washington DC, 20433 operational lessons learned to help governments and for All Children World Bank USA development organizations design and implement the (REACH) worldbank.org/reach 1818 H Street, NW / Washington DC, 20433 / USA most appropriate results-based financing mechanisms for improved learning outcomes. For more information about reach@worldbank.org worldbank.org/reach / reach@worldbank.org who we are and what we do, go to worldbank.org/reach.