89278 v1




The Andhra Pradesh
    Randomized
 Evaluation Study

      Summary
                                                                  Table of Contents




Section 1: Motivation.................................................................................................................................... 1
Section 2: The Partnership ............................................................................................................................ 3
Section 3: The Design of the Research Program .......................................................................................... 5
Section 4: Results and Findings .................................................................................................................... 9
Section 5: Summary and Conclusions ........................................................................................................ 17
References ................................................................................................................................................... 19




                                                                               ii
This summary report presents the overall results from what is nearly a ten-year long research program
entitled the Andhra Pradesh Randomized Evaluation Study (APRESt). APRESt consists of two major
strands of research – on the relative effectiveness of schooling inputs and teachers incentives, and on the
impacts of school choice1, towards improving the learning outcomes of students. The research program
analyzes the impacts of these various policy options through a series of large-scale randomized
evaluations. The rest of the summary report is organized as follows: Section 1 and 2 focuses on the
genesis of the research program, the partnership built for the program, and the roles played by each
partner. Section 3 describes the project interventions, specifically the experimental design, and Section 4
summarizes the overall results. Section 5 presents broad conclusions and recommendations. The report
is meant to be read as an overview or an executive summary of the results obtained from this research
program. The appendix includes all technical papers produced till date.




1
    These results are still being finalized and are not presented in this summary.

                                                             iii
                                         Section 1: Motivation

The responsibility for the delivery of education services is shared between the Center and States in India.
Successive governments, both at the center and in state capitals, have made efforts to improve access,
equity and quality to basic education. The last decade has witnessed the implementation of large
government programs, far-reaching policy developments and innovative and effective solutions being
developed, piloted and implemented. These include inter alia:

               The Sarva Shiksha Abhiyaan (SSA) program, a centrally sponsored scheme supports a
                partnership between the federal and state level actors and was launched as a nation-wide
                effort to ensure that all children of the appropriate age were enrolled in basic schooling.
                This program has rapidly expanded the schooling system in the country and has led to
                enormous increases in student enrolment, particularly in the poorest states of the country.
               In 2009-10, the Government of India passed an extremely important legislation popularly
                known as the Right to Education Act (RTE). The RTE aims to make free and compulsory
                education a fundamental right for all children in the country.
               In 2001, in response to public interest litigation, the Supreme Court ordered the states to
                provide cooked or prepared midday meals in every government and government assisted
                school. This is credited with attracting some of the poorest kids into schools across the
                country. To meet this mandate, the central government in collaboration with the state
                governments, launched another centrally sponsored scheme known as the Midday Meal
                Scheme (MDMS).
               States also have the flexibility to introduce and implement innovative schemes. Perhaps
                the most notable program in recent years being Tamil Nadu’s Activity Based Learning
                (ABL) Program. This aims to alter the manner in which the teachers engage with
                students in the classroom, and have children learning at their own pace and by engaging
                in activities rather than in the more traditional lecture mode.

Though inputs to school education have increased significantly over the last fifteen years, there are
growing concerns that these are not necessarily being translated into outcomes, and particularly into
learning outcomes.

Several recent assessments demonstrate this concern in translating increasing inputs into meaningful
outcomes. In a recent round of the Program for International Student Assessment (PISA), India ranked 73
out of 74 nations, with only Kyrgyzstan performing more poorly. The PISA results are even more
sobering when one considers the fact that the two states which participated in the PISA are considered
national front runners in education - Himachal Pradesh and Tamil Nadu.

Beyond these international assessments, there are numerous assessments within India that reiterate this
message on learning outcomes. For example, Pratham’s 2012 Annual Survey of Education Report
(ASER) finds very poor levels of learning in Reading and Mathematics. In reading, ASER finds that over
half the children in Grade 5 were unable to read a Grade 2 text and more than 60% of kids in Grade 3
were unable to read a Grade 1 text. Similarly, they find that almost half of Grade 5 students were unable
to perform a two digit subtraction problem and three-quarters were unable to perform division in a test of
basic numeracy skills. Though there may be concerns with precision of ASER assessments, the tools



                                                     1
themselves are simple enough to point towards a very serious problem in quality of the schooling that
children receive in typical schools in across the country.




                                                    2
                                       Section 2: The Partnership

The art and science of delivery refers to an approach by which the creative potential within and sense of
partnership across individuals, societies, other stakeholders and governments can be harnessed, and
combined with rigorous efforts to understand what works and what does not work in ensuring results -
results for those who need it the most, results in a manner that is at scale, effective in outcomes, efficient
in comparison to other alternatives, and sustainable. The implementation of the APRESt is a strong
example of developing and demonstrating the art and science behind the delivery of essential schooling
services.

APRESt emerged from a common desire across key stakeholders to improve learning outcomes in
government schools and the recognition that teacher motivation and effort was an issue that needed to be
studied more deeply. There were several reasons behind this decision. Firstly, two influential reports
identified teacher motivation as an issue across India: (i) a World Bank-Harvard survey on teacher and
health provider absence and (ii) very important work done earlier and popularly referred to as the PROBE
Report. Secondly, since nearly 90 percent of primary education spending went towards teacher salaries, it
was evident that this was an area that needed greater attention and scrutiny. While there is a broad
understanding that achieving primary education goals requires a set of inputs be brought to bear on the
issue, it is not clear how, which and in what quantities these inputs should be brought together in a
classroom setting to produce the desired results. If the vast majority of those resources are for teacher
salaries, this leaves very little for child-level learning materials such as notebooks, exercise books, and
writing materials and hence has serious implications for public resource allocations and use. Thirdly,
since the Indian government was planning to continue raising primary education budgets to meet the
schooling targets, there was a serious need to understand the most efficient way to spend these scarce
public resources. Finally, there was broad recognition that salaries paid to any government employee
were largely disconnected from any measures of performance and that this applied to civil service
teachers as well and that there might be potential gains from experimenting with interventions that
aligned the incentives of teachers to those of the government or state in terms of improving learning
outcomes. At the same time, there was a broad recognition that there might be policy options, as yet
unexplored and untested in the Indian context, which could raise learning levels in classrooms.

The issues above brought together policy makers, teachers and teacher union members, education experts,
experts at student assessments, academics and development practitioners. There were four main parties to
this partnership: the Government of the State of Andhra Pradesh (GOAP), the Azim Premji Foundation
(APF), UKAID and the World Bank (the Bank). The Government of India – through its Ministries of
Human Resources Development (MHRD) and Finance too supported the program. A working group was
formed consisting of representatives of the above organizations.

The Government of Andhra Pradesh: It was agreed that the GOAP would play the lead role in terms of
oversight and in creating an authorizing environment under which APRESt would be implemented. This
decision was particularly important given concerns of legitimacy of the work, but also because the GOAP
was a key stakeholder with a keen interest in the issue of teacher motivation. They partnered with the
Bank to ensure that this would be a learning exercise and helped finalize key design elements. Of central
importance, the GOAP agreed to randomize treatments across schools thereby enabling an experimental


                                                       3
design to be put into place. This experimental design ensured that the research study findings are of
highest rigor, which gives all stakeholders great confidence in the validity of its results. The GOAP went
a step further and also provided financial and in-kind support for this research program and entered into a
Memorandum of Understanding with the APF through which approximately USD 500,000 was allocated
for this research. Furthermore, the GOAP supported the intervention by allocating contract teachers to a
select set of 100 schools determined through a randomization algorithm. Finally, the GOAP ensured that
transfers into and out of the school identified for the program would be frozen for a period of two years.

The Azim Premji Foundation: The APF is the largest education oriented NGO in India and is one of the
pioneers of NGO-Government collaboration. Many NGOs in the education sector in India operate in a
parallel space to that occupied by the Government and see themselves as substitutes to the government.
The APF’s approach has always been to acknowledge the important role that government plays in the
delivery of education services across the country and the need to partner with the government to improve
the opportunities for 75-80% of the children in the school system in India who rely on access and quality
in government schools. The APF does this through many different interventions including school level
incentives through the learning guarantee program, digital content for schools, student and teacher
assessments and supporting teacher training. More recently, the APF has set up a university in Bangalore
to support the training of teachers and research in the education sphere. For all these reasons, the APF
was an obvious choice to help implement or support any program on the ground aimed at boosting teacher
motivation and performance in Andhra Pradesh. The Foundation had credibility with the government and
more importantly, credibility with the teachers and teacher unions. Given the nature of the this was an
extremely important consideration.

United Kingdom Agency for International Development (formerly DFID): While the roles played by
GOAP and APF were central to the success of this program, the role of UKAID was nothing short of
critical. Taxpayer money from the United Kingdom financed most of this intervention and placed a
serious responsibility on team to ensure the delivery of high quality analytical products. UKAID
financing supported all aspects of the program from design, implementation, analysis and report writing.
In addition, the program has also supported dissemination events. UKAID financing was also
instrumental in helping leverage program financing from other sources.

The World Bank: The role of the Bank in this activity has been a central one – in many ways the glue
that held everything together - from the concept stage, to bringing this diverse set of actors around to a
common platform, technical support and administrative and institutional oversight. The World Bank used
its convening power to ensure institutional continuity and provided technical guidance at every stage




                                                     4
                          Section 3: The Design of the Research Program

This working group adopted three key guiding principles in establishing the research parameters:

        (i)     the study should push both academic and policy frontiers to the extent possible;
        (ii)    interventions should be based on evidence and not ideology, and no intervention should
                be ruled out before a thorough discussion of the pros and cons; and
        (iii)   to the extent possible the evaluation should be conducted over several years to ensure
                credibility and sustainability of the results.

Given the above key guiding principles and based on the factors that led to establishment of this research
program, the working group agreed to experiment with teacher incentives in an effort to motivate their
effective participation in the schooling system.

The use of incentives as a way of improving performance in the schooling system was not a new concept
in India. Teacher recognition programs have been in place for many years in the country and every year a
handful of teachers are recognized publicly either in their own districts or at state or national functions.
In addition, organizations such as the APF have run their own programs in partnership with state
governments for many years. For example, the APF has implemented a program known as the Learning
Guarantee Program (LGP) that provided cash incentives to government schools which self-selected into
the program and for achieving specified learning levels. Both in-kind incentive programs of state
governments and the LGP are typically conducted as tournaments – where the best performing teacher (or
set of teachers) and the best performing school (or set of schools) receive the incentives. While the
measures of performance under the LGP were very clear and explicit (see Barnhardt 2007 for further
details), in most cases the rules applied to determine the best teacher in the district or state by the
government have been vague. During focus group interviews with teachers, three clear issues emerged in
terms of the available government led teacher recognition schemes: (i) selection criteria were not clear,
(ii) teachers often felt that subjective considerations played a role in determining the “best teacher”
awardee, and (iii) while in-kind recognitions were indeed valuable, the teacher’s themselves repeatedly
stated that they “would not be opposed to receiving cash bonuses for performance”. Their main concerns
seemed to focus on the fact that explicit, clear, and objective rules of the game were needed and that there
had to be an honest broker and objective means of measuring or assessing teacher performance in place
for such a scheme to work.

As individual, in-kind incentive programs, and school level incentive programs had been used in the past
in the Indian context, and based on focus group interviews with teachers, the working group decided to
focus on cash incentives to teachers as one of the interventions. As this was the first time to our
knowledge that such a program would have been put in place in government schools, and for government
school teachers in India, the team believed it was important to ensure that the findings could be
generalized across states, and that there should be a meaningful comparison of such a policy option
against the more typical investments made in government schools in India which include inter alia: (i)
infrastructure inputs, (ii) teacher and teaching inputs, and (iii) teaching and learning materials for
students.
Therefore, the working group agreed to evaluate the relative returns to additional spending on typical
schooling inputs on the one hand, against a policy of trying to improve student level outcomes by directly

                                                     5
incentivizing teachers through the potential for receiving cash benefits for improved class performance on
the other hand.

The Inputs:

The above three input choices were considered. The working group quickly ruled out improvements to
infrastructure for two main reasons. Firstly, we were not sure that the necessary improvements to
infrastructure could be completed in the short period of the study2. Secondly, and perhaps more
importantly, the team was concerned that any returns to infrastructure spending in terms of improved
learning outcomes may not be witnessed during the study period. So, the working group agreed to focus
on teacher and teaching inputs and student teaching and learning material.

To determine the specific nature of the inputs, the working group approached this by asking how an
additional rupee available for spending on primary schooling could be best spent. For example, on an
additional full-time or contract teacher or on teacher training programs, etc. Interviews with teacher
groups and with administrators and other stakeholders suggested that teachers typically received both pre-
and in-service training, and at least anecdotally these seemed to have little sustained impact on classroom
processes or teacher behavior necessary to improve learning outcomes. So, the working group considered
experimenting by looking directly at the returns to having an additional full time teacher versus an
additional contract teacher in the classrooms. A direct comparison on the relative effectiveness of civil
service teachers and contract teachers would have helped provide evidence on issues of enormous policy
relevance – entry requirements, terms and conditions of employment, tenure and career path. However,
the GOAP noted administrative and bureaucratic constraints in allocating regular teachers across a
randomly selected set of schools, although they noted that it would be relatively easy to assign contract
teachers to a randomly selected set of schools3. The second input consisted of a block grant to schools to
support the purchase of child specific teaching learning materials. Given expenditure levels and patterns
of spending and the fact that over 90 percent of public expenditures went to cover teacher salaries and
pension liabilities, the working felt that any additional spending on child-specific educational inputs (such
as notebooks, exercise books, writing/coloring materials, etc.) would presumably have the greatest
impact4.

The Incentives

As stated above, the working group agreed based on feedback from focus group discussions with teachers
and on the fact that incentives schemes were regularly used in the Indian context to try and motivate
teacher performance, decided to work on teachers incentives but with a twist – to focus on individual and
group monetary incentives to teachers conditional upon improved performance on independently
conducted standardized assessments. The use of incentives for teachers is really at the forefront of policy
debate and discussions in many countries across the world today. APRESt bears relevance for teacher


2
  At the time the program was initiated, the program was expected to continue for a 2 year period.
3
  Procedures followed hiring of contract teachers are provided in detail in the paper entitled “ “ in the Annex.
4
  The procedures employed under the research program to monitor how block grants were spent and the nature of the
items procured are provided in detail in the paper entitled “ “ in the Annex.

                                                       6
policies in numerous developing countries, but also contributes to a wider debate on teacher incentives in
education policy even in countries that are further along the development path.

Even though the working group agreed to experiment with financial incentives, it was not without debate.
The group felt that restructuring teacher pay to incorporate an incentive conditional upon improved
student performance was a reasonable idea, but differed strongly on both ideological and technical
grounds, on the nature of the design and the types of incentives. Firstly, there is a widespread belief that
in professions such as teaching, medicine, etc., intrinsic motivation is perhaps stronger driver of
performance and enhanced effort, and that the use of external incentives could negatively impact on this
intrinsic motivation. Secondly, the GOAP was concerned that individual and group incentives could lead
to conflicts among teachers who receive and those who may not receive bonus payments within a given
school. However, the Key Guiding Principles adopted earlier helped the working group to navigate
through such debates. The working group noted that ex ante that group incentives may be more effective
than individual incentives especially depending on the size of the school with smaller schools being able
to monitor individual effort levels better and ensure that other teachers are not free-riding and in larger
schools where monitoring becomes more difficult and costly, individual incentives may prove to be more
effective. , The group agreed that it was worth experimenting with both group and individual incentives.

Finally, the working group focused on the nature of the experiment. Government programs are typically
universal in nature, or they are targeted towards particular segments of the population (e.g., the poorest,
people belonging to particular caste or ethnicity, or other sub-groups), or are based on self-selection into
programs (such as in the case in a large number of anti-poverty programs). Each of these different
approaches in typical programs makes evaluation difficult as it is often difficult to develop a statistically-
valid comparison group, identical in all respects to program participants, except that they did not
participate in the program of interest. Therefore, to ensure that any program of this nature could be
evaluated and causal impacts determined, the technical team proposed a randomized control trial or RCT.
Although widely used in clinical trials for new drugs, RCTs have now made inroads into the evaluation of
social programs. By matching schools on observable characteristics ex-ante, and then randomly assigning
some into treatment(s) group(s) and others into a control group allows for a rigorous way of learning
about the effectiveness of the program(s). Since the program is randomly assigned to a subset of the
potential recipients, the remaining potential recipients provide a perfect control group as there is no
selection bias in who received the treatment and who did not. Also, other extraneous factors should affect
both groups similarly on average and their effect can be netted out by calculating a “difference in
differences” estimate. The methodology of randomized evaluation is considered the “gold standard” in
evaluating the causal impact of programs5.




5
  While random allocation of the treatment ensures internal validity, typically concerns about the generalizability of
the results outside of the study sample tend to be difficult. In this case as well this may be true. However, there are
some migitgating factors and design innovations that we have included. Firstly, the indicators of interest – teacher
absence and teaching activity, etc. are very similar to all India averages. Secondly, during sampling we have made
an effort to ensure that the five districts from which the schools are chosen cover all three socio-cultural regions of
the state.

                                                           7
Program Design

A schematic of the research design is provided below. Each individual orange cell represents a unique
“treatment”. The yellow cell represents the control group6. The relative effectiveness of each treatment
can be compared to the control group and also relative to each other. Random assignment to a subset of
the universe of possible recipients provides for a perfect control group as there is no selection bias in who
received the treatment and who did not. Furthermore, if there are other factors that may have affected all
groups during the course of the program, these extraneous factors are likely to impact on groups similarly
and thus it is possible to net out their effects by using a “difference in difference” estimate.

                                       INCENTIVES (Conditional on Improvement in
                                                  Student Learning)
                                                                     GROUP        INDIVIDUAL
                                                      NONE
                                                                    MONETARY      MONETARY
                                                     CONTROL
                          INPUTS      NONE          GROUP (100      100 Schools   100 Schools
                         (Uncond                      Schools)
                                      EXTRA
                           itional)   TEACHER
                                                    100 Schools
                                      EXTRA BLOCK
                                                    100 Schools
                                      GRANT



APRESt experiments were chosen purposefully to compare the relative effectiveness of input- and
incentive- based policies for improving education quality. Traditionally, Indian education policy has taken
an input-focused approach. Incentive-based policies are a relatively new phenomena, but increasingly
popular in many parts of the world. The table below summarizes the motivation behind each experiment
and briefly describes each.

                 Motivation                                           Description
Diagnostic        • One reason learning levels may be low is           • Existing teachers provided with detailed
Feedback             teachers don’t know how to help students            feedback on students and subject to low-stakes
                  • Can better information help?                         monitoring
Block Grants      • Significant amounts of money committed under       • Schools provided cash grants for student inputs
                     RTE.
                  • What is the effectiveness of such spending?
Contract          • Use of contract teachers is widespread, but         • Schools provided with additional teacher (on
Teachers             highly controversial                                 contract)
                  • Are contract teachers effective?
Performance       • Teacher salaries are the largest component of       • Teachers eligible for bonuses based on
Pay ×2               education spending, but a poor predictor of          improved student performance (either in own
                     outcomes                                             class or whole school)
                  • Can linking pay to performance improve
                     outcomes?


6
  Control schools also received feedback on student performance, guidance to teachers, and the additional
monitoring of classrooms due to the above activities. Our experimental evidence extends to reviewing the impact of
a program that provided low-stakes diagnostic tests, monitoring of classroom processes, and feedback to teachers on
the performance of their children. While we find teachers in treatment schools exerting more effort when observed
in the classroom, we also find that students in these schools do no better on independently-administered tests than
students in comparison schools which do not receive the program. This suggests that though teachers in the program
schools worked harder while being observed, there was no impact of the feedback and monitoring on student
learning outcomes. Our study therefore suggests that enhanced monitoring alone, with neither punitive actions for
poor performance or rewards for good performance, are unlikely to lead to improved learning in government,
primary school classrooms at the moment.

                                                         8
                                      Section 4: Results and Findings


In this section we classify results into three different sections based on the inputs and incentives studied in
this experiment. The first part looks at the role of contract teachers and the impact of the same on
improving learning outcomes. The second part looks more closely at the role of schooling inputs in the
form of block grants for teaching learning material. The final part reviews the evidence from both
individual and group incentives for teachers as a function of increased student test scores.

Contract Teachers Treatment

While there are many disputed claims in education research, most would agree that a good teacher can
make all the difference to improving student learning outcomes in their classrooms 7. However, the
question of what makes a good teacher is highly debated. A growing body of education research
acknowledges the importance of teacher quality on demonstrated student achievements, but very little is
known about which measurable characteristics of teachers truly influence classroom outcomes. Studies
have investigated the impact of licensing, licensure test scores, higher qualifications (such as bachelors,
masters or PhDs), and experience8. The absence study found a perverse effect that teachers who were
better paid or more senior (since pay and seniority are highly correlated in the Indian government school
context) and more educated gained greater satisfaction from being absent.

The typical rural government primary school is small, with an average enrollment of about 80 to 100
students. Multi-grade teaching is the norm, and schools typically have 2 to 3 teachers covering grades
one through five. A teacher tends to teach all the subjects for one grade and typically teaches in more than
one grade. Civil service or regular teachers (RTs) are state employees, have tenure till official retirement
age, and have a pensionable job with benefits. These teachers are typically selected through a teacher
selection commission or similar means and tend to be more educated (than contract teachers) and have
formal teacher training degrees. Contract teachers (CTs) on the other hand are hired at the school level,
do not have tenure and their contracts are renewed annually conditional upon performance, and are not
considered as state employees. They are typically less qualified than RTs with high school or first degree
completion, but most do not have formal teacher training. They also do not have any benefits. The
average RT salary plus benefits was approximately Rs. 10000 in 2006, or about five times the
compensation package received by CTs in the state.

The experiment in APRESt mimics the process by which contract teachers are typically hired in the state.
Schools apply to the local district education administration and seek permission for hiring a contract



7
  Rigorous studies mostly from the United States suggests that having a good teacher for 3-5 years is enough to
eliminate the achievement gaps between whites, blacks or Hispanics. For more on similar results see Rivkin,
Hanushek, and Kain (2005) and Kane and Staiger (2008).
8
  On licensing and license test scores please refer to Goldhaber and Brewer (2000), Angrist and Guryan (2003),
Buddin and Zamarro (2008), and Aaronson et al (2007); on qualifications please refer to Clotfelter et al. (2006,
2007), Rockoff (2004) and Rockoff and Staiger (2010) and for experience please refer to Clotfelter et al (2006),
Rivkin, Hanushek, and Kain (2005) and Ladd (2008)

                                                         9
teacher on the basis of student enrollment strength at the start of the year9. Contract teacher contracts in
AP typically run for a period of 10 month beyond which their contracts are renewed if there is continued
or perceived need. However, in practice, once a position has been sanctioned these were typically
continued from year to year10. Under the APRESt, schools were initially identified through a
randomization process and then were informed through a letter from the district administration that they
were authorized to hire an additional contract teacher for the year. This hiring came after all the transfers
and all the requests for contract teachers by these schools had been completed for that year. Therefore,
the additional contract teacher truly represents an additional teaching staff to the schools and not simply
the filling of an existing vacancy or assistance to meet unexpected higher enrollments and hence higher
than norm school level PTRs. Once allocations to schools were made, the decision on how and where to
use them was strictly left to schools to decide. The state administration also played a crucial role in
protecting the schools for the first two years of the research program by ensuring that teachers were not
transferred in or out of the sample schools. Four-fifths of the schools initiated the hiring procedures for
contract teachers within a week of receiving the authorization and based their hiring on qualifications,
experience, and distance from school.

From our sample of RTs and CTs we find the following characteristics:
       (i)    RTs are overwhelmingly male compared to CTs (66% to 28%)
       (ii)   on an average older by about 15 years
       (iii)  twice as likely to have a college degree (85% to 47%)
       (iv)   7 times as likely to have a formal training or education certificate (99% to 13%)
       (v)    almost twice as likely to have received training in the last year (92% to 60%)
       (vi)   very unlikely to be from the same village in which the school resides (9% to 82%)
       (vii)  having to travel 12 times as much as a CT to reach school (12 km to 0.84 km)
       (viii) earn about 9 times as much as a typical CT in our sample (INR 9000 per month to INR
              1000 per month).

Results

The Contract Teacher treatment finds the following:

(i)      At the end of two years of the program, students in schools with an extra contract teacher perform
significantly better than children in control schools by 0.15 and 0.13 standard deviations in Math and
Telugu respectively.

(ii)     The program has differential effects on different groups of students. We find that students in
schools that had poorer infrastructure and were more remote seemed to benefit more by having an
additional contract teacher. Secondly, we find that the impact of having an additional contract teacher is
highest in Grade 1 averaging 0.23 and 0.25 Standard Deviation increases in Math and Telugu scores, with
the treatment declining for higher grades.

(iii)   We find that on key metrics, regular teachers and contract teachers behave differently. RTs are
more likely to be absent from school compared to CTs (27% to 16%), and conditional on being in school
CTs are more likely to be engaged in teaching activities compared to RTs (49% to 43%). Similar results

9
  However, contract teachers can also be hired to fill in vacancies that may arise due to retirement or death or
transfers.
10
   Govinda and Yazali (2004) find that almost all para teacher contracts were being renewed year on year.

                                                          10
have been replicated across other studies and therefore suggest that CTs do behave differently than RTs
and in a manner more beneficial to enhancing student outcomes.

(iv)    By comparing absence rates and teaching activity rates for RTs in schools with and without an
extra contract teacher, we find that programs schools do have higher rates of regular teacher absence
(28% to 25%) and lower rates of teaching activity by regular teachers (45% to 41%). This suggests that
the placement of a CT might induce RTs to shirk even more. Therefore, our estimate of the impact of an
additional contract teacher is actually an underestimate, since this is the composite effect of an additional
contract teacher minus the negative effect on RT performance given the induction of a CT into the school.

Policy Directions

There is a tendency for all stakeholders to try and focus on observable teacher characteristics as a means
of ensuring a high quality teacher in the classroom. Using these metrics, the RTE has put into effect
procedures to eliminate contract teachers in the country and ensure that all teaching staff in government
run schools are full time, civil service teachers. As noted earlier, there is little rigorous evidence to
support such a policy. However, as shown through this study and others, these measures do not
necessarily correlate with success in the classroom. The RTE therefore will have significant fiscal
implications for the India, and if existing evidence holds, may not lead to the desired improvements in
education quality or in learning outcomes11. While the policy decision on contract teachers in the RTE
has been taken, these will have no direct implication. However, given the findings of this study it is
perhaps important to review the rigorous evidence that does exist to support a review of the policy on
contract teachers which finds that contract teachers are associated with less absenteeism and thus higher
effort than civil service teacher, better student performance, and given that contract teachers cost a
fraction of what regular teacher cost the public exchequer, is a more cost-effective option (see Atherton
and Kingdon (2010), Goyal and Pandey (2009), and Muralidharan and Sundararaman (2011). A more
recent meta-analysis of the issue of effectiveness of contract teachers compared to regular, civil service
teachers by Kingdon et al (2013) suggests that contract teachers are indeed as effective, if not more
effective, than regular civil service teachers in improving student learning outcomes though they caution
against trying to generalize the results.

Block Grant Treatment

Considerable attention has been paid to the importance of schooling inputs in ensuring educational
outcomes. Given the overwhelming share of salary expenditures in government schools in India, the
experiment aimed to determine returns to a student specific block grant12 to schools to obtain teaching-
learning materials for students. The program was run over two years. The per student block grant of

11
   Policy makers in India are not the only ones to advocate for the elimination of contract teachers. Well intentioned
researchers globally have advocated for contract teacher elimination on the grounds that it is unethical to have
different quality teachers for different students, that the salary differentials across civil service teachers and regular
teachers leads to fragmented and demoralized staff, etc. See ILO 2012,
12
   Schools already receive block grants to support schools. They receive an annual grant of about INR 2000 for
discretionary expenditures and about INR 500 per teacher for developing teaching learning material (TLM).
However, given that there are typically about three teachers per rural school, we are comparing approximately INR
300,000 to each school for teacher salaries and INR 2500 plus the cost of supplying textbooks, uniforms, midday
meals to each school as other inputs.

                                                           11
about INR 125 per year or given an average school size of about 80 students, translated to about INR
10,000 per school13. Grant money was typically used for procuring writing materials (40%), charts for
classrooms (25%), workbooks and exercise books (20%) and about 10% on durables such as bags, plates
and cups for the midday meal program, etc. The patterns of expenditure remain quite similar across
years14.

One would therefore expect that learning outcomes trajectory to be similar to what is observed at the end
of Year 1 but we see that this is not true for Year 2 results. It can also be seen from the nature of the
expenditure, the items procured could have been provided by the parents rather than the institution.

The Results

The study finds that students in schools that received a block grant had scores that were 0.09 and 0.08 SD
higher than students in comparisons schools in math and language respectively when comparing Year 1
results. These differences were significant. Students in schools that received block grants scored 0.04
SD and 0.07SD higher on math and language compared to students in comparison schools at the end of
Year 2 and the difference is not significant for the end of Year 2 scores. Although the initial objective of
the study was to merely study the impacts of school block grants aimed at teaching learning material for
students on student achievement, we verified a result that mirrored earlier work by Das et al in Zambia.
Household spending on education in program schools is significantly lower in Year 2 than in Year 1,
suggesting that households were responding to the program by changing household spending patterns.
That is, in Year 1, when the grant amount was unanticipated, the households were unable to or did not
offset as much as they did when the grant amount had become anticipated by Year 2. The study shows
that for each dollar of government spending in the form of a grant that the households now expect, there is
a reduction of almost 85 cents in household spending on education.

Policy Directions

The fact that households will re-optimize their budgets in response to public spending is something that
policymakers and researchers have known for a long time. However, surprisingly, the vast literature on
the impacts of schooling inputs on educational outcomes and on the estimation of education production
functions seems to have overlooked such a response on the part of households. This suggests that policy
makers may want to consider financing those inputs that households cannot easily substitute away from,
for example, teaching inputs or school level infrastructure. At this point we are not advocating that
governments stop financing those inputs that households can substitute away from by re-optimizing their
household budgets.

Performance Pay

The performance pay intervention was perhaps the most controversial and revolutionary treatment put
into effect under this program. It was the first time in India when members of the bureaucracy (here


13
     The average spending for all four programs per school were calibrated to be equal.
14
     For a more detailed description of the provision of the block grant, please refer to Das et al (2012).

                                                             12
defined as government teachers) were being paid and rewarded for their performance15. Teacher pay for
performance is not a new concept globally. While schooling inputs and resources available to schools
have increased tremendously in recent years, this has not translated into improved student learning
outcomes. Policy makers and researchers have therefore shown an interest in incentivizing teachers on
the basis of a direct measure of the performance of their students. When APRESt was initiated, rigorous
empirical evidence in the areas was limited. However, in the intervening 10 years this has become one of
the most researched areas of education policy with numerous papers published in just the last three
years16. Our results contribute to this growing body of literature and help to better understand how
institutions and bureaucracies can be managed and made more efficient.

For any system that rewards on the basis of merit or performance several prerequisites are needed. These
include inter alia: a clear set of goals, standards of performance on the basis of these goals that can be
objectively and accurately measured, and finally, rewards provided on the basis of whether these
standards were met. Alternatively, it is possible to construct a system where failure to meet established
standards or norms are punished. The latter concept of punishing those who fail to meet certain basic
norms does not exist in India and the evidence suggests that there is really no risk to an individual teacher
for failure to perform or discharge their duties as a teacher adequately. Even basic measures of
performance, such as teacher attendance, are not regularly and adequately monitored and measured, and
rewarded or punished, let alone more outcome oriented aspects such as classroom performance and
students learning outcomes. Teacher compensation is thus based solely on entry into service or seniority,
unless the teacher breaks away from the core tasks of teaching and moves into administration and
management.

Based on the comments of teachers during focus group interviews, subjective measures of performance
often result in poor performers being rewarded. While teachers incentives seems like a logical way to
align desired teacher behavior with the state’s expectations, there are some concerns that need to be
tackled with the design of incentive programs. These include: (i) incentives aligned to improved student
test scores could get teachers “teaching to the test”, rather than supporting a more wholesome
development of the student; (ii) a second concern focuses on outright cheating on tests to improve student
performance and hence teacher incentive payments, and (iii) perverse behavior on the part of teachers
who can ensure that the strongest students in class are kept out of the baseline assessment and the weakest
are kept out of class during the end line assessment and before incentive payments are calculated. The
manner in which these concerns were addressed is described below.

Teaching to the Test: The program looked at this issue in two ways. Firstly, given that learning levels in
rural government schools were very low, it was believed that teaching to the test may actually be an
improvement in classrooms. Secondly, emphasis was placed on various aspects of test development. For
example, the assessments were designed to measure performance on mechanical and conceptual

15
   There is a very cumbersome procedure for monitoring and measuring the performance of individual members of
the bureaucracy. For the senior most officers, typically those from the premier civil service, the Performance
Appraisal Report, is largely based on a results framework drawn up at the end of the year, with a rather subjective
assessment by his or her immediate supervisor and reviewed by an even more senior member of the bureaucracy.
There is no objective measure or criteria for performance.
16
   Glewwe, Ilias, and Kremer (2010), Martin (2010), Neal (2011), Behrman et al (2012), Contreras (2012), Fryer et
al (2012), Springer et al (2012), Fryer (2013), and Goodman and Turner (2013).

                                                         13
questions17. Improvements on both types of questions allows us to conclude whether improved scores are
purely due to teaching-to-the-test strategies adopted by teachers or are due to overall improved
transactions in the classroom. While the program incentivized performances in Math and Telugu, during
program implementation, student performance was also assessed on Science and Social Studies. Again,
performance improvements on non-incentivized subjects also suggests broad-based learning under the
program.

Cheating in the Test: With program like the No Child Left Behind and Race to the Top enacted in the
USA, policymakers, teachers, administrators and others have expressed concern that the incentive
structures and the high stakes testing results in perverse behavior on the part of teachers. In some cases,
this perverse behavior results in outright cheating on the part of the teachers (Jacob and Levitt (2003))18.
To ensure test score reliability, the assessments were carried out directly by one of the partners to the
APRESt project, the APF, the main implementing partner. The APF was chosen for several reasons, most
importantly the fact that they have a brand identity that is associated with promoting excellence in
education and is an NGO respected by the teaching community. Furthermore, since it was not a direct
stakeholder between improved results and pay outs for teachers, they were found to be the honest brokers
that teachers were asking for early on in the study.

Perverse Incentives Created: to minimize incentives for perverse behavior, we tied the incentives to
average improvements in child learning at the class/school level. So, if students were to drop out of the
program after taking the baseline, that student receives a very low score and therefore teachers have a
built in system to ensure minimal dropouts. Strengthening management of information systems, a key
aim of the government, would also help ensure that teacher rewards were based on improvements
demonstrated by a majority of the children and not by engineering wide margins by teachers.

Concern                           Mitigation
Teaching to the test              Test designed in such a way that one could not do well without deeper knowledge /
                                  understanding. (Plus, given extremely low levels of learning, even test-taking itself
                                  is an important skill.)
Threshold effects19               Minimized by making bonus a function of average improvement of all students.
Neglecting weaker children        Incentives tied to changes from the baseline performance. Drop-outs assigned low
                                  scores, not ignored.
Cheating, paper leaks etc.        Assessment and grading of tests done by an independent 3rd party.

Incentive Design
We employed two types of incentives in the design. These were (i) monetary incentives paid to groups
of teachers in schools (GI) and (ii) incentives paid on an individual basis to teachers in schools (II). The
details of the incentive design are provided in the formal paper, however, here will simply state the nature
of the incentive scheme. Teachers in GI and II schools were offered bonus payments on the basis of the
17
   Mechanical questions refers to those types of questions that either children are familiar with or directly test a
concept. Conceptual questions refer to those set of questions that either are unfamiliar in how they are presented or
more indirectly test the child’s comprehension of a concept.
18
   Recent cheating on standardized assessments in Atlanta further underline concerns with high stakes testing and
incentive programs based on the results of such assessments. However, these issues are not insurmountable and a
reading of the Atlanta case suggests that even rudimentary checks and balances were not in place.
19
   Focusing only on students near an expected target or cut-off, and neglecting children far below and far above the
cut-off or threshhold.

                                                         14
average improvements in the Math and Language scores of students taught by them subject to clearing a
minimum threshold of 5 percent20. All teachers in GI schools that received a bonus shared the bonus
amount. In II schools, your class performance determined your bonus payment.

The Results

From implementing the teacher performance pay experiment we find the following:

       (i)     The study unambiguously demonstrates that GI and II led to improvements in student
               outcomes both over the initial two years of the program, and over the entire five years of the
               program. At the end of the first two years, students in incentive schools performed
               significantly better than their counterparts in comparison schools by 0.27 SD and 0.17 SD in
               Math and Telugu respectively. Over the entire five year span of the program, the magnitude
               of the effect was much larger for children in incentive schools with these students scoring
               0.54 SD and 0.34 SD higher in Math and Telugu respectively than their counterparts in
               comparison schools.

       (ii)    We do not find evidence that the incentive scheme resulted in any of the concerns identified
               earlier. We carry out robustness checks for teaching to test by decomposing the treatment
               effects by Repeat and Non-Repeat Questions, by Multiple-Choice and Free-Response
               Questions, and by Mechanical versus Conceptual Questions. In all cases, we find that
               children in incentive schools perform better than children in comparison schools suggesting
               broad based gains in incentive schools. Finally, we find that students in incentive schools
               also perform significantly better than children in comparison schools on non-incentive
               subjects – Science and Social Studies. At the end of two years, incentive school students
               score 0.11 SD and 0.18 SD higher than students in comparison schools on Science and Social
               Studies, while over five years this gap widens to 0.52 SD and 0.30 SD respectively.

       (iii)   Ex ante we were unable to predict which one of these would perform better. Across five
               years of the program, students in II schools perform better than their counterparts in GI
               schools at every point over the five years, and while children in GI schools outperform their
               counterparts in comparison schools, though not significantly in every year. More
               importantly, there is no significant difference between GI and comparison schools over the
               five years. This result has now been replicated in several other studies and may begin to
               suggest that individual incentives might need to reviewed carefully when thinking about
               teacher compensation policies.

       (iv)    APRESt was initiated in an effort to boost teacher motivation and minimize teacher absence.
               At the end of two and five years of the program, we find no difference in attendance (or
               conversely absence rates) of students and teachers across incentive and comparison schools.
               Furthermore, we also do not find any significant difference across measures of classroom
               performance, such as, blackboard usage, encouraging classroom participation, etc. between
               incentive schools and comparison schools. We note that the superior performance in
               incentive schools is explained in part by enhanced teaching activity conditional upon
               presence in schools. In another study on the use of incentives, Duflo et al (2010) study the
               impact of incentives on attendance of teachers. They find that a simple piece rate scheme for
               attendance decreases teacher absenteeism in treatment schools by 21 percentage points

20
     Accounting for summer effects. This minimum 5 % threshold was removed in Year 2.

                                                       15
            compared to teachers in comparison schools, and increases student performance by 0.17 SD.
            Therefore, clearly any restructuring of the compensation package for teachers might need to
            involve incentive on two different margins - one based on attendance and the other on
            learning outcomes. Furthermore, when teachers across incentive and comparison schools
            were asked unprompted questions on what they did differently after the end of the school year
            and before they knew whether or not they had received incentive payments for that year,
            teachers in incentive schools alluded to providing more practice examples, assigning more
            homework, stayed on in class beyond school hours to provide special assistance to the weaker
            students. These seem to be credible claims when these self-reported measures are correlated
            against the performance of their students, particularly given that less than half the teachers in
            incentive schools provide these reasons.

    (v)     Adirect compare the incentive programs and the treatments using unconditional inputs finds
            that the II schools spent about the same money as the schools with the input programs, though
            student scores in the II schools were three times the scores in schools with unconditional
            inputs. Though the effect sizes in GI schools were smaller relative to IIs, the program was
            still equally cost effective given the smaller bonus pay outs. The formal papers also looks at
            two other ways to look at this issue of cost-effectiveness.

Policy Directions

Several issues emerge from this work on performance pay. First, teachers are looking for ways by which
an objective measure of their performance can be assessed and they can be rewarded accordingly.
Secondly, and perhaps more importantly, the good and effective teachers know who they are, though the
system as exists at present is not able to recognize well performing teachers. In a separate paper on how
teachers responded to the program, we find that the extent of teachers stated ex-ante support for the
program is positively correlated with their ex-post performance as measured by estimates of value
addition. This may suggest that well designed performance pay systems will allow for teachers to sort
themselves and for high performers to be recognized and rewarded and retained in the system. Even if
performance is not rewarded in the manner in which it has been highlighted in this particular experiment,
there is no question that measures of performance management needs to be introduced into the system.
This can be done in several ways. Perhaps the most important thing for the government to do would be to
introduce student assessments on a regular basis and make learning outcomes as a key system indicator
which at present it is not.




                                                    16
                               Section 5: Summary and Conclusions

APRESt represents a serious political and technical effort on the part of the GOAP and its partners to
understand and analyze factors that could contribute to improved learning outcomes in classrooms in
government schools across the state. Through a sustained period of approximately ten years, the program
helped push both policy frontiers and academic boundaries in an effort to fully understand what works
and what does not work in the improving learning outcomes in rural primary schools. The program
pushed policy boundaries by its choice of methodology in which a state government in India allowed for a
program to be randomized and thereby allowing rigorous, causal findings to be obtained. The program
also pushed policy boundaries by experimenting with a set of policy options that are typically refuted
even by well-meaning stakeholders purely on ideological grounds – such as performance pay and
vouchers to support school choice. Finally, the program pushed policy boundaries by bringing in
administrative, technical and academic teams around one table to try and address a question of global
importance. While the findings from this study may not necessarily translate into other environments in
India or even across the entire state of Andhra Pradesh, by experimenting on a large scale with rural
government run institutions and by using mechanisms that closely mirror existing administrative set ups
the program has set the stage for a larger scale roll out or at the very least a more intensive pilot. The
program has also seriously pushed academic boundaries. The working group adopted an approach that
aimed to ensure that this program would be of the highest academic quality. The best measure of
academic quality is publishing the findings in peer reviewed journals. Till date four papers have already
been published from APRESt with several other publications in the pipeline.

There are several takeaways from the research findings:

Evidence Based Policies: While much has been written about evidence based policies, rarely do you find
governments adopting policies after a careful review of the evidence. Large scale prospective
randomized evaluations allow governments to develop test beds for further policy research and
development and then closely link the results of these evaluations to program implementation. As the
education sector in India grows and matures, and the needs deepen, the government will need to find
creative ways to achieve desired results and finance such programs. Development of cost-effective
policies and strategies would be essential and the room for ideology driven policy development will need
to shrink. Andhra Pradesh has taken the first step in this direction through the implementation of
APRESt.

Enhanced Monitoring: As the paper on the impacts of diagnostic feedback to teachers demonstrates,
monitoring alone is unlikely to result in improved classroom transactions and to learning outcomes across
schools. Enhanced monitoring will have to be combined with punitive measures or rewards to ensure that
desired outcomes are met. Given the political clout of teachers unions, and the understandable reluctance
of democratically elected leaders to tangle with these unions, combining monitoring of learning outcomes
and teacher effort, under a framework of reward and professional standards might be the best way to
achieve improved learning outcomes in government schools in the country.

Focusing on Early Grades: There is a need to emphasize learning in early grades. In particular, children
should demonstrate age appropriate reading and numeracy skills. Reading skills in particular are likely to


                                                    17
open up avenues for self-learning in later years. This then implies that norms such as pupil teacher ratios
or children per classroom might need to be revised to allocate greater resources in early grades.

Learning Assessments: There is a need to strengthen assessment systems of student learning. While
student assessments should definitely not be the sole basis for measuring the performance of an
educational system, it should play an important part. If children are simply pushed through the system
with no emphasis on learning, this could have disastrous consequences both for the individual (when
trying to join a discriminating labor market) and for overall economic growth. Since India has moved to
an era of low-stakes tests, and automatic promotion, there is a need to strengthen the monitoring and
measurement of learning outcomes to ensure that learning levels are rising from the current low levels.

The Incentives Worked: Incentives work. While external factors are known to crowd out intrinsic
motivation, we believe this is question of both how issues are framed and on how the incentives are
designed. In the current setting, intrinsic motivation is crowded by a system that fails to recognize and
appropriately reward effort and success. By bringing together a reward mechanism that is truly merit
based, objective and understandable, it is possible to ensure that external rewards not only do not crowd
out intrinsic motivation but reinforce them. We are not sure at this point whether the program would
demonstrate the same results if run as a tournament as opposed to a contractual system established under
APRESt. However, irrespective of the nature of the incentives framework in which a teacher is placed,
it is important to monitor and measure performance of frontline providers. This is the only way by which
the system can ensure that the best performers are attracted, invested in and retained in the system, while
poor performers will self-select an exit strategy. Unfortunately, the current systems paints both higher
performers and low performers with the same brush at best and at worst, actually rewards low performers.
This inability to distinguish between them is highly demotivating.

Contract Teachers: This is perhaps the most controversial finding of the research program. Well
intentioned and meaning stakeholders to the education system often find themselves on opposite ends of
the spectrum when it comes to the definition of teacher quality and how one defines teacher quality.
Rigorous research findings now clearly demonstrate that observable teacher characteristics have little to
do with how well or how poorly their students perform. That is, licensed teachers do not necessarily do
better than unlicensed ones, teacher with higher qualifications do not necessarily do better than teachers
with a lower level of educational attainment, performance of teachers tends to plateau out in a relatively
short period of time, and in the context of APRESt, poorly qualified and trained contract teachers from
this study (and others) seem to do at least as well a highly qualified, tenured civil service teacher in
raising student learning outcomes. However, the RTE as enacted calls for a ban on contract or para
teachers and that these should be replaced with full time, civil service tenured teachers. The results of this
and other studies might suggest a review of this part of the RTE particularly given the enormous fiscal
implications of this provision.




                                                     18
                              References

Please see attached papers.




                                  19