---
title: "Outsourcing Education: Experimental Evidence from Liberia"
authors_and_venue: "with Justin Sandefur and Wayne Sandholtz - American Economic Review 110(2), 2020"
venue: "American Economic Review"
year: 2020
doi: "10.1257/aer.20181478"
abstract: "In 2016, the Liberian government delegated management of 93 randomly selected public schools to private providers. Providers received US$50 per pupil, on top of US$50 per pupil annual expenditure in control schools. After one academic year, students in outsourced schools scored 0.18 standard deviations higher in English and mathematics. We do not find heterogeneity in learning gains or enrollment by student characteristics, but there is significant heterogeneity across providers. While outsourcing appears to be a cost-effective way to use new resources to improve test scores, some providers engaged in unforeseen and potentially harmful behavior, complicating any assessment of welfare gains."
pdf_url: "https://mauricio-romero.com/pdfs/papers/aer.20181478.pdf"
canonical_url: "https://mauricio-romero.com/pdfs/papers/aer.20181478.pdf.md"
source: research.qmd
note: >-
  Machine-readable Markdown version of the paper, generated for LLMs.
---

# Outsourcing Education: Experimental Evidence from Liberia<sup>†</sup>

By Mauricio Romero, Justin Sandefur, and Wayne Aaron Sandholtz\*

In 2016, the Liberian government delegated management of 93 randomly selected public schools to private providers. Providers received US\$50 per pupil, on top of US\$50 per pupil annual expenditure in control schools. After one academic year, students in outsourced schools scored 0.18 $\sigma$  higher in English and mathematics. We do not find heterogeneity in learning gains or enrollment by student characteristics, but there is significant heterogeneity across providers. While outsourcing appears to be a cost-effective way to use new resources to improve test scores, some providers engaged in unforeseen and potentially harmful behavior, complicating any assessment of welfare gains. (JEL H41, I21, I28, O15)

Governments often enter into public-private partnerships as a means to raise capital or to leverage the efficiency of the private sector (World Bank 2015b). But contracts are inevitably incomplete, and thus contracting out the provision of public services to private providers will have theoretically ambiguous impacts on service quality (Hart, Shleifer, and Vishny 1997; Holmström and Milgrom 1991). While private contractors may face stronger incentives for cost efficiency than civil servants, they may also cut costs through actions that are contractually permissible but not in the public interest.

<span id="page-0-1"></span>\*Romero: Centro de Investigación Económica, ITAM (email: mtromero@itam.mx); Sandefur: Center for Global Development (email: jsandefur@cgdev.org); Sandholtz: Department of Economics, UC San Diego (email: wasandholtz@ucsd.edu). Esther Duflo was the coeditor for this article. Romero acknowledges financial support from the Asociación Mexicana de Cultura. Sandefur acknowledges financial support from the Research on Improving Systems of Education (RISE) program. Sandholtz acknowledges financial support from the Institute for Humane Studies. We are grateful to the Minister of Education, George K. Werner, Deputy Minister Romelle Horton, Binta Massaquoi, Nisha Makan, and the Partnership Schools for Liberia (PSL) team, as well as Susannah Hares, Robin Horn, and Joe Collins from Ark EPG for their commitment throughout this project to ensuring a rigorous and transparent evaluation of the PSL program. Thanks to Arja Dayal, Dackermue Dolo, and their team at Innovations for Poverty Action who led the data collection. Avi Ahuja, Miguel Jimenez, Dev Patel, and Benjamin Tan provided excellent research assistance. We're grateful to Michael Kremer, Karthik Muralidharan, and Pauline Rose who provided detailed comments on the government report of the independent evaluation of the PSL program. The design and analysis benefited from comments and suggestions from Maria Atuesta, Prashant Bharadwaj, Jeffrey Clemens, Joe Collins, Mitch Downey, Susannah Hares, Robin Horn, Isaac Mbiti, Gordon McCord, Craig McIntosh, Karthik Muralidharan, Owen Ozier, Olga Romero, Santiago Saavedra, Diego Vera-Cossio, and seminar participants at the Center for Global Development and UC San Diego. A randomized controlled trials registry entry is available at https://www.socialscienceregistry.org/trials/1501 as well as the pre-analysis plan. IRB approval was received from IPA (protocol 14227) and the University of Liberia (protocol 17-04-39) prior to any data collection. UCSD IRB approval (protocol 161605S) was received after the first round of data collection but before any other activities were undertaken. The evaluation was supported by the UBS Optimus Foundation and Aestus Trust. The views expressed here are ours, and not those of the Ministry of Education of Liberia or our funders. All errors are our own.

<span id="page-0-0"></span> $^{\dagger}$ Go to https://doi.org/10.1257/aer.20181478 to visit the article page for additional materials and author disclosure statements.

In this paper we study the Partnership Schools for Liberia (PSL) program, which delegated management of 93 public schools (3.4 percent of all public primary schools, serving 8.6 percent of students enrolled in public primary or preschool) to 8 different private organizations. Providers received an additional US\$50 per pupil as part of the program, on top of the yearly US\$50 per-pupil expenditure in control schools, and some providers independently raised and spent far more. PSL schools also negotiated successfully for more government teachers: they had an average of 1 teacher per grade, compared to 0.78 teachers per grade in traditional public schools. In exchange, providers were responsible for the daily management of the schools. These schools were to remain free and nonselective (i.e., providers were not allowed to charge fees or screen students based on ability or other characteristics). PSL school buildings remained under the ownership of the government. Teachers in PSL schools were civil servants, drawn from the existing pool of government teachers.

We study the impact of this program by randomly assigning existing public schools to be managed by a private provider. We paired schools (based on infrastructure and geography), then assigned pairs to providers, and subsequently randomly assigned treatment within each matched pair. Thus, we are able to estimate both the average impact of the PSL program as well as treatment effects across providers. Since treatment assignment may change the student composition across schools, we sampled students from pretreatment enrollment records. We associate each student with their "original" school, regardless of what school (if any) they attend in later years. The combination of random assignment of treatment at the school level with sampling from a fixed and comparable pool of students allows us to provide clean estimates of the program's intention-to-treat (ITT) effect on test scores, uncontaminated by selection effects.

The ITT effect on student test scores after one year of the program is  $0.18\sigma$  for English and  $0.18\sigma$  for mathematics. These gains do not reflect teaching to the test, as they are also seen in new questions administered only at the end of the school year and in questions with a new format. Taking into account that some providers refused to work in some schools randomly assigned to them and some students moved schools, the treatment effect on the treated (ToT) after one year of the program is  $0.21\sigma$  for English test scores and  $0.22\sigma$  for mathematics. We find no evidence of heterogeneity by students' socioeconomic status, gender, or grade, suggesting that efficiency gains need not come at the expense of equity concerns. There is also no evidence that providers engaged in student selection: the probability of remaining in a treatment school is unrelated to age, gender, household wealth, or disability.

These gains in test scores reflect a combination of additional inputs and improved management. As a lower bound, the program spent an additional US\$50 per pupil, which was the government's budget target for PSL and the transfer made to operators. While some operators spent more than this, others reported spending near this amount. When the cost of additional teachers is included, the cost rises to approximately US\$70 per student, and when the actual cost reported by providers

<span id="page-1-0"></span><sup>&</sup>lt;sup>1</sup>Consistent with the design of the experiment, we focus on the ITT effect. The ToT is estimated using the assigned treatment as an instrument for whether the student is in fact enrolled in a PSL school during the 2016-2017 academic year. The percentage of students originally assigned to treatment schools who are actually in treatment schools at the end of the 2016-2017 school year is 81 percent. The percentage of students assigned to control schools who are in treatment schools at the end of the 2016-2017 school year is 0 percent.

for the first year is included, the average increases to US\$238 (see Section IA for details). The program also increased management quality, as proxied by teacher time on task. Teachers in PSL schools were 50 percent more likely to be in school during a spot check (a 20 percentage point increase, from a base of 40 percent) and 43 percent more likely to be engaged in instruction during class time (a 15 percentage point increase, from a base of 35 percent). Teacher attendance and time on task improved for incumbent teachers, which we interpret as evidence of better management.

Since each provider was assigned schools in a matched-pair design, we are able to estimate (internally valid) treatment effects for each provider. While the assignment of treatment within matched pairs was random, the assignment of pairs to providers was not, resulting in nonrandom differences in schools and locations across providers. Therefore, the raw treatment effects for each individual provider are internally valid but they are not comparable without further assumptions (see Section III for more details). In the online Appendix, we also present treatment effects adjusting for baseline differences and "shrinking" the estimates using a Bayesian hierarchical model, with qualitatively similar results. While the highest-performing providers generated increases in learning of over  $0.36\sigma$ , the lowest-performing providers had no impact on learning. The group of highest-performing providers includes both the highest spender and some of the lowest-cost organizations. These results suggest that higher spending by itself is neither necessary nor sufficient for improving learning outcomes.<sup>2</sup>

Turning to whether PSL is a good use of scarce funds, we make two comparisons: a comparative cost-effectiveness calculation comparing PSL to business-as-usual expansion of Liberia's public school system, and a cost-benefit calculation based on the net present value of the Mincerian earnings returns to the education provided by PSL. Both calculations require strong assumptions (Dhaliwal et al. 2014), which we discuss in Section IV. While some providers incurred larger costs in the first year, assuming all providers will eventually reach the budget target of US\$50 per pupil implies that the program can increase test scores for treated students by  $0.44\sigma$  per US\$100 spent. We estimate this yields a positive net present value for the program investment after considering the income gains associated with schooling, and is more cost-effective than additional spending under business-as-usual.

However, test score gains and expenditures fail to tell the entire story of the consequences of this public-private partnership. Some providers took unforeseen actions that may be socially undesirable. While the contract did not allow cream-skimming, it did not prohibit providers from capping enrollment in oversubscribed schools or from shifting underperforming teachers to other schools.<sup>3</sup> While most providers kept students in oversubscribed schools and retained existing teachers, one provider did not. This provider, Bridge International Academies, removed pupils after taking

<span id="page-2-1"></span><span id="page-2-0"></span><sup>&</sup>lt;sup>2</sup> See Hanushek and Woessmann (2016) for a review on how school resources affect academic achievement.

<sup>&</sup>lt;sup>3</sup>In principle, removing underperforming teachers could be positive for the school system. In practice, dismissed teachers ended up either teaching at other public schools or receiving pay without work (as firing public teachers was almost impossible). Reshuffling teachers is unlikely to raise average performance in the system as a whole, and Liberia already has a tight budget and short supply of teachers (the literacy rate is below 50 percent). Similarly, reducing class sizes may be good policy, but shifting students from PSL schools to other schools is unsustainable and may lead us to overstate the scalable impact of the program. While the experiment was designed to overcome any bias from student reallocation and we can track teacher reallocations, it is not designed to measure negative spillovers.

control of schools with large class sizes, and removed 74 percent of incumbent teachers from its schools.

More worryingly, news media have revealed serious sexual abuse scandals involving two of the private providers, one of them a US-based nonprofit that was well regarded by the international community. Over the course of multiple years prior to the launch of the program and this study, a More than Me employee, who died of AIDS in 2016, raped over 30 girls in a More than Me school. A.5 In 2016, the Board Chair of the Liberian Youth Network (the previous name for the Youth Movement for Collective Action) was found guilty of raping a teenage boy. It is possible that similar scandals take place in regular schools but that these were uncovered due to the heightened scrutiny of the public-private partnership. But at a minimum it shows that private providers are far from an obvious solution to sexual violence issues in public schools.

Some of these issues could arguably have been solved with more complete contracts or better partner selection. The first year was a pilot and a learning year, and the government deliberately tried to select "mission aligned" contractors and left the contracts quite open. However, some of the providers engaged in the worst behavior were considered some of the most promising. These events underscore the challenge of ensuring that private providers act in the public interest in a world of incomplete contracts. Thus, our results suggest that outsourcing has some promising features, but also presents its own set of difficulties.

We make several contributions to both research and policy. Proponents of outsourcing in education argue that combining public finance with private management has the potential to overcome a trade-off between efficiency and equity (Patrinos, Barrera-Osorio, and Guáqueta 2009). On the efficiency side, private schools tend to be better managed than their public counterparts (Bloom et al. 2015, Muralidharan and Sundararaman 2015). On the equity side, fee-charging private schools may increase inequality and induce socioeconomic stratification in education (Hsieh and Urquiola 2006, Lucas and Mbiti 2012, Zhang 2014). Thus, in theory, publicly-financed but privately-managed schools may increase efficiency without compromising equity. Most of the empirical evidence to date on outsourcing education comes from the United States, where charter schools appear to improve learning outcomes when held accountable by a strong commissioning body (Cremata et al. 2013, Woodworth et al. 2017). However, there is limited evidence on whether private administration of public schools can improve learning outcomes in developing countries, where governments tend to have limited capacity to write complete contracts and enforce them. Two noteworthy studies which examine close analogs to PSL in the United States are Abdulkadiroğlu et al. (2016), which studies charter takeovers (where traditional public schools are restarted as charter schools, similar

<span id="page-3-0"></span><sup>&</sup>lt;sup>4</sup> Finlay Young, "Unprotected," *ProPublica*, October 11, 2018 (https://features.propublica.org/liberia/unprotected-more-than-me-katie-meyler-liberia-sexual-exploitation/).

<span id="page-3-1"></span><sup>&</sup>lt;sup>5</sup>Note that while these incidents occurred prior to the launch of the program, they were revealed in full only after the program launched, which enabled More than Me to dramatically expand its operations. The exhaustive investigation by Young (ibid.) exposes two wrongs. One is the systematic rape of Liberian children. The other is the refusal of More than Me's leadership to accept responsibility, and their (successful) efforts to conceal the case from public scrutiny.

<span id="page-3-2"></span><sup>&</sup>lt;sup>6</sup> Akoi M. Baysah, "Liberia: Police Charge Youth Activist for Sodomy," *The New Republic Liberia*, November 2, 2016 (web.archive.org/web/20161103182507/https://allafrica.com/stories/201611020824.html).

to our setting) in Boston and New Orleans; and Fryer (2014), which studies the implementation of a bundle of best practices from high-performing charter schools into low-performing, traditional public schools in Houston, Texas. In line with our results, both studies find increases in test scores. We provide some of the first experimental estimates on contracting out management of existing public schools in a developing country.<sup>7</sup>

An additional contribution is related to our experimental design and the treatment effects we are able to identify. Most US studies use admission lotteries to overcome endogeneity issues (for a review, see Chabrier, Cohodes, and Oreopoulos 2016; Betts and Tang 2014). But oversubscribed charter schools are different (and likely better) than undersubscribed ones, truncating the distribution of estimated treatment effects (Tuttle, Gleason, and Clark 2012). We provide treatment effects from across the distribution of outsourced schools, and across the distribution of students within a school. Relatedly, relying on school lotteries implies that the treatment estimates capture the joint impact of outsourcing *and* oversubscribed schools' providers. We provide treatment effects across a list of providers, vetted by the government, and show that the provider matters.

Finally, we contribute to the broader literature on outsourcing service delivery. Hart, Shleifer, and Vishny (1997) argues that the bigger the adverse consequences of noncontractible quality shading, the stronger the case for governments to provide services directly. Empirically, in cases where quality is easy to measure and to enforce, such as water services (Galiani, Gertler, and Schargrodsky 2005) or food distribution (Banerjee et al. 2019), outsourcing seems to work. Similarly, for primary health care, where quality is measurable (e.g., immunization and antenatal care coverage), outsourcing improves outcomes in general (Loevinsohn and Harding 2005, Bloom et al. 2007). In contrast, for services whose quality is difficult to measure, such as prisons (Useem and Goldstone 2002; Cabral, Lazzarini, and de Azevedo 2013), outsourcing seems to be detrimental. In contrast to primary health care, there is some evidence that contracting out advanced care (where quality is harder to measure) increases expenditure without increasing quality (Duggan 2004). Some quality aspects of education are easy to measure (e.g., enrollment and basic learning metrics), but others are harder (e.g., socialization and selection). In our setting, while outsourcing management improves most indices of school quality on average, the effect varies across providers. In addition, some providers' actions had negative unintended consequences and may have generated negative spillovers for the broader education system, underscoring the importance of robust contracting and monitoring for this type of program.

<span id="page-4-0"></span><sup>&</sup>lt;sup>7</sup>For a review on the few existing nonexperimental studies, see Aslam, Rawal, and Saeed (2017). A related paper to ours (Barrera-Osorio et al. 2017) increased the supply of schools through a public-private partnership in Pakistan. However, it is difficult to disentangle the effect of increasing the supply of schools from the effect of privately managed but publicly funded schools.

### I. Research Design

### A. The Program

Context.—The PSL program broke new ground in Liberia by delegating management of government schools and employees to private providers. Nonetheless, private actors, such as NGOs and USAID contractors, are already common in government schools. Over the past decade, Liberia's basic education budget has been roughly US\$40 million per year (about 2–3 percent of GDP), while external donors contribute about US\$30 million. This distinguishes Liberia from most other low-income countries in Africa, which finance the vast bulk of education spending through domestic tax revenue (UNESCO 2016). The Ministry spends roughly 80 percent of its budget on teacher salaries (Ministry of Education–Liberia 2017a), while almost all the aid money bypasses the Ministry, flowing instead through an array of donor contractors and NGO programs covering nonsalary expenditures. For instance, in 2017 USAID tendered a US\$28 million education program to be implemented by a US contractor in public schools over a five-year period (USAID 2017). The net result is that many "public" education services in Liberia, beyond teacher salaries, are provided by non-state actors. On top of that, more than one-half of children enrolled in preschool and primary attend private schools (Ministry of Education–Liberia 2016a).

A second broad feature of Liberia's education system, relevant for the PSL program, is its performance: not only are learning levels low, but access to basic education and progression through school remains inadequate. The Minister of Education has cited the perception that "Liberia's education system is in crisis" as the core justification for the PSL program. While the world has made great progress toward universal primary education in the past three decades (worldwide net enrollment was almost 90 percent in 2015), Liberia has been left behind. Net primary enrollment stood at only 38 percent in 2014 (World Bank 2014). Low *net* enrollment is partially explained by an extraordinary backlog of over-age children due to the civil war (see online Appendix Figure A.1): the median student in early childhood education is 8 years old and over 60 percent of 15-year-olds are still enrolled in early childhood or primary education (Liberia Institute of Statistics and Geo-Information Services 2016). Learning levels are low: only 25 percent of adult women (there is no information for men) who finish elementary school can read a complete sentence (Liberia Institute of Statistics and Geo-Information Services 2014).

*Intervention.*—The Partnership Schools for Liberia (PSL) program is a public-private partnership (PPP) for school *management*. The Government of Liberia contracted multiple nonstate providers to run 93 existing public primary and pre-primary schools. There are nine grades per school: three early childhood education grades (Nursery, K1, and K2) and six primary grades (grade 1 to grade 6).

<span id="page-5-0"></span><sup>&</sup>lt;sup>8</sup> George K. Werner, "Liberia Has to Work with International Private School Companies If We Want to Protect Our Children's Future," *Quartz Africa*, January 3, 2017 (https://qz.com/876708/why-liberia-is-work-ing-with-bridge-international-brac-and-rising-academies-by-education-minister-george-werner/).

Providers receive funding on a per-pupil basis. In exchange they are responsible for the daily management of the schools.

The government allocated rights to eight providers to manage public schools under the PSL program. The organizations are as follows: Bridge International Academies (23 schools), BRAC (20 schools), Omega Schools (19 schools), Street Child (12 schools), More than Me (6 schools), Rising Academies (5 schools), Youth Movement for Collective Action (4 schools), and Stella Maris (4 schools). See online Appendix A.5 for more details about each organization.

Rather than attempting to write a complete contract specifying private providers' full responsibilities, the government opted instead to select organizations it deemed aligned with its mission of raising learning levels (i.e., "mission-matching" à la Besley and Ghatak 2005, Akerlof and Kranton 2005). After an open and competitive bidding process led by the Ministry of Education with the support of the Ark Education Partnerships Group, the Liberian government selected seven of the eight organizations listed above, of which six passed financial due diligence. Stella Maris did not complete this step and, although included in our sample, was never paid. While Stella Maris never actually took control of their assigned schools, the government still considers them part of the program (e.g., they were allocated more schools in an expansion of the program not studied in this paper (Ministry of Education–Liberia 2017b)). The government made a separate agreement with Bridge International Academies (not based on a competitive tender), but also considers Bridge part of the PSL program.

PSL schools remain public schools and all grades are required to be free of charge and nonselective (i.e., providers are not allowed to charge fees or to discriminate in admissions). In contrast, traditional public schools are not free for all grades. Public primary education is nominally free starting in Grade 1, but tuition for early childhood education in traditional public schools is stipulated at LBD 3,500 per year (about US\$38).

PSL schools are civil servants, drawn from the existing pool of government teachers. The Ministry of Education's financial obligation to PSL schools is the same as all government-run schools: it provides teachers and maintenance, valued at about US\$50 per student. A noteworthy feature of PSL is that providers receive *additional* funding of US\$50 per student (with a maximum of US\$3,250 or 65 students per grade). Donors paid for the transfers made to providers in the first year. Donor money was attached to the PSL program and would not have been available to the government otherwise. Neither Bridge International Academies nor Stella Maris received the extra US\$50 per pupil. As mentioned above, Stella Maris did not complete financial due diligence. Bridge International Academies had a separate agreement with the Ministry of Education and relied entirely on direct grants from donors. Providers have complete autonomy over the use of these funds (e.g., they can be used for teacher training, school inputs, or management personnel). On top of that, providers may raise more funds on their own.

<span id="page-6-0"></span><sup>&</sup>lt;sup>9</sup> Providers may spend funds hiring more teachers (or other school staff). Thus, it is possible that some of the teachers in PSL schools are not civil servants. However, this rarely occurred. Only 8 percent of teachers in PSL schools were paid by providers at the end of the school year. Informal interviews with providers indicate that in

<span id="page-7-0"></span>

|                                       | Control schools | PSL treatment schools                                                      |
|---------------------------------------|-----------------|----------------------------------------------------------------------------|
| Management                            |                 |                                                                            |
| Who owns school building?             | Government      | Government                                                                 |
| Who employs and pays teachers?        | Government      | Government                                                                 |
| Who manages the school and teachers?  | Government      | Provider                                                                   |
| Who sets curriculum?                  | Government      | $Government + provider \ supplement$                                       |
| Funding                               |                 |                                                                            |
| Primary user fees (annual US\$)       | Zero            | Zero                                                                       |
| ECE user fees (annual US\$)           | US\$38          | Zero                                                                       |
| Extra funding per pupil (annual US\$) | NA              | $US\$50^a + independent\ fund-raising$                                     |
| Staffing                              |                 |                                                                            |
| Pupil-teacher ratios                  | NA              | Promised one teacher per grade, allowed to cap class sizes at 45–65 pupils |
| New teacher hiring                    | NA              | First pick of new teacher-training graduat                                 |

TABLE 1—POLICY DIFFERENCES BETWEEN TREATMENT AND CONTROL SCHOOLS

Providers must teach the Liberian national curriculum, but may supplement it with remedial programs, prioritization of subjects, longer school days, and nonacademic activities. They are welcome to provide more inputs such as extra teachers, books, or uniforms, as long as they pay for them.

The intended differences between treated (PSL) and control (traditional public) schools are summarized in Table 1. First, PSL schools are managed by private organizations. Second, PSL schools are theoretically guaranteed (as per the contract) one teacher per grade in each school, plus extra funding. Third, private providers are authorized to cap class sizes. Finally, while both PSL and traditional public schools are free for primary students starting in first grade, public schools charge early-childhood education (ECE) fees.

What Do Providers Do?—Providers enjoy considerable flexibility in defining the intervention. They are free to choose their preferred mix of, say, new teaching materials, teacher training, and managerial oversight of the schools' day-to-day operations. Rather than relying on providers' own description of their model, where there may be incentives to exaggerate and activities may be defined in noncomparable ways across providers, we administered a survey module to teachers in treatment schools, asking if they had heard of the provider, and if so, what activities the provider had engaged in. We summarize teachers' responses in Figure 1, which shows considerable variation in the specific activities and the total activity level of providers.

For instance, teachers reported that two providers (Omega and Bridge) provided computers to schools, which fits with the stated approach of these two international, for-profit firms. Other providers, such as BRAC and Street Child, put more focus on teacher training and observing teachers in the classroom, though these

<sup>&</sup>lt;sup>a</sup>Neither Bridge International Academies nor Stella Maris received the extra US\$50 per pupil.

<sup>&</sup>lt;sup>b</sup>Bridge International Academies was authorized to cap class sizes at 55 (but in practice capped them at 45 in most cases as this was allowed by the MOU), while other providers were authorized to cap class sizes at 65.

<sup>&</sup>lt;sup>c</sup>Bridge International Academies had first pick, before other providers, of the new teacher-training graduates.

<span id="page-8-0"></span>

|                   |                                                        | Provider |      |       |      |        |        |           |     |
|-------------------|--------------------------------------------------------|----------|------|-------|------|--------|--------|-----------|-----|
| t                 |                                                        | Stella M | YMCA | Omega | BRAC | Bridge | Rising | St. Child | MtM |
| odc               | Provider staff visits at least once a week $(\%)$      | 0        | 54   | 13    | 93   | 76     | 94     | 91        | 96  |
| Provider support  | Heard of PSL (%)                                       | 42       | 85   | 61    | 42   | 87     | 90     | 68        | 85  |
|                   | Heard of (provider) (%)                                | 46       | 96   | 100   | 95   | 100    | 100    | 100       | 100 |
|                   | Has anyone from (provider) been to this school? $(\%)$ | 42       | 88   | 100   | 94   | 100    | 100    | 99        | 100 |
|                   | Textbooks (%)                                          | 12       | 96   | 73    | 94   | 99     | 71     | 94        | 96  |
|                   | Teacher training (%)                                   | 0        | 77   | 62    | 85   | 87     | 97     | 93        | 96  |
| _                 | Teacher received training since Aug 2016 $(\%)$        | 23       | 46   | 58    | 45   | 50     | 81     | 58        | 37  |
| qec               | Teacher guides (or teacher manuals) $(\%)$             | 0        | 69   | 75    | 54   | 97     | 94     | 68        | 98  |
| Ever provided     | School repairs (%)                                     | 0        | 12   | 25    | 24   | 53     | 52     | 13        | 93  |
| g                 | Paper (%)                                              | 0        | 92   | 30    | 86   | 70     | 97     | 88        | 98  |
| Eve               | Organization of community meetings (%)                 | 0        | 54   | 27    | 69   | 73     | 87     | 83        | 91  |
| _                 | Food programs (%)                                      | 0        | 8    | 2     | 1    | 1      | 10     | 0         | 17  |
|                   | Copybooks (%)                                          | 4        | 65   | 30    | 92   | 18     | 97     | 94        | 91  |
|                   | Computers, tablets, electronics (%)                    | 0        | 0    | 94    | 0    | 99     | 3      | 3         | 2   |
|                   | Provide/deliver educational materials $(\%)$           | 0        | 4    | 45    | 17   | 18     | 26     | 29        | 50  |
|                   | Observe teaching practices and give suggestions (%)    | 0        | 19   | 45    | 81   | 65     | 45     | 74        | 85  |
| Most recent visit | Monitor/observe PSL program (%)                        | 0        | 12   | 23    | 11   | 13     | 13     | 35        | 65  |
|                   | Monitor other school–based government programs $(\%)$  | 0        | 0    | 7     | 5    | 10     | 6      | 18        | 9   |
|                   | Monitor health/sanitation issues (%)                   | 0        | 8    | 9     | 2    | 5      | 0      | 10        | 28  |
|                   | Meet with PTA committee (%)                            | 0        | 12   | 8     | 10   | 7      | 0      | 21        | 41  |
|                   | Meet with principal (%)                                | 0        | 12   | 54    | 36   | 38     | 6      | 51        | 63  |
| ≥                 | Deliver information (%)                                | 0        | 12   | 36    | 16   | 8      | 6      | 16        | 35  |
|                   | Check attendance and collect records (%)               | 42       | 23   | 43    | 56   | 39     | 19     | 66        | 70  |
|                   | Ask students questions to test learning $(\%)$         | 4        | 4    | 24    | 33   | 18     | 58     | 44        | 43  |

FIGURE 1. WHAT DID PROVIDERS DO?

*Notes:* The figure reports simple proportions (not treatment effects) of teachers surveyed in PSL schools who reported whether the provider responsible for their school had engaged in each of the activities listed. The sample size, n, of teachers interviewed with respect to each provider is: Stella Maris, 26; Omega, 141; YMCA, 26; BRAC, 170; Bridge, 157; Street Child, 80; Rising Academy, 31; More than Me, 46. This sample only includes compliant treatment schools.

differences were not dramatic. In general, providers such as More than Me and Rising Academies showed high activity levels across dimensions, while teacher surveys confirmed administrative reports that Stella Maris conducted almost no activities in its assigned schools.

Cost Data and Assumptions.—The government designed the PSL program based on the estimate that it spends roughly US\$50 per child in all public schools (mostly on teacher salaries), and it planned to continue to do so in PSL schools. <sup>10</sup> As shown in Section II, PSL led to reallocation of additional teaching staff to treatment schools and reduced pupil-teacher ratios in treatment schools, raising the Ministry's per-pupil cost to close to US\$70. On top of this, providers were offered a US\$50 per-pupil payment to cover their costs. As noted above, neither Bridge International Academies nor Stella Maris received the extra US\$50 per pupil. This cost figure

<span id="page-8-1"></span>Werner, "Liberia Has to Work with International Private School Companies If We Want to Protect Our Children's Future," Quartz Africa.

<span id="page-9-3"></span>FIGURE 2. BUDGET AND COSTS AS REPORTED BY PROVIDERS

*Notes:* Numbers in panel A are based on ex ante budgets submitted to the program secretariat in a uniform template (inclusive of both fixed and variable costs). Stella Maris did not provide budget data. Numbers in panel B are based on self-reports on ex post expenditures (inclusive of both fixed and variable costs) submitted to the evaluation team by five providers in various formats. Numbers do not include the cost of teaching staff borne by the Ministry of Education.

was chosen because US\$100 was deemed a realistic medium-term goal for public expenditure on primary education nationwide.<sup>11</sup>

In the first year, some providers spent far more than this amount. Ex ante per-pupil budgets submitted to the program secretariat before the school year started (on top of the Ministry's costs) ranged from a low of approximately US\$57 for Youth Movement for Collective Action to a high of US\$1,050 for Bridge International Academies (see panel A of Figure 2). Ex post per-pupil expenditure submitted to the evaluation team at the end of the school year (on top of the Ministry's costs) ranged from a low of approximately US\$48 for Street Child to a high of US\$663 for Bridge International Academies (see panel B of Figure 2). These differences in costs are large relative to differences in treatment effects on learning, implying that cost-effectiveness may be driven largely by cost assumptions.

In principle, the costs incurred by private providers would be irrelevant for policy evaluation in a public-private partnership with this structure. If the providers are willing to make an agreement in which the government pays US\$50 per pupil, providers' losses are inconsequential to the government (philanthropic donors have stepped in to fund some providers' high costs under PSL).<sup>12</sup> Thus, we present analyses using both the Ministry's US\$50 long-term cost target and providers' actual budgets.<sup>13</sup>

<span id="page-9-1"></span><span id="page-9-0"></span><sup>&</sup>lt;sup>11</sup> Ibid

<sup>&</sup>lt;sup>12</sup>These costs matter to the government under at least two scenarios. First, if providers are spending more during the first years of the program to prove effectiveness, they may lower expenditure (and quality) once they have locked in long-term contracts. Second, if private providers are not financially sustainable, they may close schools and disrupt student learning.

<span id="page-9-2"></span><sup>&</sup>lt;sup>13</sup>While some providers relied almost exclusively on the US\$50 per child subsidy from the PSL pool fund, others have raised more money from donors. Bridge International Academies relied entirely on direct grants from donors and opted not to take part in the competitive bidding process for the US\$50 per-pupil subsidy which closed in June 2016. Bridge did subsequently submit an application for this funding in January 2017, which was not approved, but allows us access to their budget data.

Providers' budgets for the first year of the program are likely a naïve measure of program cost, as they combine start-up costs, fixed costs, and variable costs. It is possible to distinguish start-up costs from other costs as shown in Figure 2, and these make up a small share of the first-year totals for most providers. It is not possible to distinguish fixed from variable costs in the budget data. In informal interviews, some providers (e.g., Street Child) profess operating a variable-cost model, implying that each additional school costs roughly the same amount to operate. Others (e.g., Bridge) report that their costs are almost entirely fixed, and unit costs would fall if scaled; however, we have no direct evidence of this. Our estimate is that Bridge's international operating cost, at scale, is between US\$191 and US\$220 per pupil annually.<sup>14</sup>

# B. Experimental Design

Sampling and Random Assignment.—Liberia has 2,619 public primary schools. Private providers and the government agreed that potential PSL schools should have at least six classrooms and six teachers, good road access, a single shift, and should not contain a secondary school on their premises. A few schools were added to the list at the request of Bridge International Academies. Some of these schools had double shifts. Only 299 schools satisfied all the criteria, although some of these are "soft" constraints that can be addressed if the program expands. For example, the government can build more classrooms and add more teachers to the school staff. On average, schools in the experiment are closer to the capital (Monrovia), have more students, greater resources, and better infrastructure. While schools in the RCT generally have better facilities and infrastructure than most schools in the country, they still have deficiencies. For example, the average school in Liberia has 1.8 permanent classrooms, and the median school has 0 permanent classrooms, while the average school in the RCT has 3.16 classrooms. Panel A of Figure 3 shows all public schools in Liberia and those within our sample. Online Appendix Table A.1 has details on the differences between schools in the experiment and other public schools.

Two providers (Omega Schools and Bridge International Academies) required schools with 2G connectivity. Each provider submitted to the government a list of the regions in which they were willing to work (Bridge International Academies had first pick of schools). Based on preferences and requirements the list of eligible schools was partitioned across providers. We paired schools in the experiment sample within each district according to a principal component analysis (PCA) index of school resources. <sup>15</sup> This pairing stratified treatment by school resources within each

<span id="page-10-1"></span><sup>15</sup>We calculated the index using the first eigenvector of a principal component analysis that included the following variables: students per teacher; students per classroom; students per chair; students per desk; students per bench; students per chalkboard; students per book; whether the school has a permanent building; whether the school has piped water, a pump, or a well; whether the school has a toilet; whether the school has a staff room; whether the school has a generator; and the number of enrolled students.

<span id="page-10-0"></span><sup>&</sup>lt;sup>14</sup>In written testimony to the UK House of Commons, Bridge stated that its fees were between US\$78 and US\$110 per annum in private schools, and that it had approximately 100,000 students in both private and PPP schools (Bridge International Academies 2017, Kwauk and Robinson 2016). Of these, roughly 9,000 are in PPP schools and pay no fees. In sworn oral testimony, cofounder Shannon May stated that Bridge had supplemented its fee revenue with more than US\$12 million in the previous year (May 2017). This is equal to an additional US\$120 per pupil, and implies Bridge spends between US\$191 and US\$220 per pupil at its current global scale.

<span id="page-11-0"></span>Panel A. Geographical distribution of all public schools in Liberia and those within the RCT

FIGURE 3. PUBLIC PRIMARY SCHOOLS IN LIBERIA

*Notes*: Data on school location are from Ministry of Education–Liberia (2015-2016) data. Geographical information on the administrative areas of Liberia comes from DIVA-GIS (2016).

private provider, but not across providers. We gave a list of pairs to each provider based on their location preferences and requirements, so that each list had twice the number of schools they were to operate. Once each provider approved this list, we randomized the treatment assignment within each pair. There is one triplet due to logistical constraints in the assignment across counties, which resulted in one extra treatment school. In short, schools are assigned to a provider, then paired, and then randomly assigned to treatment or control.

Private providers did not manage all the schools originally assigned to treatment and we treat these schools as noncompliant, presenting results in an intention-to-treat framework. After providers visited their assigned schools to start preparing for the upcoming school year, two treatment schools turned out to be private schools that were incorrectly labeled in the government data as public schools. Two other schools had only two classrooms each. Of these four schools, two had originally been assigned to More Than Me and two had been assigned to Street Child. Omega Academies opted not to operate two of their assigned schools and Rising Academies opted not to operate one of their assigned schools. In total, there are seven noncompliant treatment schools. Panel B of Figure 3 shows the treatment assignment.

<span id="page-11-1"></span><sup>&</sup>lt;sup>16</sup>More than Me and Street Child were provided with replacement schools, presenting them with a new list of counterparts and informing them, as before, that they would operate one of each pair of schools (but not which one). Providers approved the list before we randomly assigned replacement schools from it. However, we do not use this list as our main sample since it is not fully experimental. We analyzed results for this "final" treatment and control school list, and they are almost identical to the results for the "original" list. Results for this final list of treatment and control schools are available upon request. Bridge International Academies is managing two extra demonstration schools that were not randomized and are not part of our sample. Rising Academies was given one nonrandomly assigned school, which is not part of our sample either. Thus, the set of schools in our analysis is

Treatment assignment may change the student composition across schools. To prevent differences in the composition of students from driving differences in test scores, we sampled 20 students per school (from K1 to grade 5) from enrollment logs from 2015-2016, the year before the treatment was introduced. We associate each student with his or her "original" school, regardless of what school (if any) he or she attended in subsequent years. The combination of random treatment assignment at the school level with measuring outcomes of a fixed and comparable pool of students allows us to provide clean estimates of the program's intention-to-treat (ITT) effect on test scores within the student population originally attending study schools, uncontaminated by selection.

Time Line of Research and Intervention Activities.—We collected data in schools twice: at the beginning of the school year in September-October 2016 and at the end of the school year in May-June 2017.<sup>17</sup> We collected the first round of data two to eight weeks after the beginning of treatment. While we intended the first survey wave to serve as a baseline, logistical delays led it to take place shortly after the beginning of the school year. We see evidence of treatment effects within this 1–2-month time frame and treat this early wave as a very short-term outcome survey. Hence, we do not control for test scores collected during the first wave of data collection.<sup>18</sup> We focus on time-invariant covariates and administrative data collected before the program began when checking balance between treatment and control schools (see Section IB).

Test Design.—In our sample, literacy cannot be assumed at any grade level, precluding the possibility of written tests. In addition, tests administered by schools would be contaminated by shifts in enrollment and attendance due to treatment. We opted to conduct one-on-one tests in which an enumerator sits with the student (either at school or at home), asks questions, and records answers. For the math part of the test we provided students with scratch paper and a pencil. We designed the tests to capture a wide range of student abilities. To make the test scores comparable across grades, we constructed a single adaptive test for all students. The test has stop rules that skip higher-order skills if the student is not able to answer questions related to more basic skills. Online Appendix Section A.3 has details on the construction of the test.

We estimate an item response theory (IRT) model for each round of data collection. IRT models are the standard in the assessments literature for generating

not identical to the set of schools actually managed by PSL providers. Online Appendix Table A.2 summarizes the overlap between schools in our main sample and the set of schools actually managed by PSL providers.

<span id="page-12-0"></span><sup>&</sup>lt;sup>17</sup>A third round of data collection took place in March-April 2019 (see online Appendix Figure A.2 for a detailed time line of intervention and research activities).

<span id="page-12-1"></span><sup>&</sup>lt;sup>18</sup>Our pre-analysis plan was written on the assumption that we would be able to collect baseline data (Romero, Sandefur, and Sandholtz 2017). Hence, the pre-analysis plan includes a specification that controls for test scores collected during the first wave of data collection along with the main specifications used in this paper. We report these results in online Appendix Table A.4. We view the differences in short-term outcomes as treatment effects rather than "chance bias" in randomization for the following reasons. First, time-invariant student characteristics are balanced across treatment and control (see Table 2). Second, the effects on English and math test scores appear to materialize in the later weeks of the fieldwork, as shown in online Appendix Figure A.3. Third, there is no significant effect on abstract reasoning, which is arguably less amenable to short-term improvements through teaching (although the difference between a significant English-math effect and an insignificant abstract reasoning effect here is not itself significant).

comparative test scores.<sup>19</sup> There are two relevant characteristics of IRT models in this setting. First, they simultaneously estimate the test taker's ability and the difficulty of the questions, which allows the contribution of "correct answers" to the ability measure to vary from question to question. Second, they provide a comparable measure of student ability across different grades and survey rounds, even if the question overlap is imperfect. A common scale across grades allows us to estimate treatment effects as additional years of schooling. Following standard practice, we normalize the IRT scores with respect to the control group.

Additional Data.—We surveyed all the teachers in each school and conducted in-depth surveys with those teaching math and English. We asked teachers about their time use and teaching strategies. We also obtained teacher opinions on the PSL program. For a randomly selected class within each school, we conducted a classroom observation using the Stallings Classroom Observation Tool (World Bank 2015a). Furthermore, we conducted school-level surveys to collect information about school facilities, the teacher roster, input availability (e.g., textbooks), and expenditures.

Enumerators collected information on some school practices. Specifically, enumerators recorded whether the school has an enrollment log and what information it stores; whether the school has an official time table and whether it is posted; whether the school has a parent-teacher association (PTA) and if the principal knows the PTA head's contact information (or where to find it); and whether the school has a written budget and keeps a record (and receipts) of past expenditures. Additionally, we asked principals to complete two commonly used human resource instruments to measure their "intuitive score" (Agor 1989) and "time management profile" (Schermerhorn et al. 2011).

For the second wave of data collection, we surveyed a random subset of households from our student sample, recording household characteristics and attitudes of household members. We also gathered data on school enrollment and learning levels for all children 4–8 years old living in these households.

Balance and Attrition.—As mentioned above, the first wave of data was collected 2 to 8 weeks after the beginning of treatment. Hence, we focus on time-invariant characteristics when checking balance across treatment and control. Observable (time-invariant) characteristics of students and schools are balanced across treatment and control (see Table 2). Eighty percent of schools in our sample are in rural areas, over an hour away from the nearest bank (which is usually located in the nearest

<span id="page-13-0"></span><sup>&</sup>lt;sup>19</sup>For example, IRT models are used to estimate students' ability in the Graduate Record Examinations (GRE), the Scholastic Assessment Test (SAT), the Program for International Student Assessment (PISA), the Trends in International Mathematics and Science Study (TIMSS), and the Progress in International Reading Literacy Study (PIRLS) assessments. The use of IRT models in the development and education literature in economics is less prevalent, but becoming common. For example, see Das and Zajonc (2010); Andrabi et al. (2011); Andrabi, Das, and Khwaja (2017); Singh (2015, forthcoming); Muralidharan, Singh, and Ganimian (2019); and Mbiti et al. (2019). Das and Zajonc (2010) provide an introduction to IRT models, while van der Linden (2018) provides a full treatment of IRT models.

<span id="page-13-1"></span><sup>&</sup>lt;sup>20</sup>While management practices are difficult to measure, previous work has constructed detailed instruments to measure them in schools (e.g., see Bloom et al. 2015, Crawfurd 2017, Lemos and Scur 2016). Due to budget constraints, we only checked easily observable differences in school management.

TABLE 2—BALANCE: OBSERVABLE, TIME-INVARIANT SCHOOL AND STUDENT CHARACTERISTICS

<span id="page-14-0"></span>

|                                                | Treatment (1)       | Control (2)        | Difference (3)    | Difference (FE) (4) |
|------------------------------------------------|---------------------|--------------------|-------------------|---------------------|
| Panel A. School characteristics (observations  | = 185)              |                    |                   |                     |
| Facilities (PCA)                               | -0.080<br>(1.504)   | -0.003 (1.621)     | -0.077 (0.230)    | -0.070 (0.232)      |
| Percent holds some classes outside             | 13.978<br>(34.864)  | 14.130<br>(35.024) | -0.152 (5.138)    | -0.000 $(5.094)$    |
| Percent rural                                  | 79.570<br>(40.538)  | 80.435<br>(39.888) | -0.865<br>(5.913) | -0.361 (4.705)      |
| Travel time to nearest bank (minutes)          | 75.129<br>(69.099)  | 68.043<br>(60.509) | 7.086<br>(9.547)  | 7.079<br>(8.774)    |
| Panel B. Student characteristics (observation. | s = 3.508)          |                    |                   |                     |
| Age in years                                   | 12.394<br>(2.848)   | 12.291<br>(2.935)  | 0.104<br>(0.169)  | 0.059<br>(0.112)    |
| Percent male                                   | 54.949<br>(49.769)  | 56.146<br>(49.635) | -1.197 (2.041)    | -1.459 (1.247)      |
| Wealth index                                   | -0.006<br>(1.529)   | 0.024<br>(1.536)   | -0.030<br>(0.140) | 0.011<br>(0.060)    |
| Percent in top wealth quartile                 | 0.199<br>(0.399)    | 0.219<br>(0.413)   | -0.020<br>(0.026) | -0.018<br>(0.014)   |
| Percent in bottom wealth quartile              | 0.267<br>(0.442)    | 0.284<br>(0.451)   | -0.017<br>(0.039) | -0.011<br>(0.019)   |
| ECE before grade 1                             | 0.832<br>(0.374)    | 0.818<br>(0.386)   | 0.014<br>(0.025)  | 0.013<br>(0.017)    |
| Panel C. Attrition in the second wave of data  | collection (observe | ations = 3.511     |                   |                     |
| Percent interviewed                            | 95.60<br>(20.52)    | 95.74<br>(20.20)   | -0.14 (0.64)      | -0.35 (0.44)        |

Notes: The first wave of data was collected 2 to 8 weeks after the beginning of treatment; hence, the focus here is on time-invariant characteristics (some of these characteristics may vary in response to the program in the long run, but are time-invariant given the duration of our study). This table presents the mean and standard error of the mean (in parentheses) for the control (column 1) and treatment (column 2), as well as the difference between treatment and control (column 3), and the difference taking into account the randomization design (i.e., including pair fixed effects) in column 4. Panel A has two measures of school infrastructure: the first is a school infrastructure index made up of the first component in a principal component analysis of indicator variables for classrooms, staff room, student and adult latrines, library, playground, and an improved water source. The second is whether the school ever needs to hold classes outside due to lack of classrooms. There are two measures of school rurality. First, a binary variable and second, the time it takes to travel by motorcycle to the nearest bank. Panel B has student characteristics. The wealth index is the first component of a principal component analysis of indicator variables for whether the student's household has a television, radio, electricity, a refrigerator, a mattress, a motorcycle, a fan, and a phone. Panel C shows the attrition rate (proportion of students interviewed at the first round of data collection who we were able to interview in the second wave). Standard errors are clustered at the school level.

urban center), and over 10 percent need to hold some classes outside due to insufficient classrooms. Boys make up 55 percent of our students and the students' average age is 12. According to pretreatment administrative data (Ministry of Education–Liberia 2015–2016), the number of students, infrastructure, and resources available to students were not statistically different across treatment and control schools (for details, see online Appendix Table A.3).

We took great care to avoid attrition: enumerators conducting student assessments participated in extra training on tracking and its importance, and dedicated generous time to tracking. Students were tracked to their homes and tested there

when not available at school. Attrition in the second wave of data collection from our original sample is balanced between treatment and control and is below 4 percent (see panel C). Online Appendix Section A.2 has more details on the tracking and attrition that took place during data collection.

# II. Experimental Results

In this section, we first explore how the PSL program affected access to and quality of education. We then turn to mechanisms, looking at changes in material inputs, staffing, and school management. Replication data are available at Romero, Sandefur, and Sandholtz (2018).

#### A. Test Scores

Following our pre-analysis plan (Romero, Sandefur, and Sandholtz 2017), we report treatment-effect estimates from two specifications:

(1) 
$$Y_{isg} = \alpha_g + \beta_1 treat_s + \varepsilon_{isg}$$
,

(2) 
$$Y_{isg} = \alpha_g + \beta_2 treat_s + \gamma_2 X_i + \delta_2 Z_s + \varepsilon_{isg}.$$

The first specification amounts to a simple comparison of post-treatment outcomes for treatment and control individuals, in which  $Y_{isg}$  is the outcome of interest for student i in school s and group g (denoting the matched pairs used for randomization);  $\alpha_g$  is a matched-pair fixed effect (i.e., stratification-level dummies);  $treat_s$  is an indicator for whether school s was randomly chosen for treatment; and  $\varepsilon_{isg}$  is an error term. The second specification adds controls for time-invariant characteristics measured at the individual level  $(X_i)$  and school level  $(Z_s)$ . We estimate both specifications via ordinary least squares, clustering the standard errors at the school level.

Table 3 shows results from student tests. The first three columns show differences between control and treatment schools' test scores after 1–2 months of treatment (September-October 2016), while the last three columns show the difference after 9–10 months of treatment (May-June 2017). Columns 1, 2, 4, and 5 show intention-to-treat (ITT) treatment estimates, while columns 3 and 6 show treatment-on-the-treated (ToT) estimates (i.e., the treatment effect for students that actually attended a PSL school in 2016-2017). The ToT is estimated using the

(3) 
$$Y_{isg} = \alpha_g + \beta_3 treat_s + \gamma_3 X_i + \delta_3 Z_s + \zeta_3 Y_{isg,-1} + \varepsilon_{isg}.$$

However, as mentioned before, the first wave of data was collected after the beginning of treatment, so we lack a true baseline of student test scores. We report this specification in online Appendix Table A.4. The results are still statistically significant, but mechanically downward biased.

<span id="page-15-0"></span><sup>&</sup>lt;sup>21</sup>These controls were specified in the pre-analysis plan and are listed in online Appendix Table A.5 (Romero, Sandefur, and Sandholtz 2017). We had committed in the pre-analysis plan to a specification that controlled for pretreatment individual outcomes:

<span id="page-16-0"></span>

|              | (1–2 m | First wave (1–2 months after treatment) |        |                |                | Second wave (9–10 months after treatment) |  |  |  |
|--------------|--------|-----------------------------------------|--------|----------------|----------------|-------------------------------------------|--|--|--|
|              | II     | ITT                                     |        | ITT            |                | ТоТ                                       |  |  |  |
|              | (1)    | (2)                                     | (3)    | (4)            | (5)            | (6)                                       |  |  |  |
| English      | 0.09   | 0.07                                    | 0.08   | 0.17           | 0.18           | 0.21                                      |  |  |  |
|              | (0.05) | (0.03)                                  | (0.04) | (0.04)         | (0.03)         | (0.04)                                    |  |  |  |
| Math         | 0.07   | 0.05                                    | 0.06   | 0.19           | 0.18           | 0.22                                      |  |  |  |
|              | (0.04) | (0.03)                                  | (0.04) | (0.04)         | (0.03)         | (0.04)                                    |  |  |  |
| Abstract     | 0.05   | 0.03                                    | 0.04   | 0.05           | 0.05           | 0.06                                      |  |  |  |
|              | (0.05) | (0.04)                                  | (0.04) | (0.04)         | (0.04)         | (0.05)                                    |  |  |  |
| Composite    | 0.08   | 0.06                                    | 0.07   | 0.19           | 0.18           | 0.22                                      |  |  |  |
|              | (0.05) | (0.03)                                  | (0.04) | (0.04)         | (0.03)         | (0.04)                                    |  |  |  |
| New modules  |        |                                         |        | 0.20<br>(0.04) | 0.19<br>(0.04) | 0.23<br>(0.04)                            |  |  |  |
| Conceptual   |        |                                         |        | 0.13<br>(0.04) | 0.12<br>(0.04) | 0.15<br>(0.05)                            |  |  |  |
| Controls     | No     | Yes                                     | Yes    | No             | Yes            | Yes                                       |  |  |  |
| Observations | 3,508  | 3,508                                   | 3,508  | 3,492          | 3,492          | 3,492                                     |  |  |  |

TABLE 3—ITT TREATMENT EFFECTS ON LEARNING

Notes: Columns 1–3 are based on the first wave of data and show the difference between treatment and control schools taking into account the randomization design, i.e., including "pair" fixed effects (column 1), the difference taking into account other student and school controls (column 2), and the treatment-on-the-treated (ToT) estimates (column 3). Columns 4–6 are based on the second wave of data and show the difference between treatment and control taking into account the randomization design, i.e., including "pair" fixed effects (column 4), the difference taking into account other student and school controls (column 5), and the treatment-on-the-treated (ToT) estimates (column 6). The treatment-on-the-treated effects are estimated using the assigned treatment as an instrument for whether the student was in fact enrolled in a PSL school at the time of data collection. Standard errors are clustered at the school level.

assigned treatment as an instrument for whether the student is in fact enrolled in a PSL school during the 2016-2017 academic year.<sup>22</sup>

After 1–2 months of treatment, student test scores increased by  $0.05\sigma$  in math (p-value=0.09) and  $0.07\sigma$  in English (p-value=0.04). Part of these short-term improvements can be explained by the fact that most providers started the school year on time, while most traditional public schools began classes 1–4 weeks later. Hence, most students were already attending classes on a regular basis in treatment schools during our field visit, while their counterparts in control schools were not. We estimate the treatment effect separately for students tested during the first and the second half of the first round of data collection (see online Appendix Figure A.3), and show that the treatment effects fade in during the course of field work, further supporting our conclusion that these results represent early treatment effects as opposed to baseline imbalance.

In our preferred specification (column 5), the treatment effect of PSL after one academic year is  $0.18\sigma$  for English (p-value < 0.001) and  $0.18\sigma$  for math (p-value < 0.001). We focus on the ITT effect, but the ToT effect is  $0.21\sigma$  for English (p-value < 0.001) and  $0.22\sigma$  for math (p-value < 0.001). Our results are

<span id="page-16-1"></span><sup>&</sup>lt;sup>22</sup>The percentage of students originally assigned to treatment schools who were actually in treatment schools at the end of the 2016-2017 school year is 81 percent. The percentage of students assigned to control schools who were in treatment schools at the end of the 2016-2017 school year is 0 percent.

robust to different measures of student ability (see online Appendix Table A.6 for details).

An important concern when interpreting these results is whether they represent real gains in learning or better test-taking skills resulting from "teaching to the test." We show suggestive evidence that these results represent real gains. First, the treatment effect is significant  $(0.19\sigma, p\text{-value} < 0.001)$  for new modules that were not in the first wave test (and unknown to the providers or the teachers), and statistically indistinguishable from the treatment effect over all the items  $(0.18\sigma, p\text{-value} < 0.001)$ . Second, the treatment effect is positive and significant  $(0.12\sigma, p\text{-value} 0.0014)$  for the conceptual questions (which do not resemble the format of standard textbook exercises). We cannot rule out that providers narrowed the curriculum by focusing on English and mathematics or, conversely, that they generated additional learning gains in other subjects that we did not test. We find no evidence of heterogeneous treatment effects by students' socioeconomic status, gender, or grade (see online Appendix Table A.8).

### B. Enrollment, Attendance, and Student Selection

The previous section showed that education quality, measured using test scores in an ITT framework, increased in PSL schools. We now ask whether the PSL program increased access to education. To explore this question we focus on three outcomes which were committed to in the pre-analysis plan: enrollment, student attendance, and student selection (Romero, Sandefur, and Sandholtz 2017). PSL increased enrollment overall, but in schools where enrollment was already high and classes were large, the program led to a significant decline in enrollment. This does not appear to be driven by selection of "better" students, but by providers capping class sizes and eliminating double shifts. As shown in online Appendix Section A.5, almost the entirety of this phenomenon is explained by Bridge International Academies.

Enrollment changes across treatment and control schools are shown in panel A of Table 4. There are a few noteworthy items. First, treatment schools are slightly larger before treatment: they have 34 (p-value 0.095) more students on average before treatment. Online Appendix Table A.3 uses administrative data, while Table 4 uses data independently collected by our survey teams. While the difference in enrollment in the 2015-2016 academic year is only significant in the latter, the point estimates are similar across both tables. Second, PSL schools on average have 57 (p-value < 0.001) more students than control schools in the 2016-2017 academic year, which results in a net increase (after controlling for pretreatment differences) of 25 (p-value 0.09) students per school.

<span id="page-17-0"></span><sup>&</sup>lt;sup>23</sup> As shown in Table 7, PSL schools have longer school days. As a result, treatment schools spend about 45 minutes per week more in both English and math. However, they do not spend a larger fraction of the school day in English or math (see online Appendix Table A.7). More broadly, we cannot rule out that PSL spent disproportionately more resources improving English and math instruction.

<span id="page-17-1"></span><sup>&</sup>lt;sup>24</sup>Three Bridge International Academies treatment schools (representing 28 percent of total enrollment in Bridge treatment schools) had double shifts in 2015-2016, but not in 2016-2017. One Omega Schools treatment school (representing 7.2 percent of total enrollment in Omega treatment schools) had double shifts in 2015-2016, but not in 2016-2017. The MOU between Bridge and the Ministry of Education authorized eliminating double shifts (Ministry of Education–Liberia 2016b).

TABLE 4—ITT TREATMENT EFFECTS ON ENROLLMENT, ATTENDANCE, AND SELECTION

<span id="page-18-0"></span>

|                                                | Treatment        | Control          | Difference     | Difference (FE) |
|------------------------------------------------|------------------|------------------|----------------|-----------------|
|                                                | (1)              | (2)              | (3)            | (4)             |
| Panel A. School-level data (observations $= 1$ | 75)              |                  |                |                 |
| Enrollment, 2015-2016                          | 298.45           | 264.11           | 34.34          | 34.18           |
|                                                | (169.74)         | (109.91)         | (21.00)        | (20.28)         |
| Enrollment, 2016-2017                          | 309.71           | 252.75           | 56.96          | 56.89           |
|                                                | (118.96)         | (123.41)         | (18.07)        | (16.29)         |
| 2015-2016 to 2016-2017 enrollment change       | 11.55            | -6.06            | 17.61          | 24.60           |
|                                                | (141.30)         | (82.25)          | (17.19)        | (14.35)         |
| Attendance percent (spot check)                | 48.02            | 32.83            | 15.19          | 15.57           |
|                                                | (24.52)          | (26.55)          | (3.81)         | (3.13)          |
| Percent of students with disabilities          | 0.59             | 0.39             | 0.20           | 0.21            |
|                                                | (1.16)           | (0.67)           | (0.14)         | (0.15)          |
| Panel B. Student-level data (observations = 3  | 3.639)           |                  |                |                 |
| Percent enrolled in the same school            | 80.50<br>(39.63) | 83.16<br>(37.43) | -2.66 (3.66)   | 0.71<br>(2.06)  |
| Percent enrolled in school                     | 94.13            | 93.99            | 0.14           | 1.23            |
|                                                | (23.52)          | (23.77)          | (1.33)         | (0.87)          |
| Days missed, previous week                     | 0.85<br>(1.41)   | 0.85<br>(1.40)   | -0.00 $(0.10)$ | -0.06 (0.07)    |

Notes: This table presents the mean and standard error of the mean (in parentheses) for the control (column 1) and treatment (column 2) groups, as well as the difference between treatment and control (column 3), and the difference taking into account the randomization design (i.e., including "pair" fixed effects) in column 4. Panel A presents school-level data including enrollment (taken from enrollment logs) and student attendance measure by our enumerators during a spot check in the middle of a school day. If the school was not in session during a regular school day we mark all students as absent. Panel B presents student-level data including whether the student is still enrolled in the same schools, whether he is enrolled in school at all, and whether it missed school in the previous week (conditional on being enrolled in school). Standard errors are clustered at the school level.

Since provider compensation is based on the number of students enrolled rather than the number of students actively attending school, increases in enrollment may not translate into increases in student attendance. An independent measure of student attendance conducted by our enumerators during a spot check shows that students in treatment schools are  $16 \ (p\text{-value} < 0.001)$  percentage points more likely to be in school during class time (see panel A of Table 4).

Turning to the question of student selection, we find no evidence that any group of students is systematically excluded from PSL schools. The proportion of students with disabilities is not statistically different in PSL schools and control schools (panel A of Table 4).<sup>25</sup> Among our sample of students (i.e., students sampled from the 2015-2016 enrollment log), students are equally likely across treatment and control to be enrolled in the same school in the 2016-2017 academic year as they were in 2015-2016, and equally likely to be enrolled in any school (see panel B). Finally, selection analysis using student-level data on wealth, gender, and age finds no evidence of systematic exclusions (see online Appendix Table A.9).

<span id="page-18-1"></span><sup>&</sup>lt;sup>25</sup> However, the fraction of students identified as disabled in our sample is an order of magnitude lower than estimates for the percentage of disabled students in the United States and worldwide using roughly the same criteria (both about 5 percent: see Brault 2011, UNICEF 2013).

Providers are authorized to cap class sizes, which could lead to students being excluded from their previous school (and either transferred to another school or to no school at all). We estimate whether the caps are binding for each student by comparing the average enrollment before treatment in her grade cohort and the two adjacent grade cohorts (i.e., one grade above and below) to the theoretical class-size cap under PSL. We average over three cohorts because some providers used placement tests to reassign students across grade levels. Thus, the "constrained" indicator is defined by the number of students enrolled in the student's 2016-2017 "expected grade" (as predicted based on normal progression from their 2015-2016 grade) and adjacent grades, divided by the "maximum capacity" in those three grades in 2016-2017 (as specified in our pre-analysis plan in Romero, Sandefur, and Sandholtz 2017):

$$c_{igso} = \frac{Enrollment_{is,g-1} + Enrollment_{is,g} + Enrollment_{is,g+1}}{3 \times Maximum_o},$$

where  $c_{igso}$  is our "constrained" measure for student i, expected to be in grade g in 2016-2017, at school s, in a "pair" assigned to provider o;  $Enrollment_{is,g-1}$  is enrollment in the grade below the student's expected grade,  $Enrollment_{is,g}$  is enrollment in the student's expected grade, and  $Enrollment_{is,g+1}$  is enrollment in the grade above the student's expected grade;  $Maximum_o$  is the class cap approved for provider o. We label a student's grade-school combination as "constrained" if  $c_{igso} > 1$ .

Enrollment in constrained school-grades decreases, while enrollment in unconstrained school-grades increases (see column 1 in Table 5). Thus, schools far below the cap have positive treatment effects on enrollment and schools near or above the cap offset it with declining enrollment. Our student data reveal this pattern as well: columns 2 and 3 in Table 5 show the ITT effect on enrollment depending on whether students were enrolled in a constrained class in 2015-2016. In unconstrained classes students are more likely to be enrolled in the same school (and in any school). But in constrained classes students are less likely to be enrolled in the same school. While there is no effect on overall school enrollment, switching schools may be disruptive for children (Hanushek, Kain, and Rivkin 2004). Finally, test scores improve for students in constrained classes. This result is difficult to interpret as it includes the positive treatment effect over students who did not change schools (compounded by smaller class sizes) with the effect over students removed from their schools. These results are robust to excluding adjacent grades from the "constrained" measure (see online Appendix Table A.10).

#### C. Intermediate Inputs

In this section we explore the effect of the PSL program on school inputs (including teachers), school management (with a special focus on teacher behavior and pedagogy), and parental behavior.

Inputs and Resources.—Teachers, one of the most important inputs of education, change in several ways in treatment schools (see panels A and B in Table 6). PSL schools have 2.6 more teachers on average (p-value < 0.001), but this is not merely the result of operators hiring more teachers. Rather, the Ministry of

<span id="page-20-0"></span>

|                                                                                                                                                                 | $\Delta$ enrollment (1)                   | Percent<br>same school<br>(2)             | Percent<br>in school<br>(3)              | Test scores (4)                        |
|-----------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------|-------------------------------------------|------------------------------------------|----------------------------------------|
| $\overline{\text{Constrained} = 0 \times \text{treatment}}$                                                                                                     | 5.30<br>(1.11)                            | 3.90<br>(1.40)                            | 1.65<br>(0.73)                           | 0.15<br>(0.034)                        |
| Constrained = $1 \times \text{treatment}$                                                                                                                       | -11.7 (6.47)                              | -12.5 (7.72)                              | 0.085<br>(4.12)                          | 0.35<br>(0.11)                         |
| Observations Mean control (unconstrained) Mean control (constrained) $\alpha_0 = \text{constrained} - \text{unconstrained}$ $p\text{-value}(H_0: \alpha_0 = 0)$ | 1,635<br>-0.75<br>-7.73<br>-17.05<br>0.01 | 3,637<br>81.89<br>83.85<br>-16.36<br>0.04 | 3,485<br>93.38<br>94.81<br>-1.56<br>0.71 | 3,490<br>0.13<br>-0.08<br>0.20<br>0.07 |

TABLE 5—ITT TREATMENT EFFECTS, BY WHETHER CLASS SIZE CAPS ARE BINDING

Notes: Column 1 uses school-grade level data and the outcome is the change in enrollment (between 2015-2016 and 2016-2017) at the grade level. Columns 2–4 use student-level data. The outcomes are whether the student is in the same school (column 2), whether the student is still enrolled in any school (column 3), and the composite test score (column 4). There were 194 constrained classes before treatment (holding 30 percent of students), and 1,468 unconstrained classes before treatment (holding 70 percent of students). Standard errors are clustered at the school level.

Education agreed to release some underperforming teachers from PSL schools, replace those teachers, and provide additional ones. Ultimately, the extra teachers result in lower pupil-teacher ratios (despite increased student enrollment). This reshuffling of teachers means that PSL schools have younger and less-experienced teachers, who are more likely to have worked in private schools in the past and have higher test scores (we conducted simple memory, math, word association, and abstract thinking tests). Replacement and extra teachers are recent graduates from the Rural Teacher Training Institutes (see King et al. 2015 for details on this program). While the program's contracts made no provisions to pay teachers differently in treatment and control schools, teachers in PSL schools report higher wages. A potential explanation is that there are many teachers who are paid by the community in public schools (commonly known as "volunteer" teachers). If higher salaries for teachers in PSL schools are conditional on them working in program schools, then this would create an incentive to perform well. However, we could not find an explanation for these higher salaries. Hence, it is unclear whether higher salaries are tied to the program. But large unconditional increases in teacher salaries have been shown elsewhere to have no effect on student performance in the short run (de Ree et al. 2018).

Our enumerators conducted a "materials" check during classroom observations (see panel C of Table 6). Since we could not conduct classroom observations in schools that were out of session during our visit, online Appendix Table A.11 presents Lee (2009) bounds on these treatment effects (control schools are more likely to be out of session). Conditional on the school being in session during our visit, students in PSL schools are 23 percentage points (p-value < 0.001) more likely to have a textbook and 8.2 percentage points (p-value 0.051) more likely to have writing materials (both a pen and a copybook). However, we cannot rule out that there is no overall effect as zero is within the Lee (2009) bounds.

TABLE 6—ITT TREATMENT EFFECTS ON INPUTS AND RESOURCES

<span id="page-21-0"></span>

|                                                                         | Treatment (1)   | Control (2)     | Difference (3) | Difference (FE) (4) |
|-------------------------------------------------------------------------|-----------------|-----------------|----------------|---------------------|
| Panel A. School-level outcomes (observations = 185                      | 9.62            | 7.02            | 2.60           | 2.61                |
| Number of teachers                                                      | (2.82)          | (3.12)          | (0.44)         | (0.37)              |
| Pupil-teacher ratio (PTR)                                               | 32.20           | 39.95           | -7.74          | -7.82               |
|                                                                         | (12.29)         | (18.27)         | (2.31)         | (2.12)              |
| New teachers                                                            | 4.81            | 1.77            | 3.03           | 3.01                |
|                                                                         | (2.56)          | (2.03)          | (0.34)         | (0.35)              |
| Teachers dismissed                                                      | 3.27            | 2.12            | 1.15           | 1.13                |
|                                                                         | (3.81)          | (2.62)          | (0.48)         | (0.47)              |
| <i>Panel B. Teacher-level outcomes (observations</i> = 1,1 Age in years | 39.09           | 46.37           | -7.28          | -7.10               |
|                                                                         | (11.77)         | (11.67)         | (1.02)         | (0.68)              |
| Experience in years                                                     | 10.59           | 15.79           | -5.20          | -5.26               |
|                                                                         | (9.20)          | (10.77)         | (0.76)         | (0.51)              |
| Percent has worked at a private school                                  | 47.12           | 37.50           | 9.62           | 10.20               |
|                                                                         | (49.95)         | (48.46)         | (3.76)         | (2.42)              |
| Test score in standard deviations                                       | 0.13<br>(1.02)  | -0.01 (0.99)    | 0.14<br>(0.07) | 0.14<br>(0.06)      |
| Percent certified (or tertiary education)                               | 60.11           | 58.05           | 2.06           | 4.20                |
|                                                                         | (48.99)         | (49.39)         | (4.87)         | (2.99)              |
| Salary (US\$/month): conditional on salary $>0$                         | 121.36          | 104.54          | 16.82          | 13.90               |
|                                                                         | (44.42)         | (60.15)         | (6.56)         | (4.53)              |
| Panel C. Classroom observation (observations = 14. Number of seats      | 20.64           | 20.58           | 0.06           | 0.58                |
|                                                                         | (13.33)         | (13.57)         | (2.21)         | (1.90)              |
| Percent with students sitting on the floor                              | 2.41<br>(15.43) | 4.23<br>(20.26) | -1.82 (2.94)   | -1.51<br>(2.61)     |
| Percent with chalk                                                      | 96.39           | 78.87           | 17.51          | 16.58               |
|                                                                         | (18.78)         | (41.11)         | (5.29)         | (5.50)              |
| Percent of students with textbooks                                      | 37.08           | 17.60           | 19.48          | 22.60               |
|                                                                         | (43.22)         | (35.25)         | (6.33)         | (6.32)              |
| Percent of students with pens/pencils                                   | 88.55           | 79.67           | 8.88           | 8.16                |
|                                                                         | (19.84)         | (30.13)         | (4.19)         | (4.10)              |

Notes: This table presents the mean and standard error of the mean (in parentheses) for the control (column 2) and treatment (column 1) groups, as well as the difference between treatment and control (column 3), and the difference taking into account the randomization design (i.e., including "pair" fixed effects) in column 4. Panel A has school-level outcomes. Panel B presents teacher-level outcomes including their score in tests conducted by our survey teams. Panel C presents data on inputs measured during classroom observations. Since we could not conduct classroom observations in schools that were out of session during our visit, online Appendix Table A.11 presents Lee (2009) bounds on these treatment effects (control schools are more likely to be out of session). Standard errors are clustered at the school level.

School Management.—Two important management changes are shown in Table 7: PSL schools are 8.7 percentage points more likely to be in session (i.e., the school is open, students and teachers are on campus, and classes are taking place) during a regular school day (p-value 0.058), and have a longer school day that translates into 3.2 more hours per week of instructional time (p-value 0.0011). Although principals in PSL schools have scores in the "intuitive" and "time management profile" scale that are almost identical to their counterparts in traditional public schools, they spend more of their time on management-related activities (e.g., supporting

<span id="page-22-0"></span>

|                                                 | Treatment      | Control        | Difference     | Difference (FE) |
|-------------------------------------------------|----------------|----------------|----------------|-----------------|
|                                                 | (1)            | (2)            | (3)            | (4)             |
| Percent school in session at spot check         | 92.47          | 83.70          | 8.78           | 8.66            |
|                                                 | (26.53)        | (37.14)        | (4.75)         | (4.52)          |
| Instruction time (hours/week)                   | 17.84          | 14.69          | 3.15           | 3.17            |
|                                                 | (4.84)         | (4.04)         | (0.66)         | (0.65)          |
| Intuitive score (out of 12)                     | 4.08           | 4.03           | 0.04           | 0.02            |
|                                                 | (1.35)         | (1.38)         | (0.20)         | (0.19)          |
| Time management score (out of 12)               | 5.60<br>(1.21) | 5.69<br>(1.35) | -0.09 (0.19)   | -0.10 (0.19)    |
| Principal's working time (hours/week)           | 21.43          | 20.60          | 0.83           | 0.84            |
|                                                 | (11.83)        | (14.45)        | (1.94)         | (1.88)          |
| Percent of principal's time spent on management | 74.06          | 53.64          | 20.42          | 20.09           |
|                                                 | (27.18)        | (27.74)        | (4.12)         | (3.75)          |
| Index of good practices (PCA)                   | 0.41<br>(0.64) | -0.00 (1.00)   | 0.41<br>(0.12) | 0.40<br>(0.12)  |
| Observations                                    | 93             | 92             | 185            | 185             |

TABLE 7—ITT TREATMENT EFFECTS ON SCHOOL MANAGEMENT

Notes: This table presents the mean and standard error of the mean (in parentheses) for the control (column 1) and treatment (column 2) groups, as well as the difference between treatment and control (column 3), and the difference taking into account the randomization design (i.e., including "pair" fixed effects) in column 4. Intuitive score is measured using Agor's (1989) instrument and time management profile using Schermerhorn et al.'s (2011) instrument. The index of good practices is the first component of a principal component analysis of the variables in online Appendix Table A.12. The index is normalized to have mean zero and standard deviation of 1 in the control group. Standard errors are clustered at the school level.

other teachers, monitoring student progress, meeting with parents) than actually teaching. This suggests a change in the role of the principal in these schools: perhaps as a result of additional teachers, principals in PSL schools did not have to double as teachers. Additionally, management practices (as measured by a "good practices" PCA index normalized to a mean of 0 and standard deviation of 1 in the control group) are  $0.4\sigma$  (p-value 0.0011) higher in PSL schools. <sup>26</sup> This effect size can be viewed as a boost for the average treated school from the fiftieth to the sixty-sixth percentile in management practices.

*Teacher Behavior.*—An important component of school management is teacher accountability and its effects on teacher behavior. As mentioned above, teachers in PSL schools are drawn from the pool of unionized civil servants with lifetime appointments and who are paid by the Liberian government. In theory, private providers have limited authority to request teacher reassignments and no authority to promote or dismiss civil service teachers. Thus, a central hypothesis underlying the PSL program is that providers can hold teachers accountable through monitoring and support, rather than rewards and threats.<sup>27</sup>

<span id="page-22-1"></span><sup>&</sup>lt;sup>26</sup> The index includes whether the school has an enrollment log and what information is in it, whether the school has an official time table and whether it is posted, whether the school has a parent-teacher association (PTA) and whether the principal has the PTA head's number at hand, and whether the school keeps a record of expenditures and a written budget. Online Appendix Table A.12 has details on every component of the good practices index.

<span id="page-22-2"></span><sup>&</sup>lt;sup>27</sup> As mentioned above, in practice the Ministry of Education agreed to release some underperforming teachers from PSL schools at the request of providers. While providers could have provided teachers with performance incentives, we have no evidence that any of them did.

<span id="page-23-0"></span>

|                                                        | Treatment (1)    | Control (2)      | Difference (3) | Difference (FE) (4) |
|--------------------------------------------------------|------------------|------------------|----------------|---------------------|
| Panel A. Spot checks (observations = 185)              |                  |                  |                |                     |
| Percent on schools' campus                             | 60.32            | 40.38            | 19.94          | 19.79               |
|                                                        | (23.10)          | (25.20)          | (3.56)         | (3.48)              |
| Percent in classroom                                   | 47.02            | 31.42            | 15.60          | 15.37               |
|                                                        | (26.65)          | (25.04)          | (3.80)         | (3.62)              |
| <i>Panel B. Student reports (observations</i> = 185)   |                  |                  |                |                     |
| Teacher missed school previous week (percent)          | 17.72<br>(10.79) | 25.12<br>(14.93) | -7.41 (1.92)   | -7.53 (1.95)        |
| Teacher never hits students (percent)                  | 54.73            | 48.20            | 6.52           | 6.59                |
|                                                        | (18.76)          | (17.07)          | (2.64)         | (2.53)              |
| Teacher helps outside the classroom (percent)          | 50.02            | 46.59            | 3.42           | 3.56                |
|                                                        | (18.24)          | (18.01)          | (2.67)         | (2.28)              |
| Panel C. Classroom observations (observations = 185)   |                  |                  |                |                     |
| Instruction (active + passive) (percent of class time) | 49.68            | 35.00            | 14.68          | 14.51               |
|                                                        | (32.22)          | (37.08)          | (5.11)         | (4.70)              |
| Classroom management (percent class time)              | 19.03            | 8.70             | 10.34          | 10.25               |
|                                                        | (20.96)          | (14.00)          | (2.62)         | (2.73)              |
| Teacher off-task (percent class time)                  | 31.29<br>(37.71) | 56.30<br>(42.55) | -25.01 (5.91)  | -24.77<br>(5.48)    |
| Student off-task (percent class time)                  | 50.41            | 47.14            | 3.27           | 2.94                |
|                                                        | (33.51)          | (38.43)          | (5.30)         | (4.59)              |

Notes: This table presents the mean and standard error of the mean (in parentheses) for the control (column 2) and treatment (column 1) groups, as well as the difference between treatment and control (column 3), and the difference taking into account the randomization design (i.e., including "pair" fixed effects) in column 4. Panel A presents data from spot checks conducted by our survey teams in the middle of a school day. Panel B presents data from our panel of students where we asked them about their teachers' behavior. Panel C presents data from classroom observations. If the school was not in session during a regular school day we mark all teachers not on campus as absent and teachers and students as off-task in the classroom observation. Online Appendix Table A.11 has the results without imputing values for schools not in session. Standard errors are clustered at the school level.

To study teacher behavior, we conducted unannounced spot checks of teacher attendance and collected student reports of teacher behavior (see panels A and B in Table 8). Also, during these spot checks we used the Stallings classroom observation instrument to study teacher time use and classroom management (see panel C in Table 8).

Teachers in PSL schools are 20 percentage points (p-value < 0.001) more likely to be in school during a spot check (from a base of 40 percent) and the unconditional probability of a teacher being in a classroom increases by 15 percentage points (p-value < 0.001). Our spot checks align with student reports on teacher behavior. According to students, teachers in PSL schools are 7.5 percentage points (p-value < 0.001) less likely to have missed school the previous week. Students in PSL schools also report that teachers are 6.6 percentage points (p-value 0.011) less likely to hit them.

Classroom observations also show changes in teacher behavior and pedagogical practices. Teachers in PSL schools are 15 percentage points (p-value 0.0027) more likely to engage in either active instruction (e.g., teacher engaging students through lecture or discussion) or passive instruction (e.g., students working in their seat while the teacher monitors progress) and 25 percentage points (p-value < 0.001)

less likely to be off-task.<sup>28</sup> Although these are considerable improvements, the treatment group is still far off the Stallings, Knight, and Markham (2014) good practice benchmark of 85 percent of total class time used for instruction, and below the average time spent on instruction across five countries in Latin America (Bruns and Luque 2014).

These estimates combine the effects on individual teacher behavior with changes to teacher composition. To estimate the treatment effect on teacher attendance over a fixed pool of teachers, we perform additional analyses in online Appendix A.1 using administrative data (EMIS) to restrict our sample to teachers who worked at the school the year before the intervention began (2015-2016). We treat teachers who no longer worked at the school in the 2016-2017 school year as (nonrandom) attriters and estimate Lee (2009) bounds on the treatment effect. Online Appendix Table A.11 shows an ITT treatment effect of 14 percentage points (*p*-value < 0.001) on teacher attendance. Importantly, zero is not part of the Lee (2009) bounds for this effect. This aligns with previous findings showing that management practices have significant effects on worker performance (Bloom, Liang et al. 2015; Bloom, Eifert et al. 2013; Bennedsen et al. 2007).

#### D. Other Outcomes

Student data (panel C of Table 9) and household data (panel A) show that the program also increases student and parental satisfaction. Students in PSL are more likely to think going to school is fun, and parents with children in PSL schools (enrolled in 2015-2016) are 7.5 percentage points (*p*-value 0.022) more likely to be satisfied with the education their children are receiving.

Providers are not allowed to charge fees and PSL should be free at all levels, including early-childhood education (ECE) for which fees are permitted in government schools. We interviewed both parents and principals regarding fees. In both treatment and control schools parents are more likely to report paying fees than schools are to report charging them. The amount parents claim to pay in school fees is much higher than the amount schools claim to charge (see panels A and B in Table 9). Since principals may be reluctant to disclose the full amount they charge parents, especially in primary school (which is nominally free), this discrepancy is normal. While the likelihood of charging fees decreases in PSL schools by 26 percentage points according to parents and by 19 percentage points according to principals, 48 percent of parents still report paying some fees in PSL schools.

Providers often provide textbooks and uniforms free of charge to students (see Section IA). Indeed, household expenditures on fees, textbooks, and uniforms drop (see online Appendix Table A.13 for details). In total, annual household expenditures on children's education decrease by US\$6.6 (p-value 0.11). A reduction in household expenditure in education reflects a crowding-out response (i.e., parents decrease private investment in education as school investments increase). To explore whether crowding out goes beyond expenditure we ask parents about engagement in their child's education. However, we see no change on this margin (we summarize

<span id="page-24-0"></span><sup>&</sup>lt;sup>28</sup> See Stallings, Knight, and Markham (2014) for more details on how active and passive instruction, as well as time off-task and student engagement, are coded.

TABLE 9—ITT TREATMENT EFFECTS ON HOUSEHOLD BEHAVIOR, FEES, AND STUDENT ATTITUDES

<span id="page-25-0"></span>

|                                                                                  | Treatment (1)    | Control (2)      | Difference (3)  | Difference (FE) |
|----------------------------------------------------------------------------------|------------------|------------------|-----------------|-----------------|
| Panel A. Household behavior (observations = 1,115) Percent satisfied with school | 74.90<br>(19.18) | 67.41<br>(23.99) | 7.49<br>(3.20)  | 7.51<br>(3.23)  |
| Percent paying any fees                                                          | 48.08<br>(50.00) | 73.59<br>(44.13) | -25.50 $(4.73)$ | -25.74 (3.27)   |
| Fees (US\$/year)                                                                 | 5.68<br>(10.16)  | 8.06<br>(9.73)   | -2.38<br>(0.97) | -2.95 (0.62)    |
| Expenditure (US\$/year)                                                          | 65.57<br>(74.84) | 73.53<br>(79.32) | -7.95<br>(6.95) | -6.60<br>(4.11) |
| Engagement index (PCA)                                                           | -0.11 (0.84)     | -0.09 (0.91)     | -0.03 (0.07)    | -0.03 (0.06)    |
| Panel B. Fees (observations = 184) Percent with >0 ECE fees                      | 11.83            | 30.77            | -18.94          | -18.98          |
| Percent with >0 ECE fees                                                         | (32.47)          | (46.41)          | (5.92)          | (5.42)          |
| Percent with >0 primary fees                                                     | 12.90<br>(33.71) | 29.67<br>(45.93) | -16.77 (5.95)   | -16.79 (5.71)   |
| ECE fee (US\$/year)                                                              | 0.57<br>(1.92)   | 1.42<br>(2.78)   | -0.85 (0.35)    | -0.87 (0.33)    |
| Primary fee (US\$/year)                                                          | 0.54<br>(1.71)   | 1.22<br>(2.40)   | -0.68 (0.31)    | -0.70 (0.31)    |
| Panel C. Student attitudes (observations = 3,492)<br>School is fun               | 0.58<br>(0.49)   | 0.53<br>(0.50)   | 0.05<br>(0.02)  | 0.05<br>(0.02)  |
| I use what I'm learning outside of school                                        | 0.52<br>(0.50)   | 0.49<br>(0.50)   | 0.04<br>(0.02)  | 0.04<br>(0.02)  |
| If I work hard, I will succeed                                                   | 0.60<br>(0.49)   | 0.55<br>(0.50)   | 0.05<br>(0.03)  | 0.04<br>(0.02)  |
| Elections are the best way to choose a president                                 | 0.90<br>(0.30)   | 0.88<br>(0.33)   | 0.03<br>(0.01)  | 0.03<br>(0.01)  |
| Boys are smarter than girls                                                      | 0.69<br>(0.46)   | 0.69<br>(0.46)   | 0.00<br>(0.02)  | 0.01<br>(0.01)  |
| Some tribes in Liberia are bad                                                   | 0.76<br>(0.43)   | 0.79<br>(0.41)   | -0.03<br>(0.02) | -0.03 (0.01)    |

Notes: This table presents the mean and standard error of the mean (in parentheses) for the control (column 2) and treatment (column 1) groups, as well as the difference between treatment and control (column 3), and the difference taking into account the randomization design (i.e., including "pair" fixed effects) in column 4. Panel A presents data from household surveys. The index for parent engagement is the first component from a principal component analysis across several measures of parental engagement; see online Appendix Table A.14 for details. Expenditure refers to the annual household expenditures on children's education. Panel B presents data from school principals on what fees schools charge. Panel C presents data on whether students agree or disagree with several statements. Standard errors are clustered at the school level.

parental engagement using the first component from a principal component analysis across several measures of parental engagement; see online Appendix Table A.14 for the effect on each component).

To complement the effect of the program on cognitive skills, we also look for changes in student attitudes and opinions (see panel C of Table 9). Some of the control group rates are noteworthy: 50 percent of children use what they learn in class outside school, 69 percent think that boys are smarter than girls, and 79 percent think that some tribes in Liberia are bad. Turning to treatment effects, children

in PSL schools are more likely to think school is useful, more likely to think elections are the best way to choose a president, and less likely to think some tribes in Liberia are bad. The effect on tribe perceptions is particularly important in light of the recent conflict in Liberia and the ethnic tensions that sparked it. Our results also align with previous findings from Andrabi et al. (2010), which shows that children in private schools in Pakistan are more "pro-democratic" and exhibit lower gender biases (although we do not find any evidence of lower gender biases in this setting). Note, however, that our treatment effects are small in magnitude. It is also impossible to tease out the effect of who is providing education (private providers versus regular public schools) from the effect of better education and the effect of younger and better teachers. Hence, our results show the net change in students' opinions, and cannot be attributed to providers per se but rather to the program as a whole.

# **III. Provider Comparisons**

# A. Raw Differences

As discussed in Section IB and shown in online Appendix Table A.1, PSL schools are not a representative sample of public schools. Furthermore, there is heterogeneity in school characteristics across providers. This is unsurprising since providers stated different preferences for locations and some volunteered to manage schools in more remote and marginalized areas. Therefore, the raw treatment effects for each individual provider are internally valid, but not comparable with each other without further assumptions (see Section IIIB).

We show how the average school for each provider differs from the average public school in Liberia in online Appendix Table A.15. We reject the null that providers' schools have similar characteristics on at least three margins: number of students, pupil/teacher ratio, and the number of permanent classrooms. Bridge International Academies is managing schools that were considerably bigger (in 2015-2016) than the average public school in Liberia (by over 150 students), and these schools are larger than those of other providers by over 100 students. Most providers have schools with better infrastructure than the average public school in the country, except for Omega and Stella Maris. Finally, while all providers have schools that are closer to a paved road than other public schools, Bridge's and BRAC's schools are about 2 km closer than other providers' schools. Overall, these results confirm that some providers were more willing to work in average Liberian schools, while others preferred schools with easier access and better infrastructure.

We now turn to provider-by-provider outcomes. We focus on three margins: (i) *learning*, as measured by test scores; (ii) *sustainability*, providers' willingness to improve the behavior and pedagogy of existing teachers (as opposed to having the worst-performing teachers transferred to other public schools, imposing a negative externality on the broader school system); and (iii) *equity*, or providers' commitment to improving access to quality education (rather than learning gains for a subset of pupils).

The treatment effects on composite test scores are positive and significantly different from zero for three providers: Rising Academies, Bridge International Academies, and Street Child (panel A of Table 10). They are positive but statistically

TABLE 10—RAW (FULLY EXPERIMENTAL) TREATMENT EFFECTS BY PROVIDER

<span id="page-27-0"></span>

|                                                | BRAC             | Bridge           | MtM              | Omega            | Rising           |                  | Stella M         | YMCA             |
|------------------------------------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|
|                                                | (1)              | (2)              | (3)              | (4)              | (5)              | (6)              | (7)              | (8)              |
| Panel A. Student test scores (ITT)             | 0.10             | 0.26             | 0.10             |                  | 0.26             | 0.22             |                  | 0.50             |
| English (standard deviations)                  | 0.19<br>(0.10)   | 0.28<br>(0.09)   | 0.19<br>(0.22)   | -0.07 (0.11)     | 0.36<br>(0.24)   | 0.23<br>(0.13)   | -0.23 (0.23)     | 0.58<br>(0.26)   |
| Math (standard deviations)                     | 0.09 (0.09)      | 0.39<br>(0.09)   | 0.18 (0.22)      | -0.06 (0.11)     | 0.42 (0.23)      | 0.28 (0.13)      | -0.17 (0.22)     | 0.27 (0.26)      |
| Composite (standard deviations)                | 0.14<br>(0.09)   | 0.36<br>(0.09)   | 0.18<br>(0.22)   | -0.08 (0.11)     | 0.42<br>(0.23)   | 0.27<br>(0.13)   | -0.19 (0.22)     | 0.38<br>(0.26)   |
| Panel B. Changes to the pool of teachers       |                  |                  |                  |                  |                  |                  |                  |                  |
| Percent teachers dismissed                     | -6.75 (6.43)     | 50.47<br>(6.29)  | 15.51<br>(11.75) | -8.58 (6.81)     | -5.79 (12.72)    | -3.18 (8.49)     | -10.99 (14.34)   | 21.08<br>(14.34) |
| Percent new teachers                           | 39.53<br>(12.27) | 63.17<br>(12.00) | 70.88<br>(22.43) | 24.44<br>(12.99) | 24.30<br>(24.28) | 41.14<br>(16.19) | -20.32 (27.37)   | 62.37<br>(27.37) |
| Age in years (teachers)                        | -5.03 (1.93)     | -10.92 (2.01)    | -11.20 (3.52)    | -5.46 (2.03)     | -10.75 $(3.82)$  | -5.79 (2.54)     | -4.53 (4.30)     | 3.25<br>(4.30)   |
| Test score in standard deviations (teachers)   | 0.03<br>(0.17)   | 0.36<br>(0.17)   | 0.48<br>(0.31)   | 0.18<br>(0.17)   | 0.18<br>(0.33)   | 0.32<br>(0.22)   | 0.16<br>(0.38)   | -0.59 (0.38)     |
| Panel C. Enrollment and access                 |                  |                  |                  |                  |                  |                  |                  |                  |
| $\Delta$ enrollment                            | 38.02<br>(34.33) | -13.26 (33.60)   | -25.98 (62.76)   | 51.27<br>(35.26) | 19.31<br>(67.84) | 44.86<br>(45.21) | -15.92 (76.59)   | 45.38<br>(76.53) |
| $\Delta \ enrollment \ (constrained \ grades)$ | 0.00 (0.00)      | -23.85 (11.19)   | 0.00<br>(0.00)   | 0.28<br>(37.16)  | 0.00<br>(0.00)   | 32.15<br>(61.95) | -1.00 $(5.13)$   | -46.35 $(27.05)$ |
| Student attendance (percent)                   | 20.12<br>(9.02)  | 5.25<br>(9.05)   | 37.80<br>(16.50) | 18.01<br>(9.53)  | 28.76<br>(17.83) | 19.56<br>(11.88) | 9.72<br>(23.32)  | 13.53<br>(20.11) |
| Percent students still attending any school    | 1.27<br>(4.45)   | 5.19<br>(4.22)   | -3.12 (10.25)    | 4.71<br>(4.99)   | 2.82<br>(11.03)  | 3.64<br>(6.11)   | 5.98<br>(10.57)  | 4.48<br>(12.21)  |
| Percent students still attending same school   | 0.80<br>(2.20)   | 4.42<br>(2.09)   | 0.65<br>(5.07)   | 1.56<br>(2.46)   | 3.81<br>(5.45)   | -0.82 (3.02)     | 1.03<br>(5.23)   | -0.81 (6.04)     |
| Panel D. Satisfaction                          |                  |                  |                  |                  |                  |                  |                  |                  |
| Percent satisfied with school (parents)        | 11.72<br>(7.30)  | 13.22<br>(7.14)  | 0.75<br>(13.34)  | 0.21<br>(7.53)   | 4.95<br>(14.44)  | -4.96 (9.62)     | 29.49<br>(16.28) | 18.02<br>(16.27) |
| Percent students who think school is fun       | 5.83<br>(4.89)   | 2.11<br>(4.63)   | 0.50<br>(11.25)  | 4.86<br>(5.47)   | 9.44<br>(12.11)  | 2.84<br>(6.71)   | -17.50 (11.60)   | 20.92<br>(13.40) |
| Observations                                   | 40               | 45               | 12               | 38               | 10               | 24               | 8                | 8                |

Notes: This table presents the raw treatment effect for each provider on different outcomes. The estimates for each provider are *not* comparable to each other without further assumptions, and thus we do not include a test of equality. Panel A presents data on students' test scores. Panel B presents data related to the pool of teachers in each school. Panel C presents data related to school enrollment.  $\Delta$  enrollment measures the change in enrollment between the 2015-2016 and 2016-2017 school year. Panel D presents data from household surveys. Standard errors are shown in parentheses. Estimation is conducted on collapsed, school-level data.

insignificant for Youth Movement for Collective Action, More Than Me, and BRAC. Noncompliance likely explains the negative (but statistically insignificant) effect for Stella Maris and Omega Schools. Stella Maris never took control of its assigned schools. Omega had not taken control of all its schools by the end of the school year. Our teacher interviews reflect these providers' absence. In three out of four Stella Maris schools, all the teachers reported that no one from Stella had been at the school in the previous week. In 6 out of 19 Omega schools, all the teachers reported that no one from Omega had been at the school in the previous week. While we committed in the pre-analysis plan to compare for-profit to nonprofit providers, this comparison yields no clear patterns (Romero, Sandefur, and Sandholtz 2017).

To measure teacher selection, we study the number of teachers dismissed and the number of new teachers recruited (panel B of Table 10). As noted above, PSL led to the assignment of 2.6 additional teachers per school and 1.1 additional teachers exiting per school. However, large-scale dismissal of teachers was unique to one provider (Bridge International Academies), while successful lobbying for additional teachers was common across several providers. Although weeding out bad teachers is important, a reshuffling of teachers is unlikely to raise average performance in the system as a whole. We are unable to verify whether the teachers dismissed from PSL schools were reassigned to other public schools.

While enrollment increased across all providers, the smallest treatment effect on this margin is for Bridge, which is consistent with that provider being the only one enforcing class size caps (see panel C in Table 10 and online Appendix Figure A.5 for more details). As shown in Section IIB, in classes where class-size caps were binding (10 percent of all classes holding 30 percent of students at baseline), enrollment fell by 12 students per grade.

# B. Comparable Treatment Estimates

There are two hurdles to comparing provider-specific treatment effects. First, while the assignment of schools within matched pairs was random, the assignment of pairs to providers was not, resulting in nonrandom differences in schools and locations across providers. Second, the sample sizes for most providers are too small to yield reliable estimates.

To mitigate the bias due to differences in locations and schools we control for a comprehensive set of school characteristics (to account for the fact that some providers' schools will score better than others for reasons unrelated to PSL), as well as interactions of those characteristics with a treatment dummy (to account for the possibility that raising scores through PSL relative to the control group will be easier in some contexts than others). We control for both student (age, gender, wealth, and grade) and school characteristics (pretreatment enrollment, facilities, and rurality).

Because randomization occurred at the school level and some providers are managing only four or five treatment schools, the experiment is underpowered to estimate their effects. Additionally, since the "same program" was implemented by different providers, it would be naïve to treat providers' estimators as completely independent from each other. We take a Bayesian approach to this problem, estimating a hierarchical model (Rubin 1981; see Gelman et al. 2014 and Meager 2019 for a recent discussion). By allowing dependency across providers' treatment effects, the model "pools power" across providers, and in the process pulls estimates for smaller providers toward the overall average (a process known as "shrinkage"). The results of the Bayesian estimation are a weighted average of providers' own performance and average performance across all providers, and the proportions depend on the provider's sample size. We apply the Bayesian estimator after adjusting for baseline school differences and estimating the treatment effect of each provider on the average school in our sample.<sup>29</sup>

<span id="page-28-0"></span><sup>&</sup>lt;sup>29</sup>This model assumes that the true treatment effect for each provider is drawn from a normal distribution (with unknown mean and variance), and that the observed effect is sampled from a normal distribution with mean equal

We show the full set of results across providers after adjusting for baseline differences and "shrinking" the estimates using the Bayesian hierarchical model in online Appendix Table A.16. While the comparable effects are useful for comparisons, the raw experimental estimates remain useful for noncomparative statements (e.g., whether a provider had an effect or not). Online Appendix Figure A.4 shows the effects on learning after adjusting for differences in school characteristics (before the Bayesian hierarchical model) and the effects after applying a Bayesian hierarchical model (but without adjusting for school differences). Qualitatively, the results do not change. The learning gains remain positive for the same providers, and even after "shrinking" Bridge remains the only provider with a high (and statistically significant) percentage of teacher dismissal and the only one with a negative (and statistically significant) effect on enrollment in constrained grades.

### C. Excluding Some Providers

What will be the long-run impact of this program? The program was explicitly framed as a pilot, where the government would learn what works and what does not in the first year and adjust accordingly. Adjustments could be made on many different dimensions, but a unique feature of this program is the existence of eight independent operators offering competing services. This provides the opportunity for the PSL program to improve performance not only through learning by operators, but also through learning by the government about operators. Taking operator-specific performance as a fixed characteristic, we calculate how overall program performance could be improved in terms of both learning gains and nonlearning outcomes through selective renewal or cancellation of operator contracts.

For example, setting aside any political economy considerations, the government could drop the two providers that did not make much effort to manage their schools (Omega and Stella Maris). It could also drop any provider who is potentially generating negative externalities (Bridge) or who may fail on dimensions different from test scores such as protecting students from physical and sexual abuse (More than Me and YMCA). We estimate these potential outcomes by taking an inverse-variance weighted average across providers. We do this using both the raw estimates and comparable treatment estimates for completeness; however, given that the comparable treatment estimates are meant to inform about the treatment estimates from the operators in any school in the experiment we focus on those. Topping the worst performing providers (Omega and Stella Maris) increases the overall treatment effect to  $0.23\sigma$ , while taking off the providers that may generate negative externalities (Bridge) reduces the treatment effect to  $0.16\sigma$  (see online Appendix Table A.18 for details). Dropping both the worst performers and Bridge increases the overall treatment effect to  $0.2\sigma$ . Also dropping More than Me and YMCA, who have

to the true effect. The "weight" given to the provider's own performance depends on the provider's sample size and the prior distribution for the standard deviation of the distribution of true effects. We assume a noninformative prior for the standard deviation. The results are robust to the choice of prior and are available upon request.

<span id="page-29-0"></span><sup>&</sup>lt;sup>30</sup> An alternative is to estimate the overall treatment effect using a Bayesian framework among the experiments from the providers that are not dropped (akin to Bayesian meta-analysis). However, as we argue above, it would be naïve to treat providers' estimators as completely independent from each other and the treatment effect from the dropped providers is informative of the overall treatment effect (even in the absence of these providers), as well as the treatment effect in the average school in the sample.

allegedly failed to safeguard children in their schools from sexual abuse,<sup>31</sup> results in an overall treatment effect of  $0.19\sigma$ . While the political economy of provider selection is nontrivial, we see this as prima facie evidence that the program has the potential to improve outcomes further by selecting providers dynamically.

#### IV. Was PSL Worth the Cost?

To attempt an answer to this question we make two comparisons: a comparative cost-effectiveness calculation comparing PSL to a business-as-usual expansion of Liberia's public school system, and a cost-benefit calculation based on the net present value of the Mincerian earnings returns to the education provided by PSL. Both calculations require strong assumptions (Dhaliwal et al. 2014), and we discuss a range of plausible alternatives. We focus on cost-effectiveness in this section, but our cost-benefit analysis suggests PSL is worth the investment under a fairly robust set of assumptions if we do not take into account the additional cost incurred by providers (see online Appendix Section A.4 for details).

Our data on operator costs are imperfect (see Section IA), and it is extremely difficult to predict the long-term unit cost of the program. Therefore, we take as a lower bound US\$50 per pupil, which was the government's budget target for PSL and the transfer made to operators. Computing the benefits is more straightforward. The ToT effect is  $0.22\sigma$ , implying test scores increased at most by  $0.44\sigma$  per US\$100 spent (assuming a linear-dose relationship).

The PSL program reflects a fairly holistic (and costly) overhaul of how public schools operate. Comparing the average costs and benefits of a large-scale reform to the literature measuring treatment effects of marginal improvements to existing school systems may be uninformative. Nevertheless, some of these reforms, particularly those focused on increasing accountability (e.g., teacher performance pay and school-based management) have generated equal or greater increases in learning in other contexts, at lower cost per child.<sup>32</sup> Further testing would be required to know whether similar results could be achieved in Liberia.

Arguably, a more informative comparison is between PSL and a business-as-usual increase in expenditure on Liberian public schools. A useful benchmark for comparison is to assume that the government would follow its current pattern of spending almost exclusively on employing teachers. Thus, any increase in government expenditure would either increase teacher salaries or reduce pupil-teacher ratios. Existing experimental estimates from the developing world suggest that either strategy would have, at best, modest impacts on test scores. In Indonesia, de Ree et al. (2018) show

<span id="page-30-0"></span><sup>&</sup>lt;sup>31</sup> Baysah, "Liberia: Police Charge Youth Activist for Sodomy," *The New Republic Liberia*, and Young, "Unprotected," *ProPublica*.

<span id="page-30-1"></span> $<sup>^{32}</sup>$ For example, Glewwe, Ilias, and Kremer (2010) in Kenya and Mbiti et al. (2019) in Tanzania show that teacher performance pay increased test scores by  $6.29\sigma$  and  $4.58\sigma$  per \$100 spent, respectively. In Indonesia, Pradhan et al. (2014) finds that linking school committees to the village council increases test scores by  $2.27\sigma$  per \$100 spent. For a review of the most cost-effective school-level interventions in the developing world, see Kremer, Brannen, and Glennerster (2013).

<span id="page-30-2"></span><sup>&</sup>lt;sup>33</sup>We do not present a cost-effectiveness comparison of the effect of the program on access to schooling since the overall treatment effect on enrollment is not statistically different from zero. However, an alternative policy of increasing the number of teachers may attract new students, particularly if those new teachers were placed in new or understaffed schools.

that large unconditional increases in teacher salaries have no effect on student performance in the short run. In Kenya, Duflo, Dupas, and Kremer (2015) find a reduction of the pupil-teacher ratio by 10 increases test scores  $0.06\sigma$ , and in India, Banerjee et al. (2007) find no significant effect (and a point estimate of the opposite sign). Likewise, using data from control schools, we estimate that the relationship between pupil-teacher ratios and student test scores is  $-0.0014\sigma$ . Spending an extra US\$50 on hiring more teachers would cut in half pupil-teacher ratios (the average student faces a class size of 36) and increase test scores by  $0.026\sigma$ , compared to  $0.22\sigma$  under PSL.

These estimates suggest that additional spending through PSL may be more cost-effective than additional spending (to increase the number of teachers) under business-as-usual. Indeed, increasing school resources without changing the incentives or the accountability structure has been shown to have little impact on learning outcomes in developing countries (Glewwe, Kremer, and Moulin 2009; Das et al. 2013; Sabarwal, Evans, and Marshak 2014; Mbiti et al. 2019).

#### V. Conclusions

Public-private partnerships in education are controversial and receive a great deal of attention from policymakers. Yet, there is little evidence for or against them in developing countries (Aslam, Rawal, and Saeed 2017). A typical argument in favor is that privately provided but publicly funded education is a means to inject cost efficiency into education without compromising equity. A typical argument against is that outsourcing will lead to student selection and other negative, unintended consequences.

We present empirical evidence to test both arguments. The Partnership Schools for Liberia program, a public-private partnership that delegated *management* of 93 public schools (3.4 percent of all public schools) to 8 different private organizations, was an effective way to circumvent weak public-sector management and improve learning outcomes. The ITT treatment effects of private management on student test scores after one academic year of treatment are  $0.18\sigma$  for English (p-value < 0.001) and  $0.18\sigma$  for math (p-value < 0.001).

We find no evidence that providers engaged in student selection: the probability of remaining in a treatment school was unrelated to a student's age, gender, household wealth, or disability. However, costs were high, performance varied across providers, and the largest provider pushed excess pupils and underperforming teachers into other government schools or completely out of the system. In addition, while outside the scope of our experimental analysis, the program has been plagued by accusations some operators failed to prevent, or actively concealed, sexual abuse in schools they managed. Teachers or staff of two PSL providers (More than Me Academy and Youth Movement for Collective Action) have been accused of sexual abuse since the start of the program, and an investigative report published in late 2018 alleged that More than Me Academy employed a serial child rapist in its schools prior to the start of PSL.<sup>34</sup>

<span id="page-31-0"></span><sup>&</sup>lt;sup>34</sup> Baysah, "Liberia: Police Charge Youth Activist for Sodomy," The New Republic Liberia, and Young, "Unprotected," ProPublica.

One interpretation of our results is that contracting rules matter. Changing the details of the contract might improve the results of the program. For instance, contracts could forbid class-size caps or require that students previously enrolled in a school be guaranteed readmission once a school joins the PSL program. Similarly, contracts could require prior permission from the Ministry of Education before releasing a public teacher from their place of work. Stricter government oversight of child protection and vetting of private operators on this basis also appears warranted.

However, fixing the contracts and procurement process is not just a question of technical tweaks; it reflects a key governance challenge for the program. Contract differences reflect political influence: the largest provider opted not to take part in the competitive bidding process and made a separate bilateral agreement with the government. Ultimately, this agreement allowed pushing excess pupils and underperforming teachers into other government schools. This underlines the importance of uniform contracting rules and competitive bidding in a public-private partnership.

To our knowledge, we provide the first experimental estimates of the intention-totreat (ITT) effect of outsourcing the management of existing public schools to private providers in a developing country. In contrast to the US charter school literature, which focuses on experimental effects for the subset of schools and private providers where excess demand necessitates an admissions lottery, we provide treatment effects from across the distribution of outsourced schools in this setting.

However, an assortment of questions remain open for future research. First, given the bundled nature of this program, more evidence is needed to isolate the effect of outsourcing management. Variations of outsourcing also need to be studied (e.g., not allowing any teacher reassignments, or allowing providers to hire teachers directly).

Second, while we identify sources of possible externalities from the program (e.g., pushing excess pupils into nearby schools) we are unable to study the effect of these externalities (positive or negative). Another key potential negative externality for other public schools is the opportunity cost of the program: PSL may deprive other schools of scarce resources by garnering preferential allocations of teachers or funding. On the other hand, traditional public schools may learn better management or pedagogical practices from nearby PSL schools. In addition, the program may lead to changes within the Ministry of Education that improve the performance of the system as a whole. For example, the need to monitor private providers has spurred the Ministry to reform some of its administrative information systems for all schools. All of this points to the need for future research to study these system-level effects and assess the impact of potentially important externalities.

#### REFERENCES

- Abdulkadiroğlu, Atila, Joshua D. Angrist, Peter D. Hull, and Parag A. Pathak. 2016. "Charters without Lotteries: Testing Takeovers in New Orleans and Boston." *American Economic Review* 106 (7): 1878–1920.
  - **Agor, Weston H.** 1989. "Intuition and Strategic Planning: How Organizations Can Make Productive Decisions." *Futurist* 23 (6).
- Akerlof, George A., and Rachel E. Kranton. 2005. "Identity and the Economics of Organizations." *Journal of Economic Perspectives* 19 (1): 9–32.
  - Andrabi, Tahir, Natalie Bau, Jishnu Das, and Asim Ijaz Khwaja. 2010. "Are Bad Public Schools Public 'Bads?' Test Scores and Civic Values in Public and Private Schools." Unpublished.

- **Andrabi, Tahir, Jishnu Das, and Asim Ijaz Khwaja.** 2017. "Report Cards: The Impact of Providing School and Child Test Scores on Educational Markets." *American Economic Review* 107 (6): 1535–63.
- **Andrabi, Tahir, Jishnu Das, Asim Ijaz Khwaja, and Tristan Zajonc.** 2011. "Do Value-Added Estimates Add Value? Accounting for Learning Dynamics." *American Economic Journal: Applied Economics* 3 (3): 29–54.
  - **Aslam, Monazza, Shenila Rawal, and Sahar Saeed.** 2017. *Public-Private Partnerships in Education in Developing Countries: A Rigorous Review of the Evidence.* London: Ark Education Partnerships Group.
- **Banerjee, Abhijit, Shawn Cole, Esther Duflo, and Leigh Linden.** 2007. "Remedying Education: Evidence from Two Randomized Experiments in India." *Quarterly Journal of Economics* 122 (3): 1235–64.
- **Banerjee, Abhijit, Rema Hanna, Jordan Kyle, Benjamin A. Olken, and Sudarno Sumarto.** 2019. "Private Outsourcing and Competition: Subsidized Food Distribution in Indonesia." *Journal of Political Economy* 127 (1): 101–37.
  - **Barrera-Osorio, Felipe, David S. Blakeslee, Matthew Hoover, Leigh Linden, Dhushyanth Raju, and Stephen P. Ryan.** 2017. "Delivering Education to the Underserved through a Public-Private Partnership Program in Pakistan." NBER Working Paper 23870.
- **Bennedsen, Morten, Kasper Meisner Nielsen, Francisco Pérez-González, and Daniel Wolfenzon.** 2007. "Inside the Family Firm: The Role of Families in Succession Decisions and Performance." *Quarterly Journal of Economics* 122 (2): 647–91.
- **Besley, Timothy, and Maitreesh Ghatak.** 2005. "Competition and Incentives with Motivated Agents." *American Economic Review* 95 (3): 616–36.
  - **Betts, Julian R., and Y. Emily Tang.** 2014. "A Meta-Analysis of the Literature on the Effect of Charter Schools on Student Achievement." CRPE Technical Report.
  - **Bloom, Erik, Indu Bhushan, David Clingingsmith, Rathavuth Hong, Elizabeth King, Michael Kremer, Benjamin Loevinsohn, and J. Brad Schwartz.** 2007. "Contracting for Health: Evidence from Cambodia." Unpublished.
- **Bloom, Nicholas, Benn Eifert, Aprajit Mahajan, David McKenzie, and John Roberts.** 2013. "Does Management Matter? Evidence from India." *Quarterly Journal of Economics* 128 (1): 1–51.
- **Bloom, Nicholas, Renata Lemos, Raffaella Sadun, and John Van Reenen.** 2015. "Does Management Matter in Schools?" *Economic Journal* 125 (584): 647–74.
- **Bloom, Nicholas, James Liang, John Roberts, and Zhichun Jenny Ying.** 2015. "Does Working from Home Work? Evidence from a Chinese Experiment." *Quarterly Journal of Economics* 130 (1): 165–218.
- **Brault, Matthew W.** 2011. "School-Aged Children with Disabilities in US Metropolitan Statistical Areas: 2010." *American Community Survey Briefs* ACSBR/10-12.
- **Bridge International Academies.** 2017. **"**Bridge International Academies' Written Evidence to the International Development Committee Inquiry on DFID's Work on Education: Leaving No One Behind?" London: House of Commons International Development Committee.
- **Bruns, Barbara, and Javier Luque.** 2014. *Great Teachers: How to Raise Student Learning in Latin America and the Caribbean*. Washington, DC: World Bank.
- **Cabral, Sandro, Sergio G. Lazzarini, and Paulo Furquim de Azevedo.** 2013. "Private Entrepreneurs in Public Services: A Longitudinal Examination of Outsourcing and Statization of Prisons." *Strategic Entrepreneurship Journal* 7 (1): 6–25.
- **Chabrier, Julia, Sarah Cohodes, and Philip Oreopoulos.** 2016. "What Can We Learn from Charter School Lotteries?" *Journal of Economic Perspectives* 30 (3): 57–84.
- **Crawfurd, Lee.** 2017. "School Management and Public-Private Partnerships in Uganda." *Journal of African Economies* 26 (5): 539–60.
  - **Cremata, Edward, Devora Davis, Kathleen Dickey, Kristina Lawyer, Yohannes Negassi, Margaret E. Raymond, and James L. Woodworth.** 2013. *National Charter School Study: 2013.* Stanford, CA: Center for Research on Education Outcomes.
- **Das, Jishnu, Stefan Dercon, James Habyarimana, Pramila Krishnan, Karthik Muralidharan, and Venkatesh Sundararaman.** 2013. "School Inputs, Household Substitution, and Test Scores." *American Economic Journal: Applied Economics* 5 (2): 29–57.
- **Das, Jishnu, and Tristan Zajonc.** 2010. "India Shining and Bharat Drowning: Comparing Two Indian States to the Worldwide Distribution in Mathematics Achievement." *Journal of Development Economics* 92 (2): 175–87.
- **de Ree, Joppe, Karthik Muralidharan, Menno Pradhan, and Halsey Rogers.** 2018. "Double for Nothing? Experimental Evidence on an Unconditional Teacher Salary Increase in Indonesia." *Quarterly Journal of Economics* 133 (2): 993–1039.

- **Dhaliwal, Iqbal, Esther Duflo, Rachel Glennerster, and Caitlin Tulloch.** 2014. "Comparative Cost-Effectiveness Analysis to Inform Policy in Developing Countries: A General Framework with Applications for Education." In *Education Policy in Developing Countries*, edited by Paul Glewwe, 285–338. Chicago: University of Chicago Press.
- **DIVA-GIS.** 2016. Liberia Administrative Areas. [http://biogeo.ucdavis.edu/data/diva/adm/LBR\\_adm.](http://biogeo.ucdavis.edu/data/diva/adm/LBR_adm.zip) [zip](http://biogeo.ucdavis.edu/data/diva/adm/LBR_adm.zip) (accessed June 1, 2016).
- **Duflo, Esther, Pascaline Dupas, and Michael Kremer.** 2015. "School Governance, Teacher Incentives, and Pupil-Teacher Ratios: Experimental Evidence from Kenyan Primary Schools." *Journal of Public Economics* 123: 92–110.
- **Duggan, Mark.** 2004. "Does Contracting Out Increase the Efficiency of Government Programs? Evidence from Medicaid HMOs." *Journal of Public Economics* 88 (12): 2549–72.
- **Fryer, Roland G., Jr.** 2014. "Injecting Charter School Best Practices into Traditional Public Schools: Evidence from Field Experiments." *Quarterly Journal of Economics* 129 (3): 1355–1407.
- **Galiani, Sebastian, Paul Gertler, and Ernesto Schargrodsky.** 2005. "Water for Life: The Impact of the Privatization of Water Services on Child Mortality." *Journal of Political Economy* 113 (1): 83–120.
- **Gelman, Andrew, John B. Carlin, Hal S. Stern, David B. Dunson, Aki Vehtari, and Donald B. Rubin.**  2014. *Bayesian Data Analysis.* Boca Raton, FL: CRC Press.
- **Glewwe, Paul, Nauman Ilias, and Michael Kremer.** 2010. "Teacher Incentives." *American Economic Journal: Applied Economics* 2 (3): 205–27.
- **Glewwe, Paul, Michael Kremer, and Sylvie Moulin.** 2009. "Many Children Left Behind? Textbooks and Test Scores in Kenya." *American Economic Journal: Applied Economics* 1 (1): 112–35.
  - **Hanushek, Eric A., John F. Kain, and Steven G. Rivkin.** 2004. "Disruption versus Tiebout Improvement: The Costs and Benefits of Switching Schools." *Journal of Public Economics* 88 (9): 1721–46.
  - **Hanushek, Eric A., and Ludger Woessmann.** 2016. "School Resources and Student Achievement: A Review of Cross-Country Economic Research." In *Cognitive Abilities and Educational Outcomes,* edited by Monica Rosén, Kajsa Yang Hansen, and Ulrika Wolff, 149–71. Cham, Switzerland: Springer.
- **Hart, Oliver, Andrei Shleifer, and Robert W. Vishny.** 1997. "The Proper Scope of Government: Theory and an Application to Prisons." *Quarterly Journal of Economics* 112 (4): 1127–61.
- **[H](http://pubs.aeaweb.org/action/showLinks?crossref=10.1093%2Fjleo%2F7.special_issue.24&citationId=p_40)olmström, Bengt, and Paul Milgrom.** 1991. "Multitask Principal-Agent Analyses: Incentive Contracts, Asset Ownership, and Job Design." *Journal of Law, Economics, and Organization* 7: 24–52.
  - **Hsieh, Chang-Tai, and Miguel Urquiola.** 2006. "The Effects of Generalized School Choice on Achievement and Stratification: Evidence from Chile's Voucher Program." *Journal of Public Economics* 90 (8): 1477–1503.
  - **King, Simon, Medina Korda, Lee Nordstrum, and Susan Edwards.** 2015. *Endline Assessment of the Impact of Early Grade Reading and Mathematics Interventions.* Research Triangle Park, NC: RTI International.
- **[K](http://pubs.aeaweb.org/action/showLinks?pmid=23599477&crossref=10.1126%2Fscience.1235350&citationId=p_43)remer, Michael, Conner Brannen, and Rachel Glennerster.** 2013. "The Challenge of Education and Learning in the Developing World." *Science* 340 (6130): 297–300.
  - **Kwauk, Christina, and Jenny Perlman Robinson.** 2016. *Bridge International Academies: Delivering Quality Education at a Low Cost in Kenya, Nigeria, and Uganda.* Washington, DC: Brookings Institution Press.
- **[L](http://pubs.aeaweb.org/action/showLinks?crossref=10.1111%2Fj.1467-937X.2009.00536.x&citationId=p_45)ee, David S.** 2009. "Training, Wages, and Sample Selection: Estimating Sharp Bounds on Treatment Effects." *Review of Economic Studies* 76 (3): 1071–1102.
  - **Lemos, Renata, and Daniela Scur.** 2016. "Developing Management: An Expanded Evaluation Tool for Developing Countries." Unpublished.
  - **Liberia Institute of Statistics and Geo-Information Services.** 2014. *Liberia Demographic and Health Survey 2013.* Monrovia, Liberia: LISGIS.
- **Liberia Institute of Statistics and Geo-Information Services.** 2016. *Liberia Household Income and Expenditure Survey 2014–2015.* Monrovia, Liberia: LISGIS.
- **[L](http://pubs.aeaweb.org/action/showLinks?pmid=16112305&crossref=10.1016%2FS0140-6736%2805%2967140-1&citationId=p_49)oevinsohn, Benjamin, and April Harding.** 2005. "Buying Results? Contracting for Health Service Delivery in Developing Countries." *Lancet* 366 (9486): 676–81.
- **[L](http://pubs.aeaweb.org/action/showLinks?system=10.1257%2Fapp.4.4.226&citationId=p_50)ucas, Adrienne M., and Isaac M. Mbiti.** 2012. "Access, Sorting, and Achievement: The Short-Run Effects of Free Primary Education in Kenya." *American Economic Journal: Applied Economics* 4 (4): 226–53.
- **May, Shannon.** 2017. "DFID's Work on Education: Leaving No One Behind?" HC 639. London: House of Commons International Development Committee.
- **[M](http://pubs.aeaweb.org/action/showLinks?crossref=10.1093%2Fqje%2Fqjz010&citationId=p_52)biti, Isaac, Karthik Muralidharan, Mauricio Romero, Youdi Schipper, Constantine Manda, and Rakesh Rajani.** 2019. "Inputs, Incentives, and Complementarities in Education: Experimental Evidence from Tanzania." *Quarterly Journal of Economics* 134 (3): 1627–73.

- Meager, Rachael. 2019. "Understanding the Average Impact of Microcredit Expansions: A Bayesian Hierarchical Analysis of Seven Randomized Experiments." *American Economic Journal: Applied Economics* 11 (1): 57–91.
  - **Millennium Challenge Corporation.** 2013. *Liberia Constraints Analysis*. http://www.liberianembassyus.org/uploads/PDFFiles/LIBERIA%20CONSTRAINTS%20ANALYSIS\_FINAL%20VERSION.pdf (accessed March 11, 2019).
  - **Ministry of Education–Liberia.** 2015–2016. Education Management Information System (EMIS) Data. http://moe-liberia.org/emis-data/.
  - Ministry of Education-Liberia. 2016a. Liberia Education Statistics Report 2015–2016.
  - **Ministry of Education–Liberia.** 2016b. Memorandum of Understanding between Ministry of Education, Government of Liberia and Bridge International Academies. www.theperspective.org/2016/ppp\_mou.pdf (August 6, 2017).
  - Ministry of Education-Liberia. 2017a. Getting to Best Education Section Plan, 2017–2021.
  - Ministry of Education-Liberia. 2017b. PSL School Allocation: Decision Points. http://moe.gov/lr/wp-content/uploads/2017/06/Allocation-final.pdf (accessed July 28, 2017).
- Muralidharan, Karthik, Abhijeet Singh, and Alejandro J. Ganimian. 2019. "Disrupting Education? Experimental Evidence on Technology-Aided Instruction in India." *American Economic Review* 109 (4): 1426–60.
- Muralidharan, Karthik, and Venkatesh Sundararaman. 2015. "The Aggregate Effect of School Choice: Evidence from a Two-Stage Experiment in India." *Quarterly Journal of Economics* 130 (3): 1011–66.
  - Patrinos, Harry Anthony, Felipe Barrera-Osorio, and Juliana Guáqueta. 2009. The Role and Impact of Public-Private Partnerships in Education. Washington, DC: World Bank.
- Pradhan, Menno, Daniel Suryadarma, Amanda Beatty, Maisy Wong, Arya Gaduh, Armida Alisjahbana, and Rima Prama Artha. 2014. "Improving Educational Quality through Enhancing Community Participation: Results from a Randomized Field Experiment in Indonesia." *American Economic Journal: Applied Economics* 6 (2): 105–26.
  - Romero, Mauricio, Justin Sandefur, and Wayne Sandholtz. 2017. "Partnership Schools for Liberia (PSL) Program Evaluation." AEA RCT Registry. July 29. https://doi.org/10.1257/rct.1501-7.0.
  - Romero, Mauricio, Justin Sandefur, and Wayne Sandholtz. 2018. "Partnership Schools for Liberia." Harvard Dataverse. https://doi.org/10.7910/DNV/5OIYU.
- Rubin, Donald B. 1981. "Estimation in Parallel Randomized Experiments." *Journal of Educational and Behavioral Statistics* 6 (4): 377–401.
  - Sabarwal, Shwetlena, David K. Evans, and Anastasia Marshak. 2014. "The Permanent Input Hypothesis: The Case of Textbooks and (No) Student Learning in Sierra Leone (English)." World Bank Policy Research Working Paper WPS 7021.
  - Schermerhorn, John R., Richard N. Osborn, James G. Hunt, and Mary Uhl-Bien. 2011. Organizational Behavior. New York: Wiley.
- Singh, Abhijeet. 2015. "Private School Effects in Urban and Rural India: Panel Estimates at Primary and Secondary School Ages." *Journal of Development Economics* 113 (March): 16–32.
- **Singh, Abhijeet.** Forthcoming. "Learning More with Every Year: School Year Productivity and International Learning Divergence." *Journal of the European Economic Association*.
- **Stallings, Jane A., Stephanie L. Knight, and David Markham.** 2014. "Using the Stallings Observation System to Investigate Time on Task in Four Countries." Washington, DC: World Bank.
- Tuttle, Christina Clark, Philip Gleason, and Melissa Clark. 2012. "Using Lotteries to Evaluate Schools of Choice: Evidence from a National Study of Charter Schools." *Economics of Education Review* 31 (2): 237–53.
  - UNESCO. 2016. Global Education Monitoring Report 2016. Paris: UN Publications.
  - UNICEF. 2013. The State of the World's Children: Children with Disabilities. Paris: UN Publications. USAID. 2017. RFP-SOL-669-17-000004-Read Liberia. www.fbo.gov/index?s=opportunity&mode=f orm&id=e53cb285301f7014f415ce91b14049a3&tab=core&tabmode=list&= (accessed August 6, 2017).
- Useem, Bert, and Jack A. Goldstone. 2002. "Forging Social Order and Its Breakdown: Riot and Reform in US Prisons." *American Sociological Review* 67 (4): 499–525.
  - van der Linden, Wim J. 2018. Handbook of Item Response Theory. Boca Raton, FL: CRC Press.
  - Woodworth, James L., Margaret E. Raymond, Chunping Han, Yohannes Negassi, W. Patyton Richardson, and Will Snow. 2017. *Charter Management Organizations*. Stanford, CA: Center for Research on Educational Outcomes.
  - World Bank. 2013. Net ODA Received (% of GDP). https://datacatalog.worldbank.org/net-oda-received-gdp (accessed April 1, 2019).

- **World Bank.** 2014. Life Expectancy. [http://data.worldbank.org/indicator/SE.PRM.](http://data.worldbank.org/indicator/SE.PRM.NENR?locations=LR) [NENR?locations=LR.](http://data.worldbank.org/indicator/SE.PRM.NENR?locations=LR)
- **World Bank.** 2015a. *Conducting Classroom Observations: Analyzing Classroom Dynamics and Instructional Time: Using the Stallings 'Classroom Snapshot' Observation System: User Guide.*  Washington, DC: World Bank.
- **World Bank.** 2015b. *World Bank Group Support to Public-Private Partnerships: Lessons from Experience in Client Countries, FY02-12.* Washington, DC: World Bank.
- **World Bank.** 2016. Deposit Interest Rate (%). Data Retrieved from World Development Indicators. https://data.worldbank.org/indicator/FR.INR.DPST?locations=LR.
- **World Bank.** 2017. GDP Per Capita (Current US\$). Data retrieved from World Development Indicators. https://data.worldbank.org/indicator/NY.GDP.PCAP.CD.
- **Zhang, Hongliang.** 2014. "The Mirage of Elite Schools: Evidence from Lottery-Based School Admissions in China." Unpublished.
