+1(978)310-4246 credencewriters@gmail.com
  

Description

ECOS3002 Development Economics
Lecture 1
Faculty of Arts and Social Sciences
School of Economics
ECOS3002
Development Economics
Development Economics
Theory and Practice 2nd Edition
By Alain de Janvry & Elisabeth Sadoulet
Free Shipping & 20% off (code: SAV20)
at www.routledge.com
ISBN: 9780367456474
ECOS3002 Development Economics
Lecture 1
Chapter 1: What is Development? Indicators and Issues
Faculty of Arts and Social Sciences
School of Economics
What is our objective?
• This class is about economic development.
• We will mostly talk about countries’ current state of development, and how to have
more or better development.
• “more” and “better” imply that we have an objective in mind, a standard we’re trying to
attain.
• This lecture/chapter is about how we conceptualize development.
The big idea: enhancing wellbeing
• A basic definition (from our textbook): “Development is about the enhancement of
human wellbeing.”
• While this seems obvious, it’s not very practically useful. People can define “human
wellbeing” in vastly different ways, as different individuals, groups, or nations may
perceive different social needs and aspirations.
• Have you seen that in your own life, perhaps as your own values come into contrast with
those of other family members (different generations) or your friends from different
cultural backgrounds – what is the ideal major, career, income level, family size, income?
Note: we won’t focus on objectives
• This class isn’t about deciding what our concept of development should be.
• That is mostly an argument for politicians, philosophers, human rights activists, the general
public, etc., to have, as it touches on ideology, which may be rooted in different baseline
assumptions about how to define “wellbeing.”
• As economists we can and should bring our own perspectives to these debates, as members of
society!
• But our main focus as development economists is on how to achieve the development objectives
of our community, society, or nation. We largely take the objectives as given.
Seven dimensions of development
• Our textbook posts 7 dimensions of development:
1. Income and income growth.
2. Poverty and hunger.
3. Inequality and inequity.
4. Vulnerability.
5. Basic needs in education and health.
6. Environmental sustainability.
7. Quality of life.
 Does this already reflect a particular perspective on development? (e.g., western, globalist,
etc.)
1. Income and growth
• In market economies, individuals can trade their time, effort, and ingenuity, for income.
• So, on this dimension, more income is better.
• But when it comes to a group, community, or society, we have a distribution of income.
We can describe characteristics of this distribution, e.g.,
• The average or the per capita (pc) income;
• The income at different percentiles;
• The income earned by different sub-groups.
1. Income and growth
• To make comparisons between countries, it isn’t enough to just add up the incomes
earned by all the people in the country, e.g.,
• Some people earn income from wages, but they can also earn returns to capital.
• For businesses, an employee’s wage is an expense that reduces their profits (which is a source of
earnings for their owners).
• People and countries have international ties, e.g., import and export, receiving and sending
remittances, etc.
• A more robust way to measure national income is based on net production.
1. Income and growth
• Gross Domestic Product (GDP): the aggregate of value-added by all firms in the country.
• This includes production for home consumption (you produce what you consume without
selling it on the market), at opportunity cost.
• Gross National Product (GNP): GDP + net factor incomes from abroad (net repatriated
profits + net remittances).
• Gross National Income (GNI): GNP, subtracting depreciation and indirect business taxes.
1. Income and growth
• We can compare countries based on GDP (or GNP, GNI), or on their per capita values,
simply by dividing:
GDPpc = GDP / population
this provides an important correction for the fact that countries may have very different
population sizes.
• To get economic growth, we just take the change in GDP, GDPpc, etc., i.e.,
GDP growth = (GDPt – GDP(t-1)) / GDP(t-1)
1. Income and growth
• In principle, to compare between countries, we could just convert GDP to a common
currency (e.g., USD), at the official exchange rate.
• However, in order to make comparisons between countries, we need to deal with the
fact that
(1) countries measure their production in their own currency,
(2) exchange rates may be driven by factors that don’t fully reflect the state of an
economy,
(3) if we want to compare well-being, equivalent goods may have different prices in
different countries.
1. Income and growth
• To deal with (2) and (3), we can:
• Use an “equilibrium exchange rate” that isn’t as affected by arbitrary fluctuations.
• Adjust the currencies for their Purchasing Power Parity (PPP). This adjusts for the fact that
similar-quality goods and services can be much cheaper in poorer countries. Residents of
these countries would look worse off than they actually are, if we don’t correct for the fact
that they pay less for some goods and services.
• This leads to the formula in the textbook:
PPP-adjusted GDPpc = (1 / PPPe)*GDPpcLCU
where LCU refers to the local currency, and PPPe is converts the equilibrium exchange
rate based on comparable consumption baskets.
1. Income and growth
• GNIpc is used by the World Bank to classify countries based on their stage of economic
development.
• The World Bank classifies economies into 4 categories:
• Low income countries (LIC)
• Lower middle income countries (LMIC)
• Upper middle income countries (UMIC)
• High income countries (HIC)
•We will typically consider LIC, LMIC and UMIC as developing countries, and HIC as
developed countries, though this terminology is being phased out.
1. Income and growth
• Beyond broad critiques of using concepts like GDP and GDP growth to measure wellbeing and guide societal development, there are substantive critiques of the approach,
e.g.,
• It excludes good and services not transacted in the market, such as household work. Given
that globally women tend to perform this work, the contributions of women to the economy
can be seriously undercounted.
• GDP misses negative externalities like crime or social breakdown, and environmental damage.
• Though rarely used in practice, Genuine Progress Indicator (GPI):
GPI = GDP + Value of unpaid work − Costs of crime and social breakdown − Cost of environmental
damage
• In the US, GPI < GDP and the gap is widening over time. 2. Poverty and hunger • While the field of development economics is about overall development of countries, in practice it focuses most on reducing poverty, on the subset of the population that is considered near or below a poverty line. • This comes back to the distribution of income and wealth – two countries with the same income (GDP) could have very different rates of poverty. E.g., • Having a few very rich people, offset by a large proportion (e.g., 20-30-40%) of the population in poverty. • Having moderate rates of extreme wealth, but also moderate rates of poverty. 2. Poverty and hunger • We typically define poverty according to a level of income that is sufficient to procure a minimal consumption bundle in a given country – a minimal caloric intake ensuring lack of hunger, clothing, housing, transport, etc, adjusted for household size. • This deals with the measurement issue that the cost of goods and services might differ between countries, and allows us to compare poverty between countries by comparing rates of poverty (e.g., % in poverty). • We will return to the issue of poverty measurement in lecture 3 (week 3), when we study Chapter 5, which provides a more comprehensive take on poverty and vulnerability. 3. Inequality and inequity • While poverty focuses on the “left tail” of the income or wealth distribution, inequality describes the entire distribution. Roughly, inequality is “the share of aggregate income hold by the top X percent of the population relative to the bottom Y percent.” • This brings the focus of economic development not just on the status of the poor, but also on how relatively well the better-off are doing. • It is not universally accepted that we should be very concerned with inequality, however it has received increased focus in recent years as many measures of inequality have increased around the world. Might high rates of inequality decrease overall growth and development? 3. Inequality and inequity • While inequality is an ex post concept (something we look at afterward), inequity is an ex ante (beforehand) concept: “the degree of equality in opportunities to generate future income or to achieve other development objectives.” • E.g., the opportunity to obtain education, as measured by the probability of achieving a level of education like high school diploma. • This is also related to Sen’s concept of “capabilities” – what people can do with their opportunity set. • We will return to the issue of inequality and inequity in lecture 3 (week 3), when we study Chapter 6, which provides a more comprehensive take on inequality and inequity. 4. Vulnerability and poverty • We can define vulnerability (to poverty, food insecurity, or hunger) as “the probability of falling into (poverty, food insecurity, or hunger) for the non-(poor, food insecure, hungry).” • This introduces the idea of risk into analyzing economic development, in particular the vulnerability to “shocks” – bad events that can increase poverty such as illness, death of a family member, crop failure, civil conflict, etc. • There is rapidly-growing empirical evidence that exposure to uninsured risks is one of the leading causes of poverty. 4. Vulnerability and poverty • Exposure to shocks can have a number of negative impacts: • People may change their behavior through anticipating them. For example, if a farmer is worried that if they try a promising new seed variety and it fails they won’t have something to fall back on, they might not try it. This is related to their degree of risk aversion (the concavity of their utility function, as you may have studied in microeconomics). • Exposure to shocks may pull people into irreversible poverty / hunger / food insecurity, aka chronic or persistent poverty. This is also known as a “poverty trap.” • Even if shocks don’t have permanent impacts, they can devastate the well-being of a household for many years. For example, a household facing a shock may sell off productive assets to survive (e.g., livestock, land, gold), and it may take them a long time to recover. 4. Vulnerability and poverty • How resilient are households to shocks and vulnerability? We can distinguish to what extent households in the same community, region or nation face the same risk. 1. Covariate risk is the component of risk that is jointly shared by households in a community, region, or nation. If everyone faces the same shock, then it’s hard to help each other, and outside help (e.g., from the government, or other nations) is needed. 2. Whereas if covariance is low, then we have idiosyncratic risk, and as different shocks hit, the less-affected ones can more easily help the more-affected ones (mutual insurance). 4. Vulnerability and poverty • There are ongoing efforts to reduce vulnerability, both through public schemes like social programs (e.g., social welfare programs like those managed by Centrelink in Australia, or Medicare), and schemes provided mostly through private markets (e.g., insurance, like homeowners or car insurance in Australia). • We will look more into these issues when we study Chapter 5 on poverty and vulnerability analysis in week 3 (lecture 3), Chapter 13 on financial services for the poor in week 7 (lecture 7) and Chapter 14 on social programs and targeting in week 9 (lecture 8). 5. Basic needs: human development • The first 4 dimensions of development so far are monetary dimensions of development. But arguably the most important measure of development is the capacity of people – their knowledge and physical well-being. • Of course, this can also plausibly contribute to monetary metrics of growth. • A number of thinkers have argued that meeting basic human needs is a key measure of development, in areas such as education, health, nutrition, sanitation and housing. However, without a singular metric like a monetary metric, there is still much subjectivity and debate over how to weight the importance of these needs. 5. Basic needs: human development • Some indicators of basic needs: • Child health: z-scores (World Health Organization). For indicators without a standard scale, we can calculate a zscore: z = (x - μ)/σ, where x is an individual score (e.g., height, weight), μ is a mean value of x in a population, and σ is a standard deviation of the distribution of x in the population. To standardize across countries, μ and σ are taken from a US reference population. The two main measures are height-for-age and weight-for-age. • Global Burden of Disease (GBD) (World Health Organization). Calculated as “the gap between the current health status of a population and the ideal situation” (everyone lives to old age, disease and disability free). Measured in Disability Adjusted Life Years (DALY), where a DALY is “one lost year of healthy life due to premature mortality or to ill health or disability,” i.e., DALY = YLL + YLD. Then GBD = the share of DALY in ideal life expectancy. • Malnutrition: food insecurity (Food and Agricultural Organization). Proportion of population below minimal nutritional needs (2,800 kilocalories/person/day for adult men and 2,000 kilocalories/person/day for adult women, with moderate activity and lowest acceptable bodyweight). The depth of hunger is measured as the average distance to the nutritional norm. 5. Basic needs: human development • Some indicators of basic needs: • The classic Human Development Index (HDI) (United Nations Development Program) for country 𝑘𝑘: 3 1 𝐻𝐻𝑖𝑖𝑖𝑖 − 𝐻𝐻𝑖𝑖,𝑚𝑚𝑚𝑚𝑚𝑚 𝐻𝐻𝐻𝐻𝐻𝐻𝑘𝑘 = � 𝐻𝐻𝑖𝑖,𝑚𝑚𝑚𝑚𝑚𝑚 − 𝐻𝐻𝑖𝑖,𝑚𝑚𝑚𝑚𝑚𝑚 3 𝑖𝑖=1 where 𝐻𝐻𝑖𝑖𝑘𝑘 represents the value of an index of educational attainment (weighted average of literacy with primary, secondary, and tertiary gross enrollment), health (life expectancy), and income (PPP-adjusted per capita income) for country 𝑘𝑘. • In 2010, the HDI was redefined to a multiplicative specification, allowing the indices to complement each other: 1/3 3 𝐻𝐻𝑖𝑖𝑖𝑖 − 𝐻𝐻𝑖𝑖,𝑚𝑚𝑚𝑚𝑚𝑚 𝐻𝐻𝐻𝐻𝐻𝐻𝑘𝑘 = � 𝑖𝑖=1 𝐻𝐻𝑖𝑖,𝑚𝑚𝑚𝑚𝑚𝑚 − 𝐻𝐻𝑖𝑖,𝑚𝑚𝑚𝑚𝑚𝑚 5. Basic needs: human development • Some indicators of basic needs: • While the HDI makes an important contribution in providing a relatively simple, cross-country comparable index of human development, it has been criticized for its arbitrariness. E.g., why include income when the idea is to have a metric of basic needs? A key justification of incomeonly measures is that money can be converted to meet other needs like education and health. Why does each category get equal weight? • Multidimensional poverty indices (MPI) are designed as an improvement over HDI, considering a broader set of measures of living standards (in addition to those in HDI, health and education, and after dropping income, it also includes access to electricity, drinking water, sanitation, flooring, cooking fuel, and assets), sets thresholds for each, and declares a household as “poor” if it is below threshold in at least 30% of categories. We’ll return to MPI in Chapter 5 on Poverty and Vulnerability Analysis. 6. Sustainability and use of natural resources • Some of the measures thus far consider time-dependent dimensions of development within a person’s lifetime – e.g., people in school might not be contributing to GDP at the time, but they can produce a lot more later, reducing vulnerability might require up-front investments that reduce GDP, but greatly reduce the risk of people being worse of for sustained periods after facing shocks. • Even further on the time dimension: what about caring about the wellbeing of future generations? Sustainability is defined as “the concern with intergenerational equity: that the wellbeing of future generations should not be inferior to that of the current generation, as a consequence of the current generation’s behaviour toward the use of natural resources and the environment.” 6. Sustainability and use of natural resources • While this sounds plausible in principle, in practice it is hard to implement: • Just as it is hard to find a universal definition of wellbeing for present generations, even moreso for anticipating the wellbeing of future generations. • Even if we can solve this definitional problem, we need to think about that future wellbeing will flow out of natural resources. • If and when we address these challenges, we need to resolve the inevitable tradeoff between the wellbeing of the current generation and the wellbeing of future generations. • We will return to these issues in lecture 12 (week 13), when we study Chapter 15, on sustainable development and the environment. 7. Quality of life • The aforementioned 6 categories (income growth, escape from poverty, equality and equity, reduction of vulnerability, basic needs in education and health, and environmental sustainability) lend themselves relatively well to quantitative analysis, either based on money metrics, quantification of human development, or extending these ideas across generations to consider sustainability. • However, many more theories of human wellbeing and development have been developed, much of this work going beyond the scope of ECOS3002. • Our textbook additionally raises two further concepts of quality of life. 7. Quality of life • The Nobel-prize winning economist Amartya Sen is known for his work including on famines, poverty measurement, and the “capability approach.” • In his books Commodities and Capabilities (1985) and Development as Freedom (2000), he develops the idea of development as a process of expanding freedoms. In a logical framework. • Greater freedoms derive from capabilities (“the choices that a person makes among “functionings” that could be achieved, and the freedoms he or she has in exercising such choices”). • Functionings are in turn determined by “entitlements,” “the set of alternative commodities and services that a person can command in a society using the totality of rights and opportunities that he or she faces.” 7. Quality of life • Entitlements are the fruits of a developed society – public goods, personal characteristics, asset endowments, social norms, environmental conditions, etc. • Sen helped push the definition of development within the economic profession beyond monetary metrics to consider issues of freedom of choice. Monetary metrics could fail if there are differences in freedom – e.g., a wealthy member of a discriminated group may have a lower level of wellbeing than an average member of a favored group in society. • Under Sen’s approach, progress in development isn’t just about raising GDPpc, it’s about attacking sources of capability deprivation and expanding the set of capabilities. 7. Quality of life • As a second alternative, we have William Easterly’s (1999) Indicators of Quality of Life. Easterly is another prolific writer on economic development with a series of popular and influential books. • Easterly (1999) proposes a set of 81 additional indicators beyond income, adding to education, health, and inequality measures around individual rights and democracy, political stability and peace, and absence of “bads” (fraud, terrorism, crime, pollution, etc). • This can be seen as related to the multidimensional approach. Subjective measures • Unfortunately there is a tradeoff between comprehensiveness of how we measure development and how easy it is to measure and derive comparisons. Single-index measures (e.g., money) are relatively easy to measure and compare, but may miss important things. • Another approach is to collect subjective measures of well-being, such as measuring concepts like “happiness.” E.g., ask people on a scale of 1-10, “All things considered, how satisfied are you with your life as a whole these days?” • The famous “Easterlin paradox” (Easterlin, 1974) involved showing that there is not a strong correlation between income and happiness in industrialized countries. Subjective measures • However, more recent work (e.g., Deaton, 2008) shows that rising GDPpc in developing countries does increase happiness. • It appears that income tends to increase happiness below PPP-adjusted $10,000 USD per capita but not beyond, as other factors may play a greater role in driving happiness once basic needs to be met beyond this income level. • Meanwhile, the evidence on whether income growth increases well-being is relatively problematic and the positive evidence is relatively weak. MDGs and SDGs • The “development goals” represent a global effort to define minimum standards of development and set up global efforts and coordination to achieve them. • The Millenium Development Goals (MDGs) were declared in September, 2000, and involved attempting to achieving 8 goals by 2015, including To eradicate extreme poverty and hunger To improve maternal health To achieve universal primary education To combat HIV/AIDS, malaria, and other diseases To promote gender equality and empower women To ensure environmental sustainability To reduce child mortality To develop a global partnership for development MDGs and SDGs • Each MDG had specific targets and dates for achieving those targets. Progress towards these goals was uneven – some countries achieved many goals, some didn’t achieve any. •The Sustainable Development Goals (SDGs) succeeded the MDGs in 2016, and are intended to be achieved by 2030. They are a set of 17 interlinked global goals, and in 2017 the UN General Assembly set specific targets for each goal, and indicators to measure progress toward each target (usually between 2020 and 2030, though some have no end date). • You can learn more about the SDGs here (https://sdgs.un.org/) and review the SDG tracker here: https://sdg-tracker.org/. ECOS3002 Development Economics Lecture 1 Faculty of Arts and Social Sciences School of Economics ECOS3002 Development Economics Lecture 1 Introduction to Impact Evaluation and RCTs (~early sections of Chapter 4) Faculty of Arts and Social Sciences School of Economics ECOS3002 Development Economics Lecture 1 Introduction to Impact Evaluation and RCTs Overview Faculty of Arts and Social Sciences School of Economics Why evaluate impact? •Organizations funding and implementing international development projects increasingly want to evaluate “impact.” •Familiar approaches like M&E, process evaluations, qualitative assessments, etc, can verify implementation activities and provide critical diagnostic insights. •However, such methods generally cannot tell us whether a project is improving the ultimate outcomes we care about – health, income, welfare, well-being, etc. Impact Evaluations (IEs) can fill this gap. •Furthermore other methods can’t cleanly identify the magnitude of the impact of a project (how much does productivity increase, profitability or income go up, etc). • Critical for cost-benefit analysis: compare program costs to real program benefits •Detailed data collection and measurement also provides insights into why and how programs work that we might not get from a less intrusive approach. The goal of impact evaluation: causal inference •The goal of an impact evaluation is typically to answer a question like: what is the (quantitative) effect of X (a “treatment”) on Y (an outcome)? In other words, how much would Y increase (or decrease) on average, purely due to X alone, all other things equal? •This sounds easy in principle: just compare people, firms, farms, etc, that have the treatment, with those that don’t. •The challenge comes from the fact that in many of the impact evaluation contexts we care about, human choices and/or other systems intervene to allocate treatment: • Governments and NGOs choose who to give social, health, educational or benefits or programs to, and people choose or not to accept or seek out these benefits • People choose whether or not to proceed in school • Financial institutions choose who to lend to or offer other financial services • Governments or private sector firms choose where to implement infrastructure projects • Etc And the allocation of treatment may be a function of characteristics that also influence outcomes. Classic example: returns to education •Education is one of the largest areas of government expenditure, so massively important to know the returns to education, to optimize investment. •The naïve approach would be to compare people who have more or less years of education. •Can run a regression (best fit of a line to the data): Y = a + b*X + e where Y is income, and X is years of education. Would ‘b’ tell us the true average annual return to education? Classic example: returns to education •Education is one of the largest areas of government expenditure, so massively important to know the returns to education, to optimize subsidization, investment, regulation, etc. •The naïve approach would be to compare people who have more or less years of education. •Can run a regression (best fit of a line to the data): Y = a + b*X + e where Y is income, and X is years of education. Would ‘b’ tell us the average return to education? No! People can choose when to stop schooling, and the system may also have built-in barriers (e.g., entry exams). This can lead people who have innate characteristics that make them better at school to stay longer. If these innate characteristics can also drive earnings independent of X, then ‘b’ is not the “true” effect of schooling on earnings. Classic example: returns to education Classic example: returns to education Observables affecting X and Y Treatment: education Outcome: income Unobservables affecting X and Y Classic example: returns to education •The problem with our regression in this case is that there are likely to be unobservable characteristics that lead people to get more years of schooling and drive earnings (independent of schooling). •This is often called “innate ability” but could be lots of things: ambition, drive, social influences (family), genetics, etc, etc. •In our naïve regression, these characteristics go into e: Y = a + b*X + e where Y is income, and X is years of education. •So let’s just measure these characteristics? It’s generally accepted that that’s a fool’s errand: ◦ In many datasets where we want to evaluate impact, these variables don’t exist. ◦ Even if we collect the data, testing for a large range of characteristics would be incredibly expensive and we don’t have good tests for some of them. Classic example: returns to education •Another way to see this is to take a graphical approach. •Again: the key challenge comes up if some factor(s) that are hard to measure drives both (1) selection, and (2) outcomes. Income We don’t know where this is Ideally we want to get rid of this middle bit (the “selection bias”), so the true effect of the treatment is just the difference in outcomes between the two groups 10 years of schooling 16 years of schooling Modern approach to impact evaluation: have a research design •The modern approach to impact evaluation gives up on trying to run “kitchen sink regressions” (regressing Y and X and a lot of other stuff, to control for sources of selection bias) Y = a + b*X + c*[Kitchen_Sink] + e In some cases this may be the best we can do, but it often leads to undesirably weak or misleading results, and quite an unreliable guide for policy. •In other words, interpreting such results as valid estimates of the true causal effect of the treatment relies on strong assumptions. ◦ Our kitchen sink approach requires us to assume that our paltry or ill-measured set of controls fully controls for all selection bias, and that the functional form of the regression is properly specified. •Rather, the modern approach is to understand, and ideally control, the allocation of treatment (the selection process), as much as possible. This allows us to make the weakest possible assumptions about our ability to control for selection bias, and hence to generate the most reliable (i.e., believable) possible results. The so-called “gold standard” of impact evaluation: the RCT •The most direct way to control treatment assignment is to do it ourselves as researchers, in a way that minimizes the chances that treatment assignment is related to any characteristics of the treatment units. •The best way to do this is randomization. •In the simplest case, we run an RCT by taking a list of eligible participants, and randomizing them into two groups: treatment and control. •For example, list the participants in an Excel sheet, generate a new column with random numbers (e.g., “=rand()”) and then ‘sort’ on the randomlygenerated variable, taking the top 50% of observations in treatment, and the bottom 50% into control. RCTs are desirable for their design simplicity •This is particularly desirable because then we don’t need to rely on fancy econometrics (which often relies on various assumptions) to identify causal effects. We can just run the following, where X is now our randomly-assigned treatment variable (X =1 for treatment, X = 0 for control), and ‘b’ will be the true causal effect of the treatment: Y = a + b*X + e •So can display results in a simple table or bar chart, comparing a to b. •Can add controls if we want, to be more efficient by removing some of the variation in Y, but we don’t need it to deal with selection bias. •This means that RCTs rely on very weak assumptions. Many methods can help us estimate causal effects, but the stronger the assumptions it relies on, the less credible it is. RCTs rely on the weakest identifying assumptions. The “gold standard”: the RCT •The randomized control trial (= RCT) is considered the “gold standard” of impact evaluation, because it most convincingly deals with the selection bias problem. •For this reason, RCTs are advocated as long as they are ethical, feasible, and the cost justifies the knowledge created. Income 10 years of schooling 16 years of schooling RCTs aren’t the only credible way to do an impact evaluation •There are other credible ways to do an impact evaluation. •In economics these are known as “quasi-experimental” methods because they try to imitate what a pure experiment does – separating treatment from the characteristics of the treated units. •Common methods in applied economics include: • Regression discontinuity design (RDD) • Differences-in-differences (DiD) • Instrumental variables (IV) •While we can learn a lot from these methods, they all suffer drawbacks relative to RCTs: ◦ RDD only estimates a local average treatment effect ◦ DiD relies on assumptions about counterfactual trends ◦ IV relies on an untestable assumption, the exclusion restriction RCTs have become increasingly popular in (development) economics The “credibility revolution” •The increased use of RCTs has been at the forefront of a “credibility revolution” in empirical economics: an increase in the use of experiments (RCTs) and quasi-experimental methods, to study causal questions. •“Design-based” empirical work has become increasingly common – rely on clean, transparent, research designs that minimize heavy modeling assumptions. •Overall, empirical work has come to dominate theoretical work, though theory always has an important role to play. The “credibility revolution” in development economics •While Banerjee, Duflo, and Kremer officially won the Nobel Prize “for their experimental approach to alleviating global poverty,” unofficially I think that their main influence, especially Duflo, was in pushing a credibility revolution in development economics. •In the 1990s, development economics was dominated by theorical work, macro-empirical analyses (so-called cross-country regressions), and a strand of microeconomic empirical research by researchers like Angus Deaton and Christopher Udry. •Today the large majority of leading researchers are empiricists. •If not running RCTs, then conducting empirical research inspired by the experimental approach to causal inference, through quasi-experiments and identification-aware structural modeling. •I would say that the norm is to think that “if you can randomize, then you should, but we can generate useful knowledge about many important issues without just running RCTs.” RCTs haven’t “taken over” (development) economics David McKenzie (2016) How are RCTs used to influence policy? •As the evidence base from RCTs has built up, they have increasingly been used to influence public policy, especially in areas where evidence in favor of interventions was relatively thin or anecdotal. •A classic example is microfinance. In the 2000s microfinance was all the rage, seen as a “silver bullet” to solve poverty. •A number of key RCTs came out around 2010 showing that the effects of microfinance are much less than expected in benefiting the lives of the poor: not completely ineffective, and also not harmful as some critics claimed, but by no means a “silver bullet.” •This had a huge influence on the global policy debate around supporting microfinance. How are RCTs used to influence policy? •It’s important to have a properly nuanced theory of policy influence. •It’s rarely enough to throw a new fact (e.g., the result of an RCT) at a policy decisionmaker and expect them to immediately change a policy. •Policy change is often a political process. Timing is key, and deep research often takes time, so the results of the RCT may not always come at the right time to immediately influence policy change. •Furthermore, RCTs often evaluate programs at small scale, or under ideal circumstances, and the reality of program scaling may put some limits on what we can do with the results of an RCT. •It’s likely that policy change from RCTs, like any detailed research evidence, will involve ongoing processes of informing policymakers, lobbying, and assessing how the new evidence fits into the broader picture of policymaking. How are RCTs used to influence policy? •New initiatives all over the world to promote the use of RCTs to influence policymaking, closely tying the decisions about which RCTs to implement, where, and when, to policy needs. For example: ◦ The World Bank has a unit specifically dedicated to impact evaluations (DIME), and requires a certain % of their projects to have rigorous impact evaluations or RCTs. ◦ Banerjee and Duflo’s organization, J-PAL, has made formal agreements with a number of states in India (e.g., Delhi, Jharkhand, Punjab, Tamil Nadu) to promote evidence-based policymaking, and partner on making use of rigorous evidence and/or conducting RCTs for public policy. ◦ A number of governments, e.g., Peru, have setup embedded “evaluation labs” within certain government ministries, such as the Ministry of Education. ◦ In the UK the so-called “Nudge Unit” has used behavioural economics-focused experiments to improve the public service. Similar ideas were also implemented in the USA under Obama, and in Australia (“Beta Unit”). ECOS3002 Development Economics Lecture 1 Introduction to Impact Evaluation and RCTs Some technical details Faculty of Arts and Social Sciences School of Economics RCTs: analytical issues • RCTs take a population that is eligible to receive a “treatment” and use randomization to decide which group will be offered treatment, and which will not (the control, aka, counterfactual). • This should lead to “balance” between the treatment and control groups – on average the two groups should have statistically-similar characteristics. • The idea that if we randomize over a large enough sample the two groups will be balanced is based on an idea in probability theory, the Law of Large Numbers. • It is like coin slips – if we flip a fair coin 400 times, it’s highly likely we’d get close to 200 heads and 200 tails, much more likely than if we flipped 4 times. RCTs: analytical issues • However, in practice we need to verify that the “randomization worked.” • It’s always possible that we flip a fair coin 400 times and get 300 heads and 100 tails. • To do so we do a “balance check”: compare the average of some key variables in both the treatment and control groups. • Ideally, we would do a statistical test (a “t-test”) to test if any differences are statistically significant. RCTs: analytical issues • Once we’re sure our sample is balanced, we can implement the randomization, and conduct the intervention on the treatment group. • Once we collect endline data from the treatment and control groups, we can calculate treatment effects. • In a simple RCT, this will just be the difference of the mean of an outcome in the treatment group, and the mean of an outcome in the control group. • We can also implement this comparison in a regression framework, where we regress an outcome on a linear regression with a dummy signifying the treatment group. • Again, we should do a t-test to test if this difference is statistically significant. RCTs: analytical issues • Technically the aforementioned indicator is called the “intent to treat” effect (ITT). Under ITT, we treat someone as treated if they were offered the treatment, even if in practice some participants may refuse treatment (attrit). • An alternative estimator of treatment effect impacts is treatment on the treated (ToT). Under this approach, only people who were offered and took up the treatment are considered treated. ToT is implemented under a two stage least squares (instrumental variables) setup, which distinguishes takers and non-takers of treatment, in the first stage. • We can also stratify our analysis by sub-groups. For example: men vs women, urban vs rural, young vs old. ECOS3002 Development Economics Lecture 1 Introduction to Impact Evaluation and RCTs Additional materials and resources Faculty of Arts and Social Sciences School of Economics Useful resources on RCTs Organizations: ◦ IPA: https://www.poverty-action.org/ ◦ https://www.poverty-action.org/country/myanmar ◦ J-PAL: https://www.povertyactionlab.org/ ◦ CEGA: https://cega.berkeley.edu/ ◦ World Bank DIME: http://www.worldbank.org/en/research/dime ◦ 3ie: http://www.3ieimpact.org/ Useful resources on RCTs Materials on RCTs and IEs ◦ J-PAL introduction to RCTs: https://www.povertyactionlab.org/research-resources ◦ World Bank Impact Evaluation in Practice: http://www.worldbank.org/en/programs/sief-trustfund/publication/impact-evaluation-in-practice ◦ Banerjee, Abhijit and Esther Duflo. 2011. Poor Economics: A Radical Rethinking of the Way to Fight Global Poverty. PublicAffairs: New York, NY. https://www.pooreconomics.com/ ◦ World Bank Development Impact Blog: https://blogs.worldbank.org/impactevaluations/ ◦ Has an awesome “one stop shop” of articles: https://blogs.worldbank.org/impactevaluations/curated-list-ourpostings-technical-topics-your-one-stop-shop-methodology Some leading researchers Abhijit Banerjee (MIT), Esther Duflo (MIT), Sendhil Mullainathan (Chicago) (cofounders of J-PAL) Ben Olken (MIT), co-director of J-PAL, and co-director of the J-PAL Indonesia office, with Rema Hanna (Harvard) Dean Karlan (Northwestern), President of IPA, and leading researcher on private sector development Antoinette Schoar, co-director (with Karlan) of IPA SME initiative Ted Miguel, leader of CEGA David McKenzie (World Bank) and Chris Woodruff (Oxford), leading researchers on microenterprises and SMEs Karthik Muralidharan (UCSD) and Paul Niehaus (UCSD), RCTs at scale More technical resources: RCTs Handbook of Field Experiments: https://www.povertyactionlab.org/handbookfield-experiments Banerjee, Abhijit and Esther Duflo. 2009. "The Experimental Approach to Development Economics." In Annual Review of Economics, 1: 151-178. 10.1146/annurev.economics.050708.143235 A more wordy review of experiments (randomized control trials) in development economics. Duflo, Esther, Rachel Glennerster, and Michael Kremer. 2007. "Using Randomization in Development Economics Research: A Toolkit." In T. Paul Schultz and John Strauss (eds.) Handbook of Development Economics Vol. 4, Elsevier Science Ltd., North Holland, 3895-3962. A deeper and more detailed review of experimental methods in development economics (compared to Banerjee and Duflo, 2009). More technical resources: quasi-experimental approaches Imbens, Guido and Jeffrey Wooldridge. 2009. "Recent Developments in the Econometrics of Program Evaluation." Journal of Economic Literature, 47(1): 586. A solid review of similar material to Angrist and Pischke (2009), in more of a journal than textbook format. Angrist, Joshua and Jorn-Steffen Pischke. 2009. Mostly Harmless Econometrics: An Empiricist's Companion. Princeton University Press, Princeton, NJ. "The Bible" of modern reduced-form econometric methods for causal inference. Cunningham, Scott. Causal Inference Mixtape. http://scunning.com/cunningham_mixtape.pdf. A bit more gentle and step-bystep introduction to causal inference and impact evaluation. Research transparency Berkeley Initiative for Transparency in the Social Sciences (BITSS): https://www.bitss.org/ Presentation by Ted Miguel (UC Berkeley): “Research Transparency and Reproducibility in Economics and Beyond” http://emiguel.econ.berkeley.edu/writing-and-talks/talks/researchtransparency-and-reproducibility-in-economics-and-beyond Blog on JDE registered reports: https://blogs.worldbank.org/impactevaluations/registered-reports-pilotingpre-results-review-process-journal-development-economics ECOS3002 Development Economics Lecture 2 Faculty of Arts and Social Sciences School of Economics ECOS3002 Development Economics Lecture 2 Chapter 3: History of Thought in Development Economics Faculty of Arts and Social Sciences School of Economics How did we get where we are now? • Thought in development economics has been closely shaped by progress in economic development around the world. • The textbook provides a useful framing of different epochs in world development, and the emergence of development economics out of that. • Of course, the field has also been shaped by changes in broader society – the role of the university and the evolution of academic disciplines, trends in economics such as moving from a more theoretical to a more empirical field of inquiry over the last 40 years, etc. • I want to focus on the post-WWII period, when active development policymaking, and academic development economics, starting to at least partly resemble what it does today. The recovery from WWII • A number of developments after WWII set the scene for where we are today as a field. 1. There was a sense that to avoid such wars in the future, a world order had to be built on shared institutions. Globalist institutions like the United Nations, World Bank, International Monetary Fund, etc, were founded or significantly reformed. 2. The successful and rapid recovery in nations such as in Europe, including with US support (the Marshall Plan) bolstered the idea that “big push” or supply side development could work. 3. Many less-developed countries become independent of colonial oversight. 4. The cold war with principals the USA and USSR lead to foreign policy competition around the world. 5. WWII was an important time for the emergence of modern mathematical economics and its theoretical edifice, as the war effort drew in funding for areas of research like game theory. 1950s-1960s • These were “glorious years” for economic recovery, especially in the West. • Focus was on economic growth and industrialization, with big-push investments (infrastructure, basic industries) and import-substitution industrialization, without an interest in poverty-reduction per se. • So-called “pioneers in development” debated how these investments should be made. • E.g., Rosenstein-Rodan theory of the big push (1943). • Showed relatively little faith in the dynamism of the price system, and instead on a structuralist, almost engineering focus on the economy (e.g., Harrod-Domar model). • Lewis (1954) two-sector model of unlimited supplies of rural labor, as people migrate to work in industry. • Early on the pioneers ignored agricultural productivity, however the Green Revolution around the 1970s provides a foundation for economic growth in Asia and Latin America (much less so in Africa). 1970-1982 • While growth went well in the 1950s and 1960s, it was not leading to a reduction in poverty, as rising rural-urban migration generated dilapidated urban slums, and increased inequality. • The agenda evolved to see if growth could be made pro-poor. • McNamara (1973) redefined World Bank mission as focusing on poverty reduction. • Various new ideas emerged around redistribution, multiple dimensions of development, and new growth strategies like improving agricultural productivity as a way to drive up domestic demand, human capital investment, etc. • The 1970s were also a major inflationary period throughout much of the world. 1982-1997 • This period is considered to start in 1982, with Mexico defaulting on its international debts, and a number of other countries doing similar soon afterward. • The blame for the crisis was put on excessive government investment and intervention, generating the large fiscal deficits that drove up debts that became unsustainable under financial conditions in the early 1980s (e.g., higher interest rates). • This led to a set of policy reforms called “structural-adjustment” under the so-called “Washington consensus,” which can be seen as much as a crisis response to the debt crisis, as a real development agenda. Washington Consensus • The main elements of the Washington consensus were to reduce the role of the state in economic affairs, in favor of market forces: 1. Fiscal discipline (balanced budgets, tax reforms, removing subsidies) 2. Financial liberalization (e.g., deregulate banking sector 3. Trade liberalization for goods and services 4. Deregulation of FDI 5. Privatization of public enterprises • This also involved renewed globalization. 1982-1997 • Unfortunately while the Washington consensus agenda did help reduce fiscal balance, it wasn’t so helpful for development. • It imposed austerity that was socially costly, seeing reductions in investments in health and education. • The 1990s are considered a “lost decade” for development, especially in Africa. The postcommunist transition in Eastern Europe was also a mixed experience. • This era comes to the end with the Asian financial crisis of 1997. 1982-1997 • In terms of policy engagement, development economics highly macro-focused during this period, led by professors like Jeffrey Sachs. • Many of the micro-focused researchers focused on theoretical modeling issues. • Early emergence of fieldwork-based development economics, e.g., Christopher Udry. 1997-2019 • As a correction in this period there is a renewed focus on the role of the state in complementing the market. • Focus back on multidimensionality in development. • Recognition in development economics that one approach won’t work everywhere – the “end of the era of big ideas.” • Country diagnostics and customization. • End of Cold War brings an interest in aid performance. 1997-2019 • The major ideas of this period are the focus of the rest of the book, e.g., 1. Endogenous growth (innovation, R&D) 2. Open economy industrialization (leveraging globalization to your advantage) 3. New institutional economics (Chapter 20) 4. New political economy (Chapter 21) 5. Sustainable growth and the environment (Chapter 15) 6. Impact evaluation for accountability and learning (Chapter 4) 7. Agricultural for development (Chapter 18) 1997-2019 • During this period, the field of development economics evolves from macro- and theoretically-oriented, to highly empirical, agnostic and inductive, and focused on the details of development. • Opportunities for data collection and intervention-based research (RCTs) grow rapidly. • The “economist as plumber” model (Duflo, 2017) 1997-2019 • During this period, the field of development economics evolves from macro- and theoreticallyoriented, to highly empirical, agnostic and inductive, and focused on the details of development. • Opportunities for data collection and intervention-based research (RCTs) grow rapidly. • The “economist as plumber” model (Duflo, 2017) • The rise of China – a new (highly successful) development model, and its emergence as a global player (e.g., Belt and Road initiative, investments in Africa) and competitor to the US. • Many large developing countries are moving into middle-income status. Most destitute countries typically affected by serious conflict and political instability, and perhaps going forward, climate change. ECOS3002 Development Economics Lecture 2 Faculty of Arts and Social Sciences School of Economics ECOS3002 Development Economics Lecture 2 Chapter 3: History of Thought in Development Economics Faculty of Arts and Social Sciences School of Economics How did we get where we are now? • Thought in development economics has been closely shaped by progress in economic development around the world. • The textbook provides a useful framing of different epochs in world development, and the emergence of development economics out of that. • Of course, the field has also been shaped by changes in broader society – the role of the university and the evolution of academic disciplines, trends in economics such as moving from a more theoretical to a more empirical field of inquiry over the last 40 years, etc. • I want to focus on the post-WWII period, when active development policymaking, and academic development economics, starting to at least partly resemble what it does today. The recovery from WWII • A number of developments after WWII set the scene for where we are today as a field. 1. There was a sense that to avoid such wars in the future, a world order had to be built on shared institutions. Globalist institutions like the United Nations, World Bank, International Monetary Fund, etc, were founded or significantly reformed. 2. The successful and rapid recovery in nations such as in Europe, including with US support (the Marshall Plan) bolstered the idea that “big push” or supply side development could work. 3. Many less-developed countries become independent of colonial oversight. 4. The cold war with principals the USA and USSR lead to foreign policy competition around the world. 5. WWII was an important time for the emergence of modern mathematical economics and its theoretical edifice, as the war effort drew in funding for areas of research like game theory. 1950s-1960s • These were “glorious years” for economic recovery, especially in the West. • Focus was on economic growth and industrialization, with big-push investments (infrastructure, basic industries) and import-substitution industrialization, without an interest in poverty-reduction per se. • So-called “pioneers in development” debated how these investments should be made. • E.g., Rosenstein-Rodan theory of the big push (1943). • Showed relatively little faith in the dynamism of the price system, and instead on a structuralist, almost engineering focus on the economy (e.g., Harrod-Domar model). • Lewis (1954) two-sector model of unlimited supplies of rural labor, as people migrate to work in industry. • Early on the pioneers ignored agricultural productivity, however the Green Revolution around the 1970s provides a foundation for economic growth in Asia and Latin America (much less so in Africa). 1970-1982 • While growth went well in the 1950s and 1960s, it was not leading to a reduction in poverty, as rising rural-urban migration generated dilapidated urban slums, and increased inequality. • The agenda evolved to see if growth could be made pro-poor. • McNamara (1973) redefined World Bank mission as focusing on poverty reduction. • Various new ideas emerged around redistribution, multiple dimensions of development, and new growth strategies like improving agricultural productivity as a way to drive up domestic demand, human capital investment, etc. • The 1970s were also a major inflationary period throughout much of the world. 1982-1997 • This period is considered to start in 1982, with Mexico defaulting on its international debts, and a number of other countries doing similar soon afterward. • The blame for the crisis was put on excessive government investment and intervention, generating the large fiscal deficits that drove up debts that became unsustainable under financial conditions in the early 1980s (e.g., higher interest rates). • This led to a set of policy reforms called “structural-adjustment” under the so-called “Washington consensus,” which can be seen as much as a crisis response to the debt crisis, as a real development agenda. Washington Consensus • The main elements of the Washington consensus were to reduce the role of the state in economic affairs, in favor of market forces: 1. Fiscal discipline (balanced budgets, tax reforms, removing subsidies) 2. Financial liberalization (e.g., deregulate banking sector 3. Trade liberalization for goods and services 4. Deregulation of FDI 5. Privatization of public enterprises • This also involved renewed globalization. 1982-1997 • Unfortunately while the Washington consensus agenda did help reduce fiscal balance, it wasn’t so helpful for development. • It imposed austerity that was socially costly, seeing reductions in investments in health and education. • The 1990s are considered a “lost decade” for development, especially in Africa. The postcommunist transition in Eastern Europe was also a mixed experience. • This era comes to the end with the Asian financial crisis of 1997. 1982-1997 • In terms of policy engagement, development economics highly macro-focused during this period, led by professors like Jeffrey Sachs. • Many of the micro-focused researchers focused on theoretical modeling issues. • Early emergence of fieldwork-based development economics, e.g., Christopher Udry. 1997-2019 • As a correction in this period there is a renewed focus on the role of the state in complementing the market. • Focus back on multidimensionality in development. • Recognition in development economics that one approach won’t work everywhere – the “end of the era of big ideas.” • Country diagnostics and customization. • End of Cold War brings an interest in aid performance. 1997-2019 • The major ideas of this period are the focus of the rest of the book, e.g., 1. Endogenous growth (innovation, R&D) 2. Open economy industrialization (leveraging globalization to your advantage) 3. New institutional economics (Chapter 20) 4. New political economy (Chapter 21) 5. Sustainable growth and the environment (Chapter 15) 6. Impact evaluation for accountability and learning (Chapter 4) 7. Agricultural for development (Chapter 18) 1997-2019 • During this period, the field of development economics evolves from macro- and theoretically-oriented, to highly empirical, agnostic and inductive, and focused on the details of development. • Opportunities for data collection and intervention-based research (RCTs) grow rapidly. • The “economist as plumber” model (Duflo, 2017) 1997-2019 • During this period, the field of development economics evolves from macro- and theoreticallyoriented, to highly empirical, agnostic and inductive, and focused on the details of development. • Opportunities for data collection and intervention-based research (RCTs) grow rapidly. • The “economist as plumber” model (Duflo, 2017) • The rise of China – a new (highly successful) development model, and its emergence as a global player (e.g., Belt and Road initiative, investments in Africa) and competitor to the US. • Many large developing countries are moving into middle-income status. Most destitute countries typically affected by serious conflict and political instability, and perhaps going forward, climate change. ECOS3002 Development Economics Lecture 2 Faculty of Arts and Social Sciences School of Economics ECOS3002 Development Economics Lecture 2 Chapter 4: Impact Evaluation of Development Policies and Programs Faculty of Arts and Social Sciences School of Economics Why impact evaluation? • From the 1990s greater attention was placed on development effectiveness, both by better-off countries donating foreign aid, and increasingly by developing countries themselves. • In 2015 the OECD donated about USD153 billion in aid to developing countries. • Australia’s foreign aid budget is AUD4.335 billion in 2021-22. • OECD statistics exclude the growing role of emerging market countries as donors themselves, e.g., China’s aid budget was estimated to be USD4.4 billion in 2018. • Even these large numbers are dwarfed by the policy budgets of developing and emerging market countries themselves. • Moves to recast aid as support for institution and knowledge-building, rather than directly paying for programs. Why impact evaluation? • Impact evaluations are increasingly required by funders – by governments themselves, or external donors. • A legal requirement in Mexico. • Three main objectives of impact evaluations: 1. Accountability. Did the program justify its cost? 2. Results-based management. Rigorously learn and improve specific programs and policies. 3. Generic lessons for international development. Contributes to the global knowledge base. What to evaluate? • We can evaluate the causal impact of many different programs and policies, e.g., • The return to schooling (e.g., effect of one extra year of school on future wages, or effect of going to university versus not going to university). • The impact of microfinance loans on small business owners (profits, growth, employees, etc). • The effect of management practices on large firms. • The effect of gender on governance. • The effect of international trade on the production of large firms. • The effect of Cognitive Behavioural Therapy on mental health and economic outcomes. • The effect of increased access to bank financing on small and medium sized firms. • The effect of joining a sustainability certification program on smallholder farmers. • The effect of a financial literacy program on financial knowledge and behaviour. The goal of impact evaluations is causal inference • We want to identify the effect of a treatment – an intervention or a policy – on an outcome, Y. We might like to know, for any person in the population, denoted 𝑖𝑖, 𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑖𝑖 = 𝑌𝑌𝑖𝑖 1 − 𝑌𝑌𝑖𝑖 0 , where 𝑌𝑌𝑖𝑖 1 is 𝑖𝑖’s outcome with the treatment and 𝑌𝑌𝑖𝑖 0 the outcome without it. • We can never identify this for an individual – a person, a household, a firm, a village, etc – because for any individual we can only observe one or the other, 𝑌𝑌𝑖𝑖 1 or 𝑌𝑌𝑖𝑖 0 . I.e., we can never observe the counterfactual, what would have happened. • However we can identify impact in a population, or a population sample, as the average change in outcomes when a subset of the population gets the treatment, compared to another subset that doesn’t. The goal of impact evaluations is causal inference • Remember, the key point is that in general if we have two groups, one that gets a treatment and one that doesn’t, they can differ post-treatment for at least two reasons: 1. [treatment effect explanation] The treatment caused them to have a different outcome; 2. [selection bias explanation] They had different characteristics prior to treatment. These different characteristics would have caused them to have different outcomes, even without treatment. And these characteristics made the group receiving treatment more likely to do so. • In impact evaluations we try to implement research designs that rule out or at least minimize explanation 2, isolating the pure treatment effect. The goal of impact evaluations is causal inference • Intuitively speaking, selection bias comes about because people make choices, or set up systems, that allow prior characteristics of potential treatment recipients to determine whether or not they receive the treatment. • This is not necessarily a bad thing! • People should have the freedom to accept or reject intervention, policy, or product offers. • Governments, NGOs or companies should be able to use their expert knowledge to target an intervention, policy, or product to a particular user group. • However, for us as analysts, this “selection bias” makes life more difficult. Impact evaluation is a toolkit to attempt to overcome or minimize it, often by exploiting arbitrariness in treatment assignment that separates prior characteristics from treatment assignment. How do we choose between impact evaluation methods? • There are a number of quantitative impact evaluation methods. The book lists at least 7 different kinds (pg 107, Chap 4). • Within these categories there are many more sub-methodologies, which will largely not get into in ECOS3002. • The basic goal of any impact evaluation / causal inference methodology is to overcome or minimize selection bias to identify the causal effect of treatment. • However, these methods can be considered more or less useful based on a few subtle factors. How do we choose between impact evaluation methods? • Amongst researchers, there is somewhat of a hierarchy between these methods. Methods are typically more preferred (i.e., considered more scientifically valid or rigorous) if: • They require weaker assumptions (less untestable assumptions). • They give the treatment effect estimate we’re most interested in. • Some methods give us the “average treatment effect” (for the whole population of interest). • Some methods give us a “local average treatment effect” by exploiting arbitrariness in treatment assignment that only effects a subset of the population. • More-preferred methods typically involve a clear understanding of the treatment assignment mechanism (exactly how different units are assigned to treatment and control, which really gets at understanding the way that human choices enter or don’t enter the treatment assignment process). How do we choose between impact evaluation methods? • Sometimes certain methods simply aren’t feasible. • Due to timing: e.g., an RCT can only be done ex ante (before treatment) as it involves directly controlling the treatment assignment process. You can’t go back in time to do an RCT, though you can sometimes cleverly notice an opportunity to exploit naturallyoccurring arbitrariness in treatment assignment with other methods. Certain RCTs may not be ethical, and RCTs typically require prior ethical approval. • Due to data structures: e.g., a differences-in-differences approach requires panel data (at least 2 rounds of data, before and after the treatment change). Randomized control trial (RCT) • We discussed in some detail in week 1 lecture. Basic example design (from textbook): 1. Consider “universe” population we’re trying to address. 2. Build sample of eligibles. Ideally a random sample of universe (external validity), but this may not be possible. 3. Random assignment to treatment and control (internal validity). 4. Measurement • Before treatment: baseline survey and/or secondary data. • After treatment: one or more surveys and/or secondary data. 5. Estimate treatment effects: difference between T and C, with econometric corrections. Randomized control trial: assessment • Treatment assignment mechanism 1. Direct control of randomization by researchers working with implementers. 2. Natural experiment with randomization: other party implements randomization (p 112). • Critical assumption(s) 1. The treatment and control groups are balanced on key observable and unobservable characteristics  “balance check.” Balance of unobservables is untestable, however the properties of randomization in a large enough sample and balance on observables is considered to make it highly likely. ◦ Particularly key to check this when 3rd party implemented the randomization. Randomized control trial: assessment • Effect estimates • Average treatment effect (ATE) for eligible population, and universe if representative, using ITT. Can also estimate ToT (a LATE). • Data structure • One round of data is sufficient (compare T vs C after treatment). • Most studies have one baseline survey round with at least one post-treatment round. • With panel data we can estimate a highly-credible differences-in-differences model. Randomized control trial: other notes • As noted in week 1, the RCT is a “design-based” approach, considered “gold standard,” which makes econometric analysis relatively simple. • However this is a bit of a double-edged sword – because RCTs are so heavily controlled and planned, there can be an extra sense of artificiality. On pg 113 the text highlights some of these, e.g., • Hawthorne effect: subjects change behavior if they know they’re being studied. • John Henry effect: if control units know they’ve could’ve been treated, they might work harder to catch up. • Pioneer effects: researchers work with partners to run a really well implemented intervention and evaluation, but that isn’t how it would be implemented without such scrutiny. Randomized control trial: other notes • T v C is the classic design, but much more complex designs are possible and sometimes necessary, e.g., • Multiple treatment arms. • Sometimes you want to compare multiple interventions in a “horse race.” • Sometimes you do a “mechanism experiment” to piece out what is really causing the intervention to have an effect. • If a treatment can “spill over” to the control group, that violates the “stable unit treatment value assumption” (SUTVA). Then C is not a perfect counterfactual for T. • May need to randomize at level where T can’t spill over (e.g., village), and/or have multi-level randomization. • Can design multi-level RCTs that measure the direct treatment effect AND the spillover effect. • The up-front control provided by RCTs allows us to make more careful decisions about things like optimal sample size, using a process called “power calculations.” Quasi-experimental methods • The RCT (aka, field experiment) is the only fully experimental method. • But it may not be feasible, due to timing, cost, ethics, feasibility (e.g., lack of partnership with implementer), etc. • Quasi-experiments can be seen as attempting to approximate the internal validity of an RCT. • Typically quasi-experiments do this with stronger assumptions and/or without identifying the ATE. (Propensity score) matching • Matching, aka Propensity Score Matching, involves taking a treatment sample and finding a suitable control sample. • Matching involve multiple steps: 1. Get a sample of units (households, firms, villages, etc) that are eligible for a treatment, some of which got it. Select a set of observable variables, X, that predict selection into treatment. Fit a regression model for the probability of getting treatment, p(X). 2. Use the function p(X) to predict one’s probability of getting treatment, given their characteristics (this is called the propensity score). Match units in T and C based on having very close propensity scores. This may lead to completely excluding some observations in T or C if there simply aren’t comparables from the other group. 3. Take the average difference in outcomes in the matched sample. (Propensity score) matching Matching: assessment • Treatment assignment mechanism • Potentially little understanding of how treatment is allocated – could be messy. • If you know the exact variables that are used to decide treatment status and have access to those and can match on them, matching could be highly credible, but this is a rare case. Critical assumption(s) 1. The treatment and control groups are balanced on key observable and unobservable characteristics. Matching enforces matching on a set of observables in the first step. ◦ However, this is a more brute force to balance than in an RCT – need stronger assumption that enforcing matching on observables leads to matching on unobservables (untestable). ◦ Because matching is often done after the fact, the dataset available might not have been designed to have the right matching variables. Matching: assessment • Effect estimates • Treatment on treated (ToT) – by definition you’re taking a set of treated units (excluding ones offered treatment that didn’t accept) and matching them to a counterfactual. Which subgroups of T and C happen to match may be arbitrary. • Data structure • One round of data is sufficient (compare T vs C after treatment), though our matching variables should be determined prior to treatment. • If we can have more rounds of data then we can combine matching with other quasiexperimental methods, e.g., differences-in-differences (run DiD on matched samples). Differences-in-differences (DiD) DiD: assessment • Treatment assignment mechanism • Potentially vague understanding of precisely how treatment is allocated. Critical assumption(s) The treatment and control groups are balanced in changes in the key outcome(s), i.e., what would have happened in absence of treatment, also known as “parallel trends.” ◦ This is of course not directly testable – we can never observe what would have happened to the treatment group without treatment. ◦ The best we can do is to show parallel pre-trends – that prior to treatment the trends were parallel. ◦ Note that we don’t need the two groups to be identical prior to treatment, in levels. 2. The specific policy change we’re looking at isn’t correlated with other “omitted variables” – other simultaneous changes, local policy demand, etc. 1. DiD: assessment • Effect estimates • Average treatment effect. However this ATE is tied to the selection of the particular treatment units. • Data structure • Need at least two rounds of data: one before and one after, to estimate DiD. • To do a convincing test for parallel pre-trends, ideally want at least a couple rounds of pretreatment data. DiD: notes • In practice, we don’t just have the simple DiD setup where a policy changes once for a decent number of units, and not others. Many policies we might be interested in roll out over time in different units, or change over time. • One of the most prominent recent areas of research in microeconometrics is in working out estimators for these more complex settings of DiD with time-varying treatments. • DiD can be fruitfully combined with other methods, e.g., • Run DiD on data balanced by PSM. • When we have panel data from an RCT, we effectively estimate a DiD on the RCT data. Here the parallel trends and omitted variables assumptions are highly credible, but we know that treatment was randomized. Regression discontinuity design (RDD) Lee and Lemiux (2010) RDD: assessment • Treatment assignment mechanism • We don’t control it, but we directly model it. Critical assumption(s) 1. Individuals on either side of the cutoff are otherwise similar in absence of treatment – that treatment is as-if random at the cutoff. ◦ This could be violated if the cutoff is chosen in reference to some rapid change in characteristics (rare). ◦ This could be affected by “manipulation,” and in some cases individuals may have an incentive to do so. Can check for “heaping” at the cutoff, that might suggest manipulation. RDD: assessment • Effect estimates • Local average treatment effect (LATE): the effect of the treatment for those at the cutoff. This may generalize beyond the cutoff, but there’s no way to know. • Data structure • One round of data is sufficient, as long as we have data on groups both above and below the cutoff. • In some contexts a program might track treatment recipients but not the group that doesn’t receive treatment. RDD: “fuzzy RDD” • In some cases the allocation of treatment in an RDD is imperfect • People who were offered the treatment (“above the cutoff”) don’t take it up. • People who weren’t supposed to get the treatment (“below the cutoff”) get access because of imperfect implementation. • This is analogous to attrition in an RCT – people offered the treatment don’t take it up. • And relying on similar intuition, a fuzzy RDD uses 2SLS to estimate treatment effects: a first stage where we regress take-up on being above the cutoff, and then a second stage where we regress the instrumented value of treatment on the outcome. Instrumental variables (IV) • Find source of “exogenous variation” that affects whether someone gets the treatment, but only affects their outcomes through whether or not they get the treatment. • Then we estimate the effect using 2SLS: • Regress treatment selection on instrument(s) [First Stage] • Regress outcome on projected value of treatment variable [Second Stage] IV: assessment • Treatment assignment mechanism • Potentially vague understanding of precisely how treatment is allocated. Critical assumption(s) 1. Strength of the instrument (“strong first stage”). The instrument(s) need to sufficiently predict treatment selection, or there are “weak instrument” problems. There are common statistical tests for this. 2. Exclusion restriction. That the instrument only affects the outcome through the treatment selection process (that Figure 4.10 actually describes the situation). No way to formally test this; typically researchers argue for the plausibility of this assumption. IV: assessment • Effect estimates • Local average treatment effect (LATE): the effect of the treatment for those whose treatment selection is actually affected by the instrument (excludes compliers who always take up treatment, and defiers who never do). These groups are typically impossible to identify so we never quite know who the treatment effect applies to. • Data structure • One round of data is sufficient, as long as we have data on groups that receive and don’t receive the treatment. • Panel data is useful. Making IE more useful • Impact evaluations have only been prominent in economics for about the last 30 years, and in development economics for about 20. This is a rapidly-growing field, but there is still much to learn. • Thus far, a lot of focus on “internal validity” (validity that we estimated a causal effect within the population in our data). • More focus needed on external validity, e.g., • Carefully selection the population for the RCT / IE to tell us about larger groups. • Replicating studies across multiple sites. • Estimating parameters that come out of economic theory. •More focus also needed on measurement. Qualitative methods • Qualitative methods are highly undervalued as a publishable research method in economics, including development economics. • Concern about subjectivity, inference from small samples, researcher biases, etc. • Yet, many development economists do rich qualitative fieldwork prior and during a study. In other cases they partner with a researcher from a field that is more expert in qualitative methods (e.g., sociology, anthropology). • Yet this kind of work is not very well incentivized. • To go beyond scientific findings to directly inform policy, researchers tend to be closely engaged in ways that aren’t visible from their publications alone. ECOS3002 – Development Economics Final Exam Semester 2 – 2021 Section 1. Draw, calculate, interpret 1. (15 points) Inequality analysis. Suppose a neighborhood has 6 households, with expenditure levels 2, 3, 5, 6, 8 and 10. You can call them HH1, HH2, HH3, HH4, HH5, and HH6, respectively. Suppose the poverty line is 5. a. (5 points) Draw the Lorenz curve for this neighborhood. b. (5 points) Calculate the Gini coefficient for this neighborhood. Please provide your workings (how you calculated the solution). c. (5 points) Suppose you could implement a tax redistribution scheme that takes up to half the income from any household(s), and gives a cash transfer to any household(s), to reduce the value of the Gini coefficient by 10%. How would you do so (which household(s) and how much would you take from them, and which household(s) and how much would you give to them), while minimizing the total amount transferred? Please provide your workings (how you calculate the solution and/or succinctly describe the logic you used). 2. (15 points) Addressing negative externalities. One of the challenges with environmental sustainability is that it often involves market failures – cases where the “free market” does not necessarily lead to a socially optimal solution. This can occur for a number of reasons, including negative externalities like pollution. The following figures, adapted from Figure 15.1 in our textbook, illustrates this situation. While the socially optimal outcome would be at point B, where the Marginal Social Cost (MSC) of pollution equals the Marginal Social Benefit (MSB) of the associated production activities, in a market equilibrium we end up at point C, where MSB = MPC (Marginal Private Cost). $ A p* MSC = MPC + MEC MPC B pc C MEC Demand = MSB O q* qc q The difference between MSC and MPC is the Marginal Externality Cost (MEC). The free market solution effectively ignores the MEC, because the MEC doesn’t directly affect the decision-makers over production. Moving from point C toward point B generally requires some kind of market intervention, usually by the government, to force the decision-makers over production to internalize the MEC. a. (5 points) Create a figure similar to the figure above, to illustrate one potential intervention to force relevant decision-makers to internalize the externality. Only use the minimum information in your figure to illustrate the solution, not just copying the figure above or the figure in the textbook. However, where you use the same concepts in the above figure (e.g., B, MPC, q*), you should use the same labels. You will describe your solution in words in part b. b. (5 points) Describe in words the figure you created in part a, succinctly explaining how it illustrates a potential market intervention to create a more socially-optimal outcome. Make sure you describe the policy and precisely exposit how it increases social welfare, using objects in the model. c. (5 points) If you wanted to implement the policy in practice, which entity or entities would be responsible to implement the policy, and which would be recipients of the policy (i.e., be directly responsible to comply with it)? What would likely be the main political economy challenge with implementing the policy in practice? Section 2. Interpret an RCT 3. (15 points) Interpret a microfinance RCT. Many of the microfinance RCTs in the literature focus on the question “what is the impact of microfinance (relative to a control?,” often finding muted effects. However, these studies, and Rachael Meager’s meta-analysis of the early microfinance RCTs, suggest that more experienced microenterprise owners with larger enterprises, while a minority of the general population, can get a fairly significant boost to their business activities from increasing their access to finance through microfinance loans. Suppose you are evaluating an RCT that further delves into this effect in more experience microenterprise owners running relative large microenterprises. The study also concerns itself with the “spillover” effect of boosting some business, on their neighbors/competitors – i.e., if microfinance can boost the income of loan recipients by 10%, but their neighbors/competitors who aren’t recipients lose 10% of income, then the effect of a loan program on overall economic welfare might be neutral, or even negative. The study uses the following design. It identifies a sample of 500 villages in rural Brazil which have limited prior presence of microfinance programs, and identifies all of the microenterprises with 3-10 employees (on average, each village 7.2 such businesses). It randomly allocates: 1. 250 villages to control; 2. 250 villages to treatment. Within these villages, they then randomly allocate 50% of eligible business (3-10 employees) to treatment (being offered a microfinance loan), and 50% to control (not being offered any loan). The study only has resources for 1 post-treatment survey, surveying all ~5,000 eligible businesses (3-10 employees) in the 500 villages. They find that: • At endline, in the control villages, the average business income is 3200 Brazilian Real per month (1 Real = 0.25 AUD). At endline, in the treatment villages, o Amongst the 50% that were intended for treatment, the average business income is 3840 Real per month; o Amongst the 50% that were not intended for treatment, the average business income is 2880 Real per month. a. (5 points) What is the estimated effect of microfinance loans on treatment enterprises? What is the estimated “spillover effect” of microfinance loans? Succinctly explain your answer (show your calculation, if you have one). b. (5 points) Interpret these results: what is the net welfare effect of the microfinance program? c. (5 points) Provide at least two caveats to the above results and interpretation, due to information that hasn’t been provided about the data and analysis. • Section 3. Short answer 4. (5 points) Impact evaluation methods. Suppose you are chatting with someone who works at an international development consultancy, and you learn that they specialize in monitoring and evaluation. You ask them about how they do the evaluations, and they tell you that they usually measure the outcomes before an intervention, and after an intervention, for those targeted to receive the intervention, and then calculate the change in outcomes over time for the group receiving the intervention. How would you (diplomatically) explain why this approach doesn’t meet the standards of an impact evaluation, and what they could do instead? 5. (5 points) Education and development. Succinctly explain two reasons why it might be rational for individuals in a developing country setting to end their schooling early, i.e., to demand less than a high school education. 6. (10 points) Evaluate a quasi-experiment. A strand of the literature on institutions and culture in economics attempts to understand the historical factors behind modern-day institutions. A growing literature shows that social and cultural norms can have a significant impact on modern-day economic behavior and development. Suppose that you are reviewing a research paper on the effects of religion on economic outcomes, at the level of regions within countries. The authors motivate their paper by pointing out that religion can be a key source of social and cultural values, including virtues like honesty, hard work, and self-discipline. Hence these values and the norms that come from them can be a key determinant of the quality of informal institutions in regions within countries (as contrasted to formal institutions, like legal and financial institutions, that are also studied in some of the recent literature on institutions). The empirical strategy of the paper is to use instrumental variables analysis, in a twostage least squares regression framework: 1. In the first stage, they instrument current religious presence by constructing historical information on the spread of religious missionary groups in Africa, Asia, and South America. I.e., their first-stage is a regression like: 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑒𝑒𝑖𝑖,𝑐𝑐 = 𝛼𝛼 + 𝛽𝛽 ∗ 𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑛𝑛𝑛𝑛𝑛𝑛𝑦𝑦𝑦𝑦𝑦𝑦𝑦𝑦𝑦𝑦𝑦𝑦𝑦𝑦𝑐𝑐𝑒𝑒𝑖𝑖,𝑐𝑐 + 𝜀𝜀𝑖𝑖,𝑐𝑐 , where 𝑖𝑖 indexes regions-within-countries, and 𝑐𝑐 indexes countries. 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅 is a measure of the extent of community participation in religion in the present day. • 𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑛𝑛𝑛𝑛𝑛𝑛𝑦𝑦𝑦𝑦𝑦𝑦𝑦𝑦𝑦𝑦𝑦𝑦𝑦𝑦𝑐𝑐𝑒𝑒 is a measure of the extent of missionary activity in the region from 1750-1950. • 𝛼𝛼, 𝛽𝛽, and 𝜀𝜀 are regression parameters (𝜀𝜀 is the “error term”). 2. In the second stage, they regress current economic outcomes on the projected value of the 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅 variable in the first stage, using a regression like: • • � 𝐸𝐸𝐸𝐸𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑒𝑒𝑖𝑖,𝑐𝑐 = 𝛾𝛾 + 𝛿𝛿 ∗ 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑒𝑒 𝚤𝚤,𝑐𝑐 + 𝜖𝜖𝑖𝑖,𝑐𝑐 , where • 𝑖𝑖 indexes regions-within-countries, and 𝑐𝑐 indexes countries. • 𝐸𝐸𝐸𝐸𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑒𝑒 is a present-day economic or social outcome, like regional GDPpc, life expectancy, or a measure of innovation. � • 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑒𝑒 is the projected value of 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅 from the first stage. • 𝛾𝛾, 𝛿𝛿, and 𝜖𝜖 are regression parameters (𝜖𝜖 is the “error term”). This approach addresses the concern of simply running religious presence on economic outcomes in an OLS regression, since there could easily be reverse causality from current economic outcomes to current religious intensity. The authors defend their empirical strategy by arguing that the factors that drove historical missionary movements are unrelated to the main factors behind current economic outcomes, or they can address those factors with control variables such as controlling for geographic characteristics of a region (e.g., how far it is from the capital city, whether it has direct ocean access or not, how hilly the region is). While the historical data is somewhat limited, they also show that the choices of locations of missionaries seem to be largely uncorrelated with economic outcomes between 17501950. Succinctly explain at least two points of critique of this quasi-experimental design, based on concepts and knowledge in ECOS3002. Section 4. Short essays 7. (15 points) The great aid debate. One of the “great debates” in international development is around the role of international development aid. Jeffrey Sachs is framed as one of the main proponents of foreign aid, a great believer in its effectiveness, particularly to reduce poverty in the developing world. He and his allies have been strong proponents of rich countries having larger foreign aid budgets (e.g., spending 1% of GNI on foreign aid), and for projects like the Millenium Villages, a signature development project in sub-Saharan Africa where villages were provided with a range of simultaneous support in terms of education, health, infrastructure, skills development, etc. On the other side are aid skeptics like William Easterly and Dambisa Moyo. Easterley is particularly critical of how aid has been administered in the past, without accountability to the desires of aid recipients. Moyo highlights the corrosive effects of aid on governance and in stimulating increased corruption, suggeseting that all aid should be eliminated, outside of emergency aid. After going through ECOS3002, what is your stance on this debate? Provide at least three arguments either for or against the pro-aid thesis. As much as possible, use concepts and knowledge from ECOS3002 in constructing your arguments. 8. (20 points) The role of RCTs in development economics. In their book Poor Economics: A Radical Rethinking of the Way to Fight Global Poverty, the 2019 Nobel Prize-winning development economists, Abhijit Banerjee and Esther Duflo, provide one of the first book-length reviews of the so-called RCT revolution in development economics, discussing various findings from RCTs across a series of topics. The development economist Mark Rosenzweig reviewed the book in his 2012 Journal of Economic Literature article, entitled “Thinking Small: A Review of Poor Economics: A Radical Rethinking of the Way to Fight Global Poverty by Abhijit Banerjee and Esther Duflo.” Rosenzweig characterizes the RCT-based approach as “thinking small” – trying to optimize small interventions for the poor using scientific methods, to “marginally improve their welfare.” While he acknowledges that knowledge generated by RCTs can help improve welfare, he argues that this research focus distracts us from studying an helping to optimize the big forces that really cause improvements in economic development – such as major phenomena such as the Green Revolution in the 20th century, or broad structural adjustment through migration, trade, etc. Suppose you are Banerjee or Duflo, how would you respond to this critique, to defend the RCT-based approach to development economics? Provide at least three counterarguments to Rosenzweig’s thesis. As much as possible, use concepts and material from ECOS3002 in constructing your arguments. Purchase answer to see full attachment

  
error: Content is protected !!