WPS8464 Policy Research Working Paper 8464 Gendered Language Pamela Jakiela Owen Ozier Development Economics Development Research Group June 2018 Policy Research Working Paper 8464 Abstract Languages use different systems for classifying nouns. are gender languages. Cross-country and individual-level Gender languages assign many—sometimes all—nouns differences in labor force participation are large in both to distinct sex-based categories, masculine and feminine. absolute and relative terms (when women are compared to Drawing on a broad range of historical and linguistic men), suggesting that the observed patterns are not driven sources, this paper constructs a measure of the propor- by development or some unobserved aspect of culture that tion of each country’s population whose native language affects men and women equally. Following the procedures is a gender language. At the cross-country level, this proposed by Altonji, Elder, and Taber (2005) and Oster paper documents a robust negative relationship between (2017), this paper shows that the observed correlations are the prevalence of gender languages and women’s labor unlikely to be driven by unobservables. Using a permuta- force participation. It also shows that traditional views of tion test based on the structure of the language tree and gender roles are more common in countries with more the distribution of languages across countries, this paper native speakers of gender languages. In African countries demonstrates that the results are not driven by spurious where indigenous languages vary in terms of their gender correlations within language families. Gender languages structure, educational attainment and female labor force appear to reduce women’s labor force participation and participation are lower among those whose native languages perpetuate support for unequal treatment of women. This paper is a product of the Development Research Group, Development Economics. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://www.worldbank.org/research. The authors may be contacted at oozier@worldbank.org. The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. Produced by the Research Support Team Gendered Language Pamela Jakiela and Owen Ozier∗ JEL codes: J16, Z10, Z13 Keywords: grammatical gender, language, gender, linguistic determinism, labor force participation, educational attainment ∗ Jakiela: University of Maryland, BREAD, and IZA, email: pjakiela@umd.edu; Ozier: World Bank Development Research Group, BREAD, and IZA, email: oozier@worldbank.org. We are grateful to the Gender Innovation Lab at the World Bank for funding, to Laura Kincaide, Yujie Lin, and Kattya Quiroga Velasco for research assistance, and to Arun Advani, Sarah Baird, Michal Bauer, Lori Beaman, Premila and Satish Chand, Sameer Chand, Michael Clemens, Austin Davis, Giles Dickenson-Jones, Alice Evans, David Evans, James Fenske, Deon Filmer, Jed Friedman, Julio Garin, Garance Genicot, Jess Goldberg, Guy Grossman, Kyaw Hlaw, Guido Imbens, Clement Imbert, Anett John, Shareen Joshi, Eeshani Kandpal, Madhulika Khanna, Brent Kreider, Eliana La Ferrara, Margreet Luth-Morgan, Jeremy Magruder, Andy Marshall, Justin McCrary, David McKenzie, Ted Miguel, Hannes Mueller, Oyebola Okunogbe, Jessica Ol- ney, A.K. Rahim, Martin Ravallion, Bob Rijkers, Jesse Rothstein, Fr´e Schreiber, Pieter Serneels, Dominique van de Walle, Andrew Zeitlin, and seminar audiences at Georgetown University, Bocconi University, the University of Warwick, the University of Delaware, the Institute for Fiscal Studies, the European Univer- sity Institute, the World Bank, UAB (Barcelona), CERGE-EI (Prague), and CSAE (Oxford) for helpful comments. All errors are our own. The findings, interpretations and conclusions expressed in this paper are entirely those of the authors, and do not necessarily represent the views of the World Bank, its Executive Directors, or the governments of the countries they represent. 1 Introduction Language structures thought. All human beings use language to articulate their ideas and communicate them to others. Yet, the world’s languages show tremendous diversity in terms of their structure and vocabulary. Different languages obviously use different words to describe the same concept, but they also organize the relationships between concepts in remarkably different ways. Because languages are so diverse and language is so fundamental to thought, some scholars have argued that the language we speak may limit the scope of our thinking. Benjamin Lee Whorf, one of the original proponents of this theory of linguistic determinism, famously argued that it was difficult for humans to think about ideas or concepts for which there was no word in their language (Whorf 2011[1956]a). Though specious anecdotes about obscure languages abound, cognitive scientists have largely refuted the strongest forms of Whorf’s hypothesis (Boroditsky, Schmidt, and Phillips 2003). Nonetheless, there is mounting evidence for weaker forms of linguistic determinism: the languages we speak shape our thoughts in subtle, subconscious ways. For example, implicit association tests show that bilinguals display different subconscious attitudes when tested in their different languages (Ogunnaike, Dunham, and Banaji 2010, Danziger and Ward 2010). Differences in language structure also influence our behavior in the economic realm. For example, Chen (2013) demonstrates that speakers of languages that demarcate the future as separate from the present (e.g. English) save less than those whose languages make no such distinction (e.g. German). Several recent papers explore the link between language and gender roles. As Alesina, Giuliano, and Nunn (2013) note, views of the appropriate role for women in society differ markedly across cultures. Languages also vary in their treatment of gender. At one extreme, languages such as Finnish and Swahili do not mark gender distinctions in any systematic way: nouns are not categorized as either masculine or feminine; and the same first, second, and third person pronouns are used for males and females. Many languages distinguish between human males and females by using different pronouns: for example, “he” and 2 “she” in English. Some languages go even further, extending the gender distinction to inanimate nouns through a system of grammatical gender. For example, languages such as Spanish and Italian partition all nouns — even inanimate objects — into distinct gender categories. This feature of language forces gender into every aspect of life: for a speaker of a gender language, gender distinctions are salient in every thought and utterance; every object is either masculine or feminine because it is intrinsically linked to a word that carries a grammatical gender. Does grammatical gender shape (non-grammatical) gender norms? Does it impact women’s participation in economic life? Writing nearly 100 years ago, Benjamin Lee Whorf argued that the existence of linguistic gender categories likely made other gender divi- sions appear more natural (Whorf 2011[1956]b), though he did not provide any empirical evidence that this was the case. However, recent work by social scientists supports his claim. For example, seemingly arbitrary grammatical gender distinctions do influence our subconscious thoughts, imbuing inanimate nouns with masculine or feminine attributes (e.g. strength or beauty) in line with their assigned grammatical category (Boroditsky, erez and Tavits (forthcoming) show that Estonian/Russian Schmidt, and Phillips 2003). P´ bilinguals are more supportive of gender equality when interviewed in (non-gender) Esto- nian than in (gender) Russian. In the economic realm, one recent study of immigrants to the United States shows that those who grew up speaking a gender language are more likely to divide household tasks along gender lines (Hicks, Santacreu-Vasut, and Shoham 2015) while another demonstrates that female labor supply is lower among immigrants who speak a gender language at home (Gay, Hicks, Santacreu-Vasut, and Shoham forthcoming). We provide new evidence to support the hypothesis that grammatical gender shapes views of women’s role in society and directly impacts women’s labor force participation. To do this, we construct a data set characterizing the grammatical gender structure of 4,336 living languages, expanding the number of languages for which systematic data on grammatical gender is available by almost a factor of ten. We draw on a range of data sources including language textbooks, historical records, academic work by linguists, and 3 — in a small number of cases — firsthand accounts from native speakers and translators; using these data sources, we generate a measure of the grammatical gender structure of each of the languages in our data set. Taken together, these languages account for 6.44 billion people, or over 99 percent of the world population.1 We use these data in two ways. First, we calculate — for every country in the world — an estimate of the proportion of the population whose native language is a gender language. We are able to account for more than 90 percent of the estimated population in all but three countries. In our first piece of analysis, we explore the cross-country relationship between grammatical gender and women’s labor force participation, women’s educational attainment, and gender attitudes among both men and women. We then complement our cross-country analysis by estimating the individual-level association between grammatical gender and women’s participation in economic life using Afrobarometer data from African countries where both gender and non-gender languages are indigenous and widely spoken. Our cross-country analysis suggests a robust negative relationship between grammati- cal gender and female labor force participation. Our preferred specification suggests that grammatical gender is associated with a 12 percentage point reduction in women’s labor force participation and an almost 15 percentage point increase in the gender gap in labor force participation. These associations are robust to the inclusion of a wide range of geo- graphic controls (including suitability for the plough) that could not plausibly have been impacted by language. Taken at face value, our coefficient estimates suggest that gender languages keep approximately 125 million women around the world out of the labor force. Following the approach suggested by Altonji, Elder, and Taber (2005) and Oster (2017), we estimate that unobservable country-level characteristics would need to be 1.44 times more correlated with treatment than observed covariates to fully explain the apparent impact of grammatical gender on the level of female labor force participation; unobserved factors would need to be 3.23 times more closely linked to treatment to explain the impact of grammatical gender on the gender gap in labor force participation. 1 This calculation is based on Ethnologue estimates of the total number of native speakers in the world. 4 We find a far more muted cross-country relationship between grammatical gender and women’s educational attainment. This may be due to the fact that the average within- country gender gap in educational attainment is much smaller than the gender gap in labor force participation — since many wealthy countries have no gender gap in educational attainment, particularly at the primary school level. The prevalence of gender languages is negatively associated with the gender gap in both primary and secondary school completion after controlling for continent fixed effects, but the estimated relationship is not statistically significant at conventional levels. Using data from the World Values Survey (WVS), we show that grammatical gender predicts support for traditional gender roles. The coefficient estimate is large in magnitude, suggesting that differences in language could explain the entire gap in gender attitudes be- tween Ukraine (at the 55th percentile of WVS countries in terms of support for gender equality) and Trinidad and Tobago (at the 80th percentile). As Whorf might have hypoth- esized, gender languages are associated with greater support for traditional gender roles among both men and women. Though our analysis uses much richer country-level data on grammatical gender than has previously been available, mis-measurement of our independent variable of interest is still a concern. We would typically expect measurement error in the independent variable to bias the estimated association toward zero, but the interval nature of our measure of the country-level prevalence of grammatical gender (when the gender structure of the native language is not known for the entire population) can also lead to invalid inference. Using a bounding technique proposed by Imbens and Manski (2004), we show that our results are robust to correcting for the censored nature of our independent variable of interest. A more serious inference concern arises from the fact that languages are not indepen- dent. Within a language family, individual tongues have evolved in parallel over many centuries. While this slow process of language development may help to address potential concerns about reverse causality, it complicates statistical inference. Intuitively, languages need to be clustered within families, but countries draw from many different clusters. We 5 address this issue by implementing a permutation test that respects the observed pattern of variation in treatment (i.e. grammatical gender) across and within language families and the distribution of languages across countries. We cluster languages at the highest level of the language tree where we do not observe variation in grammatical gender. Generating 10,000 hypothetical assignments of grammatical gender across the 201 clusters so generated allows us to calculate permutation-test p-values indicating the likelihood that the associa- tion between grammatical gender and our outcomes of interest would be as strong as the observed relationship under the null hypothesis — given the structure of the language tree, the observed variation in grammatical gender across languages, and the distribution of lan- guages across countries. Results suggest that the strong association between grammatical gender and women’s labor force participation is not spurious. To further assess the likelihood of a link between gender languages and women’s involve- ment in economic life, we examine the individual-level association between grammatical gender and women’s labor force participation and educational attainment in parts of Sub- Saharan Africa where both gender and non-gender languages are indigenous and widely spoken. Combining our language data with Afrobarometer surveys from Kenya, Nigeria, Niger, and Uganda, we show that grammatical gender is associated with drastically reduced female labor force participation. Women whose native language is a gender language are 11 percentage points less likely to be in the labor force than women whose native language is not a gender language, even after controlling for the level of labor force participation among men from their own ethnic group, even after controlling for interactions between gender (i.e. the indicator for being female) and religious affiliation. We also find a robust negative relationship between grammatical gender and women’s educational attainment at the individual level within Africa. The approach suggested by Altonji, Elder, and Taber (2005) and Oster (2017) suggests that unobservable characteristics are unlikely to explain the relationship. Thus, gender languages appear to reduce African women’s labor force participation and lower their educational attainment. The rest of this paper is organized as follows. Section 2 introduces the concept of 6 grammatical gender and surveys recent research on its impacts. Section 3 provides an overview of our data sources, including the data we have compiled on the grammatical structure of more than 4,000 languages. Section 4 presents our cross-country analysis; Section 5 presents individual-level, within-country analysis; Section 6 discusses causality; and Section 7 concludes. 2 Grammatical Gender Many languages partition the set of all nouns into mutually exclusive categories. Member- ship in these categories, which are typically referred to as either genders or noun classes (Corbett 1991, Aikhenvald 2003), can be manifest in several ways. First, members of a noun class may be semantically related. For example, Tamil has three noun classes: nouns representing male humans and gods, nouns representing female humans and goddesses, and all other nouns (Corbett 1991, Krishnamurti 2001). However, such strictly semantic sys- tems of noun classification (where a word’s meaning always determines its categorization) are relatively rare. In many languages, members of a class are linked by morphology rather than semantics. For example, members of the KI-/VI- class in Swahili often begin with ki- in the singular and vi- in the plural — e.g. “chair” is kiti and “chairs” is viti. How- ever, though semantic and morphological regularities are a common characteristic of noun classes, they are not required. Instead, membership in a specific noun class is defined based on agreement: class must be reflected in the conjugation of associated words within the noun phrase or predicate in grammatically correct speech (Aikhenvald 2003).2 In Swahili, for example, the noun class 2 There is some debate among linguists as to whether agreement rules that do not involve elements of the noun phrase or the predicate can form the basis of a noun class system — specifically, linguists disagree as to whether requiring “anaphoric agrement” between nouns and associated pronouns constitutes a system of grammatical gender (Corbett 1991, Aikhenvald 2003). Corbett (1991) argues that there is no fundamental distinction between pronominal agreement and other forms of grammatical agreement; he consequently classifies languages that (only) require pronominal agreement (e.g. English) as gender languages in his work (Corbett 2013a, Corbett 2013b, Corbett 2013c). Aikhenvald (2003) agrees that there is no fundamental distinction between pronominal agreement and other forms of grammatical concordance, but advocates the use of the traditional definition of grammatical gender to avoid confusion. She also suggests restricting the use of the term “grammatical gender” to systems of noun classification involving 7 determines the prefixes used to modify adjectives, verbs, demonstratives, and other parts of speech. So, “these new chairs” is viti vipya hivi, while “these new teachers” is walimu wapya hawa because the word “teacher” is part of the M-/WA- class rather than the KI- /VI- noun class.3 Nouns are said to belong to the same agreement class if, “given the same conditions, they will take the same agreement form” (Corbett 1991, p. 148), where the relevant “conditions” are linguistic and typically relate to number and case. Systems of noun classification differ widely across languages, and not all languages have such a system. One of the most common bases for a system of noun classification is biological sex: (some) female humans and some other nouns are assigned to one category, while (some) male humans and some other nouns are assigned to a different category (Corbett 1991, Aikhenvald 2003, Hellinger 2003).4 We refer to systems which assign nouns, including some inanimate nouns, to agreement classes that are based on biological sex as grammatical gender ; we refer to languages characterized by such systems of grammatical gender as gender languages (Aikhenvald 2003, Hellinger and Bußman 2003). Spanish is a prominent example of a gender language: all Spanish nouns are either masculine or feminine, and both definite articles and adjectives must be consistent with a noun’s gender. So, for example, “the white house” is la casa blanc-a the.Fem house white-Fem, because “house” is feminine, but “the white horse” is a relatively small number of categories that include masculine and feminine. Since our focus is on the links between grammatical gender and non-grammatical gender norms, we adopt her terminology to avoid confusion. Employing the traditional definition of grammatical gender also facilitates the use of data from a wide range of linguistic and anthropological sources, since many historical sources distinguish between grammatical gender (which involves the assignment of nouns to gender categories) and systems that mark natural/human gender morphologically. 3 Corbett (1991) states: “The existence of gender can be demonstrated only by agreement evi- dence. . . Evidence taken only from the nouns themselves, such as the presence of markers on the nouns, does not of itself indicate that a language has genders (or noun classes); if we accepted this type of evidence, then we could equally claim that English had a gender comprising all nouns ending in -ion.” Thus, though many nouns within a class may share particular prefixes or suffixes, it is the requirement that other parts of speech (particularly elements of the noun phrase or the predicate) conjugate or inflect appropriately that distinguishes noun classes from other phonological or orthographic partitions of the set of all nouns. 4 Almost all languages also distinguish between singular and plural, but this is not typically treated as a system of noun classification because the singular and plural forms are treated as two variants of the same noun. 8 el caballo blanc-o the.Masc horse white-Masc because “horse” is masculine. A Spanish speaker must therefore maintain a mental map that assigns each noun to one of these two distinct gender categories. Systems of grammatical gender differ along several dimensions.5 Gender languages differ in the extent of agreement across parts of speech, and the extent to which the gender distinc- tion represents a complete partition of the set of all nouns. Languages such as Spanish — with only two sex-based noun classes — are at one end of this spectrum. In such languages, every inanimate noun must be classified as either feminine or masculine. Languages such as German display a weaker form of grammatical gender because some objects are classified as neither feminine nor masculine. Intuitively, one might think that the partition of nouns into two dichotomous genders suggests that other aspects of the universe should also be so organized (for example, into male and female household tasks). In systems that assign objects (i.e. nouns) without natural gender to gender categories, there is also the question of what the observed grouping signals about the relative status of women and men. Though the rules used to assign nouns to different classes are often phonological (e.g. Spanish nouns that end in “o” are typically masculine), many languages assign some nouns to the feminine gender using semantic guidelines that have a certain cultural intelligibility. For example, dangerous objects are feminine in the Australian language Dyirbal (Lakoff 1987), while one linguist studying the Siberian language Ket suggested that certain small animals were 5 Moreover, grammatical gender is only one of several ways that grammatical rules can make human gender distinctions salient. For instance, though typically not classified as a gender language, English employs a system of pronominal agreement — different third-person singular pronouns are used for male and female humans and, in some cases, male and female animals (Aikhenvald 2003, Boroditsky, Schmidt, and Phillips 2003, Hellinger and Bußman 2003, Kilarski 2013). Female pronouns have also traditionally been used to refer to ships and other large transportation vessels. Because pronouns agree with the natural gender of animate nouns, Corbett (1991) classifies English as a gender language with a strictly semantic system of noun classification (i.e. a system of grammatical gender based only on biological gender). Such systems of pronominal agreement based on the biological gender of animate referents (rather than the grammatical gender of the nouns themselves) are present in many languages that show no other form of gender inflection (Aikhenvald 2003, Creissels 2000). Other languages — e.g. Finnish, Hungarian, and Swahili — make no grammatical distinction between males and females. Givati and Troiano (2012) show that countries where the dominant language makes pronominal gender distinctions have shorter government-mandated maternity leaves. 9 feminine “because they are of no importance to the Kets” (Corbett 1991, p. 19).6 . 7 Whether grammatical gender distinctions influence (non-grammatical) gender attitudes is an empirical question, but the idea that they might is not new. Whorf, for example, argued that gender distinctions in language might make a gendered division of labor seem more natural, suggesting that viewing the world through the lens of a gender language would create “a sort of habitual consciousness of two sex classes as a standing classifaca- tory fact in our thought-world” (Whorf 2011[1956]b, p. 69).8 This argument — which was not supported by a strong body of empirical evidence — has been controversial, to say the least. However, recent work in psychology and political science shows that grammati- 6 In many languages, the grammatical gender of inanimate objects reflects stereotypes about the physical distinctions between males and females. For example, in his discussion of the major Indo-Aryan languages (Bengali, Gujarati, Hindi, Marathi, Oriya, Panjabi, and Sindhi), John Beames (1875) notes: “In all the five languages which have gender expressed, the masculine is used to denote large, strong, heavy, and coarse objects; the feminine weak, small, and fine ones” (p. 148). In the Papuan language Manangu, inanimate objects that are long or thin are masculine, while those that are short or round are feminine (Aikhenvald 2003). 7 No one knows exactly why grammatical gender systems arose in some language families and not in others. Janhunen (1999) hypothesizes that a single innovation in an ancient West Asian language brought grammatical gender into the Indo-European language family, but grammatical gender arose in indigenous language families on every continent. It is, of course, impossible to fully rule out the possibility that some aspect of culture contributed to the emergence of grammatical gender in certain ancestral languages. That said, since language structures evolve over centuries, even millennia, present-day gender attitudes cannot have had a causal impact on modern grammatical structures. Moreover, we have a relatively good understanding of the process through which grammatical gender was lost from certain widely spoken Indo- European languages; this evidence does not suggest a causal relationship between gender norms and the loss of grammatical gender. For example, McWhorter (2005) argues that the influx of Scandinavian adults into the community of English speakers explains the loss of grammatical gender, as an imperfect grasp of inflectional agreement paradigms is common among non-native speakers. This “contact hypothesis” also explains why grammatical gender is typically absent from Creole languages (McWhorter 2005, Muhleisen and Walicek 2010). However, the reduction and simplification of languages resulting from an influx of non-native speakers is not restricted to the loss of grammatical gender (and has no inherent relationship to societal gender norms): McWhorter (2005) argues that the contact hypothesis also explains why Swahili is one of the few Bantu languages that is not tonal. Kastovsky (1999) proposes a complementary explanation, arguing that the English case-number-gender agreement system was, in essence, made precarious by its own complexity and the absence of reliable morphological rules that could be used to predict agreement classes; in this context, small changes in pronunciation could lead to the conflation of declensional paradigms and their subsequent loss. Aikhenvald (2003) points to a similar process of declensional conflation and subsequent gender loss in Bengali and Persian, and to a parallel loss of the neuter gender in French. Thus, the existing evidence tends to suggest that grammatical gender is most often lost through an interplay between linguistic factors (e.g. complexity, similarity between agreement paradigms) and the arrival of large numbers of non- native speakers within a linguistic community. 8 His argument echoes earlier work by Durkheim and Mauss (1963), who highlighted the parallels be- tween culture-specific systems for classifying humans and those used for classifying other aspects of reality. Describing the extension of the clan system of one group of native Australians to the universe of animals and inanimate objects, they wrote: “The reasons which led to the establishment of the categories have been forgotten, but the category persists and is applied, well or ill, to new ideas” (p. 21). 10 cal gender shapes our subconscious attitudes in subtle and surprising ways. For example, Boroditsky, Schmidt, and Phillips (2003) conducted a study — in English — of native speakers of Spanish and German (all of whom were fluent in English); participants in the study were asked to provide (English) adjectives to describe (English) nouns that had been chosen because they had opposite grammatical genders in Spanish and German. Subjects tended to choose adjectives that aligned with the grammatical gender of the noun in their native language. For example, native German-speakers described a picture of a bridge (which is feminine in German) as “beautiful” and “elegant” while native Spanish-speakers described the same (masculine in Spanish) bridge as “big” and “dangerous” (Boroditsky, Schmidt, and Phillips 2003). Thus, the results suggest that grammatical gender shapes the way we think about inanimate objects without inherent biological gender. Grammatical erez and Tavits gender also appears to shape gender attitudes — even within individuals. P´ (forthcoming) conduct a survey experiment with Estonian/Russian bilinguals, randomizing the language in which they are interviewed. They show that bilinguals who are interviewed in Russian (a gender language) are less supportive of gender equality than those who are interviewed in (non-gender) Estonian, even though interview languages were randomly as- signed.9 Recent work also suggests that the influence of grammatical gender extends into the economic realm. Using the World Atlas of Language Structures (WALS), a comprehensive data set on the grammatical structure of more than 500 languages, a number of authors have examined the links between grammatical gender and economic and political outcomes. For example, Mavisakalyan (2015) and Shoham and Lee (2017) use the WALS to examine the cross-country association between grammatical gender and gender inequality in the labor force. Santacreu-Vasut, Shoham, and Gay (2013) show that countries where the national 9 There is also evidence that pronominal gender impacts the salience of gender distinctions. Guiora (1983) finds that children who grow up speaking Hebrew, English, or Finnish come to understand their own biological genders at different ages; those who grow up using different pronouns for males and females become aware of their own natural gender earlier. As discussed above, English has a system of pronominal gender while Finnish does not. Hebrew also uses a dichotomous system of grammatical gender (all nouns are either masculine or feminine), and male and female Hebrew-speakers must use grammatically correct verb forms, for example, that reflect their natural gender. Hebrew also uses different second-person pronouns for males and females. 11 language uses a sex-based system of grammatical gender are less likely to implement gender quotas for political office, while Santacreu-Vasut, Shenkar, and Shoham (2014) find that countries those countries also have relatively fewer women in corporate leadership positions. Hicks, Santacreu-Vasut, and Shoham (2015) show that immigrants to the United States assign tasks within the household along gendered lines if they grew up speaking a gender language; no such difference is found among immigrants who came to the U.S. before the age of language acquisition, or among the children of immigrants.10 Importantly, these findings suggest that one’s native language plays a particularly crucial role in shaping one’s views on the appropriate role for women in society. Our main contribution is the creation of a new data set characterizing the grammatical gender structure of most of the world’s native languages, allowing us to replicate and extend existing work to a much larger range of countries and contexts. 3 Data More than four thousand of the world’s living languages have at least one thousand native speakers, and 383 languages have more than one million native speakers (Lewis, Simons, and Fennig, eds., 2016). Moreover, very few countries are entirely devoid of linguistic hetero- geneity, and many countries have two or more widely spoken languages that differ in terms of their gender structure.11 Developing countries, in particular, tend to be characterized by relatively high ethnolinguistic fractionalization (Easterly and Levine 1997). We compile a new data set characterizing the gender structure of more than 4,000 living languages. Together, the languages that we classify account for over 99 percent of the world’s population. As discussed below, we compile data from a range of academic publications, pedagogical materials (e.g. language textbooks), and historical sources. The 10 In related work, Gay, Hicks, Santacreu-Vasut, and Shoham (forthcoming) find that female immigrants to the United States exhibit lower labor market participation (working fewer hours, fewer weeks, etc.) if they speak a gender language at home. 11 Only four countries are linguistically homogeneous (i.e. have only one native language that is common to all citizens): Cabo Verde, Maldives, the Democratic People’s Republic of Korea, and San Marino (Lewis, Simons, and Fennig, eds., 2016). 12 downside of this approach is that, because the underlying sources were not compiled by a single linguist, there may be measurement error at the language level. Specifically, while many historical sources explicitly state that languages either do or do not use a system of grammatical gender, we cannot always be certain that the same precise definition of grammatical gender is being used across sources.12 The strength of our approach is that we are able to characterize the grammatical structure of thousands of languages accounting for almost all of the world’s population. 3.1 Building a Grammatical Gender Data Set 3.1.1 Data on Native Languages Data on the world’s native languages comes from the Ethnologue, a comprehensive database of over 7,000 languages (Lewis, Simons, and Fennig, eds., 2016). For every known language, the Ethnologue provides an estimate of the number of native speakers (if any) in every country. Data are drawn from a range of sources including national censuses and surveys compiled by linguists. Combining the Ethnologue data with information on the grammatical gender structure of the world’s languages allows us to construct an estimate of the fraction of each country’s population that speaks a gender language as their native language. Of the 7,457 languages included in the Ethnologue database, we drop languages that are extinct or have no native speakers, sign languages, and dying languages that had fewer than 100 native speakers when last assessed by Ethnologue researchers. This leaves 6,190 languages. Together, these languages account for an estimated 6.50 billion native speakers. Of these, we successfully identify academic or historical sources characterizing the gender structure of native languages accounting for 6.44 billion speakers (or more than 99 percent of the total), as described below. 12 Indeed, even recent work by linguists does not always agree on the definition of grammatical gender — see Corbett (1991) and Aikhenvald (2003) for discussion. 13 3.1.2 Grammatical Gender Data Data on the gender structure of languages comes from a range of sources. One of the best known is the World Atlas of Language Structures (WALS), which characterizes the noun classification system of 525 languages (Corbett 2013a, Corbett 2013b, Corbett 2013c). Unfortunately (from our perspective), many of these are indigenous languages from Papua New Guinea, Australia, and the Americas with very few living speakers.13 Additional sources of data include: George L. Campbell’s Compendium of the World’s Languages (Campbell 1991), which characterizes several hundred of the world’s most widely spoken languages; George Abraham Grierson’s eleven-volume Linguistic Survey of India (Grierson 1903a, 1903b, 1904, 1905, 1907, 1908, 1909, 1916, 1919, 1921), which was compiled between 1891 and 1921 and covers more than 300 South Asian languages and dialects; and the UCLA Language Materials Project (UCLA Language Materials Project 2014), which provides detailed descriptions and learning materials for 116 languages. Additional data on the grammatical gender structures of languages comes from academic articles and teaching materials focused on individual languages. We also collected first-person accounts from native speakers for a small number of relatively undocumented languages (e.g. Fiji Hindi and Rohingya). Detailed information on the full range of sources (including the quotes used to characterize each language’s grammatical gender) is provided in our Data Construction Appendix. 3.1.3 Identifying Gender Languages For each mother tongue in the Ethnologue database, we attempt to code two variables characterizing the language’s grammatical gender structure. First, we create an indicator 13 Data from the WALS must be used with caution because they were compiled by the linguist Greville Corbett; as discussed above, Corbett advocates the use of a non-standard definition of grammatical gender that includes systems of anaphoric pronominal agreement (Corbett 1991). This is particularly problematic when one combines the WALS with other data sources that do not classify systems of pronominal agreement based on the gender of the referent as examples of grammatical gender. We address this by excluding WALS data on languages that are classified as “strictly semantic” (i.e. agreement class can always be inferred from the meaning of the noun) since Corbett considers pronominal agreement an example of such a system. We rely on other sources to classify those languages. Languages that are classified in the WALS as either lacking a grammatical gender system of having a system that is “semantic and formal” are unambiguous. 14 for using any system of grammatical gender. We code a language as a gender language if it meets two criteria: first, the language must use a system of noun classes that includes masculine and feminine as two of the possible categories; second, the masculine and feminine categories must include some inanimate objects — i.e. assignment to the gender noun classes should not be based exclusively on the biological sex (or human gender) of the referents.14 Second, whenever possible, we also code an indicator for dichotomous gender languages (e.g. Spanish) that assign all nouns to either the masculine or the feminine noun class. We identify languages as gender or non-gender in several different ways. First, some lan- guages are explicitly identified as gender or non-gender languages in linguistic or pedagogical materials. For example, the UCLA Language Materials Project characterizes Serbian by stating: “Three grammatical genders (masculine, feminine, and neuter) and two numbers (singular and plural) are also distinguished” (UCLA Language Materials Project 2014). Some sources are equally explicit about the absence of grammatical gender. For example, A Reference Grammar of Maithili states: “Modern Maithili, however, has no grammatical gender. In other words, in modern Maithili distinctions of gender are determined solely by the sex of the animate noun” (Yadav 1996). In other cases, grammatical materials characterize the different noun classes present within a language (or the absence of a noun classification system), and provide examples of words that fall into each class. For each language, we record specific quotes characterizing the gender structure. Whenever possible, we use two independent sources to confirm the structure of each language. We successfully classify 4,336 languages which together account for more than 99 percent of the world’s population. We classify all but four of the 383 languages with more than one million native speakers, and we are able to confirm the gender structure using two independent data sources for 324 of these large languages. We are able to account for 14 As discussed above, linguistic sources do not always use the same implicit definition of grammatical gender. For example, the phrase “marks gender” can be used to indicate either grammatical gender or a more limited system of indicating the gender of a human referent. Since many linguistic sources explicitly distinguish between grammatical gender and lexical marking of human/animate gender, we only use sources that indicate whether inanimates are classed in terms of nominal gender. 15 more than 99 percent of the population in 171 of 193 countries, and we account for less than 95 percent of the population in only eight countries: Eritrea (94.5 percent of native speakers coded), the Islamic Republic of Iran (93.7 percent), Ethiopia (92.6 percent), the Lao Peoples Democratic Republic (90.2 percent), Timor-Leste (90.0 percent), Cameroon (89.1 percent), Chad (75.4 percent), and Papua New Guinea (32.0 percent). Figure 1 characterizes the distribution of gender languages around the world. While many countries are dominated by either gender or non-gender languages, there is consider- able within-country variation in Canada and the United States, Sub-Saharan Africa, South Asia, and the Andean region of South America. Across all countries, we estimate that approximately 38.6 percent of the world’s population speaks a gender native language. 3.2 Other Sources of Data Additional data for our cross-country analysis comes from several sources. Data on labor force participation, income, and population come from the World Bank’s World Devel- opment Indicators database. We use data on labor force participation in 2011, which is available for 178 countries. We also use data on primary and secondary school comple- tion from the Barro-Lee Educational Attainment Data Set (Barro and Lee 2013), which is available for 142 countries. Data on gender attitudes comes from the World Values Survey and is available for 56 countries (World Values Survey Association 2015). Finally, we take several country-level geographic controls (average precipitation and rainfall plus suitability for the plough) from Alesina, Giuliano, and Nunn (2013). These data are available for 173 countries. Data for our individual-level analysis comes from the nationally-representative Afro- barometer Surveys (Afrobarometer Data 2016). Afrobarometer surveys have been con- ducted in 36 African countries and are representative of the voting age population within each country. Given the salience of ethnolinguistic identities in many African societies, the Afrobarometer collects data on respondents’ native languages. We use data from four countries where gender and non-gender languages are indigenous and widely spoken: Kenya, 16 Niger, Nigeria, and Uganda. Data for Niger is only available in Round 5 of the Afrobarome- ter (2011–2013). For the other three countries, four rounds of data are available: 2002–2003, 2005–2006, 2008–2010, and 2011–2013.15 We successfully classify the grammatical gender structure of the native languages of 99.1 percent of respondents, yielding a data set of 26,546 respondents who speak 175 different native languages. 4 Cross-Country Analysis 4.1 Empirical Strategy In our cross-country analysis, we examine the association between women’s labor force participation and the proportion of a country’s population whose native language is a gender language, Genderc . Our main empirical specification is an OLS regression of the form: LF Pc = α + βGenderc + δcontinent + λXc + εc (1) where LF Pc is women’s labor force participation in country c (in 2011), Genderc is the proportion of the population of country c whose native language is a gender language, δcontinent is a vector of continent fixed effects, Xc is a vector of of country-level geography controls, and εc is a conditionally mean-zero error term.16 Standard errors are clustered at the language level (by the most widely spoken language within each country). Our cross-country analysis of the relationship between women’s labor force participation and grammatical gender includes data on 178 countries: all the independent states for which data on women’s labor force participation is available from the World Bank development indicators database. 15 Kenya, Nigeria, and Uganda were also included in the first round of the Afrobarometer. However, that data set does not contain detailed information on native languages. 16 As discussed further below, our results are also robust to the inclusion of additional contemporaneous controls such as log GDP per capita and population. However, such controls might be directly impacted by gender norms and women’s involvement in the labor force, creating a “bad controls” problem and biasing the coefficient of interest (Angrist and Pischke 2008, Acharya, Blackwell, and Sen 2016). We therefore focus on geographic controls — proportion tropical, precipitation, temperature, suitability for the plough, and an indicator for being landlocked — which are plausibly exogenous. 17 Our main outcome of interest is women’s labor force participation. However, we do not wish to conflate gender differences in labor market participation with structural factors that impact labor force participation among both men and women. To rule out this possibility, we include specifications where the outcome variable is the gender difference in labor supply, i.e. women’s labor force participation minus men’s labor force participation.17 We also examine two other outcome variables related to gender norms: women’s educa- tional attainment and gender attitudes. As discussed above, data on women’s educational attainment comes from the Barro-Lee data set, and is available for 142 countries (Barro and Lee 2013). Our analysis of educational outcomes parallels our analysis of labor force participation. We examine rates of primary and secondary school completion among women and differences between women’s and men’s completion rates. Data on gender attitudes comes from the World Values Survey (WVS) and is available for 56 countries. In our main analysis, we construct an index of gender attitudes by taking the first principal component of the eight WVS questions on gender roles. Since we are considering attitudes rather than behaviors, we do not report gender differences; instead we compare attitudes by gender to test whether grammatical gender shapes the views of traditional gender roles among both men and women. 4.2 Labor Force Participation Figure 2 summarizes female labor force participation in the 178 countries for which data is available. The figure highlights the fact that women’s participation in economic life varies tremendously across countries: the women’s labor force participation rate ranges from 9 percent in the Republic of Yemen to 87 percent in Madagascar. Gender gaps in labor force participation also vary across countries: in Afghanistan, women are 71 percentage points less likely to be in the labor force than men; women are more likely to be in the labor force than men in Burundi and Mozambique. Figure 2 suggests a negative relationship between the prevalence of gender languages and women’s involvement in the labor force. 17 As a robustness check, we report specifications that use the ratio of women’s labor force participation to men’s labor force participation as the outcome variable (see Online Appendix Table A1). 18 In the figure, darker bars indicate a higher prevalence of grammatical gender. It is clear that many of the countries with the lowest levels of women’s labor force participation and the largest gender gaps in labor force participation are those where gender languages are dominant. We confirm the statistical significance of this relationship in a regression framework in Table 1. In the first three columns, the outcome variable is the average level of female labor force participation in country c. We report a parsimonious specification with no controls in Column 1. Gender languages are negatively and significantly associated with lower levels of female labor force participation. The coefficient estimate suggests that women’s labor force participation is 13.82 percentage points higher in the absence of gender languages (p-value 2.37 × 10−6 ). Column 2 of Table 1 reports a specification that includes continent fixed effects; Column 3 also includes geographic controls (percentage tropical, average temperature and precipitation, an indicator for being landlocked, and Alesina et al’s (2013) measure of suitability for plough agriculture). The coefficient of interest is negative and statistically significant in both specifications. Moreover, it remains reasonably similar in magnitude: when all of our geographic controls are included, the coefficient suggests that grammatical gender is associated with an 11.89 percentage point decline in women’s labor force participation (p-value 0.001). In Columns 4 through 6 of Table 1, we replicate our analysis using the gender difference in labor force participation as the dependent variable. Gender languages are also associ- ated with robust differences in women’s labor force participation relative to men.18 In a parsimonious specification with no controls (Column 4), we find that grammatical gender is associated with an 11.59 percentage point increase in the gender gap in labor force par- ticipation (p-value 6.46 × 10−6 ). When we include continent fixed effects and country-level geography controls, the coefficient rises to suggest that grammatical gender is associated with a 14,64 percentage point increase in the gender difference in labor force participation (p-value 1.44 × 10−5 ). Thus, the proportion of a country’s population whose native language 18 As shown in Online Appendix Table A1, we obtain similar results when we use the ratio of female labor force participation to male labor force participation as the outcome variable. 19 is a gender language is a robust predictor of gender differences in labor force participation. Moreover, the estimated coefficients suggest a relationship that is both statistically and economically significant. For instance, the estimated coefficients could help to explain why the gender gap in labor force participation is only 10 percentage points in Haiti but 28 per- centage points in the Dominican Republic. Taken at face value, our coefficient estimates suggest that grammatical gender might keep as many as 125 million women around the world out of the labor force. In the Online Appendix, we report a range of robustness checks, all of which suggest that the relationship between grammatical gender and female labor force participation is not driven by outliers or specification choices. In Online Appendix Table A2, we show that our main result is robust to the inclusion of a range of “bad controls” — intermediate outcomes that could themselves have been impacted by grammatical gender. As is well known, including such controls could bias the coefficient of interest, making it impossible to interpret (Angrist and Pischke 2008, Acharya, Blackwell, and Sen 2016). Nevertheless, we note that our main result is robust to the inclusion of controls for log GDP per capita, population, major world religions, and an indicator for post-Communist regimes. In Online Appendix Table A3, we demonstrate that our results hold when we drop each of the major world languages — Arabic, English, and Spanish. Finally, in Online Appendix Table A4, we include an additional variable for the proportion of a country’s population whose native language is a dichotomous gender language with only two noun classes (masculine and feminine). Results suggest that even weak forms of grammatical gender predict women’s (lack of) involvement in the labor force. 4.3 Educational Attainment Next, we examine the association between grammatical gender and women’s educational attainment. Education is a key determinant of wages; in many countries, gender differences in educational attainment translate into gender gaps in wages and economic empowerment (Grant and Behrman 2010). Nonetheless, gender gaps in primary and secondary school 20 completion are not nearly as large as gender gaps in labor force participation. Across the 142 countries in the Barro-Lee data set, the average gender gap in primary school completion is only six percentage points and the average gender gap in secondary school completion is only four percentage points. This reflects the very high rates of primary school completion in many parts of the world: more than two thirds of the countries in the Barro-Lee data set have rates of primary school completion above 90 percent for both men and women. Moreover, many wealthy countries have compulsory schooling laws which tend to reduce gender gaps in educational attainment. In Table 2, we examine the cross-country relationship between grammatical gender and primary school completion. As expected, the relationship is positive and significant when continent controls are not included — reflecting the fact that primary school completion rates are highest in Europe, where gender languages are dominant. Once continent fixed effects are included, the estimated association is negative but not statistically significant. In Columns 4 through 6 of Table 2, we examine the relationship between grammatical gender and the gender gap in primary school completion. When continent fixed effects are omitted, we again find a positive association, though it is not statistically significant. After including continent fixed effects, we find a negative relationship that is marginally statistically significant. Coefficient estimates suggest that grammatical gender is associated with a 3.72 percentage point increase in the gender gap in primary school completion (Table 2, Column 6, p-value 0.089). We observe an even more muted cross-country relationship between gender languages and secondary school completion (Table 3). In the absence of continent fixed effects, we again find a positive and significant association between grammatical gender and secondary school completion — reflecting in part the very high rate of female secondary school comple- tion in Europe and the very low rate in Africa. After including continent fixed effects, the association between grammatical gender and female secondary school completion is never statistically significant. Moreover, we never observe a statistically significant association between grammatical gender and the gender gap in secondary school completion. Thus, 21 grammatical gender explains cross-country variation in female labor force participation, but does not explain most of the observed cross-country variation in women’s educational attainment. 4.4 Gender Attitudes Our main measure of gender norms is a Gender Attitudes Index that we construct by taking the first principal component of the eight World Values Survey (WVS) questions related to gender. In Figure 3, we plot the cross-country relationship between each of these questions and the proportion of a country whose native language is a gender language. The prevalence of gender languages predicts responses to seven of the eight WVS questions. For example, WVS respondents from countries where gender languages are dominant are more likely to agree with the statement “When a mother works for pay, the children suffer” or “When jobs are scarce, men should have more right to a job than women”. The country-level prevalence of gender languages is also associated with an increased likelihood of believing that men make better business executives and political leaders than women, that being a housewife is as fullfilling as paid work, that university education is more important for boys, and that wage inequality within the household is likely to cause conflict. In Table 4, we confirm the association between the prevalence of gender languages and our summary index of gender attitudes in a regression framework. After controlling for continent fixed effects and country-level geography, the coefficient estimate suggests that grammatical gender is associated with a decline in support for gender equality that is equivalent to approximately one standard deviation in our index of gender attitudes. To put this in context, the estimates indicate that grammatical gender alone could explain the gap in gender attitudes between Ukraine (at the 55th percentile) and Trinidad and Tobago (at the 80th percentile). Thus, the estimated association between grammatical gender and non-grammatical gender attitudes is both statistically and culturally significant. If grammatical gender shapes gender attitudes, we would expect it to impact the beliefs of both men and women. In Table 5, we show that — as expected — we observe a negative 22 association between the country-level prevalence of grammatical gender and gender atti- tudes among both women (Columns 1 through 3) and men (Columns 4 through 6). The association is always statistically significant after including continent fixed effects. More- over, though the coefficient is slightly larger for men, we can never reject equality across genders. Thus, the cross-country evidence suggests that grammatical gender predicts gen- der differences in behavior (specifically, involvement in the labor force), but also predicts traditional gender attitudes among both men and women. 4.5 Robust Inference In this section, we discuss two potential concerns with our cross-country analysis. First, as discussed in Section 3.1.3, we were unable to classify the gender structure of some lan- guages. Though these language tend to be small (in terms of numbers of native speakers), they account for more than one percent of the population in 22 countries. In Section 4.5.1, we present estimation that adjusts for the interval nature of our independent variable of interest, the proportion of each country’s population whose native language is a gender language. In Section 4.5.2, we consider the fact that language structures may be correlated within language families, since modern tongues evolved from common ancestors (Roberts, Winters, and Chen 2015). To address the potential correlation within families while max- imizing statistical power (by exploiting variation in grammatical gender both across and between families), we introduce a permutation test based on the structure of the language tree. 4.5.1 Measurement Error In our cross-country analysis, our independent variable of interest is the proportion of the population whose native language is a gender language. However, as discussed above, we are unable to find information on the grammatical structure of many of the world’s smaller languages. Though these unclassified languages account for less than one percent of the world population, they make up a substantial fraction of the population in a small number 23 of countries (e.g. Chad and Papua New Guinea). Even in countries where we successfully classify the gender structure of almost everyone, our independent variable of interest is an interval rather than a point in 85 of 193 countries — because the proportion of native speakers whose languages we classify is less than one. This is a case described by Horowitz and Manski (1998) as “censoring of regressors.” Our analysis so far assumes that this missingness is ignorable. Without this assumption, however, we can still estimate worst-case bounds for the maximum and minimum possible values of the parameter of interest; following Imbens and Manski (2004), we can construct a confidence interval around these bounds. We use numerical optimization to search the space of possible dependent variable values ˆl , that would result from esti- ˆu and β to establish worst-case upper and lower bounds, β mation of Equation 1.19 We then use the associated standard errors on these extrema to compute a confidence interval, employing a formula analogous to that of Equations 6 and 7 in Imbens and Manski (2004). A confidence interval with coverage probability α is equal to: ˆl − C CIα = [β ˆu + C ˆl ), β ¯ · SE (β ˆu )] ¯ · SE (β (2) ¯ satisfies where C ∆ˆ CDF ¯+ C ¯) = α − CDF (−C (3) ˆu )) ˆl ), SE (β max(SE (β for the CDF of Student’s t-distribution with the appropriate number of degrees of freedom.20 Intuitively, the Manski and Imbens approach formalizes a method for shortening each end of the confidence interval relative to the union of the OLS confidence intervals around the worst-case point estimates, since from the perspective of each end, the other end is further away than a 95-percent confidence interval would require. 19 We use MATLAB’s fmincon interior point algorithm, and confirm results using a simple hill-climbing algorithm in Stata. 20 Imbens and Manski do this using the normal distribution, but using the Student t-distribution yields a wider, more conservative confidence interval. 24 In Table 6, we compare naive OLS confidence intervals with the more conservative Imbens-Manski confidence intervals which adjust for censoring of the regressor of interest. As expected, confidence intervals widen slightly, but patterns of significance are unchanged: those confidence intervals that did not include zero in the naive specification do not include zero after adjusting for censoring. 4.5.2 Non-Independence within Language Families A more serious inference concern arises from the fact that languages are not independent. Different tongues evolve over time from a common ancestor. Grammatical structures vary both across and within language families. Roberts, Winters, and Chen (2015) consider a range of approaches to correcting for the non-independence of modern languages. Many approaches have the drawback that they are statistically less powerful than they could oth- erwise be because they ignore variation in grammatical structure either within or between language families. We propose a permutation test approach based on the observed structure of the language tree, as documented by the Ethnologue. Specifically, we cluster together languages up to the highest tree level at which we observe no variation in our treatment of interest, grammatical gender. That is, we form the largest possible clusters that are homogeneous in terms of grammatical gender. Thus, for entire top-level language families that show no variation in gender structure (e.g., the Austronesian language family), we cluster at the language family level. In intermediate cases, we designate clusters at the highest level of the tree where we do not observe variation in grammatical gender (e.g., all Western Nilotic languages cluster together; they are only a branch within the Eastern Sudanic part of the Nilo-Saharan family, which itself contains a number of other such clusters by our definition). In cases where two languages that differ in their gender structure otherwise share the same classification path through the entire language tree, we cluster at the language level. Figure 4 illustrates this approach for a hypothetical language family. All of the languages in the Group A branch in the figure are gender languages, so they are assigned to a single 25 cluster. Similarly, all of the languages on the Group C branch are non-gender, so they also represent a single cluster. Within Group B, the B1 languages show language-level variation: Languages B1.1 and B1.2 share the same path for the entire language tree, but they differ in gender structure. Thus, within the B1 branch of this hypothetical tree, individual languages are assigned to unique clusters. Finally, the B2 languages are all gender languages, so they are assigned to a single cluster that is distinct from the B1 clusters. Thus, the hypothetical language tree presented in the figure is partitioned into six clusters, each representing a sub-tree within the language tree that shows no gender variation. This approach defines a set of 201 clusters, 68 of which have grammatical gender. Having assigned all the assigned languages to clusters in this manner, we conduct a permutation test by randomly generating alternative (hypothetical) allocations of gender structure that would be possible while holding fixed the structure of the treatment variation across the language tree, the distribution of languages across countries, and the number of clusters “treated” with grammatical gender (68 of 201). We use each such hypothetical assignment of treatments to create an associated country-level measure of grammatical gender (which would be observed if treatments were assigned according to our hypothetical allocation rule, given the structure of the language tree and the distribution of languages across countries). We repeat this process 10,000 times, allowing us to estimate the likelihood that the observed associations between grammatical gender and outcomes are spurious, given the structure of the language tree, the correlation in treatment within language families, and the distribution of languages across countries. In Table 7, we compare naive OLS p-values to those that result from our permutation test. It is clear that appropriate clustering matters: permutation test p-values are substan- tially higher than the naive OLS p-values. Nevertheless, the negative association between grammatical gender and women’s labor force participation is still statistically significant after adjusting for the non-independence of languages. Figure 5 illustrates the full distri- bution of coefficient estimates under the null, highlighting the small fraction that exceed the magnitude of the true estimated coefficients. The relationship between grammatical 26 gender and gender attitudes also remains marginally significant, in spite of the relatively small number of countries included in that analysis. Thus, our results do not appear to be driven by the correlation in grammatical structure observed within language families. 5 Within-Country Analysis 5.1 Empirical Strategy Next, we explore the relationship between gender languages and women’s labor force par- ticipation at the individual level in a cultural and institutional context where both gender and non-gender languages are indigenous. There are seven countries in Africa where be- tween 10 and 90 percent of the population speaks a gender native language: Chad, Kenya, Mauritania, Niger, Nigeria, South Sudan, and Uganda. In these countries, both gender and non-gender languages are indigenous — in contrast to, for example, several countries in South America where non-gender indigenous languages and a gender colonial language are both widely spoken. Of the seven African countries listed above, we focus on the four that have been included in at least one round of the Afrobarometer survey: Kenya, Niger, Nige- ria, and Uganda. Four rounds of data are available for Kenya, Nigeria, and Uganda, while only one round of data is available for Niger.21 Our sample includes 26,546 Afrobarometer respondents who speak 175 different languages. Our individual-level analysis parallels our cross-country analysis. We consider two main outcomes: labor force participation (an indicator equal to one if a respondent either does some type of income-generating activity or is actively looking for a job) and education (indicators for having completed primary and secondary school). We report two regression specifications. First, we estimate the association between grammatical gender and labor 21 The first round of the Afrobarometer surveys did not include sufficiently detailed data on native lan- guages for inclusion in our analysis. Our analysis includes data from Afrobarometer Rounds 2 through 5 for Kenya, Nigeria, and Uganda. Niger was only added to the Afrobarometer in Round 5; that round is included in our analysis. 27 force participation in a sample of (only) women, estimating the OLS regression equation: Yicr = α + βGendericr + νcr + γZicr + εicr (4) where Yicr is the outcome of interest for woman i in country c who was interviewed in Afrobarometer Round r, Gendericr is an indicator for having a gender language as one’s mother tongue, νcr is a vector of country-round fixed effects, Zicr is a vector of controls (age, age2 , and a set of religion dummies), and εicr is a mean-zero error term. As in our cross-country analysis, we wish to avoid confounding the impact of grammatical gender on women’s labor force participation (and education) with other cultural factors that might impact both men’s and women’s labor force attachment. To do this, we also report pooled OLS regressions that include data on both men and women. These take the form: Yicr = α + βGendericr + ζF emaleicr + µGender × F emaleicr + νcr + γZicr + εicr (5) where Gender × F emaleicr is an interaction between a female dummy and the indicator for being a native speaker of a gender language. In these specifications, we also include interactions between the F emaleicr dummy and our age and religion controls. Throughout our analysis, we cluster standard errors by language. 5.2 Labor Force Participation We report the results of our regressions of individual-level labor force participation on the indicator for being a native speaker of a gender language in Table 8. Here, the sample is restricted to women. We find a robust negative association between grammatical gender and women’s labor force participation. After controlling for country-round fixed effects, age, and religion, coefficient estimates suggest that women who speak gender languages as their native languages are 18 percentage points less likely to be in the labor force (p- value 1.85 × 10−5 ). Speaking a gender native language is also associated with lower female labor force participation relative to men from the same ethnolinguistic group (Table 9). 28 The coefficient on the Female×Gender language interaction is −0.11 (p-value 0.025) after including controls for country-round fixed effects, plus age and religion categories and interactions between those and the female dummy. Thus, grammatical gender is associated with both lower female labor force participation and larger gender gaps in labor force participation at both the cross-country and the individual level. 5.3 Educational Attainment Next we consider the within-country association between grammatical gender and women’s educational attainment. In our cross-country analysis, we did not find a statistically sig- nificant relationship between the country-level prevalence of gender languages and either women’s educational attainment or gender differences in educational attainment.22 How- ever, rates of primary and secondary school completion are quite high (for both men and women) in many countries, limiting the statistical power of cross-country analysis. More- over, many countries have compulsory schooling laws in place, and these may attenuate the impacts of both cultural values and beliefs about labor market returns (which will differ by gender if women are less likely to participate in the labor force) in decisions about girls’ enrollment in school. Average levels of education are still quite low in many African countries. In the 37 African countries included in the Barro-Lee data set, the average level of educational at- tainment among adult males is only 5.6 years. Moreover, gender differences in educational attainment persist throughout Africa. Among African countries in the Barro-Lee data set, the average level of education among women is only 4.3 years, and women obtain less school- ing than men in all but five African countries. Primary school has only recently been made free in many African countries, and compulsory schooling laws are still relatively rare. We estimate the association between having a gender native language and the likelihood of completing primary school (Table 10, Columns 1 through 3) and secondary school (Table 10, Columns 4 through 6) in Kenya, Niger, Nigeria, and Uganda using the Afrobaromater 22 As noted above, we did find a marginally significant association between grammatical gender and the gender gap in primary school completion after controlling for continent fixed effects. 29 data described above. Coefficient estimates suggest a very strong negative relationship between grammatical gender and educational attainment. After controlling for country- round fixed effects, age, and religion, we find that speaking a gender native language is associated with a 22 percentage point decline in the likelihood that a woman completed primary school and a 16 percentage point decline in the likelihood that a woman completed secondary school (Table 10, Columns 3 and 6, respectively). Both coefficients are negative and significant at the 99 percent confidence level (p-values, 3.58 × 10−5 and 5.92 × 10−4 , respectively). Moreover, we are once again able to rule out the possibility that differences in levels of education among women are driven by cultural factors (i.e. differences across ethnolinguistic groups) that impact both men and women. We report pooled OLS specifications that include men and women in Table 11. We do find that cultural factors matter: the indicator for having a gender native language is negative and significant in all specifications. After controlling for age and religion plus interactions between these controls and the indicator for being female (along with country-round fixed effects), men whose native language is a gender language are 10 percentage points less likely to finish primary school and secondary school than men whose native language is non-gender. Nonetheless, the interaction between Female and the indicator for gender native languages is also negative and significant in all specifications; moreover, after differencing out the level of education observed among men in each language group, the estimated Female×Gender language interaction is largely impervious to controls. In a specification with no controls (Table 11, Column 1), the coefficient estimate indicates that women whose native language is a gender language are 12 percentage points less likely to have completed primary school (p-value 2.15 × 10−18 ). When we include controls for country-round fixed effects, age, and religion, the estimated coefficient only drops from −0.12 to −0.11 (p-value 1.07 × 10−8 ). The pattern is similar for secondary school: with or without controls, the coefficient indicates that grammatical gender is associated with a six percentage point decline in the likelihood of completing secondary school (p-value 2.48 × 10−5 without controls, 0.009 with controls). Thus, our 30 results suggest an extremely robust negative association between grammatical gender and women’s educational attainment within our African sample. 6 Causality The analysis presented thus far documents the strong negative relationship between gram- matical gender and women’s labor force participation, and shows that it is robust to a permutation test that addresses the potential non-independence of observations. We also find a positive cross-country relationship between grammatical gender and traditional gen- der attitudes, and a robust negative association between grammatical gender and women’s educational attainment in four African countries. Of course, these are correlations, not necessarily causal relationships. In most cases, whether a language has retained grammatical gender is driven by id- iosyncracies of history far-removed from outcomes of interest in this paper. For example, scholars believe that English lost grammatical gender because its complex agreement system did not withstand the influx of Scandinavian immigrants (who learned English as a second language in adulthood) into the linguistic community (McWhorter 2005, Kastovsky 1999) — not because of changes in gender norms in pre-Norman England. Nevertheless, gender languages are not randomly assigned. The observed correlations may be driven by some unobserved causal factor that is correlated with both language and gender norms. To assess whether the observed correlation is likely to represent a causal link between language and our outcomes of interest, we follow the approach suggested by Altonji, El- der, and Taber (2005) and further refined by Oster (2017).23 Under the assumption that the relationship between the outcome variables, treatment, and the observed controls is similar to the relationship between the outcomes, treatment, and unobserved controls, this approach relates changes in coefficient magnitudes as controls are added to changes in the observed R2 . Intuitively, omitted variable bias is assumed to be proportional to changes 23 An alternative approach would be to try and identify a suitable instrument for grammatical gender. However, recent work suggests that conventional approaches may overstate the precision of 2SLS estimates, leading to invalid inference (Young 2018). Thus, OLS with caution may be an equally reasonable approach. 31 in regression coefficients as controls are added; however, these changes must be scaled by changes in the R2 — adding controls that do not explain the outcome variable does little to address concerns about omitted variable bias. Following the procedures outlined by Oster (2017), we estimate two measures of coef- ficient stability. These additional statistics are calculated using the results from two OLS regressions: (i) a bivariate regression of an outcome of interest on grammatical gender, ˚, and an associated R which generates a coefficient of interest, β ˚2 ; and (ii) a multivariate re- gression of the same outcome on grammatical gender plus a set of controls, which generates ˜, and an associated R a second OLS coefficient, β ˜2. In this framework, δ ∗ is the proportional selection coefficient. Given the empirical relationship between the outcome, the treatment, and the observed controls, δ ∗ indicates how much more correlated with treatment the unobservables would need to be in order to explain the entire association between treatment and the outcome of interest. If δ ∗ > 1, then an observed empirical relationship is relatively robust in that unobservables would need to be more correlated with treatment than observables to explain the association. A second parameter of interest is β ∗ . It indicates the likely causal impact of grammatical gender on an outcome of interest under the assumption that δ ∗ = 1 (i.e. assuming that the covariance structure is the same for observables and unobservables). Coefficient stability results are presented in Table 12. Cross-country results are pre- sented in Panel A. Results indicate that our estimates of the impact of grammatical gender on women’s labor force participation are unlikely to be driven by selection alone. Unob- servables would need to be 1.44 times more correlated with treatment (than observables) to explain the observed link between grammatical gender and the level of women’s labor force participation; unobservables would need to be 3.23 times more correlated with treat- ment to explain the gender gap in labor force participation. Estimates of β ∗ suggest that grammatical gender has a substantial negative causal impact on both outcomes of interest. Thus, the analysis suggests that gender languages reduce women’s labor force participation in both absolute and relative terms. 32 Turning to our analysis of the cross-country relationship between grammatical gender and gender attitudes, we again find that the observed association is unlikely to be driven by omitted variables.24 Adding controls to our cross-country regressions of gender attitudes on grammatical gender strengthens the empirical relationship. Thus, the procedures outlined by Oster (2017) point toward a negative causal impact of grammatical gender on support for gender equality among both women and men. Our individual-level analysis is presented in Panel B of Table 12. In all cases, the Oster (2017) approach suggests that the empirical relationship between grammatical gender and outcomes of interest is unlikely to be driven by selection on unobservables. For example, unobservable covariates would need to be 1.86 times more closely correlated with treat- ment than observables to explain the empirical relationship between grammatical gender and female labor force participation (relative to men from the same ethnic group). As discussed above, the estimated relationships between grammatical gender and educational attainment are very robust to controls. Hence, the Oster (2017) approach indicates that unobservables would need to be 4.64 times more correlated with treatment than observ- ables to explain the observed association between grammatical gender and primary school completion (relative to men from the same ethnic group). Unobservables would need to be 6.01 times more correlated with treatment than observables to explain the association between grammatical gender and secondary school completion (again, relative to men from the same ethnic group). Because the estimated coefficients show almost no change as con- trols are added (though the controls increase the R2 substantially), the approach suggested by Altonji, Elder, and Taber (2005) and Oster (2017) suggests that grammatical gender has a substantial negative causal impact on women’s labor force participation and educational attainment in Sub-Saharan Africa. Thus, the coefficient stability approach supports the hypothesis that grammatical gen- 24 Since adding controls changes the sign of the cross-country relationship between grammatical gender and educational outcomes, it is not amenable to this type of analysis. Moreover, since the observed association is, at best, only marginally statistically significant, it does not seem appropriate to assess whether the observed non-relationship is driven by omitted variables. We nevertheless present the estimated coefficients in Table 12 for completeness. 33 der has a causal impact on women’s labor force participation and, in parts of Sub-Saharan Africa, women’s educational attainment. Nevertheless, this approach — like instrumental variables — relies on fundamentally untestable assumptions. Though modern gender atti- tudes could not plausibly have impacted the grammatical structure of language, we cannot fully rule out the possibility that cultural factors shaped both grammatical structure and gender norms. As in all studies of history and culture, it is not possible to run experiments and relevant sample sizes are fairly small; some measure of caution about causal claims is therefore certainly warranted. 7 Conclusion Using a new data set on the grammatical gender structure of more than 4,000 languages, we document a robust negative association between gender languages and women’s labor force participation. At the country level, an increase in the proportion of the population whose native language is a gender language is associated with lower female labor force participation and — perhaps more importantly — larger gender differences in labor force participation. Using data from the World Values Survey, we show that grammatical gender also predicts support for traditional gender roles. However, the prevalence of gender languages does not explain cross-country differences in women’s educational attainment. Focusing on four African countries where both gender and non-gender languages are indigenous, we show that a similar pattern holds within countries. Speaking a gender na- tive language is associated with lower labor force participation and educational attainment among women, both in absolute terms and relative to men from the same ethnolinguistic group. Both our cross-country and our individual-level regressions are robust to the inclu- sion of controls that could not plausibly have been impacted by treatment; if one is willing to assume that the relationship between unobserved omitted factors, treatment, and the outcomes of interest is similar to the observed relationship between controls, treatment, and the outcomes of interest, our estimates suggest that grammatical gender has a large 34 negative impact on women’s labor force participation. Our results are consistent with research in psychology, linguistics, and anthropology suggesting that languages shape patterns of thought in subtle and subconscious ways. Lan- guages are a critical part of our cultural heritage, and it would be inappropriate to suggest that some languages are detrimental to development or women’s rights. However, languages do evolve over time; the direction of their evolution is shaped by both individual choices (for example, whether to use gendered pronouns like “he” or “she” or gender-neutral alter- emie natives such as “they”) and conscious decisions by government agencies (e.g. the Acad´ caise) and other thought leaders (e.g. major newspapers and magazines). Our re- Fran¸ sults suggest that individuals should reflect upon the social consequences of their linguistic choices, as the nature of the language we speak shapes the ways we think, and the ways our children will think in the future. 35 References Acharya, A., M. Blackwell, and M. Sen (2016): “Explaining Causal Findings without Bias: Detecting and Assessing Direct Effects,” American Political Science Review, 110(3), 512–529. Afrobarometer Data (2016): “Kenya, Niger, Nigeria, Tanzania, Uganda, Rounds 2 through 5, 2002–2013,” available at http://www.afrobarometer.org. Aikhenvald, A. Y. (2003): Classifiers: A Typology of Noun Categorization Devices. Oxford Uni- versity Press, Oxford, UK. Alesina, A., P. Giuliano, and N. Nunn (2013): “On the Origins of Gender Roles: Women and the Plough,” Quarterly Journal of Economics, 128(2), 469–530. Altonji, J. G., T. E. Elder, and C. R. Taber (2005): “Selection on Observed and Unobserved Variables: Assessing the Effectiveness of Catholic Schools,” Journal of Political Economy, 113(1), 151–184. Angrist, J. D., and J.-S. Pischke (2008): Mostly Harmless Econometrics: An Empiricist’s Companion. Princeton University Press. Barro, R., and J.-W. Lee (2013): “A New Data Set of Educational Attainment in the World, 1950-2010,” Journal of Development Economics, 104, 184–198. Beames, J. (1875): Comparative Grammar of the Modern Aryan Languages of India: to wit, Hindi, Punjabi, Sindhi, Gujarati, Marathi, Oriya, and Bangali: Vol. II. The Noun and Pronoun. Trubner and Company, London, U.K. Boroditsky, L., L. A. Schmidt, and W. Phillips (2003): “Sex, Syntax, and Semantics,” in Language in Mind: Advances in the Study of Language and Thought, ed. by S. Goldin-Meadow, and D. Gentner, pp. 61–79. MIT Press, Cambridge, MA. Campbell, G. L. (1991): Compendium of the World’s Languages. Routledge, London, UK. Chen, M. K. (2013): “The Effect of Language on Economic Behavior: Evidence from Savings Rates, Health Behaviors, and Retirement Assets,” Americn Economic Review, 103(2), 690–731. Corbett, G. G. (1991): Gender. Cambridge University Press, Cambridge, UK. (2013a): “Number of Genders,” in The World Atlas of Language Structures Online, ed. by M. S. Dryer, and M. Haspelmath. Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany. (2013b): “Sex-based and Non-sex-based Gender Systems,” in The World Atlas of Language Structures Online, ed. by M. S. Dryer, and M. Haspelmath. Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany. (2013c): “Systems of Gender Assignment,” in The World Atlas of Language Structures Online, ed. by M. S. Dryer, and M. Haspelmath. Max Planck Institute for Evolutionary Anthro- pology, Leipzig, Germany. Creissels, D. (2000): “Typology,” in African Languages: An Introduction, ed. by B. Heine, and D. Nurse, pp. 231–258. Cambridge University Press, Cambridge, UK. Danziger, S., and R. Ward (2010): “Language Changes Implicit Associations between Ethnic Groups and Evaluation in Bilinguals,” Psychological Science, 21(6), 799–800. Durkheim, E., and M. Mauss (1963): Primitive Classification. University of Chicago Press, Chicago, USA, Translated from the French, edited, and with an introduction by Rodney Needham. 36 Easterly, W., and R. Levine (1997): “Africa’s Growth Tragedy: Policies and Ethnic Divisions,” Quarterly Journal of Economics, 112(4), 1203–1250. Gay, V., D. L. Hicks, E. Santacreu-Vasut, and A. Shoham (forthcoming): “Decomposing culture: An Analysis of Gender, Language, and Labor Supply in the Household,” Review of Economics of the Household. Givati, Y., and U. Troiano (2012): “Law, Economics, and Culture: Theory of Mandated Benefits and Evidence from Maternity Leave Policies,” Journal of Law and Economics, 55(2), 339–364. Grant, M. J., and J. R. Behrman (2010): “Gender Gaps in Educational Attainment in Less Developed Countries,” Population and Development Review, 36(1), 71–89. Grierson, G. A. (1903a): Linguistic Survey of India: Volume V, Indo-Aryan Family Eastern Group, Part I, Specimens of the Bengali and Assamese Languages. Superintendent Govern- ment Printing, Calcutta, India, http://dsal.uchicago.edu/books/lsi/index.html [accessed 13 July 2016]. (1903b): Linguistic Survey of India: Volume V, Indo-Aryan Family Eastern Group, Part II, Specimens of the Bihari and Oriya Languages. Superintendent Government Printing, Calcutta, India, http://dsal.uchicago.edu/books/lsi/index.html [accessed 13 July 2016]. (1904): Linguistic Survey of India: Volume VI, Indo-Aryan Family Mediate Group, Spec- imins of the Eastern Hindi Language. Superintendent Government Printing, Calcutta, India, http://dsal.uchicago.edu/books/lsi/index.html [accessed 13 July 2016]. (1905): Linguistic Survey of India: Volume VII, Indo-Aryan Family Southern Group, Specimins of theMarathi Language. Superintendent Government Printing, Calcutta, India, http: //dsal.uchicago.edu/books/lsi/index.html [accessed 13 July 2016]. (1907): Linguistic Survey of India: Volume IX, Indo-Aryan Family Central Group, Part III, The Bhil Languages. Superintendent Government Printing, Calcutta, India, http://dsal. uchicago.edu/books/lsi/index.html [accessed 13 July 2016]. (1908): Linguistic Survey of India: Volume IX, Indo-Aryan Family Central Group, Part II, Specimens of Rajastani and Gujarati. Superintendent Government Printing, Calcutta, India, http://dsal.uchicago.edu/books/lsi/index.html [accessed 13 July 2016]. (1909): Linguistic Survey of India: Volume III, Tibeto-Burman Family, Part I, Gen- eral Introduction, Specimins of the Tibetan Dialects, the Himalayan Dialects, and the North As- sam Group. Superintendent Government Printing, Calcutta, India, http://dsal.uchicago.edu/ books/lsi/index.html [accessed 13 July 2016]. (1916): Linguistic Survey of India: Volume IX, Indo-Aryan Family Central Group, Part I, Specimens of Western Hindi and Panjabi. Superintendent Government Printing, Calcutta, India, http://dsal.uchicago.edu/books/lsi/index.html [accessed 13 July 2016]. (1919): Linguistic Survey of India: Volume VIII, Indo-Aryan Family Northwestern Group, Part II, Specimens of the Dardic or Pisacha Languages. Superintendent Government Printing, Calcutta, India, http://dsal.uchicago.edu/books/lsi/index.html [accessed 13 July 2016]. (1921): Linguistic Survey of India: Volume X, Specimins of Languages of the Eranian Fam- ily. Superintendent Government Printing, Calcutta, India, http://dsal.uchicago.edu/books/ lsi/index.html [accessed 13 July 2016]. Guiora, A. Z. (1983): “Language and Concept Formation: A Cross-Lingual Analysis,” Cross- Cultural Research, 18(3), 228–256. 37 Hellinger, M. (2003): “English,” in Gender Across Languages: The Linguistic Representation of Women and Men, Volume, ed. by M. Hellinger, and H. Bußman, pp. 105–114. John Benjamins Publishing Company, Amsterdam, The Netherlands. Hellinger, M., and H. Bußman (2003): “The Linguistic Representation of Women and Men,” in Gender Across Languages: The Linguistic Representation of Women and Men, Volume 3, ed. by M. Hellinger, and H. Bußman, pp. 1–26. John Benjamins Publishing Company, Amsterdam, The Netherlands. Hicks, D. L., E. Santacreu-Vasut, and A. Shoham (2015): “Does Mother Tongue Make for Women’s Work? Linguistics, Household Labor, and Gender Identity,” Journal of Economic Behavior and Organization, 110(2), 19–44. Horowitz, J. L., and C. F. Manski (1998): “Censoring of outcomes and regressors due to survey nonresponse: Identification and estimation using weights and imputations,” Journal of Econometrics, 84(1), 37–58. Imbens, G. W., and C. F. Manski (2004): “Confidence Intervals for Partially Identified Param- eters,” Journal of Econometrics, 72(6), 1845–1857. Janhunen, J. (1999): “Grammaticl Gender from East to West,” in Gender in Grammar and Cogniition, ed. by B. Unterbeck, and M. Rissanen, pp. 689–708. Mouton de Gruyter, Berlin, Germany. Kastovsky, D. (1999): “Inflectional Classes, Morphological Restructuring, and the Dissolution of Old English Grammatical Gender,” in Gender in Grammar and Cogniition, ed. by B. Unterbeck, and M. Rissanen, pp. 709–728. Mouton de Gruyter, Berlin, Germany. Kilarski, M. (2013): Nominal Classification: A History of Its Study from the Classical Period to the Present. John Benjamins Publishing Company, Amsterdam, The Netherlands. Krishnamurti, B. (2001): Comparative Dravidian Linguistics. Oxford University Press, Oxford, UK. Lakoff, G. (1987): Women, Fire, and Dangerous Things: What Categories Reveal about the Mind. University of Chicago Press, Chicago, IL. Lewis, M. P., G. F. Simons, and C. D. Fennig, eds., (2016): “Ethnologue: Languages of the World, nineteenth edition.,” Dallas, Texas: SIL International, Online version: http://www. ethnologue.com [accessed 4 June 2016]. Mavisakalyan, A. (2015): “Gender in Language and Gender in Employment,” Oxford Develop- ment Studies, 43(4), 403–424. McWhorter, J. H. (2005): Defining Creole. Oxford University Press. Muhleisen, S., and D. E. Walicek (2010): “Language and Gender in the Caribbean: An Overview,” Sargasso: Explorations of Language, Gender, and Sexuality, 2008–2009(I), 15–30. Ogunnaike, O., Y. Dunham, and M. R. Banaji (2010): “The Language of Implicit Preferences,” Journal of Experimental Social Psychology, 46(6), 999–1003. Oster, E. (2017): “Unobservable Selection and Coefficient Stability: Theory and Evidence,” Jour- nal of Business and Economic Statistics. ´rez, E. O., and M. Tavits (forthcoming): “Language Influences Public Attitudes Toward Pe Gender Equality,” Journal of Politics. Roberts, S. G., J. Winters, and K. Chen (2015): “Future Tense and Economic Decisions: Controlling for Cultural Evolution,” PLOS One, 10(7), e0132145. 38 Santacreu-Vasut, E., O. Shenkar, and A. Shoham (2014): “Linguistic Gender Marking and Its International Business Ramifications,” Journal of International Business Studies, 45, 1170– 1178. Santacreu-Vasut, E., A. Shoham, and V. Gay (2013): “Do female/male distinctions in lan- guage matter? Evidence from gender political quotas,” Applied Economics Letters, 20(5), 495– 498. Shoham, A., and S. M. Lee (2017): “The Causal Impact of Grammatical Gender Marking on Gender Wage Inequality amd Country Income Inequality,” Business and Society. UCLA Language Materials Project (2014): “Teaching Resources for Less Commonly Taught Languages,” http://www.lmp.ucla.edu/ [accessed 4 June 2016]. Whorf, B. L. (2011[1956]a): “Language, Mind, and Reality,” in Language, Thought, and Reality: Selected Writings of Benjamin Lee Whorf, ed. by J. B. Caroll, pp. 246–270. Martino Publishing, Mansfield Centre, CT. (2011[1956]b): “A Linguistic Consideration of Thinking in Primative Communities,” in Language, Thought, and Reality: Selected Writings of Benjamin Lee Whorf, ed. by J. B. Caroll, pp. 65–86. Martino Publishing, Mansfield Centre, CT. World Values Survey Association (2015): “World Values Survey Wave 6 2010–2014 Official Aggregate v.20150418,” available online at www.worldvaluessurvey.org, aggregate file producer: Asep/JDS, Madrid SPAIN. Yadav, R. (1996): A Reference Grammar of Maithili. Walter de Gruyter, Cambridge, UK. Young, A. (2018): “Consistency without Inference: Instrumental Variables in Practical Applica- tion,” working paper. 39 Figure 1: The Distribution of Gender Languages The figure shows the percentage of the native speakers in each country whose native language is a gender language (i.e. the fraction of Ethnologue native speakers whose native language uses a system of grammatical gender). The figure assumes that missing data (on 0.8 percent of all native speakers worldwide) is ignorable. 40 Figure 2: Cross-Country Variation in Female Labor Force Participation 100 80 60 LFPfemale 40 20 0 20 LFPmale - LFPfemale 0 -80 -60 -40 -20 Proportion gender < 0.1 0.1 < proportion gender < 0.9 Proportion gender > 0.9 The figure plots the level of female labor force participation (top panel) and the gender difference in labor force participation (bottom panel) by country. Darker bars indicate countries with a higher proportion of native speakers of gender languages. 41 Figure 3: Cross-Country Variation in Gender Attitudes When a mother works, the children suffer p = 0.000 *** Men have more right to a scarce job p = 0.002 *** Men make better political leaders p = 0.002 *** Men make better business executives p = 0.001 *** Being a housewife as fulfilling as paid work p = 0.061 * If a wife earns more, it causes problems p = 0.080 * University is more important for boys p = 0.087 * Having a job not best way to be independent p = 0.289 0 .1 .2 .3 .4 Proportion speaking gender language The figure summarizes the results from a series of regressions of (country-level averages of) responses to World Values Survey (WVS) questions on the proportion of a country’s population whose native language is a gender language. We present the results for all eight WVS questions related to gender attitudes. Responses to all eight questions are coded so that the answer most consistent with traditional gender norms (involving separate roles for men and women) is equal to 1 and the response most consistent with gender equality is equal to 0. Each regression is estimated via OLS and includes continent fixed effects. The outcome in the first row is the average response to the question “When a mother works for pay, the children suffer” (agreement is coded as a 1, disagreement as a 0). The outcome variable in the second row is the average response to the statement “When jobs are scarce, men should have more right to a job than women.” In the third row, the outcome variable is based on the statement “On the whole, men make better political leaders than women do.” In the fourth row, the outcome variable is based on the statement “On the whole, men make better business executives than women do.” In the fifth row, the outcome variable is based on the statement “Being a housewife is just as fulfilling as working for pay;” agreement was coded as 0 and disagreement was coded as 1. In the sixth row, the outcome variable is based on the statement “If a woman earns more money than her husband, it’s almost certain to cause problems.” In the seventh row, the outcome variable is based on the statement “A university education is more important for a boy than for a girl.” In the last row, the outcome variable is based on the statement “Having a job is the best way for a woman to be an independent person;” in this case, disagreement was coded as 1 and agreement was coded as 0. 42 Figure 4: Assignment to Clusters for the Permutation Test Language A1.1 Group A1 Language A1.2 Language A1.3 Group A Cluster 1 Language A2.1 Group A2 Language A2.2 Language A2.3 Language B1.1 Cluster 2 Group B1 Language B1.2 Cluster 3 Language B1.3 Cluster 4 Family Group B Language B2.1 Group B2 Language B2.2 Cluster 5 Language B2.3 Language C1.1 Group C1 Language C1.2 Language C1.3 Group C Cluster 6 Language C2.1 Group C2 Language C2.2 Language C2.3 Figure illustrates a hypothetical language family. Gender languages and branches of the tree that include only gender languages are boxed and printed in red. Languages are assigned to clusters at the highest level of the language tree that shows no variation in grammatical gender. 43 Figure 5: Permutation Tests Panel A: Female Labor Force Participation Panel B: Gender Difference in Labor Force Participation 44 Table 1: Cross-Country OLS Regressions of Labor Force Participation Dependent variable: LFPf LFPf - LFPm Specification: OLS OLS OLS OLS OLS OLS (1) (2) (3) (4) (5) (6) Proportion speaking gender language -13.82∗∗∗ -17.65∗∗∗ -11.89∗∗∗ -11.59∗∗∗ -18.86∗∗∗ -14.64∗∗∗ (2.80) (3.52) (3.35) (2.47) (3.14) (3.25) Constant 57.42∗∗∗ 60.60∗∗∗ 63.53∗∗∗ -16.93∗∗∗ -13.41∗∗∗ -3.22 (1.46) (2.25) (6.16) (1.20) (1.64) (4.71) Continent Fixed Effects No Yes Yes No Yes Yes Geography Controls No No Yes No No Yes Observations 178 178 178 178 178 178 R2 0.15 0.25 0.33 0.12 0.41 0.47 Robust standard errors clustered by most widely spoken language in all specifications. LFPf is the percentage of women in the labor force, measured in 2011. LFPf - LFPm is the gender difference in labor force participation — i.e. the difference between female and male labor force participation, again measured in 2011. Geography controls are the percentage of land area in the tropics or subtropics, average yearly precipitation, average temperature, an indicator for being landlocked, and the Alesina et al. (2013) measure of suitability for the plough. 45 Table 2: Cross-Country OLS Regressions of Primary School Completion Dependent variable: Prif Prif - Prim Specification: OLS OLS OLS OLS OLS OLS (1) (2) (3) (4) (5) (6) Proportion speaking gender language 14.86∗∗ -4.72 -6.72 1.23 -3.87∗ -3.72∗ (5.83) (4.45) (4.41) (2.14) (2.04) (2.17) Constant 61.83∗∗∗ 40.80∗∗∗ 79.24∗∗∗ -6.62∗∗∗ -10.64∗∗∗ -6.93 (5.07) (3.56) (8.30) (1.58) (1.96) (4.37) Continent Fixed Effects No Yes Yes No Yes Yes Geography Controls No No Yes No No Yes Observations 142 142 142 142 142 142 R2 0.06 0.53 0.61 0.003 0.18 0.2 Robust standard errors clustered by most widely spoken language in all specifications. Prif is the rate of primary school completion among women. Prif - Prim is the gender difference in primary school completion — i.e. the difference between female and male primary school completion. Geography controls are the percentage of land area in the tropics or subtropics, average yearly precipitation, average temperature, an indicator for being landlocked, and the Alesina et al. (2013) measure of suitability for the plough. Table 3: Cross-Country OLS Regressions of Secondary School Completion Dependent variable: Secf Secf - Secm Specification: OLS OLS OLS OLS OLS OLS (1) (2) (3) (4) (5) (6) Proportion speaking gender language 14.58∗∗ -1.62 0.45 0.48 0.71 -0.87 (5.78) (4.23) (3.71) (1.93) (2.32) (2.36) Constant 32.13∗∗∗ 14.21∗∗∗ 66.35∗∗∗ -4.17∗∗∗ -5.37∗∗∗ -3.23 (4.33) (2.42) (8.99) (1.06) (1.25) (3.73) Continent Fixed Effects No Yes Yes No Yes Yes Geography Controls No No Yes No No Yes Observations 142 142 142 142 142 142 R2 0.06 0.47 0.67 0.0008 0.07 0.1 Robust standard errors clustered by most widely spoken language in all specifications. Secf is the rate of primary school completion among women. Secf - Secm is the gender difference in primary school com- pletion — i.e. the difference between female and male primary school completion. Geography controls are the percentage of land area in the tropics or subtropics, average yearly precipitation, average temperature, an indicator for being landlocked, and the Alesina et al. (2013) measure of suitability for the plough. 46 Table 4: Cross-Country OLS Regressions of Gender Attitudes Dependent variable: Gender Attitudes Index Specification: OLS OLS OLS (1) (2) (3) Proportion speaking gender language -0.03 -0.11∗∗∗ -0.12∗∗∗ (0.05) (0.03) (0.04) Constant 0.54∗∗∗ 0.49∗∗∗ 0.52∗∗∗ (0.03) (0.02) (0.04) Continent Fixed Effects No Yes Yes Geography Controls No No Yes Observations 56 56 56 R2 0.01 0.74 0.78 Robust standard errors clustered by most widely spoken language in all specifications. The Gender Attitudes Index is constructed by taking the first principal component of the 8 World Values Survey questions relating to gender norms (described in Figure 3) at the individual level, and then calculating the average of this index within a country. Numbers closer to 1 indicate more support for gender equality. Geography controls are the percentage of land area in the tropics or subtropics, average yearly precipitation, average temperature, an indicator for being landlocked, and the Alesina et al. (2013) measure of suitability for the plough.. Table 5: Gender Attitudes Index among Women vs. Men Dependent variable: Gender Attitude Index Sample restriction: Women Women Women Men Men Men Specification: OLS OLS OLS OLS OLS OLS (1) (2) (3) (4) (5) (6) Proportion speaking gender language -0.02 -0.09∗∗∗ -0.1∗∗ -0.04 -0.12∗∗∗ -0.14∗∗∗ (0.05) (0.03) (0.04) (0.06) (0.03) (0.04) Constant 0.56∗∗∗ 0.53∗∗∗ 0.58∗∗∗ 0.5∗∗∗ 0.44∗∗∗ 0.46∗∗∗ (0.03) (0.02) (0.04) (0.03) (0.02) (0.05) Continent Fixed Effects No Yes Yes No Yes Yes Geography Controls No No Yes No No Yes Observations 56 56 56 56 56 56 R2 0.005 0.69 0.73 0.02 0.74 0.78 Robust standard errors clustered by most widely spoken language in all specifications. Female Labor Force Participation is the percentage of women in the labor force, measured in 2011. The Gender Difference in Labor Force Participation is the difference between male and female labor force par- ticipation in 2011. Geography controls are the percentage of land area in the tropics or subtropics, average yearly precipitation, average temperature, an indicator for being landlocked, and the Alesina et al. (2013) measure of suitability for the plough. 47 Table 6: Robust Inference: Manski-Imbens Worst-Case 95-Percent Confidence Intervals ıve OLS CI Na¨ Imbens-Manski CI Female labor force participation [−18.510, −5.269] [−18.452, −5.006] Gender difference in labor force participation [−21.070, −8.204] [−20.915, −7.737] Female primary school completion [−15.462, 2.031] n/a Gender difference in secondary school completion [−8.017, 0.578] n/a Female primary school completion [−6.908, 7.806] n/a Gender difference in secondary school completion [−5.538, 3.798] n/a Gender attitudes index [−0.193, −0.045] [−0.194, −0.047] Gender attitudes index among women [−0.173, −0.022] [−0.173, −0.023] Gender attitudes index among men [−0.214, −0.063] [−0.215, −0.064] Confidence intervals estimated following procedures outlined in Section 4.5.1. For each outcome, the na¨ıve confidence interval comes from the associated regression in a previous table. The Imbens- Manski worst-case confidence interval is calculated by finding the minimum and maximum possible point estimates of the relevant coefficient based on the interval nature of the dataset (without com- plete data on the grammatical structure of all languages, the right-hand-side variable–the fraction of a country’s population speaking a gender language–is only observed up to an interval in some cases), then by tightening the confidence interval for correct coverage following Imbens and Manski (2004). Worst-case confidence intervals are only calculated when na¨ ıve CI does not include zero. Table 7: Robust inference: Language structure ıve OLS Na¨ Permutation-based p-values p-values Female labor force participation 0.00053 0.01940 Gender difference in labor force participation 0.00001 0.00960 Female primary school completion 0.13096 0.16870 Gender difference in primary school completion 0.08908 0.09590 Female secondary school completion 0.90388 0.92130 Gender difference in secondary school completion 0.71246 0.72480 Gender attitudes index 0.00225 0.05530 Gender attitudes index among women 0.01223 0.09920 Gender attitudes index among men 0.00063 0.03430 P-values estimated using 10,000 permutations, following procedures outlined in Section 4.5.2. For each outcome, the na¨ ıve p-value comes from the associated regression in a previous table. The permutation-based p-value is the fraction of permutations in which the magnitude of the estimated coefficient (from a hypothetical permutation of the gender indicator that respects the cluster structure of the language tree) exceeds the magnitude of the estimated coefficient in the true (non-permuted) data set. Distributions underlying first two rows are shown in Figure 5. 48 Table 8: Individual-Level Regressions of Women’s Labor Force Participation Dependent variable: In Labor Force Specification: OLS OLS OLS (1) (2) (3) Native language is gender -0.24∗∗∗ -0.2∗∗∗ -0.18∗∗∗ (0.05) (0.04) (0.04) Constant 0.67∗∗∗ 0.58∗∗∗ 0.27∗∗∗ (0.02) (0.02) (0.09) Country-Wave Fixed Effects No Yes Yes Individual Controls No No Yes Observations 13154 13154 13154 R2 0.04 0.07 0.1 Robust standard errors clustered at the language level. The dependent variable is an indicator for being in the labor force (either working for a wage, self-employed, or actively seeking employment). Data is from Afrobarometer Rounds 2 through 5. The analysis includes data from Kenya, Niger, Nigeria, and Uganda; Niger was only added to the Afrobarometer in Round 5, while the other countries appear in all four rounds. Individual controls are age and age-squared and indicators for being identifying as Muslim, Catholic, Protestant, or another religion. Table 9: Individual-Level Regressions of Gender Differences in Labor Force Participation Dependent variable: In Labor Force Specification: OLS OLS OLS (1) (2) (3) Female × gender language -0.17∗∗∗ -0.16∗∗∗ -0.11∗∗ (0.05) (0.05) (0.05) Native language is gender -0.08∗∗∗ -0.04 -0.07∗∗∗ (0.02) (0.02) (0.03) Female -0.1∗∗∗ -0.1∗∗∗ 0.06 (0.01) (0.01) (0.04) Constant 0.77∗∗∗ 0.7∗∗∗ 0.24∗∗∗ (0.01) (0.02) (0.07) Country-Wave Fixed Effects No Yes Yes Individual Controls No No Yes Observations 26328 26328 26328 R2 0.04 0.07 0.12 Robust standard errors clustered at the language level. The dependent variable is an indicator for being in the labor force (either working for a wage, self-employed, or actively seeking employment). Data is from Afrobarometer Rounds 2 through 5. The analysis includes data from Kenya, Niger, Nigeria, and Uganda; Niger was only added to the Afrobarometer in Round 5, while the other countries appear in all four rounds. Individual controls are age and age-squared and indicators for being identifying as Muslim, Catholic, Protestant, or another religion, plus interactions between these controls and the female dummy. 49 Table 10: Individual-Level OLS Regressions of Women’s Education Dependent variable: Primary School Secondary School Specification: OLS OLS OLS OLS OLS OLS (1) (2) (3) (4) (5) (6) Native language is gender language -0.31∗∗∗ -0.3∗∗∗ -0.22∗∗∗ -0.19∗∗∗ -0.23∗∗∗ -0.16∗∗∗ (0.04) (0.06) (0.05) (0.04) (0.06) (0.04) Constant 0.7∗∗∗ 0.67∗∗∗ 0.93∗∗∗ 0.35∗∗∗ 0.33∗∗∗ 0.49∗∗∗ (0.03) (0.03) (0.04) (0.04) (0.03) (0.05) Country-Wave Fixed Effects No Yes Yes No Yes Yes Individual Controls No No Yes No No Yes Observations 13142 13142 13142 13142 13142 13142 R2 0.06 0.12 0.21 0.02 0.1 0.15 Robust standard errors clustered at the language level. The dependent variable is an indicator for being in the labor force (either working for a wage, self-employed, or actively seeking employment). Data is from Afrobarometer Rounds 2 through 5. The analysis includes data from Kenya, Niger, Nigeria, and Uganda; Niger was only added to the Afrobarometer in Round 5, while the other countries appear in all four rounds. Individual controls are age and age-squared and indicators for being identifying as Muslim, Catholic, Protestant, or another religion. Table 11: Individual-Level OLS Regressions of Gender Differences in Education Dependent variable: Primary School Secondary School Specification: OLS OLS OLS OLS OLS OLS (1) (2) (3) (4) (5) (6) Female × gender language -0.12∗∗∗ -0.11∗∗∗ -0.11∗∗∗ -0.06∗∗∗ -0.06∗∗∗ -0.06∗∗∗ (0.01) (0.01) (0.02) (0.01) (0.01) (0.02) Female -0.08∗∗∗ -0.08∗∗∗ 0.05 -0.08∗∗∗ -0.08∗∗∗ 0.08∗ (0.009) (0.009) (0.05) (0.009) (0.008) (0.05) Native language is gender language -0.19∗∗∗ -0.17∗∗∗ -0.1∗∗∗ -0.13∗∗∗ -0.17∗∗∗ -0.1∗∗∗ (0.04) (0.05) (0.04) (0.04) (0.05) (0.03) Constant 0.78∗∗∗ 0.76∗∗∗ 0.88∗∗∗ 0.43∗∗∗ 0.41∗∗∗ 0.41∗∗∗ (0.02) (0.03) (0.06) (0.04) (0.03) (0.05) Country-Wave Fixed Effects No Yes Yes No Yes Yes Individual Controls No No Yes No No Yes Observations 26294 26294 26294 26294 26294 26294 R2 0.06 0.12 0.21 0.03 0.11 0.15 Robust standard errors clustered at the language level. The dependent variable is an indicator for being in the labor force (either working for a wage, self-employed, or actively seeking employment). Data is from Afrobarometer Rounds 2 through 5. The analysis includes data from Kenya, Niger, Nigeria, and Uganda; Niger was only added to the Afrobarometer in Round 5, while the other countries appear in all four rounds. Individual controls are age and age-squared and indicators for being identifying as Muslim, Catholic, Protestant, or another religion, plus interactions between these controls and the female dummy. 50 Table 12: Coefficient Stability ˚ β ˜ β β ∗ (Rmax , 1) δ∗ Panel A. Cross-Country Regressions Female labor force participation -13.82 -11.89 -8.30 1.44 Gender difference in labor force participation -11.59 -14.64 -17.85 3.23 Female primary school completion 14.86 -6.72 -19.47 δ<0 Gender difference in primary school 1.23 -3.72 -6.28 δ<0 Female secondary school completion 14.58 0.45 -9.74 0.05 Gender difference in secondary school 0.48 -0.87 -1.80 δ<0 Gender attitude index -0.03 -0.12 -0.20 δ<0 Gender attitudes among women -0.02 -0.10 -0.18 δ<0 Gender attitudes among men -0.04 -0.14 -0.23 δ<0 Panel B. Individual-Level Regressions In labor force -0.24 -0.18 -0.13 2.11 Female × in labor force -0.17 -0.11 -0.06 1.86 Completed primary school -0.31 -0.22 -0.15 2.18 Female × completed primary school -0.12 -0.11 -0.10 4.64 Completed secondary school -0.19 -0.16 -0.14 3.47 Female × completed secondary school -0.06 -0.06 -0.06 6.01 Parameters estimated following procedures outlined in Altonji, Elder, and Taber (2005) and Oster (2017). β ˜ is the coefficient from ˚ is the coefficient of interest from a bivariate regression. β a regression that includes the full set of observable controls. β ∗ (Rmax , 1) is the implied causal impact of grammatical gender on each outcome assuming a proportional selection coefficient (δ ) equal to 1 and a maximum R2 equal to 1.3 times the R2 from the regression with controls (Oster 2017). δ ∗ is the proportional selection coefficient required to explain the observed relationship under the null hypothesis of no causal effect of grammatical gender on outcomes of interest. 51 A Online Appendix: not for print publication Table A1: Cross-Country Regressions of LFP Ratio Dependent variable: LFPratio Specification: OLS OLS OLS (1) (2) (3) Proportion speaking gender language -0.16∗∗∗ -0.25∗∗∗ -0.18∗∗∗ (0.03) (0.04) (0.04) Constant 0.77∗∗∗ 0.81∗∗∗ 0.92∗∗∗ (0.02) (0.02) (0.06) Continent Fixed Effects No Yes Yes Geography Controls No No Yes Observations 178 178 178 R2 0.13 0.37 0.44 Robust standard errors clustered by most widely spoken language in all specifications. Female Labor Force Participation is the percentage of women in the labor force, measured in 2011. The Gender Differ- ence in Labor Force Participation is the difference between male and female labor force participation in 2011. Geography controls are the percentage of land area in the tropics or subtropics, average yearly pre- cipitation, average temperature, an indicator for being landlocked, and the Alesina et al. (2013) measure of suitability for the plough. A1 Table A2: Cross-Country Regressions of LFP — Including “Bad” Controls Dependent variable: Female LFP Gender Diff. Specification: OLS OLS (1) (2) Proportion speaking gender language -6.60∗∗ -10.39∗∗∗ (3.21) (2.84) Constant 67.50∗∗∗ 1.22 (9.27) (6.14) Continent Fixed Effects Yes Yes Geography Controls Yes Yes Bad Controls Yes Yes Observations 176 176 R2 0.57 0.68 Robust standard errors clustered by most widely spoken language in all specifications. Female LFP is the percentage of women in the labor force, measured in 2011. Gender Diff. denotes the gender difference in labor force participation, which is the difference between male and female labor force participation in 2011. Geography controls are the percentage of land area in the tropics or subtropics, average yearly precipitation, average temperature, an indicator for being landlocked, and the Alesina et al. (2013) measure of suitability for the plough. Bad controls are log GDP per capita (in 2011), log population (in 2011), and the percentage Catholic, Protestant, other Christian, Muslim, and Hindu (taken from Alesina et al. 2013), and an indicator for former communist countries. A2 Table A3: Cross-Country Regressions of LFP — Dropping Major World Languages Female Gender Difference in Dependent variable: Labor Force Participation Labor Force Participation Omitted Language: Arabic English Spanish Arabic English Spanish Specification: OLS OLS OLS OLS OLS OLS (1) (2) (3) (4) (5) (6) Proportion speaking gender language -6.18∗ -12.30∗∗∗ -10.05∗∗∗ -9.09∗∗∗ -15.28∗∗∗ -11.27∗∗∗ (3.56) (3.84) (3.87) (3.52) (3.59) (3.39) Constant 67.37∗∗∗ 62.63∗∗∗ 62.97∗∗∗ -1.03 -4.34 -3.47 (6.12) (6.53) (6.34) (4.72) (4.98) (4.98) Continent Fixed Effects No Yes Yes No Yes Yes Geography Controls No No Yes No No Yes Observations 159 167 160 159 167 160 R2 0.21 0.34 0.37 0.31 0.49 0.51 Robust standard errors clustered by most widely spoken language in all specifications. Female Labor Force Participation is the percentage of women in the labor force, measured in 2011. The Gender Difference in Labor Force Participation is the difference between male and female labor force participation in 2011. Geography controls are the percentage of land area in the tropics or subtropics, average yearly precipitation, average temperature, an indicator for being landlocked, and the Alesina et al. (2013) measure of suitability for the plough. A3 Table A4: Cross-Country Regressions of LFP — Weak vs. Strong Gender Categories Female Gender Difference in Dependent variable: Labor Force Participation Labor Force Participation Specification: OLS OLS OLS OLS OLS OLS (1) (2) (3) (4) (5) (6) Proportion speaking gender language -6.65∗∗∗ -8.09∗∗ -7.18∗ 4.30∗∗∗ -5.59 -5.77 (2.54) (3.63) (3.91) (1.65) (4.23) (4.34) Proportion speaking dichotomous gender language -10.58∗∗ -11.61∗∗∗ -6.55 -23.44∗∗∗ -16.13∗∗∗ -12.34∗∗∗ (4.79) (3.86) (4.16) (3.55) (4.19) (4.53) A4 Constant 57.44∗∗∗ 61.04∗∗∗ 62.85∗∗∗ -16.89∗∗∗ -12.80∗∗∗ -4.49 (1.46) (2.24) (6.12) (1.20) (1.57) (4.68) Continent Fixed Effects No Yes Yes No Yes Yes Geography Controls No No Yes No No Yes Observations 178 178 178 178 178 178 R2 0.19 0.27 0.33 0.3 0.46 0.5 Robust standard errors clustered by most widely spoken language in all specifications. Female Labor Force Participation is the percentage of women in the labor force, measured in 2011. The Gender Difference in Labor Force Participation is the difference between male and female labor force participation in 2011. Geography controls are the percentage of land area in the tropics or subtropics, average yearly precipitation, average temperature, an indicator for being landlocked, and the Alesina et al. (2013) measure of suitability for the plough.