Report No: ACS19286 . Somali High Frequency Survey Wave 1 Preliminary Results . 30th June, 2016 . GPV01 AFRICA . . Document of the World Bank . Standard Disclaimer: . This volume is a product of the staff of the International Bank for Reconstruction and Development/ The World Bank. The findings, interpretations, and conclusions expressed in this paper do not necessarily reflect the views of the Executive Directors of The World Bank or the governments they represent. The World Bank does not guarantee the accuracy of the data included in this work. The boundaries, colors, denominations, and other information shown on any map in this work do not imply any judgment on the part of The World Bank concerning the legal status of any territory or the endorsement or acceptance of such boundaries. . Copyright Statement: . The material in this publication is copyrighted. Copying and/or transmitting portions or all of this work without permission may be a violation of applicable law. The International Bank for Reconstruction and Development/ The World Bank encourages dissemination of its work and will normally grant permission to reproduce portions of the work promptly. For permission to photocopy or reprint any part of this work, please send a request with complete information to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA, telephone 978-750-8400, fax 978-750-4470, http://www.copyright.com/. All other queries on rights and licenses, including subsidiary rights, should be addressed to the Office of the Publisher, The World Bank, 1818 H Street NW, Washington, DC 20433, USA, fax 202-522-2422, e-mail pubrights@worldbank.org. This document was prepared by Johan Mistiaen (TTL; Senior Economist, GPV01) and Utz Pape (TTL; Economist, GPV01) with substantial contributions from Gonzalo Nunez (Consultant, GPV01) and Philip Wollburg (Consultant, GPV01). The team is grateful for inputs and comments from Abdulqafar Abdullahi (Consultant, GMF07), Kevin Carey (Lead Economist, GMF07), Andrew Dabalen (Lead Economist, GPV01), John Randa (Senior Economist, GMF07) and Yutaka Yoshino (Program Leader, AFCE1). Vice President Makhtar Diop Country Director Bella Bird Senior Director Ana Revenga Practice Manager Pablo Fajnzylber Task Team Leaders Johan Mistiaen & Utz Pape Somali High Frequency Survey – Preliminary Summary Somali High Frequency Survey Wave 1 Preliminary Summary Overview The historical civil war and political insecurity in Somalia has resulted in a lack of socioeconomic, perception and other key data in Somalia. The Somalia Socioeconomic Survey 2002 was the last Somalia-wide representative survey. This lack of data makes it difficult for the government and its development partners to plan and implement appropriate policies and programs that are needed to support economic growth and stability. Especially the lack of poverty numbers undermined the development of an interim poverty reduction strategy paper, which is required to apply for HIPC debt relief. The Somali High Frequency Survey closes this crucial data gap. The first wave of the Somali High Frequency Survey was conducted as part of the Somalia Poverty TA program in February 2016. This summary document describes the methodology and presents a few preliminary findings of the first wave. A more comprehensive compilation of preliminary findings is available in the accompanying presentation. An in-depth analysis has not yet been conducted but will be proposed in the last section of this document. Methodology Data collection in Somalia is challenging due to insecurity in some areas. Traditional sampling methodologies require a full listing of enumeration areas, which is impossible in insecure areas. Also face-to-face time is limited to about 60 minutes while a full consumption questionnaire takes 90 to 120 minutes. Finally, limited field access makes monitoring of data quality difficult. The poverty team developed solutions to overcome these challenges and allow household consumption data collection in Somalia. The new solutions were tested in a pilot survey in Mogadishu. The first challenge related to sampling was resolved by employing a segmentation approach instead of requiring a full listing in insecure areas. The second challenge of limited face-to-face time was overcome by a newly developed methodology to collect consumption data in 60 minutes. An ex-post simulation using data from Hargeisa showed that the methodology is able to provide accurate poverty estimates. The third challenge of monitoring limitations was tackled by the design of a remote real-time data monitoring system. Implementing these innovations in the Somali High Frequency Survey ensured high data quality despite limitations for field monitoring. The sample design had to be adopted due to missing enumeration area maps. The survey was originally planned to rely on the sample framework provided by PESS. However, maps for a large number of rural enumeration areas were not available. Therefore, the sample design was altered for areas without existing maps. For those areas, settlement data from PESS as well as from UNDP (2005) were used to create a sample frame. The draft sample 1 Somali High Frequency Survey – Preliminary Summary frame was cleaned by merging duplicate enumeration areas and by splitting larger settlements into multiple enumeration areas. Boundaries of the enumeration areas were constructed as circles and then transformed into non-overlapping Theissen polygons. 1 The survey covered Somalis living in Mogadishu, in urban and rural areas in Puntland and Somaliland as well as in Internally Displaces Persons (IDPs) settlements. The survey did not include the nomadic population, which presents about one quarter of the Somali population (according to PESS). This ad hoc approach to create the missing sample frame aimed to ensure representativeness of the covered population but has technical limitations, which should be kept in mind when interpreting results. Table 1: Sample Properties Other IDP Overall Mogadishu Urban Rural Settlements Nomads Sample Size (Households) 4,117 816 2,048 822 431 0 Covered Households 923,092 187,246 445,113 88,770 201,963 0 Sample Size (Individuals) 21,026 3,619 11,123 4,094 2,190 0 Covered Individuals 4,930,351 895,915 2,459,482 463,266 1,111,689 0 Population (PESS) 12,316,895 1,280,939 3,935,453 2,806,787 1,106,751 3,186,965 Population Covered 40% 70% 62% 17% 100% 0% Number of Enumeration Areas 341 67 170 69 35 0 The questionnaire of the survey is focused on consumption. Consumption is measured using the Rapid Consumption Methodology. The approach partitions consumption items into core and optional modules. Only the core and one optional module is administered to each household. The missing consumption information is imputed within the survey using multiple imputation techniques. The questionnaire also covered livestock and perception data. In more secure areas, a long-form of the questionnaire was administered including modules for income / remittances, household enterprises and shocks. Poverty is estimated using the international 1.90 USD 2011 PPP poverty line. As the poverty line is defined in USD 2011 PPP, it must be converted to the currency used to measure consumption in the survey. First, USD 2011 PPP are converted into Somali Shilling in 2011 using the regression-based PPP estimate for Somalia. Second, the change in purchasing 1 Two overlapping circles become two polygons sharing as boundary the intersection of the two original circles. 2 Somali High Frequency Survey – Preliminary Summary power per Somali Shilling is considered by estimating inflation from 2011 to 2016. In the absence of a CPI, the consumption shares of the survey are used together with prices collected from 2011 to 2016 by the Food Security and Nutrition Analysis Unit (FSNAU) Somalia led by FAO. Third, the poverty line is converted back to USD using the current exchange rate. The resulting poverty line is 1.58 USD (2016) per day per person. The consumption estimates from the household are accordingly converted into USD using the region-specific exchange rates collected by the market price surveys. Food consumption is also spatially deflated using a Laspeyres deflator to ensure comparability. Preliminary Findings Poverty ranges from 35 to 71 percent in Somalia, across different parts of the population. The poverty rate in Mogadishu is similar to other urban areas while rural areas are poorer. Most people in IDP settlements are poor. Household receiving remittances are better off than household that do not receive remittances (Figure 1). Figure 1: Poverty incidence (% of population living on less than $1.9 per day in 2011 PPP terms). 100% 80% 60% 40% 20% 0% Mogadishu Other Urban Rural IDP Female Male No Received Settlements Headed Headed remittances remittances Poverty Headcount rate Overall average One in four working-age persons participate in the labor market. One in third of those is unemployed (Figure 2). Women are often not participating in the labor market because they engage in household work and/or are not allowed by their husbands to seek work. More than half of the youth pursues education. Figure 2: Education and Labor Status Poor Non-Poor Youth Men Women IDP Settlements Rural Other Urban Mogadishu Overall 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Education Only Education and Employment Employment Only Unemployed Inactive, Household work Inactive, Other Inactive, Discouraged 3 Somali High Frequency Survey – Preliminary Summary More than half of the population is literate. Literacy are higher in urban compared to rural areas (Figure 3). Women are slightly less often literate. Wealthier households have higher literacy rates than poorer households. Figure 3: Literacy 70% 60% 50% 40% 30% 20% 10% 0% Percent Literate Overall Average Somalis are optimistic about the future. Asked about their outlook on living standards and employment opportunities, 4 in 10 people are optimistic while only 2 in 10 people are pessimistic (Figure 4). People living in IDP settlements are generally more pessimistic. Wealthier and female-headed households are more optimistic. Figure 4: Outlook on living standards and employment opportunities. Employment opportunities Living standards Poor Poor Non-Poor Non-Poor Q5 (top 20) Q5 (top 20) Q4 Q4 Q3 Q3 Q2 Q2 Q1 (bottom 20) Q1 (bottom 20) Male Headed Male Headed Female Headed Female Headed IDP Settlements IDP Settlements Rural Rural Other Urban Other Urban Mogadishu Mogadishu Overall Overall 0% 20% 40% 60% 80% 100% 0% 20% 40% 60% 80% 100% Getting Worse About the same Improving Getting Worse About the same Improving Next Steps The survey data offer the opportunity for an in-depth analysis of poverty and related socio-economic indicators to improve our understanding of poverty in Somalia. This work is planned for FY17 contributing to the preparation of the Systematic Country Diagnostic while also feeding into the Country Economic Memorandum. 4 Somali High Frequency Survey – Preliminary Summary Additional waves of the High Frequency Survey will help to understand poverty dynamics and can increase coverage, e.g. by including nomads. The first wave of the survey only gave a snapshot of poverty for the represented part of the population. Additional waves are planned including previously insecure areas as well as the nomadic population. The additional waves will be funded by the Somalia Multi-Partner Knowledge Fund. The incomplete sampling frame for Somalia will negatively affect representativeness of any surveys in Somalia. Therefore, it is urgent to update the sampling frame for Somalia. Based on the experience gained in complementing the existing sample frame for the first wave of the Survey, it is proposed to support the Government of Somalia with technical assistance to construct a complete sample frame. Due to security constraints to conduct required field work for updating the sampling frame, an approach based on satellite images with selected verification will be considered. Given the reach beyond the High Frequency Survey and the strong capacity building component in this undertaking, the TFSCB could be the appropriate vehicle for this. Selected References Himelein, K.; S. Eckman, S. Murray, and J. Bauer. 2016. “Second-stage sampling for conflict areas: methods and implications.” Policy Research working paper; no. WPS 7617. Washington, D.C.: World Bank Group. Pape, U. and J. Mistiaen (2015), “Measuring Household Consumption and Poverty in 60 Minutes: The Mogadishu High Frequency Survey”, World Bank 2015. World Bank (2015), “Informing the Somali High Frequency Survey”, Report No ACS14146. Washington, D.C.: World Bank Group. 5 - DRAFT - Not For Circulation Somali High Frequency Survey Wave I (February 2016) Overview and Preliminary Results Utz Johann Pape & Johan A. Mistiaen Global Poverty and Equity Practice The World Bank August 30, 2016 Agenda 1. Methodology a) Sample b) Questionnaire c) Rapid Consumption Methodology d) Fieldwork Monitoring e) Poverty Measurement 2. Results a) Demographics b) Poverty c) Labor d) Education e) Access to Services f) Perceptions 3. Recap of Main Findings 4. Discussion and Next Steps 1 Context-adapted sample design All pre-war regions of Somalia Accessible Areas Inaccessible Areas Nomads Mogadishu Other Urban Rural IDPs PESS EA Maps PESS EA Maps No EA Maps UNHCR Maps Micro-listing Full listing Full listing Micro-listing No EA Maps Full listing Sample Design without EA maps PESS maps were only available for some areas. In areas without PESS maps, we designed a sample frame based on settlements recorded in other data sources. 1. Removal of duplicates 2 Sample Design without EA maps PESS maps were only available for some areas. In areas without PESS maps, we designed a sample frame based on settlements recorded in other data sources. 1. Removal of duplicates 2. Demarcate boundaries in urban areas using Thiessen polygons Sample Design without EA maps PESS maps were only available for some areas. In areas without PESS maps, we designed a sample frame based on settlements recorded in other data sources. 1. Removal of duplicates 2. Demarcate boundaries in urban areas using Thiessen polygons 3. Check demarcation 3 Full vs. Micro-listing Traditionally, all households in an enumeration area are listed before households are selected randomly for interviews. A full listing can raise suspicion in some areas. Thus, we opted for a micro-listing approach.   The micro-listing approach splits an enumeration area into multiple segments. Each segment is further split into blocks. Within a selected block, the enumerator records all housing structures. The tablet selects randomly one structure of which the enumerator records all households. The tablet selected randomly the household to be interviewed. This methodology provides unbiased estimates but reduces precision due to design effects introduced by the additional layers of hierarchy. Wave I surveyed 4,117 household across rural and urban areas and IDP settlements, representing 40 percent of the population Sample properties of the SHFS Overall Mogadishu Other Urban Rural IDP Settlements Nomads Sample Size (Households) 4,117 816 2,048 822 431 0 Population (Covered Households) 923,092 187,246 445,113 88,770 201,963 0 Sample Size (Individuals) 21,026 3,619 11,123 4,094 2,190 0 Population (Covered Individuals) 4,930,351 895,915 2,459,482 463,266 1,111,689 0 Population (Total PESS) 12,316,895 1,280,939 3,935,453 2,806,787 1,106,751 3,186,965 Percentage of Population Covered 40% 70% 62% 17% 100% 0% Number of Enumeration Areas 341 67 170 69 35 0 Notes 1. Wave I of the SHFS covered the following pre-war regions: Awdal, Banadir, Bari, Mudug, Nugaal, Sanaag, Sool, Togdheer, and Woqooyi Galbeed. Not included were Bakool, Bay, Galgaduud, Gedo, Hiraan, Lower Juba, Lower Shabelle, Middle Juba, and Middle Shabelle. 2. ‘Covered’ population includes extrapolation to inaccessible areas within covered pre-war regions. It is assumed that inaccessible areas are similar to the bottom 25 percent of enumeration areas in the same analytical strata. IDP settlements are scaled to all regions. 3. Percentage of Population Covered is based on PESS population estimates. 4 Agenda 1. Methodology a) Sample b) Questionnaire c) Rapid Consumption Methodology d) Fieldwork Monitoring e) Poverty Measurement 2. Results a) Demographics b) Poverty c) Labor d) Education e) Access to Services f) Perceptions 3. Recap of Main Findings 4. Discussion and Next Steps Questionnaire – Modules - Household Roster (110 questions) - Household Characteristics (38 questions) - Consumption - Food (30 questions per item) - Non-Food (14 questions per item) - Livestock (39 questions per item) - Durables (16 questions per item) - Perception (24 questions) - Food Security* (24 questions) - Income and Remittances* (14 questions) - Household Enterprise* (172 questions) - Shocks* (15 questions) * Only administered in areas with full listing 5 Questionnaire – Dataset - Household: 348 variables - Household members: 148 variables - Food: 33 variables - Non-Food: 18 variables - Durables: 30 variables - Livestock: 33 variables - Shocks: 16 variables Agenda 1. Methodology a) Sample b) Questionnaire c) Rapid Consumption Methodology d) Fieldwork Monitoring e) Poverty Measurement 2. Results a) Demographics b) Poverty c) Labor d) Education e) Access to Services f) Perceptions 3. Recap of Main Findings 4. Discussion and Next Steps 6 Rapid Consumption Methodology In traditional household surveys, consumption is measured using a long list of >300 items. This takes multiple hours or days. Full Aggregated Reduced Maize (flour) Maize Maize (flour) Maize (grain) Millet Millet (grain) Millet (flour) Vegetables Bananas Millet (grain) Milk Melons Apples Cow milk Bananas Milk powder Melons Pears Camel milk Cow milk Scaled BUT scale factor usually unknown Goat milk Milk powder Consumption - - Poverty Line Bias - - 120’ 45’ 45’ 45’ Rapid Consumption Methodology Total Imputed Group 1 Group 2 Consumption Skip 2 2 Consumption Items Skip 1 1 Core Core Core • Items are partitioned into a core and multiple optional modules • Households are assigned to the core and one optional module 7 Rapid Consumption Methodology: Pilot Simulation Results Relative Bias Relative Error 2.0% 2.0% 1.5% 1.5% 1.0% 1.0% 0.5% 0.5% 0.0% 0.0% FGT0 FGT1 FGT2 Consumption Consumption Consumption All FGT0 FGT1 FGT2 Consumption Consumption Consumption All ‐0.5% HH EA HH EA ‐1.0% ‐1.5% The Rapid Consumption Methodology performs well as simulation results indicate. The simulation uses household consumption survey data and compares indicators based on full consumption with indicators based on ex ante implemented Rapid Consumption. Source: Somaliland Household Survey 2012 Rapid Consumption Methodology – Mogadishu 2015 Pilot Food Consumption Non‐Food Consumption Share  Share  Number  Share  Share  Mogadishu  Number  Share  Share  Mogadishu  of Items Hargeisa Mogadishu Imputed of Items Hargeisa Mogadishu Imputed Core 33 91% 64% 54% 26 76% 62% 52% Module 1 19 3% 9% 16% 15 7% 9% 12% Module 2 20 2% 14% 14% 15 5% 9% 12% Module 3 15 2% 5% 6% 15 6% 8% 9% Module 4 15 2% 8% 9% 15 6% 11% 15% Only recording consumption from ‘core’ items will result in severe under- estimation of consumption and, thus, over-estimation of poverty. Source: Somaliland Household Survey 2012 and HFS Mogadishu 2015 Pilot 8 Agenda 1. Methodology a) Sample b) Questionnaire c) Rapid Consumption Methodology d) Fieldwork Monitoring e) Poverty Measurement 2. Results a) Demographics b) Poverty c) Labor d) Education e) Access to Services f) Perceptions 3. Recap of Main Findings 4. Discussion and Next Steps Field Monitoring Data collection was monitored daily using a real-time monitoring system. 1. Number of interviews 120 1,200 100 1,000 80 800 60 600 40 400 20 200 0 0 Nb of interviews Cumulative Valid and Successful Target Number of Interviews 9 Field Monitoring Data collection was monitored daily using a real-time monitoring system. 2. Allocation of Optional Modules 400 350 336 Sum of Nb of interviews with 300 Treat=1 249 247 250 237 Sum of Nb of interviews with Treat=2 200 Sum of Nb of interviews with Treat=3 150 Sum of Nb of interviews with 100 Treat=4 50 0 Field Monitoring Data collection was monitored daily using a real-time monitoring system. 3. Duration of interviews by enumerators in minutes 180 160 140 120 100 80 60 40 20 0 1.10 1.12 1.15 1.16 1.17 1.20 1.21 1.22 1.23 1.26 1.29 1.3 1.30 1.31 1.32 1.33 1.34 1.35 1.36 1.37 1.4 1.5 1.6 1.7 1.9 Average of Mean duration of interviews ‐ minutes Average of Median duration of interviews ‐ minutes 10 Agenda 1. Methodology a) Sample b) Questionnaire c) Rapid Consumption Methodology d) Fieldwork Monitoring e) Poverty Measurement 2. Results a) Demographics b) Poverty c) Labor d) Education e) Access to Services f) Perceptions 3. Recap of Main Findings 4. Discussion and Next Steps Poverty Measurement from the SHFS 1. Consumption Aggregate • For food and non-food items • For assets by estimating consumption flow • Impute ‘missing’ consumption values 2. Deflator • Laspeyres: calculate spatial price indices using a common food basket and spatial prices • Apply to food and nonfood consumption aggregate 3. Define a Poverty Line based on 1.90 USD PPP 2011 • Converting 1.90 USD PPP to SSh in 2011 • Estimate inflation of SSh from 2011 to 2016 by a CPI-like index based on estimated consumption shares and FSNAU price data (food and non-food) • Convert poverty line back to current USD using current exchange rate from SSh to USD • Resulting poverty line: 1.58 USD (2016) 11 Agenda 1. Methodology a) Sample b) Questionnaire c) Rapid Consumption Methodology d) Fieldwork Monitoring e) Poverty Measurement 2. Results a) Demographics b) Poverty c) Labor d) Education e) Access to Services f) Perceptions 3. Recap of Main Findings 4. Discussion and Next Steps The population is predominantly young Almost half of the population is less than 15 years old SHFS Population Pyramid 85+ Years 80 ‐ 84 Years 75 ‐ 79 Years 70 ‐ 74 Years 65 ‐ 69 Years 60 ‐ 64 Years 55 ‐ 59 Years 50 ‐ 54 Years 45 ‐ 49 Years Men 40 ‐ 44 Years 35 ‐ 39 Years 30 ‐ 34 Years Women 25 ‐ 29 Years 20 ‐ 24 Years 15 ‐ 19 Years 10 ‐ 14 Years 5 ‐ 9 Years 0 ‐ 4 Years 12% 10% 8% 6% 4% 2% 0% 2% 4% 6% 8% 10% 12% PESS Population Pyramid 85+ Years 80 ‐ 84 Years 75 ‐ 79 Years 70 ‐ 74 Years 65 ‐ 69 Years 60 ‐ 64 Years 55 ‐ 59 Years 50 ‐ 54 Years 45 ‐ 49 Years 40 ‐ 44 Years 35 ‐ 39 Years Men 30 ‐ 34 Years 25 ‐ 29 Years Women 20 ‐ 24 Years 15 ‐ 19 Years 10 ‐ 14 Years 5 ‐ 9 Years 0 ‐ 4 Years 12% 10% 8% 6% 4% 2% 0% 2% 4% 6% 8% 10% 12% 12 Almost half of Somali households are headed by women 2 in 3 households in Mogadishu and IDP Settlements are headed by men Percent of households headed by women 70% 60% 50% 40% 30% 20% 10% 0% Mogadishu Other Rural IDP Q1 Q2 Q3 Q4 Q5 Non‐Poor Poor Urban Settlements (bottom 20) (top 20) Female Household Heads Overall Average The average household size is 5.3 Household size decreases with income Number of members per household 7 6 5 4 3 2 1 0 Household Size Overall Average 13 Agenda 1. Methodology a) Sample b) Questionnaire c) Rapid Consumption Methodology d) Fieldwork Monitoring e) Poverty Measurement 2. Results a) Demographics b) Poverty c) Labor d) Education e) Access to Services f) Perceptions 3. Recap of Main Findings 4. Discussion and Next Steps The poverty headcount ranges from 35 to 71 percent Poverty incidence (% of population living on less than $1.9 per day in 2011 PPP terms) 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% Poverty Headcount rate Overall average 14 The population ranks among the poorest of the world Poverty incidence (% of population living on less than $1.9 per day in 2011 PPP terms) 82 77 77 71 63 62 57 52 47 43 33 29 26 The poverty gap ranges from 14 to 39 percent Poverty gap (% shortfall relative to poverty line) 50% 45% 40% 35% 30% 25% 20% 15% 10% 5% 0% Mogadishu Other Urban Rural IDP Female Male Headed No Received Settlements Headed remittances remittances Poverty Severity Overall average 15 The distribution of per capita consumption expenditures rises steeply to the poverty line 100 90 80 70 Percentage of the population 60 50 40 30 20 10 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8 Daily consumption expenditure per capita (in current US$) The distribution of per capita consumption expenditures rises steeply to the poverty line => highly elastic (pro’s and con’s) 100 90 80 70 Percentage of the population 60 Poverty Rate 50 Poverty Line 40 30 20 10 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8 Daily consumption expenditure per capita (in current US$) 16 IDPs are the poorest, while urban areas outside Mogadishu are wealthiest along every point of the distribution 100 90 80 70 Percentage of the population Mogadishu 60 Other Urban 50 40 Rural 30 IDP Settlements 20 10 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 Daily consumption expenditure per capita (in current US$) The top 20 percent consume seven times more than the bottom 20 percent Daily consumption expenditure per capita by consumption quintile (current US$) Overall Mogadishu Other Urban Rural IDP Settlements Q1 (bottom 20) 0.52 0.52 0.57 0.60 0.49 Q2 0.94 0.95 0.94 0.92 0.93 Q3 1.38 1.37 1.38 1.40 1.39 Q4 2.05 2.02 2.05 2.06 2.11 Q5 (top 20) 3.77 3.76 3.85 3.53 3.27 17 A majority of poor households are in urban areas 3 in 10 poor households are in IDP Settlements Percentage breakdown of the poor population by region Mogadishu 21% IDP Settlements 32% Rural Other Urban 9% 38% 3 in 4 households did not experience hunger in February 2016 Households in IDP Settlements report hunger more often Experience of hunger in the past 4 weeks Poor Non-Poor Q5 (top 20) Q4 Q3 Q2 Q1 (bottom 20) Male Headed Female Headed IDP Settlements Rural Other Urban Mogadishu Overall 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Never Rarely (1-2 times) Sometimes (3-10 times) Often (more than 10 times) 18 Agenda 1. Methodology a) Sample b) Questionnaire c) Rapid Consumption Methodology d) Fieldwork Monitoring e) Poverty Measurement 2. Results a) Demographics b) Poverty c) Labor d) Education e) Access to Services f) Perceptions 3. Recap of Main Findings 4. Discussion and Next Steps Labor Market Statistics: Key Concepts (I/II) • The working-age population (15 to 64 years) is made up of people who are either inside (‘active’) or outside of the labor force (‘inactive’). The working-age youth are those aged between 15 and 24 years. • The labor force is made up of employed and unemployed people. • Employed people are those who are of working-age (15 to 64 years) and engaged in activities producing goods or providing services for at least one hour during the last 7 days. This includes workers who contributed within the family establishment. • Unemployed people are those who are not employed but are looking for work and are available to work. • Long-term unemployed are those who have been unemployed for at least 12 months. • First-time job-searchers are those who are currently unemployed looking for work, and have never worked before. • Those outside of the labor force are called ‘inactive’; these are people who are not employed, not looking for work, and/or not available to work. 19 Labor Market Statistics: Key Concepts (II/II) Working-age population (15 to 64 years) Inside the labor force Outside the labor force Employed Unemployed Household work Education Other In contrast to a labor force survey, this survey only asked the main More than 20 hours 20 hours or less per respondent about labor activities for per week week all household members. This can result in under-reporting of activity status, employment and activities to look for work. 1 in 4 working-age persons participate in the labor market. More men than women are inside the labor force, inactivity highest among the youth. Labor Force Participation 35% 400,000 30% 350,000 300,000 25% 250,000 20% 200,000 15% 150,000 10% 100,000 5% 50,000 0% 0 In labor force ('active') Overall Average Absolute size of labor force by group * Youth are defined as the population aged older than 15 and younger than 25 20 3 in 10 working-aged persons are pursuing education Among young people (15 – 24 years) more than half are pursuing education Education and labor status Poor Non‐Poor Youth Men Women IDP Settlements Rural Other Urban Mogadishu Overall 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Education Only Education and Employment Employment Only Unemployed Inactive, Household work Inactive, Other Inactive, Discouraged 4 in 10 ‘inactive’ women aged 15 and older work in the household Almost 3 in 10 of ‘inactive’ men are in education Reasons for inactivity 60% 50% 40% 30% Women 20% Men 10% 0% 21 More than half of the labor force is looking for work Unemployment highest in IDP settlements, long-term unemployment low Many are looking for work for the first time Employment and unemployment in detail 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% Employed, more than 20 hours per week Employed, 20 hours or less per week Unemployed Long‐term Unemployed (12 months or more) Unemployed, First‐time Job Searcher Young people (15 to 24 years) are unemployed more often than adults (25 to 64 years) Adult unemployment and youth unenmployment (percentage of the active adult/youth population) 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% Unemployment (25 ‐ 64 years) Youth Unemployment (15 ‐ 24 years) 22 Half of working adults are workers who receive a salary Women are more often work as own-account workers or contributing family workers Status in employment Poor Non‐Poor Q5 (top 20) Q4 Q3 Q2 Q1 (bottom 20) Men Women IDP Settlements Rural Other Urban Mogadishu Overall 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Paid employee Employer Own account worker Unpaid family worker Unpaid working for others Agenda 1. Methodology a) Sample b) Questionnaire c) Rapid Consumption Methodology d) Fieldwork Monitoring e) Poverty Measurement 2. Results a) Demographics b) Poverty c) Labor d) Education e) Access to Services f) Perceptions 3. Recap of Main Findings 4. Discussion and Next Steps 23 More than half of individuals can read and write Wealthier individuals and residents of urban areas are literate more often Literacy rate (percent) 70% 60% 50% 40% 30% 20% 10% 0% Percent Literate Overall Average Less than half of the population has no education Educational attainment is highest in urban areas and among wealthier households The younger generation (15-29 years) is more educated than the older generations (30+ years) Educational attainment 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% Other University Complete Secondary Complete Primary/ Incomplete Secondary Incomplete Primary No education 24 More than half of children aged 6 to 17 are enrolled in school Children in non-poor households are enrolled in school more often Percent of school-aged (6-17) enrolled in school 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% Enrollment increases between ages 6 and 11 indicating that children go to school delayed 3 in 4 children between 11 and 17 go to school Percent of children enrolled by age 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 6 7 8 9 10 11 12 13 14 15 16 17 18 25 Agenda 1. Methodology a) Sample b) Questionnaire c) Rapid Consumption Methodology d) Fieldwork Monitoring e) Poverty Measurement 2. Results a) Demographics b) Poverty c) Labor d) Education e) Access to Services f) Perceptions 3. Recap of Main Findings 4. Discussion and Next Steps Location matters more than income level for access to high quality amenities (I) Toilet type 100% 80% 60% 40% 20% 0% Overall Mogadishu Other Rural IDP Female Male Q1 Q2 Q3 Q4 Q5 Non‐Poor Poor Urban Setttlements Headed Headed (bottom 20) (top 20) Other Open space Public toilet Pit latrine Water closet (flush) Water source 100% 80% 60% 40% 20% 0% Overall Mogadishu Other Rural IDP Female Male Q1 Q2 Q3 Q4 Q5 Non‐Poor Poor Urban Settlements Headed Headed (bottom 20)  (top 20) Unprotected well/spring, others Tanker‐truck, bottled water Borehole,protected well/spring Public tap Piped water 26 Location matters more than income level for access to high quality amenities (II) Drinking water quality 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% Protected water source Treated water Untreated water Agenda 1. Methodology a) Sample b) Questionnaire c) Rapid Consumption Methodology d) Fieldwork Monitoring e) Poverty Measurement 2. Results a) Demographics b) Poverty c) Labor d) Education e) Access to Services f) Perceptions 3. Recap of Main Findings 4. Discussion and Next Steps 27 4 in 10 households are optimistic about the future Households in IDP Settlements are more pessimistic, wealthier households are more optimistic Employment opportunities Living standards Poor Poor Non‐Poor Non‐Poor Q5 (top 20) Q5 (top 20) Q4 Q4 Q3 Q3 Q2 Q2 Q1 (bottom 20) Q1 (bottom 20) Male Headed Male Headed Female Headed Female Headed IDP Settlements IDP… Rural Rural Other Urban Other Urban Mogadishu Mogadishu Overall Overall 0% 50% 100% 0% 20% 40% 60% 80% 100% Getting Worse About the same Improving Getting Worse About the same Improving Most households feel safe Households in Mogadishu and IDP Settlements feel least safe Safety in neighborhood Safety in travelling Poor Poor Non‐Poor Non‐Poor Q5 (top 20) Q5 (top 20) Q4 Q4 Q3 Q3 Q2 Q2 Q1 (bottom 20) Q1 (bottom 20) Male Headed Male Headed Female Headed Female Headed IDP… IDP Settlements Rural Rural Other Urban Other Urban Mogadishu Mogadishu Overall Overall 0% 20% 40% 60% 80% 100% 0% 20% 40% 60% 80% 100% Very unsafe Unsafe Somewhat safe Very safe Very unsafe Unsafe Somewhat safe Very safe 28 Agenda 1. Methodology a) Sample b) Questionnaire c) Rapid Consumption Methodology d) Fieldwork Monitoring e) Poverty Measurement 2. Results a) Demographics b) Poverty c) Labor d) Education e) Access to Services f) Perceptions 3. Recap of Main Findings 4. Discussion and Next Steps Recap of Preliminary Findings • The population is predominantly young • 52% of the population covered by the SHFS live in poverty (below $1.9 per day in 2011 PPP terms) ranking as one of world’s poorest countries • Households in IDP Settlements are most affected by poverty and unemployment • More than half of the working-age population is ‘outside the labor force’ • Women are more often outside the labor force and working in the household • More than half of people inside the labor force are unemployed but almost half are optimistic about their labor market prospects • The youth is better educated but also more often unemployed than adults • Many Somali children enroll in school delayed 29 Technical Appendix: Wave 1 Somali High Frequency Survey (HFS), Revised: 30th August 2016 Technical Appendix: Wave 1 Somali HFS This technical appendix describes sample design, cleaning and construction of consumption aggregates for the Wave 1 Somali High Frequency Survey data. Introduction Estimating monetary poverty rates requires a sound, reproducible methodology. The methodology starts with the sample design, continues with questionnaire design and the construction of food and non-food consumption aggregates, selection of spatial price deflators and how to determine the consumption value derived from assets, and what process to use to construct the poverty lines. This appendix describes the methodology used to estimate poverty for the Wave 1 Somali High Frequency Survey. The chosen methodology balances a trade-off between feasibility and accuracy. Somalia is a fragile country with severe security constraints for field work and wide spread displacement. The sampling methodology was adapted to the context by excluding several inaccessible areas. The questionnaire design utilized the Rapid Consumption methodology that can be easily and quickly implemented. The choice of deflators and the poverty line were driven by data quality. A household is defined as poor if the per-capita household consumption does not exceed a given threshold (1) ≤ where yi is the nominal per-capita household expenditure and z is the poverty line at the nominal level. In the following, we discuss the selection of households i as part of the sample design and present the construction of the consumption aggregate yi before discussing the choice of the poverty line z and standard poverty measures. Sample The Population Estimation Survey of Somalia (PESS) was used as sample frame alongside a list of settlements from three different sources (UNDP 1997, UNDP 2006 and FSAU 2003) to complement missing rural and semi-urban settlements. The combined sample frame was cleaned and preprocessed before the number of enumeration areas per strata was calculated and enumeration areas selected proportional to size. Depending on the strata, different multi-stage clustering approaches were used to select households. Sample Frame Due to the combination of the different data sources, the resulting sample frame included enumeration areas as well as settlements. While enumeration areas are defined as geographical areas with about 50 to 200 households, settlements often are larger areas with a larger population. In fact, all rural and a large fraction of semi-urban enumeration areas and settlements did not have boundaries available but were only defined by a GPS position. Since PESS is also partially based on the same data sources (especially UNDP 1997 and UNDP 2006) and since some PESS enumeration areas had the same GPS location, several GPS positions were very close of each other and, thus, considered duplicates (Figure 1). Technically, duplicates are defined where the distance between the GPS position is below 75m. In groups with multiple duplicates, the additional criteria was introduced that all GPS positions must have pair-wise distances below 200m to prevent large sequential areas of GPS positions. Duplicates were merged into one ‘hypothetical’ enumeration area with a tag of the number of duplicates. Those duplicate counts were used to position manually midpoints for new enumeration areas around the main duplicate GPS position to ensure that larger settlements have the appropriate number of surrounding enumeration areas. 1 In a second step, boundaries of enumeration areas without corresponding shape files were drawn automatically. First, the GPS positions were used as midpoints of circles with a radius of 200m. Overlapping circles were transformed to Thiessen polygons where the line connecting the overlapping points becomes the new boundary. The algorithm was tested for areas where PESS shapefiles were available (Figure 2). 1 Note that this was only done for selected duplicate enumeration areas to reduce manual processing. 1 Technical Appendix: Wave 1 Somali High Frequency Survey (HFS), Revised: 30th August 2016 Figure 1: Example of duplicate GPS positions. Figure 2: Test of Thiessen polygons with bold boundaries representing the known enumeration area boundaries. Sample Stratification and Size The sample is designed based on predicted statistical precision of consumption as well as cost considerations. Without political implications, the survey stratifies the sample into four zones, A including Mogadishu, B including Garowe, C including Hergeiza and D for Sanaag, Sool and Togdheer. The sample is stratified for each zone into economic/political centers, urban centers, other urban settlements, rural settlements and – if existent – IDP camps. The result are 16 strata (star marks areas where a micro-listing approach was utilized; see below): • A: Mogadishu*; IDPs* • B: Garowe; Urban Centers; Other Urban; Rural; IDPs* • C: Hergeiza; Urban Centers; Other Urban; Rural; IDPs* • D: Sanaag Urban; Sanaag Rural; Sool Urban; Sool Rural; Togdheer Urban; Togdheer Rural 2 Technical Appendix: Wave 1 Somali High Frequency Survey (HFS), Revised: 30th August 2016 The sample employs a clustered design with the Primary Sampling Unit (PSU) being the enumeration area. Within each enumeration area, 12 households will be selected for interviews. A larger number of households per enumeration area would only marginally benefit the statistical estimation of indicators. A smaller number of households would result in less than 3 observations for each of the four optional modules capturing consumption data. A total sample of about 3,800 households is sufficient to obtain consumption estimators with a relative standard error below 1 percent. After rounding the number of enumeration areas ensuring that 12 households per enumeration area, 324 enumeration areas were initially selected. The 324 enumeration areas are first distributed into the 16 strata. The number of enumeration areas per strata is determined by (i) the population of the strata, (ii) the variability of consumption within the strata, and (iii) the requirement of at least two enumeration areas per strata. Strata with larger population and larger variability will need a larger sample to retrieve the same relative standard error as a strata with smaller population and consumption variability (Table 12). Variability is estimated based on previous surveys and a pilot in Mogadishu. The strata for Mogadishu was later amended by an additional 20 enumeration areas to correct against a faulty optional module assignment in the first days of data collection. Household Selection Depending on the strata, different clustering approaches were used. In strata with more volatile security as well as for IDP camps, a multi-stage cluster design was employed called micro-listing. Each selected enumeration area was divided into multiple segments and each segment was further divided into blocks. A block is defined as a geographical area where an enumerator can see (and list) all households from one location in the center of the block. Within each enumeration area, one segment was randomly selected and within the segment 12 blocks were chosen. In each block, all structures were listed before selecting randomly one structure. Within the selected structure, all households were listed and one household randomly selected for interview. This multi-stage clustering approach reduces the time in the field substantially and contributes to a lower profile of enumerators, which is paramount in fragile areas. In strata less volatile, the complete enumeration area was listed before 12 households were randomly selected for interviews (called full-listing). Data Collection and Replacements The survey was implemented using tablets as survey devices (CAPI). The data collection system consisted of Samsung Smartphones equipped with SIM cards, mobile data plans, microSD cards (16 GB capacity), and external battery packs. The phones were secured with Android’s native encryption and protected by a password. GPS tracker helped to track all devices using a web interface (www.gps-server.net), Barcode Scanner allowed to use barcodes for the identification of enumerators and a parental control application provided a safe contained working environment for enumerators. Interviews were conducted using SurveyCTO Collect on the tablet with data transmitted to a secure SurveyCTO server in a cloud computing environment. EAs were replaced if security rendered field work unfeasible (Table 12). Replacements were approved by the project manager. Replacement of households were approved by the supervisor after a total of three unsuccessful visits of the household. Incoming data is processed to create a raw consistent data set. Interviews with wrongly entered EAs were manually corrected. Interviews conducted outside sampled EAs were discarded. For duplicate submissions, only one record is kept. 2 Sampling weights are added to the final dataset and subsequently anonymized at the strata level. Missing values are recoded into four different types of missing values: (i) genuinely missing values coded as “.”; (ii) respondent indicated “don’t know” coded as “.a”; (iii) respondent refused to respond to the question coded as “.b”; and (iv) missing values due to the questionnaire skipping pattern because the question does not apply to the respondent coded as “.z”. Cleaning Process of Submissions The total number of interviews submitted through SurveyCTO was 4,590, and the breakdown by zone the following: - A: 1,06 - B: 1,035 - C: 2,366 2Two types of duplicate households are identified. Technical duplicates are defined as duplicate submission of the same interview. They are identified as households with identical GPS data (latitude, longitude and altitude coordinates). Manual duplicates are defined as two interviews conducted with the same household. They are identified by almost identical household rosters. The interview with more information is kept based on manual inspection. 3 Technical Appendix: Wave 1 Somali High Frequency Survey (HFS), Revised: 30th August 2016 - D: 120 The first step corresponds to a cleaning process identifying general issues and inconsistencies with submissions. - B: 1 empty household record dropped - C  3 household records deleted as they were submitted through the web and they were part of a test to monitor scripts before fieldwork  1 submission dropped as it corresponds to a test that a team leader made to check if the GPS of one of his enumerator's phone was working  1 additional household record dropped as it corresponds to an interview completed by the enumerator to check he had the latest version of the questionnaire Therefore, after making the described adjustment, the number of correct submissions became 4,584, with the following breakdown by region: - A: 1,069 - B: 1,034 - C: 2,361 - D: 120 The second step excludes submissions from EAs and blocks that were not included in the final sample. - A: 3 submissions were dropped as they belong to a block that was not included in the final sample - B: 12 submissions dropped, as they correspond to an EA that was not included in the final sample, since it was a replacement EA that was never executed - C: 3 interviews dropped because the enumerators selected a wrong EA that had been replaced Therefore, after making the described adjustment, the number of correct submissions became 4,566, with the following breakdown by region: - A: 1,066 - B: 1,022 - C: 2,358 - D: 120: The next step was to validate the acceptance of submissions, for which six criteria were defined and interviews were dropped that failed to meet at least one of them: - The duration of the interview had to exceed a threshold of 30 minutes  26 submissions were excluded because they were completed in 30 minutes or less - Random sound bites check, including respondent and enumerator voices. This criterion will be assumed to hold if a specific interview was not checked on this criterion.  No interview was removed for this reason - The interview has GPS coordinates and it was conducted within a buffer area of the correspondent EA  5 interviews did not have GPS coordinates; and  5 were also excluded as the GPS coordinates indicate the interview did not took place within the boundaries of the EA - If the interview was not completed in the first visit, then the household record for the first visit must be valid using the previous criterions (except for the duration), and both household records must contain a matching GPS positions, with a margin of +/- 10 meters  34 interviews were dropped as they corresponded to a second visit, and the record from the previous visit did not exist or was not valid  26 additional submissions were not considered, as the GPS coordinates of the first visit did not match with those of the subsequent visit 4 Technical Appendix: Wave 1 Somali High Frequency Survey (HFS), Revised: 30th August 2016 - If the interview corresponds to a replacement household, the record of the original household must be valid, except for the duration of the interview: 67 submissions were not considered as the interview corresponded to a replacement household with an inexistent or invalid record for the original household - Finally, unsuccessful interviews were discarded; the ones where no one answered the door, there was not a knowledgeable adult present or the respondent did not give permission to continue: 282 submissions were not successful and thus were also excluded Therefore, at this point, the dataset had a total number of 4,121 submissions, with the following breakdown by region: - A: 1,031 - B: 929 - C: 2,045 - D: 116 The final step excludes interviews that were incomplete, and thus have several sections without any single response. 4 households did not have any record in the sections corresponding to food consumption, assets and livestock, and thus they were excluded. Therefore, the final dataset includes a total number of 4,117 complete, valid and successful submissions from valid EA and blocks, with the following breakdown by region: - A: 1,031 - B: 929 - C: 2,041 - D: 116 Sampling weights This section describes calculation of sample weights for households in the dataset. The sample design was different for some strata due to security volatility. Thus, the methods differ between micro-listing and full-listing. After the sample weights were calculated as described below, they were scaled to the number of households accessible with GPS from the sample frame. - Full listing: The sample was drawn in a two-stage process for strata 201-204, 301-304 and 1103-1304. Therefore, the weights were calculated based on the sampling probabilities for each sampling stage and for each cluster in the following way: ℎ = 1 2 = such that ℎ : Probability of selecting household h in EA i of strata j P1: Probability of selecting the EA in stage 1 P2: Probability of selecting the household in stage 2 : Number of EAs selected in strata j : Number of households estimated in the sample frame for EA i : Number of households estimated in the sample frame in strata j : Number of households selected in EA i : Number of households listed in EA i Therefore, the sample weight for each household corresponds to = 1/ℎ 5 Technical Appendix: Wave 1 Somali High Frequency Survey (HFS), Revised: 30th August 2016 - Micro-listing: In strata 101, 105, 205 and 305, the sample was segmented in blocks within EAs, in addition to the two- stage, stratified cluster sampling, design. 3 Therefore, the weights were calculated based on the sampling probabilities for each sampling stage and for each cluster in the following way: ℎ = 1 2 3 = such that ℎ : Probability of selecting household h in EA i of strata j 1 : Probability of selecting the EA 2 : Probability of selecting the Block 3 : Probability of selecting the household : Number of EAs selected in strata j : Number of households estimated in the sample frame for EA i : Number of households estimated in the sample frame in strata j : Number of blocks selected in EA i : Number of blocks in EA i : Number of households selected in EA i : Number of households in EA i Therefore, the sample weight for each household corresponds to = 1/ℎ Finally, three types of sampling weights were estimated: 1) Unadjusted weights: Considers all submissions (4,117) and scales the weights so that the sum of the sampling weights by analytical strata matches the total number of accessible households with GPS according to sample frame. 2) Adjusted weights: Considers all submissions (4,117) and scales the weights uniformly so that the sum of the weights by analytical strata matches the total number of households according to the PESS (Table 3). 4 3) Adjusted weights for consumption and poverty variables: Considers only submissions with consumption data (excludes 53 submissions with missing values in the consumption of food, non-food and durables) and adjusts the weights of the remaining 4,064 submissions according to the following scenarios:  If the number of accessible households with GPS (i.e. the sum of weights) is larger than the total number of households according to PESS by analytical strata, then the weights were scaled downwards uniformly to match the total number of households from PESS, which already reflects the re-allocation of the weights from the 53 submissions excluded  If the number of accessible households with GPS (i.e. the sum of weights) is smaller than the total number of households according to PESS, then the weights were scaled upwards in two steps: i) re-allocating uniformly the weights from the 53 households excluded across the 4,064 submissions; and then ii) assigning the additional weights needed to match the figures from PESS only to those households or submissions in the bottom 25 percent of the total consumption distribution for the respective analytical strata. The bottom 25 percent were taking up the weight of the additional households to reflect the fact that excluded enumeration areas were not randomly chosen but differed from other enumeration areas by 3The segmentation step cancels out as exactly one segment is chosen. 4Usually, the household number from the sample frame should reflect the number of households from the last Census. However, the incomplete sample frame necessitated using different (overlapping) data sources for the sample frame. While the probabilities for selection for duplicates are adjusted for already in the EA selection step, the total number of households did not automatically sum up to the number of households from PESS. 6 Technical Appendix: Wave 1 Somali High Frequency Survey (HFS), Revised: 30th August 2016 inaccessibility due to security and/or infrastructure. As those enumeration areas are expected to be more deprived than the average enumeration area, they were assumed to be similar to the bottom 25 percent. Table 1: Total number of households by PESS region and analytical strata PESS Region Type Analytical Strata Number of households All IDP All IDPs 201,963 Banadir Urban Mogadishu 187,246 Nugaal Urban Garowe 23,119 Bari and Mudug Urban Urban Bari and Mudug 140,334 Woqooyi Galbeed Urban Hergeiza 123,390 Awdal, Sanaag, Sool and Togdheer Urban Urban Awdal, Sanaag, Sool and Togdheer 158,279 Bari, Mudug and Nugaal Rural Rural Bari, Mudug and Nugaal 27,684 Awdal, Sanaag, Sool, Togdheer and Woqooyi Galbeed Rural Rural Awdal, Sanaag, Sool, Togdheer and 61,086 Woqooyi Galbeed Consumption Aggregate The nominal household consumption aggregate is the sum of three components, namely 1) expenditures on food items, 2) expenditures on non-food items, and 3) the value of the consumption flow from durable goods: (2) = + + This section describes in detail the cleaning of the recorded data for each of three components. Subsequently, the construction of the consumption aggregate using the Rapid Consumption Methodology is explained as well as the estimation of the consumption flow for durables and the details on the deflator used to calculate spatial price indices. Moreover, 53 households were assigned a missing value in consumption since 52 of them reported not consuming any food items, and 1 household only reported consuming a non-core food item. Cleaning Food Food expenditure data is cleaned in a four-step process. First, units for reported quantities of consumption and purchase are corrected. Typical mistakes include recorded consumption of 100 kg of a product (like salt) where the correct quantity is grams. These mistakes are corrected using generic rules (Table 5). Then, we introduce a conversion factor to kg for some specific items and units. For example, we recognize that a small piece of bread must have a different weight than a small piece of garlic (Table 6: ). The third step consists of correcting issues with the exchange rate selected (Table 7). Finally, outliers are detected using the six cleaning rules below to correct quantities and prices. - Rule 1 o Consumption quantities with missing values for items reported as consumed were replaced with item-specific median consumption quantities. o Missing purchase quantities and missing prices for items consumed were replaced with item-specific median purchase quantity and item-specific median purchase price. - Rule 2: Records where the respondent did not know or refused to respond if the household had consumed the item, were replaced with the mean value, including non-consumed records. - Rule 3: Records with the same value for quantity consumed or quantity purchased and price are assumed to have a data entry error in the price or quantity and are replaced with the item-specific medians. - Rule 4: Records that have the same value in quantity consumed and quantity purchased but different units are assumed to have a wrong unit either for consumption or purchase. For both quantities, the item-specific distribution of quantities in 7 Technical Appendix: Wave 1 Somali High Frequency Survey (HFS), Revised: 30th August 2016 kg is calculated to determine the deviation of the entered figure from the median of the distribution. The unit of the quantity that is further away from the median is corrected with the unit of the quantity closer to the median. - Rule 5: o Missing and zero prices are replaced with item-specific medians o Outliers for unit prices were identified and replaced with the item-specific median. This includes unit prices in the top 10 percent of the overall cumulative distribution (considering all items), and unit prices below 0.07 USD. - Rule 6: the consumption value in USD was truncated to the mean plus 3 times the standard deviation of the cumulative distribution for each item, if the record exceeded this threshold. All medians are estimated at the EA level if a minimum of 5 observations are available excluding previously tagged records. If the minimum number of observations is not met, medians are estimated at the strata-level before proceeding to the survey level. In addition, medians greater than 20 kg and smaller than 0.02 kg were not considered for quantities, while medians greater than 20 USD and smaller than 0.005 USD were also excluded for unit prices. Non-Food The non-food dataset only contains values without quantities and units. First, we apply the same cleaning rules for currencies (Table 7), and then the following cleaning rules: - Rule 1: Zero, missing prices and missing currency for purchased items are replaced with item-specific medians. - Rule 2: Records where the respondent did not know or refused to respond if the household had purchased the item, were replaced with the mean value, including non-consumed records. - Rule 3: Prices that are beyond a specific threshold for each recall period (Table 8) are replaced with item-specific medians. - Rule 4: Prices below the 1 percent and above the 95 percent of the cumulative distribution for each item are replaced with item-specific medians - Rule 5: the purchase value in USD was truncated to the mean plus 3 times the standard deviation of the cumulative distribution for each item, if the record exceeded this threshold. The item-specific medians were applied at the EA, strata and survey level as described above. Durables For durables, we also apply the same cleaning rules for currencies (Table 7), and then the following cleaning rules: - Rule 1: Vintages with missing values and greater than 10 years are replaced with item-specific medians. - Rule 2: Current and purchase prices equal to zero are replaced with item-specific medians. - Rule 3: Records that have the same figure in current value and purchase price are incorrect. For both, the item-vintage- specific distribution is calculated to determine the deviation of the entered figure from the median. The one that is further away from that median is corrected with the item-year-specific median value. - Rule 4: Depreciation rates are replaced by the item-specific medians in the following cases: o Negative records o Depreciation rates in the top 10 percent and vintage of one year o Depreciation rates in the bottom 10 percent and a vintage greater or equal to 3 years 8 Technical Appendix: Wave 1 Somali High Frequency Survey (HFS), Revised: 30th August 2016 - Rule 5: Records with 100 items or more, and those that reported to own a durable good but did not report the number were replaced with the item-specific medians of consumption in USD - Rule 6: Consumption in the top and bottom 1 percent of the overall distribution were replaced with item-specific medians - Rule 7: Records where the respondent did not know or refused to respond if the household owned the asset, were replaced with the mean consumption value, including non-consumed records. - Rule 8: the consumption value in USD was truncated to the mean plus 3 times the standard deviation of the cumulative distribution for each item, if the record exceeded this threshold. All medians are estimated at the EA level if a minimum of 3 observations are available excluding previously tagged records. If the minimum number of observations is not met, medians are estimated at the strata-level before proceeding to the survey level. Table 9 contains a general overview of consumption of durables, while Table 10 presents the details by item. Table 11 shows the median depreciation rate by durable good. Rapid Consumption Methodology: Food and Non-Food Aggregates The survey used the new Rapid Consumption methodology to estimate consumption. A detailed description including an ex post assessment of the methodology is available in a separate document. 5 The rapid survey consumption methodology consists of five main steps. First, core items are selected based on their importance for consumption. Second, the remaining items are partitioned into optional modules. Third, optional modules are assigned to groups of households. After data collection, fourth, consumption of optional modules is imputed for all households. Fifth, the resulting consumption aggregate is used to estimate poverty indicators. First, core consumption items are selected. Consumption in a country bears some variability but usually a small number of a few dozen items captures the majority of consumption. These items are assigned to the core module, which will be administered to all households. Important items can be identified by its average food share per household or across households. Previous consumption surveys in the same country or consumption shares of neighboring / similar countries can be used to estimate food shares. 6 In the worst case, a random assignment results in a larger standard error but does not introduce a bias. Second, non-core items are partitioned into optional modules. Different methods can be used for the partitioning into optional modules. In the simplest case, the remaining items are ordered according to their food share and assigned one-by-one while iterating over the optional module in each step. A more sophisticated method takes into account correlation between items and partitions them into orthogonal sets per module. This leads to high correlation between modules supporting the total consumption estimation. Conceptual division into core and optional items is not reflected in the layout of the questionnaire. Rather, all items per household will be grouped into categories of consumption items (like cereals) and different recall periods. Using CAPI, it is straight-forward to hide the modular structure from the enumerator. Third, optional modules will be assigned to groups of households. Assignment of optional modules will be performed randomly stratified by enumeration areas to ensure appropriate representation of optional modules in each enumeration area. This step is followed by the actual data collection. Fourth, household consumption will be estimated by imputation. The average consumption of each optional module can be estimated based on the sub-sample of households assigned to the optional module. In the simplest case, a simple average can be estimated. More sophisticated techniques can employ a welfare model based on household characteristics and consumption of the core items. The results presented in this note uses a multiple imputation technique based on a multi-variate normal approximation. Next, the methodology is formalized and assessed using an ex post simulation based on the consumption data from Hergeiza using the Somaliland 2012 household survey (SHS12). Food and non-food consumption for household i are estimated by the sum of expenditures for a set of items 5Pape & Mistiaen (2015), “Measuring Household Consumption and Poverty in 60 Minutes: The Mogadishu High Frequency Survey”, World Bank (2015). 6As shown later, the assignment of items to modules is very robust and, thus, even rough estimates of consumption shares are sufficient to inform the assignment without requiring a baseline survey. 9 Technical Appendix: Wave 1 Somali High Frequency Survey (HFS), Revised: 30th August 2016 = � and = � =1 =1 where and denote the food and non-food consumption of item j in household i. As the estimation for food and non-food consumption follows the same principles, we neglect the upper index f and n in the remainder of this section. The list of items can be partitioned into M+1 modules each with mk items: ( ) ( ) = � with = � =0 =1 (0) ( ∗ ) For each household, only the core module and one additional optional module are collected. The item assignment to the modules are based on the SHS12 survey with manual modifications especially to treat ‘other’ items correctly. 7 The core module was designed to maximize its consumption share resulting in 91 percent and 76 percent of food respectively non-food consumption captures in the core modules (based on SHS12 consumption; Table 2). Optional modules are constructed using an algorithm to assign items iteratively to optional modules so that items are orthogonal within modules and correlated between modules. In each step, an unassigned item with highest consumption share is selected. For each module, total per capita consumption is regressed on household size, the consumption of all assigned items to this module as well as the new unassigned item. The item will be assigned to the module with the highest increase in the R2 relative to the regression excluding the new unassigned item. The sequenced assignment of items based on their consumption share can lead to considerable differences in the captured consumption share across optional modules. Therefore, a parameter is introduced ensuring that in each step of the assignment procedure the difference in the number of assigned items per module does not exceed d. Using d=1 assigns items to modules (almost) maximizing equal consumption share across modules. 8 Increasing d puts increasing weight on orthogonality within and correlation between modules. The parameter was set to d=3 balancing the two objectives. In each enumeration area, 12 households were interviewed with an ideal partition of three items per optional module. 9 The assignment of optional modules must ensure that a sufficient number of households are assigned to each optional module. Household consumption was then estimated using the core module, the assigned module and estimates for the remaining optional modules ( 0) ( ∗ ) ( ) � = + � + � ∈ ∗ where ∗ ∶= {1, … , ∗ − 1, ∗ + 1, … , } denotes the set of non-assigned optional modules. Consumption of non-assigned optional modules is estimated using multiple imputation techniques taking into account the variation absorbed in the residual term. Multiple imputation was implemented using multi-variate normal regression based on an EM-like algorithm to iteratively estimate model parameters and missing data. This technique is guaranteed to converge in distribution to the optimal values. An EM algorithm draws missing data from a prior (often non-informative) distribution and runs an OLS to estimate the coefficients. Iteratively, the coefficients are updated based on re-estimation using imputed values for missing data drawn from the posterior distribution of the model. The implemented technique employs a Data-Augmentation (DA) algorithm, which is similar to an EM algorithm but updates parameters in a non-deterministic fashion unlike the EM algorithm. Thus, coefficients are drawn from the parameter posterior distribution rather than chosen by likelihood maximization. Hence, the iterative process is a Monte-Carlo Markov –Chain (MCMC) in the parameter space with convergence to the stationary distribution that averages over the missing data. The distribution for the 7 Items ‘other’ are often found to capture remaining items for a food category. Using the Rapid Consumption Methodology, this creates problems as ‘other’ will include different items depending on which optional module is administered. This can lead to double-counting after the imputation. Therefore, ‘other’ items are re-formulated and carefully assigned so that double counting cannot occur. 8 Even with d=1, equal consumption share across modules is not maximized because among the modules with the same number of assigned items, the new item will be assigned to the module it’s most orthogonal to; rather than to the module with lowest consumption share. 9 Field work implementation aimed to achieve a balanced partition among optional modules but due to challenges in following the protocol exactly some enumeration areas are not completely balanced. 10 Technical Appendix: Wave 1 Somali High Frequency Survey (HFS), Revised: 30th August 2016 missing data stabilizes at the exact distribution to be drawn from to retrieve model estimates averaging over the missing value distribution. The DA algorithm usually converges considerably faster than using standard EM algorithms: ( ) ( ) (0) ( ) � = 0 + () + The performance of the estimation technique was assessed based on an ex post simulation using the Hergeiza data from SHS12 and mimicking the Rapid Consumption methodology by masking consumption of items that were not administered to households. The results of the simulation were compared with the estimates using the full consumption from SHS12 as reference. The simulation results distinguish between different levels of aggregation to estimate consumption. 10 The methodology generally does not perform well at the household level (HH) but improves considerably already at the enumeration area level (EA) where the average of 12 households is estimated. At the national aggregation level, the Rapid Consumption methodology slightly over-estimates consumption by 0.3 percent. Assessing the standard poverty measures including poverty headcount (FGT0), poverty depth (FGT1) and poverty severity (FGT2), the simulation results show that the Rapid Consumption methodology retrieves estimates within 1.5 percent of the reference measure (Figure 3). Generally, the estimates are robust as suggested by the low standard errors (Figure 4). Figure 3: Relative bias of simulation results using Rapid Figure 4: Relative standard error of simulation results using Rapid Consumption estimation. Consumption estimation. 2.0% 2.0% 1.5% 1.5% 1.0% 1.0% 0.5% 0.5% 0.0% 0.0% Consumption Consumption EA Consumption All FGT0 FGT1 FGT2 Consumption HH Consumption EA Consumption All FGT0 FGT1 FGT2 -0.5% HH -1.0% -1.5% Source: Authors’ own calculations based on SHS12 data. Source: Authors’ own calculations based on SHS12 data. Table 2: Item partitions based on SHS12 and pilot in Mogadishu. Food Items Non-food Items Share Share Number of Share Number of MogadishuShare Mogadishu items Share Hergeiza Mogadishu items Share Hergeiza Mogadishu Imputed Imputed Core 33 91% 64% 54% 26 76% 62% 52% Module 1 19 3% 9% 16% 15 7% 9% 12% Module 2 20 2% 14% 14% 15 5% 9% 12% Module 3 15 2% 5% 6% 15 6% 8% 9% Module 4 15 2% 8% 9% 15 6% 11% 15% Source: Authors’ own calculations based on SHS12 and Mogadishu Pilot data. Durable consumption flow The consumption aggregate includes the consumption flow of durables calculated based on the user-cost approach. The consumption flow distributes the consumption value of the durable over multiple years. The user-cost principle defines the consumption flow of an item as the difference of selling the asset at the beginning and the end of the year as this is the opportunity 10 The performance of the estimation techniques is presented using the relative bias (mean of the error distribution) and the relative standard error. The relative error is defined as the percentage difference of the estimated consumption and the reference consumption (based on the full consumption module, averaged over all imputations). The relative bias is the average of the relative error. The relative standard error is the standard deviation of the relative error. The simulation is run over different household-module assignments while ensuring that each optional module is assigned equally often to a household per enumeration. The relative bias and the relative standard error are reported across all simulations. 11 Technical Appendix: Wave 1 Somali High Frequency Survey (HFS), Revised: 30th August 2016 cost of the household for keeping the item. The opportunity cost is composed of the difference in the sales price and the forgone earnings on interest if the asset is sold at the beginning of the year. If the durable item is sold at the beginning of the year, the household would receive the market price pt for the item and the interest on the revenue for one year. With it denoting the interest rate, the value of the item thus is (1 + ). If the item is sold at the end of the year, the household will receive the depreciated value of the item while considering inflation. With being the inflation rate during the year t, the household would obtain (1 + )(1 − ) with the annual physical or technological depreciation rate denoted as assumed constant over time. 11 The difference between these two values is the cost that the household is willing to pay for using the durable good for one year. Hence, the consumption flow is: (3) y = (1 + ) − (1 + )(1 − ) By assuming that × ≅ 0, the equation simplifies to (4) y = ( − + ) = ( + ) where is the real market interest rate in period t. Therefore, the consumption flow of an item can be estimated by the current market value , the current real interest rate , and the depreciation rate . Assuming an average annual inflation rate , the depreciation rates can be estimated utilizing its relationship to the market price 12: (5) = − (1 + ) (1 − ) The equation can be solved for obtaining: 1 (6) 1 = 1 − � � − (1 + ) Based on this equation, item-specific median depreciation rates are estimated assuming an inflation rate of 0.5 percent, a nominal interest rate of 2.0 percent and, thus, a real interest rate of 1.5 percent (Table 11). For all households owning a durable but did not report the current value of the durable, the item-specific median consumption flow is used. For households that own more than one of the durable, the consumption flow of the newest item is added to the item- specific median of the consumption flow times the number of those items without counting the newest item. 13 Deflator Prices fluctuate considerably between regions, thus we calculated spatial price indices using a common food basket and spatial prices to make consumption comparable across regions. The Laspeyres index is chosen as a deflator due to its moderate data requirements. The deflator is calculated by analytical strata areas based on the price data collected by the HFS. The Laspeyres index reflects the item-weighted relative price differences across products. Item weights are estimated as household- weighted average consumption share across all households before imputation. Based on the democratic approach, consumption shares are calculated at the household level. Core items use total household core consumption as reference while items from optional modules use the total assigned optional module household consumption as reference. The shares are aggregated at the national level (using household weights) and then calibrated by average consumption per module to arrive at item-weights summing to 1. The item-weights are applied to the relative differences of median item prices for each analytical strata. Missing prices are replaced by the item-specific median over all households. A large Laspeyres indicates a high price level deflating consumption stronger than a lower Laspeyres index. The resulting indices show the fluctuation of prices across regions (Table 4). Table 3: Laspeyres Deflators by Analytical Strata Analytical Strata Deflator 11 Assuming a constant depreciation rate is equivalent to assuming a “radioactive decay” of durable goods (see Deaton and Zaidi, 2002). 12 In particular solves the equation ∏ =−(1 + ) = (1 + ) 13The 2016 HFS questionnaire provides information on a) the year of purchase and b) the purchasing price only for the most recent durable owned by the household. 12 Technical Appendix: Wave 1 Somali High Frequency Survey (HFS), Revised: 30th August 2016 All IDPs 0.923 Mogadishu 0.964 Garowe 0.862 Urban Bari and Mudug 1.107 Hergeiza 1.133 Urban Awdal, Sanaag, Sool and Togdheer 0.922 Rural Bari, Mudug and Nugaal 1.013 Rural Awdal, Sanaag, Sool, Togdheer and Woqooyi Galbeed 1.075 13 Technical Appendix: Wave 1 Somali High Frequency Survey (HFS), Revised: 30th August 2016 Tables for Cleaning Rules Table 4: Summary of Unit Cleaning Rules for Food Items Unit Condition Correction Affected Records 14 250 ml tin <=0.03 Multiply by 4 2; 39 Animal back, ribs, shoulder, thigh, head or leg >=7 Divide by 10 4; 35 Basket or Dengu (2 kg) >=10 Divide by 10 1,004; 20 Bottle (1 kg) >=10 Divide by 10 473; 281 Cup (200 g) >200 Divide by 2 447; 24 Faraasilad (12kg) >12 Divide by 12 544; 60 Gram (if item corresponds to a spice) <1 Multiply by 100 115; 5 Gram (if item does not corresponds to a spice) <1 Multiply by 1,000 69; 19 Haaf (25 kg) >=25 Divide by 25 357; 921 Heap (700g) >=0.69 Divide by 7 182; 11 Kilogram >=100 Divide by 1,000 68; 4 Large bag (50 kg) >=50 Divide by 50 1; 27 Liter >=10 Divide by 10 3; 32 Madal/Nus kilo ruba (0.75kg) >=7.5 Divide by 10 849; 20 Meals (300 g) >2.1 Divide by 10 366; 208 Packet sealed box/container (500 g) >=5 Divide by 10 340; 16 Piece (large - 300g) >=3 Divide by 10 397; 43 Piece (small - 150g) >=1.5 Divide by 10 95; 5 Rufuc/Jodha (12.5kg) >=12.5 Divide by 10 37; 15 Saxarad (20kg) >=20 Divide by 10 312; 793 Small bag (1 kg) >=10 Divide by 10 110; 8 Teaspoon (10 g) <0.009 Multiply by 10 45; 4 14The first number indicates the number of affected records reported for consumption while the second number states the number of affected records for purchases. 14 Technical Appendix: Wave 1 Somali High Frequency Survey (HFS), Revised: 30th August 2016 Table 5: Conversion factor to Kg for specific units and items Item Unit Conversion to Kg Biscuits Piece – large 0.030 Piece - small 0.010 Bread Piece – large 0.400 Piece - small 0.100 Eggs Piece – large 0.070 Piece - small 0.050 Canned fish/shellfish Piece – large 0.420 Piece - small 0.140 Grapefruits, lemons, guavas, limes Piece – large 0.350 Piece - small 0.100 Milk Piece – large 0.750 Piece - small 0.250 Milk powder Piece – large 0.450 Piece - small 0.100 Small bag 1.00 Garlic Piece – large 0.065 Piece - small 0.040 Onion Piece – large 0.150 Piece - small 0.095 Tomatoes Piece – large 0.200 Piece - small 0.110 Bell-pepper Piece – large 0.150 Piece - small 0.080 Sweet/ripe bananas Piece – large 0.110 Piece - small 0.070 Canned vegetables Piece – large 0.400 Piece – small 0.200 Sorghum, flour Cup 0.200 Cooking oats, corn flakes Cup 0.200 Other cooked foods from vendors Small bag 1.00 Purchased/prepared tea/coffee consumed at home Small bag 0.400 Other spices Small bag 0.400 Table 6: Summary Cleaning Rules for Currency Currency Condition Correction Somaliland shillings Entry in Somaliland shilling Replace currency to Somali shillings Price <=500 Replace currency to Somaliland shillings (Thousands) Price>=500,000 Divide by 10 Somali shillings Entry in Somali shilling Replace currency to Somaliland shillings Price <=500 Replace currency to Somali shillings (Thousands) Price>=500,000 Divide by 10 USD Price >1,000 Replace currency to Somali(land) shillings Table 7: Threshold for Non-Food Item Expenditure (USD) Recall period Min Max 1 Week 0.05 30 1 Month 0.20 95 3 Months 0.45 200 1 Year 0.80 1,200 15 Technical Appendix: Wave 1 Somali High Frequency Survey (HFS), Revised: 30th August 2016 Table 8: Consumption of durable goods (per week in current USD) SOM Wave 1 SOM Wave 1 Pilot All regions Mogadishu Mogadishu Median 0.74 1.17 1.01 Mean 1.24 1.52 1.91 Sd 1.51 1.49 2.62 Table 9: Median consumption of durable goods (per week in current USD) SOM Wave 1 SOM Wave 1 Pilot Item All regions Mogadishu Mogadishu Air conditioner 0.005 0.005 0.041 Bed N/A N/A 0.861 Bed with mattress 0.700 0.746 N/A Car 0.001 0.001 0.001 Cell phone 0.361 0.413 0.430 Chair 0.073 0.072 0.253 Clock 0.028 0.003 0.046 Coffee table (for sitting room) 0.005 0.005 0.106 Computer equipment & accessories 0.020 0.020 2.837 Cupboard, drawers, bureau 0.240 0.240 1.099 Desk 0.047 0.005 0.429 Electric stove or hot plate 0.001 0.001 N/A Electric or gas stove; hot plate N/A N/A 0.012 Electric stove N/A N/A 0.004 Fan 0.069 0.064 0.101 Gas stove 0.007 0.007 0.275 Generator 0.000 0.000 0.000 Iron 0.043 0.035 N/A Kerosene/paraffin stove 0.024 0.007 0.009 Kitchen furniture 0.023 0.015 1.112 Lantern (paraffin) 0.000 0.000 0.002 Lorry 0.000 0.000 0.000 Mattress without bed 0.217 0.212 N/A Mini-bus 0.000 0.000 0.001 Mortar/pestle 0.016 0.009 0.112 Motorcycle/scooter 0.002 0.002 0.006 Photo camera 0.001 0.001 0.595 Radio ('wireless') 0.021 0.001 0.016 Refrigerator 0.282 0.018 0.267 Satellite dish 0.117 0.008 0.265 Sewing machine 0.002 0.002 0.732 Small solar light 0.003 0.003 N/A Solar panel 0.000 0.000 0.018 Stove for charcoal 0.032 0.023 0.020 Table 0.042 0.042 0.092 Tape or CD/DVD player; HiFi 0.001 0.001 0.092 Television 0.330 0.278 0.417 Upholstered chair, sofa set 0.019 0.019 2.657 VCR 0.000 0.000 0.000 Washing machine 0.405 0.368 0.557 16 Technical Appendix: Wave 1 Somali High Frequency Survey (HFS), Revised: 30th August 2016 Table 10: Median depreciation rate of durables goods Wave 1: Awdal, Sanaag, Sool, SOM Wave 1 SOM Wave 1 Pilot Item Togdheer and SHS12 All Mogadishu Mogadishu Woqooyi Galbeed Air conditioner 0.278 0.241 0.210 0.134 0.145 Bed N/A N/A 0.364 N/A 0.088 Bed with mattress 0.172 0.172 N/A 0.172 N/A Car 0.118 0.118 0.111 0.118 0.066 Cell phone 0.188 0.188 0.296 0.188 0.169 Chair 0.149 0.149 0.371 0.149 0.114 Clock 0.204 0.204 0.228 0.204 0.110 Coffee table (for sitting room) 0.279 0.279 0.329 0.279 0.114 Computer equipment & accessories 0.182 0.240 0.364 0.150 0.204 Cupboard, drawers, bureau 0.150 0.150 0.296 0.150 0.098 Desk 0.134 0.134 0.502 0.134 0.108 Electric stove or hot plate 0.262 0.257 0.005 0.252 N/A Electric stove N/A N/A 0.296 N/A 0.138 Fan 0.131 0.131 0.235 0.131 0.134 Gas stove 0.174 0.135 0.296 0.174 0.333 Generator N/A N/A 0.296 N/A 0.127 Iron 0.161 0.161 N/A 0.161 0.110 Kerosene/paraffin stove 0.224 0.224 0.296 0.224 0.210 Kitchen furniture 0.188 0.188 0.393 0.188 0.101 Lantern (paraffin) 0.064 N/A 0.067 0.064 0.114 Lorry 0.154 N/A 0.296 0.154 0.052 Mattress without bed 0.185 0.185 N/A 0.185 N/A Mini-bus 0.153 0.172 0.296 0.153 0.039 Mortar/pestle 0.210 0.210 0.254 0.210 0.114 Motorcycle/scooter 0.172 0.172 0.138 N/A N/A Photo camera 0.134 0.134 0.296 0.122 0.171 Radio ('wireless') 0.210 0.210 0.337 0.210 0.134 Refrigerator 0.133 0.133 0.065 0.133 0.096 Satellite dish 0.110 0.110 0.303 0.110 0.097 Sewing machine 0.138 0.114 0.296 0.138 0.134 Small solar light 0.296 N/A N/A 0.471 N/A Solar panel 0.005 0.038 0.296 0.005 0.110 Stove for charcoal 0.226 0.226 0.337 0.254 0.188 Table 0.157 0.157 0.296 0.160 0.114 Tape or CD/DVD player; HiFi 0.172 N/A 0.138 0.172 0.092 Television 0.131 0.131 0.240 0.131 0.099 Upholstered chair, sofa set 0.168 0.168 0.289 0.168 0.101 VCR 0.166 0.488 0.296 0.130 0.092 Washing machine 0.138 0.138 0.171 0.138 0.114 17 Technical Appendix: Wave 1 Somali High Frequency Survey (HFS), Revised: 30th August 2016 Table 11: Sample size calculation, number of replacement and final sample. 15 Number of Strata Number of Number of Number house-holds cons- est. relative Re- Code Number accessible accessible of house- in accessible Percent in umptio standard Design "optimum" rounded standard standard place- Post Strata New of EAs EAs EAs with GPS holds EAs with GPS Sample N "weights" n deviation n Effect (NhSh) allocation optimum clusters error error ments Sample A - Mogadishu 101 1347 1299 136592 136592 131914 97% 131,914 0.146 512.9 38.4 676 3.327 5,062,022 558.88 564 47 5.4 0.01 27 67 A - IDPs 105 33,333 0.037 97.9 54.0 122 2.858 1,801,172 198.86 204 17 10.8 0.11 10 17 B - Garowe 201 149 149 149 16351 16351 100% 16,351 0.018 484.6 59.4 573 1.091 971,660 107.28 108 9 6.2 0.01 1 9 B - Urban Centers 202 111 111 111 12534 12534 100% 12,534 0.014 484.6 59.4 573 1.091 744,834 82.23 84 7 7.1 0.01 1 7 B - Other Urban 203 1426 1399 1230 160891 142339 88% 142,339 0.157 386.0 27.8 279 2.130 3,962,628 437.50 432 36 2.9 0.01 6 36 B - Rural 204 1230 1067 475 101226 46235 46% 46,235 0.051 347.1 30.9 873 2.406 1,428,335 157.70 156 13 6.0 0.02 6 13 B - IDPs 205 21,500 0.024 97.9 54.0 122 2.858 1,161,756 128.27 132 11 13.4 0.14 3 11 C - Hergeiza 301 1617 1617 1617 139345 139345 100% 139,345 0.154 484.6 59.4 573 1.091 8,280,591 914.24 912 76 2.1 0.00 5 76 C - Urban Centers 302 1071 1071 1071 114435 114435 100% 114,435 0.126 386.0 27.8 279 2.130 3,185,798 351.73 348 29 3.2 0.01 1 29 C - Other Urban 303 268 250 237 26294 23224 88% 23,224 0.026 386.0 27.8 279 2.130 646,542 71.38 72 6 7.0 0.02 0 6 C - Rural 304 1296 1218 1013 241531 185700 77% 185,700 0.205 347.1 30.9 873 2.406 5,736,817 633.38 636 53 2.9 0.01 98 53 C - IDPs 305 14,167 0.016 97.9 54.0 122 2.858 765,498 84.52 84 7 16.8 0.17 0 7 D - Sanaag Urban 1103 57 57 57 6088 6088 100% 6,088 0.007 386.0 27.8 279 2.130 169,486 18.71 24 2 12.1 0.03 0 2 D - Sanaag Rural 1104 43 43 43 5131 5131 100% 5,131 0.006 347.1 30.9 873 2.406 158,512 17.50 12 2 15.2 0.04 4 0 D - Sool Urban 1203 10 10 10 1352 1352 100% 1,352 0.001 386.0 27.8 279 2.130 37,639 4.16 12 2 12.1 0.03 4 2 D - Sool Rural 1204 29 29 10 2387 919 39% 919 0.001 347.1 30.9 873 2.406 28,391 3.13 12 2 15.2 0.04 4 3 D - Togdheer Urban 1303 128 128 128 11064 11064 100% 11,064 0.012 386.0 27.8 279 2.130 308,015 34.01 36 3 9.9 0.03 2 3 D - Togdheer Rural 1304 8 8 2 541 150 28% 150 0.000 347.1 30.9 873 2.406 4,634 0.51 12 2 15.2 0.04 2 0 T ota l 8790 8456 142745 975762 836781 905,781 369.8 288.1 8124 36.65526 34,454,327 3,804 3,840 324 1.4 0.004 174 341 15Note that the number of (accessible) households does not resemble necessarily the number of PESS households due to the merging of multiple data sources. Therefore, sample weights were adjusted accordingly to scale with PESS household estimates. 18