46435 v3 Report No. 46435-PK PAKISTAN’S INVESTMENT CLIMATE LAYING THE FOUNDATION FOR GROWTH (In Three Volumes) Volume III: Background Paper on Econometric Methods December 2009 Poverty Reduction, Economic Management Department Finance and Private Sector Unit South Asia Department Document of the World Bank 1 Background paper prepared for the Investment Climate Assessment of Pakistan 2008   Econometric Methods for Investment  Climate Assessment on Economic  Performance in Pakistan  Analysis based on firm level data from manufacturing and  service sectors from 2002 and 2007       �lvaro Escribano,* Manuel de Orte+ and Jorge Pena++      May, 2009  Summary  In  this  paper  we  describe  the  econometric  methodology  applied  in  searching  for  empirical  regularities  on  the  investment  climate  (IC)  effects  and  on  the  economic  performance  of  Pakistan  firms.  We  use  firm  level  data  coming  from  the  investment  climate  surveys  (ICSs)  of  Pakistan  — manufactures and services— for 2007 and 2002. For identification of the main IC effects on economic  performance  we  follow  the  robust  productivity  methodology  developed  in  Escribano  and  Guasch  (2005  and  2008)  and  Escribano  et  al  (2008b).  Once  we  have  robust  IC  elasticities  and  semi� elasticities on productivity and on other economic performance measres, we follow Escribano, et al,  (2008a) and Escribano, Guasch and Pena, (2008) for the evaluation of the impact of the IC variables in  terms  of  the  Olley  and  Pakes  (1996)  decomposition  of  aggregate  productivity  and  in  terms  of  the  sample  means  of  alternative  economic  performance  measures  (employment,  wages,  probability  of  exporting and probability of receiving FDI). This econometric methodology has been widely used in  the ICA of many other countries, allowing us to do cross country demeaned productivity comparisons.  Furthermore,  it  also  allow  us  to  estimate  robust  IC  productivity  effects  controlling  for  endogeneity  and  observable  fixed  effects  while  having  certain  simultaneity  among  covariates,  missing  observations and measurement errors.   *  Telefónica  UC3M  Chair  of  Economics  of  Telecommunications,  Universidad  Carlos  III  de  Madrid,  alvaroe@eco.uc3m.es  +  Laboratory of Economics of Telecommunications, Universidad Carlos III de Madrid, morte@eco.uc3m.es.  ++ Laboratory  of  Economics  of  Telecommunications,  Universidad  Carlos  III  de  Madrid,  jpizquie@eco.uc3m.es      Table of Contents    1   INTRODUCTION. ____________________________________________________________________ 1  2   DATA: THE INVESTMENT CLIMATE SURVEYS OF PAKISTAN. __________________________________ 4  2.1  Description of the data sets based on the three IC Surveys of Pakistan  ________________________ 6  2.2  Cleaning of data or imputation methods. ________________________________________________ 9  3  ECONOMETRIC ESTIMATION OF IC ELASTICITIES AND SEMI�ELASTICITIES ON PRODUCTIVITY (ANALYSIS  FOR MANUFACTURING ONLY) ________________________________________________________ 15  3.1 Robustness of IC elasticities and semi�elasticities: single step and two step estimation, restricted and  unrestricted input�output elasticities. __________________________________________________ 16  3.2  Endogeneity of production function (PF) variables. _______________________________________ 18  3.3  Role of prices on production function (sales generating function). ___________________________ 19  3.4  Endogeneity of IC variables.  _________________________________________________________ 20  _ 3.5  Selection of the relevant models.  _____________________________________________________ 22  3.6  Further robustness: two years FY06�FY07 panel.  _________________________________________ 23  3.7  Estimation under alternative replacement processes for missing data.  _______________________ 23  4  ECONOMETRIC ANALYSIS OF IC AND PRODUCTIVITY IMPACT ON EMPLOYMENT, REAL WAGES,  PROBABILITY OF EXPORTING AND PROBABILITY OF RECEIVING FDI. __________________________ 25  5  IC ASSESSMENT ON AGGREGATE PRODUCTIVITY (OLLEY AND PAKES DECOMPOSITION) AND OTHER  MEASURES OF ECONOMIC PERFORMANCE. _____________________________________________ 30  5.2  O&P decompositions: in levels and mixed. ______________________________________________ 30  _ 5.3  IC effects on productivity measure in the terms of the mixed O&P decomposition.  _____________ 31  5.4  Simulations based on the IC effects on the O&P decomposition of TFP. _______________________ 32  5.5  International comparisons of IC effects on aggregate demeaned productivity.  _________________ 33  5.6  IC evaluation on the sample means of employment and wages, on the probability of exporting and on  the probability of receiving FDI _______________________________________________________ 34  6  ECONOMETRIC METHODOLOGY FOR THE SERVICES SECTOR ________________________________ 34  7  ECONOMETRIC METHODOLOGY ICS PANEL FOR FY02 AND FY07. ____________________________ 36  8  RESULTS (I): IDENTIFICATION OF IC EFFECTS ON ECONOMIC PERFORMANCE. __________________ 37  8.2  IC elasticities and semi�elasticities with respect to productivity (manufacturing FY07). ___________ 38  8.3  IC elasticities and semi�elasticities with respect to economic performance (manufacturing FY07). __ 40  8.4  IC elasticities and semi�elasticities with respect to labor productivity of services sector.  _________ 42  8.5  IC effects on the probability of observing a productivity increase in the manufacturing sector from a  panel from FY02 to FY07. ____________________________________________________________ 42  9  RESULTS (II): IC EVALUATION ON ECONOMIC PERFORMANCE. ______________________________ 42  9.1   Olley and Pakes decompositions.  _____________________________________________________ 43  9.2   Effects of market power on measured productivities ______________________________________ 45  9.3   IC contributions to the terms of the Olley and Pakes decomposition  _________________________ 53  9.4   IC contributions to the sample means of employment, wages, exporting propensity and FDI   propensity  _______________________________________________________________________ 56  9.5   IC contributions to the Olley and Pakes decomposition of labor productivity of services sector ____ 59  9.6   IC contributions to the probability of productivity increase _________________________________ 60  10  CONCLUSIONS  ____________________________________________________________________ 61  REFERENCES __________________________________________________________________________ 65    2   List of Figures and Tables  Figures included in the main text    Figure 1: Decomposition of GDP gap between Pakistan and East Asia Region, 1990/2006_______________________ 1  Figure 2: Olley and Pakes decomposition in levels by state (FY07)  ________________________________________ 43  Figure 3: Mixed Olley and Pakes decomposition by state (FY07) __________________________________________ 43  Figure 4: Demeaned mixed Olley and Pakes decomposition in Pakistan and comparators ______________________ 44  Figure 6: Kernel density estimate of productivity densities (I) ____________________________________________ 46  Figure 7: Kernel density estimate of productivity densities (II)  ___________________________________________ 47  Figure 8: Kernel density estimate of productivity densities (III) ___________________________________________ 48  Figure 9: Kernel density estimate of productivity densities, only oligopolies (IV) _____________________________ 49  Figure 10: Number of oligopolies by size and industry __________________________________________________ 50  Figure 11: Box�plots of share of sales by market type and market structure _________________________________ 50  Figure 12: Box�plots of productivities by share of sales _________________________________________________ 51  _ Figure 13: Mixed Olley and Pakes decompositions with and without outliers  _______________________________ 52  Figure 14: Demeaned Mixed Olley and Pakes decompositions with and without outliers  ______________________ 52  Figure 15: IC Percentage contributions to aggregate log�productivity (manufacturing FY07) ____________________ 55  Figure 16: Percentage change in aggregate productivity (TFP) from a 20% improvement of IC variables  (manufacturing FY07)  ____________________________________________________________________ 56  Figure 17: IC percentage contributions to average log�employment (manufacturing FY07) _____________________ 57  Figure 18: IC percentage contributions to average log�wage (manufacturing FY07) ___________________________ 57  Figure 19: IC percentage contributions to the probability of exporting (manufacturing FY07) ___________________ 58  Figure 20: IC percentage contributions to the probability of receiving FDI (manufacturing FY07) ________________ 59  Figure 21: IC percentage contributions to the O&P decomposition of labor productivity (services FY07) __________ 60  Figure 22: IC percentage contributions to the probability of having a productivity (TFP) increase in terms of IC  variables (manufacturing panel FY02�FY07) ___________________________________________________ 61  Figure 23: Weight of each block of IC variables on aggregate productivity, average productivity and allocative  efficiency, by contributions and by simulations (manuf. FY07) ____________________________________ 62  Figure 24: Weight of each block of IC variables on the sample means of economic performance measures (manuf.  FY07)  _________________________________________________________________________________ 63      Tables included in the appendix    Table 1.1: General information on plant level and production function (productivity) variables for analysis of  manufacturing sector  ___________________________________________________________________ 67  Table 1.2: General information on plant level and labor productivity variables for analysis of services sector ______ 67  Table 2.1: Definition of IC variables: infrastructure  ____________________________________________________ 68  Table 2.2: Definition of IC variables: economic governance ______________________________________________ 69  Table 2.3: Definition of IC variables: finance __________________________________________________________ 70  Table 2.4: Definition of IC variables: innovation and competition _________________________________________ 71  Table 2.5: Definition of IC variables: labor markets and skills  ____________________________________________ 71  Table 2.6: Definition of IC variables: corporate governance ______________________________________________ 72  Table 2.7: Definition of IC variables: other control variables _____________________________________________ 72  Table 3.1: Number of observations and response rate (in parentheses) of infrastructure variables  ______________ 73  Table 3.2: Number of observations and response rate (in parentheses) of economic governance variables ________ 74  Table 3.3: Number of observations and response rate (in parentheses) of finance variables ____________________ 75  ii Table 3.4: Number of observations and response rate (in parentheses) of innovation and competition variables ___ 76  _ Table 3.5: Number of observations and response rate (in parentheses) labor markets and skills variables  ________ 76  _ Table 3.6: Number of observations and response rate (in parentheses) of corporate governance variables  _______ 77  Table 3.7: Number of observations and response rate (in parentheses) of other control variables _______________ 77  Table 4.1: Missing values and outliers in productivity and labor productivity figures before the cleaning process and  percentage in parenthesis (FY07 manufacturing, FY02� FY07 panel and FY07 services) ________________ 78  Table 4.2: Representativeness by industry and state in the sampling frame and in the complete case, 2007  manufacturing ICS ______________________________________________________________________ 79  Table 4.3: Representativeness by industry and state in the sampling frame and in the complete case, 2007 services  ICS  __________________________________________________________________________________ 79  Table 5.1: Missing values and outliers in productivity and labor productivity figures after the cleaning process (FY07  manufacturing, FY02�FY07 panel and FY07 services) ___________________________________________ 80  Table 5.2: Representativeness by industry and state in the sampling frame and in the sample with replacement of  missing values in FY07, 2007 manufacturing  ICS.  _____________________________________________ 81  Table 5.3: Representativeness by industry and state in the sampling frame and in the sample with replacement of  missing values in FY07, 2007 services ICS.  ___________________________________________________ 81  Table 5.4: Patterns of missing values in production function variables _____________________________________ 82  Table 5.5: Pattern of missing values in India by key IC variables (% of missing values in PF vars with respect to  categories of IC vars) ____________________________________________________________________ 83  Table 5.6: Number of missing values in production function variables by size  _______________________________ 84  Table 5.7: Representativity of sampling frame, complete case and sample with replacement in India ____________ 84  Table 5.8: Percentage of observations available due to missing values, by industry and region  _________________ 85  Table 6.1: Robust IC elasticities and semi�elasticities with respect to productivity – OLS Estimation (manuf. FY07) __ 86  Table 6.2: Further robustness; IC elasticities and semi�elasticities with respect to productivity – Random effects  estimation (manuf. FY07 & FY06) __________________________________________________________ 87  Table 6.3: Further robustness; IC elasticities and semi�elasticities under different replacement procedures of missing  data (manuf. FY07)  _____________________________________________________________________ 88  Table 7: IC percentage contributions to aggregate log�productivity (manufacturing FY07)  _____________________ 89  Table 8.1: IC elasticities and semi�elasticities with respect to employment – IV Estimation (manuf. FY07) _________ 90  Table 8.2: IC elasticities and semi�elasticities with respect to employment – IV Estimation (manuf. FY07) _________ 91  Table 8.3: IC elasticities and semi�elasticities with respect to wages – IV Estimation (manuf. FY07) ______________ 92  Table 8.4: IC linear probability coefficients with respect to the probability of exporting – IV Estimation (manufac.  FY07)  ________________________________________________________________________________ 93  Table 8.5: IC linear probability coefficients with respect to the probability of receiving FDI – IV Estimation  (manufacturing FY07)  ___________________________________________________________________ 94  Table 9: IC elasticities and semi�elasticities with respect to labor productivity (services FY07)  __________________ 95  Table 10: IC effects on the probability of productivity increase between FY02 and FY07 _______________________ 96        iii 1 Introduction.  The institutional, social, political and economic arrangements a society employ in the production  of  goods  and  services—investment  climate  for  us—have  an  important  impact  on  economic  growth  and  living  standards,  particularly  in  emerging  and  transition  economies  like  Pakistan.  Identifying  those  significant  IC  bottlenecks  on  economic  performance  at  the  firm  level  is  a  key  step in order to disentangle and to  understand the causes of the differences between rich and  poor countries.   Percentage 160 GDP per capita gap 140 Labor productivity gap 120 Workforce paticipation gap 100 80 60 40 20 0 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 GDP gap can be decomposed into labor productivity gap and workforce participation gap as (Y1/N1)/(Y2/N2)=[(Y1/L1)/(Y2/L2)]*[(L1/N1)/L1/N1)], where Y1 is GDP of country 1, Y2 GDP of country 2, N is total population, and L is labor force. Source: Authors' calculations with World Development Indicators, World Bank, 2008. Figure 1: Decomposition of GDP gap between Pakistan and East Asia Region, 1990/2006 The  GDP  per  capita  gap  (divergence)  between  Pakistan  and  East  Asia  region  has  dramatically  increased during the last twenty years, as Figure 1 illustrates by having a declining path. In 1990  Pakistan’s per capita GDP was 90% of the East Asia region and in 2006 it was reduced to almost  40%. Clearly, the causes of the diverging process in living standards in Pakistan can be found in  the poor evolution of output per worker, which was reduced from 140% of the labor productivity  of  the  East  Asian  region  in  1990  to  60%  in  2006,  revealing  the  striking  weakness  of  Pakistan  economy in terms of competitiveness since 1990.1  1   The  well�known  decomposition  of  GDP  per  capita  into  the  product  of  labor  productivity  and  the  workforce  participation was measured relative to other countries from the East Asia region.  1  A good candidate to explain the diverging evolution of labor productivity in Pakistan is the ability  of the economy to produce the maximum possible output with a given amount of inputs, or total  factor productivity (TFP).2 Other candidates are the capital labor ration, the capacity of firms to  enter and successfully compete in global markets, or the capacity of factor markets to integrate  the  supply  of  labor  in  the  production  of  goods  and  services.  The  objective  of  this  paper  is  to  econometrically  explore  the  investment  climate  role  on  GDP  per  capita  and  labor  productivity  differences, illustrated in Figure 1, through the IC relation with economic performance at the firm  level.  The  main  idea  of  the  methodology  is  to  use  the  rich  and  complete  set  of  information  contained  in  the  investment  climate  surveys  (ICSs)  of  Pakistan  to  perform  the  econometric  estimation  and,  at  the same  time, to  solve  most  of  the  econometrics  contingent  difficulties we  are  going  to  find  with  survey  data  on  firms.  Table  A  summarizes  the  econometric  methods  applied in the paper.  Table A: Summary of econometric methods i. Econometric model used to identify IC effects Dataset Left hand side or dependent Right hand side or ii. Evaluation of IC contributions variables explanatory variables IC contributions to aggregate productivity section 2007 and 140 investment climate with recall data] Manufacturing Productivity through average productivity and 2007 [cross- variables + other controls Employment allocative efficiency. and more than 20 Wages industry/region/size IC contributions to the sample means of Probability of exporting controls + simultaneous employment, wages, probability of Probability of receiving FDI effects exporting and probability of receiving FDI. 2007 and with Services 2007 [cross-section 110 investment climate recall data] variables + other controls IC contributions to the Olley and Pakes Labor productivity and 10 decomposition of aggregate labor industry/region/size productivity controls 120 investment climate Manufac. 2007/02 Probability of productivity variables + other controls IC contributions to the sample mean of panel increase between 2002 and and 20 the probability of productivity increase 2007 industry/region/size controls in 2002 The core of the methodology is a structural system of equations relating investment climate as  right  hand  side  (explanatory)  variables  and  productivity,  employment,  wages,  probability  of  exporting  and  probability  of  receiving  FDI  as  endogenous  or  left  hand  side  variables.  In  the  2  Differences in labor productivity have been traditionally attributed to differences in TFP in the literature see Cole et  al (2004), Caselli (2005), Hall and Jones (1999), and Klenow and Rodríguez�Clare (1997).  2 methodological  aspects of  the  estimation  of  the  system  we follow  Escribano  and  Guasch  (2005  and  2008)  and  Escribano  et  al  (2008b).  Once  we  have  identified  the  significant  IC  effects  on  economic  performance  we  evaluate  the  IC  productivity  contributions  in  terms  of  the  Olley  and  Pakes  (1996)  decomposition  of  aggregate  productivity  and  the  IC  contributions  in  terms  of  the  sample means of the remaining economic performance measures (see Escribano et al. 2008a and  Escribano, Guasch and Pena, 2008).  We  also  exploit  the  information  of  the  ICSs  of  Pakistan  services  sector  on  labor  productivity  (following  Escribano  and  de  Orte,  2008)  and  to  explore  the  role  of  the  IC  in  the  productivity  change  of  the  manufacturing  sector  between  the  fiscal  years  2002  (FY02  in  what  follows)  and  2007  (FY07).  In  a  second  step  we  evaluate  the  IC  contribution  to  the  Olley  and  Pakes  decomposition  of  labor  productivity  and  the  IC  contribution  to  the  probability  of  having  productivity increases between 2002 and 2007.  The  results  of  the  econometric  analysis  show  a  clear  relation  between  IC  and  economic  performance in Pakistan. We find that infrastructure factors like the quality of power supply, the  access  to  finance  and  informalities  are  significantly  associated  with  the  large  differences  in  productivity observed in the sample of establishments used. In addition, we observe that larger  firms  in  terms  of  market  share  cope  better  with  the  bottlenecks  imposed  by  the  investment  climate and are able to get more benefit from the positive aspects of the IC. The analysis of the  services sector and the  FY02�FY07 panel confirm as well the importance of the IC on economic  performance. Despite of the methodological difficulties contingent to the econometric estimation  of  the  IC  effects,  we  believe  that  these  empirical  regularities  offer  a  valuable  first  insight  on  where  the  critical  IC  bottlenecks  are  that  are  affecting  the  GDP  per  capita  convergence  of  Pakistan.  The  econometric  methodology  used  is  explained  in  the  remaining  sections  of  the  paper.  In  particular,  in  section  two,  we  explaining  the  cleaning  procedures  applied  to  the  data  base  in  terms of treatment of missing values, outliers and measurement errors, following Escribano and  Pena  (2009).  Section  three  focuses  on  the  methodology  applied  to  estimate  the  IC  effect  on  productivity  and  section  four  describes  the  econometric  methodology  for  the  remaining  3 economic performance measures. Section 5 describes how to assess the IC contributions to the  terms of the Olley and Pakes productivity decomposition and the sample means of the remaining  economic  performance  measures.  In  sections  six  and  seven  we  present  the  econometric  methodology applied to the analysis of services and to the panel of manufacturing firms during  the fiscal years 2002 and 2009 respectively. Section eight, briefly describes the results obtained in  the  empirical  regularities  found  in  terms  of  the  IC  effects.  Section  nine,  describes  the  IC  contributions to the corresponding sample means of the economic performance measures and to  the O&P productivity decompositions. Finally, section 10 concludes. Most of the tables with the  results on the cleaning data process and on IC effects estimation are included in a large appendix  at the end of the paper.   2 Data: the Investment Climate Surveys of Pakistan.  We  have  for  the  analysis  three  different  IC  data  sets.  The  first  dataset  comes  from  a  stratified  random sample of manufacturing firms, with stratification variables being industry and region. In  order  to  ensure  a  large  enough  number  of  large  establishments  in  the  sample,  a  sampling  approach  which  oversampled  large  firms  was  applied.  The  result  is  a  sample  with  784  manufacturing  establishments.  The  second  dataset  consists  of  151  services  establishments.  The  sampling  scheme  follows  the  same  statistical  methodology  previously  described  for  the  manufacturing sector.  In both manufacturing and services samples use proper weighting to correct for oversampling of  large  firms  when  doing  descriptive  analysis.  However,  for  the  regression  analysis  we  use  un� weighted estimation given that we control for that by adding firm size dummies in the estimation  since the stratification is not based on the dependent variable of the regression.3   3   In  econometrics  we  often  prefer  to  follow  a  structural  approach  assuming  that  the  parameters  are  unchanging  across stratra. In that case, stratification apparently causes no complications and unweighted estimation is used. A  major problem appears when the stratification is based on values of the dependent variable; for instance when low  income groups are oversampled and the dependent variable is income. However, we believe this does not apply for  the case of the Pakistan ICSs. There is no problem if stratification is based on a particular regressor and indirectly the  dependent variable is oversampled when our interest lies on the regression of y on x. Therefore, provided that the  conditional expectation for y given x is correctly specified, we can add dummy variables to control for stratification  variables. See Cameron and Trivedi (2005) for more details.  4 On  the  other  hand,  clustering  generally  leads  to  standard  errors  estimates  that  appreciably  understate  the  true  standard  errors  and  can  even  sometimes  lead  to  inconsistencies  unless  adjustment  is  made  for  clustering.  To  correct  for  this,  apart  from  the  conventional  heteroskedasticity robust White standard errors estimators, we compute cluster standard errors,  allowing for correlation within industry and region.4  Third,  we  take  also  advantage  of  the  2002  Pakistan  ICSs  in  order  to  develop  a  dynamic  panel.  However, important difference between the approach taken in 2002 and the 2007 survey limits  the  range  of  panel�type  analyses  to  some  degree.    In  particular,  as  the  frame  in  2002  was  not  representative of the population, and given that no weighting was done in 2002, inferences made  from the panel analysis have to be interpreted with care and the predictions are not statistically  representative of the population in neither FY02 nor FY07. However, they could be indicative of  productivity changes related to the investment climate over the five year period.  The Pakistan 2007 ICSs was also designed to include up to 600 firms from the original sample, out  of a total of 846 establishments surveyed in 2002.  The remaining 246 establishments were kept  as potential replacements in case of non�response by an establishment of similar characteristics  in the original panel sample.  After all, the FY02�FY07 panel ended up with only 402 interviews  out of 795 firms contacted.  The final sample of 402 firms is clearly non�representative of the population of establishments in  neither  FY02  nor  FY07,  and  could  suffer  the  problem  of  endogenous  sample  selection  of  firms.  The  initial  set  of  600  candidate  establishments  for  the  panel  is  in  turn  a  non�representative  subset of the 2002 original sampling of firms. Out of the 600 firms 402 firms were located and  interviewed,  probably  those  that  did  not  exit  the  market  and  continued  operating  over  the  5  years period. Therefore the sub�sample of the real population is likely to be over represented by  the more profitable and successful firms.  In  spite  of  all  this  sampling  problems,  we  believe  that  the  FY02�FY07  panels  are  useful  to  find  certain empirical regularities on the evolution of the investment climate constraints in Pakistan  4  In order to check robustness to other patterns of clustering, we also correct for correlation within industries, and  within regions. The results are robust to the cluster used (results are available upon from request).  5 and their firm’s effects five years later. Clearly, we will rely on the representative sample of 2007  ICSs in order to do inference about the population.  2.1 Description of the data sets based on the three IC Surveys of  Pakistan  For the econometric analysis of the IC effects on economic performance we start with a flexible  structural  system  of  equations.  The  dependent  variables  of  the  system  are,  in  the  case  of  manufacturing sector, productivity (TFP), demand for labor, real wages, probability of exporting  and probability of receiving FDI; and in the case of services we use labor productivity instead of  TFP.  Apart  from  the  endogenous  variables  indicating  the  simultaneous  structure  of  the  system,  the  sets  of  explanatory  or  right  hand  side  variables  of  the  system,  are  for  manufacturing  and  services  sectors,  the  investment  climate  variables  plus  the  dummy  variables  controlling  for  industry, region and firm size effects.  Table  1.1  describes  the  variables  used  to  generate  the  dependent  variable  of  each  of  the  equations of the system and the industry, region and firm size dummies from the manufacturing  ICSs.  Table  1.2  describes  the  variables  used  to generate  the  dependent  variables in  the  case of  the services sector, when we use labor productivity and not TFP.  Tables 2.1 to 2.7 include the list of IC variables used, along with a brief description on how they  are  measured.  We  classify  them  in  seven  broad  blocks  or  IC  categories:  1)  infrastructure,  2)  economic  governance,  3)  finance,  4)  innovation  and  competition,  5)  labor  market  and  skills,  6)  corporate governance and 7) other control variables.  Although  the  Investment  Climate  surveys  are  very  valuable  instruments  to  improve  our  understanding of the economic, social, political and institutional constraints affecting economic  growth,  they  have  some  problems  in  terms  of  the  quality  of  the  information  provided;  measurement errors, outliers and missing observations. The ICSs of Pakistan is not an exception;  finding too frequently missing cells of data, outlier observations and measurement errors.  Tables  3.1  to  3.7  shows  the  response  rate  of  the  IC  variables  in  the  three  samples  considered.  There is a wide dispersion in the response rates of the IC variables, going from a scarce 5% to a  6 complete 100%. In our econometric models we want to use the large set of IC variables. Missing a  low number of observations in each IC variable would imply losing a considerable portion of firms  from  the  original  sample  frame,  with  negative  consequences  in  terms  of  sample  representativeness and efficiency. Therefore, it is important to do some statistical treatment of  missing observations as will become clear later on.  The  same  problem  is  observed  in  the  variables  used  to  construct  the  different  measures  of  productivity  (TFP  and  labor  productivity)  which  are  part  of  the  dependent  variables  of  our  regression analysis. Table 4.1 shows the number of missing observations in the productivity (TFP)  case of manufacturing sector (panels A and B) and labor productivity in the case of services (panel  C).  We  also  include  the  number  of  outliers,  defined  as  those  cross�sectional  observations  with  ratios  of  materials  to  sales  (electricity  cost  plus  communication  cost  in  the  case  of  services)  and/or labor cost to sales larger than one.  In  the  case  of  2007  manufactures  ICS  (Table  4.1,  panel  A)  the  information  for  capital  and  employment is completely missing in FY06. Moreover, capital stock data is missing for almost half  of the establishments in FY07.  However, the information of the rest of the variables needed to  measure  TFP  is  missing  for  less  than  5%  of  the  establishments  in  FY07  and  for  less  than  7%  in  FY06. Therefore, those percentages of useful observations—available for regressions analysis in  the  complete  deletion  case—are  0%  in  FY07  and  45%  in  FY06;  a  considerable  reduction  of  observations with respect to the complete sample.  Table 4.1, panel B, shows the available observations in the FY02�FY07 panel. Again, the problem is  in the capital stock figures for FY07. Excluding missing observations and outliers from the analysis  would imply using only 54% of the sample in FY07 and 94% in FY02.  Panel C of Table 4.1 concentrates in the case of the services. The rental cost of capital is reported  by less than 35% of the interviewed establishments in FY07 and FY06; this dramatically reduces  the number of available firms to only 26% for FY07 and FY06.  The  reduced  number  of  observations  available  in  the  complete  deletion  case  has  important  consequences  in  terms  of  representativeness  and  efficiency  loses  for  the  manufacturing  and  service sectors. Tables 4.2 and 4.3, show the number of observations in the sampling frame and  7 in  the  complete  deletion  case  by  industry  and  state  for  the  manufacturing  and  service  sectors.  From  the  tables  it  is  clear  that  the  representativeness  of  the  original  sample  is  modified  in  the  complete case, especially in the Baluchistan and NWFP regions.  Apart from the clear implications in terms of representativeness and efficiency, as Escribano and  Pena  (2009)  signals,  the  missing  data  problem  may  also  have  important  implications  in  the  consistency of the IC parameters estimates. To see that let us suppose a simple linear conditional  expectation  model  that  relates y  and  a  set  of  explanatory  variables  x1,  x2,  …  ,  xn  and  a  random  error term,   yi  � 0  �1 x1i  � 2 x2i  ...  � n xni  ui .                                                      (1.1)  Let  the  pattern  of  missing  observations  be  indicated  by  si,  where  si=1  if  we  observe  the  ith  observation  and  si=0  if  it  is  a  missing  observation.  Following  Wooldridge  (2007),  the  regression  model (1.1) with missing observations becomes;  si yi  si� 0  �1si x1i  � 2 si x2i  ...  � n si xni  si ui .                                               (1.2)  Rubin (1976) rigorously defined the different missing data mechanisms (MDM from now on) we  can find. In essence, when the MDM is correlated with the dependent variable of (1.2) we have a  problem  of  self�selection  (sample  selection)  and  we  say  that  the  MDM  is  non�ignorable.  In  this  case  least  squares,  applied  either  to  the  complete  deletion  case  or  to  the  sample  with  replacement, will lead to biased and inconsistent parameter estimates. Other estimators like the  Heckman selection model should be applied in this context.  When the MDM is uncorrelated with any other variable of (1.1) we say that it is ignorable and, in  the terminology of Rubin (1976), that it is missing completely at random (MCAR). In this case least  squares  on  (1.2)  is  consistent  because  E(sit uit )  0   and  E[(si j )(siui )]  E[si jui ]  0   for  all  j  x1i , x2i ,..., xni . However, a more efficient least squares estimator can be applied on the sample  with certain replacement (imputation) of missing values.  When the pattern of missing values is determined only by the explanatory variables of (1)—for    instance the missing values have some patterns on time, size, industries, regions, exporters/non� exporters firms, domestic/foreign, etc—we say that the MDM is missing at random (MAR). In this  8 case  for  consistency  we  also  need E(siui | si x1i , si x2i ,..., si xni )  si E(ui | si x1i , si x2i ,..., si xni )  0 .  That  is,  we  need to control for any exogenous variable affecting the pattern of missing values.  Escribano  and  Pena  (2009)  point  out  that  in  general  the  MAR  is  a  plausible  assumption  in  the  context of ICSs, where missing data appears to be highly correlated with several IC variables like  auditing or accountability, informality, corruption and also with the capacity of the firms—firms  engaged in R&D, quality, innovation of new products, technologies and operating in more exigent  and competitive export markets tend to report less missing values.  In the estimation of the structural models of this report we follow Escribano and Pena (2009) and  we assume that the MDM is missing at random (MAR) and that therefore we can use the large  set of IC variables in Tables 2.1�2.7 to control for the MDM.  Notice that once we have controlled  for all these variables, we can estimate (2) in the complete case consistently although at the cost  of loosing efficiency.   In order to check the sensitivity of the results to the non�ignorability assumption of the missing  values  (sample  selection)  we  also  suggest  estimating  the  model  by  the  Heckman’s  selection  model. The underlying assumption of this model is that the MDM is endogenous and therefore it  is correlated with the dependent variables of our regression model (productivity or sales as we  will see later on). As Escribano and Pena (2009) mentioned, the condition to study here is to what  extent the pattern of missing values has to do with the productivity or sales of the firms, or it can  be explained by firms’ attributes and other IC characteristics of the firms.  2.2 Cleaning of data or imputation methods.  The question of interest now is; should we do something with respect to the missing observations  or  should  we  delete  those  firms  with  missing  observations  in  any  of  the  regression  variables  (complete deletion case). Operating with the complete case is in general acceptable if incomplete  cases attributable to missing data comprise a small percentage, say 5% or less, of the size of the  sample  (Schafer,  1997),  and  when  the  complete  case  preserves  the  representativeness  of  the  original  sampling  frame.  In  addition,  in  models  with  a  large  number  of  regressors  missing  data  problem may encourage analysts to leave out of the regression some explanatory variables with  9 high proportion of missing values. As Cameron and Trivedi (2005) pointed out, this practice may  be misleading as it leads to an omitted variables problem, which is more serious than the missing  data problem per se.  We implement a method to impute the missing observations in the regression variables of Tables  1.1, 1.2 and 4.1 in several steps based on the EM�algorithm. For the case of the manufacturing we  first replace missing data in employment and capital stock in FY06. For the case of the services  sector  we  impute  the  missing  values  on  the  rental  cost  of  capital.  Second,  for  the  remaining  missing values of the right hand side variables of the system of both manufacturing and services  we  apply  a  simple  and  direct  replacement  method  from  Escribano  and  Pena  (2009).  The  methodology for missing observations in IC variables is explained in more detail in section 3.  Imputation of missing values in employment and capital stock for 2006/05 manufacturing ICS  The information for employment and capital is completely missing in Pakistan manufacturing ICS  for FY06, see Table 4.1. Therefore for the estimation of the IC effects on economic performance  we  only  use  information  for  year  FY07;  however,  in  order  to  apply  panel  data  estimators,  as  a  further check of robustness, we also impute the missing values for capital stock and employment  in year FY06. For example, we recursively estimate the missing values of the capital stock from  the information available on the replacement cost and on the net investment in machinery and  equipment.  The  permanent  inventory  method  establish  that  the  capital  stock  at  moment  t  is  given  by  Kit=K  it�1(1�δ)+  Iit.  By  inverting  this  formula  we  can  obtain  the  value  of  the  capital  at  moment t�1 as Kit�1=(K it  � Iit)/(1�δ) where Kit  is approximated by the net book value of machinery  and  equipment  (NBVC),  Iit  is  the  net  investment  in  machinery  and  equipment,  and  δ  is  the  depreciation rate.5  Pakistan  IC  surveys  (ICSs)  provide  data  on  employment  for  FY07  and FY04, but not for FY06. The replacement method is explained  graphically  in  the  figure  at  the  right.  60%  of  establishments  reported  the  same  employment  figures  in  FY07  and  FY04,  37%  5  The depreciation rate used is 15 percent, a standard percentage commonly applied in other works. Other percentages were also  used in order to check robustness.   10 reported  a  slightly  higher  number  of  employees  in  FY07  than  in  FY04  and  only  3%  reported  a  reduction of employees from FY04 to FY07. Taking into account the low variability in employment  data observed between both years, we estimate employment in FY06 as a simple extrapolation of  the data observed in FY04 and FY07.   Imputation of missing values in rental cost of capital in 2006 services ICS  Rental  cost  of  capital  is  critical  for  the  econometric  analysis  of  the  services  sector  because  it  proxies  the  price  of  the  capital.  Unfortunately,  information  of  this  variable  is  missing  for  more  than  65%  of  the  establishments  of  the  sample.  Out  of  this  percentage,  more  than  63%  of  the  firms report a rental cost of capital equal to zero, and only the remaining 2% are missing values  per se. When the value reported is equal to zero the establishment is not renting land/buildings,  equipment  or  furniture.  In  these  cases  we  estimate  the  rental  cost  of  capital  based  on  the  information available for the area of the local occupied by the firm in squared feet, the industry,  the state and the size.  In  particular,  we  propose  a  EM  algorithm  based  on  the  next  population  model   J1   X1   u1   J    X  �  u  .  Where  J  is  the  rental  cost  of  capital  and  X  is  the  matrix  of  explanatory   Mis   2   2 variables, in our case the area of the local, and industry, region, size dummies. The sample is split  into  the  N1  available  observations  and  N2  missing  observations.  The  algorithm  chooses  the  candidates  values  to  replace  the  missing  cells  of  the  rental  cost  of  capital  that  maximizes  the  likelihood  function  conditional  on  the  vector  of  parameters  of  that  model.  Particularly,  our  EM  algorithm consists on (see Cameron and Trivedi (2005) for more details) i) estimate  �  using the  N1  available  observations;  ii)  generate  J Mis  X2 � ;  iii)  in  order  to  mimic  the  distribution  of  J1  ˆ ˆ generate  adjusted  values  of  J Mis  (V 1/2 J Mis )  u m ,  where  um   is  a  Monte  Carlo  draw  from  the  ˆa ˆ ˆ N(0,  s2)  distribution,  being  s2  the  variance  of  u1  and  a  estimate  of  V  can  be  obtained  as  V ( J Mis )  V ( J | X 2 )  s 2 ( I N2  X 2 [ X1 ' X1 ]1 X 2 ') ,  and     denotes  element  by  element  ˆ ˆ ˆ ˆ multiplication; iv) using the augmented sample obtain a revised estimate of  � ; v) repeat steps (1)  ˆ 11 to  (4)  until  convergence  is  achieved  in  the  sense  that  the  change  in  the  sum  of  the  square  residuals becomes arbitrarily small.  Note that steps iii) and iv) are simply random draws from  Kernel density of rental cost of capital after and before replacement the  conditional  distributions  of  J  given  �   in  the  case  of  .2 step  iii),  and  of  �   given  s2  in  the  case  of  step  iv).  The  .15 estimated  conditional  distributions  of  the  rental  cost  of  .1 capital  after  and  before  the  replacement  method  are  .05 shown in the figure at the right.   0 0 5 10 15 x kdensity lr_original kdensity lr_new Replacement of remaining missing values   When  the  MDM  is  ignorable,  the  objective  of  the  replacement  methods  is  not  to  augment  the  sample size, but to preserve the sample representativity and to gain efficiency in the estimation  and to retrieve for the analysis a larger number of firms from very expensive interviews.  Our method of imputing missing data, which we call ICA method, shares the expectation step of  the Expectation�Maximization (EM) algorithm proposed in the seminal paper of Dempster, Laird  and  Rubin  (1977),  In  particular,  the  replacement  strategy  starts  from  the  expectation  of  the  production  function  variables  conditional  on  the  industry,  region  and  size  the  corresponding  observation belongs to (‘expectation step’). In other words we replace the missing value by the  expectation  of  the  distribution  of  the  variable  conditional  on  the  information  on  sector,  region  and size according to next equation  E ( J it | DR ,it , DI ,it , DS ,it )  �0  � R, J DR ,it  � I, J DI ,it  � S , J DS ,it   J  Y , L, M , K , LC                   (1.3)  Where Y, L, M, K and LC represents output, labor, materials, capital and labor cost and DR, DI and  DS  are  vectors  of  region,  industry  and  size  dummies  respectively.  We  choose  (1.3)  such  that  it  represents the special features of the IC datasets since in the IC surveys the industry, region and  firm size are commonly used variables to stratify the sample.   12 After  excluding  from  the  replacement  process6  those  observations  with  all  the  production  function  variables  missing,  the  imputed  values  replacing  the  incomplete  data  (missing  observations) are given by   J it  �0  � R , J DR ,it  �T , J DI ,it  �T , J DS ,it ˆ ˆ ˆ ˆ J  Y , L, M , K , LC  .                                    (1.4)  In this ICA method we assume first that each imputed variable can be approximated by a linear  function of the variables used to stratify the sample (dummies of industry, region and size).  The  second  condition  for  multiple  imputations  to  work  well  is  that  all  the  variables,  including  those replaced and those used to replace, have Normal distributions (see Allison, 2001). Although  these are too strong assumptions it has been shown that the multiple imputation methods seems  to works well even when the variables have distributions that are not Normal, see Schafer (1997).  In  addition,  as  we  have  already  pointed  out  we  also  control  for  any  explanatory  variable  correlated  with  the  pattern  of  missing  values  (si)  to  get  consistency  when  estimating  the  parameters of the regression models with IC variables.  When  the  two  assumptions  mentioned  above  (Normality  and  linearity  of  imputed  variables  on  dummies  of  industry,  region  and  size)  do  not  hold  the  replacement  strategy  is  risky.  In  those  cases, we can understand our replaced variables as the classical problem of variables measured  with error. In order to illustrate this let our model be given by  yi  xi �  ui , where yi represents  sales and xi is a vector of inputs. Suppose that in the population we have that  E (ui | xi )  0 , and  that  xi  is  missing  when  i  S .  When  we  predict  xi  i  S   such  that  xi  xi  vi   where  xi   is  our  ˆ ˆ predicted value, then the model becomes  yi  xi �  vi �  ui . Where when  i � S   xi  xi  and vi=0,     while  if  i  S   xi  xi   and  vi  xi  xi .  Therefore,  consistency  of  estimates  of  �   depends  on   ˆ  ˆ whether  E (vi | xi )  0 . For example if we use the region industry size average of the plant level IC    variables  instead  of  the  IC  variables,  this  will  generate  a  consistent  procedure  if  the  plant  level  observations on the IC variables are noisy measures of the true level of the IC variables and we  6  ICA method is conservative in the sense that we do not replace missing cells for those observations with all but one  PF variables unobserved. We force the industry�region�size cells to have at least 18 values to estimate consistently  the  sample  average.  Moreover,  in  order  to  avoid  biases  caused  by  outlier  observations  we  use  the  within�group  median instead of the within�group mean.  13 average  over  enough  observations.  The  procedure  will  generate  inconsistent  estimates  if  the  effect  of  these  variables  differs  by  firm  (giving  us  a  vi ),  the  interfirm  variance  in  the  variables  correctly  reflect  that  variance,  and  firms  that  have  higher  levels  of  the  IC  variables  are  more  productive and hence employ more of one or more factors of production.7   Evaluation of missing data mechanism  We briefly discuss the pattern of missing values (missing data mechanism) observed in the IC of  Pakistan  for  FY07.  Table  5.4  explores  the  pattern  of  missing  values  in  production  function  variables. From this table it is clear the particular pattern of missing values we have to deal to.  The most common situation is that in which we lack information for capital, but we observe the  remaining  production  function  figures.  This  situation  comprises  44.8%  of  total  observations  for  FY07.  Table 5.5 explores the relation between having missing value in at least one of the PF variables  and  other  key  IC  variables,  related  with  firm  performance  (as  the  correlations  with  respect  to  sales and TFP shows). From this table it is clear that the missing data mechanism has to do with  the capacity and performance of the firm. For instance, having missing values is more common  amongst  those  firms  lacking  a  power  generator,  having  power  or  water  outages  or  suffering  crime  losses.  As  opposite  it  is  less  probably  to  pick  a  firm  having  a  missing  value  among  those  using  e�mail,  internet,  external  auditory,  with  access  to  loan  or  credit  lines  or  introducing  new  product  improvements.  This  negative  relation  between  firm  performance,  capacity,  innovation  and formality is supported by the correlation between the IC variables and the number of missing  values  reported  in  last  column  of  Table  5.5.  Table  5.6  demonstrates  that  missing  values  in  PF  figures is determined by the size of the firm.  From tables 5.4, 5.5 and 5.6 it is patent the necessity of having a method to deal with the missing  data mechanism as it is correlated with various key IC variables. The results after the replacement  method  proposed  above  are  reported  in  tables  5.7  and  5.8.  In  the  complete  case  (complete  deletion case) the representativity of the original sampling frame is modified as the percentage of  useful observations varies by size, region and industry (used to stratify the population). After the  7  We thank Arial Pakes for this suggestion.  14 replacement mechanism the percentages of useful observations are more consistent with those  of the sampling frame.  Summary of results of the cleaning (imputation) process of the data base  Tables  5.1,  5.2  and  5.3  summarize  the  results  of  the  cleaning  process.  Table  5.1  shows  the  number of observations available for regression analysis once we have imputed values to certain  missing observations. After this cleaning process we are able to use 93.4% of the whole sample in  the 2006 manufacturing ICS, almost 90% in the panel and 95%% in the services ICS. From Tables  5.2 and 5.3 it is clear that the sample with replacement is consistent with the proportions or the  original sampling frame in terms of representativeness.   3 Econometric  estimation  of  IC  elasticities  and  semi­ elasticities  on  productivity  (TFP)  (analysis  for  manufacturing only)  In the identification of the significant investment climate effects on economic performance (productivity, demand for labor, real wages, probability of exporting and probability of receiving FDI) it is important to condition on the whole set of information contained in the IC survey. In particular, we propose a simultaneous equations system that relates the interactions between the investment climate variables and firm’s economic performance measures. Escribano and Guasch (2005, 2008), model that relates IC and C variables with firm-level productivity (TFP) by the following system of equations with fixed effects, log Yit  � L log Lit  � M log M it  � K log K it  log TFPit                                               (3.1a)  log TFPit  ai  � DR Dr  � Ds D j  � DT Dt  � P  wit 8                                                (3.1b)      ai  � IC ICP,i  �C CP,i  � i                                                                         (3.1c)    8  In the ICS of Pakistan we have data on production function variables for two years FY06 and FY07. However, as we  pointed out in section 2, the information for year FY06 is flaw and pledge of missing information. Although for the  complete analysis we rely only on the information available for year FY07(cross section) we keep the sub�index of  time t for the other models.  15 where,  Y  is  firms’  output  (sales),  L  is  employment,  M  denotes  intermediate  materials,  K  is  the  capital stock, IC and C are time�fixed effect vectors of other investment climate and control time� fixed effects, and Dr, Dj and Dt are the vectors of state, industry and year dummies.   The usually unobserved time fixed effects ( ai ) of the TFP equation (3.1b) are here proxy by the  set of observed time fixed components IC, and C variables of (3.1c) and a remaining unobserved  random effects ( � i ). The two random error terms of the system,  � i  and  wit , are assumed to be  conditionally uncorrelated with the explanatory L, M, K, IC and C variables9 of equation (3.2),  log Yit  � L log Lit  � M log M it  � K log K it  � IC IC P ,i  � C C P ,i  � DR Dr  � Ds D j  � DT Dt  � P  u it       (3.2)        Therefore, the regression equation (3.2) represents the conditional expectation plus a composite  random�effect error term equal to  uit  � i  wit .  Before  introducing  the  remaining  equations  of  the  system  we  explain  the  main  econometric  issues that we have to address in the estimation of productivity (TFP) equations.  3.1 Robustness of IC elasticities and semi­elasticities: single step  and  two  step  estimation,  restricted  and  unrestricted  input­ output elasticities.  By simply plugging (2.1c) into (2.1b) we get the next expression for productivity log TFPit  � IC ICi  � C Ci  � DR Dr  � Ds D j  � DT Dt  � P  uit      (3.3) where IC and C are, respectively, the observable fixed effects vectors of investment climate and  control variables listed in Tables 2.1 to 2.7 of the Appendix. In the regressions, we always control  for  several  region  dummies  (Dr,  r=  1,  2,…,  R),  sector�industry  dummies  (Dj,  j  =  1,  2,  ...,  qD),  a  constant term (�P) and in the panel data case we also include a set of time dummies (Dt, t = 1,  2,...,  qT).  Since  there  is  no  single  salient  measure  of  productivity  (or  logTFPit),  any  empirical  9  Under this formulation (and other standard conditions) the OLS estimator of the productivity equation (3.2) with  robust standard errors is consistent, although a more efficient estimator (GLS) is given by the random effects (RE)  estimator  that  takes  into  consideration  the  particular  covariance  structure  of  the  error  term,  � i  wit ,  which  introduces a particular type of heteroskedasticity in the regression errors of (3.2).  16 evaluation of the productivity impact the IC might critically depend on the particular productivity  measure  used.  Escribano  and  Guasch  (2005,  2008)  suggested–following  the  literature  on  sensitivity  analysis  of  Magnus  and  Vasnev  (2006)–to  look  for  empirical  results  (elasticities)  that  are robust to several productivity measures. This is also the approach we follow in this paper.  In  particular,  we  want  the  elasticities  of  IC  on  productivity  (TFP)  to  be  robust  (with  equal  signs  and  similar  magnitudes)  for  the  6  different  productivity  measures  used.  The  alternative  productivity measures used come from considering:  a) different functional forms of the production functions (Cobb�Douglas and Translog),  b) different  sets  of  assumptions  (technology  and  market  conditions)  to  get  consistent  estimators  based  on  Solow’s  residuals,  ordinary  least  squares  (OLS),  or  random  effects  (RE), and so on,  c) different  aggregation  levels  when  measuring  input�output  elasticities  (industry  level  or  aggregate country level).  Table B: Summary of productivity (P) measures and estimated investment climate (IC) elasticities Functional forms of Estimation Aggregation level of Result production function procedure coefficients of PF Two-step 1.1 Restricted coefficients 2 (TFP) measures; 2 (IC) 1. Solow´s Residual estimation 2.2 Unrestricted coefficients elasticities Single-step 2.1 Restricted coefficients 2 (TFP) measures; 2 (IC) 2. Cobb-Douglas estimation 2.2 Unrestricted coefficients elasticities Single-step 3.1 Restricted coefficients 2 (TFP) measures; 2 (IC) 3. Translog estimation 3.2 Unrestricted coefficients elasticities 6 (TFP) measures and therefore 6 estimates of Total IC elasticities (or semi- elasticities) Note: Restricted coefficient = equal input-output elasticities in all industries. Unrestricted coefficient = different input output elasticities by industry. Table B above summarizes the productivity measures used for the IC robust evaluation. The two� step estimation starts from the nonparametric approach based on cost shares from Hall (1990) to  17 obtain  Solow’s  residuals  in  logs  under  two  different  assumptions:10  (a)  the  cost  shares  are  constant  for  all  plants  located  in  the  same  country  (restricted  Solow  residual),  and  (b)  the  cost  shares vary among industries in the same country (unrestricted by industry Solow residual). Once  we  have  estimated  the  Solow  residuals  (logTFPit)  in  the  first  step,  in  the  second  step  we  can  estimate equation (3.3) by OLS with robust standard errors and allowing for clustering correlation  within industries and states. For further robustness we use the available panel data—FY06 and  FY07—for productivity and production function variables and estimate (3.3) also by RE.  In the single�step estimation approach, we start with the OLS parametric estimation (and RE for  the case of the FY06�FY07 panel) of the extended production function (3.2). We use two different  functional  forms  of  the  PF—Cobb�Douglas  and  Translog—under  two  different  aggregation  conditions  on  the  input�output  elasticities:  equal  input�output  elasticities  in  all  industries  (restricted case) and different input�output elasticities by industries (unrestricted case).  3.2 Endogeneity of production function (PF) variables.  There  is  an  identification  issue  separating  TFP  from  PF  when  any  PF  inputs  is  influenced  by  unobserved  common  causes  affecting  productivity—such  as  a  firm’s  fixed  effects.  This  creates  simultaneous equation bias if least squares are used estimating equation (3.1a) to measure TFP.  However,  this  endogeneity  problem  of  the  inputs  is  overcome  by  using  the  single  step  least  squares  estimation  of  equation  (3.2)  follow  the  approach  proposed  by  Escribano  and  Guasch  (2005, 2008). That is, in (3.2) we proxy the usually unobserved firm�specific fixed effects (which  are  the  main  cause  of  inputs’  endogeneity)  by a  long  list  of  observed  firm�specific  fixed  effects  coming  from  the  investment  climate  surveys.  Controlling  for  the  largest  set  of  IC  variables  and  plant  C  characteristics,  we  can—under  standard  regularity  conditions—  get  consistent  and  unbiased  least  squares  estimators  of  the  parameters  of  the  PF  and  the  corresponding  IC  elasticities on TFP in one step.  10   The  advantage  of  the  Solow  residuals  is  that  they  require  neither  the  inputs  (L,  M,  K)  to  be  exogenous  nor  the  input�output elasticities to be constant or homogeneous (Escribano and Guasch, 2005 and 2008). The drawback is  that they require having constant returns to scale (CRS) and, at least, competitive input markets.  18 Notice that even if we were only interested in assessing the impact of one block of IC variables,  say infrastructure, we do not limit the scope of the analysis to only that block of IC variables. We  include  (and  therefore  control  for)  IC  factors  from  all  the  blocks  because  of  the  crucial  role  IC  variables play as proxies for the unobserved fixed effects. This is the key feature of the Escribano  and Guasch (2005, 2008) econometric methodology to provide robust empirical regularities. If for  example, we try to estimate the impact of say infrastructure, without controlling for the other IC  blocks of variables, we can get different signs on certain coefficients due to the omitted variables  problem; see Escribano and Guasch (2008).  3.3 Role  of  prices  on  production  function  (sales  generating  functions and market power).  The  role  of  prices  in  the  system  (3.1a)�(3.1c)  deserves  special  attention.  As  our  dependent  variable is sales, rather than units of physical output, it reflects prices. In fact, according to the  current  literature,  the  term  sales  generating  function  seems  more  appropriate  rather  than  production function for equation (3.1a), as in the work of Olley and Pakes (1996). If prices are not  identical  across  firms,  what  seems  to  be  a  high  productive  plant  may  be  just  an  establishment  that is charging high prices, what in turn may be consequence of either market power (non zero  mark�ups)  or  differences  in  quality  of  final  goods.  While  with  homogeneous  products  high  productivity could be a reflection of high prices, or in other words a reflection of market power  (Melitz,  2000;  Bernard,  et  al.,  2003;  Katayama,  et  al.,  2006;  Foster  et  al,  2008),  under  heterogeneous  or  differentiated  products  high  prices  could  be  consequence  of  higher  quality,  what could be translated to over�measured productivity as some plants would be able to produce  higher  quality—and  price—products  with  the  same  amount  of  output  (Levinsohn  and  Melitz,  2002;  de  Loecker,  2007;  Katayama,  et  al.,  2006;  Gorodnichenko,  2007).  These  points  are  especially important in developing countries where usually market power is a severe constraint to  growth.  Addressing  these  issues  is  not  a  straightforward  task  with  the  data  available.  A  more  comprehensive  analysis  would  need  information  on  plant  level  input  prices  to  incorporate  the  demand side of the model.   19 As long as this data is not available a plausible solution is to estimate the system (3.1a)�(3.1c) by  following  a  control  approach.  Now  instead  of  observing  output  (Y)  we  are  observing  sales  (PyY),  where Py denotes prices, and then equation (3.1a) is transformed to (3.1a’)  log Yit  log Py ,it  log Py ,it  � L log Lit  � M log M it  � K log K it  log TFPit                                     (3.1a’)  Notice that as long as we control for logPy on the right hand side of equation (3.1a’), productivity in  the  RHS  of  the  equation still  is  logTFP.  Since,  within  a  year  there  is  low  price  variability  at  the  firm  level we assume that logPy can be proxied by a constant term, control variables C that are time�firm  level fixed effect vectors of firm variables and a set of dummy variables, and Dr, Dj and Dt including  the  vectors  of  state,  industry  and  year  dummies.  Therefore,  after  including  all  those  variables  we  could assume that that  log Py ,it  � IC IC P ,i  � C C P ,i  � DR Dr  � Ds D j  � DT Dt  and therefore we can get a       similar expression for (3.2) incorporating prices   log Yit  log Py ,it  � L log Lit  � M log M it  � K log K it  � IC IC P ,i  � C C P ,i  � DR Dr  � Ds D j  � DT Dt  � P  uit (3.2’)       Estimating  sales  in  (3.2’),  as  we  do  in  our  empirical  analysis,  can  provide  evidence  that  TFP  can  be  “interpreted�  as  “technical  efficiency�.11  Finally,  to  control  for  the  mark  up  (market  power  effect)  and/or  quality  (differentiated  products)  we  are  also  including  several  IC  and  C  variables  related  to  competition (see the list of IC variables included in the group of other control variables).  In the empirical section we also compare productivities by different market structures—monopolies,  duopolies,  oligopolies  or  fragmented  market,  market  types—local,  national,  international—,  sizes,  sectors and states. We believe that, in addition to the control approach followed, this type of analyses  may  help  us  identify  whether  the  measured  productivities  may  be  driven  by  mark�ups  and/or  differentiated products rather than by differences in efficiency.  3.4 Endogeneity of IC variables.  Another  econometric  problem  we  have  to  face  when  estimating  the  parameters  of  IC,  and  C  variables—either  from  the  two�step  or  single�step  procedure—is  the  possible  endogeneity  of  some  of  these  explanatory  variables.  That  is,  many  IC  variables  are  likely  to  be  determined  11  Notice, however, that the word technical efficiency that you use is too narrow in the ICs context since there are  many efficiencies related to IC variables on TFP that are not technical (regulatory, governance, institutional, etc.).  20 simultaneously  along  with  any  TFP  measure.  With  these  productivity  equations,  the  traditional  instrumental variable (IV) approach is difficult to implement, given that we only have information  for one year, and therefore we cannot use natural instruments, such as those provided by their  own lags. As an alternative correction for the endogeneity of the IC variables, we use the region� industry�size  average  of  plant�level  IC  variables  instead  of  the  crude  IC  variables,12  which  is  a  common solution in panel data studies at the firm level13.  However, one should avoid including too many industry�region�size variables since it may lead to  multicollinariety  problems.  Especially,  if  the  number  of  states,  sizes  and  industries  is  not  large  enough and there are common regions and/or industries processes affecting the variables. So a  proper a priori consideration of the endogeneity of IC and C variables is important.  Using industry�region�size averages also mitigates the effect of having certain missing individual  IC observations at the plant level, which—as mentioned in Section 2—represent one of the most  important difficulties using ICSs. As an alternative, we also follow a second strategy to deal with  the  missing  values  of  some  IC,  and  C  variables.  In  order  to  keep  as  many  observations  in  the  regressions as possible to avoid losing efficiency, when the response rate of the variables is large  enough,  we  decided  to  replace  those  missing  observations  with  the  corresponding  industry� region�size average.14 Thus, we gain observations, efficiency, and representativity maybe at the  cost of introducing some measurement errors into the explanatory variables.15  For  those  variables  which  endogeneity  is  intrinsic  due  to  the  construction  of  the  simultaneous  system  of  equations  (exporting  probability  and  probability  of  receiving  FDI  inflows)  we  apply  standard  IV  estimators  (2SLS)  using  as  instruments  either  the  industry�region�size  average  or  12  For the creation of cells a minimum number of firms are imposed—there must be at least 15 to 20 firms in each  industry�region�size  cell  to  create  the  average,  otherwise  we  apply  the  region�industry  averages.  If  the  problem  persists, we apply the industry�size or the region�size average.   13   This  two�step  estimation  approach  is  a  simplified  version  of  an  instrumental  variable  estimator  (two�stage  least  squares, 2SLS).  14   Notice  that  this  replacement  strategy  has  a  straightforward  weighted  least  squares  interpretation  since  we  are  giving a greater weight to those observations with more variance (Escribano at al., 2008b).  15   Depending  on  the  assumption  we  make,  the  measurement  error  may  introduce  a  downward  bias  in  the  parameters  that  depends  on  the  ratio  between  the  variances  of  the  variables  and  the  measurement  error.  Since  those explanatory variables are constant within regions, sizes, and industries we expect their variances will be small.  21 those  exogenous  IC  variables  from  the  list  of  explanatory  covariates  of  the  corresponding  equation.   Unfortunately, endogeneity is yet an unsettled issue in ICSs. Implementation of those techniques  that  allow  obtaining  causal  interpretations,  like  those  derived  from  the  concept  of  ‘Granger  causality’  or  experimental  or  quasi�experimental  methods,  are  unfeasible  to  implement  in  the  actual  context  of  IC  surveys  with  cross�sectional  dataset  or  with  incomplete  panels  with  a  very  short  time  dimension.  Although  the  solutions  proposed  to  deal  with  endogeneity  in  this  report  can reduce the degree of endogeneity of both IC and PF variables, they do not allow us to place  causal interpretations on the results obtained. Rather, we have to satisfy ourselves by obtaining  empirical regularities with the relationships among IC variables and measures of firms’ economic  performance.  3.5 Selection of the relevant models.  The econometric methodology applied for the selection of the variables (IC, and C) goes from the  general  to  the  specific.  The  otherwise  omitted  variables  problem  that  we  encounter—starting  from a too�simple model—generates biased and inconsistent parameter estimates. We start the  selection of IC variables with a wide set compounded by up to 160 variables. We avoid using at  the same time in the regression, explanatory IC variables that provide similar information (highly  correlated), mitigating the problem of multicollinearity that could otherwise arise. We then start  removing  from  the  regressions—the  less  significant  variables—one  by  one,  until  we  obtain  the  final  set  of  IC  variables,  significant  in  at  least  one  of  the  alternative  TFP  regressions  and  with  parameters  varying  within  a  reasonable  range  of  values.  Once  we  have  selected  a  preliminary  model we test for omitted IC variables (those initially dropped IC variables).  The robust TFP effects obtained on IC and C variables, along with their level of significance, are  listed in Table 6.1 of the appendix included at the end of the report. Indications of the form the  variables are entering the regression—industry�region�size average or missing values replaced by  the industry�region�size average, logs, etc.—are also included in the Table. In all the cases we are  using cluster standard errors.  22 3.6 Further robustness: two years FY06­FY07 panel.  In the estimation of the (2.1a)�(2.1c) system of equations of Table 6.1, we used cross�section data  from the ICSs of FY07. We initially dropped the data on PF variables for FY06 because they are  more affected by missing observations and measurement errors. We believe that these problems  could introduce unnecessary noise on the estimated results, in spite of the usual advantages of  using  a  panel  data  structure—efficiency  gains  when  using  additional  panel  data  estimators  like  random  effects.  However,  as  a  further  check  of  the  robustness  we  also  apply  the  estimation  procedures  described  in  sections  3.1  to  3.4  to  the  FY06�FY07  panel  adding  also  the  random  effects estimator. The results are reported in Table 6.2 with similar results. In all the cases we are  using cluster standard errors.  3.7 Estimation  under  alternative  replacement  processes  for  missing data.  A second set of robust results is applied by using different imputation mechanisms for replacing  missing values in production function figures. We allow for different assumptions on the missing  data mechanism (MDM): missing at random (MAR) or non�ignorable MDM, for more information  on MDM see for instance Little and Rubin (1987). As we have already pointed out in section two  of  this  paper,  the  simple  imputation  mechanism  that  we  usually  applied  in  ICSs  (called  ICA  method)  is  based  on  the  conditional  expectation  of  each  of  the  missing  production  function  variables  on  firm’s  information  by  industry,  region  and  size.  Here  we  analyze  their  robustness  under  alternative  imputation  methods,  briefly  described  below;  for  a  complete  explanation  of  these imputation methods see Escribano and Pena (2009).  We first describe alternative replacement or imputation mechanism to the ICA method that rely  on the missing at random (MAR) assumption:  a) Bootstrap  ICA  method:  If  we  use  imputed  observations  as  if  they  were  real  data  the  resulting  regression  standard  errors  estimates  will  be  in  general  too  low  and  inference  might lead to find too many significant variables. This has to do with the lack of uncertainty  in  the  estimation  of  the  parameters  of  the  model.  Conventional  formulas  to  compute  23 standard errors do not correct for the fact that certain observations were imputed but not  observed. To correct for this, a plausible solution is to compute bootstrap estimates of the  standard  errors  of  the  estimated  coefficients  of  equation  (3.3).  The  idea  is  to  create  ‘r’  replications of the original sample using as strata the industry and region. In the next step  and for each replication, we estimate equation (1.4) by least squares replacing the missing  observations  before  the  new  estimation  of  equation  (3.3).  From  the  resulting  bootstrap  empirical  distribution  of  the  estimators  of  equation  (3.3),  after  several  iterations  of  the  replacements of the missing values, we obtain the estimated bootstrap standard errors.  b) EM  algorithm  on  industry,  state,  size  variables.  Dempster,  Laird  and  Rubin  (1977)  introduced  the  EM  imputation  algorithm  that  has  been  widely  applied  in  a  variety  of  contexts and applications. Basically, the EM algorithm imputes missing data conditional on a  given  model,  and  consequently  chooses  the  candidate  values  to  replace  the  missing  cells  that  maximize  the  likelihood  function  conditional  on  the  vector  of  parameters  of  that  model.  The  purpose  here  is  to  apply  all  the  steps  of  the  EM  algorithm  to  the  problem  at  hand,  using  as  model  the  standard  regression  model  applied  in  the  ICA  replacement  method; that is, using as covariates of the regression model the industry, region (state) and  size  variables.  Notice  that  the  estimation  of  the  IC  elasticities  and  semi�elasticities  is  achieved by following a iterative procedure: i) first, we apply the EM algorithm to replace  the missing cells in sales, labor, capital and materials (used to compute productivity) using  the  procedure  discussed  in  section  1.2  to  replace  missing  data;  ii)  we  compute  the  corresponding  productivity  measure  (the  restricted  Solow  residual);  iii)  we  estimate  equation (3.3) under the new imputed values of the missing observations.   c) EM algorithm on industry, state, size variables and production function variables. In this  case we extend the set of covariates of the regression model (1.4) to include the production  function variables. We follow the same iterative procedures described in case a).  d) EM algorithm on equation (3.3). Now we simply apply the EM algorithm to equation (3.3)  and  we  estimate  the  parameters  of  the  model  in  a  single  step  by  maximizing  the  log� likelihood  function  resulting  from  (3.3)  and  the  MDM.  The  procedure  now  is  slightly  different  than  in  cases  a)  and  b).  We  now  replace  missing  values  and  estimate  the  IC  24 parameters  in  a  single  step  by  maximizing  the  log�likelihood  function,  provided  the  population  models  used  to  replace  missing  data  and  to  estimate  the  IC  effects  on  productivity  are  the  same.  Notice  that  the  EM  algorithm  is  always  iterative;  several  repetitions  of  the  replacement  procedure  are  needed  until  the  likelihood  function  converges to the conditional maximum.  We  also  propose  mechanisms  to  deal  with  the  missingness  problem  when  we  assume  that  the  missing  data  mechanism  (MDM)  is  correlated  with  the  dependent  variable  of  our  model  (non� ignorable  MDM).  In  these  cases  one  can  implement  the  Heckman  (1976)  method  (Heckit)  to  correct  for  self�selection,  since  OLS  applied  either  on  the  complete  deletion  case  or  on  the  sample with replacement is inconsistent. The probability of selection—of observing the data—is  modeled with the same IC of the TFP regression model plus other investment climate variables to  solve the identification problem.  We  also  estimate  equation  (3.3)  in  the  complete  deletion  case;  that  is  using  only  the  available  information  for  PF  variables  without  replacement  (dropping  all  firms  with  missing  observations  from  the  sample).  For  consistency  in  this  case  we  need  the  missing  completely  at  random  assumption (MCAR), unless we correct for the correlation between the MDM and the covariates  of the model by controlling for those IC variables correlated with the MDM.   In order to compare the results of the ICA method with alternative replacement procedures we  apply all these imputation models to equation (3.3), using only the ICSs data for FY07. The results  from  the  seven  alternative  imputation  procedures  used  are  reported  in  Table  6.3  and  are  very  robust, as we mention later on.  4 Econometric analysis  of  IC and  productivity  impact on  employment,  real  wages,  probability  of  exporting  and  probability of receiving FDI.  The  same  idea  of  approximating  the  unobservable  fixed  effect  by  the  firm  level  investment  climate conditions is applied in the remaining equations of the model.   25 The  demand  for  labor  determined  by  firm  level  productivity  (logPit)  and  by  real  wages  in  logs  (logWit) and is given by;      logLit = � L  aL,i  � P log TFP  � wlogWit  � Exp yit  � FDI yit  � DR Dr  � Ds Dj  � DM Dm  � DT Dt  � L,it (4.1a)                         it Exp FDI             aL ,i  � L ICiL  � C CiL  vL,i .                                                                                      (4.1b)                         The  wage  equation  is  determined  by  the  productivity  (TFP)  level  after  controlling  for  all  the  IC  effects and by the fact that certain firms exports and receive FDI;     logWit = �W  aW ,i  �P log TFP  �Exp yit  �FDI yit  �DR Dr  �Ds Dj  �DM Dm  � DT Dt  �W ,it (4.2a)                         Exp FDI  it             aW ,i  � IC ICiW  �C CiW  vW ,i .                                                                             (4.2b)                         The probability of firms entering the export market depends on firm level productivity (TFP), the  investment climate and by the fact that certain firms receive FDI;  yit  � Exp  aExp ,i  � P log TFPit  � FDI yit � DR Dr  � Ds D j  � DM Dm  � DT Dt  � Exp ,it                    (4.3a)                         Exp FDI       aExp ,i = � IC ICiExp � C CiExp  vExp ,i                                                                                   (4.3b)                         Finally,  the  probability  of  receiving  foreign  direct  investment  equation,  depends  on  firm  level  productivity (TFP), the investment climate and by the fact that certain firms exports;  yit  � FDI  aFDI ,i  � P logTFPit  � Exp yit  � DR Dr  � Ds D j  � DM Dm  � DT Dt  � FDI ,it        (4.4a)                         FDI Exp       aFDI ,i  � IC ICiFDI  �C CiFDI  vFDI ,i                                                                       (4.4b)  Notice that since the variable yrit, with r = Exp or FDI, is a binary random variable taking only 0  and  1  values,  then  P( yit  1 / x)  E ( yit / x)   then:  a)  the  conditional  probability  is  equal  to  the  r r conditional expectation which is usually assumed to follow a Probit or a Logit model, and b) the  conditional variance (heteroskedasticity) is equal to the product of the conditional probabilities of  the two events. In general, the linear probability models (LPM) approximate well the Probit and  Logit nonlinear models when the variables are evaluated close to their sample means. Since we  are interested in the mean IC contribution relative to the mean values of the dependent variables  of  (4.1a)  to  (4.4a),  we  will  concentrate  only  on  linear  probability  specifications,  like  (4.3a)  and  26 (4.4a). The main advantage of the LPM is in its simplicity since the parameters of the explanatory  variables  of  (4.3a)  and  (4.4a)  measure  the  change  in  probability  when  one  of  the  explanatory  variables changes, holding the rest of the explanatory variables constant. This is important for the  economic interpretation of the coefficients obtained in the empirical section.   By substituting the usually unobserved fixed effects components by their corresponding equation  we can simplify the system of equations including productivity to:  it      log TFP  �P  �IC ICiP  �CCiP  �Exp yit  �FDI yit  �DR Dr  �Ds Dj  �DM Dm  �DT Dt  (vP,i  � P,it ) (4.5)                         Exp FDI  it it Exp FDI      logLit �L  �P logTFP �w logW �Exp yit �FDI yit �LICiL �CCiL �DRDr �DsDj �DMDm �DT D  (vL,i �L,it )(4.6)                          t Exp FDI      logWit  �W  �P logTFP  �Exp yit  �FDI yit  �IC ICi  �CCi  �DRDr  �Ds Dj  �DM Dm  �DT Dt  (vW,i �W,it )  it  (4.7)  yitExp  �Exp �P logTFP �FDI yit  �IC ICiExp  �CCiExp  �DR Dr  �Ds Dj  �DM Dm  �DT Dt  (vExp,i  �Exp,it ) (4.8)                         it FDI           yitFDI  �FDI  �P logTFP  �Exp yit  �IC ICiFDI  �CCiFDI  �DRDr  �Ds Dj  �DM Dm  �DT Dt  (vFDI ,i �FDI ,it )    (4.9)                         it Exp       The  composite  error  terms  of  each  equation  of  the  system  have  three  terms,  says  �it  � r ,i  vr ,i  ur ,it  with r=P, L, W, Exp and FDI. The firm fixed effects ( � r ,i ) are approximated by  the set of observed time�invariant, firm level IC and C variables. The remaining unobserved firm  effects  are  assumed  to  be  independently  distributed  of  IC  and  C  variables,  therefore  what  remains  are  random  effects  ( vP ,i ).  Therefore,  we  assume  that  the  error  terms  (vr,i+εr,j,it)  are  uncorrelated with all the explanatory variables of each equation r, where r=P, Exp, FDI, W and L.  However,  for  certain  explanatory  variables  this  exogeneity  condition  is  not  satisfied.  The  endogeneity  of  certain  IC  variables  induces  a  correlation  between  those  IC  variables  and  the  errors  (vr,i+εr,j,it)  of  the  system  of  equations  (4.5)  to  (4.9)  and  creates  simultaneous  equation  biases and inconsistencies in least squares estimators; like pooling OLS or in random effects (RE)  estimators. This correlation is in general mitigated by replacing those plant�level IC variables by  their  region�industry  averages  ( IC j ).  However,  for  some  other  explanatory  variables  like  productivity,  wages,  exports  and  FDI,  the  endogeneity  is  intrinsic  due  to  the  simultaneous  structure  of  the  system  of  equations.  Therefore,  we  estimate  each  equation  by  instrumental  27 variables  (IV)  techniques  (2SLS)  using  heteroskedasticity�robust  standard  errors.  We  could  have  used  3SLS,  which  is  more  efficient  than  2SLS  under  correct  specification.  However,  since  with  system  of  equations  estimation  techniques  the  misspecification  of  one  equation  affects  the  whole system, we believe that the results from 2SLS are more robust.   Provided  that  we  are  instrumenting  the  productivity  (TFP)  variable  in  the  employment,  real  wages, exports and FDI equations using instruments from the investment climate survey, it is very  convenient to specify a number of rules to choose the list of instruments, etc. First, estimation of  the system of equations (4.5) to (4.9) by IV techniques is done equation by equation. Productivity  equation is at the core of this process and it is estimated seeking robust procedures of Escribano  and  Guasch  (2005  and  2008).  Once  we  have  obtained  robust  IC  and  C  coefficients  for  different  productivity  (TFP)  measures,  we  use  the  set  of  significant  explanatory  variables  to  instrument  productivity in the rest of equations. Notice that some of these variables will be used as included  instruments, while many other will be excluded instruments as they may appear as explanatory  variables in other equations.  The next step is to obtain a preliminary specification for the remaining equations of the system by  OLS with robust standard errors. As in the productivity case, in order to avoid omitted variables  problems, the selection of the model goes from the general to the specific. We start selecting the  preliminary  model  from  a  set  of  more  than  160  IC  and  C  variables,  industry,  state  and  year  dummies,  productivity  and  a  constant  term  (also  real  wages  in  the  case  of  demand  for  labor  equation).  Once we have a preliminary valid model for each equation of the system we start instrumenting  productivity.  We  then  remove  instruments  from  the  list  of  excluded  instruments  provided  we  want  a  partial  R�squared  –or  ‘Shea’  partial  R�squared—as  high  as  possible  with  the  restriction  that our model is not over�identified. To test the over�identification restrictions we use Hansen  test, a robust to general heteroskedasticity variation of classical Sargan test. In addition we take  into  account  the  significance  in  the  first  stage  estimates  when  removing  instruments.  We  also  remove  instruments  from  the  matrix  of  included  instruments  if  in  the  process  of  IV  selection  some of them become insignificant.  28 A similar process is applied when we have to instrument any other simultaneous variable like real  wages  in  demand  for  labor  equation,  or  exports  or  FDI  when  they  appear  as  significant  explanatory  variable  in  other  equations.  A  good  strategy  that  works  well  is  to  estimate  first  by  OLS and then change to IV if we have the set of instruments, which in this case is given by the  explanatory  variables  of  the  corresponding  equation,  excluding  obviously  those  endogenous  covariates. Then we proceed as in the productivity case, removing instruments, either included or  excluded, according to the criteria mentioned before.  Identification of the system of equations  To  discuss  the  identification  issues  underlying  the  system  of  equations  proposed  it  is  useful  to  apply matrix notation. The structural form of the system (4.5) � (4.9) is given by  Αy t + Βxt = ut                                                                          (4.10)  where  y t   is  the  5 1   vector  of  observations  of  dependent  variables  (log�productivity,  yit   and  Exp yit , log�employment and log�wages);  x t  is the 140x1 vector of explanatory variables (ICi, Ci, Dr,  FDI Dj  and  Dt);  ut   is  the  5 1   vector  of  errors;  Α   is  a  5  5   matrix  of  coefficients  of  simultaneous  dependent  variables;  Β   is  a  5x164  matrix  of  coefficients  corresponding  to  the  exogenous/endogenous IC and variables.   In the system (4.5) � (4.9), we are imposing certain structure; for example that employment has  no direct effect in any other equation of the system and that real wages only affects employment  demand, after controlling for all IC and C variables. Therefore, we can explicitly write the first LHS  term of (4.10) as;   1 aP,Exp aP, FDI 0 0   log TFP  it  log TFP  aP, Exp yit  aP,FDI yit it Exp FDI        aExp,P 1 aExp, FDI 0 0  Exp yit  yit  aExp,P log TFPit  aExp, FDI yit Exp FDI  .   Αyt   aFDI , P aFDI , Exp 1 0 0   yit FDI   yit  aFDI ,P log TFP  aFDI ,Exp yit FDI Exp      it   aL, P aL, Exp aL,FDI 1 aL,W   log Lit   log Lit  aL, P log TFP  aL,W logWit  aL, Exp yit  aL, FDI yit  it Exp FDI   1   logWit      aW ,P aW , Exp aW , FDI logWit  aW , P log TFP  aW , Exp yit  aW , FDI yit Exp FDI 0   it  The rank condition is a necessary and sufficient condition for the system (4.10) to be identified. To  discuss whether the rank condition is satisfied, say, in the first equation, let  α   be the first row of  Α   and  β    the  first  row  of  Β .  We  may  now  partition  these  vectors  into  two  components  29 corresponding  to  the  included  ( α 1 and  β )  variables  and  the  excluded  ( α   and  β )  variables  in   1 2 2  α 0  β 0     and  Β  B B  ,  which  allow  us  to  construct  1 1 the  productivity  equation  such  that  A =   A1 A2   1 2 0 0 the  next  matrix  D =   .  By  the  rank  condition,  productivity  equation  is  identified  if  A2 B2  rank ( D)  5  1 . The same holds for the rest of equations of the system. Thus, even if we have  several  exclusion  restrictions  in  matrix  Α   (in  the  productivity,  wages  and  employment  equations),  nevertheless  these  restrictions  are  not  enough  to  ensure  that  the  rank  condition  is  satisfied. For that, we force the coefficient of certain IC variables to be 0 prior to start estimating  the system, for more details on extra identification issues see Escribano et al (2008b).   The empirical IC results based on 2SLS are included in Tables 8.1 to 8.5 of the Appendix. In all the  cases we found evidence that TFP has a significant and positive impact on; employment demand,  on  real  wages,  and  on  the  probabilities  of  exporting  or  receiving  FDI.  Notice  that  TFP  is  always  significant even after controlling for IC and other C variables.  5 IC  assessment  on  aggregate  productivity  (Olley  and  Pakes decomposition) and other measures of economic  performance.  In the second part of the analysis, taking advantage of the robustness of the IC, and C elasticities  estimated, we want to concentrate on the TFP measure that comes from the restricted Solow’s  residuals. Our aim is to evaluate the IC effects on average productivity and on allocative efficiency  components  of  the  Olley  and  Pakes  (1996)  decomposition  (O&P)  of  aggregate  productivity  in  levels (TFP) and on the mixed O&P decomposition (logTFP).  5.2 O&P decompositions: in levels and mixed.  The O&P decomposition of aggregate productivity in levels is,          TFP  TFP  N cov( sit , TFPit ) .                                                         (5.1a)  ˆ Y 30 Where TFP is aggregate productivity (TFP) (or weighted average productivity, where the weights  are given by the share of sales),  TFP  is the sample average productivity and the last term is N  times the sample covariance of the share of sales and firm level productivity; this last term is the  allocative efficiency term describing the ability of the markets to reallocate resources from less to  more productive establishments. Furthermore, we want to exploit the log�linear properties of the  following  mixed16  O&P  decomposition  in  order  to  obtain  closed  form  O&P  decompositions  in  terms of IC and C variables,          log TFP  log TFP  N cov( sit , log TFPit ) .                                                     (5.1b)  ˆ Y Expressions (5.1a) and (5.1b) can be easily applied by industry, state, size, age or for the whole  sample. The results of the decomposition by states and at country level in levels and in logs are in  Figures 2 and 3 of section 9.  5.3 IC effects on productivity measure in the terms of the mixed  O&P decomposition.  The useful additive property of equation (3.3) in logarithms, allow us to obtain an exact closed  form  solution  of  the  decomposition  of  aggregate  log  productivity  according  to  equation  (5.1b).  Following Escribano et al. (2008a), we can express aggregate log productivity as a weighted sum  of the average values of the IC, C, dummy D variables, the intercept and the productivity average  residuals  ( u )  from  (3.3);  and,  the  sum  of  the  covariances  between  the  share  of  sales  and  ˆ investment climate variables IC, C, dummies D and the productivity residuals ( u ).  ˆ log TFP  � IC IC P  � C C P  � ´DR Dr  � Ds D j  � DM Dm  � DT Dt  � p  u it ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ  N � ´IC cov( sit , IC P ,i )  N � ´C cov( sit , C P ,i )  N q� ´Ds cov( sit , D j )  N � ´DR cov( sit , Dr )    (5.2)  ˆ ˆ Y ˆ ˆ Y ˆ ˆ Y ˆ ˆ Y ˆ ˆ ˆ ˆ  N � ´DT cov( sit , Dt )  N � DM cov( sit , Dm )  N � DT cov( sit , Dt )  N cov( sit , u it ) ˆ ˆ Y Y Y ˆ Y ˆ where the set of estimated parameters used comes from the two�step TFP estimation, having the  restricted Solow’s residual as dependent variable of the regression equation (3.3).  16  It is called mixed Olley and Pakes (O&P) decomposition because in the original O&P decomposition both TFP and  the share of sales were in levels while now TFP in (5.1b) is in logs, (log P).   31 The contributions of IC variables to aggregate log�TFP of equation (5.2) can be computed for the  whole  sample  or  by  industry/sector,  state,  size,  etc.  In  particular,  we  compute  the  IC  contributions relative to aggregate productivity as follows;  100 100  [� IC ICP  � C CP  � ´DR Dr  � Ds D j  � DM Dm  � DT Dt  � p  uit ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ log TFP  N� ´IC cov( sit , ICP ,i )  N� ´C cov( sit , CP ,i )  N q� ´Ds cov( sit , D j )  N� ´DR cov( sit , Dr )             (5.3)  ˆ ˆ Y ˆ ˆ Y ˆ ˆ Y ˆ ˆ Y  N� ´DT cov( sit , Dt )  N� DM cov( sit , Dm )  N� DT cov( sit , Dt )  N cov( sit , uit )]. ˆ ˆ Y ˆ ˆ Y ˆ ˆ Y ˆ Y ˆ There are several advantages of using equation (5.3). First, we can compare net contributions by  isolating the impact of IC variables from the impact of industry dummies, the intercept, and the  residuals.  Second,  we  can  split  the  total  effect  on  aggregate  productivity  in  the  part  explained  only  by  IC,  and  C  variables  (demeaned  logTFP),  and  the  proportion  is  due  to  the  rest;  constant  term, industry dummies and so on. The empirical results of decomposition (5.3) are in Table 7.   We could also get rid of the different directional effects (positive or negative) of the various IC  effects  by  simply  computing  the  percentage  contributions  in  absolute  value.  This  slightly  modification  allow  us  to  do  direct  comparisons  of  the  IC  absolute  percentage  contributions  (or  weight  of  each  IC  variable  relative  to  the  total  weight  of  other  IC  variables)  to  aggregate  log� productivity, to average log�productivity and to the allocative efficiency term. The results are in  Figure 13.  5.4 Simulations  based  on  the  IC  effects  on  the  O&P  decomposition of TFP.  So  far,  we  have  exploited  the  linear  properties  of  the  logarithm  form  of  the  mixed  O&P  decomposition of TFP. However, the original O&P decomposition was done in terms of TFP and  the  share  of  sales  (in  levels).  Therefore  the  O&P  decompositions  is  capturing  also  nonlinear  relations between market shares and IC variables coming from (5.1a) and equation (3.3). To know  to  what  extent  these  nonlinear  terms  are  affecting  this  relation,  we  perform  simulation  experiments17  on  INF,  IC,  and  C  variables,  and  evaluate  the  consistency  of  the  results  with  the  ones  obtained  from  the  previous  mixed  O&P  decomposition�  see  (5.3).  The  IC  simulations  are  17   We are indebted to Ariel Pakes for this suggestion.  32 done variable by variable (one at a time) keeping the rest of the variables constant; that is, we  propose a scenario in which one of the IC variables experiment a 20 percent improvement in all  the  establishments.  We  compute  the  corresponding  rate  of  change  of  aggregate  productivity,  average productivity and allocative efficiency caused by such improvement. We repeat the same  experiment for the rest IC and C variables, and, for comparative purposes, we also evaluate the  relative IC effect by group of IC variables.  The  resulting  simulations  of  a  20%  improvement  in  IC  variables  are  in  Figure  14.  A  comparison  between the simulations and the IC absolute percentage contributions are in Figure 13.  5.5 International  comparisons  of  IC  effects  on  aggregate  demeaneded productivity.  To make cross�country comparisons based on IC impacts on productivity, avoiding the problem of  comparing apples and oranges, it is desirable to create an index (demeaneded productivity). After  subtracting  the  mean  (that  is,  the  constant  term,  time  effects,  industry  effects  and  country� specific  effects)  from  firm  level  log�productivity  we  can  concentrate  on  the  part  of  log� productivity explained by the IC variables. Thus, demeaned aggregate log�productivity at the firm  level is simply  Demeaned log TFPi  � ´IC ICiP  � ´C CiP                                                            (5.4)  ˆ ˆ With  the  expression  given  by  (5.4)  we  can  easily  compute  the  O&P  decompositions  (5.1a)  and  (5.1b) based on this demeaned part of productivity to do international comparisons of IC impacts  on aggregate productivity. The share of aggregate log�productivity attributable to the IC can be  found at the end of Table 7, while a comparison of the demeaned productivity in Pakistan with  that of other countries is in Figure 4 of section 9.      33 5.6  IC  evaluation  on  the  sample  means  of  employment  and  wages, on the probability of exporting and on the probability  of receiving FDI  The objective now is to measure the partial direct effect of each IC variable on each dependent  measuring  economic  performance  from  the  system  of  equations  (4.5)�(4.9),  at  different  aggregation levels (aggregate level, by sector, by region, by size of the firm, by age of the firm,  etc.). For that purpose, we evaluate the impact of the average IC variable on the sample average  values of the dependent variables of the system. In what follows, we substitute all the unknown  parameters of the system (4.5) to (4.9) by their corresponding 2SLS estimated values.   The labor demand and the wage equations evaluated at the sample means and in relative terms  are,  100 L 100   [�ˆL  �ˆP log TFPt  �ˆw log W t  �ˆExp ytExp  �ˆFDI ytFDI  �ˆL IC log Lt                             (5.5)  L      �ˆC C  �ˆDR Dr  �ˆDs D j  �ˆDM Dm ] 100 ˆ 100  �W  � P log TFP t  � Exp ytExp  � FDI ytFDI  � IC IC W ˆ ˆ ˆ ˆ log W t  .                                   (5.6)   � C C  � Ds D j  � Ds D j  � DR Dr  � DM Dm ˆ W ˆ ˆ ˆ ˆ Exp FDI Since  yit  and  yit  are binary variables, evaluating the impact at the sample mean implies the  evaluation on the probability (frequency) of exporting and receiving FDI, respectively. In particular  equations (3.8) and (3.9) relative to the frequency of exporting and receiving FDI becomes  100 100  � Exp � P log TFPt  � FDI yitFDI  � IC ICEsp  �ˆC C Exp  �ˆDs Dj  �ˆDR Dr  �ˆDM Dm                 (5.7)                         ˆ ˆ ˆ ˆ     ˆ P( Expt � 0) 100 100   �FDI  �P log TFPt  �Exp ytExp  �IC ICFDI  �CC FDI  �Ds Dj  �DR Dr  �DM Dm  (5.8)                   ˆ ˆ ˆ ˆ ˆ ˆ ˆ P(FDIt � 0)  ˆ     The results of equations (5.5) to (5.8) are in Figures 7 to 10 of section 9.   6 Econometric methodology for the services sector  34 The  econometric  analysis  of  the  services  sector  is  based  on  the  concept  of  labor�productivity,  instead of total factor productivity, due to a lack of information on some of the basic inputs of the  production  functions,  like  the  capital  stock.18  Therefore,  the  analysis  of  the  retail  sector  of  Pakistan is based on the reduced form of the production function. In particular, the log of labor  productivity is explained by the input prices (w = wages and r = the rental cost of capital), by the  investment climate (IC) variables and other control variables and by the usual dummy variables  for industry, region and size.  The  reduced  form  equation  of  labor  productivity  in  terms  of  input  prices  (w,  r),  investment  climate (IC) variables and other control (C) variables is given by;  Y      lo g    � 0  � 1 lo g w it  � 2 lo g rit  � 3 I C i  � 4 C i  � 5 D t  � 6 D m      L  it          (6.1)     � 7 D s  � 8 D r  � it , Y L where  i=1,…,  150  an  t=FY06  and  FY07.  Most  of  the  general  econometric  issues  previously  discussed for the manufacturing model apply now for the services sector.  The  endogeneity  of  the  explanatory  variables  of  (6.1)  is  treated  by  considering  the  firm  level  investment  climate  (IC)  variables  as  observable  fixed  effects,  as  initially  suggested  in  Escribano  and Guasch (2005 and 2008). Under standard general conditions, the ordinary least squares (OLS)  estimator is a consistent estimator of the parameters of equation (6.1). The remaining problem is  the possible endogeneity of certain IC variables. In those cases, when we have evidence of the  endogeneity  of  certain  IC  variables  we  substitute  the  endogenous  IC  variable  for  their  corresponding  region�industry  average.  To  control  for  the  usual  heteroskedasticity  of  firm  level  data, we use robust standard errors allowing for clustering by sector and region. In the cleaning  of  the  data  base  and  missing  observations  we  apply  the  same  procedures  discussed  before  for  the manufacturing sector.  The  main  difference  with  respect  to  the  manufactures  case  is  the  way  robustness  is  addressed  here.  We  check  the  robustness  to:  a)  alternative  measures  of  input  prices  used  in  (6.1),  wages  18   This  econometric  model  is  based  on  Escribano  and  de  Orte  (2008).  We  refer  the  reader  to  this  paper  for  more  details on the derivation of equation (6.1).  35 and rental cost of capital;19 b) to the use of the panel data dataset and also c) to the use of cross� section  data,  and  d)  by  using  restricted  and  unrestricted  input  price  elasticities  on  labor  productivity; by industry, region and industry/region.   In order to evaluate the contribution of each explanatory variable to labor�productivity we apply  the  O&P  decomposition  to  (log)  labor  productivity  and  compute  the  contribution  of  each  IC  variable to aggregate log labor productivity and through the two components; average log labor  productivity and allocative efficiency.  Similarly to the manufacturing TFP case, we can evaluate the IC contributions to the terms of the  O&P decomposition of aggregate labor productivity according to the following decomposition:   100 100        [�ˆ0  �ˆ1 log wt  �ˆ2 log r t  �ˆ3 IC  �ˆ4C  � 5 Dt  � 6 Dm  � 7 Ds  � 8 Dr  � t ,YL ˆ log(Y / L)t  �ˆ1 N t cov( siY , log wi )  �ˆ2 N t cov( siY , log ri )  �ˆ3 N t cov( siY , ICi )  �ˆ4 N t cov( siY , Ci ) ˆ ˆ  ˆ  ˆ  � 5 N t cov( siY , Dt )  � 6 N t cov( siY , Dm )  � 7 N t cov( siY , Ds )  � 8 N t cov( siY , Dr )  N t cov( siY , � i ,YL )]  ˆ  ˆ  ˆ  ˆ ˆ       (6.2)  The  estimation  results  of  equation  (6.1)  are  in  Table  9  of  the  appendix  and  the  results  of  expression (6.3) are in Figure 11 of section 9.  7 Econometric methodology ICSs panel for FY02 and  FY07.  The evolution of TFP from FY02 to FY07 shows an increase in the aggregate productivity of the  manufacturing sector of Pakistan. In order to have additional insight on the IC factors associated  with  that  increase  in  productivity,  we  use  the  information  available  for  the  402  panel  manufacturing  firms.  As  we  already  pointed  out,  the  information  is  far  of  being  a  stratified  sample representative of the population; therefore the results should be interpreted with care.  In particular, we propose a model that relates the probability of the productivity increases from  FY02  to  FY07  conditional  on  the  investment  climate  faced  in  FY02,  after  controlling  for  region,  size and sector. The population model is,  19  We also tested robustness to the use of electricity cost and communication cost. The results obtained  are robust  to the input prices used. The results are available upon from request.  36 P( yit  1| ICit 5 ,Cit 5 , Drt 5 , D jt 5 , Dmt 5 ; � IC , �C , � DS , � DR , � DM )                                                 (7.1)  G(�0  �IC ICit 5  �C Cit 5  � DS Drt 5  � DR D jt 5  � DM Dmt 5 )       Notice that since the variable yit, is a binary random variable, taking the value 1 if the is a TFP  increase  from  FY02  to  FY07  and  0  otherwise,  then  P( yit  1 / x)  E ( yit / x) .  As  usual,  in  the  estimation of (7.1) we use the classical specifications for the unknown nonlinear functions, G(.).  As  we  did  before,  we  select  the  set  of  significant  IC  variables  going  from  the  general  to  the  specific  using  the  linear  probability  model  (LPM)  specification,  since  our  goal  is  to  evaluate  the  estimated  model  close  to  the  mean  values.  However,  once  we  have  selected  the  significant  IC  variables we also use Logit and Probit models to test the robustness of the empirical regularities  obtained.  Most  of  the  econometric  issues  discussed  before,  for  the  manufacturing  econometric  model,  apply with (7.1) as well. The endogeneity of IC variables is addressed by controlling for observable  fixed  effects  and  also  by  a  selective  use  of  industry�region�size  averages,  as  instruments.  The  heteroskedasticity  of  the  errors  is  addressed  by  using  robust  standard  errors,  allowing  for  clustering by region and size. Remember, that we are only looking for empirical regularities and  therefore that we do not want to attempt to interpret the results as causal relations.  The IC effects are evaluated in terms of the probability of creating a productivity increase; done  according to the following decomposition obtained from (7.1) based on the LPM estimates,  100 100  �0 �IC ICt 5 �CCt 5 �Ds Drt 5 �Ds Djt 5 �DMDmt 5  ˆ ˆ ˆ ˆ ˆ ˆ   ˆ P( yt � 1)                                                                 (7.2) The  results  of  the  estimation  of  equation  (7.1)  can  be  found  in  Table  10  of  the  Appendix.  The  percentage IC contributions from equation (7.2) are included in Figure 12 of section 9.  8 Results  (I):  Identification  of  IC  effects  on  economic  performance.  In this section we describe the empirical results obtained from the econometric methodologies  discussed in sections 2, 3, 4, 5, 6 and 7. We first focus on the IC elasticities and semi�elasticities  37 with  respect  to  productivity;  all  these  empirical  results  are  included  in  Tables  6.1  to  6.4  of  the  Appendix. Once we have estimated the robust IC parameters with respect to productivity we can  concentrate  on  the  effects  of  the  remaining  measures  of  economic  performance,  system  of  equations (4.5)�(4.9), which results are included in Tables 8.1�8.5. Finally, we present the results  obtained for the services sector and for the FY02�FY07 panel; Tables 9 and 10 respectively.  8.2 IC  elasticities  and  semi­elasticities  with  respect  to  productivity (manufacturing FY07).  The robust IC effects on productivity (TFP) of the manufacturing sector are included in Table 6.1.  After the IC selection process we identify twenty five IC significant variables in at least one of the  six TFP specifications considered, with the expected IC effects varying within a reasonable range  of  values  and  always  preserving  the  sign.  In  this  set  of  results,  we  estimate  the  system  (3.1a)� (3.1c) by OLS using data for FY07, after replacing the missing observations with the ICA method  described in section 2.  All  the  empirical  regularities  included  in  Table  6.1  are  interpreted  in  terms  of  the  IC  effects  on  productivity (TFP). For instance, having a large number of power outages is associated on average  with a lower level of productivity. We do not place a causal interpretation on those results. From  the  selection  process,  no  IC  variables  are  finally  used  as  the  industry/region/size  average,  although for many of them we replace missing data in the IC variables by using the ICA method.  Note  that  the  coefficients  are  subject  to  biases  depending  on  the  assumption  we  made  on  the  measurement error introduced. No bias is expected if we assume a random measurement error,  while a downward bias can follow under the assumption of correlation between the error and the  plant�level  variable.  In  either  case,  the  bias  is  expected  to  be  small,  firstly  because  we  try  to  replace  missing  values  only  in  those  variables  where  the  response  rate  is  higher  than  90%  and  secondly  because,  as  we  already  pointed  out,  the  variance  of  the  error  term  is  expected  to  be  small.      38 Comparison with FY06­FY07 panel estimation by random effects (RE)  Further robustness of IC parameters is presented in Table 6.2. Now, we expand the data range to  include FY07 and FY06 and we apply the random effects estimator to (3.1a)�(3.1c), the empirical  results are robust to the estimator used with cross section data and with panel data. Notice that  under  the  composite  error  term  of  (3.2)  and  (3.3),  although  the  OLS  estimator  is  consistent,  estimation by RE is more efficient. IC coefficients in Table 6.2 does not vary with respect to Table  6.1, however, significance is reduced for some variables.  Comparison under different replacement procedures of missing data  Special importance has Table 6.3 in where we check the robustness of our empirical regularities  to  other  replacement  procedures  of  IC  data  in  FY07.  The  ICA  method  applied  to  equation  (3.3)  with the dependent variable being the restricted Solow residual is included in the first column.  The second column corresponds to the same replacement procedure under resampling (boostrap  standard errors) where we perform 1000 replications by industry and region and we compute the  bootstrap  standard  errors  of  IC  parameters.  The  ICA  method  shows  robustness  to  this  new  replacement procedure as the variation in significance is not too large, even using the corrected  standard errors.  Table 6.3 shows the next set of results comparing the ICA method with different specifications of  the  EM  algorithm  (third,  fourth  and  fifth  columns).  IC  effects  are  homogenous  within  different  EM algorithms, significance does not vary much, and the range of values is not too large between  the different specifications.   So  far  we  have  compared  the  performance  of  ICA  method  with  other  replacement  procedures  that requires the MAR assumption for consistency. Results of column 6 (Table 6.3) accounts for  the  more  restrictive  non�ignorable  MDM  assumption  by  applying  the  Heckman  correction  for  endogenous sample selection where the selection equation is exclusively modeled with a larger  set of IC variables (and industry/state/size controls). Once again, the ICA method is robust to this  general  context,  since  we  do  not  observe  significant  changes  in  IC  parameters.  In  fact  the  Heckman’s  Lambda  is  not  significant,  and  this  is  consistent  with  the  conclusion  that  the  non� 39 ignorable  MDM  assumption  is  not  supported  by  the  missing  data  mechanism  of  this  ICSs  of  Pakistan.  The previous comparison is complemented with the estimation of equation (3.3) in the complete  case.  Using  the  model  previously  selected  under  ICA  method  when  we  use  only  the  complete  case the empirical results leads to similar conclusions in terms of parameter estimates, but not in  terms of significance. A natural consequence of the efficiency lost from reducing the sample by  50%.  Notice  that  we  still  replace  those  missing  values  of  IC  variables,  the  complete  case  only  refers here to the PF variables.  Therefore, from Tables 6.1, 6.2 and 6.3 we found IC elasticities and semi�elasticities with respect  to productivity that are robust under alternative sets of conditions. Given this robustness, for the  economic interpretation of the IC effects on economic performance we will concentrate in one  set of IC estimated coefficients; say those based on the ICSs from FY07 with missing observations  replaced  by  ICA  method,  with  TFP  coming  from  the  two  step  estimation  procedure  (Solow  residuals as productivity (TFP) measure). We have found a set of empirical regularities that are  reasonably robust to other assumptions, even though some of these assumptions might be too  strong. Obviously, the numerical results are not identical among different specifications, but we  believe that the variation of the estimates and significance is reasonable.  8.3 IC elasticities and semi­elasticities with respect to economic  performance (manufacturing FY07).  Given that we have found robust IC effects on productivity, the estimation of the system (4.5) to  (4.9)  is  based  only  on  two  productivity  measures,  say  the  restricted  and  unrestricted  Solow  residuals. The results are in Tables 8.1 to 8.5 of the appendix.  In Table 8.1 productivity equation is now estimated by 2SLS, where the probability of receiving  FDI  is  instrumented  exclusively  with  IC  variables,  either  including—covariates  in  productivity  equation—or excluding—covariates of FDI in equation (4.8). The IV results on the coefficicnt of  FDI  on  TFP  is  similar  to  the  one  obtained  with  OLS,  although  now  FDI  is  not  significant.  The  remaining IC effects on TFP do not change with respect to the OLS case.  40 In  the  case  of  employment  equation,  Table  8.2,  we  instrument  two  explanatory  variables  (real  wages and productivity) with IC variables, as explained in the previous case. We find a positive  relation between employment and productivity and, as expected, a negative real wages effect on  employment.  The  equations  for  real  wages,  the  probability  of  exporting  and  the  probability  of  receiving  FDI  are  in  Tables  8.3,  8.4  and  8.5.  In  this  cases  we  only  instrument  productivity  (TFP)  using as excluded instruments those exogenous covariates of productivity equation.  In all the cases the IC parameters has the expected signs and are significant in at least one of the  two regressions. We also evaluate the validity of the instruments used with the R�square of the  first stage regression which measures the significance of both included and excluded instruments,  and  the  partial  R�squared,  used  to  evaluate  the  squared  partial  correlation  between  the  endogenous covariate and the excluded instruments. A p�value close to zero for the F�test of the  partial R�squared implies significance of the excluded instruments. Finally the Hansen test checks  the  null  hypothesis  of  overidentification,  that  is  the  instruments  are  valid  instruments  (i.e  uncorrelated  with  the  error  term)  and  therefore  the  excluded  instruments  were  correctly  excluded from the estimating equation.   The  instrumental  variables  used  satisfy  all  the  tests  proposed  in  the  five  equations  for  all  the  endogenous  covariates  considered.  In  particular,  we  find  a  positive  and  significant  relation  between  productivity  (TFP)  and  employment,  real  wages,  probability  of  exporting  and  the  probability  of  receiving  FDI.  As  noticed,  the  effect  of  wages  on  employment  is  negative  and  significant,  while  the  effect  of  receiving  FDI  is  positively  associated  with  productivity  although  insignificant.  In  spite  of  the  non�significant  IV  estimation  obtained  we  still  maintain  the  FDI  variable in the set of TFP covariates, provided that the IV estimation of FDI coefficient does not  differ from the OLS case and since for further evaluations of productivity effects we use the OLS  estimators of IC on TFP.      41 8.4 IC  elasticities  and  semi­elasticities  with  respect  to  labor  productivity of services sector.    We briefly comment the results obtained from the OLS estimation of equation (6.1) in Table 9 of  the Appendix. We follow the same idea of searching for robust results under alternative scenarios  and  we  find  robust  IC  effects  to  different  specifications  of  the  labor  productivity  equation.  The  results are robust to: a) using a single cross�section or two years panel, b) including one or two  input prices and c) whether we allow input elasticities to vary by state, industry or both. In the  evaluation  of  the  IC  effects  on  the  O&P  decomposition  of  labor  productivity  we  rely  on  the  IC  estimates based on OLS, the first column of Table 9, with panel data for FY06 and FY07, restricted  input price coefficients and two inputs prices.     8.5 IC  effects  on  the  probability  of  observing  a  productivity  increase in the manufacturing sector from a panel from FY02  to FY07.  The  last  set  of  IC  estimates  measure  the  effects  on  the  probability  of  getting  a  productivity  increase  from  FY02  to  FY07;  see  Table  10.  The  results  are  intuitive  and  support  a  strong  association  between  IC  constraints  firms  faced  in  FY02  and  the  probability  of  getting  a  productivity increase five years later. Table 10 shows the results based on the linear probability  model  (LPM),  the  Logit  model  and  Probit  models,  along  with  the  odd  ratio  and  the  marginal  effects to make easier comparisons with the results of the LPM.  9 Results (II): IC evaluation on economic performance.  The  objective  so  far  was  to  obtain  a  robust  identification  of  IC  effects  under  alternative  conditions.  The  aim  now  is  to  use  the  estimated  coefficients  obtained  to  evaluate,  using  alternative procedures, the economic relevance of the IC contributions on the several measures  of  firm  economic  performance  in  Pakistan.  Provided  that  we  have  obtained  robust  results  to  different  TFP  measures,  from  now  on,  we  only  consider  one  set  of  IC  elasticities  and  semi� elasticities on TFP; i.e. those obtained from the restricted Solow residual case.  42 First,  we  present  the  O&P  decomposition  of  Pakistan,  which  is  at  the  core  of  all  the  analysis.  Second,  we  introduce  the  concept  of  demeaned  productivity  for  Pakistan  and  compare  it  with  those obtained for other economies. Third, we evaluate the IC contributions in terms of: a) the  O&P decomposition and the simulations results and b) the IC contributions to the sample means  of the remaining economic performance measures. This section ends with a brief description of  the IC empirical regularities obtained for the service sector for the manufacturing sector from the  FY02�FY07 panel.   9.1 Olley and Pakes decompositions.  A key characteristic of Pakistan manufacturing sector is the large share of the allocative efficiency  term relative to the average productivity term, what is illustrated in Figure 2 and 3. This result is  homogeneous by states and robust to the use of the O&P decomposition in levels or the mixed  O&P counterpart (expressions 5.1a and 5.1b). In particular, for the whole country, the allocative  efficiency component represents more than half of aggregate productivity (a slightly lower share  in  the  case  of  the  mixed  decomposition).  Obviously  there  is  certain  variation  among  states;  a  better allocation of resources is observed in Sindh and Punjab relative to the share in Beluchistan  and NWFP.  12 2.5 Aggregate Productivity Aggregate Productivity 2.15 Average Productivity Average Productivity 2.02 9.55 10 Efficiency Term Efficiency Term 2.0 8.50 1.72 1.54 8 1.5 TFP in logs 1.35 1.33 TFP in levels 6.10 1.19 1.16 1.14 6 1.10 4.97 4.95 4.68 4.60 4.34 1.0 0.82 3.88 3.82 0.80 3.54 3.46 4 0.58 2.56 0.44 0.5 1.50 2 0.17 0.46 0 0.0 Punjab Sindh Baluchistan NWFP Whole Pakistan Punjab Sindh Baluchistan NWFP Whole Pakistan Figure 2: Olley and Pakes decomposition in Figure 3: Mixed Olley and Pakes levels by state (FY07) decomposition by state (FY07) Note: Olley and Pakes decomposition in levels according to Note: Olley and Pakes decomposition in levels according to equation (5.1a). The productivity measure used is the restricted equation (5.1b). The productivity measure used is the Solow residual in levels. restricted Solow residual in logs. Source: Authors’ calculations with Pakistan ICS data. Source: Authors’ calculations with Pakistan ICS data. 43 This is a key result in the case of Pakistan that will have implications in the analysis that follows in  next  sections.  A  large  allocative  efficient  effect  implies  a  better  allocation  of  resources  toward  most  productive  firms,  but  also  a  large  gap  between  low  and  high  productive  firms.  Therefore,  there is room to reallocate resources and/or to bring the low productivity firms close to the more  successful ones. This is important in terms of potential productivity gains from the perspective of  economic policies relaxing constraints of the investment climate.  Additional insight on the role of the investment climate on productivity is obtained by applying  the  O&P  decomposition  on  the  concept  of  demeaned  productivity;  the  share  of  productivity  associated  only  to  investment  climate  variables.  Figure  3  compares  the  O&P  demeaned  decomposition of Pakistan with those of other countries.  2.0 Aggregate Productivity 1.5 1.5 1.5 Average Productivity 1.0 Allocative Efficiency 1.0 0.8 0.7 0.7 0.6 0.58 TFP in logs 0.6 0.50 0.5 0.5 0.5 0.3 0.3 0.5 0.27 0.16 0.2 0.07 0.1 0.03 0.03 0.1 0.02 0.01 0.0 0.00 -0.01 -0.4 -0.5 -0.4 -0.5 -0.5 -0.7 -0.7 -1.0 Philippines Turkey Indonesia Bangladesh India Croatia Mexico Pakistan South Africa Brazil Chile   Figure 4: Demeaned mixed Olley and Pakes decomposition in Pakistan and comparators Note: Olley and Pakes decomposition in levels according to equation (5.4). The productivity measure used is the demeaned restricted Solow residual in logs. Source: Authors’ calculations with Pakistan ICS data. Figure 4 says that Pakistan aggregate log�productivity is in general positively influenced by the IC.  This does not mean that Pakistan is more productive than other countries, but that the effect of  the  IC  on  productivity  is  larger  in  Pakistan  and  that  the  positive  IC  factors  dominate  over  the  negative IC ones. As it was pointed out in the previous paragraphs the dominant contributor is  the  allocative  efficiency  term.  As  the  positive  IC  effects  are  concentrated  in  high  market  share  firms  and  the  negative  in  low�market  share  firms,  the  negligible  contribution  to  the  average  44 productivity  is  considerably  amplified.  This  result  is  important  because,  as  we  will  see  in  next  sections, it suggests channels for getting productivity increases.  9.2 Effects of market power on measured productivities  In  this  section  we  analyze  to  what  extent  those  sample  differences  observed  in  measured  productivities may be driven by market power.  Under  market  power  productivities  are  likely  to  be  over�measured,  yielding  in  turn  to  over� measured aggregate productivities. Under market structures allowing market power situations— monopolies,  duopolies  or  oligopolies—,  firms  are  able  to  fix  a  larger  mark�up,  and  as  a  consequence measured productivity, apart from efficiency, is reflecting market power. Note that  this  problem  affects  either  productivity  or  share  of  sales,  so  the  problem  may  affect  aggregate  productivity through either average productivity or allocative efficiency.  For  a  proper  evaluation  of  market  power  we  would  need  to  rely  in  the  concept  of  market  of  reference in which firms operate. Unfortunately, ICSs only provide information on the number of  competitors, and the market in which firms sell their products (local, national, international). We  analyze  differences  in  the  distribution  of  measured  productivities  by  market  structures  and  market  type  (local,  national,  international)  looking  for  differences  in  the  sample  densities  that  might reveal a problem of market power in the measurement of productivities.20  Figure 5 includes the number of firms by the market in which firms operate and the number of  competitors. 344 firms, 44% of the sample, operate in local markets, 360 in national markets and  75,  only  10%,  in  international  markets.  Once  assumed  that  firms  operating  in  international  markets  compete  against  more  than  4  competitors,  panel  B  shows  the  number  of  firms  by  number of competitors. Most of the firms compete against more than 5 firms, representing 82%  of the total sample, 38% in local markets and 46% in national markets. Oligopolies represent only  13% of the sample, whereas duopolies and monopolies represent less than 4% of the final sample  of firms.  20   Other  variables  included  in  the  ICS  of  Pakistan  where  used  to  test  the  effect  of  competition  on  measured  productivities: behavior of prices and quantities, size, state, industries. All of them leaded to similar conclusions than  those we expose here. Results are available upon from request.  45 400 360 (46%) 350 316 344 (44%) 264 350 300 Local (38%)  (46%) 300 250 National 250 200 200 150 150 55  100 75 (10%) 100 (8%) 31  5  8  6  1  (5%) 50 50 (1%) (1%) (1%) (0%) 0 0 Local National International Monopoly Duopoly Oligopoly More than 5 competitors A. By market type, total number of firms and B. By market type and market structure, total number and percentage in parentheses percentage in parentheses Figure 5: Number of firms by market type and market structure Source: Authors calculations with Pakistan IC data. Assuming that market power situations are more likely to appear in oligopolies (in between 3 and  5 competitors), duopolies and monopolies, from Figure 5 we may conclude that the problem is  reduced to a low percentage of the final sample of firms, say 17% of firms. Specifically, in what  follows we can focus on the measured productivity for oligopolies, firms competing against more  than 5 competitors and firms competing in international markets. The purpose now is therefore  to  analyze  differences  in  estimated  productivities  between  these  two  groups,  assuming  that  if  more  competence  is  present  in  the  market  the  mark�up  effect  tend  to  be  lower.  Exigent  international  markets  are  more  competitive,  followed  firstly  by  firms  competing  against  more  than 5 competitors in national markets, second by firms competing against more than 5 in local  markets, third by national oligopolies and lastly  by local oligopolies.  A. For international establishments and by market B. By market type and market structure structure Figure 6: Kernel density estimate of productivity densities (I) Notes: Productivity measure used is the restricted Solow residual. Epanechnikov Kernel. Source: Authors calculations with Pakistan IC data. 46 Figure 6 shows the kernel estimates of productivity densities by market type first (panel A) and by  market  type  and  market  structure.  From  panel  A,  establishments  operating  in  international  markets  are  more  productive  than  their  counterparts  operating  in  local  and  national  markets,  supporting  the  view  of  more  exigent  international  markets  associated  with  higher  levels  of  productivities.  Panel  B  comes  to  confirm  this  idea.  However,  in  this  case  we  observe  that  the  distribution  of  productivities  of  national  oligopolies  is  similar  to  that  of  international  establishments. Local establishments, either oligopolies or competing against more than 5 firms,  show  the  lowest  levels  of  productivities.  The  median  of  the  group  of  national  firms  competing  against  more  than  5  is  1.11,  in  between  local  firms  and  national  oligopolies  and  international  firms.  Figure  6  comes  to  confirm  the  idea  of  larger  productivities  associated  with  more  exigent  and  competitive  markets.  Nonetheless,  this  guideline  is  broken  by  the  national  oligopolies.  This  atypical behavior of oligopolies in national markets raises some doubts about whether measured  productivities in this case may be driven by market power effects.     A. By intensity of domestic competition B. By intensity of international competition Figure 7: Kernel density estimate of productivity densities (II) Notes: Productivity measure used is the restricted Solow residual. Epanechnikov Kernel. Source: Authors calculations with Pakistan IC data. More  insight  about  competitive  conditions  facing  firms  in  Pakistan  is  in  Figure  7.  Managers  are  asked about their perceptions on the intensity of competitive forces they have to deal with. From  47 panel  A  we  do  not  observe  substantial  differences  between  those  establishments  operating  in  intense domestic competition markets and those that do not. On the other hand, panel B shows  the distributions of firms operating in markets with intense international competition. In this case  it is patent the difference in productivity distributions of those firms facing intense international  competition and those that do not.  Figure  8  shows  productivity  distributions  by  establishments  considering  either  domestic  or  international  competition  is  important  for  costs.  Panel  A  shows  slight  differences  between  estimated  productivities  for  firms  considering  domestic  competition  is  important  for  costs  and  those  that  do  not.  Concretely,  the  distribution  of  those  considering  it  is  important  is  slightly  skewed toward right, with a median of 1.2, being the median of the establishments which costs  are influenced by domestic competition 1.02. However, at this point, this is not conclusive of a  negative effect of competition on productivity, neither it contradicts the findings  of Figure 7.  A  plausible  argument  is  that  those  incumbents  having  market  power  are  more  likely  to  have  advantages  in  costs  with  respect  to  entrants,  due  to  either  large  size,  a  good  management  or  scale/scope  economies,  and  therefore  we  can  expect  a  negative  answer  to  this  question  from  those  firms  having  market  power–probably  national  oligopolies—.  Note  that,  as  opposite,  the  influence of international competition on costs is important in explaining productivity differences,  as  panel  B  shows.  Those  firms  which  costs  are  influenced  by  international  competition  show  larger levels of productivities.  A. By importance of domestic competition for costs B. By importance of international competition for costs Figure 8: Kernel density estimate of productivity densities (III) Notes: Productivity measure used is the restricted Solow residual. Epanechnikov Kernel. Source: Authors calculations with Pakistan IC data. 48 It is interesting now to analyze the behavior of oligopolies facing different competitive conditions.  At  this  respect,  Figure  9  shows  the  productivity  distributions  of  oligopolies  by  intensity  of  domestic  and  importance  of  domestic  competition  for  costs.  Within  oligopolies  we  measure  larger  productivities  for  those  operating  in  non�intense  domestic  markets  (panel  A)  and  considering domestic competition is not important for costs (panel B). Therefore large measured  levels  of  productivity  are  associated  with  oligopolies  that  do  not  consider  competence  as  determinant  for  their  behavior  and  costs.  As  opposite,  within  those  oligopolies  operating  in  intense  competence  conditions  densities  of  measured  productivities  are  under  the  level  of  the  density for all firms.  A. By intensity of domestic competition for costs B. By importance of domestic competition for costs Figure 9: Kernel density estimate of productivity densities, only oligopolies (IV) Notes: Productivity measure used is the restricted Solow residual. Epanechnikov Kernel. Source: Authors calculations with Pakistan IC data. From figures 5 to 9 we are able to identify those firms more likely to be affected by the problem  of  mark�ups  in  measured  productivities.  That  is,  national  oligopolies  operating  in  non�intense  competitive conditions. This implies a reduced share of the final sample, say less than 13% of the  sample  (including  local  oligopolies).  The  next  step  is  first  to  identify  those  size  categories  and  industries in which it is more likely to find national oligopolies and second to link productivity and  market shares.  49 According  to  Figure  10,  most  of  the  local  oligopolies  are  small  firms.  As  opposite,  national  oligopolies  in  the  sample  are  evenly  distributed  by  sizes.  By  industry,  most  of  the  local  monopolies  operate  in  the  food  &  beverages  and  textiles  sectors,  whereas  national  oligopolies  can be found in the chemical and machinery and equipment sectors.  50 47 (6.3%) 25 21  Local National (2.8%) Local National 45 20 40 35 12  15 30 (1.6%) 10 (1.3%) 25 7  10 6  (.9%) 20 4  4  4 (.8%) 3  (.5%) (.5%) 2  2 (.5%) 2  4 15 11 (1.5%) 11 (1.5%) 5 (.4%) 1  1  1  9 (1.2%) (.2%) (.2%) (.4%) 2 10 6 (.8%) (.1%) (.1%) (.1%) 0 5 2 (.2%) 0 Food & Textile and Wood and Pulp, Paper, Chemical Non�Metallic Iron, Steel Machinery Miscellaneous 0 Beverages Leather Wood Product and Printing Mineral Manufacturing Small Medium Large A. Oligopolies by size (percentage in B. Oligopolies by industry (percentage in parentheses) parentheses) Figure 10: Number of oligopolies by size and industry Source: Authors calculations with Pakistan IC data. From Figure 11 there are two firms one of them operating in the national market and the other  one in international market representing more than 45% of total market sales. Apart from these  two outliers, larger market shares are concentrated in establishments facing more competence  (see  panel  B)  and  in  establishments  orientated  toward  international  markets  (panel  A),  in  line  with what we saw for measured productivity.    A. Share of sales by market type B. Share of sales by market structure Figure 11: Box-plots of share of sales by market type and market structure Source: Authors calculations with Pakistan IC data. 50 Finally, productivity and share of sales co�move, as Figure 12 clearly points out.  Until percentile 20 In between percentiles 21 and 40 In between percentiles 41 and 60 In between percentiles 61 and 80 From percentile 81 -2 0 2 4 6 Productivity Figure 12: Box-plots of productivities by share of sales Source: Authors calculations with Pakistan IC data. The  conclusion  of  this  sub�section  is  that  in  the  productivity  analysis  we  should  be  concerned  with  the  measured  levels  of  efficiency  for  oligopolies,  especially  those  operating  in  national  markets.  However,  for  the  remaining  firms  we  do  not  observe  an  ostensible  relation  between  market power and productivity. We argue that in absence of market power and with competence  measured productivity, although likely to be affected by prices, it is correlated with the real level  of  efficiency  –in  line  with  Foster  et  al  (2008)—,  and  even  in  those  non�competitive  oligopolies  measured  productivity  can  be  used  as  proxy  of  real  productivity.  In  the  analysis,  however,  one  should  be  cautious  and  take  into  account  all  this  issues,  especially  when  there  are  outliers  in  market shares as those observed in Figure 11, what may dangerously bias the results in terms of  aggregate  productivities  (the  way  we  proceed  to  avoid  this  problem  is to  follow  a  conservative  procedure  by  deleting  the  5%  of  upper  and  lower  values  of  productivity  when  computing  O&P  decompositions).  Figure 13 comes to confirm the need of controlling for those outlier observations. Panel A, Figure  13 shows the O&P decomposition by industry after excluding those 5% of upper and lower levels  of measured productivity. Likewise, panel B shows the case including all observations. Outliers of  Figure 11 are oligopolies concentrated in textiles and as a result the allocation effect considerably  increases  in  panel  B  with  respect  to  panel  A.  In  the  remaining  sectors  the  allocation  effect  is  almost the same.  51 4.0 4.0 3 .8 Aggregate log-productivity Aggregate log-productivity 3.5 Average log-productivity 3.5 Average log-productivity (log) Allocative efficiency (log) Allocative efficiency 3.0 3.0 2 .5 9 2.4 2 .4 2.4 2.5 2.5 2 .1 2 .1 2 .0 2 .0 2.0 1 .9 1 .9 2.0 2.0 1.7 1 .7 1 .7 1 .6 1 .5 1 .5 1 .4 1 .4 1.5 1.5 1 .3 1.3 1.3 1.3 1 .3 1 .3 1 .1 6 1.15 1 .2 1 .2 1 .2 1 .2 1 .2 1 .2 1 .2 1 .1 1.17 1 .1 1 .1 1 .0 0 .9 4 0.8 1 1.0 1.0 0 .7 3 0 .6 6 0 .68 0 .5 1 0 .50 0 .4 5 0.45 0.45 0 .3 4 0 .3 3 0 .2 6 0.5 0.5 0 .0 4 0.0 0.0 Food, Textiles Wood Pulp, Paper, Chemical Non-Metallic Iron, Steel, Machinery Miscellaneous Food, Textiles Wood Pulp, Paper, Chemical Non-Metallic Iron, Steel, Machinery Miscellaneous bevrgs., & and Printing Mineral and Non- bevrgs., & and Printing Mineral and Non- tobacco Products Ferrous Metal tobacco Products Ferrous Metal A. Without outliers (5% of upper and lower levels of B. With outliers (5% of upper and lower levels of productivity dropped) productivity included) Figure 13: Mixed Olley and Pakes decompositions with and without outliers Source: Authors calculations with Pakistan IC data. The  last  figure  illustrates  the  same  idea  by  using  the  demeaned  mixed  O&P  decomposition.  Results in panel B are profoundly affected by outlier observations.   1.5 1.5 Demean aggregate log-productivity Demean aggregate log-productivity 1 .3 1.3 Demean average log-productivity 1.3 Demean average log-productivity 1 .1 8 Demean (log) allocative efficiency Demean (log) allocative efficiency 1 .1 1.1 1.1 0 .9 0 .8 4 0.9 0.9 0 .8 0 .8 0 .7 5 0 .7 0 .7 0 0 .7 0 .6 7 0 .6 0.7 0.7 0 .5 8 0 .6 0 .6 0 .5 6 0 .5 3 0 .6 0 .5 0 .5 0 .5 0 .5 0 .4 3 0 .4 0 0.5 0.5 0 .3 8 0 .4 3 0 .3 8 0 .2 8 0 .2 8 0 .3 0 .3 0 .3 0 .2 0.3 0.3 0 .1 9 0 .1 6 0 .2 0 .2 0 .1 6 0 .2 0 .2 0 .2 0 .2 0 .1 0 .1 0 .1 0 .1 0 .1 0 .1 0 .1 0 .1 0 .1 0 .0 0.1 0.1 0 .0 -0.1 -0.1 Food, Textiles Wood Pulp, Paper, Chemical Non-Metallic Iron, Steel, Machinery Miscellaneous Food, Textiles Wood Pulp, Paper, Chemical Non-Metallic Iron, Steel, Machinery Miscellaneous bevrgs., & and Printing Mineral and Non- bevrgs., & and Printing Mineral and Non- tobacco Products Ferrous Metal tobacco Products Ferrous Metal A. Without outliers (5% of upper and lower levels of B. With outliers (5% of upper and lower levels of productivity dropped) productivity included) Figure 14: Demeaned Mixed Olley and Pakes decompositions with and without outliers Source: Authors calculations with Pakistan IC data. Concluding,  it  is  difficult  to  control  for  mark�up  effects  since  models  recently  developed  in  the  literature to control for this problem are difficult to implement in the context of IC surveys (see  Katayama  et  al.  (2006),  Gorodnichenko  (2007)  or  de  Loecker  (2006)  for  examples).  As  this  methods  are  not  available  we  follow  a  control  approach  by  approximating  the  effect  of  prices  with industry/region/size dummies plus a set of other IC variables, such as the market of the firm  (local,  national,  international)  or  the  market  structure  (monopolies,  oligopolies,  more  than  5  competitors). In spite of these difficulties, we find that measured productivities increase as firms´  environments become more competitive, with the exception of national oligopolies, what allow  us  to  conclude  that  our  measured  productivities  are  correlated  with  true  productivities.  As  an  52 additional  correction  for  prices  effect  either  in  productivities  or  market  shares,  we  propose  to  follow a conservative procedure by excluding those outlier observations. This procedure appears  to work well as the O&P decompositions for the whole sample are very sensitive to the presence  of outliers likely to be, in turn, consequence of price effects.  9.3  IC  contributions  to  the  terms  of  the  Olley  and  Pakes  decomposition  From  the  previous  subsections  we  know  that  the  IC  plays  an  important  role  on  productivity  in  Pakistan. The aim now is to identify the main IC contributors to the effect observed in Figure 3.  The questions we are willing to address here are two; first, the magnitude of the IC effect (how  many firms are affected by the problem and how much of the problem they have to deal to, e.g.  how many firms suffering power outages and how many power outages they suffer); and second  which  firms  in  the  population  are  predominantly  affected  by  the  problem.  The  second  is  particularly important because a positive investment climate effect concentrated in high market  share firms amplifies the original effect as it is concentrated in those firms using the largest share  of the resources of the economy, as they use it more productively they can get more output from  the  same  amount  of  inputs,  what  is  translated  to  overall  welfare  gains  for  the  economy  as  a  whole—we  say  that  the  economy  is  using  its  resources  more  efficiently.  The  same  holds  for  negative aspects of the investment climate, if they are concentrated in high share of sales firms  the  negative  impact  is  exponentially  amplified.  In  turn,  the  IC  contribution  on  aggregate  log� productivity  can  be  decomposed  into  the  accumulation  effect21  (average  productivity)  and  the  distributional effect22 (allocative efficiency).  The effect of the individual IC variables is transmitted to aggregate productivity through average  log�productivity  and  in  a  lesser  extent  through  the  effect  of  the  IC  on  the  allocative  efficiency  21   We  say  that  in  the  sample,  the  average  firm  is  suffering  a  certain  number  of  power  outages,  spend  a  certain  number of time dealing with bureaucracy, report a certain percentage of sales to taxes, has a certain probability of  having a loan, etc, etc. That is, we evaluate the investment climate of the average (or representative) firm.  22  What we take into account is the how the IC variable is distributed across firms according to their share of sales.  The idea here to use of inputs to produce output in a efficient way, we do not want low productivity firms using large  amounts of resources, we want the scarce resources to be allocated in firms that use them in an efficient way.  53 term  (but  not  on  the  share  of  sales).23  Specifically,  considering  that  the  effect  of  the  IC  on  aggregate log�productivity is 100, 14% is due average log�productivity and the remaining 86% to  the allocation effect. The main IC contributors to Aggregate log�productivity are:  •  Number of power outages (infrastructure). The contribution of the number of power outages  to the average is �13.3% , the contribution on the allocative efficiency is 6.7%, so the negative  effect on productivity is biased towards low market share firms.  •  Days  of  inventory  of  main  intermediate  material  (infrastructure).  The  contribution  to  the  average  log�productivity  is  14%.  The  negative  effect  on  the  allocation  (9%)  is  in  this  case  added to the effect on the average, indicating that the positive effect is concentrated in high  market share firms.  •  Sales reported to taxes (economic governance). This variable accounts for 12.8% of aggregate  log�productivity. The main contributor is the average effect with 12.3%.  •  Working  capital  financed  by  internal  funds  (finance).  Out  of  the  total  �6.6%  contribution  to  aggregate  log�productivity  of  this  variable,  �8.3%  is  driven  by  average  log�productivity.  The  remaining 1.8% comes from the contribution to the allocative efficiency, indicating that the  effect is to some extent concentrated in low share of sales firms.  •  Working  capital  financed  by  private  banks  (finance).  Being  financed  by  private  banks  is  positively  associated  with  aggregate  log�productivity,  with  a  contribution  of  11.3%.  The  allocation  effect  becomes  in  this  case  more  important  as  it  accounts  for  almost  all  the  contribution (9.6% out of 11.3%). That is, as the positive effect of the financing from private  banks is mostly concentrated in firms with large share of sales, the overall effect on aggregate  log�productivity is considerably amplified.  •  Dummy for process innovation (innovation and competition). The contribution to aggregate  log�productivity  is  15.7%,  the  main  contributor  is  allocative  efficiency  (13.1%),  the  average  effect accounts for around one sixth of the overall contribution (2.6%).  23  Actually we cannot address how the real IC effect on the allocation is, as we regress IC on TFP and not in share of  sales.  54 •  Dummy  for  training  (labor  markets  and  skills).Out  of  the  total  12.4%  contribution  to  aggregate  log�productivity  11%  is  due  to  the  allocation  effect  and  only  1.4%  due  to  the  average.  Infrastructures Economic governance Finance Innovation Labor Corp. Other % and markets & gov. control 30.0 23.9 competition skills variables 25.0 15.7 20.0 14.9 12.8 13.1 12.4 12.3 11.3 11.0 15.0 9.6 9.0 8.1 7.4 7.4 7.0 6.9 6.7 10.0 6.1 6.0 5.0 5.2 4.8 4.6 4.6 4.5 4.4 4.0 3.1 3.0 2.8 2.6 2.6 1.8 1.5 1.8 1.7 1.7 1.7 5.0 1.5 1.4 1.3 1.2 0.8 0.6 0.2 0.5 0.5 0.4 0.2 0.2 0.2 0.1 0.1 0.0 0.0 0.0 0.0 -0.1 -0.1 -0.2 -0.2 -0.3 -0.5 -0.3 -1.1 -0.9 -1.3 -1.7 -2.5 -5.0 -3.5 Aggregate productivity -6.6 -6.6 -8.1 -10.0 -8.3 Average productivity -15.0 Allocative efficiency -13.3 -16.2 -20.0 1.1 1.2 1.3 1.4 1.5 2.1 2.2 2.3 2.4 2.5 2.6 3.1 3.2 3.3 3.4 3.5 3.6 4.1 4.2 4.3 5.1 5.2 6.1 7.1 7.2 LEGEND 2 Economic governance 3 Finance 5 Labor markets and skills 1 Infrastructures 2.1 Dummy for conflicts with 3.1 Purchases paid before delivery 5.1 Staff - female workers 1.1 Number of power outages clients with a court involved 3.2 Working capital financed by internal funds 5.2 Dummy for training 1.2 Dummy for own generator 1.3 Products with own 2.2 Dummy for security expenses 3.3 Working capital financed by private banks 2.3 Crime losses 3.4 Working capital financed by family/friends 6 Corporate governance transport 1.4 Days of inventory of main 2.4 Payments to obtain a contract 3.5 Working capital financed by informal 6.1 Largest shareholder with the government funds int. mat. 2.5 Sales reported to taxes 3.6 Dummy for checking or saving account 7 Other control variables 1.5 Dummy for industrial zone 2.6 Dummy for gifts in tax 7.1 Dummy for help from BOI inspections 4 Innovation and competition 7.2 Dummy for materials from rural 4.1 Dummy for process innovation villages with a local supplier 4.2 New equipment 4.3 Dummy f or FDI Figure 15: IC Percentage contributions to aggregate log-productivity (manufacturing FY07) Note: Contributions computed according to section 5.4.The productivity measure used is the restricted Solow residual. Source: Authors’ calculations with Pakistan ICS data. •  Largest  shareholders  (corporate  governance). The  aggregate  effect  is  �8.1%,  in  this  case  the  effect of the allocative efficiency (8.1%) is subtracted from the average effect (�16.2%).  •  Other  variables  with  a  relatively  important  contribution  are:  dummy  for  conflicts  in  courts,  security expenses and dummy for checking or savings account.  Figure 16 presents the change in aggregate productivity when we improve IC variables by 20%.  The  results  come  to  confirm  the  importance  of  IC  variables  in  the  growth  of  productivity.  For  instance, if we were able to reduce informality by 20% aggregate productivity could increase by  2.5% (variable 2.5 in Figure 14). As another example, if we reduce the share of working capital  financed by internal funds—variable 3.2—aggregate productivity could increase by 1.7%.  55 % Infrastructures Economic governance Finance Innovation Labor Corp. Other % and markets & gov. control 4.0 competition skills variables 3.4 3.5 Aggregate productivity 3.0 Average productivity 3.0 2.5 Allocative efficiency 2.3 2.2 2.5 1.7 2.0 1.6 1.2 1.5 1.0 0.9 0.7 0.7 0.6 1.0 0.5 0.5 0.4 0.4 0.4 0.4 0.3 0.3 0.3 0.2 0.2 0.2 0.2 0.2 0.2 0.1 0.1 0.2 0.2 0.2 0.1 0.5 0.1 0.1 0.1 0.1 0.1 0.0 0.0 0.1 0.0 0.1 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 -0.1 -0.1 -0.1 -0.2 -0.2 -0.5 -1.0 -0.9 -1.5 1.1 1.2 1.3 1.4 1.5 2.1 2.2 2.3 2.4 2.5 2.6 3.1 3.2 3.3 3.4 3.5 3.6 4.1 4.2 4.3 5.1 5.2 6.1 7.1 7.2 LEGEND 2 Economic governance 3 Finance 5 Labor markets and skills 1 Infrastructures 2.1 Dummy for conflicts with 3.1 Purchases paid before delivery 5.1 Staff - female workers 1.1 Number of power outages clients with a court involved 3.2 Working capital financed by internal funds 5.2 Dummy for training 1.2 Dummy for own generator 2.2 Dummy for security expenses 3.3 Working capital financed by private banks 1.3 Products with own 2.3 Crime losses 3.4 Working capital financed by family/friends 6 Corporate governance transport 2.4 Payments to obtain a contract 3.5 Working capital financed by informal 6.1 Largest shareholder 1.4 Days of inventory of main with the government funds int. mat. 2.5 Sales reported to taxes 3.6 Dummy for checking or saving account 7 Other control variables 1.5 Dummy for industrial zone 2.6 Dummy for gifts in tax 7.1 Dummy for help from BOI inspections 4 Innovation and competition 7.2 Dummy for materials from rural 4.1 Dummy for process innovation villages with a local supplier 4.2 New equipment 4.3 Dummy f or FDI Figure 16: Percentage change in aggregate productivity (TFP) from a 20% improvement of IC variables (manufacturing FY07) Note: Simulations computed according to section 4.3. The productivity measure used is the restricted Solow residual. Source: Author’s calculations with Pakistan ICS data. 9.4 IC contributions to the sample means of employment, wages,  exporting propensity and FDI propensity  The  next  set  of  figures  focus  on  the  IC  percentage  contributions  to  the  sample  means  of  the  remaining measures of economic performance.  The  main  contributors  to  average  log�employment  are  real  wages  and  productivity.  Apart  from  these two variables, innovation and competition and labor markets and skills variables are also  important  contributors  to  employment.  For  instance,  the  percentage  of  workers  in  staff  is  positively  associated  with  total  employment  contributing  by  9.2%  to  average  log�employment.  Within this group other important factors are the experience of the manager (variable 7.6) and  the percentage of production workers in staff (variable 7.1).  Within innovation and competition it is worth mentioning the positive effect of competence, as  the 6.1% contribution of ‘Dummy for more than 5 competitors’ shows (variable 6.6).    56 % Prod. W. Infrastructures Economic Finance Innovation and Labor markets & Corp. Other control governance competition skills gov. variables 10.9 15 9.2 9.0 8.3 8.0 7.7 7.3 10 6.1 5.4 4.7 3.9 3.7 3.7 3.1 2.8 2.6 2.1 2.0 5 1.5 1.4 1.3 1.3 -32.8 0.9 0.9 0.7 0.7 0.6 0.3 0.1 0 -0.2 -0.3 -0.7 -1.5 -1.5 -5 -10 -15 -20 -25 -30 -35 1 2 3.1 3.2 3.3 3.4 3.5 3.6 4.1 4.2 5.1 5.2 5.3 5.4 5.5 5.6 6.1 6.2 6.3 6.4 6.5 6.6 7.1 7.2 7.3 7.4 7.5 7.6 8.1 8.2 9.1 9.2 9.3 9.4 9.5 LEGEND 1. Productivity 4 Economic governance 6 Innovation and competition 7.5 Experience of the manager 2. Real wages 4.1 Dumm y for security expenses 6.1 Dummy for quality certification 7.6 Education of the manager 4.2 Crim e losses 6.2 Computer controlled machinery 3 Infrastructures 6.3 Staff with computer 8 Corporate governance 3.1 Days to clear customs for exports 5 Finance 6.4 Dummy for e-m ail 8.1 Dumm y for incorporated copany - interaction with firms that do export 5.1 Sales paid for after delivery 6.5 Exporting experience 8.2 Dumm y for lim ited company 3.2 Electricity from a generator 5.2 Working capital financed by state- 6.6 Dummy for more than 5 competitors 3.3 Dumm y for insuficient water owned banks 9 Other control variables supply 5.3 Dumm y for checking or saving account 7 Labor markets and skills 9.1 Trade union 3.4 Dumm y for own transport 5.4 Owner of the lands 7.1 Staff - production workers 9.2 Capacity utilization 3.5 Shipment losses, exports 5.5 Dumm y for credit line 7.2 Staff - female workers 9.3 Dumm y for help from EPB 3.6 Days of inventory of m ain supply 5.6 Dumm y for external auditory 7.3 Staff - skilled workers 9.4 Dumm y for materials from rural 7.4 Training to non-production workers villages with supplier from firm's city 9.5 Dumm y for materials from rural Figure 17: IC percentage contributions to average log-employment (manufacturing FY07) Note: Simulations computed according to section 5.5. The productivity measure used is the restricted Solow residual. Source: Author’s calculations with Pakistan ICS data. Prod. Infrastructures Economic Finance Innovation & competition governance % 94.1 95 75 55 35 15 9.7 8.7 6.2 3.8 1.5 2.8 2.9 0.7 �5 �0.3 �2.9 �2.1 �2.5 �2 �5.8 �25 �14.9 1.1 2.1 2.2 2.3 3.1 3.2 4.1 4.2 4.3 4.4 4.5 4.6 5.1 5.2 5.3 5.4 1 Productivity 4 Finance 5 Innovation and competition 4.1 Purchases paid before delivery 5.1 Dummy for quality certification 2 Infrastructures 4.2 Working capital financed by family/friends 5.2 Dummy for product innovation 2.1 Losses due to power outages 4.3 Working capital financed by private banks 5.3 Staff with computer 2.2 Products with own transport 4.4 Dummy for checking or saving account 5.4 Dummy for FDI 2.3 Shipment losses, domestic 4.5 Dummy for credit line 2.4 shipment losses, domestic 4.6 Dummy for external auditory 3 Economic governance 3.1 Security expenses 3.2 Dummy for crime losses Figure 18: IC percentage contributions to average log-wage (manufacturing FY07) Note: Simulations computed according to section 5.6. The productivity measure used is the restricted Solow residual. Source: Author’s calculations with Pakistan ICS data. Wages case is illustrated by Figure 18. The largest contributor is productivity 94.8% of the total  contribution. Finance group contributes with six variables, particularly having access to financing  57 and auditing/accounting are positively related with wages, although the contribution is ten times  lower than that of productivity. Finally, wages appear to be lower for those firms suffering losses  due to power outages, the contribution of this variable is 14.9%.  Productivity  is  also  the  most  important  contributor  to  the  probability  of  exporting  (Figure  19),  accounting for almost half of the total contribution. Within infrastructures, the average number  of days waiting to clear customs is very important has a contribution of �38%. Finance variables  appear  to  have  a  important  role  in  exporting  propensity,  the  percentage  of  working  capital  financed by internal funds contributes with 24.5% and ‘Dummy for checking or saving accounts’  with 13.4%. Other important variables are ’Dummy for tax exemption’, ’Dummy for e�mail’ and  ’Dummy for more than 5 competitors’.   % Prod. Infrastructures Economic governance Finance Innovation & Corp. Other control competition gov. variables 50 46.6 30 24.5 12.2 13.4 11.8 8.7 10.8 7.2 10 2.4 3.9 1.8 3.3 1 0.6 -0.3 -0.4 �10 -9.5 �30 -38 �50 1.1 2.1 2.2 2.3 3.1 3.2 3.3 3.4 4.1 4.2 4.3 5.1 5.2 5.3 5.4 6.1 7.1 7.2 1 Productivity 4 Finance 6 Corporate governance 4.1 Working capital financed by internal funds 6.1 Dummy for state-owned firm 2 Infrastructures 4.2 Working capital financed by state-owned banks 2.1 Days to clear customs to export 4.3 Dummy for checking or saving account 7 Other control variables 2.2 Electricity from a generator 4.4 Own lands 7.1 Dummy from help from EPB 3.3 Products with own transport 7.2 Dummy for intermediate materials from 5 Innovation and competition rural villages with sub-contractual 3 Economic governance 5.1 Dummy for quality certification arrangement 3.1 Security expenses 5.2 Dummy for joint venture 3.2 Payments to deal with bur. issues 5.3 Dummy for  e�mail 3.3 Number of inspections 5.4 Dummy for  more than 5 competitors 3.4 Dummy for tax exemption Figure 19: IC percentage contributions to the probability of exporting (manufacturing FY07) Note: Simulations computed according to section 5.7. The productivity measure used is the restricted Solow residual. Source: Author’s calculations with Pakistan ICS data. Finally,  the  case  of  FDI  is  represented  in  Figure  20.  Again,  the  most  important  contributor  is  productivity.  Security  expenses  and  the  payments  to  deal  with  bureaucratic  issues  have  contributions  greater  than  20%.  Other  variables  are  ‘Dummy  for  foreign  technology’,  training  58 (variable 6.1), ‘Dummy for incorporated company’ or ’Dummy for quality certification’, all of them  with contributions in between 14% and 17%.  In general, almost all IC contributions to the probability of receiving FDI range in between 10 and  20%.  By  groups  of  variables,  those  related  with  innovation  and  competition  appears  to  be  importantly  associated  with  the  decision  of  foreign  firms  to  invest  in  Pakistan.  Economic  governance and infrastructures are also important, and specially the productivity of the firms.  % Prod. Infrastructure Economic governance Finance Innovation and competition Lab. Corp. Other control 80 mkts. & gov. variables 66.9 skills 70 60 50 40 32.8 30 17.3 17.3 15.1 20 14.0 13.3 7.6 5.2 10 0 -1.1 �10 -6.8 -4.2 -9.6 -7.3 -11.1 -13.6 -12.5 �20 �30 -23.5 1 2.1 2.2 2.3 3.1 3.2 3.3 3.4 4.1 4.2 5.1 5.2 5.3 5.4 6.1 7.1 8.1 8.2 1. Productivity 4 Finance 6 Labor markets and skills 4.1 Purchases paid before delivery 6.1 Training to non-production workers 2 Infrastructure 4.2 Working capital financed by private banks 2.1 Days to clear customs to export - interaction with firms that do export 7 Corporate governance 5 Innovation and competition 7.1 Dummy for incorporated company 2.2 Products with own transport 5.1 Dummy for quality certification 2.3 Number of water outages 5.2 Dummy for foreign technology 8 Other control variables 5.3 Dummy for outsourcing 8.1 Trade union 3. Economic governance 5.4 Staff with computer 8.2 Dummy for materials from rural villages 3.1 Security expenses 3.2 Crime losses with sub-contractual aarrangement 3.3 Payments to deal with bur. Issues 3.4 Number of labor inspections Figure 20: IC percentage contributions to the probability of receiving FDI (manufacturing FY07) Note: Simulations computed according to section 5.8. The productivity measure used is the restricted Solow residual. Source: Author’s calculations with Pakistan ICS data. 9.5  IC  contributions  to  the  Olley  and  Pakes  decomposition  of  labor productivity of services sector  IC  contributions  to  aggregate  (log)  labor  productivity  come  to  confirm  the  large  differences  in  productivities that exist in Pakistan, Figure 21. Like in the manufacturing case the contribution of  IC  variables  is  in  most  cases  amplified  due  to  this  fact.  This  is  the  case  of  ‘Dummy  for  informal  competition’ and ‘Staff with computer’, the contribution to the average is amplified because the  impact is concentrated in higher market share firms.  59 It is also worth mentioning the positive contribution of having an own generator or the negative  of the number of inspections suffered (variable 2.3) and the sales reported to taxes (variable 2.4).  In  general,  the  results  for  the  services  sectors  come  to  support  those  obtained  for  the  manufacturing, although with some slight differences.  % Infrastructures Economic governance Finance Innovation and competition 44.1 50 42.7 37.4 Aggregate labor productivity 40 32.8 31.8 Average labor productivity 30 Allocative efficiency 16.5 16.5 15.3 14.4 20 10.0 7.9 6.9 6.7 10 5.4 5.2 3.0 2.5 2.4 1.6 1.2 0.4 0.1 0.0 0 -0.1 -0.3 -0.4 -0.9 -0.9 -1.5 -1.5 -2.1 -2.5 -2.9 -3.9 -4.0 -4.0 -5.7 -10 -8.8 -14.3 -15.3 -20 -17.3 -18.6 -30 1.1 2.1 2.2 2.3 2.4 2.5 3.1 3.2 3.3 3.4 3.5 4.1 4.2 4.3 LEGEND 1 Infrastructures 2 Economic governance 3 Finance 4 Innovation and competition 1.1 Dummy for own 2.1 Dummy for bribes from public offcials 3.1 Working capital financed by trade credit 4.1 Dummy for quality certification generator 2.2 Manager's time spent in bur. issues 3.2 Working capital financed by family/friends 4.2 Staff with computer 2.3 Number of inspections 3.3 Owner of the lands 4.3 Exporting experience 2.4 Sales reported to taxes 3.4 Dummy for loan 2.5 Dummy for informal competition 3.5 Value of the collateral Figure 21: IC percentage contributions to the O&P decomposition of labor productivity (services FY07) Note: Simulations computed according to section 5.8. The productivity measure used is the restricted Solow residual. Source: Author’s calculations with Pakistan ICS data. 9.6 IC contributions to the probability of productivity increase   The  IC  contributions  to  the  probability  of  productivity  increase  are  in  Figure  22.  The  largest  contributions are those from the Economic Governance block. The number of inspections firms  suffered  in  FY02  is  negatively  associated  with  increases  of  productivity,  and  the  percentage  contribution  of  that  variable  is  �33%.  Within  this  group  security  expenses,  corruption  and  informalities  appear  to  be  closely  related  with  the  increase  in  productivity  between  FY02  and  FY07.  The  number  of  power  outages  has  a  negative  contribution  (�9%),  also  within  the  infrastructure  group,  the  low  quality  of  the  supplies  received  contributes  with  8%.  The  finance  group  contributes with four significant covariates of the probability of productivity increase. Having loan  is positive but the positive effect is reduced if the loan has collateral. Those firms that belonged  60 to a trade association in FY02 have more probability of experience an increase in productivity in  FY07, the contribution is 7%. The contributions of labor markets and skills are lower relative to  other variables, although this group contributed with three variables.  % Infrastructures Economic governance Finance Labor markets & Other 10 skills control 6.0 7.0 variables 4.1 5 3.4 2.5 2.2 0.4 0.6 0 �0.9 -5 �4.1 �5.0 �4.1 -10 �8.0 �7.7 �9.0 -15 -20 -25 -30 -35 �33.0 1.1 1.2 2.1 2.2 2.3 2.4 2.5 2.6 3.1 3.2 3.3 3.4 4.1 4.2 4.3 5.1 LEGEND 1 Infrastructures 2 Economic governance 3 Finance 4 Labor markets and skills 1.1 Number of power outages 2.1 Security expenses 3.1 Dummy for credit line 4.1 Staff - female workers 1.2 Low quality supplies 2.2 Crime losses 3.2 Dummy for loan 4.2 Staff - university education 2.3 Manager's time spent in bureaucratic 3.3 Dummy for loan with collateral 4.3 Training to non-production issues 3.4 Dummy for trade association workers 2.4 Number of inspections 2.5 Informal payments to obatin a 5 Other control variables contract with the government 5.1 Dummy for state owned firm Figure 52: IC percentage contributions to the probability of having a productivity (TFP) increase in terms of IC variables (manufacturing panel FY02-FY07) Note: Simulations computed according to section 5.8. The productivity measure used is the restricted Solow residual. Source: Author’s calculations with Pakistan ICS data. 10 Conclusions  The  objective  of  this  methodological  paper  is  to  identify  and  evaluate  the  importance  of  those  significant  IC  factors  associated  with  economic  performance  in  Pakistan.  The  strategy  for  the  identification  of  IC  effects  is  based  in  the  robustness  of  the  empirical  regularities.  We  found  IC  elasticities  and  semi�elasticities  with  respect  to  productivity  that  are  robust  under  alternative  economic environments (assumptions). Given this robustness, for the evaluation of IC effect on  economic  performance  we  concentrate  in  only  one  set  of  IC  parameters.  The  idea  is  to  obtain  empirical  regularities  that  are  reasonably  robust  (in  terms  of  signs  and  magnitude)  under  alternative  econometric  conditions,  even  when  these  assumptions  do  not  hold.  Obviously,  the  results are not numerically identical among different specifications but the observed variation of  61 the  estimates  and  significance  is  reasonable  and  gives  more  credibility  to  the  empirical  results  obtained.  The  identification  and  posterior  assessment  is  not  a  straightforward  task  due  to  the  numerous  methodological  difficulties  we  have  encountered.  To  list  a  few  endogeneity  of  regressors,  productivity  (TFP)  measures,  selection  of  the  relevant  model,  simultaneous  effects  and  low  quality of the database (missing observations, outliers, etc.) have been addressed with the ICSs of  Pakistan. We believe that the empirical regularities observed allow us to obtain a valuable insight  on which are the main areas of reform regarding the investment climate.   100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% CONTRIBUTIONS SIMULATIONS CONTRIBUTIONS SIMULATIONS CONTRIBUTIONS SIMULATIONS Aggregate productivity Average productivity Efficiency term Infrastructure Economic governance Finance Innovation and competition Labor markets and skills corporate governance Other control variables Figure 23: Weight of each block of IC variables on aggregate productivity, average productivity and allocative efficiency, by contributions and by simulations (manuf. FY07) Note: The weight of each block or group of IC vars from contributions comes from Figure 13 (section 9). We take the percentage contributions of Figure 13 in absolute value and we compute the relative weight of each block. For the case of simulations we do the same with the percentage increases of Figure 14 (section 9). The productivity measure used is the demeaned restricted Solow residual in logs. Source: Authors’ calculations with Pakistan ICS data. Figure 23 summarizes the results obtained for productivity. Both IC percentage contributions and  simulations reveal the important role of the infrastructures on productivity, particularly from the  low quality of the supply of power, from the benefits of being located in an industrial zone and  from the size of inventories (a possible measure of competitiveness and/or informality). The total  absolute IC contribution of the infrastructure group to aggregate productivity is around 30%.  62 Aggregate  productivity  is  also  associated  with  a  number  of  economic  governance  variables,  mainly related with informality and security, crime and courts. The final absolute contribution of  economic governance group is almost 20%. A similar contribution comes from the finance group.  The way firms get financing appears to be important to efficiency; those establishments financed  with  internal  funds  are  associated  with  lower  levels  of  productivity,  while  being  financed  with  funds from private banks is positively related with productivity.  The  negative  association  of  corporate  governance  is  represented  by  firms’  share  owned  by  the  largest shareholder. Concentration of the property of the firm is associated with low productivity  levels and the effect is concentrated in low market share firms.   Finally, it is worth mentioning the large contribution of innovation and competition to aggregate  productivity.  Introducing  product  improvements,  using  of  new  machinery  and  receiving  FDI  contributes in overall with almost 20% to aggregate productivity. Notice that the contribution to  the average of this group is lower than 5%, but as the pro�productivity effects of these factors are  mainly associated with large market share firms, the final contribution to aggregate productivity  is considerably amplified.  100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% Productivity Employment Wages Exports FDI Infrastructure Economic governance Finance Innovation and competition Labor markets and skills Corporate governance Other control variables Productivity Wages Figure 24: Weight of each block of IC variables on the sample means of economic performance measures (manuf. FY07) Note: The weight of each block or group of IC vars to the sample means comes from the contributions of Figures comes from figures 7 to 10 (section 9). We take the percentage contributions in absolute value and we compute the relative weight of each block. For the case of productivity we take the IC contributions to average log-productivity from Figure 13. Source: Authors’ calculations with Pakistan ICS data. 63 The IC contributions to the sample means of all the economic performance measures considered  in  the  paper  are  summarized  in  Figure  24.  The  infrastructure  group  appears  to  be  especially  important  for  productivity  and  exporting  activity  and  to  a  lower  extent  for  attracting  FDI,  for  employment  and  wages.  The  weight  of  the  economic  governance  group  is  larger  than  20%  in  productivity  and  FDI  and  only  slightly  lower  in  exporting  equation.  The  finance  IC  group  represents more than 20% of the total IC contributions to wages and exports, and 18% of the IC  contribution to productivity. Innovation and competition factors have more relative importance  in  exports  and  FDI,  whereas  the  contribution  of  labor  skills  to  employment  is  especially  important. Corporate governance contribution is remarkable in the case of productivity only. The  Wage  contribution  to  employment  is  22%,  and  productivity  (TFP)  is  a  key  factor  in  wages,  exporting and FDI equations, contributing with 58%, 23% and 25% respectively.    64 References  Allison,  P.D  (2001);  “Missing  Data,�  Quantitative  Applications  in  the  Social  Sciences,  Sage  University Paper.   Bernard,  A.  B.,  J.  Eaton,  J.  B.  Jensen,  and  S.  Kortum  (2003);  “Plants  and  Productivity  inInternational Trade," American Economic Review, 93(4), 1268�1290.  Cameron, A.C. and P.K Trivedi (2005); “Microeconometrics: Theory and Applications,� Cambridge  University Press.  Caselli,  F.,  (2005);  “Accounting  for  Income  Differences  Across  Countries,�  chapter  9  in  the  Handbook of Economic Growth Vol. 1A, P. Aghion and S. Durlauf, eds., North Holland.  Cole, H. L., L. E. Ohanian, A. Riascos and J. A. Schmitz Jr. (2004); "Latin America in the Rearview  Mirror," National Bureau of Economic Research WP #11008, December.   Dempster  A.P.,  N.M.  Laird  and  D.B.  Rubin  (1977);  “Maximum  Likelihood  Estimation  for  Incomplete Data Via the EM Algorithm,� Journal of the Royal Statistical Society, Series B,  39, 1�38.  Escribano Alvaro and J. Luis Guasch (2005); “Assessing the Impact of the Investment Climate on  Productivity using Firm Level Data: Methodology and the Cases of Guatemala, Honduras  and  Nicaragua,�  World  Bank  Policy  Research  Working  Paper  #  3621,  The  World  Bank,  June.  Escribano A. and J. L. Guasch (2008); “Robust Methodology for Investment Climate Assessment  on  Productivity:  Application  to  Investment  Climate  Surveys  from  Central  America,�  Working Paper # 08�19 (11), Universidad Carlos III de Madrid.  Escribano,  A.,  J.  L.  Guasch,  M.  de  Orte  and  J.  Pena.  (2008a);  “Investment  Climate  and  Firm’s  Performance:  Econometric  and  Applications  to  Turkey’s  Investment  Climate  Survey,�  Working Paper # 08�20 (12), Universidad Carlos III de Madrid.  Escribano,  A.,  J.  L.  Guasch,  M.  de  Orte  and  J.  Pena  (2008b);  “Investment  Climate  Assessment  Based on Demeaned Olley and Pakes Decompositions: Methodology and Applications to  Turkey’s Investment Climate Survey,� Working Paper # 08�20 (13), Universidad Carlos III  de Madrid.  Escribano, A., J. L. Guasch and J. Pena. (2009); “Assessing the Impact of Infrastructure Quality on  Firm Productivity in Africa,� World Bank Policy Research Working Paper (forthcoming),  The World Bank, Washington DC.  Escribano, A. and M. de Orte (2008); “Investment Climate Assessment of Labour Productivity and  Employment  in  the  Retail  Sector  of  India:  Analysis  Based  on  Firm  Level  Data,�  mimeo,  Universidad Carlos III de Madrid.  Escribano,  A.  and  J.  Pena  (2008);  “Empirical  Econometric  Evaluation  of  Alternative  Methods  of  Dealing with Missing Values in Investment Climate Surveys,� mimeo, Universidad Carlos  III de Madrid.  65 Foster,  L.,  J.  Haltiwanger  and  C.  Syverson,  (2008);  “Reallocation,  Firm  Turnover,  and  Efficiency:  Selection on Productivity or Profitablity?� American Economic Review, March 2008.   Gorodnichenko, Y. (2007); “Using Firm Optimization to Evaluate and Estimate Returns to Scale,"  Mimeo, UC Berkeley, Sept.  Hall,  R.  E.  and  C.  I.  Jones  (1999);  “Why  Do  Some  Countries  Produce  So  Much  More  Output  Per  Worker Than Others?� Quarterly Journal of Economics 114: 83�116.  Heckman,  J.J.  (1976);  “The  Common  Structure  of  Statistical  Models  of  Truncation,  Sample  Selection  and  Limited  Dependent  Variables  and  a  Simple  Estimator  for  Such  Models,�  Annals of Economic and Social Measurement 5, 475�492.  Katayama,  H.,  S.  Lu,  and  J.  R.  Tybout  (2006);  “Firm�Level  Productivity  Studies:  Illusions  and  a  Solution,"  unpublished  paper,  The  University  of  Sydney,  Charles  River  Associates  and  Pennsylvania State University.  Klenow, P. J. and A. Rodríguez�Clare (1997); “The Neoclassical Revival in Growth Economics: Has  It Gone Too Far?� in B. Bernanke and J. Rotemberg, eds., NBER Macroeconomics Annual  (MIT Press, Cambridge) 73�103.  Levinsohn  and  Melitz,  (2002);  “Productivity  in  a  Differentiated  Products  Market  Equilibrium,�  Mimeo, NBER.  de Loecker, J. (2007); “Product Differentiation, Multi�Product Firms, and Estimating the Impact of  Trade Liberalization on Productivity," NBER Working Paper No. 13155.  Melitz,  M.  (2000);  “Estimating  Firm�level  Productivity  in  Differentiated  Product  Industries,"  Unpub. paper, Harvard University.  Olley,  G.  S.  and  A.  Pakes  (1996);  “The  Dynamics  of  Productivity  in  the  Telecommunications  Equipment Industry,� Econometrica, Vol. 64, 6, 1263�1297.  Rubin, D.B. (1976); “Inference and Missing Data,� Biometrika, 63, 581�592.  Schafer, J.L (1997); “Analysis of Incomplete Multivariate Data,� London: Chapman and Hall. Wooldridge, J.M (2007); “Econometric Analysis of Cross Section and Panel Data,� The MIT Press.  Cambridge, Massachusetts.  66 Appendix on tables and figures  Table 1.1: General information on plant level and production function (productivity) variables for analysis of manufacturing sector General Industrial classification a) Food Processing, Beverages, and Tobacco; b) Textile and Leather; c) Wood and information at Wood Products, other than Pulp; d) Pulp, Paper, and Printing; e) Chemical, including plant level Petrochemical; f) Non-Metallic Mineral Products; g) Iron, Steel, and Non-Ferrous Metal; h) Machinery; i) Miscellaneous Manufacturing Industries. Regional classification a) Punjab; b) Sindh; c) Baluchistan; d) NWFP. Size classification a) Small (< 25 employees); b) medium (>=25 & <100); c) large (>=100). Production Sales Used as the measure of output for the production function estimation. Sales are function variables defined as total annual sales. The series are deflated by using the Producer Price Indexes (PPI), base 2002. Employment Total number of permanent and temporal workers. Total hours worked per year Total number of employees multiplied by the average hours worked per year. Materials Total costs of intermediate and raw materials used in production (excluding fuel). The series are deflated by using the Producer Price Indexes (PPI), base 2002. Capital stock Net book value of machinery and equipment. The series are deflated by using the Producer Price Indexes (PPI), base 2002. User cost of capital The user cost of capital is defined in terms of the opportunity cost of using capital; it is defined as a 15% of the net book value of machinery and equipment. Labor cost Total expenditures on personnel. The series are deflated by using the Producer Price Indexes (PPI), base 2002. Dependent Exports Dummy variable that takes value 1 if exports are greater than 10%. variables in equation Foreign Direct Investment Dummy variable that takes value 1 if any part of the capital of the firm is foreign. regressions and linear probability Wages Real wage is defined as the total expenditures on personnel (deflated by using the models Producer Price Indexes (PPI), base 2002.) divided by the total number of permanent and temporal workers. Employment Total number of permanent and temporal workers. All series figure in US dollars, data obtained from WDI, The World Bank, 2008. Table 1.2: General information on plant level and labor productivity variables for analysis of services sector General Industrial classification a) Wholesale and retail trade; b) other services including IT, construction and Information at transport. Plant Level Regional classification a) Punjab; b) Sindh; c) Baluchistan; d) NWFP. Size classification a) Small (< 25 employees); b) medium (>=25 & <100); c) large (>=100). Labor Sales Used as the measure of output. Sales are defined as total annual sales. The series productivity are deflated by using the Producer Price Indexes (PPI), base 2002. variables Employment (demand for Total number of permanent and temporal workers. labor) Total hours worked per Total number of employees multiplied by the average hours worked per year. year Electricity cost Total annual costs of electricity. The series are deflated by using the Producer Price Indexes (PPI), base 2002. Communication cost Total annual costs of communications services. The series are deflated by using the Producer Price Index (PPI), base 2002. Rental cost of capital Total annual cost of rental of land/buildings, equipment, furniture. The series are deflated by using the Producer Price Indexes (PPI), base 2002. Labor cost Total annual cost of labor (including wages, salaries, bonuses, social payments). The series are deflated by using the Producer Price Index (PPI), base 2002. All series figure in US dollars, data obtained from WDI, The World Bank, 2008. 67 Table 2.1: Definition of IC variables: infrastructure Name Definition Days to clear customs to export Average number of days to clear customs when exporting directly. Longest # of days to clear cust. to export Longest number of days to clear customs when exporting directly. Days to clear customs to import Average number of days to clear customs when importing. Longest # of days to clear cust. to import Longest number of days to clear customs when importing. Power outages Total number of power outages suffered by the plant in 2005. Average duration of power outages Average duration of power outages suffered in hours, conditional on the pant reports having power outages. Total duration of power outages by year Total duration of power outages suffered by the plant by month, in hours, conditional on the plat reports having power outages. Losses due to power outages Losses due to power outages as a percentage of total annual sales, conditional on the plant reports having power outages. Wait for a power supply Number of days waiting to obtain an electricity supply, conditional on submit an electrical connection. Dummy for gifts to obtain a power supply Gifts expected or requested to obtain an electrical connection, conditional on submit an electrical connection. Dummy for own generator Dummy variable taking value 1 if the firm has its own power generator. Electricity from a generator Percent. of the electricity used by the plat provided by a own generator. Dummy for insufficient water supply Dummy variable that takes value 1 if the firm has experienced Insufficient water supply for production during 2005. Water outages Total number of water outages suffered by the plant in 2005. Average duration of water outages Average duration of water outages suffered in hours, conditional on the plant reports having water outages. Total duration of water outages by year Total duration of water outages suffered by the plant by month in hours, conditional on the plant reports having water outages. Water from public sources Percentage of water supply from public sources. Wait for a water supply Number of days waiting for a water supply, conditional on submit a water supply. Dummy for gifts to obtain a water supply Gifts expected or requested to obtain a water supply, conditional on submit a water supply. Wait for a phone connection Number o days waiting to obtain a phone connection, conditional on submit a phone connection. Dummy for gifts to obtain a phone Gifts expected or requested to obtain a phone supply, conditional on submit a phone connection connection Dummy for webpage Dummy variable taking value 1 if the plant uses its own web page to communicate with clients and suppliers. Dummy for e-mail Dummy variable taking value 1 if the plant uses the electronic mail to communicate with clients and suppliers. Dummy for own transport Dummy variable taking value 1 if the plant uses its own transport to make shipments to its costumers. Products with own transport Percentage of shipments to costumers that were transported by the establishment own transport as a percentage of annual revenue. Conditional on the plant reports having it own transport. Shipment losses, exports Percentage of the consignment value of the products shipped for direct export lost while in transit because of theft, breakage or spoilage. Shipment losses, domestic Percentage of the consignment value of the products shipped for import lost while in transit because of theft, breakage or spoilage. Days of inventory of main input Average number of days (measured in production days) that the main input is available on stock. Wait for an import license Total days to obtain an import license, conditional on submit an import license. Dummy for gifts to obtain an import license Gifts expected or requested to obtain an import license, conditional on submit an import license. Dummy for industrial zone Dummy 1 if the plant is located in an industrial zone. Low quality supplies * Percentage of supplies that are of lower than agreed upon quality Sales lost due to delivery delays * Percentage of sales lost due to delivery delays of key inputs. Dummy for broadband internet Dummy variable taking value 1 if the establishment has a high-speed, broadband Internet connection** connection on its premises. Number of internet outages** Average number of times per month the establishment experienced unavailability of Internet connection. Average duration of internet outages** In a typical month last fiscal year, average duration of unavailability of internet connection. Dummy for security expenses for internet** Dummy variable taking value 1 if security of Internet connections or authentication of parties in a transaction affects the volume and/or nature of purchases that the establishment makes over the Internet. * Available only in 2002 ICS, ** Available only in services survey. 68 Table 2.2: Definition of IC variables: economic governance Name Definition Dummy for conflicts with clients with Dummy taking value 1 if the plant has conflicts with clients with a third part involved. a third part Dummy for conflicts with clients with Dummy taking value 1 if the plant has conflicts with clients with a court involved (conditional on a court involved having conflicts with clients with a third part involved). Weeks to judgment Number of weeks that took the court to come to judgment in the last conflict with clients (conditional on having conflicts with clients with a third part involved). Dummy for security expenses Dummy taking value 1 if the plant has security expenses. Security expenses Security expenses as a percentage of annual total sales. Dummy for crime losses Dummy taking value 1 if the plant has experienced losses due to criminal attempts in 2005. Crime losses Crime losses as a percentage of annual total sales in 2005. Manager's time spent in bur. issues In typical week percentage of manager's time spent dealing with bureaucratic issues. Weeks to bureaucracy Total number of weeks spent by management dealing with bureaucracy in 2005. Number of tax inspections Total number of inspections of tax officials received by the plant in 2005. Dummy for gifts in tax inspections Gifts expected or requested in inspections with tax officials. Dummy for payments to obtain a Dummy that takes value 1 if firms operating in the same sector of the surveyed plant have to contract with the government offer informal payments to obtain a contract with the government. Payments to obtain a contract with Payments to obtain a contract with the government as a percentage of contract value. the government Payments to deal with bur. issues Gifts or informal payments to public officials to “get things done� with regard to customs, taxes, licenses, regulations, services etc, as a percentage of annual sales. Sales reported to taxes Percentage of total annual sales that a typical firm operating in plant's sector reports for tax purposes. Workforce reported to taxes Percentage of total work force that a typical firm operating in plant's sector reports for tax purposes. Dummy for interventionist labor Dummy variable that takes value 1 if the labor regulation has affected plant's employment regulation decisions. Wait for a construction permit Days waiting to obtain a construction permit (conditional on submit a construction permit). Dummy for gifts to obtain a Gifts expected or requested to obtain a construction permit, conditional on submit a construction permit construction permit. Wait for an operation license Days waiting to obtain a main operating license (conditional on submit a operating license). Dummy for gifts to obtain an Gifts expected or requested to obtain an operating license, conditional on submit a operating operating license license. Dummy for tax exemption Dummy taking value 1 if the establishment is currently using or benefiting from: customs duty drawback, export rebate, sales tax refunds and profit tax exemption. Number of labor inspections Total number of inspections of labor officials received by the plant in 2005. Days of production lost due to Total number of production days lost due to worker absenteeism. absenteeism * Illegal payments in protection * Total amount of illegal payments to prevent violence (e.g. organized crime). Dummy for informal competition** Dummy taking value 1 if the establishment competes against unregistered or informal trading firms. Dummy for bribes to public officials** Dummy variable taking value 1 if the establishment has paid any informal payment (payment required/expected) to the police, political party, etc. (i.e. Bhatha) to ensure security to your establishment (protection from robbery, arson, or any other crime). Dummy for policy discussion with Dummy taking value 1 if the establishment participated, directly or through its representative government** body, in policy discussions with the local, provincial or federal government bodies during the last Fiscal year. * Available only in 2002 ICS, ** Available only in services survey. 69 Table 2.3: Definition of IC variables: finance Name of the variable Definition Purchases paid before delivery Percentage of annual purchases paid for before the delivery. Purchases paid on delivery Percentage of annual purchases paid for on delivery. Purchases paid after delivery Percentage of annual purchases paid for after the delivery. Sales paid before delivery Percentage of annual sales paid for before the delivery. Sales paid on delivery Percentage of annual sales paid for on delivery. Sales paid after delivery Percentage of annual sales paid for after the delivery. Working capital financed by internal Percentage of firm's working capital financed with internal funds. founds Working capital financed by private banks Percentage of firm's working capital financed with funds from private commercial banks. Working capital financed by state-owned Percentage of firm's working capital financed with funds state owned banks. banks Working capital financed by family/friends Percentage of firm's working capital financed with family/friends funds. Working capital financed bi non-bank Percentage of firm's working capital financed with funds from non-banking financial financial institutions institutions. Working capital financed by trade credit Percentage of firm's working capital financed with credits from suppliers. Working capital financed by informal Percentage of firm's working capital financed with funds from informal sources. founds New fixed assets financed by internal Percentage of investments in new fixed assets financed with internal funds. founds New fixed assets financed by private Percentage of investments in new fixed assets financed with funds from private banks commercial banks. New fixed assets financed by state-owned Percentage of investments in new fixed assets financed with funds state owned banks. banks New fixed assets financed by Percentage of investments in new fixed assets financed with family/friends funds. family/friends New fixed assets financed by non-bank Percentage of investments in new fixed assets financed with funds from non-banking financial institutions financial institutions. New fixed assets financed by trade credit Percentage of investments in new fixed assets financed with credits from suppliers. New fixed assets financed by informal Percentage of investments in new fixed assets financed with funds from informal sources. founds Dummy for checking or saving account Dummy taking value 1 if the plant has a checking or saving account. Own of the land Percentage of the lands in which the plant operates owned by the firm. Dummy for credit line Dummy that takes value 1 if the firm has access to a credit line or overdraft facility Dummy for loan Dummy that takes value 1 if the firm has access to a loan line. Dummy for loan with collateral Dummy that takes value 1 if the firm has access to a loan line with collateral (conditional on having a loan line). Value of the collateral Value of the collateral as a percentage of the loan value (conditional on having a loan with collateral) Dummy for debt Dummy taking value 1 if any of the number of rejected loan applications is larger than the number of applications for a loan. Dummy no loan because of complexity Dummy that takes value 1 if the firm did not apply for loan because of its complexity. Dummy no loan because of cost Dummy that takes value 1 if the firm did not apply for loan because of its cost. Dummy no loan because of collateral Dummy that takes value 1 if the firm did not apply for loan because of its collateral. Rejected credit applications Percentage of rejected credit applications. Accepted credit applications Percentage of accepted credit applications. Dummy for external auditory Dummy that takes value 1 if the firm has its annual statements externally audited. Dummy for trade association* Dummy 1 if the establishment belongs to any trade association. Percentage of credit line unused* Percentage of the credit line currently unused. Borrows in foreign currency* Percentage of establishment’s borrows denominated in foreign currency. Wait to clear a check* Days that it takes to clear a check with the establishment’s banking institution. Charges to clear a check* Charge to clear a check with establishment’s banking institution. Wait to clear a domestic currency wire* Days that it takes to clear a domestic currency wire with establishment’s banking institution. Charges to clear a domes. currency wire* Charges to clear a domestic currency wire with establishment’s banking institution. Wait to clear a foreign currency wire* Days that it takes to clear a foreign currency wire with establishment’s banking institution. Charges to clear a foreign currency wire* Charge to clear a foreign currency wire with establishment’s banking institution. Dummy for clear title for owned land** Dummy taking value 1 if out of the lands owned, the establishment has a clear title. * Available only in 2002 ICS, ** Available only in services survey. 70 Table 2.4: Definition of IC variables: innovation and competition Name Definition Dummy for quality certification Dummy taking value 1 if the firm has any kind of quality certification. Dummy for foreign technology Dummy taking value 1 if the plant uses technology licensed from a foreign-owned company. Dummy for product innovation Dummy taking value 1 if the plant has introduced any product innovation in the last 3 years. Dummy for process innovation Dummy taking value 1 if the plant has introduced any production process improvement in the last 3 years. Dummy for joint venture Dummy taking value 1 if the plant has agreed any new joint venture with a foreign company. Dummy for outsourcing Dummy taking value 1 if the plant subcontracts any part of the activity. Dummy for R&D Dummy that takes value 1 if the firm performed R&D activities during last year. Computer controlled machinery Percentage of plant’s machinery that is controlled by computer. Staff with computer Percentage of staff using computer at job. New equipment Percentage of plant’s equipment that is less that 5 years old. Dummy for FDI Dummy that takes value 1 if any part of firm's capital is foreign. Dummy for importer Dummy taking value 1 if the firm imports more than 10% of the total purchases of intermediate materials. Share of imports Share of imported inputs over total purchases of intermediate materials. Dummy for exporter Dummy taking value 1 if the firm exports more than 10% of the total annual sales. Exporting experience Number of years of exporting experience. Share of exports Share of exports over total annual sales. Dummy for local monopoly Dummy taking value one if the firm is a local monopoly. Dummy for less that 5competitors Dummy taking value one if the plant has more or equal than 5 competitors in the local market. Dummy for more than 5competitors Dummy taking value one if the plant has less than 5 competitors in the local market. Number of competitors* Total number of competitors within plant’s main product line. Dummy for new service offered** Dummy taking value 1 if during the last 3 years the establishment introduced to the market any new or improved service. Dummy for new methods of Dummy taking value 1 if during the last 3 years the establishment introduced to the market any providing services** new or improved methods of providing services. Dummy for investment in IT** Dummy taking value 1 if during the last year the establishment invested in information technologies. * Available only in 2002 ICS, ** Available only in services survey. Table 2.5: Definition of IC variables: labor markets and skills Name Definition Staff - production workers Percentage of production workers in staff. Staff - female workers Percentage of female workers in staff. Staff - skilled workers Percentage of skilled production workers in staff. Staff - university education Dummy taking value 1 if the typical production worker has at least one year of university education. Dummy for training Dummy taking value one if the firm provides formal (beyond on the job) training to its employees. Training to production workers Percentage of production workers receiving formal (beyond on the job) training Training to non-production workers Percentage of non-production workers receiving formal (beyond on the job) training University education* Manager experience in years. Tenure* Average tenure of the staff of the plant. Experience of the manager Number of years of experience of the manager in the establishment’s sector. Education of the manager Dummy taking value 1 if the manager has at least a bachelor degree. Education of the manager (post- Dummy taking value 1 if the manager has a post-grade (MA, PhD). grade) ** * Available only in 2002 ICS, ** Available only in services survey. 71 Table 2.6: Definition of IC variables: corporate governance Name Definition Largest shareholder Percentage of firm's capital owned by the largest shareholder. Dummy for incorporated company Dummy that takes value 1 if the firm is an incorporated company. Dummy for limited company Dummy that takes value 1 if the firm is a limited company. Dummy for state-owned firm Dummy that takes value 1 if any part of firm's capital is public. * Available only in 2002 ICS, ** Available only in services survey. Table 2.7: Definition of IC variables: other control variables Name Definition Age Age of the firm in 2005. Trade union Percentage of workforce unionized Capacity utilization Percentage of capacity utilized. Dummy for increased sales Dummy taking value 1 if the plant has increased its sales Dummy for decreased sales Dummy taking value 1 if the plant has decreased its sales Dummy for help from SMEDA Dummy 1 if the plant is receiving support from SMEDA Dummy for help from EPB Dummy 1 if the plant is receiving support from EPB Dummy for help from BOI Dummy 1 if the plant is receiving support from BOI Days of production lost due to Total number of working days lost due to strikes strikes* Days of production lost due to civil Total number of production days lost due to civil unrests. unrests* Dummy for materials from rural Dummy 1 if the direct purchase at the source is the mechanism of supply used for raw villages with direct purchase materials originated from rural villages. Dummy for materials from rural Dummy 1 if a regular supplier located in the rural area is the mechanism of supply used for villages with local supplier raw materials originated from rural villages. Dummy for materials from rural Dummy 1 if a regular supplier located establishment’s city is the mechanism of supply used for villages with supplier from raw materials originated from rural villages. Dummy for materials from rural Dummy 1 if the sub-contractual arrangement is the mechanism of supply used for raw villages with sub-contractual materials originated from rural villages. Materials from rural villages Percentage of the inputs used by the establishment originate from rural villages Local area in square feet Total area of the local occupied by the establishment in squared feet. Dummy for initiatives to address Dummy taking value 1 if the firm overcame any initiative to address AIDS among its employees AIDS during last fiscal year. * Available only in 2002 ICS, ** Available only in services survey. 72 Table 3.1: Number of observations and response rate (in parentheses) of infrastructure variables Name of the variable Manufact. Manufacturing panel Services FY07 FY02 FY07 FY07 Days to clear customs to export 117(14.9) 103(25.6) 138(34.3) 3(1.9) Longest number of days to clear customs to export 114(14.5) 103(25.6) 130(32.3) 3(1.9) Days to clear customs to import 109(13.9) 66(16.4) 109(27.1) 0(0) Longest number of days to clear customs to import 104(13.3) 66(16.4) 104(25.9) 0(0) Power outages 780(99.5) 402(100) 401(99.8) 149(98.6) Average duration of power outages 783(99.9) 0(0) 401(99.8) 147(97.3) Total duration of power outages by year 780(99.5) 0(0) 401(99.8) 147(97.3) Losses due to power outages 758(96.7) 402(100) 394(98.0) 143(94.7) Wait for a power supply 50(6.4) 32(8.0) 34(8.5) 149(98.6) Dummy for gifts to obtain a power supply 775(98.9) 402(100) 399(99.3) 147(97.3) Dummy for own generator 784(100) 402(100) 402(100) 151(100) Electricity from a generator 778(99.2) 400(99.5) 400(99.5) 150(99.3) Dummy for insufficient water supply 781(99.6) 0(0) 402(100) 0(0) Water outages 778(99.2) 395(98.3) 401(99.8) 0(0) Average duration of water outages 773(98.6) 0(0) 400(99.5) 0(0) Total duration of water outages by year 772(98.5) 0(0) 399(99.3) 0(0) Water from public sources 778(99.2) 0(0) 398(99.0) 0(0) Wait for a water supply 19(2.4) 0(0) 20(5.01) 0(0) Dummy for gifts to obtain a water supply 779(99.4) 0(0) 400(99.5) 0(0) Wait for a phone connection 59(7.5) 57(14.2) 68(16.9) 149(98.6) Dummy for gifts to obtain a phone connection 773(98.6) 402(100) 397(98.8) 149(98.6) Dummy for webpage 202(25.8) 402(100) 197(49.0) 151(100) Dummy for e-mail 784(100) 402(100) 402(100) 151(100) Dummy for own transport 783(99.9) 0(0) 402(100) 0(0) Products with own transport 774(98.7) 0(0) 399(99.3) 0(0) Shipment losses, exports 120(15.3) 0(0) 139(34.6) 3(1.9) Shipment losses, domestic 760(96.9) 0(0) 380(94.5) 0(0) Days of inventory of main input 773(98.6) 401(99.8) 397(98.8) 140(92.7) Wait for an import license 7(0.9) 2(0.5) 12(3.0) 146(96.6) Dummy for gifts to obtain an import license 773(98.6) 402(100) 400(99.5) 146(96.6) Dummy for industrial zone 784(100) 0(0) 402(100) 0(0) Low quality supplies * 0(0) 401(99.8) 0(0) 0(0) Sales lost due to delivery delays * 0(0) 401(99.8) 0(0) 0(0) Dummy for broadband internet connection** 0(0) 0(0) 0(0) 149(98.6) Number of internet outages** 0(0) 0(0) 0(0) 148(98) Average duration of internet outages** 0(0) 0(0) 0(0) 147(97.3) Dummy for security expenses for internet** 0(0) 0(0) 0(0) 11(7.2) * Available only in 2002 ICS, ** Available only in services survey. 73 Table 3.2: Number of observations and response rate (in parentheses) of economic governance variables Name of the variable Manufact. Manufacturing panel Services FY07 FY02 FY07 FY07 Dummy for conflicts with clients with a third part 776(99.0) 402(100) 399(99.3) 150(99.3) Dummy for conflicts with clients with a court involved 773(98.6) 402(100) 397(98.8) 150(99.3) Weeks to judgment 702(89.5) 0(0) 359(89.3) 141(93.3) Dummy for security expenses 783(99.9) 397(98.8) 402(100) 150(99.3) Security expenses 453(57.8) 397(98.8) 312(77.6) 86(56.9) Dummy for crime losses 780(99.5) 395(98.3) 401(99.8) 150(99.3) Crime losses 773(98.6) 395(98.3) 398(99.0) 150(99.3) Manager's time spent in bur. issues 781(99.6) 402(100) 402(100) 148(98.0) Weeks to bureaucracy 781(99.6) 402(100) 402(100) 148(98.0) Number of tax inspections 765(97.6) 397(98.8) 392(97.5) 150(99.3) Dummy for gifts in tax inspections 746(95.2) 392(97.5) 387(96.3) 143(94.7) Dummy for payments to obtain a contract with the government 280(35.7) 0(0) 181(45.0) 71(47.0) Payments to obtain a contract with the government 280(35.7) 0(0) 181(45.0) 71(47.0) Payments to deal with bur. issues 784(100) 399(99.3) 402(100) 151(100) Sales reported to taxes 684(87.2) 0(0) 366(91.0) 131(86.7) Workforce reported to taxes 652(83.2) 0(0) 363(90.3) 130(86.0) Dummy for interventionist labor regulation 784(100) 402(100) 402(100) 139(92.0) Wait for a construction permit 25(3.2) 5(1.2) 38(9.5) 147(97.3) Dummy for gifts to obtain a construction permit 763(97.3) 402(100) 394(98.0) 147(97.3) Wait for an operation license 15(1.9) 3(0.7) 13(3.2) 17(11.2) Dummy for gifts to obtain an operating license 775(98.9) 402(100) 399(99.3) 148(98.0) Dummy for tax exemption 783(99.9) 402(100) 402(100) 0(0) Number of labor inspections 757(96.6) 375(93.3) 394(98.0) 145(96.0) Days of production lost due to absenteeism * 0(0) 401(99.8) 0(0) 0(0) Illegal payments in protection * 0(0) 396(98.5) 0(0) 0(0) Dummy for informal competition** 0(0) 0(0) 0(0) 146(96.6) Dummy for bribes to public officials** 0(0) 0(0) 0(0) 148(98.0) Dummy for policy discussion with government** 0(0) 0(0) 0(0) 147(97.3) * Available only in 2002 ICS, ** Available only in services survey. 74 Table 3.3: Number of observations and response rate (in parentheses) of finance variables Name of the variable Manufact. Manufacturing panel Services FY07 FY02 FY07 FY07 Purchases paid before delivery 773(98.6) 402(100) 400(99.5) 147(97.3) Purchases paid on delivery 773(98.6) 402(100) 399(99.3) 147(97.3) Purchases paid after delivery 782(99.7) 402(100) 401(99.8) 147(97.3) Sales paid before delivery 772(98.5) 0(0) 398(99.0) 144(95.3) Sales paid on delivery 143(18.2) 0(0) 57(14.2) 88(58.2) Sales paid after delivery 781(99.6) 0(0) 402(100) 145(96.0) Working capital financed by internal founds 782(99.7) 402(100) 402(100) 150(99.3) Working capital financed by private banks 771(98.3) 401(99.8) 398(99.0) 145(96.0) Working capital financed by state-owned banks 769(98.1) 401(99.8) 398(99.0) 145(96.0) Working capital financed by family/friends 769(98.1) 401(99.8) 398(99.0) 145(96.0) Working capital financed bi non-bank financial institutions 769(98.1) 0(0) 398(99.0) 145(96.0) Working capital financed by trade credit 769(98.1) 402(100) 399(99.3) 145(96.0) Working capital financed by informal founds 769(98.1) 402(100) 398(99.0) 145(96.0) New fixed assets financed by internal founds 141(18.0) 98(24.4) 107(26.6) 26(17.2) New fixed assets financed by private banks 133(17.0) 98(24.4) 104(25.9) 25(16.5) New fixed assets financed by state-owned banks 132(16.8) 98(24.4) 104(25.9) 25(16.5) New fixed assets financed by family/friends 132(16.8) 98(24.4) 104(25.9) 25(16.5) New fixed assets financed by non-bank financial institutions 132(16.8) 0(0.01) 104(25.9) 25(16.5) New fixed assets financed by trade credit 132(16.8) 97(24.1) 104(25.9) 25(16.5) New fixed assets financed by informal founds 132(16.8) 98(24.4) 104(25.9) 25(16.5) Dummy for checking or saving account 778(99.2) 0(0) 400(99.5) 151(100) Own of the land 782(99.7) 385(95.8) 402(100) 149(98.6) Dummy for credit line 772(98.5) 402(100) 396(98.5) 147(97.3) Dummy for loan 776(99.0) 402(100) 396(98.5) 148(98.0) Dummy for loan with collateral 771(98.3) 402(100) 391(97.3) 147(97.3) Value of the collateral 784(100) 380(94.5) 402(100) 151(100) Dummy for debt 784(100) 0(0.01) 402(100) 151(100) Dummy no loan because of complexity 710(90.6) 315(78.4) 335(83.3) 132(87.4) Dummy no loan because of cost 712(90.8) 172(42.8) 335(83.3) 132(87.4) Dummy no loan because of collateral 710(90.6) 160(39.8) 335(83.3) 132(87.4) Rejected credit applications 44(5.6) 0(0) 51(12.7) 8(5.3) Accepted credit applications 44(5.6) 0(0) 51(12.7) 8(5.3) Dummy for external auditory 764(97.4) 402(100) 396(98.5) 150(99.3) Dummy for trade association* 0(0) 402(100) 0(0) 0(0) Percentage of credit line unused* 0(0) 96(23.9) 0(0) 0(0) Borrows in foreign currency* 0(0) 398(99.0) 0(0) 0(0) Wait to clear a check* 0(0) 385(95.8) 0(0) 0(0) Charges to clear a check* 0(0) 383(95.3) 0(0) 0(0) Wait to clear a domestic currency wire* 0(0) 364(90.5) 0(0) 0(0) Charges to clear a domestic currency wire* 0(0) 362(90.0) 0(0) 0(0) Wait to clear a foreign currency wire* 0(0) 299(74.4) 0(0) 0(0) Charges to clear a foreign currency wire* 0(0) 333(82.8) 0(0) 0(0) Dummy for clear title for owned land** 0(0) 0(0) 0(0) 144(95.3) * Available only in 2002 ICS, ** Available only in services survey. 75 Table 3.4: Number of observations and response rate (in parentheses) of innovation and competition variables Name of the variable Manufact. Manufacturing panel Services FY07 FY02 FY07 FY07 Dummy for quality certification 777(99.1) 400(99.5) 394(98.0) 145(96.0) Dummy for foreign technology 777(99.1) 0(0) 399(99.3) 0(0) Dummy for product innovation 779(99.4) 0(0) 402(100.) 0(0) Dummy for process innovation 781(99.6) 0(0) 402(100.) 0(0) Dummy for joint venture 782(99.7) 0(0) 400(99.5) 0(0) Dummy for outsourcing 781(99.6) 0(0) 400(99.5) 0(0) Dummy for R&D 782(99.7) 0(0) 400(99.5) 0(0) Computer controlled machinery 783(99.9) 0(0) 401(99.8) 0(0) Staff with computer 780(99.5) 401(99.8) 401(99.8) 145(96.0) New equipment 773(98.6) 400(99.5) 398(99.0) 0(0) Dummy for FDI 782(99.7) 402(100) 402(100) 149(98.6) Dummy for importer 756(96.4) 402(100) 393(97.8) 0(0) Share of imports 756(96.4) 0(0.01) 393(97.8) 0(0) Dummy for exporter 727(92.7) 399(99.3) 395(98.3) 120(79.4) Exporting experience 782(99.7) 402(100) 401(99.8) 151(100) Share of exports 727(92.7) 399(99.3) 395(98.3) 120(79.4) Dummy for local monopoly 692(88.3) 400(99.5) 272(67.7) 0(0) Dummy for less that 5competitors 692(88.3) 400(99.5) 272(67.7) 0(0) Dummy for more than 5competitors 692(88.3) 400(99.5) 272(67.7) 0(0) Number of competitors* 0(0) 400(99.5) 0(0) 0(0) Dummy for new service offered** 0(0) 0(0) 0(0) 150(99.3) Dummy for new methods of providing services** 0(0) 0(0) 0(0) 150(99.3) Dummy for investment in IT** 0(0) 0(0) 0(0) 148(98.0) * Available only in 2002 ICS, ** Available only in services survey. Table 3.5: Number of observations and response rate (in parentheses) labor markets and skills variables Name of the variable Manufact. Manufacturing panel Services FY07 FY02 FY07 FY07 Staff - production workers 777(99.1) 402(100) 400(99.5) 0(0) Staff - female workers 745(95.0) 394(98.0) 394(98.0) 0(0) Staff - skilled workers 776(99.0) 402(100) 399(99.3) 0(0) Staff - university education 776(99.0) 400(99.5) 400(99.5) 148(98.0) Dummy for training 780(99.5) 402(100) 400(99.5) 148(98.0) Training to production workers 776(99.0) 389(96.8) 397(98.8) 146(96.6) Training to non-production workers 769(98.1) 389(96.8) 394(98.0) 146(96.6) University education* 0(0) 400(99.5) 0(0) 0(0) Tenure* 0(0) 400(99.5) 0(0) 0(0) Experience of the manager 780(99.5) 401(99.8) 402(100) 151(100) Education of the manager 781(99.6) 400(99.5) 402(100) 150(99.3) Education of the manager (post-grade) ** 0(0) 0(0) 0(0) 150(99.3) * Available only in 2002 ICS, ** Available only in services survey. 76 Table 3.6: Number of observations and response rate (in parentheses) of corporate governance variables Name of the variable Manufact. Manufacturing panel Services FY07 FY02 FY07 FY07 Largest shareholder 783(99.9) 0(0) 402(100) 151(100) Dummy for incorporated company 784(100) 402(100) 402(100) 151(100) Dummy for limited company 784(100) 402(100) 402(100) 151(100) Dummy for state-owned firm 782(99.7) 402(100) 402(100) 150(99.3) * Available only in 2002 ICS, ** Available only in services survey. Table 3.7: Number of observations and response rate (in parentheses) of other control variables Name of the variable Manufact. Manufacturing panel Services FY07 FY02 FY07 FY07 Age 784(100) 402(100) 402(100) 151(100) Trade union 777(99.1) 402(100) 397(98.8) 147(97.3) Capacity utilization 783(99.9) 399(99.3) 402(100) 0(0) Dummy for increased sales 784(100) 0(0) 402(100) 0(0) Dummy for decreased sales 784(100) 0(0) 402(100) 0(0) Dummy for help from SMEDA 778(99.2) 402(100) 396(98.5) 0(0) Dummy for help from EPB 777(99.1) 402(100) 397(98.8) 0(0) Dummy for help from BOI 776(99.0) 402(100) 397(98.8) 0(0) Days of production lost due to strikes* 0(0) 397(98.8) 0(0) 0(0) Days of production lost due to civil unrests* 0(0) 398(99.0) 0(0) 0(0) Dummy for materials from rural villages with direct purchase 674(86.0) 0(0) 307(76.4) 0(0) Dummy for materials from rural villages with local supplier 611(77.9) 0(0) 308(76.6) 0(0) Dummy for materials from rural villages with supplier from 614(78.3) 0(0) 295(73.4) 0(0) Dummy for materials from rural villages with sub-contractual 590(75.3) 0(0) 335(83.3) 0(0) Materials from rural villages 759(96.8) 0(0) 396(98.5) 0(0) Local area in square feet 0(0) 0(0) 0(0) 143(94.7) Dummy for initiatives to address AIDS 0(0) 0(0) 0(0) 150(99.3) * Available only in 2002 ICS, ** Available only in services survey. 77 Table 4.1: Missing values and outliers in productivity and labor productivity figures before the cleaning process and percentage in parenthesis (FY07 manufacturing, FY02- FY07 panel and FY07 services) A/. Pakistan ICS FY07, manufacturing FY06 FY07 Missing Missing observations Outliers observations Outliers Sales 43 (5.4) - 34 (4.3) - Materials 57 (7.2) 29 (3.7) 32 (4.1) 21 (2.7) Capital 784 (100) - 394 (50.2) - Employment 784 (100) - 2 (0.25) - Labor cost 40 (5.1) 4 (0.5) 20 (2.5) 7 (0.9) Useful 0 (0%) 358 (45.6%) Total 784 784 B/. Pakistan FY02-FY07 panel, manufacturing FY02 FY07 Missing Missing observations Outliers observations Outliers Sales 1 (0.2) - 23 (5.7) - Materials 1 (0.2) 16 (3.9) 29 (7.2) 21 (5.2) Capital 4 (1) 144 (35) - Employment 3 (1) - 1 (0.2) - Labor cost 8 (0.7) 2 (0.5) 19 (4.7) 7 (1.7) Useful 378 (94%) 215 (53.8%) Total 402 402 C/. Pakistan ICS FY07, services FY06 FY07 Missing Missing observations Outliers observations Outliers Sales 15 (9.9) 11 (7.2) Labor cost 9 (5.9) 4 (2.6) 2 (1.3) 6 (3.9) Electricity cost 3 (1.9) 1 (0.6) Communication cost 5 (3.3) 1 (0.7) 4 (2.6) 1 (0.7) Rental cost of capital 106 (70.2) 100 (66.2) Employment 2 (1.3) 1 (0.7) Useful 39 (25.8%) 40 (26.4%) Total 151 151 Note: For manufacturing, the variables listed are needed to compute productivity or total factor productivity. For services, the variables listed are needed to compute labor productivity. Outliers are defined as those observations with materials to sales or labor cost to sales larger than one. The ratio of materials to sales is replaced by the ratio of electricity costs plus communication costs to sales in the case of services. Useful observations are those observations without either missing values or outliers. Source: Authors’ calculations with ICS data. 78 Table 4.2: Representativeness by industry and state in the sampling frame and in the complete case, 2007 manufacturing ICS State Punjab Sindh Baluchistan NWFP Total by industry Industry/sector Sampli Complete Sampling Complete Sampling Complete Sampling Complete Sampling Complete ng case frame case frame case frame case frame case frame Food, Bev, # of obs. 71 30 50 26 13 8 18 2 152 66 and Tobacco % industry 46.7 45.5 32.9 39.4 8.6 12.1 11.8 3.0 100.0 100.0 Textile and # of obs. 154 54 90 35 8 6 8 1 260 96 Leather % industry 59.2 56.3 34.6 36.5 3.1 6.3 3.1 1.0 100.0 100.0 Wood and # of obs. 15 8 3 1 2 2 6 1 26 12 Wood % industry Products 57.7 66.7 11.5 8.3 7.7 16.7 23.1 8.3 100.0 100.0 Pulp, Paper, # of obs. 22 15 8 4 2 2 3 0 35 21 and Printing % industry 62.9 71.4 22.9 19.0 5.7 9.5 8.6 0.0 100.0 100.0 Chemical # of obs. 55 25 19 10 5 5 2 1 81 41 % industry 67.9 61.0 23.5 24.4 6.2 12.2 2.5 2.4 100.0 100.0 Non-Metallic # of obs. 18 10 11 7 4 4 0 0 33 21 Mineral % industry Products 54.5 47.6 33.3 33.3 12.1 19.0 0.0 0.0 100.0 100.0 Iron, Steel, # of obs. 15 7 2 1 0 0 0 0 17 8 and Non- % industry Ferrous Metal 88.2 87.5 11.8 12.5 0.0 0.0 0.0 0.0 100.0 100.0 Machinery # of obs. 97 54 22 11 3 3 6 1 128 69 % industry 75.8 78.3 17.2 15.9 2.3 4.3 4.7 1.4 100.0 100.0 Miscellaneous # of obs. 35 18 15 4 2 2 0 0 52 24 Industries % industry 67.3 75.0 28.8 16.7 3.8 8.3 0.0 0.0 100.0 100.0 Total by state # of obs. 482 221 220 99 39 32 43 6 784 358 % industry 61.5 61.7 28.1 27.7 5.0 8.9 5.5 1.7 100.0 100.0 Source: authors’ calculations with ICS data Table 4.3: Representativeness by industry and state in the sampling frame and in the complete case, 2007 services ICS State Punjab Sindh Baluchistan NWFP Total by industry Industry/sect Sampli Complete Sampling Complete Sampling Complete Sampling Complete Sampling Complete or ng case frame case frame case frame case frame case frame Wholesale and # of obs. 10 2 20 0 12 1 10 8 52 11 retail trade % industry 19.2 18.2 38.5 0.0 23.1 9.1 19.2 72.7 100.0 100.0 Other services # of obs. 25 4 35 10 18 8 20 6 98 28 % industry 25.5 14.3 35.7 35.7 18.4 28.6 20.4 21.4 100.0 100.0 Total by state # of obs. 35 6 55 10 30 9 30 14 150 39 % industry 23.3 15.4 36.7 25.6 20.0 23.1 20.0 35.9 100.0 100.0 Source: authors’ calculations with ICS data 79 Table 5.1: Missing values and outliers in productivity and labor productivity figures after the cleaning process (FY07 manufacturing, FY02-FY07 panel and FY07 services) A/. Pakistan ICS 2007, manufacturing FY06 FY07 Missing Missing observations Outliers observations Outliers Sales 26 - 12 - Materials 29 22 12 20 Capital 39 - 24 - Employment 1 - 1 - Labor cost 26 1 12 1 Useful 710 (90.5%) 731 (93.2%) Total 784 784 B/. Pakistan FY02-FY07 panel, manufacturing FY02 FY07 Missing Missing observations Outliers observations Outliers Sales 1 - 8 - Materials 1 4 8 14 Capital 0 - 12 - Employment 0 - 1 - Labor cost 0 2 8 4 Useful 394 (98%) 360 (89.6%) Total 402 402 C/. Pakistan ICS FY07, services FY06 FY06 Missing Missing observations Outliers observations Outliers Sales 6 1 Labor cost 5 2 0 2 Electricity cost 2 0 4 3 Communication cost 3 1 Rental cost of capital 1 1 Employment 1 0 Useful 137 (90.7%) 143 (94.7%) Total 151 151 Note: For manufacturing, the variables listed are needed to compute productivity or total factor productivity. For services, the variables listed are needed to compute labor productivity. Outliers are defined as those observations with materials to sales or labor cost to sales larger than one. The ratio of materials to sales is replaced by the ratio of electricity costs plus communication costs to sales in the case of services. Useful observations are those observations without either missing values or outliers. Source: Authors’ calculations with ICS data. 80 Table 5.2: Representativeness by industry and state in the sampling frame and in the sample with replacement of missing values in FY07, 2007 manufacturing ICS. State Punjab Sindh Baluchistan NWFP Total by industry Industry/sector Sampli Sample Sampling Sample Sampling Sample Sampling Sample Sampling Sample ng with frame with frame with frame with frame with frame replac. replac. replac. replac. replac. Food, Bev, # of obs. 71 30 50 26 13 8 18 2 152 66 and Tobacco % industry 46.7 45.5 32.9 39.4 8.6 12.1 11.8 3.0 100.0 100.0 Textile and # of obs. 154 54 90 35 8 6 8 1 260 96 Leather % industry 59.2 56.3 34.6 36.5 3.1 6.3 3.1 1.0 100.0 100.0 Wood and # of obs. 15 8 3 1 2 2 6 1 26 12 Wood % industry Products 57.7 66.7 11.5 8.3 7.7 16.7 23.1 8.3 100.0 100.0 Pulp, Paper, # of obs. 22 15 8 4 2 2 3 0 35 21 and Printing % industry 62.9 71.4 22.9 19.0 5.7 9.5 8.6 0.0 100.0 100.0 Chemical # of obs. 55 25 19 10 5 5 2 1 81 41 % industry 67.9 61.0 23.5 24.4 6.2 12.2 2.5 2.4 100.0 100.0 Non-Metallic # of obs. 18 10 11 7 4 4 0 0 33 21 Mineral % industry Products 54.5 47.6 33.3 33.3 12.1 19.0 0.0 0.0 100.0 100.0 Iron, Steel, # of obs. 15 7 2 1 0 0 0 0 17 8 and Non- % industry Ferrous Metal 88.2 87.5 11.8 12.5 0.0 0.0 0.0 0.0 100.0 100.0 Machinery # of obs. 97 54 22 11 3 3 6 1 128 69 % industry 75.8 78.3 17.2 15.9 2.3 4.3 4.7 1.4 100.0 100.0 Miscellaneous # of obs. 35 18 15 4 2 2 0 0 52 24 Industries % industry 67.3 75.0 28.8 16.7 3.8 8.3 0.0 0.0 100.0 100.0 Total by state # of obs. 482 221 220 99 39 32 43 6 784 358 % industry 61.5 61.7 28.1 27.7 5.0 8.9 5.5 1.7 100.0 100.0 Source: authors’ calculations with ICS data Table 5.3: Representativeness by industry and state in the sampling frame and in the sample with replacement of missing values in FY07, 2007 services ICS. State Punjab Sindh Baluchistan NWFP Total by industry Industry/sect Sampli Sample Sampling Sample Sampling Sample Sampling Sample Sampling Sample or ng with frame with frame with frame with frame with frame replac. replac. replac. replac. replac. Wholesale and # of obs. 10 2 20 0 12 1 10 8 52 11 retail trade % industry 19.2 18.2 38.5 0.0 23.1 9.1 19.2 72.7 100.0 100.0 Other services # of obs. 25 4 35 10 18 8 20 6 98 28 % industry 25.5 14.3 35.7 35.7 18.4 28.6 20.4 21.4 100.0 100.0 Total by state # of obs. 35 6 55 10 30 9 30 14 150 39 % industry 23.3 15.4 36.7 25.6 20.0 23.1 20.0 35.9 100.0 100.0 Source: authors’ calculations with ICS data 81 Table 5.4: Patterns of missing values in production function variables Sales Materials Capital Employment Labor cost # of m.v #Obs %Obs 0 377 48.1 1 351 44.8 2 13 1.7 2 12 1.5 4 10 1.3 1 6 0.8 1 3 0.4 3 3 0.4 3 2 0.3 1 1 0.1 2 1 0.1 2 1 0.1 2 1 0.1 3 1 0.1 3 1 0.1 Yellow means that data is available. As opposite, white implies that data is missing. Source: Authors’ calculations with Pakistan ICS data. 82 Table 5.5: Pattern of missing values in India by key IC variables (% of missing values in PF vars with respect to categories of IC vars) No M.V Any M.V Corr. with Corr. with Corr. with TFP sales # of M.V Own generator NO 43.0 57.0 0.2123 0.1522 -0.1011 YES 61.5 38.5 Power outages NO 50.6 49.4 -0.1271 -0.1004 0.0186 YES 47.6 52.4 Water outages NO 47.7 52.3 0.1239 -0.0015 0.0485 YES 48.7 51.3 E-mail usage NO 39.3 60.7 0.2565 0.1247 -0.1734 YES 64.9 35.1 Internet usage NO 50.0 50.0 -0.0267 0.0244 -0.1345 YES 71.9 28.1 Informality: sales reported NO 47.6 52.4 -0.029 -0.0187 -0.0097 YES 52.2 47.8 Informality: workforce reported NO 51.2 48.8 -0.0722 -0.0347 0.0923 YES 32.1 67.9 Corruption: payments to deal NO 39.1 60.9 0.0745 0.0164 -0.088 with bureaucracy YES 55.7 44.3 Corruption: payments to obtain NO 53.2 46.8 -0.056 -0.0386 -0.1178 contract with govern. YES 67.4 32.6 Crime losses NO 47.5 52.5 0.0913 0.0906 -0.0324 YES 53.5 46.5 Security expenses NO 33.0 67.0 0.1651 0.069 -0.201 YES 58.1 41.9 Loan NO 44.6 55.4 0.189 0.1343 -0.1426 YES 70.3 29.7 Credit line NO 43.3 56.7 0.2316 0.1611 -0.1541 YES 64.3 35.7 Auditory NO 41.0 59.0 0.2505 0.1516 -0.0701 YES 66.5 33.5 ISO NO 44.4 55.6 0.1949 0.1719 -0.0429 YES 62.4 37.6 New product NO 45.8 54.2 0.1779 0.139 0.02 YES 64.5 35.5 Source: Authors’ calculations with Pakistan ICS data. 83 Table 5.6: Number of missing values in production function variables by size Small Medium Large Totals by size 534 139 103 Sales Number of missing 20 10 4 Perc over totals by size 3.7 7.2 3.9 Employment Number of missing 2 0 2 Perc over totals by size 0.4 0.0 1.9 Materials Number of missing 22 6 3 Perc over totals by size 4.1 4.3 2.9 Capital Number of missing 304 54 33 Perc over totals by size 56.9 38.8 32.0 Labor cost Number of missing 11 5 3 Perc over totals by size 2.1 3.6 2.9 Source: Authors’ calculations with Pakistan ICS data. Table 5.7: Representativity of sampling frame, complete case and sample with replacement in India a) By Industry Food, Textile Wood Pulp Chemical Non- Iron, Steel Machiner Miscellan Total Bevs., and Metallic y eous Tobacco Leather Sampling # Obs 151 258 25 35 80 33 17 128 52 779 frame Perc over 19.4 33.1 3.2 4.5 10.3 4.2 2.2 16.4 6.7 100.0 total Complete # Obs 65 95 11 21 40 21 8 69 24 354 case Perc over 18.4 26.8 3.1 5.9 11.3 5.9 2.3 19.5 6.8 100.0 total Sample # Obs 142 240 19 32 76 31 16 121 49 726 with replacing Perc over 19.6 33.1 2.6 4.4 10.5 4.3 2.2 16.7 6.7 100.0 total b) By size Small Medium Large Total Sampling # Obs 534 139 103 776 frame Perc over 68.8 17.9 13.3 100.0 total Complete # Obs 215 79 60 354 case Perc over 60.7 22.3 16.9 100.0 total Sample with # Obs 502 132 90 724 replacing Perc over 69.3 18.2 12.4 100.0 total Source: Authors’ calculations with Pakistan ICS data. 84 Table 5.8: Percentage of observations available due to missing values, by industry and region Small Medium Large Total #Obs Perc. #Obs Perc. #Obs Perc. #Obs Perc. Food Original Sample 111 28 12 151 Without replacing 43 38.7 16 57.1 6 50.0 65 43.0 With replacing 107 96.4 27 96.4 8 66.7 142 94.0 Textile and Original Sample 171 45 40 256 leather Without replacing 52 30.4 23 51.1 20 50.0 95 37.1 With replacing 162 94.7 42 93.3 34 85.0 238 93.0 Wood Original Sample 23 1 1 25 Without replacing 10 43.5 1 100.0 0 0.0 11 44.0 With replacing 18 78.3 1 100.0 0 0.0 19 76.0 Pulp, paper Original Sample 22 7 6 35 and printing Without replacing 11 50.0 5 71.4 5 83.3 21 60.0 With replacing 20 90.9 6 85.7 6 100.0 32 91.4 Chemicals Original Sample 44 16 20 80 Without replacing 19 43.2 7 43.8 14 70.0 40 50.0 With replacing 41 93.2 16 100.0 19 95.0 76 95.0 Non metallic Original Sample 18 9 6 33 minerals Without replacing 10 55.6 6 66.7 5 83.3 21 63.6 With replacing 17 94.4 9 100.0 5 83.3 31 93.9 Iron, steel Original Sample 11 6 0 17 Without replacing 4 36.4 4 66.7 0 0.0 8 47.1 With replacing 10 90.9 6 100.0 0 0.0 16 94.1 Machinery Original Sample 91 21 15 127 Without replacing 49 53.8 12 57.1 8 53.3 69 54.3 With replacing 87 95.6 19 90.5 15 100.0 121 95.3 Miscellaneous Original Sample 43 6 3 52 Without replacing 17 39.5 5 83.3 2 66.7 24 46.2 With replacing 40 93.0 6 100.0 3 100.0 49 94.2 Total Original Sample 534 139 103 776 Without replacing 215 40.3 79 56.8 60 58.3 354 45.6 With replacing 502 94.0 132 95.0 90 87.4 724 93.3 Source: Authors’ calculations with Pakistan ICS data. 85 Table 6.1: Robust IC elasticities and semi-elasticities with respect to productivity – OLS Estimation (manuf. FY07) Blocks of ICA Two steps Single step estimation variables Solow residual Cobb-Douglas Translog Explanatory ICA variables Restricted Unrestric. Restricted Unrestric. Restricted Unrestric. Infrastructure Number of power outages (log)(b) -0.050* -0.050* -0.050* -0.036 -0.044* -0.026 Dummy for own generator (b) 0.071 0.058 0.108* 0.092* 0.095* 0.074 Products with own transport (%)(b) 0.001 0.001* 0.001 0.002** 0.001* 0.001 Days of inventory of main intermediate material 0.057** 0.060** 0.068*** 0.075*** 0.058** 0.065*** (log)(b) Dummy for industrial zone 0.128* 0.131* 0.202** 0.199** 0.172** 0.159** Economic Dummy for conflicts with clients with a court 0.543*** 0.501*** 0.542*** 0.352** 0.503*** 0.545*** governance involved (b) Dummy for security expenses (b) 0.085* 0.084* 0.099** 0.081* 0.090* 0.105** Crime losses (%)(b) -0.005*** -0.005** -0.003 -0.003* -0.004** -0.004** Payments to obtain a contract with the govern. (b) -0.0002 -0.0002 -0.0003 -0.0001 -0.0003 -0.0004** Sales reported to taxes (%)(b) 0.001 0.002* 0.001 0.001* 0.001* 0.001 Dummy for gifts in tax inspections (b) -0.078 -0.079* -0.07 -0.042 -0.068 -0.025 Finance Purchases paid before delivery (%)(b) -0.002 -0.002 -0.002 -0.001 -0.002* -0.002* Working capital financed by internal founds (% (b) -0.001 -0.001 -0.001 -0.002** -0.001 -0.002* Working capital financed by private banks (%)(b) 0.004* 0.004* 0.005** 0.002 0.004* 0.002 Working capital financed by family/friends (%)(b) -0.013* -0.012* -0.011* -0.011* -0.010** -0.01 Working capital financed by informal founds (%)(b) -0.019 -0.018 -0.025** -0.026** -0.022* -0.025** Dummy for checking or saving account (b) 0.075 0.063 0.09 0.133** 0.085 0.111 Innovation Dummy for process innovation (b) 0.300*** 0.291** 0.311** 0.265*** 0.310** 0.267** and New equipment (%)(b) 0.001** 0.001** 0.001* 0.001 0.001* 0.001 competition Dummy for FDI (b) 0.191 0.238 0.26 0.267 0.343* 0.295 Labor markets Staff - female workers (%)(b) -0.005* -0.006* -0.006** -0.007*** -0.005* -0.005* and skills Dummy for training (b) 0.186* 0.179 0.262*** 0.192** 0.268*** 0.219* Corporate Largest shareholder (%)(b) -0.002** -0.002** -0.002** -0.002** -0.002** -0.001 governance Other control Dummy for help from BOI (b) 0.272** 0.263** 0.324*** 0.318*** 0.302** 0.363*** variables Dummy for materials from rural villages with local 0.144 0.108 0.217* 0.107 0.138 0.146 supplier (b) Observations 727 727 727 727 727 727 R-squared 0.22 0.21 0.92 0.93 0.93 0.94 NOTES: Two steps estimation: in the first step estimation of equation (b2.1) by non-parametric techniques to compute productivity (Solow residual), in the second step estimate (3.2) and (3.3) by OLS using as dependent variable the Solow residual from the first step, either restricted or unrestricted. Single step estimation: estimae (3.1), (3.2) and (3.3) in a single step by OLS, where (3.1) can be a Cobb-Douglas Production function or a Translogarithmic. Restricted: equal input output for all the establishments in the country. Unrestricted: equal inoput-output elasticites for all the establishments in the same sector. *significant at 10%; ** significant at 5%; *** significant at 1% given by robust standard errors corrected for correlation between cluster (industry and region). Each regression includes a set of industry, size and region dummies and a constant term. (a) Variables instrumented with the industry-region-size average. (b) Variables approximated with a proxy (only missing values replaced by the industry-region-size average). Source: Authors’ calculations with Pakistan ICS data. 86 Table 6.2: Further robustness; IC elasticities and semi-elasticities with respect to productivity – Random effects estimation (manuf. FY07 & FY06) Blocks of ICA Two steps Single step estimation variables Solow residual Cobb-Douglas Translog Explanatory ICA variables Restricted Unrestric. Restricted Unrestric. Restricted Unrestric. Infrastructure Number of power outages (log)(b) -0.042** -0.041** -0.035* -0.027 -0.025 -0.011 Dummy for own generator (b) 0.091 0.08 0.174** 0.144** 0.165** 0.147* Products with own transport (%)(b) 0.001 0.001 0.001 0.001 0.001 0.001 Days of inventory of main intermediate material 0.050* 0.055* 0.091*** 0.087*** 0.076** 0.073** (log)(b) Dummy for industrial zone 0.145** 0.146** 0.256*** 0.282*** 0.247*** 0.241*** Economic Dummy for conflicts with clients with a court 0.697*** 0.657*** 0.707*** 0.647*** 0.676*** 0.773*** governance involved (b) Dummy for security expenses (b) 0.104* 0.104* 0.167*** 0.137** 0.114* 0.143** Crime losses (%)(b) -0.004 -0.004 -0.003 -0.002 -0.002 -0.002 Payments to obtain a contract with the govern. (b) -0.0002 -0.0003 0.0001 -0.0002 -0.0004 -0.0003 Sales reported to taxes (%)(b) 0.002 0.002 0.002 0.002 0.002 0.001 Dummy for gifts in tax inspections (b) -0.06 -0.061 -0.033 -0.027 -0.05 -0.028 Finance Purchases paid before delivery (%)(b) -0.002 -0.002 -0.002 -0.001 -0.002* -0.002* Working capital financed by internal founds (% (b) -0.001 -0.001 -0.001 -0.002 -0.001 -0.002 Working capital financed by private banks (%)(b) 0.003 0.003 0.003 0.001 0.002 0.001 Working capital financed by family/friends (%)(b) -0.007 -0.006 -0.01 -0.009 -0.009 -0.007 Working capital financed by informal founds (%)(b) -0.013 -0.011 -0.018 -0.023 -0.018 -0.024 Dummy for checking or saving account (b) 0.037 0.027 0.141** 0.162** 0.116* 0.133** Innovation Dummy for process innovation (b) 0.251** 0.246** 0.285*** 0.273*** 0.306*** 0.262** and New equipment (%)(b) 0.001 0.001 0.001 0 0.001 0.001 competition Dummy for FDI (b) 0.141 0.18 0.252 0.184 0.303 0.236 Labor Staff - female workers (%)(b) -0.004 -0.005 -0.003 -0.006 -0.004 -0.004 markets and Dummy for training (b) 0.102 0.096 0.255** 0.085 0.252** 0.183 skills Corporate Largest shareholder (%)(b) -0.003*** -0.003*** -0.004*** -0.003*** -0.003*** -0.003*** governance Other control Dummy for help from BOI (b) 0.215 0.207 0.306** 0.261** 0.311** 0.335** variables Dummy for materials from rural villages with local 0.203 0.16 0.298** 0.265** 0.214 0.197 supplier (b) Observations 1428 1428 1428 1428 1428 1428 R-squared 0.19 0.19 0.91 0.92 0.91 0.93 NOTES: Two steps estimation: in the first step estimation of equation (b2.1) by non-parametric techniques to compute productivity (Solow residual), in the second step estimate (3.2) and (3.3) by OLS using as dependent variable the Solow residual from the first step, either restricted or unrestricted. Single step estimation: estimae (3.1), (3.2) and (3.3) in a single step by OLS, where (3.1) can be a Cobb-Douglas Production function or a Translogarithmic. Restricted: equal input output for all the establishments in the country. Unrestricted: equal inoput-output elasticites for all the establishments in the same sector. *significant at 10%; ** significant at 5%; *** significant at 1% given by robust standard errors corrected for correlation between cluster (industry and region). Each regression includes a set of time, industry, size and region dummies and a constant term. (a) Variables instrumented with the industry-region-size average. (b) Variables approximated with a proxy (only missing values replaced by the industry-region-size average). Source: Authors’ calculations with Pakistan ICS data. 87 Table 6.3: Further robustness; IC elasticities and semi-elasticities under different replacement procedures of missing data (manuf. FY07) Dependent variable: restricted Solow residual (productivity) ICA Boot. ICA EM alg. I EM alg. EM alg. Heckit Complete method meth. II III case Blocks Explanatory ICA variables [1] [2] [3] [4] [5] [6] [7] Number of power outages (log)(b) -0.050* -0.050* -0.055** -0.046** -0.069*** -0.061** -0.055 Infrastruc Dummy for own generator (b) 0.071 0.071* 0.084 0.103 0.105** 0.0671 0.069 ture Products with own transport (%)(b) 0.001 0.001 0.001** 0.001* 0.002*** 0.001 0.002 Days of inventory of main intermediate material (log)(b) 0.057** 0.057*** 0.075** 0.069* 0.043** 0.047 0.046 Dummy for industrial zone 0.128* 0.128* 0.147** 0.228*** 0.0990* 0.087 0.081 Economic Dummy for conflicts with clients with a court involved 0.543*** 0.543** 0.372** 0.455** 0.190* 0.494** 0.438*** governance (b) Dummy for security expenses (b) 0.085* 0.085 0.097* 0.101* 0.126*** 0.090 0.091 Crime losses (%)(b) -0.005*** -0.005 -0.004* -0.010 -0.001 -0.002 -0.003 Payments to obtain a contract with the government (b) -0.0002 -0.000 0.000 0.000 0.000 -0.0003 -0.0004 Sales reported to taxes (%)(b) 0.001 0.001** 0.001 0.001 0.000 0.0010 0.001 Dummy for gifts in tax inspections (b) -0.078 -0.078 -0.071 -0.059 -0.091** -0.105 -0.101 Finance Purchases paid before delivery (%)(b) -0.002 -0.002 -0.002 -0.001 -0.003*** -0.002* -0.002* Working capital financed by internal founds (% (b) -0.001 -0.001 -0.0002 -0.002* -0.002** 0.0001 0.0004 Working capital financed by private banks (%)(b) 0.004* 0.004** 0.004 0.001 0.003 0.007*** 0.008* Working capital financed by family/friends (%)(b) -0.013* -0.013 -0.020 -0.021*** -0.003 -0.003 -0.003 Working capital financed by informal founds (%)(b) -0.019 -0.019*** 0.003 0.019** 0.001 -0.022 -0.023 Dummy for checking or saving account (b) 0.075 0.075*** 0.035 0.099 -0.007 -0.025 -0.030 Innovation Dummy for process innovation (b) 0.300*** 0.300*** 0.316** 0.362*** 0.185*** 0.334*** 0.332*** and New equipment (%)(b) 0.001** 0.001 0.002** 0.002** 0.002*** 0.001* 0.002* competition Dummy for FDI (b) 0.191 0.191 0.018 0.186 0.054 0.241 0.153 Labor mkts Staff - female workers (%)(b) -0.005* -0.005 -0.004 -0.002 -0.002 -0.003 -0.002 and skills Dummy for training (b) 0.186* 0.186 0.110 0.016 0.211** 0.136 0.116 Corporate Largest shareholder (%)(b) -0.002** -0.002** -0.003*** -0.002** -0.001 -0.001 -0.002 governance Other Dummy for help from BOI (b) 0.272** 0.272 0.222 0.168 0.095 0.072 0.071 control Dummy for materials from rural villages with local 0.144 0.144*** 0.137 0.142 0.024 0.151 0.159 variables supplier (b) Observations 727 727 712 764 764 751 358 R-squared 0.22 0.22 0.20 0.19 0.40 0.30 Heckman's Lambda -0.056 [0.117] All regressions corresponds to the two step restricted case, see equation (3.3) in section 2.1. *significant at 10%; ** significant at 5%; *** significant at 1% given by robust standard errors corrected for correlation between cluster (industry and region). Each regression includes a set of time, industry, size and region dummies and a constant term. (a) Variables instrumented with the industry-region-size average. (b) Variables approximated with a proxy (only missing values replaced by the industry-region-size average). Source: Authors’ calculations with Pakistan ICS data. [1] Results from the replacement by ICA method, first column of Table 6.1. [2] ICA method with standard errors obtained by bootstrap, 1000 repetitions. [3] EM algorithm on inputs and output using as covariates industry/region/size dummies output and inputs. Convergence achieved after 6 iterations in the case of replacement of capital figures, 11 iterations for materials and 9 for sales. [4] EM algorithm on inputs and output using as covariates industry/region/size dummies. Convergence achieved after 6 iterations in the case of replacement of capital figures, 14 iterations for materials and 9 for sales. [5] EM algorithm on productivity (restricted Solow residual) using as covariates industry/region/size dummies and IC explanatory variables of productivity equation. Convergence achieved after 6 iterations. [6] Heckman endogenous selection model; probability of selection modeled with other IC variables. [7] Model on the complete case, only available observations (no missing data and no replacement) used. Source: Authors’ calculations with Pakistan ICS data. 88 Table 7: IC percentage contributions to aggregate log-productivity (manufacturing FY07) Aggregate log- Average log- Allocative productivity productivity efficiency Infrastructure Number of power outages (b) -2.42 -4.93 2.51 Dummy for own generator (b) 2.29 0.65 1.65 Products with own transport (b) 0.5 0.46 0.04 Days of inventory of main intermediate material (b) 8.99 5.6 3.39 Dummy for industrial zone 2.53 1.45 1.08 Economic Dummy for conflicts with clients with a court involved (b) 2.18 0.55 1.63 governance Dummy for security expenses (b) 2.5 1.61 0.89 Crime losses (b) -0.02 -0.1 0.09 Demeaned log-productivity Payments to obtain a contract with the government (b) 0 -0.07 0.06 Sales reported to taxes (b) 4.61 4.44 0.17 Dummy for gifts in tax inspections (b) -0.37 -0.58 0.21 Finance Purchases paid before delivery (b) -0.19 -0.5 0.31 Working capital financed by internal founds (b) -2.56 -3.25 0.69 Working capital financed by private banks (b) 4.05 0.61 3.44 Working capital financed by family/friends (b) 0 -0.06 0.06 Working capital financed by informal founds (b) 0 -0.02 0.02 Dummy for checking or saving account (b) 2.52 1.91 0.61 Innovation and Dummy for process innovation (b) 5.67 0.93 4.74 competition New equipment (b) 2.66 1.01 1.65 Dummy for FDI (b) 1.85 0.15 1.7 Labor markets Staff - female workers (b) -1.24 -0.33 -0.91 and skills Dummy for training (b) 4.04 0.44 3.6 Corporate Largest shareholder (b) -2.9 -5.84 2.94 governance Other control Dummy for help from BOI (b) 0.97 0.4 0.57 variables Dummy for materials from rural villages with local supplier (b) 0.07 0.17 -0.1 Total contribution of IC (demeaned log-productivity) 35.75 4.71 31.05 Other stuff Industry/region/size controls -13.1 -4.4 -8.7 Constant term 41.01 41.01 0 Residual 36.34 0 36.34 Total contribution of other stuff 64.25 36.61 27.64 Total 100 41.32 58.68 NOTES: Results from equation (5.3). The contribution of IC to aggregate log-productivity is equal to the sum of the contributions to average log-productivity and to the allocative efficiency. Demeaned log-productivity is the part of productivity associated with the investment climate The productivity measure used is the restricted Solow residual. (a) Variables instrumented with the industry-region-size average. (b) Variables approximated with a proxy (only missing values replaced by the industry-region-size average). Source: Authors’ calculations with Pakistan ICS data. 89 Table 8.1: IC elasticities and semi-elasticities with respect to employment – IV Estimation (manuf. FY07) Dependent variable: employment Restricted Solow residual Unrestricted Solow residual Blocks Explanatory ICA variables Coefficient % Contrib Coefficient % Contrib Infrastructure Number of power outages (log)(b) -0.050* -11.3 -0.050* -11.4 Dummy for own generator (b) 0.072 1.3 0.059 1.1 Products with own transport (%)(b) 0.001 1.1 0.001** 1.4 Days of inventory of main intermediate material (log)(b) 0.057** 12.4 0.061** 13.3 Dummy for industrial zone 0.123 3.0 0.127* 3.1 Economic Dummy for conflicts with clients with a court involved (b) 0.524*** 1.0 0.485*** 1.0 governance Dummy for security expenses (b) 0.081 3.6 0.079 3.5 Crime losses (%)(b) -0.006*** -0.4 -0.005*** -0.4 Payments to obtain a contract with the government (b) -0.0002 -0.2 -0.0002 -0.1 Sales reported to taxes (%)(b) 0.001 10.4 0.002 12.3 Dummy for gifts in tax inspections (b) -0.071 -1.3 -0.072 -1.4 Finance Purchases paid before delivery (%)(b) -0.002 -1.0 -0.002 -1.1 Working capital financed by internal founds (% (b) -0.001 -8.6 -0.001 -7.3 Working capital financed by private banks (%)(b) 0.004 1.2 0.004 1.2 Working capital financed by family/friends (%)(b) -0.013* -0.1 -0.012* -0.1 Working capital financed by informal founds (%)(b) -0.02 -0.1 -0.018 -0.1 Dummy for checking or saving account (b) 0.076 4.3 0.064 3.6 Innovation Dummy for process innovation (b) 0.288** 1.8 0.282** 1.8 and New equipment (%)(b) 0.001** 2.4 0.002** 2.5 competition Dummy for FDI (b)1 0.461 0.8 0.473 0.8 Labor Staff - female workers (%)(b) -0.005 -0.7 -0.006* -0.8 markets and Dummy for training (b) 0.142 0.8 0.139 0.8 skills Corporate Largest shareholder (%)(b) -0.002** -14.9 -0.002** -14.5 governance Other control Dummy for help from BOI (b) 0.152 0.9 0.115 0.9 variables Dummy for materials from rural villages with local supplier (b) 0.261** 0.4 0.252** 0.3 Instruments First stage R-squared2 0.34 0.34 evaluation Partial R-squared: dummy for FDI3 0.14 0.15 Partial R-squared F test (p-value)4 0.00 0.00 Hansen test (p-value)5 0.67 0.67    Observations 725 725 NOTES: * significant at 10%; ** significant at 5%; *** significant at 1% (robust standard errors corrected for clustering by industry and region). Each regression includes a set of industry, region and size dummies and a constant term. (a) Variables instrumented with the industry-region-size average. (b) Variables approximated with a proxy (only missing values replaced by the industry-region-size average). 1 Dummy for FDI is endogenous and the list of variables used as excluded instruments comes from the list of explanatory variables from their corresponding equations. 2 First stage R-squared from the regression of productivity on both the included and the excluded instruments. 3 The partial R-squared measures the squared partial correlation between the excluded instruments and the dummy for FDI. 4 F-test of joint significance of the excluded instruments that corresponds to the partial R-squared. 5 The Hansen test is a test of overidentifying restrictions. The null hypothesis is that the instruments are valid instruments, that is, uncorrelated with the error term, and therefore the excluded instruments are correctly excluded from the estimated equation. Source: Authors’ calculations with Pakistan ICS data. 90 Table 8.2: IC elasticities and semi-elasticities with respect to employment – IV Estimation (manuf. FY07) Dependent variable: employment (demand for labor) Restricted Solow residual Unrestricted Solow residual Blocks Explanatory ICA variables Coefficient % Contrib Coefficient % Contrib Productivity 0.266** 16.54 0.275** 15.3 Real wages -0.250* -51.41 -0.248* -45.65 Infrastructur Days to clear customs for exports - interaction with firms that -0.189* -1.86 -0.188* -2.12 e do export (a) Electricity from a generator (b) 0.003 0.94 0.003 1.22 Dummy for insufficient water supply -0.215* -2.05 -0.217* -2.07 Products with own transport (b) 0.003*** 1.93 0.003*** 1.99 Shipment losses, exports (a) -0.021** -0.54 -0.023** -0.5 Days of inventory of main intermediate material (b) 0.110*** 13.56 0.109*** 12.28 Economic Dummy for security expenses 0.395*** 11.79 0.395*** 11.4 governance Crime losses (b) -0.007** -0.28 -0.007** -0.26 Finance Sales paid after delivery (b) 0.001 4.38 0.001 3.91 Working capital financed by state-owned banks (b) 0.008 0.34 0.008 0.35 Dummy for checking or saving account (b) 0.177*** 5.78 0.179*** 5.43 Own of the land (b) 0.002*** 8.42 0.002*** 7.49 Dummy for credit line 0.192* 1.69 0.188* 1.93 Dummy for external auditory 0.364*** 4.2 0.365*** 5.07 Innovation Dummy for quality certification (b) 0.269** 2.29 0.268** 2.84 and Computer controlled machinery (b) 0.007** 1.3 0.007** 1.75 competition Staff with computer (b) 0.005** 0.95 0.005** 1.21 Dummy for e-mail 0.219** 3.76 0.219** 4.22 Exporting experience (b) 0.183*** 3.03 0.182*** 3.57 Dummy for more than 5 competitors (b) 0.209** 9.67 0.209** 8.4 Labor Staff - production workers (b) 0.003*** 11.45 0.003*** 9.96 markets and Staff - female workers (b) 0.014*** 1.6 0.014*** 1.85 skills Staff - skilled workers (b) 0.004*** 14.86 0.004*** 12.42 Training to non-production workers (b) 0.006** 0.56 0.006** 0.85 Experience of the manager (b) 0.090** 12.03 0.092** 10.75 Education of the manager (b) 0.307*** 6.17 0.308*** 6.49 Corporate Dummy for incorporated company 1.234*** 1.63 1.224*** 2.7 governance Dummy for limited company 0.544*** 4.17 0.543*** 5.03 Other control Trade union (b) 0.005** 0.65 0.005** 1 variables Capacity utilization (b) 0.003** 12.47 0.003** 11 Dummy for help from EPB (b) 0.241* 0.82 0.241* 0.99 Dummy for mats. from rural villages with supplier from firm's city (b) -0.472*** -1.04 -0.477*** -0.95 Dummy for mats. from rural villages with sub-contractual arrang. (b) 0.333 0.18 0.329* 0.17 Instruments First stage R-squared: productivity2 0.35 0.35 evaluation Partial R-squared: productivity3 0.21 0.2 Partial R-squared F test (p-value): productivity4 0.00 0.00 First stage R-squared: wages2 0.27 0.27 Partial R-squared: wages3 0.06 0.06 Partial R-squared F test (p-value): wages4 0.00 0.00 Hansen test (p-value)5 0.69 0.69 Observations 690 690 NOTES: * significant at 10%; ** significant at 5%; *** significant at 1% (robust standard errors corrected for clustering by industry and region). Each regression includes a set of industry, region and size dummies and a constant term. (a) Variables instrumented with the industry-region-size average. (b) Variables approximated with a proxy (only missing values replaced by the industry-region-size average). 1 Productivity and real wages, are endogenous and the list of variables used as excluded instruments comes from the list of explanatory variables from their corresponding equations. 2 First stage R-squared from the regression of productivity on both the included and the excluded instruments. 3 The partial R-squared measures the squared partial correlation between the excluded instruments and the productivity. 4 F-test of joint significance of the excluded instruments that corresponds to the partial R-squared. 5 The Hansen test is a test of overidentifying restrictions. The null hypothesis is that the instruments are valid instruments, that is, uncorrelated with the error term, and therefore the excluded instruments are correctly excluded from the estimated equation. Source: Authors’ calculations with Pakistan ICS data. 91 Table 8.3: IC elasticities and semi-elasticities with respect to wages – IV Estimation (manuf. FY07) Dependent variable: real wages Restricted Solow residual Unrestricted Solow residual Blocks Explanatory ICA variables Coefficient % Contrib Coefficient % Contrib Productivity1 0.823*** 94.1 0.808*** 91.74 Infrastructure Losses due to power outages (a) -0.020* -14.9 -0.019* -13.75 s Products with own transport (b) -0.002* -2.9 -0.002* -3.2 Shipment losses, domestic (a) -0.018** -2.1 -0.019*** -2.28 Economic Security expenses (b) 0.01 1.5 0.007 1.52 governance Dummy for crime losses (b) -0.239* -2.5 -0.237* -2.45 Finance Purchases paid before delivery (b) -0.003* -2 -0.003* -1.99 Working capital financed by family/friends (b) -0.025*** -0.3 -0.026*** -0.3 Working capital financed by private banks (b) -0.011* -5.8 -0.010* -5.66 Dummy for checking or saving account (b) 0.145* 9.7 0.157* 10.56 Dummy for credit line (b) 0.378*** 8.7 0.372*** 8.73 Dummy for external auditory (b) 0.2 6.2 0.211* 6.48 Innovation Dummy for quality certification (b) 0.184* 3.8 0.185* 3.95 and Dummy for product innovation (b) 0.221* 2.8 0.233* 3.01 competition Staff with computer (b) 0.005* 2.9 0.005** 3.03 Dummy for FDI (b) 0.358* 0.7 0.324 0.61 Instruments First stage R-squared2 0.2 0.2 evaluation Partial R-squared: productivity3 0.05 0.05 4 Partial R-squared F test (p-value) 0 0 Hansen test (p-value)5 0.78 0.78 Observations 724 724 NOTES: * significant at 10%; ** significant at 5%; *** significant at 1% (robust standard errors corrected for clustering by industry and region). Each regression includes a set of industry, region and size dummies and a constant term. (a) Variables instrumented with the industry-region-size average. (b) Variables approximated with a proxy (only missing values replaced by the industry-region-size average). 1 Productivity is endogenous and the list of variables used as excluded instruments comes from the list of explanatory variables from their corresponding equations. 2 First stage R-squared from the regression of productivity on both the included and the excluded instruments. 3 The partial R-squared measures the squared partial correlation between the excluded instruments and the productivity. 4 F-test of joint significance of the excluded instruments that corresponds to the partial R-squared. 5 The Hansen test is a test of overidentifying restrictions. The null hypothesis is that the instruments are valid instruments, that is, uncorrelated with the error term, and therefore the excluded instruments are correctly excluded from the estimated equation. Source: Authors’ calculations with Pakistan ICS data. 92 Table 8.4: IC linear probability coefficients with respect to the probability of exporting – IV Estimation (manufac. FY07) Dependent variable: probability of exporting Restricted Solow residual Unrestricted Solow residual Blocks Explanatory ICA variables Coefficient % Contrib Coefficient % Contrib Productivity1 0.106*** 46.6 0.108*** 46.54 Infrastructur Days to clear customs to export (a) -0.067 -38 -0.1 -39.06 e Electricity from a generator (b) 0.002** 2.4 0.002** 2.45 Products with own transport (b) 0.001*** 3.9 0.001** 4.07 Economic Security expenses (b) 0.001 1 0.001* 0.98 governance Payments to deal with bur. issues (b) 0.001*** 8.7 0.001*** 8.81 Number of inspections (b) -0.046** -9.5 -0.046** -9.59 Dummy for tax exemption (b) 0.174*** 12.2 0.174*** 12.32 Finance Working capital financed by internal founds (b) 0.001* 24.5 0.001* 25.01 Working capital financed by state-owned banks (b) 0.009*** 1.8 0.009*** 1.86 Own of the land (b) 0.001** 13.4 0.001** 13.5 Innovation Dummy for quality certification (b) 0.157*** 7.2 0.157*** 7.22 and Dummy for joint venture (b) 0.182* 0.6 0.181* 0.62 competition Dummy for e-mail (b) 0.118*** 10.8 0.117*** 11 Dummy for more than 5 competitors (b) 0.037* 11.8 0.037* 11.69 Corporate Dummy for state-owned firm -0.194* -0.3 -0.194* -0.31 governance Other control Dummy for help from EPB (b) 0.176** 3.3 0.178** 3.31 variables Dummy for materials from rural villages with sub-contractual -0.131* -0.4 -0.132* -0.42 arrangement (b) Instruments First stage R-squared2 0.35 0.35 evaluation Partial R-squared: productivity3 0.22 0.22 Partial R-squared F test (p-value)4 0 0 Hansen test (p-value)5 0.88 0.88 Observations 673 673 NOTES: * significant at 10%; ** significant at 5%; *** significant at 1% (robust standard errors corrected for clustering by industry and region). Each regression includes a set of industry, region and size dummies and a constant term. (a) Variables instrumented with the industry-region-size average. (b) Variables approximated with a proxy (only missing values replaced by the industry-region-size average). 1 Productivity is endogenous and the list of variables used as excluded instruments comes from the list of explanatory variables from their corresponding equations. 2 First stage R-squared from the regression of productivity on both the included and the excluded instruments. 3 The partial R-squared measures the squared partial correlation between the excluded instruments and the productivity. 4 F-test of joint significance of the excluded instruments that corresponds to the partial R-squared. 5 The Hansen test is a test of overidentifying restrictions. The null hypothesis is that the instruments are valid instruments, that is, uncorrelated with the error term, and therefore the excluded instruments are correctly excluded from the estimated equation. Source: Authors’ calculations with Pakistan ICS data. 93 Table 8.5: IC linear probability coefficients with respect to the probability of receiving FDI – IV Estimation (manufacturing FY07) Dependent variable: probability of receiving FDI Restricted Solow residual Unrestricted Solow residual Blocks Explanatory ICA variables Coefficient % Contrib Coefficient % Contrib Productivity1 0.029* 174.6 0.030* 796.4 Infrastructur Days to clear customs to export - interaction with firms that do -0.018** -24.8 -0.018** -63.7 e export (a) Products with own transport (b) -0.0003** -28.7 -0.0003** -90.4 Number of water outages (b) -0.008* -19 -0.008* -73.2 Security expenses (a) 0.007** 13.5 0.007** 57.6 Economic Crime losses (b) -0.001* 85.4 -0.001* 386.8 governance Payments to deal with bur. issues (b) -0.000** -2.8 -0.000** -12.6 Number of labor inspections (b) -0.011** -60.8 -0.011** -303 Finance Purchases paid before delivery (b) -0.000** -35.6 -0.000** -136 Working capital financed by private banks (b) 0.001* -17.8 0.001 -74.5 Innovation Dummy for quality certification (b) 0.032* 19.7 0.032* 46.8 and Dummy for foreign technology (b) 0.148*** 36.1 0.149*** 72.1 competition Dummy for outsourcing (b) -0.206** 45 -0.206** 50.6 Staff with computer (b) 0.001* -32.5 0.001* -52.6 Labor Training to non-production workers (a) 0.003* 34.6 0.003 60.3 markets and skills corporate Dummy for incorporated company 0.173** 44.8 0.172** 163.3 governance Other control Trade union (b) -0.001* 39.3 -0.001 17.3 variables Dummy for materials from rural villages with sub-contractual 0.293* -10.8 0.293 -9.8 arrangement (b) Instruments First stage R-squared2 0.35 0.35 evaluation Partial R-squared: productivity3 0.29 0.29 Partial R-squared F test (p-value)4 0 0 Hansen test (p-value)5 0.28 0.28 Observations 725 725 NOTES: * significant at 10%; ** significant at 5%; *** significant at 1% (robust standard errors corrected for clustering by industry and region). Each regression includes a set of industry, region and size dummies and a constant term. (a) Variables instrumented with the industry-region-size average. (b) Variables approximated with a proxy (only missing values replaced by the industry-region-size average). 1 Productivity is endogenous and the list of variables used as excluded instruments comes from the list of explanatory variables from their corresponding equations. 2 First stage R-squared from the regression of productivity on both the included and the excluded instruments. 3 The partial R-squared measures the squared partial correlation between the excluded instruments and the productivity. 4 F-test of joint significance of the excluded instruments that corresponds to the partial R-squared. 5 The Hansen test is a test of overidentifying restrictions. The null hypothesis is that the instruments are valid instruments, that is, uncorrelated with the error term, and therefore the excluded instruments are correctly excluded from the estimated equation. Source: Authors’ calculations with Pakistan ICS data. 94 Table 9: IC elasticities and semi-elasticities with respect to labor productivity (services FY07) Restricted estimation Unrestricted estimation Dependent variable: log of labor productivity Panel data FY06- Cross-sectional Unr. by Unr. by Unr. by log(sales/employment)] FY07 data FY07 industry region ind. and reg. Input prices Log of real wages 0.400*** 0.407*** 0.393*** 0.389*** Log of rental cost of capital 0.06 0.055 Infrastructures Number of power outages (log)(b) -0.806** -0.799** -0.839** -0.821** -0.796* -0.792* -0.709 Dummy for own generator (b) 0.359* 0.352* 0.33 0.342* 0.356* 0.302* 0.3 Red tape, Dummy for bribes from public officials (b) -0.268 -0.288* -0.35* -0.375* -0.297 -0.248 -0.284 corruption and Manager's time spent in bur. issues (%)(b) -0.046** -0.048** -0.038* -0.041* -0.045** -0.042** -0.042** crime Number of inspections (log)(b) -0.237* -0.23 -0.216 -0.203 -0.221 -0.233* -0.221 Sales reported to taxes (%)(b) -0.002 -0.002 -0.001 -0.001 -0.001 -0.002 -0.001 Dummy for informal competition (b) 0.459** 0.486** 0.380* 0.410** 0.428** 0.423** 0.382** Finance and Working capital financed by trade credit -0.027*** -0.026** -0.026** -0.025** -0.028*** -0.027*** -0.029*** corporate (%)(b) governance Working capital financed by family/friends 0.008 0.011* 0.004 0.009 0.006 0.001 -0.001 (%)(b) Owner of the lands (%)(b) -0.003 -0.003 -0.002 -0.002 -0.003 -0.003 -0.003 Dummy for loan (b) 0.743 0.806 0.804 0.821* 0.831 0.988* 1.096* Value of the collateral (%)(b) -0.004** -0.005** -0.004** -0.004** -0.004 -0.006** -0.007** Innovation Dummy for quality certification (b) 0.412* 0.445 0.419 0.458* 0.415 0.521 0.537 and Staff with computer (%)(b) 0.008 0.008 0.008* 0.008 0.008* 0.006 0.006 competition Exporting experience (log)(b) 0.357 0.381* 0.331 0.376* 0.315 0.422** 0.385** Other control Area of the local in square feet (log)(b) 0.136* 0.152* 0.123* 0.137** 0.127* 0.133* 0.124* variables Observations 277 277 143 143 277 277 277 R-squared 0.63 0.62 0.64 0.64 0.63 0.65 0.65 NOTES: Estimation of equation (6.1) Restricted: equal input prices for all the establishments in the country. Panel data estimation uses data for FY06 and FY07. Cross-sectional estimation uses data for FY07. Unrestricted: equal input prices for all the establishments in the same sector, region or region/sector. *significant at 10%; ** significant at 5%; *** significant at 1% given by robust standard errors corrected for correlation between cluster (industry and region). Each regression includes a set of industry, size and region dummies and a constant term (also a set of time dummies in the panel data estimation). (a) Variables instrumented with the industry-region-size average. (b) Variables approximated with a proxy (only missing values replaced by the industry-region-size average). Source: Authors’ calculations with Pakistan ICS data. 95 Table 10: IC effects on the probability of productivity increase between FY02 and FY07 Dependent variable: probability of productivity increase Linear Logit Model Probit Model Probability Model Coeffs. Odd ratios Coeffs. Marginal effects Infrastructure Days to clear customs for imports - interaction -0.043 -0.23 0.795 -0.13* *-0.049* with firms that do import (log)(a) Total number of power outages (log)(b) -0.044* -0.282** 0.754** -0.170** -0.064** Low quality supplies (%)(a) -0.009* -0.043** 0.958** -0.026** -0.010** Economic Security expenses (%)(b) 0.018 0.091* 1.095* 0.055 0.021 governance Crime losses (%)(b) -0.024*** -0.144** 0.866** -0.091*** -0.034*** Manager's time spent in bur. Issues (%) -0.003* -0.014 0.986 -0.008 -0.003 Total number of inspections (log)(a) -0.118* -0.516 0.597 -0.335* -0.126* Payments to deal with bur. issues (%)(b) 0.011** 0.119** 1.127** 0.073** 0.027** Total number of labor inspections (log)(b) 0.059*** 0.305*** 1.356*** 0.181*** 0.068*** Illegal payments in protection (%)(b) -0.044* -0.246* 0.782* -0.147* -0.055* Finance Dummy for credit line 0.075 0.407 1.503 0.263* 0.096* Dummy for loan 0.165** 1.167 3.213 0.656* 0.225* Dummy for loan with collateral -0.263*** -1.630* 0.196* -0.934** -0.359** Dummy for trade association 0.072 0.4 1.492 0.249* 0.094* Labor markets Staff - female workers (%)(b) -0.009*** -0.042** 0.959** -0.027*** -0.010*** and skills Staff - university education (%)(b) 0.194 1.312 3.715 0.787 0.239 Training to non-production workers (%)(a) 0.149* 0.715 2.045 0.392 0.148 Corporate Dummy for state-owned firm 0.304** 1.444* 4.239* 0.894* 0.261* governance R-squared/Pseudo R-squared 0.17 0.14 0.14 0.14 0.14 Observations 352 352 352 352 352 NOTES: Estimation of equation (7.1) *significant at 10%; ** significant at 5%; *** significant at 1% given by robust standard errors corrected for correlation between cluster (industry and region). Each regression includes a set of industry, size and region dummies and a constant term. (a) Variables instrumented with the industry-region-size average. (b) Variables approximated with a proxy (only missing values replaced by the industry-region-size average). Source: Staff calculations with Pakistan ICS data. 96