The devil is in the shadow. Do institutions affect income and productivity or only official income and official productivity?

This paper assesses the relationship between institutions, output, and productivity when official output is corrected for the size of the shadow economy. Our results confirm the usual positive impact of institutional quality on official output and total factor productivity, and its negative impact on the size of the underground economy. However, once output is corrected for the shadow economy, the relationship between institutions and output becomes weaker. The impact of institutions on total (“corrected”) factor productivity becomes insignificant. Differences in corrected output must then be attributed to differences in factor endowments. These results survive several tests for robustness.


Introduction
The consensus that institutions are a key determinant of economic development has led international organizations to devote a great deal of attention and resources to improving the institutional frameworks of developing countries. Various conventions have accordingly been set up, such as the 1998 UN resolution or the OECD's (1999) "Convention on combating bribery." The political consensus is indeed backed by a parallel consensus based on results from a decade of empirical research. Spurred by the seminal papers of Mauro (1995) or Knack and Keefer (1995), this line of research has repeatedly concluded that ailing institutions are associated with lower GDP per capita growth. Later studies, such as Hall and Jones (1999) and Acemoglu et al. (2001), extended this finding to the level of per capita income. Furthermore, Hall and Jones (1999) and Olson et al. (2000) observed that the bulk of the relationship between institutions and income runs through the impact of institutions on total factor productivity.
Although these consonant observations have drawn a consistent picture of the relationship between institutions and development, they all share a common drawback that may turn out to be crucial in the context of developing economies in particular: they use official output figures. The problem here is that most official output figures neglect a sizeable share of economic activity which takes place in the informal sector and, therefore, remains unrecorded in official statistics, namely the shadow economy. According to Schneider (2005aSchneider ( , 2005bSchneider ( , 2007, the shadow economy amounted to 39 % of economic activity in developing countries, on average, and up to 40 % in transition countries in [2002][2003]. These figures consequently call for caution in interpreting empirical results emphasizing the negative impact of defective institutions on income. They emphasize that official income decreases when, for instance, corruption increases or the rule of law deteriorates, but do not guarantee that the same holds true for total income, defined as the sum of official and unofficial income. Previous research moreover suggests that the shadow economy flourishes in countries laden with defective institutions, thus acting as a substitute for the official economy. In Johnson et al.'s (1997) model, for instance, corruption increases the size of the shadow economy because it can be seen as a form of taxation and regulation, which drives entrepreneurs underground. Hindriks et al. (1999), on the other hand, argue that the shadow economy is a complement to corruption. In their view, taxpayers can collude with corrupt inspectors so that the latter underreport the tax liability of the taxpayer in exchange for a bribe. According to the empirical results in Dreher and Schneider (2010), better institutions reduce the size of the shadow economy. Similarly, Johnson et al. (1998a) observed that a one point increase in Transparency International's corruption index would imply a 5.1 point decrease in the share of the shadow economy. 1 To summarize, good institutions seem to increase official output, while at the same time reducing unofficial output. One may accordingly contend that the observed correlation between institutions and income may be less substantial than it first seems and result only from a drop in recorded output. In other words, production might not disappear, but only go underground, which is a special case of Hirschman's (1970) exit option, as Schneider and Enste (2000) argue. Even if this substitution from official to shadow production was imperfect, the negative impact of bad institutions on overall income would be dampened. This intuition is consistent with Johnson et al. (1998b), who report that the relationship between corruption and growth becomes insignificant once the shadow economy is added as an explanatory variable. At any rate, a systematic investigation of the relationship between institutions and total income, as opposed to official income, is warranted. This is precisely the aim of this paper.
To foreshadow our main results, we confirm the positive impact of institutional quality on official output and total factor productivity, and its negative impact on the size of the underground economy, reported in the previous literature. However, once output is corrected for the shadow economy, the relationship between institutions and output becomes weaker. The impact of institutions on total ("corrected") factor productivity even becomes insignificant.
Our line of reasoning is based on the following steps. In the next section, we recall the theoretical impact of institutional quality on the shadow economy, output, and productivity. In Sect. 3, we correct official output figures for the shadow economy and compare the distribution of per capita income using both raw and corrected figures. We then probe deeper into the impact of the shadow economy on output by performing a development accounting analysis, following Hall and Jones (1999) and Caselli (2005). Here, differences in income are broken down into differences in factor endowments and total factor productivity. Section 4 uses this decomposition to determine the channels through which institutions affect per capita income. We thus check whether institutions are still significantly correlated with per capita output and total factor productivity once correcting for the size of the shadow economy. The final section concludes.

The impact of institutions on the shadow economy, output, and productivity: theoretical considerations
In this section, we briefly provide our theory on the impact of institutions on the size of the shadow economy, output, and productivity.

The impact of institutions on the shadow economy
As argued in the Introduction, the size of the shadow economy should be sensitive to the quality of institutions. The shadow economy is part of the general institutional context, as Tanaka (2010) points out. Consequently, various dimensions of the institutional framework affect how taxes and regulations are implemented, thereby affecting the costs and benefits of being formal or informal (e.g., Teobaldelli 2011). Their role may even be more important than the actual burden of taxes and regulations. The first dimension of the institutional framework that affects the incentive to be formal or informal is corruption. Johnson et al. (1997) remark that corruption works as an additional form of taxation and regulation, therefore increasing the costs of being formal. Hindriks et al. (1999) argue further that corruption and the shadow economy are complementary in nature, because corruption makes it possible for taxpayers to bribe tax inspectors so that the latter underreport the former's tax liability.
Second, Johnson et al. (1998a) emphasize that arbitrary implementation of legal rules is an additional burden on official activity, providing the incentive to move to, or remain in, the shadow economy. By the same token, Chong and Gradstein (2007) argue that weak formal property rights reduce the benefit of operating in the formal sector, thereby increasing the size of the shadow economy.
Finally, Dabla-Norris et al. (2008) argue that better institutions should result in a higher probability of detection of firms operating in the shadow economy. Better institutions should therefore increase the incentive to stay in the formal sector, and decrease the size of the shadow economy.
2.2 The impact of institutions on output, factor accumulation, and TFP The notion that institutions are a fundamental determinant of economic outcomes can at the very least be traced back to the work summarized by North (1994) and Acemoglu et al. (2005). Political and social institutions shape economic institutions, which in turn, shape economic incentives. More precisely, institutions that provide more secure property rights will provide individuals with stronger incentives to accumulate factors of production, innovate, and participate in economic activity in general.
Accordingly, the quality of institutions may affect the accumulation of both physical and human capital. Reducing investment when the institutional environment provides lower expected returns is a rational behavior that has been repeatedly observed using many dimensions of the institutional framework, surveyed by Brunetti and Weder (1998). The accumulation of human capital may be affected in a similar way. Moreover, one may allude to the fact that education is often provided by the state and necessitates public infrastructure. If the institutional framework leads to the diversion of resources from education to less productive uses, then its effect will be reinforced. Unsurprisingly, institutional quality has been found to be correlated with the stock of human capital, for instance, by Hall and Jones (1999).
Beside factor accumulation, institutions may also affect productivity for three main reasons: predation, diversion of productive resources, and the quality of accumulated factors of production. First, predation acts as a tax on productive activities and reduces the returns accruing to those responsible for them. This provides an incentive to use productive resources less intensively, thereby reducing total productivity. Second, the diversion of resources away from productive activities is a corollary to the risk of predation. A weak rule of law, resulting in widespread theft, for example, prompts agents to divert productive resources elsewhere in order to protect their property. Similarly, an ill-designed regulatory framework may encourage rent-seeking, i.e., taking advantage of loopholes in the protection of property rights. Finally, weak institutions may affect the quality of accumulated factors of production. Henisz (2000), for instance, argues that the risk of expropriation encourages the accumulation of generic, as opposed to specific capital, because the former can be reallocated with more ease. As specific capital is bound to be more efficient in performing the task for which it is designed, this should affect productivity. This list of negative effects of weak institutions is nonexhaustive. It is detailed and complemented in Hall and Jones (1999), Méon and Weill (2005), and Méon et al. (2009).

Development accounting with the shadow economy
In this section, we estimate productivity levels across countries with and without the shadow economy, and compare the results. To do so, we first present how we corrected the output figures. We then introduce the development accounting method and the data on which it was applied, and report the results.

Correcting output figures
The prerequisite to correcting output figures is to measure the shadow economy. Data for the shadow economy are taken from Schneider (2005aSchneider ( , 2005b, 2 who calculates the size and development of the shadow economy of 145 nations, including developing, transition, and highly developed OECD countries, over the 1999 to 2003 period. Schneider (2005b) estimates the relative size of the shadow economy with the help of structural equation modeling (DYMIMIC: dynamic multiple indicators, multiple causes), employing variables such as direct and indirect taxation, customs duties, government regulations, the rate of unemployment, growth rate of real GDP, and currency circulation. While the DYMIMIC approach produces estimated relative sizes of the shadow economy, another step is necessary to gain absolute values. In order to calibrate absolute figures of the size of the shadow economies from the relative DYMIMIC estimation results, Schneider uses previous estimates for a number of countries.
According to these data, the average size of the shadow economy as a percentage of official GDP in 37 African states is 41 %. In Central and South America, this value is also 41 %, while in Asia, the average value is much lower (29 %). Regarding transition countries, the value is 38 %, and it is 17 % for the OECD. Looking at the unweighted average of the 145 countries in the sample, the relative size of the shadow economy is 34 % in 1999-2000.
We added the shadow economy output to the official output figures, thereby obtaining what we refer to as total output. 3 The data on official output stem from the Penn World Tables, version 6.2. 4 While Heston et al. (2006) do not introduce any corrections to the 2 Note that our results do not depend on the choice of a particular set of estimates for the shadow economy. As alternative estimates for the size of the shadow economy, we take two indicators from Friedman et al. (2000), who collected data on the unofficial economy for 69 countries from various sources. We use three sets of estimates taken from Schneider and Enste (2000): First, we use average estimates for the years 1990 to 1993 employing the physical input (electricity) method. Second, we use their MIMIC estimates over the same period of time, while our third indicator complements the first with data for the years 1989 and 1990, using the same method (taken from Johnson et al. 1997). As we report in Tables A7 and A8 in the working paper version of this article (Dreher et al. 2012), our results are robust to this choice. 3 Note that we cannot correct input figures for the size of the shadow economy. Regarding labor, we proxy the labor force with the working age population. As shown by Caselli (2005), taking into account differences in hours worked is unlikely to affect the outcome of development accounting analyses. Not adjusting labor inputs should therefore prove of little importance for our results. Regarding capital, it has been stressed that small scale activities dominate this sector (ILO 1972), at least in developing economies, and that it therefore generates low levels of income and little accumulation (Gërxhani 2004). As Tanzi (1999) remarks, individuals who operate in the shadow economy often use capital or tools that are borrowed from the official economy. Hillman et al. (1995) back this claim with anecdotal evidence from Bulgaria, reporting that formal state enterprises have been rented out to informal entrepreneurs. This implies that there may be little capital specific to the shadow economy. To the extent that some unregistered capital operates in the shadow economy, our results might overestimate TFP. See Sect. 5 in Dreher et al. (2012) for details. Moreover, countries with poor institutions are also those where public investment is the least efficient, as Pritchett (2000) argues, and experience more frequent destruction due to natural disasters, as Caselli and Malhotra (2004) emphasize. The perpetual inventory method, which does not control for the efficiency of investment and assumes a constant rate of depreciation across countries, thus overestimates badly governed countries' capital stocks, and underestimates their TFPs. Therefore, using more accurate measures of the capital stock would further weaken the link between institutions and TFP, and strengthen our findings. 4 To obtain meaningful international comparisons, we used PPP-converted GDP per capita figures for official GDP. It is referred to as rgdpwok in PWT6.2. In the sample of countries for which we could obtain data on both GDP and the shadow economy, the mean of the uncorrected official output was $18,941 PPP per capita, and the mean of total corrected output was $23,640 PPP per capita. Table 1 below provides more detailed descriptive statistics on official and corrected outputs. Appendix A reports descriptive statistics for all variables, while Appendix B shows the definitions and sources. y denotes official output per worker and y tot total output per worker defined as official output plus the shadow economy official data in order to correct for the size of the shadow economy, some countries do adjust their official data. This may result in double counting when correcting output figures. We carefully checked individual countries' practices to determine which ones officially correct their GDP figures. Our search revealed that of the 133 countries included in our sample, 5 39 countries perform such an adjustment. To the best of our knowledge, the country that performs the most careful correction of its GDP figures is Italy. The Italian statistical office (ISTAT) calculates the size and development of the shadow economy, thanks mainly to the discrepancy method, and uses these estimates to compute official GDP figures. 6 In other countries, statistical offices also mostly use the discrepancy method, as well as microdata on the size and development of the shadow economy in certain sectors on occasion (e.g., the services sector (hotels, restaurants) or the construction sector). A substantial problem with these corrections is that they are carried out only for a few specific service sectors, and that they are merely mentioned in a footnote, if at all. There is no detailed documentation of what has been corrected, whether the total size of the shadow economy actually has been taken into account, or more likely, whether shadow economy activities have been corrected for in a few sectors only. Given that we cannot be sure what has been corrected for, and whether the bulk of shadow economy activities are really captured, we decided to ignore these corrections for the majority of our study. Accordingly, we correct the official GDP in all countries by adding the size of the shadow economy estimated according to the sources we use in this study. Admittedly, this may involve some double counting and overestimation of total GDP in countries that already correct official GDP figures for some shadow activities. 7 Table 1 compares official (y) and total (y tot ) output per worker measured in PPP dollars. We focus on the year 2000, because it maximizes the number of observations in our sample. We report data on two samples: First, we describe the largest sample for which we could find data on output per worker and the shadow economy, which features 136 countries. Second, we use a restricted sample for which we have not only data on output per worker and the shadow economy, but also on human and physical capital stocks.
As Table 1 shows, adding the output produced in the shadow economy to official output increases both the mean and standard deviation of output. This is not surprising, be-5 Listed in Appendix F of Dreher et al. (2012). 6 The discrepancy method compares Gross National Product (GNP) based on expenditures with GNP based on income. According to the estimates, the size of the Italian shadow economy amounts to 15-17 % of GDP in the year 2000 (Castellucci 2007), while the estimates we use here range between 20 % and 27 %. 7 We replicated the analysis by not adjusting official GDP at all in these countries. As we show in Tables A9 and A10 of Dreher et al. (2012), our results are strengthened by this. cause the shadow economy cannot be negative. However, Table 1 also reports the ratio of maximum to minimum output. The ratio decreases with the inclusion of the shadow economy to the output figures in both samples, which suggests that the distribution of outputs is more concentrated. 8 This is due to the fact that the shadow economy tends to be larger in poorer countries. To be more specific, the coefficient of correlation between the share of the shadow economy and official output per worker is −0.67 in the larger sample and −0.71 in the restricted sample. Official figures therefore tend to overestimate the differences in output across countries. The observed differences in outputs are therefore reduced when the shadow economy is taken into account. In the next section, we investigate the impact of this correction on the development accounting results.

The development accounting method
Development accounting estimates countries' outputs as a function of their factor endowments and compares the estimated figures with actual output figures. The difference between the two gives total factor productivity (TFP), or the Solow residual, depending on the reader's optimism. To do so, the standard method in the literature is the calibration approach surveyed by Caselli (2005), and used by King and Levine (1994), Klenow and Rodriguez-Clare (1997), Prescott (1998) and Hall and Jones (1999), among others. Following the standard specification used in the vast majority of development accounting studies, we assume a Cobb-Douglas specification. As Aiyar and Dalgaard (2009) show, it is a good approximation for cross-country development accounting. In per worker terms, the production function reads: where y is the country's output per worker, k its physical capital stock per worker, and h the average human capital stock per worker. A is total factor productivity, and α a parameter that measures the contribution of capital to output. Total factor productivity is then estimated by solving the above equation for A. This gives: One may remark that if output y is underestimated, then A will be as well. As we have just shown in the previous section, the share of total economic activity that takes place in the shadow economy is systematically overlooked by most official statistics. TFP computed from official figures may thus be biased.
To compute A, one needs a value for α and data on y, k, and h. It is commonly assumed that a reasonable estimate for α is around 0.3, such as in Caselli (2005), Hall and Jones (1999), Klenow and Rodriguez-Clare (1997), Prescott (1998), or Collins et al. (1996. 9 However, although this parameter's value is critical in development accounting exercises, as Caselli (2005) shows, the specific value admittedly is arbitrary. Although it is true that it corresponds to the US long run average, it may be quite different for other countries. Indeed, the estimates of α that are reported in the literature vary widely. Thus, Cavalcanti Ferreira et al. (2004) report estimates of α that are approximately equal to 0.43. Moreover, estimates of α obtained when the production function is estimated based on efficiency frontier techniques frequently reach 0.8, as in Kneller and Stevens (2003). Abu-Qarn and Abu-Bader (2007) assess α in MENA countries and conclude that the share of capital often exceeds 0.6 there. They even report estimates of more than 0.9 for the region's α. When studying OECD countries, Abu-Qarn and Abu-Bader (2009) also reject the 0.3 estimate and find that alpha robustly is greater than 0.5. However, the most systematic attempt at assessing the share of capital in a large sample of countries is Senhadji's (2000), providing estimates for a sample of 88 developed and developing countries. He also rejects the estimate of 0.3, observing large cross-country variations for this parameter. The world mean and the world median are found to be 0.55 and 0.57, respectively.
Therefore, as 0.3 seems to be a very small value for α, and any exogenously imposed value is arbitrary; we endogenized the magnitude of this parameter, following various methods. Specifically, we first estimated the coefficient of the Cobb-Douglas production function given by (1) on our cross-country sample. This allowed us to use both official and total output. We then used the panel dimension of our data over the 1980-2000 period. 10 Here, we first computed both pooled and between regressions and then ran a panel regression with fixed country effects, as well as subsequently adding fixed time effects. We also ran two random effects regressions, one including country effects only, and another including random time effects as well.
According to our results, the estimates of α remain in a fairly narrow range of 0.5-0.6. 11 They approximately average out to 0.57, which corresponds to Senhadji's (2000) estimate of the world median. We will therefore use this value in our calculations below. Note, however, that using the mean value of 0.55 obtained by Senhadji does not change our results. Arguably, an α of 0.57 remains a conservative guess, given that Senhadji (2000) reports estimates of α for individual countries or regions that often exceed it. This value allows for an investigation of the impact of increasing the capital share in the production function, while leaving a role for differences in TFP in explaining cross-country differences in per worker output. As Caselli (2005) shows, variations in factor endowments explain the totality of cross-country differences in output per worker with values of α exceeding 0.6. 12 10 We restricted our observations to this period in order to minimize the impact of the initial capital stock. 11 See Appendix D in Dreher et al. (2012). 12 Because we estimate the value of α instead of postulating it, our strategy goes beyond the standard development accounting literature. We tested the robustness of our results by setting α to 0.3 and 0.6. When we use a value of 0.6 the results match those reported below. When employing a value of 0.3, the results are weaker, due to the reduced role of capital in the production function. As we show below, the impact of institutions on output mainly works through its impact on capital, so that using a lower α increases not only the Solow residual, but also its correlation with institutions. See Table A5 in Dreher et al. (2012). One may also want to use a specific value of α for each country or each group of countries. However, assuming different αs in a cross-country development accounting exercise would be at odds with the objective of that literature, which is precisely to see how much of the cross-country variation in output a common production function can explain. This is why the literature systematically assumes a constant α across countries. We adopt this assumption for the same reason. Most importantly, deviating from this assumption would prevent any meaningful comparison between our results and previous studies explaining differences in output employing a single production function. However, while that would go beyond the scope of the present paper, we acknowledge that considering country or region-specific αs would be interesting for future research.
The number of workers was computed from the Penn World Tables 6.2 dataset. 13 The human capital stock is usually computed as a function of years of schooling in the population. Following Hall and Jones (1999) and Caselli (2005), we accordingly define h as: where s is the average number of years of schooling in the population over 25 years old, taken from Barro and Lee's (2001) Hall and Jones (1999), this definition of human capital is routinely used in development accounting. Its motivation is as follows: According to our model, workers' wages should be proportional to their human capital. The relationship between wage and education is commonly assumed to be log-linear at the country level, yet the cross-country pattern of the education-wage profile seems convex. Using a piecewise linear specification allows taking stock of within and cross-country evidence. In Barro and Lee's dataset, the last year for which this statistic can be computed is 2000. The last set of data required for our calculations is the stock of capital. Again, we followed the literature and computed it by applying the perpetual inventory method, where the capital stock of a particular year, K t , is defined as the sum of previous year's investment, I t−1 , and capital stock, K t−1 , to which a depreciation rate, δ, is applied. Hence, the capital stock is given by: Again, data on real investment in PPP terms were obtained from the Penn World Tables 6.2. 14 These data are available from 1950 until 2004. However, not all countries have complete series for the entire period. We therefore restricted our analysis to countries for which the information was available from 1970 at the least.
To apply the above formula, however, we need the initial capital stock. Still following Caselli (2005) and Hall and Jones (1999), we assume that the capital stock in the initial year is equal to its steady-state value in the Solow growth model, namely K 0 = I 0 /(g + δ), where δ is usually set to 0.06 in the literature, I 0 is the value of investment in the first year for which an observation is available, and g is the average rate of growth for the investment series between that year and 1970. 15 Finally, we used the measures of uncorrected and corrected output per worker defined in the previous subsection. Overall, we were able to obtain data on output per worker, physical capital, human capital, and the shadow economy for 76 countries in the year 2000. The key point here is to determine the extent to which the inclusion of the shadow economy in output figures affects the observed role of the residual in explaining cross-country income differences. 13 Specifically, the number of workers was obtained by dividing total GDP by GDP per worker, that is rgdpch * pop * 1000/rgdpwok, according to notations in the PWT6.2. 14 That is rgdpl * pop * ki in PWT6.2. 15 Note that the impact of K 0 on capital stocks in 2000 is quite small as we use no base year subsequent to 1970. Since the annual rate of depreciation is 6 %, the maximum share of the initial capital stock still in use in 2000 cannot exceed 15 % of its initial value. Caselli (2005) moreover points out that although changing the value of this parameter changes the relative weight of past and recent investment, it has almost no effect on the outcome of development accounting. To measure the impact of including the shadow economy in output figures on the capacity of factor endowments to explain income differences, we compare figures on actual output per worker with output predicted by a model considering factor endowments only, i.e., y KH ≡ k α h 1−α , called the factor-only model. We then assess its relevance by computing the two measures of success defined in Caselli (2005). The first one is the ratio of the log-variance of the factor-only output to the log-variance of observed output. The second one is the ratio of the 90th to the 10th percentile of the factor-only output to the ratio of the 90th to the 10th percentile of observed output: success 2 ≡ y 90 KH /y 10 KH y 90 /y 10 .

Results
We compute the two measures of success with official output figures and output figures corrected for the shadow economy, respectively. The first two columns of Table 2 display the results of our calculations. The results displayed in Table 2 are in line with the usual findings in the literature. 16 Namely, it appears that the factor-only model fails to account for all of the variance in output. However, the key finding of Table 2 appears when comparing the results obtained with official figures to those obtained with corrected figures. Here, we observe that the measures of success of the factor-only model are substantially greater when corrected figures are used instead of official figures. Table 2 shows that, in our sample, the correction adds around 20 percentage points to those measures. The last three columns of Table 2 provide more information on the impact of adding the shadow economy to official output figures. The distribution of official TFPs is described in the first line and the distribution of corrected TFPs (including the shadow economy) in the following line. These results resemble those in Table 1. Specifically, it appears that both the mean and the standard deviation of TFPs increase when the shadow economy is taken into account. Additionally, the ratio of maximum to minimum TFP decreases, implying that the distribution of TFPs becomes more concentrated. 17 The rationale for this result again stems from the fact that a smaller share of output is officially reported in poorer countries, with their TFP therefore likely to be relatively more underestimated than that of richer countries. As a result, correcting for the shadow economy leads to a more concentrated distribution 16 The results also remained qualitatively unchanged when we used our alternative definition of total output, not adjusting for those countries that do provide partial adjustment themselves. Both success 1 and success 2 rose when using corrected figures instead of official ones. 17 Again, the same results were obtained when using the alternative definition of total output.  Table 3 picks some countries from the sample and displays their respective official incomes, total incomes, and productivity levels. To facilitate comparison, all values are given relative to the United States, and countries are ranked in descending order based on official GDP per worker. Table 3 also recalls summary statistics for the entire sample, and the correlation of each displayed variable with official output per worker. Table 3 confirms that productivity differences are responsible for the bulk of differences in output per worker. The same diagnosis can be made regardless of the definition of output, official or corrected, used for computations. 18 On the other hand, the table shows that the rise in output due to the inclusion of the shadow economy can indeed be quite large for some countries, especially poorer ones, with the result being that total productivity can be severely underestimated in countries with a large shadow sector. For instance, Malawi's official productivity level is 55 % of that of the United States. However, when corrected fig-18 Some TFP estimates may look surprising. For instance, China's estimated TFP is 0.51 times that of US TFP with official output, and 0.53 with total output. Those figures may look small, but they are larger than the estimates provided by Hall and Jones (1999), who estimated China's TFP to be around 10 % of the US level in 1988. TFP conversely seems high in Egypt, but that finding is in line with Abu-Qarn and Abu-Bader (2007), who report that TFP growth accounted for the bulk of the country's growth over the 1960-1998 period. ures are used, Malawi's TFP relative to the United States becomes 71 %. A similar order of magnitude can be found in a middle-income country like Brazil, where overall TFP relative to the United States is 93 %, compared to 73 % for official TFP. Its GDP thus becomes 93 % of that of the United States, instead of 73 %. Even among developed countries, the inclusion of the shadow economy can affect our perception of TFP differences, although its absolute increase remains smaller. Countries like Italy or Great Britain can thus make up all of their difference vis-à-vis the United States, as the upper half of Table 3 shows.
More generally, all the variables displayed in Table 3 appear to be positively correlated with official output per worker. Richer countries also tend to be more productive. More remarkable, however, is the fact that the correlation of TFP with output decreases from 0.44 to 0.14 when shadow output is added to official output. The intuition behind this finding is that the share of the shadow economy tends to increase when income decreases. Poorer countries therefore report a smaller fraction of their total output. This introduces a systematic bias that results in underestimating TFP in poor countries, which increases the correlation between output and TFP. When this statistical artifact is corrected, the correlation between output and TFP consequently becomes less clear, which is precisely what our calculations reveal.
Overall, these results may cast some doubt on the usual finding that the quality of institutions is positively correlated with productivity, because the observed relationship may also be driven by unreported output. The next section investigates this possibility.

Do institutions really affect output and productivity after all?
The aim of this section is to assess the robustness of previous results, which have emphasized institutional quality as a major determinant of per capita income and TFP, to the correction of official GDP figures for the shadow economy. Accordingly, we first look at whether we can replicate previous findings on the negative impact of institutional quality on the informal sector. We then proceed by examining the impact of institutions on per capita output, and finally on TFP.
To do so, our primary measure of institutional quality is the rule of law, which has been an important focus of the literature on institutions and economic performance, e.g., Rodrik et al. (2004), or Dollar and Kraay (2003). 19 It is measured by the World Bank's rule of law index (Kaufmann et al. 2006) for the year 2000, an index measuring whether, and to what extent, institutions protect property rights, and reliably enforced laws and regulations govern economic and social interactions. It is based on perceptions recorded in a large number of independent polls and surveys.

The shadow economy
To assess the impact of institutional quality on the size of the shadow economy, we chose a parsimonious model, including GDP per capita as the only additional explanatory variable. The results show that GDP per capita does not affect the share of the underground economy 19 We test for the stability of our results by replacing the rule of law with the World Bank's control of corruption, voice and accountability, political stability, government effectiveness, regulatory quality, and the average of the individual indicators. As we report in Appendices A2 and A3 of this paper's working paper version (Dreher et al. 2012), the results are robust to the choice of indicator. at conventional levels of significance. 20 The result, however, confirms previous research showing that good institutions are negatively related to the shadow economy. At the 1 % level of significance, a better rule of law reduces the share of the shadow economy. Specifically, an increase in the rule of law index by one point reduces the shadow economy by 0.3 percentage points. 21 This amounts to a standardized beta coefficient of almost 0.8. This result, for 133 countries, is in line with the models of Johnson et al. (1997) and Hindriks et al. (1999), as well as the results reported in Johnson et al. (1997Johnson et al. ( , 1998b, which show that corruption increases the shadow economy in a cross section of 15 and 39 countries, respectively.

Output
We proceed by examining the impact of institutions on official and unofficial (logged) output per worker. Again, we follow Hall and Jones (1999) and opt for a parsimonious specification, including only the rule of law as an explanatory variable, and starting with OLS. However, institutions might well depend on GDP and could as such be endogenous. To control for a potential endogeneity bias, we instrument the rule of law index employing the variables suggested in Hall and Jones (1999) as instruments for institutional quality. Their instruments measure the extent of Western influence in a country from the sixteenth to nineteenth century, which is exogenous to GDP, but highly correlated with institutions. According to Hall and Jones, European influence is unlikely to have been stronger in regions more likely to have a higher GDP today. The first reason for this is that, for the most part, Europeans conquered resource-rich regions, which are not systematically among the countries with high output per worker today. The second reason is that European influence concentrated on sparsely settled regions. As these were frequently regions with low productivity, there should again be no tendency for these regions to be among those with high output per worker today.
Despite this, past European influence is still likely to be highly correlated with the rule of law. As Hall and Jones (1999) point out, countries most strongly influenced by Western Europe are among those most likely to adopt favorable institutional infrastructures. We employ the percentage of a country's population speaking one of the five primary Western European languages as their mother tongue. In addition, we use the absolute value of a country's latitude in degrees, which measures the distance from the equator. 22 Table 4 shows the results. Columns 1 and 2 report OLS estimates. While column 1 refers to official GDP, column 2 employs corrected output figures, i.e., overall output including the shadow economy. Given the negative impact of the rule of law on the shadow economy, we would expect the impact of the rule of law on output to be smaller or vanish completely once the underground economy is included. The results show that the impact of the rule of law on total output is smaller than its impact on official output. However, improvements in the rule of law still increase output when the shadow economy is taken into account-the positive impact on official GDP apparently dominates the negative impact on the size of the shadow economy. According to the coefficients, an increase in the rule of law index by one point 20 See Table A1 in Dreher et al. (2012). Sargan test (p-value) 0.14 0.11 Notes: Robust t statistics in parentheses; *** significant at 1 % Constant term included but not reported increases official output by 9.3 %, while increasing total output by only 8.6 %. With the rule of law index varying from −2.37 to 2.11 among the countries included in our sample, the difference between the parameters of the two models is not dramatic. While it is significant at the 1 % level, the standardized regression (beta) coefficients are 0.8 and 0.78, respectively. Columns 3 and 4 of Table 4 report the results of our instrumental variables approach. As shown in the table, the overidentifying restrictions are not rejected at conventional levels of significance. The instruments are jointly significant at the 1 % level in both first-stage regressions, and the F-test statistic easily exceeds the rule-of-thumb threshold of 10 proposed by Staiger and Stock (1997), indicating that the instruments have some power.
As can be seen, the impact of the rule of law on output remains significant at the 1 % percent level in both specifications, with a positive coefficient. The coefficients show that an increase in the rule of law index by 0.1 increases official output by 13.1 %, and total output by 12.3 %. Again, the impact of the rule of law is thus smaller when focusing on total output as compared to official output. The difference between the parameters of the two models is significant, at the 5 % level, but they are of similar magnitude. This is confirmed by the beta coefficients of 1.12 and 1.11, respectively. 23

Total factor productivity
Tables 5a and 5b focus on total factor productivity. When instrumenting institutional quality with latitude and the percentage of major European languages spoken, the Sargan test rejects 23 To test for robustness, we split the sample according to income and the rule of law. The threshold value for income was set to US$6000, which is the average value among our sample of countries. The strong rule of law group includes countries with a rule of law in the highest quartile, while we define countries with a weak rule of law to contain all other countries. The results for the low-income and weak rule of law groups very much resemble those of the previous sample (see Table A4 in Dreher et al. 2012). In the group of countries with high income and a strong rule of law, there is a substantial decline in the magnitude of the impact of the rule of law on output when we take the shadow economy into account. In the 2SLS regressions the impact of the rule of law is no longer significant at conventional levels. The reasoning behind this finding is twofold. First, it may be due to the fact that the small number of countries in the sub-sample does not allow for the identification of a significant impact. Second, it may be due to the fact that the elasticity of output with respect to governance is smaller in countries with good governance.  Notes: Robust t statistics in parentheses; ** significant at 5 %; *** significant at 1 % Constant term included but not reported the overidentifying restrictions, casting doubts on the exogeneity of the instruments. The analysis presented in the table therefore employs the share of native English speakers instead of focusing on five languages (as suggested by Hall and Jones 1999) and GDP per capita. It now appears that while the rule of law is highly correlated with GDP per capita, it is not significantly correlated with total factor productivity (0.84 and, respectively, 0.3). This suggests that the impact of institutions on output mainly runs through factor endowments as opposed to productivity. While the results are not affected by the choice of instruments, now the Sargan test does not reject the overidentifying restrictions at conventional levels of significance.
We present two sets of results. First, Table 5a again employs a parsimonious model, including the rule of law index as the only explanatory variable. Second, in Table 5b, we additionally control for government consumption (as a percentage of GDP) and the rate of inflation. Both variables create distortions; thus it is reasonable to expect that they will affect the relationship between the rule of law and factor productivity. According to the results of both specifications, the rule of law significantly increases total factor productivity when official output is concerned. However, turning to total output, this result no longer holds. According to the OLS and instrumental variables estimates, the impact of the rule of law on total factor productivity no longer exists once we control for the size of the shadow economy. This result has important implications for empirical research on the impact of institutional quality on productivity. While good institutions increase official output, they simultaneously decrease the size of the underground economy. As a consequence, total factor productivity does not seem to be affected by the quality of institutions. Regarding the control variables, Table 5b shows that government consumption does not affect factor productivity in any specification, while productivity declines with inflation, at the 5 % level of significance. The first-stage F-test indicates that the instruments have some explanatory power, while the Sargan test does not reject the overidentifying restrictions at conventional levels of significance. Overall, these results suggest a new implication regarding the findings reported by Hall and Jones (1999) or Lambsdorff (2003), for example. Their finding that the quality of institutions affects official productivity may indeed be driven by the fact that they used official figures and, therefore, underestimated output. Consequently, their estimates may not only imply that some production disappears in weak institutional frameworks, but also that some production goes underground.
However, as argued in Sect. 2.2, the quality of institutions not only affects TFP and the size of the shadow economy, but also the accumulation of factors of production. Accordingly, Hall and Jones (1999) show that the impact of institutions on output is due to their joint effect on the stock of physical capital, the stock of human capital, and TFP. Table 6 provides a similar decomposition, taking into account the extra impact of institutions on the shadow economy. With total output being equal to official output plus the shadow economy, expression (1) implies that the sum of the coefficients of institutions that appear in columns (2) to (5) of Table 6 should equal the coefficient that appears in the first column. 24 According to Table 6, total output increases with the rule of law in the restricted sample of 76 countries, at the 1 % level of significance, confirming the results of Table 4. These results confirm that the impact of institutional quality on total output runs through its effect on physical capital, human capital, and TFP, but also show that it is partly compensated by its impact on the shadow economy. They, however, also imply that the biggest share of the impact of institutional quality on output runs through the capital stock per worker. More specifically, the estimates reported in the first column of Table 6 show that an increase in the rule of law index by 0.1 results in a 9.9 % increase in total output per worker. Out of this 9.9 %, 8.3 % is due to an increase in the physical capital stock, 1 % to an increase in the human capital stock, and 1.4 % to TFP. However, the same improvement also decreases the shadow economy and this decomposition illustrates how neglecting the shadow economy leads to overestimating the impact of institutions on output. 25

Concluding remarks
In this paper, we reexamined the nexus between output, productivity, and institutions, while taking account of the importance of the shadow economy across the world. With this in mind, we studied the distribution and institutional determinants of output and total factor productivity (TFP), comparing the results obtained with both official output and total output, with the latter being defined as the sum of output produced in the official and shadow economies.
According to our results, the cross-country distribution of output becomes less dispersed when official output figures are corrected for the shadow economy. This is due to the fact that the share of unrecorded activity is larger in poorer countries. Thus, these countries' total production tends to be underestimated by official figures.
To check how the omission of the shadow economy from official output figures may bias productivity measures, we performed a development accounting analysis with both official and corrected output figures. Our results show that when using official figures, total factor productivity is underestimated, especially in poor countries. Moreover, we observe that correcting output for the shadow economy leads to an increase in the predictive power of the factor-only model. Part of the puzzle as to why factor endowments have a limited ability in explaining cross-country differences in output per worker may thus be explained by the existence of the shadow economy.
Deepening the level of explanation regarding differences in countries' economic performance, we then studied the impact of the quality of the institutional framework. While we were able to replicate the usual association of output and TFP with institutions when we used official figures, we obtained more qualified results upon employing corrected output figures. In particular, although total output is significantly positively correlated with institutional quality, the estimated impact of institutions is smaller than the one obtained with official output. Even more striking is the impact of our correction on the relationship between TFP and institutions. More specifically, even though we observe the usual positive correlation between TFP and institutional quality when output is measured by official figures, this correlation loses significance when corrected output is used instead. These findings call for a reinterpretation of earlier studies that have emphasized the relationship between measured TFP and institutional quality. 25 Again, we also split the sample according to income and the rule of law. As shown in Table A5 in Dreher et al. (2012), the results hold for the sample of 50 low-income countries, while neither official nor total productivity are affected in the sample of 26 countries with above average GDP per capita. This suggests that our results are mainly driven by low-income countries. When we split the sample according to the rule of law, all coefficients are insignificant at conventional levels in the sample with weak rule of law countries. In the sample with strong rule of law countries, one coefficient even turns significantly negative. Due to both the small sample sizes and the reduced variance due to splitting the sample according to the dependent variable, it is not possible to give much credence to these results.
The main rationale behind our results is that weak institutions not lead only to less factor accumulation, but also encourage participation in the shadow economy. The observed negative correlation between weak institutions and official output is therefore driven both by a reduction in production and a switch from the formal to the informal sector. Using official output to estimate the relationship between institutions and output implies that the production of countries with weaker institutions will be underestimated, thereby inflating the observed relationship. As a result, when shadow output is added to official output the correlation weakens. Using official output figures to compute TFP leads to the same bias. Thus, correcting official figures for the shadow economy also weakens the relationship between institutions and TFP, or even goes as far as removing it altogether. The essence of our results suggests that part of the observed relationships reported in the previous literature is not due to a reduction of output, but instead due to a switch from the formal to the informal sector.
Our results have broad implications for the empirical literature on the determinants and consequences of GDP. Since the shadow economy tends to be larger in countries with a lower official GDP, results employing uncorrected figures will reflect this bias. Whenever the interest of the researcher is based on income, instead of official income, corrected figures should be used instead of official ones. These results also have important implications for policy makers. What matters for policy makers in most cases is income, and not income as it is officially measured. Statistical offices around the world should give priority to more precise estimates of the amount of underground activity in the country, thus producing reliable estimates of overall economic activity. Arguably, for some governments this would imply acknowledging the existence of a substantial amount of underground activity, which politicians in many countries will not find it easy.
At the same time, what our results underline is that development accounting is a powerful tool of analysis that still needs improvement. This paves the way for exciting future research that may still change our understanding of the determinants of relative economic performance of countries all over the world. Size of the shadow economy in percent of official GDP. Schneider (2005aSchneider ( , 2005b Total output per worker (y tot ) Official output corrected for the share of the shadow economy.

Number of workers (L)
Computed as rgdpch * pop * 1000/rgdpwok. Penn World Tables version 6.2 Investment (I ) Computed as rgdpl * pop * ki. Penn World Tables version 6.2 Initial capital stock (K 0 ) Estimated as K 0 = I 0 /(g + δ). Penn World Tables version 6.2 Capital stock (K) Computed as K t = K t−1 * (1 − δ) + I t−1 . Schooling (s) Average number of years of schooling in the population over 25 years old. Barro and Lee (2001) Human capital (h) Defined as h = e φ(s) where φ is a piecewise linear function. Latitude Distance in degrees from the equator.

Easterly and Sewadeh (2001) Language
Percentage of the population speaking one of the five primary European languages: Portuguese, Spanish, English, French, German. Alesina et al. (2003) Rule of law Perceptions based index, with higher numbers showing "better" environments. Kaufmann et al. (2006) Corruption Perceptions based index, with higher numbers showing "better" environments. Kaufmann et al. (2006) Government effectiveness Perceptions based index, with higher numbers showing "better" environments. Kaufmann et al. (2006) Quality of bureaucracy Perceptions based index, with higher numbers showing "better" environments. Kaufmann et al. (2006) Voice and accountability Perceptions based index, with higher numbers showing "better" environments. Kaufmann et al. (2006) Overall index Average of rule of law, corruption, government effectiveness, and quality of bureaucracy.  (2006)