# The Distribution of Human Capital in Sweden, ch. 4

## 4. Model specification

The reader will find that this chapter of the paper is quite exhaustive. It aims to offer a thorough, and hopefully interesting, discussion of the estimating model. It will elaborate on the potential sources of bias in the measure of human capital, and how they have been dealt with in this particular study.

For a sneak peek, the reader will find that Table 4.1 on page 27 contains the estimation results of the experience component in xitβ, while Table A.1 in the appendix presents the estimates for the controls in xitβ.

### 4.1 The experience component of human capital

This section will address the estimation of the experience component of human capital. But before delving into the delicately argumentative part of this process, I shall reconnect to the formal framework developed in chapter two and, with hope of facilitating the cooperation of the reader, restate our core statistical model and the assumption of strict exogeneity needed for unbiased estimation of the elements in our parameter vector β:

(4.1) wit = Θi + xitβ + εit

(4.2) E (εit | xi1, …, xiT, Θi) = 0

To ensure that the strict exogeneity assumption holds for the experience component in xitβ, we have to make sure that nothing outside of the model influences both wage and experience, after Θi has been accounted for. Suppose, for example, that the return to experience depends on whether you are a woman or a man, and that this dependency in turn changes over time, as we might expect it to do. This would imply that the assumption fails if this dependency is left outside of the model. Hence, we would need a gender dummy. However , recall that we also need to be sure that there are no perfect linear relationships between the explanatory variables in the model. We stated this as an assumption that the outer product matrix of ˜xit is of full rank (see equation (2.9)), implying that we cannot include time-invariant variables in our explanatory variable vector x. So how do we control for gender without including a time-invariant variable in the model? Well, we simply interact the time-invariant variable gender with the time-varying variable experience, thereby obtaining separate estimates for women and men. And since we are already controlling for constant differences through Θi, controlling for different trends of time-invariant variables will suffice.

Consequently , anticipating two potential sources of bias, the variables gender and immigrant are interacted with experience and year in (2.11) to control for different trends in the relationship between labor force experience and wage between the groups. In addition , experience is modeled through a 3rd degree polynomial to allow for a non-linear relationship to wages. The order of the polynomial was decided upon after a visual inspection of the wage at different levels of experience.

A rather different discussion, but perhaps more interesting, is what to do with the different estimates. Remember that our objective is to obtain a measure of human capital, and that we previously established that all components pertaining to an individual’s capacity to produce are valid components in a measure of human capital, while the rest are not. The question is, and this is where things get delicate, whether the different trends of men and women, and immigrants and non-immigrants, are the results of a fair market valuation of the quality of their labor force experience. By fair, I mean to say free from discriminatory elements.

However , before going into what the potential sources of discrimination might be, I would like to make a distinction between pre-market and in-market discrimination. For example, if it is the case that some lucrative parts of the labor market are inaccessible to women, such as CEO positions, they will select into career paths that differ from those of men because the expected returns to their labor is lower. As a consequence , even though the within-occupation expected returns to labor might be exactly equal between men and women, the labor market will be segregated with women predominantly working in less lucrative occupations due to the selection mechanism. This is obviously an economically inefficient and morally abominable outcome, but in terms of using the labor force experience of an individual as a component in the measure of their human capital it does not matter. As long as the quality of experience is solely appraised on factors pertaining to the production capacity of the individual, a segregated labor market will not cause our measure to be biased. It is only the valuation of the experience that needs to be free of discriminatory elements, not the actual process of its accumulation. Another way of putting it is that the returns to productivity should be equal across groups that are identical in their productive capacities.

In the field of labor economics, seminal work on the topic of discrimination has been made by Gary S. Becker (1957), Edmund S. Phelps (1972) and Kenneth Arrow (1973). They focus on two different kinds of discrimination, originating from different sources. Becker developed the framework for what is commonly known as taste-based discrimination, where one group of individuals is preferred to another. Phelps and Arrow on the other hand, focused on discrimination that arises due to the prevalence of imperfect information in the labor market, known as statistical discrimination. To determine the extent of in-market discrimination on the Swedish labor market and the consequence it has on our estimation of human capital, we shall need a theoretical foundation from which to derive conclusions about the matter. I will therefore briefly and non-formally describe the ideas underpinning the theory of taste-based and statistical discrimination.

Differences in preferences may be conscious, such as racists preferring to work with people of their own color. This is the kind of preference that Becker concerned himself with. But they may also be unconscious, such as a person who tends to tip a white taxi driver slightly higher than a black (Ayres, 2004). This type of implicit discrimination has quite recently risen as a subject of interest for labor economists, and its relevance is convincingly argued for by Bertrand et al. (2005). Now, if it is the case that preference-based discrimination is established in the economy, it would probably lead to in-market discrimination through wage differentials. I say probably because it matters who the source of the discrimination is. If the discriminatory behavior is on the part of the consumer, a nonpreferred individual at equal level of productivity to a preferred individual would always be forced to suffer a premium to their wage to compensate the expected loss of sales that their presence would imply. It does not matter whether the loss actually occurs or not, only that it on average will. If, on the other hand, the discriminatory behavior is on the part of the employer or co-worker, the non-preferred individuals would only be forced to suffer a premium to their wage if their labor supply exceeded the labor demand of the proportion of employers and co-workers who don’t discriminate, assuming that such a proportion exists. As a consequence , I would think it likely that taste-based discrimination does lead to in-market discrimination in the form of wage differentials between groups.

As mentioned previously, statistical discrimination arises because labor markets are characterized by imperfect information. The consequence of this is that differences in expected productivity and reliability of information lead to wage differentials between groups (Phelps, 1972) (Arrow, 1973). However , assuming that marginal productivity pricing of labor is the prevalent compensation scheme, differences in expected productivity between groups only implies that above-average individuals within the groups will be undercompensated for their productivity while under-average individuals will be overcompensated. The average compensation for the group will still equal its average productivity. Thus statistical discrimination due solely to differences in expected productivity between groups is not a problem for our estimates. But suppose you are looking for a mechanic to fix your car. You are free to choose between two offers, one from a mechanic you have used previously and know to do a good job, and one from a mechanic with equal experience but whom you have never previously employed. Assuming that you are risk-averse, you would request some compensation to employ the previously unknown mechanic. Hence , at equal prices the old mechanic will be hired because the information that you have on your old mechanic is more reliable. To summarize, if we assume that employers are risk-averse, and that the quality of information differs between groups, the individuals in a group whose information is perceived as less reliable will be systematically undercompensated for their productive capacity. Of course, the converse situation might also arise where you know for certain that your old mechanic will do a poor job, and only are expecting that the new mechanic will perform equally badly. In this case the unreliability of the information will cause you to hire the new mechanic on the off chance that the new mechanic will surprise you and do a good job. Unfortunately, though, there is nothing to suggest that these circumstances will average out in the long run between groups. Hence statistical discrimination, pertaining to perceived differences in the reliability of information between groups, will also lead to in-market discrimination through wage differentials.

Now that we have established a theoretical framework from which to draw conclusions about the consequences of discrimination, we shift our focus to the empirical question of whether discrimination is prevalent in the Swedish labor market.

Evidence suggests that there is substantial selection into occupations by gender, and that this is the main source of wage differentials between women and men in Sweden (Meyersson-Milgrom et al. 2001 ). Even so, differences within occupations still remain after controlling for observable characteristics such as education and experience. Furthermore , the wage differential increases along the wage distribution and accelerates in the upper part (Albrecht et al. 2003) suggesting that the most lucrative jobs in the labor market are inaccessible to women. But whether this will cause our estimates to be biased or not depends on whether the wage differential between women and men that remains after controlling for observable characteristics are due to differences in unobservable characteristics. If it is the case that they are, not taking account of them in our measure of human capital would imply that we are excluding a market valuation of the returns to experience due to differences in unobservable characteristics between the groups. In other words, the conceptual advantage of our specification would diminish substantially. On the other hand, if we do take account of the differences in returns to experience between women and men, we would have to assume that the unexplained wage differential is due to unobservable factors that affect an individual’s capacity to produce. And if that assumption does not hold, the validity of our human capital measure will diminish. So which direction do we take? Without any clear guidance on the subject from previous empirical research on the Swedish labor market, I think it is sensible to mimic what has previously been done in similar estimation strategies, if nothing else than for reasons of comparability. Looking at Abowd, Kramarz and Margolis (1999) , Abowd, Lengermann and McKinney (2003) and Abowd et al. (2005), they all accounted for gender differences in their human capital measure. Consequently , while stressing the fact that this is not an obvious or necessarily correct choice, differences in returns to experience between the genders will be accounted for in our human capital measure.

When it comes to discrimination of immigrants , the empirical picture is quite similar with the selection mechanism as the main catalyst of discrimination. Immigrants have a much more difficult time of getting a job interview (Carlsson, Roth, 2007) (Åslund, Skans, 2007) and subsequent employment (Eriksson, Lagerström, 2011). Findings also suggest that immigrants from African, Asian and Middle-Eastern countries are especially subjected to discrimination (Arai, Skogman Thoursie, 2009) (le Grand, Szulkin, 2003) (Carlsson, Roth, 2007) (Eriksson, Lagerström, 2011). Furthermore , differentials between immigrants and nonimmigrants also remain after controlling for observable characteristics (le Grand, Szulkin, 2003 ). But just like the gender wage differential, the immigrant wage differential may be due to unobservable characteristics pertaining to productivity. In fact, I think it is more plausible that there are unobservable characteristics that differ between immigrants and non-immigrants that can explain the wage differential. An immigrant might have less access to social networks that are relevant to the employer, and might struggle with cultural differences and language barriers. Of course, the opposite could also be true where an immigrant has a unique access to a social network that is essential to the employer , or has an understanding of the culture that the employer wants to expand the business to , or is fluent in a language that the employer wants to be able to communicate in. But whatever is the case, it seems likely that more differences in unobserved factors of human capital exist between immigrants and non-immigrants than between women and men. In addition , an ultimately decisive aspect of why different returns to experience for immigrants should be included in the human capital measure is that evidence suggests that immigrants tend to gradually assimilate to the society they have joined (Borjas, 1985) (Friedberg, 1995). Briefly explained, the assimilation process implies that the differences between immigrants and non-immigrants decline over time as the immigrant becomes more and more integrated into the new society. Therefore, taking into consideration the plausibility of differences in unobserved factors of human capital and the assimilation process of immigrants, differences in return to experience between immigrants and non-immigrants will also be accounted for in our measure of human capital.

To summarize, an assessment of the evidence suggests that discrimination is prevalent on the Swedish labor market. It also suggests that it primarily operates by causing a segregated labor market through a selection mechanism created by differences in expected returns to labor between the groups. However, even after controlling for occupation and other observables, the wage differentials remain. But because it is not possible to determine whether those differences reflect differences in the returns to productivity or differences in unobserved factors between the groups, it is not entirely obvious whether they should be accounted for in our measure of human capital. In lack of more substantial guidance, it was decided that the differences between women and men were to be accounted for in the measure of human capital due to reasons of comparability with previous studies. The same conclusion was reached for differences pertaining to nativity, but for slightly different reasons. First, that it seems plausible that differences in unobserved factors of human capital between immigrants and non-immigrants exist and is reflected in the wage. And second, that the process of assimilation is likely expressed through differences in returns to experience between immigrants and non-immigrants. Hence, the differences between the groups’ returns to experience are included in the human capital measure.

Table 4.1 below contains the estimation results of the experience component in xitβ.

Table 4.1 [Table not shown]

The sizes of the coefficients are consistent with what we might expect due to the prevalence of discrimination. The experience component as well as its rate of decay is largest for nonimmigrant males, followed by immigrant males, non-immigrant females and lastly immigrant females. This order of importance for the experience component is quite similar to what was obtained in Abowd, Lengermann and McKinney (2003), although they modeled experience through a quadratic and did not estimate separate effects for immigrants and non-immigrants. Looking at the sizes of the coefficients, however , the ones obtained in this study are much smaller.

### 4.2 The unobserved component of human capital

As is obvious from equation (2.15) the elements in Θi correspond to each individual’s average log wage over the time period when the explanatory variables are held at zero. Thus , the elements in xit are variables whose association to the wage are controlled for in the model. In the case of the experience component in xit , the strength of this association as measured by the corresponding element in β is of particular interest because it is a part of our measure of human capital. But for the rest of the variables in xit, the strength of the association to wage is not as important. What is essential is that their influence on the wage is controlled for to ensure that the sources of variation that is caught by Θi are valid components of human capital.

Since we have already established that all components pertaining to an individual’s capacity to produce are valid components in a measure of human capital, the effect on the wage of, say, moving from one city to another is not a valid measure of human capital. Unless some remarkable discovery of a “well of knowledge” or “fountain of youth” is made, surely an individual’s accumulated knowledge and experience, physical and mental health or cognitive and motor skills do not change in accordance with her position on the globe. Likewise, they do not change when a person goes from being employed by the private sector to the public sector, or from one branch of industry to another. But all of these changes does affect the wage, and therefore must be properly controlled for in (2.11). Thus the variables industry, county, sector and year are controlled for in the model.

The industry variable divides the labor market into 19 different branches of industry and indicates which one the main employer of the individual belongs to. Hence , it will control for variation in wages due solely to being employed in different branches of industry. The county variable indicates the residency of each individual in relation to the 21 counties in Sweden, and will control for variation in wages due solely to moving from one county to another. Likewise, the sector variable indicates whether the main employer of the individual belongs to the private sector, the public regional sector or the public state sector and will thus control for variation in wages due solely to being employed by different sectors of the economy. And finally, the year variable controls for variation in wages due to annual variations in the business cycle and exogenous shocks to the economy.

In addition , the variables child and fulltime are also controlled for in (2.11). The reason why this is done is, however, not as straightforward. Considering the child variable, it is probable that the absence from the labor market that is implied by parenting, especially for mothers (Albrecht et al. 2003), constitutes an actual depreciation of the human capital. And if we assume that the labor market is efficient, the lowered expected productivity of parents should equal the depreciation of their human capital, and thus not be controlled for in our model. But it seems unlikely that it is the only impact of parenting on wages. Inherent in the decision of employment is also the potential of the employee to accumulate human capital. In the eyes of employers, this potential is valuable and therefore compensated through the wage. Now, to the extent that parenting lowers this future potential, an employer would charge a premium on the wage of parents. Thus a parent will not receive the same wage as a non parent will at a given level of productive capacity. And as a consequence of this, the wage premium charged on parents should be, and is, controlled for in the model.

For the fulltime variable, a different mechanism is likely to cause differences in wages for given levels of production capacities. Assume that the standard contract of employment stipulates that an employee is to work fulltime for a specific wage. Assume further that the marginal utility of income is decreasing, and that the preference of an employer is to offer part-time employments. For an employee to consent to work less than fulltime , the employer would have to pay a premium to the employee to compensate for the disproportional loss of utility induced by the decrease in income as the income level moves further down the distribution. In an analogous fashion, an employee working part-time would be willing to suffer a premium on the hourly wage in exchange for working fulltime. On the other hand, if it is the preference of the employee that has changed so that the utility of leisure has increased, then a part-time employment contract would be more valuable than a full-time contract and consequently the premium on the hourly wage would be paid by the employee. Hence the direction of this effect is ambiguous, but is nevertheless also controlled for in the model.

Table A.1 in the appendix contain the estimation results for the controls in xitβ. Despite estimating numerous parameters, the accuracy of the estimations are such that all effects are significant on the 1 percent level except for the industry factor levels Electricity, gas and water supply which is significant at the 5 percent level and Other business activities which is significant at the 10 percent level. The size and direction of the controls are pretty much in line with what we would expect, with the wage premiums demanded by part-time employees and suffered by parents confirmed. A word of caution about the factor level Extraterritorial organizations: even though the estimate is significant, the sample only contained 75 individuals working for an employer in that particular branch of industry (compared to several hundreds or thousands in the other branches). By using the estimates in Table 4.1 and A.1, the person effects, Θi, were estimated as described in chapter 2 (see equation (2.15)). In the next chapter, those estimates will be presented.