A Rough Guide to Getting Inequality in OLGs

robertdkirkby · September 27, 2023, 1:54am

Was writing the following as an email, figured I would clean it up for general interest

A Rough Guide to Getting Inequality in OLGs
We are often interested in getting realistic levels of inequality in life-cycle and OLG models. This is a brief description of how this is typically done and covers various aspects of inequality: earnings, consumption, wealth, and hours worked. I focus on the most basic/standard approaches.

Earnings Inequality
In an exogenous labor supply model, earnings inequality is directly generated by our process on earnings. In an endogenous labor supply model earnings inequality is a combination of the process on earnings with decisions on how much to work. Here I will describe the case with exogenous labor supply, but most of the literature with endogenous labor supply simply does the same things, except as a process on earnings-per-unit-of-time-worked, rather than as a process on earnings.

The ‘standard’ approach (as of early 2020s) is to model earnings as

earnings=exp(\alpha_i+\kappa_j+z+e)

Where \alpha_i is a fixed-effect, \kappa_j is a deterministic age-profile (often fitted as a quadratic or cubic function), z is an AR(1) process, and e is an i.i.d. process. [The exponential is there as then we can estimate this from data on log-earnings, and because it eases the interpretation of \alpha_i, \kappa_j, z and e as they are ‘percentage deviations’.]

This process is capable of generating realistic earnings inequality (e.g., all the deciles). But we can do substantially better with a few small changes. The first is to make the parameters of z and e depend on age. The second is to change the innovations to z and e, which are typically normal/gaussian distributions, and instead to use gaussian-mixture innovations.[VFI Toolkit contains routines to discretize all of these, see Appendix on exogenous shock processes in the Intro to Life-Cycle Models.]

Two more things we might add to earnings to make them more realistic. The first is a ‘heterogeneous income profile’ (HIP), essentially instead of the fixed effect, \alpha_i, being a constant, we can also have a slope \alpha^{HIP}_i * j which generates an ‘income profile’ [Think of it as a reduced-form approximation of modelling human capital that is accumulated with age/work and which impacts earnings.] The second is that we estimate different earnings profiles for different educational groups. [GKOS2023 and Blundell, Graber & Mogstad (2015) advocate these two, respectively.]

There is one last thing worth mentioning for earnings inequality. Adding a ‘non-employment’ shock can help hit earnings[Later I emphasize non-employment shocks as a way to hit hours-worked inequality, but they also help for earnings.] You can either use a ‘partial non-employment’ shock (if you are hit with shock, earnings are 0.7 of what they would otherwise be), or full non-employment shocks (if you are hit with shock, earnings are zero).[Kaplan (2012) does partial, GKOS2022 does full.]

People used to use unit-root/permanent shocks. An estimation like that of GKOS2022 would estimate an age-conditional autocorrelation of one if this was the ‘correct’ process. They do not. This kind of evidence strongly suggests permanent shocks are not very empirically realistic.

Getting earnings inequality ‘right’ obviously has important impacts on getting consumption and wealth inequality.

Consumption Inequality
Consumption inequality in the data increases with age. As long as your earnings process gives a substantial role to permanent/persistent shocks, and/or to heterogeneous income profiles (and you have a tight borrowing constraint for the young), your model is going to get this just fine. Obviously getting the details is trickier, but the broad-brush is easy enough. See, e.g., Storeslettern, Telmer & Yaron (2004).

Wealth Inequality
Wealth inequality is substantially higher than earnings inequality (in the data). There are two sides to getting it empirically roughly right. First is to get that a decent chunk of the population has essentially zero fraction of total wealth (but not quite zero wealth), the second is to get that the top 10/5/1% hold an obscenely large amount. Getting that a chunk of the population holds near zero can be done in two ways: have a flat-earnings profile together with a pension that is roughly the same amount as their earnings (this kills off the life-cycle consumption-smoothing motive for saving), or you can have them be relatively impatient (either have different agents with different discount factor, beta, those with lower beta will solve little to nothing given equilibrium interest rates; or make them suffer impatience (quasi-hyperbolic discounting) or temptation (Gul-Pesendorfer preferences). [VFI Toolkit can handle any/all of these using permanent types.]
There are roughly three ways to get the wealth inequality at the top of the wealth distribution: (i) bequests, (ii) preference heterogeneity (typically different discount factors), (iii) heterogeneous rates of return. Note that without bequests, the effect of all the others is muted. Often the heterogeneous rates of return are more explicitly modelled as entrepreneurship. For the top quintile of wealth, and for the savings of the elderly, the risk of medical expenses in old age can be modelled explicitly. When using bequests the intergenerational persistence of earnings ability is also important. See De Nardi & Fella (2017) for more. The heterogeneous rates of returns is easily implemented in a standard life-cycle model where there is a rate-of-return r to assets, simply making r stochastic (more persistent markov processes will generate more wealth inequality).

Hours Worked Inequality
If you just add endogenous labor as a (continuous) decision you won’t capture hours worked inequality. There are two levels at which you can get the hours worked inequality. The first is to use non-employment shocks to simply force people to be occasionally unable to work (and allow them to depend on age), you can find this in Kaplan (2012). The second endogenizes more and uses labor-search (you can decide to try find a job, but only end up with a job with some probability; model employment-status as a ‘semi-exogenous state;) together with earnings being a convex function of hours worked and a fixed-cost of working (Erosa, Fuster & Kambourov, 2016).

Initial Inequality
Your model is likely to start in period 1 with everyone already being, e.g., 23 years old. Inequality already exists at this age, and will simply have to be put into the model in the form of the initial age j=1 distribution.[Obviously the more state variables and permanent types in the model, the easier it is to have a lot of heterogeneity/inequality in the age j=1 distribution.] To give a simple example, imagine the model has one endogenous state for assets, and one markov exogenous state for earnings. We could estimate a joint log-normal iid distribution for assets and earnings from the data on 23 year olds, and then discretize this as our age j=1 distribution. [Log-normal as the empirical distributions have a skew towards high earnings/assets, and the log-normal is a simple way to capture this which normal would not be able to do. Even better might be something like gaussian-mixture.]

A few final points
Another thing we might want to get in terms of inequality is family/gender, Borella, De Nardi & Yang (2018) is a nice example. Human capital is going to be important in terms of getting earnings inequality and how it reacts to any changes in the economy. We might also be interested in making education endogenous.

aledinola · September 28, 2023, 11:03am

As an example of a model with entrepreneurship and human capital accumulation set in an OLG framework, see Zeida (2022). The aim of Zeida’s paper is to evaluate the macroeconomic and distributional consequences of Trump’s corporate tax cuts in 2017.

Many other interesting papers on entrepreneurship and wealth inequality, such as Kitao (2008) and Bruggeman (2021) are set instead in an infinite-horizon framework. On top of this, Bruggeman’s paper also has stochastic ageing [young agents choose their occupation, whether to be a worker or an entrepreneur; they become old with some exogenous probability and when old they receive a pension; old agents die with some exogenous probability and are reborn with the same assets]

All three papers mentioned above should be replicable with the toolkit (I think, at least for the steady-state). Bruggeman (2021) and Zeida (2022) have online replication codes (in Fortran) available.

FedeLondon2024 · May 22, 2024, 1:19pm

Hi Robert,

I’m interested in the point you made above in relation to idiosyncratic rates of return:

‘The heterogeneous rates of returns is easily implemented in a standard life-cycle model where there is a rate-of-return r to assets, simply making r stochastic (more persistent markov processes will generate more wealth inequality).’

I actually want to do study the implications of this, but I’m slightly puzzled, since it’s been drilled into me that to solve for general eqm in incomplete markets we iterate over the household problem until the interest rate consistent with a certain amount of capital, aggregated over all households, is equal to aggregate savings. So, if r is the equilibrium object, how can it be stochastic? I know Hubmer and Krusell put out a paper where they hardware portfolio shares from Norwegian and Swedish administrative data and -I think- obtain different rates of return, but the explanation in the paper itself is obscure, and the code is in Fortran, which is another barrier.

On an unrelated note. I’ve just signed up to Microsoft Azure to get hold of some GPUs, and I have spent an afternoon trying to set it up, but to no avail yet. I wonder if you have (or are aware of) any resources that could serve as a rodmap to make that possible. I heard Fernandez-Villaverde after a presentation advising people to use Amazon Web Services, and I’ve also looked at that and it seems be a maze much like Azure. I wonder whether you personally use any of these two platforms, and if so, what would you recommend? I’m purely interested in getting access to GPUs for the purpose of using the toolkit.

Best regards,
Federico

robertdkirkby · May 23, 2024, 12:12am

When solving Aiyagari model, w and r are both just numbers (in equilibrium they are both deterministic constants). But we have that household income is w*z which is stochastic (and the stochastics just disappear in the aggregates when we take the expectation as there is a continuum of households). You can just pull the same trick on the asset returns, make it so that asset returns are r*z, and that way r is still a constant to be determined in eqm, while the households get a stochastic r*z. This is not the only option, but it gives you an intuition on how it can work with the general eqm [I don’t know what exactly Huber & Krusell do]

I used Azure a decade ago, but my knowledge would be very out of date by now. My own setup is (i) desktop computer with a mid-quality GPU, (ii) server/cluster that the university runs which has a more powerful GPU. I code everything on (i) with small grids, and once it is working I increase the grids and send it to (ii). In principle you could do the same with Azure/AWS as your (ii), but I don’t know exactly how one sets this up. My practical suggestion is, email your Engineering/Computer Science department, ask if they use Azure/AWS or if they have a server/cluster. If they use Azure/AWS they can probably send you their info they give to students (or let you sit in on the class while they explain to students). If they have a server/cluster, then they may let you use that (if it is University’s they will let you, but if it belongs to their school, probably not). [I stopped using Azure 9ish years ago when the University I work at put together the server/cluster that I nowadays use (they periodically upgrade parts of it)]

FedeLondon2024 · May 23, 2024, 11:12am

Thanks Robert. That makes a lot of sense.

In your experience, what is the highest number of continuous state variables that can be used if one has access to GPUs? I’m guessing that depends on the GPU, but I’m sure you’ve experimented and have a rough idea. I understand there’s fortran code that does not run on GPUs that can handle 3 -I’m thinking of some paper I’ve seen, where there’s assets, idiosyncratic productivity, and cumulated labour earnings. Now, I’m told Fortran is particularly fast, so I guess that is the reason a lot of papers featuring OLG models with idiosyncratic uncertainty are accompanied by Fortran code (and I suspect it’s the same code worked over and over again and adapted to different needs). Until I set myself up so I can use your toolbox I’m merely speculating, but I wonder if your matlab toolbox leveraging GPUs is on par with Fortran’s performance or whether it exceeds that, and whether therefore it can accommodate a higher number of continuous state variables?

robertdkirkby · May 23, 2024, 9:43pm

It is important to distinguish different types of state variables, in particular endogenous vs exogenous states. So assets is endo, idiosyncratic productivity is exo, and cumulated labour earnings is endo. The reason is that the endo states require you to think about both this period value, as well as choose next period value. Whereas the exo states you need to think about this period value, but next period value is not a choice and is just used for expectations. As a result the computational cost of endo states is vastly larger than the computational cost of exo states.

The model you describe: assets, cumulated labour earnings, idiosyncratic productivity, so two endo and one exo states, is about the limit of what VFI Toolkit can solve (with a good GPU). But that is just because the toolkit uses a highly robust but not so fast algorithm. You could definitely solve it with more customized code using matlab+gpu.

Most of the code for older papers is in Fortran, but really the biggest speed gains come from better algorithms. Be aware though that good CPU algorithms and good GPU algorithms are two very different things, because of how the hardware differs in the two cases. The following paragraph hopefully gives some idea of why algorithms are the most important component when it comes to determining the code runtimes.

The steps in any value function problem (whether you use VFI or the FOCs) are essentially: the maximization (choose next period assets), the expectations, and an approximation/fitting of the value function (or other function if FOCs). In decreasing order of computation time these are max, fitting, and expectations. A good example of an algorithm is thus EGM, which replaces the max with a function inverse, so it is replacing the slowest part of value function iteration with something very computationally easy. This is why EGM is so fast. But also not all max can be replaced with a function inverse, which is why EGM can be complex to implement in more advanced models. It does however show you the importance of picking good algorithms if you really want fast code.

So VFI Toolkit, because it uses fairly ‘generic’ algorithms cannot compete for speed with custom-code. The role of Fortran vs Matlab here is not zero, Fortran is slightly faster for CPUs, but it is minor compared to the role of algorithms. The advantage of the fairly generic algorithms used by VFI Toolkit is that it can easily solve just about anything. The GPU is important because it allows these generic/robust algorithms to still have reasonable runtimes (they would be painfully slow on CPU).

[EGM=enodogenous grid method]

Shiki · May 24, 2024, 4:09am

Hi Robert, can VFI toolkit solve OLG model with housings?
for example, the model in Floetotto et.al (2016) contains savings and housing assets. If I want to replicate the results, can I start with setting n_a in OLGModel4 to be a vector and do I need to change codes (besides consumption and return function) in other m files?

robertdkirkby · May 24, 2024, 4:34am

If the housing, h, only takes a handful of values (e.g., just 5 grid points) and you have a high-end GPU then you will be able to solve it. But if housing is a full second asset then it will just be too slow [hopefully I can implement divide-and-conquer for two endogenous states later this year, and hopefully that is fast enough, so then you will be fine with two full endogenous states].

I start with setting n_a in OLGModel4 to be a vector and do I need to change codes (besides consumption and return function) in other m files?

What you just described is correct, make n_a a row vector (and a_grid a stacked column vector), and change the inputs to ReturnFn and the FnsToEvaluate (so they are something like a1prime,a2prime,a1,a2,z). Those are essentially the only changes you need.

robertdkirkby · May 24, 2024, 9:04am

ps. details I won’t go into mean the ‘cumulated labour earnings’ is easier/faster (code wise) than a full second asset. Hence why when it is the second asset you are fine, but a full second asset like housing is only going to work if it has few grid points (at least until I get divide-and-conquer for two assets going)

aledinola · May 26, 2024, 7:31pm

@Shiki

The model in Floetotto et al. (2016) is quite complicated to solve. See the recursive problem of the household:

If you just want to learn how to solve a model with housing and home-ownership, a simpler example to start with is K. Chen RED (2010). You can find the Fortran code on the Review of Economic Dynamics website here and you could try to replicate the model using the VFI toolkit in Matlab.

The stationary model in Chen’s paper is relatively simple. There are 4 state variables: two continuous states (assets and housing), one Markov shock for labor productivity and age. The choices are consumption/savings, next-period housing and rent vs own. Labor supply is inelastic. If you limit the number of grid points on housing to something like 7-11, you should be able to use the toolkit to solve for the steady-state. Transition probably would take a long time. I attach below the household’s problem in Chen’s paper.

Shiki · May 27, 2024, 2:53am

Thank you. I’ll try it.

Shiki · June 2, 2024, 3:07pm

I have a quick question, when VFI toolkit calculates the general equilibrium condition (equations approach to 0), does it have a default tolerance value? if so, can I change it?

aledinola · June 2, 2024, 3:51pm

Yes you can change the default value, please see here:

jake88 · August 20, 2024, 5:46pm

I’m also interested in OLG model with housing. I’m trying to solve the model in this Chen paper that you suggested but I don’t know where to write the equation V(a,h,eta,j)=max{V_o,V_r}. Following other examples I set up the ReturnFn for the owner:

function F = ReturnFn(hprime,aprime,a,h,eta,j)
% tau and util are written as separate m files
F = -inf
c = (1+r)a+weps*eta+(1-delta_o)h-tau(h,hprime)+indb

if c>0 & aprime>=-(1-gamma)*hprime
F = util(c,hprime)
end
end

Should I set up another return function for the renter? Probably from my questions you can see that I am a beginner
By the way, is there an OLG with housing vs renting and general equilibrium implemented somewhere with the toolkit? Maybe it is more efficient if I look first at that.
Thanks!

aledinola · August 20, 2024, 8:33pm

Hi @jake88, as far as I know, each model must have a single ReturnFn. I think that to deal with the owner-renter choice V=max{V_o,V_r} you could introduce another endogenous state variable, call it housing tenure, with only two possible values: 1 = owner, 2 = renter.
Take this suggestion with a grain of salt, though: the problem would become a bit heavy since you would have three endogenous states.

As a passing note, in the ReturnFn you should also declare as inputs any parameter or variable that you use, like for example the interest rate r. I also assume that the utility function (which you have coded separatly) should have some of its own parameters (e.g. crra, etc.)

jake88 · August 22, 2024, 1:28pm

Then the input n_a should be a vector with 3 elements? And n_z should be 1 elements?
Maybe it could help if there was an example of a similar setup with the vfi toolkit. I looked at the repository with replications and the one with OLG but I didn’t find anything really related, unless I missed it

aledinola · September 4, 2024, 7:18am

Yes, n_a = [n_assets, n_house, 2]
Not aware of any example similar to this model

robertdkirkby · September 5, 2024, 7:57am

I’ve implemented a rough version of Chen (2010) baseline model. I was doing divide-and-conquer for two endogenous states and figured this might be a nice test to see how the new codes works in practice.

Codes are a bit rough, but 90% there.

There were three tricks to solving the model, first is relevant to VFI Toolkit, other two are general.
i) If you look closely at model, the distinction between “renter” and “owner” is just that a renter is someone choosing hprime=0 (zero next period housing) and an owner is someone choosing hprime>0. This can be easily handled by VFI Toolkit with an if-else statement inside the ReturnFn. Hence the state-space of the model is just (a,h,z,j)
ii) Renters choose how much to spend on consumption, c, and how much on housing services, d. It is possible to define cspend=c+pd. Then budget constraint can be used to calculate cspend, and the split of cspend into c and d can be solved analytically. See pdf in the github repo, as that does the derivation, the formula is then used in the ReturnFn.
iii) Paper describes the utility function as a combining c and hprime: u(c,x)=\frac{[(\theta c^\upsilon + (1-\theta) x^\upsilon)^{1/\upsilon}]^{1-\sigma}}{1-\sigma}, where x is the housing services (x=h' for owners and x=d for renters). But then calibrates \upsilon=0. \upsilon=0 means the formula in the paper (the inner part) will just be incorrect, and it should be simplified to a Cobb-Douglas. So the actual utility function becomes u(c,x)=\frac{[( c^\theta x^{1-\theta}]^{1-\sigma}}{1-\sigma}

Some further comments:

I renamed two things: eta I rename z (the AR(1) process on earnings), and epsilonj I rename kappaj (the deterministic process on earnings).
The earnings seems a bit weird. The standard deviation of the initial earnings seems to be so large that everyone starts near edges of grid on z and then earnings dispersion decreases with age. This might be that I just misunderstand how exactly Chen2010 intends for initial earnings to be done.
Model has deterministic productivity growth g, this just means a simple renormalization of the model before solving. But I can’t be bothered so I just solve it as though g=0. (Intro to Life-Cycle Models explains what to do in model 22; and Intro to OLG Models in model 8.) [If anyone does this, I am happy to update repo based on what your fix]
Paper Table 2 reports Gini of Financial Wealth. But I interpret financial wealth as asset a, and this can take negative values so the Gini is not well defined. Maybe paper just does Gini for positive a?
Codes require a pretty solid GPU. I can run it on my desktop with GPU with 8gb GDDR (gpu memory). But really it should be run with larger grids.
It is using divide-and-conquer “DC2B”, so it only does divide-and-conquer in the first endogenous state but does all the points in the second endogenous state. This will give the correct answer as long as the first endogenous state is monotone (conditional on the other endogenous state, and on any decision and exogenous state variables).

PS. Trick (ii) I have seen before, so I knew the trick and just had to work through the algebra to derive the formula. But I only realised (iii) after the results looked stupid. I don’t think paper mentions either of them (that I saw).
PPS. The two endo state divide-and-conquer was only implemented yesterday (which is why I wanted to try it out on this model today). Should get some public documentation later this month.

@jake88

jake88 · September 6, 2024, 2:45pm

Awesome, thanks @robertdkirkby!

robertdkirkby · September 11, 2024, 12:35am

I fixed the GE conditions in the Chen (2010) codes to deal with the rental housing coming out of assets: K’=A’-H^r (1 − p).

But I think there is a typo in the Chen (2010) appendix. Reading section 2.7 about the financial intermediary, it reads like the financial intermediary takes assets A and turns it into a combination of rental housing services Hr and physical capital K. This agrees with Appendix 1 7c equation for “Asset Markets Clear”. Where I think there is an error is equation A.3 in Appendix 1, for “Housing Markets Clear”, which reads H=Hr+H0, all three of which are defined on the same page. But definition for H is H=\int h'(s) d\mu and definition of H0 is H0=\int_{h'>0} h'(s)d\mu, since h'\geq0, it follows that H=H0 (as H is just H0 plus some zeros). This would leave Hr=0, which is clearly stupid and in conflict with the rest of the model. Thus I think the “Housing Markets Clear” should just be erased entirely (it would be H=H0 but this is just degenerate, so just delete the equation as it is trivially satisfied); this makes sense as based on Section 2.7 the rental housing is not coming out of housing, it is coming out of assets. [The housing markets clear eqn was not used in my codes anyway, so no difference there.]