Why do the elderly save? The role of medical expenses

Is there any Matlab replication with toolkit of the paper “Why do the elderly save?” by De Nardi, French and Jones? I found the replication code at one author’s website:

But it is written in C and I struggle quite a bit to understand it.
I have now got access to AWS so I should be able to run even the more complicated life-cycle models

2 Likes

Largely irrelevant comment: I’m surprised they used C instead of Fortran

DeNardi, French & Jones (2010) is on my to-do list (now that VFI Toolkit does GMM estimation of life-cycle models it was something that was a clear goal for me). I didn’t realise there were replication materials, and now that I know I will give it a go (saves me doing the data work which would have been hard work, the replication materials appear to contain all the data moments you need for the GMM).

As to why C, their ReadMe point 4 says: “The codes solving for the value functions and simulating households’ histories are written in the C language. We use GAUSS for the econometrics. The GAUSS programs call the C programs, send them the necessary inputs - including parameter values and initial values of the state variables - and retrieve the simulated histories. The GAUSS programs then use the simulated histories and the data to compute the GMM criterion function.” But Gauss website says it can also call Fortran, so that cannot be the reason.

2 Likes

Thanks for your reply. If I wanted to try to replicate this paper on my own, is there a model on the toolkit website that I can use as a template?

1 Like

What you need to code DFJ2010 in VFI Toolkit is as follows:
Permanent types, 10 ptypes: two genders, and 5 income levels
One endogenous state: assets (not cash-in-hand as DFJ2010 switch to)
Two markov exogenous states: health, zeta (AR(1) on medical expenses)
One iid exogenous state: xi (i.i.d on medical expenses)
[note, zero decision variables]

DFJ2010 have lots of parameters that depend on age and permanent type, e.g yn(g,h,I,t),m(g,h,I,t), sigma(g,h,I,t). If these didn’t depend on h they would be easy, just age-conditional parameters that differ by ptype. But the dependence on h, which is a markov z state complicates things. Since h only takes two values, I suggest setting them up with each as two (h=0 and h=1) age-conditional parameters that differ by permanent type.

That the transitions of h depend on (g,I,t) is easy (age-dependent exogenous markovs, but then differ by ptype). [eqn 5 of DFJ2010]

The awkward part for VFI Toolkit is that the conditional survival probabilities depend on h (the markov x state). VFI Toolkit does not handle state-dependent discount factors (which is what conditional survival probabilities effectively are). Instead you have to add an additional “dead” grid point to the health state and handle the conditional survival probabilities based on the transition probabilities to this state (and in that state, return fn evaluates to F=0) [dead state as conditional survival probabilities]. Actually, because of the warm-glow of bequests you need to go even further and add two grid points to health state, ‘death’ which is the period you die and gives the warm-glow of bequests, and ‘dead’ which returns 0 [when in ‘death’ you transition with probability one to ‘dead’]; this is because the warm-glow has to depend on the conditional survival probability, which is here being controlled by transition probabilities of the health state.

This (two extra health state values) is computationally a bit wasteful, but the model is otherwise pretty simple so it will still compute easy fast enough to GMM estimate, and you can use simoptions.conditionalrestrictions to make all the model stats conditional on not being death/dead (GMM estimation works with conditional restrictions as target moments).

2 Likes

Thank you, I will study your answer carefully and start implementing the code

1 Like

Took all day, but here is the model. Most of the time was spent sorting out the parameters, this model has hundreds of parameters.

I have not checked outputs at all, but probably is good. Currently it just sets up and solves the value fn and policy fn.

Note: I am missing three parameter values ‘tautilde’ and ‘xtilde’ which are the estate tax rate and estate tax exemption, and ‘r’ which is the return on assets. DFJ2010 appear to import them from files which are not in the replication materials (not 100% sure yet, but cannot find them so far). I put placeholders in code.

Looking more closely at DFJ2010 the GMM estimation will be tricky. Most papers I have seen do some kind of ‘fixed effect for cohort’ and hence get one set of moments to target. This paper seems to instead target each of the four cohorts separately. I don’t full understand yet but I am guessing it might be possible if I set up cohort as another dimension of the permanent type? Anyway, will post again once I have a better grip on the estimation.

2 Likes

Added a pdf that explains/describes the model (to same github repo as codes).

I did a quick/rough count and there are 2000+ parameters (because of four parameters that depend on age (31), gender (x2), and income quintile (x5, so 300+); three of the four further depend on health (x2, so 600)). [3x600+300>2000 :exploding_head:]

2 Likes

Thanks, this helps a lot! A lot of codes for me to study!

1 Like

I’ve been reading DFJ2010 more closely, in terms of the GMM estimation (SMM in their case).

The estimation is standard. You require three things to implement:
(i) the initial distribution of agents (they use the 1996 empirical distribution, but appear add some measurement error; see from line 450 in ‘getdat25_wave8.g’)
(ii) the target moments
(iii) the variance-covariance matrix of the target moments
I cannot find any of these three in their materials. I looks like the 1996 data distribution is imported by the gauss codes, but I don’t think that the file that would be imported is included. Nor can I find (ii) and (iii), my impression is that these might be calculated as part of ‘gmmcrit25.g’ which has the GMM criterion around line 1097 ‘mmtmtx = mmtmtx~mmtmtx_m;’ being the difference between data and model moments, and appears to calculate the data moments ‘mmtmtx’ earlier in the same code.

Without these I am unlikely to try and implement the GMM estimation itself, but from reading DFJ2010 closesly I am pretty sure that with these you can do the estimation using the standard VFI Toolkit commands.

1 Like

The code in your github are very useful! I was lost in going through the C code but the toolkit makes things much easier.
Question: I would like to generate the model data for median assets by cohort, permanent income quintile and obviously age (see figures 1 etc of the paper). Not a toolkit pro, but I think I should first simulate a panel of households in the model using the policy functions compute by the vfiter command. Am I on the right direction? Could you please provide some guidance or point me to some existing code? Thanks!

1 Like

There are three ways to do this, the second one below being what you describe.
(i) since ‘permanent income quintile’ is actually just a permanent type in the model, you can just use “LifeCycleProfiles_FHorz_Case1_PType()” and this will report the median conditional on age and permanent type; this requires StationaryDist as an input.
(ii) you can simulate panel data using “SimPanelValues_FHorz_Case1()” (and same “_PType”). You can see an example in Life-Cycle Model 13. Then you just calculate the stats like you would off of any panel data.
(iii) you can calculate the stats using the ‘value of the FnsToEvaluate on the grid’ plus the weights on the grid which is of course just the ‘StationaryDist’. This would involve using the “EvalFnOnAgentDist_ValuesOnGrid_FHorz_Case1()” (and same “_PType”) command.

Note: before you can do any of this, you need the initial age j=1 distribution of agents. DFJ2010 says they used the 1996 distribution of agents as the initial dist, but I couldn’t find a copy of it anywhere in the materials. My (possibly incomplete) understanding is that the “cohorts” were just different splits of the initial distribution based on the age in the initial distribution, but I am not certain of this. Implementing this (being born at periods after j=1) in the toolkit is non-trivial.

PS. Creating a simulation manually would involve using a random number generator to determine the evolution of the exogenous shocks, together with using the policy function to determine the evolution of the endogenous states. Everything else is just a function of the endogenous and exogenous states, so once you simulate these you can directly calculate everything else.

2 Likes

I tried (ii) and I followed Life-Cycle Model 13. After doing the value function iteration, I appended this commands to your script:

simoptions=struct(); % Use the default options

FnsToEvaluate.assets=@(h,aprime,a,z) a; % a is the current asset holdings

% To simulate panel data we will set the number of time periods

simoptions.simperiods=N_j; % N_j is the default value

simoptions.numbersims=10^3; % 10^3 is the default value

% To simulate panel data you have to define ‘where’ an individual household

% simulation starts from which we call InitialDist below (and from which

% starting points will be drawn randomly)

InitialDist=StationaryDist(:,:,1);

SimPanelValues=SimPanelValues_FHorz_Case1_PType(InitialDist,PTypeDistParamNames,Policy,FnsToEvaluate,Params,n_d,n_a,n_z,N_j,Names_i,d_grid,a_grid,z_grid,pi_z, simoptions);

For now I do only assets to test the commands. Problem is that I don’t know how to define a few missing variables for the simpanel command

  • InitialDist: I would like everyone to start with zero assets at age=1. How do I do this?
  • PTypeDistParamNames: How do I set up this veriable?
  • pi_z is not found

Thanks!!

To create InitialDist you want roughly,

InitialDist=zeros([n_a,n_z,n_e],'gpuArray');
InitialDist(1,1,8,4)=1;

This points everyone in first grid point on assets (which is zero assets), first grid point on health (which is good), the middle grid point on zeta, and the middle grid point on xi. Note, more generally, this is just a distribution over the grid points on (a,z,e) at age j=1, and the masses should sum to one. [Given DFJ2010 is about retirement, obviously most people should start with some assets, not zero like above, but is just to illustrate what InitialDist involves.]

PTypeDistParamNames is just a name of where in the parameter structure to find the masses of the different permanent types (which should sum to one). So something like

PTypeDistParamNames={'ptypemasses'};
Params.ptypemasses=1/10*ones(1,length(Names_i));

which puts all quintiles at equal mass (so 1/5) together with half of each gender (so 1/2), and together means 1/10 of each permanent type.

pi_z was not found because you are calling it pi_z_J instead [I just use the _J notation to remind myself that it depends on age, VFI Toolkit can handle that there is or is not a dependence on age, it figures out which based on the size]

PS. The way DFJ2010 do things, you would have to set InitialDist to match the ‘1996 distribution’ that they get from the data (and same for the distribution of masses of the permanent types which is probably not perfectly 1/10 of each in data).

PPS. It is possible to put the permanent type weights into InitialDist by adding permanent type as an extra dimension (in which case you will see a warning that those from PTypeDistParamNames are being ignored because the InitialDist already has weights which are going to be used). Alternatively, you can set up InitialDist.ptype001, InitialDist.ptype002, etc. and get dependence on permanent type that way, as you would for getting anything else to depend on permanent type (in this case each of them should be mass 1, and PTypeDistParamNames is used for the relative masses). I expect that the InitialDist in the 1996 data used by DFJ2010 the distributions are different by each permanent type.

PPPS. In general, InitialDist will be the same size as the value function, except without the N_j dimension (and without the N_i dimension if the InitialDist does not depend on permanent type).

2 Likes

I followed your tips (thanks!!) but I got this error:

Error using gpuArray/reshape
Number of elements must not change. Use as one of the size inputs to automatically calculate the appropriate
size for that dimension.

Error in KronPolicyIndexes_FHorz_Case1 (line 22)
Policy=reshape(Policy,[size(Policy,1),N_a,N_z,N_j]);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error in SimPanelValues_FHorz_Case1 (line 221)
PolicyIndexesKron=KronPolicyIndexes_FHorz_Case1(Policy, n_d, n_a, n_z, N_j); % Create it here as want it both here and inside SimPanelIndexes_FHorz_Case1 (which will recognise that it is already in this form)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error in SimPanelValues_FHorz_Case1_PType (line 242)
SimPanelValues_ii=SimPanelValues_FHorz_Case1(InitialDist_temp,Policy_temp,FnsToEvaluate_temp,Parameters_temp,FnsToEvaluateParamNames_temp,n_d_temp,n_a_temp,n_z_temp,N_j_temp,d_grid_temp,a_grid_temp,z_grid_temp,pi_z_temp, simoptions_temp);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error in DeNardiFrenchJoines2010 (line 694)
SimPanelValues=SimPanelValues_FHorz_Case1_PType(InitialDist,PTypeDistParamNames,Policy,FnsToEvaluate,Params,n_d,n_a,n_z,N_j,Names_i,d_grid,a_grid,z_grid,pi_z_J, simoptions);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

My code is this:

simoptions=struct(); 
simoptions.simperiods=N_j; 
simoptions.numbersims=1000; 

InitialDist=zeros([n_a,n_z,n_e],'gpuArray');
InitialDist(1,1,8,4)=1;

PTypeDistParamNames = {'ptypemasses'};
Params.ptypemasses = (1/10)*ones(1,length(Names_i));

SimPanelValues=SimPanelValues_FHorz_Case1_PType(InitialDist,PTypeDistParamNames,Policy,FnsToEvaluate,Params,n_d,n_a,n_z,N_j,Names_i,d_grid,a_grid,z_grid,pi_z_J, simoptions);

The rest of the code is the same as yours on github

You need

simoptions.n_e=vfoptions.n_e;
simoptions.pi_e=vfoptions.pi_e;
simoptions.e_grid=vfoptions.e_grid;

The simulation command doesn’t know there is an e variable because you haven’t told it, so it expects Policy to be a different size, hence the error.

[They were actually in the original code, but you overwrote them with simoptions=struct()]

1 Like

Thanks! My new code is

FnsToEvaluate.assets=@(aprime,a,h,zeta,xi) a;

simoptions.simperiods=N_j; 
simoptions.numbersims=1000; 

InitialDist=zeros([n_a,n_z,n_e],'gpuArray');
InitialDist(1,1,8,4)=1;

PTypeDistParamNames = {'ptypemasses'};
Params.ptypemasses = (1/10)*ones(1,length(Names_i));

SimPanelValues=SimPanelValues_FHorz_Case1_PType(InitialDist,PTypeDistParamNames,Policy,FnsToEvaluate,Params,n_d,n_a,n_z,N_j,Names_i,d_grid,a_grid,z_grid,pi_z_J, simoptions);

My results are strange because I get that assets by age have a weird spike at age 3 but they are zero after age 5

assets_age = mean(SimPanelValues.assets,2);
plot(1:N_j,assets_age)

Any idea why this is the case? Did I make some mistake in my code?

1 Like

Do you mind sharing your graph here, to see how it looks like?

Never mind, I ran Robert’s code with your code and I got this

This is indeed strange :frowning:
Not sure what is wrong after a quick look

1 Like

I will take a look, but currently on the road without gpu, so won’t be able to do so until April 20-something.

P.S. I did the value and policy fn without testing them which is always a dangerous game, so is possible the issue lies there.

2 Likes