Age-dependent health shocks

( Sorry for the spam :smiley:

I was looking at this interesting paper on health risk and optimal tax progressivity:

Juergen Jung, Chung Tran, Health Risk, Insurance, and Optimal Progressive Income Taxation, Journal of the European Economic Association , Volume 21, Issue 5, October 2023, Pages 2043–2097, Health Risk, Insurance, and Optimal Progressive Income Taxation | Journal of the European Economic Association | Oxford Academic

There are two idiosyncratic shocks, z_n and z_h. The income shock z_n is not age-dependent, the health shock z_h depends on age and on a permanent income type and affects both working-age and retired individuals. Can the toolkit handle this specification and, if so, is there any available example? (The fact that the health shock depends on the income type is not really needed)

EDIT
I ran the example LifeCycleModelA5 but I got an error reported below. However I understood the logic and I think I can do the following. I define both shocks as age-dependent (even though only the health shock is age dependent) thus getting

zn_grid_J, pi_zn_J for the income shock

and

zh_grid_J, pi_zh_J for the health shock

Then I combine the two shock by taking the kron product, thus obtaining

pi_z_J = kron(pi_zh_J,pi_zn_J)

(But is the kron product defined for arrays with 3 dimensions?)

and

z_grid = [zn_grid_J; zh_grid_J]

Then in the return function I can write

function F = ReturnFn(aprime,a,zn,zh,parameters)

Do you think it is a good approach or would you recommend something different? Thanks in advance!

Here is the error

Warning: LifeCycleProfiles_FHorz_Case1 has changed the order of the fourth and fifth inputs (should now be
something like Params,[] when previously it would have been [],Params) 
> In LifeCycleProfiles_FHorz_Case1 (line 23)
In LifeCycleModelA5 (line 315) 
Warning: Annoying, but it makes inputs to LifeCycleProfiles_FHorz_Case1 have same order as those of
similar functions
1 Like

Yep, easy one :smiley:

Essentially,
(i) n_z=[n1,n2]; with n1 the number of grid points for zn (income), and n2 the number of grid points for zh (health). Let N_j be the number of model periods.
(ii) create [zn_grid, pi_zn] for the (not age-dependent) income shock (sizes are [n1,1] and [n1,n1]; e.g. use Farmer-Toda method for an AR(1) to discretize)
(iii) create [zh_grid_J, pi_zh_J] for the (age-dependent) health shock (sizes are [n2,N_j] and [n2,n2,N_j])
(iv) combine the grids: z_grid_J=[zn_grid.*ones(1,N_j); zh_grid_J] (size will be [n1+n2,N_j]; note it is essentially a stacked-column vector for each age)
(v) combine the transition matrices:

pi_z_J=zeros(n1*n2,n1*n2,N_j);
for jj=1:N_j
    pi_z_J(:,:,jj)=kron(pi_zh_J(:,:,jj),pi_zn); % note, kron() in reverse order
end

(vi) Done! Just pass n_z, z_grid_J and pi_z_J as inputs to commands in the usual fashion (where you pass n_z, z_grid and pi_z in the simple examples).

Note: From the perspective of the toolkit/computation, at this point it is kind of silly not to just use age-dependent income shocks as well (although obviously that has higher data needs).

PS. I need to overhaul the appendix about exogenous shocks. The toolkit is a bit nicer and smoother now than when it was written.

2 Likes

Great explanation, thanks!

EDIT
Comment 1
I think when you write

pi_z_J=zeros(n1+n2,n1+n2,N_j);

you actually mean

pi_z_J=zeros(n1*n2,n1*n2,N_j);

correct?

Comment 2

In the example LifeCycleModelA5 I found the comment below. This is no longer necessary with the updated toolkit…

% To use exogenous shocks that depend on age you have to add them to vfoptions and simoptions

%vfoptions.z_grid_J=z_grid_J; % Note: naming of vfoptions.z_grid_J has to be exactly as is.

%vfoptions.pi_z_J=pi_z_J; % Note: naming of vfoptions.z_grid_J has to be exactly as is.

%simoptions.z_grid_J=z_grid_J; % Note: naming of vfoptions.z_grid_J has to be exactly as is.

%simoptions.pi_z_J=pi_z_J; % Note: naming of vfoptions.z_grid_J has to be exactly as is.

% You then just pass a 'placeholder' for z_grid and pi_z, and the commands

% will ignore these and will only use what is in vfoptions/simoptions

%z_grid=z_grid_J(:,1); % Not actually used

%pi_z=pi_z_J(:,:,1); % Not actually used

Let me write here the model I have in mind to understand if the toolkit is able to handle it

  1. Two decision variables, labor supply d1 and health effort d2.
  2. One endogenous asset, savings, a.
  3. Two shocks, let’s call them zn (labor productivity) and zh (health shock) which are age-dependent.
  4. A permanent type that we call theta_i for no-college and college.
  5. The second shock, health status, zh is semi-exogenous. This means that the transition matrix P(zh,zh') depends on health effort which is the second decision variable, d2.
  6. The transition matrix for health status, P(zh,zh'), depends not only on health effort (see 5.) but also on the permanent type theta_i. The idea is that college-educated individuals are more likely to be healthy.
  7. The survival probability depends on age zh.

I think 1-2-3-4 are entirely standard with the toolkit. 5 should be feasible given the example with fertility and our example with search. I was wondering about 6 and 7.

Thanks a lot in advance for any help!

Comment 1: you are right, have fixed this
Comment 2: you now just pass z_grid_J as an input in any place you would otherwise put z_grid, this is an improvement to the old way the toolkit used to work which was that you had to put z_grid_J into vfoptions. [Unimportant note: whether you call it z_grid of z_grid_J is irrelevant internally, what the toolkit does is look at the size and shape of the thing you input and based on this figures out if it depends on model period j or not]

1 Like

Oops, forgot about the two different health shocks part. Happily it is also easy as it is just different permanent types.

You can do this using permanent types. First set up Names_i, for example,
Names_i={'ondeathsdoor','healthnut'}
as the two names of the two different permanent types (you probably will want to use less ridiculous names than those I am using here :slight_smile: )

Then all you do is to create

z_grid_J.ondeathsdoor
pi_z_J.ondeathsdoor
z_grid_J.healthnut
pi_z_J.healthnut

[the process for creating these will be like described earlier]

The way Names_i works, you can essentially create any input as a structure with fields using those names (here the grid and transition matrix for the z exogenous state, but could be vfoptions, the ReturnFn, a specific parameter, etc), and internally the toolkit does: (i) if the input does not depend on Names_i then it is passed to all permanent types, (ii) if the input does depend on Names_i then it is only passed to the permanent type of that specific name.

Note, you can even set n_z.ondeathsdoor and n_z.healtnut to differ if you wanted to.

When setting up anything that depends on permanent type in this way, you just set X.permanenttypename to be the same kind of object that X would normally be (so z_grid_J.healthnut will be set up as a standard kind of z_grid_J)

1 Like

1-6 is all easy. 7 can be done.

You set up zn as an exogenous z markov state. You set up zh as a semi-exogenous semiz state.

You set up d1_labor and d2_health as two decision variables; the decision variable that matters for the semi-exogenous state transitions must be the ‘last’ decision variable.

You can set up z_grid_J and pi_z_J to depends on permanent type using Names_i, you can do the same with semi-exogenous shocks (either be setting up vfoptions.SemiExoStateFn.healthnut, or by setting up vfoptions.healthnut so that this can differ by permanent type; of course it may be that you just use the same SemiExoStateFn function, but have different parameters for the function for each permanent type which is very easy to set up, depends what you are after).

The survival probability is trickier. The toolkit does not have a clean way to set this up, but you can easily incorporate it. What you can do is add a semi-exogenous state into your ‘health’ states that represents ‘dead’, and then set the probability of going to this, and make it so that this state just always gives ‘return fn value’ of zero (then the discount factor parameter in the model will just be a patience parameter beta, rather than beta*sj, as survival will be handled in the transition probability to the ‘dead’ state). [The reason I describe this as ‘trickier’ is that you have to be very careful with how you do the FnsToEvaluate, although using the simoptions.conditionalrestrictions should make it easier]

PS. You could also calibrate the whole model to some target moments using CalibrateLifeCycleModel_PType()

1 Like

Thank you so much for your detailed answer, Rob!

To keep things simple I will abstract from the permanent type affecting the health state. The importat part is that the health shock zh is both age-dependent and semi-exogenous. I was wondering how to do this. I have to create zn_grid_J and pi_zn_J (even though zn does not depend on age)? But then I cannot create pi_zh_J since zh is semi-exogenous…

In the mail file I declare

% Note that z = [zn,zh], where only zh is semi-exogenous and age-dependent
n_z = [7,2]; % 7 points for zn, 2 points for zh
% Generate zn_grid_J with size [n_z(1),N_j] and pi_zn_J with size [n_z(1),n_z(1),N_j]
% Since zn is not age-dependent, the age dimension is redundant
% Generate zh_grid_J with size [n_z(2),N_j] 
vfoptions.n_semiz    = n_z(2);
vfoptions.semiz_grid = zh_grid_J;
%no of decision d variables relevant for semi-exo z=[zh]. We have d=[effort]
vfoptions.numd_semiz = 1; 

Then I create the function

function [prob] = f_SemiExoStateFn(zh,zh_prime,d_effort,jj,p_bb,p_gb,a)
% zh is *SEMI-EXOGENOUS* and *age-dependent*
% zh:       Semi-exogenous health status today
% zh_prime: Semi-exogenous health status tomorrow
% d:        Decision variable, effort
% jj:       Age (from 1 to N_j)

% By how much exerting effort reduces the prob of bad health. If no effort,
% this is 1.
decrease = exp(-a*d_effort);

% THIS IS WHAT I WANT TO DO
% % Build transition matrix for health, given age=jj and given d
% % prob of leaving the bad (state=1) state
% pi_zh_j(1,:) = [p_bb(jj)*decrease,1-p_bb(jj)*decrease];
% % prob of leaving the good (state=2) state
% pi_zh_j(2,:) = [p_gb(jj)*decrease,1-p_gb(jj)*decrease];

if zh==0
    % Bad health today
    if zh_prime==0
        % bad health tomorrow
        prob = p_bb(jj)*decrease;
    elseif zh_prime==1
        % good health tomorrow
        prob = 1-p_bb(jj)*decrease;
    end
elseif zh==1
    % Good health today
    if zh_prime==0
        % bad health tomorrow
        prob = p_gb(jj)*decrease;
    elseif zh_prime==1
        prob = 1-p_gb(jj)*decrease;
    end
end %end if zh

end %end function

To make sure that the transition probabilities for zh are age-dependent, I pass also the parameter agej to the function f_SemiExoStateFn. Given agej and d_effort, and given zh and zh_prime, this function will spit out the scalar variable prob. This prob tells me the conditional probability of going from zh to zh_prime, given age and effort.

Problem: I think I cannot pass the vectors p_bb and p_gb as inputs to f_SemiExoStateFn, given how arrayfun works. This wouldn’t be a problem if zh was not semi-exogenous. In that case, I would define pi_zh_J as shown below. But now I can’t do that.

pi_zh_J   = zeros(n_zh,n_zh,N_j);
% Build state vector and transition matrix for health
for jj=1:N_j
    % prob of leaving the bad (state=1) state
    pi_zh_J(1,:,jj) = [p_bb(jj),1-p_bb(jj)];
    % prob of leaving the good (state=2) state
    pi_zh_J(2,:,jj) = [p_gb(jj),1-p_gb(jj)];
end

I cannot pass every individual element of the vectors p_bb and p_gb to f_SemiExoStateFn since there are too many. Can I create pi_zh_J as a 4-dim array (zh,zh_prime,age,d) ?
Another solution would be to parametrize p_bb and p_gb using a low order polynomial in age…

UPDATE

I actually solved one of the problem described above, namely I parametrized p_bb and p_gb with a low-order polynomial in age, so I have to pass only 4 parameters (p_gb_0,p_gb_1,p_bb_0,p_bb_1).

However, I’m still confused about how to declare the second shock as both age-dependent and semi-exogenous. What I did is the following:

n_zn = 7; % Markov shock
% Create zn_grid and pi_zn using Tauchen or variants
n_zh = 2; % Health shock: semi-exo and age-dependent
% Create zh_grid but not pi_zh yet. See below
vfoptions.SemiExoStateFn = @(zh,zh_prime,d,agej,a,p_gb_0,p_gb_1,p_bb_0,p_bb_1) ...
    f_SemiExoStateFn(zh,zh_prime,d,agej,a,p_gb_0,p_gb_1,p_bb_0,p_bb_1);
% --- Standard Markov shock
n_z        = n_zn;
z_grid     = zn_grid;
pi_z       = pi_zn;
% --- Semi-exogenous and age-depedent shock
n_semiz    = n_zh;
semiz_grid = zh_grid;
vfoptions.n_semiz    = n_semiz;
vfoptions.semiz_grid = semiz_grid;
vfoptions.numd_semiz = 1; % no of decision d variables relevant for semi-exo z

% We also need to tell simoptions about the semi-exogenous states
simoptions.n_semiz        = vfoptions.n_semiz;
simoptions.semiz_grid     = vfoptions.semiz_grid;
simoptions.SemiExoStateFn = vfoptions.SemiExoStateFn;
simoptions.numd_semiz     = vfoptions.numd_semiz;

I also show below my function

function [prob] = f_SemiExoStateFn(zh,zh_prime,d_effort,agej,a,p_gb_0,p_gb_1,p_bb_0,p_bb_1)
% zh is *SEMI-EXOGENOUS* and *age-dependent*
% ------------------------------------------
% INPUTS
% zh:       Semi-exogenous health status today
% zh_prime: Semi-exogenous health status tomorrow
% d:        Decision variable, effort
% jj:       Age (from 1 to N_j)
% a,p_gb_0,p_gb_1,p_bb_0,p_bb_1: Parameters
% OUTPUT
% prob:     Prob(zh_prime|zh), given d and age
% ------------------------------------------

prob = -inf; % Just a placeholder (one that will cause errors if not overwritten)

% By how much exerting effort reduces the prob of bad health. If no effort,
% this is 1.
decrease = exp(-a*d_effort);

% THIS IS WHAT I WANT TO DO
% % Build transition matrix for health, given age=jj and given d
% % prob of leaving the bad (state=1) state
% pi_zh_j(1,:) = [p_bb(jj)*decrease,1-p_bb(jj)*decrease];
% % prob of leaving the good (state=2) state
% pi_zh_j(2,:) = [p_gb(jj)*decrease,1-p_gb(jj)*decrease];

% Prob of going from good health to bad, as a polynomial in age
p_gb = p_gb_0 + p_gb_1*agej;
% Prob of going from bad health to bad, as a polynomial in age
p_bb = p_bb_0 + p_bb_1*agej;

if zh==0
    % Bad health today
    if zh_prime==0
        % bad health tomorrow
        prob = p_bb*decrease;
    elseif zh_prime==1
        % good health tomorrow
        prob = 1-p_bb*decrease;
    end
elseif zh==1
    % Good health today
    if zh_prime==0
        % bad health tomorrow
        prob = p_gb*decrease;
    elseif zh_prime==1
        prob = 1-p_gb*decrease;
    end
end %end if zh

end %end function

My understanding: you want SemiExoStateFn() —which is the transition probabilities of the semi-exogenous state— to depend on age.

This can be done by making parameters that are input to the SemiExoStateFn() depend on age. When you write SemiExoStateFn() you just treat them as scalar inputs, e.g., you just input p_bb and p_gb. Then in parameters you set up Params.p_bb and Params_p_gb as being N_j-by-1 vectors (so age dependent parameters; can by 1-by-N_j, doesn’t matter). Internally the toolkit then recognise that p_bb and p_gb depend on age and use the value for that age when creating the semi-exogenous transition probabilities for that age.

Note that this is exactly how you use age dependent parameters in the return function, or as survival probabilities. You just tell toolkit to use ‘sj’, and make Params.sj an N_j-by-1 vector, the toolkit then realises it should use the value of sj for the current period everywhere it comes across sj.

So what you write is largely fine, just delete the (jj) from inside your f_SemiExoStateFn(), as the toolkit will essentially be doing that for you whenever it actually calls f_SemiExoStateFn().

You could parametrize p_bb and p_gb as low order polynomial in age. In this case you need to pass the 3 (pretend it is a quadratic polynomial, so 3 coeffs) parameters and also an age-dependent parameter that indicates the age.

PS. Internally, creating matrices on the gpu is done using arrayfn(). If you try to pass a 2-by-1 parameter, it just ends up creating this as an extra dimension of the matrix. Parameters should not be matrix dimensions and this is why all parameters must be scalar eventually (if you put in age-dependent or permanent-type-dependent parameters, then internally just the scalar that corresponds to the current age/ptype gets pulled out and passed to arrayfn() ). Hence why VFI Toolkit requires all parameters be scalar (after accounting for age or ptype dependence).

1 Like

Thanks a lot! Actually, I am writing a new post then I noticed your reply :slight_smile:

I figured out that it’d be easier to parametrize the transition rates with a polynomial, since this is what I will do with the data in any case.

1 Like

I am a bit confused regarding the ordering of the variables

Should it be (aprime,a,z,semiz) or (aprime,a,semiz,z)? On page 48, footnote 51 of lifecyclemodels pdf it says

The ordering of inputs for any functions to evaluate are the same, (d, aprime, a, z, semiz, e, …).

However, in IntroToLifeCycleModels/LifeCycleModel29_ReturnFn.m at main · vfitoolkit/IntroToLifeCycleModels · GitHub it shows a different ordering

My understanding is that the ordering should be

(d,aprime,semiz,z,e)

1 Like

It should say that for ReturnFn and FnsToEvaluate you use the ordering (…,semiz,z,…)

I have fixed the typo in pdf. Thanks for spotting!

[when it was originally coded I used z,semiz, but then I had to reverse them everywhere so that I could implement taking advantage of the Tan improvement in this situation. I missed updating that part in the docs]

1 Like

Great, thanks! I think there is a typo also in the matlab file lifecycle29 (fertlity example),

see this comment:

%The semi-exogenous states must be included in the return fn, fns to evaluate, etc., as the last 'z' variables.

Small comment: perhaps for clarity I would also change the order of state variables in the Bellman equation. Now you write …,z,n1,n2,… where n1 and n2 are two semi-exogenous shocks. I would switch to …,n1,n2,z,… to make the math consistent with the algorithm.

I’m not sure I understand how to set up the survival probability depending on the health shock. It would be great if you could provide a simple example on how to do this :slight_smile:

Consider a very simple case where there is only one shock, z=[sick,healthy] and is standard, i.e. not semi-exogenous nor age-dependent.

The way I would set this up is like this

clear
clc
close all

myf = 'C:\Users\aledi\Documents\GitHub\VFIToolkit-matlab\VFIToolkit-matlab';
%myf = 'C:\Users\haomi\Documents\GitHub\VFIToolkit-matlab'; 
addpath(genpath(myf))

Params.J = 80;
N_j = Params.J;
Params.Jr = 45;
Params.r = 0.04;
Params.beta = 0.95;
Params.w = 1;
Params.kappa_j=[linspace(0.5,2,Params.Jr-15),linspace(2,1,14),zeros(1,Params.J-Params.Jr+1)];
dj=[0.006879, 0.000463, 0.000307, 0.000220, 0.000184, 0.000172, 0.000160, 0.000149, 0.000133, 0.000114, 0.000100, 0.000105, 0.000143, 0.000221, 0.000329, 0.000449, 0.000563, 0.000667, 0.000753, 0.000823,...
    0.000894, 0.000962, 0.001005, 0.001016, 0.001003, 0.000983, 0.000967, 0.000960, 0.000970, 0.000994, 0.001027, 0.001065, 0.001115, 0.001154, 0.001209, 0.001271, 0.001351, 0.001460, 0.001603, 0.001769, 0.001943, 0.002120, 0.002311, 0.002520, 0.002747, 0.002989, 0.003242, 0.003512, 0.003803, 0.004118, 0.004464, 0.004837, 0.005217, 0.005591, 0.005963, 0.006346, 0.006768, 0.007261, 0.007866, 0.008596, 0.009473, 0.010450, 0.011456, 0.012407, 0.013320, 0.014299, 0.015323,...
    0.016558, 0.018029, 0.019723, 0.021607, 0.023723, 0.026143, 0.028892, 0.031988, 0.035476, 0.039238, 0.043382, 0.047941, 0.052953, 0.058457, 0.064494,...
    0.071107, 0.078342, 0.086244, 0.094861, 0.104242, 0.114432, 0.125479, 0.137427, 0.150317, 0.164187, 0.179066, 0.194979, 0.211941, 0.229957, 0.249020, 0.269112, 0.290198, 0.312231, 1.000000]; 
Params.sj = 1-dj;

%% Grids
n_d = 0;
d_grid = [];
n_a = 600;
a_grid = linspace(0,50,n_a)';

n_z = 2;
z_grid = [0.1,1]';
pi_z = [0.5,0.5;0.5,0.5];

%% Return function
DiscountFactorParamNames={'beta','sj'};

% Notice we still use 'LifeCycleModel8_ReturnFn'
ReturnFn=@(aprime,a,z,w,r,kappa_j) myReturnFn(aprime,a,z,w,r,kappa_j);

%% VFI
vfoptions.verbose = 1;
[V, Policy]=ValueFnIter_Case1_FHorz(n_d,n_a,n_z,N_j, d_grid, a_grid, z_grid, pi_z, ReturnFn, Params, DiscountFactorParamNames, [], vfoptions);

%% Distribution
Params.mewj=ones(1,Params.J); % Marginal distribution of households over age
for jj=2:length(Params.mewj)
    Params.mewj(jj)=Params.sj(jj-1)*Params.mewj(jj-1);
end
Params.mewj=Params.mewj./sum(Params.mewj); % Normalize to one
AgeWeightsParamNames={'mewj'};

jequaloneDist=zeros(n_a,n_z); % Put no households anywhere on grid
% All agents start with zero assets, 5% are sick and 95% are healthy
jequaloneDist(1,:)=[0.05,0.95];

simoptions.verbose = 1;
StationaryDist=StationaryDist_FHorz_Case1(jequaloneDist,AgeWeightsParamNames,Policy,n_d,n_a,n_z,N_j,pi_z,Params,simoptions);

mu_a = sum(StationaryDist,[2,3]);

plot(a_grid,mu_a)

and this is the return function

function F = myReturnFn(aprime,a,z,w,r,kappa_j)


cons = w*kappa_j*z+(1+r)*a-aprime;

F = -inf;
if cons>0
    F = log(cons);
end

end

Now the question is how I impose, say, that sj if sick (z=z(1)) is smaller than sj if healthy (z=z(2)). If I coded this myself I would define sj as an array with size [n_z,N_j]
and

sj(2,:) = as before
sj(1,:) = 0.9*sj(2,:);

Thanks!

made the two changes to Life-Cycle Model 29 you suggested. thanks!

1 Like

Let pd be probability of death.

The trick is that in standard bellman eqn we have (near the right hand end) a term:
+ (1-pd) * beta * E[V']
Notice that we could rewrite this as:
+ (1-pd) * beta * E[V'] + pd * beta * 0

In the first version we are passing pd as a discount factor (so the discount factor is (1-pd) * beta).

In the second version we implement this instead just using an exogenous state called ‘death’ and have that the return value for ‘death’ is zero (so the discount factor is just beta). The (1-pd) term is now applied as a probability, instead of as a discount factor [pd is probability of transitioning to ‘death’, 1-pd is probability of transitioning to a different (alive) state]. Obviously in the second version we can rewrite this as beta * E_d[V’], where E_d is understood as the expectations operator E combined with an expectation over the pd (this is awkward wording).

So in the code, you just set DiscountFactorParamNames={‘beta’}, and then deal with the survival probabilities via the state transition probabilities, plus an extra state value called ‘death’ which has a return value of 0. In your toy example you thus need z=[sick,healthy,death].

1 Like

Ok but how do I make the sj (or pd) term depend on z?

say I set
z_grid=[alive,dead]
then
pi_z=[1-dj, dj; 0, 1]

(obviously this is not the actual grid, but it gives the idea)

or a bigger example
z_grid=[healthy, sick, dead]
then

pi_z=[h*(1-dh), (1-h)*(1-dh), dh;
          (1-s)*(1-ds), s*(1-ds), ds;
         0,0,1]

where h is probability of remaining healthy (conditional on staying alive), s is probability of remaining sick (conditional on staying alive), and dh and ds as the two probabilities of dying (and you would set ds>dh). So, e.g., the second row first column is the probability of transitioning sick–>healthy, which is (1-s) [that you are no longer sick, conditional on staying alive] times (1-ds) [that you stay alive].

(I find it easiest to think about first setting up the top-left 2x2 for the probabilities given you stay alive, then expanding this into the full thing by multiplying it by the probabilities of staying alive (which here differ by row) and then put the last column and last row in which are fairly simple to write out)

1 Like