Stationary age distribution

Shiki · August 1, 2025, 1:13pm

Hi Robert, I have an empirical question. I saw in many papers that authors use stationary age distribution: mu_{j+1}=s_j*mu_j/(1+n). But if the benchmark model is calibrated using census or survey data, would it be natural to just set up mu_j to match the age distribution in observed data?

robertdkirkby · August 2, 2025, 12:30am

TLDR: If you want population growth counterfactuals, need age weights \mu_j to depend on n. If not interested in population growth counterfactuals, getting age weights from empirical data will lead to more accurate/realistic results.

If you are doing a life-cycle mode, just using mu_j to match the empirical age distribution makes sense (whether you get this from same data you are already using, or you use the census data because it is likely more accurate; depends what exactly you want the model to ‘reproduce’). [Some papers don’t even need to say what mu_j is because they only report age conditional stats, so the mu_j are not relevant.]

If you are doing an OLG model, you may want to think about how population growth matters for various things. In this case you want n, and the \mu_{j+1}=s_j*\mu_j/(1+n) is the obvious/standard way to use it in the model. Of course, you can do an OLG model with just the empirical age distribution, simply that then you will have difficulty if you are interested in any counterfactuals relating to n.

In an OLG transition path, you get that the current period \mu_{j,t} depends on the current and past values of n_t (which is now the growth rate of period 1 agents, which is no longer the same thing as the growth rate of the population). So it makes sense that we are not about to get a real world age demographic that matches just a single value of n.

Note, in the real world data, the current age masses will depend not just on current and past s_j and n, but also on any immigration/emigration and the ages of them.

Shiki · August 2, 2025, 6:44am

Thanks Robert. I want to let my benchmark OLG model match some observed data by age groups (e.g. saving rates in 20-30, 30-40…not exact match, but as close as possible) and then examine the impact of some policies. For OLG transition path, can I do the following: use empirical age distribution as the starting equilibrium; then during the first 20 or 25 periods, the population shares converge to the stationary age distribution (n).

robertdkirkby · August 3, 2025, 1:05am

What you say seems fine, but there is perhaps a better option.

For a specific country there are likely population forecasts. You could do what you say, but if the point is what will happen in country X over the next 20 years, then it might make more sense to use the population forecasts for country X rather than converging them to the stationary distribution.

The correct answer as to how you should set up the population weights will, as is probably clear, depend on exactly what you want to do/ask with the model.

PS. For the codes to run just fine, whether or not the conditional survival probabilities match what happens in the population is not important (you might want them to match for conceptual reasons, but codes will run just fine either way).

Shiki · August 3, 2025, 7:38am

if the point is what will happen in country X over the next 20 years, then it might make more sense to use the population forecasts for country X rather than converging them to the stationary distribution.

If I use vfiToolkit to draw transition path, then I should obtain forecasted age structure for 20 years and manually type

ParamPath.mewj(1)=[forecasted value for 20 periods]
ParamPath.mewj(2)=[forecasted value for 20 periods]
…
ParamPath.a=a_new*ones(20,1) %Assume that I want to examine a policy that changes parameter a (and consider the demographic change).

Is it the correct way to proceed?

robertdkirkby · August 3, 2025, 9:01pm

Essentially, yes.

If you set up a parameter omega as ParamPath.omega to be N_j-by-T, or T-by-N_j, then VFI Toolkit will recognise the size and interpret it as an age-conditional parameter that varies over the transition path. As the codes loop internally over the transition path time period t they simply update Params.omega=ParamPath.omega(:,t), or equivalent, and use this for everything in that time period.

So your code has a minor typo, as should say ParamPath.mewj(1 ,: )=[forecasted value for 20 periods], where I just added ,: , but other than that all good [could be (:,1), not important which]

robertdkirkby · August 4, 2025, 1:27am

Btw, the new workshop on VFI Toolkit has a part of transition paths in OLG models that shows off some features that are not documented anywhere else yet. I haven’t made video yet but slides are there and the codes that go with them (video should show up in next month or so).

Shiki · August 5, 2025, 9:30pm

Thanks. These slides are very useful. One thing I want more details is the choice of period T in transition path. Normally people use T=50-100 periods. If in a model, agej represents 5 years (e.g. agej==1 means age 20-24), then should I set T to be a smaller number like 10 or 20 (since now 1 period = 5 years and 10-20 periods are equivalent to 50-100 years)?

robertdkirkby · August 5, 2025, 9:53pm

I tend to think of choosing T like choosing the number of lags to include in a VAR.

If you set T too small the answer will be wrong (biased in the VAR, probably just fail to compute in the transition path).

If you set T too big, it is just inefficient (lower statistical efficiency in the VAR because estimating ‘too many unnecessary’ parameters, slower to compute in the transition path).

How big T should be will differ across models. My advice is first solve once for your model with a big T, because you know this will work. Then look at the result and pick a T that is long enough for the model to be essentially converged to the final stationary general eqm, but not much longer than is needed for that to occur. This will give you a good T to use in practice as it is not too short (not so short that model fails to converge to final eqm and so transition path results are incorrect), but not so long that you waste compute and have unnecessarily high runtimes.

What you say about models with a 5 year time period likely being able to have smaller T than models with a 1 year time period makes sense. The real reason is that you would expect to need less time periods to converge to the final stationary general eqm and therefore a smaller T should be fine (rather than because of what the time period is per se).