Calculate Correlations (and Covariances) from values, weights, and transition probabilities

robertdkirkby · July 16, 2025, 11:41pm

Following is Matlab code demonstrating how to compute autocorrelation coefficients from matrices of values and weights, as well as the transition probabilities [so for a discrete markov process].


% Calculate autocorrelations from values, weights, and transition probabilities

% Idea: there are N points in state-space
% We have weights W and values V (on the state-space)
% We have transition probabilities P
% [In finite horizon, or just generally outside the a stationary eqm, we
% also have next period weights W2 and values V2. In a stationary eqm for
% infinite horizon model these will just be copies of W and V.]
%
% First section of code just sets all this up
% Second section of code shows how to calculate the correlation of the two
% periods. Done by first calculating the covariance and the standard
% deviations.

%% Setup
InfHorz=1;
HighCorr=0; % change P to make correlation high (only works for infinite horizon setup)

% Number of points in state space
N=2;

% We have function of the state space, and this period it takes values
V=10*rand(N,1);

% Transition matrix for the states is
P=rand(N,N);
P=P./(sum(P,2)); % normalize rows to 1
if HighCorr==1 % Force high correlation
    P=P+eye(N,N); % move mass towards diagonal
    P=P./(sum(P,2)); % normalize rows to 1
elseif HighCorr==2 % perfect correlation, so will get corr=1
    P=eye(N,N);
end


% Set up weights for this period and next period, and values for next period
if InfHorz==1 % Infinite horizon
    % Use the weights that correspond to the stationary dist of P
    dist=1;
    W=rand(N,1);
    W=W/sum(W); % total mass of 1
    while dist>0
        Wlag=W;
        W=P'*W;
        dist=max(abs(W-Wlag));
    end

    V2=V;
    W2=W;

    fprintf('double-check, following should be zero (or at least 10^-9 ish): %1.10f \n', max(abs(W-(P'*W))) )
else
    V2=10*rand(N,1);

    % Say the distribution over states this period is
    W=rand(N,1);
    W=W/sum(W); % total mass of 1
    % Then value over states next period is
    W2=P'*W;

end



%% Autocorrelation
% We want to calculate the autocorrelation
% Correlation(x,y)=Cov(x,u)/(stddev(x)*stddev(y))
% So first calculate the covariance and the two standard deviations

meanV=sum(W.*V);
meanV2=sum(W2.*V2);
stddevV=sqrt(sum(W.*(V-meanV).^2));
stddevV2=sqrt(sum(W2.*(V2-meanV2).^2));

% Calculate covariance between this period and next period values
Covar=(W.*V)'*P*V2 - meanV*meanV2;
fprintf('Covariance is: %2.4f  \n', Covar)

% Calculate the correlation
Corr=Covar/(stddevV*stddevV2);
fprintf('Correlation is: %2.4f  \n', Corr)


%% Double check by simulating the model and calculating the autocorrelation from this
S=10000; % number of simulations

% first, get pairs of indices for this period and next period
Wcum=cumsum(W);
Pcum=cumsum(P,2);
indexThisPeriod=zeros(S,1);
indexNextPeriod=zeros(S,1);
for ss=1:S
    draw=rand(1,1);
    [~,ii]=max(Wcum>draw,[],1);
    indexThisPeriod(ss)=ii;
    draw2=rand(1,1);
    [~,ii2]=max(Pcum(ii,:)>draw2,[],2);
    indexNextPeriod(ss)=ii2;
end
Vsim=V(indexThisPeriod);
V2sim=V2(indexNextPeriod);

CovarSim=cov(Vsim,V2sim);
CovarSim=CovarSim(1,2); % the covariance is the off-diagonal [Matlab computes the whole Covariance Matrix, the diagonals are the variances]

fprintf('Covariance from two different approaches should be roughly same: %2.4f, %2.4f  \n', Covar, CovarSim)
% If you set S really large, they should become essentially identical.
% Only difference should be the sampling error


% Double-check: W and histcounts(indexThisPeriod) should be roughly the same (once you normalize the later; looks fine)
% H=histcounts(indexThisPeriod)'; H/sum(H)
% Double-check: if you set HighCorr==2 then get indexThisPeriod=indexNextPeriod (passed this)

robertdkirkby · July 17, 2025, 12:01am

If you want, e.g., five-period autocorrelations you just use P^5 in place of P (the five-period transition matrix in place of the one-period transition matrix), or the finite-horizon equivalent of doing this.

javier_fernan · July 17, 2025, 4:22pm

This is really useful. Going back to @aledinola example, I can set

W=W2=StationaryDist(
V=V2=pol_n(

How do I define P? Is it simply pi_z?

aledinola · July 17, 2025, 6:32pm

Hi @javier_fernan,

I think that P in Robert’s code is a matrix that gives the probability of going from x to x' where x is a generic point on the state space. In the example, x=(a,z). So you can think of P(x,x')=Prob(a',z' | a,z). To generate this matrix you have to combine the policy function for a' with the exogenous transition \pi(z,z').
Here is my code to do this:

Ptemp = cell(N_z,1);
for z_c=1:N_z
   Ptemp{z_c} = sparse((1:N_a),Policy(:,z_c),1,N_a,N_a);
   Ptemp{z_c} = kron(pi_z(z_c,:), Ptemp{z_c} );
end
P = cell2mat(Ptemp);

You can verify that (1) the size of P is [N_a*N_z,N_a*N_z] and that (2) the rows of P sum to one.

robertdkirkby · July 17, 2025, 8:08pm

As @aledinola says, P will map (a,z) into (a’,z’), and is constructed by combining Policy with pi_z.

[Note: this is the same P you would use to iterate on the agent distribution. Of course, nowadays you would use Tan improvement instead to do the iterations as two sub-steps without actually constructing the whole P. But just mention in case you ever saw that before.]

aledinola · October 5, 2025, 7:59pm

I have a question regarding correlations so this post seems the right place

I have a model with semi-exogenous shocks and I would lile to compute the expected value of semiz \times semiz'. Of course the transition matrix of semiz depends on a decision variable d and other stuff.

Is there a way of doing this calculation with the toolkit? Functions to evaluate?

Here is a mathamtical formulation of the problem for clarity:

Consider a life-cycle model with semi-exogenous shocks. The variables in the action space are

(d,a^{\prime },a,semiz,j)

The transition probabilities for semiz are given by \pi (semiz,semiz^{\prime }|d,j). The agents distribution is \mu (a,semiz,j) .
We can compute \mathbb{E}(semiz\times semiz^{\prime }) as

\sum_{a,semiz,j} \sum_{semiz^{\prime}} semiz\times semiz^{\prime }\times \pi (semiz,semiz^{\prime }|d^{\ast }(a,semiz,j),j) \times \mu (a,semiz,j)

where d^{\ast} is the policy function for the decision variable d .

robertdkirkby · October 6, 2025, 6:43am

Currently there is not a toolkit command for this. I am hoping to do an auto-correlation command (for life-cycle models) at some point, but it does not yet exist.

You can do one manually. You can use ‘ValuesOnGrid’ to get the values for semiz and semiz’ (or any other FnsToEvaluate). You can construct P (see below), and then using P and the ‘ValuesOnGrid’ you can do the same calculation as in the first post in this topic.

The missing piece here then is how to construct P with a semi-exogenous state. When VFI Toolkit does the agent distribution using the Tan improvement it comes very close to creating P, e.g.,

github.com/vfitoolkit/VFIToolkit-matlab

StationaryDist/FHorz/SemiExo/StationaryDist_FHorz_SemiExo_Iteration_raw.m

master

function StationaryDistKron=StationaryDist_FHorz_SemiExo_Iteration_raw(jequaloneDistKron,AgeWeightParamNames,PolicyIndexesKron,N_d1,N_a,N_z,N_semiz,N_j,pi_z_J,pi_semiz_J,Parameters,simoptions)
% Will treat the agents as being on a continuum of mass 1.

N_bothz=N_z*N_semiz;

if N_d1==0
    optd2prime=gather(reshape(PolicyIndexesKron(1,:,:,:),[N_a*N_bothz,N_j])); % Note: column vector (conditional on jj)
else
    optd2prime=gather(reshape(ceil(PolicyIndexesKron(1,:,:,:)/N_d1),[N_a*N_bothz,N_j]));   
end
optaprime=gather(reshape(PolicyIndexesKron(2,:,:,:),[N_a*N_bothz,N_j])); % Note: column vector (conditional on jj)

%% Tan improvement verion

% To do Tan improvement with semiz shocks we treat the first step as
% (a,semiz,z) to (a',semiz',z) and then the second is the standard just
% updating z to z'.
StationaryDistKron=zeros(N_a*N_bothz,N_j,'gpuArray');
StationaryDistKron(:,1)=jequaloneDistKron;
StationaryDistKron_jj=sparse(gather(jequaloneDistKron));

This file has been truncated. show original

The core trick to the semi-exogenous shock is that you take ‘pi(semiz,semiz’|d,j)’ and index the ‘d’ dimension based on Policy(1,:,:,j) [1 is the decision variable, rather than next period endogenous state], and this gives you ‘pi(semiz,semiz’|a,z,j)'. In the linked code this is pi_semiz_jj and the index fullindex which gives semiztransition.

Of course in the Tan improvement you create what the code calls Gamma and pi_z, and P then comes from putting these two together [the point of the Tan improvement is that instead of creating P you can do two substeps with Gamma and pi_z, respectively, which are just the two components of P and this saves on computations.]

In sum, there is not yet a dedicated command for this, but you could string something together based on existing parts.

PS. An autocorrelation command is actually substantially more involved than age-conditional stats, simply because it will require a different internal code for each setting (in the same way the agent dist commands do).

aledinola · October 6, 2025, 7:31am

Is there a toolkit function I can use to generate pi(semiz,semiz’|d,j), rather than doing it manually by looping over f_EXOGshockfun?

Thanks!

robertdkirkby · October 6, 2025, 8:37am

SemiExogShockSetup_FHorz()

% Internally, only ever use age-dependent joint-grids (makes all the code much easier to write)
vfoptions=SemiExogShockSetup_FHorz(n_d,N_j,d_grid,Parameters,vfoptions,2);
% output: vfoptions.semiz_gridvals_J, vfoptions.pi_semiz_J
% size(semiz_gridvals_J)=[prod(n_z),length(n_z),N_j]
% size(pi_semiz_J)=[prod(n_semiz),prod(n_semiz),prod(n_dsemiz),N_j]
% If no semiz, then vfoptions just does not contain these field