Aiyagari example on CPU, Part II

[This post is a continuation of Part I]
In the previous post I outline a solution that requires to change the code in CreateReturnFnMatrix_Case1_Disc and change Aiyagari1994_ReturnFn as well. This requires to keep two versions of Aiyagari1994_ReturnFn, one for the GPU (not vectorized) and one for the CPU (vectorized). To avoid this, I tried to use the option vfoptions.returnmatrix==1 suggested by the Toolkit.

As far as I understand, I have to create myself the 3-dimensional return matrix over (a’,a,z) and pass it to VFI as the input ReturnFn. In the standard case, ReturnFn is a function handle, but now it is a numeric array (right?). So here’s my code:

% Pre-compute here the return matrix
tic
if vfoptions.parallel==0 && vfoptions.returnmatrix==1
    z_grid = gather(z_grid);
    pi_z   = gather(pi_z);
    ReturnFn = zeros(n_a,n_a,n_z);
    for i3=1:n_z
        for i2=1:n_a
            for i1=1:n_a
                ReturnFn(i1,i2,i3)=Aiyagari1994_ReturnFn(a_grid(i1),a_grid(i2),z_grid(i3),Params.alpha,Params.delta,Params.mu,Params.r);
            end
        end
    end
end
toc

This is quite fast even if it is not vectorized. It takes 0.6 seconds with n_a=512, n_z=11.

Unfortunately, I got the following error message

Incorrect number or types of inputs or outputs for function func2str.

Error in getAnonymousFnInputNames (line 3)
temp=func2str(FnToEvaluate); % Note that this strips out any spaces automatically

Error in ValueFnIter_Case1 (line 191)
    temp=getAnonymousFnInputNames(ReturnFn);

Error in Aiyagari1994 (line 130)
[V,Policy]=ValueFnIter_Case1(n_d,n_a,n_z,d_grid,a_grid,z_grid,pi_z,ReturnFn,Params,DiscountFactorParamNames,[],vfoptions);

For completeness, I report my entire code below:

% Example based on Aiyagari (1994).
%
% These codes set up and solve the Aiyagari (1994) model for a given
% parametrization. After solving the model they then show how some of the
% vfitoolkit commands to easily calculate things like the Gini coefficient
% for income, and how to plot the distribution of asset holdings.
%
% VFI Toolkit automatically detects hardware (GPU? Number of CPUs?) and
% sets defaults accordingly. It will run without a GPU, but slowly. It is
% indended for use with GPU.

clear,clc,close all
addpath(genpath('C:\Users\aledi\Dropbox\Family Room\6_Computation\VFI_toolkit\VFIToolkit-matlab-master'))

%% Set some basic variables

% VFI Toolkit thinks of there as being:
% k: an endogenous state variable (assets)
% z: an exogenous state variable (exogenous labor supply)

% Size of the grids
n_k = 512;
n_z = 11; %21;

% Parameters
Params.beta  = 0.96; %Model period is one-sixth of a year
Params.alpha = 0.36;
Params.delta = 0.08;
Params.mu    = 3;
Params.sigma = 0.2;
Params.rho   = 0.6;

%% Set up the exogenous shock process
% Create markov process for the exogenous labour productivity, l.
Tauchen_q=3; % Footnote 33 of Aiyagari(1993WP, pg 25) implicitly says that he uses q=3
[z_grid,pi_z]=discretizeAR1_Tauchen(0,Params.rho,sqrt((1-Params.rho^2)*Params.sigma^2),n_z,Tauchen_q);
% Note: sigma is standard deviations of s, input needs to be standard deviation of the innovations
% Because s is AR(1), the variance of the innovations is (1-rho^2)*sigma^2

z_grid = exp(z_grid);
aux    = pi_z^1000;
z_prob = aux(1,:)';
z_mean = dot(z_grid,z_prob);
z_grid = z_grid/z_mean;

%[z_mean,z_variance,z_corr,~]=MarkovChainMoments(z_grid,pi_z);
% Get some info on the markov process
%[Expectation_l,~,~,~]=MarkovChainMoments(z_grid,pi_z); %Since l is exogenous, this will be it's eqm value
% Note: Aiyagari (1994) actually then normalizes l by dividing it by Expectation_l (so that the resulting process has expectation equal to 1)
%z_grid=z_grid./Expectation_l;
%[Expectation_l,~,~,~]=MarkovChainMoments(z_grid,pi_z);


% If you look at Expectation_l you will see it is now equal to 1
Params.Expectation_l = dot(z_grid,z_prob);

%% Grids

% In the absence of idiosyncratic risk, the steady state equilibrium is given by
r_ss=1/Params.beta-1;
K_ss=((r_ss+Params.delta)/Params.alpha)^(1/(Params.alpha-1)); %The steady state capital in the absence of aggregate uncertainty.

% Set grid for asset holdings
k_grid=10*K_ss*(linspace(0,1,n_k).^3)'; % linspace ^3 puts more points near zero, where the curvature of value and policy functions is higher and where model spends more time

% Bring model into the notational conventions used by the toolkit
d_grid=0; %There is no d variable
a_grid=k_grid;
% pi_z;
% z_grid

n_d=0;
n_a=n_k;
% n_z

% Create functions to be evaluated
FnsToEvaluate.K = @(aprime,a,s) a; %We just want the aggregate assets (which is this periods state)

% Now define the functions for the General Equilibrium conditions
% Should be written as LHS of general eqm eqn minus RHS, so that the closer the value given by the function is to
% zero, the closer the general eqm condition is to holding.
GeneralEqmEqns.CapitalMarket = @(r,K,alpha,delta,Expectation_l) r-(alpha*(K^(alpha-1))*(Expectation_l^(1-alpha))-delta); %The requirement that the interest rate corresponds to the agg capital level
% Inputs can be any parameter, price, or aggregate of the FnsToEvaluate

fprintf('Grid sizes are: %i points for assets, and %i points for exogenous shock \n', n_a,n_z)

%%
DiscountFactorParamNames={'beta'};

% (a',a,s) are next-period endog state,endog state,exogenous state (Markov)
ReturnFn=@(aprime,a,s,alpha,delta,mu,r) Aiyagari1994_ReturnFn(aprime,a,s,alpha,delta,mu,r);


%%

% Use the toolkit to find the equilibrium price index
GEPriceParamNames={'r'};
% Set initial value for interest rates (Aiyagari proves that with idiosyncratic
% uncertainty, the eqm interest rate is limited above by it's steady state value
% without idiosyncratic uncertainty, that is that r<r_ss).
Params.r=0.038;

%% Test VFI
vfoptions=struct();
vfoptions.verbose  = 1;
vfoptions.parallel = 0;
vfoptions.returnmatrix=1;

% Pre-compute here the return matrix
tic
if vfoptions.parallel==0 && vfoptions.returnmatrix==1
    z_grid = gather(z_grid);
    pi_z   = gather(pi_z);
    ReturnFn = zeros(n_a,n_a,n_z);
    for i3=1:n_z
        for i2=1:n_a
            for i1=1:n_a
                ReturnFn(i1,i2,i3)=Aiyagari1994_ReturnFn(a_grid(i1),a_grid(i2),z_grid(i3),Params.alpha,Params.delta,Params.mu,Params.r);
            end
        end
    end
end
toc

% Special version for CPU
if isfield(vfoptions,'parallel') && vfoptions.parallel==0 && vfoptions.returnmatrix==0
   ReturnFn=@(aprime,a,s,alpha,delta,mu,r) Aiyagari1994_ReturnFn_cpu(aprime,a,s,alpha,delta,mu,r);
end

[V,Policy]=ValueFnIter_Case1(n_d,n_a,n_z,d_grid,a_grid,z_grid,pi_z,ReturnFn,Params,DiscountFactorParamNames,[],vfoptions);

mean(V,"all")

size(Policy)
pol_aprime = squeeze(Policy(1,:,:));
pol_aprime_val = a_grid(pol_aprime);

figure
plot(a_grid,a_grid,'--')
hold on
plot(a_grid,pol_aprime_val(:,1))
hold on
plot(a_grid,pol_aprime_val(:,n_z))

Way back in version 0.0 of the toolkit, my first idea was that you would pass the return function as a matrix, and then code did the rest. Somewhere around 0.0 I figured out how to instead pass the return function as a function, and then toolkit internally made this a matrix and done!

vfoptions.returnmatrix==1

is that v0.0 functionality, and because I’ve hardly used it since I haven’t checked that it can be used.

Anyway, I eliminated the error you hit (just added an if vfoptions.returnmatrix~=1). Can’t promise there isn’t another hitch as I haven’t run it. I am happy to make any further modifications you mention to get it to work but I haven’t written a test of it myself.

1 Like

Thanks a lot Robert! I will test it and let you know.

One question: if you run the general equilibrium command, the routine fminsearch (or similar) will change the GE parameters many times until GE conditions are satisfied. Since the GE parameters affect the Return matrix (in most cases), also the Return matrix should be recomputed manually at every iteration, right?

In other words, if I use the returnmatrix=1 option, I cannot use the toolkit GE command but I have to write my own function

I ran the test and I found a new error in UnKronPolicyIndexes_Case1. I report here the error message:

Grid sizes are: 601 points for assets, and 11 points for exogenous shock(s) 

vfoptions = 

  struct with fields:

                        verbose: 1
                       parallel: 0
                   returnmatrix: 1
                     solnmethod: 'purediscretization_refinement'
               divideandconquer: 0
                      lowmemory: 0
                      tolerance: 1.0000e-09
                        howards: 80
                     maxhowards: 500
                        maxiter: Inf
                 endogenousexit: 0
                       endotype: 0
                incrementaltype: 0
                    polindorval: 1
        policy_forceintegertype: 0
    piz_strictonrowsaddingtoone: 0
                     outputkron: 0
                       actualV0: 0

Creating return fn matrix
Time to create return fn matrix:   0.0001 
Starting Value Function 
Time to solve for Value Fn and Policy:   0.1925 
Transforming Value Fn and Optimal Policy matrices back out of Kronecker Form
Index in position 2 exceeds array bounds. Index must not exceed 1.

Error in UnKronPolicyIndexes_Case1 (line 18)
                optaindexKron=Policy(i,j);

Error in ValueFnIter_Case1 (line 753)
    Policy=UnKronPolicyIndexes_Case1(Policy, n_d, n_a, n_z,vfoptions);

Error in Aiyagari1994 (line 150)
[V,Policy]=ValueFnIter_Case1(n_d,n_a,n_z,d_grid,a_grid,z_grid,pi_z,ReturnFn,Params,DiscountFactorParamNames,[],vfoptions);

I had a look under the hood and the problem is this part of the code:

if vfoptions.parallel~=2
        PolicyTemp=zeros(l_a,N_a,N_z);
        for i=1:N_a
            for j=1:N_z
                optaindexKron=Policy(i,j);
                optA=ind2sub_homemade([n_a'],optaindexKron);
                PolicyTemp(:,i,j)=[optA'];
            end
        end
        Policy=reshape(PolicyTemp,[l_a,n_a,n_z]);

On entry to the function, Policy has size [n_a*n_z,1], but later it is accessed as Policy(i,j) and when j is greater than one you get an error.

I report below the exact error message that I got after running Aiayagari1994.m with vfoptions.returnmatrix=1

Grid sizes are: 601 points for assets, and 11 points for exogenous shock(s) 

vfoptions = 

  struct with fields:

                        verbose: 1
                       parallel: 0
                   returnmatrix: 1
                     solnmethod: 'purediscretization_refinement'
               divideandconquer: 0
                      lowmemory: 0
                      tolerance: 1.0000e-09
                        howards: 80
                     maxhowards: 500
                        maxiter: Inf
                 endogenousexit: 0
                       endotype: 0
                incrementaltype: 0
                    polindorval: 1
        policy_forceintegertype: 0
    piz_strictonrowsaddingtoone: 0
                     outputkron: 0
                       actualV0: 0

Creating return fn matrix
Time to create return fn matrix:   0.0001 
Starting Value Function 
Time to solve for Value Fn and Policy:   0.1925 
Transforming Value Fn and Optimal Policy matrices back out of Kronecker Form
Index in position 2 exceeds array bounds. Index must not exceed 1.

Error in UnKronPolicyIndexes_Case1 (line 18)
                optaindexKron=Policy(i,j);

Error in ValueFnIter_Case1 (line 753)
    Policy=UnKronPolicyIndexes_Case1(Policy, n_d, n_a, n_z,vfoptions);

Error in Aiyagari1994 (line 150)
[V,Policy]=ValueFnIter_Case1(n_d,n_a,n_z,d_grid,a_grid,z_grid,pi_z,ReturnFn,Params,DiscountFactorParamNames,[],vfoptions);

Another question, perhaps unrelated: the verbose message reports solnmethod: 'purediscretization_refinement', but here we don’t have any d variable, so shouldn’t it be purediscretization?

Actually the mistake above occurs even if vfoptions.returnmatrix=0. I think it depends on the new ValueFnIter_Case1_NoD_raw. The last line should be

Policy=reshape(PolicyIndexes,[N_a,N_z]);

After fixing this, the VFI works well on a Mac computer without supported GPU. Here are the results in terms of timing, for the two separate cases of returnmatrix 0 or 1:

returnmatrix=0

Grid sizes are: 601 points for assets, and 11 points for exogenous shock(s) 

Creating return fn matrix
NOTE: When using CPU you can speed things up by giving return fn as a matrix; see vfoptions.returnmatrix=1 in VFI Toolkit documentation. 
Time to create return fn matrix:   3.8978 
Starting Value Function 
Time to solve for Value Fn and Policy:   0.1905 
Transforming Value Fn and Optimal Policy matrices back out of Kronecker Form
Time to create UnKron Value Fn and Policy:   0.0114 

returnmatrix=1

Grid sizes are: 601 points for assets, and 11 points for exogenous shock(s) 
Time to create the Return Matrix manually:
Elapsed time is 1.065759 seconds.


Creating return fn matrix
Time to create return fn matrix:   0.0002 
Starting Value Function 
Time to solve for Value Fn and Policy:   0.1775 
Transforming Value Fn and Optimal Policy matrices back out of Kronecker Form
Time to create UnKron Value Fn and Policy:   0.0075 

Thanks to the new ValueFnIter_Case1_NoD_raw, the running time of Value Fn and Policy is more than 10 times faster! Thanks for overhauling it.

The bottleneck is the creation of return function, as expected. If I do it myself with returnmatrix=1, it takes approx. 1 second, if I let the toolkit do it it takes almost 4 seconds. Please note that the manual computation of the return matrix is done using exactly the same Aiyagari1994_ReturnFn that is used by the GPU. I suspect that there might be some inefficiency in how the toolkit does it internally. Here is my code:

ReturnFn = zeros(n_a,n_a,n_z);
   for z_c=1:n_z
       for a_c=1:n_a
           for aprime_c=1:n_a
                ReturnFn(aprime_c,a_c,z_c) = Aiyagari1994_ReturnFn(a_grid(aprime_c),a_grid(a_c),z_grid(z_c),Params.alpha,Params.delta,Params.mu,Params.r);
           end
       end
   end

Another error in EvalFnOnAgentDist_AllStats_Case1. This should be the last one :slight_smile:
After fixing this, the Aiyagari example will work nicely on the CPU!

Error using zeros
GPU acceleration with Parallel Computing Toolbox is not supported on macOS.

Error in StatsFromWeightedGrid (line 168)
            QuantileMeans=zeros(nquantiles,1,'gpuArray');

Error in EvalFnOnAgentDist_AllStats_Case1 (line 204)
        AllStats.(FnsToEvalNames{ff})=StatsFromWeightedGrid(Values,StationaryDistVec,simoptions.npoints,simoptions.nquantiles,simoptions.tolerance,0,simoptions.whichstats);

Error in Aiyagari1994 (line 196)
AllStats=EvalFnOnAgentDist_AllStats_Case1(StationaryDist, Policy, FnsToEvaluateIneq,Params, [],n_d, n_a, n_z, d_grid, a_grid,z_grid,simoptions);

Done. Should work now.

(not sure if vfoptions.returnmatrix=1 works, let me know if there is a change I need to make, not quite clear from your earlier post if there was or not)

1 Like

I tested the code and it works well with parallel=1 (parallel CPU) but it gives an error with parallel=0 (simple CPU). I set returnmatrix=0 in both cases.

parallel = 1 GOOD

Grid sizes are: 601 points for assets, and 11 points for exogenous shock(s) 

vfoptions = 

  struct with fields:

                        verbose: 1
                       parallel: 1
                   returnmatrix: 0
                     solnmethod: 'purediscretization_refinement'
               divideandconquer: 0
                      lowmemory: 0
                      tolerance: 1.0000e-09
                        howards: 80
                     maxhowards: 500
                        maxiter: Inf
                 endogenousexit: 0
                       endotype: 0
                incrementaltype: 0
                    polindorval: 1
        policy_forceintegertype: 0
    piz_strictonrowsaddingtoone: 0
                     outputkron: 0
                       actualV0: 0

Creating return fn matrix
NOTE: When using CPU you can speed things up by giving return fn as a matrix; see vfoptions.returnmatrix=1 in VFI Toolkit documentation. 
Time to create return fn matrix:   0.7514 
Starting Value Function 
Time to solve for Value Fn and Policy:   0.6098 
Transforming Value Fn and Optimal Policy matrices back out of Kronecker Form
Time to create UnKron Value Fn and Policy:   0.0086 

ans =

    6.8380


ans =

     1   601    11

   601    11


ans =

    4.1156

The Gini coeff for Earnings is:   0.1131 
 The Gini coeff for Income is:   0.3102 
 The Gini coeff for Wealth is:   0.3918 

parallel = 0 ERROR

Grid sizes are: 601 points for assets, and 11 points for exogenous shock(s) 

vfoptions = 

  struct with fields:

                        verbose: 1
                       parallel: 0
                   returnmatrix: 0
                     solnmethod: 'purediscretization_refinement'
               divideandconquer: 0
                      lowmemory: 0
                      tolerance: 1.0000e-09
                        howards: 80
                     maxhowards: 500
                        maxiter: Inf
                 endogenousexit: 0
                       endotype: 0
                incrementaltype: 0
                    polindorval: 1
        policy_forceintegertype: 0
    piz_strictonrowsaddingtoone: 0
                     outputkron: 0
                       actualV0: 0

Creating return fn matrix
NOTE: When using CPU you can speed things up by giving return fn as a matrix; see vfoptions.returnmatrix=1 in VFI Toolkit documentation. 
Time to create return fn matrix:   1.1095 
Starting Value Function 
Time to solve for Value Fn and Policy:   0.2039 
Transforming Value Fn and Optimal Policy matrices back out of Kronecker Form
Index in position 2 exceeds array bounds. Index must not exceed 1.

Error in UnKronPolicyIndexes_Case1 (line 18)
                optaindexKron=Policy(i,j);

Error in ValueFnIter_Case1 (line 753)
    Policy=UnKronPolicyIndexes_Case1(Policy, n_d, n_a, n_z,vfoptions);

Error in Aiyagari1994 (line 122)
[V,Policy]=ValueFnIter_Case1(n_d,n_a,n_z,d_grid,a_grid,z_grid,pi_z,ReturnFn,Params,DiscountFactorParamNames,[],vfoptions);

I think the bug is in ValueFnIter_Case1_NoD_raw, please change line
Policy=PolicyIndexes;
to
Policy=reshape(PolicyIndexes,[N_a,N_z]);

Done :slight_smile:

vfoptions.parallel=0 now works for Aiyagari example

1 Like