Bug in ValueFnIter_postGI_sparse_raw

I ran a model in infinite horizon, two d variables (like Bruggeman’s model: d1 is labor supply and d2 is occupational choice), one endogenous state and three Markov shocks. I chose howardssparse=1 and got this error message:

Error using gpuArray/reshape
Number of elements must not change. Use [] as one of the size inputs to automatically calculate the appropriate size for that dimension.

Error in ValueFnIter_postGI_sparse_raw (line 90)
        [Vtempii,maxindexL2]=max(reshape(entireRHS_ii_z,[n_d*n2long,n_a]),[],1);
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error in ValueFnIter_GridInterpLayer (line 151)
                    [V,Policy] = ValueFnIter_postGI_sparse_raw(V0, n_d, n_a, n_z, d_gridvals, a_grid, z_gridvals, pi_z, ReturnFn, DiscountFactorParamsVec, ReturnFnParamsVec, vfoptions);
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error in ValueFnIter_Case1 (line 464)
    [V,Policy]=ValueFnIter_GridInterpLayer(V0, n_d, n_a, n_z, d_gridvals, a_grid, z_gridvals, pi_z, ReturnFn, DiscountFactorParamsVec, ReturnFnParamsVec, vfoptions);
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error in solve_toolkit (line 222)
[V,Policy] = ValueFnIter_Case1(n_d,n_a,n_z,d_grid,a_grid,z_grid,pi_z,ReturnFn,...
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error in main (line 28)
    [V,Policy,StationaryDist,pol_vals,mom,Params,AllStats] = solve_toolkit(GE_guess,Params,par);
                                                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Toolkit issue: gridinterplayer=1, howardssparse=1 fails with multi-dimensional n_d

Summary

I ran into a bug in the infinite-horizon Case 1 interpolation-layer path when:

  • vfoptions.gridinterplayer = 1
  • vfoptions.howardssparse = 1
  • vfoptions.howardsgreedy = 0
  • there is at least one discrete decision block, and in particular when n_d is a vector rather than a scalar

In my application, the decision block is:

  • d = (l,e)
  • n_d = [n_l, n_e]

so n_d is not scalar.

The non-sparse interpolation path (howardssparse=0) gets past this particular issue, but the sparse path fails immediately with a reshape error.

Error message

Error using gpuArray/reshape
Number of elements must not change. Use [] as one of the size inputs to automatically calculate the appropriate size for that dimension.

Error in ValueFnIter_postGI_sparse_raw (line 90)
        [Vtempii,maxindexL2]=max(reshape(entireRHS_ii_z,[n_d*n2long,n_a]),[],1);
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error in ValueFnIter_GridInterpLayer (line 151)
                    [V,Policy] = ValueFnIter_postGI_sparse_raw(V0, n_d, n_a, n_z, d_gridvals, a_grid, z_gridvals, pi_z, ReturnFn, DiscountFactorParamsVec, ReturnFnParamsVec, vfoptions);
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error in ValueFnIter_Case1 (line 464)
    [V,Policy]=ValueFnIter_GridInterpLayer(V0, n_d, n_a, n_z, d_gridvals, a_grid, z_gridvals, pi_z, ReturnFn, DiscountFactorParamsVec, ReturnFnParamsVec, vfoptions);
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Relevant toolkit files

  • ValueFnIter/InfHorz/GridInterpLayer/ValueFnIter_GridInterpLayer.m
  • ValueFnIter/InfHorz/GridInterpLayer/ValueFnIter_postGI_sparse_raw.m

The dispatch appears to be:

  • ValueFnIter_Case1
  • ValueFnIter_GridInterpLayer
  • ValueFnIter_postGI_sparse_raw

for the combination:

  • scalar endogenous asset state a
  • N_d > 0
  • vfoptions.preGI = 0
  • vfoptions.howardssparse = 1

Diagnosis

The sparse post-GI routine seems to assume that n_d is a scalar count, but in Case 1 the toolkit interface allows n_d to be a vector describing multiple discrete decision dimensions.

In ValueFnIter_postGI_sparse_raw.m, the problematic lines are currently:

[Vtempii,maxindexL2]=max(reshape(entireRHS_ii_z,[n_d*n2long,n_a]),[],1);
[maxindexL2_d, maxindexL2_a] = ind2sub([n_d,n2long],maxindexL2);

If n_d = [n_l,n_e], then:

  • n_d*n2long is not the intended scalar prod(n_d)*n2long
  • ind2sub([n_d,n2long],...) is also inconsistent with the flattened first dimension returned by the return-matrix constructor

The routine should instead use:

  • N_d = prod(n_d)

because the first dimension of the return object is the flattened discrete-choice dimension.

Why this seems to be the right interpretation

Earlier in the same routine, the code calls:

ReturnMatrix_ii_z = CreateReturnFnMatrix_Case1_Disc_DC1_Par2(ReturnFn, n_d, special_n_z, d_gridvals, aprime_ii, a_gridvals, z_val, ReturnFnParamsVec,1);

and the comments indicate that ReturnMatrix_ii_z is of size:

(N_d,n2long,n_a)

So downstream reshaping/indexing should be based on N_d = prod(n_d), not on the vector n_d itself.

Minimal patch proposal

In ValueFnIter_postGI_sparse_raw.m:

  1. Define
N_d = prod(n_d);
  1. Replace
[Vtempii,maxindexL2]=max(reshape(entireRHS_ii_z,[n_d*n2long,n_a]),[],1);

with

[Vtempii,maxindexL2]=max(reshape(entireRHS_ii_z,[N_d*n2long,N_a]),[],1);
  1. Replace
[maxindexL2_d, maxindexL2_a] = ind2sub([n_d,n2long],maxindexL2);

with

[maxindexL2_d, maxindexL2_a] = ind2sub([N_d,n2long],maxindexL2);
  1. Optional consistency cleanup:
midpoint=max(min(maxindex,N_a-1),2);

instead of mixing n_a and N_a in that line.

Additional note on lowmemory

I also checked whether vfoptions.lowmemory=1 meaningfully reduces memory use in the gridinterplayer=1 path.

My reading of the code is:

  • lowmemory=1 does reduce memory use in the return-matrix construction stage by looping over z
  • but in ValueFnIter_Refine_postGI_raw.m, the main post-GI iteration still constructs large interpolation objects such as EVpre, EVinterp, and entireRHS, so peak memory can remain very high

So the reshape error above appears separate from the memory issue.

Reproducibility context

Model-side setup:

  • endogenous state: a
  • exogenous state: z = (eps, theta, age)
  • discrete decision block: d = (l,e)
  • hence n_d = [n_l,n_e]

This is a standard Case 1 use in which there are multiple discrete decision dimensions stacked into d_grid.

Question

Is the intended behavior that ValueFnIter_postGI_sparse_raw.m should support vector-valued n_d the same way the rest of Case 1 does?

If yes, then the patch above seems like the minimal fix. If not, it would help to document that the sparse post-GI path currently only supports scalar n_d.

Regarding this problem, I’ve proposed a number of patches to this problem, many of which have been accepted. Fixing this will just be a case of extending the logic of not naively creating these variables before lowmemory has had its say.

1 Like

Actually my post deals with two issues at the same time (which is maybe not a good idea).

The most urgent one is that the function gives an error message, which signals a bug.

The secondary issue is that the grid interp layer codes are too memory intensive, even with lowmemory=1.

To provide context, my machine has 16 GB of GPU RAM, which is not small.
Are you sure your changes have been merged into the toolkit? As far as I understand, the lowmemory=1 option does not seem to reduce peak memory usage with gridinterplayer and refine. I checked in my task manager and peak memory usage was hitting the 16 GB of RAM.
The reason is that lowmemory=1 loops over z in the construction of the return function (good) but then does NOT loop over z in the memory-intensive vfi with interpolation. I would consider NOT precomputing the big return array when lowmemory=1.

Context information on the model.
Based on Bruggeman (AEJ:Macro 2021), Higher Taxes at the top: the role of entrepreneurs. In terms of grid sizes, the only change is that I have N_z=9×11+1=100, but I have a small n_d=[6,3].

My changes have been in other areas (experience assets), but solving the same problems. The bulk of my changes are being reviewed now but I think some others are already in.

1 Like