Fixed. Github doesn’t send me notifications for issues so I had missed seeing them. (Going to try change my github settings so I get notifications for issues.)
Comment on Boar and Midrigan (2023) - Should We Tax Capital Income or Wealth? [note, not 2022]
I was considering implementing this too, but then I realised this is NOT a model of entrepreneurial choice. Which is really unusual, especially given almost everything they cite is.
In entrepreneurial choice models (Quadrini, 2000; Cagetti & De Nardi, 2006; Kitao, 2008; all cited by BM2023, as well as many of their other cites) agents choose every period whether to be an entrepreneur or a worker (based on their savings and on their productivity in each of the two; some models allow to switch every period, so take a period to switch). Notice also that being an entrepreneur and being a worker are mutually exclusive. In such models there might be a markov process on labor productivity (as a worker), call it z_w, and a markov process on entrepreneur productivity, call it z_e. All agents have both markovs, and the decision on whether to be a worker or entrepreneur switches back and forth based on which of z_e and z_w is high/low (as well as other things like savings).
But in BM2023 there is no switching. Instead there are two different permanent types of agent, fraction 1-\psi are ‘pure-workers’ and fraction \psi are ‘worker-entrepreneurs’. The ‘pure-workers’ have only a markov on their labor productivity, z_w, and just work every period. The ‘worker-entrepreneurs’ have two markovs, z_w and z_e, and every period they get both labor earnings from working (based on z_w) and entrepreneurial income from ‘managing/producing/entrepreneuring’ (based on z_e). Notice that this eliminates any entrepreneurial entry/exit. It also substantially changes the risk faced by entrepreneurs as they still get worker earnings even while doing entrepreneurship. Because worker-entrepreneurs are always entrepreneurs it likely also substantially changes their savings and likely means they run into the collateral constraint less often.
I am guessing that an implication of the ‘permanent worker-entrepreneur’ is that the collateral constraints bind much less for high productivity entrepreneurs in this model than in a standard entrepreneurial-choice model. This is likely a major contributor to why BM2023 conclude that as “pointed out by [Guvenen, Kambourov, Kuruscu, Ocampo, & Chen (2023)], heterogeneity in rates of return [arising from collateral-constrained entrepreneurs] implies a nontrivial distinction between capital income and wealth taxation. … We find that capital income taxation is preferred to wealth taxation…because the redistribution motive dominates the efficiency gains”. Note that by making entrepreneur a permanent state rather than a choice BM2023 are both increasing the redistribution motive and reducing the efficiency gains.
PS. This is 100% not obvious reading BM2023 which in addition to citing a large number of entrepreneurial-choice models makes comments like “In addition, a fraction \psi of them are also endowed with entrepreneurial ability and can run a private business”. Note the “can”, which suggests the possibility rather than the obligation. I had first understood this to mean two permanent types, one being pure workers and the other being the standard entrepreneurial choice. Only after digging into their code (happily available because it is mandatory as part of the publication) did it become clear what the model was actually doing. This is not to say the model description in the paper is incorrect, it is correct and once you reread knowing that they are just two permanent types everything checks out. But this massive departure of the model from what is standard in all the cited literature somehow managed to not even warrant a sentence to alert the reader that they were doing things differently to everyone else.
PPS. BM2023 denote e what I here call z_w and denote z what I here call z_e.
Are you referring to the AER Insights? I think BM have an older working paper where the benchmark model is a simple Aiyagari with endogenous labor supply. They then have an extension with entrepreneurs to show that their results are robust to the inclusion of heterogeneous returns on assets (they get heterogeneous returns on assets by adding an entre technology with financial frictions)
This older wp got split into the jme (only Aiyagari) and the aer insights (model with entrepreneurs).
It would still be interesting to partially replicate the aer insights paper with the toolkit. Can it be done using permanent types? The fact that there is no occupational choice makes it easier to solve!
Unfortunately I am quite busy with teaching and other commitments in this period and haven’t had much time to look into your example yet. I also wanted to go back to the way you implement interpolation and see if it can be sped up.
You can do Boar & Midrigan (2023) in VFI Toolkit using permanent types (I got most of the way though if you want a copy of the script). But I cannot imagine anyone much wanting to use a model about entrepreneurs in which no-one ever starts a business. Seems to miss one of the two main points of putting entrepreneurs into a model (the other being wealth inequality). Hence why I stopped working on it once I realised what was up.
I did some work and if you update both the BM2022 codes and the toolkit the infinite horizon value fn drops from 18s to 1.6s (on my desktop; got 18s to 4s just by switching to vfoptions.greedyhowards=0, which is now the default in this setting; the 4s to 1.6s was by implementing a new ‘post-GI’ instead of ‘pre-GI’, which is just about the details of how/when the grid interpolation layer is being done). Obviously this makes the whole thing much easier to use and even with poor initial guess finding the final stationary eqm only takes 104s (could be faster still if you set up a decent initial guess and change heteroagentoptions.fminalgo as discussed in previous post).
While I was there I cleaned up the transition paths as well. This model is really fragile (something about the iota lump-sum transfer). It solves but I had to set the update factors really small so that it does converge and this works but means that the transition paths take ages to solve versus ‘normal’ because it does a huge number of iterations versus normal. There is for sure a better way to do this, and with not setting such small update steps it would be plenty fast enough (if you time the ValueFnOnTransPath_Case1() command it is decently fast for a single iteration).
This looks quite interesting, I look forward to using the improved codes. I have a couple of comments:
Transition path.
In the working paper version, Boar and Midrigan acknowledge that the fixed point iteration (which is the standard method implemented in the VFI toolkit, I think) is “notoriously slow and unreliable” and suggest using a different method.
I think there is another post in this forum where we discussed this algorithm (Type-I Anderson acceleration). I wonder if this “unreliability” is somehow related to the problem I had found solving for the transition after a change in the discount factor in the Aiyagari model (see my older post here)
VFI Infinite horizon
Is the greedyhowards method based on building a large transition matrix P(a',z'|a,z)? I’ve always found this method quite slow compared to the others, and the reason might be that this transition matrix is sparse but not so sparse. It is the same logic of the Tan improvement, just here you should do it in reverse order. First you apply the transition from z to z' and then as a separate step the transition from a to a'. I hope this makes sense. This is the logic I have used in my code howardssparse that you have added to the toolkit (I’m quite proud that my contribution was accepted, by the way:) )
Good point about breaking howards iterations into howardssparse by using Tan improvement approach in reverse. I should try do this, was a nice bit of code you wrote ![]()
VFI Toolkit does transitions based on shooting algorithms. When BM2022 say ‘using a fixed point method’, I believe they meant a Newton method (so update based on the Jacobian, that is, based on derivatives). As you say, they did Type-I Anderson acceleration which is like using the average of the Jacobian over the last few iterations.
At some point I want to add the Newton method, and Anderson acceleration version, for transition paths to the toolkit. But is not an current priority for me as the shooting works fine in my current research papers.
I think in principle it could be done also for the case with interpolation. I think now howardssparse works only if gridinterplayer=0.
Regarding gridinterplayer, whenever you have a bit of time, it would be nice if you could update the pdf that explains the algorithm with the new feature you added (post-GI)
Yes and yes.
Will update the docs soon.
Added a description of pre-GI and post-GI for implementing grid interpolation layer in infinite-horizon VFI to the pdf about grid interpolation layer. Both are compromises that aim to avoid having to recompute the ‘second layer’ at each iteration.
Actually, what you might really want to do is to do post-GI, but with vfoptions.maxaprimediff set small, and then just perform an additional second stage of the post-GI with a small vfoptions.maxaprimediff. But not sure I can be bothered writing the code for this a post-GI is already pretty good, even though I suspect it would be very powerful. (Have 2 or even 4 post-GI stages, you could even interpolate further.)
I had a look at the replication for BM 2022 and I found these lines:
vfoptions.gridinterplayer=1;
vfoptions.ngridinterp=20;
simoptions.gridinterplayer=vfoptions.gridinterplayer;
simoptions.ngridinterp=vfoptions.ngridinterp;
Is this doing pre-GI or post-GI?
Thanks!
Update
I realized that the default is vfoptions.preGI=0. Is there a way to set maxaprimediff?
Thanks for updating the pdf! As far as I understand, post-GI might do deliver the same solution as pre-GI, and in case they do differ, pre-GI is more accurate. Is my understanding correct?
In order to save memory, wouldn’t it be enough to compute the return matrix at each iteration of the value function. Here is the pseudo-code I have in mind
- Define a coarse grid A with n_a elements
- Precompute F(a',a,z) where a' belongs to the coarse grid A. We precompute only this, not F on the finer grid
- Guess V_0(a,z)
- Start value function iteration
- Compute continuation value EV(a',z) = \sum_{z'} V_0(a',z') \pi(z,z') using matrix multiplication
- Compute g(a,z) = \arg \max_{a'\in A} \{ F(a',a,z)+\beta EV(a',z) \} where g(a,z) is the location of the optimal a' on the coarse grid that maximizes the right-hand side of the Bellman equation
- Define a finer grid but only in the subinterval \mathcal{A} = [A(g(a,z)-1), A(g(a,z)+1)]. Special care must be taken if g(a,z)=1 or g(a,z)=n_a.
- Compute F(a',a,z) with a' on the finer grid, using arrayfun. This is done for every iteration of the VFI.
- Do another maximization, this time on the finer grid, using linear interpolation:
- Continue until convergence
I hope this is not too confusing ![]()
The method I wrote above should be slower than post-GI because it is calling arrayfun every time within each iteration of the value function, but more accurate.
As far as I understand, post-GI might do deliver the same solution as pre-GI, and in case they do differ, pre-GI is more accurate. Is my understanding correct?
Correct. If vfoptions.maxaprimediff is large enough (in the examples I have done, anything from 2 to 10 was plenty; most models was 2 to 4) then the pre-GI and post-GI give the same soluation. If vfoptions.maxaprimediff is too small, then pre-GI is the most accurate, followed by post-GI, followed by not using grid interpolation at all.
The method you describe is like doing the second stage of post-GI, but recomputing the ReturnFn matrix for it within each iteration, except of course that it is allowed to differ across iterations. That method will acheive the full accuracy of pre-GI, but will be slower than post-GI (I expect it is slower, I have not tried, technically depends on how many iterations you need of it as the ReturnFn matrix is smaller within each iteration than it would be for post-GI; you are essentially setting vfoptions.maxaprimediff=1 within each iteration. If number of iterations is small enough, it would be faster to do your suggestion).
This is close to the kind of thing I was suggesting when I said “have 2 or even 4 post-GI stages, you could even interpolate further”. I was thinking that rather than recompute the ReturnFn matrix every iteration you could create a ReturnFn matrix, iterate until convergence (rather than once as in your example), create a new ReturnFn matrix, iterate until convergence (rather than once as in your example) again; each one would just use a small value like vfoptions.maxaprimediff=2 or just =1 which would correspond to the case in your suggestion.
Worth noting that because the second-stage post-GI ReturnFn matrix is just based on the finer grid for ±vfoptions.maxaprimediff in terms of runtime creating it takes very little of the runtime of the total VFI. So recomputing it each iteration like you suggest is a plausible option. Would kind of just need to code a few different alternatives and see which is fastest, as it not obvious a priori.
In any case, post-GI already gives a big runtime and accuracy improvement, and advanced users can easily make sure to set vfoptions.aprimediff so that post-GI gives the full accuracy of pre-GI. Maybe I will revisit something more advanced next year but for now I am happy with the gains ![]()
I have tested pre-GI vs post-GI on a simple Aiyagari model, using a standard calibration. Unfortunately, post_GI is faster but gives wrong results. Maybe only slightly wrong…
Grid sizes are: 1000 points for assets, and 7 points for exogenous shock
Start VFI...
vfoptions.gridinterplayer = 1
vfoptions.preGI = 1
Run time VFI = 1.186103
Start VFI...
vfoptions.gridinterplayer = 1
vfoptions.preGI = 0
Run time VFI = 0.924625
Difference pre-GI post-GI, value function = 0.000000
Difference pre-GI post-GI, policy function = 2.000000
There is a difference in the policy function. You can find my code here, see until line 68 of main: GitHub - aledinola/AiyagariFortranMatlab: Aiyagari model solved in Matlab and in Fortran, comparison
I expect the difference is just because vfoptions.maxaprimediff is too small. I also expect that the value fn difference is not zero, but is more like 1e-9 so just appears rounded to zero as you have forced six decimal places.
Solve it with pre-GI, and then solve it without grid interpolation. Then calculate A=max(PolicyA(1,:,:)-PolicyB(1,:,:)), then set vfoptions.maxaprimediff=A+1. Once you do this the post-GI should then give same solution as the pre-GI. [PolicyA and PolicyB are the pre-GI and without GI solutions, the +1 in A+1 is because the pre-GI will report lower grid point]
In bigger models the post-GI is more relevant, as it reduces memory use and runtimes by much more in larger models. It isn’t going to give you much gain in Aiyagari model.
Yes, the two methods give the same result is I set maxaprimediff=6. But then if I increase n_a the results are again different. It seems that post-GI is a bit fragile. It’s good to have it in the toolkit but I wouldn’t set it as default for gridinterplayer. It’s a bit like divide and conquer, faster but dangerous ![]()
If I have time, I was thinking about coding up a couple of things, and then submit it to you for revision ![]()
- post-GI using the pseudo-code I outlined in my post above
- Howards sparse for gridinterplayer
Do you think it might be a good idea? My plan is to do this for the project I have with infinite horizon and entrepreneurs, so I might do something a bit project-specific but it could be easily recycled for the toolkit, I think.
post-GI is fragile, but it remains substantially more accurate than without GI, and for anything beyond the basics pre-GI just runs out of memory. So I am comfortable with post-GI as the default [it is only the default when you explictly say you want to use grid interpolation layer; post-GI guarantees most, but not always all, of the extra accuracy from the grid interpolation later]. At some point I will go and add an additional stage or two of post-GI. Given your discomfort I might add it sooner rather than later ![]()
For sure, if you do code either/both of the two you suggest I will certainly incorporate them into the toolkit ![]()
For the entrepreneurial choice models, you probably want to start from modifying the “postGI2B” code. This is where there are two endogenous states and GI is applied only to the first of the two (this is what the 2B refers to, two endo states but apply only to the first). Here is the main grid interpolation layer in InfHorz command, you can see all the if-statements about deciding which setting to run. For solving Kitao (2008) I am using “postGI2B_nod” while for solving Bruggemann (2021) I am using “Refine_postGI2B”. On my desktop I need vfoptions.lowmemory=1 when solving B2021. [Entrepreneurial-choice in both these two models has two endogenous states, the first is assets and the second is entrepreneur-worker-status. K2008 has two markovs, B2021 has three markovs. K2008 has exogenous labor supply, so no decision variables. B2021 has endogenous labor supply, so one decision variable.]
One question about the transition path: does it converge only with interpolation or also with pure discretization? Pure discretization should be faster and if it works it could be used to get a good initial guess for the prices in the transition path
Currently it has vfoptions.gridinterplayer=1. But you can turn it off and see how it goes. Obviously it will be faster per-transition-path-iteration, but I don’t expect it will make the convergence more stable. The reason the convergence is slow is largely because the damping parameters are crazy small, which is because otherwise the convergence was unstable.
I read the file BoarMidrigan2022/BoarMidrigan2022.m at main · robertdkirkby/BoarMidrigan2022 · GitHub and I don’t understand something regarding the GE conditions (it might be a mistake?).
The market clearing condition for capital is set as
GeneralEqmEqns_pre.capitalmarket=@(r,K,L,alpha,delta) r-(alpha*(K^(alpha-1))*(L^(1-alpha))-delta);
which is equivalent to
and K is defined as aggregate assets in the code. But in this model, the asset market clearing condition is K+B = A, hence the market clearing in the code should be written as:
(In the interest of clarity, one could define K as an intermediate variable equal to assets minus government debt, but I don’t know how to do it.)
In any case, the market clearing condition for capital seems incorrect to me. It follows that also the other market clearing conditions are incorrect: private capital K should always equal total assets minus government debt.
