Constrain General Eqm Parameters

robertdkirkby · June 9, 2025, 10:01am

You can now constrain the general eqm parameters (the ones named in GEPriceParamNames). This works for both infinite and finite horizon models (and is now explained in an Appendix in the Intro to OLG Models, following is pretty much copy-paste from there).

Say you are finding the wage, w, in general equilibrium. You know this has to be positive, and if you try to solve the model with a negative wage then it gives errors. You can set constraints on the general equilibrium parameters to deal with these situations. There are three possible constraints, all of which are applied by specifying the name of the parameter you want to constrain. In the following, we will use w as the name of the parameter (in GEPriceParamNames) that we wish to constain.

You can constrain a parameter to be positive using
heteroagentoptions.constrainpositive={‘w’}
You can constrain a parameter to be between 0 and 1 (e.g., because it is a probability) using
heteroagentoptions.constrain0to1={‘w’}
You can constrain a parameter to be between A and B using
heteroagentoptions.constrainAtoB={‘w’}
and then you specify the values of A and B, e.g. that w is between 3 and 5, with
heteroagentoptions.constrainAtoBlimits.w=[3,5]
[Use name of parameter as field, ‘.w’, and then use a vector of two values for the range to apply to that parameter]

robertdkirkby · June 9, 2025, 10:06am

Note that this is the exact same setup that you use to constrain parameters when doing calibration or GMM estimation, except of course that in those cases you put these into caliboptions and estimoptions, respectively, rather than heteroagentoptions.

aledinola · June 26, 2025, 1:47pm

Small question: why the case 0-1 is specified separately from A-B?

robertdkirkby · June 27, 2025, 9:25am

Essentially because A-B works by converting to 0-1 and then using that. So I had to implement 0-1 as part of implementing A-B, and then seemed no reason not to just give user direct access to 0-1 as even though it is a subcase of A-B it is a commonly used one, e.g. for probabilities.

Rest of this just gives unnecessary detail that explains why implementing A-B works via 0-1…

Internally, ‘constrain positive’ replaces \theta (which is \in [0,Inf)) with \hat{\theta}\equiv log(\theta) (which is \in \mathbb{R}), and then performs the unconstrained optimization using \hat{\theta} and in each iteration inside the optimization just evaluates \theta=exp(\theta) (which is \in (0,Inf)) before passing it to the model.

Similarly, ‘constrain 0 to 1’ replaces \theta (which is \in [0,1]) with \hat{\theta}\equiv log(\frac{\theta}{1-\theta}) (which is \in \mathbb{R}) and then performs the unconstrained optimization using \hat{\theta} and in each iteration inside the optimization just evaluates \theta=1/(1+exp(-\hat{\theta})) (which is \in (0,1)) before passing it to the model.

Finally, ‘constrain A to B’ first replaces \theta (which is \in [A,B]) with \tilde{\theta}\equiv \frac{\theta - A}{B-A} (which is \in [0,1]), and then does the same as the constrain 0 to 1 (so \hat{\theta}\equiv log(\frac{\tilde{\theta}}{1-\tilde{\theta}}) (which is \in \mathbb{R}) and does the unconstrained optimization using \hat{\theta} and in each iteration inside the optimization just first evaluates \tilde{\theta}=1/(1+exp(-\hat{\theta})) (which is \in (0,1)) and then \theta=A+(B-A)\hat{\theta} (which is \in (A,B)) before passing it to the model.

Note, one minor weakness of all this, is that the constrained parameters can’t quite hit the end points. So if, e.g., you constrain 0-to-1 and the true solution is 0 (is 1) you will get a solution like 0.00001 (like 0.99999). You can see this in each paragraph above as the sets are closed (square-brackets [,]) to begin, and open (round-brackets (,)) at the end.

aledinola · June 27, 2025, 10:14am

Thanks for the answer!

"in each iteration inside the optimization just evaluates \hat{\theta} = \exp(\theta) "

What about possible overflows? I used the exp log transformation in the past but if you have large numbers, the exp of a large number might become undefined. For example, on my Matlab, exp(x)=inf if x>=710

John d’Errico, in his implementation of fminsearchbnd, uses a more robust transformation. This post on the Matlab forum might also be useful.

Not suggesting to change this, just something to be aware of.

robertdkirkby · June 27, 2025, 10:32am

I kind of just fudge these internally. If your x goes above something very big I just do a min(x,C) where C is something large, before doing the operations like ‘take exponential’; so effectively x is capped at C, and thus exp(x) is capped at exp(C) hence avoiding the Infs. There is of course a loss but it is pretty minuscule as is simply implicitly puts that the max value that can be taken by the variable that was constrained to be positive is capped at about 5x10^21 (I cannot think of an economic application where you need a parameter value larger than this, would tend to be a sign the model is not well constructed; even a model of global GDP measured in cents won’t get close to this ).

I have no objection to an improved implementation, but it is simply what I came up with at the time and works for everything I have tried so far.

You can see exactly how the handling of the constraints are done in the first part of the ‘objective function’ used for calibration/estimation. Currently it is lines 6-32ish (looks like C is 50 which gives the exp(C)=5x10^21 described above). This is the ‘convert \hat{\theta} back into \theta’ part described in my previous post:

github.com/vfitoolkit/VFIToolkit-matlab

Estimation/ObjectiveFn/CalibrateLifeCycleModel_objectivefn.m

master

function Obj=CalibrateLifeCycleModel_objectivefn(calibparamsvec, CalibParamNames,n_d,n_a,n_z,N_j,d_grid, a_grid, z_gridvals_J, pi_z_J, ReturnFn, ReturnFnParamNames, Parameters, DiscountFactorParamNames, jequaloneDist,AgeWeightParamNames, ParametrizeParamsFn, FnsToEvaluate, FnsToEvaluateParamNames, usingallstats, usinglcp,targetmomentvec, allstatmomentnames, acsmomentnames, allstatcummomentsizes, acscummomentsizes, AllStats_whichstats, ACStats_whichstats, calibparamsvecindex, calibomitparams_counter, calibomitparamsmatrix, caliboptions, vfoptions,simoptions)
% Note: Inputs are CalibParamNames,TargetMoments, and then everything
% needed to be able to run ValueFnIter, StationaryDist, AllStats and
% LifeCycleProfiles. Lastly there is caliboptions.

% Do any transformations of parameters before we say what they are
penalty=zeros(length(calibparamsvec),1); % Used to apply penalty to objective function when parameters try to leave restricted ranges
for pp=1:length(CalibParamNames)
    if caliboptions.constrainpositive(pp)==1 % Forcing this parameter to be positive
        temp=calibparamsvec(calibparamsvecindex(pp)+1:calibparamsvecindex(pp+1));
        penalty((calibparamsvecindex(pp)+1:calibparamsvecindex(pp+1)))=abs(temp/50).*(temp<-51); % 1 if out of range [Note: 51, rather than 50, so penalty only hits once genuinely out of range]
        % Constrain parameter to be positive (be working with log(parameter) and then always take exp() before inputting to model)
        calibparamsvec(calibparamsvecindex(pp)+1:calibparamsvecindex(pp+1))=exp(calibparamsvec(calibparamsvecindex(pp)+1:calibparamsvecindex(pp+1)));
    elseif caliboptions.constrain0to1(pp)==1
        temp=calibparamsvec(calibparamsvecindex(pp)+1:calibparamsvecindex(pp+1));
        penalty((calibparamsvecindex(pp)+1:calibparamsvecindex(pp+1)))=abs(temp/50).*((temp>51)+(temp<-51)); % 1 if out of range [Note: 51, rather than 50, so penalty only hits once genuinely out of range]
        % Constrain parameter to be 0 to 1 (be working with x=log(p/(1-p)), where p is parameter) then always take 1/(1+exp(-x)) before inputting to model
        calibparamsvec(calibparamsvecindex(pp)+1:calibparamsvecindex(pp+1))=1/(1+exp(-calibparamsvec(calibparamsvecindex(pp)+1:calibparamsvecindex(pp+1))));
        % Note: This does not include the endpoints of 0 and 1 as 1/(1+exp(-x)) maps from the Real line into the open interval (0,1)
        %       R is not compact, and [0,1] is compact, so cannot have a continuous bijection (one-to-one and onto) function from R into [0,1].

This file has been truncated. show original