A recent commit add this error message to the toolkit:
On transtion paths you can only use vfoptions.lowmemory>0 when using transpathoptions.fastOLG=1, because otherwise the runtimes will anyway be so slow as to be essentially unusable
To that I say “yes, and no”. What I have measured is that for a large model, which needs a lowmemory option just to fit, fastOLG is just another dimension (age) that needs to be either prioritized or iterated as appropriate.
In my case, I have 32GB of GPU RAM and 265GB of CPU RAM (of which 128GB can be “shared” with the GPU). The statistics I measured where:
fastOLG: 26-31GB GPU RAM in use; 16-27GB shared GPU RAM, 74GB of CPU RAM (includes shared GPU RAM), and a GPU utilization rate that fluctuated mostly between 40% and 70%. Total runtime for a transition iteration: 2431 seconds. There was no indication that the GPU was thrashing (nothing was charged against time spent in Copy).
slowOLG: 9.5GB GPU RAM; no shared GPU RAM, 40GB of CPU RAM, and a GPU utilization rate above 98%. Total runtime for a transition iteration: 1025 seconds.
I think that as long as the GPU compute cores are fully occupied, the toolkit is making best use of the cores, and there’s no need to enforce that a particular dimension be loaded into the GPU.
It might be useful to create a test harness that can benchmark various iteration plans so that the user can put tunings into their models, but that’s just a thought for the future.