Veda2.0 Released!


How to speed up model generation?
#1
Hi,

In the Danish Energy Agency we have a large model. Currently, our runtime is in excess of 2h. So in order to do faster development, adjusting and testing and we have divided the model into separate sectors - kind of partial modelling. (and introduced artificial imports of commodities that go across sectors that is active in the partial models, e.g. electricity)

Each team have the opportunity to develop, adjust and do testing on their part of the model, without running the whole model, thus saving time.

Our experience is that the total runtime for the whole model is much, much longer, than the combined runtimes of the partial models. 2 hours compared to combined runtime of the partials of 30m.

But it isn't the cplex solver that's the issue here. The model generation is the culprit. CPLEX can run several threads and flies away, but model generation only runs on one thread and is excruciatingly slow.

I think the problem lies within dimensionality. The full model increases fast in number of processes and commodities, and the permutation of commodities by processes goes up accordingly. This is apparant in the model generation step.

For example in the "prepparm.gms" TIMES source file the section, the controlling set is defined with

Code:
UNCD7(%2,LL--ORD(LL),%3%4)$(%9(%1(%2,'%DFLBL%',%3))>-.5) $= %1(%2,LL,%3);


which for the PRC_MARK parameter call translates to

Code:
UNCD7(R,LL--ORD(LL),P,ITEM,C,BD,'0')$((PRC_MARK(R,'0',P,ITEM,C,BD))>-.5) $= PRC_MARK(R,LL,P,ITEM,C,BD);


Here, for example, the LL is alias'ed from ALLYEAR, which is defined over 1980:2200, and ITEM is aliased from the universe set - which for the PRC_MARK case is just the commodity.

Just replacing the LL with DM_YEAR, which is just the years for the model horizon, reduces the runtime for that line quite a lot.

So it feels like, gams spends a lot of time non-existing combinations, perhaps on the testing conditional for the non-existing combinations.

Would it be possible to make adjustments, so e.g. genparm.gms only works on actual combinations of processes and commodities (perhaps the RPC set)?
Or maybe have the model generation in parallel mode using more threads on the cpu?

Or perhaps any other suggestions?

Thanks,
Lars
Reply
#2
Could you provide a listing file demonstrating an example long model processing and generation time? From such a listing file (using the default profile option), one would be able to see the performance bottlenecks, and focus on them.

> UNCD7(R,LL--ORD(LL),P,ITEM,C,BD,'0')$((PRC_MARK(R,'0',P,ITEM,C,BD))>-.5) $= PRC_MARK(R,LL,P,ITEM,C,BD);
Just replacing the LL with DM_YEAR, which is just the years for the model horizon, reduces the runtime for that line quite a lot.


I don't quite understand what you suggest.  You cannot replace LL with DM_YEAR here, because 1) DM_YEAR is not an ordered set, and 2) the logic would then not make sense for the original purpose.

Last year we had an ETSAP project together with GAMS about the TIMES code performance, and the conclusions were that it does have a pretty good performance (the GAMS people made that judgement). The project also implemented a number of improvements, based on test models. But of course, we apparently did not have your model as a test model, but we had a number of other big models. Please provide a test model (the TIMES input files) if you wish the maintainer to improve the performance with respect to your model.

I would myself be happy to implement performance improvements into the code where possible, as soon as I have a test model demonstrating some bad performance.  Blush

> Would it be possible to make adjustments, so e.g. genparm.gms only works on actual combinations of processes and commodities (perhaps the RPC set)?

Hmm... but there is no such code file "genparm.gms" in the TIMES code. Maybe that is your own supplementary code file?
[+] 1 user Likes Antti-L's post
Reply
#3
Thanks for the reply.
Yes, rereading the post I realised, that I did remove the ORD(LL) part of the line, so the order of the set becomes irrelevant and the DM_YEAR works. It doesn't, obviously, when having the ORD().
Perhaps I just hoped that there was something that had an easier fix/workaround.
I had been looking at the review of the perfomance project you mentioned, and it is unclear for me how many of the suggested improvements have been introduced in the latest TIMES code.

I will try and post listing files showing the bottlenecks, but one is definitely the genparm.gms call to do the interpolations.
I will also need to make a test version of the model for illustrating my issue, which I of course know, is partly our own fault by having so many processes and commodities - unfortunately in a larger organisation and with an model evolved over a long time, this is a difficult issue.

This will take some time.

Perhaps I will also try to do my own interpolations, as we work heavily in R to do data management before setting up excel files for veda. It won't be a problem to handle this here, and then don't ask TIMES to do them.

regards,
Lars
Reply
#4
> I had been looking at the review of the perfomance project you mentioned, and it is unclear for me how many of the suggested improvements have been introduced in the latest TIMES code.

Basically all of them were implemented.  But the resulting performance improvement was relatively modest.  However, different models may manifest quite different performance issues, depending on how intensively they use some TIMES features. And therefore, I am quite sure your performance issue could be largely eliminated, if you would first let us see it.

> Perhaps I will also try to do my own interpolations, as we work heavily in R to do data management before setting up excel files for veda. It won't be a problem to handle this here, and then don't ask TIMES to do them.

Well, I think it be very kind of you if you would help improving the TIMES code in this respect.  I think the ETSAP community would be thankful to you if you provide such help. And in this case, helping to improve the performance would be very simple: Just provide the input files (*.DD and *.RUN) and the listing file (*.LST) for a test model demonstrating the performance issue.  I am not able to see why you would wish to choose not helping us here, to improve the performance.
Reply
#5
Perhaps I came across wrongly.

I would very much like to help.
My suggestion of trying to do my own interpolations was actually to help identify my issue. Running with and without the interpolation rules done by TIMES - to see if my combined runtime gets closer to the runtime of the full model.

I would on all accounts rather have code that is peer reviewed to do tasks, than something I have done myself, and thus also having to maintain it myself.

Unfortunately, I can't provide the full dd and run files because of confidentiality issues - here my hands are tied :-(
I will try and setup a test model to show what is my issue. And then the community can evaluate if needed.
regards
Lars
Reply
#6
I fully agree with Antti.
Reply
#7
Sorry it was the prepparm.gms file in the TIMES code
Reply
#8
Ok, I do understand your confidentiality point, even though many other modelers with equally confidential models have trusted me in that respect in the past (me keeping the *.DD and *.RUN files confidential).  However, I cannot really see any confidentiality matters about the listing file (with no equation listing), and so could you consider at least providing such a *.LST file, which would show the performance profile?
Reply
#9
In case you are in fact seeing a notable performance issue with that PRC_MARK processing, just to illustrate that such may not be easily seen in other models, I have a model with moderate use of FLO_MARK/PRC_MARK (~100,000 data entries for it), and the prepparm processing for it takes only 0.16 seconds. With test results of that kind, it has thus far not been identified as any notable performance issue, due to not really standing out in test models.

I guess you might be using FLO_MARK/PRC_MARK very intensively in your model (perhaps millions of data entries)?  As mentioned, it would be useful to see the listing file from your case to get a better understanding of the performance issue (with unmodified TIMES code, of course).
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Issues in Scaling the Model Anuradha 14 545 29-04-2025, 09:10 PM
Last Post: Anuradha
  meaning of NCAP in a simple test model Enya 2 224 02-04-2025, 03:33 PM
Last Post: Antti-L
  Running TIMES model on linux HPC cluster LucasRM 7 4,690 21-01-2025, 11:13 PM
Last Post: AKanudia
  Demo model on CCS [email protected] 0 247 12-12-2024, 07:35 PM
Last Post: [email protected]
  Unexpected error in model run BSR 2 751 17-07-2024, 07:07 PM
Last Post: BSR
  Model geographical expansion stevenss 0 496 17-05-2024, 02:36 PM
Last Post: stevenss
  Generating _Trade_Links for multi-regional model kushagra 4 1,316 03-04-2024, 07:29 PM
Last Post: kushagra
  Model Infeasible Question VanessaDing 4 2,010 14-03-2024, 07:47 PM
Last Post: Antti-L
  Long model generation time Kristina.Haaskjold 8 2,056 12-02-2024, 02:35 PM
Last Post: Kristina.Haaskjold
  RES generation LucasD 3 1,175 21-11-2023, 08:22 PM
Last Post: LucasD

Forum Jump:


Users browsing this thread: 1 Guest(s)