Posts: 14
Threads: 6
Likes Received: 0 in 0 posts
Likes Given: 3
Joined: Apr 2018
Hi,
In the Danish Energy Agency we have a large model. Currently, our runtime is in excess of 2h. So in order to do faster development, adjusting and testing and we have divided the model into separate sectors - kind of partial modelling. (and introduced artificial imports of commodities that go across sectors that is active in the partial models, e.g. electricity)
Each team have the opportunity to develop, adjust and do testing on their part of the model, without running the whole model, thus saving time.
Our experience is that the total runtime for the whole model is much, much longer, than the combined runtimes of the partial models. 2 hours compared to combined runtime of the partials of 30m.
But it isn't the cplex solver that's the issue here. The model generation is the culprit. CPLEX can run several threads and flies away, but model generation only runs on one thread and is excruciatingly slow.
I think the problem lies within dimensionality. The full model increases fast in number of processes and commodities, and the permutation of commodities by processes goes up accordingly. This is apparant in the model generation step.
For example in the "prepparm.gms" TIMES source file the section, the controlling set is defined with
Code: UNCD7(%2,LL--ORD(LL),%3%4)$(%9(%1(%2,'%DFLBL%',%3))>-.5) $= %1(%2,LL,%3);
which for the PRC_MARK parameter call translates to
Code: UNCD7(R,LL--ORD(LL),P,ITEM,C,BD,'0')$((PRC_MARK(R,'0',P,ITEM,C,BD))>-.5) $= PRC_MARK(R,LL,P,ITEM,C,BD);
Here, for example, the LL is alias'ed from ALLYEAR, which is defined over 1980:2200, and ITEM is aliased from the universe set - which for the PRC_MARK case is just the commodity.
Just replacing the LL with DM_YEAR, which is just the years for the model horizon, reduces the runtime for that line quite a lot.
So it feels like, gams spends a lot of time non-existing combinations, perhaps on the testing conditional for the non-existing combinations.
Would it be possible to make adjustments, so e.g. genparm.gms only works on actual combinations of processes and commodities (perhaps the RPC set)?
Or maybe have the model generation in parallel mode using more threads on the cpu?
Or perhaps any other suggestions?
Thanks,
Lars
Posts: 1,981
Threads: 26
Likes Received: 66 in 57 posts
Likes Given: 20
Joined: Jun 2010
7 hours ago
(This post was last modified: 6 hours ago by Antti-L.)
Could you provide a listing file demonstrating an example long model processing and generation time? From such a listing file (using the default profile option), one would be able to see the performance bottlenecks, and focus on them.
> UNCD7(R,LL--ORD(LL),P,ITEM,C,BD,'0')$((PRC_MARK(R,'0',P,ITEM,C,BD))>-.5) $= PRC_MARK(R,LL,P,ITEM,C,BD);
Just replacing the LL with DM_YEAR, which is just the years for the model horizon, reduces the runtime for that line quite a lot.
I don't quite understand what you suggest. You cannot replace LL with DM_YEAR here, because 1) DM_YEAR is not an ordered set, and 2) the logic would then not make sense for the original purpose.
Last year we had an ETSAP project together with GAMS about the TIMES code performance, and the conclusions were that it does have a pretty good performance (the GAMS people made that judgement). The project also implemented a number of improvements, based on test models. But of course, we apparently did not have your model as a test model, but we had a number of other big models. Please provide a test model (the TIMES input files) if you wish the maintainer to improve the performance with respect to your model.
I would myself be happy to implement performance improvements into the code where possible, as soon as I have a test model demonstrating some bad performance.
> Would it be possible to make adjustments, so e.g. genparm.gms only works on actual combinations of processes and commodities (perhaps the RPC set)?
Hmm... but there is no such code file " genparm.gms" in the TIMES code. Maybe that is your own supplementary code file?
Posts: 14
Threads: 6
Likes Received: 0 in 0 posts
Likes Given: 3
Joined: Apr 2018
Thanks for the reply.
Yes, rereading the post I realised, that I did remove the ORD(LL) part of the line, so the order of the set becomes irrelevant and the DM_YEAR works. It doesn't, obviously, when having the ORD().
Perhaps I just hoped that there was something that had an easier fix/workaround.
I had been looking at the review of the perfomance project you mentioned, and it is unclear for me how many of the suggested improvements have been introduced in the latest TIMES code.
I will try and post listing files showing the bottlenecks, but one is definitely the genparm.gms call to do the interpolations.
I will also need to make a test version of the model for illustrating my issue, which I of course know, is partly our own fault by having so many processes and commodities - unfortunately in a larger organisation and with an model evolved over a long time, this is a difficult issue.
This will take some time.
Perhaps I will also try to do my own interpolations, as we work heavily in R to do data management before setting up excel files for veda. It won't be a problem to handle this here, and then don't ask TIMES to do them.
regards,
Lars
Posts: 1,981
Threads: 26
Likes Received: 66 in 57 posts
Likes Given: 20
Joined: Jun 2010
6 hours ago
(This post was last modified: 6 hours ago by Antti-L.)
> I had been looking at the review of the perfomance project you mentioned, and it is unclear for me how many of the suggested improvements have been introduced in the latest TIMES code.
Basically all of them were implemented. But the resulting performance improvement was relatively modest. However, different models may manifest quite different performance issues, depending on how intensively they use some TIMES features. And therefore, I am quite sure your performance issue could be largely eliminated, if you would first let us see it.
> Perhaps I will also try to do my own interpolations, as we work heavily in R to do data management before setting up excel files for veda. It won't be a problem to handle this here, and then don't ask TIMES to do them.
Well, I think it be very kind of you if you would help improving the TIMES code in this respect. I think the ETSAP community would be thankful to you if you provide such help. And in this case, helping to improve the performance would be very simple: Just provide the input files (*.DD and *.RUN) and the listing file (*.LST) for a test model demonstrating the performance issue. I am not able to see why you would wish to choose not helping us here, to improve the performance.
Posts: 14
Threads: 6
Likes Received: 0 in 0 posts
Likes Given: 3
Joined: Apr 2018
6 hours ago
(This post was last modified: 6 hours ago by lbtermansen.)
Perhaps I came across wrongly.
I would very much like to help.
My suggestion of trying to do my own interpolations was actually to help identify my issue. Running with and without the interpolation rules done by TIMES - to see if my combined runtime gets closer to the runtime of the full model.
I would on all accounts rather have code that is peer reviewed to do tasks, than something I have done myself, and thus also having to maintain it myself.
Unfortunately, I can't provide the full dd and run files because of confidentiality issues - here my hands are tied :-(
I will try and setup a test model to show what is my issue. And then the community can evaluate if needed.
regards
Lars
Posts: 1,063
Threads: 42
Likes Received: 20 in 16 posts
Likes Given: 27
Joined: May 2010
Reputation:
20
I fully agree with Antti.
Posts: 14
Threads: 6
Likes Received: 0 in 0 posts
Likes Given: 3
Joined: Apr 2018
Sorry it was the prepparm.gms file in the TIMES code
Posts: 1,981
Threads: 26
Likes Received: 66 in 57 posts
Likes Given: 20
Joined: Jun 2010
5 hours ago
(This post was last modified: 5 hours ago by Antti-L.)
Ok, I do understand your confidentiality point, even though many other modelers with equally confidential models have trusted me in that respect in the past (me keeping the *.DD and *.RUN files confidential). However, I cannot really see any confidentiality matters about the listing file (with no equation listing), and so could you consider at least providing such a *.LST file, which would show the performance profile?
Posts: 1,981
Threads: 26
Likes Received: 66 in 57 posts
Likes Given: 20
Joined: Jun 2010
In case you are in fact seeing a notable performance issue with that PRC_MARK processing, just to illustrate that such may not be easily seen in other models, I have a model with moderate use of FLO_MARK/PRC_MARK (~100,000 data entries for it), and the prepparm processing for it takes only 0.16 seconds. With test results of that kind, it has thus far not been identified as any notable performance issue, due to not really standing out in test models.
I guess you might be using FLO_MARK/PRC_MARK very intensively in your model (perhaps millions of data entries)? As mentioned, it would be useful to see the listing file from your case to get a better understanding of the performance issue (with unmodified TIMES code, of course).
|