Timeouts in newer MG5_aMC Versions — Configurable?

Asked by Zachary Marshall

Hi there,

We've been seeing an increasing number of timeouts when running MG5_aMC recently. This is correlated with our moves to newer versions, but I've not been able to identify a killer moment at which the timeouts started occurring. There are two places we've seen them so far.

1) In model loading, around L3705 of madgraph/interface/common_run_interface.py:

             AskforEditCard.update_dependent(interface, interface.me_dir, card, path, timer=20)

2) in LHAPDF loading, in get_lhapdf_version_static of madgraph/interface/common_run_interface.py , which seems to ultimately be called from here:

        self.do_update('dependent', timer=20)

We're trying to track down the source of these timeouts on our side, to see if anything has changed, but I wanted to ask two questions of you all in the meantime. First: has anything changed to do with these timeouts or their treatment in recent releases? Second: could the timeouts be made configurable (rather than hard-coded to 20) so that we can make them a bit longer in systems that need a little extra time?

Thanks,
Zach

Question information

Language:
English Edit question
Status:
Open
For:
MadGraph5_aMC@NLO Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#1

Hi,

So the reasons are multiple:
1) In old version of the code, this function was not dependent of lhapdf and therefore the timeout was only trigger for slow/big model (and was correctly handle: which is stop the function, warn the user and continue the code without that optional feature)
2) we were not expecting that the lhapdf would be so slow that the timeout will occur within lhapdf call and therefore the catching of the timeout signal was/is not yet intercepted correctly if the timeout occur during such call.

I did provide to you in the previous thread about this issue, the patch to avoid the issue (not sure which solution I picked either allow for infinite time for lhapdf to react or change the exception handling such that it cancel the call to lhapdf if the timeout occurs).

> Second: could the timeouts be made configurable (rather than hard-coded to 20) so that we can make them a bit longer in systems that need a little extra time?

I do not see a point to make that timeout configurable. Either you care about the feature and then you have to run explicitly (and then it run without timer) or you do not really care about it and then the default timer is reasonable. Having an mix between the two mode is obviously possible but I do not see the point.

I do understand that the crash due to a super slow lhapdf is an issue but the patch should solve that issue without having to introduce an additional parameter.

Revision history for this message
Zachary Marshall (zach-marshall) said :
#2

Thanks Olivier. The point about LHAPDF is clear; what about the model loading? Should we also disable these?

I'd be ok with a top-level switch to disable them (and then it's up to us to do something sensible) — wouldn't that be nicer than our privately patching MG5_aMC to disable them for each new version?

Thanks,
Zach

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#3

You want to do which version?
Always disable the call to the function or disable the timer?

I guess always disable the call to the function can indeed make sense in your case.
Since this function is mainly for people that provide inconsistent BSM information in the param_card

Cheers,

Olivier

> On 19 Oct 2023, at 17:40, Zachary Marshall <email address hidden> wrote:
>
> Question #708223 on MadGraph5_aMC@NLO changed:
> https://answers.launchpad.net/mg5amcnlo/+question/708223
>
> Status: Answered => Open
>
> Zachary Marshall is still having a problem:
> Thanks Olivier. The point about LHAPDF is clear; what about the model
> loading? Should we also disable these?
>
> I'd be ok with a top-level switch to disable them (and then it's up to
> us to do something sensible) — wouldn't that be nicer than our privately
> patching MG5_aMC to disable them for each new version?
>
> Thanks,
> Zach
>
> --
> You received this question notification because you are an answer
> contact for MadGraph5_aMC@NLO.

Revision history for this message
Zachary Marshall (zach-marshall) said :
#4

We're in MG5_aMC 3.5.1 at the moment, but if you just put it into the main branch we'll get it eventually.

I would be inclined to disable the timer — the call was timing out when loading a restrict card. If we're loading a restrict card that is well-tested, is it in fact safe to just disable the checks? If so, that'd also be fine with me! We can make sure restrict cards used in production are well tested.

Thanks,
Zach

Can you help with this problem?

Provide an answer of your own, or ask Zachary Marshall for more information if necessary.

To post a message you must log in.