Meaning of the expected CLs reported by pyhf. #1619
Replies: 7 comments 13 replies
-
Dear Sabine
The expected limit is the limit you would get if the observed data would be
the yield you would expect for the background only hypothesis.
Does that help? I can try giving an example
Best
Lukas
…On Wed, 6 Oct 2021 at 17:59, Sabine Kraml ***@***.***> wrote:
Dear pyhf developers,
We'd like to inquire about the exact definition of the expected CLs
reported by pyhf.
As we understand it, expected limits, and thus also CLs_exp, are computed
under the assumption that observation equals expectation (thus being
agnostic of the actual observed data). We tried this out with the full
likelihoods published by ATLAS, replacing in the json file the number of
observed events in each signal region by the number of expected background
events. This gives CLs_exp=CLs_obs, as it should be, but this value is
different from the CLs_exp obtained with the original json file.
To give a concrete example from ATLAS-SUSY-2018-04 (hadronic stau search
with two signal regions):
SR1cuts: #exp=6.057 #obs=10
SR2cuts: #exp=10.338 #obs=7
Here, #exp and #obs denote the number of expected SM events and the number
of observed events, respectively.
Patching a signal of 1.568 events in region SR1cuts (=SRlow in the paper)
and 7.701 events in region SR2cuts (=SRhigh in the paper), which
corresponds to a benchmark point with m_stauLR = 400 GeV and m_LSP = 40
GeV, gives
CLs_obs = 0.94 and CLs_exp = 0.85.
However, when verifying the expected CLs "by hand", that is using
SR1cuts: #exp=6.057 #obs=6.057
SR2cuts: #exp=10.338 #obs=10.338
one obtains
CLs_obs = CLs_exp = 0.71
for the same signal. Some clarification where the difference comes from
would be much appreciated.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#1619>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AARV6A2PSLUHGCK7KK4FKILUFRW7DANCNFSM5FO77KWQ>
.
|
Beta Was this translation helpful? Give feedback.
-
Just a quick comment for now -- but the computation of CLs requires first - a fit to generate Asimov data - and this depends on the observed data you specify. This Asimov dataset will depend on the observed dataset. Does that clarify? |
Beta Was this translation helpful? Give feedback.
-
Here is an example:
[image: image.png]
…On Wed, Oct 6, 2021 at 6:09 PM Giordon Stark ***@***.***> wrote:
One thing @alexander-held <https://github.com/alexander-held> reminded me
of to clarify, that goes along with what @lukasheinrich
<https://github.com/lukasheinrich> said. The Asimov will center your
background expectation to the observed data.
If you switch from observed data to expected data -- your background-only
model will have NPs centered at different values because they're fit to
different datasets. The expected CLs has to be different.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1619 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AARV6A23QS3ZRWKQXS5ICD3UFRYDFANCNFSM5FO77KWQ>
.
|
Beta Was this translation helpful? Give feedback.
-
Hi guys, thanks for the detailed answers. So pyhf gives an "a-posteriori" expected CLs, and this can be quite different from the "a-priori" one where one sets data=SM (e.g. before unblinding and/or to determine sensitivities). I think it is worthwhile to make this very clear for the user not only by quoting the underlying formulae, like in #1367, but giving also some physics understanding. The point is that many of the official limit plots from the experiments show the a-priori expected limit, not the a-posteriori one. So when using a published full likelihood from ATLAS and naively comparing pyhf's CLs_exp with the expected limit from the analysis, there is a discrepancy. Perhaps experimentalists are all aware of this, but it is a potential serious pitfall for users from the theory side (recasters.....). |
Beta Was this translation helpful? Give feedback.
-
OK, maybe I'm mistaken here; will get back to the books. (But, pseudo-data
can also be sampled from the SM hypothesis without knowledge of the
measured data, can't it)
In any case, my original question was not about the difference between
expected and observed limits as on
pyhf/pyhf-tutorial#10, but on the different ways
to compute an expected limit, and how this is done in pyhf. This is
clarified. What we have to do on our side is to use one method consistently
when recasing, and be doubly careful when comparing with expected limits in
the experimental publications, in order to compare apples with apples.
…On Thu, Oct 7, 2021 at 11:13 PM Matthew Feickert ***@***.***> wrote:
Hi guys, thanks for the detailed answers. So pyhf gives an "a-posteriori"
expected CLs, and this can be quite different from the "a-priori" one where
one sets data=SM (e.g. before unblinding and/or to determine
sensitivities). I think it is worthwhile to make this very clear for the
user not only by quoting the underlying formulae, like in #1367
<#1367>, but giving also
some physics understanding. The point is that many if not most of the
official limit plots from the experiments show the a-priori expected limit,
not the a-posteriori one. So when using a published full likelihood from
ATLAS and naively comparing pyhf's CLs_exp with the expected limit from the
analysis, there is a discrepancy. Perhaps experimentalists are all aware of
this, but it is a potential serious pitfall for users from the theory side
(recasters.....).
I think it is worthwhile to make this very clear for the user not only by
quoting the underlying formulae, like in #1367
<#1367>, but giving also
some physics understanding.
We're quite happy to try and make this more clear for people. So we'll
address this in:
- Issue #1620 <#1620>
- pyhf/pyhf-tutorial#10
<pyhf/pyhf-tutorial#10>
The point is that many if not most of the official limit plots from the
experiments show the a-priori expected limit, not the a-posteriori one. So
when using a published full likelihood from ATLAS and naively comparing
pyhf's CLs_exp with the expected limit from the analysis, there is a
discrepancy.
Can you point us to some examples of this @sabinekraml
<https://github.com/sabinekraml>? I think @lukasheinrich
<https://github.com/lukasheinrich>, @kratsg <https://github.com/kratsg>,
@alexander-held <https://github.com/alexander-held>, and I are all
surprised to hear this as the ATLAS SUSY group definitely uses Asimov data
to produce expected limits. This is also what is recommended in the joint
recommendations from ATLAS and CMS in Procedure for the LHC Higgs boson
search combination in Summer 2011
<https://inspirehep.net/literature/1196797> Section 1 (there the phrase
"pseudo-data" is used) and (using "Asimov") in the February, 2011 ATLAS Frequentist
Limit Recommendation
<https://indico.cern.ch/event/126652/contributions/1343592/attachments/80222/115004/Frequentist_Limit_Recommendation.pdf>
Section 3.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1619 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AG3ROOG277UFIJDYVCVYF5LUFYENXANCNFSM5FO77KWQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
--
_______________________________________________________
Sabine Kraml - ***@***.*** - www.kraml.net
LPSC
53 Av des Martyrs
38026 Grenoble
France
(+33)(0)4 76 28 40 52
|
Beta Was this translation helpful? Give feedback.
-
In the Combine docs, there is a section on Asymptotic Frequentist Limits
<https://cms-analysis.github.io/HiggsAnalysis-CombinedLimit/part3/commonstatsmethods/#asymptotic-frequentist-limits>
that confirms that they use the Asimov approach as well
Yes, and scrolling down to "Computing the expected significance" one finds
the "a-priori expected" as the default behavior.
Anyways, it would be really good to discuss "live" if you have time next
week.
…On Fri, Oct 8, 2021 at 4:19 PM Matthew Feickert ***@***.***> wrote:
I don't actually know what CMS does in these cases given my edited
response above, so that will be worth finding out as well.
The good news is that CMS Combine is indeed doing the same thing that
ATLAS does (e.g. using the Asimov data for expected limits) *when* they
are performing Frequentist limits (they may also set Bayesian limits, but
that is apparently not common these days).
In the Combine docs, there is a section on Asymptotic Frequentist Limits
<https://cms-analysis.github.io/HiggsAnalysis-CombinedLimit/part3/commonstatsmethods/#asymptotic-frequentist-limits>
that confirms that they use the Asimov approach as well
The AsymptoticLimits calculation follows the frequentist paradigm for
calculating expected limits. This means that the routine will first fit the
observed data, conditionally for a fixed value of r and set the nuisance
parameters to the values obtained in the fit for generating the Asimov
data, i.e it calculates the *post-fit* or *a-posteriori* expected limit.
This is doubly confirmed when looking at the Combine FAQ
<http://cms-analysis.github.io/HiggsAnalysis-CombinedLimit/part4/usefullinks/#faq>
"Why does changing the observation in data affect my expected limit?"
The expected limit (if using either the default behaviour of -M
AsymptoticLimits or using the LHC-limits style limit setting with toys)
uses the *post-fit* expectation of the background model to generate toys.
This means that first the model is fit to the *observed data* before toy
generation.
Though it is worth noting that Combine has a "Blind limits" option
<https://cms-analysis.github.io/HiggsAnalysis-CombinedLimit/part3/commonstatsmethods/#blind-limits>
In order to use the *pre-fit* nuisance parameters (to calculate an
*a-priori* limit), you must add the option --noFitAsimov or
--bypassFrequentistFit.
Though I would assume (read as: someone who knows should correct me) that
CMS would make this *very* explicit if they were to ever present limits
like this given the Asimov procedure being viewed as the "default" in ATLAS
and CMS.
(Thanks @bregnery <https://github.com/bregnery> for pointing me in the
right direction on the Combine docs. 👍)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1619 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AG3ROODKQJ3G2KIC4VG7N2LUF34YJANCNFSM5FO77KWQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
--
_______________________________________________________
Sabine Kraml - ***@***.*** - www.kraml.net
LPSC
53 Av des Martyrs
38026 Grenoble
France
(+33)(0)4 76 28 40 52
|
Beta Was this translation helpful? Give feedback.
-
(if you allow me to chime in) Interesting discussion! For me, the words "expected limit" would have implied "the hypothesized limit computed before I make my observation, assuming I find no new physics", which is in contradiction to the word "a posteriori" which for me implies "after having made my observation". But I see in this context "expected limit" means something like "the limit, given my observation, under the assumption that what I measured is the expecation value of the SM background, thus under the assumption of the SM null hypothesis to hold". It's conceptually a weird thing, no? The limit of the "signal strength" parameter, given the data, under the assumption that that parameter is zero? I am not sure I know what I can do with this quantity, particularly in the context of searches, where I do not assume the SM hypothesis to hold. It's something like the sensitivity but trying to update my background estimates by assuming no new physics is in my data, which I cannot know. (Am I misunderstanding something?) Anyways, even if I cannot fully wrap my head around its meaning, it's well-defined and we can use it to validate the results in our database, so thanks a lot for the clarification. |
Beta Was this translation helpful? Give feedback.
-
Dear pyhf developers,
We'd like to inquire about the exact definition of the expected CLs reported by pyhf.
As we understand it, expected limits, and thus also CLs_exp, are computed under the assumption that observation equals expectation (thus being agnostic of the actual observed data). We tried this out with the full likelihoods published by ATLAS, replacing in the json file the number of observed events in each signal region by the number of expected background events. This gives CLs_exp=CLs_obs, as it should be, but this value is different from the CLs_exp obtained with the original json file.
To give a concrete example from ATLAS-SUSY-2018-04 (hadronic stau search with two signal regions):
SR1cuts: #exp=6.057 #obs=10
SR2cuts: #exp=10.338 #obs=7
Here, #exp and #obs denote the number of expected SM events and the number of observed events, respectively.
Patching a signal of 1.568 events in region SR1cuts (=SRlow in the paper) and 7.701 events in region SR2cuts (=SRhigh in the paper), which corresponds to a benchmark point with m_stauLR = 400 GeV and m_LSP = 40 GeV, gives
However, when verifying the expected CLs "by hand", that is using
SR1cuts: #exp=6.057 #obs=6.057
SR2cuts: #exp=10.338 #obs=10.338
one obtains
for the same signal. Some clarification where the difference comes from would be much appreciated.
Beta Was this translation helpful? Give feedback.
All reactions