Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Documentation Section focused on underlying stats without code #839

Merged
merged 77 commits into from
Mar 14, 2024
Merged
Show file tree
Hide file tree
Changes from 68 commits
Commits
Show all changes
77 commits
Select commit Hold shift + click to select a range
9432a92
Work on introductory sections
kcormi May 7, 2023
e997e15
Try to fix some math rendering
kcormi May 7, 2023
ce50cfc
more attempted math fix
kcormi May 7, 2023
7174101
Work on fleshing out likelihood
kcormi May 7, 2023
68be330
Try fixing some math rendering
kcormi May 7, 2023
cd4b6c7
math format test
kcormi May 7, 2023
0360454
math format test
kcormi May 7, 2023
7f34cd6
math format test
kcormi May 7, 2023
f985c4c
math format test
kcormi May 7, 2023
0e1580f
math format test
kcormi May 7, 2023
5be19fc
math format test
kcormi May 7, 2023
19c4c01
math format test
kcormi May 7, 2023
a489941
change mkdocs
kcormi May 7, 2023
884b31d
Update Model and Likelihood, concisify, start stats tests
kcormi May 9, 2023
e3c3656
Minor fixes to likelihood explanations
kcormi May 9, 2023
8044451
Minor wording fix
kcormi May 9, 2023
cdd73ed
WIP on fitting concepts and statistical tests info
kcormi May 10, 2023
c2706b6
rename how --> what for accuracy in intro labels
kcormi May 10, 2023
d5dced5
Major update on statistical tests. Improve introduction section.
kcormi May 11, 2023
43bc53f
General improvements to structure and details
kcormi May 11, 2023
b4685fb
Textual improvements; also fix to some details of CLs description
kcormi May 11, 2023
0f83ff0
Minor nomenclature fixes
kcormi May 11, 2023
35b7ed1
Small fix to constraint term descriptions
kcormi May 11, 2023
d35f2ff
Fix details blocks rendering issue
kcormi Sep 27, 2023
4e8dcc5
Updates for details boxes
kcormi May 11, 2023
1eec83d
Update full likelihood figure
kcormi May 11, 2023
a9bef05
Minor improvements and fixes
kcormi May 13, 2023
15b98cb
Rearrange likelihood explanations a little
kcormi May 13, 2023
bdebfa9
First pass at adding channel compatibility and other gof test (KS,AD)
kcormi May 13, 2023
1e4c73f
Fix unbinned likelihood eqn
kcormi May 14, 2023
fb76b98
(Possibly incorrect) attempt at integrating unbinned likelihod info
kcormi May 15, 2023
55ba2dd
Minor wording tweak
kcormi May 15, 2023
d333609
Fix docs README for updates
kcormi May 16, 2023
870e4e3
Remove numerical fitting for now
kcormi May 16, 2023
1f0bf54
fix removed docs README line
kcormi May 16, 2023
57cc4ca
Fix repo url for mkdocs
kcormi May 16, 2023
2413ad2
Try to improve statistical tests description
kcormi May 26, 2023
abc6ccb
Update for "what combine does" sections
kcormi Jun 26, 2023
a40f2f4
Statistical test and fitting concepts updates
kcormi Jun 27, 2023
a2825dc
minor wording update
kcormi Jun 27, 2023
ce3a416
Test link fix
kcormi Jun 27, 2023
980494f
Fix relative links
kcormi Jun 27, 2023
48db95b
Minor fixes
kcormi Jun 27, 2023
6ddf4c5
minor fixes
kcormi Jun 27, 2023
da2cb66
Attempt to improve CLs section
kcormi Jun 29, 2023
f696dfd
Minor typo fix
kcormi Jun 29, 2023
2968c05
Change to pmu/(1-pb) for CLs
kcormi Jun 30, 2023
2538efd
remove commented out section
kcormi Jun 30, 2023
8ab6904
minor notation changes
kcormi Sep 27, 2023
e3436d8
Minor wording updates
kcormi Nov 30, 2023
94b9f46
Fix interpolation functions
kcormi Dec 1, 2023
6b8e81f
Fix likelihood graphic interp functions
kcormi Dec 1, 2023
9f365da
Update png of llhood equations
kcormi Dec 1, 2023
77fa520
Minor update
kcormi Feb 8, 2024
3dad7c7
Fix some notation
kcormi Mar 6, 2024
2122f94
Minor updates -- synchronization
kcormi Mar 7, 2024
4af18df
Update likelihood equation diagram
kcormi Mar 8, 2024
721da22
Fix some more mathematical conventions
kcormi Mar 8, 2024
1e9963d
Some more notation fixes
kcormi Mar 8, 2024
8526fc8
More notation and other fixes
kcormi Mar 8, 2024
f93161a
Try updating some text
kcormi Mar 11, 2024
e305a9e
Some more wording fixes
kcormi Mar 11, 2024
5e3db88
more small changes
kcormi Mar 11, 2024
5bf8856
typo fix
kcormi Mar 11, 2024
370fe63
Some fixes here and there
kcormi Mar 12, 2024
8967747
Some more minor fixes
kcormi Mar 12, 2024
ddebb2b
More accurate wording
kcormi Mar 12, 2024
73f5d73
Avoid math in headings
kcormi Mar 13, 2024
5e88bdf
Some more paper conventions I missed
kcormi Mar 13, 2024
247a3d8
some data/primary constraint/auxiliary spots that were missed
kcormi Mar 13, 2024
edbf95c
More minor fixes
kcormi Mar 13, 2024
50fd734
More minor consistency of notation points
kcormi Mar 14, 2024
c2028dd
More notation fixes
kcormi Mar 14, 2024
76d7ce4
More notation consistency fixing
kcormi Mar 14, 2024
aac4767
improve a little wording
kcormi Mar 14, 2024
8a56802
Slight improvement in statistical tests description
kcormi Mar 14, 2024
a4d22c7
Minor comment updates
kcormi Mar 14, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,11 @@
Note to developers: navigation bar found in `mkdocs.yml`.

* [**Getting started**](index.md)
* **What Combine Does**
* [Intro](what_combine_does/introduction.md)
* [Model and Likelihood](what_combine_does/model_and_likelihood.md)
* [Fitting Concepts](what_combine_does/fitting_concepts.md)
* [Statistical Tests](what_combine_does/statistical_tests.md)
* **Setting up the analysis**
* [Preparing the datacard](part2/settinguptheanalysis.md#preparing-the-datacard)
* [Counting experiment](part2/settinguptheanalysis.md#a-simple-counting-experiment)
Expand Down
Binary file added docs/what_combine_does/CombineLikelihoodEqns.odp
Binary file not shown.
Binary file added docs/what_combine_does/CombineLikelihoodEqns.pdf
Binary file not shown.
Binary file added docs/what_combine_does/CombineLikelihoodEqns.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
5,439 changes: 5,439 additions & 0 deletions docs/what_combine_does/CombineLikelihoodEqns.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
162 changes: 162 additions & 0 deletions docs/what_combine_does/fitting_concepts.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,162 @@
# Likelihood based fitting

"Fitting" simply means estimating some parameters of a model (or really a [set of models](../../what_combine_does/model_and_likelihood/#sets-of-observation-models)) based on data.
Likelihood-based fitting does this through the [likelihood function](../../what_combine_does/model_and_likelihood/#the-likelihood).

In frequentist frameworks, this typically means doing [maximum likelihood estimation](https://pdg.lbl.gov/2022/web/viewer.html?file=../reviews/rpp2022-rev-statistics.pdf#subsection.40.2.2).
In bayesian frameworks, usually [posterior distributions](https://pdg.lbl.gov/2022/web/viewer.html?file=../reviews/rpp2022-rev-statistics.pdf#subsection.40.2.6) of the parameters are calculated from the likelihood.

## Fitting Frameworks

Likelihood fits typically either follow a frequentist framework of maximum likelihood estimation, or a bayesian framework of updating estimates to find posterior distributions given the data.

### Maximum Likelihood fits

A [maximum likelihood fit](https://pdg.lbl.gov/2022/web/viewer.html?file=../reviews/rpp2022-rev-statistics.pdf#subsection.40.2.2) means finding the values of the model parameters $(\vec{\mu}, \vec{\nu})$ which maximize the likelihood, $\mathcal{L}(\vec{\mu},\vec{\nu};\mathrm{data})$
The values which maximize the likelihood, are the parameter estimates, denoted with a "hat" ($\hat{}$):

$$(\vec{\hat{\mu}}, \vec{\hat{\nu}}) \equiv \underset{\vec{\mu},\vec{\nu}}{\operatorname{argmax}} \mathcal{L}(\vec{\mu}, \vec{\nu};\mathrm{data})$$

These values provide **point estimates** for the parameter values.

Because the likelihood is equal to the probability of observing the data given the model, the maximum likelihood estimate finds the parameter values for which the data is most probable.

### Bayesian Posterior Calculation

In a bayesian framework, the likelihood represents the probability of observing the data given the model and some prior probability distribution over the model parameters.

The prior probability of the parameters, $\pi(\vec{\Phi})$, are updated based on the data to provide a [posterior distributions](https://pdg.lbl.gov/2022/web/viewer.html?file=../reviews/rpp2022-rev-statistics.pdf#subsection.40.2.6)

$$ p(\vec{\Phi};\mathrm{data}) = \frac{ p(\mathrm{data};\vec{\Phi}) \pi(\vec{\Phi}) }{ p(\mathrm{data}) } = \frac{ \mathcal{L}(\vec{\Phi};\mathrm{data}) \pi(\vec{\Phi}) }{ \int_{\vec{\Phi'}} \mathcal{L}(\vec{\Phi'};\mathrm{data}) \pi(\vec{\Phi'}) }$$

The posterior distribution $p(\vec{\Phi};\mathrm{data})$ defines the updated belief about the parameters $\vec{\Phi}$.

## Methods for considering subsets of models

Often, one is interested in some particular aspect of a model.
This may be for example information related to the parameters of interest, but not the nuisance parameters.
In this case, one needs a method for specifying precisely what is meant by a model considering only those parameters of interest.

There are several methods for considering sub models which each have their own interpretations and use cases.

### Conditioning

Conditional Sub-models can be made by simply restricting the values of some parameters.
The conditional likelihood of the parameters $\vec{\mu}$ conditioned on particular values of the parameters $\vec{\nu}$ is:

$$ \mathcal{L}(\vec{\mu},\vec{\nu}) \xrightarrow{\mathrm{conditioned\ on\ } \vec{\nu} = \vec{\nu}_0} \mathcal{L}(\vec{\mu}) = \mathcal{L}(\vec{\mu},\vec{\nu}_0) $$

### Profiling

The profiled likelihood $\mathcal{L}(\vec{\mu})$ is defined from the full likelihood, $\mathcal{L}(\vec{\mu},\vec{\nu})$, such that for every point $\vec{\mu}$ it is equal to the full likelihood at $\vec{\mu}$ maximized over $\vec{\nu}$.

$$ \mathcal{L}(\vec{\mu},\vec{\nu}) \xrightarrow{\mathrm{profiling\ } \vec{\nu}} \mathcal{L}({\vec{\mu}}) = \max_{\vec{\nu}} \mathcal{L}(\vec{\mu},\vec{\nu})$$

In some sense, the profiled likelihood is the best estimate of the likelihood at every point $\vec{\mu}$, it is sometimes also denoted with a double hat notation $\mathcal{L}(\vec{\mu},\vec{\hat{\hat{\nu}}}(\vec{\mu}))$.

### Marginalization

Marginalization is a procedure for producing a probability distribution $p(\vec{\mu};\mathrm{data})$ for a set of parameters $\vec{\mu}$, which are only a subset of the parameters in the full distribution $p(\vec{\mu},\vec{\nu};\mathrm{data})$.
The marginal probability density $p(\vec{\mu})$ is defined such that for every point $\vec{\mu}$ it is equal to the probability at $\vec{\mu}$ integrated over $\vec{\nu}$.

$$ p(\vec{\mu},\vec{\nu}) \xrightarrow{\mathrm{marginalizing\ } \vec{\nu}} p({\vec{\mu}}) = \int_{\vec{\nu}} p(\vec{\mu},\vec{\nu})$$

The marginalized probability $p(\vec{\mu})$ is the probability for the parameter values $\vec{\mu}$ taking into account all possible values of $\vec{\nu}$.

Marginalized likelihoods can also be defined, by their relationship to the probability distributions:

$$ \mathcal{L}(\vec{\mu};\mathrm{data}) = p(\mathrm{data};\vec{\mu}) $$


## Parameter Uncertainties

Parameter uncertainties describe regions of parameter values which are considered reasonable parameter values, rather than single estimates.
These can be defined either in terms of frequentist **confidence regions** or bayesian **credibility regions**.

In both cases the region is defined by a confidence or credibility level $CL$, which quantifies the meaning of the region.
For frequentist confidence regions, the confidence level $CL$ describes how often the confidence region will contain the true parameter values if the model is a sufficiently accurate approximation of the truth.
For bayesian credibility regions, the credibility level $CL$ describes the bayesian probability that the true parameter value is in that region for under the given model.


The confidence or credibility regions are described by a set of points $\{ \vec{\mu} \}_{\mathrm{CL}}$ which meet some criteria.
In most situations of interest, the credibility region or confidence region for a single parameter, $\mu$, is effectively described by an interval:

$$ \{ \mu \}_{\mathrm{CL}} = [ \mu^{-}_{\mathrm{CL}}, \mu^{+}_{\mathrm{CL}} ] $$

Typically indicated as:

$$ \mu = X^{+\mathrm{up}}_{-\mathrm{down}} $$

or, if symmetric intervals are used:

$$ \mu = X \pm \mathrm{unc.} $$


### Frequentist Confidence Regions

Frequentist confidence regions are random variables of the observed data.
These are very often the construction used to define the uncertainties reported on a parameter.

If the same experiment is repeated multiple times, different data will be osbserved each time and a different confidence set $\{ \vec{\mu}\}_{\mathrm{CL}}^{i}$ will be found for each experiment.
If the data are generated by the model with some set of values $\vec{\mu}_{\mathrm{gen}}$, then the fraction of the regions $\{ \vec{\mu}\}_{\mathrm{CL}}^{i}$ which contain the values $\vec{\mu}_{\mathrm{gen}}$ will be equal to the confidence level ${\mathrm{CL}}$.
The fraction of intervals which contain the generating parameter value is referred to as the "coverage".

From first principles, the intervals can be constructed using the [Neyman construction](https://pdg.lbl.gov/2022/web/viewer.html?file=../reviews/rpp2022-rev-statistics.pdf#subsubsection.40.4.2.1).

In practice, the likelihood can be used to construct confidence regions for a set of parameters $\vec{\mu}$ by using the profile likelikhood ratio:

$$ \Lambda \equiv \frac{\mathcal{L}(\vec{\mu},\vec{\hat{\nu}}(\vec{\mu}))}{\mathcal{L}(\vec{\hat{\mu}},\vec{\hat{\nu}})} $$

i.e. the ratio of the profile likelihood at point $\vec{\mu}$ to the maxmimum likelihood. For technical reasons, the negative logarithm of this quantity is typically used in practice.

Each point $\vec{\mu}$ can be tested to see if it is in the confidence region, by checking the value of the likelihood ratio at that point and comparing it to the expected distribution if that point were the true generating value of the data.

$$ \{ \vec{\mu} \}_{\mathrm{CL}} = \{ \vec{\mu} : -\log(\Lambda) \lt \gamma_{\mathrm{CL}}(\vec{\mu}) \} $$

The cutoff value $\gamma_{\mathrm{CL}}$ must be chosen to match this desired coverage of the confidence set.

Under some conditions, the value of $\gamma_{\mathrm{CL}}$ is known analytically for any desired confidence level, and is independent of $\vec{\mu}$, which greatly simplifies estimating confidence regions.

/// details | **Constructing Frequentist Confidence Regions in Practice**

When a single fit is performed by some numerical minimization program and parameter values are reported along with some uncertainty values, they are usually reported as frequentist intervals.
The [MINUIT minimizer](https://root.cern/root/htmldoc/guides/minuit2/Minuit2.pdf) which evaluates likelihood functions has two methods for [estimating parameter uncertainties](https://root.cern/root/htmldoc/guides/minuit2/Minuit2.pdf#section.2.5).

These two methods are the most commonly used methods for estimating confidence regions in a fit; they are the [**minos** method](https://root.cern/root/htmldoc/guides/minuit2/Minuit2.pdf#subsection.2.5.3), and the [**hessian** method](https://root.cern/root/htmldoc/guides/minuit2/Minuit2.pdf#subsection.2.5.2).
In both cases, Wilk's theorem is assumed to hold at all points in parameter space, such that $\gamma_{\mathrm{CL}}$ is independent of $\vec{\mu}$.

When $\gamma_{\mathrm{CL}}$ is independent of $\vec{\mu}$ the problem simplifies to finding the boundaries where $-\log(\Lambda) = \gamma_{\mathrm{CL}}$.
This boundary point is referred to as the "crossing", i.e. where $-\log(\Lambda)$ crosses the threshold value.

#### The Minos method for estimating confidence regions

In the minos method, once the best fit point $\vec{\hat{\mu}}$ is determined, the confidence region for any parameter $\mu_i$ can be found by moving away from its best fit value $\hat{\mu}_i$.
At each value of $\mu_i$, the other parameters are profiled, and $-\log{\Lambda}$ is calculated.

Following this procedure, $\mu_i$ is searched for the boundary of the confidence regions, where $-\log{\Lambda} = \gamma_{\mathrm{CL}}$.

The search is performed in both directions, away from the best fit value of the parameter and the two crossings are taken as the borders of the confidence region.

This procedure has to be followed sepately for each parameter $\mu_i$ for which a confidence interval is calculated.

#### The Hessian method for estimating confidence regions

The Hessian method relies on the second derivatives (i.e. the [hessian](https://en.wikipedia.org/wiki/Hessian_matrix)) of the likelihood at the best fit point.

By assuming that the shape of the likelihood function is well described by its second-order approximation, the values at which $-\log(\Lambda) = \gamma_{\mathrm{CL}}$ can be calculated analytically without the need for a seach

$$ \mu_i^{\mathrm{crossing}} - \hat{\mu} \propto (\frac{\partial^2{\mathcal{L}(\vec{\hat{\mu}})}}{\partial\mu_i^2})^{-1} $$

By computing and then inverting the full hessian matrix, all individual confidence regions and the full covariance matrix are determined.
By construction, this method always reports symmetric confidence intervals, as it assumes that the likelihood is well described by a second order expansion.

///

### Bayesian Credibility Regions

Often the full posterior probability distribution is summarized in terms of some **credible region** which contains some specified portion of the posterior probability of the parameter.

$$ \{ \vec{\mu} \}_{\mathrm{CL}} = \{ \vec{\mu} : \vec{\mu} \in \Omega, \int_{\Omega} p(\vec{\mu};\mathrm{data}) = \mathrm{CL} \}$$

The credible region represents a region in which the bayesian probability of the parameter being in that region is equal to the chosen Credibility Level.

56 changes: 56 additions & 0 deletions docs/what_combine_does/introduction.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# Introduction And Capabilities

Combine is a tool for making statistical analyses based on a model of expected observations and a dataset.
Example statistical analyses are claiming discovery of a new particle or process, setting limits on the existence of new physics, and measuring cross sections.

The package has no physics-specific knowledge, it is completely agnostic to the interpretation of the analysis being performed, but its usage and development is based around common cases in High Energy Physics.
This documentation is a description of what combine does and how you can use it to run your analyses.

Roughly, combine does three things:

1. Helps you to build a statistical model of expected observations;
2. Runs statistical tests on the model and observed data;
3. Provides tools for validating, inspecting, and understanding the model and the statistical tests.

Combine can be used for analyses in HEP ranging from simple counting experiments, to unfolded measurements, new physics searches,combinations of measurements, and EFT fits.

## Model Building

Combine provides a powerful, human-readable, and lightweight interface for [building likelihood models](../../part2/settinguptheanalysis/#preparing-the-datacard) for both [binned](../../part2/settinguptheanalysis/#binned-shape-analysis) and [unbinned](../../part2/settinguptheanalysis/#unbinned-or-parametric-shape-analysis) data.
The likelihood definition allows the user to define many processes which contribute to the observation, as well as multiple channels which may be fit simultaneously.

Furthermore, combine provides a powerful and intuitive interface for [combining models](../../part2/settinguptheanalysis/#combination-of-multiple-datacards), as it was originally developped for combinations of higgs boson analysis at the CMS experiment.

The interface simplifies many common tasks, while providing many options for customizations.
Common nuisance parameter types are defined for easy use, while user-defined functions can also be provided.
Input histograms defining the model can be provide in root format, or in other tabular formats compatable with pandas.

Custom [physics models](../../part2/physicsmodels/) can be defined in python which determine how the parameters of interest alter the model, and a number of predefined models are provided by default.

A number of tools are also provided for run-time alterations of the model, allowing for straightforward comparisons of alternative models.

## Statistical Tests

Combine can be used for statistical tests in frequentist or bayesian frameworks as well as perform some hybrid frequentist-bayesian analysis tasks.

Combine implements various methods for [commonly used statistical tests](../../part3/commonstatsmethods/) in high energy physics, including for discovery, limit setting, and parameter estimation.
Statistical tests can be customized to use various test statistics and confidence levels, as well as providing different output formats.

A number of asymptotic methods, relying on Wilks' theorem, and valid in appropriate conditions are implemented for fast evaluation.
Generation of pseudo-data from the model can also be performed, and tests are implemented to automatically run over emprical distributions without relying on asymptotic approximations.
Pseudo-data generation and fitting over the pseudo-data can be customized in a number of ways.

## Validation and Inspection

Combine provides tools for [inspecting the model](../../part3/validation/#validating-datacards) for things like potentially problematic input templates.

[Various methods](../../part3/nonstandard/) are provided for inspecting the likelihood function and the performance of the fits.

Methods are provided for comparing pre-fit and postfit results of all values including nuisance parameters, and summaries of the results can produced.

Plotting utilities allow the pre- and post-fit model expectations and their uncertainties to be plotted, as well as plotted summaries of debugging steps such as the nuisance parameter values and likelihood scans.





Loading
Loading