Skip to content
Remco Bouckaert edited this page Nov 17, 2021 · 9 revisions

The analysis prints out multiple ML estimates with their SDs. Which one to choose?

The difference between the estimates is the way they are estimated from the nested sampling run. Since these are estimates that require random sampling, they differ from one estimate to another. When the standard deviation is small, the estimates will be very close, but when the standard deviations is quite large, the ML estimates can substantially differ. Regardless, any of the reported estimates are valid estimates, but make sure to report them with their standard deviation.

What is the difference between the plain, subsample and bootstrap estimate?

Nested sampling produces a sequence of save points with increasing likelihoods L1...Lk, and each likelihood is associated with a series X1...Xk where Xi associated with likelihood Li is the probability of that a state in the prior has at least likelihood Li. Now, the likelihoods L1...Lk are exactly known, but the X1...Xk values are too complex to calculate. However, we know that they form a sequence of values starting at X1=1 and that following Xi is distributed according to a beta(N,1) distribution times the previous value X(i-1), where N is the number of points for the nested sampling analysis.

There are different ways to treat this random variable. The simplest is taking the mean value estimate, and Xi=exp(-i/N) as is used in the plain estimate in the output.

An alternative is to randomly sample Xi as a fraction of the previous X(i-1) by multiplying with a randomly sampled value from the beta(N,1) distribution. This (sampling from beta distribution) can be repeated several times, and the mean and variance of the ML estimate estimated based on the sample. By resampling the likelihoods with replacement at the same time gives the estimate called "subsample estimate”. There is another way to get likelihoods by unraveling the sequence (as described in https://arxiv.org/pdf/1703.09701.pdf Algorithm 2) that produces what is labelled “bootstrap sample”.

How do I know the sub-chain length is large enough?

NS works in theory if and only if the points generated at each iteration are independent. If you already did an MCMC run and know the effective sample size (ESS) for each parameter, to be sure every parameter in every sample is independent you can take the length of the MCMC run divided by the smallest ESS as sub-chain length. This tend to result in quite large sub-chain lengths.

In practice, we can get away much smaller sub-chain lengths, which you can verify by running multiple NS analysis with increasing sub-chain lengths. If the ML and SD estimates do not substantially differ, (you want the difference between ML1 and ML2 to be at most 2*sqrt(SD1*SD1+SD2*SD2)) you know the shorter sub-chain length was sufficient.

How many particles do I need?

To start, use only a few particles. This should give you a sense of the information H, which is one of the estimates provided by the NS analysis. If you want to compare two hypotheses, you want the difference between ML1 and ML2 to be at least 2*sqrt(SD1*SD1+SD2*SD2) in order to make sure the difference is not due to randomisation.

If the difference is larger, you do not need more particles.

If the difference is smaller, you can guess how much the SD estimates must shrink to get a difference that is sufficiently large. Since the SD=sqrt(H/N), we have that N=H/(SD*SD) and H comes from the NS run with a few particles. Run the analysis again, with the increased number of particles, and see if the difference becomes large enough.

If the difference is less than 2, the hypotheses may not be distinguishable -- in terms of Bayes factors, are barely worth mentioning.

Is NS faster than path sampling/stepping stone (PS/SS)?

This depends on many things, but in general, depends on how accurate the estimates should be. For NS, we get an estimate of the SD, which is not available for PS/SS. If the hypotheses have very large differences in MLs, NS requires very few (maybe just 1) particle, and will be very fast. If differences are smaller, more particles may be required, and the run-time of NS is linear in the number of particles.

The parallel implementation makes it possible to run many particles in parallel, giving a many-particle estimate in the same time as a single particle estimate (PS/SS can be parallelised by steps as well).

The output is written on screen, which I forgot to save. Can I estimate them directly from the log files?

The NS package has a NSLogAnalyser application that you can run via the menu File/Launch apps in BEAUti -- a window pops up where you select the NSLogAnalyser, and a dialog shows you various options to fill in. You can also run it from the command line on OS X or Linux using

/path/to/beast/bin/applauncher NSLogAnalyser -N 1 -log xyz.log

where the argument after N is the particleCount you specified in the XML, and xyz.log the trace log produced by the NS run.

Why are some NS runs longer than others?

Nested sampling stops automatically when the accuracy in the ML estimate cannot be improved upon. Because it is a stochastic process, some analyses get there faster than others, resulting in different run times.

Why are the ESSs so low when I open a log file in Tracer?

An NS analysis produces two trace log files: one for the nested sampling run (say myFile.log) and one with the posterior sample (myFile.posterior.log).

The ESSs in Tracer of log files with the posterior samples are meaningless, because the log file is ordered using the nested sampling run. If you look at the trace of the Likelihood, it should show a continuous increasing function. It is not quite clear how to estimate ESSs of a nested sampling run yet, though the number of entries in the posterior log is equal to the maximum theoretical ESS, which is almost surely an overestimate.