Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CensoredDistribution #1489

Closed
wants to merge 4 commits into from
Closed

Conversation

ae-foster
Copy link
Contributor

We currently have TransformedDistribution which works with invertible transformations of random variables. The new CensoredDistribution, added here, makes the following non-invertible transformation

X ~ dist
X[X >= upper_lim] = upper_lim
X[X <= lower_lim] = lower_lim

The new pdf uses the log(cdf) values at upper_lim and lower_lim, and the original pdf elsewhere. This is a valid probability density function, albeit with respect to a new base measure.

These distributions find applications in contrib.oed work (they make interesting OED problems because censoring leads to lower / zero information gain)

Copy link
Member

@fritzo fritzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add some tests? It's fine to add a new file tests/distributions/test_censored.py.

def rsample(self, sample_shape=torch.Size()):
x = self.base_dist.sample(sample_shape)
x[x > self.upper_lim] = self.upper_lim
x[x < self.lower_lim] = self.lower_lim
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return x and add a test that would have caught this.

@fritzo
Copy link
Member

fritzo commented Oct 23, 2018

cc @alicanb who prototyped TruncatedDistribution in probtorch/pytorch#121

@ae-foster
Copy link
Contributor Author

Thanks @fritzo ! I will get on and write some tests

@activatedgeek
Copy link
Contributor

Hey @ae-foster, can you point me to some references where Censored Distribution was required?

@ae-foster
Copy link
Contributor Author

I'm going to close this PR unmerged. And it's not because I'm too lazy to write the test (ok it's partly that).

The main reason is that I didn't use this code as it stands in pyro.contrib.oed. I want to point out what the issue with this code is, in case anyone tries something similar in future. The issue is quite simply the numerical stability of torch.log(cdf). Log-cdf values, like log-pdf values can be large and negative. Unfortunately, taking the explicit log of floating point values will run into problems in this regime. What is needed is an asymptotic expansion of the cdf or its log. Fortunately, for certain distributions (e.g. the Normal) we can find such an asymptotic expansion. The resulting algorithm blends between log(cdf) and this asymptotic expansion to avoid the numerical instabilities. Unfortunately, this is distribution specific, thus precluding the use of a CensoredDistribution tool.

For details of my new approach, see https://github.com/ae-foster/pyro/blob/oed-master/pyro/distributions/censored_sigmoid_normal.py

@ae-foster ae-foster closed this Mar 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants