Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Factories for distributions that can be reparametrized #240

Open
luke14free opened this issue Mar 29, 2020 · 4 comments
Open

Factories for distributions that can be reparametrized #240

luke14free opened this issue Mar 29, 2020 · 4 comments

Comments

@luke14free
Copy link

luke14free commented Mar 29, 2020

It would be very nice to implement a pattern where we can have factories for distributions that can be re-parametrized in terms of scale/loc. A good example would be the beta where I'd love to be able to write something like:

pm.Beta.from_loc_scale(name='beta', loc=..., scale=...)

This could become very handy in a number of cases (I was thinking about regressions, but there are so much more)

@tirthasheshpatel
Copy link
Contributor

This can be done easily and is very handy!

Quick fix straight from wikipedia
    @staticmethod
    def from_loc_scale(name, loc, scale, **kwargs):
        """
        Beta distribution from  `loc` and `scale` parameters.

        Parameters
        ----------
        loc : tensor, float
            Mean of the distribution
        scale : tensor, float
            Variance of the distribution
        """
        nu = (loc * (1 - loc)) / scale - 1
        if tf.reduce_any(nu < 0):
            raise ValueError("invaid value for `loc` or `scale`")
        alpha = loc * nu
        beta = (1 - loc) * nu
        return Beta(nme=name, concentration0=alpha, concentration1=beta, **kwargs)

I wonder if this can be done for multivariate distributions though (or if it even makes sense to do so)? What do you say, @luke14free?

@luke14free
Copy link
Author

Yes, static methods for the win here. I am not sure how useful it would be to have this on multivariate distributions (there might be usecases but they don't pop out immediately in my head). My point was having to avoid recomputing simple transformation all times (I managed to introduce a couple of stupid bugs by transcribing the wrong transformations from paper to code in the past).

Maybe it would make sense to have it for multivariate like Dirichlet and Multinomials, while the most used ones like Multivariate Gaussians and T-student are already express in terms of mean/scale.

@lucianopaz
Copy link
Contributor

We plan to allow a single parametrization in the each distribution instance's initialization function. Pymc3 supported multiple parametrizations in __init__ (e.g. the Normal) and that made things harder to maintain.
That being said, @luke14free, your idea of having a static factory method do this automatically is a perfectly valid approach. We just need to agree on the design here. I think that the simplest way to do this would be to implement these static methods in each distribution instance that needs them, but that would lead to essentially duplicate code in many places and would be harder to maintain. Maybe there could be some base classes that implement common reparametrizations (e.g. a normal's scale and precision) and have the appropriate classes inherit from these. I would like to hear what the others think. @twiecki, @junpenglao?

@twiecki
Copy link
Member

twiecki commented Apr 1, 2020

Yeah, I like the static method approach. In PyMC3 we just supported multiple kwargs which didn't work terribly either, usually there are not more than 2 parameterizations. Any reason not to do that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants