Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Numerical instabilities with tf.float32 for cholesky decomposition #307

Open
Sayam753 opened this issue Jul 28, 2020 · 3 comments
Open

Numerical instabilities with tf.float32 for cholesky decomposition #307

Sayam753 opened this issue Jul 28, 2020 · 3 comments

Comments

@Sayam753
Copy link
Member

Sayam753 commented Jul 28, 2020

Carrying forward the discussions in #289 , it will be great to investigate cholesky decomposition errors further with tf.float32. Here's a good starting point -

import tensorflow as tf
tf.random.set_seed(42)

def cov(shape, dtype):
  sigma = tf.random.normal((shape, shape), dtype=dtype)
  return tf.linalg.cholesky(tf.matmul(sigma, sigma, transpose_a=True))

cov(10**4, tf.float32)  # decomposition errors
cov(10**4, tf.float64)  # works all good

Should I open this issue in TF side?

@tirthasheshpatel
Copy link
Contributor

Your normal distribution is centered at 0 and so the diagonal entries are susceptible to be near zero which presents challenges for float32 datatype. If you center the data differently, the results may change... I did it and got a success rate of about 70% on random data of shape 5000 x 5000 and 100 experiments...

Here's the code:

import numpy as np
import tensorflow as tf

def run(N, nrep, dtype):
    sr = 0
    for _ in range(nrep):
        a = tf.random.normal((N,N), dtype=dtype) + tf.linalg.eye(N, dtype=dtype)
        a = tf.linalg.matmul(a, a, transpose_a=True)
        try:
            __ = tf.linalg.cholesky(a)
            sr += 1
            print(".", end="")
        except:
            print("x", end="")
            continue
    print()
    return sr/nrep

if __name__ == "__main__":
    import sys
    if len(sys.argv) != 4:
        print("usage : python testing.py <N> <nrep> <dtype>")
    else:
        dtype = tf.float32 if sys.argv[3] == 'float32' else tf.float64
        print("success rate :", run(int(sys.argv[1]), int(sys.argv[2]), dtype))

Though it is not the case with tensorflow only, I also tried to do the same experiment with numpy and got about 71% success... So, float32 datatype is generally not a good choice for large data...

@tirthasheshpatel
Copy link
Contributor

GPy and other libraries just keep on adding noise until cholesky decomposition passes. So, I don't think there is a better solution to this problem. Though it is worth some discussion...

@Sayam753
Copy link
Member Author

So, float32 datatype is generally not a good choice for large data...

That's a good observation. I think I should add these lines in notes section for Full Rank ADVI. Thanks @tirthasheshpatel for providing more insights.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants