Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An example where estimated prior variance are wrongfully zero #27

Open
gaow opened this issue Mar 25, 2020 · 0 comments
Open

An example where estimated prior variance are wrongfully zero #27

gaow opened this issue Mar 25, 2020 · 0 comments

Comments

@gaow
Copy link
Member

gaow commented Mar 25, 2020

problem

This is in part related to #26 . I got this example from @DongyueXie complaining that mmbr does not work for his experiment:

set.seed(12345)
n = 30
p = 150
k=5
max_cluster = 2
L=matrix(0,nrow=n,ncol=k)
for(i in 1:n){
  L[i,] = rowSums(rmultinom(max_cluster,1,rep(1,k))%*%diag(rnorm(max_cluster,2,1)))
}
FF = matrix(0,nrow=k,ncol=p)
idx = round(seq(1,p,by=p/k))
idx = c(idx,p+1)
for(i in 1:(k)){
  FF[i,(idx[i]):(idx[i+1]-1)] = 5
}
sigma = sqrt(var(c(L%*%FF))/10)
Y = L%*%FF + matrix(rnorm(n*p,0,sigma),nrow=n,ncol=p)
library(mmbr)
library(mashr)
prior_covar = create_mash_prior(sample_data = list(X=t(FF),Y=t(Y),residual_variance=cov(t(Y))), max_mixture_len=-1)
fit = msusie(t(FF),t(Y),prior_variance = prior_covar,precompute_covariances=T)

Where there are 150 samples, 30 conditions and 5 variables. all variables have at least an effect in one of the 30 conditions

The result:

> fit$pip   
[1] 0 0 0 0 0

Interestingly though, if I only consider the first 10 conditions by setting y2 = t(Y)[,1:10] and rerun exactly the command above, I get:

> y2 = t(Y)[,1:10]
> prior_covar = create_mash_prior(sample_data = list(X=t(FF),Y=y2,residual_variance=cov(y2)), max_mixture_len=-1)
> fit = msusie(t(FF),y2,prior_variance = prior_covar,precompute_covariances=T)
> fit$pip
[1] 0.9999996 1.0000000 1.0000000 0.9999965 1.0000000

as number of conditions increase PIP ended up zero.

analyze with SuSiE

for (i in 1:ncol(t(Y))) print(susieR::susie(t(FF),t(Y)[,i])$pip)

many of the PIPs look good. For example,

[1] 0 1 0 0 0
[1] 0 1 1 0 0
[1] 0 0 1 0 0
[1] 0 0 1 1 0
[1] 0 1 1 0 0
[1] 1 0 0 1 0
...

Not estimate prior variance

If we dont estimate prior variance, though:

> fit = msusie(t(FF),t(Y),prior_variance = prior_covar,precompute_covariances=T,estimate_prior_variance=F)
> fit$pip
[1] 0.8921862 0.8927010 0.8925958 0.8930521 0.8925924
> fit$V
 [1] 1 1 1 1 1 1 1 1 1 1
> 

Estimate prior variance but not set it to zero

Now, if I use EM update but not compare sigma_0 at the end of each iteration, #26

> fit = msusie(t(FF),t(Y),prior_variance = prior_covar, estimate_prior_variance=T, estimate_prior_method='EM',check_null_threshold=NA,max_iter=1000)
> fit$pip
[1] 0.8924915 0.8928845 0.8926351 0.8925828 0.8925347
> fit$V
 [1] 0.0006035701 0.0006035703 0.0006035706 0.0006035709 0.0006035712
 [6] 0.0006035714 0.0006035717 0.0006035720 0.0006035723 0.0006035726

The PIP are similar to when prior is not estimated. But the estimated prior are very small. Also it took very long time to converge,

> fit$niter
[1] 783

Use oracle prior and EM estimate for prior scalar

Finally, I try it with the oracle prior, cor(t(L))

> fit = msusie(t(FF),t(Y),prior_variance = cor(t(L)),estimate_prior_variance=T, estimate_prior_method='EM',check_null_threshold=NA,max_iter=1000)
> fit$V 
 [1] 0.3395068 0.2582417 0.2482696 0.1126837 0.1118897 0.1117417 0.1116513
 [8] 0.1115740 0.1114970 0.1114130
> fit$pip
[1] 0.8503046 0.9974211 0.9975020 0.9003030 0.9995462
> fit$niter
[1] 19

The results looks much better now. However if I do check it with zero, at the end of each iteration, #26,

> fit = msusie(t(FF),t(Y),prior_variance = cor(t(L)),estimate_prior_variance=T, estimate_prior_method='EM',max_iter=1000)
> fit$niter
[1] 17
> fit$pip
[1] 0.002025137 0.999861402 0.999674562 0.996728430 0.999980420
> fit$V
 [1] 0.4186166 0.3103659 0.3255259 0.0000000 0.0000000 0.0000000 0.0000000
 [8] 0.0000000 0.0000000 0.2208241

I'm documenting it here for now and will look into it some more later.

gaow added a commit that referenced this issue Apr 15, 2020
@gaow gaow changed the title Estimated prior variance are wrongfully zero An example where estimated prior variance are wrongfully zero May 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant