Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error Using OEM with Categorical Variables #18

Open
chandrabanerjee opened this issue Mar 6, 2019 · 2 comments
Open

Error Using OEM with Categorical Variables #18

chandrabanerjee opened this issue Mar 6, 2019 · 2 comments

Comments

@chandrabanerjee
Copy link

Hello,
I am trying to use OEM for grouped LASSO on a tall dataset which contains many categorical variables (and as a result, lots of binary variables in the model matrix). While running cv.oem, I keep getting the following error:

Error in oemfit.binomial(is.sparse, x, y, family, penalty, weights, groups, :
TridiagEigen: failed to compute all the eigenvalues

This does not just extend to my use case, but also to smaller datasets like the birthwt data from the MASS package. I am pasting a reproducible example below:

`
library(MASS)
library(splines)
library(oem)

#Load and create Model Matrix
data("birthwt")
view(birthwt)

birthwt$race = as.factor(birthwt$race)
birthwt$smoke = as.factor(birthwt$smoke)
birthwt$low = as.factor(birthwt$low)

X = model.matrix(low~ns(age,3)+ns(lwt,3)+race+smoke+ptl, birthwt)[,-1]
Y = birthwt$low

#Define Groups
grouping = c(1,1,1,2,2,2,3,3,4,5,5,5)

#Run cv.oem for Logistic Regression with Group LASSO penalty:
cvoem = cv.oem(X, Y, family = "binomial", penalty = "grp.lasso", groups = grouping, nfolds = 10)
`

Any help regarding: 1) an explanation of the issue and 2) a workaround would be much appreciated. cv.gglasso is just too slow!

Thanks.

@jaredhuling
Copy link
Owner

I am not able to reproduce your error. This is a known error, however, and should be fixed with the current development version on github. If you install the current version of oem on github do you still have this error?

I'm confused by your example, since the number of columns in X in your code is 10, but you have specified a group structure that has length 12. When I fix your code above, there is no error:

`
library(MASS)
library(splines)
library(oem)

#Load and create Model Matrix
data("birthwt")
view(birthwt)

birthwt$race = as.factor(birthwt$race)
birthwt$smoke = as.factor(birthwt$smoke)
birthwt$low = as.factor(birthwt$low)

X = model.matrix(low~ns(age,3)+ns(lwt,3)+race+smoke+ptl, birthwt)[,-1]
Y = birthwt$low

#Define Groups
grouping = c(1,1,1,2,2,2,3,3,4,5)

#Run cv.oem for Logistic Regression with Group LASSO penalty:
cvoem = cv.oem(X, Y, family = "binomial", penalty = "grp.lasso", groups = grouping, nfolds = 10)
`

@chandrabanerjee
Copy link
Author

Hi Jared,
Thanks for the quick response. I think I deleted one categorical variable by mistake. Attached is the updated code.

`
library(MASS)
library(splines)
library(oem)
#Load and create Model Matrix
data("birthwt")
view(birthwt)

birthwt$race = as.factor(birthwt$race)
birthwt$smoke = as.factor(birthwt$smoke)
birthwt$ptl = as.factor(birthwt$ptl)
birthwt$low = as.factor(birthwt$low)

X = model.matrix(low~ns(age,3)+ns(lwt,3)+race+smoke+ptl, birthwt)[,-1]

Y = birthwt$low

grouping = c(1,1,1,2,2,2,3,3,4,5,5,5)

cvoem = cv.oem(X, Y, family = "binomial", penalty = "grp.lasso", groups = grouping, nfolds = 10)

This returns the error. I will try and test out the development version of the package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants