Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing learner (no weight = 0) #232

Open
kmishra9 opened this issue Aug 15, 2019 · 4 comments
Open

Missing learner (no weight = 0) #232

kmishra9 opened this issue Aug 15, 2019 · 4 comments
Labels

Comments

@kmishra9
Copy link

Hey there,

I fit a model with the following code:

# Define Super Learner
stack = make_learner(
    Stack,
    lrnr_glm,
    lrnr_randomForest,
    lrnr_xgboost,
    lrnr_xgboost_limited,
    lrnr_svm,
    lrnr_earth
)
metalearner   = make_learner(Lrnr_nnls)
super_learner = Lrnr_sl$new(learners = stack, metalearner = metalearner)

# Sequential model training
model_13 = super_learner$train(train_task)

Everything looks good except the Lrnr_earth_2_3_backward_0_1_0_0 learner, which is specified in the picture below, doesn't exist as part of the weights. It doesn't even have a weight of 0, like lrnr_glm 🤷‍♂, and yet training works when I use lrnr_earth$train() manually
image

@imalenica
Copy link
Member

I would guess it fails on one of the folds, hence it is removed from the list of learners used when fitting the final weights.

@kmishra9
Copy link
Author

That makes sense -- it takes a while to run so I've been ensuring lrnrs run on samples. Any ideas on how I could parse out which fold or piece of data its failing on?

@kmishra9
Copy link
Author

Hmm, so I'm not sure its the case that lrnr_earth is simply failing. I basically split up the dataset into chunks of 100 pieces and trained lrnr_earth on each one, then verified each had been trained and that there were no errors:

results = foreach(i = seq(from = 1, to = nrow(final_log_continuous_dataset_train), by=100)) %do% {
    task = make_sl3_Task(
        data         = final_log_continuous_dataset_train %>% dplyr::slice(i:(i+99)),
        covariates   = covariates,
        outcome      = outcome,
        outcome_type = "continuous",
        weights      = weights
    )
    return(lrnr_earth$train(task))
}
for (i in results) {i$assert_trained()}

In addition, when I ran the same super_learner on a sample_task instead of the training_task from above, I get the same result as above (6 input learners, 5 learners with weights).

Finally, I ran the following code:

small_stack = make_learner(
    Stack,
    lrnr_earth
)
small_super_learner = Lrnr_sl$new(learners = small_stack, metalearner = metalearner)
scheduled_small_super_learner = Scheduler$new(
    delayed_object = delayed_learner_train(learner = small_super_learner, task = train_task),
    job_type       = FutureJob,
    nworkers       = cpus_logical,
    verbose        = TRUE
)
scheduled_small_super_learner$compute()

This errors with the following:

... [below repeated a bunch of times]
Error in private$.train(subsetted_task, trained_sublearners) : 
  All learners in stack have failed
In addition: Warning message:
In private$.train(subsetted_task, trained_sublearners) :
  Lrnr_earth_2_3_backward_0_1_0_0 failed with message: no function 'earth' could be found. It will be removed from the stack
updating Stack from ready to running
run:11 ready:0 workers:60
updating Stack from running to resolved
updating Stack from running to resolved
updating Stack from running to resolved
updating Stack from running to resolved
updating Stack from running to resolved
updating Stack from running to resolved
updating Stack from running to resolved
updating Stack from running to resolved
updating Stack from running to resolved
updating Stack from running to resolved
updating Stack from running to resolved
Failed on Stack
Error in self$compute_step() : 
  Error in private$.train(subsetted_task, trained_sublearners) : 
  All learners in stack have failed

That foreach loop that worked above? Doesn't work now:

Failed on Lrnr_earth_2_3_backward_0_1_0_0
... [repeat a bunch of times] ... 
Failed on Lrnr_earth_2_3_backward_0_1_0_0
Error in { : task 1 failed - "no function 'earth' could be found"

However, this seems to show the output I'd expect:

getS3method("earth", "default")

@imalenica
Copy link
Member

Just few short comments: for manual checking on which fold it fails, you would want to grab the folds generated by sl3with corresponding samples in each for testing purposes.

Just from glancing at the error, it looks like earth is not loaded/installed. Can you provide your sessioninfo?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants