-
Notifications
You must be signed in to change notification settings - Fork 122
Forward pass adds same linear basis function #195
Comments
@CherylCB Thanks for reporting this. I will look into it as soon as I can, which might be a while. It's annoying, but if this bug is causing problems for you the best workaround might be to add your own code to prune any duplicate basis functions. If you want to do that and need guidance on where to start, please comment here and I can elaborate. It seems you're using py-earth essentially as a variable selection mechanism for linear regression. Is that right? If so, you could also just extract the set of selected variables and use them with |
Thanks for your answer @jcrudy. For now indeed I have added my own code to workaround the problem of adding duplicate basis functions. |
@CherylCB Glad you were able to work around this bug. Please feel free to post code for your workaround in this thread if it's shareable. It might help someone out later. I'll be leaving this issue open until it's fixed. |
@jcrudy I have some time coming days to work on this bug, do you have any suggestions on what would be the first place for me to look? |
@CherylCB That's great. I'll take all the help I can get. Going off of the current master, I'd start by looking here. As you'll see, I wrote some special code to try to prevent exactly what you are seeing. One of two possible things is probably happening:
You're actually in a good position to figure this out, since you have an example data set that shows the problem. I'd suggest you try to debug what's happening in the forward pass using a script that fits a model to your data set. Unfortunately, it's hard to set up a debugger to work with cython. Perhaps you're more skilled than me in this area, but if not I suggest you just use print statements in the cython files. The workflow is something like this:
If you have any problems, don't hesitate to get in touch. You can reply here or email me (my address is on my github profile). You're potentially saving me a lot of time by working on this, so of course I'm very happy to spend some time helping you succeed at it. Good luck! |
I'm trying to run a pyearth model with
enable_pruning = False
and only linear features with amax_degree
of1
. I noticed that the same feature is added twice. See results below:I noticed issue #135 which seems to suggest the same bug, addressed by @jcrudy .
I'm installing from the latest commit
git+https://github.com/scikit-learn-contrib/py-earth.git@b209d1916f051dbea5b142af25425df2de469c5a#egg=sklearn-contrib-py-earth
The text was updated successfully, but these errors were encountered: