-
-
Notifications
You must be signed in to change notification settings - Fork 205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
General Refactoring TODO #1
Comments
May I add some things?! I have seen some double comparisons with a if(a == b && a == x) in distributions.Uniform:26 (and in many other lines). I would change every occurrence in the code, but usually I use In addition, I already corrected some java docs. You can see it in my dev branch. Hell, you wrote a lot of code. How long have you been developing JSAT? |
Of course!
In that case it is actually safe. if a == b && a == x is the case where the uniform distribution is over one infinitesimal point, which really doesn't make sense, but is needed because otherwise the (a <= x && x <= b) case would apply, and result in a NaN from a division by zero. I'm generally aware of the dangers of floating point, but there are instances where such == checks are done intentionally, such as the case in the Uniform distribution. There are a few other instances of code smells in JSAT that are done intentionally, hopefully I did a better job documenting them then I did that one in Uniform.java
Feel free to do a pull request on them!
Been a free time project for just over 4 years now :) I'm very much a "learn by doing" kinda guy, so whenever I see a paper I'm interested in I just go and try and implement it when I have the time. Though I did have more free time before working for money :-/ |
I will try to do so, but I already added some things which I would say are necessary (apache-commons-math for safe double comparison; plan to add more, e.g. trove for primitive lists). I ran findbugs on the project and it found some bad double compares and x == Double.NaN checks. |
As I said, some of those checks are done intentionally. Right now I do not want to have any dependencies in JSAT. Also, FYI - there is some code where you will experience a serious performance regression in using Trove instead of the code I have for primitive maps. (And some code that is on my local repo and not committed yet). |
I think I found some places where the checks should not be done but I will write a testcase first which may take some time. Will let you know when it's done. Do you have any advice how to compare two doubles for equality in JSAT without an external library? I have an array of doubles and want to check whether they are all the same.
Ok :(
I would use them e.g. instead of List to prevent a lot of AutoBoxing (I had an algorithm where removing boxing/unboxing increased the algorithm by factor 2). |
Feel free to email me the places where you have concerns and I will review them You need to decide what level of "sameness" you need. In ML we usually don't care about the difference in value less than 1e-3, so doing abs(a-b) < 1e-3 should be good for most cases. In the Vec class there is a compare to method that takes a threshold on the absolute difference. As I said, I have my own primitive collections in JSAT, I did not say to use generics. |
Ahh, found them by now. Is there a reason why you use List/List then? E.g. NaiveBayes.getSampleVariableVector (l. 388) |
That code was written before I created the DoubleList class, and I missed changing that one. |
I would separate abstract classes/interfaces from concrete implementations by putting them into sub-packages. |
Notes for myself on some refactoring I want to do at some point. Probably going to do these all at once when I move to Java 8
The text was updated successfully, but these errors were encountered: