Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPICE only implementation #110

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

akoumjian
Copy link

@akoumjian akoumjian commented Sep 9, 2024

Update ---
I’ve finally had a chance to do the analysis to compare this version of ASSIST (SPICE) with the current implementation (ASCII derived binary files). I wanted to do this analysis to make sure that my changes did not negatively impact the accuracy or performance.

I have run two similar types of tests. The first compares against future state vectors provided by JPL Horizons, the second compares against real historical observations in the MPC (using our ephemeris generation on top of ASSIST).

In addition to testing spk vs ascii, I also wanted to include comparisons for ephemeris version (440 vs 441), FFP contraction (on vs off), and coefficient truncation (50 bits vs the full 52). The latter is only tested for the SPICE implementation and only because there were some indications that the truncated values were aligning closer to JPL.

Since we are analyzing multiple objects, and each object has a different number of dates we are comparing, I’m providing two numbers to represent the performance.

Mean residual – For a given environment, object result pair, we take the mean of all residuals. We then take the mean across all objects for each environment. Smaller is better.

Inverse, normalized residual – Take the worst mean residual for each environment, object pair. Normalize all other environment, object pair mean residuals against it (e.g. the worst performer gets a 1, an environment with half the residuals compared to the worst would be 0.5). Then take the 1 - normalized residual. Lastly, take the mean of those inverse, normalized residuals across all objects for each distinct environment. Since this is an inverse value, larger is better.

Summary of Results

Based on how small the differences are, how main and spk both perform slightly better in different scenarios, I am comfortable making this transition. To me the open question is whether or not to truncate the values. I am inclined not to, as it uses extra computation and we don’t know if it’s lost in the noise.

Minor Planet Center

The main branch performs better, on order of 0.05 milliarcseconds against the spk branch. I am not sure about the statistical significance. Interesting that spk with truncation also aligns closer to MPC and that the main branch is also using truncated coefficients via the ASCII file.

branch truncation ffp de residuals_mean (milliarcseconds) inverse_normalized_residual
main None off 440 960.421052 8.142460e-05
main None off 441 960.421052 8.142460e-05
main None on 440 960.426462 7.685354e-05
main None on 441 960.426462 7.685354e-05
spk 50 on 440 960.478827 3.260136e-05
spk 50 on 441 960.478827 3.260136e-05
spk 50 off 441 960.483157 2.893960e-05
spk 50 off 440 960.483157 2.893960e-05
spk None on 440 960.517395 4.288764e-09
spk None on 441 960.517395 4.288764e-09
spk None off 440 960.517396 1.694934e-09
spk None off 441 960.517396 1.694934e-09

JPL Horizons

The SPICE branch with truncation performs best, on the order of 5-10 meters. Once again, I don’t see any statistical significance. Interesting again that the spice implementation performs better with coefficient truncation.

branch truncation ffp de residuals_mean (km) inverse_normalized_residual
spk 50 on 441 1.674012 0.000817
spk 50 off 441 1.674043 0.000783
spk 50 off 440 1.688946 0.000595
spk None off 440 1.683057 0.000540
main None off 441 1.688224 0.000514
main None on 441 1.688197 0.000499
spk 50 on 440 1.690831 0.000499
spk None on 440 1.683059 0.000487
spk None off 441 1.691399 0.000390
spk None on 441 1.691395 0.000333
main None on 440 1.698822 0.000293
main None off 440 1.698833 0.000267

I’m using a branch of our adam-assist tool to do the analysis. You can find the relevant files here:
https://github.com/B612-Asteroid-Institute/adam-assist/blob/ak/test-assist-versions/tests/test_complete_residuals.py
https://github.com/B612-Asteroid-Institute/adam-assist/blob/ak/test-assist-versions/test-assist-versions.sh

Performance

I ran the benchmark test but updated the sample size to 1000, here are the results on my Macbook. I wasn't expecting it to be faster, but it is slightly.

main
0.018496s 0.002606s 1.1.9 Sep 12 2024 17:20:51 d450571a0fa3563ead12dc8f013c949c2a5c3834
ak/spk_only
0.017530s 0.002283s 1.1.9 Sep 12 2024 17:23:44 61c95ebeccc0e4b90d69e89a0efd87ad97ddd28f

Bug in Loading DE441

This PR fixes a bug in the SPICE kernel loading. SPICE files do not require that summary records for the same target are adjacent to each other. The previous assist_spk_init would miss the second string of records in DE441.


As per meeting with @matthewholman and @cylon359 ...

This PR moves to support only the SPICE (.bsp) files for planet ephem. Most of the significant changes are the same as #107 . References to the .440 ASCII derived file have been replaced.
I have removed the truncation as we no longer need to find agreement with the ASCII derived files. I still intend to determine if truncation provides better agreement with Horizons on a wide scale, as Davide seemed rightly skeptical.

This PR remains as Work in Progress as I am attempting to determine why the onthefly_interpolation, onthefly_backwards_interpolation unit tests are failing on Github CI. I did notice that these tests produce different results when running on Apple M machines vs the Github X86_64 VMs. The difference is enough to cause one test to pass on Apple but fail on x86_64.

I have been able to narrow the difference down to the step size being used by rebound. I'm wondering if the higher precision from the SPICE ephem is triggering different behavior on some maths happening in rebound (but only for certain architectures / libraries).

@hannorein
Copy link
Collaborator

Regarding the machine dependent results you observe, this depends on the compiler. You might need to turn off floating point contractions. I don't know what compiler you're using. If it's clang, use the -ffp-contract=off option.

@akoumjian
Copy link
Author

Regarding the machine dependent results you observe, this depends on the compiler. You might need to turn off floating point contractions. I don't know what compiler you're using. If it's clang, use the -ffp-contract=off option.

Thanks @hannorein . I'll investigate that.


# Iterate through each directory in the unit_tests directory and compile the test files
# then run them
test:
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted an easy way to run all the c unit tests locally.

Comment on lines +28 to +41
To use use ASSIST, you also need to download ephemeris data files. One file for planet ephemeris and another suplementary file for asteroid ephemeris. You can do this with Python packages or by downloading files directly. Note that these are large files, almost 1GB in size.

pip install naif-de440
pip install jpl-small-bodies-de441-n16


You can initialize the ephemeris data from the packages like so:

python3

>>> import assist
>>> from naif_de440 import de440
>>> from jpl_small_bodies_de441_n16 import de441_n16
>>> ephem = assist.Ephem(de440, de441_n16)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope it's okay that I demonstrate how the python packages can be used in lieu of downloading the files from JPL.

printf("Difference: %.2f m\n",diff);
assert(diff < 250);
fprintf(stderr,"diff to JPL Small Body code %fm\n",diff);
assert(diff < 200);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This particular moment in time gets closer to JPL with SPICE ephem.

@@ -51,7 +51,7 @@ int main(int argc, char* argv[]){

// Compare results
for (int i=0; i<Noutput; i++){
assert(fabs(x_direct[i]-x_interpolate[i]) < 1e-13); // Check that results agree
assert(fabs(x_direct[i]-x_interpolate[i]) < 2e-13); // Check that results agree
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the switch to SPICE is making the interpolate function agree ever so slightly less. After doing many comparisons with JPL Horizons and seeing equal or better residuals (compared to the .440 version of ASSIST), I'm inclined to believe this has to do more with interpolation than it does less accuracy with the integration.

@akoumjian akoumjian changed the title WIP: SPICE only implementation SPICE only implementation Sep 24, 2024
@akoumjian
Copy link
Author

@matthewholman @cylon359 I think this is ready for your review. If it does get merged, it seems like a major version increment is warranted given the lack of backwards compatibility with the other data files.

@hannorein
Copy link
Collaborator

What was the reason to remove support for the other data files?

@akoumjian
Copy link
Author

Rob and Matt suggested that maintaining two distinct paths for loading the coefficients wasn't necessary, which is why I came back with this simpler implementation (prior PR supported both implementations). There was also indication that the ASCII-derived files are missing a couple bits of data, likely due to the serialization process, though there's no knowing if those bits have meaningful precision or not.

@hannorein
Copy link
Collaborator

I see. Oh well. I would have voted to keep backwards compatibility. 😉

@akoumjian
Copy link
Author

I'm happy to accommodate whatever consensus is reached. If I do go back to the multiple file implementation, I will need to add back a couple small improvements and I would want to more aggressively re-organize how the code runs. Initially I wanted to modify as little as possible being a new contributor.

@hannorein
Copy link
Collaborator

IMO, the old implementation could just live in a separate "legacy" file with the intend to not update it anymore or provide much support.

@akoumjian
Copy link
Author

It's not quite that straight forward, as there are two slightly different interfaces for how the masses are loaded and the coefficients generated. I would probably update both to have a similar or identical interface so that all the places that differ in assist.c or forces.c would be a very simple conditional, or hidden behind a common calling function.

@hannorein
Copy link
Collaborator

Yeah. I see there is a lot change in the backend.

What was the idea behind using a getter function for the constants rather than just populating constants in a struct like it used to be? It probably doesn't matter too much for the run time, but now every time a constant is used in the code, it might require up to 8 string comparisons just to read the value.

@akoumjian
Copy link
Author

Yes, that was left over from the two implementations version. I was wondering if it might impact performance and was surprised that it benchmarked faster. Perhaps I could get it even faster removing those comparisons.

@hannorein
Copy link
Collaborator

It probably doesn't matter much for the runtime, just seems a bit unnecessary.

In fact, I would remove struct spk_global, struct spk_constants, and struct mass_data all together. I think all of the contents can just be put into struct assist_ephem. I don't see any added benefit from the nested structs.

Also, is the data in struct mass_data actually ever needed after initialization (it looks like they get copied to sp->targets[m].mass) ? If not, I would not keep it around. It makes memory management easier and avoids confusion by having only one instance of each constant around.

@akoumjian
Copy link
Author

I'm definitely open to that. If we maintain the file type readers, I would consolidate both implementations into the assist_ephem struct.

@hannorein
Copy link
Collaborator

FYI: I'm also happy to put in some clean-up work. (I just don't want to mess up your pull request too much!)

It looks like the actual new "datafile reading part" is great and doesn't need any changes, so the precision tests should not be affected by any of the more or less stylistic code changes.

@akoumjian
Copy link
Author

If you want to branch off this, or merge it in, then clean it up, all are fine with me. I would like to get a version published to pypi that supports this format soon, as it will make a lot of our pipelines cleaner and we won't need to download multiple versions of the coefficient files. I won't have much time this next week to put towards it, but if you'd like me to make those changes I can do it later.

@hannorein
Copy link
Collaborator

I'm also pretty busy this and (most of) next week. If you want to use it in your pipeline I wouldn't worry too much about putting it on pypi, just install it from your copy. Then you're also safe from any sudden changes which would break the pipeline and can update manually when a new feature is added to the upstream version that you need. But I have no objections to merging this PR and then later changing things (it's just more likely to make other users unhappy if it breaks something for them).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants