Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to fitsio? #419

Open
astrofle opened this issue Oct 28, 2024 · 1 comment
Open

Switch to fitsio? #419

astrofle opened this issue Oct 28, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@astrofle
Copy link
Collaborator

Feature description
dysh is slow (as of 0.4.0). We need to make dysh faster.

Could switching to fitsio help with some of that?
Also, fitsio supports reading row ranges -- which astropy doesn't. This could be useful for an online mode.

A quick test shows fitsio may be faster at reading data (taken from a notebook):

import numpy as np
import pandas as pd

import fitsio
from astropy.io import fits
from dysh.util import get_project_testdata

filename = get_project_testdata() / "AGBT20B_014_03.raw.vegas/AGBT20B_014_03.raw.vegas.A6.fits"

%%timeit -n 10
hdu = fits.open(filename)
df = pd.DataFrame(np.lib.recfunctions.drop_fields(hdu[1].data, "DATA"))
    36.1 ms ± 5.48 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

%%timeit -n 10
data = fitsio.read(filename)
df = pd.DataFrame(np.lib.recfunctions.drop_fields(data, "DATA"))
    13.4 ms ± 1.49 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

That is almost a third of the time.

@astrofle astrofle added the enhancement New feature or request label Oct 28, 2024
@teuben
Copy link
Collaborator

teuben commented Oct 28, 2024

every little bit helps, but we need to see this in the "bigger" picture. If we have a large getps() - and we do have a benchmark. In issue #279 I did some work, totally unfinished, but telling, about a benchmark. LIke with all good software, we should define a benchmark that a user can run, and verify the result, e.g. in the output of the statistics of the spectrum to many digits. I've advocated this "qac" technique as an option to the Spectrum.stats() function.

I'm actually surprised, I've naively thought fits.open() would use fitsio.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants