You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I need to extract information from a few columns in ~20k different fits files. Each file is relatively small, ~0.2MB. I have been doing this so far with a loop and astropy like this
from astropy.io import fits
data = []
for file_name in fits_files_list:
with fits.open(file_name, memmap=False) as hdulist:
lam = np.around(10**hdulist[1].data['loglam'], 4)
flux = np.around(hdulist[1].data['flux'], 4)
z = np.around(hdulist[2].data['z'], 4)
data.append([lam, flux, z])
This takes for the 20k fits files ~2.5 hours and from time to time I need to loop through the files for other reasons. So I wanted to minimize the time for that and I tried out fitsio in this way:
import fitsio
from fitsio import FITS,FITSHDR
for file_name in fits_files_list[:300]:
hdulist=fitsio.FITS(file_name)
lam = np.around(10**hdulist[1]['loglam'][:], 4)
flux = np.around(hdulist[1]['flux'][:], 4)
z = np.around(hdulist[2]['z'][:], 4)
data.append([lam, flux, z])
But unfortunately, it doesn't give me much of a time improvement, if at all. So my question is: Can I improve the time for looping with fistio? Do you know of other packages that would help me? Or do you know if I can change my algorithm to make it run faster, e.g. somehow vectorize the loop? Or some software to stack quickly 20k fits files into one fits-file (TOPCAT has no function that does this for more than 2 files)? Thanks for any ideas and comments!
The text was updated successfully, but these errors were encountered:
Hi, I need to extract information from a few columns in ~20k different fits files. Each file is relatively small, ~0.2MB. I have been doing this so far with a loop and astropy like this
This takes for the 20k fits files ~2.5 hours and from time to time I need to loop through the files for other reasons. So I wanted to minimize the time for that and I tried out fitsio in this way:
But unfortunately, it doesn't give me much of a time improvement, if at all. So my question is: Can I improve the time for looping with fistio? Do you know of other packages that would help me? Or do you know if I can change my algorithm to make it run faster, e.g. somehow vectorize the loop? Or some software to stack quickly 20k fits files into one fits-file (TOPCAT has no function that does this for more than 2 files)? Thanks for any ideas and comments!
The text was updated successfully, but these errors were encountered: