Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Continuum normalization #291

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open

Conversation

alexji
Copy link

@alexji alexji commented Aug 27, 2018

This is addressing #231.
To start, two continuum fitting functions have been defined:

  • fit_continuum_generic takes in a spectrum and fits a user-specified model (from astropy.modeling) with iterative sigma clipping. Exclusion regions can be specified. This is the most minimal implementation of the interface suggested here
  • fit_continuum_linetools that is a direct port of the algorithm in linetools. The only change is making it compatible with specutils.Spectrum1D and changing the Akima spline to scipy's version. This does not yet satisfy the expected interface.

Needed before PR is complete:

  • unit tests
  • examples in docs
  • ideally an astropy.modeling subclass that implements spline-like models (and possibly a corresponding fitter class?)
  • proper implementation of quantities (I have just turned everything into an array ignoring units)

Advice/examples of how to organize unit tests/testing data would be very helpful!
I have also been unable to write/run tests because most of my spectra cannot be read by the current spectrum readers. These are spectra that are compatible with IRAF (multispec format).

@profxj
Copy link

profxj commented Aug 28, 2018

Great work @alexji !

Shall I try to add a script like lt_continuum and give it a test drive?

ps. Testing is going to be a challenge here, no doubt..

pps. There is an extensive module in linetools for I/O: readspec.py
Maybe it is time I try to port a bunch of it. That said, IRAF is a defunct code ;)

@eteq eteq requested a review from brechmos-stsci September 5, 2018 15:14
@alexji
Copy link
Author

alexji commented Sep 5, 2018

This is important to do, but right now I don't think anything will have changed because it is almost literally a direct port.

However after the next step (wrapping the output spline into an astropy model) I think this would definitely be worth it to make sure all the interfaces still make sense.

@eteq
Copy link
Member

eteq commented Sep 5, 2018

A few things on this:

  • This has collided a bit with the work in Fitting of Spectrum1D #210 (which is based on the notebook that I linked but has evolved a bit as things do upon implementation). @brechmos-stsci has made some very basic (even more "generic" than yours I think) start on continuum fitting as part of that that has a similar interface to this, but slightly different - I'll let @brechmos-stsci speak to that in more detail.
  • That said, this is definitely more thorough/useful for many science cases/etc. The question is more one of naming: I'm not a fan of linetools being in the name because that's not very descriptive of the algorithm and/or use case. What about something like fit_continuum_qso (although that actually seems too specific to me, but I had trouble coming up with something better...)? Similarly, perhaps fit_continuum_generic should be fit_continuum_sigmaclip? (The one in Fitting of Spectrum1D #210 is somewhat similar but uses median smoothing... So maybe it should be fit_continuum_median?)
  • We are trying to sprint to get a release of specutils out at the end of this week. That might be a bit agressive for getting this in unless you have lots of time the rest of this week, @alexji - do you mind if we aim this for 0.5 (which doesn't yet have a specific timeline, but should at least be before the end of the year)

@brechmos-stsci
Copy link

@eteq and @alexji Yes, my continuum fitting is super simple. The PR #210 that @eteq references provides more of a framework for the continuum fitting (and is based on a basic model fitting). I think this PR could reasonably easily be fit into that framework once #210 is merged. I am happy to help with guidance in any way.

@alexji
Copy link
Author

alexji commented Sep 5, 2018

@eteq yes, 0.5 sounds good to me! @brechmos-stsci the #210 framework looks good and I will adapt this to match that.

fit_continuum_sigmaclip sounds good.
I realized later it is somewhat duplicating functionality provided in astropy.modeling.fitting.FittingWithOutlierRemoval link, but if we allow the user to specify the fitter then reimplementing this seems inevitable.

@eteq
Copy link
Member

eteq commented Sep 6, 2018

@alexji - one other thing that just occurred to me: I think moving forward we will want to support spline fitting using an astropy.modeling if we can. It may be ok to sort of "hack" it for this particular application, but do you think it might be possible to prototype out a modeling-compatible fitter and matched model? The hope would be that we could include that here so that it doesn't block further development, but then try to port it back to astropy if possible?

@alexji
Copy link
Author

alexji commented Sep 6, 2018

Yeah, I agree this is a good idea!

The only way I have thought to do this is to have spline-specific models and fitters that are meant to work together, and will error if the wrong model/fitter combination are used. Otherwise I think we would need to reimplement the highly optimized spline-specific fitters.

If that sounds good, then I will make relevant subclasses and include them here in a way that hopefully can be easily ported to astropy.

(Edit: timescale for this is probably about 2 weeks)

@eteq
Copy link
Member

eteq commented Sep 6, 2018

spline-specific models and fitters that are meant to work together

Yes, exactly what I was thinking, @alexji . It's a bit of a violation of the idea that they are mix-and-match but I think that makes sense for splines. Although now that I think about it, in principal you could use any arbitrary fitter as long as the model exposes all the parameters (knot locations + coeffs), you just probably wouldn't want to use anything other than the spline-specific one.

@hcferguson
Copy link
Contributor

hcferguson commented Sep 6, 2018

For what it's worth, I've had the most success with scipy.interpolate.UniverariateSpline, although it generally takes several iterations of fiddling with s and k before I am happy or fed up.

That routine has get_coeffs and get_knots methods, which might be handy. It has other methods as well, which I suppose one might like to have available within the Astropy version.

@alexji
Copy link
Author

alexji commented Sep 9, 2018

@eteq I have put in a very basic spline fitter version and a simple test (rebased onto current version). Basically I have just rewrapped scipy's splines into the astropy modeling framework.

One thing I realized is there's basically two modes of spline fitting: you can either specify knots, or you can have it automatically determine knots when fitting.
The former makes it straightforward to expose knots etc in the modeling framework. The latter is often convenient to do, but mixes model definition and fitting. For now I have just allowed both, but I wonder if we should scrap automatic knot determination from data (or separate it into a separate function)?

SaOgaz pushed a commit that referenced this pull request Mar 25, 2019
WIP: Updating the update script for the testers
@rosteen
Copy link
Contributor

rosteen commented Apr 28, 2020

@alexji this is back on our radar as something we want to get merged in, but it will likely need to be updated to match up with the current API. I'll be digging into what exactly needs to be changed soon, but I wanted to check in with you first and see if you have the availability/desire to do the updates. I'm happy to take this on if not, just let me know either way.

@alexji
Copy link
Author

alexji commented Apr 28, 2020

@rosteen it sounds like you have a plan in mind for this, so you should go ahead! I'm happy to contribute in any way that is helpful, but I haven't been paying attention to the other API changes so would need some time to get back up to speed.

@rosteen
Copy link
Contributor

rosteen commented Apr 30, 2020

@alexji I actually haven't had time to dig in and figure out exactly what needs to be updated/formulate a plan yet - I think that @nmearl or @eteq have a good idea of what updates are needed here and can chime in with the answer sooner than I could. This isn't super time sensitive, so as long as you're willing it would be awesome if you could do the updates. I just wanted to offer to step in if you were too busy or no longer had any interest.

@alexji
Copy link
Author

alexji commented May 1, 2020

Yeah I am happy to do it, especially if not time sensitive. I can finish this before the end of May. But it would be very helpful to be pointed to any resource laying out the anticipated API.

@nmearl
Copy link
Contributor

nmearl commented May 6, 2020

@alexji Do you mind if I rebase and push directly to your branch with some low-level fixes?

@alexji
Copy link
Author

alexji commented May 6, 2020 via email

@nmearl nmearl force-pushed the continuum-norm branch from 1c88ccf to 96fd450 Compare May 6, 2020 17:29
Copy link
Contributor

@nmearl nmearl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @alexji. It seems though there's quite a few unfilled methods in the fitter. I wasn't able to get the fitter working even in a somewhat useable fashion, so we certainly want to revisit the implementation to see how far along it is in serving as a fitter.

Also, in general, we don't support plotting within the specutils library, so the references in tests as well as in the fit_continuum_linetools function will need to be removed.

Likewise, I'm not sure of the use of fit_continuum_linetools as it seems to be very QSO-centric. Is the idea to adopt the generic portions into fit_continuum_generic?


if not isinstance(model, modeling.FittableModel):
raise ValueError('The model parameter must be a astropy.modeling.FittableModel object')
# TODO: this is waiting on a refactor in modeling.fitting to work
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexji can you expand on what specifically this is waiting for in the refactor?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't remember exactly, but I believe it had to do with the fact that at the time astropy.modeling did not support any type of spline fitting.

return continuum_model


def fit_continuum_linetools(spec, edges=None, ax=None, debug=False, kind="QSO", **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexji can you expand on where this is planned to be used?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was not a plan, but it was requested in #231 so I ported it in.

The only changes are switching to Scipy's Akima1D interpolator and
changing the relevant syntax.
"""
assert kind in ["QSO"], kind
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We certainly want this to be useful for other than QSO spectra. It'd be nice to have an idea of what might need to be done to make it so!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main difference between QSO spectra and others is the pre-defined knots (and shifting them by some redshift assuming the units, as mentioned below).

However the fitting algorithm iteratively rejects knots based on goodness-of-fit and can be generalized. A good default knot chunking can be done by just spacing knots linearly, and the user can specify them in more detail.

It may be out of scope for this package, but it does seem nice to include pre-defined knots to enable easier out-of-the-box usage. However it also makes sense to remove them entirely for maximum generality, and to define another package with those pre-defined knots.

assert len(wa) > 0

zp1 = 1 + redshift
div = np.rec.fromrecords([(200. , 500. , 25),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, this assumes the spectrum is in wavelength space, which we can't guarantee.

sci_y = scipymodel(x)
assert np.allclose(my_y, sci_y, atol=1e-6)

if make_plot:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't particularly useful for automated testing infrastructures (and we also do not support plotting functionalities within the library), so this will eventually have to be removed.

@alexji
Copy link
Author

alexji commented May 7, 2020

It looks like there have been updates to the API for astropy.modeling since I wrote the SplineModel class. I will go through this and fix it up.

For now, we can put it into specutils.fitting.spline, but ultimately this should probably go into astropy.modeling (cf astropy/astropy#7522)

I will also see if I can implement the linetools algorithm as an astropy fitter. My concern is the number of knots (i.e. number of parameters in the spline model) will change after running the fitter, and I think changing the number of parameters in an astropy model is not good.

@alexji
Copy link
Author

alexji commented May 8, 2020

OK I have fixed up the SplineModel so it passes the test (at least locally, travis is running). For the fitter, I am not sure what methods have to be implemented.

It is not possible to define a fixed set of parameters for all classes of spline models that the fitter than determines, because by default a spline will automatically determine knot locations from the data.
However it is possible to do this if you force the user to define knots as part of the model. This is probably the most common use case. Should I restrict the SplineModel to only be this case?

@rosteen
Copy link
Contributor

rosteen commented May 11, 2020

I should have time to pull down the latest changes and think more about the knot question today.

My initial thought is that while it would be nice to be able to specify the number of knots without explicitly defining their location, if that's a big effort it would be ok to require the user to explicitly define the knots as part of the model in the interest of having at least some usable spline fitting. We can iterate later to add functionality.

It's been quite a while since I was using splines for science - @eteq do you want to chime in to agree/disagree that user-defined knots are the most common use case anyway?

@hcferguson
Copy link
Contributor

@rosteen I'm used to specifying the number of knots rather than their locations for this kind of application. Also, my recollection is that there are huge performance advantages to having data on a regular grid, at least for the 2D case in scipy (RectBivariateSpline is much faster than BivariateSpline). Not sure if the same goes for 1D splines.

Also, I've not seen it applied in the context of continuum fitting, but Gaussian Process Regression might be an interesting alternative to splines, potentially even making use of the uncertainties. There's a package in scikit-learn. [https://scikit-learn.org/stable/modules/gaussian_process.html](Gaussian Process Regression).

@alexji
Copy link
Author

alexji commented May 11, 2020

@hcferguson 's common use case of specifying knot number is easy to include. It just involves a helper class to automatically generate evenly spaced knots over the spectrum's dispersion axis and pass that into the other model [and probably creating a convenience class for this type of model].

@rosteen
Copy link
Contributor

rosteen commented May 12, 2020

@alexji I was thinking pretty much the same thing about auto-generating the N knots. @hcferguson would that solution fit with your needs? We'd certainly document what exactly specifying the number of knots does, in terms of them being fixed rather than free-floating.

Base automatically changed from master to main March 22, 2021 15:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants