fix error in fitting-a-line equation (32) as identified by Yin-Zhe Ma #18

davidwhogg · 2013-01-17T13:06:15Z

email dated 2013-01-17

jobovy · 2013-01-17T14:46:36Z

Andisheh Mahdavi had also pointed this out in an email 13 months ago. I changed the code at that point

http://trac.astrometry.net/changeset?reponame=&new=20081%40trunk%2Fprojects%2Fstraightline%2Fpy%2Fex12.py&old=16114%40trunk%2Fprojects%2Fstraightline%2Fpy%2Fex12.py

but it seems like we never changed this in the document. It also seems like the figure was not updated, and there is probably another figure that needs to be updated (but this is a little confusing since the mapping between Exercises in the document and in the code is not the same).

davidwhogg · 2013-01-18T03:25:11Z

Oh good point. Can you try to track down the code? When you think the figures are fixed, can you re-assign this issue to me to fix the text? I will then go through the remaining issues and update on arXiv. We should thank Andisheh and Yin-Zhe too

jobovy · 2013-01-22T01:44:06Z

I updated Figure 9 and the left panel of Figure 10, which were the only Figures affected by this bug (I think). I haven't re-latexed, so I don't know whether the figures look right.

I also imported the code that was in astrometry/projects/straightline/py (_.py and data__.dat).

davidwhogg · 2017-05-27T22:34:12Z

Also got SMS on 2017-05-23 about this from @dfm:

Dude. I think there's a tiny thinko in eq. 32 of fitting a line
(the section with arbitrary measurement covariances).
When you integrate the likelihood function over datasets
x and y, you get a function of the slope, not a constant.
You can also find the same result if you set all the
elements of S to zero except sigma_y. This might actually
have the behavior that you want (penalizing large slopes),
but have you thought about this before? I'm trying to
derive the jitter generalization in N-dimensions right now...

dfm · 2017-05-30T16:29:44Z

I think that I tracked down the commits that Jo is talking about and I may be totally wrong but I think that the issue is actually slightly more subtle. If I understand the changes correctly, the likelihood function that you're now using is:

ln(L) = K - 0.5 * sum(Delta**2/Sigma**2 + ln(Sigma**2))

and that is absolutely correct if you think that Delta is your data, but I think that it gets more complicated if x and y are the data.

One way to see this qualitatively is with the special case where the covariance matrix is S=[[0, 0],[0,sigma_y^2]]. In this case, the likelihood doesn't simplify to the usual thing. Instead, since Sigma^2 = sigma_y^2 cos^2 theta, the likelihood becomes:

ln(L) = K - 0.5 * sum((m*x+b - y)**2/sigma_y**2 + ln(sigma_y**2 * cos^2 theta))

I think that we want that extra cos^2 theta to be cancelled by the K "constant". You can derive the same thing by integrating L over y in (-inf, inf) and x from [xmin, xmax] and setting this equal to one. This gives something like K = 0.5 * ln(cos^2 theta) - ln(xmax - xmin) + constant (don't quote me on that).

The thing that I'm now confused about is how to generalize this to higher dimensions. It's easy to write down the sampling distribution for Delta in arbitrary dimensions, but the integral to compute K is a FPITA to do... anyone have thoughts on this?

jobovy · 2017-05-31T20:06:51Z

--@dfm I see what you mean, yes it doesn't seem right that the expression doesn't reduce to the regular chi^2 fit when the x uncertainties go to zero. I think this ultimately comes from the assumption (left implicit in the text) that there is a prior that is a wide (and going to infinitely-wide to derive eq. [32]) Gaussian 'along the line' rather than 'along x', which leads to the cos theta factors.

Here's one way to derive what is probably the correct expression. I'm wondering whether introducing the 'perpendicular distance' to the line is ultimately more confusing, because I don't use it in this derivation. We'll need the following math for conditioning Gaussians: For a 2D Gaussian N( [x,y] | [0,b], [[ alpha^2,\rho \alpha \beta],[\rho\alpha\beta,\beta^2]]) we have that p(y|x) = N( y | b + \rho \beta/\alpha x, \beta^2 ( 1 - \rho^2) )

The model for the thin 2D line is: p(x,y) = p(x) p(y|x) with p(x) = N(x|0,alpha^2), a Gaussian with mean zero and std. dev. alpha; p(y|x) = N(y|mx + b, 0). We can write this as a 2D Gaussian using the reverse of the conditioning math above: N( [x,y] | [0,b], [[ alpha^2,\rho \alpha \beta],[\rho\alpha\beta,\beta^2]]) where to get the line to emerge we need: (a) \rho = 1 [to get the conditional variance to be zero, the thin line], (b) \beta/\alpha = m.

We can convolve the 2D Gaussian model with the uncertainty variance S, then the variance simply becomes: [[ alpha^2+\sigma_x^2, \alpha\beta + \sigma_xy],[\alpha\beta + \sigma_xy, \beta^2 + \sigma_y^2]], the mean remains [0,b]. This Gaussian has a correlation coefficient: (\alpha\beta + \sigma_xy) / \sqrt[\alpha^2+\sigma_x^2] / \sqrt[\beta^2 + \sigma_y^2].

Then we can write this 2D Gaussian again in the form p(x) p(y|x) with p(x) = N(x|0,\alpha^2 | \sigma_x^2). p(y|x) is a Gaussian with mean: b + [\alpha\beta + \sigma_xy]/[\alpha^2 + \sigma_x^2] x. In the limit of alpha --> infinity while keeping \beta/\alpha = m (to get an uninformative prior on x), this becomes b + mx. The variance is [beta^2 + \sigma_y^2] x (1 - [\alpha\beta + \sigma_xy]^2 / [\alpha^2+\sigma_x^2] / [\beta^2 + \sigma_y^2], which in the limit of alpha --> infinity eventually reduces to \sigma_y^2 + m^2 \sigma_x^2 - 2 m \sigma_xy. This is the same variance as \Sigma up to a cos^2 theta factor.

So the log likelihood is then in the limit of alpha --> infinity: ln[p(y|x)] + ln[p(x)]-term-that-doesn't-vary-and-is-formally-neg-infinity. or

ln L = -0.5 [ \ln ( \sigma_y^2 + m^2 \sigma_x^2 - 2 m \sigma_xy ) + (y- mx - b)^2 / ( \sigma_y^2 + m^2 \sigma_x^2 - 2 m \sigma_xy )^2 ]

the second term is the same as we have in the paper (because the cos^2 theta factor in both Delta^2 and \Sigma factors out), the first term is different. This reduces to the correct chi^2 expression for \sigma_x --> 0.

So I think the discussion in terms of \Delta (the perpendicular distance from the line) is confusing because it introduces cos^2 theta factors that shouldn't be there. If we agree on this, I guess I should change the code and figures again. And we probably should update the arXiv version (@davidwhogg).

I think it should be straightforward to generalize the derivation above to more than two dimensions and include intrinsic variance. The math for conditioning Gaussians in more than 2D is similar to that above (I'm sure you know this anyway, but it is also given in the Appendix of the arXiv v1 version of the XD paper).

dfm · 2017-05-31T20:15:37Z

Perfect. I actually just derived that same expression following a similar argument and it definitely makes me happier! I'd be in favor of updating the document because it does seem to make a difference in the cases I've tested.

… (see #18)

davidwhogg · 2017-07-26T04:18:00Z

fixing now...

davidwhogg · 2020-05-02T14:58:26Z

Reopening this issue because I haven't re-posted to arXiv and I need to do a final read and edit I think?? I also have unanswered email from Maddox 2018-12-04 and Dunne 2018-11-07 related to this that I have to work out?

ghost assigned davidwhogg Jan 17, 2013

ghost assigned jobovy Jan 18, 2013

ghost assigned davidwhogg Jan 22, 2013

jobovy added a commit that referenced this issue Jun 2, 2017

Update plots for two-d uncertainties for corrected objective function…

1922326

… (see #18)

jobovy added a commit that referenced this issue Jun 2, 2017

Update plots for exercises 16 and 17 (17 and 18), see #18

993b81a

jobovy mentioned this issue Jun 2, 2017

Update 2D uncertainties and intrinsic variance exercises #29

Merged

davidwhogg closed this as completed Jul 26, 2017

davidwhogg reopened this May 2, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix error in fitting-a-line equation (32) as identified by Yin-Zhe Ma #18

fix error in fitting-a-line equation (32) as identified by Yin-Zhe Ma #18

davidwhogg commented Jan 17, 2013

jobovy commented Jan 17, 2013

davidwhogg commented Jan 18, 2013

jobovy commented Jan 22, 2013

davidwhogg commented May 27, 2017 •

edited

Loading

dfm commented May 30, 2017

jobovy commented May 31, 2017

dfm commented May 31, 2017

davidwhogg commented Jul 26, 2017

davidwhogg commented May 2, 2020

fix error in fitting-a-line equation (32) as identified by Yin-Zhe Ma #18

fix error in fitting-a-line equation (32) as identified by Yin-Zhe Ma #18

Comments

davidwhogg commented Jan 17, 2013

jobovy commented Jan 17, 2013

davidwhogg commented Jan 18, 2013

jobovy commented Jan 22, 2013

davidwhogg commented May 27, 2017 • edited Loading

dfm commented May 30, 2017

jobovy commented May 31, 2017

dfm commented May 31, 2017

davidwhogg commented Jul 26, 2017

davidwhogg commented May 2, 2020

davidwhogg commented May 27, 2017 •

edited

Loading