Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow cropping non-geos areas #516

Merged
merged 4 commits into from
Jun 28, 2023
Merged

Conversation

mraspaud
Copy link
Member

  • Tests added
  • Tests passed
  • Passes git diff origin/main **/*py | flake8 --diff
  • Fully documented

@mraspaud mraspaud requested a review from djhoese May 23, 2023 13:44
@mraspaud mraspaud self-assigned this May 23, 2023
@mraspaud
Copy link
Member Author

@ameraner you might want to have a look at this also

@codecov
Copy link

codecov bot commented May 23, 2023

Codecov Report

Merging #516 (44af636) into main (2a6d769) will decrease coverage by 0.10%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main     #516      +/-   ##
==========================================
- Coverage   94.34%   94.25%   -0.10%     
==========================================
  Files          82       82              
  Lines       13026    13030       +4     
==========================================
- Hits        12290    12282       -8     
- Misses        736      748      +12     
Flag Coverage Δ
unittests 94.25% <100.00%> (-0.10%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pyresample/geometry.py 87.64% <100.00%> (+0.05%) ⬆️
pyresample/test/test_geometry/test_area.py 99.21% <100.00%> (+<0.01%) ⬆️

... and 1 file with indirect coverage changes

@djhoese
Copy link
Member

djhoese commented May 23, 2023

If I'm not mistaken this will mean that reduce_data in Satpy will now cause a performance hit for all LEO data right? For some reason I thought reduce_data depended on this failing to not have reduce_data be performed on LEO/SwathDefinition data.

@djhoese
Copy link
Member

djhoese commented May 23, 2023

Copy link
Member

@ameraner ameraner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you for "unlocking" this useful feature! I left just a clarification question.

On a side note, the code for SwathDef data now falls into here

area_boundary = AreaDefBoundary(area_to_cover, 100)
which then reaches a PendingDeprecationWarning
class AreaDefBoundary(AreaBoundary):
"""Boundaries for area definitions (pyresample)."""
def __init__(self, area, frequency=1):
lons, lats = area.get_bbox_lonlats()
warnings.warn("'AreaDefBoundary' will be removed in the future. " +
"Use the Swath/AreaDefinition 'boundary' method instead!.",
PendingDeprecationWarning, stacklevel=2)
AreaBoundary.__init__(self,
*zip(lons, lats))
if frequency != 1:
self.decimate(frequency)

"equal.")

data_boundary = Boundary(*get_geostationary_bounding_box_in_lonlats(self))
data_boundary = self.boundary(frequency=100)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the rationale for choosing 100 here? (we also hardcode it a bit later

area_boundary = AreaDefBoundary(area_to_cover, 100)
). I see that the geos case has some more handling of the frequency here
def _get_geo_boundary_sides(self, frequency=None):
"""Retrieve the boundary sides list for geostationary projections."""
# Define default frequency
if frequency is None:
frequency = 50
# Ensure at least 4 points are used
if frequency < 4:
frequency = 4
# Ensure an even number of vertices for side creation
if (frequency % 2) != 0:
frequency = frequency + 1
.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We set that default just to restrict the number of vertices (several thousands)... otherwise the polygon intersection would takes too long

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we want to keep the number of vertices low to speed up the process.

@mraspaud
Copy link
Member Author

If I'm not mistaken this will mean that reduce_data in Satpy will now cause a performance hit for all LEO data right? For some reason I thought reduce_data depended on this failing to not have reduce_data be performed on LEO/SwathDefinition data.

So maybe we should check that the it is an areadef not swathdef here? or should that be done in satpy to keeps things cleaner here?

@djhoese
Copy link
Member

djhoese commented May 25, 2023

Oh I'm wrong. There is an AreaDefinition check at the top of the method. This is probably good then.

@ghiggi
Copy link
Contributor

ghiggi commented May 25, 2023

@mraspaud: FYI there are several edge cases where this method currently fails:

  • When a swath/area intersects twice with another area, the polygon intersection routine currently runs infinitely if the intersections are more than 1
  • When an area is within (or touches) another, I think that the intersection now runs infinitely ...

If you implement get_array_indices_from_lonlat method also for SwathDefinition, then the method would work also when area_to_cover is SwathDefinition ;)

data_boundary = Boundary(*get_geostationary_bounding_box_in_lonlats(self))
data_boundary = self.boundary(frequency=100)
else:
data_boundary = Boundary(*get_geostationary_bounding_box_in_lonlats(self))
Copy link
Contributor

@ghiggi ghiggi May 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we add/call get_geostationary_bounding_box_in_lonlats in https://github.com/mraspaud/pyresample/blob/4e245a906bf3aeff072f6b61febdd1df298fb425/pyresample/geometry.py#L299 get_bbox_lonlats when is geostationary you can simplify all the if_else conditions here just do self.boundary() with a sensible frequency defaults ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here below:

AreaDefBoundary(area_to_cover, 100) fails if area_to_cover is SwathDefinition.
Maybe use area_to_cover.boundary() also here in all cases (with sensible frequency defaults)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this simplification is better suited for another PR

@mraspaud
Copy link
Member Author

Oh I'm wrong. There is an AreaDefinition check at the top of the method. This is probably good then.

I can't see it, could you point me to it please?

@djhoese
Copy link
Member

djhoese commented May 26, 2023

Here:

if not isinstance(area_to_cover, AreaDefinition):
raise NotImplementedError('Only AreaDefinitions can be used')

@djhoese djhoese added this to the v1.28.0 milestone Jun 19, 2023
@djhoese
Copy link
Member

djhoese commented Jun 27, 2023

Hhhmmm maybe I should have merged this before my big config PR. Sorry, but I've caused some conflicts.

@mraspaud
Copy link
Member Author

Fixed.

@coveralls
Copy link

coveralls commented Jun 28, 2023

Coverage Status

coverage: 93.864% (-0.09%) from 93.953% when pulling 44af636 on mraspaud:feature-non-geos-crop into 2a6d769 on pytroll:main.

@mraspaud
Copy link
Member Author

@djhoese please review my two last commits to see if you agree with the refactoring I made.

This refactoring took me some time because I found a bug/inconsistency in the usage of the frequency argument in AreaDefBoundary and the boundary method of area defintion: in the first case it means the step we use to determine the length of the boundary, in the second case it is the length of the boundary itself. I'm working on a fix now (which will be another PR).

@ghiggi
Copy link
Contributor

ghiggi commented Jun 28, 2023

I remember I noticed such inconsistencies already a while ago. For GEO areas is the total number of vertices, while for swath and other areas is the number of steps on each boundary side right? I personally would really like to have this as the number of vertices per boundary side, and allow to specify a tuple (y,x) or (x,y) instead of a single value for all dimensions.

I also remember somewhat that quite ago, by using the step approach, we were discarding boundary corners values, but I think this was fixed by one of mine or your PR @mraspaud.

@mraspaud
Copy link
Member Author

@ghiggi I think both approaches are valid, the usage just depends on what the user needs at that point.

in #526 I have now just replaced frequency with bbox_shape as a more accurate name. I don't change any functionality or behaviour though, so allowing to pass a tuple for example isn't supported (yet?)

@mraspaud
Copy link
Member Author

which then reaches a PendingDeprecationWarning

Indeed, this is now fixed.

@djhoese
Copy link
Member

djhoese commented Jun 28, 2023

@mraspaud I think I understand the last 2 commits. I assume the main motivation was to:

  1. Remove use of AreaDefBoundary
  2. Use one method for getting the boundaries for both the source and cover(target) areas.

So if that is true then 👍 from me.

@mraspaud mraspaud merged commit 2048b12 into pytroll:main Jun 28, 2023
@mraspaud mraspaud deleted the feature-non-geos-crop branch June 28, 2023 20:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

5 participants