You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is related to #29. There is a single costly EE RPC call that we make that may not be essential. Right now, we need to access all the system:time_start properties from each image in the collection, which is slow. If instead, we read the first and last (few) values and interpolated the rest, we could save time in the EE backend and avoid the biggest bottleneck for Xee. The tradeoff would be that the interpolated values may differ from the actual values, which would definitely cause data errors.
A good scenario seems to be that we add this capability behind a feature flag (name TBD). For users that understand their data well and know that this is safe, they will get a faster means of opening data. For datasets that are hard to interpolate time (or another primary dimension), users have a fallback.
Once this feature flag exists, we'd also need to rely on the fallback for slicing image ids in _slice_collection().
The text was updated successfully, but these errors were encountered:
Makes improvements towards #29. Here, we group together all ee.getInfo() calls into one RPC call. In addition, this PR helped identify the underlying reason why Xee is slow (see #30).
Status Quo:
```
open_dataset():avg=51.30,std=10.21,best=39.41,worst=68.27
open_and_chunk():avg=52.94,std=7.01,best=43.06,worst=63.03
open_and_write():avg=113.94,std=27.35,best=90.03,worst=173.90
```
After:
```
open_dataset():avg=39.82,std=8.67,best=25.24,worst=55.54
open_and_chunk():avg=36.46,std=11.96,best=25.71,worst=59.83
open_and_write():avg=91.48,std=4.74,best=86.33,worst=104.08
```
PiperOrigin-RevId: 570480601
Makes improvements towards #29. Here, we group together all ee.getInfo() calls into one RPC call. In addition, this PR helped identify the underlying reason why Xee is slow (see #30).
Status Quo:
```
open_dataset():avg=51.30,std=10.21,best=39.41,worst=68.27
open_and_chunk():avg=52.94,std=7.01,best=43.06,worst=63.03
open_and_write():avg=113.94,std=27.35,best=90.03,worst=173.90
```
After:
```
open_dataset():avg=39.82,std=8.67,best=25.24,worst=55.54
open_and_chunk():avg=36.46,std=11.96,best=25.71,worst=59.83
open_and_write():avg=91.48,std=4.74,best=86.33,worst=104.08
```
PiperOrigin-RevId: 570480601
Makes improvements towards #29. Here, we group together all ee.getInfo() calls into one RPC call. In addition, this PR helped identify the underlying reason why Xee is slow (see #30).
Status Quo:
```
open_dataset():avg=51.30,std=10.21,best=39.41,worst=68.27
open_and_chunk():avg=52.94,std=7.01,best=43.06,worst=63.03
open_and_write():avg=113.94,std=27.35,best=90.03,worst=173.90
```
After:
```
open_dataset():avg=39.82,std=8.67,best=25.24,worst=55.54
open_and_chunk():avg=36.46,std=11.96,best=25.71,worst=59.83
open_and_write():avg=91.48,std=4.74,best=86.33,worst=104.08
```
PiperOrigin-RevId: 570806762
This is related to #29. There is a single costly EE RPC call that we make that may not be essential. Right now, we need to access all the
system:time_start
properties from each image in the collection, which is slow. If instead, we read the first and last (few) values and interpolated the rest, we could save time in the EE backend and avoid the biggest bottleneck for Xee. The tradeoff would be that the interpolated values may differ from the actual values, which would definitely cause data errors.A good scenario seems to be that we add this capability behind a feature flag (name TBD). For users that understand their data well and know that this is safe, they will get a faster means of opening data. For datasets that are hard to interpolate time (or another primary dimension), users have a fallback.
Once this feature flag exists, we'd also need to rely on the fallback for slicing image ids in
_slice_collection()
.The text was updated successfully, but these errors were encountered: