-
Notifications
You must be signed in to change notification settings - Fork 1.1k
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lazy resampling executes non-lazy transforms without executing pending transforms #6450
Comments
Thanks and agree that adding an option will allow for different use cases, one of the use cases is to achieve fast but less precise results compared with the non-lazy... {"optimize": "lazy_to_front", another_option: 3"} #json-like Looks highly extensible |
hi @atbenmurray I don't understand the issue well
which Noise? there is no intensity Noise transform here. or do you mean something else? --
Will it move all lazy transforms to the front? (then it won't be the right sequence of transforms ,) . If I understand it correctly - to move ALL lazy trasforms to front/back, then it will be applicable only to some special cases (and not in general), since we can't simply re-arrange transforms. A common transform list I use is following (with Indentities for current "lazy_evaluation" version)
I don't think we can simply re-arrange lazy (spatial) transforms to the front. I personally think we should do lazy transform explicit. Anyone who wants to use them must create a separate list of spatial transforms with Compose( spatial_transforms, lazy=True), and then use that new compose_transform as a step between main transforms |
@myron thanks for the detailed response! In general, people shouldn't have to use this compose option. The primary purpose for it is to work around the situation we have now for the upcoming release. As I understand it, we have benchmarked pipelines that rely on the dev branch behaviour. The dev branch produces incorrect output in the general case of having lazy and non lazy transforms mixed together, because, unless you use Identity/Identityd (becoming ApplyPending/ApplyPendingd) to enforce ordering, all non-lazy transforms will be executed first. This decision was taken to improve performance on a benchmark that had a Lambda transform in it (as far as I am aware from the other conversations), but it will break our user's pipelines in the general case. I'm doing some experiments with your pipeline so that I fully understand the implications @wyli I guess I really need to get a complete list of the pipelines this behaviour was done for so we can be sure that this fix would sort those pipelines for benchmarking while allowing default behaviour to not break our users' pipelines |
I think we just use the by the way I disagree with "The dev branch behaviour has a major issue in that unless you use Identity/Identityd (becoming ApplyPending/ApplyPendingd) to enforce ordering", could you rephrase to avoid misleading info? |
The goal has always been to allow it to work without breaking anything in the general case. That way it is accessible just as a switch that can be flipped at the policy level rather than having to rewrite pipelines. The user is always then free to get very fancy with how they organise their transforms to get the extra benefit, but |
I've modified it to say 'will produce incorrect output'. |
@wyli I really need to see the pipelines that the dev branch behaviour was intended for. Surely you can point me at them. The flag isn't a fix unless we know it fixes those specific cases. |
Great! So if we eyeball these and verify that moving all the lazy transforms last is the equivalent of dev branch behaviour, then this is the right modification for the PR and we should be able to avoid needing to re-benchmark |
I think I understand your point, but my point is that, there are 200+ transforms in monai, even if it's in non-lazy mode, you can easily compose certain sequences of some of them that may not work as you think (especially if you don't read the documentation)... we can't say that the monai codebase is incorrect or has major issue, right? |
also, we'll never achieve exactly the same result by design, like the padding modes, output data types (quantisations), and interpolation modes. In most use cases, there's a trade-off for speed/interpolation quality/memory footprint, in my opinion, we just need more documentation about the details and let the user decides what transform sequence to use. |
That's why the default behaviour should be as has always been stated in the design and presentations and discussions, i.e. that the ordering of transform execution is unmodified, and that any changes to transform ordering should come from option flags. Anyway, our plan is to add the option flag which should allow the benchmarked pipelines to run as discussed, and have the default behaviour be to not modify transform order. There won't be any need to document anything as an issue in docstrings or release notes, because it won't be an issue, just an option that allows you to move all the lazy transforms last if you want to do so. draft of a docstring fragment for compose options: option 'move_lazy_last': This option reorders the transforms in a given pipeline so that they are executed after all non-lazy transforms. This can be helpful for performance and image quality, as it results in fewer resampling operations. Note that this option should not be used with pipelines that require ordering to be preserved, such as if you have a mixed sequence of lazy and non-lazy spatial transforms that must be performed in the specified order. If your pipeline uses |
Ok, so after trying a couple of different interpretations of the dev lazy mode, I know what So: 'LX' means lazy transform with id X (id doesn't mean anything here, just helps with clarity in the examples) Given a list of transforms:
The presence of an ApplyPending splits the transform list into multiple sublists
Each sublist is then sorted so non-lazy transforms go first, lazy transforms go next, and then the ApplyPending goes last:
Which then concatenates back to:
This precisely mimics the behaviour in dev and makes sense from a user perspective; it is easy to explain that an ApplyPending effectively splits the list of transforms into sublists and schedules each separately. I've implemented this and I am running a wide selection of tests on it. |
On the dev branch there is no reordering of the user-defined transforms with or without lazy resampling, you can turn on the verbose=True to confirm this. If you are talking about the 'applied_operations' #6371 #6439, these are about inversing of a compose using MONAI/tests/test_integration_lazy_samples.py Line 159 in d29914d
It has nothing to do with the transform call, that is |
There is effective reordering on Given the following pipeline on dev:
Flip is executed lazily
Spacing is executed lazily
Rand3DElastic is executed non-lazily, without the pending transforms being executed first. This means that Rand3DElastic is the first to change the actual data (non-lazy execution), ie out of order
Rotate is executed lazily
Finally, the pending transforms are executed
Rand3DElastic is the first transform to be executed on the data; it should have been the 3rd It doesn't matter that the logging shows that Rand3DElastic was visited in order. It is the first transform that is executed on the data. That is what I mean by effective reordering. This should not be the default behaviour, because we want users to switch on lazy without it breaking their pipelines. |
The default behaviour is no lazy resampling, and the user should read the tutorials and examples before considering enabling lazy resampling. The definition of 'breaking their pipelines' is unclear here. as I mentioned earlier, the padding mode, interpolation mode, output data type may be different compared with the non-lazy in many use cases. But they are not wrong. In v1.2, to fully leverage lazy resampling and get the optimal outcome, the existing preprocessing transforms should be revisited, instead of simply setting lazy_evaluation=True (it'll either not provide any speedup or not provide exactly the same preprocessing result). And for your reordering example, the behaviour will be different when it's used with a cache dataset. Again I don't think your reordering is broken, as long as the program's behaviour is consistent with its documentation, then it's a feature not a bug. The user should have all the freedom to implement their preprocessing logic. So what I'm suggesting is that in one of the option for the |
The example that I showed you above shows that this is a breaking change. If I put Rand3DElastic 3rd, I want it to execute 3rd. #6371 makes my point for me To quote: Disabling This highlights the importance of discussing changes to the proposed design. #6371 was opened, patched, PRed and merged without you requesting a review from me. If you had, I could have pointed out that it isn't just a problem for inversion, but it will break existing pipelines where the user mixes lazy and non-lazy spatial/cropping and other transforms, and that it shouldn't be necessary to disable the invert in these circumstances. Essentially, all we have to do is ensure that the forward execution order is respected for inverse. We already do this quite neatly for Compose-based subclasses such as RandomOrder. The full solution is what I have described above, but I'll repeat it here briefly:
The following should work fine in the upcoming commit:
|
This reordering is already ambiguous for cache Vs non-cache use cases, it'll be 'breaking the pipeline' by your definition. But I don't think it's a bug, it's a feature...the cache + lazy feature interactions are documented. |
It isn't ambiguous for cache vs non-cache. There are clear rules. Caching will only execute transforms up to the first random transform. It then calls the compose pipeline with those transforms only. The reordering rules can be applied for those transforms. The loaded data is then run with the remaining part of the pipeline. The reordering rules can be applied for those transforms. Even if this were the case, there are plenty of users who won't be using the caching / persistent mechanisms, so this is an aside from the core points I made above. It is a bug that
gets effectively executed as
This is not what users expect and won't produce the results they intended for their pipeline. It is also not necessary to disable inversion in the case of
this can be made to work fine. |
Agreed, that's what we have on the Dev branch, that's what we have for lazy resampling on the Dev branch. |
It is a bug that
gets effectively executed as
The default behaviour should be that no reordering occurs, because it breaks what the user has defined for their pipeline. Reordering must be opt in. PR will allow original reordering, reordering, inversion of reordering, for both cache and non-cache. |
I don't think that's a valid/good assumption for monai now and future |
I only point that out because you mentioned that rather than addressing the core point and in the previous paragraph I explained that the behaviour is clearly defined for caching / non-caching I've already shown in previous discussions, presentations and so forth that we can integrate caching into the same mechanism that does things like specifying out of order performance optimisations. |
ok, in the interests of time for releasing v1.2, just to reiterate my understanding:
If you don't agree with any of the above, let's have an offline discussion with the wider team... |
I'd phrase as the following:
But yes, in essence, it should be fine |
Sure, that's good. Implementation-wise, do you plan to include all the code changes in PR #6257 and what's the timeline? If needed I can help run some benchmarks and integration tests with all the testing environments we currently have. |
Yes, PR #6257 should come across in its entirety. It has the default lazy mode, the lazy_to_last flag mode, and the restored ability to run invert. I hope to get the commit done by afternoon / late afternoon today. |
I note that in the pipeline
Rand3DElastic doesn't appear in the applied_operations list. Is this intentional? If not, I can resolve that while I am at it. |
Theres work in progress #1793 for that, but need a major rework to use the latest API |
Ok, so it looks like at least some of the pipelines result in moving execution order around on a per key basis. As a result, it won't be possible to have inversion work for that mode without largely rewriting Example:
So I intend to do the following:
Then for 1.3, I'll have to look at reimplementing the invert mechanism to work with inconsistent ordering on different dictionary keys. It might be that the full lazy implementation already handles this but I'll have to check. |
The commits for this are now on #6257 |
@wyli Ok so there is still an issue. Vanilla lazy mode requires that the user does not have to modify their pipeline in order to run lazy=True, so The only way that I can think to handle this is to have transforms such as |
For me, warning people isn't enough. lazy with options=None should never break a pipeline. I'm also not keen on the idea of cropping transforms calling apply_pending internally. It can't happen anyway because the existence of Transforms that require it should be able to indicate that they do through the LazyTrait interface. I'm testing something out that solves the issue in the in_order pipeline |
the root cause is that the crop sampling locations are defined in the image space instead of the physical space. in the end the coordinates should be defined in the original physical space and they should carry the geometric information during the preprocessing then there's no ambiguity. |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
Describe the bug
Lazy resampling will execute the following pipeline incorrectly:
My understanding is that this was implemented deliberately to provide improved performance on some of the existing pipelines that are being benchmarked, but it will break a lot of our users pipelines if the non-lazy transforms rely on the pending transforms to have been run:
. Noise will undergo spatial distortion that was not intended
. Non-lazy spatial transforms will output the wrong result
While we want the best performance on our pipelines, we shouldn't expect our users to have to modify their pipelines to execute correctly.
The most obvious solution is a new flag for Compose. We've discussed it as a 1.3 feature but now we need it for this release.
Options can be thought of like compile flags for a c++ compiler.
We should pick a way of specifying them that is extensible enough for our purpose and easy for the user:
*** Flags for 1.2 ***
We only need one or two flags for 1.2
.
"lazy_to_front"
,"lazy_to_back"
These should physically reorder the transforms for execution. It can reshuffle the transforms in a way similar to RandomOrder.
@myron This should cover the benchmark performance that you need, yes?
@wyli This seems implementable in #6257 fairly simply. Given that RandomOrder works fine during inverse AFAIK, it means we also don't need to lose invert
The text was updated successfully, but these errors were encountered: