-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NIFI-12270: Add new provenance event: UPLOAD #8094
Conversation
For uploading local files isn't that what the SEND is for? Or is this to upload from an external location into an external system without ever storing locally? If that's the use case please update the Jira so it's clearer what the new event type is for, thanks! |
Similar to Matt's comment why does a SEND event not cover this case sufficiently? In this case it is that NiFi is effectively being used to take a given external reference and send that data to another system all without having stored locally (within nifi) the bytes. Never the less NiFi is the system that sends the data it seems and that is the purpose of the SEND event. A send without having had a CREATE/RECEIVE is fair game now given this paradigm... that is ok. I think we should avoid adding an event type unless the need/case is really strong. |
This PR is following @markap14 's recommendation in this email thread: |
Ahh that discussion context is helpful - would make sense to have that already linked in either the JIRA itself or this PR. If it was there and I missed it sorry about that. Otherwise, we need to put links to such good/useful discussions on these to help others who might review later. To the issue at hand, I do not share the view we should create a different event type here. The meaning of 'SEND' is that NiFi is responsible for having taken the bits of a thing and sent it to some external system. That is precisely what is happening here. Now there is value in knowing as well then that NiFi itself never actually stored the bits. This is known though already from the provenance trail of this object. Presumably it never had a CREATE/RECEIVE and the bytes will of course have never been populated other than perhaps a link/reference. Given there was already a decent discussion and some apparent consensus and clearly this was a lot of work I am not saying I am a -1 on this but it does add more code/more maintenance/more semantics and I'm not clear what the value add will be in that sense. If the issue of size is being worked around here then ideally we'd have a different/simpler answer. If we continue with this as-is the description of a SEND should ensure devs use that when sending content from within NiFi to an external system and an upload only when sending external content to an external system. Because as written in teh docs/guide right now UPLOAD looks like a subcase of SEND which it is not. |
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/pom.xml
Outdated
Show resolved
Hide resolved
...mponents/src/main/java/org/apache/nifi/controller/repository/StandardProvenanceReporter.java
Outdated
Show resolved
Hide resolved
...-provenance-utils/src/main/java/org/apache/nifi/provenance/upload/StandardUploadContext.java
Outdated
Show resolved
Hide resolved
...a-provenance-utils/src/main/java/org/apache/nifi/provenance/upload/StandardFileResource.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution, @Lehel44! Overall, I think it's reasonable to add the new UPLOAD event but see inline comments about implementation details.
@joewitt I added the discussion link to the Jira, thanks for pointing out. The main issue here was that the SEND event used the size of the incoming FlowFile but instead of a copy of this FlowFile an external file was uploaded and sent to the external system, therefore its size should have been represented instead of the FlowFile's size. From the documentation: "SEND: Emits a Provenance Event of type that indicates that a copy of the given FlowFile was sent to an external destination.". The UPLOAD event would describe an external resource being uploaded to an external destination. While the SEND event automatically uses the size of the FlowFile argument when creating the event, the UPLOAD takes a FlowFile for necessary event attributes (flowfile UUID etc.), but also takes a FileResource object which represents the external resource and uses its attributes when constructing the event: for instance the size and the path are will be displayed in the provenance event. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update @Lehel44. I noticed that FileResource
was changed from an interface to a class, but this is a breaking change at the nifi-api
level, so it should be reverted. I also recommend only adding a minimal number of new methods to ProvenanceReporter, instead of all possible versions.
nifi-api/src/main/java/org/apache/nifi/provenance/upload/FileResource.java
Show resolved
Hide resolved
nifi-mock/src/main/java/org/apache/nifi/util/MockProvenanceReporter.java
Outdated
Show resolved
Hide resolved
Thanks for the review @markap14 @exceptionfactory. Is this one good to go in? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working through the feedback and scoping down the number of new methods @Lehel44, the latest version looks good. +1 merging
Summary
NIFI-12270
See discussion
Tracking
Please complete the following tracking steps prior to pull request creation.
Issue Tracking
Pull Request Tracking
NIFI-00000
NIFI-00000
Pull Request Formatting
main
branchVerification
Please indicate the verification steps performed prior to pull request creation.
Build
mvn clean install -P contrib-check
Licensing
LICENSE
andNOTICE
filesDocumentation