Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NIFI-13427: Python API extended to support source processors #9000

Closed
wants to merge 2 commits into from

Conversation

pgyori
Copy link
Contributor

@pgyori pgyori commented Jun 24, 2024

Summary

Extends the NiFi Python API so that one can create Python processors that have no incoming connections but can create FlowFiles themselves.

NIFI-13427

Tracking

Please complete the following tracking steps prior to pull request creation.

Issue Tracking

Pull Request Tracking

  • Pull Request title starts with Apache NiFi Jira issue number, such as NIFI-00000
  • Pull Request commit message starts with Apache NiFi Jira issue number, as such NIFI-00000

Pull Request Formatting

  • Pull Request based on current revision of the main branch
  • Pull Request refers to a feature branch with one commit containing changes

Verification

Please indicate the verification steps performed prior to pull request creation.

Build

  • Build completed using mvn clean install -P contrib-check
    • JDK 21

Licensing

  • New dependencies are compatible with the Apache License 2.0 according to the License Policy
  • New dependencies are documented in applicable LICENSE and NOTICE files

Documentation

  • Documentation formatting appears as expected in rendered files

@henrikjohansen
Copy link

Hey @pgyori 👋 Just out of curiosity - have you thought about offering a version of the FlowFileSourceProxy for processors that are not safe for concurrent execution?

Currently there is no way of setting annotations such as @TriggerSerially from the Python side - most regular processors are fine with 'Concurrent Tasks' > 1 but I have several source processors I would like to develop where running multiple concurrent tasks would be rather ... problematic 😇

Copy link
Contributor

@exceptionfactory exceptionfactory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work on this new Python Processor interface @pgyori! The system test is very helpful as well.

If you can address the comments from @lordgamez, this looks like it should be ready to go today.

@exceptionfactory
Copy link
Contributor

Hey @pgyori 👋 Just out of curiosity - have you thought about offering a version of the FlowFileSourceProxy for processors that are not safe for concurrent execution?

Currently there is no way of setting annotations such as @TriggerSerially from the Python side - most regular processors are fine with 'Concurrent Tasks' > 1 but I have several source processors I would like to develop where running multiple concurrent tasks would be rather ... problematic 😇

Thanks for raising this question @henrikjohansen. This is probably worth considering in a discussion in Jira, where it would be helpful to outline the type of Processor to be developed. It is worth noting that a large number of the List Processors in Java have the TriggerSerially annotation, so it might even make sense to make that the standard behavior. However, I think this can be considered as a follow-on effort, as the implications may be a bit different given the relationship between Java Threads and Python processes.

@pgyori
Copy link
Contributor Author

pgyori commented Jun 27, 2024

Hey @pgyori 👋 Just out of curiosity - have you thought about offering a version of the FlowFileSourceProxy for processors that are not safe for concurrent execution?

Currently there is no way of setting annotations such as @TriggerSerially from the Python side - most regular processors are fine with 'Concurrent Tasks' > 1 but I have several source processors I would like to develop where running multiple concurrent tasks would be rather ... problematic 😇

Hi @henrikjohansen ,
That is a very interesting - and challenging - problem. Thank you for asking that! I agree with @exceptionfactory , let's open a separate Jira ticket for that and discuss it further.

@pgyori
Copy link
Contributor Author

pgyori commented Jun 27, 2024

Thank you @lordgamez , @henrikjohansen , @exceptionfactory !
I appreciate the feedback. I pushed a commit addressing the review comments.

@exceptionfactory
Copy link
Contributor

Thanks @pgyori! The changes look good, so I plan to merge after verifying the builds.

Copy link
Contributor

@lordgamez lordgamez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@exceptionfactory exceptionfactory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again @pgyori! +1 merging

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants