You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is there a way to micro batching the actual NATS streams? Let's say I just want to try and pull 20,000 messages, if there are so many or whatever is in the queue. Something like "Pull X or MAX(queue_length)".
It's understable that the connector its in early stages but it would a common use case, also having PySpark examples would help.
The idea is that even fi NATS is a stream technology it makes perfect sense to work in microbatches and standard Spark dataframes instead of StreamReaders/etc.
Something similar to what is implemted in this Python library here.
What is the proposed change?
We have a flavour? or method to pull a fix amount of messages into a Spark Dataframe, avoiding all the streaming APIs once the messages are pulled.
Who benefits from this change?
Anyone that rathers using examples that are done with Python.
What alternatives have you evaluated?
Some users that might have the specific use case of batching NATS messages and want to avoid the hurdle to be forced on working on streaming APIs.
The text was updated successfully, but these errors were encountered:
What motivated this proposal?
Is there a way to micro batching the actual NATS streams? Let's say I just want to try and pull 20,000 messages, if there are so many or whatever is in the queue. Something like "Pull X or MAX(queue_length)".
It's understable that the connector its in early stages but it would a common use case, also having PySpark examples would help.
The idea is that even fi NATS is a stream technology it makes perfect sense to work in microbatches and standard Spark dataframes instead of StreamReaders/etc.
Something similar to what is implemted in this Python library here.
What is the proposed change?
We have a flavour? or method to pull a fix amount of messages into a Spark Dataframe, avoiding all the streaming APIs once the messages are pulled.
Who benefits from this change?
Anyone that rathers using examples that are done with Python.
What alternatives have you evaluated?
Some users that might have the specific use case of batching NATS messages and want to avoid the hurdle to be forced on working on streaming APIs.
The text was updated successfully, but these errors were encountered: