Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Format][C] Add ArrowAsyncDeviceStreamHandler to C Data Interface #43631

Closed
zeroshade opened this issue Aug 9, 2024 · 1 comment
Closed

[Format][C] Add ArrowAsyncDeviceStreamHandler to C Data Interface #43631

zeroshade opened this issue Aug 9, 2024 · 1 comment

Comments

@zeroshade
Copy link
Member

Describe the enhancement requested

Discussed in apache/arrow-adbc#811, there is a need for there to be an asynchronous-oriented version of the C Data Stream interface to allow for more use cases and interactions with other runtimes (such as R, Python and Ruby).

Currently, the design of the C Data Stream interface makes it difficult to utilize with anything similar to the python async/await or otherwise as it is inherently synchronous for a consumer to interact with. Managing it asynchronously would require additional work by users. Instead we can define an asynchronous-style stream interface that would enable such use cases, and that would then be built upon by ADBC for new ADBC APIs.

Component(s)

C, Format

zeroshade added a commit that referenced this issue Nov 6, 2024
…3632)

### Rationale for this change
See apache/arrow-adbc#811 and #43631

### What changes are included in this PR?
Definition of `ArrowAsyncDeviceStreamHandler` and addition of it to the docs.

I've sent an [email to the mailing list](https://lists.apache.org/thread/yfokmfkrmmp7tqvq0m3rshcvloq278cq) to start a discussion on this topic, so this may change over time due to those discussions.

* GitHub Issue: #43631

Lead-authored-by: Matt Topol <[email protected]>
Co-authored-by: Felipe Oliveira Carvalho <[email protected]>
Co-authored-by: Sutou Kouhei <[email protected]>
Co-authored-by: Raúl Cumplido <[email protected]>
Co-authored-by: Dane Pitkin <[email protected]>
Co-authored-by: Antoine Pitrou <[email protected]>
Co-authored-by: David Li <[email protected]>
Co-authored-by: Ian Cook <[email protected]>
Signed-off-by: Matt Topol <[email protected]>
@zeroshade zeroshade added this to the 19.0.0 milestone Nov 6, 2024
@zeroshade
Copy link
Member Author

Issue resolved by pull request 43632
#43632

zeroshade added a commit that referenced this issue Nov 11, 2024
)

### Rationale for this change
Building on #43632 which created the Async C Data Structures, this adds functions to `bridge.h`/`bridge.cc` to implement helpers for managing the Async C Data interfaces 

### What changes are included in this PR?
Two functions added to bridge.h:

1. `CreateAsyncDeviceStreamHandler` populates a `ArrowAsyncDeviceStreamHandler` and an `Executor` to provide a future that resolves to an `AsyncRecordBatchGenerator` to produce record batches as they are pushed asynchronously. The `ArrowAsyncDeviceStreamHandler` can then be passed to any asynchronous producer.
2. `ExportAsyncRecordBatchReader` takes a record batch generator and a schema, along with an `ArrowAsyncDeviceStreamHandler` to use for calling the callbacks to push data as it is available from the generator. 

### Are these changes tested?
Unit tests are added (currently only one test, more tests to be added)

### Are there any user-facing changes?
No

* GitHub Issue: #43631

Lead-authored-by: Matt Topol <[email protected]>
Co-authored-by: David Li <[email protected]>
Co-authored-by: Benjamin Kietzman <[email protected]>
Signed-off-by: Matt Topol <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant