-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify image tiling requirements for cl_khr_external_memory #861
Comments
We have done some thinking about how the tiling considerations in external memory and image from buffer (#710) intersect with ext_image_tiling_control and exisiting spec behavior. We worked through some principles using which these considerations could be reconciled and we would be able to move forward with the external sharing extension. The key goals of this exercise were.
There's a lot of text in our next comment, but hopefully this summary will help with reviewing our proposal. Thanks, |
When EnqueueMapbuffer is called the spec requires the implementation to present a linear view of the image data. This may be accomplished through When an image is created from a buffer and the image row pitch is not zero, the tiling mode is now implicitly linear since the application specified image row pitch would
We recommend adding a new param name CL_DEVICE_INFER_EXTERNAL_IMAGE_INFO_WITH_UUID for GetDeviceInfo as part of the External Memory extension. We also need to add a new property When importing the external image into OpenCL. If the implementation supports automatic inference of external images,
By default a clEnqueueMapImage will present a linear view of image data to the host CPU regardless of the actual image data layout in device memory. 4)DRM format modifiers can be a reliable mechanism for importing tiled images and can be seen as an alternative to specifying the uuid while Thanks, |
Here are the broad takeaways from the most recent discussion.
Thanks, |
An additional suggestion for implementing robust import of optimally tiled images from an API like Vulkan to OpenCL is to define a vendor extension. For example vendorXYZ_cl_vk_mem_import. The presence of the extension lets the application know that there is a robust mechanism in place for transferring the image tiling information even in the absence of drm format modifiers. |
Just capturing some of the key things we are looking for -
Allowing linear layout as one of the optimal layout sounds reasonable to me. I have been looking at drm format modifiers. It seems like this path could work for us. So, we are open to explore this. In terms of unblocking external memory and semaphore extensions in the short term, if we can not get image layout exchange sorted out in time, one way to move forward is to continue external memory extensions with buffer-only support with image support added later on. For implementations that do not support images, image requirements are already waived off for these extensions. We may just need to add a separate query to advertise image support for external memory sharing separate from default image support. |
From a portability point-of-view I don't think the default and minimum baseline tiling requirement for external memory import can be anything other than linear tiling. I think we should do the following:
Does this sound reasonable? |
@bashbaug - thanks for your feedback. While separating out optimally tiled images into a different extension and requiring linear tiling would simplify things for some vendors, it might not be ideal for other vendors that really depend on optimal tiling for good performance. If we are able to get the ext_image_tiling_control and the drm format modifers extension done, then we can define clean mechanisms for the import of both linear and optimally tiled images within cl_khr_external_memory. In this situation, we may not need to define a default tiling mode. It comes down to a question of timing in terms of reconciling all of the specs. |
I thought we were converging on default being optimal which can be linear as well as non-linear tiling. Was this the case only for regular images? Is there a reason to have linear layout as default for external images? We will need to add export memory mechanism at some point in the future where we would also need to query exportable handles from regular images created within OpenCL and having different default settings between the two may hurt in some way. In any case, if the goal is to be explicit about image tiling/layout, then I agree that having default is not necessary and we can make it mandatory for applications to pass layout information (including linear, optimal and DRM format exchange) to ensure the behavior is deterministic. We can possibly error out in the absence of tiling information while importing external images. I tend to agree with Ben's point (2) that optimal as is defined by Vulkan is problematic and we will mostly need to leave it implementation defined in the spec. If we want to avoid issues with this path, may be we should just focus on having DRM format/similar other mechanism for exchanging layout. Linear and any other optimal layouts that vendors want to support can be expressed via this mechanism. May be, we don't even need to define ambiguous /implementation defined optimal layout as such. In addition to this, if we consider a case where the same memory is bound to different layouts (buffer as well as image, linear layout image as well as optimal layout image) and if the same is imported by OpenCL, the OpenCL's view can only be limited to the layout information passed by application or assumed by the implementation. In such cases, we may still have limitations to have a single coherent view of the data across APIs even after having an explicit mechanism in place. (PS: Some scenarios here may be hypothetical and may not happen in practice, but are still possible). |
Some questions for the discussion today -
|
cl_khr_external_memory currently waives off need for image import for implementations not supporting images which already brings down the minimum bar for the functionality to importing buffers only. Also, there are other mechanisms to extend support for linear images by importing external memory as buffer using cl_khr_external_memory and then using cl_khr_image_from_buffer. So, don't think we need to mandate the linear layout as default in cl_khr_external_memory to get the same functionality supported across vendors/implementations. |
In terms of wider interoperability which is probably the biggest motivating factor to have robust mechanism for exchanging image information, I can see below categories of interop -
While we should pursue 3 to have more generic memory sharing and solving image information exchange may be a small part of it, it may require more than just this. May be we should focus on 1 and 2 for now and then continue to explore 3. |
Lastly, we need some clarity on dependencies across cl_khr_external_memory, cl_khr_image_from_buffer, cl_ext/khr_image_tiling and DRM format modifier or equivalent extensions -
|
Proposing these ideas as a thought experiment to work out if there is a way to robustly support import of optimally tilled images without implying that linear tiling is default. If we make the external memory extension depend on both image tiling control, then we can require that the image tiling mode be explicitly specified when importing images. Implementations that support image tiling control would be required to support both linear and optimal layouts. However an optimally tiled image layout could be implementing using linear tiling. Since the optimal layout is opaque to the application, this would work. This means that the spec is implementable even on platforms that only have linear tiling. DRM format modifiers are currently available on Linux only. Until they are universally available, we could add a device info query ( TILED_IMAGE_SOURCE ) which lists the driver uuids for which image tiling information can be inferred. In the case of an implementation that supports the impot of optimally tiled Vulkan images, this would return the Vulkan driver uuid. In this case, the Vulkan and OpenCL implementations must be from the same vendor. Long term, the robust mechanism for transferring tiling information is drm format modifers. However , we would need to get clarity on which operating systems are supported, which APIs would support them and how to add new modiifers. Note that it should be possible for an application to support only the drm format modifiers corresponding to linear layouts. For OpenCL images created with commands such as clCreateImage, the tiling mode would be implicitly optimal. For specific cases, such as creating an image from a buffer, the tiling mode would be linear if not explicitly specified. These cases would be documented in the image tiling control spec. Overall, the spec would not have any notion of which tiling mode is default. Implementations would support both and for cases where the tiling mode was implicit, the spec would document which mode applied. |
Thanks Balaji. This is a longer route to take, but may be bit better than implicitly assuming some layout to counter the problem of lack of layout information.
By both do you mean both linear and optimal layouts or both image tiling control and DRM modifier extensions?
What kind of inference are we looking for? Can you help understand purpose of this query? Also, uuids should be API agnostic and truly be the property of device. Not sure what we mean by Vulkan driver uuid. |
I think all of us are in agreement that DRM format modifiers or equivalent mechanism is the way to go for robust image information exchange.
|
One of my takeaways from yesterday's teleconference is that the tiling requirements may be different depending on the external memory type. IIRC "Android Hardware Buffers" carry tiling information with them, hence for this type of external memory both linear tiling and optimal tiling could (should? must?) be supported, and explicitly passing any tiling information would be redundant. Assuming this is the case can we define the tiling behavior for each of the external memory types we support?
|
Thanks Ben. Just jotting down few things for further discussion -
For inference capability, is it solely a property of handle type or a property of platform + device + handle_type? |
Brief summary of offline discussion: we're trending toward defining tiling requirements and defaults for the external memory handle types we support. We think we know what to do for OPAQUE_WIN32 handle types and Android hardware buffers. We need to decide how to handle DMA_BUF and OPAQUE_FD handle types. Details: We tried to answer the following questions for each external memory handle type:
For DMA_BUF and OPAQUE_FD handles we cannot infer the tiling layout. We dicussed two possible options, which does not have an (initial) dependency on an image_tiling_controls extension, and one which does: Behavior with Option 1: DMA_BUF and OPAQUE_FD with no initial dependency on image_tiling_controls:
What would a query look like if we added one for (3)?
Possible behavior with Option 2: DMA_BUF and OPAQUE_FD with an initial dependency on image_tiling_controls:
We need to decide whether to pursue one of these two options or a different option. |
After further internal discussion, we'd like to propose the following -
Thanks, |
@bcalidas Also, with item 4, do we need tiling_control extension to depend on DRM format modifier or will this be a restriction only for externally imported images? |
@nikhiljnv v - I was not aware of cases with AHB that did not support automatic inference. This can be discussed as part of the AHB handle type extension. Regarding the tiling_control extension, it will not depend on DRM format modifier but external memory will depend on both tiling and drm format modifier. |
We did some more thinking on item 2 from the earlier post. "This means that external_memory extension now depends on tiling_control extension" It may be possible to remove the dependence of external memory on tiling_control by stating that images imported without a tiling property explicitly specified, are linear. This would not necessarily imply that linear tiling in general is the default for OpenCL , but that for this specific case of external memory usage, linear would apply. If this image had been created in Vulkan with Optimal tiling, then the results would be undefined. The related questions that come up are - |
Some questions -
|
One idea to potentially "un-stick" this issue: We essentially have implementation-defined behavior right now, where some implementations assume that external memory handles for images are linearly tiled and other implementations assume optimal tiling. Since it appears unlikely we will be able to unify this behavior, perhaps we should consider a query so applications will be able to determine what the behavior is and react accordingly. This could be as simple as a single per-device query that returns whether the device assumes linear tiling ( This query would only describe the default tiling behavior when no explicit tiling information is provided. So, if an implementation supports something like the image tiling controls extension or a DRM format modifier extension, any explicit tiling information would override the default and allow other types of tiling. If this sounds interesting, things we will need to decide are: Can we have just one query for the device covering all handle types or do we need handle-specific queries? Does the query make sense for all handle types or just a few handle types? |
If we added a query for default tiling behavior, it would need to be per device and per handle. However, it still leaves unresolved the question of how an implementation can guarantee correct transfer of tiling information if the external image is not linearly tiled. |
Makes sense. We currently have a device-specific query to determine which handle types are supported for importing:
We could add a similar query to determine which handle types are assumed linear? Don't take the enum name too seriously - I know it's really long right now:
Would this work? Note: we have a similar platform query for the handle types that are supported for importing. We could add a platform query for the "assumed linear tiling" query also, if desired. I'm not sure how useful this is, though...
Yes - the "these handles assume linear tiling" query would only be a solution for some use-cases, but it's something we could do easily in the short-term. A longer-term solution would use something like the image tiling extension or a TBD DRM format modifiers extension to transfer the tiling information explicitly. |
I assume CL_DEVICE_EXTERNAL_MEMORY_IMPORT_ASSUME_LINEAR_HANDLE_TYPES_KHR can return a subset of handle types returned by CL_DEVICE_EXTERNAL_MEMORY_IMPORT_HANDLE_TYPES_KHR
I assume the diff between lists returned by CL_DEVICE_EXTERNAL_MEMORY_IMPORT_HANDLE_TYPES_KHR and CL_DEVICE_EXTERNAL_MEMORY_IMPORT_ASSUME_LINEAR_HANDLE_TYPES_KHR would imply handle types with non-linear tiling. Do we also need a similar query for reporting handle types where inference can be supported e.g. CL_DEVICE_EXTERNAL_MEMORY_IMPORT_CAN_INFER_HANDLE_TYPES_KHR ? |
Also, I think that assumes_linear query is useful in general to allow implementations to report the default assumption for images in general in the absence of image tiling information and not just external images for different handle types. |
Thanks for the feedback and questions!
Yes, definitely. It wouldn't include any new handle types that are not supported for importing (we should have a test for this), but it could certainly be a subset.
I think we could use a similar mechanism to query other properties if needed, but I think to "un-stick" this issue we would only need the "assume linear" query. |
We are ok with this proposal. Do we have agreement that the CL_DEVICE_EXTERNAL_MEMORY_IMPORT_ASSUME_LINEAR_HANDLE_TYPES_KHR query is sufficient to "un-stick" the issue. |
Hi, Quick update from Qualcomm. We further reasoned that if an implementation reported a handle type in the CL_DEVICE_EXTERNAL_MEMORY_IMPORT_ This allows us to proceed forward with the base cl_khr_external_memory spec. We can follow up by finishing the ext_image_tiling_control which introduces the concept of linear and optimal tiling. Finally we can add an extension that layers on both image tiling control and external_memory and provides explicit control over the tiling of imported images. With this extension we can also add a per-handle type query about which formats would support automatic layout inference. This would be needed for handle types like Android Hardware Buffer. Hopefully this plan can work and get us moving forward in stages. Feedback is appreciated. Thanks, |
Add query to clGetDeviceInfo for external memory handle types that are assumed to be linear.
@bcalidas If you already have a draft spec, will it be possible to push a PR sometime this week or early next week? May be we can continue to discuss any issues as part of spec PR review. |
@nikhiljnv This PR adds the spec update for assume_linear_handle_types. There is no expectation that the layout of handle types not in the assume_list will be inferred. Just that when no other information is provided, for these handle types the layout is not guaranteed to be linear. Thanks, |
To try and summarize my position on the call of July 18, which stems from @bcalidas suggestions of adding hints to help infer tiling:
The Objective here is to get an API that we can evolve while keeping backward compatible. |
To expand further on Brice's comments, here's how this could be done.
*For some implementations these hints are necessary since a given handle type could have come from one of many different sources. The inference mechanism could be different for each source.
The overall takeaway is that in many cases tiling is already handled implicitly through inferencing. We can use the properties list in clCreateImageWithProperties() to bring clarity to the inferencing mechanisms. |
Hmm, I had slightly different takeaways from our discussion last week. My key observation was that there really are only two "classes" (or "types") of tiling, which are mutually exclusive (you only have one or the other, never both): The first class is "explicit tiling layouts", where an application specifies (prescribes) a specific tiling layout. If an application specifies this specific tiling layout incorrectly then the program is incorrect and will very likely result in an error or in undefined behavior (e.g. a corrupted image). The most obvious explicit tiling layout is "linear tiling", but there could be other explicit tiling layouts as well, such as DRM format modifiers. The second class is "implicit tiling layouts", where no specific tiling layout is specified. With this class of tiling an implementation must to infer or otherwise determine what the tiling layout is for a handle. There are several mechanisms that could be used to do this: for example an implementation may be able to query the tiling layout (e.g. through the OS?) for some handle types, or an implementation may know by policy what the tiling layout is for certain types of handles plus image properties (e.g. image type, format, dimensions, ...). A few follow-on observations:
(I think this is all compatible with @Kerilk's observations above, maybe just said slightly differently and with a few more words.) edit: spelling |
The conclusion from the most recent discussion is -
|
#940 has been updated per review comments. |
This is the pull request for the header file. |
Add query to clGetDeviceInfo for external memory handle types that are assumed to be linear.
* External memory import: Add assume linear query (#861) Add query to clGetDeviceInfo for external memory handle types that are assumed to be linear. * Assign macro numeric value and improve formatting * Address review comments * Fixed token name to what was agreed on in the review * Updated minor version --------- Co-authored-by: Joshua Kelly <[email protected]>
Need to check CTS coverage and decide if need a CTS issue to track further changes. |
Closing as discussed in memory subgroup call on Sept 19, 2023. |
cl_khr_external_memory spec currently doesn't specify if tiled images can be imported from APIs like Vulkan and
more specifically what minimal tiling modes needs to be supported.
This came up during review of KhronosGroup/OpenCL-CTS#1559
This issue tracks required changes to the spec.
The text was updated successfully, but these errors were encountered: