update strided implementation #191

Jutho · 2024-10-07T14:41:50Z

This PR contains two changes:

It adds the backend argument throughout the lower methods of the StridedBLAS implementation. This enables the lower methods (like blas_contract!) to be reused by other backends that only provide a specialised tensor permutation implementation, in particular the upcoming HPTT.
It adds the allocator argument to the specialised stridedtensorcontract! method where all arguments are simply matrices. Without this argument, this function would never be called as the argument is inserted higher up in the chain, and then the general method for arbitrary rank tensors ends up being used.

lkdvos

I like this change a lot, as this is definitely functionality that can be re-used in many other places.
The only comment about contents I can come up with is if the stridedtensorcontract! implementation, which basically checks the memory costs and then dispatches to blas_contract! really is specific to Strided, but I guess the memory model depends on the strided way of doing permutations.

Otherwise, maybe we could consider moving the blas_contract! functions into a different file? In principle, the implementation that's now here is not even BLAS specific, it really is just transpose-transpose-gemm-transpose, which should also work for abstractarray, and maybe no longer really belongs in the strided file.

lkdvos · 2024-10-07T20:37:31Z

src/implementation/strided.jl

@@ -228,7 +231,7 @@ function blas_contract!(C, A, pA, B, pB, pAB, α, β, allocator)
        C_ = SV(tensoralloc_add(TC, C, ipAB, false, Val(true), allocator))
        _unsafe_blas_contract!(C_, A_, pA, B_, pB, trivialpermutation(ipAB),
                               one(TC), zero(TC))
-        stridedtensoradd!(C, C_, pAB, α, β, StridedNative(), allocator)
+        stridedtensoradd!(C, C_, pAB, α, β, backend, allocator)


should this be stridedtensoradd! or just tensoradd!?

right, if this is to work generally, it should be tensoradd!

Jutho · 2024-10-07T20:47:11Z

Yes, I think making a separate blas_contract file is a good idea. I do think it is ok to keep using StridedView arguments within those lower level functions such as isblascontractable. That functionality probably even works for a CuStridedView, so it can probably even be used in conjunction with a pure CUDA implementation of tensoradd!.

Jutho · 2024-10-09T08:38:20Z

Ok, the errors seem CUDA related on x86 platforms, only in the latest version (1.11). I guess we can ignore this and this is ready to be merged?

lkdvos · 2024-10-09T09:56:51Z

I think so, yes

update strided implementation

644abb6

Jutho requested a review from lkdvos October 7, 2024 14:42

lkdvos approved these changes Oct 7, 2024

View reviewed changes

Jutho added 3 commits October 8, 2024 00:32

move blascontract

81d0c5d

fix forgotten conj argument

27ef972

add lts to ci

a8a678a

lkdvos merged commit e5651a0 into master Oct 14, 2024
16 of 18 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update strided implementation #191

update strided implementation #191

Jutho commented Oct 7, 2024

lkdvos left a comment

lkdvos Oct 7, 2024

Jutho Oct 7, 2024

Jutho commented Oct 7, 2024

Jutho commented Oct 9, 2024

lkdvos commented Oct 9, 2024

update strided implementation #191

update strided implementation #191

Conversation

Jutho commented Oct 7, 2024

lkdvos left a comment

Choose a reason for hiding this comment

lkdvos Oct 7, 2024

Choose a reason for hiding this comment

Jutho Oct 7, 2024

Choose a reason for hiding this comment

Jutho commented Oct 7, 2024

Jutho commented Oct 9, 2024

lkdvos commented Oct 9, 2024