-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zpotrf_dtd fixed for the GPU, updated kernels to cublas_v2, although results still suspicious #94
Conversation
This is the continuation of PR #55 |
@@ -209,4 +209,85 @@ enum dplasma_matrix_type_e { | |||
extern char *dplasma_lapack_const_strings[]; | |||
#define dplasma_lapack_const(plasma_const) (dplasma_lapack_const_strings[plasma_const][0]) | |||
|
|||
#if defined(DPLASMA_HAVE_CUDA) | |||
#include <cublas.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cubla_v2.h
? In CUDA12, cubla.h
and cublas_v2.h
are incompatible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it depends on the order of inclusion of the headers. In order not to break the rest of dplasma, it stays cublas.h here but cublas_v2 is included elsewhere and this should compile ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. We had issues using CUDA12: #92. We can open another PR for that to upgrade from cublas
to cublas_v2
.
CODE; \ | ||
} \ | ||
} while(0) | ||
|
||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe move all cublas-related to potrf_cublas_utils.h
? @abouteiller
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not potrf, but something more generic such as dplasma_cublas_utils.h. Moreover I don't understand what the purpose of the potrf_cublas_utils.h
file is, the types defined inside do not seem to be used anywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know about all the types but dplasma_potrf_workspace_t
is used by zpotrf_wrapper, and my test when creating, destroying and acessing the workspace.
tests/testing_zpotrf_dtd.c
Outdated
status = cusolverDnDpotrf_bufferSize(cusolverDnHandle, cublas_uplo, nb, NULL, mb, &workspace_size); | ||
assert(CUSOLVER_STATUS_SUCCESS == status); | ||
|
||
cusolverDnDestroy(cusolverDnHandle); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw PTG does the same thing. So this cusolverDnHandle
does not need to be the ONE that is used in the potrf_zpotrf
task_class, right? @abouteiller @therault
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically each stream needs a handle, but the handle creation is a function in userland. Unfortunately, we do not expose the streams to the users so these need to be done as needed via a function exported by the user code.
CODE; \ | ||
} \ | ||
} while(0) | ||
|
||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not potrf, but something more generic such as dplasma_cublas_utils.h. Moreover I don't understand what the purpose of the potrf_cublas_utils.h
file is, the types defined inside do not seem to be used anywhere.
cublasStatus_t status = cublasInit(); | ||
assert(CUBLAS_STATUS_SUCCESS == status); | ||
parsec_info_register(&parsec_per_stream_infos, "DPLASMA::CUDA::HANDLES", | ||
CuHI = parsec_info_register(&parsec_per_stream_infos, "DPLASMA::CUDA::HANDLES", | ||
destroy_cuda_handles, NULL, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why one of these function is name-spaced with dplasma while the other one is not ? Why is the info called parsec_per_stream_infos
when it clearly belongs to dplasma. I know there issues are not related to this PR, but I just noticed them while reading through the code.
src/dplasmaaux.c
Outdated
@@ -116,6 +116,9 @@ dplasma_aux_getGEMMLookahead( parsec_tiled_matrix_t *A ) | |||
#include "potrf_cublas_utils.h" | |||
#include "parsec/utils/zone_malloc.h" | |||
|
|||
parsec_info_id_t CuHI = -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
documentation.
6d0156a
to
18381bd
Compare
code refactoring, removed useless includes, changed redundancy in lapack to cublas converstions reverted previous refactoring, added gpu support for gemm_dtd added dtd sources and changed to cusolverDnXpotrf refactoring, cusolver workspaces are allocated per streams fixed typo and changed comment changed names fixed cmake when not DPLASMA_HAVE_CUDA fixed chore declaration a order and cmake
testing_zpotrf_dtd now works with GPU. However, the factorisation and results produced are suspicious