CUDA implementation of DBSCAN. Inspired from paper https://www.sciencedirect.com/science/article/pii/S1877050913003438
## Source code
- In
gdbscan_paper.cu
you can find the implementation most similar to the one proposed in the original paper. - In
gdbscan.cu
you can find my naive implementation, which improves 1 removing some sequential loops from the host in the BFS code. - In
gdbscan_shifted.cu
there's an attempt to change the memory access pattern incompute_degrees
andcompute_adjacency_list
. It edits 2 - In
gdbscan_shared.cu
there's an attempt to exploit shared memory incompute_degrees
andcompute_adjacency_list
. It edits 2
G-DBSCAN.ipynb
is the code run on Google Colab to compare sklearn DBSCAN against my implementations.
Profiling has been conducted with 200 000 10-dimensional points generated by sklearn make_blobs
.
The data is stored in profiling/data.txt
.
- In
profiling/paper
there are execution times and profiling foreach kernel related togdbscan_paper.cu
source code - In
profiling/standard
there are execution times and profiling foreach kernel related togdbscan.cu
source code - In
profiling/shifted
there are execution times and profiling foreach kernel related togdbscan_shifted.cu
source code - In
profiling/shared
there are execution times and profiling foreach kernel related togdbscan_shared.cu
source code
There are also the clustering output files, to verify clustering quality against sklearn's implementation.