Fix bfs iterator for multiple source nodes #382

Tortar · 2024-06-04T01:28:26Z

I noticed that the previous implementation of multi source bfs was wrong, because it didn't start with the first level nodes (also my fault :( ), this should be correct instead, and also faster than the one in #381. I see a 1.7x improvement over the previous version on a erdos_renyi(1000000, 0.00001) starting from a random node.

There is still a problem though, I think multi-source dfs suffers from a similar problem. But I unfortunately don't have time to fix it at the moment.

codecov · 2024-06-04T01:36:36Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.31%. Comparing base (43f9f18) to head (5d1275c).
Report is 1 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #382      +/-   ##
==========================================
- Coverage   97.31%   97.31%   -0.01%     
==========================================
  Files         120      120              
  Lines        6954     6953       -1     
==========================================
- Hits         6767     6766       -1     
  Misses        187      187

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Tortar · 2024-06-18T11:36:09Z

Gentle bump

Tortar · 2024-06-19T18:40:08Z

Hi @gdalle @simonschoelly sorry for the ping but I think this one would be good for a patch version release, if you have some time to spend reviewing it

gdalle · 2024-06-19T19:05:13Z

test/iterators/bfs.jl

@@ -35,7 +35,7 @@
        end
    end
    nodes_visited = collect(BFSIterator(g2, [1, 6]))
-    @test nodes_visited == [1, 2, 3, 6, 5, 7, 4]
+    @test nodes_visited == [1, 6, 2, 3, 5, 7, 4]


If these tests are sensitive to the ordering at each level, which is an implementation detail, can you rework them to make them independent?

I'm a bit unsure if the ordering on each level could be considered an implementation detail, if this is it we could speed-up the running time by 2x by ordering the level nodes (for cache locality reasons I presume), but this is actually wrong because you want to follow strictly how bfs works, which is to look for each of the neighbors of a certain node in the previous level and just then go to the next node of the previous level

If you consider two graph structures where the neighbors of each vertex make up the same (mathematical) set but are stored in different orders, the BFS algorithm will return different things for node_visited. And they will both be correct. So our tests should be agnostic to that

yes I mean I think you are right that we should be agnostic to that, just to clarify what I mean:

0 / \ 1 2 / \ / \ 3 4 5 6

given a graph like this if we start at node 0 it is okay to have e.g. 0, 1, 2, 3, 4, 5, 6 or 0, 2, 1, 5, 6, 3, 4 but not 0, 1, 2, 5, 6, 3, 4 or 0, 1, 2, 5, 3, 4, 6.

But I think this is what you are actually saying in your last comment, so we need to have tests which are okay with all acceptable versions.

Exactly. It's a bit of a pain so I'm not making it strictly necessary for the PR to be merged, but essentially in your example we would want to check that the returned vector has the form [.|..|....] where the first subset is {0}, the second is {1, 2} and the third is {3, 4, 5, 6} but in any order

Your definition iterates sligthly differently than what I had in mind...but it's totally okay I wanted just to understand which one was preferable and indeed parallelizing the algorithm effectively would require to drop mine. So let's go with yours, I think that we can also get a 2x speed-up by sorting with that :-)

I'm slightly biased by a recent bachelor project I supervised on... parallel BFS ;) check out the repo of my interns https://github.com/KassFlute/ParallelGraphs.jl for a multithreaded and even BLAS-ified version of BFS that is much faster than the one here! ping @KassFlute and @AntoineBut

Gentle ping @Tortar if you want to adjust the tests so that I can merge while it's still fresh in our minds

should be okay now 👍

src/iterators/bfs.jl

Co-authored-by: Guillaume Dalle <[email protected]>

Tortar added 2 commits June 4, 2024 03:25

Fix bfs iterator for multiple source nodes

2224594

Update bfs.jl

813cbe2

Tortar mentioned this pull request Jun 4, 2024

Optimize a bit bfs and dfs iterators #381

Closed

Tortar added 8 commits June 4, 2024 03:40

fix another possible source of problems

2946a40

actually we can go faster

f16b695

Update bfs.jl

a627fdf

sort for faster bfs

a73a5ba

better not

bad4117

simpler

86254e1

Update bfs.jl

cf5203c

Update bfs.jl

49ec54d

Update doc

f067c3d

gdalle requested changes Jun 19, 2024

View reviewed changes

Tortar and others added 5 commits June 19, 2024 23:15

Update src/iterators/bfs.jl

04e4109

Co-authored-by: Guillaume Dalle <[email protected]>

Update bfs.jl

2566b17

sort is applicable

c020f26

Update bfs.jl

2b16385

adjust tests

5d1275c

gdalle approved these changes Jun 26, 2024

View reviewed changes

gdalle merged commit c5ea323 into JuliaGraphs:master Jun 26, 2024
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix bfs iterator for multiple source nodes #382

Fix bfs iterator for multiple source nodes #382

Tortar commented Jun 4, 2024 •

edited

Loading

codecov bot commented Jun 4, 2024 •

edited

Loading

Tortar commented Jun 18, 2024 •

edited

Loading

Tortar commented Jun 19, 2024

gdalle Jun 19, 2024

Tortar Jun 19, 2024

gdalle Jun 21, 2024

Tortar Jun 21, 2024 •

edited

Loading

gdalle Jun 21, 2024

Tortar Jun 21, 2024

gdalle Jun 21, 2024

gdalle Jun 25, 2024

Tortar Jun 25, 2024

gdalle Jun 26, 2024

Fix bfs iterator for multiple source nodes #382

Fix bfs iterator for multiple source nodes #382

Conversation

Tortar commented Jun 4, 2024 • edited Loading

codecov bot commented Jun 4, 2024 • edited Loading

Codecov Report

Tortar commented Jun 18, 2024 • edited Loading

Tortar commented Jun 19, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Tortar Jun 21, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Tortar commented Jun 4, 2024 •

edited

Loading

codecov bot commented Jun 4, 2024 •

edited

Loading

Tortar commented Jun 18, 2024 •

edited

Loading

Tortar Jun 21, 2024 •

edited

Loading