-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add checkpointing to metagraph build #197
base: master
Are you sure you want to change the base?
Conversation
Could you add a test where it builds one graph, kills the process, so it fails to build to the end. So we check that there are no collisions here and you can correctly generate multiple graphs in parallel on the same machine. |
I can add a test that builds 2 graphs in parallel (using phases), but I
won't kill the process as that causes the test to be undeterministic.
…On Tue, 8 Sep 2020 at 19:31, Mikhail Karasikov ***@***.***> wrote:
Could you add a test where it builds one graph, kills the process, so it
fails to build to the end.
Then it does the same thing for a different graph. Then restarts building
both graphs in parallel, and checks that both graphs are correctly
generated.
So we check that there are no collisions here and you can correctly
generate multiple graphs in parallel on the same machine.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#197 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAOYQLCNGGYLC7VF3SD5WZLSEZS75ANCNFSM4RAHV52Q>
.
|
Ok, I added a test that builds 2 graphs at the same time. First it builds
phase 1 for both graphs, then it builds phase2 for both and asserts that
the graphs are correctly built.
…On Tue, 8 Sep 2020 at 19:37, Daniel Danciu ***@***.***> wrote:
I can add a test that builds 2 graphs in parallel (using phases), but I
won't kill the process as that causes the test to be undeterministic.
On Tue, 8 Sep 2020 at 19:31, Mikhail Karasikov ***@***.***>
wrote:
> Could you add a test where it builds one graph, kills the process, so it
> fails to build to the end.
> Then it does the same thing for a different graph. Then restarts building
> both graphs in parallel, and checks that both graphs are correctly
> generated.
>
> So we check that there are no collisions here and you can correctly
> generate multiple graphs in parallel on the same machine.
>
> —
> You are receiving this because you authored the thread.
> Reply to this email directly, view it on GitHub
> <#197 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AAOYQLCNGGYLC7VF3SD5WZLSEZS75ANCNFSM4RAHV52Q>
> .
>
|
metagraph/src/graph/representation/succinct/build_checkpoint.hpp
Outdated
Show resolved
Hide resolved
metagraph/src/graph/representation/succinct/build_checkpoint.hpp
Outdated
Show resolved
Hide resolved
metagraph/src/graph/representation/succinct/build_checkpoint.cpp
Outdated
Show resolved
Hide resolved
metagraph/src/graph/representation/succinct/build_checkpoint.cpp
Outdated
Show resolved
Hide resolved
if (std::filesystem::exists(checkpoint_file_)) { | ||
std::ifstream f(checkpoint_file_); | ||
f >> checkpoint_; | ||
if (checkpoint_ > 0) { | ||
f >> kmer_dir_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Write the phase too. Otherwise, how do you know which phase this checkpoint corresponds to?
If you assume that different phases have different checkpoints, like 1: (1,2,3), 2: (4,5,6), then make the interface not allow setting checkpoint = 0
for the second phase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like you said, the phases correspond to different checkpoints, so writing the phase is not useful - it just tells the program up to which checkpoint it should go before stopping.
So if the users enters phase=1, we might make it up to checkpoint 2, and then resume the operation from checkpoint 2 and finalize the phase.
Setting checkpoint_=0 for phase_=2 is perfectly valid. One tells us how far we got with the computation, the other how far we wish to go. The other way around, setting checkpoint_=5 and phase_=1 could be invalid, but in our case phase_=1 corresponds to having all checkpoints done, so any combination of valid checkpoint (0..5) and phase values (1..2) is fine.
metagraph/src/graph/representation/succinct/boss_chunk_construct.cpp
Outdated
Show resolved
Hide resolved
metagraph/src/graph/representation/succinct/boss_chunk_construct.cpp
Outdated
Show resolved
Hide resolved
@@ -703,7 +819,7 @@ void recover_dummy_nodes(const KmerCollector &kmer_collector, | |||
return kmer::transform<KMER>(reinterpret_cast<const KMER_REAL &>(v), k + 1) + kmer_delta; | |||
} | |||
}, | |||
real_split_by_W, true | |||
real_split_by_W, false /* remove sources */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like a variable name commented out
real_split_by_W, false /* remove sources */ | |
real_split_by_W, false // remove sources |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's the standard way of commenting parameters represented by constants. You can only use // here because the param happens to be at the end of the line.
metagraph/src/graph/representation/succinct/boss_chunk_construct.cpp
Outdated
Show resolved
Hide resolved
metagraph/src/graph/representation/succinct/boss_chunk_construct.cpp
Outdated
Show resolved
Hide resolved
It's better to have non-deterministic tests than no tests.
|
58db659
to
5d81032
Compare
As discussed via chat, there are multiple checkpoints after each major operation: generate kmers, generate reverse complements, splitting into chunks, generate dummy1, concatenate chunks, generate dummy2..k.
The client interface exposes only 2 phases:
I tested by inserting exit statements in the code after each phase and making sure the build is successful on resume.
I also added functional tests for building in phases.