Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add checkpointing to metagraph build #197
base: master
Are you sure you want to change the base?
Add checkpointing to metagraph build #197
Changes from 12 commits
8567bff
97294cd
1f731ff
7e981ba
b9fbc53
d83368d
0b27107
7d53b52
7476e7a
fa07b12
8301c22
71ac989
12e1ada
da1aea6
bc63e50
0756a42
a2e6c0f
15b42da
8a9121b
9fabdb5
a4fb98c
be4d767
89d1588
4c73921
6b0b9d9
caaf272
5dfcee8
62ec669
9cf55a9
88748b9
2792ab0
a40dea5
7600830
c4c43ec
007845f
866e786
d52ca9e
5a3a8a3
ad6d965
6493adb
57c111c
e6c0d95
7cd5b63
80f1e48
de1173c
fb0ad63
5e42d10
d728edd
79d97e4
03d5b2d
e4aed78
82a9e8e
76c0d1b
2a5b52f
dc278db
98ad5da
8c953a6
c238b20
0d399b7
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Who is responsible for cleaning these files up? Does it mean that it never removes the old temp files until all the k-mers are sorted, and thus, the disk usage has grown a lot?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only files affected by this change are the original collected chunks. These will be indeed deleted after checkpoint 5 instead of after checkpoint 2.
Why not delete them now? Because if the program crashes between
merge_queue_.shutdown()
(so the merge queue still has elements to merge, but no more new elements will be added) and the queue being emptied, then we lose all data. The old chunks have been deleted, but the new merge is not yet ready.The timespan between the files being deleted and the merge queue being emptied is quite short, so the probability of this happening is low, but at the same time I cannot leave this flaw in the code with a clear conscience.
Since
recover_dummy_kmers
inboss_chunk_construct
doesn't receive a reference to this object, but only to theChunkedWaitQueue
it creates, it cannot even manually callclear()
on it when the merge is done.