Skip to content
This repository has been archived by the owner on Aug 2, 2022. It is now read-only.

Re-write adding footer logic to prevent unnecessary file copying #330

Open
jmazanec15 opened this issue Feb 25, 2021 · 0 comments
Open

Re-write adding footer logic to prevent unnecessary file copying #330

jmazanec15 opened this issue Feb 25, 2021 · 0 comments
Labels
Enhancements Improvement on existing component

Comments

@jmazanec15
Copy link
Member

In order for Lucene to be able to handle the ".hnsw" files correctly, we need to add a footer to the end of them after the graphs are created. Originally, we just copy the data from a temporary file with the graph to a Lucene OutputIndex and then write the footer here.

This copy may not be necessary if we add the footer to the file ourselves:

// Manually write footer
OutputStream os = Files.newOutputStream(Paths.get(indexPath), StandardOpenOption.APPEND);
os.write(FOOTER_MAGIC);
os.write(0);

long value = state.directory.openChecksumInput(hnswFileName, state.context).getChecksum();
if ((value & 0xFFFFFFFF00000000L) != 0) {
    throw new IllegalStateException("Illegal CRC-32 checksum: " + value + " (resource=" + os + ")");
}
os.write((int) (value >> 32));
os.write((int) (value));

This could potentially save time during index/merge operations. How much time would need to be checked via testing. Additionally, we would need to test to make sure this is fault tolerant and does not produce any corruptions.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Enhancements Improvement on existing component
Projects
None yet
Development

No branches or pull requests

1 participant