-
Notifications
You must be signed in to change notification settings - Fork 284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extended balance_root
to handle root underflow case
#621
base: main
Are you sure you want to change the base?
Conversation
Did some code refactoring.
… trying to convert overflow pages to PageType variant I have changed places which use `page_type` function to use `maybe_page_type`
cd7c875
to
8a54812
Compare
I'm not sure why the test is failing only on windows. I have fixed the same error on my ubuntu machine. This is the only check that's failing. |
core/storage/btree.rs
Outdated
// In SQLite, we want to leave some space for future cell pointers | ||
// Each cell pointer takes 2 bytes | ||
const CELL_POINTER_SIZE: usize = 2; | ||
const ESTIMATED_FUTURE_CELLS: usize = 4; // Leave room for a few more cells |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this come from somewhere, e.g. sqlite source code? If so, good to leave a link as comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed this function to keep this PR close handling just the root and for checking root overflow, we just need to check if there the page contains any overflow cells. SQLite here: https://github.com/sqlite/sqlite/blob/c2e400af042bdd7d21e159a41fcf34c05398044c/src/btree.c#L9059-L9060
} | ||
} | ||
|
||
// Helper function to handle page 1's special case |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should at least describe what the special case is and why it occurs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rename the function to hint at what the special case is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not mentioned in SQLite. The overflow case is almost copied from previous balance_root
implementation which handled only overflow case.
balance_deeper
in SQLite handles the root overflow case. But it works a bit differently. It allocates a new child page and makes it right-child page and then it copies the contents into the child page. Then balance_nonroot
will handle the child page overflow by splitting the page.
You can find balance_deeper
docs and implementation here: https://github.com/sqlite/sqlite/blob/master/src/btree.c#L8940-L9004
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we handle root underflow? Balance root tries to map 1:1 to balance_deeper
implementation which is an overflow balancing of root. Underflow should be managed generically for all cases. Furthermore, can a root page underflow even? Isn't it possible to create a btree page that once you do first insertion it will see itself as underfull?
@pereman2 I've looked at both the references below and thought it is better to handle root underflow case in the I think the comment from [1] describes root underflow case.
[1] https://github.com/sqlite/sqlite/blob/master/src/btree.c#L8867-L8881 |
core/storage/btree.rs
Outdated
self.usable_space(), | ||
); | ||
let buf = child_contents.as_ptr(); | ||
cells.push(buf[start..start + len].to_vec()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we insert_into_cell
here instead of cloning vecs?
core/storage/btree.rs
Outdated
let new_root_buf = new_root_page_contents.as_ptr(); | ||
new_root_buf[0..DATABASE_HEADER_SIZE] | ||
.copy_from_slice(¤t_root_buf[0..DATABASE_HEADER_SIZE]); | ||
if self.is_underflow(contents) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe do:
if self.is_overflow(contents) {
...
} else if self.is_underflow(contents) {
...
} else {
unreachable!("balance_root was called where we didn't have any overflow or underflow")
}
core/storage/btree.rs
Outdated
let child_page = self | ||
.pager | ||
.read_page(child_page_id as usize) | ||
.expect("This shouldn't have happened, child page not found"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to propagate errors
let child_page = self | |
.pager | |
.read_page(child_page_id as usize) | |
.expect("This shouldn't have happened, child page not found"); | |
let child_page = self | |
.pager | |
.read_page(child_page_id as usize)?; |
And I think we are not handling the case where child page is not loaded/locked?
core/storage/btree.rs
Outdated
return; | ||
} | ||
|
||
let grandchild_ptr = child_contents.rightmost_pointer(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename grandchild_ptr
-> child_rightmost_pointer
? There could be a lot of grand childs so it this feels more exact.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moreover, if we are balancing root because it doesn't have any more cells, why do we expect the child to not be a leaf page? I mean, if you take rightmost pointer it means that child is a interior page, therefore, the tree is not balanced.
core/storage/btree.rs
Outdated
@@ -1285,8 +1400,7 @@ impl BTreeCursor { | |||
|
|||
// setup overflow page | |||
let contents = page.get().contents.as_mut().unwrap(); | |||
let buf = contents.as_ptr(); | |||
buf.fill(0); | |||
contents.write_u32(0, 0); // Initialize next overflow page pointer to 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why 0, can we use one of the constants? This 0 means PAGE_HEADER_OFFSET_PAGE_TYPE
, are you sure this is what this 0 write means?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, what I did is wrong. Will fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pereman2 can you please tell me what's happening here?: https://github.com/sqlite/sqlite/blob/c2e400af042bdd7d21e159a41fcf34c05398044c/src/btree.c#L7166-L7172
This is my understanding:
// Writes the page number of the new overflow page (pgnoOvfl) to pPrior. For the first overflow page, pPrior points to the end of the local cell content.
put4byte(pPrior, pgnoOvfl);
// ignore this
releasePage(pToRelease);
pToRelease = pOvfl;
// Points pPrior to the start of the new overflow page's data area
pPrior = pOvfl->aData;
// Writes 0 at the start of the overflow page, which shows its the last overflow page
put4byte(pPrior, 0);
// Sets pPayload to point past the 4-byte pointer. This is where the actual data will be stored in the overflow page
pPayload = &pOvfl->aData[4];
//Calculates remaining space for data in this overflow page. Usable space minus 4 bytes used for the next-page pointer
spaceLeft = pBt->usableSize - 4;
Where are we doing put4byte(pPrior, 0);
equivalent in limbo?
1. Added an unreachable!() to the if-else under/overflow checker. 2. Directly inserted to cells using `insert_into_cell()` instead of cloning vecs into an external vector 3. Propagated errors using `?` instead of `except` and handled the case where child page is not loaded/locked. 4. Restored `allocate_overflow_page()` function to previous state.
@pereman2 @jussisaurio I've added the following changes, can you please take a look?
|
Balance root now handles root underflow case:
balance_leaf