Skip to content

Commit

Permalink
Merge #131
Browse files Browse the repository at this point in the history
131: Append perf bugfix r=Kerollmops a=saik0

Fixes #130 

## Cause

Where _N_ is the length of the iterator: _N_ calls to `RoaringBitmap::push` required N calls to`RoaringBitmap::max`, which is not cheap.

## Solution

Compare only the first item in the iterator to the bitmap max, then compare each element to its predecessor.

## Future work

This could be strictly faster than inserting, given that it's always inserting into the last container (or pushing a new container)

## Before
![Screen Shot 2022-01-09 at 9 35 20 PM](https://user-images.githubusercontent.com/997050/148730686-152f7232-4ef0-43da-b37f-7bef6d1502da.png)
![Screen Shot 2022-01-09 at 9 35 36 PM](https://user-images.githubusercontent.com/997050/148730691-a15045a7-0abc-47c3-8208-b95aa406a681.png)


## After

![Screen Shot 2022-01-09 at 11 13 35 PM](https://user-images.githubusercontent.com/997050/148729666-55c48265-d38a-4865-a92d-ceb83efdb500.png)
![Screen Shot 2022-01-09 at 11 13 43 PM](https://user-images.githubusercontent.com/997050/148729673-41dfc7e1-62ca-4cc4-8c7c-8cd3b59adc4c.png)



Co-authored-by: saik0 <[email protected]>
  • Loading branch information
bors[bot] and saik0 authored Jan 10, 2022
2 parents 9a1b540 + ef37842 commit 5bd5fc5
Show file tree
Hide file tree
Showing 2 changed files with 52 additions and 4 deletions.
30 changes: 26 additions & 4 deletions src/bitmap/iter.rs
Original file line number Diff line number Diff line change
Expand Up @@ -184,13 +184,35 @@ impl RoaringBitmap {
&mut self,
iterator: I,
) -> Result<u64, NonSortedIntegers> {
let mut count = 0;
// Name shadowed to prevent accidentally referencing the param
let mut iterator = iterator.into_iter();

let mut prev: u32 = match iterator.next() {
None => return Ok(0),
Some(first) => {
if let Some(max) = self.max() {
if first <= max {
return Err(NonSortedIntegers { valid_until: 0 });
}
}

first
}
};

self.insert(prev);
let mut count = 1;

// It is now guaranteed that so long as the values are iterator are monotonically
// increasing they must also be the greatest in the set.

for value in iterator {
if self.push(value) {
count += 1;
} else {
if value <= prev {
return Err(NonSortedIntegers { valid_until: count });
} else {
self.insert(value);
prev = value;
count += 1;
}
}

Expand Down
26 changes: 26 additions & 0 deletions tests/push.rs
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,32 @@ fn append() {
test_from_sorted_iter!(vec![1, 2, 4, 5, 7, 8, 9], RoaringBitmap);
}

#[test]
fn append_empty() {
assert_eq!(RoaringBitmap::new().append(vec![]), Ok(0u64))
}

#[test]
fn append_error() {
match [100u32].iter().cloned().collect::<RoaringBitmap>().append(vec![10, 20, 0]) {
Ok(_) => {
panic!("The 0th element in the iterator was < the max of the bitmap")
}
Err(non_sorted_error) => {
assert_eq!(non_sorted_error.valid_until(), 0)
}
}

match [100u32].iter().cloned().collect::<RoaringBitmap>().append(vec![200, 201, 201]) {
Ok(_) => {
panic!("The 3rd element in the iterator was < 2nd")
}
Err(non_sorted_error) => {
assert_eq!(non_sorted_error.valid_until(), 2)
}
}
}

#[test]
fn append_tree() {
test_from_sorted_iter!((0..1_000_000).map(|x| 13 * x).collect::<Vec<u64>>(), RoaringTreemap);
Expand Down

0 comments on commit 5bd5fc5

Please sign in to comment.