Skip to content

Commit

Permalink
Top-N: Rework to use heap of sort keys (duckdb#14424)
Browse files Browse the repository at this point in the history
This PR reworks the Top-N implementation to use a heap of sort keys.
Previously, we used to lean on our sort implementation, and would "sort
of" make a heap by re-sorting and discarding entries, in combination
with some early filtering. See
duckdb#2172

The main reason we implemented it this way is that we had to implement
the Top-N operator for many types, included nested types, and it was
easier to lean on the existing sort implementation - which was also an
improvement over the `Value`-based implementation we had previously. Now
that we have sort keys, it is much easier to implement the Top-N
algorithm using an actual heap - by leveraging sort keys. This PR does
exactly that - and implements sort keys using a heap from the `std`
(using `std::push_heap` and `std::pop_heap` over a vector).

This allows some clean-up of code as we can remove specialized code
(`VectorOperations::DistinctLessThanNullsFirst`/`VectorOperations::DistinctGreaterThanNullsFirst`).
In addition, we improve performance in many cases. In particular, sort
keys allow us to also easily keep track of a "global boundary value"
across all threads - that allows us to do much more skipping in the
adversarial case where data is reverse-sorted on the order key. This
makes performance much more stable.

Below are some performance numbers running on TPC-H SF10:


```sql
-- natural sort order, small limit, large payload
SELECT * FROM lineitem ORDER BY l_orderkey LIMIT 5;
-- old: 0.18s, new: 0.22s

-- inverse natural sort order, small limit, large payload
SELECT * FROM lineitem ORDER BY l_orderkey DESC LIMIT 5;
-- old: 0.76s, new: 0.24s

-- inverse natural sort order, large limit, large payload
SELECT * FROM lineitem ORDER BY l_orderkey DESC LIMIT 10000;
-- old: 1.59s, new: 0.34s

-- natural sort order, small limit, small payload
SELECT l_orderkey FROM lineitem ORDER BY l_orderkey LIMIT 5;
-- old: 0.03s, new: 0.06s

-- inverse natural sort order, small limit, small payload
SELECT l_orderkey FROM lineitem ORDER BY l_orderkey DESC LIMIT 5;
-- old: 0.16s, new: 0.07s

-- inverse natural sort order, small limit, large payload
SELECT l_orderkey FROM lineitem ORDER BY l_orderkey DESC LIMIT 10000;
-- old: 0.32s, new: 0.14s
```

In general, we can see that performance is much more stable and greatly
improved in several cases. There are a number of small regressions - in
particular when sorting on individual integer keys in natural sort order
the old algorithm is sometimes better. That is mostly because in these
cases we can filter out values immediately. In the old implementation we
would figure this out directly with the sort values, whereas in the new
implementation we still spend time constructing the sort keys. We could
remedy that by adding templated heaps for primitive types in the future.
  • Loading branch information
Mytherin authored Oct 18, 2024
2 parents 4d19100 + dca9938 commit 1979504
Show file tree
Hide file tree
Showing 9 changed files with 198 additions and 344 deletions.
38 changes: 0 additions & 38 deletions src/common/vector_operations/is_distinct_from.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -502,16 +502,6 @@ idx_t PositionComparator::Final<duckdb::DistinctLessThan>(Vector &left, Vector &
return VectorOperations::DistinctGreaterThan(right, left, &sel, count, true_sel, false_sel, null_mask);
}

template <>
idx_t PositionComparator::Final<duckdb::DistinctLessThanNullsFirst>(Vector &left, Vector &right,
const SelectionVector &sel, idx_t count,
optional_ptr<SelectionVector> true_sel,
optional_ptr<SelectionVector> false_sel,
optional_ptr<ValidityMask> null_mask) {
// DistinctGreaterThan has NULLs last
return VectorOperations::DistinctGreaterThan(right, left, &sel, count, true_sel, false_sel, null_mask);
}

template <>
idx_t PositionComparator::Final<duckdb::DistinctGreaterThan>(Vector &left, Vector &right, const SelectionVector &sel,
idx_t count, optional_ptr<SelectionVector> true_sel,
Expand All @@ -520,16 +510,6 @@ idx_t PositionComparator::Final<duckdb::DistinctGreaterThan>(Vector &left, Vecto
return VectorOperations::DistinctGreaterThan(left, right, &sel, count, true_sel, false_sel, null_mask);
}

template <>
idx_t PositionComparator::Final<duckdb::DistinctGreaterThanNullsFirst>(Vector &left, Vector &right,
const SelectionVector &sel, idx_t count,
optional_ptr<SelectionVector> true_sel,
optional_ptr<SelectionVector> false_sel,
optional_ptr<ValidityMask> null_mask) {
// DistinctLessThan has NULLs last
return VectorOperations::DistinctLessThan(right, left, &sel, count, true_sel, false_sel, null_mask);
}

using StructEntries = vector<unique_ptr<Vector>>;

static void ExtractNestedSelection(const SelectionVector &slice_sel, const idx_t count, const SelectionVector &sel,
Expand Down Expand Up @@ -1198,15 +1178,6 @@ idx_t VectorOperations::DistinctGreaterThan(Vector &left, Vector &right, optiona
null_mask);
}

// true := A > B with nulls being minimal
idx_t VectorOperations::DistinctGreaterThanNullsFirst(Vector &left, Vector &right,
optional_ptr<const SelectionVector> sel, idx_t count,
optional_ptr<SelectionVector> true_sel,
optional_ptr<SelectionVector> false_sel,
optional_ptr<ValidityMask> null_mask) {
return TemplatedDistinctSelectOperation<duckdb::DistinctGreaterThanNullsFirst>(left, right, sel, count, true_sel,
false_sel, null_mask);
}
// true := A >= B with nulls being maximal
idx_t VectorOperations::DistinctGreaterThanEquals(Vector &left, Vector &right, optional_ptr<const SelectionVector> sel,
idx_t count, optional_ptr<SelectionVector> true_sel,
Expand All @@ -1224,15 +1195,6 @@ idx_t VectorOperations::DistinctLessThan(Vector &left, Vector &right, optional_p
null_mask);
}

// true := A < B with nulls being minimal
idx_t VectorOperations::DistinctLessThanNullsFirst(Vector &left, Vector &right, optional_ptr<const SelectionVector> sel,
idx_t count, optional_ptr<SelectionVector> true_sel,
optional_ptr<SelectionVector> false_sel,
optional_ptr<ValidityMask> null_mask) {
return TemplatedDistinctSelectOperation<duckdb::DistinctGreaterThanNullsFirst>(right, left, sel, count, true_sel,
false_sel, nullptr);
}

// true := A <= B with nulls being maximal
idx_t VectorOperations::DistinctLessThanEquals(Vector &left, Vector &right, optional_ptr<const SelectionVector> sel,
idx_t count, optional_ptr<SelectionVector> true_sel,
Expand Down
Loading

0 comments on commit 1979504

Please sign in to comment.