feat: minimalistic storage layer setup #11

yliang412 · 2025-01-24T05:54:16Z

Problem

Persistence storage is on top of the optd optimizer wishlist.

Summary of changes

models the database schema and brings it to life using Diesel as the ORM.
The schema uses an operator per table model.
implemented a storage manager API on top of the ORM, currently it support:

impl StorageManager {
     pub fn add_logical_expr(&mut self, logical_expr: LogicalExpr) -> (LogicalExprId, RelGroupId);
    
     pub fn add_logical_expr_to_group(
            &mut self,
            logical_expr: LogicalExpr,
            rel_group_id: RelGroupId,
        ) -> LogicalExprId;
    
    pub fn get_logical_expr_identifiers(
            &mut self,
            logical_expr: &LogicalExpr,
        ) -> Option<(LogicalExprId, RelGroupId)>;
    
    
    pub fn get_all_logical_exprs_in_group(
            &mut self,
            rel_group_id: RelGroupId,
        ) -> Vec<LogicalExprWithId>;

    // more private methods ...
}

Future works

This PR unlocks the opportunity for people to work on rule matching. A fully sketched-out schema will be implemented gradually this week and next week.

Signed-off-by: Yuchen Liang <[email protected]>

yliang412 · 2025-01-24T05:58:58Z

Demo

Imagine the following schema and query:

CREATE TABLE t1(v1 INTEGER, v2 TEXT);
CREATE TABLE t2(v1 INTEGER, v2 TEXT);
SELECT * from t1 inner join t2 on t1.v1 = t2.v1 where t1.v2 = 'foo';
- LogicalFilter (on: t1.v2 = 'foo')
  - child: LogicalJoin (inner, on t1.v1 = t2.v1)
    - left: LogicalScan (t1)
    - right: LogicalScan (t2)

Besides adding it in initially, we also apply the join commutativity rule and register the equivalent of

- LogicalJoin (inner, on t1.v1 = t2.v1)
  - left: LogicalScan (t2)
  - right: LogicalScan (t1)

$ cargo run --bin storage_demo_with_trait

After running the demo, you will have:

$ sqlite3 test_memo.db

sqlite> select l.id, l.group_id, l.created_at, desc.name from logical_exprs as l, logical_op_kinds as desc where l.logical_op_kind_id = desc.id;
┌────┬──────────┬─────────────────────┬───────────────┐
│ id │ group_id │     created_at      │     name      │
├────┼──────────┼─────────────────────┼───────────────┤
│ 1  │ 1        │ 2025-01-24 05:52:45 │ LogicalScan   │
│ 2  │ 2        │ 2025-01-24 05:52:45 │ LogicalScan   │
│ 3  │ 3        │ 2025-01-24 05:52:45 │ LogicalJoin   │
│ 4  │ 3        │ 2025-01-24 05:52:45 │ LogicalJoin   │
│ 5  │ 4        │ 2025-01-24 05:52:45 │ LogicalFilter │
└────┴──────────┴─────────────────────┴───────────────┘

sqlite> select * from logical_joins;
┌─────────────────┬───────────┬──────┬───────┬───────────────┐
│ logical_expr_id │ join_type │ left │ right │   join_cond   │
├─────────────────┼───────────┼──────┼───────┼───────────────┤
│ 3               │ 0         │ 1    │ 2     │ t1.v1 = t2.v1 │
│ 4               │ 0         │ 2    │ 1     │ t1.v1 = t2.v1 │
└─────────────────┴───────────┴──────┴───────┴───────────────┘
sqlite> select * from logical_scans;
┌─────────────────┬────────────┐
│ logical_expr_id │ table_name │
├─────────────────┼────────────┤
│ 1               │ t1         │
│ 2               │ t2         │
└─────────────────┴────────────┘
sqlite> select * from logical_filters;
┌─────────────────┬───────┬───────────────┐
│ logical_expr_id │ child │   predicate   │
├─────────────────┼───────┼───────────────┤
│ 5               │ 3     │ t1.v2 = 'foo' │
└─────────────────┴───────┴───────────────┘

It is also fine if you run them multiple times. The created_at timestamp won't change.

Signed-off-by: Yuchen Liang <[email protected]>

skyzh

Generally LGTM with the approach -- I could see some minor problems and we can figure out them later.

skyzh · 2025-01-24T12:38:30Z

optd/src/storage/models/common.rs

+
+    fn try_from(value: i32) -> Result<Self, Self::Error> {
+        use JoinType::*;
+        match value {


consider using a 3rd party crate to reduce boilerplate code

skyzh · 2025-01-24T12:39:07Z

optd/src/storage.rs

+
+        let mut exprs = Vec::with_capacity(records.len());
+
+        for (record, name) in records {


sounds like a lot of round trips to the database

skyzh · 2025-01-24T12:41:46Z

optd/src/storage/models/logical_expr.rs

+}
+
+#[derive(Debug)]
+pub struct LogicalExprWithId {


just FYI the old codebase doesn't have this and I'm unsure how useful it is to have it

skyzh · 2025-01-24T12:42:27Z

optd/src/storage/models/logical_expr.rs

+    fn insert_op(&self, id: LogicalExprId, storage: &mut StorageManager);
+
+    /// Gets the logical operator kind id.
+    fn op_kind(&self, storage: &mut StorageManager) -> LogicalOpKindId {


do we have plan to use the diesel async interface? sounds like we are starting with sync first and needs to do a significant refactor

I'm looking into how diesel-async works. Would there be a lot of problems besides adding in the await points?

skyzh · 2025-01-24T12:44:46Z

optd/src/storage/models/logical_expr.rs

+    /// Otherwise, it inserts the logical expression into the database and returns the generated logical expression id and
+    /// the relational group id.
+    fn add(&self, storage: &mut StorageManager) -> (LogicalExprId, RelGroupId) {
+        if let Some((id, rel_group_id)) = self.get_identifiers(storage) {


do check and insert in one sql using insert on conflict?

TODO: wrap all these ops into a txn

do check and insert in one sql using insert on conflict?

Yea, I think to do this we probably need to build an index on the data columns and then you can start using upserts.

I would also think that an in-memory caching layer is necessary to speed things up.

skyzh · 2025-01-24T12:47:52Z

optd/src/storage/models/logical_expr.rs

+    /// Gets the logical expression id if it is already in the database.
+    fn id(&self, storage: &mut StorageManager) -> Option<LogicalExprId>;
+
+    fn insert_op(&self, id: LogicalExprId, storage: &mut StorageManager);


what does this function do and any example using it?

This is used by add to add an entry for per-operator tables. This reduces the LoC needed for each operator.

skyzh · 2025-01-24T12:58:17Z

I also recommend having a single counter for both group id and expr id so that one number uniquely identifies an entity in the system. This helps with debugging. i.e., if we assign 1 as a group id, then 1 won't be used as an expression id. It is possible with Postgres's create sequence. (Something for the future)

connortsui20

First pass (super high level) review:

I added some comments about module file structure

In terms of style, I think we should be very intentional about naming. Things like Expr are pretty self explanatory, but I actually have to think hard when stumbling across something like physical_op_kinds and rel_group -- it will be even harder for someone new jumping into this codebase. Default to spelling out everything unless it becomes unwieldy (like in the case of writing out the entire Expression). Things like Rel and OpKind are, in my opinion, hard to grok.

connortsui20 · 2025-01-24T15:44:50Z

optd/src/storage/models/logical_operators.rs

@@ -0,0 +1,30 @@
+//! The logical operator objects in the optimizer storage layer.


This should be a mod.rs file considering that we're going to have tens of operators and the directory structure is going to super difficult to navigate if the submodule root is located in a different place than the operator sub-submodules

I prefer having a consistent project structure and avoid using mod. With the same convention, people will know where to find submodules.

Yes I agree, and I'm saying that there is not much justification here in having a logical_operator.rs file located so far away from the logical_operator folder, especially since we're going to have a lot of modules.

See discussions:

https://www.reddit.com/r/rust/comments/18pytwt/noob_question_foomodrs_vs_foors_foo_for_module/

https://users.rust-lang.org/t/module-mod-rs-or-module-rs/122653

https://internals.rust-lang.org/t/the-module-scheme-module-rs-file-module-folder-instead-of-just-module-mod-rs-introduced-by-the-2018-edition-maybe-a-little-bit-more-confusing/21977/17?u=zirconium-n

This is obviously not a black and white conversation, but in my opinion the only argument against having mod.rs is that you have multiple files named the same thing. But with the last link listed above plus the fact that mod.rs should really only be re-exporting stuff, I don't see any good reason to follow this.

connortsui20 · 2025-01-24T15:45:02Z

optd/src/storage/models/physical_operators.rs

@@ -0,0 +1,30 @@
+//! The physical operator objects in the optimizer storage layer.


This should also be a mod.rs file

connortsui20 · 2025-01-24T15:46:55Z

optd/src/storage/models.rs

+pub mod common;
+pub mod logical_expr;
+pub mod logical_operators;
+pub mod physical_expr;
+pub mod physical_operators;
+pub mod rel_group;


This definitely needs to be in a mod.rs file

connortsui20 · 2025-01-24T15:48:07Z

optd/src/storage/schema.rs

@@ -0,0 +1,116 @@
+// @generated automatically by Diesel CLI.


So I know that this is code generated, but is it possible to put comments here anyways? The workflow that worked for the codegen from sea-orm was that we added extra stuff into this file (like comments and lints), and when the codegen overwrote the extra stuff we just used git to put them back

connortsui20 · 2025-01-24T15:53:35Z

.github/workflows/check.yml

@@ -91,7 +91,7 @@ jobs:
        uses: dtolnay/install@cargo-docs-rs
      - name: cargo docs-rs
        # TODO: Once we figure out the crates, rename this.
-        run: cargo docs-rs -p optd-tmp
+        run: cargo docs-rs -p optd


I still personally think that this should be called optd-storage instead of just optd because it will become confusing

I think last time we decided initially we are going to have a single crate called optd. storage will be a module inside that crate.

If that's the case, then we arguably shouldn't have a workspace at all.

With what @SarveshOO7 was working on, I think it makes sense to have a separate crate for everything around the pipeline we talked about because we do not want to have to recompile datafusion every single time we make a change.

I think either we completely remove the workspace and call it optd, or we start with 2 crates (something like optd-core and optd-datafusion) and make sure we don't expand beyond 2 without good reason.

.gitignore

connortsui20 · 2025-01-24T15:55:00Z

Cargo.toml

 resolver = "2"
+
+[workspace.dependencies]
+anyhow = "1"


We probably want to use snafu given the service-like-library nature of this project

optd/migrations/.keep

connortsui20 · 2025-01-24T16:09:41Z

optd/src/storage/models/logical_expr.rs

+#[diesel(belongs_to(RelGroup))]
+#[diesel(belongs_to(LogicalOpKind))]
+#[diesel(check_for_backend(diesel::sqlite::Sqlite))]
+pub struct LogicalExprRecord {


I don't think we ever talked about this type, this seems to be a new table? I see that it is used in one other place in get_all_logical_exprs_in_group, but I don't immediately see the purpose of this...

This is the flat record type on the logical_exprs table. You need this and the specific operator table to get out the full LogicalExpr enum. Subject to change but I would argue this does not change to overall API.

If that's the case, can you make the visibility pub(super) or whatever is the lowest visibility for the memo table to have access to it?

optd/src/storage/models/logical_expr.rs

connortsui20 · 2025-01-24T16:16:07Z

optd/src/storage.rs

I think there should be a storage_manager submodule in the storage module that only contains the StorageManager type and impl

Signed-off-by: Yuchen Liang <[email protected]>

AlSchlo · 2025-01-24T19:15:05Z

optd/migrations/2025-01-22-223443_create_physical_op_kinds/up.sql

@@ -0,0 +1,8 @@
+-- The physical operator descriptor table stores all the 
+-- physical operators that can be used in the optimizer.
+CREATE TABLE physical_op_kinds (


TBH, I think having an enum here might be more practical. It will also make all the queries simpler and safe us a join. Might as well codegen everywhere, right?

This table will never change at run time.

I'll experiment with that during the WE.

I agree, even though I haven't actually given the logic of this PR a review, we should be pushing as much work as we can to compile-time / codegen / preprocessing, especially since we want the foundation to be minimal at runtime so that we don't run into a wave of runtime bugs in the future

AlSchlo · 2025-01-24T19:18:16Z

optd/src/bin/storage_demo_manual.rs

+        .select(logical_op_kinds::id)
+        .first::<LogicalOpKindId>(&mut storage.conn)?;
+
+    let logical_filter_id = logical_op_kinds::table


if we keep the table approach, I guess we could have a big function that loads all of them & caches the ids in the future.

We could also always carry the operator kind.

connortsui20 · 2025-01-25T15:03:16Z

optd/migrations/2025-01-22-223940_create_logical_exprs/up.sql

+-- which group a logical expression belongs to.
+CREATE TABLE logical_exprs (
+    -- The logical expression id.
+    id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,


If we go with Chi's suggestion of having a unique ID for each object we shouldn't use autoincrement here

connortsui20 · 2025-01-25T15:03:36Z

optd/migrations/2025-01-22-223940_create_logical_exprs/up.sql

+    -- The logical expression id.
+    id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
+    -- The logical operator descriptor id.
+    logical_op_kind_id BIGINT NOT NULL,


Probably doesn't need to be BIGINT here? Or do foreign key references HAVE to be BIGINT?

connortsui20 · 2025-01-25T15:04:43Z

optd/migrations/2025-01-22-223441_create_relational_groups/up.sql

@@ -0,0 +1,8 @@
+-- A relational group contains a set of relational expressions 
+-- that are logically equivalent.
+CREATE TABLE rel_groups (


Yeah I think expand out the "relational" here, and also be consistent with what the migration file directory is called too

connortsui20 · 2025-01-25T15:06:33Z

optd/migrations/2025-01-22-223441_create_relational_groups/up.sql

+    id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
+    -- Time at which the group is created.
+    created_at TIMESTAMP DEFAULT (CURRENT_TIMESTAMP) NOT NULL
+);


Add TODOs here saying that we will add other columns like an optional winner, potentially cost, and also other group metadata related to our union-find parent pointer idea

connortsui20 · 2025-01-25T15:30:55Z

optd/migrations/2025-01-22-223940_create_logical_exprs/up.sql

It seems like there might be an unnecessary level of indirection here?

For the sake of this example, suppose we only have logical expression types. My understanding of this is that there is a logical_exprs table that tracks expression IDs and maps them to their logical operator kind (Scan, Filter, Join). The purpose of this is so that every logical expression ID lookup can be paired with a logical operator kind so that we know what specific table to go look at.

What if instead of having a full table that makes this mapping, we instead encode the operator kind inside of the expression ID? For example, we could have it such that the upper 16 bits of a 64-bit ID encode the operator kind, and the lower 48 bits encode the unique expression given the operator kind (we would likely not even need all 16 bits, we could probably get away with the top 8 bits even).

You could probably argue that this introduces complexity, but in my mind I feel that having extra tables that are not strictly necessary introduces even more complexity. When I was trying to read this PR I had to spend a non-trivial amount of time trying to understand the architecture, since I did not expect that specific table when we initially talked about the architecture in previous meetings.

In terms of implementation of the above proposal, yes it would be a bit ugly, but if abstracted correctly at the storage layer level, there should only really be 1 function that handles the conversion into some sort of strongly-typed struct like this:

struct ExprId { kind: u16, // could also encode logical/physical difference before converting into stronger types id: u64, // truncated at 48 bits }

This is opposed to having a dedicated table that makes this mapping, where every single expression lookup now has to look up and extra record on disk (and even in memory, this roundtrip is going to need to happen for every single lookup during optimization).

As for the group ID mapping, I think that needs to be in a separate junction table regardless for group operations. Unless this is actually supposed to be the junction table? If that's the case, then all 3 things in this table would have needed to be a composite primary key tuple.

I think there are tradeoffs to both models, but I am leaning towards removing the layer of indirection.

yliang412 added 8 commits January 22, 2025 23:55

feat: minimum data storage without considering scalar operators

e1aa1d8

Signed-off-by: Yuchen Liang <[email protected]>

add all the minimal plan nodes

a26c52e

Signed-off-by: Yuchen Liang <[email protected]>

more query code

4400421

Signed-off-by: Yuchen Liang <[email protected]>

more query code

0e06331

Signed-off-by: Yuchen Liang <[email protected]>

rename table to *_op_kinds

0373b4b

Signed-off-by: Yuchen Liang <[email protected]>

add manual duplication check for each logical_op

d13c731

Signed-off-by: Yuchen Liang <[email protected]>

add enum-dispatched storage trait draft

e073616

Signed-off-by: Yuchen Liang <[email protected]>

fix clippy

6337fa5

Signed-off-by: Yuchen Liang <[email protected]>

fix format

595bbe8

Signed-off-by: Yuchen Liang <[email protected]>

yliang412 marked this pull request as ready for review January 24, 2025 06:19

yliang412 changed the title ~~feat: minimalist storage layer setup~~ feat: minimalistic storage layer setup Jan 24, 2025

skyzh approved these changes Jan 24, 2025

View reviewed changes

connortsui20 requested changes Jan 24, 2025

View reviewed changes

yliang412 added 2 commits January 24, 2025 13:34

remove enum_dispatch

037ded3

Signed-off-by: Yuchen Liang <[email protected]>

add all *.db in gitignore

32857c2

Signed-off-by: Yuchen Liang <[email protected]>

AlSchlo reviewed Jan 24, 2025

View reviewed changes

connortsui20 requested changes Jan 25, 2025

View reviewed changes


		let mut exprs = Vec::with_capacity(records.len());

		for (record, name) in records {

		@@ -0,0 +1,30 @@
		//! The logical operator objects in the optimizer storage layer.

		@@ -0,0 +1,30 @@
		//! The physical operator objects in the optimizer storage layer.

feat: minimalistic storage layer setup #11

Are you sure you want to change the base?

feat: minimalistic storage layer setup #11

Conversation

yliang412 commented Jan 24, 2025 • edited Loading

Problem

Summary of changes

Future works

yliang412 commented Jan 24, 2025 • edited Loading

Demo

skyzh left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

skyzh commented Jan 24, 2025 • edited Loading

connortsui20 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

connortsui20 Jan 24, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AlSchlo Jan 24, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

connortsui20 Jan 24, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

connortsui20 Jan 25, 2025 • edited Loading

Choose a reason for hiding this comment

yliang412 commented Jan 24, 2025 •

edited

Loading

yliang412 commented Jan 24, 2025 •

edited

Loading

skyzh commented Jan 24, 2025 •

edited

Loading

connortsui20 Jan 24, 2025 •

edited

Loading

AlSchlo Jan 24, 2025 •

edited

Loading

connortsui20 Jan 24, 2025 •

edited

Loading

connortsui20 Jan 25, 2025 •

edited

Loading