Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: minimalistic storage layer setup #11

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

yliang412
Copy link
Member

@yliang412 yliang412 commented Jan 24, 2025

Problem

Persistence storage is on top of the optd optimizer wishlist.

Summary of changes

  • models the database schema and brings it to life using Diesel as the ORM.
  • The schema uses an operator per table model.
  • implemented a storage manager API on top of the ORM, currently it support:
impl StorageManager {
     pub fn add_logical_expr(&mut self, logical_expr: LogicalExpr) -> (LogicalExprId, RelGroupId);
    
     pub fn add_logical_expr_to_group(
            &mut self,
            logical_expr: LogicalExpr,
            rel_group_id: RelGroupId,
        ) -> LogicalExprId;
    
    pub fn get_logical_expr_identifiers(
            &mut self,
            logical_expr: &LogicalExpr,
        ) -> Option<(LogicalExprId, RelGroupId)>;
    
    
    pub fn get_all_logical_exprs_in_group(
            &mut self,
            rel_group_id: RelGroupId,
        ) -> Vec<LogicalExprWithId>;

    // more private methods ...
}

Future works

This PR unlocks the opportunity for people to work on rule matching. A fully sketched-out schema will be implemented gradually this week and next week.

@yliang412
Copy link
Member Author

yliang412 commented Jan 24, 2025

Demo

Imagine the following schema and query:

CREATE TABLE t1(v1 INTEGER, v2 TEXT);
CREATE TABLE t2(v1 INTEGER, v2 TEXT);
SELECT * from t1 inner join t2 on t1.v1 = t2.v1 where t1.v2 = 'foo';
- LogicalFilter (on: t1.v2 = 'foo')
  - child: LogicalJoin (inner, on t1.v1 = t2.v1)
    - left: LogicalScan (t1)
    - right: LogicalScan (t2)

Besides adding it in initially, we also apply the join commutativity rule and register the equivalent of

- LogicalJoin (inner, on t1.v1 = t2.v1)
  - left: LogicalScan (t2)
  - right: LogicalScan (t1)
$ cargo run --bin storage_demo_with_trait

After running the demo, you will have:

$ sqlite3 test_memo.db
sqlite> select l.id, l.group_id, l.created_at, desc.name from logical_exprs as l, logical_op_kinds as desc where l.logical_op_kind_id = desc.id;
┌────┬──────────┬─────────────────────┬───────────────┐
│ id │ group_id │     created_at      │     name      │
├────┼──────────┼─────────────────────┼───────────────┤
│ 112025-01-24 05:52:45 │ LogicalScan   │
│ 222025-01-24 05:52:45 │ LogicalScan   │
│ 332025-01-24 05:52:45 │ LogicalJoin   │
│ 432025-01-24 05:52:45 │ LogicalJoin   │
│ 542025-01-24 05:52:45 │ LogicalFilter │
└────┴──────────┴─────────────────────┴───────────────┘

sqlite> select * from logical_joins;
┌─────────────────┬───────────┬──────┬───────┬───────────────┐
│ logical_expr_id │ join_type │ left │ right │   join_cond   │
├─────────────────┼───────────┼──────┼───────┼───────────────┤
│ 3012t1.v1 = t2.v1 │
│ 4021t1.v1 = t2.v1 │
└─────────────────┴───────────┴──────┴───────┴───────────────┘
sqlite> select * from logical_scans;
┌─────────────────┬────────────┐
│ logical_expr_id │ table_name │
├─────────────────┼────────────┤
│ 1               │ t1         │
│ 2               │ t2         │
└─────────────────┴────────────┘
sqlite> select * from logical_filters;
┌─────────────────┬───────┬───────────────┐
│ logical_expr_id │ child │   predicate   │
├─────────────────┼───────┼───────────────┤
│ 53t1.v2 = 'foo' │
└─────────────────┴───────┴───────────────┘

It is also fine if you run them multiple times. The created_at timestamp won't change.

Signed-off-by: Yuchen Liang <[email protected]>
@yliang412 yliang412 marked this pull request as ready for review January 24, 2025 06:19
@yliang412 yliang412 changed the title feat: minimalist storage layer setup feat: minimalistic storage layer setup Jan 24, 2025
Copy link
Member

@skyzh skyzh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM with the approach -- I could see some minor problems and we can figure out them later.


fn try_from(value: i32) -> Result<Self, Self::Error> {
use JoinType::*;
match value {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider using a 3rd party crate to reduce boilerplate code


let mut exprs = Vec::with_capacity(records.len());

for (record, name) in records {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds like a lot of round trips to the database

}

#[derive(Debug)]
pub struct LogicalExprWithId {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just FYI the old codebase doesn't have this and I'm unsure how useful it is to have it

fn insert_op(&self, id: LogicalExprId, storage: &mut StorageManager);

/// Gets the logical operator kind id.
fn op_kind(&self, storage: &mut StorageManager) -> LogicalOpKindId {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have plan to use the diesel async interface? sounds like we are starting with sync first and needs to do a significant refactor

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm looking into how diesel-async works. Would there be a lot of problems besides adding in the await points?

/// Otherwise, it inserts the logical expression into the database and returns the generated logical expression id and
/// the relational group id.
fn add(&self, storage: &mut StorageManager) -> (LogicalExprId, RelGroupId) {
if let Some((id, rel_group_id)) = self.get_identifiers(storage) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do check and insert in one sql using insert on conflict?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: wrap all these ops into a txn

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do check and insert in one sql using insert on conflict?

Yea, I think to do this we probably need to build an index on the data columns and then you can start using upserts.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also think that an in-memory caching layer is necessary to speed things up.

/// Gets the logical expression id if it is already in the database.
fn id(&self, storage: &mut StorageManager) -> Option<LogicalExprId>;

fn insert_op(&self, id: LogicalExprId, storage: &mut StorageManager);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this function do and any example using it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is used by add to add an entry for per-operator tables. This reduces the LoC needed for each operator.

@skyzh
Copy link
Member

skyzh commented Jan 24, 2025

I also recommend having a single counter for both group id and expr id so that one number uniquely identifies an entity in the system. This helps with debugging. i.e., if we assign 1 as a group id, then 1 won't be used as an expression id. It is possible with Postgres's create sequence. (Something for the future)

Copy link
Member

@connortsui20 connortsui20 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass (super high level) review:

I added some comments about module file structure

In terms of style, I think we should be very intentional about naming. Things like Expr are pretty self explanatory, but I actually have to think hard when stumbling across something like physical_op_kinds and rel_group -- it will be even harder for someone new jumping into this codebase. Default to spelling out everything unless it becomes unwieldy (like in the case of writing out the entire Expression). Things like Rel and OpKind are, in my opinion, hard to grok.

@@ -0,0 +1,30 @@
//! The logical operator objects in the optimizer storage layer.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a mod.rs file considering that we're going to have tens of operators and the directory structure is going to super difficult to navigate if the submodule root is located in a different place than the operator sub-submodules

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer having a consistent project structure and avoid using mod. With the same convention, people will know where to find submodules.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I agree, and I'm saying that there is not much justification here in having a logical_operator.rs file located so far away from the logical_operator folder, especially since we're going to have a lot of modules.

See discussions:

This is obviously not a black and white conversation, but in my opinion the only argument against having mod.rs is that you have multiple files named the same thing. But with the last link listed above plus the fact that mod.rs should really only be re-exporting stuff, I don't see any good reason to follow this.

@@ -0,0 +1,30 @@
//! The physical operator objects in the optimizer storage layer.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also be a mod.rs file

Comment on lines +1 to +6
pub mod common;
pub mod logical_expr;
pub mod logical_operators;
pub mod physical_expr;
pub mod physical_operators;
pub mod rel_group;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This definitely needs to be in a mod.rs file

@@ -0,0 +1,116 @@
// @generated automatically by Diesel CLI.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I know that this is code generated, but is it possible to put comments here anyways? The workflow that worked for the codegen from sea-orm was that we added extra stuff into this file (like comments and lints), and when the codegen overwrote the extra stuff we just used git to put them back

@@ -91,7 +91,7 @@ jobs:
uses: dtolnay/install@cargo-docs-rs
- name: cargo docs-rs
# TODO: Once we figure out the crates, rename this.
run: cargo docs-rs -p optd-tmp
run: cargo docs-rs -p optd
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still personally think that this should be called optd-storage instead of just optd because it will become confusing

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think last time we decided initially we are going to have a single crate called optd. storage will be a module inside that crate.

Copy link
Member

@connortsui20 connortsui20 Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that's the case, then we arguably shouldn't have a workspace at all.

With what @SarveshOO7 was working on, I think it makes sense to have a separate crate for everything around the pipeline we talked about because we do not want to have to recompile datafusion every single time we make a change.

I think either we completely remove the workspace and call it optd, or we start with 2 crates (something like optd-core and optd-datafusion) and make sure we don't expand beyond 2 without good reason.

.gitignore Outdated Show resolved Hide resolved
resolver = "2"

[workspace.dependencies]
anyhow = "1"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably want to use snafu given the service-like-library nature of this project

optd/migrations/.keep Outdated Show resolved Hide resolved
#[diesel(belongs_to(RelGroup))]
#[diesel(belongs_to(LogicalOpKind))]
#[diesel(check_for_backend(diesel::sqlite::Sqlite))]
pub struct LogicalExprRecord {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we ever talked about this type, this seems to be a new table? I see that it is used in one other place in get_all_logical_exprs_in_group, but I don't immediately see the purpose of this...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the flat record type on the logical_exprs table. You need this and the specific operator table to get out the full LogicalExpr enum. Subject to change but I would argue this does not change to overall API.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that's the case, can you make the visibility pub(super) or whatever is the lowest visibility for the memo table to have access to it?

optd/src/storage/models/logical_expr.rs Outdated Show resolved Hide resolved
optd/src/storage/models/logical_expr.rs Outdated Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there should be a storage_manager submodule in the storage module that only contains the StorageManager type and impl

Signed-off-by: Yuchen Liang <[email protected]>
Signed-off-by: Yuchen Liang <[email protected]>
@@ -0,0 +1,8 @@
-- The physical operator descriptor table stores all the
-- physical operators that can be used in the optimizer.
CREATE TABLE physical_op_kinds (
Copy link
Collaborator

@AlSchlo AlSchlo Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, I think having an enum here might be more practical. It will also make all the queries simpler and safe us a join. Might as well codegen everywhere, right?

This table will never change at run time.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll experiment with that during the WE.

Copy link
Member

@connortsui20 connortsui20 Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, even though I haven't actually given the logic of this PR a review, we should be pushing as much work as we can to compile-time / codegen / preprocessing, especially since we want the foundation to be minimal at runtime so that we don't run into a wave of runtime bugs in the future

.select(logical_op_kinds::id)
.first::<LogicalOpKindId>(&mut storage.conn)?;

let logical_filter_id = logical_op_kinds::table
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we keep the table approach, I guess we could have a big function that loads all of them & caches the ids in the future.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also always carry the operator kind.

-- which group a logical expression belongs to.
CREATE TABLE logical_exprs (
-- The logical expression id.
id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we go with Chi's suggestion of having a unique ID for each object we shouldn't use autoincrement here

-- The logical expression id.
id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
-- The logical operator descriptor id.
logical_op_kind_id BIGINT NOT NULL,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably doesn't need to be BIGINT here? Or do foreign key references HAVE to be BIGINT?

@@ -0,0 +1,8 @@
-- A relational group contains a set of relational expressions
-- that are logically equivalent.
CREATE TABLE rel_groups (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think expand out the "relational" here, and also be consistent with what the migration file directory is called too

id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
-- Time at which the group is created.
created_at TIMESTAMP DEFAULT (CURRENT_TIMESTAMP) NOT NULL
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add TODOs here saying that we will add other columns like an optional winner, potentially cost, and also other group metadata related to our union-find parent pointer idea

Copy link
Member

@connortsui20 connortsui20 Jan 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like there might be an unnecessary level of indirection here?

For the sake of this example, suppose we only have logical expression types. My understanding of this is that there is a logical_exprs table that tracks expression IDs and maps them to their logical operator kind (Scan, Filter, Join). The purpose of this is so that every logical expression ID lookup can be paired with a logical operator kind so that we know what specific table to go look at.

What if instead of having a full table that makes this mapping, we instead encode the operator kind inside of the expression ID? For example, we could have it such that the upper 16 bits of a 64-bit ID encode the operator kind, and the lower 48 bits encode the unique expression given the operator kind (we would likely not even need all 16 bits, we could probably get away with the top 8 bits even).

You could probably argue that this introduces complexity, but in my mind I feel that having extra tables that are not strictly necessary introduces even more complexity. When I was trying to read this PR I had to spend a non-trivial amount of time trying to understand the architecture, since I did not expect that specific table when we initially talked about the architecture in previous meetings.

In terms of implementation of the above proposal, yes it would be a bit ugly, but if abstracted correctly at the storage layer level, there should only really be 1 function that handles the conversion into some sort of strongly-typed struct like this:

struct ExprId {
    kind: u16, // could also encode logical/physical difference before converting into stronger types
    id: u64,   // truncated at 48 bits
}

This is opposed to having a dedicated table that makes this mapping, where every single expression lookup now has to look up and extra record on disk (and even in memory, this roundtrip is going to need to happen for every single lookup during optimization).

As for the group ID mapping, I think that needs to be in a separate junction table regardless for group operations. Unless this is actually supposed to be the junction table? If that's the case, then all 3 things in this table would have needed to be a composite primary key tuple.

I think there are tradeoffs to both models, but I am leaning towards removing the layer of indirection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants