Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect stack overflow and reduce stack usage on debug build #1465

Open
Eason0729 opened this issue Oct 10, 2024 · 4 comments
Open

Detect stack overflow and reduce stack usage on debug build #1465

Eason0729 opened this issue Oct 10, 2024 · 4 comments

Comments

@Eason0729
Copy link
Contributor

related to apache/datafusion#12731

The issue is that unoptimized(debug) build consume more stack memory, which known to cause stack overflow if stack size is less than 1MiB.

Expected change:

  1. Return error on stack overflow
  2. Reduce stack usage (or grow stack
@Eason0729
Copy link
Contributor Author

I would like to take stacker approach, that is:

  1. implement stack guard at some point
  2. if red zone is reached, try grow stack
  3. if unable to grow stack, return an error

If those words aren't enough to explain the approach I would like to take, I can submit a draft PR to make it clear with code.

@alamb
Copy link
Contributor

alamb commented Oct 10, 2024

@Eason0729
Copy link
Contributor Author

#1468

@Eason0729
Copy link
Contributor Author

Eason0729 commented Oct 17, 2024

I spent some time manually run reg read sp fp, and write down rbp-rsp (in debug build.

Detail

export RUSTFLAGS="-C force-frame-pointers=yes"

parse_query 12592
parse_boxed_query_body 12592
parse_select 17024
parse_optional_alias 256
parse_comma_separated 1056
parse_select 17024
parse_table_and_joins 10016
parse_derived_table_factor 2384

And here is some large struct(in byte):

Expr 296
Statement 3528
let parser = Parser::new(&GenericDialect {});
let sql = r#"
    SELECT seq
    FROM (
        SELECT seq,
        LEAD(seq) OVER (ORDER BY seq) AS next_seq
        FROM parquet_table
    ) AS subquery
    WHERE next_seq - seq > 1;
"#;
let stmt = parser.try_with_sql(sql).unwrap().parse_statement().unwrap();
print!("{}", stmt.to_string());

In this example, sqlparser consume about 140KB stack at peak, so I think there is little to optimize in sqlparser.
I would try, but we still need to reduce stack usage on datafusion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants