Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Fetch SQL queries in Postgres backend #196

Closed

Conversation

rblaine95
Copy link
Contributor

@rblaine95 rblaine95 commented Nov 6, 2023

An attempt at some optimizations on the FETCH_QUERY and FETCH_QUERY_UPDATE Postgres queries.

Opening as a draft PR while I work on figuring out a way to benchmark the updated queries.

According to GPT:

The refactored versions with the LEFT JOIN and GROUP BY clauses, as well as the use of CTEs are likely to be more efficient in terms of performance, especially when dealing with large datasets. It allows for a more optimized approach to data retrieval, particularly when combining information from multiple tables.

Please be as pedantic as possible (-Wclippy::pedantic -Wclippy::nursery).

Related to #195

@swcurran
Copy link
Contributor

swcurran commented Nov 6, 2023

@andrewwhitehead — any idea why the tests failed? Doesn’t look like it would be related to the change?

Fun PR — thanks!

@andrewwhitehead
Copy link
Contributor

It appears that the latest ring dependency breaks support for the manylinux aarch64 environment we are using (briansmith/ring#1728). We could update to manylinux_2_28 or try one of the other fixes, I'll have to test that in another PR.

@rblaine95
Copy link
Contributor Author

rblaine95 commented Nov 9, 2023

Small update - I've made some progress on building a benchmark using Criterion to test if there's any benefit in the refactored SQL queries.

Preliminary results are looking good.
On my local (M2 Macbook Pro (10CPU, 16GiB), bash ./tests/docker_pg.sh), I pre-populated the DB with 5000 profiles, then benchmarked the old SQL vs the new SQL creating 1000 profiles on top of the existing 5000.

Red = Old SQL
Blue = Refactored SQL
image

I'll work on a larger benchmark - pre-populate with 50000 profiles and then compare the SQL queries.
The expectation is that the refactored version should be faster compared to the old version as the number of profiles increases.

Correction:
It looks like I severely misunderstood.
The benchmark, currently, isn't running in multi-tenant mode.
Apologies for that.

@rblaine95 rblaine95 force-pushed the chore/optimize-pg-sql branch 2 times, most recently from ab73e4a to b8c18d3 Compare November 10, 2023 10:16
@rblaine95
Copy link
Contributor Author

@swcurran, @andrewwhitehead, is there a channel on the Hyperledger Discord we could chat in?
I realized I may have been a little overly excited with my benchmark results earlier and that the tests/benchmarks aren't actually running in Multitenant mode. I'd like to figure out a way to pre-populate the DB with a number of profiles so that the benchmarking results are a little more appropriate.

@swcurran
Copy link
Contributor

There is an aries-askar channel on the Hyperledger Discord Server (invite: https://discord.gg/hyperledger) that I just created that we can use. Should have been created long ago :-).

@rblaine95 rblaine95 force-pushed the chore/optimize-pg-sql branch 3 times, most recently from 94f4d97 to 3d2d75f Compare November 17, 2023 09:08
@rblaine95 rblaine95 closed this Nov 17, 2023
@rblaine95 rblaine95 deleted the chore/optimize-pg-sql branch December 11, 2023 09:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants