Reduce complexity of database interactions #423

sdsantos · 2025-01-27T18:15:50Z

Reference ooni/probe#2024

Simplifications:

Drop the urls and networks tables
Drop the results table
Make the creation of db measurement entries stateless

The text was updated successfully, but these errors were encountered:

sdsantos · 2025-01-27T18:26:30Z

Drop the urls and networks tables

The Url and Network tables can be dropped and their columns moved into the Measurement table.

Advantages:

A lot less joins should make queries faster

Disadvantages:

More space used, due to duplications:
- URLs are usually tested multiple times across time.
- Networks don't vary much across time, and they used to be linked to Results, not Measurements.
We currently show the same Network for a result, but this would mean there could be multiple Networks for the same result_id.

sdsantos · 2025-01-27T18:33:23Z

Drop the results table

Going column by column of the Result table:

id (result_id on measurement)
descriptor_name / descriptor_runId / descriptor_revision (moved to measurement)
start_time (could be fetched from the first measurement)
is_viewed (could be moved to the measurement and aggregated: only true if true for all measurements of the same result)
data_usage_up / data_usage_down (needs to be moved to the measurement and summed)
failure_msg (not used at the moment, can be dropped)
network_id (we're planning to drop the network table, meaning measurements will need to store the network info)
task_origin (moved to the measurement)
is_done
Used for:
- knowing when we can re-run a websites result
- showing a loading animation when result is ongoing
- showing a result has an error because it's done but empty of measurements
Just checking if measurements are done is not enough to deduce this state, because there are moments between one measurement being done and another one being created.

Summary, most columns are solvable, only is_done is more tricky, but it's not necessary for core features.

When we're showing the results list, we're already aggregating information from all the measurements that are part of a result. So going straight to the measurements table could have a slightly performance improvement. Space should increase due to the duplicated information. And some care will be needed to ensure information is coherent (for example, the same descriptor and task origin are used for the same result_id).

sdsantos · 2025-01-27T18:37:10Z

Make the creation of db measurement entries stateless

Right now we show the ongoing Measurement inside the Results screen. We would need to drop that feature.
We also update the fields is_uploaded, is_upload_failed and upload_failure_msg when we retry to upload a measurement. But we could create a new entry instead, when that happens.

Nevertheless, SQLdelight handles well changes in values inside DB rows, and even provides a reactive stream of the row state changes. So this simplification may not gain us much in terms of complexity.

sdsantos moved this to Icebox in Roadmap Jan 27, 2025

sdsantos added this to Roadmap Jan 27, 2025

hellais removed this from Roadmap Jan 29, 2025

hellais added the funder/otffoss2025 label Jan 30, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce complexity of database interactions #423

Reduce complexity of database interactions #423

sdsantos commented Jan 27, 2025

sdsantos commented Jan 27, 2025 •

edited

Loading

sdsantos commented Jan 27, 2025 •

edited

Loading

sdsantos commented Jan 27, 2025 •

edited

Loading

Reduce complexity of database interactions #423

Reduce complexity of database interactions #423

Comments

sdsantos commented Jan 27, 2025

sdsantos commented Jan 27, 2025 • edited Loading

Drop the urls and networks tables

sdsantos commented Jan 27, 2025 • edited Loading

Drop the results table

sdsantos commented Jan 27, 2025 • edited Loading

Make the creation of db measurement entries stateless

sdsantos commented Jan 27, 2025 •

edited

Loading

sdsantos commented Jan 27, 2025 •

edited

Loading

sdsantos commented Jan 27, 2025 •

edited

Loading