Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(sdk): Event cache experimental store: LinkedChunk #3166

Merged
merged 2 commits into from
Mar 14, 2024

Conversation

Hywan
Copy link
Member

@Hywan Hywan commented Feb 26, 2024

This PR is a work-in-progress. It explores an experimental data structure to store events in an efficient way.

Note: in this comment, I will use the term store to mean database or storage.

The biggest constraint is the following: events can be ordered in multiple ways, either topological order, or sync order. The problem is that, when syncing events (with /sync), or when fetching events (with /messages), we don't know how to order the newly received events compared to the already downloaded events. A reconciliation algorithm must be written (see #3058). However, from the “storage” point of view, events must be read, written and re-ordered efficiently.

Ordering index

The simplest approach would be to use an order_index for example. Every time a new event is inserted, it uses the position of the last event, increments it by one, and done.

However, inserting a new event in the middle of existing events would shift all events on one side of the insertion point: given a, b, c, d, e, f with f being the most recent event, if g needs to be inserted between b and c, then c, d, e, f's ordering positions need to be shifted. That's not optimal at all as it would imply a lot of updates in the store.

Example of a relational database:

ordering_index event
0 a
1 b
2 g
3 c

An insertion can be O(n), and it can happen more frequently than one can think of. Let's imagine a permalink to an old message: the user opens it, a couple of events are fetched (with /messages), and these events must be inserted in the store, thus potentially shifting a lot of existing events. Another example: Imagine the SDK has a search API for events; as long as no search result is found, the SDK will back-paginate until reaching the beginning of the room; every time there is a back-pagination, a block of events will be inserted: there is more and more events to shift at each back-pagination.

Linked list

OK, let's forget the order_index. Let's use a linked list then? Each event has a link to the previous and to the next event.

Inserting an event would be at worst O(3) in this case: if the previous event exists, it must be updated, if the next event exists, it must be updated, finally, insert the new event.

Example with a relational database:

previous id event next
null id(a) a id(b)
id(a) id(b) b id(c)
id(b) id(c) c null

This approach ensures a fast writing, but a terribly slow reading. Indeed, reading N events require N queries in the store. Events aren't contiguous in the store, and cannot be ordered by the database engine (e.g. with ORDER BY for SQL-based database). So it really requires one query per event. That's a no-go.

What about gap?

In the two scenarios above, another problem arises. How to represent a gap? Indeed, when new events are synced (via /sync), sometimes the response contains a limited flag, which means that the results are partial.

Let's take the following example: the store contains a, b, c. After a long offline period (during which the room has been pretty active), a sync is started, which provides the following events: x, y, z + the limited flag. The app is killed and reopened later. The event cache store will contain a, b, c, x, y, z. How do we know that there is a hole/a gap between c and x? This is an important information! When z, y and x are displayed, and the user would like to scroll up, the SDK must know that it must back-paginate before providing c, b and a.

So the data structure we use must also represent gaps. This information is also crucial for the events reconciliation algorithm.

Proposal

What about a mix between the two? Here is Linked Chunk.

A linked chunk is like a linked list, except that each node is either a Gap or an Items. A Gap contains nothing, it's just a gap. An Items contains several events. A node is called a Chunk. A chunk has a maximum size, which is called a capacity. When a chunk is full, a new chunk is created and linked appropriately. Inside a chunk, an ordering index is used to order events. At this point, it becomes a trade-off the find the appropriate chunk size to balance the performance between reading and writing. Nonetheless, if the chunk size is 50, then reading events is 50 times more efficient with a linked chunk than with a linked list, and writing events is at worst O(49), compare to the O(n - 1) of the ordering index.

Example with a relational database. First table is events, second table is chunks.

chunk id index event
$0 0 a
$0 1 b
$0 2 c
$0 3 d
$2 0 e
$2 1 f
$2 2 g
$2 3 h
chunk id type previous next
$0 items null $1
$1 gap $0 $2
$2 items $1 null

Reading the last chunk consists of reading all events where the chunk_id is $2 for example, and contains events e, f, g and h. We can sort them easily by using the event_index column. The previous chunk is a gap. The previous chunk contains events a, b, c and d.

Being able to read events by chunk clearly limit the amount of reading and writing in the store. It is also close to what will be really done in real life with this store. It also allows to represent gaps. We can replace a gap by new chunk pretty easily with few writings.

A summary:

Data structure Reading Writing
Ordering index “O(1)”1 (fast) O(n - 1) (slow)
Linked list O(n) (slow) O(3) (fast)
Linked chunk O(n / capacity) O(capacity - 1)

Implementation

This PR contains a draft implementation of a linked chunk. It will strictly only contain the required API for the EventCache, understand it is not designed as a generic data structure type.


Footnotes

  1. O(1) because it's simply one query to run; the database engine does the sorting for us in a very efficient way, particularly if the ordering_index is an unsigned integer.

@Hywan Hywan force-pushed the feat-sdk-event-cache-store-experimental branch from a2fc51a to eeeca2d Compare February 28, 2024 08:48
@Hywan Hywan mentioned this pull request Feb 28, 2024
2 tasks
@Hywan Hywan force-pushed the feat-sdk-event-cache-store-experimental branch 2 times, most recently from 7a00876 to 3b4ef21 Compare February 29, 2024 12:41
@Hywan Hywan force-pushed the feat-sdk-event-cache-store-experimental branch from 3470f18 to 9b815dd Compare March 11, 2024 12:39
Copy link
Member

@bnjbvr bnjbvr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great stuff! A first round of comments here. I'm mostly worried about the naming conventions for the iterators, as I'm 100% sure I'm going to get them wrong; let's see if the renaming proposal below makes sense. Thanks for all the tests <3

crates/matrix-sdk/src/event_cache/linked_chunk.rs Outdated Show resolved Hide resolved
crates/matrix-sdk/src/event_cache/linked_chunk.rs Outdated Show resolved Hide resolved
crates/matrix-sdk/src/event_cache/linked_chunk.rs Outdated Show resolved Hide resolved
crates/matrix-sdk/src/event_cache/linked_chunk.rs Outdated Show resolved Hide resolved
crates/matrix-sdk/src/event_cache/linked_chunk.rs Outdated Show resolved Hide resolved
crates/matrix-sdk/src/event_cache/linked_chunk.rs Outdated Show resolved Hide resolved
crates/matrix-sdk/src/event_cache/linked_chunk.rs Outdated Show resolved Hide resolved
crates/matrix-sdk/src/event_cache/linked_chunk.rs Outdated Show resolved Hide resolved
@Hywan Hywan force-pushed the feat-sdk-event-cache-store-experimental branch from 1d2dc08 to 4ec21e8 Compare March 13, 2024 15:36
@Hywan Hywan marked this pull request as ready for review March 14, 2024 10:14
@Hywan Hywan requested a review from a team as a code owner March 14, 2024 10:14
@Hywan Hywan requested review from bnjbvr and removed request for a team March 14, 2024 10:14
@Hywan Hywan force-pushed the feat-sdk-event-cache-store-experimental branch from bf60cf0 to 743b4fd Compare March 14, 2024 13:04
Copy link
Member

@bnjbvr bnjbvr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for renaming the iterators 🫶

LGTM, let's ship this so we can move foreward forward with the integration!

crates/matrix-sdk/src/event_cache/store.rs Outdated Show resolved Hide resolved
crates/matrix-sdk/src/event_cache/store.rs Outdated Show resolved Hide resolved
crates/matrix-sdk/src/event_cache/store.rs Outdated Show resolved Hide resolved
crates/matrix-sdk/src/event_cache/linked_chunk.rs Outdated Show resolved Hide resolved
crates/matrix-sdk/src/event_cache/linked_chunk.rs Outdated Show resolved Hide resolved
Comment on lines +251 to +256
items: I,
chunk_identifier: ChunkIdentifier,
) -> Result<(), LinkedChunkError>
where
I: IntoIterator<Item = T>,
I::IntoIter: ExactSizeIterator,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Nothing to action here, just a FYI: note we may need something slightly different, since the back-pagination replaces a gap with the events returned by the back-pagination and a previous gap, if there was another previous-token in the back-pagination response.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, we can use replace_gap_at + insert_gap_at then?

crates/matrix-sdk/src/event_cache/linked_chunk.rs Outdated Show resolved Hide resolved
crates/matrix-sdk/src/event_cache/linked_chunk.rs Outdated Show resolved Hide resolved
@Hywan Hywan force-pushed the feat-sdk-event-cache-store-experimental branch from 743b4fd to 4cc13a9 Compare March 14, 2024 13:10
Copy link

codecov bot commented Mar 14, 2024

Codecov Report

Attention: Patch coverage is 77.81570% with 65 lines in your changes are missing coverage. Please review.

Project coverage is 83.69%. Comparing base (7718f90) to head (e8cf6dc).
Report is 2 commits behind head on main.

Files Patch % Lines
crates/matrix-sdk/src/event_cache/linked_chunk.rs 86.36% 36 Missing ⚠️
crates/matrix-sdk/src/event_cache/store.rs 0.00% 29 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3166      +/-   ##
==========================================
- Coverage   83.76%   83.69%   -0.08%     
==========================================
  Files         234      235       +1     
  Lines       24174    24467     +293     
==========================================
+ Hits        20249    20477     +228     
- Misses       3925     3990      +65     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Hywan Hywan force-pushed the feat-sdk-event-cache-store-experimental branch 2 times, most recently from 12cd505 to 97ccf5c Compare March 14, 2024 13:58
This patch is a work-in-progress. It explores an experimental data
structure to store events in an efficient way.

Note: in this comment, I will use the term _store_ to mean _database_
or _storage_.

The biggest constraint is the following: events can be ordered in
multiple ways, either topological order, or sync order. The problem is
that, when syncing events (with `/sync`), or when fetching events (with
`/messages`), we **don't know** how to order the newly received events
compared to the already downloaded events. A reconciliation algorithm
must be written (see matrix-org#3058). However, from the “storage” point of view,
events must be read, written and re-ordered efficiently.

The simplest approach would be to use an `order_index` for example.
Every time a new event is inserted, it uses the position of the last
event, increments it by one, and done.

However, inserting a new event in _the middle_ of existing events would
shift all events on one side of the insertion point: given `a`, `b`,
`c`, `d`, `e`, `f` with `f` being the most recent event, if `g` needs
to be inserted between `b` and `c`, then `c`, `d`, `e`, `f`'s ordering
positions need to be shifted. That's not optimal at all as it would
imply a lot of updates in the store.

Example of a relational database:

| ordering_index | event |
|----------------|-------|
| 0              | `a`   |
| 1              | `b`   |
| 2              | `g`   |
| 3              | `c`   |
| …              | …     |

An insertion can be O(n), and it can happen more frequently than one
can think of. Let's imagine a permalink to an old message: the user
opens it, a couple of events are fetched (with `/messages`), and these
events must be inserted in the store, thus potentially shifting a lot of
existing events. Another example: Imagine the SDK has a search API for
events; as long as no search result is found, the SDK will back-paginate
until reaching the beginning of the room; every time there is a
back-pagination, a block of events will be inserted: there is more and
more events to shift at each back-pagination.

OK, let's forget the `order_index`. Let's use a linked list then? Each
event has a _link_ to the _previous_ and to the _next_ event.

Inserting an event would be at worst O(3) in this case: if the previous
event exists, it must be updated, if the next event exists, it must be
updated, finally, insert the new event.

Example with a relational database:

| previous | id      | event | next    |
|----------|---------|-------|---------|
| null     | `id(a)` | `a`   | `id(b)` |
| `id(a)`  | `id(b)` | `b`   | `id(c)` |
| `id(b)`  | `id(c)` | `c`   | null    |

This approach ensures a fast _writing_, but a terribly slow _reading_.
Indeed, reading N events require N queries in the store. Events aren't
contiguous in the store, and cannot be ordered by the database engine
(e.g. with `ORDER BY` for SQL-based database). So it really requires one
query per event. That's a no-go.

In the two scenarios above, another problem arises. How to represent
a gap? Indeed, when new events are synced (via `/sync`), sometimes the
response contains a `limited` flag, which means that the results are
_partial_.

Let's take the following example: the store contains `a`, `b`, `c`.
After a long offline period (during which the room has been pretty
active), a sync is started, which provides the following events: `x`,
`y`, `z` + the _limited_ flag. The app is killed and reopened later.
The event cache store will contain `a`, `b`, `c`, `x`, `y`, `z`. How
do we know that there is a hole/a gap between `c` and `x`? This is an
important information! When `z`, `y` and `x` are displayed, and the user
would like to scroll up, the SDK must know that it must back-paginate
before providing `c`, `b` and `a`.

So the data structure we use must also represent gaps. This information
is also crucial for the events reconciliation algorithm.

What about a mix between the two? Here is _Linked Chunk_.

A _linked chunk_ is like a linked list, except that each node is either
a _Gap_ or an _Items_. A _Gap_ contains nothing, it's just a gap. An
_Items_ contains _several_ events. A node is called a _Chunk_. A _chunk_
has a maximum size, which is called a _capacity_. When a chunk is full,
a new chunk is created and linked appropriately. Inside a chunk, an
ordering index is used to order events. At this point, it becomes a
trade-off the find the appropriate chunk size to balance the performance
between reading and writing. Nonetheless, if the chunk size is 50, then
reading events is 50 times more efficient with a linked chunk than with
a linked list, and writing events is at worst O(49), compare to the O(n
- 1) of the ordering index.

Example with a relational database. First table is `events`, second
table is `chunks`.

| chunk id | index | event |
|----------|-------|-------|
| `$0`     | 0     | `a`   |
| `$0`     | 1     | `b`   |
| `$0`     | 2     | `c`   |
| `$0`     | 3     | `d`   |
| `$2`     | 0     | `e`   |
| `$2`     | 1     | `f`   |
| `$2`     | 2     | `g`   |
| `$2`     | 3     | `h`   |

| chunk id | type  | previous | next |
|----------|-------|----------|------|
| `$0`     | items | null     | `$1` |
| `$1`     | gap   | `$0`     | `$2` |
| `$2`     | items | `$1`     | null |

Reading the last chunk consists of reading all events where the
`chunk_id` is `$2` for example, and contains events `e`, `f`, `g` and
`h`. We can sort them easily by using the `event_index` column. The
previous chunk is a gap. The previous chunk contains events `a`, `b`,
`c` and `d`.

Being able to read events by chunk clearly limit the amount of reading
and writing in the store. It is also close to what will be really done
in real life with this store. It also allows to represent gaps. We can
replace a gap by new chunk pretty easily with few writings.

A summary:

| Data structure | Reading           | Writing         |
|----------------|-------------------|-----------------|
| Ordering index | “O(1)”[^1] (fast) | O(n - 1) (slow) |
| Linked list    | O(n) (slow)       | O(3) (fast)     |
| Linked chunk   | O(n / capacity)   | O(capacity - 1) |

This patch contains a draft implementation of a linked chunk. It will
strictly only contain the required API for the `EventCache`, understand
it _is not_ designed as a generic data structure type.

[^1]: O(1) because it's simply one query to run; the database engine
does the sorting for us in a very efficient way, particularly if the
`ordering_index` is an unsigned integer.
@Hywan Hywan force-pushed the feat-sdk-event-cache-store-experimental branch from 9596e74 to e8cf6dc Compare March 14, 2024 14:09
@Hywan Hywan merged commit 0a7e28f into matrix-org:main Mar 14, 2024
34 checks passed
@Hywan Hywan changed the title feat(sdk): Event cache experimental store feat(sdk): Event cache experimental store: LinkedChunk Mar 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants