Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Update backward extremity docs to make it clear that it does not indicate whether we have fetched an events' prev_events #11469

Merged
merged 6 commits into from
Dec 4, 2021
Merged
1 change: 1 addition & 0 deletions changelog.d/11469.doc
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Update section about backward extremities in the room DAG concepts doc to correct the misconception about backward extremities indicating whether we have fetched an events' `prev_events`.
13 changes: 5 additions & 8 deletions docs/development/room-dag-concepts.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,16 +38,14 @@ Most-recent-in-time events in the DAG which are not referenced by any other even
The forward extremities of a room are used as the `prev_events` when the next event is sent.


## Backwards extremity
## Backward extremity

The current marker of where we have backfilled up to and will generally be the
oldest-in-time events we know of in the DAG.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... or rather, the prev_events of the oldest-in-time events we know of.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be worth adding something like "Gives the starting point when backfilling history"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... or rather, the prev_events of the oldest-in-time events we know of.

They are the oldest-in-time events we know of but sometimes we only know their event_ids. Suggestion to phrase it better?


This is an event where we haven't fetched all of the `prev_events` for.

Once we have fetched all of its `prev_events`, it's unmarked as a backwards
extremity (although we may have formed new backwards extremities from the prev
events during the backfilling process).
Copy link
Contributor Author

@MadLittleMods MadLittleMods Dec 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These statements are not true according to the code. Actual behavior in the new part of the diff 👉

Relevant code:

def _update_backward_extremeties(self, txn, events):
"""Updates the event_backward_extremities tables based on the new/updated
events being persisted.
This is called for new events *and* for events that were outliers, but
are now being persisted as non-outliers.
Forward extremities are handled when we first start persisting the events.
"""
# From the events passed in, add all of the prev events as backwards extremities.
# Ignore any events that are already backwards extrems or outliers.
query = (
"INSERT INTO event_backward_extremities (event_id, room_id)"
" SELECT ?, ? WHERE NOT EXISTS ("
" SELECT 1 FROM event_backward_extremities"
" WHERE event_id = ? AND room_id = ?"
" )"
" AND NOT EXISTS ("
" SELECT 1 FROM events WHERE event_id = ? AND room_id = ? "
" AND outlier = ?"
" )"
)
txn.execute_batch(
query,
[
(e_id, ev.room_id, e_id, ev.room_id, e_id, ev.room_id, False)
for ev in events
for e_id in ev.prev_event_ids()
if not ev.internal_metadata.is_outlier()
],
)
# Delete all these events that we've already fetched and now know that their
# prev events are the new backwards extremeties.
query = (
"DELETE FROM event_backward_extremities"
" WHERE event_id = ? AND room_id = ?"
)
txn.execute_batch(
query,
[
(ev.event_id, ev.room_id)
for ev in events
if not ev.internal_metadata.is_outlier()
],
)

When we persist a non-outlier event, we clear it as a backward extremity and set
all of its `prev_events` as the new backward extremities if they aren't already
persisted in the `events` table.


## Outliers
Expand All @@ -56,8 +54,7 @@ We mark an event as an `outlier` when we haven't figured out the state for the
room at that point in the DAG yet.

We won't *necessarily* have the `prev_events` of an `outlier` in the database,
but it's entirely possible that we *might*. The status of whether we have all of
the `prev_events` is marked as a [backwards extremity](#backwards-extremity).
but it's entirely possible that we *might*.

For example, when we fetch the event auth chain or state for a given event, we
mark all of those claimed auth events as outliers because we haven't done the
Expand Down