-
Notifications
You must be signed in to change notification settings - Fork 252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sync: Avoid crash on invalid state events #3124
Conversation
The bad events will now be saved as None, which falls back to the default for the event type. For example JoinRule::_Custom("test") falls back to JoinRule::Public. This may or may not be the correct behavior... Here is the code where it falls back to the default: matrix-rust-sdk/crates/matrix-sdk-base/src/rooms/normal.rs Lines 1141 to 1146 in 7493176
|
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #3124 +/- ##
==========================================
- Coverage 83.85% 83.83% -0.03%
==========================================
Files 229 229
Lines 23706 23718 +12
==========================================
+ Hits 19879 19884 +5
- Misses 3827 3834 +7 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for writing a fix. The approach isn't great, in the sense that the lib user might have been able to do something with the custom event, and we're losing it here.
Instead, could we try to serialize the Raw<Event>
s in the BaseRoomInfo
, have getters that automatically deserialize them, please? It shouldn't require a migration, in theory (because a serialized Raw<T>
is the same as the serialize T
, when the serialization mechanism is JSON). If it does require a migration, that would result in a logout in the EX apps, so it would be an easy way to try it out and make sure no migration is needed.
39ea904
to
915974d
Compare
@@ -1778,6 +1823,65 @@ mod tests { | |||
); | |||
} | |||
|
|||
#[async_test] | |||
async fn test_save_bad_joinrules() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is currently not testing what it's supposed to, because it runs in-memory instead of on sqlite or indexeddb. How can I change that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make_room
below likely creates and configures the Client
; you'd need another Client
that is configured to use sqlite or indexeddb.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you address this comment? ^
915974d
to
f34b593
Compare
|
||
for (room_id, redactions) in &changes.redactions { | ||
// TODO: Load room version lazily, only if it is necessary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the lazy loading to make it simpler while implementing, but I should add it back in. The approach from the sqlite make_room_version closure doesn't work here because it's async
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the last set of changes, I had to make the sqlite version non-async too, to allow passing the room version into the base_info.handle_redaction function.
f34b593
to
6701933
Compare
This PR is ready for another review. I've changed the fundamental approach, so now the raw events are stored in the database. I did a quick test using element x android (sqlite), but I could not test indexeddb yet and I haven't tested redactions yet. |
The problem was in matrix_sdk_sqlite::state_store::SqliteStateStore save_changes, which saved the RoomInfo to the disk by serializing it. Now the events are serialized as Raws using a separate struct.
@@ -1778,6 +1823,65 @@ mod tests { | |||
); | |||
} | |||
|
|||
#[async_test] | |||
async fn test_save_bad_joinrules() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make_room
below likely creates and configures the Client
; you'd need another Client
that is configured to use sqlite or indexeddb.
@@ -77,33 +78,44 @@ impl fmt::Display for DisplayName { | |||
#[derive(Clone, Debug, Serialize, Deserialize)] | |||
pub struct BaseRoomInfo { | |||
/// The avatar URL of this room. | |||
#[serde(skip_serializing)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of all those annotations, can we use something like #[serde(into = "RawRoomInfo", from = "RawRoomInfo")]
, so that the (de)serialization of this struct goes through that of RawRoomInfo
instead, and we don't have to worry about redacting events?
@@ -116,6 +128,34 @@ pub struct BaseRoomInfo { | |||
pub(crate) notable_tags: RoomNotableTags, | |||
} | |||
|
|||
#[derive(Clone, Debug, Default, Serialize, Deserialize)] | |||
pub struct RawRoomInfo { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: doc comment for the whole struct please
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, question: does it need to be public so that downstream implementations of the state store trait can reuse it? (If so, please indicate so in the comment.)
I wonder if it still needs that, with my suggestion above to use serde(from / into)
.
// Clean up raw events we saved for roominfo | ||
if room_infos.get(room_id).is_some() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this at all? 😨 Do we need something similar for edits? If we actually have to keep this, can we commonize the code with the indexeddb implementation?
@@ -101,6 +102,7 @@ mod keys { | |||
|
|||
pub const ROOM_STATE: &str = "room_state"; | |||
pub const ROOM_INFOS: &str = "room_infos"; | |||
pub const RAW_ROOM_INFOS: &str = "raw_room_infos"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you'd need a new table, in both cases; the justification being that serializing Raw<T>
with serde_json is the same as serializing T
with serde_json. So you should be able to reuse the same table.
32a1fa3
to
7de60a1
Compare
7de60a1
to
17c4201
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3124 +/- ##
==========================================
- Coverage 83.86% 83.83% -0.03%
==========================================
Files 231 229 -2
Lines 23856 23718 -138
==========================================
- Hits 20006 19884 -122
+ Misses 3850 3834 -16 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the new version of the PR! Here's another round of comments. I'm a bit scared by the code duplication, so let's see if we can lower it a bit.
if event.deserialize().is_ok_and(|e| e.event_id() == event_id) { | ||
if let Ok(mut object) = event.deserialize_as() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can we avoid deserializing twice here? Maybe call deserialize_as
and check against the target event_id
instead?
if event | ||
.as_ref() | ||
.and_then(|a| a.deserialize().ok()) | ||
.is_some_and(|e| e.event_id() == event_id) | ||
{ | ||
if let Ok(mut object) = event.as_ref().unwrap().deserialize_as() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto, the event is deserialized twice here: can we deserialize it once with deserialize_as
and then check the event_id
matches later, instead?
.is_some_and(|e| e.event_id() == event_id) | ||
{ | ||
if let Ok(mut object) = event.as_ref().unwrap().deserialize_as() { | ||
if redact_in_place(&mut object, room_version, None).is_ok() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use a match and log the error message instead?
for event in map.values_mut() { | ||
if event.deserialize().is_ok_and(|e| e.event_id() == event_id) { | ||
if let Ok(mut object) = event.deserialize_as() { | ||
if redact_in_place(&mut object, room_version, None).is_ok() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto here
}, | ||
// Clean up rtc_members | ||
&|map| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style nit: please let the code breath 😁
}, | |
// Clean up rtc_members | |
&|map| { | |
}, | |
// Clean up rtc_members | |
&|map| { |
Some(MinimalStateEvent::Redacted(ev)) => &ev.content.history_visibility, | ||
None => &HistoryVisibility::WorldReadable, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the motivation behind this change?
Some(MinimalStateEvent::Redacted(ev)) => &ev.content.join_rule, | ||
None => &JoinRule::Public, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto?
@@ -1778,6 +1823,65 @@ mod tests { | |||
); | |||
} | |||
|
|||
#[async_test] | |||
async fn test_save_bad_joinrules() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you address this comment? ^
if let Some(room_state_changes) = changes.state.get(room_id) { | ||
raw_room_info.base_info.extend_with_changes(&room_state_changes); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems it would be much simpler to throw away the previous RawRoomInfo
and convert the RoomInfo
struct into a RawRoomInfo
struct, instead of applying a diff of changes to the previously-serialized RawRoomInfo
? That way maybe we wouldn't even need extend_with_changes
at all.
|
||
// Clean up raw events we saved for roominfo | ||
if changes.room_infos.get(room_id).is_some() { | ||
// This roominfo has changed, so let's update the raw events |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why we need this :/ It seems that a redaction would cause a change to the RoomInfo
, so we'd convert it into a RawRoomInfo
before persisting it, and we wouldn't need any special handling of redacted events?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I think I understand it now. So as I've suggested in a private channel, the duplication of the code applying the redaction to events both in BaseRoomInfo
and RawRoomInfo
is a bit problematic, and a hindrance in the long term. I wonder if we could regenerate a BaseRoomInfo
from the RawRoomInfo
every time we received an update to the RawRoomInfo
, instead? That would keep the redaction code in a single place and avoid duplication of concerns.
This wasn't the path we wanted to take with this, so closing, we can revisit the approach later, as we still have an issue open describing the issue. |
The problem was in matrix_sdk_sqlite::state_store::SqliteStateStore save_changes, which saved the RoomInfo to the disk by serializing it.
Fixes #2949