Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sync: Avoid crash on invalid state events #3124

Closed
wants to merge 2 commits into from
Closed

Conversation

timokoesters
Copy link
Contributor

@timokoesters timokoesters commented Feb 13, 2024

The problem was in matrix_sdk_sqlite::state_store::SqliteStateStore save_changes, which saved the RoomInfo to the disk by serializing it.

  • Public API changes documented in changelogs (optional)

Fixes #2949

@timokoesters timokoesters requested a review from a team as a code owner February 13, 2024 15:24
@timokoesters timokoesters requested review from bnjbvr and removed request for a team February 13, 2024 15:24
@timokoesters
Copy link
Contributor Author

timokoesters commented Feb 13, 2024

The bad events will now be saved as None, which falls back to the default for the event type. For example JoinRule::_Custom("test") falls back to JoinRule::Public. This may or may not be the correct behavior...

Here is the code where it falls back to the default:

fn join_rule(&self) -> &JoinRule {
match &self.base_info.join_rules {
Some(MinimalStateEvent::Original(ev)) => &ev.content.join_rule,
_ => &JoinRule::Public,
}
}

Copy link

codecov bot commented Feb 13, 2024

Codecov Report

Attention: 8 lines in your changes are missing coverage. Please review.

Comparison is base (060eaef) 83.85% compared to head (39ea904) 83.83%.
Report is 33 commits behind head on main.

❗ Current head 39ea904 differs from pull request most recent head 6701933. Consider uploading reports for the commit 6701933 to get more accurate results

Files Patch % Lines
crates/matrix-sdk-base/src/rooms/mod.rs 55.55% 8 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3124      +/-   ##
==========================================
- Coverage   83.85%   83.83%   -0.03%     
==========================================
  Files         229      229              
  Lines       23706    23718      +12     
==========================================
+ Hits        19879    19884       +5     
- Misses       3827     3834       +7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@bnjbvr bnjbvr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for writing a fix. The approach isn't great, in the sense that the lib user might have been able to do something with the custom event, and we're losing it here.

Instead, could we try to serialize the Raw<Event>s in the BaseRoomInfo, have getters that automatically deserialize them, please? It shouldn't require a migration, in theory (because a serialized Raw<T> is the same as the serialize T, when the serialization mechanism is JSON). If it does require a migration, that would result in a logout in the EX apps, so it would be an easy way to try it out and make sure no migration is needed.

@@ -1778,6 +1823,65 @@ mod tests {
);
}

#[async_test]
async fn test_save_bad_joinrules() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is currently not testing what it's supposed to, because it runs in-memory instead of on sqlite or indexeddb. How can I change that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make_room below likely creates and configures the Client; you'd need another Client that is configured to use sqlite or indexeddb.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you address this comment? ^


for (room_id, redactions) in &changes.redactions {
// TODO: Load room version lazily, only if it is necessary
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the lazy loading to make it simpler while implementing, but I should add it back in. The approach from the sqlite make_room_version closure doesn't work here because it's async

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the last set of changes, I had to make the sqlite version non-async too, to allow passing the room version into the base_info.handle_redaction function.

@timokoesters
Copy link
Contributor Author

This PR is ready for another review. I've changed the fundamental approach, so now the raw events are stored in the database. I did a quick test using element x android (sqlite), but I could not test indexeddb yet and I haven't tested redactions yet.

The problem was in matrix_sdk_sqlite::state_store::SqliteStateStore save_changes, which saved the RoomInfo to the disk by serializing it.
Now the events are serialized as Raws using a separate struct.
@@ -1778,6 +1823,65 @@ mod tests {
);
}

#[async_test]
async fn test_save_bad_joinrules() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make_room below likely creates and configures the Client; you'd need another Client that is configured to use sqlite or indexeddb.

@@ -77,33 +78,44 @@ impl fmt::Display for DisplayName {
#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct BaseRoomInfo {
/// The avatar URL of this room.
#[serde(skip_serializing)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of all those annotations, can we use something like #[serde(into = "RawRoomInfo", from = "RawRoomInfo")], so that the (de)serialization of this struct goes through that of RawRoomInfo instead, and we don't have to worry about redacting events?

@@ -116,6 +128,34 @@ pub struct BaseRoomInfo {
pub(crate) notable_tags: RoomNotableTags,
}

#[derive(Clone, Debug, Default, Serialize, Deserialize)]
pub struct RawRoomInfo {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: doc comment for the whole struct please

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, question: does it need to be public so that downstream implementations of the state store trait can reuse it? (If so, please indicate so in the comment.)

I wonder if it still needs that, with my suggestion above to use serde(from / into).

Comment on lines 1241 to 1242
// Clean up raw events we saved for roominfo
if room_infos.get(room_id).is_some() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this at all? 😨 Do we need something similar for edits? If we actually have to keep this, can we commonize the code with the indexeddb implementation?

@@ -101,6 +102,7 @@ mod keys {

pub const ROOM_STATE: &str = "room_state";
pub const ROOM_INFOS: &str = "room_infos";
pub const RAW_ROOM_INFOS: &str = "raw_room_infos";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you'd need a new table, in both cases; the justification being that serializing Raw<T> with serde_json is the same as serializing T with serde_json. So you should be able to reuse the same table.

@timokoesters timokoesters force-pushed the timo/badstate branch 3 times, most recently from 32a1fa3 to 7de60a1 Compare March 4, 2024 11:32
@codecov-commenter
Copy link

Codecov Report

Attention: Patch coverage is 55.55556% with 8 lines in your changes are missing coverage. Please review.

Project coverage is 83.83%. Comparing base (59c468c) to head (39ea904).
Report is 49 commits behind head on main.

❗ Current head 39ea904 differs from pull request most recent head 17c4201. Consider uploading reports for the commit 17c4201 to get more accurate results

Files Patch % Lines
crates/matrix-sdk-base/src/rooms/mod.rs 55.55% 8 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3124      +/-   ##
==========================================
- Coverage   83.86%   83.83%   -0.03%     
==========================================
  Files         231      229       -2     
  Lines       23856    23718     -138     
==========================================
- Hits        20006    19884     -122     
+ Misses       3850     3834      -16     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@bnjbvr bnjbvr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the new version of the PR! Here's another round of comments. I'm a bit scared by the code duplication, so let's see if we can lower it a bit.

Comment on lines +202 to +203
if event.deserialize().is_ok_and(|e| e.event_id() == event_id) {
if let Ok(mut object) = event.deserialize_as() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we avoid deserializing twice here? Maybe call deserialize_as and check against the target event_id instead?

Comment on lines +187 to +192
if event
.as_ref()
.and_then(|a| a.deserialize().ok())
.is_some_and(|e| e.event_id() == event_id)
{
if let Ok(mut object) = event.as_ref().unwrap().deserialize_as() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto, the event is deserialized twice here: can we deserialize it once with deserialize_as and then check the event_id matches later, instead?

.is_some_and(|e| e.event_id() == event_id)
{
if let Ok(mut object) = event.as_ref().unwrap().deserialize_as() {
if redact_in_place(&mut object, room_version, None).is_ok() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use a match and log the error message instead?

for event in map.values_mut() {
if event.deserialize().is_ok_and(|e| e.event_id() == event_id) {
if let Ok(mut object) = event.deserialize_as() {
if redact_in_place(&mut object, room_version, None).is_ok() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto here

Comment on lines +198 to +200
},
// Clean up rtc_members
&|map| {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style nit: please let the code breath 😁

Suggested change
},
// Clean up rtc_members
&|map| {
},
// Clean up rtc_members
&|map| {

Comment on lines +1199 to +1200
Some(MinimalStateEvent::Redacted(ev)) => &ev.content.history_visibility,
None => &HistoryVisibility::WorldReadable,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the motivation behind this change?

Comment on lines +1207 to +1208
Some(MinimalStateEvent::Redacted(ev)) => &ev.content.join_rule,
None => &JoinRule::Public,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto?

@@ -1778,6 +1823,65 @@ mod tests {
);
}

#[async_test]
async fn test_save_bad_joinrules() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you address this comment? ^

Comment on lines +637 to +639
if let Some(room_state_changes) = changes.state.get(room_id) {
raw_room_info.base_info.extend_with_changes(&room_state_changes);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems it would be much simpler to throw away the previous RawRoomInfo and convert the RoomInfo struct into a RawRoomInfo struct, instead of applying a diff of changes to the previously-serialized RawRoomInfo? That way maybe we wouldn't even need extend_with_changes at all.


// Clean up raw events we saved for roominfo
if changes.room_infos.get(room_id).is_some() {
// This roominfo has changed, so let's update the raw events
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why we need this :/ It seems that a redaction would cause a change to the RoomInfo, so we'd convert it into a RawRoomInfo before persisting it, and we wouldn't need any special handling of redacted events?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I think I understand it now. So as I've suggested in a private channel, the duplication of the code applying the redaction to events both in BaseRoomInfo and RawRoomInfo is a bit problematic, and a hindrance in the long term. I wonder if we could regenerate a BaseRoomInfo from the RawRoomInfo every time we received an update to the RawRoomInfo, instead? That would keep the redaction code in a single place and avoid duplication of concerns.

@bnjbvr
Copy link
Member

bnjbvr commented May 2, 2024

This wasn't the path we wanted to take with this, so closing, we can revisit the approach later, as we still have an issue open describing the issue.

@bnjbvr bnjbvr closed this May 2, 2024
@poljar poljar deleted the timo/badstate branch November 5, 2024 11:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Not robust against invalid state event
3 participants