Generate RPC endpoint documentation for api.oxen.io #478

jagerman · 2023-07-22T01:22:24Z

This enhances/fixes/updates storage server's RPC documentation to make it convertible to markdown documentation —and thus in-browser documentation, via docsify—for https://api.oxen.io/storage-rpc .

(This is currently built on top of #477 as I wanted to document the new subaccount endpoints without bothering with the old subkey endpoints; this PR is done, but pending completion/merging #477).

The `_t` is a suffix meaning typedef (e.g. `size_t`, `uint64_t`, `std::enable_if_t` but not `unsigned long` or `std::enable_if`). In storage server the suffix was wrongly used on some concrete types (`user_pubkey_t`, `all_stats_t`, `peer_stats_t`). This fixes it, dropping the `_t` suffix from the problematic structs.

- bump minimum required to the current live version - removes some feature guards for previous HF versions - remove now-unused code

This enhances/fixes/updates storage server's RPC documentation to make it convertible to markdown documentation (and thus in-browser documentation, via docsify) for https://api.oxen.io.

venezuela01 · 2023-07-23T15:07:44Z

Thank you for writing the document!

I think we need to be very careful when utilizing a deterministic blinding mechanism like k = H(A.pubkey || B.pubkey) mod L . We might risk exposing social graph if not using it carefully, especially in closed-group settings.

According to https://oxen.caliban.org/operator/ranking, the foundation controls 170+ nodes, while a significant number of other operators manage between 10 and 120+ nodes. These operators have access to a significant amount of Session identity public keys, covering from 5% to over 40% of Session ID partition spaces within a total of 260+ swarms.

If operators decide to scan their nodes and extract Session IDs, coupled with closed-group public keys observed from their respective nodes, they could essentially replicate the deterministic blinding mechanism. If they run through all possible combinations of (Session ID, Group ID) exhaustively, they can rebuild the potential mapping between Session ID and Blinded ID and recover a substantial amount of the social graph, because Session IDs who share the same closed group might leak their social connections.

jagerman · 2023-07-24T15:34:36Z

I think we need to be very careful when utilizing a deterministic blinding mechanism like k = H(A.pubkey || B.pubkey) mod L. We might risk exposing social graph if not using it carefully, especially in closed-group settings.

First point: There's nothing in here that specifically requires that: you could generate a random k put that into the key instead of k = H(...) (by which I mean: this mechanism doesn't require determinism; that's just a suggestion).

The information leak is something of a problem, but is somewhat mitigated by the fact that the storage server doesn't know which keys have been issued: the admin simply generates it and sends it to a user, so there's no (easy) ability to scrape pubkeys in use (unlike, say, with SOGS): you'd have to modify storage server, hook into the authentication code to dump incoming auth requests, and then use that -- and that is only doable on SNodes in your control.

One of the compelling reasons to want a deterministic key, however, is that it allows revoking keys without having to know all the subkeys that have been issued--so, for example, I could revoke user x, added by another admin, from having access without needing to know and coordinate the k that was used to add him (and you want that to be able to remove a user from a closed group). Maybe that's worth the cost anyway of storing and coordinating those issues (random) k factors; I'll have a chat with the Session devs about whether we want to go that way instead (but back to point 1: it's an option without needing to change anything here).

venezuela01 · 2023-07-28T19:29:07Z

First point: There's nothing in here that specifically requires that: you could generate a random k put that into the key instead of k = H(...) (by which I mean: this mechanism doesn't require determinism; that's just a suggestion).

Indeed, I was aware of this. My intention was to discuss the design of the close group in general, but I was unable to find an appropriate thread. I apologize for hijacking the thread. Please let me know if you have a better suggestion for where to discuss close group design.

The information leak is something of a problem, but is somewhat mitigated by the fact that the storage server doesn't know which keys have been issued: the admin simply generates it and sends it to a user, so there's no (easy) ability to scrape pubkeys in use (unlike, say, with SOGS): you'd have to modify storage server, hook into the authentication code to dump incoming auth requests, and then use that -- and that is only doable on SNodes in your control.

That's an aspect I hadn't considered—thank you for pointing it out. It certainly makes sense. However, ideally we should try as hard as we can to protect the association between the user's unblinded and blinded ID from being scraped.

A resourceful attacker might be able to gain access to 50% of unique swarm IDs by controlling only 10% of nodes, according to swarm id coverage of individual service node operator.

If an attacker were to stake 3750 Oxen per node and offer an extremely low operation fee to attract contributors, they would need only 170*3750 = 637,500 Oxen to control 10% of nodes, or 50% of unique swarms. With access to 50% of Session IDs, the attacker might be able to recover a significant portion of the social graph if they are able to modify the storage server and recompute the deterministic link between Session ID and Closed Group ID.

According to research like Robust De-anonymization of Large Sparse Datasets, an anonymous social network can be de-anonymized if its social graph is known, and a third-party dataset from a different social network covering roughly the same set of people is used for comparison and matching.

If we don't prevent potential social graph leakage as hard as we can from the beginning, the longer we operate the network, the less confidence we can have in the social graph protection we provide to our users. It would be incredibly difficult to assure users of the likelihood of their social graph being protected or their privacy being preserved. It would be complex to explain to users that "under certain assumptions, if the number of bad operators is below a certain threshold, then the likelihood of your privacy being compromised is not higher than a certain value, please do your own research to evaluate your risk based on your threat model".

venezuela01 · 2023-07-31T12:27:17Z

According to https://api.oxen.io/storage-rpc/#/subaccounts?id=authentication, https://api.oxen.io/storage-rpc/#/subaccounts?id=revoke_subaccount, as well as related source code and other documents, I understand that a persistently maintained list of keys belonging to removed users resides on the storage server.

Ideally, I think the resources we allocate to a task should be proportionate to the value it confers to the user. When a user is removed from a group, we cease to offer them any value. If we are obliged to indefinitely retain some metadata to prevent the user from scraping the storage server, then the resources we allocate are disproportionately higher than the value they afford the user. In this context, our 'resources' include storage, time, and future maintenance obligations. Holding onto an indefinite obligation of maintenance in the time dimension for a user who no longer accesses a group seems disproportionately high. Conversely, a system design where an authentication key is rotated each time a user is removed would align more closely with the principle that "the resources we allocate should be proportional to the value it provides to the user."

Let's imagine a catastrophic event leading to some swarms losing random data. In a future update, we may accidentally delete or overwrite some revocation lists or some of their elements during a migration. This would place us in an unfortunate situation, as we would be unable to identify which groups' scraping protections are affected and which remain secure. If we were to implement a key rotation mechanism for scraping protection, the worst-case scenario in a catastrophic event involving the loss of random data (including authentication public keys), would necessitate asking users to issue new public keys. Despite this inconvenience, we could still assure users that their groups would not be vulnerable to scraping by former members.

In essence, an 'authentication key rotation' design would offer greater robustness than a 'revocation list' design in the event of a catastrophe.

jagerman · 2023-09-12T18:36:09Z

Conversely, a system design where an authentication key is rotated each time a user is removed would align more closely with the principle that "the resources we allocate should be proportional to the value it provides to the user."

Indeed, and the subaccount revocation is a non-airtight solution that is mainly advisory rather than providing long-term security.

The long-term security happens within the group itself: once a user is removed, the group's keys are rotated so that even if a user still has access to the group (e.g. because of the revocation not being added, or because they are an admin of a storage server in the group's swarm) they can't do anything with the group data as they no longer are issued encryption keys.

This is by design: we need groups to be re-joinable on a newly restored device, or when a device has been offline for a considerable amount of time. Our design achieves this by storing the keys within the swarm, with a best-effort (but not airtight) authentication revocation on the outside, so that a user on a new device, or who has been away for a long time, can still come back to the group, re-fetch keys/info/members messages, and get updated keys needed to decrypt and rebuild all the group's info. It's important that a removed user cannot get that data, though, and so there is the two-pronged approach here: a best-effort revocation list, plus an (inner) key rotation of all the stored messages so that even if the best-effort revocation isn't enough, the user still can't get anything out of the group other than seeing that opaque binary blobs are being posted in it.

A key rotation mechanism on the storage server itself would mean that there is state (the access keys) that has to be externally communicated when it changes: until you get the new keys, you can't access the storage server at all, and so if you are away for an extended period during which keys rotated, you would have no way to get those keys yourself: we'd need to build out-of-band communication with an admin, which is a pretty bit usability drag.

jagerman added 7 commits July 17, 2023 14:28

Remove obsolete hardfork code

1d6aee6

- bump minimum required to the current live version - removes some feature guards for previous HF versions - remove now-unused code

WIP - subaccounts

b766be8

more WIP

19b72ed

Add common fields to info endpoint

4b30949

Misc fixes and formatting

81dd37a

Generate RPC markdown documentation for api.oxen.io

ba8e30f

This enhances/fixes/updates storage server's RPC documentation to make it convertible to markdown documentation (and thus in-browser documentation, via docsify) for https://api.oxen.io.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generate RPC endpoint documentation for api.oxen.io #478

Generate RPC endpoint documentation for api.oxen.io #478

jagerman commented Jul 22, 2023

venezuela01 commented Jul 23, 2023 •

edited

Loading

jagerman commented Jul 24, 2023

venezuela01 commented Jul 28, 2023 •

edited

Loading

venezuela01 commented Jul 31, 2023

jagerman commented Sep 12, 2023

Generate RPC endpoint documentation for api.oxen.io #478

Are you sure you want to change the base?

Generate RPC endpoint documentation for api.oxen.io #478

Conversation

jagerman commented Jul 22, 2023

venezuela01 commented Jul 23, 2023 • edited Loading

jagerman commented Jul 24, 2023

venezuela01 commented Jul 28, 2023 • edited Loading

venezuela01 commented Jul 31, 2023

jagerman commented Sep 12, 2023

venezuela01 commented Jul 23, 2023 •

edited

Loading

venezuela01 commented Jul 28, 2023 •

edited

Loading