-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generate RPC endpoint documentation for api.oxen.io #478
base: dev
Are you sure you want to change the base?
Conversation
The `_t` is a suffix meaning typedef (e.g. `size_t`, `uint64_t`, `std::enable_if_t` but not `unsigned long` or `std::enable_if`). In storage server the suffix was wrongly used on some concrete types (`user_pubkey_t`, `all_stats_t`, `peer_stats_t`). This fixes it, dropping the `_t` suffix from the problematic structs.
- bump minimum required to the current live version - removes some feature guards for previous HF versions - remove now-unused code
This enhances/fixes/updates storage server's RPC documentation to make it convertible to markdown documentation (and thus in-browser documentation, via docsify) for https://api.oxen.io.
Thank you for writing the document! I think we need to be very careful when utilizing a deterministic blinding mechanism like According to https://oxen.caliban.org/operator/ranking, the foundation controls 170+ nodes, while a significant number of other operators manage between 10 and 120+ nodes. These operators have access to a significant amount of Session identity public keys, covering from 5% to over 40% of Session ID partition spaces within a total of 260+ swarms. If operators decide to scan their nodes and extract Session IDs, coupled with closed-group public keys observed from their respective nodes, they could essentially replicate the deterministic blinding mechanism. If they run through all possible combinations of |
First point: There's nothing in here that specifically requires that: you could generate a random The information leak is something of a problem, but is somewhat mitigated by the fact that the storage server doesn't know which keys have been issued: the admin simply generates it and sends it to a user, so there's no (easy) ability to scrape pubkeys in use (unlike, say, with SOGS): you'd have to modify storage server, hook into the authentication code to dump incoming auth requests, and then use that -- and that is only doable on SNodes in your control. One of the compelling reasons to want a deterministic key, however, is that it allows revoking keys without having to know all the subkeys that have been issued--so, for example, I could revoke user |
Indeed, I was aware of this. My intention was to discuss the design of the close group in general, but I was unable to find an appropriate thread. I apologize for hijacking the thread. Please let me know if you have a better suggestion for where to discuss close group design.
That's an aspect I hadn't considered—thank you for pointing it out. It certainly makes sense. However, ideally we should try as hard as we can to protect the association between the user's unblinded and blinded ID from being scraped. A resourceful attacker might be able to gain access to 50% of unique swarm IDs by controlling only 10% of nodes, according to swarm id coverage of individual service node operator. If an attacker were to stake 3750 Oxen per node and offer an extremely low operation fee to attract contributors, they would need only 170*3750 = 637,500 Oxen to control 10% of nodes, or 50% of unique swarms. With access to 50% of Session IDs, the attacker might be able to recover a significant portion of the social graph if they are able to modify the storage server and recompute the deterministic link between Session ID and Closed Group ID. According to research like Robust De-anonymization of Large Sparse Datasets, an anonymous social network can be de-anonymized if its social graph is known, and a third-party dataset from a different social network covering roughly the same set of people is used for comparison and matching. If we don't prevent potential social graph leakage as hard as we can from the beginning, the longer we operate the network, the less confidence we can have in the social graph protection we provide to our users. It would be incredibly difficult to assure users of the likelihood of their social graph being protected or their privacy being preserved. It would be complex to explain to users that "under certain assumptions, if the number of bad operators is below a certain threshold, then the likelihood of your privacy being compromised is not higher than a certain value, please do your own research to evaluate your risk based on your threat model". |
According to https://api.oxen.io/storage-rpc/#/subaccounts?id=authentication, https://api.oxen.io/storage-rpc/#/subaccounts?id=revoke_subaccount, as well as related source code and other documents, I understand that a persistently maintained list of keys belonging to removed users resides on the storage server. Ideally, I think the resources we allocate to a task should be proportionate to the value it confers to the user. When a user is removed from a group, we cease to offer them any value. If we are obliged to indefinitely retain some metadata to prevent the user from scraping the storage server, then the resources we allocate are disproportionately higher than the value they afford the user. In this context, our 'resources' include storage, time, and future maintenance obligations. Holding onto an indefinite obligation of maintenance in the time dimension for a user who no longer accesses a group seems disproportionately high. Conversely, a system design where an authentication key is rotated each time a user is removed would align more closely with the principle that "the resources we allocate should be proportional to the value it provides to the user." Let's imagine a catastrophic event leading to some swarms losing random data. In a future update, we may accidentally delete or overwrite some revocation lists or some of their elements during a migration. This would place us in an unfortunate situation, as we would be unable to identify which groups' scraping protections are affected and which remain secure. If we were to implement a key rotation mechanism for scraping protection, the worst-case scenario in a catastrophic event involving the loss of random data (including authentication public keys), would necessitate asking users to issue new public keys. Despite this inconvenience, we could still assure users that their groups would not be vulnerable to scraping by former members. In essence, an 'authentication key rotation' design would offer greater robustness than a 'revocation list' design in the event of a catastrophe. |
Indeed, and the subaccount revocation is a non-airtight solution that is mainly advisory rather than providing long-term security. The long-term security happens within the group itself: once a user is removed, the group's keys are rotated so that even if a user still has access to the group (e.g. because of the revocation not being added, or because they are an admin of a storage server in the group's swarm) they can't do anything with the group data as they no longer are issued encryption keys. This is by design: we need groups to be re-joinable on a newly restored device, or when a device has been offline for a considerable amount of time. Our design achieves this by storing the keys within the swarm, with a best-effort (but not airtight) authentication revocation on the outside, so that a user on a new device, or who has been away for a long time, can still come back to the group, re-fetch keys/info/members messages, and get updated keys needed to decrypt and rebuild all the group's info. It's important that a removed user cannot get that data, though, and so there is the two-pronged approach here: a best-effort revocation list, plus an (inner) key rotation of all the stored messages so that even if the best-effort revocation isn't enough, the user still can't get anything out of the group other than seeing that opaque binary blobs are being posted in it. A key rotation mechanism on the storage server itself would mean that there is state (the access keys) that has to be externally communicated when it changes: until you get the new keys, you can't access the storage server at all, and so if you are away for an extended period during which keys rotated, you would have no way to get those keys yourself: we'd need to build out-of-band communication with an admin, which is a pretty bit usability drag. |
This enhances/fixes/updates storage server's RPC documentation to make it convertible to markdown documentation —and thus in-browser documentation, via docsify—for https://api.oxen.io/storage-rpc .
(This is currently built on top of #477 as I wanted to document the new subaccount endpoints without bothering with the old subkey endpoints; this PR is done, but pending completion/merging #477).