-
Notifications
You must be signed in to change notification settings - Fork 1
SMIP: go-spacemesh API implementation #21
Comments
I think we should separate two purposes of the API, first, is to actually provide an API to Node functions such as GetBalance, ChangeRewardAddress... etc... This I agree that could be changed to use GRPC API. The other functionality that is discussed here is the "Events" functionality. this will probably not be used by end users which want to connect with the Node, and will serve building apps such as block explorer and dashboard. IMO this is a separate requirement and should be designed a bit different. Having said that, I'd be happy to discuss and see if we could use streams and get same level of robustness and flexibility using GRPC streams, if you indeed think it is better to implement events that way |
Also note that as part of this change, I think we should also address the local testnet events issue |
@antonlerner thanks for taking a look and for the thorough reply! Your timing is great :) I'm working on the non-stream API endpoints for now, and haven't begun implementation of the streams yet. To respond to a few of your points:
While I agree there's an important distinction between "one-off" endpoints and the streams, I'm not entirely sure the streams won't be used by end users--e.g., I'm pretty sure that @avive and @IlyaVi plan to subscribe to events in the wallet and to use this to display account-related events to the user, e.g., incoming transactions, rewards, etc. This may be cleaner and easier than polling the node from a design perspective. I think @avive has stronger thoughts on this.
Curious to hear more about why you feel that pubsub is more robust and makes it easier to subscribe and unsubscribe from different topics. Also, AFAICT the two are not necessarily mutually exclusive - I think we could have the same set of events exposed using the existing pubsub framework, or grpc streams, or both (modulo questions about serialization and multiplexing, as you point out). I haven't gotten deep enough into the implementation yet to know with confidence.
Totally agree. Would appreciate your advice on how to test these!
Another good point - would love to hear thoughts from @ilans on this. Does the API design as it stands contain the correct endpoints? And is there a preferred protocol for consuming these data? |
It all depends on how it's implemented, if events are raised only as part of the node running, in order for you to get all data you'd need to restart the node and sync from genesis.
As I understand it, the GRPC stream will give you all the data in the mesh in a single stream, so you can't have only part of the data without deserialising it first. the other way around it is to create another endpoint for each datatype and / or identity type (i.e account, node etc...) pubsub can make it more robust in terms that each of the identities and data types can be made a topic and allow much flexible querying and filtering
Can we map out all uses for the streams we know of? this will help us understand how many endpoints we will need to support and whats the equivalent effort (topic selection) we would need to have on our pubsub. I think this will also help us choose better between the two.
we can read the grpc stream code... also, another advantage of mapping the required endpoints will tell us how many streams will be simultaneously opened and active when querying the node. |
We have designed the api around services and 3 clients and did the big code review of the api around those services - e.g. node, mesh, global-state, transactions service and we've implemented all review suggestions. I feel that we have a good design and I see little reason to separate facets differently. Everything is mapped out in the current grpc service definitions of these services so I don't understand the ask for mapping things out. All clients use different kind of methods to get what they want - current data, streams for future data and queries for historical data. The wallets definitely need streams so they can stop polling the node on loop like what is done today in smapp which is bad and very wasteful. Also, streams do not give all the data in the mesh in a single stream - what they return depends on what they were defined to return based on the user input filters. |
how will one subscribe to new data from the stream? also, was this review done with @ilans? He also wants to have certain probes inside nodes and receive some data/events from |
Quick update here, I've begun implementing streams (spacemeshos/go-spacemesh#2061). I created a new singleton struct that basically just stores a list of channels, one per data type that we care about. In the places where I considered integrating into the existing events/pubsub framework but did not for several reasons:
To be clear, I'm talking specifically about how the API backend is implemented internally, not about how data is collected/published externally. I haven't touched the existing pubsub code, and it's likely this API code will be totally orthogonal to it. |
go-spacemesh API implementation
Overview
We have a robust data model design (#13), and an API design based on it (https://github.com/spacemeshos/api/). We need go-spacemesh to expose the data in the API to several classes of clients, including:
Goals and Motivation
Design
Benefits of gRPC
gRPC allows us to achieve all of these goals, in addition to having the following other niceties:
Downsides to/limitations of gRPC
By default gRPC has a maximum message size limit of 4mb, but this can be increased pretty easily. (We already ran into this once.) I don't foresee any major design or implementation challenges as a result of this.
In theory, the RPC design pattern requires tighter coupling between the client and the server than pubsub (which is very loosely coupled: the publisher doesn't even need to know of the subscriber's existence). In practice I don't think this will be an issue for us since I think we can use the existing pubsub-based events framework for all events, which can be delivered to pubsub subscribers and/or to the API streams transparently.
Proposed Implementation
gRPC vs. existing pubsub framework
pubsub is a low-level message-passing protocol that allows a set of events, such as "new block", "block valid", "new ATX", "reward received", "created block", etc., to be broadcast to any number of subscribers. It's currently being used in a multi node test that allows many node instances to share data very rapidly in order to simulate a network in fast-forward. It can hypothetically be used to pass these same events to downstream clients such as for analytics purposes, or a block explorer.
However, being a low-level protocol, pubsub is missing a lot of the features that we get for free with gRPC, so this sort of use case would require considerable additional effort: developing SDKs/connectors for the clients, handling type conversions, clearly defining the protocol, load balancing, and encryption. Also, pubsub would not support certain required use cases well, such as web/mobile clients.
Finally, many of our API endpoints are in fact remote procedure calls: they take arguments, cause the backend to perform some action, and return some value. This use case is not natively supported in pubsub.
We can pretty easily implement and emulate all of the features of pubsub using gRPC streams—in fact, this design work is already done.
Implementation plan
See spacemeshos/go-spacemesh#1764
Dependencies and Interactions
Dependencies:
Interactions:
Stakeholders and Reviewers
Testing and Performance
Testing: Existing API tests will be rewritten, and expanded, to work with the new API code. New tests will be written for any new functionality added, e.g., grpc streams.
Performance: We may want to do some profiling/performance/stress tests to make sure that the new API code, especially events/streams, does not have a negative performance impact on go-spacemesh. Per @antonlerner, we should also test how many simultaneous connections grpc supports.
The text was updated successfully, but these errors were encountered: