Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a subject type #222

Closed
wants to merge 28 commits into from
Closed

Conversation

MattesWhite
Copy link
Contributor

This PR adds a Subject type to the NATS client. It also includes some convenience stuff like a mutable, owned version SubjectBuf, Tokens as an own type, the possibility to iterate a subjects tokens and a matching algorithm.

For the validation algorithm I tried to stick to the docs from nats.io.

The subject type is for now nowhere integrated, i.e. it would only be useful for users. However, piece by piece substituting str with Subject(Buf) should proof beneficial.

This could also be used to solve #221 .

Copy link
Member

@derekcollison derekcollison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will take a look, most clients do not vigorously test subjects since the server protects itself etc.

@MattesWhite
Copy link
Contributor Author

Yes, in my application I don't validate subjects that much either. But frequently use the Subjects to parse messages I receive from subscription with wildcards to check out where it actually came from and subject.tokens().nth(4) is just way simpler than writting some custum parser every time.

@caspervonb caspervonb self-assigned this Nov 8, 2021
@caspervonb
Copy link
Collaborator

Interesting but I'm unsure.
Any thoughts on this addition @Jarema?

@derekcollison
Copy link
Member

I think the ability to pull tokens and such is a good addition, but would not want to force changes to normal publish and subscribe.

Copy link
Member

@Jarema Jarema left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, I think this is nice to have such an API.

Though I'm wondering - how it would look without incorporating it into publish/subscribe?

Without that, it looks a little like we couldn't decide if we want subject validation or not, so we have a compromise that might raise eyebrows why it's there, but not part of pub/sub.

Incorporating it into publish and subscribe doesn't sound bad, especially if it implements nicely some kind of Into trait that still allows passing &str or String. Also, the performance culprit of such an addition should be marginal.

An alternative (but still a compromise) would be to have it exposed under the subject module and treat it as a utility.

#[test_case("abc.def", &["*", "fed"], "abc.def.*.fed" ; "single wildcard and more")]
#[test_case("abc", &[">"], "abc.>" ; "multi wildcard")]
#[test_case("abc", &[">", "cba"], "" => panics ; "multi wildcard and more")]
fn join_subject(base: &str, appends: &[&str], expect: &str) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be useful for internals.
I like it.
It's pretty easy to mess up final subject while concating tokens of the subject (or specifically - prefixes with user provided subjects, JS api prefix with KV prefix etc).

Having correctness check of the subject ensured by some tested function has a value.

#[test_case("probe" => true ; "valid name")]
#[test_case("pröbe" => false ; "non alphanumeric")]
#[test_case("PrObE007" => true ; "wild stuff")]
fn validate_token(token: &str) -> bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part is nice and can help both users and contributors, but is IMHO too strict.

Yes, NATS docs specifies that subject tokens should contain only ASCII alphanumeric chars, but that is said only to ensure as much compatibility across NATS Clients as possible. NATS Server actually accepts any UTF-8 charts as part of a subject token as long as they're not . and space, as those are part of NATS proto.

I found myself (as a user) useful to be able to use non alphanumeric characters and it would be shame to artifically remove such option from the client.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After using NATS more I recognized this as well. $SYS subjects wouldn't be valid as well. So I'll relax the constraints

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So everything is a token as long as !( token.is_empty() || token.chars().any(|c| c == '.' || c == ' ') ), right?

#[test_case("cba", ">" => true ; "multi wildcard reverse")]
#[test_case("*", ">" => true ; "mixed wildcards")]
#[test_case("cba", "abc" => false ; "unequal tokens")]
fn match_tokens(l: &str, r: &str) -> bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about that.
What are the use cases of that?

@MattesWhite was that added for testing capabilities?

If you subscribe - you will get what you asked for and matching won't help.
Checking if you will get what you asked for before subbing looks IMHO a little too defensive.
For publishing - matching beforehand sounds almost like coupling with consumers of the subject.
But maybe I'm too picky here :).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This originats from the current implementation of the async client. For each subscription a thread is spawn, epsecially when you have lots of request handling this produces a lot of overhead.
As a workaround for our PoC we have one > subscription and distribute the messages to the async parts via channels. This is not good and will probably become a bottleneck but for now works ok. It would be okay for me to remove this once the async client is no longer depending on the blocking crate.

@MattesWhite
Copy link
Contributor Author

Yes, an integration of the Subject API into the crate was my long time goal. However, I was not sure if the work would be accepted so I waited for a positive response. When I'm done with my other PR I'll continue here.

@derekcollison derekcollison mentioned this pull request Jan 8, 2022
@MattesWhite MattesWhite changed the title Add a subject type [WIP] Add a subject type Jan 9, 2022
@MattesWhite
Copy link
Contributor Author

MattesWhite commented Jan 9, 2022

As predicted, incorporating Subject and SubjectBuf into the whole crate is a lot of work. At the moment I'm not done but the work in the crate itself is mostly done.

I reworked Subject and Token into dynamically sized types, i.e. they behave like str or Path now. Sadly, this includes a little bit of unsafe. I'd like to have your feedback on this matter.

ToDo:

  • Fix benches, examples and tests
  • Reevaluate the uses of Subject::new_unchecked() and SubjectBuf::new_unchecked()
  • Write proper documentation for Subject and SubjectBuf
  • DISCUSS: Public API could take impl TryInto<&Subject> or a bound to another converter trait. This would probably lessen the migration effort for users.

SubjectBuf implements Deref
@derekcollison
Copy link
Member

FYI we need two types. One for publish which can not have wildcards (must be literal) and one for subscriptions which can include wildcards.

@MattesWhite
Copy link
Contributor Author

The current implementation has a contains_wildcards() -> bool isn't that enough?

@derekcollison
Copy link
Member

Possibly, depends on the contract and when they are valid or not and whether or not we want that correctness enforced in a zero abstractions scenario.

@caspervonb
Copy link
Collaborator

caspervonb commented Jan 25, 2022

Discussed this a little bit offline last week with @Jarema.
I think a type for this can be useful, in that we provide a guarantee that once constructed the type is valid but this feels a bit heavy handed with both Subject and SubjectBuf's.

I feel like the sweet spot could be to have a Subject type, that is a validated type. Then we can implement AsSubject/Into traits for conversions from &str, String, etc. Basically somewhere in-between this PR and #295

I think I might take a quick pass at this soon, see if my vague idea is actually any better in practice 🙂

@MattesWhite
Copy link
Contributor Author

So after long time I have again some time to work on this. So far I got everything to compile again. Next I'll marge main to catch up with recent development (good work by the way +1). Then I'll fix tests to work again. And after all this is done we can start reviewing this giant of a PR...

Copy link
Contributor Author

@MattesWhite MattesWhite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Self review

@MattesWhite
Copy link
Contributor Author

@caspervonb
Finally, after months of delay this PR is ready for review, CI is passing, and Subject is introduced to the whole crate.

I slimed the added features a bit down:

  • Token as an own type is gone
  • Subject::sub_subject() is removed

The AsSubject trait was introduced to allow the use of &str in most places of the public API so the doc tests are mostly left unchanged, and the change for users should be minimal.

@MattesWhite MattesWhite changed the title [WIP] Add a subject type Add a subject type Apr 6, 2022
@Jarema
Copy link
Member

Jarema commented Apr 28, 2022

@MattesWhite hey! We will look into this as soon as we will tackle this topic in async-nats client to have it in both, consistent.

@MattesWhite
Copy link
Contributor Author

Closing this PR as it is way outdated and superseded by #596 and #952

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants