-
-
Notifications
You must be signed in to change notification settings - Fork 214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add domain name action #907
base: main
Are you sure you want to change the base?
Conversation
It still needs docs for the website |
Thank you so much! 🙏 I am focusing on our v1 release first before reviewing this PR. In the meantime, I recommend using the regex with our |
Too bad, I wanted to do it right away :( But isn't the following regex better suited for this purpose?
It better aligns with RFC 1035, doesn't it? This way, it also includes the restrictions that apply to a domain and its subdomains. |
I haven't checked the regex yet. Almost every regex of Valibot is handmade. I will check the regex before merging. But please feel free to discuss and research so that the process of checking and merging will be faster. |
This regex is used to validate domain names. Exactly that was my intention. I looked more closely into RFC 1035 and created this regex with the goal of adhering to the specifications as much as possible so that tools like Netscaler or DNS processes will consistently accept these domains. There is a lot of guidance in RFC 1035, especially concerning octets and the conditions for when and how certain characters are allowed. Here’s a detailed breakdown of how the regex works:
Explanation of each part:
Examples of valid domains:
Examples of invalid domains:
This regex is designed to follow DNS standards closely, enabling reliable domain validation across different DNS systems and tools. This way, I can still help a bit :) |
Sorry @JacKyDev I didn't mean to steal your thunder. I just had some spare time left and since it been a few days since you responded I figured I'd step in instead. I will look into the regex you proposed when I get the chance today! |
@paksa92 All good, I'll survive :D |
I'm not sure if this is common practice at Valibot, so I’ll just mention it. I would assume that the documentation could also be updated so that Additionally, I would suggest that something similar to If I haven't overlooked anything, those would be:
Also, the But I'm not sure if this is common practice at Valibot or if I might be overthinking it. Perhaps you should wait for Fabian to respond, as I also asked him this question here: #894. |
@JacKyDev yeah, I already mentioned that it is missing in the PR, but I will add it.
|
I suppose devs rarely look at the |
@JacKyDev I'm currently testing this regex and I think it's wrong. First off, TLD's may not exceed a length of 6. Changing the regex to:
But, this regex still allows TLD's exceeding a length of 6. I suspect one of the negative lookaheads I tweaked the original regex in my PR to account for the max length of a domain name, which now should cover all the cases you mentioned:
Please test it and let me know if you find any issues. |
In the RFC, I don’t see any limitation on the domain suffix Generally, the value is that a domain, without a It’s true that the usual domains are about 6 characters long,
So, I believe the restriction isn’t quite correct. Correct me if you have other information, but in that case, Anyway, roughly speaking, I’ll take another look later so we can |
I also did some additional research and couldn’t find anything that limits it to 6 characters. From my understanding, that would mean the regex should match as specified. Do you have anything that suggests otherwise? In the meantime, I ran your tests, and apart from the one mentioned below, all tests are passing. The failing one makes sense, though, since a TLD could now be longer than 6 characters after the change. test("should not match 'example.commmerce' - TLD too long (more than 6 characters)", () => {
expectActionIssue(action, baseIssue, ['example.commmerce']);
});
Fun fact: there’s also a .shopping TLD, which is offered for ecommerce If you don’t find anything else, then the regex should be all set, and the next step would be to get feedback from Fabian hopefully, we’ve managed to save him some work :) |
You're right @JacKyDev, there is no limit of 6 characters on the TLD. My assumption was based on some stackoverflow posts I've seen. I think the regex is all set now! I will commit it today :) |
|
@tats-u : I understand what you mean, but when I test this regex with emojis or other Unicode characters, it consistently flags them as invalid. This aligns with what I expected, since domains technically only allow ASCII characters. However, I can't replicate what you're describing in a unit test—here, all tests fail as soon as characters outside of From a technical standpoint, this is exactly what I'd expect, as any characters outside the ASCII range should be represented in Punycode to be valid in a domain. Converting to Punycode means that an emoji, for example, doesn't count as a single character, but as multiple ASCII characters, which would also increase the string length—especially since Punycode includes a prefix. Do you have a test example we can try? Currently, emojis and special characters like "ü" are rejected, just as intended. |
If so we don't have to care about surrogate pairs. You designed this validator on the premise that we need to make Unicode domains punycoded in advance. |
I think we’re in a good place by strictly requiring specific values... This way, the domain itself is validated very precisely. And we wouldn’t be offering any special exceptions. Separately, one could consider adding a I wouldn’t really be against the additional check, because you’re right—it’s not much more code. I still lack experience with the process here, as a regex action might feel messy if it includes code checks. But a length Check ist really small... :) |
I agree with it. There are 2 strategies:
Users pass the punycoded domains to other libraries in 1. while they utilize unpunycoded Unicode domains as are. |
I think in the end it might need to be more of a combination domainName validates a string as a real domain as the status quo. Background: Point 1: Point 2: Basically, as you described, it’s for cases where someone wants Point 3: domainSchema = v.pipe(v.string(), v.toPunyCode(), v.domainName()) The aim is to check the actual domain in DNS or another service. This approach would cover all use cases, and if someone wants My suggestion would be to rename the current action to rfcDomain,
The advantage here would be that we offer a variety of options, |
Hey 👋 I am pretty busy at the moment. However, I will try to give you feedback this week to point you in the right direction if necessary. Thanks for your contribution! |
Because I love valibot so much I wanted to do something in return for all the headache she saved me from so here's a PR that adds domain name validation as requested by #894.