Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

email validation: provide spec validation #204

Closed
kurtextrem opened this issue Oct 7, 2023 · 6 comments · Fixed by #912
Closed

email validation: provide spec validation #204

kurtextrem opened this issue Oct 7, 2023 · 6 comments · Fixed by #912
Assignees
Labels
enhancement New feature or request workaround Workaround fixes problem

Comments

@kurtextrem
Copy link
Contributor

The currently used email regex does not match emails according to the spec, which means emails that browsers accept, will be rejected by valibot (by design)

So we want to use this issue to find out if others would be interested in using a regexp that validates more emails: #180 (comment)

@fabian-hiller fabian-hiller self-assigned this Oct 7, 2023
@fabian-hiller fabian-hiller added the enhancement New feature or request label Oct 7, 2023
@fabian-hiller
Copy link
Owner

Here is some more context on this issue. Currently, our email validation is deliberately limited to "normal" email addresses. This has several advantages.

On the one hand, the validation is more secure as it excludes various special characters like ` and | which can be used for SQL injection attacks. On the other, the validation is more accurate, allowing typos in common email addresses to be detected.

The downside is that this regex differs from the standard and does not allow email addresses that use an IP address at the end, for example. If this behavior is needed, our regex function can be used as a workaround with a regex that conforms to the RFC standard.

W3C Working Draft Regex from w3.org:

/^[a-zA-Z0-9.!#$%&’*+/=?^_`{|}~-]+@[a-zA-Z0-9-]+(?:\.[a-zA-Z0-9-]+)*$/

HTML Standard Regex from whatwg.org:

/^[a-zA-Z0-9.!#$%&'*+\/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/

This issue was created, as described above, to get feedback to see if there is a need to add, for example, a specEmail function to match the validation of the <input type="email" /> element in the browser.

@fabian-hiller fabian-hiller added the workaround Workaround fixes problem label Oct 7, 2023
@kurtextrem
Copy link
Contributor Author

Plus, the spec allows emails like:

which are both emails that are valid, but contain characters that are rarely encountered.

@kazizi55
Copy link
Contributor

kazizi55 commented Oct 8, 2023

I don’t think there is so much a need to add the specEmail() function.

In other words, I think there should be only one email validation method, email(), which implements HTML Standard Regex.

That’s because the Valibot users (I mean the programmers) would not be able to easily decide which email method to choose, since there is a little difference between email() and specEmail(), and both of them are based on different standards.

Who can predict the application users would type rarely encountered characters or not?

What do you guys think about this?

@kurtextrem
Copy link
Contributor Author

Who can predict the application users would type rarely encountered characters or not?

I share the same opinion.
However, I can also understand the argument of avoiding accidental security issues, as not everyone might be aware that ' or ` is an allowed character - although I'd say this is more of a teaching/docs issue as prepared statements (or stored procedures, or at very least escaping user input) should be always used in the first place.

@kazizi55
Copy link
Contributor

kazizi55 commented Oct 8, 2023

I can also understand the argument of avoiding accidental security issues

Yes, I can understand the argument too. 😄

as not everyone might be aware that ' or ` is an allowed character - although I'd say this is more of a teaching/docs

I agree with you, so I think we also have to add some explanation to the email regex doc in Valibot website, which says that ' or ` is an allowed character and would lead to accidental security issues.

In short, implementing email() with HTML Standard Regex and adding some explanation to docs are appropriate for this issue, I think.

@fabian-hiller
Copy link
Owner

I decided against the RFC standard for the regex of email. The reasons can be found in PR #180. I think it would be a good idea to point this out in the docs as soon as we extend the API reference.

If there are any counter arguments against my decision, feel free to create a new issue for this topic in order to get feedback from more people. This issue is meant for people to submit use cases where email is too strict, to determine in the long run if the described workaround with regex is sufficient or if we should add specEmail.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request workaround Workaround fixes problem
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants