-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sanitizer built-ins document #244
Conversation
FYI, this PR is still called "Not ready yet". Let me know when you are seeking review. |
About now would be good, so I renamed it. :) I added more "sections" with spec links, also for attributes. I think that should cover all of HTML; while SVG + MathML coverage is still a bit of a mess. There are a lot more leftover ("other") attributes than there were with elements. Changes so far are:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making this list. Very useful to have something based on the HTML standard. Here's an initial very conservative safelist for "default
":
- html
- head
- title
- All of Sections (body, article, ...)
- All of Grouping Content (p, hr, ...)
- Most of Text-level Semantics:
- a attributes target, referrerpolicy, download, and ping I would omit
- All of Edits (ins, del)
- All of Tabular Data
- SVG & MathML: TBD
- Global attributes:
- dir
- lang
- title
|
"ondragenter", | ||
"ondragleave", | ||
"ondragover", | ||
"ondragstart", | ||
"ondrop", | ||
"ondurationchange", | ||
"onemptied", | ||
"onend", | ||
"onended", | ||
"onerror", | ||
"onfocus", | ||
"onfocusin", | ||
"onfocusout", | ||
"onformdata", | ||
"ongotpointercapture", | ||
"onhashchange", | ||
"oninput", | ||
"oninvalid", | ||
"onkeydown", | ||
"onkeypress", | ||
"onkeyup", | ||
"onlanguagechange", | ||
"onload", | ||
"onloadeddata", | ||
"onloadedmetadata", | ||
"onloadstart", | ||
"onlostpointercapture", | ||
"onmessage", | ||
"onmessageerror", | ||
"onmousedown", | ||
"onmouseenter", | ||
"onmouseleave", | ||
"onmousemove", | ||
"onmouseout", | ||
"onmouseover", | ||
"onmouseup", | ||
"onmousewheel", | ||
"onmove", | ||
"onoffline", | ||
"ononline", | ||
"onorientationchange", | ||
"onoverscroll", | ||
"onpagehide", | ||
"onpageshow", | ||
"onpaste", | ||
"onpause", | ||
"onplay", | ||
"onplaying", | ||
"onpointercancel", | ||
"onpointerdown", | ||
"onpointerenter", | ||
"onpointerleave", | ||
"onpointermove", | ||
"onpointerout", | ||
"onpointerover", | ||
"onpointerrawupdate", | ||
"onpointerup", | ||
"onpopstate", | ||
"onprogress", | ||
"onratechange", | ||
"onrepeat", | ||
"onreset", | ||
"onresize", | ||
"onresolve", | ||
"onscroll", | ||
"onscrollend", | ||
"onscrollsnapchange", | ||
"onscrollsnapchanging", | ||
"onsearch", | ||
"onsecuritypolicyviolation", | ||
"onseeked", | ||
"onseeking", | ||
"onselect", | ||
"onselectionchange", | ||
"onselectstart", | ||
"onshow", | ||
"onslotchange", | ||
"onstalled", | ||
"onstorage", | ||
"onsubmit", | ||
"onsuspend", | ||
"ontimeupdate", | ||
"ontimezonechange", | ||
"ontoggle", | ||
"ontouchcancel", | ||
"ontouchend", | ||
"ontouchmove", | ||
"ontouchstart", | ||
"ontransitionend", | ||
"onunload", | ||
"onvalidationstatuschange", | ||
"onvolumechange", | ||
"onwaiting", | ||
"onwebkitanimationend", | ||
"onwebkitanimationiteration", | ||
"onwebkitanimationstart", | ||
"onwebkitfullscreenchange", | ||
"onwebkitfullscreenerror", | ||
"onwebkittransitionend", | ||
"onwheel" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What should we do here? In spec purity terms, I believe we should stick to those in the HTML standard and make a big note that many engines support non-standardized and add them as a hint or such?
But In reality, I can see this going wrong.
@evilpie: How would we best identify the list of supported event handler attributes in Gecko?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably just check if an attribute is a https://html.spec.whatwg.org/#event-handler-content-attributes. We could then maybe non-normatively list all of them (they're also in an index in HTML). Implementations can do roughly the same thing they do for Trusted Types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Gecko, Trusted Types currently uses the EventNameList.h.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've now removed the list of event handlers, instead adding a rules to remove event-handler-content-attributes. I'm iterating over those, as if they were a list. Not sure if that's legitimate.
I've also added a note and a script that merges in a copy of the event handlers, so it's more easy to see what this does. This should make it easy to modify, and to -- eventually -- just use a list directly derived from the HTML spec text.
Unfortunately, the preview doesn't run the scripts, so that particular bit isn't easy to review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think iterating over them is okay. We might have to revisit this when upstreaming.
index.bs
Outdated
<span class="marker">Note:</span> The [=remove unsafe=] algorithm specifies | ||
to additionally remove any [=event handler content attributes=], as defined | ||
in [[HTML]]. | ||
If a [=user agent=] defines extensions to the [[HTML]] spec with additional | ||
[=event handler content attributes=], it is its responsibility to decide how | ||
to handle them. Using the current [=event handler content attributes=] list, | ||
the safe baseline configuration looks effectively like so: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, this is very important.
nit picking, but is there a higher level of severity than a "note"? Can we do warnings? :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, this is very important. nit picking, but is there a higher level of severity than a "note"? Can we do warnings? :D
The tool offers Notes, Issues, Examples, Advisements
HTML suports highly dramatic warnings, with a few usages in the spec, e.g. here.
If we want to use a warning here, I'd just emulate the CSS in the spec directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guess it's best to do what's easiest and not worth nit picking too hard. :) thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could add a source comment to upgrade this to class=warning
when upstreaming.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using "advisement" + with label "Warning:" now. Also addrf a comment to use "class=warning" eventually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's also a pre-defined CSS style for "annoying-warning", but that sounds.. like too much. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy with this 🥳 thank you!
lgtm 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems fine. I'm a bit surprised we block nested documents in baseline though. Did we discuss that and I forgot about it?
We discussed it at end of last year; see minutes of the 2024-12-11 call. At the time, there was a rather large "other" category that we went through. The minutes record "some gut calls during the call", and "no" for frame-related stuff. I do remember this as gut calls indeed, with noone having super strong opinions one way or another. |
Thanks! Admittedly it's also somewhat hard to do a nested document policy that makes sense, so once there's demand for that we can try to figure out more dedicated syntax I suppose. Or point people towards "unsafe". |
Landing this. I'll note that the list format makes this really easy to change, spec-wise. :) |
This PR was merged into the 6.4 branch. Discussion ---------- [HtmlSanitizer] fix tests | Q | A | ------------- | --- | Branch? | 6.4 | Bug fix? | no | New feature? | no | Deprecations? | no | Issues | | License | MIT The files we used to download are no longer part of the WICG/sanitizer-api repository (see WICG/sanitizer-api#244). Commits ------- 8f06032 fix tests
This is meant as a starting point for our built-ins. It's a "mostly free-form" document, in that it's just text with one line per element, plus markdown-style headings. The idea is to classify elements and attributes into groups.
I started out putting everything into "other", and then moving them into better defined groups. The idea is to work down the "other" list until it's empty.
I copied all the elements from the "proposed allow lists" doc into the "harmless" category. Will do the same for attributes.
Source(s):
Preview | Diff