Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

src: implement whatwg's URLPattern spec #56452

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

anonrig
Copy link
Member

@anonrig anonrig commented Jan 3, 2025

Co-authored-by: Daniel Lemire (@lemire)

Blocked

This is blocked from landing due to the old macOS machines we use in our infrastructure (cc @nodejs/build)

19:50:46 ../deps/ada/ada.h:8457:10: fatal error: 'ranges' file not found
19:50:46 #include <ranges>
19:50:46          ^~~~~~~~
19:50:46   /usr/local/bin/ccache cc -o /Users/iojs/build/workspace/node-test-commit-osx/nodes/osx11-

Notes:

  • Ada now requires C++20
  • URLPattern is now a global class.
  • URLPattern is also exposed in node:url module
  • Ada now enables exceptions just like V8. This is done because std::regex, the default regex library of C++ does not have any non-exception API surface like std::filesystem. The alternative to not enabling exceptions is to bundle Ada with a regex library or implementing it's own regex parser, which is too much work for URLPattern at this stage. Further Ada releases can support such changes to disable exceptions.

TODOs

  • Pass all web-platform tests
  • Release Ada v3 before landing this PR
  • Make sure to split all changes to multiple commits
  • Add @lemire as co-author to all commits
  • Land upstream pull-request implement URLPattern ada-url/ada#785
  • Add documentation for global and node:url module declarations.

cc @nodejs/cpp-reviewers

Fixes #40844

@anonrig anonrig requested review from jasnell and RafaelGSS January 3, 2025 16:07
@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/gyp
  • @nodejs/security-wg
  • @nodejs/startup
  • @nodejs/url
  • @nodejs/web-standards

@nodejs-github-bot nodejs-github-bot added lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run. labels Jan 3, 2025
@targos targos added the semver-major PRs that contain breaking changes and should be released in the next major version. label Jan 3, 2025
@anonrig anonrig added macos Issues and PRs related to the macOS platform / OSX. blocked PRs that are blocked by other issues or PRs. build-agenda labels Jan 3, 2025
@targos
Copy link
Member

targos commented Jan 3, 2025

Ada now enables exceptions just like UV and V8

Can you elaborate? libuv is a C library so I don't think exceptions exist there, and I'm pretty sure V8 is built with exceptions disabled.

@anonrig
Copy link
Member Author

anonrig commented Jan 3, 2025

Ada now enables exceptions just like UV and V8

Can you elaborate? libuv is a C library so I don't think exceptions exist there, and I'm pretty sure V8 is built with exceptions disabled.

My bad UV does not enable exceptions. Referencing v8.gyp file:

{
  'target_name': 'torque_base',
  'type': 'static_library',
  'toolsets': ['host', 'target'],
  'sources': [
    '<!@pymod_do_main(GN-scraper "<(V8_ROOT)/BUILD.gn"  "\\"torque_base.*?sources = ")',
  ],
  'dependencies': [
    'v8_shared_internal_headers',
    'v8_libbase',
  ],
  'defines!': [
    '_HAS_EXCEPTIONS=0',
    'BUILDING_V8_SHARED=1',
  ],
  'cflags_cc!': ['-fno-exceptions'],
  'cflags_cc': ['-fexceptions'],
  'xcode_settings': {
    'GCC_ENABLE_CPP_EXCEPTIONS': 'YES',  # -fexceptions
  },
  'msvs_settings': {
    'VCCLCompilerTool': {
      'RuntimeTypeInfo': 'true',
      'ExceptionHandling': 1,
    },
  },
}

@targos
Copy link
Member

targos commented Jan 3, 2025

This is not really V8. It's a build-time executable (torque) used to generate code for V8

@anonrig anonrig requested a review from Qard January 3, 2025 16:27
src/node_url_pattern.cc Outdated Show resolved Hide resolved
src/node_url_pattern.cc Outdated Show resolved Hide resolved
src/node_url_pattern.cc Outdated Show resolved Hide resolved

MaybeLocal<Value> URLPattern::Hash() const {
auto context = env()->context();
return ToV8Value(context, url_pattern_.get_hash());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the key challenge here is that this will copy the string on every call. Any chance of memoizing the string once created.

URLPattern::URLPattern(Environment* env,
Local<Object> object,
ada::url_pattern&& url_pattern)
: BaseObject(env, object) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We likely should introduce this as experimental in the first release, even if it graduates from experimental quickly. There should likely be a warning emitted on the first construction.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK: There is no easy way to emit an experimental warning in C++ that can be dismissed using the CLI command. For now, I have made it experimental on the nodejs doc.

src/node_url_pattern.cc Outdated Show resolved Hide resolved
src/node_url_pattern.cc Outdated Show resolved Hide resolved
src/node_url_pattern.cc Outdated Show resolved Hide resolved
@jasnell
Copy link
Member

jasnell commented Jan 3, 2025

Can you also include a fairly simple benchmark?

Copy link

codecov bot commented Jan 3, 2025

Codecov Report

Attention: Patch coverage is 86.77043% with 68 lines in your changes missing coverage. Please review.

Project coverage is 88.71%. Comparing base (7b472fd) to head (2a19d32).
Report is 124 commits behind head on main.

Files with missing lines Patch % Lines
src/node_url_pattern.cc 86.87% 22 Missing and 44 partials ⚠️
src/node_url_pattern.h 0.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #56452      +/-   ##
==========================================
- Coverage   89.16%   88.71%   -0.45%     
==========================================
  Files         661      664       +3     
  Lines      191421   192638    +1217     
  Branches    36845    36770      -75     
==========================================
+ Hits       170673   170905     +232     
- Misses      13615    14482     +867     
- Partials     7133     7251     +118     
Files with missing lines Coverage Δ
...internal/bootstrap/web/exposed-window-or-worker.js 93.89% <100.00%> (+0.19%) ⬆️
lib/internal/url.js 95.79% <100.00%> (-1.89%) ⬇️
lib/url.js 98.94% <100.00%> (-1.06%) ⬇️
src/node_binding.cc 83.66% <ø> (ø)
src/node_external_reference.h 100.00% <ø> (ø)
src/node_url_pattern.h 0.00% <0.00%> (ø)
src/node_url_pattern.cc 86.87% <86.87%> (ø)

... and 151 files with indirect coverage changes

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t think this is a good pattern to land in Node.js. Specifically, a server using this will create one per route and iterate in a loop. This will be slow, specifically if you need to match the last of the list.

(This feedback was provided when URLPattern was standardized and essentially ignored).

For this to be useful, we would need to have a Node.js-specific API to organize these URLPattern in a radix prefix trie and actually do the matching all at once.

I can possibly be persuaded that we need this for Web platform compatibility, but it’s not that popular either (unlike fetch()).

@mcollina
Copy link
Member

mcollina commented Jan 3, 2025

@jasnell I’ll try to build this and get a benchmark going against the ecosystem routers.

@anonrig
Copy link
Member Author

anonrig commented Jan 3, 2025

@jasnell I’ll try to build this and get a benchmark going against the ecosystem routers.

Right now, this pull-request does not pass WPT, and not at all optimized. Any benchmarks will not be beneficial.

@anonrig anonrig requested a review from jasnell January 24, 2025 21:42
@anonrig anonrig force-pushed the yagiz/implement-url-pattern branch from 981efd1 to b514115 Compare January 24, 2025 21:44
src/node_url_pattern.h Outdated Show resolved Hide resolved
}

void URLPattern::MemoryInfo(MemoryTracker* tracker) const {
// TODO(anonrig): Implement this.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget this :-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how to properly set the memory of a url_pattern. Any suggestions?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just estimate, it does not need to be exact. This information is used when generating a heap snapshot so it's largely informational.

src/node_url_pattern.cc Show resolved Hide resolved
src/node_url_pattern.cc Show resolved Hide resolved
src/node_url_pattern.cc Show resolved Hide resolved
Local<Value> ignore_case;
if (obj->Get(env->context(),
FIXED_ONE_BYTE_STRING(env->isolate(), "ignoreCase"))
.ToLocal(&ignore_case)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handle the ToLocal(...) == false case properly ;-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it can't get it, we should fail since it is not required. I think this is unnecessary.

src/node_url_pattern.cc Outdated Show resolved Hide resolved
return;
}
info.GetReturnValue().Set(result);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the way these are defined, the property is going to be reserialized as a new string on every call. Should probably memoize these to avoid the extraneous/duplicative copies.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any suggestions on how to memoize with least amount of code and highest amount of memory safety possible?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You would need to have v8::Global<v8::String> member fields to cache the values. Or, maybe use SetLazyDataProperty() ... not sure if the latter would make it spec compliant, tho, as I believe it makes the properties instance properties rather than prototype properties.

@anonrig anonrig force-pushed the yagiz/implement-url-pattern branch 4 times, most recently from 055fefc to 201bf3c Compare January 25, 2025 00:46
@nodejs-github-bot
Copy link
Collaborator

@anonrig
Copy link
Member Author

anonrig commented Jan 25, 2025

@nodejs/tsc @nodejs/build @nodejs/platform-macos I can not land this pull-request due to the old macOS infrastructure. This is currently the only blocker for this pull-request.

19:50:46 ../deps/ada/ada.h:8457:10: fatal error: 'ranges' file not found
19:50:46 #include <ranges>
19:50:46          ^~~~~~~~
19:50:46   /usr/local/bin/ccache cc -o /Users/iojs/build/workspace/node-test-commit-osx/nodes/osx11-

@mcollina
Copy link
Member

What's the status with WPTs?
Did you perform any optimizations?

src/node_url_pattern.cc Outdated Show resolved Hide resolved
src/node_url_pattern.cc Outdated Show resolved Hide resolved
src/node_url_pattern.cc Outdated Show resolved Hide resolved
src/node_url_pattern.cc Outdated Show resolved Hide resolved
src/node_url_pattern.cc Outdated Show resolved Hide resolved
}
return Null(env->isolate());
}
env->ThrowTypeError("Failed to exec URLPattern");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These throws (here and the "Failed to test ..." one below) need proper error codes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any suggestion on error code naming?

@anonrig
Copy link
Member Author

anonrig commented Jan 25, 2025

What's the status with WPTs?

There is currently 19 failing tests, and 8 of them are invalid, and needs to be fixed in WPT. I have an open PR: web-platform-tests/wpt#49782

Did you perform any optimizations?

Not at the moment. My goal is to pass almost all WPT and land this pull-request before optimizing it. Passing all WPT will give us the confidence to optimize more aggressively.

@anonrig anonrig force-pushed the yagiz/implement-url-pattern branch from 201bf3c to 77419ea Compare January 25, 2025 18:36
@anonrig anonrig force-pushed the yagiz/implement-url-pattern branch from 77419ea to 2a19d32 Compare January 25, 2025 18:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocked PRs that are blocked by other issues or PRs. build-agenda lib / src Issues and PRs related to general changes in the lib or src directory. macos Issues and PRs related to the macOS platform / OSX. needs-ci PRs that need a full CI run.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

implement URLPattern