-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Solving the issue "Template matching regex features only matches prefix" #170
base: master
Are you sure you want to change the base?
Conversation
Can you add some tests to demonstrate the problem this solves? |
@halfak Okay I'll try doing that |
CN_TEMPLATES = [ | ||
r"Citation[_ ]needed", | ||
r"Cn", | ||
r"Fact" | ||
] | ||
cn_templates = wikitext.revision.template_names_matching( | ||
"|".join(CN_TEMPLATES), name="enwiki.revision.cn_templates") | ||
"$|".join(CN_TEMPLATES)+"$", name="enwiki.revision.cn_templates") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you considered a regex in the format (?:foo|bar|baz)$
, with a single $
after the parenthesis?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay I'll try to check that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, not really. I was thinking in something like this:
(?:Citation[_ ]needed|Cn|Fact)$
See https://regex101.com/r/GinCwl/1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, not really. I was thinking in something like this:
(?:Citation[_ ]needed|Cn|Fact)$
See https://regex101.com/r/GinCwl/1.
Oh that does seem much better, thank you.
I'll do that and run some tests
Heyy, I tried to solve this issue referring https://phabricator.wikimedia.org/T250522
I initially did want to pass a param to the template_names_matching() method but I believe it would have demanded changes in the definition in revscoring and passing a delimiter would still require the same amount of work, hence I individually added $ to every regex which called template_names_matching().