-
Notifications
You must be signed in to change notification settings - Fork 451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add regex::Regex::get_config
#1250
Comments
We absolutely cannot expose Morphing your feature request into something more practical, this means exposing the builder options on But even if this were done, in order to write a round-trip serialization routine, you would either need to:
My take here is that if you want to use a Note also that there are builder options (like size limits) that cannot be expressed in the pattern string itself. |
Defining a wrapper config to avoid exposing a public dependency on In the feature request, I avoided discussing the representation used for serialization as it's a concern out of the scope of the In my particular use-case, I would favor the rich object representation without default values; so keeping |
I am willing to work on a pull request to define a |
This is probably trickier to implement than you might imagine. A So I'm pretty sure that in order to do this, you'd need to add more glue code in To be honest, I'm not sure this is worth doing. Like it's not clear to me why you can't put the options you care about into the pattern string itself. You might have specifically avoided talking about that, but I specifically chose to talk about it because use cases are what motivate API expansion and I want to better understand the use cases. If you have avenues available to you to achieve your goal without too much fuss, then adding new API surface area might not be worth doing. Tangentially, I believe I would prefer these be getter methods on a |
Thank you for the warnings. I still think that the exact representation is not that relevant as long as the data is available. My use-case in particular is serialization for IPC. I have a config management tool (similar to Puppeteer or Ansible) that forks itself and uses a multi-process architecture with an orchestrator and agents. Communication is mainly done through pipes, using So far I was using a custom I also was aware of the Based on your last message, it feels though that not all the config is actually still available once the config is built. The goal with having getters returning enough data for serialization directly in the main crate would have been completeness. If it's not possible to have all the needed data to rebuild a regex, then it's maybe better to keep a separate representation as I was doing so far. Regarding the getter methods vs a config, you're right that getters may actually be simpler. The config struct would have probably had a lifetime, and structs with lifetimes are usually not very ergonomic. |
I think it's definitely possible, it's just not as simple as exposing something that's already there. Like right now a Lines 101 to 104 in 1a069b9
I think that would probably need to be changed to this: #[derive(Clone)]
pub struct Regex {
pub(crate) meta: meta::Regex,
pub(crate) pattern: Arc<Config>,
}
struct Config {
pattern: Box<str>,
case_insensitive: bool,
unicode: bool,
...
} And then the surrounding code would need to build the I don't think this is a huge deal to do. Philosophically, I'm not opposed to your thoughts here. I'm open to this. If you put up a PR, I can take a look. I can't promise I'll merge it, but I'll definitely consider it. |
Feature Request
Please add a method
regex::Regex::get_config
which would expose the data fromregex_automata::meta::Regex::get_config
.Why?
The main reason is to support full round-trips when serializing regexes. In particular, the
serde_regex
crate (officially recommended byregex
) is not able to preserve config flags such as case insensitivity.The current situation is such that neither
regex_automata::meta::Regex
norregex::Regex
are suitable for round-tripping serialization.regex_automata::meta::Regex
stores the internal impl and config, and even makes the config readable; but it does not keep track of the input patternregex::Regex
is a wrapper around the previous struct, which also keeps track of the input pattern; however it does not provide any method to read the configAlternatives
An alternative would be for
regex_automata::meta::Regex
to provide a way to recover the input pattern, but it feels a bit backwards.regex::Regex
already has all the data available, so exposing it to enable lossless serialization feels like a better approach.The text was updated successfully, but these errors were encountered: