Regex to match crate and version #1143
-
I want to be able to capture the name of the crate and the version in any of these formats: hello
hello = ''
hello = { version = '' }
hello = { version = '', features = [''] } I tried many different queries with no success, e.g.
And ended up using a simple alternation:
The first one matches none of them. Can anyone explain why and how I could've done it correctly w/o an alternation (only if it's easier than an alternation) |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
This program, strictly speaking, satisfies your prompt: use regex::Regex;
fn main() -> anyhow::Result<()> {
let hay = "
hello
hello = '1.2.3'
hello = { version = '1.2.3' }
hello = { version = '1.2.3', features = [''] }
";
let re = Regex::new(
r"(?xm)
(?<krate>^\S+)(?:\s*=\s*(?:\{\s*)?(?:version\s*=\s*)?'(?<version>[^']+)')?
",
)
.unwrap();
for caps in re.captures_iter(hay) {
dbg!(&caps);
}
Ok(())
} And the [package]
publish = false
name = "d1143"
version = "0.1.0"
edition = "2021"
[dependencies]
anyhow = "1.0.77"
regex = "1.10.2"
[[bin]]
name = "d1143"
path = "main.rs" And then running it gives:
And that seems to satisfy what you want here for these inputs. But I expect the regex will matching things other than what you intend, and it may simultaneously be overfitted to this particular input. I actually like the alternation approach here better personally, because I find it easier to understand. The main issue with this approach is that you can't have duplicate capture group names. (I'm not sure why you didn't run into that problem? You have shown a regex with multiple groups using the same name.) However, you can side-step that by using the multi-regex support in use regex_automata::meta::Regex;
fn main() -> anyhow::Result<()> {
let hay = "
hello
hello = '1.2.3'
hello = { version = '1.2.3' }
hello = { version = '1.2.3', features = [''] }
";
let re = Regex::new_many(&[
r"(?m)(?<krate>^\S+)$",
r"(?m)(?<krate>^\S+)\s*=\s*'(?<version>[^']+)'$",
r"(?m)(?<krate>^\S+)\s*=\s*\{.*version\s*=\s*'(?<version>[^']+).*}$",
])
.unwrap();
for caps in re.captures_iter(hay) {
let krate = &hay[caps.get_group_by_name("krate").unwrap()];
let version =
caps.get_group_by_name("version").map_or("unknown", |sp| &hay[sp]);
println!("crate: {krate:?}, version: {version:?}");
}
Ok(())
} And the [package]
publish = false
name = "d1143"
version = "0.1.0"
edition = "2021"
[dependencies]
anyhow = "1.0.77"
regex-automata = "0.4.3"
[[bin]]
name = "d1143"
path = "main.rs" And the output:
The meta regex API is a bit lower level and more cumbersome to use, but it lets you use multiple regexes with the same capture names. |
Beta Was this translation helpful? Give feedback.
This program, strictly speaking, satisfies your prompt:
And the
Cargo.toml
:And then running it gives: