Why does Regex::new
report an error when reading the pattern from a file, but not when using a string literal?
#1245
-
What version of regex are you using?If it isn't the latest version, then please upgrade and check whether the bug 1.11.1 Describe the bug at a high level.Give a brief description of the actual problem you're observing. I'm getting regular expressions from the nmap fingerprint library, but when I use them individually, the regular expressions work, but when I create them through a loop, I get the error: Backreferences are not supported. What are the steps to reproduce the behavior?This section should almost always provide a COMPLETE Rust program that others If providing a small and simple reproduction is not easy, please explain why https://github.com/nmap/nmap/blob/master/nmap-service-probes nmap-service-probes.txt
What is the actual behavior?If you provide a Rust program in the previous section, then this should be the What is the expected behavior?What do you expect the output to be? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Your reproduction instructions were insufficient. Firstly, your program wouldn't compile. Here is an updated program that does: use regex::Regex;
fn main() {
let date:String = "\u{5}\0\r\u{3}\u{10}\0\0\0\u{18}\0\0\0\0\u{8}\u{1}@\u{4}\0\u{1}\u{5}\0\0\0\0".parse().unwrap();
let mut matten: Vec<String> = Vec::new();
let nsp_str = include_str!("./nmap-service-probes.txt");
let mut nsp_lines = Vec::new();
for l in nsp_str.lines() {
nsp_lines.push(l.to_string());
}
for line in nsp_lines {
if line.contains("#") {
continue;
} else if line.contains("Exclude") {
continue;
}
if line.starts_with("match") {
let line_split: Vec<&str> = line.split(" ").collect();
let line_other = line_split[2..].to_vec().join(" ");
let line_other_replace = line_other.replace("|s", "|");
let line_other_split: Vec<&str> =
line_other_replace.split("|").collect();
let pattern = line_other_split[1].to_string();
// println!("{}", pattern);
matten.push(pattern);
}
}
let mut true_patten: Vec<String> = Vec::new();
// let set = RegexSet::new(matten.clone()).unwrap();
// let matches: Vec<_> = set.matches(&date).into_iter().collect();
for ma in matten.clone() {
match Regex::new(&ma) {
Ok(_) => true_patten.push(ma),
Err(e) => {
if ma.contains("0....") {
println!("Regex Error: {:?}", e);
println!("Regex Error: {}", ma);
}
// println!("Regex Error: {}", ma);
continue;
}
}
}
let test = Regex::new(
"^\x05\0\r\x03\x10\0\0\0\x18\0\0\0....\x04\0\x01\x05\0...$",
)
.unwrap();
let res = test.captures(&date);
println!("{:?}", res);
} It also didn't compile because of the missing
In the future, please provide a working program and all necessary steps to reproduce the problem you're seeing. And please try to minimize the program provided. The one you gave is kind of a mess. In any case, this is absolutely not a bug or a problem with this crate. As far as I can tell, the issue is that the patterns you're parsing from the text file are just not valid syntax for the let test = Regex::new(
"^\x05\0\r\x03\x10\0\0\0\x18\0\0\0....\x04\0\x01\x05\0...$",
)
.unwrap(); Is because escape sequences like let test = Regex::new(
r"^\x05\0\r\x03\x10\0\0\0\x18\0\0\0....\x04\0\x01\x05\0...$",
)
.unwrap(); Then you'll get an error:
Raw strings disable Rust string literal escape sequences. Raw strings are recommended to use with regex patterns. Otherwise, you end up needing to deal with not only the regex language but Rust's string literal language as well. Most programming languages have a "raw string" syntax. Bottom line here is that there is no inconsistency. The problem with your test is that you're comparing apples and organges. So with that explanation out of the way, what can you do? Well in this case, it looks like your patterns assume the regex engine interprets for ma in matten.clone() {
match regex::RegexBuilder::new(&ma).octal(true).build() {
Ok(_) => true_patten.push(ma),
Err(e) => {
if ma.contains("0....") {
println!("Regex Error: {:?}", e);
println!("Regex Error: {}", ma);
}
// println!("Regex Error: {}", ma);
continue;
}
}
}
let test = regex::RegexBuilder::new(
r"^\x05\0\r\x03\x10\0\0\0\x18\0\0\0....\x04\0\x01\x05\0...$",
)
.octal(true)
.build()
.unwrap(); And now running your program, I get:
So it looks like you still have one regex pattern that doesn't work. And in this case, the error should make it pretty clear that
|
Beta Was this translation helpful? Give feedback.
Your reproduction instructions were insufficient. Firstly, your program wouldn't compile. Here is an updated program that does: