Replies: 1 comment 12 replies
-
Out of curiosity, why not just implement the sparse set yourself? It's not a lot of code and it's pretty simple. As to your question... It's hard to say. If you could show a MRE Rust program that uses a lot more memory than an equivalent C++ program using RE2, then I could likely give you a better answer. Otherwise, the main knob to turn is probably to disable Unicode mode. The regex crate and RE2 have different support for Unicode. While there's a lot of differences in the details, probably the biggest difference is that the regexes Also, in a multi-threaded environment, I would generally expect the regex crate to use more memory than RE2. RE2's lazy DFA makes use one of one shared cache across all threads for a single lazy DFA, where as the regex crate uses one cache per thread. But if you aren't measuring memory in a multi-threaded benchmark, this difference shouldn't come up. There could also be memory usage related bugs. For example, I haven't looked at #1116 yet. Otherwise it's very difficult to answer your question in the abstract. I would really want an MRE to give a more precise answer. (For example, maybe there's something different about how the NFAs are represented? Or maybe there's something causing a prefilter somewhere to be bigger than one might expect?) In general, there is a lot of pressure on runtime performance in the form of benchmarks, but very little on memory usage. So there could very well be gains to be had here. |
Beta Was this translation helpful? Give feedback.
-
I've finally done some very basic benching
regex-filtered
againstFilteredRE2
: ua-parser/uap-rust@1af15c1After fixing a dumb mistake it's basically on par in terms of runtime (and that's hot even its final form).
However there's an other standout issue: the RSS / memory footprint of the regex-filtered version is much higher than that of the re2 version, and simple tracking of the rss over phases using memory-stats shows it is largely attributable to
regex
, this was confirmed by just loading all the regexes into aVec<Regex>
and finding that the RSS climbs from ~2MB (after loading the arguments and reading the 633 regex strings from the file) to ~100MB (and then a bit further to ~140MB when matching, I assume that's because caches come into play then).I looked at the different features of regex, and the only knob which seemed to influence RSS is
dfa
, which reduces RSS down to ~50MB before matching which is similar to re2 (though caches then tack ~ 20MB on when matching occurs). Sadly, it's also very much load bearing and disabling it makes runtime shoot up from 46 seconds to 157 seconds on my machine (for 100 iterations over 75158 strings, using 633 regexes).And so I wanted to know, is there a knob I missed? Maybe something wrong I did with regex::bytes or something? Or is it the tradeoff? I assume
regex-lite
would also be an option but wouldn't really be competitive with re2?In fairness I can't say it's of major importance, the runtime seems a lot more relevant, but it's useful information to stash away given the use case for
regex-filtered
would routinely be to load hundreds to thousands of regexes and it might inform e.g. lazy-compiling regexes (under the assumption that most of them will never actually be needed).Beta Was this translation helpful? Give feedback.
All reactions