Skip to content

Commit

Permalink
Document that for "usual" regex behavior multiline is required
Browse files Browse the repository at this point in the history
Regular expression users typically expect that matching a `$` in a
multiline string would match the end of current line and not the end
of the string past many lines. This is default behavior in pretty much
every regexp engine: `grep`, `perl`, text editors, you name it… So it
is fair to expect such expectation, so warn a user about necessity to
pass `multiline`

Fixes: purescript-contrib#231
  • Loading branch information
Hi-Angel committed Oct 16, 2024
1 parent c38e6ea commit a7eaab0
Showing 1 changed file with 21 additions and 9 deletions.
30 changes: 21 additions & 9 deletions src/Parsing/String.purs
Original file line number Diff line number Diff line change
Expand Up @@ -231,14 +231,24 @@ match p = do
-- |
-- | #### Example
-- |
-- | This example shows how to compile and run the `xMany` parser which will
-- | capture the regular expression pattern `x*`.
-- | Compiling and running different regex parsers:
-- |
-- | ```purescript
-- | case regex "x*" noFlags of
-- | Left compileError -> unsafeCrashWith $ "xMany failed to compile: " <> compileError
-- | Right xMany -> runParser "xxxZ" do
-- | xMany
-- | example re flags text =
-- | case regex re flags of
-- | Left compileError -> unsafeCrashWith $ "xMany failed to compile: " <> compileError
-- | Right xMany -> runParser text do
-- | xMany
-- |
-- | -- Capturing a string per `x*` regex.
-- | exampleXMany = example "x*" noFlags "xxxZ"
-- |
-- | -- Capturing everything till end of line.
-- | exampleCharsTillEol = example ".*$" multiline "line1\nline2"
-- |
-- | -- Capturing everything till end of string. Note the distinction with
-- | -- `exampleCharsTillEol`.
-- | exampleCharsTillEos = example ".*$" (dotAll <> multiline) "line1\nline2"
-- | ```
-- |
-- | #### Flags
Expand All @@ -249,9 +259,11 @@ match p = do
-- | regex "x*" (dotAll <> ignoreCase)
-- | ```
-- |
-- | The `dotAll`, `unicode`, and `ignoreCase` flags might make sense for
-- | a `regex` parser. The other flags will
-- | probably cause surprising behavior and you should avoid them.
-- | The `dotAll`, `multiline`, `unicode`, and `ignoreCase` flags might make
-- | sense for a `regex` parser. In fact, per JS RegExp semantics matching a
-- | single line boundary in a multiline string requires passing `multiline`.
-- |
-- | Other flags will probably cause surprising behavior and should be avoided.
-- |
-- | [*MDN Advanced searching with flags*](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions#advanced_searching_with_flags)
regex :: forall m. String -> RegexFlags -> Either String (ParserT String m String)
Expand Down

0 comments on commit a7eaab0

Please sign in to comment.