pgsql: Change regex \D and \W shorthands to always match newlines.

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

pgsql: Change regex \D and \W shorthands to always match newlines.

Tom Lane-2
Change regex \D and \W shorthands to always match newlines.

Newline is certainly not a digit, nor a word character, so it is
sensible that it should match these complemented character classes.
Previously, \D and \W acted that way by default, but in
newline-sensitive mode ('n' or 'p' flag) they did not match newlines.

This behavior was previously forced because explicit complemented
character classes don't match newlines in newline-sensitive mode;
but as of the previous commit that implementation constraint no
longer exists.  It seems useful to change this because the primary
real-world use for newline-sensitive mode seems to be to match the
default behavior of other regex engines such as Perl and Javascript
... and their default behavior is that these match newlines.

The old behavior can be kept by writing an explicit complemented
character class, i.e. [^[:digit:]] or [^[:word:]].  (This means
that \D and \W are not exactly equivalent to those strings, but
they weren't anyway.)

Discussion: https://postgr.es/m/3220564.1613859619@...

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/7dc13a0f0805a353cea0455ed95701322b39d4dd

Modified Files
--------------
doc/src/sgml/func.sgml                             | 28 +++++++++++++++-------
src/backend/regex/re_syntax.n                      |  7 +++++-
src/backend/regex/regcomp.c                        |  6 ++---
.../modules/test_regex/expected/test_regex.out     | 12 ++++++----
4 files changed, 36 insertions(+), 17 deletions(-)