NOTIFY and pg_notify performance when deduplicating notifications

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

NOTIFY and pg_notify performance when deduplicating notifications

julien-2
Hi,

Back in 2016 a patch was proposed to fix the O(N^2) performance on transactions that generate many notifications. The performance issue is caused by the check for duplicate notifications.

I have a feature built around LISTEN / NOTIFY that works perfectly well, except for the enormous performance impact to transactions that emit large numbers of notifications. It’s very hard to work around the problem on the application side and transactions that could take just a few seconds end up taking over 30 minutes.

The patch that was proposed was nearly finalized, but ended up being abandoned. The latest patch is here: https://www.postgresql.org/message-id/CAP_rwwmKjO_p3kYB4jYceqcvcyRYjBQdji1GSCyqvLK%3D5nZzWQ%40mail.gmail.com .

I understand that the only work left to be done on the patch was to address comments made on the proposed syntax. I’m attaching an updated patch that changes the syntax to allow for a variable number of modes. The new syntax would be NOTIFY channel [ , payload [ , collapse_mode ] ] ; where collapse_mode can be 'never' or 'maybe'.

I hope this patch can be reviewed and included in PostgreSQL.

Best regards.

--
Julien Demoor
 

postgresql-notify-collapse-mode.patch (14K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: NOTIFY and pg_notify performance when deduplicating notifications

Catalin Iacob
On Tue, Oct 2, 2018 at 7:20 PM <[hidden email]> wrote:
> I have a feature built around LISTEN / NOTIFY that works perfectly well, except for the enormous performance impact to transactions that emit large numbers of notifications.

Indeed, I have the same and am very interested in this.

> I hope this patch can be reviewed and included in PostgreSQL.

I added this to the next Commitfest and added myself as a reviewer.
Will try to a review beginning of next week.
https://commitfest.postgresql.org/20/1820/

Reply | Threaded
Open this post in threaded view
|

RE: NOTIFY and pg_notify performance when deduplicating notifications

julien-2
> Indeed, I have the same and am very interested in this.
>
> > I hope this patch can be reviewed and included in PostgreSQL.
>
> I added this to the next Commitfest and added myself as a reviewer.
> Will try to a review beginning of next week.
> https://commitfest.postgresql.org/20/1820/

Thank you for reviewing.

I just caught an error in my patch, it's fixed in the attachment. The 'never' and 'maybe' collapse modes were mixed up in one location.

I can't find a reasonable way to build a regression test that checks whether notifications are effectively deduplicated. The output of the LISTEN command lists the PID of the notifying backend for each notification, e.g. : 'Asynchronous notification "foobar" received from server process with PID 24917'. I can't just add this to async.out. I did test manually for all eight combinations : four collapse mode values (missing, empty string, 'maybe' and 'never'), both with NOTIFY and pg_notify().

postgres-notify-all-v7.patch (14K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: NOTIFY and pg_notify performance when deduplicating notifications

Catalin Iacob
On Tue, Oct 9, 2018 at 2:17 PM <[hidden email]> wrote:
> I just caught an error in my patch, it's fixed in the attachment. The 'never' and 'maybe' collapse modes were mixed up in one location.

Here's a partial review of this version, did not read the doc part
very carefully.

First of all, I agree that this is a desirable feature as, for a large
number of notiifications, the O(n^2) overhead quickly becomes very
noticeable.

I would expect the collapse mode to be an enum which is created from
the string early on during parsing and used for the rest of the code.
Instead the string is used all the way leading to string comparisons
in the notification dispatcher and to the need of hardcoding special
strings in various places, including the contrib module.

This comment in the beginning of async.c should also be updated:
*   Duplicate notifications from the same transaction are sent out as one
*   notification only. This is done to save work when for example a trigger

pg_notify_3args duplicates pg_notify, I would expect a helper function
to be extracted and called from both.

There are braces placed on the same line as the if, for example if
(strlen(collapse_mode) != 0) { which seems to not be the project's
style.

>
> I can't find a reasonable way to build a regression test that checks whether notifications are effectively deduplicated. The output of the LISTEN command lists the PID of the notifying backend for each notification, e.g. : 'Asynchronous notification "foobar" received from server process with PID 24917'. I can't just add this to async.out. I did test manually for all eight combinations : four collapse mode values (missing, empty string, 'maybe' and 'never'), both with NOTIFY and pg_notify().

One way could be to take inspiration from
src/test/isolation/specs/async-notify.spec and check that
pg_notification_queue_usage() does grow when repeating the same
payload with collapse_mode='never' (while for always it would grow).
But I'm not sure it's worth the effort.

Reply | Threaded
Open this post in threaded view
|

Re: NOTIFY and pg_notify performance when deduplicating notifications

Catalin Iacob
On Wed, Oct 10, 2018 at 5:42 PM Catalin Iacob <[hidden email]> wrote:
> One way could be to take inspiration from
> src/test/isolation/specs/async-notify.spec and check that
> pg_notification_queue_usage() does grow when repeating the same
> payload with collapse_mode='never' (while for always it would grow).

Sorry, the last part should be "(while for *maybe* it would *not* grow)".

Reply | Threaded
Open this post in threaded view
|

Re: NOTIFY and pg_notify performance when deduplicating notifications

julien-2
In reply to this post by Catalin Iacob
On 10/10/2018 19:42, Catalin Iacob wrote:

> On Tue, Oct 9, 2018 at 2:17 PM <[hidden email]> wrote:
>> I just caught an error in my patch, it's fixed in the attachment. The
>> 'never' and 'maybe' collapse modes were mixed up in one location.
>
> Here's a partial review of this version, did not read the doc part
> very carefully.
>
> First of all, I agree that this is a desirable feature as, for a large
> number of notiifications, the O(n^2) overhead quickly becomes very
> noticeable.
>
> I would expect the collapse mode to be an enum which is created from
> the string early on during parsing and used for the rest of the code.
> Instead the string is used all the way leading to string comparisons
> in the notification dispatcher and to the need of hardcoding special
> strings in various places, including the contrib module.
>
> This comment in the beginning of async.c should also be updated:
> *   Duplicate notifications from the same transaction are sent out as one
> *   notification only. This is done to save work when for example a trigger
>
> pg_notify_3args duplicates pg_notify, I would expect a helper function
> to be extracted and called from both.
>
> There are braces placed on the same line as the if, for example if
> (strlen(collapse_mode) != 0) { which seems to not be the project's
> style.
Thank you for the review. I've addressed all your points in the attached
patch. The patch was made against release 11.1.

I couldn't find a way to make a good helper function for pg_notify_3args
and pg_notify, I hope my proposed solution is acceptable.

postgres-notify-all-v8.patch (35K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: NOTIFY and pg_notify performance when deduplicating notifications

Dmitry Dolgov
> On Mon, Nov 19, 2018 at 8:30 AM Julien Demoor <[hidden email]> wrote:
>
> Thank you for the review. I've addressed all your points in the attached
> patch. The patch was made against release 11.1.

I've noticed, that cfbot complains about this patch [1], since:

Duplicate OIDs detected:
3423
found 1 duplicate OID(s) in catalog data

At the same time it's quite minor problem, so I'm moving it to the nexc CF as
"Needs review".

Also since it's performance related patch, and the latest tests I see in the
history of this topic were posted around the time of the initial thread (which
was quite some time ago), could we expect to see some new benchmarks with this
patch and without? Probably the positive difference would be obvious, but
still.

[1]: https://travis-ci.org/postgresql-cfbot/postgresql/builds/461617098

Reply | Threaded
Open this post in threaded view
|

Re: NOTIFY and pg_notify performance when deduplicating notifications

julien-2
Is there a particular format that is needed for the benchmark? Here's a
quick benchmark below.

Unpatched, generating N notifications takes t milliseconds:
N= 10000, t= 348
N= 20000, t=1419
N= 30000, t=3253
N= 40000, t=6170

Patched, with the 'never' collapse mode, on another (much faster) machine:
N= 10000, t=   6
N= 20000, t=  32
N= 30000, t=  11
N= 40000, t=  14
N=100000, t=  37
N=200000, t=  86

The benchmark shows that the time to send notifications grows
quadratically with the number of notifications per transaction without
the patch while it grows linearly with the patch applied and
notification collapsing disabled.


The actual psql output is below.

-- Unpatched

jdemoor=# listen my_channel;
LISTEN
Time: 0.100 ms
jdemoor=#
jdemoor=#
jdemoor=# begin;
BEGIN
Time: 0.034 ms
jdemoor=# select count(*) from (select pg_notify('my_channel', g::text)
from generate_series(1, 10000) g) x;
  count
-------
  10000
(1 row)

Time: 348.214 ms
jdemoor=# rollback;
ROLLBACK
Time: 0.054 ms
jdemoor=#
jdemoor=# begin;
BEGIN
Time: 0.050 ms
jdemoor=# select count(*) from (select pg_notify('my_channel', g::text)
from generate_series(1, 20000) g) x;
  count
-------
  20000
(1 row)

Time: 1419.723 ms (00:01.420)
jdemoor=# rollback;
ROLLBACK
Time: 0.062 ms
jdemoor=#
jdemoor=# begin;
BEGIN
Time: 0.020 ms
jdemoor=# select count(*) from (select pg_notify('my_channel', g::text)
from generate_series(1, 30000) g) x;
  count
-------
  30000
(1 row)

Time: 3253.845 ms (00:03.254)
jdemoor=# rollback;
ROLLBACK
Time: 0.064 ms
jdemoor=#
jdemoor=# begin;
BEGIN
Time: 0.020 ms
jdemoor=# select count(*) from (select pg_notify('my_channel', g::text)
from generate_series(1, 40000) g) x;
  count
-------
  40000
(1 row)

Time: 6170.646 ms (00:06.171)
jdemoor=# rollback;
ROLLBACK
Time: 0.063 ms
jdemoor=#



-- Patched

postgres=# listen my_channel;
LISTEN
Time: 0.164 ms
postgres=#
postgres=#
postgres=# begin;
BEGIN
Time: 0.099 ms
postgres=# select count(*) from (select pg_notify('my_channel', g::text,
'never') from generate_series(1, 10000) g) x;
  count
-------
  10000
(1 row)

Time: 6.092 ms
postgres=# rollback;
ROLLBACK
Time: 0.112 ms
postgres=#
postgres=# begin;
BEGIN
Time: 0.032 ms
postgres=# select count(*) from (select pg_notify('my_channel', g::text,
'never') from generate_series(1, 20000) g) x;
  count
-------
  20000
(1 row)

Time: 7.378 ms
postgres=# rollback;
ROLLBACK
Time: 0.258 ms
postgres=#
postgres=# begin;
BEGIN
Time: 0.070 ms
postgres=# select count(*) from (select pg_notify('my_channel', g::text,
'never') from generate_series(1, 30000) g) x;
  count
-------
  30000
(1 row)

Time: 11.782 ms
postgres=# rollback;
ROLLBACK
Time: 0.256 ms
postgres=#
postgres=# begin;
BEGIN
Time: 0.073 ms
postgres=# select count(*) from (select pg_notify('my_channel', g::text,
'never') from generate_series(1, 40000) g) x;
  count
-------
  40000
(1 row)

Time: 14.269 ms
postgres=# rollback;
ROLLBACK
Time: 0.144 ms
postgres=# begin;
BEGIN
Time: 0.204 ms
postgres=# select count(*) from (select pg_notify('my_channel', g::text,
'never') from generate_series(1, 100000) g) x;
  count
--------
  100000
(1 row)

Time: 37.199 ms
postgres=# rollback;
ROLLBACK
Time: 0.864 ms
postgres=#
postgres=#
postgres=# begin;
BEGIN
Time: 0.126 ms
postgres=# select count(*) from (select pg_notify('my_channel', g::text,
'never') from generate_series(1, 200000) g) x;
  count
--------
  200000
(1 row)

Time: 86.477 ms
postgres=# rollback;
ROLLBACK
Time: 1.292 ms




On 01/12/2018 02:35, Dmitry Dolgov wrote:

>> On Mon, Nov 19, 2018 at 8:30 AM Julien Demoor <[hidden email]> wrote:
>>
>> Thank you for the review. I've addressed all your points in the attached
>> patch. The patch was made against release 11.1.
>
> I've noticed, that cfbot complains about this patch [1], since:
>
> Duplicate OIDs detected:
> 3423
> found 1 duplicate OID(s) in catalog data
>
> At the same time it's quite minor problem, so I'm moving it to the nexc CF
> as
> "Needs review".
>
> Also since it's performance related patch, and the latest tests I see in the
> history of this topic were posted around the time of the initial thread
> (which
> was quite some time ago), could we expect to see some new benchmarks with
> this
> patch and without? Probably the positive difference would be obvious, but
> still.
>
> [1]: https://travis-ci.org/postgresql-cfbot/postgresql/builds/461617098
>


Reply | Threaded
Open this post in threaded view
|

Re: NOTIFY and pg_notify performance when deduplicating notifications

Tom Lane-2
In reply to this post by julien-2
Julien Demoor <[hidden email]> writes:
> [ postgres-notify-all-v8.patch ]

I took a quick look through this.  A few comments:

* I find the proposed syntax extension for NOTIFY to be fairly bizarre;
it's unlike the way that we handle options for any other utility
statement.  It's also non-orthogonal, since you can't specify a collapse
mode without giving a payload string.  I think possibly a better answer
is the way that we've been adding optional parameters to VACUUM and
suchlike recently:

         NOTIFY [ (collapse = off/on) ] channel [ , payload ]

This'd be more extensible if we ever find a need for other options,
too.

* I'm also unimpressed with the idea of having a "maybe" collapse mode.
What's that supposed to do?  It doesn't appear to be different from
"always", so why not just reduce this to a boolean collapse-or-not
flag?

* The documentation doesn't agree with the code, since it fails to
mention "always" mode.

* I was kind of disappointed to find that the patch doesn't really
do anything to fix the performance problem for duplicate notifies.
The thread title had led me to hope for more ;-).  I wonder if we
couldn't do something involving hashing.  OTOH, it's certainly
arguable that that would be an independent patch, and that this
one should be seen as a feature patch ("make NOTIFY's behavior
for duplicate notifies more flexible and more clearly specified")
rather than a performance patch.

                        regards, tom lane