Buffers from parallel workers not accumulated to upper nodes with gather merge

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

Buffers from parallel workers not accumulated to upper nodes with gather merge

Jehan-Guillaume de Rorthais
Hi,

A bug report[1] on pev2 project exposes another bug from explain analyze
buffers, similar to the one fixed 2 years ago[2].

In this report, buffers from the workers are not propagated to upper nodes
on top of the Gather Merge node. On the opposite, buffers from workers ARE
propagated to upper nodes on top of a Gather node.

Here is some SQL to reproduce:

  CREATE TABLE contacts (contactSourceName text, filler text)
      WITH ( autovacuum_enabled = off);
  INSERT INTO contacts
      SELECT rpad(chr(64+i)::text,10,'0'),
          rpad(j::text,500,'0')
      FROM generate_series(1,26) i
      CROSS JOIN  generate_series(1,3200) j;
 
  SET work_mem TO '4MB';
  SET random_page_cost to 4;
  SET jit TO off ;
  SET parallel_setup_cost TO 0;
  SET parallel_tuple_cost TO 0;
 
  EXPLAIN (ANALYZE, BUFFERS, VERBOSE, COSTS OFF, TIMING OFF)
  SELECT contactsourceName, count(*)
  FROM contacts
  GROUP BY contactSourceName;

  Finalize GroupAggregate (actual rows=26 loops=1)
    Buffers: shared hit=2087 read=1463 written=32
    ->  Gather Merge (actual rows=70 loops=1)
          Workers Planned: 2
          Workers Launched: 2
          Buffers: shared hit=3910 read=2049 written=96
          ->  Sort (actual rows=23 loops=3)
                Buffers: shared hit=3910 read=2049 written=96
                Worker 0:
                  Buffers: shared hit=914 read=287 written=32
                Worker 1:
                  Buffers: shared hit=909 read=299 written=32
  [...]

Gather Merge exposes shared hit=3910.
Finalize GroupAggregate exposes shared hit=2087.
The 909 and 914 buffers from workers have not been propagated.

Comparing with a Gather node:

  EXPLAIN (ANALYZE, BUFFERS, VERBOSE, COSTS OFF, TIMING OFF) SELECT count(*) AS
  c FROM contacts;

  Finalize Aggregate (actual rows=1 loops=1)
    Output: count(*)
    Buffers: shared hit=3894 read=2049 written=96
    ->  Gather (actual rows=3 loops=1)
          Workers Planned: 2
          Workers Launched: 2
          Buffers: shared hit=3894 read=2049 written=96
          ->  Partial Aggregate (actual rows=1 loops=3)
                Buffers: shared hit=3894 read=2049 written=96
                Worker 0:
                  Buffers: shared hit=870 read=299 written=32
                Worker 1:
                  Buffers: shared hit=887 read=301 written=32
  [...]

Buffers are correctly propagated to the Finalize Aggregate node.

I'm not perfectly clear with the code, but here is my analysis of the issue
and a fix proposal.

The Merge node works correctly because it calls ExecShutdownGatherWorkers as
soon as the workers are exhausted from gather_readnext. Because of this,
buffers from workers are already accounted and propagated to upper nodes
before the recursive call of ExecShutdownNode on each nodes. There's no similar
call to ExecShutdownGatherMergeWorkers for Gather Merge. Adding a call to
ExecShutdownGatherMergeWorkers in gather_merge_getnext when workers are
exhausted seems to fix the issue, but I feel like this is the wrong place to
fix this issue.

During the recursive call on ExecShutdownNode, the upper nodes call
InstrStartNode *after* the workers buffers have been propagated into
pgBufferUsage. Because of this, pgBufferUsage doesn't move anymore before the
call to InstrStopNode and buffers from parallel workers are only accounted up
to the Gather node, but not to upper node.

Note that this bug might similarly affect WAL counters as well but I did not try
to reproduce or analyse it.

Moving the call of InstrStartNode before the recurring call seems to fix the
issue for buffers, and might impact WALs counters as well, by setting
bufusage_start before buffers from workers are accumulated in pgBufferUsage.

You'll find in attachment a patch implementing this fix. Note that I asked
"stephanesql" on github's issue if he wants to be credited as reporter and
under what name he would like to appear there.

Regards,

[1] https://github.com/dalibo/pev2/issues/240
[2] see 85c9d3475e4f680dbca7c04fe096af018f3b8760 and
https://www.postgresql.org/message-id/flat/86137f17-1dfb-42f9-7421-82fd786b04a1%40anayrat.info

v1-0001-Fix-explain-analyse-buffers-for-nodes-on-top-of-Gath.patch (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Buffers from parallel workers not accumulated to upper nodes with gather merge

akapila
On Sat, Jul 18, 2020 at 7:32 PM Jehan-Guillaume de Rorthais
<[hidden email]> wrote:

>
> Hi,
>
> A bug report[1] on pev2 project exposes another bug from explain analyze
> buffers, similar to the one fixed 2 years ago[2].
>
> In this report, buffers from the workers are not propagated to upper nodes
> on top of the Gather Merge node. On the opposite, buffers from workers ARE
> propagated to upper nodes on top of a Gather node.
>
> Here is some SQL to reproduce:
>
>   CREATE TABLE contacts (contactSourceName text, filler text)
>       WITH ( autovacuum_enabled = off);
>   INSERT INTO contacts
>       SELECT rpad(chr(64+i)::text,10,'0'),
>           rpad(j::text,500,'0')
>       FROM generate_series(1,26) i
>       CROSS JOIN  generate_series(1,3200) j;
>
>   SET work_mem TO '4MB';
>   SET random_page_cost to 4;
>   SET jit TO off ;
>   SET parallel_setup_cost TO 0;
>   SET parallel_tuple_cost TO 0;
>
>   EXPLAIN (ANALYZE, BUFFERS, VERBOSE, COSTS OFF, TIMING OFF)
>   SELECT contactsourceName, count(*)
>   FROM contacts
>   GROUP BY contactSourceName;
>
>   Finalize GroupAggregate (actual rows=26 loops=1)
>     Buffers: shared hit=2087 read=1463 written=32
>     ->  Gather Merge (actual rows=70 loops=1)
>           Workers Planned: 2
>           Workers Launched: 2
>           Buffers: shared hit=3910 read=2049 written=96
>           ->  Sort (actual rows=23 loops=3)
>                 Buffers: shared hit=3910 read=2049 written=96
>                 Worker 0:
>                   Buffers: shared hit=914 read=287 written=32
>                 Worker 1:
>                   Buffers: shared hit=909 read=299 written=32
>   [...]
>
> Gather Merge exposes shared hit=3910.
> Finalize GroupAggregate exposes shared hit=2087.
> The 909 and 914 buffers from workers have not been propagated.
>
> Comparing with a Gather node:
>
>   EXPLAIN (ANALYZE, BUFFERS, VERBOSE, COSTS OFF, TIMING OFF) SELECT count(*) AS
>   c FROM contacts;
>
>   Finalize Aggregate (actual rows=1 loops=1)
>     Output: count(*)
>     Buffers: shared hit=3894 read=2049 written=96
>     ->  Gather (actual rows=3 loops=1)
>           Workers Planned: 2
>           Workers Launched: 2
>           Buffers: shared hit=3894 read=2049 written=96
>           ->  Partial Aggregate (actual rows=1 loops=3)
>                 Buffers: shared hit=3894 read=2049 written=96
>                 Worker 0:
>                   Buffers: shared hit=870 read=299 written=32
>                 Worker 1:
>                   Buffers: shared hit=887 read=301 written=32
>   [...]
>
> Buffers are correctly propagated to the Finalize Aggregate node.
>
> I'm not perfectly clear with the code, but here is my analysis of the issue
> and a fix proposal.
>
> The Merge node works correctly because it calls ExecShutdownGatherWorkers as
> soon as the workers are exhausted from gather_readnext. Because of this,
> buffers from workers are already accounted and propagated to upper nodes
> before the recursive call of ExecShutdownNode on each nodes. There's no similar
> call to ExecShutdownGatherMergeWorkers for Gather Merge. Adding a call to
> ExecShutdownGatherMergeWorkers in gather_merge_getnext when workers are
> exhausted seems to fix the issue, but I feel like this is the wrong place to
> fix this issue.
>

Why do you think so?  I think irrespective of whether we want to call
ExecShutdownGatherMergeWorkers in gather_merge_getnext (when we know
we are done aka binaryheap_empty) to fix this particular issue, it is
better to shutdown the workers as soon as we are done similar to what
we do for Gather node.  It is good to release resources as soon as we
can.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Buffers from parallel workers not accumulated to upper nodes with gather merge

akapila
On Mon, Jul 20, 2020 at 11:30 AM Amit Kapila <[hidden email]> wrote:

>
> On Sat, Jul 18, 2020 at 7:32 PM Jehan-Guillaume de Rorthais
> <[hidden email]> wrote:
> >
> > Hi,
> >
> > A bug report[1] on pev2 project exposes another bug from explain analyze
> > buffers, similar to the one fixed 2 years ago[2].
> >
> > In this report, buffers from the workers are not propagated to upper nodes
> > on top of the Gather Merge node. On the opposite, buffers from workers ARE
> > propagated to upper nodes on top of a Gather node.
> >
> > Here is some SQL to reproduce:
> >
> >   CREATE TABLE contacts (contactSourceName text, filler text)
> >       WITH ( autovacuum_enabled = off);
> >   INSERT INTO contacts
> >       SELECT rpad(chr(64+i)::text,10,'0'),
> >           rpad(j::text,500,'0')
> >       FROM generate_series(1,26) i
> >       CROSS JOIN  generate_series(1,3200) j;
> >
> >   SET work_mem TO '4MB';
> >   SET random_page_cost to 4;
> >   SET jit TO off ;
> >   SET parallel_setup_cost TO 0;
> >   SET parallel_tuple_cost TO 0;
> >
> >   EXPLAIN (ANALYZE, BUFFERS, VERBOSE, COSTS OFF, TIMING OFF)
> >   SELECT contactsourceName, count(*)
> >   FROM contacts
> >   GROUP BY contactSourceName;
> >
> >   Finalize GroupAggregate (actual rows=26 loops=1)
> >     Buffers: shared hit=2087 read=1463 written=32
> >     ->  Gather Merge (actual rows=70 loops=1)
> >           Workers Planned: 2
> >           Workers Launched: 2
> >           Buffers: shared hit=3910 read=2049 written=96
> >           ->  Sort (actual rows=23 loops=3)
> >                 Buffers: shared hit=3910 read=2049 written=96
> >                 Worker 0:
> >                   Buffers: shared hit=914 read=287 written=32
> >                 Worker 1:
> >                   Buffers: shared hit=909 read=299 written=32
> >   [...]
> >
> > Gather Merge exposes shared hit=3910.
> > Finalize GroupAggregate exposes shared hit=2087.
> > The 909 and 914 buffers from workers have not been propagated.
> >
> > Comparing with a Gather node:
> >
> >   EXPLAIN (ANALYZE, BUFFERS, VERBOSE, COSTS OFF, TIMING OFF) SELECT count(*) AS
> >   c FROM contacts;
> >
> >   Finalize Aggregate (actual rows=1 loops=1)
> >     Output: count(*)
> >     Buffers: shared hit=3894 read=2049 written=96
> >     ->  Gather (actual rows=3 loops=1)
> >           Workers Planned: 2
> >           Workers Launched: 2
> >           Buffers: shared hit=3894 read=2049 written=96
> >           ->  Partial Aggregate (actual rows=1 loops=3)
> >                 Buffers: shared hit=3894 read=2049 written=96
> >                 Worker 0:
> >                   Buffers: shared hit=870 read=299 written=32
> >                 Worker 1:
> >                   Buffers: shared hit=887 read=301 written=32
> >   [...]
> >
> > Buffers are correctly propagated to the Finalize Aggregate node.
> >
> > I'm not perfectly clear with the code, but here is my analysis of the issue
> > and a fix proposal.
> >
> > The Merge node works correctly because it calls ExecShutdownGatherWorkers as
> > soon as the workers are exhausted from gather_readnext. Because of this,
> > buffers from workers are already accounted and propagated to upper nodes
> > before the recursive call of ExecShutdownNode on each nodes. There's no similar
> > call to ExecShutdownGatherMergeWorkers for Gather Merge. Adding a call to
> > ExecShutdownGatherMergeWorkers in gather_merge_getnext when workers are
> > exhausted seems to fix the issue, but I feel like this is the wrong place to
> > fix this issue.
> >
>
> Why do you think so?  I think irrespective of whether we want to call
> ExecShutdownGatherMergeWorkers in gather_merge_getnext (when we know
> we are done aka binaryheap_empty) to fix this particular issue, it is
> better to shutdown the workers as soon as we are done similar to what
> we do for Gather node.  It is good to release resources as soon as we
> can.
>

Apart from above, we should also do what you are suggesting in the
patch as that is also a problem and your fix seems correct to me.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Buffers from parallel workers not accumulated to upper nodes with gather merge

Jehan-Guillaume de Rorthais
In reply to this post by akapila
On Mon, 20 Jul 2020 11:30:34 +0530
Amit Kapila <[hidden email]> wrote:

> On Sat, Jul 18, 2020 at 7:32 PM Jehan-Guillaume de Rorthais
> <[hidden email]> wrote:
[...]

> > The Merge node works correctly because it calls ExecShutdownGatherWorkers as
> > soon as the workers are exhausted from gather_readnext. Because of this,
> > buffers from workers are already accounted and propagated to upper nodes
> > before the recursive call of ExecShutdownNode on each nodes. There's no
> > similar call to ExecShutdownGatherMergeWorkers for Gather Merge. Adding a
> > call to ExecShutdownGatherMergeWorkers in gather_merge_getnext when workers
> > are exhausted seems to fix the issue, but I feel like this is the wrong
> > place to fix this issue.
>
> Why do you think so?

Because the fix seemed too specific to the Gather Merge node alone. This bug
might exist for some other nodes (I didn't look for them) and the trap will
stay for futur ones.

The fix in ExecShutdownNode recursion have a broader impact on all present
and futur nodes.

> I think irrespective of whether we want to call
> ExecShutdownGatherMergeWorkers in gather_merge_getnext (when we know
> we are done aka binaryheap_empty) to fix this particular issue, it is
> better to shutdown the workers as soon as we are done similar to what
> we do for Gather node.  It is good to release resources as soon as we
> can.

Indeed. I was focusing on the bug and I didn't thought about that. The patch I
test was really just a quick test. I'll have a closer look at this, but I
suppose this might be considered as a separate commit?

Regards,


Reply | Threaded
Open this post in threaded view
|

Re: Buffers from parallel workers not accumulated to upper nodes with gather merge

akapila
On Mon, Jul 20, 2020 at 1:29 PM Jehan-Guillaume de Rorthais
<[hidden email]> wrote:

>
> On Mon, 20 Jul 2020 11:30:34 +0530
> Amit Kapila <[hidden email]> wrote:
>
> > On Sat, Jul 18, 2020 at 7:32 PM Jehan-Guillaume de Rorthais
> > <[hidden email]> wrote:
> [...]
> > > The Merge node works correctly because it calls ExecShutdownGatherWorkers as
> > > soon as the workers are exhausted from gather_readnext. Because of this,
> > > buffers from workers are already accounted and propagated to upper nodes
> > > before the recursive call of ExecShutdownNode on each nodes. There's no
> > > similar call to ExecShutdownGatherMergeWorkers for Gather Merge. Adding a
> > > call to ExecShutdownGatherMergeWorkers in gather_merge_getnext when workers
> > > are exhausted seems to fix the issue, but I feel like this is the wrong
> > > place to fix this issue.
> >
> > Why do you think so?
>
> Because the fix seemed too specific to the Gather Merge node alone. This bug
> might exist for some other nodes (I didn't look for them) and the trap will
> stay for futur ones.
>
> The fix in ExecShutdownNode recursion have a broader impact on all present
> and futur nodes.
>
> > I think irrespective of whether we want to call
> > ExecShutdownGatherMergeWorkers in gather_merge_getnext (when we know
> > we are done aka binaryheap_empty) to fix this particular issue, it is
> > better to shutdown the workers as soon as we are done similar to what
> > we do for Gather node.  It is good to release resources as soon as we
> > can.
>
> Indeed. I was focusing on the bug and I didn't thought about that. The patch I
> test was really just a quick test. I'll have a closer look at this, but I
> suppose this might be considered as a separate commit?
>

Good Question.  Initially, I thought we will have it in a same commit,
but now on again thinking about it, we might want to keep this one for
the HEAD only and ExecShutdownNode related change in the
back-branches.  BTW, can you please test if the problem exist in
back-branches, and does your change fix it there as well?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Buffers from parallel workers not accumulated to upper nodes with gather merge

Jehan-Guillaume de Rorthais


Le 20 juillet 2020 11:59:00 GMT+02:00, Amit Kapila <[hidden email]> a écrit :

>On Mon, Jul 20, 2020 at 1:29 PM Jehan-Guillaume de Rorthais
><[hidden email]> wrote:
>>
>> On Mon, 20 Jul 2020 11:30:34 +0530
>> Amit Kapila <[hidden email]> wrote:
>>
>> > On Sat, Jul 18, 2020 at 7:32 PM Jehan-Guillaume de Rorthais
>> > <[hidden email]> wrote:
>> [...]
>> > > The Merge node works correctly because it calls
>ExecShutdownGatherWorkers as
>> > > soon as the workers are exhausted from gather_readnext. Because
>of this,
>> > > buffers from workers are already accounted and propagated to
>upper nodes
>> > > before the recursive call of ExecShutdownNode on each nodes.
>There's no
>> > > similar call to ExecShutdownGatherMergeWorkers for Gather Merge.
>Adding a
>> > > call to ExecShutdownGatherMergeWorkers in gather_merge_getnext
>when workers
>> > > are exhausted seems to fix the issue, but I feel like this is the
>wrong
>> > > place to fix this issue.
>> >
>> > Why do you think so?
>>
>> Because the fix seemed too specific to the Gather Merge node alone.
>This bug
>> might exist for some other nodes (I didn't look for them) and the
>trap will
>> stay for futur ones.
>>
>> The fix in ExecShutdownNode recursion have a broader impact on all
>present
>> and futur nodes.
>>
>> > I think irrespective of whether we want to call
>> > ExecShutdownGatherMergeWorkers in gather_merge_getnext (when we
>know
>> > we are done aka binaryheap_empty) to fix this particular issue, it
>is
>> > better to shutdown the workers as soon as we are done similar to
>what
>> > we do for Gather node.  It is good to release resources as soon as
>we
>> > can.
>>
>> Indeed. I was focusing on the bug and I didn't thought about that.
>The patch I
>> test was really just a quick test. I'll have a closer look at this,
>but I
>> suppose this might be considered as a separate commit?
>>
>
>Good Question.  Initially, I thought we will have it in a same commit,
>but now on again thinking about it, we might want to keep this one for
>the HEAD only and ExecShutdownNode related change in the
>back-branches.  BTW, can you please test if the problem exist in
>back-branches, and does your change fix it there as well?

Yes. I'll take care of that later today or tomorrow.

Regards,


Reply | Threaded
Open this post in threaded view
|

Re: Buffers from parallel workers not accumulated to upper nodes with gather merge

Jehan-Guillaume de Rorthais
In reply to this post by akapila
On Mon, 20 Jul 2020 15:29:00 +0530
Amit Kapila <[hidden email]> wrote:

> On Mon, Jul 20, 2020 at 1:29 PM Jehan-Guillaume de Rorthais
> <[hidden email]> wrote:
> >
> > On Mon, 20 Jul 2020 11:30:34 +0530
> > Amit Kapila <[hidden email]> wrote:
> >
> > > On Sat, Jul 18, 2020 at 7:32 PM Jehan-Guillaume de Rorthais
> > > <[hidden email]> wrote:
> > [...]
> > > > The Merge node works correctly because it calls
> > > > ExecShutdownGatherWorkers as soon as the workers are exhausted from
> > > > gather_readnext. Because of this, buffers from workers are already
> > > > accounted and propagated to upper nodes before the recursive call of
> > > > ExecShutdownNode on each nodes. There's no similar call to
> > > > ExecShutdownGatherMergeWorkers for Gather Merge. Adding a call to
> > > > ExecShutdownGatherMergeWorkers in gather_merge_getnext when workers are
> > > > exhausted seems to fix the issue, but I feel like this is the wrong
> > > > place to fix this issue.
> > >
> > > Why do you think so?
> >
> > Because the fix seemed too specific to the Gather Merge node alone. This bug
> > might exist for some other nodes (I didn't look for them) and the trap will
> > stay for futur ones.
> >
> > The fix in ExecShutdownNode recursion have a broader impact on all present
> > and futur nodes.
> >
> > > I think irrespective of whether we want to call
> > > ExecShutdownGatherMergeWorkers in gather_merge_getnext (when we know
> > > we are done aka binaryheap_empty) to fix this particular issue, it is
> > > better to shutdown the workers as soon as we are done similar to what
> > > we do for Gather node.  It is good to release resources as soon as we
> > > can.
> >
> > Indeed. I was focusing on the bug and I didn't thought about that. The
> > patch I test was really just a quick test. I'll have a closer look at this,
> > but I suppose this might be considered as a separate commit?
> >
>
> Good Question.  Initially, I thought we will have it in a same commit,
> but now on again thinking about it, we might want to keep this one for
> the HEAD only and ExecShutdownNode related change in the back-branches.
OK.

I'll send a patch proposal for Gather Merge on HEAD tomorrow.

> BTW, can you please test if the problem exist in back-branches, and does
> your change fix it there as well?

Gather Merge appears in v10, but I haven't been able to produce a plan using it.
The SQL scenario expose the same bug in v11, v12, v13 and head.

Nevertheless, the recursion in ExecShutdownNode always happen before
InstrStartNode back to v10. The code in 9.6 is fine, the recursion happen
after both InstrStartNode and InstrStopNode.

You'll find in attachment patches for versions from 10 to head.

Regards,

v1-0001-Fix-explain-analyse-buffers-for-nodes-on-top-of-Gath-head.patch (2K) Download Attachment
v1-0001-Fix-explain-analyse-buffers-for-nodes-on-top-of-Gath-v10.patch (2K) Download Attachment
v1-0001-Fix-explain-analyse-buffers-for-nodes-on-top-of-Gath-v11.patch (2K) Download Attachment
v1-0001-Fix-explain-analyse-buffers-for-nodes-on-top-of-Gath-v12.patch (2K) Download Attachment
v1-0001-Fix-explain-analyse-buffers-for-nodes-on-top-of-Gath-v13.patch (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Buffers from parallel workers not accumulated to upper nodes with gather merge

Jehan-Guillaume de Rorthais
On Tue, 21 Jul 2020 17:34:57 +0200
Jehan-Guillaume de Rorthais <[hidden email]> wrote:
> On Mon, 20 Jul 2020 15:29:00 +0530
> Amit Kapila <[hidden email]> wrote:
> > On Mon, Jul 20, 2020 at 1:29 PM Jehan-Guillaume de Rorthais
> > <[hidden email]> wrote:  
> > > On Mon, 20 Jul 2020 11:30:34 +0530
> > > Amit Kapila <[hidden email]> wrote:
[...]

> > > > I think irrespective of whether we want to call
> > > > ExecShutdownGatherMergeWorkers in gather_merge_getnext (when we know
> > > > we are done aka binaryheap_empty) to fix this particular issue, it is
> > > > better to shutdown the workers as soon as we are done similar to what
> > > > we do for Gather node.  It is good to release resources as soon as we
> > > > can.  
> > >
> > > Indeed. I was focusing on the bug and I didn't thought about that. The
> > > patch I test was really just a quick test. I'll have a closer look at
> > > this, but I suppose this might be considered as a separate commit?
> > >  
> >
> > Good Question.  Initially, I thought we will have it in a same commit,
> > but now on again thinking about it, we might want to keep this one for
> > the HEAD only and ExecShutdownNode related change in the back-branches.  
>
> OK.
>
> I'll send a patch proposal for Gather Merge on HEAD tomorrow.
Please, find in attachment the patch to release Gather Merge workers as soon as
they are exhausted.

I was wondering how interesting it would be to release each worker individually
of each others? This might be interesting for long running queries and/or large
number of workers. This would apply to Gather as well. Do you think it worth to
give it try? Maybe it has already been discussed in the past?

Regards,

v1-0001-Shutdown-Gather-Merge-workers-when-exhausted.patch (931 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Buffers from parallel workers not accumulated to upper nodes with gather merge

akapila
In reply to this post by Jehan-Guillaume de Rorthais
On Tue, Jul 21, 2020 at 9:05 PM Jehan-Guillaume de Rorthais
<[hidden email]> wrote:

>
> On Mon, 20 Jul 2020 15:29:00 +0530
> Amit Kapila <[hidden email]> wrote:
>
>
> > BTW, can you please test if the problem exist in back-branches, and does
> > your change fix it there as well?
>
> Gather Merge appears in v10, but I haven't been able to produce a plan using it.
>
I am able to do it for v10.  I think you can try by "set
enable_hashagg = off;" and following all other steps mentioned in your
email?  Attached, please find the patch for v10 with a modified commit
message.  I will do it for other versions as well if you are fine with
it?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

v2-0001-Fix-buffer-usage-stats-for-nodes-above-Gather-Merge.patch (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Buffers from parallel workers not accumulated to upper nodes with gather merge

Jehan-Guillaume de Rorthais
On Thu, 23 Jul 2020 15:44:16 +0530
Amit Kapila <[hidden email]> wrote:

> On Tue, Jul 21, 2020 at 9:05 PM Jehan-Guillaume de Rorthais
> <[hidden email]> wrote:
> >
> > On Mon, 20 Jul 2020 15:29:00 +0530
> > Amit Kapila <[hidden email]> wrote:
> >
> >  
> > > BTW, can you please test if the problem exist in back-branches, and does
> > > your change fix it there as well?  
> >
> > Gather Merge appears in v10, but I haven't been able to produce a plan
> > using it.
>
> I am able to do it for v10.  I think you can try by "set
> enable_hashagg = off;" and following all other steps mentioned in your
> email?

Indeed. I confirm it produces the same result. I've been able to successfully
test to proposed fix.

> Attached, please find the patch for v10 with a modified commit
> message.  I will do it for other versions as well if you are fine with
> it?

I'm not an english-native writer, but I would rephrase this:

  We can do a similar thing for Gather Merge as well but it seems the better
  to account for stats during nodes shutdown after completing the execution.

Thanks!

Regards,


Reply | Threaded
Open this post in threaded view
|

Re: Buffers from parallel workers not accumulated to upper nodes with gather merge

akapila
In reply to this post by Jehan-Guillaume de Rorthais
On Thu, Jul 23, 2020 at 2:37 PM Jehan-Guillaume de Rorthais
<[hidden email]> wrote:

>
> On Tue, 21 Jul 2020 17:34:57 +0200
> Jehan-Guillaume de Rorthais <[hidden email]> wrote:
> > On Mon, 20 Jul 2020 15:29:00 +0530
> >
> > I'll send a patch proposal for Gather Merge on HEAD tomorrow.
>
> Please, find in attachment the patch to release Gather Merge workers as soon as
> they are exhausted.
>

Okay, thanks.  I'll take care of this after pushing the bug-fix patch.

> I was wondering how interesting it would be to release each worker individually
> of each others? This might be interesting for long running queries and/or large
> number of workers. This would apply to Gather as well. Do you think it worth to
> give it try? Maybe it has already been discussed in the past?
>

I am not sure if that would buy us much because the workers exit as
soon as their work is done.  So now accumulating the stats for each
individual worker and ensuring it is gone separately would be more
work without much gain.


--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Buffers from parallel workers not accumulated to upper nodes with gather merge

akapila
In reply to this post by Jehan-Guillaume de Rorthais
On Thu, Jul 23, 2020 at 3:59 PM Jehan-Guillaume de Rorthais
<[hidden email]> wrote:

>
> On Thu, 23 Jul 2020 15:44:16 +0530
> Amit Kapila <[hidden email]> wrote:
>
> > On Tue, Jul 21, 2020 at 9:05 PM Jehan-Guillaume de Rorthais
> > <[hidden email]> wrote:
> > >
> > > On Mon, 20 Jul 2020 15:29:00 +0530
> > > Amit Kapila <[hidden email]> wrote:
> > >
> > >
> > > > BTW, can you please test if the problem exist in back-branches, and does
> > > > your change fix it there as well?
> > >
> > > Gather Merge appears in v10, but I haven't been able to produce a plan
> > > using it.
> >
> > I am able to do it for v10.  I think you can try by "set
> > enable_hashagg = off;" and following all other steps mentioned in your
> > email?
>
> Indeed. I confirm it produces the same result. I've been able to successfully
> test to proposed fix.
>

Cool.

> > Attached, please find the patch for v10 with a modified commit
> > message.  I will do it for other versions as well if you are fine with
> > it?
>
> I'm not an english-native writer, but I would rephrase this:
>
>   We can do a similar thing for Gather Merge as well but it seems the better
>   to account for stats during nodes shutdown after completing the execution.
>

Fair enough.  I will adopt your suggestion before commit.   I'll wait
for a day or so to see if anyone has more comments.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Buffers from parallel workers not accumulated to upper nodes with gather merge

akapila
On Thu, Jul 23, 2020 at 4:24 PM Amit Kapila <[hidden email]> wrote:

>
> On Thu, Jul 23, 2020 at 3:59 PM Jehan-Guillaume de Rorthais
> <[hidden email]> wrote:
> >
> >
> > I'm not an english-native writer, but I would rephrase this:
> >
> >   We can do a similar thing for Gather Merge as well but it seems the better
> >   to account for stats during nodes shutdown after completing the execution.
> >
>
> Fair enough.  I will adopt your suggestion before commit.   I'll wait
> for a day or so to see if anyone has more comments.
>

Pushed.

--
With Regards,
Amit Kapila.


Reply | Threaded
Open this post in threaded view
|

Re: Buffers from parallel workers not accumulated to upper nodes with gather merge

Jehan-Guillaume de Rorthais
On Sat, 25 Jul 2020 13:16:55 +0530
Amit Kapila <[hidden email]> wrote:

> On Thu, Jul 23, 2020 at 4:24 PM Amit Kapila <[hidden email]> wrote:
> >
> > On Thu, Jul 23, 2020 at 3:59 PM Jehan-Guillaume de Rorthais
> > <[hidden email]> wrote:  
> > >
> > >
> > > I'm not an english-native writer, but I would rephrase this:
> > >
> > >   We can do a similar thing for Gather Merge as well but it seems the
> > > better to account for stats during nodes shutdown after completing the
> > > execution.
> >
> > Fair enough.  I will adopt your suggestion before commit.   I'll wait
> > for a day or so to see if anyone has more comments.
> >  
>
> Pushed.

Thank you!

Regards,