Default setting for enable_hashagg_disk

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
100 messages Options
12345
Reply | Threaded
Open this post in threaded view
|

Default setting for enable_hashagg_disk

Jeff Davis-8
This is just a placeholder thread for an open item that I'm adding to
the Open Items list. We can make a decision later.

Now that we have Disk-based Hash Aggregation, there are a lot more
situations where the planner can choose HashAgg. The
enable_hashagg_disk GUC, if set to true, chooses HashAgg based on
costing. If false, it only generates a HashAgg path if it thinks it
will fit in work_mem, similar to the old behavior (though it wlil now
spill to disk if the planner was wrong about it fitting in work_mem).
The current default is true.

I expect this to be a win in a lot of cases, obviously. But as with any
planner change, it will be wrong sometimes. We may want to be
conservative and set the default to false depending on the experience
during beta. I'm inclined to leave it as true for now though, because
that will give us better information upon which to base any decision.

Regards,
        Jeff Davis




Reply | Threaded
Open this post in threaded view
|

Re: Default setting for enable_hashagg_disk

Justin Pryzby
On Tue, Apr 07, 2020 at 11:20:46AM -0700, Jeff Davis wrote:
> The enable_hashagg_disk GUC, if set to true, chooses HashAgg based on
> costing. If false, it only generates a HashAgg path if it thinks it will fit
> in work_mem, similar to the old behavior (though it wlil now spill to disk if
> the planner was wrong about it fitting in work_mem).  The current default is
> true.

Are there any other GUCs that behave like that ?  It's confusing to me when I
see "Disk Usage: ... kB", despite setting it to "disable", and without the
usual disable_cost.  I realize that postgres chose the plan on the hypothesis
that it would *not* exceed work_mem, and that spilling to disk is considered
preferable to ignoring the setting, and that "going back" to planning phase
isn't a possibility.

template1=# explain (analyze, costs off, summary off) SELECT a, COUNT(1) FROM generate_series(1,999999) a GROUP BY 1 ;
 HashAggregate (actual time=1370.945..2877.250 rows=999999 loops=1)
   Group Key: a
   Peak Memory Usage: 5017 kB
   Disk Usage: 22992 kB
   HashAgg Batches: 84
   ->  Function Scan on generate_series a (actual time=314.507..741.517 rows=999999 loops=1)

A previous version of the docs said this, which I thought was confusing, and you removed it.
But I guess this is the behavior it was trying to .. explain.

+      <term><varname>enable_hashagg_disk</varname> (<type>boolean</type>)
+        ... This only affects the planner choice;
+        execution time may still require using disk-based hash
+        aggregation. The default is <literal>on</literal>.

I suggest that should be reworded and then re-introduced, unless there's some
further behavior change allowing the previous behavior of
might-exceed-work-mem.

"This setting determines whether the planner will elect to use a hash plan
which it expects will exceed work_mem and spill to disk.  During execution,
hash nodes which exceed work_mem will spill to disk even if this setting is
disabled.  To avoid spilling to disk, either increase work_mem (or set
enable_hashagg=off)."

For sure the release notes should recommend re-calibrating work_mem.

--
Justin


Reply | Threaded
Open this post in threaded view
|

Re: Default setting for enable_hashagg_disk

Tomas Vondra-4
On Tue, Apr 07, 2020 at 05:39:01PM -0500, Justin Pryzby wrote:

>On Tue, Apr 07, 2020 at 11:20:46AM -0700, Jeff Davis wrote:
>> The enable_hashagg_disk GUC, if set to true, chooses HashAgg based on
>> costing. If false, it only generates a HashAgg path if it thinks it will fit
>> in work_mem, similar to the old behavior (though it wlil now spill to disk if
>> the planner was wrong about it fitting in work_mem).  The current default is
>> true.
>
>Are there any other GUCs that behave like that ?  It's confusing to me when I
>see "Disk Usage: ... kB", despite setting it to "disable", and without the
>usual disable_cost.  I realize that postgres chose the plan on the hypothesis
>that it would *not* exceed work_mem, and that spilling to disk is considered
>preferable to ignoring the setting, and that "going back" to planning phase
>isn't a possibility.
>

It it really any different from our enable_* GUCs? Even if you do e.g.
enable_sort=off, we may still do a sort. Same for enable_groupagg etc.

>template1=# explain (analyze, costs off, summary off) SELECT a, COUNT(1) FROM generate_series(1,999999) a GROUP BY 1 ;
> HashAggregate (actual time=1370.945..2877.250 rows=999999 loops=1)
>   Group Key: a
>   Peak Memory Usage: 5017 kB
>   Disk Usage: 22992 kB
>   HashAgg Batches: 84
>   ->  Function Scan on generate_series a (actual time=314.507..741.517 rows=999999 loops=1)
>
>A previous version of the docs said this, which I thought was confusing, and you removed it.
>But I guess this is the behavior it was trying to .. explain.
>
>+      <term><varname>enable_hashagg_disk</varname> (<type>boolean</type>)
>+        ... This only affects the planner choice;
>+        execution time may still require using disk-based hash
>+        aggregation. The default is <literal>on</literal>.
>
>I suggest that should be reworded and then re-introduced, unless there's some
>further behavior change allowing the previous behavior of
>might-exceed-work-mem.
>

Yeah, it would be good to mention this is a best-effort setting.

>"This setting determines whether the planner will elect to use a hash plan
>which it expects will exceed work_mem and spill to disk.  During execution,
>hash nodes which exceed work_mem will spill to disk even if this setting is
>disabled.  To avoid spilling to disk, either increase work_mem (or set
>enable_hashagg=off)."
>
>For sure the release notes should recommend re-calibrating work_mem.
>

I don't follow. Why would the recalibrating be needed?

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Default setting for enable_hashagg_disk

Justin Pryzby
On Thu, Apr 09, 2020 at 01:48:55PM +0200, Tomas Vondra wrote:

> On Tue, Apr 07, 2020 at 05:39:01PM -0500, Justin Pryzby wrote:
> > On Tue, Apr 07, 2020 at 11:20:46AM -0700, Jeff Davis wrote:
> > > The enable_hashagg_disk GUC, if set to true, chooses HashAgg based on
> > > costing. If false, it only generates a HashAgg path if it thinks it will fit
> > > in work_mem, similar to the old behavior (though it wlil now spill to disk if
> > > the planner was wrong about it fitting in work_mem).  The current default is
> > > true.
> >
> > Are there any other GUCs that behave like that ?  It's confusing to me when I
> > see "Disk Usage: ... kB", despite setting it to "disable", and without the
> > usual disable_cost.  I realize that postgres chose the plan on the hypothesis
> > that it would *not* exceed work_mem, and that spilling to disk is considered
> > preferable to ignoring the setting, and that "going back" to planning phase
> > isn't a possibility.
>
> It it really any different from our enable_* GUCs? Even if you do e.g.
> enable_sort=off, we may still do a sort. Same for enable_groupagg etc.

Those show that the GUC was disabled by showing disable_cost.  That's what's
different about this one.

Also.. there's no such thing as enable_groupagg?  Unless I've been missing out
on something.

> > "This setting determines whether the planner will elect to use a hash plan
> > which it expects will exceed work_mem and spill to disk.  During execution,
> > hash nodes which exceed work_mem will spill to disk even if this setting is
> > disabled.  To avoid spilling to disk, either increase work_mem (or set
> > enable_hashagg=off)."
> >
> > For sure the release notes should recommend re-calibrating work_mem.
>
> I don't follow. Why would the recalibrating be needed?

Because HashAgg plans which used to run fine (because they weren't prevented
from overflowing work_mem) might now run poorly after spilling to disk (because
of overflowing work_mem).

--
Justin


Reply | Threaded
Open this post in threaded view
|

Re: Default setting for enable_hashagg_disk

Jeff Davis-8
On Thu, 2020-04-09 at 12:24 -0500, Justin Pryzby wrote:
> Also.. there's no such thing as enable_groupagg?  Unless I've been
> missing out
> on something.

I thought about adding that, and went so far as to make a patch. But it
didn't seem right to me -- the grouping isn't what takes the time, it's
the sorting. So what would the point of such a GUC be? To disable
GroupAgg when the input data is already sorted? Or a strange way to
disable Sort?

> Because HashAgg plans which used to run fine (because they weren't
> prevented
> from overflowing work_mem) might now run poorly after spilling to
> disk (because
> of overflowing work_mem).

It's probably worth a mention in the release notes, but I wouldn't word
it too strongly. Typically the performance difference is not a lot if
the workload still fits in system memory.

Regards,
        Jeff Davis




Reply | Threaded
Open this post in threaded view
|

Re: Default setting for enable_hashagg_disk

Robert Haas
In reply to this post by Tomas Vondra-4
On Thu, Apr 9, 2020 at 7:49 AM Tomas Vondra
<[hidden email]> wrote:
> It it really any different from our enable_* GUCs? Even if you do e.g.
> enable_sort=off, we may still do a sort. Same for enable_groupagg etc.

I think it's actually pretty different. All of the other enable_* GUCs
disable an entire type of plan node, except for cases where that would
otherwise result in planning failure. This just disables a portion of
the planning logic for a certain kind of node, without actually
disabling the whole node type. I'm not sure that's a bad idea, but it
definitely seems to be inconsistent with what we've done in the past.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Reply | Threaded
Open this post in threaded view
|

Re: Default setting for enable_hashagg_disk

Jeff Davis-8
On Thu, 2020-04-09 at 15:26 -0400, Robert Haas wrote:
> I think it's actually pretty different. All of the other enable_*
> GUCs
> disable an entire type of plan node, except for cases where that
> would
> otherwise result in planning failure. This just disables a portion of
> the planning logic for a certain kind of node, without actually
> disabling the whole node type. I'm not sure that's a bad idea, but it
> definitely seems to be inconsistent with what we've done in the past.

The patch adds two GUCs. Both are slightly weird, to be honest, but let
me explain the reasoning. I am open to other suggestions.

1. enable_hashagg_disk (default true):

This is essentially there just to get some of the old behavior back, to
give people an escape hatch if they see bad plans while we are tweaking
the costing. The old behavior was weird, so this GUC is also weird.

Perhaps we can make this a compatibility GUC that we eventually drop? I
don't necessarily think this GUC would make sense, say, 5 versions from
now. I'm just trying to be conservative because I know that, even if
the plans are faster for 90% of people, the other 10% will be unhappy
and want a way to work around it.

2. enable_groupingsets_hash_disk (default false):

This is about how we choose which grouping sets to hash and which to
sort when generating mixed mode paths.

Even before this patch, there are quite a few paths that could be
generated. It tries to estimate the size of each grouping set's hash
table, and then see how many it can fit in work_mem (knapsack), while
also taking advantage of any path keys, etc.

With Disk-based Hash Aggregation, in principle we can generate paths
representing any combination of hashing and sorting for the grouping
sets. But that would be overkill (and grow to a huge number of paths if
we have more than a handful of grouping sets). So I think the existing
planner logic for grouping sets is fine for now. We might come up with
a better approach later.

But that created a testing problem, because if the planner estimates
correctly, no hashed grouping sets will spill, and the spilling code
won't be exercised. This GUC makes the planner disregard which grouping
sets' hash tables will fit, making it much easier to exercise the
spilling code. Is there a better way I should be testing this code
path?

Regards,
        Jeff Davis




Reply | Threaded
Open this post in threaded view
|

Re: Default setting for enable_hashagg_disk

Melanie Plageman


On Thu, Apr 9, 2020 at 1:02 PM Jeff Davis <[hidden email]> wrote:
2. enable_groupingsets_hash_disk (default false):

This is about how we choose which grouping sets to hash and which to
sort when generating mixed mode paths.

Even before this patch, there are quite a few paths that could be
generated. It tries to estimate the size of each grouping set's hash
table, and then see how many it can fit in work_mem (knapsack), while
also taking advantage of any path keys, etc.

With Disk-based Hash Aggregation, in principle we can generate paths
representing any combination of hashing and sorting for the grouping
sets. But that would be overkill (and grow to a huge number of paths if
we have more than a handful of grouping sets). So I think the existing
planner logic for grouping sets is fine for now. We might come up with
a better approach later.

But that created a testing problem, because if the planner estimates
correctly, no hashed grouping sets will spill, and the spilling code
won't be exercised. This GUC makes the planner disregard which grouping
sets' hash tables will fit, making it much easier to exercise the
spilling code. Is there a better way I should be testing this code
path?


So, I was catching up on email and noticed the last email in this
thread.

I think I am not fully understanding what enable_groupingsets_hash_disk
does. Is it only for testing?

Using the tests you added to src/test/regress/sql/groupingsets.sql, I
did get a plan that looks like hashagg is spilling to disk (goes through
hashagg_spill_tuple() code path and has number of batches reported in
Explain) in a MixedAgg plan for a grouping sets query even with
enable_groupingsets_hash_disk set to false. You don't have the exact
query I tried (below) in the test suite, but it is basically what is
already there, so I must be missing something.

set enable_hashagg_disk = true;
SET enable_groupingsets_hash_disk = false;
SET work_mem='64kB';
set enable_hashagg = true;
set jit_above_cost = 0;
drop table if exists gs_hash_1;
create table gs_hash_1 as
select g1000, g100, g10, sum(g::numeric), count(*), max(g::text) from
  (select g%1000 as g1000, g%100 as g100, g%10 as g10, g
   from generate_series(0,199999) g) s
group by cube (g1000,g100,g10);

explain (analyze, costs off, timing off)
select g1000, g100, g10
from gs_hash_1 group by cube (g1000,g100,g10);

                          QUERY PLAN                          
--------------------------------------------------------------
 MixedAggregate (actual rows=9648 loops=1)
   Hash Key: g10
   Hash Key: g10, g1000
   Hash Key: g100
   Hash Key: g100, g10
   Group Key: g1000, g100, g10
   Group Key: g1000, g100
   Group Key: g1000
   Group Key: ()
   Peak Memory Usage: 233 kB
   Disk Usage: 1600 kB
   HashAgg Batches: 2333
   ->  Sort (actual rows=4211 loops=1)
         Sort Key: g1000, g100, g10
         Sort Method: external merge  Disk: 384kB
         ->  Seq Scan on gs_hash_1 (actual rows=4211 loops=1)

Anyway, when I throw in the stats trick that is used in join_hash.sql:

alter table gs_hash_1 set (autovacuum_enabled = 'false');
update pg_class set reltuples = 10 where relname = 'gs_hash_1';

I get a MixedAgg plan that doesn't have any Sort below and uses much
more disk.

                        QUERY PLAN                        
----------------------------------------------------------
 MixedAggregate (actual rows=4211 loops=1)
   Hash Key: g1000, g100, g10
   Hash Key: g1000, g100
   Hash Key: g1000
   Hash Key: g100, g10
   Hash Key: g100
   Hash Key: g10, g1000
   Hash Key: g10
   Group Key: ()
   Peak Memory Usage: 405 kB
   Disk Usage: 59712 kB
   HashAgg Batches: 4209
   ->  Seq Scan on gs_hash_1 (actual rows=200000 loops=1)

I'm not sure if this is more what you were looking for--or maybe I am
misunderstanding the guc.
 
--
Melanie Plageman
Reply | Threaded
Open this post in threaded view
|

Re: Default setting for enable_hashagg_disk

Justin Pryzby
On Tue, Jun 09, 2020 at 06:20:13PM -0700, Melanie Plageman wrote:

> On Thu, Apr 9, 2020 at 1:02 PM Jeff Davis <[hidden email]> wrote:
>
> > 2. enable_groupingsets_hash_disk (default false):
> >
> > This is about how we choose which grouping sets to hash and which to
> > sort when generating mixed mode paths.
> >
> > Even before this patch, there are quite a few paths that could be
> > generated. It tries to estimate the size of each grouping set's hash
> > table, and then see how many it can fit in work_mem (knapsack), while
> > also taking advantage of any path keys, etc.
> >
> > With Disk-based Hash Aggregation, in principle we can generate paths
> > representing any combination of hashing and sorting for the grouping
> > sets. But that would be overkill (and grow to a huge number of paths if
> > we have more than a handful of grouping sets). So I think the existing
> > planner logic for grouping sets is fine for now. We might come up with
> > a better approach later.
> >
> > But that created a testing problem, because if the planner estimates
> > correctly, no hashed grouping sets will spill, and the spilling code
> > won't be exercised. This GUC makes the planner disregard which grouping
> > sets' hash tables will fit, making it much easier to exercise the
> > spilling code. Is there a better way I should be testing this code
> > path?
>
> So, I was catching up on email and noticed the last email in this
> thread.
>
> I think I am not fully understanding what enable_groupingsets_hash_disk
> does. Is it only for testing?

If so, it should be in category: "Developer Options".

> Using the tests you added to src/test/regress/sql/groupingsets.sql, I
> did get a plan that looks like hashagg is spilling to disk (goes through
> hashagg_spill_tuple() code path and has number of batches reported in
> Explain) in a MixedAgg plan for a grouping sets query even with
> enable_groupingsets_hash_disk set to false.

> I'm not sure if this is more what you were looking for--or maybe I am
> misunderstanding the guc.

The behavior of the GUC is inconsistent with the other GUCs, which is
confusing.  See also Robert's comments in this thread.
https://www.postgresql.org/message-id/20200407223900.GT2228%40telsasoft.com

The old (pre-13) behavior was:
 - work_mem is the amount of RAM to which each query node tries to constrain
   itself, and the planner will reject a plan if it's expected to exceed that.
   ...But a chosen plan might exceed work_mem anyway.

The new behavior in v13 seems to be:
 - HashAgg now respects work_mem, but instead enable*hash_disk are
   opportunisitic.  A node which is *expected* to spill to disk will be
   rejected.
   ...But at execution time, a node which exceeds work_mem will be spilled.

If someone sees a plan which spills to disk and wants to improve performance by
avoid spilling, they might SET enable_hashagg_disk=off, which might do what
they want (if the plan is rejected at plan time), or it might not, which I
think will be a surprise every time.

If someone agrees, I suggest to add this as an Opened Item.

Maybe some combination of these would be an improvement:

 - change documentation to emphasize behavior;
 - change EXPLAIN ouput to make it obvious this isn't misbehaving;
 - rename the GUC to not start with enable_* (work_mem_exceed?)
 - rename the GUC *values* to something other than on/off. On/Planner?
 - change the GUC to behave like it sounds like it should, which means "off"
   would allow the pre-13 behavior of exceeding work_mem.
 - Maybe make it ternary, like:
   exceed_work_mem: {spill_disk, planner_reject, allow}

--
Justin


Reply | Threaded
Open this post in threaded view
|

Re: Default setting for enable_hashagg_disk

Melanie Plageman

On Tue, Jun 9, 2020 at 7:15 PM Justin Pryzby <[hidden email]> wrote:
On Tue, Jun 09, 2020 at 06:20:13PM -0700, Melanie Plageman wrote:
> On Thu, Apr 9, 2020 at 1:02 PM Jeff Davis <[hidden email]> wrote:
>
> > 2. enable_groupingsets_hash_disk (default false):
> >
> > This is about how we choose which grouping sets to hash and which to
> > sort when generating mixed mode paths.
> >
> > Even before this patch, there are quite a few paths that could be
> > generated. It tries to estimate the size of each grouping set's hash
> > table, and then see how many it can fit in work_mem (knapsack), while
> > also taking advantage of any path keys, etc.
> >
> > With Disk-based Hash Aggregation, in principle we can generate paths
> > representing any combination of hashing and sorting for the grouping
> > sets. But that would be overkill (and grow to a huge number of paths if
> > we have more than a handful of grouping sets). So I think the existing
> > planner logic for grouping sets is fine for now. We might come up with
> > a better approach later.
> >
> > But that created a testing problem, because if the planner estimates
> > correctly, no hashed grouping sets will spill, and the spilling code
> > won't be exercised. This GUC makes the planner disregard which grouping
> > sets' hash tables will fit, making it much easier to exercise the
> > spilling code. Is there a better way I should be testing this code
> > path?
>
> So, I was catching up on email and noticed the last email in this
> thread.
>
> I think I am not fully understanding what enable_groupingsets_hash_disk
> does. Is it only for testing?

If so, it should be in category: "Developer Options".

> Using the tests you added to src/test/regress/sql/groupingsets.sql, I
> did get a plan that looks like hashagg is spilling to disk (goes through
> hashagg_spill_tuple() code path and has number of batches reported in
> Explain) in a MixedAgg plan for a grouping sets query even with
> enable_groupingsets_hash_disk set to false.

> I'm not sure if this is more what you were looking for--or maybe I am
> misunderstanding the guc.

The behavior of the GUC is inconsistent with the other GUCs, which is
confusing.  See also Robert's comments in this thread.
https://www.postgresql.org/message-id/20200407223900.GT2228%40telsasoft.com

The old (pre-13) behavior was:
 - work_mem is the amount of RAM to which each query node tries to constrain
   itself, and the planner will reject a plan if it's expected to exceed that.
   ...But a chosen plan might exceed work_mem anyway.

The new behavior in v13 seems to be:
 - HashAgg now respects work_mem, but instead enable*hash_disk are
   opportunisitic.  A node which is *expected* to spill to disk will be
   rejected.
   ...But at execution time, a node which exceeds work_mem will be spilled.

If someone sees a plan which spills to disk and wants to improve performance by
avoid spilling, they might SET enable_hashagg_disk=off, which might do what
they want (if the plan is rejected at plan time), or it might not, which I
think will be a surprise every time.


But I thought that the enable_groupingsets_hash_disk GUC allows us to
test the following scenario:

The following is true:
- planner thinks grouping sets' hashtables table would fit in memory
  (spilling is *not* expected)
- user is okay with spilling
- some grouping keys happen to be sortable and some hashable

The following happens:
- Planner generates some HashAgg grouping sets paths
- A MixedAgg plan is created
- During execution of the MixedAgg plan, one or more grouping sets'
  hashtables would exceed work_mem, so the executor spills those tuples
  to disk instead of exceeding work_mem

Especially given the code and comment:
    /*
    * If we have sortable columns to work with (gd->rollups is non-empty)
    * and enable_groupingsets_hash_disk is disabled, don't generate
    * hash-based paths that will exceed work_mem.
    */
    if (!enable_groupingsets_hash_disk &&
            hashsize > work_mem * 1024L && gd->rollups)
            return; /* nope, won't fit */

If this is the scenario that the GUC is designed to test, it seems like
you could exercise it without the enable_groupingsets_hash_disk GUC by
lying about the stats, no?

--
Melanie Plageman
Reply | Threaded
Open this post in threaded view
|

Re: Default setting for enable_hashagg_disk

Jeff Davis-8
In reply to this post by Melanie Plageman
On Tue, 2020-06-09 at 18:20 -0700, Melanie Plageman wrote:
> So, I was catching up on email and noticed the last email in this
> thread.
>
> I think I am not fully understanding what
> enable_groupingsets_hash_disk
> does. Is it only for testing?

It's mostly for testing. I could imagine cases where it would be useful
to force groupingsets to use the disk, but I mainly wanted the setting
there for testing the grouping sets hash disk code path.

> Using the tests you added to src/test/regress/sql/groupingsets.sql, I
> did get a plan that looks like hashagg is spilling to disk (goes
> through

I had something that worked as a test for a while, but then when I
tweaked the costing, it started using the Sort path (therefore not
testing my grouping sets hash disk code at all) and a bug crept in. So
I thought it would be best to have a more forceful knob.

Perhaps I should just get rid of that GUC and use the stats trick?

Regards,
        Jeff Davis




Reply | Threaded
Open this post in threaded view
|

Re: Default setting for enable_hashagg_disk

Jeff Davis-8
In reply to this post by Justin Pryzby
On Tue, 2020-06-09 at 21:15 -0500, Justin Pryzby wrote:
> The behavior of the GUC is inconsistent with the other GUCs, which is
> confusing.  See also Robert's comments in this thread.
>
https://www.postgresql.org/message-id/20200407223900.GT2228%40telsasoft.com

enable_* GUCs are planner GUCs, so it would be confusing to me if they
affected execution-time behavior.

I think the point of confusion is that it's not enabling/disabling an
entire execution node; it only "disables" HashAgg if it thinks it will
spill. I agree that is a difference with the other GUCs, and could
cause confusion.

Stepping back, I was trying to solve two problems with these GUCs:

1. Testing the spilling of hashed grouping sets: I'm inclined to just
get rid of enable_groupingsets_hash_disk and use Melanie's stats-
hacking approach instead.

2. Trying to provide an escape hatch for someone who experiences a
performance regression and wants something like the old behavior back.
There are two aspects of the old behavior that a user could potentially
want back:
  a. Don't choose HashAgg if it's expected to have more groups than fit
into a work_mem-sized hashtable.
  b. If executing HashAgg, and the hash table exceeds work_mem, just
keep going.

The behavior in v13 master is, by default, analagous to Sort or
anything else that adapts at runtime to spill. If we had spillable
HashAgg the whole time, we wouldn't be worried about #2 at all. But,
out of conservatism, I am trying to accommodate users who want an
escape hatch, at least for a release or two until users feel more
comfortable with disk-based HashAgg.

Setting enable_hash_disk=false implements 2(a). This name apparently
causes confusion, but it's hard to come up with a better one because
the v12 behavior has nuance that's hard to express succinctly. I don't
think the names you suggested quite fit, but the idea to use a more
interesting GUC value might help express the behavior. Perhaps making
enable_hashagg a ternary "enable_hashagg=on|off|avoid_disk"? The word
"reject" is too definite for the planner, which is working with
imperfect information.

In master, there is no explicit way to get 2(b), but you can just set
work_mem higher in a lot of cases. If enough people want 2(b), I can
add it easily. Perhaps hashagg_overflow=on|off, which would control
execution time behavior?

Regards,
        Jeff Davis




Reply | Threaded
Open this post in threaded view
|

Re: Default setting for enable_hashagg_disk

Jeff Davis-8
In reply to this post by Jeff Davis-8
On Tue, 2020-04-07 at 11:20 -0700, Jeff Davis wrote:

> Now that we have Disk-based Hash Aggregation, there are a lot more
> situations where the planner can choose HashAgg. The
> enable_hashagg_disk GUC, if set to true, chooses HashAgg based on
> costing. If false, it only generates a HashAgg path if it thinks it
> will fit in work_mem, similar to the old behavior (though it wlil now
> spill to disk if the planner was wrong about it fitting in work_mem).
> The current default is true.
>
> I expect this to be a win in a lot of cases, obviously. But as with
> any
> planner change, it will be wrong sometimes. We may want to be
> conservative and set the default to false depending on the experience
> during beta. I'm inclined to leave it as true for now though, because
> that will give us better information upon which to base any decision.

A compromise may be to multiply the disk costs for HashAgg by, e.g. a
1.5 - 2X penalty. That would make the plan changes less abrupt, and may
mitigate some of the concerns about I/O patterns that Tomas raised
here:


https://www.postgresql.org/message-id/20200519151202.u2p2gpiawoaznsv2@development

The issues were improved a lot, but it will take us a while to really
tune the IO behavior as well as Sort.

Regards,
        Jeff Davis




Reply | Threaded
Open this post in threaded view
|

Re: Default setting for enable_hashagg_disk

Melanie Plageman
In reply to this post by Jeff Davis-8


On Wed, Jun 10, 2020 at 10:39 AM Jeff Davis <[hidden email]> wrote:
On Tue, 2020-06-09 at 18:20 -0700, Melanie Plageman wrote:
> So, I was catching up on email and noticed the last email in this
> thread.
>
> I think I am not fully understanding what
> enable_groupingsets_hash_disk
> does. Is it only for testing?

It's mostly for testing. I could imagine cases where it would be useful
to force groupingsets to use the disk, but I mainly wanted the setting
there for testing the grouping sets hash disk code path.

> Using the tests you added to src/test/regress/sql/groupingsets.sql, I
> did get a plan that looks like hashagg is spilling to disk (goes
> through

I had something that worked as a test for a while, but then when I
tweaked the costing, it started using the Sort path (therefore not
testing my grouping sets hash disk code at all) and a bug crept in. So
I thought it would be best to have a more forceful knob.

Perhaps I should just get rid of that GUC and use the stats trick?


I like the idea of doing the stats trick. For extra security, you could
throw in that other trick that is used in groupingsets.sql and make some
of the grouping columns unhashable and some unsortable so you know that
you will not pick only the Sort Path and do just a GroupAgg.

This slight modification of my previous example will probably yield
consistent results:

set enable_hashagg_disk = true;
SET enable_groupingsets_hash_disk = false;
SET work_mem='64kB';
SET enable_hashagg = true;
drop table if exists gs_hash_1;
create table gs_hash_1 as
  select g%1000 as g1000, g%100 as g100, g%10 as g10, g,
          g::text::xid as g_unsortable, g::bit(4) as g_unhashable
   from generate_series(0,199999) g;
analyze gs_hash_1;

alter table gs_hash_1 set (autovacuum_enabled = 'false');
update pg_class set reltuples = 10 where relname = 'gs_hash_1';

explain (analyze, costs off, timing off)
select g1000, g100, g10
  from gs_hash_1
  group by grouping sets ((g1000,g100), (g10, g_unhashable), (g100, g_unsortable));

                           QUERY PLAN                          
----------------------------------------------------------------
 MixedAggregate (actual rows=201080 loops=1)
   Hash Key: g100, g_unsortable
   Group Key: g1000, g100
   Sort Key: g10, g_unhashable
     Group Key: g10, g_unhashable
   Peak Memory Usage: 109 kB
   Disk Usage: 13504 kB
   HashAgg Batches: 10111
   ->  Sort (actual rows=200000 loops=1)
         Sort Key: g1000, g100
         Sort Method: external merge  Disk: 9856kB
         ->  Seq Scan on gs_hash_1 (actual rows=200000 loops=1)

While we are on the topic of the tests, I was wondering if you had
considered making a user defined type that had a lot of padding so that
the tests could use fewer rows. I did this for adaptive hashjoin and it
helped me with iteration time.
I don't know if that would still be the kind of test you are looking for
since a user probably wouldn't have a couple hundred really fat
untoasted tuples, but, I just thought I would check if that would be
useful.

--
Melanie Plageman
Reply | Threaded
Open this post in threaded view
|

Re: Default setting for enable_hashagg_disk

Jeff Davis-8
In reply to this post by Jeff Davis-8
On Wed, 2020-06-10 at 11:39 -0700, Jeff Davis wrote:
> 1. Testing the spilling of hashed grouping sets: I'm inclined to just
> get rid of enable_groupingsets_hash_disk and use Melanie's stats-
> hacking approach instead.

Fixed in 92c58fd9.

> think the names you suggested quite fit, but the idea to use a more
> interesting GUC value might help express the behavior. Perhaps making
> enable_hashagg a ternary "enable_hashagg=on|off|avoid_disk"? The word
> "reject" is too definite for the planner, which is working with
> imperfect information.

I renamed enable_hashagg_disk to hashagg_avoid_disk_plan, which I think
satisfies the concerns raised here. Also in 92c58fd9.

There is still the original topic of this thread, which is whether we
need to change the default value of this GUC, or penalize disk-based
HashAgg in some way, to be more conservative about plan changes in v13.
I think we can wait a little longer to make a decision there.

Regards,
        Jeff Davis




Reply | Threaded
Open this post in threaded view
|

Re: Default setting for enable_hashagg_disk

Justin Pryzby
On Thu, Jun 11, 2020 at 01:22:57PM -0700, Jeff Davis wrote:

> On Wed, 2020-06-10 at 11:39 -0700, Jeff Davis wrote:
> > 1. Testing the spilling of hashed grouping sets: I'm inclined to just
> > get rid of enable_groupingsets_hash_disk and use Melanie's stats-
> > hacking approach instead.
>
> Fixed in 92c58fd9.
>
> > think the names you suggested quite fit, but the idea to use a more
> > interesting GUC value might help express the behavior. Perhaps making
> > enable_hashagg a ternary "enable_hashagg=on|off|avoid_disk"? The word
> > "reject" is too definite for the planner, which is working with
> > imperfect information.
>
> I renamed enable_hashagg_disk to hashagg_avoid_disk_plan, which I think
> satisfies the concerns raised here. Also in 92c58fd9.

Thanks for considering :)

I saw you updated the Open Items page, but put the items into "Older Bugs /
Fixed".

I moved them underneath "Resolved" since they're all new in v13.
https://wiki.postgresql.org/index.php?title=PostgreSQL_13_Open_Items&diff=34995&oldid=34994

--
Justin


Reply | Threaded
Open this post in threaded view
|

Re: Default setting for enable_hashagg_disk

Greg Stark
In reply to this post by Justin Pryzby


On Thu, 9 Apr 2020 at 13:24, Justin Pryzby <[hidden email]> wrote:
On Thu, Apr 09, 2020 at 01:48:55PM +0200, Tomas Vondra wrote:

> It it really any different from our enable_* GUCs? Even if you do e.g.
> enable_sort=off, we may still do a sort. Same for enable_groupagg etc.

Those show that the GUC was disabled by showing disable_cost.  That's what's
different about this one.

Fwiw in the past this was seen not so much as a positive thing but a bug to be fixed. We've talked about carrying a boolean "disabled plan" flag which would be treated as a large cost penalty but not actually be added to the cost in the plan.

The problems with the disable_cost in the cost are (at least):

1) It causes the resulting costs to be useless for comparing the plan costs with other plans.

2) It can cause other planning decisions to be distorted in strange non-linear ways.

 
--
greg
Reply | Threaded
Open this post in threaded view
|

Re: Default setting for enable_hashagg_disk

Justin Pryzby
In reply to this post by Jeff Davis-8
On Thu, Jun 11, 2020 at 01:22:57PM -0700, Jeff Davis wrote:
> > think the names you suggested quite fit, but the idea to use a more
> > interesting GUC value might help express the behavior. Perhaps making
> > enable_hashagg a ternary "enable_hashagg=on|off|avoid_disk"? The word
> > "reject" is too definite for the planner, which is working with
> > imperfect information.
>
> I renamed enable_hashagg_disk to hashagg_avoid_disk_plan, which I think
> satisfies the concerns raised here. Also in 92c58fd9.

I think this should be re-arranged to be in alphabetical order
https://www.postgresql.org/docs/devel/runtime-config-query.html

--
Justin


Reply | Threaded
Open this post in threaded view
|

Re: Default setting for enable_hashagg_disk

Robert Haas
In reply to this post by Jeff Davis-8
On Wed, Jun 10, 2020 at 2:39 PM Jeff Davis <[hidden email]> wrote:

> The behavior in v13 master is, by default, analagous to Sort or
> anything else that adapts at runtime to spill. If we had spillable
> HashAgg the whole time, we wouldn't be worried about #2 at all. But,
> out of conservatism, I am trying to accommodate users who want an
> escape hatch, at least for a release or two until users feel more
> comfortable with disk-based HashAgg.
>
> Setting enable_hash_disk=false implements 2(a). This name apparently
> causes confusion, but it's hard to come up with a better one because
> the v12 behavior has nuance that's hard to express succinctly. I don't
> think the names you suggested quite fit, but the idea to use a more
> interesting GUC value might help express the behavior. Perhaps making
> enable_hashagg a ternary "enable_hashagg=on|off|avoid_disk"? The word
> "reject" is too definite for the planner, which is working with
> imperfect information.
>
> In master, there is no explicit way to get 2(b), but you can just set
> work_mem higher in a lot of cases. If enough people want 2(b), I can
> add it easily. Perhaps hashagg_overflow=on|off, which would control
> execution time behavior?

Planner GUCs are a pretty blunt instrument for solving problems that
users may have with planner features. There's no guarantee that the
experience a user has with one query will be the same as the
experience they have with another query, or even that you couldn't
have a single query which contains two different nodes where the
optimal behavior is different for one than it is for the other. In the
first case, changing the value of the GUC on a per-query basis is
pretty painful; in the second case, even that is not good enough. So,
as Tom has said before, the only really good choice in a case like
this is for the planner to figure out the right things automatically;
anything that boils down to a user-provided hint pretty well sucks.

So I feel like the really important thing here is to fix the cases
that don't come out well with default settings. If we can't do that,
then the feature is half-baked and maybe should not have been
committed in the first place. If we can, then we don't really need the
GUC, let alone multiple GUCs. I understand that some of the reason you
added these was out of paranoia, and I get that: it's hard to be sure
that any feature of this complexity isn't going to have some rough
patches, especially given how defective work_mem is as a model in
general. Still, we don't want to end up with 50 planner GUCs enabling
and disabling individual bits of various planner nodes, or at least I
don't think we do, so I'm very skeptical of the idea that we need 2
just for this feature. That doesn't feel scalable. I think the right
number is 0 or 1, and if it's 1, very few people should be changing
the default. If anything else is the case, then IMHO the feature isn't
ready to ship.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Reply | Threaded
Open this post in threaded view
|

Re: Default setting for enable_hashagg_disk

Justin Pryzby
On Mon, Jun 22, 2020 at 10:52:37AM -0400, Robert Haas wrote:

> On Wed, Jun 10, 2020 at 2:39 PM Jeff Davis <[hidden email]> wrote:
> > The behavior in v13 master is, by default, analagous to Sort or
> > anything else that adapts at runtime to spill. If we had spillable
> > HashAgg the whole time, we wouldn't be worried about #2 at all. But,
> > out of conservatism, I am trying to accommodate users who want an
> > escape hatch, at least for a release or two until users feel more
> > comfortable with disk-based HashAgg.
> >
> > Setting enable_hash_disk=false implements 2(a). This name apparently
> > causes confusion, but it's hard to come up with a better one because
> > the v12 behavior has nuance that's hard to express succinctly. I don't
> > think the names you suggested quite fit, but the idea to use a more
> > interesting GUC value might help express the behavior. Perhaps making
> > enable_hashagg a ternary "enable_hashagg=on|off|avoid_disk"? The word
> > "reject" is too definite for the planner, which is working with
> > imperfect information.
> >
> > In master, there is no explicit way to get 2(b), but you can just set
> > work_mem higher in a lot of cases. If enough people want 2(b), I can
> > add it easily. Perhaps hashagg_overflow=on|off, which would control
> > execution time behavior?
>
> don't think we do, so I'm very skeptical of the idea that we need 2
> just for this feature. That doesn't feel scalable. I think the right
> number is 0 or 1, and if it's 1, very few people should be changing
> the default. If anything else is the case, then IMHO the feature isn't
> ready to ship.

This was addressed in 92c58fd94801dd5c81ee20e26c5bb71ad64552a8
https://wiki.postgresql.org/index.php?title=PostgreSQL_13_Open_Items&diff=34994&oldid=34993

--
Justin


12345