Partition-wise aggregation/grouping

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
85 messages Options
12345
Reply | Threaded
Open this post in threaded view
|

Partition-wise aggregation/grouping

Jeevan Chalke
Hi all,

Declarative partitioning is supported in PostgreSQL 10 and work is already in
progress to support partition-wise joins. Here is a proposal for partition-wise
aggregation/grouping.  Our initial performance measurement has shown 7 times
performance when partitions are on foreign servers and approximately 15% when
partitions are local.

Partition-wise aggregation/grouping computes aggregates for each partition
separately.  If the group clause contains the partition key, all the rows
belonging to a given group come from one partition, thus allowing aggregates
to be computed completely for each partition.  Otherwise, partial aggregates
computed for each partition are combined across the partitions to produce the
final aggregates. This technique improves performance because:
i. When partitions are located on foreign server, we can push down the
aggregate to the foreign server.
ii. If hash table for each partition fits in memory, but that for the whole
relation does not, each partition-wise aggregate can use an in-memory hash
table.
iii. Aggregation at the level of partitions can exploit properties of
partitions like indexes, their storage etc.

Attached an experimental patch for the same based on the partition-wise join
patches posted in [1].

This patch currently implements partition-wise aggregation when group clause
contains the partitioning key.  A query below, involving a partitioned table
with 3 partitions containing 1M rows each, producing total 30 groups showed
15% improvement over non-partition-wise aggregation. Same query showed 7 times
improvement when the partitions were located on the foreign servers.

Here is the sample plan:

postgres=# set enable_partition_wise_agg to true;
SET
postgres=# EXPLAIN ANALYZE SELECT a, count(*) FROM plt1 GROUP BY a;
                                                  QUERY PLAN                                                 
--------------------------------------------------------------------------------------------------------------
 Append  (cost=5100.00..61518.90 rows=30 width=12) (actual time=324.837..944.804 rows=30 loops=1)
   ->  Foreign Scan  (cost=5100.00..20506.30 rows=10 width=12) (actual time=324.837..324.838 rows=10 loops=1)
         Relations: Aggregate on (public.fplt1_p1 plt1)
   ->  Foreign Scan  (cost=5100.00..20506.30 rows=10 width=12) (actual time=309.954..309.956 rows=10 loops=1)
         Relations: Aggregate on (public.fplt1_p2 plt1)
   ->  Foreign Scan  (cost=5100.00..20506.30 rows=10 width=12) (actual time=310.002..310.004 rows=10 loops=1)
         Relations: Aggregate on (public.fplt1_p3 plt1)
 Planning time: 0.370 ms
 Execution time: 945.384 ms
(9 rows)

postgres=# set enable_partition_wise_agg to false;
SET
postgres=# EXPLAIN ANALYZE SELECT a, count(*) FROM plt1 GROUP BY a;
                                                              QUERY PLAN                                                              
---------------------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=121518.01..121518.31 rows=30 width=12) (actual time=6498.452..6498.459 rows=30 loops=1)
   Group Key: plt1.a
   ->  Append  (cost=0.00..106518.00 rows=3000001 width=4) (actual time=0.595..5769.592 rows=3000000 loops=1)
         ->  Seq Scan on plt1  (cost=0.00..0.00 rows=1 width=4) (actual time=0.007..0.007 rows=0 loops=1)
         ->  Foreign Scan on fplt1_p1  (cost=100.00..35506.00 rows=1000000 width=4) (actual time=0.587..1844.506 rows=1000000 loops=1)
         ->  Foreign Scan on fplt1_p2  (cost=100.00..35506.00 rows=1000000 width=4) (actual time=0.384..1839.633 rows=1000000 loops=1)
         ->  Foreign Scan on fplt1_p3  (cost=100.00..35506.00 rows=1000000 width=4) (actual time=0.402..1876.505 rows=1000000 loops=1)
 Planning time: 0.251 ms
 Execution time: 6499.018 ms
(9 rows)

Patch needs a lot of improvement including:
1. Support for partial partition-wise aggregation
2. Estimating number of groups for every partition
3. Estimating cost of partition-wise aggregation based on sample partitions
similar to partition-wise join
and much more.

In order to support partial aggregation on foreign partitions, we need support
to fetch partially aggregated results from the foreign server. That can be
handled as a separate follow-on patch.

Though is lot of work to be done, I would like to get suggestions/opinions from
hackers.

I would like to thank Ashutosh Bapat for providing a draft patch and helping
me off-list on this feature while he is busy working on partition-wise join
feature.

[1] https://www.postgresql.org/message-id/CAFjFpRcbY2QN3cfeMTzVEoyF5Lfku-ijyNR%3DPbXj1e%3D9a%3DqMoQ%40mail.gmail.com

Thanks

--
Jeevan Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company



--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pg_partwise_agg_WIP.patch (62K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Partition-wise aggregation/grouping

Antonin Houska-2
Jeevan Chalke <[hidden email]> wrote:

> Declarative partitioning is supported in PostgreSQL 10 and work is already in
> progress to support partition-wise joins. Here is a proposal for partition-wise
> aggregation/grouping. Our initial performance measurement has shown 7 times
> performance when partitions are on foreign servers and approximately 15% when
> partitions are local.
>
> Partition-wise aggregation/grouping computes aggregates for each partition
> separately. If the group clause contains the partition key, all the rows
> belonging to a given group come from one partition, thus allowing aggregates
> to be computed completely for each partition. Otherwise, partial aggregates
> computed for each partition are combined across the partitions to produce the
> final aggregates. This technique improves performance because:

> i. When partitions are located on foreign server, we can push down the
> aggregate to the foreign server.

> ii. If hash table for each partition fits in memory, but that for the whole
> relation does not, each partition-wise aggregate can use an in-memory hash
> table.

> iii. Aggregation at the level of partitions can exploit properties of
> partitions like indexes, their storage etc.

I suspect this overlaps with

https://www.postgresql.org/message-id/29111.1483984605%40localhost

I'm working on the next version of the patch, which will be able to aggregate
the result of both base relation scans and joins. I'm trying hard to make the
next version available before an urgent vacation that I'll have to take at
random date between today and early April. I suggest that we coordinate the
effort, it's lot of work in any case.

--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Partition-wise aggregation/grouping

Jeevan Chalke


On Tue, Mar 21, 2017 at 1:47 PM, Antonin Houska <[hidden email]> wrote:
Jeevan Chalke <[hidden email]> wrote:

> Declarative partitioning is supported in PostgreSQL 10 and work is already in
> progress to support partition-wise joins. Here is a proposal for partition-wise
> aggregation/grouping. Our initial performance measurement has shown 7 times
> performance when partitions are on foreign servers and approximately 15% when
> partitions are local.
>
> Partition-wise aggregation/grouping computes aggregates for each partition
> separately. If the group clause contains the partition key, all the rows
> belonging to a given group come from one partition, thus allowing aggregates
> to be computed completely for each partition. Otherwise, partial aggregates
> computed for each partition are combined across the partitions to produce the
> final aggregates. This technique improves performance because:

> i. When partitions are located on foreign server, we can push down the
> aggregate to the foreign server.

> ii. If hash table for each partition fits in memory, but that for the whole
> relation does not, each partition-wise aggregate can use an in-memory hash
> table.

> iii. Aggregation at the level of partitions can exploit properties of
> partitions like indexes, their storage etc.

I suspect this overlaps with

https://www.postgresql.org/message-id/29111.1483984605%40localhost

I'm working on the next version of the patch, which will be able to aggregate
the result of both base relation scans and joins. I'm trying hard to make the
next version available before an urgent vacation that I'll have to take at
random date between today and early April. I suggest that we coordinate the
effort, it's lot of work in any case.

IIUC, it seems that you are trying to push down the aggregation into the
joining relations. So basically you are converting
Agg -> Join -> {scan1, scan2} into
FinalAgg -> Join -> {PartialAgg -> scan1, PartialAgg -> scan2}.
In addition to that your patch pushes aggregates on base rel to its children,
if any.

Where as what I propose here is pushing down aggregation below the append
node keeping join/scan as is. So basically I am converting
Agg -> Append-> Join -> {scan1, scan2} into
Append -> Agg -> Join -> {scan1, scan2}.
This will require partition-wise join as posted in [1].
But I am planning to make this work for partitioned relations and not for
generic inheritance.

I treat these two as separate strategies/paths to be consider while planning.

Our work will overlap when we are pushing down the aggregate on partitioned
base relation to its children/partitions.

I think you should continue working on pushing down aggregate onto the
joins/scans where as I will continue my work on pushing down aggregates to
partitions (joins as well as single table). Once we are done with these task,
then we may need to find a way to integrate them.

[1] https://www.postgresql.org/message-id/flat/CAFjFpRfQ8GrQvzp3jA2wnLqrHmaXna-urjm_UY9BqXj=EaDTSA@...#CAFjFpRfQ8GrQvzp3jA2wnLqrHmaXna-urjm_UY9BqXj=EaDTSA@...

 

--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers



--
Jeevan Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Reply | Threaded
Open this post in threaded view
|

Re: Partition-wise aggregation/grouping

Antonin Houska-2
The promised new version of my patch is here:

https://www.postgresql.org/message-id/9666.1491295317%40localhost

Jeevan Chalke <[hidden email]> wrote:

> On Tue, Mar 21, 2017 at 1:47 PM, Antonin Houska <[hidden email]> wrote:
>
>  Jeevan Chalke <[hidden email]> wrote:
>
> IIUC, it seems that you are trying to push down the aggregation into the
> joining relations. So basically you are converting
> Agg -> Join -> {scan1, scan2} into
> FinalAgg -> Join -> {PartialAgg -> scan1, PartialAgg -> scan2}.
> In addition to that your patch pushes aggregates on base rel to its children,
> if any.
>
> Where as what I propose here is pushing down aggregation below the append
> node keeping join/scan as is. So basically I am converting
> Agg -> Append-> Join -> {scan1, scan2} into
> Append -> Agg -> Join -> {scan1, scan2}.
> This will require partition-wise join as posted in [1].
> But I am planning to make this work for partitioned relations and not for
> generic inheritance.
>
> I treat these two as separate strategies/paths to be consider while planning.
>
> Our work will overlap when we are pushing down the aggregate on partitioned
> base relation to its children/partitions.
>
> I think you should continue working on pushing down aggregate onto the
> joins/scans where as I will continue my work on pushing down aggregates to
> partitions (joins as well as single table). Once we are done with these task,
> then we may need to find a way to integrate them.
>
> [1] https://www.postgresql.org/message-id/flat/CAFjFpRfQ8GrQvzp3jA2wnLqrHmaXna-urjm_UY9BqXj=EaDTSA@...#CAFjFpRfQ8GrQvzp3jA2wnLqrHmaXna-urjm_UY9BqXj=EaDTSA@...

My patch does also create (partial) aggregation paths below the Append node,
but only expects SeqScan as input. Please check if you patch can be based on
this or if there's any conflict.

(I'll probably be unable to respond before Monday 04/17.)

--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Partition-wise aggregation/grouping

Antonin Houska-2
Antonin Houska <[hidden email]> wrote:

>
> Jeevan Chalke <[hidden email]> wrote:
>
> > Our work will overlap when we are pushing down the aggregate on partitioned
> > base relation to its children/partitions.
> >
> > I think you should continue working on pushing down aggregate onto the
> > joins/scans where as I will continue my work on pushing down aggregates to
> > partitions (joins as well as single table). Once we are done with these task,
> > then we may need to find a way to integrate them.
> >
> > [1] https://www.postgresql.org/message-id/flat/CAFjFpRfQ8GrQvzp3jA2wnLqrHmaXna-urjm_UY9BqXj=EaDTSA@...#CAFjFpRfQ8GrQvzp3jA2wnLqrHmaXna-urjm_UY9BqXj=EaDTSA@...
>
> My patch does also create (partial) aggregation paths below the Append node,
> but only expects SeqScan as input. Please check if you patch can be based on
> this or if there's any conflict.

Well, I haven't imposed any explicit restriction on the kind of path to be
aggregated below the Append path. Maybe the only thing to do is to merge my
patch with the "partition-wise join" patch (which I haven't checked yet).

--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Partition-wise aggregation/grouping

Antonin Houska-2
Antonin Houska <[hidden email]> wrote:

> Antonin Houska <[hidden email]> wrote:
> >
> > Jeevan Chalke <[hidden email]> wrote:
> >
> > > Our work will overlap when we are pushing down the aggregate on partitioned
> > > base relation to its children/partitions.
> > >
> > > I think you should continue working on pushing down aggregate onto the
> > > joins/scans where as I will continue my work on pushing down aggregates to
> > > partitions (joins as well as single table). Once we are done with these task,
> > > then we may need to find a way to integrate them.
> > >
> > > [1] https://www.postgresql.org/message-id/flat/CAFjFpRfQ8GrQvzp3jA2wnLqrHmaXna-urjm_UY9BqXj=EaDTSA@...#CAFjFpRfQ8GrQvzp3jA2wnLqrHmaXna-urjm_UY9BqXj=EaDTSA@...
> >
> > My patch does also create (partial) aggregation paths below the Append node,
> > but only expects SeqScan as input. Please check if you patch can be based on
> > this or if there's any conflict.
>
> Well, I haven't imposed any explicit restriction on the kind of path to be
> aggregated below the Append path. Maybe the only thing to do is to merge my
> patch with the "partition-wise join" patch (which I haven't checked yet).
Attached is a diff that contains both patches merged. This is just to prove my
assumption, details to be elaborated later. The scripts attached produce the
following plan in my environment:

                   QUERY PLAN                  
------------------------------------------------
 Parallel Finalize HashAggregate
   Group Key: b_1.j
   ->  Append
         ->  Parallel Partial HashAggregate
               Group Key: b_1.j
               ->  Hash Join
                     Hash Cond: (b_1.j = c_1.k)
                     ->  Seq Scan on b_1
                     ->  Hash
                           ->  Seq Scan on c_1
         ->  Parallel Partial HashAggregate
               Group Key: b_2.j
               ->  Hash Join
                     Hash Cond: (b_2.j = c_2.k)
                     ->  Seq Scan on b_2
                     ->  Hash
                           ->  Seq Scan on c_2


Note that I had no better idea how to enforce the plan than hard-wiring zero
costs of the partial aggregation paths. This simulates the use case of partial
aggregation performed on remote node (postgres_fdw). Other use cases may
exist, but I only wanted to prove the concept in terms of coding so far.

--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at



--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

agg_pushdown_partition_wise.diff (399K) Download Attachment
test_setup_partition_wise.sql (597 bytes) Download Attachment
query_partition_wise.sql (198 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Partition-wise aggregation/grouping

Robert Haas
On Wed, Apr 26, 2017 at 6:28 AM, Antonin Houska <[hidden email]> wrote:

> Attached is a diff that contains both patches merged. This is just to prove my
> assumption, details to be elaborated later. The scripts attached produce the
> following plan in my environment:
>
>                    QUERY PLAN
> ------------------------------------------------
>  Parallel Finalize HashAggregate
>    Group Key: b_1.j
>    ->  Append
>          ->  Parallel Partial HashAggregate
>                Group Key: b_1.j
>                ->  Hash Join
>                      Hash Cond: (b_1.j = c_1.k)
>                      ->  Seq Scan on b_1
>                      ->  Hash
>                            ->  Seq Scan on c_1
>          ->  Parallel Partial HashAggregate
>                Group Key: b_2.j
>                ->  Hash Join
>                      Hash Cond: (b_2.j = c_2.k)
>                      ->  Seq Scan on b_2
>                      ->  Hash
>                            ->  Seq Scan on c_2

Well, I'm confused.  I see that there's a relationship between what
Antonin is trying to do and what Jeevan is trying to do, but I can't
figure out whether one is a subset of the other, whether they're both
orthogonal, or something else.  This plan looks similar to what I
would expect Jeevan's patch to produce, except i have no idea what
"Parallel" would mean in a plan that contains no Gather node.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Partition-wise aggregation/grouping

Antonin Houska-2
Robert Haas <[hidden email]> wrote:

> On Wed, Apr 26, 2017 at 6:28 AM, Antonin Houska <[hidden email]> wrote:
> > Attached is a diff that contains both patches merged. This is just to prove my
> > assumption, details to be elaborated later. The scripts attached produce the
> > following plan in my environment:
> >
> >                    QUERY PLAN
> > ------------------------------------------------
> >  Parallel Finalize HashAggregate
> >    Group Key: b_1.j
> >    ->  Append
> >          ->  Parallel Partial HashAggregate
> >                Group Key: b_1.j
> >                ->  Hash Join
> >                      Hash Cond: (b_1.j = c_1.k)
> >                      ->  Seq Scan on b_1
> >                      ->  Hash
> >                            ->  Seq Scan on c_1
> >          ->  Parallel Partial HashAggregate
> >                Group Key: b_2.j
> >                ->  Hash Join
> >                      Hash Cond: (b_2.j = c_2.k)
> >                      ->  Seq Scan on b_2
> >                      ->  Hash
> >                            ->  Seq Scan on c_2
>
> Well, I'm confused.  I see that there's a relationship between what
> Antonin is trying to do and what Jeevan is trying to do, but I can't
> figure out whether one is a subset of the other, whether they're both
> orthogonal, or something else.  This plan looks similar to what I
> would expect Jeevan's patch to produce,
The point is that the patch Jeevan wanted to work on is actually a subset of
[1] combined with [2].

> except i have no idea what "Parallel" would mean in a plan that contains no
> Gather node.

parallel_aware field was set mistakenly on the AggPath. Fixed patch is
attached below, producing this plan:

                   QUERY PLAN                  
------------------------------------------------
 Finalize HashAggregate
   Group Key: b_1.j
   ->  Append
         ->  Partial HashAggregate
               Group Key: b_1.j
               ->  Hash Join
                     Hash Cond: (b_1.j = c_1.k)
                     ->  Seq Scan on b_1
                     ->  Hash
                           ->  Seq Scan on c_1
         ->  Partial HashAggregate
               Group Key: b_2.j
               ->  Hash Join
                     Hash Cond: (b_2.j = c_2.k)
                     ->  Seq Scan on b_2
                     ->  Hash
                           ->  Seq Scan on c_2

[1] https://www.postgresql.org/message-id/9666.1491295317%40localhost

[2] https://commitfest.postgresql.org/14/994/

--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at



--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

agg_pushdown_partition_wise_v2.diff (398K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Partition-wise aggregation/grouping

Jeevan Chalke


On Thu, Apr 27, 2017 at 4:53 PM, Antonin Houska <[hidden email]> wrote:
Robert Haas <[hidden email]> wrote:

> On Wed, Apr 26, 2017 at 6:28 AM, Antonin Houska <[hidden email]> wrote:
> > Attached is a diff that contains both patches merged. This is just to prove my
> > assumption, details to be elaborated later. The scripts attached produce the
> > following plan in my environment:
> >
> >                    QUERY PLAN
> > ------------------------------------------------
> >  Parallel Finalize HashAggregate
> >    Group Key: b_1.j
> >    ->  Append
> >          ->  Parallel Partial HashAggregate
> >                Group Key: b_1.j
> >                ->  Hash Join
> >                      Hash Cond: (b_1.j = c_1.k)
> >                      ->  Seq Scan on b_1
> >                      ->  Hash
> >                            ->  Seq Scan on c_1
> >          ->  Parallel Partial HashAggregate
> >                Group Key: b_2.j
> >                ->  Hash Join
> >                      Hash Cond: (b_2.j = c_2.k)
> >                      ->  Seq Scan on b_2
> >                      ->  Hash
> >                            ->  Seq Scan on c_2
>
> Well, I'm confused.  I see that there's a relationship between what
> Antonin is trying to do and what Jeevan is trying to do, but I can't
> figure out whether one is a subset of the other, whether they're both
> orthogonal, or something else.  This plan looks similar to what I
> would expect Jeevan's patch to produce,

The point is that the patch Jeevan wanted to work on is actually a subset of
[1] combined with [2].

Seems like, as you are targeting every relation whether or not it is
partitioned. Where as I am targeting only partitioned relations in my
patch.
 

> except i have no idea what "Parallel" would mean in a plan that contains no
> Gather node.

parallel_aware field was set mistakenly on the AggPath. Fixed patch is
attached below, producing this plan:

                   QUERY PLAN
------------------------------------------------
 Finalize HashAggregate
   Group Key: b_1.j
   ->  Append
         ->  Partial HashAggregate
               Group Key: b_1.j
               ->  Hash Join
                     Hash Cond: (b_1.j = c_1.k)
                     ->  Seq Scan on b_1
                     ->  Hash
                           ->  Seq Scan on c_1
         ->  Partial HashAggregate
               Group Key: b_2.j
               ->  Hash Join
                     Hash Cond: (b_2.j = c_2.k)
                     ->  Seq Scan on b_2
                     ->  Hash
                           ->  Seq Scan on c_2

With my patch, I am getting following plan where we push entire
aggregation below append.

                QUERY PLAN               
------------------------------------------
 Append
   ->  HashAggregate
         Group Key: b_1.j
         ->  Hash Join
               Hash Cond: (b_1.j = c_1.k)
               ->  Seq Scan on b_1
               ->  Hash
                     ->  Seq Scan on c_1
   ->  HashAggregate
         Group Key: b_2.j
         ->  Hash Join
               Hash Cond: (b_2.j = c_2.k)
               ->  Seq Scan on b_2
               ->  Hash
                     ->  Seq Scan on c_2
(15 rows)


Antonin, I have tried applying your patch on master but it doesn't get
apply. Can you please provide the HEAD and any other changes required
to be applied first?

How the plan look like when GROUP BY key does not match with the
partitioning key i.e. GROUP BY b.v ?
 

[1] https://www.postgresql.org/message-id/9666.1491295317%40localhost

[2] https://commitfest.postgresql.org/14/994/

--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at



--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers




--
Jeevan Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Phone: +91 20 66449694

Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb

This e-mail message (and any attachment) is intended for the use of the individual or entity to whom it is addressed. This message contains information from EnterpriseDB Corporation that may be privileged, confidential, or exempt from disclosure under applicable law. If you are not the intended recipient or authorized to receive this for the intended recipient, any use, dissemination, distribution, retention, archiving, or copying of this communication is strictly prohibited. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and delete this message.
Reply | Threaded
Open this post in threaded view
|

Re: Partition-wise aggregation/grouping

Antonin Houska-2
Jeevan Chalke <[hidden email]> wrote:

> On Thu, Apr 27, 2017 at 4:53 PM, Antonin Houska <[hidden email]> wrote:
>
> > Robert Haas <[hidden email]> wrote:
> > > Well, I'm confused. I see that there's a relationship between what
> > > Antonin is trying to do and what Jeevan is trying to do, but I can't
> > > figure out whether one is a subset of the other, whether they're both
> > > orthogonal, or something else. This plan looks similar to what I
> > > would expect Jeevan's patch to produce,

> >  The point is that the patch Jeevan wanted to work on is actually a subset of
> >  [1] combined with [2].

> Seems like, as you are targeting every relation whether or not it is
> partitioned.

Yes.

> With my patch, I am getting following plan where we push entire
> aggregation below append.
>
> QUERY PLAN
> ------------------------------------------
> Append
> -> HashAggregate
> Group Key: b_1.j
> -> Hash Join
> Hash Cond: (b_1.j = c_1.k)
> -> Seq Scan on b_1
> -> Hash
> -> Seq Scan on c_1
> -> HashAggregate
> Group Key: b_2.j
> -> Hash Join
> Hash Cond: (b_2.j = c_2.k)
> -> Seq Scan on b_2
> -> Hash
> -> Seq Scan on c_2
> (15 rows)

I think this is not generic enough because the result of the Append plan can
be joined to another relation. As such a join can duplicate the
already-aggregated values, the aggregates should not be finalized below the
top-level plan.

> Antonin, I have tried applying your patch on master but it doesn't get
> apply. Can you please provide the HEAD and any other changes required
> to be applied first?

I've lost that information. I'll post a new version to the [1] thread asap.

> How the plan look like when GROUP BY key does not match with the
> partitioning key i.e. GROUP BY b.v ?

EXPLAIN (COSTS false)
SELECT          b.v, avg(b.v + c.v)
FROM            b
                JOIN
                c ON b.j = c.k
GROUP BY        b.v;

                   QUERY PLAN                  
------------------------------------------------
 Finalize HashAggregate
   Group Key: b_1.v
   ->  Append
         ->  Partial HashAggregate
               Group Key: b_1.v
               ->  Hash Join
                     Hash Cond: (b_1.j = c_1.k)
                     ->  Seq Scan on b_1
                     ->  Hash
                           ->  Seq Scan on c_1
         ->  Partial HashAggregate
               Group Key: b_2.v
               ->  Hash Join
                     Hash Cond: (b_2.j = c_2.k)
                     ->  Seq Scan on b_2
                     ->  Hash
                           ->  Seq Scan on c_2


> >  [1] https://www.postgresql.org/message-id/9666.1491295317%40localhost
> >
> >  [2] https://commitfest.postgresql.org/14/994/


--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Partition-wise aggregation/grouping

Robert Haas
On Fri, Apr 28, 2017 at 3:03 AM, Antonin Houska <[hidden email]> wrote:
> I think this is not generic enough because the result of the Append plan can
> be joined to another relation. As such a join can duplicate the
> already-aggregated values, the aggregates should not be finalized below the
> top-level plan.

If the grouping key matches the partition key, then it's correct to
push the entire aggregate down, and there's probably a large
performance advantage from avoiding aggregating twice.  If the two
don't match, then pushing the aggregate down necessarily involves a
"partial" and a "finalize" stage, which may or may not be cheaper than
doing the aggregation all at once.  If you have lots of 2-row groups
with 1 row in the first branch of the append and 1 row in the second
branch of the append, breaking the aggregate into two steps is
probably going to be a loser.  If the overall number of groups is
small, it's probably going to win.  But when the grouping key matches
the partition key, so that two-stage aggregation isn't required, I
suspect the pushdown should almost always win.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Partition-wise aggregation/grouping

Jeevan Chalke
In reply to this post by Jeevan Chalke
Hi,

Attached is the patch to implement partition-wise aggregation/grouping.

As explained earlier, we produce a full aggregation for each partition when
partition keys are leading group by clauses and then append is performed.
Else we do a partial aggregation on each partition, append them and then add
finalization step over it.

I have observed that cost estimated for partition-wise aggregation and cost
for the plans without partition-wise aggregation is almost same. However,
execution time shows significant improvement (as explained my in the very
first email) with partition-wise aggregates. Planner chooses a plan according
to the costs, and thus most of the time plan without partition-wise
aggregation is chosen. Hence, to force partition-wise plans and for the
regression runs, I have added a GUC named partition_wise_agg_cost_factor to
adjust the costings.

This feature is only used when enable_partition_wise_agg GUC is set to on.

Here are the details of the patches in the patch-set:

0001 - Refactors sort and hash final grouping paths into separate functions.
Since partition-wise aggregation too builds paths same as that of
create_grouping_paths(), separated path creation for sort and hash agg into
separate functions. These functions later used by main partition-wise
aggregation/grouping patch.

0002 - Passes targetlist to get_number_of_groups().
We need to estimate groups for individual child relations and thus need to
pass targetlist corresponding to the child rel.

0003 - Adds enable_partition_wise_agg and partition_wise_agg_cost_factor GUCs.

0004 - Implements partition-wise aggregation.

0005 - Adds test-cases.

0006 - postgres_fdw changes which enable pushing aggregation for other upper
relations.


Since this patch is highly dependent on partition-wise join [1], one needs to
apply all those patches on HEAD (my repository head was at:
66ed3829df959adb47f71d7c903ac59f0670f3e1) before applying these patches in
order.

Suggestions / feedback / inputs ?

[1] https://www.postgresql.org/message-id/CAFjFpRd9Vqh_=-Ldv-XqWY006d07TJ+VXuhXCbdj=P1jukYBrw@...


On Tue, Mar 21, 2017 at 12:47 PM, Jeevan Chalke <[hidden email]> wrote:
Hi all,

Declarative partitioning is supported in PostgreSQL 10 and work is already in
progress to support partition-wise joins. Here is a proposal for partition-wise
aggregation/grouping.  Our initial performance measurement has shown 7 times
performance when partitions are on foreign servers and approximately 15% when
partitions are local.

Partition-wise aggregation/grouping computes aggregates for each partition
separately.  If the group clause contains the partition key, all the rows
belonging to a given group come from one partition, thus allowing aggregates
to be computed completely for each partition.  Otherwise, partial aggregates
computed for each partition are combined across the partitions to produce the
final aggregates. This technique improves performance because:
i. When partitions are located on foreign server, we can push down the
aggregate to the foreign server.
ii. If hash table for each partition fits in memory, but that for the whole
relation does not, each partition-wise aggregate can use an in-memory hash
table.
iii. Aggregation at the level of partitions can exploit properties of
partitions like indexes, their storage etc.

Attached an experimental patch for the same based on the partition-wise join
patches posted in [1].

This patch currently implements partition-wise aggregation when group clause
contains the partitioning key.  A query below, involving a partitioned table
with 3 partitions containing 1M rows each, producing total 30 groups showed
15% improvement over non-partition-wise aggregation. Same query showed 7 times
improvement when the partitions were located on the foreign servers.

Here is the sample plan:

postgres=# set enable_partition_wise_agg to true;
SET
postgres=# EXPLAIN ANALYZE SELECT a, count(*) FROM plt1 GROUP BY a;
                                                  QUERY PLAN                                                 
--------------------------------------------------------------------------------------------------------------
 Append  (cost=5100.00..61518.90 rows=30 width=12) (actual time=324.837..944.804 rows=30 loops=1)
   ->  Foreign Scan  (cost=5100.00..20506.30 rows=10 width=12) (actual time=324.837..324.838 rows=10 loops=1)
         Relations: Aggregate on (public.fplt1_p1 plt1)
   ->  Foreign Scan  (cost=5100.00..20506.30 rows=10 width=12) (actual time=309.954..309.956 rows=10 loops=1)
         Relations: Aggregate on (public.fplt1_p2 plt1)
   ->  Foreign Scan  (cost=5100.00..20506.30 rows=10 width=12) (actual time=310.002..310.004 rows=10 loops=1)
         Relations: Aggregate on (public.fplt1_p3 plt1)
 Planning time: 0.370 ms
 Execution time: 945.384 ms
(9 rows)

postgres=# set enable_partition_wise_agg to false;
SET
postgres=# EXPLAIN ANALYZE SELECT a, count(*) FROM plt1 GROUP BY a;
                                                              QUERY PLAN                                                              
---------------------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=121518.01..121518.31 rows=30 width=12) (actual time=6498.452..6498.459 rows=30 loops=1)
   Group Key: plt1.a
   ->  Append  (cost=0.00..106518.00 rows=3000001 width=4) (actual time=0.595..5769.592 rows=3000000 loops=1)
         ->  Seq Scan on plt1  (cost=0.00..0.00 rows=1 width=4) (actual time=0.007..0.007 rows=0 loops=1)
         ->  Foreign Scan on fplt1_p1  (cost=100.00..35506.00 rows=1000000 width=4) (actual time=0.587..1844.506 rows=1000000 loops=1)
         ->  Foreign Scan on fplt1_p2  (cost=100.00..35506.00 rows=1000000 width=4) (actual time=0.384..1839.633 rows=1000000 loops=1)
         ->  Foreign Scan on fplt1_p3  (cost=100.00..35506.00 rows=1000000 width=4) (actual time=0.402..1876.505 rows=1000000 loops=1)
 Planning time: 0.251 ms
 Execution time: 6499.018 ms
(9 rows)

Patch needs a lot of improvement including:
1. Support for partial partition-wise aggregation
2. Estimating number of groups for every partition
3. Estimating cost of partition-wise aggregation based on sample partitions
similar to partition-wise join
and much more.

In order to support partial aggregation on foreign partitions, we need support
to fetch partially aggregated results from the foreign server. That can be
handled as a separate follow-on patch.

Though is lot of work to be done, I would like to get suggestions/opinions from
hackers.

I would like to thank Ashutosh Bapat for providing a draft patch and helping
me off-list on this feature while he is busy working on partition-wise join
feature.

[1] https://www.postgresql.org/message-id/CAFjFpRcbY2QN3cfeMTzVEoyF5Lfku-ijyNR%3DPbXj1e%3D9a%3DqMoQ%40mail.gmail.com

Thanks

--
Jeevan Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company




--
Jeevan Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company



--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

partition-wise-agg-v1.tar.gz (34K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Partition-wise aggregation/grouping

Jeevan Chalke


On Wed, Aug 23, 2017 at 4:43 PM, Jeevan Chalke <[hidden email]> wrote:
Hi,

Attached is the patch to implement partition-wise aggregation/grouping.

As explained earlier, we produce a full aggregation for each partition when
partition keys are leading group by clauses and then append is performed.
Else we do a partial aggregation on each partition, append them and then add
finalization step over it.

I have observed that cost estimated for partition-wise aggregation and cost
for the plans without partition-wise aggregation is almost same. However,
execution time shows significant improvement (as explained my in the very
first email) with partition-wise aggregates. Planner chooses a plan according
to the costs, and thus most of the time plan without partition-wise
aggregation is chosen. Hence, to force partition-wise plans and for the
regression runs, I have added a GUC named partition_wise_agg_cost_factor to
adjust the costings.

This feature is only used when enable_partition_wise_agg GUC is set to on.

Here are the details of the patches in the patch-set:

Here are the new patch-set re-based on HEAD (f0a0c17) and
latest partition-wise join (v29) patches.
 

0001 - Refactors sort and hash final grouping paths into separate functions.
Since partition-wise aggregation too builds paths same as that of
create_grouping_paths(), separated path creation for sort and hash agg into
separate functions. These functions later used by main partition-wise
aggregation/grouping patch.

0002 - Passes targetlist to get_number_of_groups().
We need to estimate groups for individual child relations and thus need to
pass targetlist corresponding to the child rel.

0003 - Adds enable_partition_wise_agg and partition_wise_agg_cost_factor GUCs.

0004 - Implements partition-wise aggregation.

0005 - Adds test-cases.

0006 - postgres_fdw changes which enable pushing aggregation for other upper
relations.

0007 - Provides infrastructure to allow partial aggregation
This will allow us to push the partial aggregation over fdw.
With this one can write SUM(PARTIAL x) to get a partial sum
result. Since PARTIAL is used in syntax, I need to move that
to a reserved keywords category. This is kind of PoC patch
and needs input over approach and the way it is implemented.

0008 - Teaches postgres_fdw to push partial aggregation
With this we can push aggregate on remote server when
GROUP BY key does not match with the PARTITION key too.
 


Since this patch is highly dependent on partition-wise join [1], one needs to
apply all those patches on HEAD (my repository head was at:
66ed3829df959adb47f71d7c903ac59f0670f3e1) before applying these patches in
order.

Suggestions / feedback / inputs ?

[1] https://www.postgresql.org/message-id/CAFjFpRd9Vqh_=-Ldv-XqWY006d07TJ+VXuhXCbdj=P1jukYBrw@...



--
Jeevan Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company



--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

partition-wise-agg-v2.tar.gz (44K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Partition-wise aggregation/grouping

rajkumar.raghuwanshi

On Fri, Sep 8, 2017 at 5:47 PM, Jeevan Chalke <[hidden email]> wrote:
Here are the new patch-set re-based on HEAD (f0a0c17) and
latest partition-wise join (v29) patches.

Hi Jeevan,

I have started testing partition-wise-aggregate and got one observation, please take a look.
with the v2 patch, here if I change target list order, query is not picking full partition-wise-aggregate.

SET enable_partition_wise_agg TO true;
SET partition_wise_agg_cost_factor TO 0.5;
SET enable_partition_wise_join TO true;
SET max_parallel_workers_per_gather TO 0;

CREATE TABLE pagg_tab (a int, b int, c int) PARTITION BY RANGE(a);
CREATE TABLE pagg_tab_p1 PARTITION OF pagg_tab FOR VALUES FROM (0) TO (10);
CREATE TABLE pagg_tab_p2 PARTITION OF pagg_tab FOR VALUES FROM (10) TO (20);
CREATE TABLE pagg_tab_p3 PARTITION OF pagg_tab FOR VALUES FROM (20) TO (30);
INSERT INTO pagg_tab SELECT i % 30, i % 30, i % 50 FROM generate_series(0, 299) i;
ANALYZE pagg_tab;

postgres=# explain (verbose, costs off) select a,b,count(*) from pagg_tab group by a,b order by 1,2;
                          QUERY PLAN                         
--------------------------------------------------------------
 Sort
   Output: pagg_tab_p1.a, pagg_tab_p1.b, (count(*))
   Sort Key: pagg_tab_p1.a, pagg_tab_p1.b
   ->  Append
         ->  HashAggregate
               Output: pagg_tab_p1.a, pagg_tab_p1.b, count(*)
               Group Key: pagg_tab_p1.a, pagg_tab_p1.b
               ->  Seq Scan on public.pagg_tab_p1
                     Output: pagg_tab_p1.a, pagg_tab_p1.b
         ->  HashAggregate
               Output: pagg_tab_p2.a, pagg_tab_p2.b, count(*)
               Group Key: pagg_tab_p2.a, pagg_tab_p2.b
               ->  Seq Scan on public.pagg_tab_p2
                     Output: pagg_tab_p2.a, pagg_tab_p2.b
         ->  HashAggregate
               Output: pagg_tab_p3.a, pagg_tab_p3.b, count(*)
               Group Key: pagg_tab_p3.a, pagg_tab_p3.b
               ->  Seq Scan on public.pagg_tab_p3
                     Output: pagg_tab_p3.a, pagg_tab_p3.b
(19 rows)

-- changing target list order
-- picking partial partition-wise aggregation path
postgres=# explain (verbose, costs off) select b,a,count(*) from pagg_tab group by a,b order by 1,2;
                                 QUERY PLAN                                
----------------------------------------------------------------------------
 Finalize GroupAggregate
   Output: pagg_tab_p1.b, pagg_tab_p1.a, count(*)
   Group Key: pagg_tab_p1.b, pagg_tab_p1.a
   ->  Sort
         Output: pagg_tab_p1.b, pagg_tab_p1.a, (PARTIAL count(*))
         Sort Key: pagg_tab_p1.b, pagg_tab_p1.a
         ->  Append
               ->  Partial HashAggregate
                     Output: pagg_tab_p1.b, pagg_tab_p1.a, PARTIAL count(*)
                     Group Key: pagg_tab_p1.b, pagg_tab_p1.a
                     ->  Seq Scan on public.pagg_tab_p1
                           Output: pagg_tab_p1.b, pagg_tab_p1.a
               ->  Partial HashAggregate
                     Output: pagg_tab_p2.b, pagg_tab_p2.a, PARTIAL count(*)
                     Group Key: pagg_tab_p2.b, pagg_tab_p2.a
                     ->  Seq Scan on public.pagg_tab_p2
                           Output: pagg_tab_p2.b, pagg_tab_p2.a
               ->  Partial HashAggregate
                     Output: pagg_tab_p3.b, pagg_tab_p3.a, PARTIAL count(*)
                     Group Key: pagg_tab_p3.b, pagg_tab_p3.a
                     ->  Seq Scan on public.pagg_tab_p3
                           Output: pagg_tab_p3.b, pagg_tab_p3.a
(22 rows)

Thanks & Regards,
Rajkumar Raghuwanshi
QMG, EnterpriseDB Corporation


 
Reply | Threaded
Open this post in threaded view
|

Re: Partition-wise aggregation/grouping

Jeevan Chalke


On Tue, Sep 12, 2017 at 3:24 PM, Rajkumar Raghuwanshi <[hidden email]> wrote:

On Fri, Sep 8, 2017 at 5:47 PM, Jeevan Chalke <[hidden email]> wrote:
Here are the new patch-set re-based on HEAD (f0a0c17) and
latest partition-wise join (v29) patches.

Hi Jeevan,

I have started testing partition-wise-aggregate and got one observation, please take a look.
with the v2 patch, here if I change target list order, query is not picking full partition-wise-aggregate.

Thanks Rajkumar for reporting this.

I am looking into this issue and will post updated patch with the fix.

Thanks & Regards,
Rajkumar Raghuwanshi
QMG, EnterpriseDB Corporation


 

--
Jeevan Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Reply | Threaded
Open this post in threaded view
|

Re: Partition-wise aggregation/grouping

Jeevan Chalke


On Tue, Sep 12, 2017 at 6:21 PM, Jeevan Chalke <[hidden email]> wrote:


On Tue, Sep 12, 2017 at 3:24 PM, Rajkumar Raghuwanshi <[hidden email]> wrote:

Hi Jeevan,

I have started testing partition-wise-aggregate and got one observation, please take a look.
with the v2 patch, here if I change target list order, query is not picking full partition-wise-aggregate.

Thanks Rajkumar for reporting this.

I am looking into this issue and will post updated patch with the fix.

Logic for checking whether partition keys lead group by keys needs to be
updated here. The group by expressions can appear in any order without
affecting the final result. And thus, the need for partition keys should
be leading the group by keys to have full aggregation is not mandatory.
Instead we must ensure that the partition keys are part of the group by
keys to compute full aggregation on a partition.

Attached, revised patch-set with above fix.

Also, in test-cases, I have removed DROP/ANALYZE commands on child
relations and also removed VERBOSE from the EXPLAIN.

Notes:
HEAD: 8edacab209957520423770851351ab4013cb0167
Partition-wise Join patch-set version: v32

Thanks

--
Jeevan Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company



--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

partition-wise-agg-v3.tar.gz (44K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Partition-wise aggregation/grouping

rajkumar.raghuwanshi


On Mon, Sep 18, 2017 at 12:37 PM, Jeevan Chalke <[hidden email]> wrote:


On Tue, Sep 12, 2017 at 6:21 PM, Jeevan Chalke <[hidden email]> wrote:


On Tue, Sep 12, 2017 at 3:24 PM, Rajkumar Raghuwanshi <[hidden email]> wrote:

Hi Jeevan,

I have started testing partition-wise-aggregate and got one observation, please take a look.
with the v2 patch, here if I change target list order, query is not picking full partition-wise-aggregate.

Thanks Rajkumar for reporting this.

I am looking into this issue and will post updated patch with the fix.

Logic for checking whether partition keys lead group by keys needs to be
updated here. The group by expressions can appear in any order without
affecting the final result. And thus, the need for partition keys should
be leading the group by keys to have full aggregation is not mandatory.
Instead we must ensure that the partition keys are part of the group by
keys to compute full aggregation on a partition.

Attached, revised patch-set with above fix.

Also, in test-cases, I have removed DROP/ANALYZE commands on child
relations and also removed VERBOSE from the EXPLAIN.

Notes:
HEAD: 8edacab209957520423770851351ab4013cb0167
Partition-wise Join patch-set version: v32

Thanks for the patch. I have tested it and issue is fixed now.  
Reply | Threaded
Open this post in threaded view
|

Re: Partition-wise aggregation/grouping

rajkumar.raghuwanshi
In reply to this post by Jeevan Chalke
On Mon, Sep 18, 2017 at 12:37 PM, Jeevan Chalke <[hidden email]> wrote:

On Tue, Sep 12, 2017 at 6:21 PM, Jeevan Chalke <[hidden email]> wrote:


On Tue, Sep 12, 2017 at 3:24 PM, Rajkumar Raghuwanshi <[hidden email]> wrote:

Hi Jeevan,

I have started testing partition-wise-aggregate and got one observation, please take a look.
with the v2 patch, here if I change target list order, query is not picking full partition-wise-aggregate.

Thanks Rajkumar for reporting this.

I am looking into this issue and will post updated patch with the fix.

Logic for checking whether partition keys lead group by keys needs to be
updated here. The group by expressions can appear in any order without
affecting the final result. And thus, the need for partition keys should
be leading the group by keys to have full aggregation is not mandatory.
Instead we must ensure that the partition keys are part of the group by
keys to compute full aggregation on a partition.

Attached, revised patch-set with above fix.

Also, in test-cases, I have removed DROP/ANALYZE commands on child
relations and also removed VERBOSE from the EXPLAIN.

Notes:
HEAD: 8edacab209957520423770851351ab4013cb0167
Partition-wise Join patch-set version: v32

Hi Jeevan,

while testing latest v3 patches, I am  getting a server crash if I reset partition_wise_agg_cost_factor, please take a look.

CREATE TABLE lp (a TEXT, b FLOAT, c INT) PARTITION BY LIST(c);
CREATE TABLE lp1 PARTITION OF lp FOR VALUES IN (10,20);
CREATE TABLE lp2 PARTITION OF lp FOR VALUES IN (30,40);

INSERT INTO lp VALUES ('a1',800, 20);
INSERT INTO lp VALUES ('a2',1250,30);
INSERT INTO lp VALUES ('a3',2975,20);
INSERT INTO lp VALUES ('a3',2850,30);

postgres=# SET enable_partition_wise_agg TO true;
SET
postgres=# SET partition_wise_agg_cost_factor TO 0.5;
SET
postgres=#
postgres=# SELECT MAX(b), AVG(b) FROM lp GROUP BY a HAVING a = 'a3' ORDER BY 1,2;
 max  |  avg  
------+--------
 2975 | 2912.5
(1 row)

postgres=# RESET partition_wise_agg_cost_factor;
RESET
postgres=#
postgres=# SELECT MAX(b), AVG(b) FROM lp GROUP BY a HAVING a = 'a3' ORDER BY 1,2;
server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!>

Thanks & Regards,
Rajkumar Raghuwanshi
QMG, EnterpriseDB Corporation
Reply | Threaded
Open this post in threaded view
|

Re: Partition-wise aggregation/grouping

Jeevan Chalke


On Thu, Sep 21, 2017 at 5:00 PM, Rajkumar Raghuwanshi <[hidden email]> wrote:
Hi Jeevan,

while testing latest v3 patches, I am  getting a server crash if I reset partition_wise_agg_cost_factor, please take a look.

CREATE TABLE lp (a TEXT, b FLOAT, c INT) PARTITION BY LIST(c);
CREATE TABLE lp1 PARTITION OF lp FOR VALUES IN (10,20);
CREATE TABLE lp2 PARTITION OF lp FOR VALUES IN (30,40);

INSERT INTO lp VALUES ('a1',800, 20);
INSERT INTO lp VALUES ('a2',1250,30);
INSERT INTO lp VALUES ('a3',2975,20);
INSERT INTO lp VALUES ('a3',2850,30);

postgres=# SET enable_partition_wise_agg TO true;
SET
postgres=# SET partition_wise_agg_cost_factor TO 0.5;
SET
postgres=#
postgres=# SELECT MAX(b), AVG(b) FROM lp GROUP BY a HAVING a = 'a3' ORDER BY 1,2;
 max  |  avg  
------+--------
 2975 | 2912.5
(1 row)

postgres=# RESET partition_wise_agg_cost_factor;
RESET
postgres=#
postgres=# SELECT MAX(b), AVG(b) FROM lp GROUP BY a HAVING a = 'a3' ORDER BY 1,2;
server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!>


Oops. Will fix this.

I have added these tests in testcase, but testcase is working as expected.
However running those steps on psql reproduces the crash (not consistent though).

Looking into it. Thanks for reporting.
 
Thanks & Regards,
Rajkumar Raghuwanshi
QMG, EnterpriseDB Corporation


--
Jeevan Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Reply | Threaded
Open this post in threaded view
|

Re: Partition-wise aggregation/grouping

Ashutosh Bapat
In reply to this post by Jeevan Chalke
Hi Jeevan,
I have started reviewing these patches.

0001 looks fine. There might be some changes that will be needed, but
those will be clear when I review the patch that uses this
refactoring.

0002
+ *
+ * If targetlist is provided, we use it else use targetlist from the root.
  */
 static double
 get_number_of_groups(PlannerInfo *root,
                     double path_rows,
-                    grouping_sets_data *gd)
+                    grouping_sets_data *gd,
+                    List *tlist)
 {
    Query      *parse = root->parse;
    double      dNumGroups;
+   List       *targetList = (tlist == NIL) ? parse->targetList : tlist;

May be we should just pass targetlist always. Instead of passing NIL,
pass parse->targetList directly. That would save us one conditional
assignment. May be passing NIL is required for the patches that use
this refactoring, but that's not clear as is in this patch.

0003
In the documenation of enable_partition_wise_aggregate, we should
probably explain why the default is off or like partition_wise_join
GUC, explain the consequences of turning it off. I doubt if we could
accept something like partition_wise_agg_cost_factor looks. But we can
discuss this at a later stage. Mean time it may be worthwhile to fix
the reason why we would require this GUC. If the regular aggregation
has cost lesser than partition-wise aggregation in most of the cases,
then probably we need to fix the cost model.

I will continue reviewing rest of the patches.

On Mon, Sep 18, 2017 at 12:37 PM, Jeevan Chalke
<[hidden email]> wrote:

>
>
> On Tue, Sep 12, 2017 at 6:21 PM, Jeevan Chalke
> <[hidden email]> wrote:
>>
>>
>>
>> On Tue, Sep 12, 2017 at 3:24 PM, Rajkumar Raghuwanshi
>> <[hidden email]> wrote:
>>>
>>>
>>> Hi Jeevan,
>>>
>>> I have started testing partition-wise-aggregate and got one observation,
>>> please take a look.
>>> with the v2 patch, here if I change target list order, query is not
>>> picking full partition-wise-aggregate.
>>
>>
>> Thanks Rajkumar for reporting this.
>>
>> I am looking into this issue and will post updated patch with the fix.
>
>
> Logic for checking whether partition keys lead group by keys needs to be
> updated here. The group by expressions can appear in any order without
> affecting the final result. And thus, the need for partition keys should
> be leading the group by keys to have full aggregation is not mandatory.
> Instead we must ensure that the partition keys are part of the group by
> keys to compute full aggregation on a partition.
>
> Attached, revised patch-set with above fix.
>
> Also, in test-cases, I have removed DROP/ANALYZE commands on child
> relations and also removed VERBOSE from the EXPLAIN.
>
> Notes:
> HEAD: 8edacab209957520423770851351ab4013cb0167
> Partition-wise Join patch-set version: v32
>
> Thanks
>
> --
> Jeevan Chalke
> Principal Software Engineer, Product Development
> EnterpriseDB Corporation
> The Enterprise PostgreSQL Company
>
>
>
> --
> Sent via pgsql-hackers mailing list ([hidden email])
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>



--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
12345
Previous Thread Next Thread