BUG #16053: Query planner performance regression in sql query with multiple CTE in v12

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

BUG #16053: Query planner performance regression in sql query with multiple CTE in v12

PG Doc comments form
The following bug has been logged on the website:

Bug reference:      16053
Logged by:          Federico Caselli
Email address:      [hidden email]
PostgreSQL version: 12.0
Operating system:   Windows 10
Description:        

After upgrading to v12 from v11.4 I've noticed one of the queries run by an
application has a significant performance degradation.

I've managed to create a toy db that illustrates the issue. Following is a
link of the db and the query in question (there seems to be no way to attach
a file from the web form. If I should upload them somewhere else let me
know)
https://www.dropbox.com/s/6n3ctdn2zohoann/toy-db_and_query.zip?dl=0 

The example query is not very representative in this toy case, but
illustrates the difference between the two planning. (e.g. cte_D is
superfluous in the example, and removing it does remove the execution time
differente in v12, but it is usually used to execute other operations in a
real query). I've noticed that changing the groupby expression (e.g. using
only pid) produces similar execution times in v12 compared with v11.
Following are the output of explain analyze on the query for version 12 and
11. v12 seems to have a similar execution time when adding a join with cte_b
on the final select.

I'm not asking for help optimizing the query, I just wanted to report the
performance regression in this case I've come across.
This is my first time reporting an issue, so I hope I've not missed
anything
Best,
Federico Caselli

Version 12:
Nested Loop  (cost=7503.23..7867.82 rows=1 width=16) (actual
time=41.737..7772.635 rows=307 loops=1)
  Join Filter: ((cte_a.cid = inv.cid) AND (cte_a.pid = inv.pid))
  Rows Removed by Join Filter: 170692
  CTE cte_b
    ->  Seq Scan on inv_y inv_y_1  (cost=0.00..31.89 rows=557 width=8)
(actual time=0.008..0.128 rows=557 loops=1)
          Filter: (y = 1)
          Rows Removed by Filter: 1114
  ->  Hash Join  (cost=107.71..112.76 rows=1 width=24) (actual
time=0.816..3.014 rows=557 loops=1)
        Hash Cond: ((cte_b.cid = cte_a.cid) AND (cte_b.pid = cte_a.pid))
        ->  HashAggregate  (cost=67.30..69.30 rows=200 width=16) (actual
time=0.412..1.234 rows=557 loops=1)
              Group Key: cte_b.cid, cte_b.pid
              ->  Hash Left Join  (cost=49.05..63.12 rows=557 width=16)
(actual time=0.035..0.304 rows=557 loops=1)
                    Hash Cond: ((cte_b.cid = expe.cid) AND (cte_b.pid =
expe.pid))
                    ->  CTE Scan on cte_b  (cost=0.00..11.14 rows=557
width=8) (actual time=0.011..0.219 rows=557 loops=1)
                    ->  Hash  (cost=49.04..49.04 rows=1 width=16) (actual
time=0.017..0.017 rows=0 loops=1)
                          Buckets: 1024  Batches: 1  Memory Usage: 8kB
                          ->  GroupAggregate  (cost=49.01..49.03 rows=1
width=16) (actual time=0.017..0.017 rows=0 loops=1)
                                Group Key: expe.cid, expe.pid
                                ->  Sort  (cost=49.01..49.01 rows=1
width=16) (actual time=0.016..0.016 rows=0 loops=1)
                                      Sort Key: expe.cid, expe.pid
                                      Sort Method: quicksort  Memory: 25kB
                                      ->  Hash Join  (cost=33.67..49.00
rows=1 width=16) (actual time=0.010..0.010 rows=0 loops=1)
                                            Hash Cond: ((cte_b_1.cid =
expe.cid) AND (cte_b_1.pid = expe.pid))
                                            ->  CTE Scan on cte_b cte_b_1
(cost=0.00..11.14 rows=557 width=8) (actual time=0.000..0.000 rows=1
loops=1)
                                            ->  Hash  (cost=33.55..33.55
rows=8 width=16) (actual time=0.006..0.006 rows=0 loops=1)
                                                  Buckets: 1024  Batches: 1
Memory Usage: 8kB
                                                  ->  Seq Scan on expe
(cost=0.00..33.55 rows=8 width=16) (actual time=0.006..0.006 rows=0
loops=1)
                                                        Filter: ((invd >=
'2018-07-01'::date) AND (invd <= '2018-12-31'::date))
        ->  Hash  (cost=37.95..37.95 rows=164 width=8) (actual
time=0.399..0.399 rows=557 loops=1)
              Buckets: 1024  Batches: 1  Memory Usage: 30kB
              ->  Subquery Scan on cte_a  (cost=34.67..37.95 rows=164
width=8) (actual time=0.255..0.339 rows=557 loops=1)
                    ->  HashAggregate  (cost=34.67..36.31 rows=164 width=16)
(actual time=0.255..0.311 rows=557 loops=1)
                          Group Key: inv_y.cid, inv_y.pid
                          ->  Seq Scan on inv_y  (cost=0.00..31.89 rows=557
width=8) (actual time=0.014..0.145 rows=557 loops=1)
                                Filter: (y = 1)
                                Rows Removed by Filter: 1114
  ->  Merge Join  (cost=7363.63..7723.15 rows=1 width=32) (actual
time=0.142..13.926 rows=307 loops=557)
        Merge Cond: ((inv.cid = cte_b_3.cid) AND (inv.pid = cte_b_3.pid))
        ->  GroupAggregate  (cost=4523.47..4785.28 rows=200 width=16)
(actual time=0.100..10.260 rows=307 loops=557)
              Group Key: inv.cid, inv.pid
              ->  GroupAggregate  (cost=4523.47..4775.61 rows=438 width=16)
(actual time=0.066..10.169 rows=307 loops=557)
                    Group Key: inv.cid, inv.pid
                    ->  Merge Join  (cost=4523.47..4767.95 rows=438
width=16) (actual time=0.032..7.534 rows=24765 loops=557)
                          Merge Cond: ((inv.cid = cte_b_2.cid) AND (inv.pid
= cte_b_2.pid))
                          ->  Sort  (cost=4486.93..4565.57 rows=31456
width=16) (actual time=0.031..1.286 rows=31011 loops=557)
                                Sort Key: inv.cid, inv.pid
                                Sort Method: quicksort  Memory: 2222kB
                                ->  Seq Scan on inv  (cost=0.00..2137.00
rows=31456 width=16) (actual time=0.013..10.231 rows=31011 loops=1)
                                      Filter: ((invd >= '2017-01-01'::date)
AND (invd <= '2018-12-31'::date))
                                      Rows Removed by Filter: 68989
                          ->  Sort  (cost=36.54..37.94 rows=557 width=8)
(actual time=0.000..0.780 rows=25015 loops=557)
                                Sort Key: cte_b_2.cid, cte_b_2.pid
                                Sort Method: quicksort  Memory: 51kB
                                ->  CTE Scan on cte_b cte_b_2
(cost=0.00..11.14 rows=557 width=8) (actual time=0.004..0.038 rows=557
loops=1)
        ->  GroupAggregate  (cost=2840.16..2931.86 rows=200 width=16)
(actual time=0.030..3.575 rows=556 loops=557)
              Group Key: cte_b_3.cid, cte_b_3.pid
              ->  Merge Left Join  (cost=2840.16..2925.68 rows=557 width=16)
(actual time=0.030..3.413 rows=557 loops=557)
                    Merge Cond: ((cte_b_3.cid = inv_1.cid) AND (cte_b_3.pid
= inv_1.pid))
                    ->  Sort  (cost=36.54..37.94 rows=557 width=8) (actual
time=0.000..0.043 rows=557 loops=557)
                          Sort Key: cte_b_3.cid, cte_b_3.pid
                          Sort Method: quicksort  Memory: 51kB
                          ->  CTE Scan on cte_b cte_b_3  (cost=0.00..11.14
rows=557 width=8) (actual time=0.000..0.028 rows=557 loops=1)
                    ->  GroupAggregate  (cost=2803.61..2882.95 rows=133
width=16) (actual time=0.029..3.252 rows=286 loops=557)
                          Group Key: inv_1.cid, inv_1.pid
                          ->  Merge Join  (cost=2803.61..2880.62 rows=133
width=16) (actual time=0.018..2.430 rows=7313 loops=557)
                                Merge Cond: ((inv_1.cid = cte_b_4.cid) AND
(inv_1.pid = cte_b_4.pid))
                                ->  Sort  (cost=2767.07..2790.90 rows=9533
width=16) (actual time=0.017..0.439 rows=9558 loops=557)
                                      Sort Key: inv_1.cid, inv_1.pid
                                      Sort Method: quicksort  Memory:
833kB
                                      ->  Seq Scan on inv inv_1
(cost=0.00..2137.00 rows=9533 width=16) (actual time=0.026..7.346 rows=9558
loops=1)
                                            Filter: ((invd >=
'2018-01-01'::date) AND (invd <= '2018-06-30'::date))
                                            Rows Removed by Filter: 90442
                                ->  Sort  (cost=36.54..37.94 rows=557
width=8) (actual time=0.000..0.267 rows=7584 loops=557)
                                      Sort Key: cte_b_4.cid, cte_b_4.pid
                                      Sort Method: quicksort  Memory: 51kB
                                      ->  CTE Scan on cte_b cte_b_4
(cost=0.00..11.14 rows=557 width=8) (actual time=0.001..0.026 rows=557
loops=1)
Planning Time: 1.123 ms
Execution Time: 7773.390 ms

Version 11:
Hash Join  (cost=7762.21..7767.72 rows=1 width=16) (actual
time=43.094..43.352 rows=307 loops=1)
  Hash Cond: ((cte_h.cid = cte_a.cid) AND (cte_h.pid = cte_a.pid))
  CTE cte_a
    ->  HashAggregate  (cost=36.06..37.70 rows=164 width=16) (actual
time=0.246..0.305 rows=557 loops=1)
          Group Key: inv_y.cid, inv_y.pid
          ->  Seq Scan on inv_y  (cost=0.00..31.89 rows=557 width=16)
(actual time=0.024..0.141 rows=557 loops=1)
                Filter: (y = 1)
                Rows Removed by Filter: 1114
  CTE cte_b
    ->  Seq Scan on inv_y inv_y_1  (cost=0.00..31.89 rows=557 width=8)
(actual time=0.240..0.620 rows=557 loops=1)
          Filter: (y = 1)
          Rows Removed by Filter: 1114
  CTE cte_c
    ->  GroupAggregate  (cost=4443.30..4687.71 rows=424 width=16) (actual
time=14.641..23.430 rows=307 loops=1)
          Group Key: inv.cid, inv.pid
          ->  Merge Join  (cost=4443.30..4680.29 rows=424 width=16) (actual
time=14.610..21.422 rows=24765 loops=1)
                Merge Cond: ((inv.cid = cte_b.cid) AND (inv.pid =
cte_b.pid))
                ->  Sort  (cost=4406.76..4482.95 rows=30476 width=16)
(actual time=14.464..15.601 rows=31011 loops=1)
                      Sort Key: inv.cid, inv.pid
                      Sort Method: quicksort  Memory: 2222kB
                      ->  Seq Scan on inv  (cost=0.00..2137.00 rows=30476
width=16) (actual time=0.020..8.211 rows=31011 loops=1)
                            Filter: ((invd >= '2017-01-01'::date) AND (invd
<= '2018-12-31'::date))
                            Rows Removed by Filter: 68989
                ->  Sort  (cost=36.54..37.94 rows=557 width=8) (actual
time=0.141..0.814 rows=25015 loops=1)
                      Sort Key: cte_b.cid, cte_b.pid
                      Sort Method: quicksort  Memory: 51kB
                      ->  CTE Scan on cte_b  (cost=0.00..11.14 rows=557
width=8) (actual time=0.001..0.025 rows=557 loops=1)
  CTE cte_d
    ->  HashAggregate  (cost=11.66..13.66 rows=200 width=16) (actual
time=23.573..23.609 rows=307 loops=1)
          Group Key: cte_c.cid, cte_c.pid
          ->  CTE Scan on cte_c  (cost=0.00..8.48 rows=424 width=16) (actual
time=14.642..23.493 rows=307 loops=1)
  CTE cte_e
    ->  GroupAggregate  (cost=2798.92..2877.75 rows=132 width=16) (actual
time=14.762..17.212 rows=286 loops=1)
          Group Key: inv_1.cid, inv_1.pid
          ->  Merge Join  (cost=2798.92..2875.44 rows=132 width=16) (actual
time=14.530..16.416 rows=7313 loops=1)
                Merge Cond: ((inv_1.cid = cte_b_1.cid) AND (inv_1.pid =
cte_b_1.pid))
                ->  Sort  (cost=2762.38..2786.05 rows=9469 width=16) (actual
time=14.392..14.652 rows=9558 loops=1)
                      Sort Key: inv_1.cid, inv_1.pid
                      Sort Method: quicksort  Memory: 833kB
                      ->  Seq Scan on inv inv_1  (cost=0.00..2137.00
rows=9469 width=16) (actual time=0.232..12.250 rows=9558 loops=1)
                            Filter: ((invd >= '2018-01-01'::date) AND (invd
<= '2018-06-30'::date))
                            Rows Removed by Filter: 90442
                ->  Sort  (cost=36.54..37.94 rows=557 width=8) (actual
time=0.132..0.333 rows=7584 loops=1)
                      Sort Key: cte_b_1.cid, cte_b_1.pid
                      Sort Method: quicksort  Memory: 51kB
                      ->  CTE Scan on cte_b cte_b_1  (cost=0.00..11.14
rows=557 width=8) (actual time=0.001..0.026 rows=557 loops=1)
  CTE cte_f
    ->  HashAggregate  (cost=24.14..26.14 rows=200 width=16) (actual
time=17.537..17.612 rows=557 loops=1)
          Group Key: cte_b_2.cid, cte_b_2.pid
          ->  Hash Left Join  (cost=4.62..19.96 rows=557 width=16) (actual
time=17.307..17.431 rows=557 loops=1)
                Hash Cond: ((cte_b_2.cid = cte_e.cid) AND (cte_b_2.pid =
cte_e.pid))
                ->  CTE Scan on cte_b cte_b_2  (cost=0.00..11.14 rows=557
width=8) (actual time=0.000..0.025 rows=557 loops=1)
                ->  Hash  (cost=2.64..2.64 rows=132 width=16) (actual
time=17.300..17.300 rows=286 loops=1)
                      Buckets: 1024  Batches: 1  Memory Usage: 22kB
                      ->  CTE Scan on cte_e  (cost=0.00..2.64 rows=132
width=16) (actual time=14.765..17.266 rows=286 loops=1)
  CTE cte_g
    ->  GroupAggregate  (cost=49.01..49.03 rows=1 width=16) (actual
time=0.026..0.026 rows=0 loops=1)
          Group Key: expe.cid, expe.pid
          ->  Sort  (cost=49.01..49.01 rows=1 width=16) (actual
time=0.025..0.025 rows=0 loops=1)
                Sort Key: expe.cid, expe.pid
                Sort Method: quicksort  Memory: 25kB
                ->  Hash Join  (cost=33.67..49.00 rows=1 width=16) (actual
time=0.014..0.014 rows=0 loops=1)
                      Hash Cond: ((cte_b_3.cid = expe.cid) AND (cte_b_3.pid
= expe.pid))
                      ->  CTE Scan on cte_b cte_b_3  (cost=0.00..11.14
rows=557 width=8) (actual time=0.000..0.000 rows=1 loops=1)
                      ->  Hash  (cost=33.55..33.55 rows=8 width=16) (actual
time=0.008..0.008 rows=0 loops=1)
                            Buckets: 1024  Batches: 1  Memory Usage: 8kB
                            ->  Seq Scan on expe  (cost=0.00..33.55 rows=8
width=16) (actual time=0.008..0.008 rows=0 loops=1)
                                  Filter: ((invd >= '2018-07-01'::date) AND
(invd <= '2018-12-31'::date))
  CTE cte_h
    ->  HashAggregate  (cost=19.54..21.54 rows=200 width=16) (actual
time=0.951..1.024 rows=557 loops=1)
          Group Key: cte_b_4.cid, cte_b_4.pid
          ->  Hash Left Join  (cost=0.04..15.36 rows=557 width=16) (actual
time=0.283..0.823 rows=557 loops=1)
                Hash Cond: ((cte_b_4.cid = cte_g.cid) AND (cte_b_4.pid =
cte_g.pid))
                ->  CTE Scan on cte_b cte_b_4  (cost=0.00..11.14 rows=557
width=8) (actual time=0.242..0.713 rows=557 loops=1)
                ->  Hash  (cost=0.02..0.02 rows=1 width=16) (actual
time=0.027..0.027 rows=0 loops=1)
                      Buckets: 1024  Batches: 1  Memory Usage: 8kB
                      ->  CTE Scan on cte_g  (cost=0.00..0.02 rows=1
width=16) (actual time=0.026..0.026 rows=0 loops=1)
  ->  CTE Scan on cte_h  (cost=0.00..4.00 rows=200 width=16) (actual
time=0.952..1.120 rows=557 loops=1)
  ->  Hash  (cost=16.77..16.77 rows=1 width=40) (actual time=42.127..42.127
rows=307 loops=1)
        Buckets: 1024  Batches: 1  Memory Usage: 30kB
        ->  Hash Join  (cost=11.27..16.77 rows=1 width=40) (actual
time=41.827..42.082 rows=307 loops=1)
              Hash Cond: ((cte_f.cid = cte_a.cid) AND (cte_f.pid =
cte_a.pid))
              ->  CTE Scan on cte_f  (cost=0.00..4.00 rows=200 width=16)
(actual time=17.538..17.705 rows=557 loops=1)
              ->  Hash  (cost=11.25..11.25 rows=1 width=24) (actual
time=24.248..24.249 rows=307 loops=1)
                    Buckets: 1024  Batches: 1  Memory Usage: 25kB
                    ->  Hash Join  (cost=5.74..11.25 rows=1 width=24)
(actual time=24.056..24.208 rows=307 loops=1)
                          Hash Cond: ((cte_d.cid = cte_a.cid) AND (cte_d.pid
= cte_a.pid))
                          ->  CTE Scan on cte_d  (cost=0.00..4.00 rows=200
width=16) (actual time=23.574..23.662 rows=307 loops=1)
                          ->  Hash  (cost=3.28..3.28 rows=164 width=8)
(actual time=0.470..0.470 rows=557 loops=1)
                                Buckets: 1024  Batches: 1  Memory Usage:
30kB
                                ->  CTE Scan on cte_a  (cost=0.00..3.28
rows=164 width=8) (actual time=0.247..0.416 rows=557 loops=1)
Planning Time: 5.114 ms
Execution Time: 44.345 ms

Reply | Threaded
Open this post in threaded view
|

Re: BUG #16053: Query planner performance regression in sql query with multiple CTE in v12

Andrew Gierth
>>>>> "PG" == PG Bug reporting form <[hidden email]> writes:

 PG> After upgrading to v12 from v11.4 I've noticed one of the queries
 PG> run by an application has a significant performance degradation.

 PG> I've managed to create a toy db that illustrates the issue.
 PG> Following is a link of the db and the query in question (there
 PG> seems to be no way to attach a file from the web form. If I should
 PG> upload them somewhere else let me know)
 PG> https://www.dropbox.com/s/6n3ctdn2zohoann/toy-db_and_query.zip?dl=0 

You could at least have included the query in the message.

What seems to be happening here looks to me more down to chance than to
any bug. In v11, the join order is constrained by the CTEs, whereas in
v12 it is not; the misplan is the result of a mis-estimated 1-row result
from a subpath that may not have been considered under v11.

--
Andrew (irc:RhodiumToad)


Reply | Threaded
Open this post in threaded view
|

Re: BUG #16053: Query planner performance regression in sql query with multiple CTE in v12

Federico
Sorry for the late reply and for not including the query in the first message.

Is there a way to avoid the mis-estimate other than changing the query?

Following is the query

--explain analyse
WITH cte_A AS
  (SELECT inv_y.cid AS cid,
          inv_y.pid AS pid,
          sum(inv_y.value) AS value
   FROM inv_y
   WHERE inv_y.y = 1
   GROUP BY inv_y.cid,
            inv_y.pid),
     cte_B AS
  (SELECT inv_y.cid AS cid,
          inv_y.pid AS pid
   FROM inv_y
   WHERE inv_y.y = 1),
     cte_C AS
  (SELECT sum(inv.value) AS value,
          inv.cid AS cid,
          inv.pid AS pid
   FROM inv
   JOIN cte_B ON cte_B.cid = inv.cid
   AND cte_B.pid = inv.pid
   WHERE inv.invd >= '2017-01-01'
     AND inv.invd <= '2018-12-31'
   GROUP BY inv.cid,
            inv.pid),
     cte_D AS
  (SELECT cte_C.cid AS cid,
          cte_C.pid AS pid,
          sum(cte_C.value) AS value
   FROM cte_C
   GROUP BY cte_C.cid,
            cte_C.pid),
     cte_E AS
  (SELECT sum(inv.value) AS value,
          inv.cid AS cid,
          inv.pid AS pid
   FROM inv
   JOIN cte_B ON cte_B.cid = inv.cid
   AND cte_B.pid = inv.pid
   WHERE inv.invd >= '2018-01-01'
     AND inv.invd <= '2018-06-30'
   GROUP BY inv.cid,
            inv.pid),
     cte_F AS
  (SELECT cte_B.cid AS cid,
          cte_B.pid AS pid,
          coalesce(sum(cte_E.value), 0) AS value
   FROM cte_B
   LEFT OUTER JOIN cte_E ON cte_E.cid = cte_B.cid
   AND cte_E.pid = cte_B.pid
   GROUP BY cte_B.cid,
            cte_B.pid),
     cte_G AS
  (SELECT sum(expe.value) AS value,
          expe.cid AS cid,
          expe.pid AS pid
   FROM expe
   JOIN cte_B ON cte_B.cid = expe.cid
   AND cte_B.pid = expe.pid
   WHERE expe.invd >= '2018-07-01'
     AND expe.invd <= '2018-12-31'
   GROUP BY expe.cid,
            expe.pid),
     cte_H AS
  (SELECT cte_B.cid AS cid,
          cte_B.pid AS pid,
          coalesce(sum(cte_G.value), 0) AS value
   FROM cte_B
   LEFT OUTER JOIN cte_G ON cte_G.cid = cte_B.cid
   AND cte_G.pid = cte_B.pid
   GROUP BY cte_B.cid,
            cte_B.pid)
SELECT cte_D.value AS ref_value,
       cte_D.value - (cte_F.value + cte_H.value) AS rem_value
FROM cte_A
JOIN cte_D ON cte_A.cid = cte_D.cid
AND cte_A.pid = cte_D.pid
JOIN cte_F ON cte_A.cid = cte_F.cid
AND cte_A.pid = cte_F.pid
JOIN cte_H ON cte_A.cid = cte_H.cid
AND cte_A.pid = cte_H.pid
--join cte_B ON cte_A.cid = cte_B.cid
--AND cte_A.pid = cte_B.pid

On Sat, 12 Oct 2019 at 19:17, Andrew Gierth <[hidden email]> wrote:
>>>>> "PG" == PG Bug reporting form <[hidden email]> writes:

 PG> After upgrading to v12 from v11.4 I've noticed one of the queries
 PG> run by an application has a significant performance degradation.

 PG> I've managed to create a toy db that illustrates the issue.
 PG> Following is a link of the db and the query in question (there
 PG> seems to be no way to attach a file from the web form. If I should
 PG> upload them somewhere else let me know)
 PG> https://www.dropbox.com/s/6n3ctdn2zohoann/toy-db_and_query.zip?dl=0

You could at least have included the query in the message.

What seems to be happening here looks to me more down to chance than to
any bug. In v11, the join order is constrained by the CTEs, whereas in
v12 it is not; the misplan is the result of a mis-estimated 1-row result
from a subpath that may not have been considered under v11.

--
Andrew (irc:RhodiumToad)