very slow queries when max_parallel_workers_per_gather is higher than zero

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

very slow queries when max_parallel_workers_per_gather is higher than zero

Pavel Stehule
Hi,

my customer does performance checks of PostgreSQL 9.5 and 10. Almost all queries on 10 are faster, but there are few queries (40 from 1000) where PostgreSQL 9.5 is significantly faster than. Unfortunately - pretty fast queries (about 20ms) are too slow now (5 sec).

attached execution plans

It looks like some cost issue, slow queries prefers Seq scan against bitmap heap scan

Hash Cond: (f_ticketupdate_aad5jtwal0ayaax.dt_event_id = dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_aaewg8j61ia.id)
   ->  Parallel Seq Scan on f_ticketupdate_aad5jtwal0ayaax  (cost=0.00..1185867.47 rows=24054847 width=8) (actual time=0.020..3741.409 rows=19243863 loops=3)
   ->  Hash  (cost=27.35..27.35 rows=7 width=4) (actual time=0.089..0.089 rows=7 loops=3)
        Buckets: 1024  Batches: 1  Memory Usage: 9kB


Regards

Pavel

pg10_regression.sql (18K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: very slow queries when max_parallel_workers_per_gather is higher than zero

Tomas Vondra-4


On 04/16/2018 11:34 AM, Pavel Stehule wrote:

> Hi,
>
> my customer does performance checks of PostgreSQL 9.5 and 10. Almost all
> queries on 10 are faster, but there are few queries (40 from 1000) where
> PostgreSQL 9.5 is significantly faster than. Unfortunately - pretty fast
> queries (about 20ms) are too slow now (5 sec).
>
> attached execution plans
>
> It looks like some cost issue, slow queries prefers Seq scan against
> bitmap heap scan
>
> Hash Cond: (f_ticketupdate_aad5jtwal0ayaax.dt_event_id =
> dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_aaewg8j61ia.id
> <http://dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_aaewg8j61ia.id>)
>    ->  Parallel Seq Scan on f_ticketupdate_aad5jtwal0ayaax 
> (cost=0.00..1185867.47 rows=24054847 width=8) (actual
> time=0.020..3741.409 rows=19243863 loops=3)
>    ->  Hash  (cost=27.35..27.35 rows=7 width=4) (actual
> time=0.089..0.089 rows=7 loops=3)
>         Buckets: 1024  Batches: 1  Memory Usage: 9kB
>
>

What happens when you disable sequential scans on pg10?

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: very slow queries when max_parallel_workers_per_gather is higher than zero

Pavel Stehule


2018-04-16 14:00 GMT+02:00 Tomas Vondra <[hidden email]>:


On 04/16/2018 11:34 AM, Pavel Stehule wrote:
> Hi,
>
> my customer does performance checks of PostgreSQL 9.5 and 10. Almost all
> queries on 10 are faster, but there are few queries (40 from 1000) where
> PostgreSQL 9.5 is significantly faster than. Unfortunately - pretty fast
> queries (about 20ms) are too slow now (5 sec).
>
> attached execution plans
>
> It looks like some cost issue, slow queries prefers Seq scan against
> bitmap heap scan
>
> Hash Cond: (f_ticketupdate_aad5jtwal0ayaax.dt_event_id =
> dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_aaewg8j61ia.id
> <http://dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_aaewg8j61ia.id>)
>    ->  Parallel Seq Scan on f_ticketupdate_aad5jtwal0ayaax 
> (cost=0.00..1185867.47 rows=24054847 width=8) (actual
> time=0.020..3741.409 rows=19243863 loops=3)
>    ->  Hash  (cost=27.35..27.35 rows=7 width=4) (actual
> time=0.089..0.089 rows=7 loops=3)
>         Buckets: 1024  Batches: 1  Memory Usage: 9kB
>
>

What happens when you disable sequential scans on pg10?

set enable_seqscan=off;
set  max_parallel_workers_per_gather=2;

Query Performs nicely, but no parallel workers are used:
GroupAggregate  (cost=2611148.87..2611152.89 rows=31 width=22) (actual time=0.084..0.084 rows=0 loops=1)
   Group Key: f_zendesktickets_aaeljtllr5at3el.cstm_custom_38746665_primary_column
   ->  Sort  (cost=2611148.87..2611149.11 rows=99 width=28) (actual time=0.082..0.082 rows=0 loops=1)
         Sort Key: f_zendesktickets_aaeljtllr5at3el.cstm_custom_38746665_primary_column
         Sort Method: quicksort  Memory: 25kB
         ->  Nested Loop  (cost=1639.25..2611145.59 rows=99 width=28) (actual time=0.076..0.076 rows=0 loops=1)
               Join Filter: (((f_ticketattributeshistory_aajzjp98uraszb6.attrnewvalue_id = ANY ('{4757,4758,4759}'::integer[])) AND (4754 = f_ticketattributeshistory_aajzjp98uraszb6.attroldvalue_id) AND (4790 = f_ticketattributeshistory_aajzjp98uraszb6.ticketfield_id)) OR (f_zendesktickets_aaeljtllr5at3el.dt_createda
t_id = f_ticketupdate_aad5jtwal0ayaax.dt_event_id))
               ->  Nested Loop  (cost=1638.81..1809540.39 rows=350270 width=20) (actual time=0.075..0.075 rows=0 loops=1)
                     ->  Nested Loop  (cost=1638.24..1508474.08 rows=69140 width=8) (actual time=0.075..0.075 rows=0 loops=1)
                           ->  Bitmap Heap Scan on dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_aaewg8j61iagl1z dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_aaewg8j61ia  (cost=4.34..27.35 rows=7 width=4) (actual time=0.026..0.038 rows=7 loops=1)
                                 Recheck Cond: (6171 = id_euweek)
                                 Heap Blocks: exact=7
                                 ->  Bitmap Index Scan on dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_a_id_euweek_idx  (cost=0.00..4.33 rows=7 width=0) (actual time=0.019..0.019 rows=7 loops=1)
                                       Index Cond: (6171 = id_euweek)
                           ->  Bitmap Heap Scan on f_ticketupdate_aad5jtwal0ayaax  (cost=1633.90..214617.67 rows=87472 width=8) (actual time=0.004..0.004 rows=0 loops=7)
                                 Recheck Cond: (dt_event_id = dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_aaewg8j61ia.id)
                                 ->  Bitmap Index Scan on f_ticketupdate_aad5jtwal0ayaax_dt_event_id_idx  (cost=0.00..1612.03 rows=87472 width=0) (actual time=0.003..0.003 rows=0 loops=7)
                                       Index Cond: (dt_event_id = dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_aaewg8j61ia.id)
                     ->  Index Scan using f_ticketattributeshistory_aajzjp98uraszb6_ticketupdate_id_idx on f_ticketattributeshistory_aajzjp98uraszb6  (cost=0.57..4.12 rows=23 width=20) (never executed)
                           Index Cond: (ticketupdate_id = f_ticketupdate_aad5jtwal0ayaax.id)
               ->  Index Scan using f_zendesktickets_aaeljtllr5at3el_pkey on f_zendesktickets_aaeljtllr5at3el  (cost=0.43..2.27 rows=1 width=12) (never executed)
                     Index Cond: (id = f_ticketattributeshistory_aajzjp98uraszb6.zendesktickets_id)
                     Filter: ((4765 <> status_id) AND (group_id = 17429))
 Planning time: 8.516 ms
 Execution time: 1.895 ms

the speed is back

 

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: very slow queries when max_parallel_workers_per_gather is higher than zero

Tomas Vondra-4
> Query Performs nicely, but no parallel workers are used:
> GroupAggregate  (cost=2611148.87..2611152.89 rows=31 width=22) (actual
> time=0.084..0.084 rows=0 loops=1)
>    Group Key:
> f_zendesktickets_aaeljtllr5at3el.cstm_custom_38746665_primary_column
>    ->  Sort  (cost=2611148.87..2611149.11 rows=99 width=28) (actual
> time=0.082..0.082 rows=0 loops=1)
>          Sort Key:
> f_zendesktickets_aaeljtllr5at3el.cstm_custom_38746665_primary_column
>          Sort Method: quicksort  Memory: 25kB
>          ->  Nested Loop  (cost=1639.25..2611145.59 rows=99 width=28)
> (actual time=0.076..0.076 rows=0 loops=1)
>                Join Filter:
> (((f_ticketattributeshistory_aajzjp98uraszb6.attrnewvalue_id = ANY
> ('{4757,4758,4759}'::integer[])) AND (4754 =
> f_ticketattributeshistory_aajzjp98uraszb6.attroldvalue_id) AND (4790 =
> f_ticketattributeshistory_aajzjp98uraszb6.ticketfield_id)) OR
> (f_zendesktickets_aaeljtllr5at3el.dt_createda
> t_id = f_ticketupdate_aad5jtwal0ayaax.dt_event_id))
>                ->  Nested Loop  (cost=1638.81..1809540.39 rows=350270
> width=20) (actual time=0.075..0.075 rows=0 loops=1)
>                      ->  Nested Loop  (cost=1638.24..1508474.08
> rows=69140 width=8) (actual time=0.075..0.075 rows=0 loops=1)
>                            ->  Bitmap Heap Scan on
> dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_aaewg8j61iagl1z
> dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_aaewg8j61ia 
> (cost=4.34..27.35 rows=7 width=4) (actual time=0.026..0.038 rows=7 loops=1)
>                                  Recheck Cond: (6171 = id_euweek)
>                                  Heap Blocks: exact=7
>                                  ->  Bitmap Index Scan on
> dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_a_id_euweek_idx 
> (cost=0.00..4.33 rows=7 width=0) (actual time=0.019..0.019 rows=7 loops=1)
>                                        Index Cond: (6171 = id_euweek)
>                            ->  Bitmap Heap Scan on
> f_ticketupdate_aad5jtwal0ayaax  (cost=1633.90..214617.67 rows=87472
> width=8) (actual time=0.004..0.004 rows=0 loops=7)
>                                  Recheck Cond: (dt_event_id =
> dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_aaewg8j61ia.id
> <http://dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_aaewg8j61ia.id>)
>                                  ->  Bitmap Index Scan on
> f_ticketupdate_aad5jtwal0ayaax_dt_event_id_idx  (cost=0.00..1612.03
> rows=87472 width=0) (actual time=0.003..0.003 rows=0 loops=7)
>                                        Index Cond: (dt_event_id =
> dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_aaewg8j61ia.id
> <http://dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_aaewg8j61ia.id>)
>                      ->  Index Scan using
> f_ticketattributeshistory_aajzjp98uraszb6_ticketupdate_id_idx on
> f_ticketattributeshistory_aajzjp98uraszb6  (cost=0.57..4.12 rows=23
> width=20) (never executed)
>                            Index Cond: (ticketupdate_id =
> f_ticketupdate_aad5jtwal0ayaax.id
> <http://f_ticketupdate_aad5jtwal0ayaax.id>)
>                ->  Index Scan using
> f_zendesktickets_aaeljtllr5at3el_pkey on
> f_zendesktickets_aaeljtllr5at3el  (cost=0.43..2.27 rows=1 width=12)
> (never executed)
>                      Index Cond: (id =
> f_ticketattributeshistory_aajzjp98uraszb6.zendesktickets_id)
>                      Filter: ((4765 <> status_id) AND (group_id = 17429))
>  Planning time: 8.516 ms
>  Execution time: 1.895 ms
>
> the speed is back
>

Yeah, but the cost is higher (2611152 vs. 1949508). So clearly, the
database believes it's going to be cheaper. I suspect a part of the
issue might be that the join is misestimated - it's expected to produce
~29k rows, but produces 0.

Can you check if this query has the same issue? It's just the
problematic join, and it should be simpler to investigate:

SELECT count(*)
  FROM f_ticketupdate_aad5jtwal0ayaax AS f
  INNER JOIN
  dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_aaewg8j61iagl1z AS d
  ON (f.dt_event_id = d.id)


regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: very slow queries when max_parallel_workers_per_gather is higher than zero

Pavel Stehule


2018-04-16 15:52 GMT+02:00 Tomas Vondra <[hidden email]>:
> Query Performs nicely, but no parallel workers are used:
> GroupAggregate  (cost=2611148.87..2611152.89 rows=31 width=22) (actual
> time=0.084..0.084 rows=0 loops=1)
>    Group Key:
> f_zendesktickets_aaeljtllr5at3el.cstm_custom_38746665_primary_column
>    ->  Sort  (cost=2611148.87..2611149.11 rows=99 width=28) (actual
> time=0.082..0.082 rows=0 loops=1)
>          Sort Key:
> f_zendesktickets_aaeljtllr5at3el.cstm_custom_38746665_primary_column
>          Sort Method: quicksort  Memory: 25kB
>          ->  Nested Loop  (cost=1639.25..2611145.59 rows=99 width=28)
> (actual time=0.076..0.076 rows=0 loops=1)
>                Join Filter:
> (((f_ticketattributeshistory_aajzjp98uraszb6.attrnewvalue_id = ANY
> ('{4757,4758,4759}'::integer[])) AND (4754 =
> f_ticketattributeshistory_aajzjp98uraszb6.attroldvalue_id) AND (4790 =
> f_ticketattributeshistory_aajzjp98uraszb6.ticketfield_id)) OR
> (f_zendesktickets_aaeljtllr5at3el.dt_createda
> t_id = f_ticketupdate_aad5jtwal0ayaax.dt_event_id))
>                ->  Nested Loop  (cost=1638.81..1809540.39 rows=350270
> width=20) (actual time=0.075..0.075 rows=0 loops=1)
>                      ->  Nested Loop  (cost=1638.24..1508474.08
> rows=69140 width=8) (actual time=0.075..0.075 rows=0 loops=1)
>                            ->  Bitmap Heap Scan on
> dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_aaewg8j61iagl1z
> dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_aaewg8j61ia 
> (cost=4.34..27.35 rows=7 width=4) (actual time=0.026..0.038 rows=7 loops=1)
>                                  Recheck Cond: (6171 = id_euweek)
>                                  Heap Blocks: exact=7
>                                  ->  Bitmap Index Scan on
> dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_a_id_euweek_idx 
> (cost=0.00..4.33 rows=7 width=0) (actual time=0.019..0.019 rows=7 loops=1)
>                                        Index Cond: (6171 = id_euweek)
>                            ->  Bitmap Heap Scan on
> f_ticketupdate_aad5jtwal0ayaax  (cost=1633.90..214617.67 rows=87472
> width=8) (actual time=0.004..0.004 rows=0 loops=7)
>                                  Recheck Cond: (dt_event_id =
> dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_aaewg8j61ia.id
> <http://dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_aaewg8j61ia.id>)
>                                  ->  Bitmap Index Scan on
> f_ticketupdate_aad5jtwal0ayaax_dt_event_id_idx  (cost=0.00..1612.03
> rows=87472 width=0) (actual time=0.003..0.003 rows=0 loops=7)
>                                        Index Cond: (dt_event_id =
> dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_aaewg8j61ia.id
> <http://dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_aaewg8j61ia.id>)
>                      ->  Index Scan using
> f_ticketattributeshistory_aajzjp98uraszb6_ticketupdate_id_idx on
> f_ticketattributeshistory_aajzjp98uraszb6  (cost=0.57..4.12 rows=23
> width=20) (never executed)
>                            Index Cond: (ticketupdate_id =
> f_ticketupdate_aad5jtwal0ayaax.id
> <http://f_ticketupdate_aad5jtwal0ayaax.id>)
>                ->  Index Scan using
> f_zendesktickets_aaeljtllr5at3el_pkey on
> f_zendesktickets_aaeljtllr5at3el  (cost=0.43..2.27 rows=1 width=12)
> (never executed)
>                      Index Cond: (id =
> f_ticketattributeshistory_aajzjp98uraszb6.zendesktickets_id)
>                      Filter: ((4765 <> status_id) AND (group_id = 17429))
>  Planning time: 8.516 ms
>  Execution time: 1.895 ms
>
> the speed is back
>

Yeah, but the cost is higher (2611152 vs. 1949508). So clearly, the
database believes it's going to be cheaper. I suspect a part of the
issue might be that the join is misestimated - it's expected to produce
~29k rows, but produces 0.

Can you check if this query has the same issue? It's just the
problematic join, and it should be simpler to investigate:

SELECT count(*)
  FROM f_ticketupdate_aad5jtwal0ayaax AS f
  INNER JOIN
  dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_aaewg8j61iagl1z AS d
  ON (f.dt_event_id = d.id)



db_sq3rjf5b7p309lq9wuqrh3qhk4gy9fbw=#  set  max_parallel_workers_per_gather=2;
SET
db_sq3rjf5b7p309lq9wuqrh3qhk4gy9fbw=# explain analyze SELECT count(*)
db_sq3rjf5b7p309lq9wuqrh3qhk4gy9fbw-#   FROM f_ticketupdate_aad5jtwal0ayaax AS f
db_sq3rjf5b7p309lq9wuqrh3qhk4gy9fbw-#   INNER JOIN
db_sq3rjf5b7p309lq9wuqrh3qhk4gy9fbw-#   dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_aaewg8j61iagl1z AS d
db_sq3rjf5b7p309lq9wuqrh3qhk4gy9fbw-#   ON (f.dt_event_id = d.id);
                                                                                            QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=1550912.23..1550912.24 rows=1 width=8) (actual time=13102.458..13102.458 rows=1 loops=1)
   ->  Gather  (cost=1550912.02..1550912.23 rows=2 width=8) (actual time=13102.374..13102.453 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Partial Aggregate  (cost=1549912.02..1549912.03 rows=1 width=8) (actual time=13098.537..13098.537 rows=1 loops=3)
               ->  Hash Join  (cost=251.51..1489774.90 rows=24054847 width=0) (actual time=3.037..11128.097 rows=19243863 loops=3)
                     Hash Cond: (f.dt_event_id = d.id)
                     ->  Parallel Seq Scan on f_ticketupdate_aad5jtwal0ayaax f  (cost=0.00..1185867.47 rows=24054847 width=4) (actual time=0.051..3724.233 rows=19243863 loops=3)
                     ->  Hash  (cost=178.45..178.45 rows=5845 width=4) (actual time=2.806..2.806 rows=5845 loops=3)
                           Buckets: 8192  Batches: 1  Memory Usage: 270kB
                           ->  Seq Scan on dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_aaewg8j61iagl1z d  (cost=0.00..178.45 rows=5845 width=4) (actual time=0.015..1.741 rows=5845 loops=3)
 Planning time: 0.463 ms
 Execution time: 13111.044 ms


db_sq3rjf5b7p309lq9wuqrh3qhk4gy9fbw=#  set  max_parallel_workers_per_gather=0;
SET
db_sq3rjf5b7p309lq9wuqrh3qhk4gy9fbw=# explain analyze SELECT count(*)
  FROM f_ticketupdate_aad5jtwal0ayaax AS f
  INNER JOIN
  dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_aaewg8j61iagl1z AS d
  ON (f.dt_event_id = d.id);
                                                                                      QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=2395990.11..2395990.12 rows=1 width=8) (actual time=37321.462..37321.462 rows=1 loops=1)
   ->  Hash Join  (cost=251.51..2251661.03 rows=57731632 width=0) (actual time=3.118..31649.524 rows=57731589 loops=1)
         Hash Cond: (f.dt_event_id = d.id)
         ->  Seq Scan on f_ticketupdate_aad5jtwal0ayaax f  (cost=0.00..1522635.32 rows=57731632 width=4) (actual time=0.784..10186.896 rows=57731589 loops=1)
         ->  Hash  (cost=178.45..178.45 rows=5845 width=4) (actual time=2.316..2.316 rows=5845 loops=1)
               Buckets: 8192  Batches: 1  Memory Usage: 270kB
               ->  Seq Scan on dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_aaewg8j61iagl1z d  (cost=0.00..178.45 rows=5845 width=4) (actual time=0.006..1.359 rows=5845 loops=1)
 Planning time: 0.433 ms
 Execution time: 37321.653 ms

 

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: very slow queries when max_parallel_workers_per_gather is higher than zero

Tomas Vondra-4
Apologies, the reduced query was missing a where condition on id_week:

SELECT count(*)
FROM f_ticketupdate_aad5jtwal0ayaax AS f
INNER JOIN
  dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_aaewg8j61iagl1z AS d
  ON (f.dt_event_id = d.id)
WHERE ( 6171 = d."id_euweek" )


regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: very slow queries when max_parallel_workers_per_gather is higher than zero

Guilherme Pereira
Hope it's fine to jump in.

db_sq3rjf5b7p309lq9wuqrh3qhk4gy9fbw=#  set  max_parallel_workers_per_gather=0;
SET
db_sq3rjf5b7p309lq9wuqrh3qhk4gy9fbw=# explain analyze SELECT count(*)
FROM f_ticketupdate_aad5jtwal0ayaax AS f
INNER JOIN
  dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_aaewg8j61iagl1z AS d
  ON (f.dt_event_id = d.id)
WHERE ( 6171 = d."id_euweek" );
                                                                                     QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=1508646.93..1508646.94 rows=1 width=8) (actual time=0.145..0.145 rows=1 loops=1)
   ->  Nested Loop  (cost=1638.24..1508474.08 rows=69140 width=0) (actual time=0.142..0.142 rows=0 loops=1)
         ->  Bitmap Heap Scan on dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_aaewg8j61iagl1z d  (cost=4.34..27.35 rows=7 width=4) (actual time=0.043..0.103 rows=7 loops=1)
               Recheck Cond: (6171 = id_euweek)
               Heap Blocks: exact=7
               ->  Bitmap Index Scan on dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_a_id_euweek_idx  (cost=0.00..4.33 rows=7 width=0) (actual time=0.036..0.036 rows=7 loops=1)
                     Index Cond: (6171 = id_euweek)
         ->  Bitmap Heap Scan on f_ticketupdate_aad5jtwal0ayaax f  (cost=1633.90..214617.67 rows=87472 width=4) (actual time=0.003..0.003 rows=0 loops=7)
               Recheck Cond: (dt_event_id = d.id)
               ->  Bitmap Index Scan on f_ticketupdate_aad5jtwal0ayaax_dt_event_id_idx  (cost=0.00..1612.03 rows=87472 width=0) (actual time=0.003..0.003 rows=0 loops=7)
                     Index Cond: (dt_event_id = d.id)
 Planning time: 0.496 ms
 Execution time: 0.227 ms
(13 rows)

db_sq3rjf5b7p309lq9wuqrh3qhk4gy9fbw=#  set  max_parallel_workers_per_gather=2;
SET
db_sq3rjf5b7p309lq9wuqrh3qhk4gy9fbw=# explain analyze SELECT count(*)
FROM f_ticketupdate_aad5jtwal0ayaax AS f
INNER JOIN
  dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_aaewg8j61iagl1z AS d
  ON (f.dt_event_id = d.id)
WHERE ( 6171 = d."id_euweek" );
                                                                                              QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=1490623.06..1490623.07 rows=1 width=8) (actual time=9604.745..9604.745 rows=1 loops=1)
   ->  Gather  (cost=1490622.85..1490623.06 rows=2 width=8) (actual time=9604.707..9604.739 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Partial Aggregate  (cost=1489622.85..1489622.86 rows=1 width=8) (actual time=9600.255..9600.255 rows=1 loops=3)
               ->  Hash Join  (cost=27.44..1489550.83 rows=28808 width=0) (actual time=9600.249..9600.249 rows=0 loops=3)
                     Hash Cond: (f.dt_event_id = d.id)
                     ->  Parallel Seq Scan on f_ticketupdate_aad5jtwal0ayaax f  (cost=0.00..1185867.47 rows=24054847 width=4) (actual time=0.076..4955.525 rows=19243863 loops=3)
                     ->  Hash  (cost=27.35..27.35 rows=7 width=4) (actual time=0.099..0.099 rows=7 loops=3)
                           Buckets: 1024  Batches: 1  Memory Usage: 9kB
                           ->  Bitmap Heap Scan on dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_aaewg8j61iagl1z d  (cost=4.34..27.35 rows=7 width=4) (actual time=0.045..0.085 rows=7 loops=3)
                                 Recheck Cond: (6171 = id_euweek)
                                 Heap Blocks: exact=7
                                 ->  Bitmap Index Scan on dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_a_id_euweek_idx  (cost=0.00..4.33 rows=7 width=0) (actual time=0.032..0.032 rows=7 loops=3)
                                       Index Cond: (6171 = id_euweek)
 Planning time: 0.616 ms
 Execution time: 9611.924 ms
(17 rows)

On Mon, Apr 16, 2018 at 4:53 PM Pavel Stehule <[hidden email]> wrote:

---------- Forwarded message ---------
From: Tomas Vondra <[hidden email]>
Date: po 16. 4. 2018 16:14
Subject: Re: very slow queries when max_parallel_workers_per_gather is higher than zero
To: Pavel Stehule <[hidden email]>
Cc: PostgreSQL Hackers <[hidden email]>


Apologies, the reduced query was missing a where condition on id_week:

SELECT count(*)
FROM f_ticketupdate_aad5jtwal0ayaax AS f
INNER JOIN
  dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_aaewg8j61iagl1z AS d
  ON (f.dt_event_id = d.id)
WHERE ( 6171 = d."id_euweek" )


regards


--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Reply | Threaded
Open this post in threaded view
|

Re: very slow queries when max_parallel_workers_per_gather is higher than zero

Guilherme Pereira-2
Some extra info, which might help, increasing the cost of the parallel_setup_cost to a value of 4500, Postgres doesn't choose the parallel query anymore, making the query faster again.

db_sq3rjf5b7p309lq9wuqrh3qhk4gy9fbw=# set parallel_setup_cost = 4500;
SET
db_sq3rjf5b7p309lq9wuqrh3qhk4gy9fbw=# explain analyze SELECT count(*)
FROM f_ticketupdate_aad5jtwal0ayaax AS f
INNER JOIN
  dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_aaewg8j61iagl1z AS d
  ON (f.dt_event_id = d.id)
WHERE ( 6171 = d."id_euweek" );
                                                                                     QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=1508646.93..1508646.94 rows=1 width=8) (actual time=0.067..0.067 rows=1 loops=1)
   ->  Nested Loop  (cost=1638.24..1508474.08 rows=69140 width=0) (actual time=0.064..0.064 rows=0 loops=1)
         ->  Bitmap Heap Scan on dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_aaewg8j61iagl1z d  (cost=4.34..27.35 rows=7 width=4) (actual time=0.016..0.040 rows=7 loops=1)
               Recheck Cond: (6171 = id_euweek)
               Heap Blocks: exact=7
               ->  Bitmap Index Scan on dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_a_id_euweek_idx  (cost=0.00..4.33 rows=7 width=0) (actual time=0.010..0.010 rows=7 loops=1)
                     Index Cond: (6171 = id_euweek)
         ->  Bitmap Heap Scan on f_ticketupdate_aad5jtwal0ayaax f  (cost=1633.90..214617.67 rows=87472 width=4) (actual time=0.002..0.002 rows=0 loops=7)
               Recheck Cond: (dt_event_id = d.id)
               ->  Bitmap Index Scan on f_ticketupdate_aad5jtwal0ayaax_dt_event_id_idx  (cost=0.00..1612.03 rows=87472 width=0) (actual time=0.002..0.002 rows=0 loops=7)
                     Index Cond: (dt_event_id = d.id)
 Planning time: 0.528 ms
 Execution time: 0.144 ms

On Mon, 16 Apr 2018 at 19:16 Guilherme Pereira <[hidden email]> wrote:
Hope it's fine to jump in.

db_sq3rjf5b7p309lq9wuqrh3qhk4gy9fbw=#  set  max_parallel_workers_per_gather=0;
SET
db_sq3rjf5b7p309lq9wuqrh3qhk4gy9fbw=# explain analyze SELECT count(*)
FROM f_ticketupdate_aad5jtwal0ayaax AS f
INNER JOIN
  dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_aaewg8j61iagl1z AS d
  ON (f.dt_event_id = d.id)
WHERE ( 6171 = d."id_euweek" );
                                                                                     QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=1508646.93..1508646.94 rows=1 width=8) (actual time=0.145..0.145 rows=1 loops=1)
   ->  Nested Loop  (cost=1638.24..1508474.08 rows=69140 width=0) (actual time=0.142..0.142 rows=0 loops=1)
         ->  Bitmap Heap Scan on dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_aaewg8j61iagl1z d  (cost=4.34..27.35 rows=7 width=4) (actual time=0.043..0.103 rows=7 loops=1)
               Recheck Cond: (6171 = id_euweek)
               Heap Blocks: exact=7
               ->  Bitmap Index Scan on dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_a_id_euweek_idx  (cost=0.00..4.33 rows=7 width=0) (actual time=0.036..0.036 rows=7 loops=1)
                     Index Cond: (6171 = id_euweek)
         ->  Bitmap Heap Scan on f_ticketupdate_aad5jtwal0ayaax f  (cost=1633.90..214617.67 rows=87472 width=4) (actual time=0.003..0.003 rows=0 loops=7)
               Recheck Cond: (dt_event_id = d.id)
               ->  Bitmap Index Scan on f_ticketupdate_aad5jtwal0ayaax_dt_event_id_idx  (cost=0.00..1612.03 rows=87472 width=0) (actual time=0.003..0.003 rows=0 loops=7)
                     Index Cond: (dt_event_id = d.id)
 Planning time: 0.496 ms
 Execution time: 0.227 ms
(13 rows)

db_sq3rjf5b7p309lq9wuqrh3qhk4gy9fbw=#  set  max_parallel_workers_per_gather=2;
SET
db_sq3rjf5b7p309lq9wuqrh3qhk4gy9fbw=# explain analyze SELECT count(*)
FROM f_ticketupdate_aad5jtwal0ayaax AS f
INNER JOIN
  dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_aaewg8j61iagl1z AS d
  ON (f.dt_event_id = d.id)
WHERE ( 6171 = d."id_euweek" );
                                                                                              QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=1490623.06..1490623.07 rows=1 width=8) (actual time=9604.745..9604.745 rows=1 loops=1)
   ->  Gather  (cost=1490622.85..1490623.06 rows=2 width=8) (actual time=9604.707..9604.739 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Partial Aggregate  (cost=1489622.85..1489622.86 rows=1 width=8) (actual time=9600.255..9600.255 rows=1 loops=3)
               ->  Hash Join  (cost=27.44..1489550.83 rows=28808 width=0) (actual time=9600.249..9600.249 rows=0 loops=3)
                     Hash Cond: (f.dt_event_id = d.id)
                     ->  Parallel Seq Scan on f_ticketupdate_aad5jtwal0ayaax f  (cost=0.00..1185867.47 rows=24054847 width=4) (actual time=0.076..4955.525 rows=19243863 loops=3)
                     ->  Hash  (cost=27.35..27.35 rows=7 width=4) (actual time=0.099..0.099 rows=7 loops=3)
                           Buckets: 1024  Batches: 1  Memory Usage: 9kB
                           ->  Bitmap Heap Scan on dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_aaewg8j61iagl1z d  (cost=4.34..27.35 rows=7 width=4) (actual time=0.045..0.085 rows=7 loops=3)
                                 Recheck Cond: (6171 = id_euweek)
                                 Heap Blocks: exact=7
                                 ->  Bitmap Index Scan on dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_a_id_euweek_idx  (cost=0.00..4.33 rows=7 width=0) (actual time=0.032..0.032 rows=7 loops=3)
                                       Index Cond: (6171 = id_euweek)
 Planning time: 0.616 ms
 Execution time: 9611.924 ms
(17 rows)

On Mon, Apr 16, 2018 at 4:53 PM Pavel Stehule <[hidden email]> wrote:

---------- Forwarded message ---------
From: Tomas Vondra <[hidden email]>
Date: po 16. 4. 2018 16:14
Subject: Re: very slow queries when max_parallel_workers_per_gather is higher than zero
To: Pavel Stehule <[hidden email]>
Cc: PostgreSQL Hackers <[hidden email]>


Apologies, the reduced query was missing a where condition on id_week:

SELECT count(*)
FROM f_ticketupdate_aad5jtwal0ayaax AS f
INNER JOIN
  dwh_dm_aabv5kk9rxac4lz_aaonw7nchsan2n1_aad8xhr0m_aaewg8j61iagl1z AS d
  ON (f.dt_event_id = d.id)
WHERE ( 6171 = d."id_euweek" )


regards


--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Reply | Threaded
Open this post in threaded view
|

Re: very slow queries when max_parallel_workers_per_gather is higher than zero

Tomas Vondra-4
Thanks. So we now have a trivial query demonstrating the issue. IMHO
this is not really a costing issue, but due to a misestimate.

Essentially, the problem is that the two sides of the join mismatch,
causing this:

    ->  Bitmap Heap Scan on dwh_dm ... d  (... rows=7 width=4) (...)

    ->  Bitmap Heap Scan on f_ticketupdate_aad5jtwal0ayaax f
          (cost=1633.90..214617.67 rows=87472 width=4)
          (actual time=0.003..0.003 rows=0 loops=7)
        Recheck Cond: (dt_event_id = d.id)

        ->  Bitmap Index Scan on f_ticketupdate_aad5jtwal0ayaa ...
              (cost=0.00..1612.03 rows=87472 width=0)
              (actual time=0.003..0.003 rows=0 loops=7)
            Index Cond: (dt_event_id = d.id)

I.e. the database believes the bitmap index scan will match 87k rows.
But in fact it matches 0, which makes the bitmap heap scan entirely
unnecessary (thus costing nothing, because it's skipped).

Of course, the parallel plan is structured slightly differently, and
does not allow this skipping because it places the f_ table on the outer
side of the join (and scans it using sequential scan).

Now, try changing the parameters (particularly id_euweek) so that the
bitmap index scan actually matches something. I'm pretty sure that will
make the non-parallel case much more expensive.

Increasing the parallel_setup_cost makes the parallel plan a bit more
expensive, enough to switch to the non-parallel plan. But that's mostly
a fluke and not particularly principled way to fix this - if the cost
difference gets a bit larger (or if you increase the number of parallel
workers) it's probably going to use the parallel plan again.

Obviously, PostgreSQL 9.5 doesn't have parallel queries, so it does not
have a chance of making this mistake.

regards
--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services