BUG #15618: Duplicating a join clause breaks index use

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

BUG #15618: Duplicating a join clause breaks index use

apt.postgresql.org Repository Update
The following bug has been logged on the website:

Bug reference:      15618
Logged by:          cd a
Email address:      [hidden email]
PostgreSQL version: 10.3
Operating system:   Windows 10
Description:        

Adding a unused join clause to the query, modifies the run time from 50ms to
4seconds.
No columns from the join clause are returned and the joined table is not
used in other places.
The join clause is a duplicate of another one that is used for further
joining.

The joined clause should have been automatically removed as it has no side
effects, thus using indices and optimizing query runtime.

Reply | Threaded
Open this post in threaded view
|

Re: BUG #15618: Duplicating a join clause breaks index use

Pavel Stehule


pá 1. 2. 2019 v 19:03 odesílatel PG Bug reporting form <[hidden email]> napsal:
The following bug has been logged on the website:

Bug reference:      15618
Logged by:          cd a
Email address:      [hidden email]
PostgreSQL version: 10.3
Operating system:   Windows 10
Description:       

Adding a unused join clause to the query, modifies the run time from 50ms to
4seconds.
No columns from the join clause are returned and the joined table is not
used in other places.
The join clause is a duplicate of another one that is used for further
joining.

The joined clause should have been automatically removed as it has no side
effects, thus using indices and optimizing query runtime.

Please, can you send a example. Generally, you can remove only some outer join clauses. unused inner join cannot be removed, because it can has impact on result.

Regards

Pavel Stehule
Reply | Threaded
Open this post in threaded view
|

Re: BUG #15618: Duplicating a join clause breaks index use

Alex-10
Query:

select
 pp.id as id1_13_0_,
 ppv.id as id2_14_1_,
 de.id as id1_3_2_,
 dev.id as id1_5_3_,
 dep.id as id1_3_4_,
 depv.id as id1_5_5_,
 pp.description_source as descript5_13_0_,
 pp.meaning as meaning2_13_0_,
 pp.secondary_meaning as secondar3_13_0_,
 pp.source as source4_13_0_,
 pp.value_id as value_id6_13_0_,
 ppv.first as first3_14_1_,
 ppv.third as third4_14_1_,
 ppv.second as second5_14_1_,
 ppv.text as text6_14_1_,
 ppv.type as type1_14_1_,
 de.parent as parent3_3_2_,
 de.pos as pos2_3_2_,
 de.description_value_id as descript4_3_2_,
 de.product as product5_3_2_,
 dev.language as language2_5_3_,
 dev.lexeme as lexeme3_5_3_,
 dev.value as value4_5_3_,
 dep.parent as parent3_3_4_,
 dep.pos as pos2_3_4_,
 dep.description_value_id as descript4_3_4_,
 dep.product as product5_3_4_,
 depv.language as language2_5_5_,
 depv.lexeme as lexeme3_5_5_,
 depv.value as value4_5_5_
from
    product p
    left outer join product_property pp on p.id=pp.product_id
    left outer join product_property_value ppv on pp.value_id=ppv.id
    left outer join description_element de on pp.description_source=de.id
    left outer join description_value dev on de.description_value_id=dev.id
    left outer join description_element dep on de.parent=dep.id
    left outer join description_value depv on dep.description_value_id=depv.id
    left outer join shop_category sc_dup on p.shop_category=sc_dup.id           --duplicate join
    left outer join shop_category sc on p.shop_category=sc.id
    left outer join category c on sc.category=c.id
where
    c.id=8511


Explain analyze with duplicate join:

"Gather  (cost=134471.60..285324.15 rows=48 width=320) (actual time=1746.142..3854.899 rows=90 loops=1)"
"  Workers Planned: 2"
"  Workers Launched: 2"
"  ->  Hash Join  (cost=133471.60..284319.35 rows=20 width=320) (actual time=1753.615..3702.835 rows=30 loops=3)"
"        Hash Cond: (p.shop_category = sc.id)"
"        ->  Hash Left Join  (cost=133454.98..283698.84 rows=160986 width=324) (actual time=1661.551..3691.094 rows=131278 loops=3)"
"              Hash Cond: (dep.description_value_id = depv.id)"
"              ->  Nested Loop Left Join  (cost=119237.57..253506.99 rows=160986 width=218) (actual time=1465.635..3282.286 rows=131278 loops=3)"
"                    ->  Hash Left Join  (cost=119237.14..175609.97 rows=160986 width=198) (actual time=1465.625..2561.546 rows=131278 loops=3)"
"                          Hash Cond: (de.description_value_id = dev.id)"
"                          ->  Hash Left Join  (cost=105019.73..150448.12 rows=160986 width=92) (actual time=1260.801..2160.415 rows=131278 loops=3)"
"                                Hash Cond: (pp.value_id = ppv.id)"
"                                ->  Hash Left Join  (cost=90691.88..127921.82 rows=160986 width=52) (actual time=1096.581..1811.725 rows=131278 loops=3)"
"                                      Hash Cond: (pp.description_source = de.id)"
"                                      ->  Hash Left Join  (cost=14036.24..34090.73 rows=160986 width=32) (actual time=187.524..370.095 rows=131278 loops=3)"
"                                            Hash Cond: (p.id = pp.product_id)"
"                                            ->  Parallel Seq Scan on product p  (cost=0.00..11757.46 rows=37946 width=8) (actual time=0.118..55.751 rows=30357 loops=3)"
"                                            ->  Hash  (cost=6564.66..6564.66 rows=386366 width=32) (actual time=185.622..185.622 rows=386366 loops=3)"
"                                                  Buckets: 65536  Batches: 8  Memory Usage: 3390kB"
"                                                  ->  Seq Scan on product_property pp  (cost=0.00..6564.66 rows=386366 width=32) (actual time=0.103..80.719 rows=386366 loops=3)"
"                                      ->  Hash  (cost=36108.73..36108.73 rows=2208473 width=20) (actual time=905.265..905.265 rows=2208473 loops=3)"
"                                            Buckets: 65536  Batches: 64  Memory Usage: 2254kB"
"                                            ->  Seq Scan on description_element de  (cost=0.00..36108.73 rows=2208473 width=20) (actual time=0.101..359.710 rows=2208473 loops=3)"
"                                ->  Hash  (cost=6477.38..6477.38 rows=386438 width=40) (actual time=160.004..160.004 rows=386438 loops=3)"
"                                      Buckets: 65536  Batches: 8  Memory Usage: 3050kB"
"                                      ->  Seq Scan on product_property_value ppv  (cost=0.00..6477.38 rows=386438 width=40) (actual time=0.108..62.499 rows=386438 loops=3)"
"                          ->  Hash  (cost=6865.96..6865.96 rows=252596 width=106) (actual time=204.657..204.657 rows=252596 loops=3)"
"                                Buckets: 32768  Batches: 16  Memory Usage: 2449kB"
"                                ->  Seq Scan on description_value dev  (cost=0.00..6865.96 rows=252596 width=106) (actual time=0.102..80.803 rows=252596 loops=3)"
"                    ->  Index Scan using description_element_pkey on description_element dep  (cost=0.43..0.48 rows=1 width=20) (actual time=0.005..0.005 rows=1 loops=393835)"
"                          Index Cond: (de.parent = id)"
"              ->  Hash  (cost=6865.96..6865.96 rows=252596 width=106) (actual time=195.776..195.776 rows=252596 loops=3)"
"                    Buckets: 32768  Batches: 16  Memory Usage: 2449kB"
"                    ->  Seq Scan on description_value depv  (cost=0.00..6865.96 rows=252596 width=106) (actual time=0.019..74.589 rows=252596 loops=3)"
"        ->  Hash  (cost=16.61..16.61 rows=1 width=4) (actual time=0.232..0.232 rows=1 loops=3)"
"              Buckets: 1024  Batches: 1  Memory Usage: 9kB"
"              ->  Nested Loop  (cost=0.56..16.61 rows=1 width=4) (actual time=0.224..0.226 rows=1 loops=3)"
"                    ->  Index Scan using shop_category_category_idx on shop_category sc  (cost=0.28..8.30 rows=1 width=8) (actual time=0.095..0.095 rows=1 loops=3)"
"                          Index Cond: (category = 8511)"
"                    ->  Index Only Scan using category_pkey on category c  (cost=0.28..8.29 rows=1 width=4) (actual time=0.125..0.126 rows=1 loops=3)"
"                          Index Cond: (id = 8511)"
"                          Heap Fetches: 1"
"Planning time: 2.543 ms"
"Execution time: 3857.492 ms"

Explain analyze without duplicate join:

"Nested Loop  (cost=3.40..216.69 rows=48 width=320) (actual time=0.203..1.624 rows=90 loops=1)"
"  ->  Index Only Scan using category_pkey on category c  (cost=0.28..8.29 rows=1 width=4) (actual time=0.049..0.049 rows=1 loops=1)"
"        Index Cond: (id = 8511)"
"        Heap Fetches: 1"
"  ->  Nested Loop Left Join  (cost=3.12..207.91 rows=48 width=324) (actual time=0.152..1.556 rows=90 loops=1)"
"        ->  Nested Loop Left Join  (cost=2.70..186.48 rows=48 width=218) (actual time=0.149..1.416 rows=90 loops=1)"
"              ->  Nested Loop Left Join  (cost=2.27..163.25 rows=48 width=198) (actual time=0.141..1.322 rows=90 loops=1)"
"                    ->  Nested Loop Left Join  (cost=1.85..141.81 rows=48 width=92) (actual time=0.138..1.084 rows=90 loops=1)"
"                          ->  Nested Loop Left Join  (cost=1.42..110.35 rows=48 width=72) (actual time=0.133..0.967 rows=90 loops=1)"
"                                ->  Nested Loop Left Join  (cost=1.00..87.41 rows=48 width=32) (actual time=0.101..0.651 rows=90 loops=1)"
"                                      ->  Nested Loop  (cost=0.57..78.82 rows=11 width=8) (actual time=0.069..0.432 rows=27 loops=1)"
"                                            ->  Index Scan using shop_category_category_idx on shop_category sc  (cost=0.28..8.30 rows=1 width=8) (actual time=0.036..0.036 rows=1 loops=1)"
"                                                  Index Cond: (category = 8511)"
"                                            ->  Index Scan using product_shop_category_idx on product p  (cost=0.29..70.34 rows=18 width=8) (actual time=0.031..0.390 rows=27 loops=1)"
"                                                  Index Cond: (shop_category = sc.id)"
"                                      ->  Index Scan using product_property_product_id_idx on product_property pp  (cost=0.42..0.71 rows=7 width=32) (actual time=0.006..0.007 rows=3 loops=27)"
"                                            Index Cond: (p.id = product_id)"
"                                ->  Index Scan using product_property_value_pkey on product_property_value ppv  (cost=0.42..0.48 rows=1 width=40) (actual time=0.003..0.003 rows=1 loops=90)"
"                                      Index Cond: (pp.value_id = id)"
"                          ->  Index Scan using description_element_pkey on description_element de  (cost=0.43..0.66 rows=1 width=20) (actual time=0.001..0.001 rows=0 loops=90)"
"                                Index Cond: (pp.description_source = id)"
"                    ->  Index Scan using description_value_pkey on description_value dev  (cost=0.42..0.45 rows=1 width=106) (actual time=0.002..0.002 rows=0 loops=90)"
"                          Index Cond: (de.description_value_id = id)"
"              ->  Index Scan using description_element_pkey on description_element dep  (cost=0.43..0.48 rows=1 width=20) (actual time=0.001..0.001 rows=0 loops=90)"
"                    Index Cond: (de.parent = id)"
"        ->  Index Scan using description_value_pkey on description_value depv  (cost=0.42..0.45 rows=1 width=106) (actual time=0.001..0.001 rows=0 loops=90)"
"              Index Cond: (dep.description_value_id = id)"
"Planning time: 2.689 ms"
"Execution time: 1.746 ms"

Note that both query results are the same. Using order by all columns in select results in same content.
On Friday, February 1, 2019, 8:25:44 PM GMT+2, Pavel Stehule <[hidden email]> wrote:




pá 1. 2. 2019 v 19:03 odesílatel PG Bug reporting form <[hidden email]> napsal:
The following bug has been logged on the website:

Bug reference:      15618
Logged by:          cd a
Email address:      [hidden email]
PostgreSQL version: 10.3
Operating system:   Windows 10
Description:       

Adding a unused join clause to the query, modifies the run time from 50ms to
4seconds.
No columns from the join clause are returned and the joined table is not
used in other places.
The join clause is a duplicate of another one that is used for further
joining.

The joined clause should have been automatically removed as it has no side
effects, thus using indices and optimizing query runtime.

Please, can you send a example. Generally, you can remove only some outer join clauses. unused inner join cannot be removed, because it can has impact on result.

Regards

Pavel Stehule
Reply | Threaded
Open this post in threaded view
|

Re: BUG #15618: Duplicating a join clause breaks index use

Pantelis Theodosiou
Is there a foreign key from product (shop_category) that references shop_category (id) ?

On Sat, Feb 2, 2019 at 12:17 PM Alex <[hidden email]> wrote:
Query:

select
 pp.id as id1_13_0_,
 ppv.id as id2_14_1_,
 de.id as id1_3_2_,
 dev.id as id1_5_3_,
 dep.id as id1_3_4_,
 depv.id as id1_5_5_,
 pp.description_source as descript5_13_0_,
 pp.meaning as meaning2_13_0_,
 pp.secondary_meaning as secondar3_13_0_,
 pp.source as source4_13_0_,
 pp.value_id as value_id6_13_0_,
 ppv.first as first3_14_1_,
 ppv.third as third4_14_1_,
 ppv.second as second5_14_1_,
 ppv.text as text6_14_1_,
 ppv.type as type1_14_1_,
 de.parent as parent3_3_2_,
 de.pos as pos2_3_2_,
 de.description_value_id as descript4_3_2_,
 de.product as product5_3_2_,
 dev.language as language2_5_3_,
 dev.lexeme as lexeme3_5_3_,
 dev.value as value4_5_3_,
 dep.parent as parent3_3_4_,
 dep.pos as pos2_3_4_,
 dep.description_value_id as descript4_3_4_,
 dep.product as product5_3_4_,
 depv.language as language2_5_5_,
 depv.lexeme as lexeme3_5_5_,
 depv.value as value4_5_5_
from
    product p
    left outer join product_property pp on p.id=pp.product_id
    left outer join product_property_value ppv on pp.value_id=ppv.id
    left outer join description_element de on pp.description_source=de.id
    left outer join description_value dev on de.description_value_id=dev.id
    left outer join description_element dep on de.parent=dep.id
    left outer join description_value depv on dep.description_value_id=depv.id
    left outer join shop_category sc_dup on p.shop_category=sc_dup.id           --duplicate join
    left outer join shop_category sc on p.shop_category=sc.id
    left outer join category c on sc.category=c.id
where
    c.id=8511


Explain analyze with duplicate join:

"Gather  (cost=134471.60..285324.15 rows=48 width=320) (actual time=1746.142..3854.899 rows=90 loops=1)"
"  Workers Planned: 2"
"  Workers Launched: 2"
"  ->  Hash Join  (cost=133471.60..284319.35 rows=20 width=320) (actual time=1753.615..3702.835 rows=30 loops=3)"
"        Hash Cond: (p.shop_category = sc.id)"
"        ->  Hash Left Join  (cost=133454.98..283698.84 rows=160986 width=324) (actual time=1661.551..3691.094 rows=131278 loops=3)"
"              Hash Cond: (dep.description_value_id = depv.id)"
"              ->  Nested Loop Left Join  (cost=119237.57..253506.99 rows=160986 width=218) (actual time=1465.635..3282.286 rows=131278 loops=3)"
"                    ->  Hash Left Join  (cost=119237.14..175609.97 rows=160986 width=198) (actual time=1465.625..2561.546 rows=131278 loops=3)"
"                          Hash Cond: (de.description_value_id = dev.id)"
"                          ->  Hash Left Join  (cost=105019.73..150448.12 rows=160986 width=92) (actual time=1260.801..2160.415 rows=131278 loops=3)"
"                                Hash Cond: (pp.value_id = ppv.id)"
"                                ->  Hash Left Join  (cost=90691.88..127921.82 rows=160986 width=52) (actual time=1096.581..1811.725 rows=131278 loops=3)"
"                                      Hash Cond: (pp.description_source = de.id)"
"                                      ->  Hash Left Join  (cost=14036.24..34090.73 rows=160986 width=32) (actual time=187.524..370.095 rows=131278 loops=3)"
"                                            Hash Cond: (p.id = pp.product_id)"
"                                            ->  Parallel Seq Scan on product p  (cost=0.00..11757.46 rows=37946 width=8) (actual time=0.118..55.751 rows=30357 loops=3)"
"                                            ->  Hash  (cost=6564.66..6564.66 rows=386366 width=32) (actual time=185.622..185.622 rows=386366 loops=3)"
"                                                  Buckets: 65536  Batches: 8  Memory Usage: 3390kB"
"                                                  ->  Seq Scan on product_property pp  (cost=0.00..6564.66 rows=386366 width=32) (actual time=0.103..80.719 rows=386366 loops=3)"
"                                      ->  Hash  (cost=36108.73..36108.73 rows=2208473 width=20) (actual time=905.265..905.265 rows=2208473 loops=3)"
"                                            Buckets: 65536  Batches: 64  Memory Usage: 2254kB"
"                                            ->  Seq Scan on description_element de  (cost=0.00..36108.73 rows=2208473 width=20) (actual time=0.101..359.710 rows=2208473 loops=3)"
"                                ->  Hash  (cost=6477.38..6477.38 rows=386438 width=40) (actual time=160.004..160.004 rows=386438 loops=3)"
"                                      Buckets: 65536  Batches: 8  Memory Usage: 3050kB"
"                                      ->  Seq Scan on product_property_value ppv  (cost=0.00..6477.38 rows=386438 width=40) (actual time=0.108..62.499 rows=386438 loops=3)"
"                          ->  Hash  (cost=6865.96..6865.96 rows=252596 width=106) (actual time=204.657..204.657 rows=252596 loops=3)"
"                                Buckets: 32768  Batches: 16  Memory Usage: 2449kB"
"                                ->  Seq Scan on description_value dev  (cost=0.00..6865.96 rows=252596 width=106) (actual time=0.102..80.803 rows=252596 loops=3)"
"                    ->  Index Scan using description_element_pkey on description_element dep  (cost=0.43..0.48 rows=1 width=20) (actual time=0.005..0.005 rows=1 loops=393835)"
"                          Index Cond: (de.parent = id)"
"              ->  Hash  (cost=6865.96..6865.96 rows=252596 width=106) (actual time=195.776..195.776 rows=252596 loops=3)"
"                    Buckets: 32768  Batches: 16  Memory Usage: 2449kB"
"                    ->  Seq Scan on description_value depv  (cost=0.00..6865.96 rows=252596 width=106) (actual time=0.019..74.589 rows=252596 loops=3)"
"        ->  Hash  (cost=16.61..16.61 rows=1 width=4) (actual time=0.232..0.232 rows=1 loops=3)"
"              Buckets: 1024  Batches: 1  Memory Usage: 9kB"
"              ->  Nested Loop  (cost=0.56..16.61 rows=1 width=4) (actual time=0.224..0.226 rows=1 loops=3)"
"                    ->  Index Scan using shop_category_category_idx on shop_category sc  (cost=0.28..8.30 rows=1 width=8) (actual time=0.095..0.095 rows=1 loops=3)"
"                          Index Cond: (category = 8511)"
"                    ->  Index Only Scan using category_pkey on category c  (cost=0.28..8.29 rows=1 width=4) (actual time=0.125..0.126 rows=1 loops=3)"
"                          Index Cond: (id = 8511)"
"                          Heap Fetches: 1"
"Planning time: 2.543 ms"
"Execution time: 3857.492 ms"

Explain analyze without duplicate join:

"Nested Loop  (cost=3.40..216.69 rows=48 width=320) (actual time=0.203..1.624 rows=90 loops=1)"
"  ->  Index Only Scan using category_pkey on category c  (cost=0.28..8.29 rows=1 width=4) (actual time=0.049..0.049 rows=1 loops=1)"
"        Index Cond: (id = 8511)"
"        Heap Fetches: 1"
"  ->  Nested Loop Left Join  (cost=3.12..207.91 rows=48 width=324) (actual time=0.152..1.556 rows=90 loops=1)"
"        ->  Nested Loop Left Join  (cost=2.70..186.48 rows=48 width=218) (actual time=0.149..1.416 rows=90 loops=1)"
"              ->  Nested Loop Left Join  (cost=2.27..163.25 rows=48 width=198) (actual time=0.141..1.322 rows=90 loops=1)"
"                    ->  Nested Loop Left Join  (cost=1.85..141.81 rows=48 width=92) (actual time=0.138..1.084 rows=90 loops=1)"
"                          ->  Nested Loop Left Join  (cost=1.42..110.35 rows=48 width=72) (actual time=0.133..0.967 rows=90 loops=1)"
"                                ->  Nested Loop Left Join  (cost=1.00..87.41 rows=48 width=32) (actual time=0.101..0.651 rows=90 loops=1)"
"                                      ->  Nested Loop  (cost=0.57..78.82 rows=11 width=8) (actual time=0.069..0.432 rows=27 loops=1)"
"                                            ->  Index Scan using shop_category_category_idx on shop_category sc  (cost=0.28..8.30 rows=1 width=8) (actual time=0.036..0.036 rows=1 loops=1)"
"                                                  Index Cond: (category = 8511)"
"                                            ->  Index Scan using product_shop_category_idx on product p  (cost=0.29..70.34 rows=18 width=8) (actual time=0.031..0.390 rows=27 loops=1)"
"                                                  Index Cond: (shop_category = sc.id)"
"                                      ->  Index Scan using product_property_product_id_idx on product_property pp  (cost=0.42..0.71 rows=7 width=32) (actual time=0.006..0.007 rows=3 loops=27)"
"                                            Index Cond: (p.id = product_id)"
"                                ->  Index Scan using product_property_value_pkey on product_property_value ppv  (cost=0.42..0.48 rows=1 width=40) (actual time=0.003..0.003 rows=1 loops=90)"
"                                      Index Cond: (pp.value_id = id)"
"                          ->  Index Scan using description_element_pkey on description_element de  (cost=0.43..0.66 rows=1 width=20) (actual time=0.001..0.001 rows=0 loops=90)"
"                                Index Cond: (pp.description_source = id)"
"                    ->  Index Scan using description_value_pkey on description_value dev  (cost=0.42..0.45 rows=1 width=106) (actual time=0.002..0.002 rows=0 loops=90)"
"                          Index Cond: (de.description_value_id = id)"
"              ->  Index Scan using description_element_pkey on description_element dep  (cost=0.43..0.48 rows=1 width=20) (actual time=0.001..0.001 rows=0 loops=90)"
"                    Index Cond: (de.parent = id)"
"        ->  Index Scan using description_value_pkey on description_value depv  (cost=0.42..0.45 rows=1 width=106) (actual time=0.001..0.001 rows=0 loops=90)"
"              Index Cond: (dep.description_value_id = id)"
"Planning time: 2.689 ms"
"Execution time: 1.746 ms"

Note that both query results are the same. Using order by all columns in select results in same content.
On Friday, February 1, 2019, 8:25:44 PM GMT+2, Pavel Stehule <[hidden email]> wrote:




pá 1. 2. 2019 v 19:03 odesílatel PG Bug reporting form <[hidden email]> napsal:
The following bug has been logged on the website:

Bug reference:      15618
Logged by:          cd a
Email address:      [hidden email]
PostgreSQL version: 10.3
Operating system:   Windows 10
Description:       

Adding a unused join clause to the query, modifies the run time from 50ms to
4seconds.
No columns from the join clause are returned and the joined table is not
used in other places.
The join clause is a duplicate of another one that is used for further
joining.

The joined clause should have been automatically removed as it has no side
effects, thus using indices and optimizing query runtime.

Please, can you send a example. Generally, you can remove only some outer join clauses. unused inner join cannot be removed, because it can has impact on result.

Regards

Pavel Stehule
Reply | Threaded
Open this post in threaded view
|

Re: BUG #15618: Duplicating a join clause breaks index use

Alex-10
Yes.

On Saturday, February 2, 2019, 2:31:23 PM GMT+2, Pantelis Theodosiou <[hidden email]> wrote:


Is there a foreign key from product (shop_category) that references shop_category (id) ?

On Sat, Feb 2, 2019 at 12:17 PM Alex <[hidden email]> wrote:
Query:

select
 pp.id as id1_13_0_,
 ppv.id as id2_14_1_,
 de.id as id1_3_2_,
 dev.id as id1_5_3_,
 dep.id as id1_3_4_,
 depv.id as id1_5_5_,
 pp.description_source as descript5_13_0_,
 pp.meaning as meaning2_13_0_,
 pp.secondary_meaning as secondar3_13_0_,
 pp.source as source4_13_0_,
 pp.value_id as value_id6_13_0_,
 ppv.first as first3_14_1_,
 ppv.third as third4_14_1_,
 ppv.second as second5_14_1_,
 ppv.text as text6_14_1_,
 ppv.type as type1_14_1_,
 de.parent as parent3_3_2_,
 de.pos as pos2_3_2_,
 de.description_value_id as descript4_3_2_,
 de.product as product5_3_2_,
 dev.language as language2_5_3_,
 dev.lexeme as lexeme3_5_3_,
 dev.value as value4_5_3_,
 dep.parent as parent3_3_4_,
 dep.pos as pos2_3_4_,
 dep.description_value_id as descript4_3_4_,
 dep.product as product5_3_4_,
 depv.language as language2_5_5_,
 depv.lexeme as lexeme3_5_5_,
 depv.value as value4_5_5_
from
    product p
    left outer join product_property pp on p.id=pp.product_id
    left outer join product_property_value ppv on pp.value_id=ppv.id
    left outer join description_element de on pp.description_source=de.id
    left outer join description_value dev on de.description_value_id=dev.id
    left outer join description_element dep on de.parent=dep.id
    left outer join description_value depv on dep.description_value_id=depv.id
    left outer join shop_category sc_dup on p.shop_category=sc_dup.id           --duplicate join
    left outer join shop_category sc on p.shop_category=sc.id
    left outer join category c on sc.category=c.id
where
    c.id=8511


Explain analyze with duplicate join:

"Gather  (cost=134471.60..285324.15 rows=48 width=320) (actual time=1746.142..3854.899 rows=90 loops=1)"
"  Workers Planned: 2"
"  Workers Launched: 2"
"  ->  Hash Join  (cost=133471.60..284319.35 rows=20 width=320) (actual time=1753.615..3702.835 rows=30 loops=3)"
"        Hash Cond: (p.shop_category = sc.id)"
"        ->  Hash Left Join  (cost=133454.98..283698.84 rows=160986 width=324) (actual time=1661.551..3691.094 rows=131278 loops=3)"
"              Hash Cond: (dep.description_value_id = depv.id)"
"              ->  Nested Loop Left Join  (cost=119237.57..253506.99 rows=160986 width=218) (actual time=1465.635..3282.286 rows=131278 loops=3)"
"                    ->  Hash Left Join  (cost=119237.14..175609.97 rows=160986 width=198) (actual time=1465.625..2561.546 rows=131278 loops=3)"
"                          Hash Cond: (de.description_value_id = dev.id)"
"                          ->  Hash Left Join  (cost=105019.73..150448.12 rows=160986 width=92) (actual time=1260.801..2160.415 rows=131278 loops=3)"
"                                Hash Cond: (pp.value_id = ppv.id)"
"                                ->  Hash Left Join  (cost=90691.88..127921.82 rows=160986 width=52) (actual time=1096.581..1811.725 rows=131278 loops=3)"
"                                      Hash Cond: (pp.description_source = de.id)"
"                                      ->  Hash Left Join  (cost=14036.24..34090.73 rows=160986 width=32) (actual time=187.524..370.095 rows=131278 loops=3)"
"                                            Hash Cond: (p.id = pp.product_id)"
"                                            ->  Parallel Seq Scan on product p  (cost=0.00..11757.46 rows=37946 width=8) (actual time=0.118..55.751 rows=30357 loops=3)"
"                                            ->  Hash  (cost=6564.66..6564.66 rows=386366 width=32) (actual time=185.622..185.622 rows=386366 loops=3)"
"                                                  Buckets: 65536  Batches: 8  Memory Usage: 3390kB"
"                                                  ->  Seq Scan on product_property pp  (cost=0.00..6564.66 rows=386366 width=32) (actual time=0.103..80.719 rows=386366 loops=3)"
"                                      ->  Hash  (cost=36108.73..36108.73 rows=2208473 width=20) (actual time=905.265..905.265 rows=2208473 loops=3)"
"                                            Buckets: 65536  Batches: 64  Memory Usage: 2254kB"
"                                            ->  Seq Scan on description_element de  (cost=0.00..36108.73 rows=2208473 width=20) (actual time=0.101..359.710 rows=2208473 loops=3)"
"                                ->  Hash  (cost=6477.38..6477.38 rows=386438 width=40) (actual time=160.004..160.004 rows=386438 loops=3)"
"                                      Buckets: 65536  Batches: 8  Memory Usage: 3050kB"
"                                      ->  Seq Scan on product_property_value ppv  (cost=0.00..6477.38 rows=386438 width=40) (actual time=0.108..62.499 rows=386438 loops=3)"
"                          ->  Hash  (cost=6865.96..6865.96 rows=252596 width=106) (actual time=204.657..204.657 rows=252596 loops=3)"
"                                Buckets: 32768  Batches: 16  Memory Usage: 2449kB"
"                                ->  Seq Scan on description_value dev  (cost=0.00..6865.96 rows=252596 width=106) (actual time=0.102..80.803 rows=252596 loops=3)"
"                    ->  Index Scan using description_element_pkey on description_element dep  (cost=0.43..0.48 rows=1 width=20) (actual time=0.005..0.005 rows=1 loops=393835)"
"                          Index Cond: (de.parent = id)"
"              ->  Hash  (cost=6865.96..6865.96 rows=252596 width=106) (actual time=195.776..195.776 rows=252596 loops=3)"
"                    Buckets: 32768  Batches: 16  Memory Usage: 2449kB"
"                    ->  Seq Scan on description_value depv  (cost=0.00..6865.96 rows=252596 width=106) (actual time=0.019..74.589 rows=252596 loops=3)"
"        ->  Hash  (cost=16.61..16.61 rows=1 width=4) (actual time=0.232..0.232 rows=1 loops=3)"
"              Buckets: 1024  Batches: 1  Memory Usage: 9kB"
"              ->  Nested Loop  (cost=0.56..16.61 rows=1 width=4) (actual time=0.224..0.226 rows=1 loops=3)"
"                    ->  Index Scan using shop_category_category_idx on shop_category sc  (cost=0.28..8.30 rows=1 width=8) (actual time=0.095..0.095 rows=1 loops=3)"
"                          Index Cond: (category = 8511)"
"                    ->  Index Only Scan using category_pkey on category c  (cost=0.28..8.29 rows=1 width=4) (actual time=0.125..0.126 rows=1 loops=3)"
"                          Index Cond: (id = 8511)"
"                          Heap Fetches: 1"
"Planning time: 2.543 ms"
"Execution time: 3857.492 ms"

Explain analyze without duplicate join:

"Nested Loop  (cost=3.40..216.69 rows=48 width=320) (actual time=0.203..1.624 rows=90 loops=1)"
"  ->  Index Only Scan using category_pkey on category c  (cost=0.28..8.29 rows=1 width=4) (actual time=0.049..0.049 rows=1 loops=1)"
"        Index Cond: (id = 8511)"
"        Heap Fetches: 1"
"  ->  Nested Loop Left Join  (cost=3.12..207.91 rows=48 width=324) (actual time=0.152..1.556 rows=90 loops=1)"
"        ->  Nested Loop Left Join  (cost=2.70..186.48 rows=48 width=218) (actual time=0.149..1.416 rows=90 loops=1)"
"              ->  Nested Loop Left Join  (cost=2.27..163.25 rows=48 width=198) (actual time=0.141..1.322 rows=90 loops=1)"
"                    ->  Nested Loop Left Join  (cost=1.85..141.81 rows=48 width=92) (actual time=0.138..1.084 rows=90 loops=1)"
"                          ->  Nested Loop Left Join  (cost=1.42..110.35 rows=48 width=72) (actual time=0.133..0.967 rows=90 loops=1)"
"                                ->  Nested Loop Left Join  (cost=1.00..87.41 rows=48 width=32) (actual time=0.101..0.651 rows=90 loops=1)"
"                                      ->  Nested Loop  (cost=0.57..78.82 rows=11 width=8) (actual time=0.069..0.432 rows=27 loops=1)"
"                                            ->  Index Scan using shop_category_category_idx on shop_category sc  (cost=0.28..8.30 rows=1 width=8) (actual time=0.036..0.036 rows=1 loops=1)"
"                                                  Index Cond: (category = 8511)"
"                                            ->  Index Scan using product_shop_category_idx on product p  (cost=0.29..70.34 rows=18 width=8) (actual time=0.031..0.390 rows=27 loops=1)"
"                                                  Index Cond: (shop_category = sc.id)"
"                                      ->  Index Scan using product_property_product_id_idx on product_property pp  (cost=0.42..0.71 rows=7 width=32) (actual time=0.006..0.007 rows=3 loops=27)"
"                                            Index Cond: (p.id = product_id)"
"                                ->  Index Scan using product_property_value_pkey on product_property_value ppv  (cost=0.42..0.48 rows=1 width=40) (actual time=0.003..0.003 rows=1 loops=90)"
"                                      Index Cond: (pp.value_id = id)"
"                          ->  Index Scan using description_element_pkey on description_element de  (cost=0.43..0.66 rows=1 width=20) (actual time=0.001..0.001 rows=0 loops=90)"
"                                Index Cond: (pp.description_source = id)"
"                    ->  Index Scan using description_value_pkey on description_value dev  (cost=0.42..0.45 rows=1 width=106) (actual time=0.002..0.002 rows=0 loops=90)"
"                          Index Cond: (de.description_value_id = id)"
"              ->  Index Scan using description_element_pkey on description_element dep  (cost=0.43..0.48 rows=1 width=20) (actual time=0.001..0.001 rows=0 loops=90)"
"                    Index Cond: (de.parent = id)"
"        ->  Index Scan using description_value_pkey on description_value depv  (cost=0.42..0.45 rows=1 width=106) (actual time=0.001..0.001 rows=0 loops=90)"
"              Index Cond: (dep.description_value_id = id)"
"Planning time: 2.689 ms"
"Execution time: 1.746 ms"

Note that both query results are the same. Using order by all columns in select results in same content.
On Friday, February 1, 2019, 8:25:44 PM GMT+2, Pavel Stehule <[hidden email]> wrote:




pá 1. 2. 2019 v 19:03 odesílatel PG Bug reporting form <[hidden email]> napsal:
The following bug has been logged on the website:

Bug reference:      15618
Logged by:          cd a
Email address:      [hidden email]
PostgreSQL version: 10.3
Operating system:   Windows 10
Description:       

Adding a unused join clause to the query, modifies the run time from 50ms to
4seconds.
No columns from the join clause are returned and the joined table is not
used in other places.
The join clause is a duplicate of another one that is used for further
joining.

The joined clause should have been automatically removed as it has no side
effects, thus using indices and optimizing query runtime.

Please, can you send a example. Generally, you can remove only some outer join clauses. unused inner join cannot be removed, because it can has impact on result.

Regards

Pavel Stehule
Reply | Threaded
Open this post in threaded view
|

Re: BUG #15618: Duplicating a join clause breaks index use

Alex-10
Also, note that if I remove any of the following joins and their columns from select, while keeping the duplicate join, the issue doesn't reproduce anymore (query is fast):
left outer join product_property_value ppv on pp.value_id=ppv.id
left outer join description_value depv on dep.description_value_id=depv.id
left outer join description_value dev on de.description_value_id=dev.id
On Saturday, February 2, 2019, 3:07:57 PM GMT+2, Alex <[hidden email]> wrote:


Yes.

On Saturday, February 2, 2019, 2:31:23 PM GMT+2, Pantelis Theodosiou <[hidden email]> wrote:


Is there a foreign key from product (shop_category) that references shop_category (id) ?

On Sat, Feb 2, 2019 at 12:17 PM Alex <[hidden email]> wrote:
Query:

select
 pp.id as id1_13_0_,
 ppv.id as id2_14_1_,
 de.id as id1_3_2_,
 dev.id as id1_5_3_,
 dep.id as id1_3_4_,
 depv.id as id1_5_5_,
 pp.description_source as descript5_13_0_,
 pp.meaning as meaning2_13_0_,
 pp.secondary_meaning as secondar3_13_0_,
 pp.source as source4_13_0_,
 pp.value_id as value_id6_13_0_,
 ppv.first as first3_14_1_,
 ppv.third as third4_14_1_,
 ppv.second as second5_14_1_,
 ppv.text as text6_14_1_,
 ppv.type as type1_14_1_,
 de.parent as parent3_3_2_,
 de.pos as pos2_3_2_,
 de.description_value_id as descript4_3_2_,
 de.product as product5_3_2_,
 dev.language as language2_5_3_,
 dev.lexeme as lexeme3_5_3_,
 dev.value as value4_5_3_,
 dep.parent as parent3_3_4_,
 dep.pos as pos2_3_4_,
 dep.description_value_id as descript4_3_4_,
 dep.product as product5_3_4_,
 depv.language as language2_5_5_,
 depv.lexeme as lexeme3_5_5_,
 depv.value as value4_5_5_
from
    product p
    left outer join product_property pp on p.id=pp.product_id
    left outer join product_property_value ppv on pp.value_id=ppv.id
    left outer join description_element de on pp.description_source=de.id
    left outer join description_value dev on de.description_value_id=dev.id
    left outer join description_element dep on de.parent=dep.id
    left outer join description_value depv on dep.description_value_id=depv.id
    left outer join shop_category sc_dup on p.shop_category=sc_dup.id           --duplicate join
    left outer join shop_category sc on p.shop_category=sc.id
    left outer join category c on sc.category=c.id
where
    c.id=8511


Explain analyze with duplicate join:

"Gather  (cost=134471.60..285324.15 rows=48 width=320) (actual time=1746.142..3854.899 rows=90 loops=1)"
"  Workers Planned: 2"
"  Workers Launched: 2"
"  ->  Hash Join  (cost=133471.60..284319.35 rows=20 width=320) (actual time=1753.615..3702.835 rows=30 loops=3)"
"        Hash Cond: (p.shop_category = sc.id)"
"        ->  Hash Left Join  (cost=133454.98..283698.84 rows=160986 width=324) (actual time=1661.551..3691.094 rows=131278 loops=3)"
"              Hash Cond: (dep.description_value_id = depv.id)"
"              ->  Nested Loop Left Join  (cost=119237.57..253506.99 rows=160986 width=218) (actual time=1465.635..3282.286 rows=131278 loops=3)"
"                    ->  Hash Left Join  (cost=119237.14..175609.97 rows=160986 width=198) (actual time=1465.625..2561.546 rows=131278 loops=3)"
"                          Hash Cond: (de.description_value_id = dev.id)"
"                          ->  Hash Left Join  (cost=105019.73..150448.12 rows=160986 width=92) (actual time=1260.801..2160.415 rows=131278 loops=3)"
"                                Hash Cond: (pp.value_id = ppv.id)"
"                                ->  Hash Left Join  (cost=90691.88..127921.82 rows=160986 width=52) (actual time=1096.581..1811.725 rows=131278 loops=3)"
"                                      Hash Cond: (pp.description_source = de.id)"
"                                      ->  Hash Left Join  (cost=14036.24..34090.73 rows=160986 width=32) (actual time=187.524..370.095 rows=131278 loops=3)"
"                                            Hash Cond: (p.id = pp.product_id)"
"                                            ->  Parallel Seq Scan on product p  (cost=0.00..11757.46 rows=37946 width=8) (actual time=0.118..55.751 rows=30357 loops=3)"
"                                            ->  Hash  (cost=6564.66..6564.66 rows=386366 width=32) (actual time=185.622..185.622 rows=386366 loops=3)"
"                                                  Buckets: 65536  Batches: 8  Memory Usage: 3390kB"
"                                                  ->  Seq Scan on product_property pp  (cost=0.00..6564.66 rows=386366 width=32) (actual time=0.103..80.719 rows=386366 loops=3)"
"                                      ->  Hash  (cost=36108.73..36108.73 rows=2208473 width=20) (actual time=905.265..905.265 rows=2208473 loops=3)"
"                                            Buckets: 65536  Batches: 64  Memory Usage: 2254kB"
"                                            ->  Seq Scan on description_element de  (cost=0.00..36108.73 rows=2208473 width=20) (actual time=0.101..359.710 rows=2208473 loops=3)"
"                                ->  Hash  (cost=6477.38..6477.38 rows=386438 width=40) (actual time=160.004..160.004 rows=386438 loops=3)"
"                                      Buckets: 65536  Batches: 8  Memory Usage: 3050kB"
"                                      ->  Seq Scan on product_property_value ppv  (cost=0.00..6477.38 rows=386438 width=40) (actual time=0.108..62.499 rows=386438 loops=3)"
"                          ->  Hash  (cost=6865.96..6865.96 rows=252596 width=106) (actual time=204.657..204.657 rows=252596 loops=3)"
"                                Buckets: 32768  Batches: 16  Memory Usage: 2449kB"
"                                ->  Seq Scan on description_value dev  (cost=0.00..6865.96 rows=252596 width=106) (actual time=0.102..80.803 rows=252596 loops=3)"
"                    ->  Index Scan using description_element_pkey on description_element dep  (cost=0.43..0.48 rows=1 width=20) (actual time=0.005..0.005 rows=1 loops=393835)"
"                          Index Cond: (de.parent = id)"
"              ->  Hash  (cost=6865.96..6865.96 rows=252596 width=106) (actual time=195.776..195.776 rows=252596 loops=3)"
"                    Buckets: 32768  Batches: 16  Memory Usage: 2449kB"
"                    ->  Seq Scan on description_value depv  (cost=0.00..6865.96 rows=252596 width=106) (actual time=0.019..74.589 rows=252596 loops=3)"
"        ->  Hash  (cost=16.61..16.61 rows=1 width=4) (actual time=0.232..0.232 rows=1 loops=3)"
"              Buckets: 1024  Batches: 1  Memory Usage: 9kB"
"              ->  Nested Loop  (cost=0.56..16.61 rows=1 width=4) (actual time=0.224..0.226 rows=1 loops=3)"
"                    ->  Index Scan using shop_category_category_idx on shop_category sc  (cost=0.28..8.30 rows=1 width=8) (actual time=0.095..0.095 rows=1 loops=3)"
"                          Index Cond: (category = 8511)"
"                    ->  Index Only Scan using category_pkey on category c  (cost=0.28..8.29 rows=1 width=4) (actual time=0.125..0.126 rows=1 loops=3)"
"                          Index Cond: (id = 8511)"
"                          Heap Fetches: 1"
"Planning time: 2.543 ms"
"Execution time: 3857.492 ms"

Explain analyze without duplicate join:

"Nested Loop  (cost=3.40..216.69 rows=48 width=320) (actual time=0.203..1.624 rows=90 loops=1)"
"  ->  Index Only Scan using category_pkey on category c  (cost=0.28..8.29 rows=1 width=4) (actual time=0.049..0.049 rows=1 loops=1)"
"        Index Cond: (id = 8511)"
"        Heap Fetches: 1"
"  ->  Nested Loop Left Join  (cost=3.12..207.91 rows=48 width=324) (actual time=0.152..1.556 rows=90 loops=1)"
"        ->  Nested Loop Left Join  (cost=2.70..186.48 rows=48 width=218) (actual time=0.149..1.416 rows=90 loops=1)"
"              ->  Nested Loop Left Join  (cost=2.27..163.25 rows=48 width=198) (actual time=0.141..1.322 rows=90 loops=1)"
"                    ->  Nested Loop Left Join  (cost=1.85..141.81 rows=48 width=92) (actual time=0.138..1.084 rows=90 loops=1)"
"                          ->  Nested Loop Left Join  (cost=1.42..110.35 rows=48 width=72) (actual time=0.133..0.967 rows=90 loops=1)"
"                                ->  Nested Loop Left Join  (cost=1.00..87.41 rows=48 width=32) (actual time=0.101..0.651 rows=90 loops=1)"
"                                      ->  Nested Loop  (cost=0.57..78.82 rows=11 width=8) (actual time=0.069..0.432 rows=27 loops=1)"
"                                            ->  Index Scan using shop_category_category_idx on shop_category sc  (cost=0.28..8.30 rows=1 width=8) (actual time=0.036..0.036 rows=1 loops=1)"
"                                                  Index Cond: (category = 8511)"
"                                            ->  Index Scan using product_shop_category_idx on product p  (cost=0.29..70.34 rows=18 width=8) (actual time=0.031..0.390 rows=27 loops=1)"
"                                                  Index Cond: (shop_category = sc.id)"
"                                      ->  Index Scan using product_property_product_id_idx on product_property pp  (cost=0.42..0.71 rows=7 width=32) (actual time=0.006..0.007 rows=3 loops=27)"
"                                            Index Cond: (p.id = product_id)"
"                                ->  Index Scan using product_property_value_pkey on product_property_value ppv  (cost=0.42..0.48 rows=1 width=40) (actual time=0.003..0.003 rows=1 loops=90)"
"                                      Index Cond: (pp.value_id = id)"
"                          ->  Index Scan using description_element_pkey on description_element de  (cost=0.43..0.66 rows=1 width=20) (actual time=0.001..0.001 rows=0 loops=90)"
"                                Index Cond: (pp.description_source = id)"
"                    ->  Index Scan using description_value_pkey on description_value dev  (cost=0.42..0.45 rows=1 width=106) (actual time=0.002..0.002 rows=0 loops=90)"
"                          Index Cond: (de.description_value_id = id)"
"              ->  Index Scan using description_element_pkey on description_element dep  (cost=0.43..0.48 rows=1 width=20) (actual time=0.001..0.001 rows=0 loops=90)"
"                    Index Cond: (de.parent = id)"
"        ->  Index Scan using description_value_pkey on description_value depv  (cost=0.42..0.45 rows=1 width=106) (actual time=0.001..0.001 rows=0 loops=90)"
"              Index Cond: (dep.description_value_id = id)"
"Planning time: 2.689 ms"
"Execution time: 1.746 ms"

Note that both query results are the same. Using order by all columns in select results in same content.
On Friday, February 1, 2019, 8:25:44 PM GMT+2, Pavel Stehule <[hidden email]> wrote:




pá 1. 2. 2019 v 19:03 odesílatel PG Bug reporting form <[hidden email]> napsal:
The following bug has been logged on the website:

Bug reference:      15618
Logged by:          cd a
Email address:      [hidden email]
PostgreSQL version: 10.3
Operating system:   Windows 10
Description:       

Adding a unused join clause to the query, modifies the run time from 50ms to
4seconds.
No columns from the join clause are returned and the joined table is not
used in other places.
The join clause is a duplicate of another one that is used for further
joining.

The joined clause should have been automatically removed as it has no side
effects, thus using indices and optimizing query runtime.

Please, can you send a example. Generally, you can remove only some outer join clauses. unused inner join cannot be removed, because it can has impact on result.

Regards

Pavel Stehule
Reply | Threaded
Open this post in threaded view
|

Re: BUG #15618: Duplicating a join clause breaks index use

Tom Lane-2
In reply to this post by Alex-10
Alex <[hidden email]> writes:

> select
> ...
> from
>     product p
>     left outer join product_property pp on p.id=pp.product_id
>     left outer join product_property_value ppv on pp.value_id=ppv.id
>     left outer join description_element de on pp.description_source=de.id
>     left outer join description_value dev on de.description_value_id=dev.id
>     left outer join description_element dep on de.parent=dep.id
>     left outer join description_value depv on dep.description_value_id=depv.id
>     left outer join shop_category sc_dup on p.shop_category=sc_dup.id           --duplicate join
>     left outer join shop_category sc on p.shop_category=sc.id
>     left outer join category c on sc.category=c.id
> where
>     c.id=8511

I suspect the real issue here has little to do with the "duplicate" nature
of the extra join, and a lot to do with the fact that your query exceeds
join_collapse_limit.  That causes the planner to break it up in a
semi-arbitrary way to limit planning time.  Adding one more join moves
where the break happens, and evidently is preventing the planner from
discovering a good join order.

You want join_collapse_limit to be at least 10 to handle this query well,
since 10 tables are mentioned.  The default value, 8, is kinda tuned for
older and slower machines than most of us use today.  I wouldn't make it
enormous, because it does prevent exponential planning time growth for
many tables, but you could likely get away with 15 or so.  You should
likely keep from_collapse_limit in sync with that too.

                        regards, tom lane

Reply | Threaded
Open this post in threaded view
|

Re: BUG #15618: Duplicating a join clause breaks index use

Alex-10
Indeed, increasing those two parameter values results in a fast query.

With default values, the planning time is ~2ms but execution time is ~4600ms.
With increased values, the planning time increases by 0.5ms (2.5ms) but execution time drops to 0.7ms (aprox x6000 faster).

As the impact on the execution time is very high, shouldn't another mechanism be used instead of parameters join_collapse_limit or from_collapse_limit, such as some maximum planning time?
With a maximum planning time, faster machines will benefit from better query planning.

Another suggestion is to run a benchmark on the machine to automatically determine some good values, and attach hardware info to the results, to recompute in case hardware changes (although hdd performance may degrade with fragmentation and ssd with space occupied).
On Saturday, February 2, 2019, 5:08:15 PM GMT+2, Tom Lane <[hidden email]> wrote:


Alex <[hidden email]> writes:

> select
> ...
> from
>     product p
>     left outer join product_property pp on p.id=pp.product_id
>     left outer join product_property_value ppv on pp.value_id=ppv.id
>     left outer join description_element de on pp.description_source=de.id
>     left outer join description_value dev on de.description_value_id=dev.id
>     left outer join description_element dep on de.parent=dep.id
>     left outer join description_value depv on dep.description_value_id=depv.id
>     left outer join shop_category sc_dup on p.shop_category=sc_dup.id           --duplicate join
>     left outer join shop_category sc on p.shop_category=sc.id
>     left outer join category c on sc.category=c.id
> where
>     c.id=8511

I suspect the real issue here has little to do with the "duplicate" nature
of the extra join, and a lot to do with the fact that your query exceeds
join_collapse_limit.  That causes the planner to break it up in a
semi-arbitrary way to limit planning time.  Adding one more join moves
where the break happens, and evidently is preventing the planner from
discovering a good join order.

You want join_collapse_limit to be at least 10 to handle this query well,
since 10 tables are mentioned.  The default value, 8, is kinda tuned for
older and slower machines than most of us use today.  I wouldn't make it
enormous, because it does prevent exponential planning time growth for
many tables, but you could likely get away with 15 or so.  You should
likely keep from_collapse_limit in sync with that too.

            regards, tom lane