BUG #15984: order of where in() query affects query planer

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

BUG #15984: order of where in() query affects query planer

PG Doc comments form
The following bug has been logged on the website:

Bug reference:      15984
Logged by:          Richard
Email address:      [hidden email]
PostgreSQL version: 11.5
Operating system:   Debian Sid
Description:        

i have a partial index like in the following example and when reorder the
elements of the in() statement,
i get sometimes a bitmap indexscan instead of the expected index only scan.
if i remove an element, i still get the index only,
but with the wrong order, i get a bitmap heap scan. is this expected?



drop table temp;
create table temp ( i int );
insert into temp
select (random()*20)::int from generate_series(1,1000000,1);
create index "full" on temp( i );
create index "partial" on temp( i ) where i in ( 1,2,3,4,5,6,7,8,9 );
vacuum full temp;  

explain analyse select count(*) from temp where i in ( 1,2,3,4,5,6,7,8,9);

┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                              QUERY PLAN  
                                                           │
╞═══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
│ Aggregate  (cost=5336.92..5336.93 rows=1 width=8) (actual
time=164.105..164.105 rows=1 loops=1)                                      

│   ->  Index Only Scan using partial on temp  (cost=0.42..5224.42
rows=45000 width=0) (actual time=0.035..138.494 rows=450415 loops=1) │
│         Heap Fetches: 450415                                              
                                                           │
│ Planning Time: 0.953 ms                                                  
                                                           │
│ Execution Time: 164.121 ms                                                
                                                           │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘


explain analyse select count(*) from temp where i in ( 1,2,3,4,5,6,7,8);

┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                              QUERY PLAN  
                                                           │
╞═══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
│ Aggregate  (cost=5336.38..5336.39 rows=1 width=8) (actual
time=170.707..170.707 rows=1 loops=1)                                      

│   ->  Index Only Scan using partial on temp  (cost=0.42..5236.38
rows=40000 width=0) (actual time=0.017..144.923 rows=400509 loops=1) │
│         Index Cond: (i = ANY ('{1,2,3,4,5,6,7,8}'::integer[]))            
                                                           │
│         Heap Fetches: 400509                                              
                                                           │
│ Planning Time: 0.153 ms                                                  
                                                           │
│ Execution Time: 170.722 ms                                                
                                                           │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘


explain analyse select count(*) from temp where i in ( 1,2,3,4,5,6,7,9,8);

┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                                QUERY PLAN
                                                              │
╞══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
│ Aggregate  (cost=2641.03..2641.04 rows=1 width=8) (actual
time=313.834..313.835 rows=1 loops=1)                                      
  │
│   ->  Bitmap Heap Scan on temp  (cost=837.50..2528.53 rows=45000 width=0)
(actual time=150.929..262.355 rows=450415 loops=1)             │
│         Recheck Cond: ((i = ANY ('{1,2,3,4,5,6,7,8,9}'::integer[])) AND (i
= ANY ('{1,2,3,4,5,6,7,9,8}'::integer[])))                    │
│         Heap Blocks: exact=4425                                          
                                                              │
│         ->  BitmapAnd  (cost=837.50..837.50 rows=2025 width=0) (actual
time=150.465..150.465 rows=0 loops=1)                             │
│               ->  Bitmap Index Scan on partial  (cost=0.00..349.42
rows=45000 width=0) (actual time=46.848..46.848 rows=450415 loops=1)  │
│               ->  Bitmap Index Scan on "full"  (cost=0.00..465.32
rows=45000 width=0) (actual time=103.481..103.482 rows=450415 loops=1) │
│                     Index Cond: (i = ANY
('{1,2,3,4,5,6,7,9,8}'::integer[]))                                        
                   │
│ Planning Time: 0.121 ms                                                  
                                                              │
│ Execution Time: 313.859 ms                                                
                                                              │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

Reply | Threaded
Open this post in threaded view
|

Re: BUG #15984: order of where in() query affects query planer

Tom Lane-2
PG Bug reporting form <[hidden email]> writes:
> i have a partial index like in the following example and when reorder the
> elements of the in() statement,
> i get sometimes a bitmap indexscan instead of the expected index only scan.
> if i remove an element, i still get the index only,
> but with the wrong order, i get a bitmap heap scan. is this expected?

FWIW, I get the same plan shape with either order of the IN elements.

However, your example is probably going to be subject to plan instability
because

(1) you used vacuum full not plain vacuum.  That doesn't leave the table
in the all-visible condition that would favor an index-only scan.

(2) you didn't analyze the table.  At some point, autovacuum will come
along and rectify that oversight, likely causing the plan choice to
change underneath you.

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: BUG #15984: order of where in() query affects query planer

easteregg
In reply to this post by PG Doc comments form
Thank you for your reply,

i used the given example only because in my original case, i have a 12gb table totaling to 43gb with indices and an quite similar query along with some other clauses in the wherecase and still see this behavior. that table has constantly updates ( around 5-100 updates per second ) written to it.

i confirm, with the example table after the analyse, the result is consistent.
but my real table still have different plans with different order of the array, i attach an explain. the partial index is defined as followed ( i have to blank out some fieldnames and tablenames )

create index sn_<table>_detail_all on <table> (districtid,datepublished) where districtid in (3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19) and statusid in (2,3,4,5);

and the other index from the bitmap query is defined like this:

(districtid, <othertableid>, dateplaced)


 🐘 08:42:38 » [hidden email]:5432/user > analyse verbose <table>;
INFO:  analyzing "public.<table>"
INFO:  "<table>": scanned 300000 of 1608676 pages, containing 1369182 live rows and 120789 dead rows; 300000 rows in sample, 7341901 estimated total rows
Time: 93397.647 ms (01:33.398)


 🐘 08:47:06 » [hidden email]:5432/user > explain (verbose, analyse)  select count(*) from <table> where districtid in (3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19) and statusid in (2,3,4,5);
                                                                          QUERY PLAN
═══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════
Aggregate  (cost=15644.10..15644.11 rows=1 width=8) (actual time=68.883..68.883 rows=1 loops=1)
  Output: count(*)
  ->  Index Only Scan using sn_<table>r_detail_all on public.<table>  (cost=0.41..15487.10 rows=62803 width=0) (actual time=0.014..61.728 rows=60915 loops=1)
        Output: districtid, datepublished
        Heap Fetches: 11373
Planning Time: 1.636 ms
Execution Time: 68.910 ms
(7 rows)

Time: 94.375 ms


 🐘 08:47:28 » [hidden email]:5432/user > explain (verbose, analyse)  select count(*) from <table> where districtid in (4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,3) and statusid in (2,3,4,5);
                                                                                                                             QUERY PLAN
════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════
Aggregate  (cost=3885.65..3885.66 rows=1 width=8) (actual time=1023.805..1023.805 rows=1 loops=1)
  Output: count(*)
  ->  Bitmap Heap Scan on public.<table>  (cost=3164.92..3728.64 rows=62803 width=0) (actual time=258.616..999.205 rows=60915 loops=1)
        Recheck Cond: ((<table>.districtid = ANY ('{3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19}'::integer[])) AND (<table>.statusid = ANY ('{2,3,4,5}'::integer[])) AND (<table>.districtid = ANY ('{4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,3}'::integer[])))
        Heap Blocks: exact=52047
        ->  BitmapAnd  (cost=3164.92..3164.92 rows=544 width=0) (actual time=224.005..224.005 rows=0 loops=1)
              ->  Bitmap Index Scan on sn_<table>r_detail_all  (cost=0.00..564.43 rows=62803 width=0) (actual time=75.655..75.655 rows=60941 loops=1)
              ->  Bitmap Index Scan on <table>_districtid_<table>rid_dateplaced_idx  (cost=0.00..2568.84 rows=63650 width=0) (actual time=143.565..143.565 rows=61161 loops=1)
                    Index Cond: (<table>.districtid = ANY ('{4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,3}'::integer[]))
Planning Time: 1.538 ms
Execution Time: 1023.846 ms
(11 rows)

Time: 1054.731 ms (00:01.055)


From: Tom Lane [mailto:[hidden email]]
To: [hidden email]
Cc: [hidden email]
Sent: Fri, 30 Aug 2019 00:16:45 +0200
Subject: Re: BUG #15984: order of where in() query affects query planer

PG Bug reporting form <[hidden email]> writes:
> i have a partial index like in the following example and when reorder the
> elements of the in() statement,
> i get sometimes a bitmap indexscan instead of the expected index only scan.
> if i remove an element, i still get the index only,
> but with the wrong order, i get a bitmap heap scan. is this expected?

FWIW, I get the same plan shape with either order of the IN elements.

However, your example is probably going to be subject to plan instability
because

(1) you used vacuum full not plain vacuum. That doesn't leave the table
in the all-visible condition that would favor an index-only scan.

(2) you didn't analyze the table. At some point, autovacuum will come
along and rectify that oversight, likely causing the plan choice to
change underneath you.

regards, tom lane
Reply | Threaded
Open this post in threaded view
|

Re: BUG #15984: order of where in() query affects query planer

easteregg
In reply to this post by PG Doc comments form
i have created another example for you, were i get consistent results after the analyse with the different plans:

create table test( a int, b int, c int );
insert into test
select random()*100, random()*10, random()*5 FROM generate_series(1,1000000);
create index "full" on test( a, c );
create index "partial" on test( a, c ) where ( a in ( 3,4,5,6,7,8,9,10 ) and b in ( 2,3 ) );
analyse test;

explain analyse select * from test where a in ( 3,4,5,6,7,8,9,10 ) and b in ( 2,3 );

explain analyse select * from test where a in ( 3,4,5,6,7,8,10,9 ) and b in ( 2,3 );

do i have wrong expections from theryplanner, that those are in fact two different paths to take? currently i work around this issue by sorting the elements for the in statements, but i think that must be the wrong path.

with kind regards, richard