BUG #16280: dead tuples (probably) effect plan and query performance

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

BUG #16280: dead tuples (probably) effect plan and query performance

apt.postgresql.org Repository Update
The following bug has been logged on the website:

Bug reference:      16280
Logged by:          Ilya Serbin
Email address:      [hidden email]
PostgreSQL version: 11.6
Operating system:   Centos 7.4
Description:        

Hello all,
Faced some strange plan changes with a query (query is quite bad, however I
can't understand why the plan changes).
Context:

db1=> \d+ table1
                                                            Table
"db1.table1"
   Column   |            Type             | Collation | Nullable |          
    Default               | Storage  | Stats target | Description
------------+-----------------------------+-----------+----------+-------------------------------------+----------+--------------+-------------
 id         | bigint                      |           | not null |
nextval('table1_id_seq'::regclass)  | plain    |              |
 table1_id  | character varying(255)      |           | not null |          
                          | extended |              |
 content    | jsonb                       |           |          |          
                          | extended |              |
 created_at | timestamp without time zone |           | not null | now()    
                          | plain    |              |
 updated_at | timestamp without time zone |           | not null | now()    
                          | plain    |              |
 queued_at  | timestamp with time zone    |           |          |          
                          | plain    |              |
Indexes:
    "table1_pkey" PRIMARY KEY, btree (id)
    "uk_table1_id" UNIQUE CONSTRAINT, btree (table1_id)
    "content_idx" gin (content jsonb_path_ops)
Referenced by:
    TABLE "collection_table1s" CONSTRAINT "fk_collection_table1s_table1_id"
FOREIGN KEY (table1_id) REFERENCES table1(id)
    TABLE "db1_table1s" CONSTRAINT "fk_db1_table1s_table1_id" FOREIGN KEY
(table1s_id) REFERENCES table1(id)
    TABLE "table1_sort" CONSTRAINT "fk_table1_sort_table1_id" FOREIGN KEY
(table1_id) REFERENCES table1(table1_id)
              name              |  setting
--------------------------------+-----------
 constraint_exclusion           | partition
 cpu_index_tuple_cost           | 0.005
 cpu_operator_cost              | 0.0025
 cpu_tuple_cost                 | 0.01
 default_statistics_target      | 200
 effective_cache_size           | 1572864
 enable_bitmapscan              | on
 enable_hashagg                 | on
 enable_hashjoin                | on
 enable_indexonlyscan           | on
 enable_indexscan               | on
 enable_material                | on
 enable_mergejoin               | on
 enable_nestloop                | on
 enable_parallel_append         | on
 enable_parallel_hash           | on
 enable_partition_pruning       | on
 enable_partitionwise_aggregate | off
 enable_partitionwise_join      | off
 enable_seqscan                 | on
 enable_sort                    | on
 enable_tidscan                 | on
 random_page_cost               | 1.1
 seq_page_cost                  | 1
 shared_buffers                 | 524288
(25 rows)


First plan: Good plan and problematic query:
https://explain.tensor.ru/archive/explain/1cf3c0181a9574bf2fd06d6fd07dc201:0:2020-02-27#context
Second plan: Same query and it's bad plan:
https://explain.tensor.ru/archive/explain/3770d39f786135e38bebeb1a8a4b1da9:0:2020-02-27#context

Issue description:
After some minimal activity in database first plan changes to the second.
Analyze on table1 do not help (tried with various default_statistics_target
values).
content_idx index recreation helps for some time, but several minutes later
plan degrades back to second one.
The only thing helped (surprisingly) is vacuum. It also helps for some time,
but once number of dead tuples reaches something  like 300-500 - plan
changes back to second one.

Table details with bad plan:

db1=> select * from pg_stat_user_tables where relname='table1';
-[ RECORD 1 ]-------+------------------------------
relid               | 74813598
schemaname          | db1
relname             | table1
seq_scan            | 1167
seq_tup_read        | 315158718
idx_scan            | 23116536
idx_tup_fetch       | 42353778
n_tup_ins           | 409288
n_tup_upd           | 313114963
n_tup_del           | 0
n_tup_hot_upd       | 4683
n_live_tup          | 409288
n_dead_tup          | 379
n_mod_since_analyze | 1169
last_vacuum         | 2020-02-12 08:58:06.147247+01
last_autovacuum     | 2020-02-20 12:29:31.427082+01
last_analyze        |
last_autoanalyze    | 2020-02-20 10:12:11.494305+01
vacuum_count        | 2
autovacuum_count    | 702
analyze_count       | 0
autoanalyze_count   | 20

Table details with good plan after vacuum:
-[ RECORD 1 ]-------+------------------------------
relid               | 74813598
schemaname          | db1
relname             | table1
seq_scan            | 1167
seq_tup_read        | 315158718
idx_scan            | 23116662
idx_tup_fetch       | 42360694
n_tup_ins           | 409288
n_tup_upd           | 313114963
n_tup_del           | 0
n_tup_hot_upd       | 4683
n_live_tup          | 409288
n_dead_tup          | 0
n_mod_since_analyze | 0
last_vacuum         | 2020-02-26 08:10:49.884625+01
last_autovacuum     | 2020-02-20 12:29:31.427082+01
last_analyze        | 2020-02-26 08:13:07.253307+01
last_autoanalyze    | 2020-02-20 10:12:11.494305+01
vacuum_count        | 3
autovacuum_count    | 702
analyze_count       | 1
autoanalyze_count   | 20

I can confirm that issue exists on 11.4 and 11.6 (updated to 11.6 since I
thought it was a bu that may have been fixed in 11.5 "Fix possible failure
of planner's index endpoint probes (Tom Lane)"), it is always reproducible,
and can be reproducted on prod and all lower environments.
I can't understand why it happens. As far as I understand there is something
to do with visibility map, but as per documention it should effect
index-only scans, not my case.
Main two questions are:
1) Is it a bug?
2) If it is expected behaviour - can someone please explain why it happens
and if there is any way  to keep the good plan (without installing
extensions to force pin plans)

Best regards,
Ilya

Reply | Threaded
Open this post in threaded view
|

Re: BUG #16280: dead tuples (probably) effect plan and query performance

alexey.ermakov-2
On 2/27/20 17:28, PG Bug reporting form wrote:

> I can confirm that issue exists on 11.4 and 11.6 (updated to 11.6 since I
> thought it was a bu that may have been fixed in 11.5 "Fix possible failure
> of planner's index endpoint probes (Tom Lane)"), it is always reproducible,
> and can be reproducted on prod and all lower environments.
> I can't understand why it happens. As far as I understand there is something
> to do with visibility map, but as per documention it should effect
> index-only scans, not my case.
> Main two questions are:
> 1) Is it a bug?
> 2) If it is expected behaviour - can someone please explain why it happens
> and if there is any way  to keep the good plan (without installing
> extensions to force pin plans)
>
> Best regards,
> Ilya
>
Hello,

I don't think that's a bug, comparing these lines from both plans

->  Bitmap Index Scan on content_idx  (cost=0.00..155.07 rows=409
width=0) (actual time=4.932..4.932 rows=21952 loops=1)
       Index Cond: (content @> '{"anotherjsonkey": {"values":
["13"]}}'::jsonb)
       Buffers: shared hit=48

->  Bitmap Index Scan on content_idx  (cost=0.00..27.11 rows=414
width=0) (actual time=4.287..4.287 rows=21952 loops=1)
       Index Cond: (content @> '{"anotherjsonkey": {"values":
["13"]}}'::jsonb)
       Buffers: shared hit=16


we can see that cost of bitmapscan is more in bad plan but actual time
is about the same.
I think that could be fixed by changing seq_page_cost/random_page cost
to something like 1/10th of their current value (0.1/0.11) if your database
is on SSD (or fits in memory). Change in cost possibly could be
explained due
to change in table size (if table size changed after vacuum).

It's also possible to trigger autovacuum on that table more often by
changing
autovacuum_vacuum_scale_factor/autovacuum_vacuum_threshold
(it's possible to do so on selected table only if necessary).

In given case main reason of problem I think is that planner underestimate
cost of rechecking condition on large jsonb value, AFAIK planner doesn't
account width of the column but I might be wrong.

--
Alexey Ermakov


Reply | Threaded
Open this post in threaded view
|

Re: BUG #16280: dead tuples (probably) effect plan and query performance

Tom Lane-2
In reply to this post by apt.postgresql.org Repository Update
PG Bug reporting form <[hidden email]> writes:
> Issue description:
> After some minimal activity in database first plan changes to the second.
> Analyze on table1 do not help (tried with various default_statistics_target
> values).
> content_idx index recreation helps for some time, but several minutes later
> plan degrades back to second one.
> The only thing helped (surprisingly) is vacuum. It also helps for some time,
> but once number of dead tuples reaches something  like 300-500 - plan
> changes back to second one.

Surely it is *not* a bug that dead tuples affect the plan choice.
The density of live tuples is an important factor in the relative
costs of different table scan techniques.

In the case at hand, I wonder why your rowcount estimate is off
by a factor of 50:

->  Bitmap Index Scan on content_idx  (cost=0.00..155.07 rows=409
width=0) (actual time=4.932..4.932 rows=21952 loops=1)
       Index Cond: (content @> '{"anotherjsonkey": {"values": ["13"]}}'::jsonb)
       Buffers: shared hit=48

If you can't improve that you're not likely to get a good plan, and
futzing around with cost factors to make this particular query do
"the right thing" anyway is inevitably going to make things worse
for other queries.  Maybe a larger stats target for the content column
would help, but I fear that this @> condition is just beyond
the ability of the planner to estimate.  You might need to redesign
the data representation to make it a bit more SQL-friendly.

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: BUG #16280: dead tuples (probably) effect plan and query performance

Serbin, Ilya
Hello, Tom!
Thanks for your answer. My concern is that plan changes after a relatively small number of dead tuples. Bad plan is being generated when table1 contain 300-400 dead tuples. It is only 0.07%-0.08% of the whole table (409k+ entries).
In addition, table is growing and currently there are 425k of entries. However even on 425k size table plan stays good until number of dead tuples reaches 300-400, аfter that plan changes to the bad one.
As I said, I tried analyzing table with various default_statistics_target (100-1000 with step of 100) - plan stays bad. Tried setting random_page_cost=0.1 and seq_page_cost=1 (2, 3, 4, etc). Plan changed to good one only starting from   random_page_cost=0.1 and  seq_page_cost=8. However, once I ran vacuum - plan changed to good one and stayed the same even when I set random_page_cost=30;set seq_page_cost=1;
I realize that I can set autovacuum thresholds for this table to trigger it once dead tuples reach 300, but it doesn't seem right to me (this number of changes happens in something like 5 minutes and tables is ~2GB size as of now).

Why does such a small amount (0.07%) of dead tuples changes cost estimations so dramatically? Or am I missing something and dead tuples has nothing to do with it?

Thanks in advance,
Ilya

чт, 27 февр. 2020 г. в 18:41, Tom Lane <[hidden email]>:
PG Bug reporting form <[hidden email]> writes:
> Issue description:
> After some minimal activity in database first plan changes to the second.
> Analyze on table1 do not help (tried with various default_statistics_target
> values).
> content_idx index recreation helps for some time, but several minutes later
> plan degrades back to second one.
> The only thing helped (surprisingly) is vacuum. It also helps for some time,
> but once number of dead tuples reaches something  like 300-500 - plan
> changes back to second one.

Surely it is *not* a bug that dead tuples affect the plan choice.
The density of live tuples is an important factor in the relative
costs of different table scan techniques.

In the case at hand, I wonder why your rowcount estimate is off
by a factor of 50:

->  Bitmap Index Scan on content_idx  (cost=0.00..155.07 rows=409
width=0) (actual time=4.932..4.932 rows=21952 loops=1)
       Index Cond: (content @> '{"anotherjsonkey": {"values": ["13"]}}'::jsonb)
       Buffers: shared hit=48

If you can't improve that you're not likely to get a good plan, and
futzing around with cost factors to make this particular query do
"the right thing" anyway is inevitably going to make things worse
for other queries.  Maybe a larger stats target for the content column
would help, but I fear that this @> condition is just beyond
the ability of the planner to estimate.  You might need to redesign
the data representation to make it a bit more SQL-friendly.

                        regards, tom lane


--
Илья Сербин | Старший Администратор БД
[hidden email]
Мобильный : 8-918-895-05-96 | Telegram : @iserbin_61
DBI 24/7 контакт : 8-800-333-65-79 | Сайт : http://dbi.ru/

Reply | Threaded
Open this post in threaded view
|

Re: BUG #16280: dead tuples (probably) effect plan and query performance

Serbin, Ilya
Hello,
Asking again just in case someone can help to find out why is that happening in my case and if it is a bug.

пт, 28 февр. 2020 г. в 12:25, Serbin, Ilya <[hidden email]>:
Hello, Tom!
Thanks for your answer. My concern is that plan changes after a relatively small number of dead tuples. Bad plan is being generated when table1 contain 300-400 dead tuples. It is only 0.07%-0.08% of the whole table (409k+ entries).
In addition, table is growing and currently there are 425k of entries. However even on 425k size table plan stays good until number of dead tuples reaches 300-400, аfter that plan changes to the bad one.
As I said, I tried analyzing table with various default_statistics_target (100-1000 with step of 100) - plan stays bad. Tried setting random_page_cost=0.1 and seq_page_cost=1 (2, 3, 4, etc). Plan changed to good one only starting from   random_page_cost=0.1 and  seq_page_cost=8. However, once I ran vacuum - plan changed to good one and stayed the same even when I set random_page_cost=30;set seq_page_cost=1;
I realize that I can set autovacuum thresholds for this table to trigger it once dead tuples reach 300, but it doesn't seem right to me (this number of changes happens in something like 5 minutes and tables is ~2GB size as of now).

Why does such a small amount (0.07%) of dead tuples changes cost estimations so dramatically? Or am I missing something and dead tuples has nothing to do with it?


Reply | Threaded
Open this post in threaded view
|

Re: BUG #16280: dead tuples (probably) effect plan and query performance

James Coleman
On Wed, Mar 11, 2020 at 6:51 AM Serbin, Ilya <[hidden email]> wrote:

>
> Hello,
> Asking again just in case someone can help to find out why is that happening in my case and if it is a bug.
>
> пт, 28 февр. 2020 г. в 12:25, Serbin, Ilya <[hidden email]>:
>>
>> Hello, Tom!
>> Thanks for your answer. My concern is that plan changes after a relatively small number of dead tuples. Bad plan is being generated when table1 contain 300-400 dead tuples. It is only 0.07%-0.08% of the whole table (409k+ entries).
>> In addition, table is growing and currently there are 425k of entries. However even on 425k size table plan stays good until number of dead tuples reaches 300-400, аfter that plan changes to the bad one.
>> As I said, I tried analyzing table with various default_statistics_target (100-1000 with step of 100) - plan stays bad. Tried setting random_page_cost=0.1 and seq_page_cost=1 (2, 3, 4, etc). Plan changed to good one only starting from   random_page_cost=0.1 and  seq_page_cost=8. However, once I ran vacuum - plan changed to good one and stayed the same even when I set random_page_cost=30;set seq_page_cost=1;
>> I realize that I can set autovacuum thresholds for this table to trigger it once dead tuples reach 300, but it doesn't seem right to me (this number of changes happens in something like 5 minutes and tables is ~2GB size as of now).
>>
>> Why does such a small amount (0.07%) of dead tuples changes cost estimations so dramatically? Or am I missing something and dead tuples has nothing to do with it?

That looks like an almost impossible query to plan correctly -- the
planner would need some kind of stats on fields internal to the json
blob, but it can't have that since the blob is necessarily schema less
(at least from PG's perspective). That's why Tom is wondering about
making the representation map more closely to a SQL/relational table
model with discrete columns.

Alternatively have you tried adding function indexes on the (it looks
like) ~2 paths you're querying in the JSON field?

It's not impossible there's a bug in the planner here, but the
opaqueness of this query to the planner is why that's not anyone's
first assumption. To show otherwise someone would have to step through
the execution and show that it's not just reaching some cutoff (that
makes somewhat reasonable sense for how hard this query is to plan) or
that the cutoff is being calculated incorrectly.

James


Reply | Threaded
Open this post in threaded view
|

Re: BUG #16280: dead tuples (probably) effect plan and query performance

Serbin, Ilya
It seems like I found the reason. I've recreated the "content_idx" gin (content jsonb_path_ops) with fastupdate=false, and good plan stayed even on deadtuples = 4000+ ( as I said previously good plan changed to bad starting on ~300 dead tuples in table).
So, as far as I understand the issue raises once gin pending-entry list reaches ~300 entries. AFAIK gin pending-entry list is stored unsorted, so optimizer may decide to choose another plan to be used since cost estimations for scanning unsorted gin pending-entry list may be to high.
Once I disabled fastupdate mechanism I've raised overhead for write operations, but scanning uses only index body, without  gin pending-entry (since it's not presented anymore).
Would much appreciate if someone can confirm or disprove my conclusions.