autovac issue with large number of tables

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

autovac issue with large number of tables

Jim Nasby

A database with a very large number of  tables eligible for autovacuum can result in autovacuum workers “stuck” in a tight loop of table_recheck_autovac() constantly reporting nothing to do on the table. This is because a database with a very large number of tables means it takes a while to search the statistics hash to verify that the table still needs to be processed[1]. If a worker spends some time processing a table, when it’s done it can spend a significant amount of time rechecking each table that it identified at launch (I’ve seen a worker in this state for over an hour). A simple work-around in this scenario is to kill the worker; the launcher will quickly fire up a new worker on the same database, and that worker will build a new list of tables.

 

That’s not a complete solution though… if the database contains a large number of very small tables you can end up in a state where 1 or 2 workers is busy chugging through those small tables so quickly than any additional workers spend all their time in table_recheck_autovac(), because that takes long enough that the additional workers are never able to “leapfrog” the workers that are doing useful work.

 

PoC patch attached.

 

1: top hits from `perf top -p xxx` on an affected worker

Samples: 72K of event 'cycles', Event count (approx.): 17131910436

Overhead  Shared Object     Symbol

  42.62%  postgres          [.] hash_search_with_hash_value

  10.34%  libc-2.17.so      [.] __memcpy_sse2

   6.99%  [kernel]          [k] copy_user_enhanced_fast_string

   4.73%  libc-2.17.so      [.] _IO_fread

   3.91%  postgres          [.] 0x00000000002d6478

   2.95%  libc-2.17.so      [.] _IO_getc

   2.44%  libc-2.17.so      [.] _IO_file_xsgetn

   1.73%  postgres          [.] hash_search

   1.65%  [kernel]          [k] find_get_entry

   1.10%  postgres          [.] hash_uint32

   0.99%  libc-2.17.so      [.] __memcpy_ssse3_back

 

 

 


autovac.patch (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: autovac issue with large number of tables

Masahiko Sawada-2
On Mon, 27 Jul 2020 at 06:43, Nasby, Jim <[hidden email]> wrote:
>
> A database with a very large number of  tables eligible for autovacuum can result in autovacuum workers “stuck” in a tight loop of table_recheck_autovac() constantly reporting nothing to do on the table. This is because a database with a very large number of tables means it takes a while to search the statistics hash to verify that the table still needs to be processed[1]. If a worker spends some time processing a table, when it’s done it can spend a significant amount of time rechecking each table that it identified at launch (I’ve seen a worker in this state for over an hour). A simple work-around in this scenario is to kill the worker; the launcher will quickly fire up a new worker on the same database, and that worker will build a new list of tables.
>
>
>
> That’s not a complete solution though… if the database contains a large number of very small tables you can end up in a state where 1 or 2 workers is busy chugging through those small tables so quickly than any additional workers spend all their time in table_recheck_autovac(), because that takes long enough that the additional workers are never able to “leapfrog” the workers that are doing useful work.
>

As another solution, I've been considering adding a queue having table
OIDs that need to vacuumed/analyzed on the shared memory (i.g. on
DSA). Since all autovacuum workers running on the same database can
see a consistent queue, the issue explained above won't happen and
probably it makes the implementation of prioritization of tables being
vacuumed easier which is sometimes discussed on pgsql-hackers. I guess
it might be worth to discuss including this idea.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

FW: autovac issue with large number of tables

Jim Nasby
In reply to this post by Jim Nasby

A database with a very large number of  tables eligible for autovacuum can result in autovacuum workers “stuck” in a tight loop of table_recheck_autovac() constantly reporting nothing to do on the table. This is because a database with a very large number of tables means it takes a while to search the statistics hash to verify that the table still needs to be processed[1]. If a worker spends some time processing a table, when it’s done it can spend a significant amount of time rechecking each table that it identified at launch (I’ve seen a worker in this state for over an hour). A simple work-around in this scenario is to kill the worker; the launcher will quickly fire up a new worker on the same database, and that worker will build a new list of tables.

 

That’s not a complete solution though… if the database contains a large number of very small tables you can end up in a state where 1 or 2 workers is busy chugging through those small tables so quickly than any additional workers spend all their time in table_recheck_autovac(), because that takes long enough that the additional workers are never able to “leapfrog” the workers that are doing useful work.

 

PoC patch attached.

 

1: top hits from `perf top -p xxx` on an affected worker

Samples: 72K of event 'cycles', Event count (approx.): 17131910436

Overhead  Shared Object     Symbol

  42.62%  postgres          [.] hash_search_with_hash_value

  10.34%  libc-2.17.so      [.] __memcpy_sse2

   6.99%  [kernel]          [k] copy_user_enhanced_fast_string

   4.73%  libc-2.17.so      [.] _IO_fread

   3.91%  postgres          [.] 0x00000000002d6478

   2.95%  libc-2.17.so      [.] _IO_getc

   2.44%  libc-2.17.so      [.] _IO_file_xsgetn

   1.73%  postgres          [.] hash_search

   1.65%  [kernel]          [k] find_get_entry

   1.10%  postgres          [.] hash_uint32

   0.99%  libc-2.17.so      [.] __memcpy_ssse3_back

 

 

 


autovac.patch (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [UNVERIFIED SENDER] FW: autovac issue with large number of tables

Jim Nasby

Sorry, please ignore this duplicate!

On 7/27/20 1:39 PM, Nasby, Jim wrote:

A database with a very large number of  tables eligible for autovacuum can result in autovacuum workers “stuck” in a tight loop of table_recheck_autovac() constantly reporting nothing to do on the table. This is because a database with a very large number of tables means it takes a while to search the statistics hash to verify that the table still needs to be processed[1]. If a worker spends some time processing a table, when it’s done it can spend a significant amount of time rechecking each table that it identified at launch (I’ve seen a worker in this state for over an hour). A simple work-around in this scenario is to kill the worker; the launcher will quickly fire up a new worker on the same database, and that worker will build a new list of tables.

 

That’s not a complete solution though… if the database contains a large number of very small tables you can end up in a state where 1 or 2 workers is busy chugging through those small tables so quickly than any additional workers spend all their time in table_recheck_autovac(), because that takes long enough that the additional workers are never able to “leapfrog” the workers that are doing useful work.

 

PoC patch attached.

 

1: top hits from `perf top -p xxx` on an affected worker

Samples: 72K of event 'cycles', Event count (approx.): 17131910436

Overhead  Shared Object     Symbol

  42.62%  postgres          [.] hash_search_with_hash_value

  10.34%  libc-2.17.so      [.] __memcpy_sse2

   6.99%  [kernel]          [k] copy_user_enhanced_fast_string

   4.73%  libc-2.17.so      [.] _IO_fread

   3.91%  postgres          [.] 0x00000000002d6478

   2.95%  libc-2.17.so      [.] _IO_getc

   2.44%  libc-2.17.so      [.] _IO_file_xsgetn

   1.73%  postgres          [.] hash_search

   1.65%  [kernel]          [k] find_get_entry

   1.10%  postgres          [.] hash_uint32

   0.99%  libc-2.17.so      [.] __memcpy_ssse3_back

 

 

 

Reply | Threaded
Open this post in threaded view
|

Re: autovac issue with large number of tables

Jim Nasby
In reply to this post by Masahiko Sawada-2
On 7/27/20 1:51 AM, Masahiko Sawada wrote:

> On Mon, 27 Jul 2020 at 06:43, Nasby, Jim <[hidden email]> wrote:
>> A database with a very large number of  tables eligible for autovacuum can result in autovacuum workers “stuck” in a tight loop of table_recheck_autovac() constantly reporting nothing to do on the table. This is because a database with a very large number of tables means it takes a while to search the statistics hash to verify that the table still needs to be processed[1]. If a worker spends some time processing a table, when it’s done it can spend a significant amount of time rechecking each table that it identified at launch (I’ve seen a worker in this state for over an hour). A simple work-around in this scenario is to kill the worker; the launcher will quickly fire up a new worker on the same database, and that worker will build a new list of tables.
>>
>>
>>
>> That’s not a complete solution though… if the database contains a large number of very small tables you can end up in a state where 1 or 2 workers is busy chugging through those small tables so quickly than any additional workers spend all their time in table_recheck_autovac(), because that takes long enough that the additional workers are never able to “leapfrog” the workers that are doing useful work.
>>
> As another solution, I've been considering adding a queue having table
> OIDs that need to vacuumed/analyzed on the shared memory (i.g. on
> DSA). Since all autovacuum workers running on the same database can
> see a consistent queue, the issue explained above won't happen and
> probably it makes the implementation of prioritization of tables being
> vacuumed easier which is sometimes discussed on pgsql-hackers. I guess
> it might be worth to discuss including this idea.
I'm in favor of trying to improve scheduling (especially allowing users
to control how things are scheduled), but that's a far more invasive
patch. I'd like to get something like this patch in without waiting on a
significantly larger effort.


Reply | Threaded
Open this post in threaded view
|

Re: autovac issue with large number of tables

Kasahara Tatsuhito
Hi,

On Tue, Jul 28, 2020 at 3:49 AM Jim Nasby <[hidden email]> wrote:
> I'm in favor of trying to improve scheduling (especially allowing users
> to control how things are scheduled), but that's a far more invasive
> patch. I'd like to get something like this patch in without waiting on a
> significantly larger effort.

BTW, Have you tried the patch suggested in the thread below?

https://www.postgresql.org/message-id/20180629.173418.190173462.horiguchi.kyotaro%40lab.ntt.co.jp

The above is a suggestion to manage statistics on shared memory rather
than in a file, but I think this feature may mitigate your problem.
I think that feature has yet another performance challenge, but it
might be worth a try.
The above patch will also require a great deal of effort to get into
the PostgreSQL-core, but I'm curious to see how well it works for this
problem.

Best regards,

--
Tatsuhito Kasahara
kasahara.tatsuhito _at_ gmail.com


Reply | Threaded
Open this post in threaded view
|

Re: autovac issue with large number of tables

Jim Nasby

On 7/31/20 1:26 AM, Kasahara Tatsuhito wrote:

On Tue, Jul 28, 2020 at 3:49 AM Jim Nasby [hidden email] wrote:
I'm in favor of trying to improve scheduling (especially allowing users
to control how things are scheduled), but that's a far more invasive
patch. I'd like to get something like this patch in without waiting on a
significantly larger effort.
BTW, Have you tried the patch suggested in the thread below?

https://www.postgresql.org/message-id/20180629.173418.190173462.horiguchi.kyotaro%40lab.ntt.co.jp

The above is a suggestion to manage statistics on shared memory rather
than in a file, but I think this feature may mitigate your problem.
I think that feature has yet another performance challenge, but it
might be worth a try.
The above patch will also require a great deal of effort to get into
the PostgreSQL-core, but I'm curious to see how well it works for this
problem.

Without reading the 100+ emails or the 260k patch, I'm guessing that it won't help because the problem I observed was spending most of it's time in

  42.62%  postgres          [.] hash_search_with_hash_value

I don't see how moving things to shared memory would help that at all.

BTW, when it comes to getting away from using files to store stats, IMHO the best first pass on that is to put hooks in place to allow an extension to replace/supplement different parts of the existing stats infrastructure.

Reply | Threaded
Open this post in threaded view
|

Re: autovac issue with large number of tables

Tom Lane-2
Jim Nasby <[hidden email]> writes:
> Without reading the 100+ emails or the 260k patch, I'm guessing that it
> won't help because the problem I observed was spending most of it's time in
>    42.62% postgres          [.] hash_search_with_hash_value
> I don't see how moving things to shared memory would help that at all.

So I'm a bit mystified as to why that would show up as the primary cost.
It looks to me like we force a re-read of the pgstats data each time
through table_recheck_autovac(), and it seems like the costs associated
with that would swamp everything else in the case you're worried about.

I suspect that the bulk of the hash_search_with_hash_value costs are
HASH_ENTER calls caused by repopulating the pgstats hash table, rather
than the single read probe that table_recheck_autovac itself will do.
It's still surprising that that would dominate the other costs of reading
the data, but maybe those costs just aren't as well localized in the code.

So I think Kasahara-san's point is that the shared memory stats collector
might wipe out those costs, depending on how it's implemented.  (I've not
looked at that patch in a long time either, so I don't know how much it'd
cut the reader-side costs.  But maybe it'd be substantial.)

In the meantime, though, do we want to do something else to alleviate
the issue?  I realize you only described your patch as a PoC, but I
can't say I like it much:

* Giving up after we've wasted 1000 pgstats re-reads seems like locking
the barn door only after the horse is well across the state line.

* I'm not convinced that the business with skipping N entries at a time
buys anything.  You'd have to make pretty strong assumptions about the
workers all processing tables at about the same rate to believe it will
help.  In the worst case, it might lead to all the workers ignoring the
same table(s).

I think the real issue here is autovac_refresh_stats's insistence that it
shouldn't throttle pgstats re-reads in workers.  I see the point about not
wanting to repeat vacuum work on the basis of stale data, but still ...
I wonder if we could have table_recheck_autovac do two probes of the stats
data.  First probe the existing stats data, and if it shows the table to
be already vacuumed, return immediately.  If not, *then* force a stats
re-read, and check a second time.

BTW, can you provide a test script that reproduces the problem you're
looking at?  The rest of us are kind of guessing at what's happening.

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: autovac issue with large number of tables

Kasahara Tatsuhito
Hi,

On Wed, Aug 12, 2020 at 2:46 AM Tom Lane <[hidden email]> wrote:
> So I think Kasahara-san's point is that the shared memory stats collector
> might wipe out those costs, depending on how it's implemented.  (I've not
> looked at that patch in a long time either, so I don't know how much it'd
> cut the reader-side costs.  But maybe it'd be substantial.)
Thanks for your clarification, that's what I wanted to say.
Sorry for the lack of explanation.

> I think the real issue here is autovac_refresh_stats's insistence that it
> shouldn't throttle pgstats re-reads in workers.
I agree that.

> I wonder if we could have table_recheck_autovac do two probes of the stats
> data.  First probe the existing stats data, and if it shows the table to
> be already vacuumed, return immediately.  If not, *then* force a stats
> re-read, and check a second time.
Does the above mean that the second and subsequent table_recheck_autovac()
will be improved to first check using the previous refreshed statistics?
I think that certainly works.

If that's correct, I'll try to create a patch for the PoC.

Best regards,

--
Tatsuhito Kasahara
kasahara.tatsuhito _at_ gmail.com


Reply | Threaded
Open this post in threaded view
|

Re: autovac issue with large number of tables

Kasahara Tatsuhito
Hi,

On Wed, Sep 2, 2020 at 2:10 AM Kasahara Tatsuhito
<[hidden email]> wrote:
> > I wonder if we could have table_recheck_autovac do two probes of the stats
> > data.  First probe the existing stats data, and if it shows the table to
> > be already vacuumed, return immediately.  If not, *then* force a stats
> > re-read, and check a second time.
> Does the above mean that the second and subsequent table_recheck_autovac()
> will be improved to first check using the previous refreshed statistics?
> I think that certainly works.
>
> If that's correct, I'll try to create a patch for the PoC

I still don't know how to reproduce Jim's troubles, but I was able to reproduce
what was probably a very similar problem.

This problem seems to be more likely to occur in cases where you have
a large number of tables,
i.e., a large amount of stats, and many small tables need VACUUM at
the same time.

So I followed Tom's advice and created a patch for the PoC.
This patch will enable a flag in the table_recheck_autovac function to use
the existing stats next time if VACUUM (or ANALYZE) has already been done
by another worker on the check after the stats have been updated.
If the tables continue to require VACUUM after the refresh, then a refresh
will be required instead of using the existing statistics.

I did simple test with HEAD and HEAD + this PoC patch.
The tests were conducted in two cases.
(I changed few configurations. see attached scripts)

1. Normal VACUUM case
  - SET autovacuum = off
  - CREATE tables with 100 rows
  - DELETE 90 rows for each tables
  - SET autovacuum = on and restart PostgreSQL
  - Measure the time it takes for all tables to be VACUUMed

2. Anti wrap round VACUUM case
  - CREATE brank tables
  - SELECT all of these tables (for generate stats)
  - SET autovacuum_freeze_max_age to low values and restart PostgreSQL
  - Consumes a lot of XIDs by using txid_curent()
  - Measure the time it takes for all tables to be VACUUMed

For each test case, the following results were obtained by changing
autovacuum_max_workers parameters to 1, 2, 3(def) 5 and 10.
Also changing num of tables to 1000, 5000, 10000 and 20000.

Due to the poor VM environment (2 VCPU/4 GB), the results are a little unstable,
but I think it's enough to ask for a trend.

===========================================================================
[1.Normal VACUUM case]
 tables:1000
  autovacuum_max_workers 1:   (HEAD) 20 sec VS (with patch)  20 sec
  autovacuum_max_workers 2:   (HEAD) 18 sec VS (with patch)  16 sec
  autovacuum_max_workers 3:   (HEAD) 18 sec VS (with patch)  16 sec
  autovacuum_max_workers 5:   (HEAD) 19 sec VS (with patch)  17 sec
  autovacuum_max_workers 10:  (HEAD) 19 sec VS (with patch)  17 sec

 tables:5000
  autovacuum_max_workers 1:   (HEAD) 77 sec VS (with patch)  78 sec
  autovacuum_max_workers 2:   (HEAD) 61 sec VS (with patch)  43 sec
  autovacuum_max_workers 3:   (HEAD) 38 sec VS (with patch)  38 sec
  autovacuum_max_workers 5:   (HEAD) 45 sec VS (with patch)  37 sec
  autovacuum_max_workers 10:  (HEAD) 43 sec VS (with patch)  35 sec

 tables:10000
  autovacuum_max_workers 1:   (HEAD) 152 sec VS (with patch)  153 sec
  autovacuum_max_workers 2:   (HEAD) 119 sec VS (with patch)   98 sec
  autovacuum_max_workers 3:   (HEAD)  87 sec VS (with patch)   78 sec
  autovacuum_max_workers 5:   (HEAD) 100 sec VS (with patch)   66 sec
  autovacuum_max_workers 10:  (HEAD)  97 sec VS (with patch)   56 sec

 tables:20000
  autovacuum_max_workers 1:   (HEAD) 338 sec VS (with patch)  339 sec
  autovacuum_max_workers 2:   (HEAD) 231 sec VS (with patch)  229 sec
  autovacuum_max_workers 3:   (HEAD) 220 sec VS (with patch)  191 sec
  autovacuum_max_workers 5:   (HEAD) 234 sec VS (with patch)  147 sec
  autovacuum_max_workers 10:  (HEAD) 320 sec VS (with patch)  113 sec

[2.Anti wrap round VACUUM case]
 tables:1000
  autovacuum_max_workers 1:   (HEAD) 19 sec VS (with patch) 18 sec
  autovacuum_max_workers 2:   (HEAD) 14 sec VS (with patch) 15 sec
  autovacuum_max_workers 3:   (HEAD) 14 sec VS (with patch) 14 sec
  autovacuum_max_workers 5:   (HEAD) 14 sec VS (with patch) 16 sec
  autovacuum_max_workers 10:  (HEAD) 16 sec VS (with patch) 14 sec

 tables:5000
  autovacuum_max_workers 1:   (HEAD) 69 sec VS (with patch) 69 sec
  autovacuum_max_workers 2:   (HEAD) 66 sec VS (with patch) 47 sec
  autovacuum_max_workers 3:   (HEAD) 59 sec VS (with patch) 37 sec
  autovacuum_max_workers 5:   (HEAD) 39 sec VS (with patch) 28 sec
  autovacuum_max_workers 10:  (HEAD) 39 sec VS (with patch) 29 sec

 tables:10000
  autovacuum_max_workers 1:   (HEAD) 139 sec VS (with patch) 138 sec
  autovacuum_max_workers 2:   (HEAD) 130 sec VS (with patch)  86 sec
  autovacuum_max_workers 3:   (HEAD) 120 sec VS (with patch)  68 sec
  autovacuum_max_workers 5:   (HEAD)  96 sec VS (with patch)  41 sec
  autovacuum_max_workers 10:  (HEAD)  90 sec VS (with patch)  39 sec

 tables:20000
  autovacuum_max_workers 1:   (HEAD) 313 sec VS (with patch) 331 sec
  autovacuum_max_workers 2:   (HEAD) 209 sec VS (with patch) 201 sec
  autovacuum_max_workers 3:   (HEAD) 227 sec VS (with patch) 141 sec
  autovacuum_max_workers 5:   (HEAD) 236 sec VS (with patch)  88 sec
  autovacuum_max_workers 10:  (HEAD) 309 sec VS (with patch)  74 sec
===========================================================================

The cases without patch, the scalability of the worker has decreased
as the number of tables has increased.
In fact, the more workers there are, the longer it takes to complete
VACUUM to all tables.
The cases with patch, it shows good scalability with respect to the
number of workers.

Note that perf top results showed that hash_search_with_hash_value,
hash_seq_search and
pgstat_read_statsfiles are dominant during VACUUM in all patterns,
with or without the patch.

Therefore, there is still a need to find ways to optimize the reading
of large amounts of stats.
However, this patch is effective in its own right, and since there are
only a few parts to modify,
I think it should be able to be applied to current (preferably
pre-v13) PostgreSQL.

The patch and reproduce scripts were attached.

Thoughts ?

Best regards,

--
Tatsuhito Kasahara
kasahara.tatsuhito _at_ gmail.com

v1_mod_table_recheck_autovac.patch (4K) Download Attachment
normal_vacuum_case_sample.sh (2K) Download Attachment
wrap-round_vacuum_case_sample.sh (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: autovac issue with large number of tables

Kasahara Tatsuhito
Therefore, we expect this patch [1] to be committed for its original
purpose, as well as to improve autovacuum from v14 onwards.Hi,

On Fri, Sep 4, 2020 at 7:50 PM Kasahara Tatsuhito
<[hidden email]> wrote:

>
> Hi,
>
> On Wed, Sep 2, 2020 at 2:10 AM Kasahara Tatsuhito
> <[hidden email]> wrote:
> > > I wonder if we could have table_recheck_autovac do two probes of the stats
> > > data.  First probe the existing stats data, and if it shows the table to
> > > be already vacuumed, return immediately.  If not, *then* force a stats
> > > re-read, and check a second time.
> > Does the above mean that the second and subsequent table_recheck_autovac()
> > will be improved to first check using the previous refreshed statistics?
> > I think that certainly works.
> >
> > If that's correct, I'll try to create a patch for the PoC
>
> I still don't know how to reproduce Jim's troubles, but I was able to reproduce
> what was probably a very similar problem.
>
> This problem seems to be more likely to occur in cases where you have
> a large number of tables,
> i.e., a large amount of stats, and many small tables need VACUUM at
> the same time.
>
> So I followed Tom's advice and created a patch for the PoC.
> This patch will enable a flag in the table_recheck_autovac function to use
> the existing stats next time if VACUUM (or ANALYZE) has already been done
> by another worker on the check after the stats have been updated.
> If the tables continue to require VACUUM after the refresh, then a refresh
> will be required instead of using the existing statistics.
>
> I did simple test with HEAD and HEAD + this PoC patch.
> The tests were conducted in two cases.
> (I changed few configurations. see attached scripts)
>
> 1. Normal VACUUM case
>   - SET autovacuum = off
>   - CREATE tables with 100 rows
>   - DELETE 90 rows for each tables
>   - SET autovacuum = on and restart PostgreSQL
>   - Measure the time it takes for all tables to be VACUUMed
>
> 2. Anti wrap round VACUUM case
>   - CREATE brank tables
>   - SELECT all of these tables (for generate stats)
>   - SET autovacuum_freeze_max_age to low values and restart PostgreSQL
>   - Consumes a lot of XIDs by using txid_curent()
>   - Measure the time it takes for all tables to be VACUUMed
>
> For each test case, the following results were obtained by changing
> autovacuum_max_workers parameters to 1, 2, 3(def) 5 and 10.
> Also changing num of tables to 1000, 5000, 10000 and 20000.
>
> Due to the poor VM environment (2 VCPU/4 GB), the results are a little unstable,
> but I think it's enough to ask for a trend.
>
> ===========================================================================
> [1.Normal VACUUM case]
>  tables:1000
>   autovacuum_max_workers 1:   (HEAD) 20 sec VS (with patch)  20 sec
>   autovacuum_max_workers 2:   (HEAD) 18 sec VS (with patch)  16 sec
>   autovacuum_max_workers 3:   (HEAD) 18 sec VS (with patch)  16 sec
>   autovacuum_max_workers 5:   (HEAD) 19 sec VS (with patch)  17 sec
>   autovacuum_max_workers 10:  (HEAD) 19 sec VS (with patch)  17 sec
>
>  tables:5000
>   autovacuum_max_workers 1:   (HEAD) 77 sec VS (with patch)  78 sec
>   autovacuum_max_workers 2:   (HEAD) 61 sec VS (with patch)  43 sec
>   autovacuum_max_workers 3:   (HEAD) 38 sec VS (with patch)  38 sec
>   autovacuum_max_workers 5:   (HEAD) 45 sec VS (with patch)  37 sec
>   autovacuum_max_workers 10:  (HEAD) 43 sec VS (with patch)  35 sec
>
>  tables:10000
>   autovacuum_max_workers 1:   (HEAD) 152 sec VS (with patch)  153 sec
>   autovacuum_max_workers 2:   (HEAD) 119 sec VS (with patch)   98 sec
>   autovacuum_max_workers 3:   (HEAD)  87 sec VS (with patch)   78 sec
>   autovacuum_max_workers 5:   (HEAD) 100 sec VS (with patch)   66 sec
>   autovacuum_max_workers 10:  (HEAD)  97 sec VS (with patch)   56 sec
>
>  tables:20000
>   autovacuum_max_workers 1:   (HEAD) 338 sec VS (with patch)  339 sec
>   autovacuum_max_workers 2:   (HEAD) 231 sec VS (with patch)  229 sec
>   autovacuum_max_workers 3:   (HEAD) 220 sec VS (with patch)  191 sec
>   autovacuum_max_workers 5:   (HEAD) 234 sec VS (with patch)  147 sec
>   autovacuum_max_workers 10:  (HEAD) 320 sec VS (with patch)  113 sec
>
> [2.Anti wrap round VACUUM case]
>  tables:1000
>   autovacuum_max_workers 1:   (HEAD) 19 sec VS (with patch) 18 sec
>   autovacuum_max_workers 2:   (HEAD) 14 sec VS (with patch) 15 sec
>   autovacuum_max_workers 3:   (HEAD) 14 sec VS (with patch) 14 sec
>   autovacuum_max_workers 5:   (HEAD) 14 sec VS (with patch) 16 sec
>   autovacuum_max_workers 10:  (HEAD) 16 sec VS (with patch) 14 sec
>
>  tables:5000
>   autovacuum_max_workers 1:   (HEAD) 69 sec VS (with patch) 69 sec
>   autovacuum_max_workers 2:   (HEAD) 66 sec VS (with patch) 47 sec
>   autovacuum_max_workers 3:   (HEAD) 59 sec VS (with patch) 37 sec
>   autovacuum_max_workers 5:   (HEAD) 39 sec VS (with patch) 28 sec
>   autovacuum_max_workers 10:  (HEAD) 39 sec VS (with patch) 29 sec
>
>  tables:10000
>   autovacuum_max_workers 1:   (HEAD) 139 sec VS (with patch) 138 sec
>   autovacuum_max_workers 2:   (HEAD) 130 sec VS (with patch)  86 sec
>   autovacuum_max_workers 3:   (HEAD) 120 sec VS (with patch)  68 sec
>   autovacuum_max_workers 5:   (HEAD)  96 sec VS (with patch)  41 sec
>   autovacuum_max_workers 10:  (HEAD)  90 sec VS (with patch)  39 sec
>
>  tables:20000
>   autovacuum_max_workers 1:   (HEAD) 313 sec VS (with patch) 331 sec
>   autovacuum_max_workers 2:   (HEAD) 209 sec VS (with patch) 201 sec
>   autovacuum_max_workers 3:   (HEAD) 227 sec VS (with patch) 141 sec
>   autovacuum_max_workers 5:   (HEAD) 236 sec VS (with patch)  88 sec
>   autovacuum_max_workers 10:  (HEAD) 309 sec VS (with patch)  74 sec
> ===========================================================================
>
> The cases without patch, the scalability of the worker has decreased
> as the number of tables has increased.
> In fact, the more workers there are, the longer it takes to complete
> VACUUM to all tables.
> The cases with patch, it shows good scalability with respect to the
> number of workers.
>
> Note that perf top results showed that hash_search_with_hash_value,
> hash_seq_search and
> pgstat_read_statsfiles are dominant during VACUUM in all patterns,
> with or without the patch.
>
> Therefore, there is still a need to find ways to optimize the reading
> of large amounts of stats.
> However, this patch is effective in its own right, and since there are
> only a few parts to modify,
> I think it should be able to be applied to current (preferably
> pre-v13) PostgreSQL.
>
> The patch and reproduce scripts were attached.
>
> Thoughts ?

Hi.

I ran the same test with a patch[1] that manages the statistics on
shared memory.
This patch is expected to reduce the burden of refreshing large
amounts of stats.

And the following results were obtained.
(The results for HEAD are the same as in my last post.)

========================================================================================
[1.Normal VACUUM case]
 tables:1000
  autovacuum_max_workers 1:   (HEAD) 20 sec VS (with shared_base_stast
patch) 8 sec
  autovacuum_max_workers 2:   (HEAD) 18 sec VS (with shared_base_stast
patch) 8 sec
  autovacuum_max_workers 3:   (HEAD) 18 sec VS (with shared_base_stast
patch) 8 sec
  autovacuum_max_workers 5:   (HEAD) 19 sec VS (with shared_base_stast
patch) 9 sec
  autovacuum_max_workers 10:  (HEAD) 19 sec VS (with shared_base_stast
patch) 9 sec

 tables:5000
  autovacuum_max_workers 1:   (HEAD) 77 sec VS (with shared_base_stast
patch) 13 sec
  autovacuum_max_workers 2:   (HEAD) 61 sec VS (with shared_base_stast
patch) 12 sec
  autovacuum_max_workers 3:   (HEAD) 38 sec VS (with shared_base_stast
patch) 13 sec
  autovacuum_max_workers 5:   (HEAD) 45 sec VS (with shared_base_stast
patch) 12 sec
  autovacuum_max_workers 10:  (HEAD) 43 sec VS (with shared_base_stast
patch) 12 sec

 tables:10000
  autovacuum_max_workers 1:   (HEAD) 152 sec VS (with
shared_base_stast patch) 18 sec
  autovacuum_max_workers 2:   (HEAD) 119 sec VS (with
shared_base_stast patch) 25 sec
  autovacuum_max_workers 3:   (HEAD)  87 sec VS (with
shared_base_stast patch) 28 sec
  autovacuum_max_workers 5:   (HEAD) 100 sec VS (with
shared_base_stast patch) 28 sec
  autovacuum_max_workers 10:  (HEAD)  97 sec VS (with
shared_base_stast patch) 29 sec

 tables:20000
  autovacuum_max_workers 1:   (HEAD) 338 sec VS (with
shared_base_stast patch) 27 sec
  autovacuum_max_workers 2:   (HEAD) 231 sec VS (with
shared_base_stast patch) 54 sec
  autovacuum_max_workers 3:   (HEAD) 220 sec VS (with
shared_base_stast patch) 67 sec
  autovacuum_max_workers 5:   (HEAD) 234 sec VS (with
shared_base_stast patch) 75 sec
  autovacuum_max_workers 10:  (HEAD) 320 sec VS (with
shared_base_stast patch) 83 sec

[2.Anti wrap round VACUUM case]
 tables:1000
  autovacuum_max_workers 1:   (HEAD) 19 sec VS (with shared_base_stats
patch) 6 sec
  autovacuum_max_workers 2:   (HEAD) 14 sec VS (with shared_base_stats
patch) 7 sec
  autovacuum_max_workers 3:   (HEAD) 14 sec VS (with shared_base_stats
patch) 6 sec
  autovacuum_max_workers 5:   (HEAD) 14 sec VS (with shared_base_stats
patch) 6 sec
  autovacuum_max_workers 10:  (HEAD) 16 sec VS (with shared_base_stats
patch) 7 sec

 tables:5000
  autovacuum_max_workers 1:   (HEAD) 69 sec VS (with shared_base_stats
patch) 8 sec
  autovacuum_max_workers 2:   (HEAD) 66 sec VS (with shared_base_stats
patch) 8 sec
  autovacuum_max_workers 3:   (HEAD) 59 sec VS (with shared_base_stats
patch) 8 sec
  autovacuum_max_workers 5:   (HEAD) 39 sec VS (with shared_base_stats
patch) 9 sec
  autovacuum_max_workers 10:  (HEAD) 39 sec VS (with shared_base_stats
patch) 8 sec

 tables:10000
  autovacuum_max_workers 1:   (HEAD) 139 sec VS (with
shared_base_stats patch) 9 sec
  autovacuum_max_workers 2:   (HEAD) 130 sec VS (with
shared_base_stats patch) 9 sec
  autovacuum_max_workers 3:   (HEAD) 120 sec VS (with
shared_base_stats patch) 9 sec
  autovacuum_max_workers 5:   (HEAD)  96 sec VS (with
shared_base_stats patch) 8 sec
  autovacuum_max_workers 10:  (HEAD)  90 sec VS (with
shared_base_stats patch) 9 sec

 tables:20000
  autovacuum_max_workers 1:   (HEAD) 313 sec VS (with
shared_base_stats patch) 12 sec
  autovacuum_max_workers 2:   (HEAD) 209 sec VS (with
shared_base_stats patch) 12 sec
  autovacuum_max_workers 3:   (HEAD) 227 sec VS (with
shared_base_stats patch) 12 sec
  autovacuum_max_workers 5:   (HEAD) 236 sec VS (with
shared_base_stats patch) 11 sec
  autovacuum_max_workers 10:  (HEAD) 309 sec VS (with
shared_base_stats patch) 12 sec
========================================================================================

This patch provided a very nice speedup in both cases.
However, in case 1, when the number of tables is large, there is an
increase in the time required
as the number of workers increases.
Whether this is due to CPU and IO conflicts or patch characteristics
is not yet known.
Nevertheless, at least the problems associated with
table_recheck_autovac() appear to have been resolved.

So, I hope that  this patch [1] to be committed for its original purpose,
as well as to improve autovacuum of v14 and later.

The other patch I submitted (v1_mod_table_recheck_autovac.patch) is
useful for slight
improving autovacuum of PostgreSQL 13 and before.
Is it worth backporting this patch to current PostgreSQL 13 and earlier?

Best regards,

[1] https://www.postgresql.org/message-id/20200908.175557.617150409868541587.horikyota.ntt%40gmail.com

--
Tatsuhito Kasahara
kasahara.tatsuhito _at_ gmail.com