autovac issue with large number of tables

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

autovac issue with large number of tables

Jim Nasby

A database with a very large number of  tables eligible for autovacuum can result in autovacuum workers “stuck” in a tight loop of table_recheck_autovac() constantly reporting nothing to do on the table. This is because a database with a very large number of tables means it takes a while to search the statistics hash to verify that the table still needs to be processed[1]. If a worker spends some time processing a table, when it’s done it can spend a significant amount of time rechecking each table that it identified at launch (I’ve seen a worker in this state for over an hour). A simple work-around in this scenario is to kill the worker; the launcher will quickly fire up a new worker on the same database, and that worker will build a new list of tables.

 

That’s not a complete solution though… if the database contains a large number of very small tables you can end up in a state where 1 or 2 workers is busy chugging through those small tables so quickly than any additional workers spend all their time in table_recheck_autovac(), because that takes long enough that the additional workers are never able to “leapfrog” the workers that are doing useful work.

 

PoC patch attached.

 

1: top hits from `perf top -p xxx` on an affected worker

Samples: 72K of event 'cycles', Event count (approx.): 17131910436

Overhead  Shared Object     Symbol

  42.62%  postgres          [.] hash_search_with_hash_value

  10.34%  libc-2.17.so      [.] __memcpy_sse2

   6.99%  [kernel]          [k] copy_user_enhanced_fast_string

   4.73%  libc-2.17.so      [.] _IO_fread

   3.91%  postgres          [.] 0x00000000002d6478

   2.95%  libc-2.17.so      [.] _IO_getc

   2.44%  libc-2.17.so      [.] _IO_file_xsgetn

   1.73%  postgres          [.] hash_search

   1.65%  [kernel]          [k] find_get_entry

   1.10%  postgres          [.] hash_uint32

   0.99%  libc-2.17.so      [.] __memcpy_ssse3_back

 

 

 


autovac.patch (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: autovac issue with large number of tables

Masahiko Sawada-2
On Mon, 27 Jul 2020 at 06:43, Nasby, Jim <[hidden email]> wrote:
>
> A database with a very large number of  tables eligible for autovacuum can result in autovacuum workers “stuck” in a tight loop of table_recheck_autovac() constantly reporting nothing to do on the table. This is because a database with a very large number of tables means it takes a while to search the statistics hash to verify that the table still needs to be processed[1]. If a worker spends some time processing a table, when it’s done it can spend a significant amount of time rechecking each table that it identified at launch (I’ve seen a worker in this state for over an hour). A simple work-around in this scenario is to kill the worker; the launcher will quickly fire up a new worker on the same database, and that worker will build a new list of tables.
>
>
>
> That’s not a complete solution though… if the database contains a large number of very small tables you can end up in a state where 1 or 2 workers is busy chugging through those small tables so quickly than any additional workers spend all their time in table_recheck_autovac(), because that takes long enough that the additional workers are never able to “leapfrog” the workers that are doing useful work.
>

As another solution, I've been considering adding a queue having table
OIDs that need to vacuumed/analyzed on the shared memory (i.g. on
DSA). Since all autovacuum workers running on the same database can
see a consistent queue, the issue explained above won't happen and
probably it makes the implementation of prioritization of tables being
vacuumed easier which is sometimes discussed on pgsql-hackers. I guess
it might be worth to discuss including this idea.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

FW: autovac issue with large number of tables

Jim Nasby
In reply to this post by Jim Nasby

A database with a very large number of  tables eligible for autovacuum can result in autovacuum workers “stuck” in a tight loop of table_recheck_autovac() constantly reporting nothing to do on the table. This is because a database with a very large number of tables means it takes a while to search the statistics hash to verify that the table still needs to be processed[1]. If a worker spends some time processing a table, when it’s done it can spend a significant amount of time rechecking each table that it identified at launch (I’ve seen a worker in this state for over an hour). A simple work-around in this scenario is to kill the worker; the launcher will quickly fire up a new worker on the same database, and that worker will build a new list of tables.

 

That’s not a complete solution though… if the database contains a large number of very small tables you can end up in a state where 1 or 2 workers is busy chugging through those small tables so quickly than any additional workers spend all their time in table_recheck_autovac(), because that takes long enough that the additional workers are never able to “leapfrog” the workers that are doing useful work.

 

PoC patch attached.

 

1: top hits from `perf top -p xxx` on an affected worker

Samples: 72K of event 'cycles', Event count (approx.): 17131910436

Overhead  Shared Object     Symbol

  42.62%  postgres          [.] hash_search_with_hash_value

  10.34%  libc-2.17.so      [.] __memcpy_sse2

   6.99%  [kernel]          [k] copy_user_enhanced_fast_string

   4.73%  libc-2.17.so      [.] _IO_fread

   3.91%  postgres          [.] 0x00000000002d6478

   2.95%  libc-2.17.so      [.] _IO_getc

   2.44%  libc-2.17.so      [.] _IO_file_xsgetn

   1.73%  postgres          [.] hash_search

   1.65%  [kernel]          [k] find_get_entry

   1.10%  postgres          [.] hash_uint32

   0.99%  libc-2.17.so      [.] __memcpy_ssse3_back

 

 

 


autovac.patch (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [UNVERIFIED SENDER] FW: autovac issue with large number of tables

Jim Nasby

Sorry, please ignore this duplicate!

On 7/27/20 1:39 PM, Nasby, Jim wrote:

A database with a very large number of  tables eligible for autovacuum can result in autovacuum workers “stuck” in a tight loop of table_recheck_autovac() constantly reporting nothing to do on the table. This is because a database with a very large number of tables means it takes a while to search the statistics hash to verify that the table still needs to be processed[1]. If a worker spends some time processing a table, when it’s done it can spend a significant amount of time rechecking each table that it identified at launch (I’ve seen a worker in this state for over an hour). A simple work-around in this scenario is to kill the worker; the launcher will quickly fire up a new worker on the same database, and that worker will build a new list of tables.

 

That’s not a complete solution though… if the database contains a large number of very small tables you can end up in a state where 1 or 2 workers is busy chugging through those small tables so quickly than any additional workers spend all their time in table_recheck_autovac(), because that takes long enough that the additional workers are never able to “leapfrog” the workers that are doing useful work.

 

PoC patch attached.

 

1: top hits from `perf top -p xxx` on an affected worker

Samples: 72K of event 'cycles', Event count (approx.): 17131910436

Overhead  Shared Object     Symbol

  42.62%  postgres          [.] hash_search_with_hash_value

  10.34%  libc-2.17.so      [.] __memcpy_sse2

   6.99%  [kernel]          [k] copy_user_enhanced_fast_string

   4.73%  libc-2.17.so      [.] _IO_fread

   3.91%  postgres          [.] 0x00000000002d6478

   2.95%  libc-2.17.so      [.] _IO_getc

   2.44%  libc-2.17.so      [.] _IO_file_xsgetn

   1.73%  postgres          [.] hash_search

   1.65%  [kernel]          [k] find_get_entry

   1.10%  postgres          [.] hash_uint32

   0.99%  libc-2.17.so      [.] __memcpy_ssse3_back

 

 

 

Reply | Threaded
Open this post in threaded view
|

Re: autovac issue with large number of tables

Jim Nasby
In reply to this post by Masahiko Sawada-2
On 7/27/20 1:51 AM, Masahiko Sawada wrote:

> On Mon, 27 Jul 2020 at 06:43, Nasby, Jim <[hidden email]> wrote:
>> A database with a very large number of  tables eligible for autovacuum can result in autovacuum workers “stuck” in a tight loop of table_recheck_autovac() constantly reporting nothing to do on the table. This is because a database with a very large number of tables means it takes a while to search the statistics hash to verify that the table still needs to be processed[1]. If a worker spends some time processing a table, when it’s done it can spend a significant amount of time rechecking each table that it identified at launch (I’ve seen a worker in this state for over an hour). A simple work-around in this scenario is to kill the worker; the launcher will quickly fire up a new worker on the same database, and that worker will build a new list of tables.
>>
>>
>>
>> That’s not a complete solution though… if the database contains a large number of very small tables you can end up in a state where 1 or 2 workers is busy chugging through those small tables so quickly than any additional workers spend all their time in table_recheck_autovac(), because that takes long enough that the additional workers are never able to “leapfrog” the workers that are doing useful work.
>>
> As another solution, I've been considering adding a queue having table
> OIDs that need to vacuumed/analyzed on the shared memory (i.g. on
> DSA). Since all autovacuum workers running on the same database can
> see a consistent queue, the issue explained above won't happen and
> probably it makes the implementation of prioritization of tables being
> vacuumed easier which is sometimes discussed on pgsql-hackers. I guess
> it might be worth to discuss including this idea.
I'm in favor of trying to improve scheduling (especially allowing users
to control how things are scheduled), but that's a far more invasive
patch. I'd like to get something like this patch in without waiting on a
significantly larger effort.


Reply | Threaded
Open this post in threaded view
|

Re: autovac issue with large number of tables

Kasahara Tatsuhito
Hi,

On Tue, Jul 28, 2020 at 3:49 AM Jim Nasby <[hidden email]> wrote:
> I'm in favor of trying to improve scheduling (especially allowing users
> to control how things are scheduled), but that's a far more invasive
> patch. I'd like to get something like this patch in without waiting on a
> significantly larger effort.

BTW, Have you tried the patch suggested in the thread below?

https://www.postgresql.org/message-id/20180629.173418.190173462.horiguchi.kyotaro%40lab.ntt.co.jp

The above is a suggestion to manage statistics on shared memory rather
than in a file, but I think this feature may mitigate your problem.
I think that feature has yet another performance challenge, but it
might be worth a try.
The above patch will also require a great deal of effort to get into
the PostgreSQL-core, but I'm curious to see how well it works for this
problem.

Best regards,

--
Tatsuhito Kasahara
kasahara.tatsuhito _at_ gmail.com