autovacuum big table taking hours and sometimes seconds

classic Classic list List threaded Threaded
21 messages Options
12
Reply | Threaded
Open this post in threaded view
|

autovacuum big table taking hours and sometimes seconds

Mariel Cherkassky
Hi,
I have a table with a bytea column and its size is huge and thats why postgres created a toasted table for that column. The original table contains about 1K-10K rows but the toasted can contain up to 20M rows. I assigned the next two settings for the toasted table : 

 alter table orig_table set (toast.autovacuum_vacuum_scale_factor = 0);

  alter table orig_table set (toast.autovacuum_vacuum_threshold  =10000);


Therefore I hoped that after deletion of 10K rows from the toasted table autovacuum will launch vacuum on the toasted table.

From the logs I see that sometimes the autovacuum is running once in a few hours (3-4 hours) and sometimes it runs even every few minutes.

Now I wanted to check if only depends on the thresholds and on the frequency of the deletes/updates on the table ? In some cases the autovacuum is taking a few hours (4+) it finishes and then immediatly is starting to run vacuum again on  the table : 

2019-01-29 07:10:58 EST  14083  LOG:  automatic vacuum of table "db.pg_toast.pg_toast_14430": index scans: 3

pages: 1672 removed, 7085056 remain

tuples: 6706885 removed, 2023847 remain

buffer usage: 4808221 hits, 6404148 misses, 6152603 dirtied

avg read rate: 2.617 MiB/s, avg write rate: 2.514 MiB/s

system usage: CPU 148.65s/70.06u sec elapsed 19119.55 sec 

This run took 19119 sec ~ 5 hours


2019-01-29 10:05:45 EST  11985  LOG:  automatic vacuum of table "db.pg_toast.pg_toast_14430": index scans: 2

pages: 2752 removed, 7082304 remain

tuples: 3621620 removed, 1339540 remain

buffer usage: 2655076 hits, 3506964 misses, 3333423 dirtied

avg read rate: 2.638 MiB/s, avg write rate: 2.508 MiB/s

system usage: CPU 71.22s/37.65u sec elapsed 10384.93 sec


this run took 10384 sec ~ 2.88 hours.


the diff between the summaries is 3 hours and the second run took 2.88 hours which means that the autovacuum launched vacuum on the table a few minutes after the first vacuum has finished.


In addition, as I said sometimes if runs very often : 

2019-02-04 09:26:23 EST  14735  LOG:  automatic vacuum of table "db.pg_toast.pg_toast_14430": index scans: 1

pages: 1760 removed, 11149568 remain

tuples: 47870 removed, 4929452 remain

buffer usage: 200575 hits, 197394 misses, 24264 dirtied

avg read rate: 5.798 MiB/s, avg write rate: 0.713 MiB/s

system usage: CPU 1.55s/1.38u sec elapsed 265.96 sec


2019-02-04 09:32:57 EST  26171  LOG:  automatic vacuum of table "db.pg_toast.pg_toast_14430": index scans: 1

pages: 2144 removed, 11147424 remain

tuples: 55484 removed, 4921526 remain

buffer usage: 196811 hits, 209267 misses, 34471 dirtied

avg read rate: 5.459 MiB/s, avg write rate: 0.899 MiB/s

system usage: CPU 1.73s/1.54u sec elapsed 299.50 sec


Now the question is how to handle or tune it ? Is there any change that I need to increase the cost_limit / cost_delay ?

Reply | Threaded
Open this post in threaded view
|

Re: autovacuum big table taking hours and sometimes seconds

Laurenz Albe
On Wed, 2019-02-06 at 12:29 +0200, Mariel Cherkassky wrote:

> Hi,
> I have a table with a bytea column and its size is huge and thats why postgres created a toasted table for that column.
> The original table contains about 1K-10K rows but the toasted can contain up to 20M rows.
> I assigned the next two settings for the toasted table :
>  alter table orig_table set (toast.autovacuum_vacuum_scale_factor = 0);
>   alter table orig_table set (toast.autovacuum_vacuum_threshold  =10000);
>
> Therefore I hoped that after deletion of 10K rows from the toasted table autovacuum will launch vacuum on the toasted table.
> From the logs I see that sometimes the autovacuum is running once in a few hours (3-4 hours) and sometimes it runs even every few minutes.
> Now I wanted to check if only depends on the thresholds and on the frequency of the deletes/updates on the table ?
> In some cases the autovacuum is taking a few hours (4+) it finishes and then immediatly is starting to run vacuum again on  the table :
>
> Now the question is how to handle or tune it ? Is there any change that I need to increase the cost_limit / cost_delay ?

Maybe configuring autovacuum to run faster will help:

alter table orig_table set (toast.autovacuum_vacuum_cost_limit = 2000);

Or, more extreme:

alter table orig_table set (toast.autovacuum_vacuum_cost_delay = 0);

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com


Reply | Threaded
Open this post in threaded view
|

Re: autovacuum big table taking hours and sometimes seconds

David Rowley-3
On Thu, 7 Feb 2019 at 00:17, Laurenz Albe <[hidden email]> wrote:

>
> On Wed, 2019-02-06 at 12:29 +0200, Mariel Cherkassky wrote:
> > Now the question is how to handle or tune it ? Is there any change that I need to increase the cost_limit / cost_delay ?
>
> Maybe configuring autovacuum to run faster will help:
>
> alter table orig_table set (toast.autovacuum_vacuum_cost_limit = 2000);
>
> Or, more extreme:
>
> alter table orig_table set (toast.autovacuum_vacuum_cost_delay = 0);

Going by the block hits/misses/dirtied and the mentioned vacuum times,
it looks like auto-vacuum is set to the standard settings and if so it
spent about 100% of its time sleeping on the job.

It might be a better idea to consider changing the vacuum settings
globally rather than just for one table.

Running a vacuum_cost_limit of 200 is likely something you'd not want
to ever do with modern hardware... well maybe unless you just bought
the latest Raspberry PI, or something.  You should be tuning that
value to something that runs your vacuums to a speed you're happy with
but leaves enough IO and CPU for queries running on the database.

If you see that all auto-vacuum workers are busy more often than not,
then they're likely running too slowly and should be set to run more
quickly.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: autovacuum big table taking hours and sometimes seconds

Mariel Cherkassky
Hey,
As I said, I set the next settings for the toasted table : 

 alter table orig_table set (toast.autovacuum_vacuum_scale_factor = 0);

  alter table orig_table set (toast.autovacuum_vacuum_threshold  =10000);

Can you explain a little bit more why you decided that the autovacuum spent it time on sleeping ?

I see the autovacuum statistics from the logs, how can I check that the workers are busy very often ?

My vacuum limit is 200 right now, basically If vacuum runs on my toasted table and reached 200 but it didnt finish to clean all the dead tuples, after the nap, should it continue cleaning it or wait until the vacuum_threshold hit again ?


‫בתאריך יום ד׳, 6 בפבר׳ 2019 ב-15:05 מאת ‪David Rowley‬‏ <‪[hidden email]‬‏>:‬
On Thu, 7 Feb 2019 at 00:17, Laurenz Albe <[hidden email]> wrote:
>
> On Wed, 2019-02-06 at 12:29 +0200, Mariel Cherkassky wrote:
> > Now the question is how to handle or tune it ? Is there any change that I need to increase the cost_limit / cost_delay ?
>
> Maybe configuring autovacuum to run faster will help:
>
> alter table orig_table set (toast.autovacuum_vacuum_cost_limit = 2000);
>
> Or, more extreme:
>
> alter table orig_table set (toast.autovacuum_vacuum_cost_delay = 0);

Going by the block hits/misses/dirtied and the mentioned vacuum times,
it looks like auto-vacuum is set to the standard settings and if so it
spent about 100% of its time sleeping on the job.

It might be a better idea to consider changing the vacuum settings
globally rather than just for one table.

Running a vacuum_cost_limit of 200 is likely something you'd not want
to ever do with modern hardware... well maybe unless you just bought
the latest Raspberry PI, or something.  You should be tuning that
value to something that runs your vacuums to a speed you're happy with
but leaves enough IO and CPU for queries running on the database.

If you see that all auto-vacuum workers are busy more often than not,
then they're likely running too slowly and should be set to run more
quickly.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
Reply | Threaded
Open this post in threaded view
|

Re: autovacuum big table taking hours and sometimes seconds

dangal
In reply to this post by David Rowley-3
Would it be nice to start changing those values ​​found in the default
postgres.conf so low?



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html

Reply | Threaded
Open this post in threaded view
|

Re: autovacuum big table taking hours and sometimes seconds

Mariel Cherkassky
which one you mean ? I changed the threshold and the scale for the specific table...

‫בתאריך יום ד׳, 6 בפבר׳ 2019 ב-15:36 מאת ‪dangal‬‏ <‪[hidden email]‬‏>:‬
Would it be nice to start changing those values ​​found in the default
postgres.conf so low?



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html

Reply | Threaded
Open this post in threaded view
|

Re: autovacuum big table taking hours and sometimes seconds

Michaeldba@sqlexec.com
Hi all,

In the myriad of articles written about autovacuum tuning, I really like this article by Tomas Vondra of 2ndQuadrant:
https://blog.2ndquadrant.com/autovacuum-tuning-basics/

It is a concise article that touches on all the major aspects of autovacuuming tuning: thresholds, scale factors, throttling, etc.

Regards and happy vacuuming to yas!
Michael Vitale

Wednesday, February 6, 2019 8:41 AM
which one you mean ? I changed the threshold and the scale for the specific table...

Wednesday, February 6, 2019 8:36 AM
Would it be nice to start changing those values ​​found in the default
postgres.conf so low?



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html

Wednesday, February 6, 2019 8:05 AM

Going by the block hits/misses/dirtied and the mentioned vacuum times,
it looks like auto-vacuum is set to the standard settings and if so it
spent about 100% of its time sleeping on the job.

It might be a better idea to consider changing the vacuum settings
globally rather than just for one table.

Running a vacuum_cost_limit of 200 is likely something you'd not want
to ever do with modern hardware... well maybe unless you just bought
the latest Raspberry PI, or something. You should be tuning that
value to something that runs your vacuums to a speed you're happy with
but leaves enough IO and CPU for queries running on the database.

If you see that all auto-vacuum workers are busy more often than not,
then they're likely running too slowly and should be set to run more
quickly.


Reply | Threaded
Open this post in threaded view
|

Re: autovacuum big table taking hours and sometimes seconds

Jeff Janes
In reply to this post by Mariel Cherkassky
On Wed, Feb 6, 2019 at 5:29 AM Mariel Cherkassky <[hidden email]> wrote:
 
Now the question is how to handle or tune it ? Is there any change that I need to increase the cost_limit / cost_delay ?

Sometimes vacuum has more work to do, so it takes more time to do it.

There is no indication of a problem. Or at least, you haven't described one. So, there is nothing to handle or to tune.

If there is a problem, those log entries might help identify it.  But in the absence of a problem, they are just log spam.

Cheers,

Jeff
Reply | Threaded
Open this post in threaded view
|

Re: autovacuum big table taking hours and sometimes seconds

Mariel Cherkassky
Well, basically I'm trying to tune it because the table still keep growing. I thought that by setting the scale and the threshold it will be enough but its seems that it wasnt. I attached some of the logs output to hear what you guys think about it ..

‫בתאריך יום ד׳, 6 בפבר׳ 2019 ב-16:12 מאת ‪Jeff Janes‬‏ <‪[hidden email]‬‏>:‬
On Wed, Feb 6, 2019 at 5:29 AM Mariel Cherkassky <[hidden email]> wrote:
 
Now the question is how to handle or tune it ? Is there any change that I need to increase the cost_limit / cost_delay ?

Sometimes vacuum has more work to do, so it takes more time to do it.

There is no indication of a problem. Or at least, you haven't described one. So, there is nothing to handle or to tune.

If there is a problem, those log entries might help identify it.  But in the absence of a problem, they are just log spam.

Cheers,

Jeff
Reply | Threaded
Open this post in threaded view
|

Re: autovacuum big table taking hours and sometimes seconds

Jeff Janes
On Wed, Feb 6, 2019 at 9:42 AM Mariel Cherkassky <[hidden email]> wrote:
Well, basically I'm trying to tune it because the table still keep growing. I thought that by setting the scale and the threshold it will be enough but its seems that it wasnt. I attached some of the logs output to hear what you guys think about it ..

Are all four log entries from well after you made the change?  My first inclination is to think that the first 2 are from either before the change, or just after the change when it is still settling into the new regime.  Also, is the table still continuing to grow, or is at a new steady-state of bloat which isn't growing but also isn't shrinking back to where you want it to be?  More aggressive vacuuming alone should stop the bloat, but is not likely to reverse it.

I habitually set vacuum_cost_page_hit and vacuum_cost_page_miss to zero.  Page reads are self-limiting (vacuum is single threaded, so you can't have more than one read (times autovacuum_max_workers) going on at a time) so I don't see a need to throttle them intentionally as well--unless your entire db is sitting on one spindle.  Based on the high ratio of read rates to write rates in the last two log entries, this change alone should be enough greatly speed up the run time of the vacuum.

If you need to speed it up beyond that, I don't think it matters much whether you decrease cost_delay or increase cost_limit, it is the ratio that mostly matters.

And if these latter measures do work, you should consider undoing changes to autovacuum_vacuum_scale_factor.  Reading the entire index just to remove 10,000 rows from the table is a lot of extra work that might be unnecessary. Although that extra work might not be on anyone's critical path.
Reply | Threaded
Open this post in threaded view
|

Re: autovacuum big table taking hours and sometimes seconds

David Rowley-3
In reply to this post by Mariel Cherkassky
On Thu, 7 Feb 2019 at 02:34, Mariel Cherkassky
<[hidden email]> wrote:
> As I said, I set the next settings for the toasted table :
>
>  alter table orig_table set (toast.autovacuum_vacuum_scale_factor = 0);
>
>   alter table orig_table set (toast.autovacuum_vacuum_threshold  =10000);

These settings don't control how fast auto-vacuum runs, just when it should run.

> Can you explain a little bit more why you decided that the autovacuum spent it time on sleeping ?

Yeah, if you look at the following settings.

 vacuum_cost_limit                   | 200
 vacuum_cost_page_dirty              | 20
 vacuum_cost_page_hit                | 1
 vacuum_cost_page_miss               | 10
 autovacuum_vacuum_cost_delay        | 20ms

I've tagged on the default setting for each of these.  Both vacuum and
auto-vacuum keep score of how many points they've accumulated while
running. 20 points for dirtying a page, 10 for a read that's not found
to be in shared_buffers, 1 for reading a buffer from shared buffers.
When vacuum_cost_limit points is reached (or
autovacuum_vacuum_cost_limit if not -1) auto-vacuum sleeps for
autovacuum_vacuum_cost_delay, normal manual vacuums sleep for
vacuum_cost_delay.

In one of the log entries you saw:

> buffer usage: 4808221 hits, 6404148 misses, 6152603 dirtied
> avg read rate: 2.617 MiB/s, avg write rate: 2.514 MiB/s
> system usage: CPU 148.65s/70.06u sec elapsed 19119.55 sec

Doing a bit of maths to see how much time that vacuum should have slept for:

postgres=# select (4808221 * 1 + 6404148 * 10 + 6152603 * 20) / 200.0
* 20 / 1000;
      ?column?
--------------------
 19190.176100000000

That's remarkably close to the actual time of 19119.55 sec. If you do
the same for the other 3 vacuums then you'll see the same close match.

> I see the autovacuum statistics from the logs, how can I check that the workers are busy very often ?

It would be nice if there was something better, but periodically doing:

SELECT count(*) FROM pg_stat_activity where query like 'autovacuum%';

will work.

> My vacuum limit is 200 right now, basically If vacuum runs on my toasted table and reached 200 but it didnt finish to clean all the dead tuples, after the nap, should it continue cleaning it or wait until the vacuum_threshold hit again ?

You're confusing nap time is something else, Maybe you're confusing
that with speed of vacuum?  Napping is just the time auto-vacuum will
wait between checking for new tables to work on.  Having the
auto-vacuum run so slowly is a probable cause of still having dead
tuples after the vacuum... likely because they became dead after
vacuum started.

I'd recommend reading the manual or Tomas Vondra's blog about vacuum
costs. It's not overly complex, once you understand what each of the
vacuum settings does.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: autovacuum big table taking hours and sometimes seconds

Mariel Cherkassky
Just to make sure that I understood : 
-By increasing the cost_limit or decreasing the cost of the page_cost we can decrease the time it takes the autovacuum process to vacuum a specific table. 
-The vacuum threshold/scale are used to decide how often the table will be vacuum and not how long it should take.

I have 3 questions : 
1)To what value do you recommend to increase the vacuum cost_limit ? 2000 seems reasonable ? Or maybe its better to leave it as default and assign a specific value for big tables ?
2)When the autovacuum reaches the cost_limit while trying to vacuum a specific table, it wait nap_time seconds and then it continue to work on the same table ? 
3)So in case I have a table that keeps growing (not fast because I set the vacuum_scale_factor to 0 and the autovacuum_vacuum_threshold to 10000). If the table keep growing it means I should try to increase the cost right ? Do you see any other option ? The table represent sessions of my system so basically from my point of view I should have almost the same amount of sessions every day and the table shouldn't grow dramatically but before changing the vacuum threshold/factor it happened. As I mentioned in my first comment there is a byte column and therefore the toasted table is the problematic here.

‫בתאריך יום ה׳, 7 בפבר׳ 2019 ב-0:34 מאת ‪David Rowley‬‏ <‪[hidden email]‬‏>:‬
On Thu, 7 Feb 2019 at 02:34, Mariel Cherkassky
<[hidden email]> wrote:
> As I said, I set the next settings for the toasted table :
>
>  alter table orig_table set (toast.autovacuum_vacuum_scale_factor = 0);
>
>   alter table orig_table set (toast.autovacuum_vacuum_threshold  =10000);

These settings don't control how fast auto-vacuum runs, just when it should run.

> Can you explain a little bit more why you decided that the autovacuum spent it time on sleeping ?

Yeah, if you look at the following settings.

 vacuum_cost_limit                   | 200
 vacuum_cost_page_dirty              | 20
 vacuum_cost_page_hit                | 1
 vacuum_cost_page_miss               | 10
 autovacuum_vacuum_cost_delay        | 20ms

I've tagged on the default setting for each of these.  Both vacuum and
auto-vacuum keep score of how many points they've accumulated while
running. 20 points for dirtying a page, 10 for a read that's not found
to be in shared_buffers, 1 for reading a buffer from shared buffers.
When vacuum_cost_limit points is reached (or
autovacuum_vacuum_cost_limit if not -1) auto-vacuum sleeps for
autovacuum_vacuum_cost_delay, normal manual vacuums sleep for
vacuum_cost_delay.

In one of the log entries you saw:

> buffer usage: 4808221 hits, 6404148 misses, 6152603 dirtied
> avg read rate: 2.617 MiB/s, avg write rate: 2.514 MiB/s
> system usage: CPU 148.65s/70.06u sec elapsed 19119.55 sec

Doing a bit of maths to see how much time that vacuum should have slept for:

postgres=# select (4808221 * 1 + 6404148 * 10 + 6152603 * 20) / 200.0
* 20 / 1000;
      ?column?
--------------------
 19190.176100000000

That's remarkably close to the actual time of 19119.55 sec. If you do
the same for the other 3 vacuums then you'll see the same close match.

> I see the autovacuum statistics from the logs, how can I check that the workers are busy very often ?

It would be nice if there was something better, but periodically doing:

SELECT count(*) FROM pg_stat_activity where query like 'autovacuum%';

will work.

> My vacuum limit is 200 right now, basically If vacuum runs on my toasted table and reached 200 but it didnt finish to clean all the dead tuples, after the nap, should it continue cleaning it or wait until the vacuum_threshold hit again ?

You're confusing nap time is something else, Maybe you're confusing
that with speed of vacuum?  Napping is just the time auto-vacuum will
wait between checking for new tables to work on.  Having the
auto-vacuum run so slowly is a probable cause of still having dead
tuples after the vacuum... likely because they became dead after
vacuum started.

I'd recommend reading the manual or Tomas Vondra's blog about vacuum
costs. It's not overly complex, once you understand what each of the
vacuum settings does.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
Reply | Threaded
Open this post in threaded view
|

Re: autovacuum big table taking hours and sometimes seconds

Jeff Janes
On Thu, Feb 7, 2019 at 6:55 AM Mariel Cherkassky <[hidden email]> wrote:

I have 3 questions : 
1)To what value do you recommend to increase the vacuum cost_limit ? 2000 seems reasonable ? Or maybe its better to leave it as default and assign a specific value for big tables ?

That depends on your IO hardware, and your workload.  You wouldn't want background vacuum to use so much of your available IO that it starves your other processes.

 
2)When the autovacuum reaches the cost_limit while trying to vacuum a specific table, it wait nap_time seconds and then it continue to work on the same table ? 

No, it waits for autovacuum_vacuum_cost_delay before resuming within the same table. During this delay, the table is still open and it still holds a lock on it, and holds the transaction open, etc.  Naptime is entirely different, it controls how often the vacuum scheduler checks to see which tables need to be vacuumed again.

 
3)So in case I have a table that keeps growing (not fast because I set the vacuum_scale_factor to 0 and the autovacuum_vacuum_threshold to 10000). If the table keep growing it means I should try to increase the cost right ? Do you see any other option ?

 You can use pg_freespacemap to see if the free space is spread evenly throughout the table, or clustered together.  That might help figure out what is going on.  And, is it the table itself that is growing, or the index on it?

Cheers,

Jeff
Reply | Threaded
Open this post in threaded view
|

Re: autovacuum big table taking hours and sometimes seconds

Mariel Cherkassky
I checked in the logs when the autovacuum vacuum my big toasted table during the week and I wanted to confirm with you what I think : 
postgresql-Fri.log:2019-02-08 05:05:53 EST  24776  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 8
postgresql-Fri.log- pages: 2253 removed, 13737828 remain
postgresql-Fri.log- tuples: 21759258 removed, 27324090 remain
postgresql-Fri.log- buffer usage: 15031267 hits, 21081633 misses, 19274530 dirtied
postgresql-Fri.log- avg read rate: 2.700 MiB/s, avg write rate: 2.469 MiB/s
--
postgresql-Mon.log:2019-02-11 01:11:46 EST  8426  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 23
postgresql-Mon.log- pages: 0 removed, 23176876 remain
postgresql-Mon.log- tuples: 62269200 removed, 82958 remain
postgresql-Mon.log- buffer usage: 28290538 hits, 46323736 misses, 38950869 dirtied
postgresql-Mon.log- avg read rate: 2.850 MiB/s, avg write rate: 2.396 MiB/s
--
postgresql-Mon.log:2019-02-11 21:43:19 EST  24323  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 1
postgresql-Mon.log- pages: 0 removed, 23176876 remain
postgresql-Mon.log- tuples: 114573 removed, 57785 remain
postgresql-Mon.log- buffer usage: 15877931 hits, 15972119 misses, 15626466 dirtied
postgresql-Mon.log- avg read rate: 2.525 MiB/s, avg write rate: 2.470 MiB/s
--
postgresql-Sat.log:2019-02-09 04:54:50 EST  1793  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 13
postgresql-Sat.log- pages: 0 removed, 13737828 remain
postgresql-Sat.log- tuples: 34457593 removed, 15871942 remain
postgresql-Sat.log- buffer usage: 15552642 hits, 26130334 misses, 22473776 dirtied
postgresql-Sat.log- avg read rate: 2.802 MiB/s, avg write rate: 2.410 MiB/s
--
postgresql-Thu.log:2019-02-07 12:08:50 EST  29630  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 13
postgresql-Thu.log- pages: 0 removed, 10290976 remain
postgresql-Thu.log- tuples: 35357057 removed, 3436237 remain
postgresql-Thu.log- buffer usage: 11854053 hits, 21346342 misses, 19232835 dirtied
postgresql-Thu.log- avg read rate: 2.705 MiB/s, avg write rate: 2.437 MiB/s
--
postgresql-Tue.log:2019-02-12 20:54:44 EST  21464  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 10
postgresql-Tue.log- pages: 0 removed, 23176876 remain
postgresql-Tue.log- tuples: 26011446 removed, 49426774 remain
postgresql-Tue.log- buffer usage: 21863057 hits, 28668178 misses, 25472137 dirtied
postgresql-Tue.log- avg read rate: 2.684 MiB/s, avg write rate: 2.385 MiB/s
--


Lets focus for example on one of the outputs :
postgresql-Fri.log:2019-02-08 05:05:53 EST  24776  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 8
postgresql-Fri.log- pages: 2253 removed, 13737828 remain
postgresql-Fri.log- tuples: 21759258 removed, 27324090 remain
postgresql-Fri.log- buffer usage: 15031267 hits, 21081633 misses, 19274530 dirtied
postgresql-Fri.log- avg read rate: 2.700 MiB/s, avg write rate: 2.469 MiB/s

The cost_limit is set to 200 (default) and the cost_delay is set to 20ms. 
The calculation I did : (1*15031267+10*21081633+20*19274530)/200*20/1000 = 61133.8197 seconds ~ 17H
So autovacuum was laying down for 17h ? I think that I should increase the cost_limit to max specifically on the toasted table. What do you think ? Am I wrong here ?


‫בתאריך יום ה׳, 7 בפבר׳ 2019 ב-18:26 מאת ‪Jeff Janes‬‏ <‪[hidden email]‬‏>:‬
On Thu, Feb 7, 2019 at 6:55 AM Mariel Cherkassky <[hidden email]> wrote:

I have 3 questions : 
1)To what value do you recommend to increase the vacuum cost_limit ? 2000 seems reasonable ? Or maybe its better to leave it as default and assign a specific value for big tables ?

That depends on your IO hardware, and your workload.  You wouldn't want background vacuum to use so much of your available IO that it starves your other processes.

 
2)When the autovacuum reaches the cost_limit while trying to vacuum a specific table, it wait nap_time seconds and then it continue to work on the same table ? 

No, it waits for autovacuum_vacuum_cost_delay before resuming within the same table. During this delay, the table is still open and it still holds a lock on it, and holds the transaction open, etc.  Naptime is entirely different, it controls how often the vacuum scheduler checks to see which tables need to be vacuumed again.

 
3)So in case I have a table that keeps growing (not fast because I set the vacuum_scale_factor to 0 and the autovacuum_vacuum_threshold to 10000). If the table keep growing it means I should try to increase the cost right ? Do you see any other option ?

 You can use pg_freespacemap to see if the free space is spread evenly throughout the table, or clustered together.  That might help figure out what is going on.  And, is it the table itself that is growing, or the index on it?

Cheers,

Jeff
Reply | Threaded
Open this post in threaded view
|

Re: autovacuum big table taking hours and sometimes seconds

Michael Lewis
It is curious to me that the tuples remaining count varies so wildly. Is this expected?


Michael Lewis

On Thu, Feb 14, 2019 at 9:09 AM Mariel Cherkassky <[hidden email]> wrote:
I checked in the logs when the autovacuum vacuum my big toasted table during the week and I wanted to confirm with you what I think : 
postgresql-Fri.log:2019-02-08 05:05:53 EST  24776  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 8
postgresql-Fri.log- pages: 2253 removed, 13737828 remain
postgresql-Fri.log- tuples: 21759258 removed, 27324090 remain
postgresql-Fri.log- buffer usage: 15031267 hits, 21081633 misses, 19274530 dirtied
postgresql-Fri.log- avg read rate: 2.700 MiB/s, avg write rate: 2.469 MiB/s
--
postgresql-Mon.log:2019-02-11 01:11:46 EST  8426  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 23
postgresql-Mon.log- pages: 0 removed, 23176876 remain
postgresql-Mon.log- tuples: 62269200 removed, 82958 remain
postgresql-Mon.log- buffer usage: 28290538 hits, 46323736 misses, 38950869 dirtied
postgresql-Mon.log- avg read rate: 2.850 MiB/s, avg write rate: 2.396 MiB/s
--
postgresql-Mon.log:2019-02-11 21:43:19 EST  24323  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 1
postgresql-Mon.log- pages: 0 removed, 23176876 remain
postgresql-Mon.log- tuples: 114573 removed, 57785 remain
postgresql-Mon.log- buffer usage: 15877931 hits, 15972119 misses, 15626466 dirtied
postgresql-Mon.log- avg read rate: 2.525 MiB/s, avg write rate: 2.470 MiB/s
--
postgresql-Sat.log:2019-02-09 04:54:50 EST  1793  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 13
postgresql-Sat.log- pages: 0 removed, 13737828 remain
postgresql-Sat.log- tuples: 34457593 removed, 15871942 remain
postgresql-Sat.log- buffer usage: 15552642 hits, 26130334 misses, 22473776 dirtied
postgresql-Sat.log- avg read rate: 2.802 MiB/s, avg write rate: 2.410 MiB/s
--
postgresql-Thu.log:2019-02-07 12:08:50 EST  29630  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 13
postgresql-Thu.log- pages: 0 removed, 10290976 remain
postgresql-Thu.log- tuples: 35357057 removed, 3436237 remain
postgresql-Thu.log- buffer usage: 11854053 hits, 21346342 misses, 19232835 dirtied
postgresql-Thu.log- avg read rate: 2.705 MiB/s, avg write rate: 2.437 MiB/s
--
postgresql-Tue.log:2019-02-12 20:54:44 EST  21464  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 10
postgresql-Tue.log- pages: 0 removed, 23176876 remain
postgresql-Tue.log- tuples: 26011446 removed, 49426774 remain
postgresql-Tue.log- buffer usage: 21863057 hits, 28668178 misses, 25472137 dirtied
postgresql-Tue.log- avg read rate: 2.684 MiB/s, avg write rate: 2.385 MiB/s
--


Lets focus for example on one of the outputs :
postgresql-Fri.log:2019-02-08 05:05:53 EST  24776  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 8
postgresql-Fri.log- pages: 2253 removed, 13737828 remain
postgresql-Fri.log- tuples: 21759258 removed, 27324090 remain
postgresql-Fri.log- buffer usage: 15031267 hits, 21081633 misses, 19274530 dirtied
postgresql-Fri.log- avg read rate: 2.700 MiB/s, avg write rate: 2.469 MiB/s

The cost_limit is set to 200 (default) and the cost_delay is set to 20ms. 
The calculation I did : (1*15031267+10*21081633+20*19274530)/200*20/1000 = 61133.8197 seconds ~ 17H
So autovacuum was laying down for 17h ? I think that I should increase the cost_limit to max specifically on the toasted table. What do you think ? Am I wrong here ?


‫בתאריך יום ה׳, 7 בפבר׳ 2019 ב-18:26 מאת ‪Jeff Janes‬‏ <‪[hidden email]‬‏>:‬
On Thu, Feb 7, 2019 at 6:55 AM Mariel Cherkassky <[hidden email]> wrote:

I have 3 questions : 
1)To what value do you recommend to increase the vacuum cost_limit ? 2000 seems reasonable ? Or maybe its better to leave it as default and assign a specific value for big tables ?

That depends on your IO hardware, and your workload.  You wouldn't want background vacuum to use so much of your available IO that it starves your other processes.

 
2)When the autovacuum reaches the cost_limit while trying to vacuum a specific table, it wait nap_time seconds and then it continue to work on the same table ? 

No, it waits for autovacuum_vacuum_cost_delay before resuming within the same table. During this delay, the table is still open and it still holds a lock on it, and holds the transaction open, etc.  Naptime is entirely different, it controls how often the vacuum scheduler checks to see which tables need to be vacuumed again.

 
3)So in case I have a table that keeps growing (not fast because I set the vacuum_scale_factor to 0 and the autovacuum_vacuum_threshold to 10000). If the table keep growing it means I should try to increase the cost right ? Do you see any other option ?

 You can use pg_freespacemap to see if the free space is spread evenly throughout the table, or clustered together.  That might help figure out what is going on.  And, is it the table itself that is growing, or the index on it?

Cheers,

Jeff
Reply | Threaded
Open this post in threaded view
|

Re: autovacuum big table taking hours and sometimes seconds

Mariel Cherkassky
Maybe by explaining the tables purpose it will be cleaner. The original table contains rows for sessions in my app. Every session saves for itself some raw data which is saved in the toasted table. We clean old sessions (3+ days) every night. During the day sessions are created so the size of the table should grow during the day and freed in the night after the autovacuum run.However, the autovacuums sleeps for alot of time and during that time more sessions are created so maybe this can explain the big size ? Do you think that by increasing the cost limit and decreasing the cost delay I can solve the issue ?

On Thu, Feb 14, 2019, 8:38 PM Michael Lewis <[hidden email] wrote:
It is curious to me that the tuples remaining count varies so wildly. Is this expected?


Michael Lewis

On Thu, Feb 14, 2019 at 9:09 AM Mariel Cherkassky <[hidden email]> wrote:
I checked in the logs when the autovacuum vacuum my big toasted table during the week and I wanted to confirm with you what I think : 
postgresql-Fri.log:2019-02-08 05:05:53 EST  24776  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 8
postgresql-Fri.log- pages: 2253 removed, 13737828 remain
postgresql-Fri.log- tuples: 21759258 removed, 27324090 remain
postgresql-Fri.log- buffer usage: 15031267 hits, 21081633 misses, 19274530 dirtied
postgresql-Fri.log- avg read rate: 2.700 MiB/s, avg write rate: 2.469 MiB/s
--
postgresql-Mon.log:2019-02-11 01:11:46 EST  8426  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 23
postgresql-Mon.log- pages: 0 removed, 23176876 remain
postgresql-Mon.log- tuples: 62269200 removed, 82958 remain
postgresql-Mon.log- buffer usage: 28290538 hits, 46323736 misses, 38950869 dirtied
postgresql-Mon.log- avg read rate: 2.850 MiB/s, avg write rate: 2.396 MiB/s
--
postgresql-Mon.log:2019-02-11 21:43:19 EST  24323  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 1
postgresql-Mon.log- pages: 0 removed, 23176876 remain
postgresql-Mon.log- tuples: 114573 removed, 57785 remain
postgresql-Mon.log- buffer usage: 15877931 hits, 15972119 misses, 15626466 dirtied
postgresql-Mon.log- avg read rate: 2.525 MiB/s, avg write rate: 2.470 MiB/s
--
postgresql-Sat.log:2019-02-09 04:54:50 EST  1793  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 13
postgresql-Sat.log- pages: 0 removed, 13737828 remain
postgresql-Sat.log- tuples: 34457593 removed, 15871942 remain
postgresql-Sat.log- buffer usage: 15552642 hits, 26130334 misses, 22473776 dirtied
postgresql-Sat.log- avg read rate: 2.802 MiB/s, avg write rate: 2.410 MiB/s
--
postgresql-Thu.log:2019-02-07 12:08:50 EST  29630  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 13
postgresql-Thu.log- pages: 0 removed, 10290976 remain
postgresql-Thu.log- tuples: 35357057 removed, 3436237 remain
postgresql-Thu.log- buffer usage: 11854053 hits, 21346342 misses, 19232835 dirtied
postgresql-Thu.log- avg read rate: 2.705 MiB/s, avg write rate: 2.437 MiB/s
--
postgresql-Tue.log:2019-02-12 20:54:44 EST  21464  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 10
postgresql-Tue.log- pages: 0 removed, 23176876 remain
postgresql-Tue.log- tuples: 26011446 removed, 49426774 remain
postgresql-Tue.log- buffer usage: 21863057 hits, 28668178 misses, 25472137 dirtied
postgresql-Tue.log- avg read rate: 2.684 MiB/s, avg write rate: 2.385 MiB/s
--


Lets focus for example on one of the outputs :
postgresql-Fri.log:2019-02-08 05:05:53 EST  24776  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 8
postgresql-Fri.log- pages: 2253 removed, 13737828 remain
postgresql-Fri.log- tuples: 21759258 removed, 27324090 remain
postgresql-Fri.log- buffer usage: 15031267 hits, 21081633 misses, 19274530 dirtied
postgresql-Fri.log- avg read rate: 2.700 MiB/s, avg write rate: 2.469 MiB/s

The cost_limit is set to 200 (default) and the cost_delay is set to 20ms. 
The calculation I did : (1*15031267+10*21081633+20*19274530)/200*20/1000 = 61133.8197 seconds ~ 17H
So autovacuum was laying down for 17h ? I think that I should increase the cost_limit to max specifically on the toasted table. What do you think ? Am I wrong here ?


‫בתאריך יום ה׳, 7 בפבר׳ 2019 ב-18:26 מאת ‪Jeff Janes‬‏ <‪[hidden email]‬‏>:‬
On Thu, Feb 7, 2019 at 6:55 AM Mariel Cherkassky <[hidden email]> wrote:

I have 3 questions : 
1)To what value do you recommend to increase the vacuum cost_limit ? 2000 seems reasonable ? Or maybe its better to leave it as default and assign a specific value for big tables ?

That depends on your IO hardware, and your workload.  You wouldn't want background vacuum to use so much of your available IO that it starves your other processes.

 
2)When the autovacuum reaches the cost_limit while trying to vacuum a specific table, it wait nap_time seconds and then it continue to work on the same table ? 

No, it waits for autovacuum_vacuum_cost_delay before resuming within the same table. During this delay, the table is still open and it still holds a lock on it, and holds the transaction open, etc.  Naptime is entirely different, it controls how often the vacuum scheduler checks to see which tables need to be vacuumed again.

 
3)So in case I have a table that keeps growing (not fast because I set the vacuum_scale_factor to 0 and the autovacuum_vacuum_threshold to 10000). If the table keep growing it means I should try to increase the cost right ? Do you see any other option ?

 You can use pg_freespacemap to see if the free space is spread evenly throughout the table, or clustered together.  That might help figure out what is going on.  And, is it the table itself that is growing, or the index on it?

Cheers,

Jeff
Reply | Threaded
Open this post in threaded view
|

Re: autovacuum big table taking hours and sometimes seconds

Michael Lewis
Thanks, that context is very enlightening. Do you manually vacuum after doing the big purge of old session data? Is bloat causing issues for you? Why is it a concern that autovacuum's behavior varies?


Michael Lewis

On Thu, Feb 14, 2019 at 12:41 PM Mariel Cherkassky <[hidden email]> wrote:
Maybe by explaining the tables purpose it will be cleaner. The original table contains rows for sessions in my app. Every session saves for itself some raw data which is saved in the toasted table. We clean old sessions (3+ days) every night. During the day sessions are created so the size of the table should grow during the day and freed in the night after the autovacuum run.However, the autovacuums sleeps for alot of time and during that time more sessions are created so maybe this can explain the big size ? Do you think that by increasing the cost limit and decreasing the cost delay I can solve the issue ?

On Thu, Feb 14, 2019, 8:38 PM Michael Lewis <[hidden email] wrote:
It is curious to me that the tuples remaining count varies so wildly. Is this expected?


Michael Lewis

On Thu, Feb 14, 2019 at 9:09 AM Mariel Cherkassky <[hidden email]> wrote:
I checked in the logs when the autovacuum vacuum my big toasted table during the week and I wanted to confirm with you what I think : 
postgresql-Fri.log:2019-02-08 05:05:53 EST  24776  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 8
postgresql-Fri.log- pages: 2253 removed, 13737828 remain
postgresql-Fri.log- tuples: 21759258 removed, 27324090 remain
postgresql-Fri.log- buffer usage: 15031267 hits, 21081633 misses, 19274530 dirtied
postgresql-Fri.log- avg read rate: 2.700 MiB/s, avg write rate: 2.469 MiB/s
--
postgresql-Mon.log:2019-02-11 01:11:46 EST  8426  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 23
postgresql-Mon.log- pages: 0 removed, 23176876 remain
postgresql-Mon.log- tuples: 62269200 removed, 82958 remain
postgresql-Mon.log- buffer usage: 28290538 hits, 46323736 misses, 38950869 dirtied
postgresql-Mon.log- avg read rate: 2.850 MiB/s, avg write rate: 2.396 MiB/s
--
postgresql-Mon.log:2019-02-11 21:43:19 EST  24323  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 1
postgresql-Mon.log- pages: 0 removed, 23176876 remain
postgresql-Mon.log- tuples: 114573 removed, 57785 remain
postgresql-Mon.log- buffer usage: 15877931 hits, 15972119 misses, 15626466 dirtied
postgresql-Mon.log- avg read rate: 2.525 MiB/s, avg write rate: 2.470 MiB/s
--
postgresql-Sat.log:2019-02-09 04:54:50 EST  1793  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 13
postgresql-Sat.log- pages: 0 removed, 13737828 remain
postgresql-Sat.log- tuples: 34457593 removed, 15871942 remain
postgresql-Sat.log- buffer usage: 15552642 hits, 26130334 misses, 22473776 dirtied
postgresql-Sat.log- avg read rate: 2.802 MiB/s, avg write rate: 2.410 MiB/s
--
postgresql-Thu.log:2019-02-07 12:08:50 EST  29630  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 13
postgresql-Thu.log- pages: 0 removed, 10290976 remain
postgresql-Thu.log- tuples: 35357057 removed, 3436237 remain
postgresql-Thu.log- buffer usage: 11854053 hits, 21346342 misses, 19232835 dirtied
postgresql-Thu.log- avg read rate: 2.705 MiB/s, avg write rate: 2.437 MiB/s
--
postgresql-Tue.log:2019-02-12 20:54:44 EST  21464  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 10
postgresql-Tue.log- pages: 0 removed, 23176876 remain
postgresql-Tue.log- tuples: 26011446 removed, 49426774 remain
postgresql-Tue.log- buffer usage: 21863057 hits, 28668178 misses, 25472137 dirtied
postgresql-Tue.log- avg read rate: 2.684 MiB/s, avg write rate: 2.385 MiB/s
--


Lets focus for example on one of the outputs :
postgresql-Fri.log:2019-02-08 05:05:53 EST  24776  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 8
postgresql-Fri.log- pages: 2253 removed, 13737828 remain
postgresql-Fri.log- tuples: 21759258 removed, 27324090 remain
postgresql-Fri.log- buffer usage: 15031267 hits, 21081633 misses, 19274530 dirtied
postgresql-Fri.log- avg read rate: 2.700 MiB/s, avg write rate: 2.469 MiB/s

The cost_limit is set to 200 (default) and the cost_delay is set to 20ms. 
The calculation I did : (1*15031267+10*21081633+20*19274530)/200*20/1000 = 61133.8197 seconds ~ 17H
So autovacuum was laying down for 17h ? I think that I should increase the cost_limit to max specifically on the toasted table. What do you think ? Am I wrong here ?


‫בתאריך יום ה׳, 7 בפבר׳ 2019 ב-18:26 מאת ‪Jeff Janes‬‏ <‪[hidden email]‬‏>:‬
On Thu, Feb 7, 2019 at 6:55 AM Mariel Cherkassky <[hidden email]> wrote:

I have 3 questions : 
1)To what value do you recommend to increase the vacuum cost_limit ? 2000 seems reasonable ? Or maybe its better to leave it as default and assign a specific value for big tables ?

That depends on your IO hardware, and your workload.  You wouldn't want background vacuum to use so much of your available IO that it starves your other processes.

 
2)When the autovacuum reaches the cost_limit while trying to vacuum a specific table, it wait nap_time seconds and then it continue to work on the same table ? 

No, it waits for autovacuum_vacuum_cost_delay before resuming within the same table. During this delay, the table is still open and it still holds a lock on it, and holds the transaction open, etc.  Naptime is entirely different, it controls how often the vacuum scheduler checks to see which tables need to be vacuumed again.

 
3)So in case I have a table that keeps growing (not fast because I set the vacuum_scale_factor to 0 and the autovacuum_vacuum_threshold to 10000). If the table keep growing it means I should try to increase the cost right ? Do you see any other option ?

 You can use pg_freespacemap to see if the free space is spread evenly throughout the table, or clustered together.  That might help figure out what is going on.  And, is it the table itself that is growing, or the index on it?

Cheers,

Jeff
Reply | Threaded
Open this post in threaded view
|

Re: autovacuum big table taking hours and sometimes seconds

Mariel Cherkassky
No I don't run vacuum manually afterwards because the autovacuum should run. This process happens every night. Yes , bloating is an issue because the table grow and take a lot of space on disk.  Regarding the autovacuum, I think that it sleeps too much time (17h) during it's work, don't you think so? 

On Thu, Feb 14, 2019, 9:52 PM Michael Lewis <[hidden email] wrote:
Thanks, that context is very enlightening. Do you manually vacuum after doing the big purge of old session data? Is bloat causing issues for you? Why is it a concern that autovacuum's behavior varies?


Michael Lewis

On Thu, Feb 14, 2019 at 12:41 PM Mariel Cherkassky <[hidden email]> wrote:
Maybe by explaining the tables purpose it will be cleaner. The original table contains rows for sessions in my app. Every session saves for itself some raw data which is saved in the toasted table. We clean old sessions (3+ days) every night. During the day sessions are created so the size of the table should grow during the day and freed in the night after the autovacuum run.However, the autovacuums sleeps for alot of time and during that time more sessions are created so maybe this can explain the big size ? Do you think that by increasing the cost limit and decreasing the cost delay I can solve the issue ?

On Thu, Feb 14, 2019, 8:38 PM Michael Lewis <[hidden email] wrote:
It is curious to me that the tuples remaining count varies so wildly. Is this expected?


Michael Lewis

On Thu, Feb 14, 2019 at 9:09 AM Mariel Cherkassky <[hidden email]> wrote:
I checked in the logs when the autovacuum vacuum my big toasted table during the week and I wanted to confirm with you what I think : 
postgresql-Fri.log:2019-02-08 05:05:53 EST  24776  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 8
postgresql-Fri.log- pages: 2253 removed, 13737828 remain
postgresql-Fri.log- tuples: 21759258 removed, 27324090 remain
postgresql-Fri.log- buffer usage: 15031267 hits, 21081633 misses, 19274530 dirtied
postgresql-Fri.log- avg read rate: 2.700 MiB/s, avg write rate: 2.469 MiB/s
--
postgresql-Mon.log:2019-02-11 01:11:46 EST  8426  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 23
postgresql-Mon.log- pages: 0 removed, 23176876 remain
postgresql-Mon.log- tuples: 62269200 removed, 82958 remain
postgresql-Mon.log- buffer usage: 28290538 hits, 46323736 misses, 38950869 dirtied
postgresql-Mon.log- avg read rate: 2.850 MiB/s, avg write rate: 2.396 MiB/s
--
postgresql-Mon.log:2019-02-11 21:43:19 EST  24323  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 1
postgresql-Mon.log- pages: 0 removed, 23176876 remain
postgresql-Mon.log- tuples: 114573 removed, 57785 remain
postgresql-Mon.log- buffer usage: 15877931 hits, 15972119 misses, 15626466 dirtied
postgresql-Mon.log- avg read rate: 2.525 MiB/s, avg write rate: 2.470 MiB/s
--
postgresql-Sat.log:2019-02-09 04:54:50 EST  1793  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 13
postgresql-Sat.log- pages: 0 removed, 13737828 remain
postgresql-Sat.log- tuples: 34457593 removed, 15871942 remain
postgresql-Sat.log- buffer usage: 15552642 hits, 26130334 misses, 22473776 dirtied
postgresql-Sat.log- avg read rate: 2.802 MiB/s, avg write rate: 2.410 MiB/s
--
postgresql-Thu.log:2019-02-07 12:08:50 EST  29630  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 13
postgresql-Thu.log- pages: 0 removed, 10290976 remain
postgresql-Thu.log- tuples: 35357057 removed, 3436237 remain
postgresql-Thu.log- buffer usage: 11854053 hits, 21346342 misses, 19232835 dirtied
postgresql-Thu.log- avg read rate: 2.705 MiB/s, avg write rate: 2.437 MiB/s
--
postgresql-Tue.log:2019-02-12 20:54:44 EST  21464  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 10
postgresql-Tue.log- pages: 0 removed, 23176876 remain
postgresql-Tue.log- tuples: 26011446 removed, 49426774 remain
postgresql-Tue.log- buffer usage: 21863057 hits, 28668178 misses, 25472137 dirtied
postgresql-Tue.log- avg read rate: 2.684 MiB/s, avg write rate: 2.385 MiB/s
--


Lets focus for example on one of the outputs :
postgresql-Fri.log:2019-02-08 05:05:53 EST  24776  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 8
postgresql-Fri.log- pages: 2253 removed, 13737828 remain
postgresql-Fri.log- tuples: 21759258 removed, 27324090 remain
postgresql-Fri.log- buffer usage: 15031267 hits, 21081633 misses, 19274530 dirtied
postgresql-Fri.log- avg read rate: 2.700 MiB/s, avg write rate: 2.469 MiB/s

The cost_limit is set to 200 (default) and the cost_delay is set to 20ms. 
The calculation I did : (1*15031267+10*21081633+20*19274530)/200*20/1000 = 61133.8197 seconds ~ 17H
So autovacuum was laying down for 17h ? I think that I should increase the cost_limit to max specifically on the toasted table. What do you think ? Am I wrong here ?


‫בתאריך יום ה׳, 7 בפבר׳ 2019 ב-18:26 מאת ‪Jeff Janes‬‏ <‪[hidden email]‬‏>:‬
On Thu, Feb 7, 2019 at 6:55 AM Mariel Cherkassky <[hidden email]> wrote:

I have 3 questions : 
1)To what value do you recommend to increase the vacuum cost_limit ? 2000 seems reasonable ? Or maybe its better to leave it as default and assign a specific value for big tables ?

That depends on your IO hardware, and your workload.  You wouldn't want background vacuum to use so much of your available IO that it starves your other processes.

 
2)When the autovacuum reaches the cost_limit while trying to vacuum a specific table, it wait nap_time seconds and then it continue to work on the same table ? 

No, it waits for autovacuum_vacuum_cost_delay before resuming within the same table. During this delay, the table is still open and it still holds a lock on it, and holds the transaction open, etc.  Naptime is entirely different, it controls how often the vacuum scheduler checks to see which tables need to be vacuumed again.

 
3)So in case I have a table that keeps growing (not fast because I set the vacuum_scale_factor to 0 and the autovacuum_vacuum_threshold to 10000). If the table keep growing it means I should try to increase the cost right ? Do you see any other option ?

 You can use pg_freespacemap to see if the free space is spread evenly throughout the table, or clustered together.  That might help figure out what is going on.  And, is it the table itself that is growing, or the index on it?

Cheers,

Jeff
Reply | Threaded
Open this post in threaded view
|

Re: autovacuum big table taking hours and sometimes seconds

Michael Lewis
If there are high number of updates during normal daytime processes, then yes you need to ensure autovacuum is handling this table as needed. If the nightly delete is the only major source of bloat on this table, then perhaps running a manual vacuum keeps things tidy after the big delete. Granted, if you are manually going to vacuum, don't use vacuum full as there is not much sense in recovering that disk space if the table is going to expected to be similarly sized again by the end of the day.

Do you have a proper number of workers and maintenance_work_mem to get the job done?

As you proposed, it seems likely to be good to significantly increase autovacuum_vacuum_cost_limit on this table, and perhaps decrease autovacuum_vacuum_scale_factor if it is not being picked up as a candidate for vacuum very frequently.


Michael Lewis


On Thu, Feb 14, 2019 at 1:08 PM Mariel Cherkassky <[hidden email]> wrote:
No I don't run vacuum manually afterwards because the autovacuum should run. This process happens every night. Yes , bloating is an issue because the table grow and take a lot of space on disk.  Regarding the autovacuum, I think that it sleeps too much time (17h) during it's work, don't you think so? 

On Thu, Feb 14, 2019, 9:52 PM Michael Lewis <[hidden email] wrote:
Thanks, that context is very enlightening. Do you manually vacuum after doing the big purge of old session data? Is bloat causing issues for you? Why is it a concern that autovacuum's behavior varies?


Michael Lewis

On Thu, Feb 14, 2019 at 12:41 PM Mariel Cherkassky <[hidden email]> wrote:
Maybe by explaining the tables purpose it will be cleaner. The original table contains rows for sessions in my app. Every session saves for itself some raw data which is saved in the toasted table. We clean old sessions (3+ days) every night. During the day sessions are created so the size of the table should grow during the day and freed in the night after the autovacuum run.However, the autovacuums sleeps for alot of time and during that time more sessions are created so maybe this can explain the big size ? Do you think that by increasing the cost limit and decreasing the cost delay I can solve the issue ?

On Thu, Feb 14, 2019, 8:38 PM Michael Lewis <[hidden email] wrote:
It is curious to me that the tuples remaining count varies so wildly. Is this expected?


Michael Lewis

On Thu, Feb 14, 2019 at 9:09 AM Mariel Cherkassky <[hidden email]> wrote:
I checked in the logs when the autovacuum vacuum my big toasted table during the week and I wanted to confirm with you what I think : 
postgresql-Fri.log:2019-02-08 05:05:53 EST  24776  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 8
postgresql-Fri.log- pages: 2253 removed, 13737828 remain
postgresql-Fri.log- tuples: 21759258 removed, 27324090 remain
postgresql-Fri.log- buffer usage: 15031267 hits, 21081633 misses, 19274530 dirtied
postgresql-Fri.log- avg read rate: 2.700 MiB/s, avg write rate: 2.469 MiB/s
--
postgresql-Mon.log:2019-02-11 01:11:46 EST  8426  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 23
postgresql-Mon.log- pages: 0 removed, 23176876 remain
postgresql-Mon.log- tuples: 62269200 removed, 82958 remain
postgresql-Mon.log- buffer usage: 28290538 hits, 46323736 misses, 38950869 dirtied
postgresql-Mon.log- avg read rate: 2.850 MiB/s, avg write rate: 2.396 MiB/s
--
postgresql-Mon.log:2019-02-11 21:43:19 EST  24323  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 1
postgresql-Mon.log- pages: 0 removed, 23176876 remain
postgresql-Mon.log- tuples: 114573 removed, 57785 remain
postgresql-Mon.log- buffer usage: 15877931 hits, 15972119 misses, 15626466 dirtied
postgresql-Mon.log- avg read rate: 2.525 MiB/s, avg write rate: 2.470 MiB/s
--
postgresql-Sat.log:2019-02-09 04:54:50 EST  1793  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 13
postgresql-Sat.log- pages: 0 removed, 13737828 remain
postgresql-Sat.log- tuples: 34457593 removed, 15871942 remain
postgresql-Sat.log- buffer usage: 15552642 hits, 26130334 misses, 22473776 dirtied
postgresql-Sat.log- avg read rate: 2.802 MiB/s, avg write rate: 2.410 MiB/s
--
postgresql-Thu.log:2019-02-07 12:08:50 EST  29630  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 13
postgresql-Thu.log- pages: 0 removed, 10290976 remain
postgresql-Thu.log- tuples: 35357057 removed, 3436237 remain
postgresql-Thu.log- buffer usage: 11854053 hits, 21346342 misses, 19232835 dirtied
postgresql-Thu.log- avg read rate: 2.705 MiB/s, avg write rate: 2.437 MiB/s
--
postgresql-Tue.log:2019-02-12 20:54:44 EST  21464  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 10
postgresql-Tue.log- pages: 0 removed, 23176876 remain
postgresql-Tue.log- tuples: 26011446 removed, 49426774 remain
postgresql-Tue.log- buffer usage: 21863057 hits, 28668178 misses, 25472137 dirtied
postgresql-Tue.log- avg read rate: 2.684 MiB/s, avg write rate: 2.385 MiB/s
--


Lets focus for example on one of the outputs :
postgresql-Fri.log:2019-02-08 05:05:53 EST  24776  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 8
postgresql-Fri.log- pages: 2253 removed, 13737828 remain
postgresql-Fri.log- tuples: 21759258 removed, 27324090 remain
postgresql-Fri.log- buffer usage: 15031267 hits, 21081633 misses, 19274530 dirtied
postgresql-Fri.log- avg read rate: 2.700 MiB/s, avg write rate: 2.469 MiB/s

The cost_limit is set to 200 (default) and the cost_delay is set to 20ms. 
The calculation I did : (1*15031267+10*21081633+20*19274530)/200*20/1000 = 61133.8197 seconds ~ 17H
So autovacuum was laying down for 17h ? I think that I should increase the cost_limit to max specifically on the toasted table. What do you think ? Am I wrong here ?


‫בתאריך יום ה׳, 7 בפבר׳ 2019 ב-18:26 מאת ‪Jeff Janes‬‏ <‪[hidden email]‬‏>:‬
On Thu, Feb 7, 2019 at 6:55 AM Mariel Cherkassky <[hidden email]> wrote:

I have 3 questions : 
1)To what value do you recommend to increase the vacuum cost_limit ? 2000 seems reasonable ? Or maybe its better to leave it as default and assign a specific value for big tables ?

That depends on your IO hardware, and your workload.  You wouldn't want background vacuum to use so much of your available IO that it starves your other processes.

 
2)When the autovacuum reaches the cost_limit while trying to vacuum a specific table, it wait nap_time seconds and then it continue to work on the same table ? 

No, it waits for autovacuum_vacuum_cost_delay before resuming within the same table. During this delay, the table is still open and it still holds a lock on it, and holds the transaction open, etc.  Naptime is entirely different, it controls how often the vacuum scheduler checks to see which tables need to be vacuumed again.

 
3)So in case I have a table that keeps growing (not fast because I set the vacuum_scale_factor to 0 and the autovacuum_vacuum_threshold to 10000). If the table keep growing it means I should try to increase the cost right ? Do you see any other option ?

 You can use pg_freespacemap to see if the free space is spread evenly throughout the table, or clustered together.  That might help figure out what is going on.  And, is it the table itself that is growing, or the index on it?

Cheers,

Jeff
Reply | Threaded
Open this post in threaded view
|

Re: autovacuum big table taking hours and sometimes seconds

Mariel Cherkassky
I set the toast.autovacuum_vacuum_scale_factor to 0 and the toast.autovacuum_vacuum threshold to 10000 so it should be enough to force a vacuum after the nightly deletes. Now , I changed the cost limit and the cost delay, my question is if I have anything else to do ? My maintenance_work_mem is about 1gb and I didn't change the default value of the workers. Is there a way to calc what size the maintenance_work_mem should be in order to clean the table ? And what exactly is saved in the maintenance_work_mem ? I mean how it used by the autovacuum..

On Thu, Feb 14, 2019, 11:45 PM Michael Lewis <[hidden email] wrote:
If there are high number of updates during normal daytime processes, then yes you need to ensure autovacuum is handling this table as needed. If the nightly delete is the only major source of bloat on this table, then perhaps running a manual vacuum keeps things tidy after the big delete. Granted, if you are manually going to vacuum, don't use vacuum full as there is not much sense in recovering that disk space if the table is going to expected to be similarly sized again by the end of the day.

Do you have a proper number of workers and maintenance_work_mem to get the job done?

As you proposed, it seems likely to be good to significantly increase autovacuum_vacuum_cost_limit on this table, and perhaps decrease autovacuum_vacuum_scale_factor if it is not being picked up as a candidate for vacuum very frequently.


Michael Lewis


On Thu, Feb 14, 2019 at 1:08 PM Mariel Cherkassky <[hidden email]> wrote:
No I don't run vacuum manually afterwards because the autovacuum should run. This process happens every night. Yes , bloating is an issue because the table grow and take a lot of space on disk.  Regarding the autovacuum, I think that it sleeps too much time (17h) during it's work, don't you think so? 

On Thu, Feb 14, 2019, 9:52 PM Michael Lewis <[hidden email] wrote:
Thanks, that context is very enlightening. Do you manually vacuum after doing the big purge of old session data? Is bloat causing issues for you? Why is it a concern that autovacuum's behavior varies?


Michael Lewis

On Thu, Feb 14, 2019 at 12:41 PM Mariel Cherkassky <[hidden email]> wrote:
Maybe by explaining the tables purpose it will be cleaner. The original table contains rows for sessions in my app. Every session saves for itself some raw data which is saved in the toasted table. We clean old sessions (3+ days) every night. During the day sessions are created so the size of the table should grow during the day and freed in the night after the autovacuum run.However, the autovacuums sleeps for alot of time and during that time more sessions are created so maybe this can explain the big size ? Do you think that by increasing the cost limit and decreasing the cost delay I can solve the issue ?

On Thu, Feb 14, 2019, 8:38 PM Michael Lewis <[hidden email] wrote:
It is curious to me that the tuples remaining count varies so wildly. Is this expected?


Michael Lewis

On Thu, Feb 14, 2019 at 9:09 AM Mariel Cherkassky <[hidden email]> wrote:
I checked in the logs when the autovacuum vacuum my big toasted table during the week and I wanted to confirm with you what I think : 
postgresql-Fri.log:2019-02-08 05:05:53 EST  24776  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 8
postgresql-Fri.log- pages: 2253 removed, 13737828 remain
postgresql-Fri.log- tuples: 21759258 removed, 27324090 remain
postgresql-Fri.log- buffer usage: 15031267 hits, 21081633 misses, 19274530 dirtied
postgresql-Fri.log- avg read rate: 2.700 MiB/s, avg write rate: 2.469 MiB/s
--
postgresql-Mon.log:2019-02-11 01:11:46 EST  8426  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 23
postgresql-Mon.log- pages: 0 removed, 23176876 remain
postgresql-Mon.log- tuples: 62269200 removed, 82958 remain
postgresql-Mon.log- buffer usage: 28290538 hits, 46323736 misses, 38950869 dirtied
postgresql-Mon.log- avg read rate: 2.850 MiB/s, avg write rate: 2.396 MiB/s
--
postgresql-Mon.log:2019-02-11 21:43:19 EST  24323  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 1
postgresql-Mon.log- pages: 0 removed, 23176876 remain
postgresql-Mon.log- tuples: 114573 removed, 57785 remain
postgresql-Mon.log- buffer usage: 15877931 hits, 15972119 misses, 15626466 dirtied
postgresql-Mon.log- avg read rate: 2.525 MiB/s, avg write rate: 2.470 MiB/s
--
postgresql-Sat.log:2019-02-09 04:54:50 EST  1793  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 13
postgresql-Sat.log- pages: 0 removed, 13737828 remain
postgresql-Sat.log- tuples: 34457593 removed, 15871942 remain
postgresql-Sat.log- buffer usage: 15552642 hits, 26130334 misses, 22473776 dirtied
postgresql-Sat.log- avg read rate: 2.802 MiB/s, avg write rate: 2.410 MiB/s
--
postgresql-Thu.log:2019-02-07 12:08:50 EST  29630  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 13
postgresql-Thu.log- pages: 0 removed, 10290976 remain
postgresql-Thu.log- tuples: 35357057 removed, 3436237 remain
postgresql-Thu.log- buffer usage: 11854053 hits, 21346342 misses, 19232835 dirtied
postgresql-Thu.log- avg read rate: 2.705 MiB/s, avg write rate: 2.437 MiB/s
--
postgresql-Tue.log:2019-02-12 20:54:44 EST  21464  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 10
postgresql-Tue.log- pages: 0 removed, 23176876 remain
postgresql-Tue.log- tuples: 26011446 removed, 49426774 remain
postgresql-Tue.log- buffer usage: 21863057 hits, 28668178 misses, 25472137 dirtied
postgresql-Tue.log- avg read rate: 2.684 MiB/s, avg write rate: 2.385 MiB/s
--


Lets focus for example on one of the outputs :
postgresql-Fri.log:2019-02-08 05:05:53 EST  24776  LOG:  automatic vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 8
postgresql-Fri.log- pages: 2253 removed, 13737828 remain
postgresql-Fri.log- tuples: 21759258 removed, 27324090 remain
postgresql-Fri.log- buffer usage: 15031267 hits, 21081633 misses, 19274530 dirtied
postgresql-Fri.log- avg read rate: 2.700 MiB/s, avg write rate: 2.469 MiB/s

The cost_limit is set to 200 (default) and the cost_delay is set to 20ms. 
The calculation I did : (1*15031267+10*21081633+20*19274530)/200*20/1000 = 61133.8197 seconds ~ 17H
So autovacuum was laying down for 17h ? I think that I should increase the cost_limit to max specifically on the toasted table. What do you think ? Am I wrong here ?


‫בתאריך יום ה׳, 7 בפבר׳ 2019 ב-18:26 מאת ‪Jeff Janes‬‏ <‪[hidden email]‬‏>:‬
On Thu, Feb 7, 2019 at 6:55 AM Mariel Cherkassky <[hidden email]> wrote:

I have 3 questions : 
1)To what value do you recommend to increase the vacuum cost_limit ? 2000 seems reasonable ? Or maybe its better to leave it as default and assign a specific value for big tables ?

That depends on your IO hardware, and your workload.  You wouldn't want background vacuum to use so much of your available IO that it starves your other processes.

 
2)When the autovacuum reaches the cost_limit while trying to vacuum a specific table, it wait nap_time seconds and then it continue to work on the same table ? 

No, it waits for autovacuum_vacuum_cost_delay before resuming within the same table. During this delay, the table is still open and it still holds a lock on it, and holds the transaction open, etc.  Naptime is entirely different, it controls how often the vacuum scheduler checks to see which tables need to be vacuumed again.

 
3)So in case I have a table that keeps growing (not fast because I set the vacuum_scale_factor to 0 and the autovacuum_vacuum_threshold to 10000). If the table keep growing it means I should try to increase the cost right ? Do you see any other option ?

 You can use pg_freespacemap to see if the free space is spread evenly throughout the table, or clustered together.  That might help figure out what is going on.  And, is it the table itself that is growing, or the index on it?

Cheers,

Jeff
12