When do vacuumed pages/tuples become available for reuse?

classic Classic list List threaded Threaded
30 messages Options
12
Reply | Threaded
Open this post in threaded view
|

When do vacuumed pages/tuples become available for reuse?

rihad
If an autovacuum job on a huge table runs for 5-6 hours, do its freed
pages/tuples become available for reuse immediately when they are marked
as free, or only at the end of the multi-hour vacuum operation?



Reply | Threaded
Open this post in threaded view
|

Re: When do vacuumed pages/tuples become available for reuse?

Tom Lane-2
rihad <[hidden email]> writes:
> If an autovacuum job on a huge table runs for 5-6 hours, do its freed
> pages/tuples become available for reuse immediately when they are marked
> as free, or only at the end of the multi-hour vacuum operation?

They'll be freed in batches, where the size of a batch depends on the
autovacuum_work_mem or maintenance_work_mem setting.  The basic
work cycle is

* scan table to find dead tuples, save their TIDs in working memory;
  continue until end of table or working memory full
* scan indexes to find index entries matching those TIDs, remove 'em
* go back to table and remove the previously-found tuples
* if not end of table, repeat

So a larger work-mem setting means fewer passes over the indexes,
but a longer time until space is reclaimed.

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: When do vacuumed pages/tuples become available for reuse?

rihad
On 04/11/2019 05:48 PM, Tom Lane wrote:

> rihad <[hidden email]> writes:
>> If an autovacuum job on a huge table runs for 5-6 hours, do its freed
>> pages/tuples become available for reuse immediately when they are marked
>> as free, or only at the end of the multi-hour vacuum operation?
> They'll be freed in batches, where the size of a batch depends on the
> autovacuum_work_mem or maintenance_work_mem setting.  The basic
> work cycle is
>
> * scan table to find dead tuples, save their TIDs in working memory;
>    continue until end of table or working memory full
> * scan indexes to find index entries matching those TIDs, remove 'em
> * go back to table and remove the previously-found tuples
> * if not end of table, repeat
>
> So a larger work-mem setting means fewer passes over the indexes,
> but a longer time until space is reclaimed.
>
> regards, tom lane
> .
>
Thanks! Our autovacuum_work_mem = 1GB, so this probably means any space
would be available for reuse only at the end of the vacuum? Are there
any downsides in decreasing it to, say, 64MB? I see only pluses )



Reply | Threaded
Open this post in threaded view
|

Re: When do vacuumed pages/tuples become available for reuse?

Alvaro Herrera-9
On 2019-Apr-11, rihad wrote:

> Thanks! Our autovacuum_work_mem = 1GB, so this probably means any space
> would be available for reuse only at the end of the vacuum? Are there any
> downsides in decreasing it to, say, 64MB? I see only pluses )

Yes, each vacuum will take longer and will use much more I/O.

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: When do vacuumed pages/tuples become available for reuse?

rihad
On 04/11/2019 06:09 PM, Alvaro Herrera wrote:
> On 2019-Apr-11, rihad wrote:
>
>> Thanks! Our autovacuum_work_mem = 1GB, so this probably means any space
>> would be available for reuse only at the end of the vacuum? Are there any
>> downsides in decreasing it to, say, 64MB? I see only pluses )
> Yes, each vacuum will take longer and will use much more I/O.
>
I see, thanks.



Reply | Threaded
Open this post in threaded view
|

Re: When do vacuumed pages/tuples become available for reuse?

Tom Lane-2
In reply to this post by rihad
rihad <[hidden email]> writes:
> Thanks! Our autovacuum_work_mem = 1GB, so this probably means any space
> would be available for reuse only at the end of the vacuum?

It's six bytes per dead tuple, last I checked ... you do the math.

> Are there
> any downsides in decreasing it to, say, 64MB? I see only pluses )

Well, usually people prefer to minimize the number of passes over
the indexes.

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: When do vacuumed pages/tuples become available for reuse?

rihad
On 04/11/2019 06:20 PM, Tom Lane wrote:

> rihad <[hidden email]> writes:
>> Thanks! Our autovacuum_work_mem = 1GB, so this probably means any space
>> would be available for reuse only at the end of the vacuum?
> It's six bytes per dead tuple, last I checked ... you do the math.
>
>> Are there
>> any downsides in decreasing it to, say, 64MB? I see only pluses )
> Well, usually people prefer to minimize the number of passes over
> the indexes.
>
> regards, tom lane
> .
>
Yup, it's just that n_dead_tuples grows by several hundred thousand (the
table sees much much more updates than inserts) and disk usage grows
constantly between several hour long vacuum runs. Running vacuum full
isn't an option.



Reply | Threaded
Open this post in threaded view
|

Re: When do vacuumed pages/tuples become available for reuse?

Alvaro Herrera-9
On 2019-Apr-11, rihad wrote:

> On 04/11/2019 06:20 PM, Tom Lane wrote:
> > rihad <[hidden email]> writes:
> > > Thanks! Our autovacuum_work_mem = 1GB, so this probably means any space
> > > would be available for reuse only at the end of the vacuum?
> > It's six bytes per dead tuple, last I checked ... you do the math.
> >
> > > Are there
> > > any downsides in decreasing it to, say, 64MB? I see only pluses )
> > Well, usually people prefer to minimize the number of passes over
> > the indexes.

> Yup, it's just that n_dead_tuples grows by several hundred thousand (the
> table sees much much more updates than inserts) and disk usage grows
> constantly between several hour long vacuum runs. Running vacuum full isn't
> an option.

Perhaps it'd be better to vacuum this table much more often.

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: When do vacuumed pages/tuples become available for reuse?

rihad
On 04/11/2019 06:41 PM, Alvaro Herrera wrote:

> On 2019-Apr-11, rihad wrote:
>
>> On 04/11/2019 06:20 PM, Tom Lane wrote:
>>> rihad <[hidden email]> writes:
>>>> Thanks! Our autovacuum_work_mem = 1GB, so this probably means any space
>>>> would be available for reuse only at the end of the vacuum?
>>> It's six bytes per dead tuple, last I checked ... you do the math.
>>>
>>>> Are there
>>>> any downsides in decreasing it to, say, 64MB? I see only pluses )
>>> Well, usually people prefer to minimize the number of passes over
>>> the indexes.
>> Yup, it's just that n_dead_tuples grows by several hundred thousand (the
>> table sees much much more updates than inserts) and disk usage grows
>> constantly between several hour long vacuum runs. Running vacuum full isn't
>> an option.
> Perhaps it'd be better to vacuum this table much more often.
>
Each run took 5-6 hours, now it takes 2-3 hours after I've tweaked some
cost-based vacuum knobs.



Reply | Threaded
Open this post in threaded view
|

Re: When do vacuumed pages/tuples become available for reuse?

Alvaro Herrera-9
On 2019-Apr-11, rihad wrote:

> On 04/11/2019 06:41 PM, Alvaro Herrera wrote:
>
> > Perhaps it'd be better to vacuum this table much more often.
> >
> Each run took 5-6 hours, now it takes 2-3 hours after I've tweaked some
> cost-based vacuum knobs.

But how often does it run?

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: When do vacuumed pages/tuples become available for reuse?

rihad
On 04/11/2019 07:04 PM, Alvaro Herrera wrote:

> On 2019-Apr-11, rihad wrote:
>
>> On 04/11/2019 06:41 PM, Alvaro Herrera wrote:
>>
>>> Perhaps it'd be better to vacuum this table much more often.
>>>
>> Each run took 5-6 hours, now it takes 2-3 hours after I've tweaked some
>> cost-based vacuum knobs.
> But how often does it run?
>
One after another. Enough n_dead_tuples accumulate in between runs to
easily trigger that.

autovacuum_vacuum_scale_factor = 0.01

utovacuum_vacuum_threshold = 50

which means to run autovac when 1% of table size + 50 rows have been
updated or deleted.

But we can't make each autovacuum run run faster )

Currently I lowered this from 20ms:

autovacuum_vacuum_cost_delay = 10ms

And increased this from 200:

autovacuum_vacuum_cost_limit = 400

to make it finish in 1 hour rather than 3 hours.

Anything more than that and we risk impacting the performance of user
queries.



Reply | Threaded
Open this post in threaded view
|

Re: When do vacuumed pages/tuples become available for reuse?

Jeff Janes
In reply to this post by rihad
On Thu, Apr 11, 2019 at 10:28 AM rihad <[hidden email]> wrote:

Yup, it's just that n_dead_tuples grows by several hundred thousand (the
table sees much much more updates than inserts) and disk usage grows
constantly between several hour long vacuum runs. Running vacuum full
isn't an option.

The disk usage doesn't reach a steady state after one or two autovacs?  Or it does, but you are just unhappy about the ratio between the steady state size and the theoretical fully packed size?

Cheers,

Jeff
Reply | Threaded
Open this post in threaded view
|

Re: When do vacuumed pages/tuples become available for reuse?

rihad
On 04/11/2019 07:40 PM, Jeff Janes wrote:
On Thu, Apr 11, 2019 at 10:28 AM rihad <[hidden email]> wrote:

Yup, it's just that n_dead_tuples grows by several hundred thousand (the
table sees much much more updates than inserts) and disk usage grows
constantly between several hour long vacuum runs. Running vacuum full
isn't an option.

The disk usage doesn't reach a steady state after one or two autovacs?  Or it does, but you are just unhappy about the ratio between the steady state size and the theoretical fully packed size?

Cheers,

Jeff


Since we dump&restore production DB daily into staging environment, the difference in size (as reported by psql's \l+) is 11GB in a freshly restored DB as opposed to 70GB in production.

Reply | Threaded
Open this post in threaded view
|

Re: When do vacuumed pages/tuples become available for reuse?

Jeff Janes
In reply to this post by rihad
On Thu, Apr 11, 2019 at 11:14 AM rihad <[hidden email]> wrote:

> autovacuum_vacuum_scale_factor = 0.01
> autovacuum_vacuum_threshold = 50  

This seems counterproductive.  You need to make the vacuum more efficient, not more frantic.


autovacuum_vacuum_cost_delay = 10ms
autovacuum_vacuum_cost_limit = 400
 
Anything more than that and we risk impacting the performance of user
queries.

Well, unbounded bloat will also impact the user queries--eventually.  Was this an experimental determination?  Can you tell what about the autovac most impacts the user queries, the reading or the writing?

You might just have more workload than your hardware can handle.  There is always going to be some fundamental limit, and while you can tune your way up to that limit, you can't tune your way past it.

Cheers,

Jeff
Reply | Threaded
Open this post in threaded view
|

Re: When do vacuumed pages/tuples become available for reuse?

Jeff Janes
In reply to this post by rihad
On Thu, Apr 11, 2019 at 11:44 AM rihad <[hidden email]> wrote:
On 04/11/2019 07:40 PM, Jeff Janes wrote:

The disk usage doesn't reach a steady state after one or two autovacs?  Or it does, but you are just unhappy about the ratio between the steady state size and the theoretical fully packed size?

Cheers,

Jeff


Since we dump&restore production DB daily into staging environment, the difference in size (as reported by psql's \l+) is 11GB in a freshly restored DB as opposed to 70GB in production.


Yeah, that seems like a problem.  Do you have long lived transactions/snapshots that are preventing vacuuming from removing dead tuples?  You can run a manual "vacuum verbose" and see how many dead but nonremovable tuples there were, or set log_autovacuum_min_duration to some non-negative value less than the autovac takes, and do the same.

(Indeed, those dumps you take daily might be the source of those long-lived snapshots.  How long does a dump take?)

Also, what does pg_freespace (https://www.postgresql.org/docs/current/pgfreespacemap.html) show about the available of space in the table?  How about pgstattuple (https://www.postgresql.org/docs/current/pgstattuple.html)

Cheers,

Jeff
Reply | Threaded
Open this post in threaded view
|

Re: When do vacuumed pages/tuples become available for reuse?

rihad
On 04/11/2019 08:09 PM, Jeff Janes wrote:
On Thu, Apr 11, 2019 at 11:44 AM rihad <[hidden email]> wrote:
On 04/11/2019 07:40 PM, Jeff Janes wrote:

The disk usage doesn't reach a steady state after one or two autovacs?  Or it does, but you are just unhappy about the ratio between the steady state size and the theoretical fully packed size?

Cheers,

Jeff


Since we dump&restore production DB daily into staging environment, the difference in size (as reported by psql's \l+) is 11GB in a freshly restored DB as opposed to 70GB in production.


Yeah, that seems like a problem.  Do you have long lived transactions/snapshots that are preventing vacuuming from removing dead tuples?  You can run a manual "vacuum verbose" and see how many dead but nonremovable tuples there were, or set log_autovacuum_min_duration to some non-negative value less than the autovac takes, and do the same.

vacuum frees tuples just fine. It's just that by the time each run finishes many more accumulate due to table update activity, ad nauseum. So this unused space constantly grows. Here's a sample autovacuum run:

2019-04-11 19:39:44.450841500 [] LOG:  automatic vacuum of table "foo.public.bar": index scans: 1
2019-04-11 19:39:44.450843500   pages: 0 removed, 472095 remain, 4 skipped due to pins, 39075 skipped frozen
2019-04-11 19:39:44.450844500   tuples: 19150 removed, 2725811 remain, 465 are dead but not yet removable
2019-04-11 19:39:44.450845500   buffer usage: 62407557 hits, 6984769 misses, 116409 dirtied
2019-04-11 19:39:44.450846500   avg read rate: 16.263 MB/s, avg write rate: 0.271 MB/s
2019-04-11 19:39:44.450847500   system usage: CPU 59.05s/115.26u sec elapsed 3355.28 sec


(Indeed, those dumps you take daily might be the source of those long-lived snapshots.  How long does a dump take?)

The daily dumps are taken daily from the slave server as part of stock FreeBSD postgres port activity.
I don't think it impacts the master server.


Also, what does pg_freespace (https://www.postgresql.org/docs/current/pgfreespacemap.html) show about the available of space in the table?  How about pgstattuple (https://www.postgresql.org/docs/current/pgstattuple.html)


Thanks, I'll try those. But as I said freshly restored DB is only 11GB in size, not 70 (only public schema is used).

Cheers,

Jeff


Reply | Threaded
Open this post in threaded view
|

Re: When do vacuumed pages/tuples become available for reuse?

Michael Lewis
vacuum frees tuples just fine. It's just that by the time each run finishes many more accumulate due to table update activity, ad nauseum. So this unused space constantly grows. Here's a sample autovacuum run:

2019-04-11 19:39:44.450841500 [] LOG:  automatic vacuum of table "foo.public.bar": index scans: 1
2019-04-11 19:39:44.450843500   pages: 0 removed, 472095 remain, 4 skipped due to pins, 39075 skipped frozen
2019-04-11 19:39:44.450844500   tuples: 19150 removed, 2725811 remain, 465 are dead but not yet removable
2019-04-11 19:39:44.450845500   buffer usage: 62407557 hits, 6984769 misses, 116409 dirtied
2019-04-11 19:39:44.450846500   avg read rate: 16.263 MB/s, avg write rate: 0.271 MB/s
2019-04-11 19:39:44.450847500   system usage: CPU 59.05s/115.26u sec elapsed 3355.28 sec

Maybe I am off base, but those read/write rates seem very low. Is this running on spinny disks? Also, less than half a million rows remain and 2.7 million dead but not removed in this auto vacuum. It seems to indicate that auto vacuum is not as aggressive as it needs to be. Have you verified that it blocks normal activity when autovacuum does more work each pass? If you are hardware bound, could you use partitioning to allow auto vacuum to work on the partitions separately and perhaps keep up?
Reply | Threaded
Open this post in threaded view
|

Re: When do vacuumed pages/tuples become available for reuse?

Alvaro Herrera-9
In reply to this post by rihad
On 2019-Apr-11, rihad wrote:

> 2019-04-11 19:39:44.450844500   tuples: 19150 removed, 2725811 remain, 465 are dead but not yet removable

What Jeff said.  This vacuum spent a lot of time, only to remove miserly
19k tuples, but 2.7M dead tuples remained ... probably because you have
long-running transactions preventing vacuum from removing them.  Maybe
that was pg_dump.

I suspect what you really need is keep an eye on the age of your oldest
transactions.  Don't leave anything "idle in transaction".

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: When do vacuumed pages/tuples become available for reuse?

Michael Lewis
> 2019-04-11 19:39:44.450844500   tuples: 19150 removed, 2725811 remain, 465 are dead but not yet removable

What Jeff said.  This vacuum spent a lot of time, only to remove miserly
19k tuples, but 2.7M dead tuples remained ... probably because you have
long-running transactions preventing vacuum from removing them.  Maybe
that was pg_dump.

Wouldn't "dead but not yet removable" be high if there were long running transactions holding onto old row versions?
Reply | Threaded
Open this post in threaded view
|

Re: When do vacuumed pages/tuples become available for reuse?

Tom Lane-2
In reply to this post by Alvaro Herrera-9
Alvaro Herrera <[hidden email]> writes:
> On 2019-Apr-11, rihad wrote:
>> 2019-04-11 19:39:44.450844500   tuples: 19150 removed, 2725811 remain, 465 are dead but not yet removable

> What Jeff said.  This vacuum spent a lot of time, only to remove miserly
> 19k tuples, but 2.7M dead tuples remained ... probably because you have
> long-running transactions preventing vacuum from removing them.

I think you misread it --- I'm pretty sure "N remain" is referring
to live tuples.  Maybe we should adjust the wording to make that
clearer?

                        regards, tom lane


12