Unable to Vacuum Large Defragmented Table

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

Unable to Vacuum Large Defragmented Table

Igal Sapir
I have a table for which pg_relation_size() shows only 31MB, but pg_total_relation_size() shows a whopping 84GB.

The database engine is running inside a Docker container, with the data mounted as a volume from a partition on the host's file system.

When I try to run `VACUUM FULL`, the disk usage goes up until it reaches the full capacity of the partition (about 27GB of free space), at which point it fails.

How can I reclaim the disk space here other than write the data to a new table and drop the old one?

Thank you,

Igal



Reply | Threaded
Open this post in threaded view
|

Re: Unable to Vacuum Large Defragmented Table

Adrian Klaver-4
On 4/7/19 3:08 PM, Igal Sapir wrote:
> I have a table for which pg_relation_size() shows only 31MB, but
> pg_total_relation_size() shows a whopping 84GB.

What does:

pg_indexes_size()

show.

Also how many indexes are on the table?

>
> The database engine is running inside a Docker container, with the data
> mounted as a volume from a partition on the host's file system.
>
> When I try to run `VACUUM FULL`, the disk usage goes up until it reaches
> the full capacity of the partition (about 27GB of free space), at which
> point it fails.

Yeah it would:

https://www.postgresql.org/docs/11/sql-vacuum.html
"Selects “full” vacuum, which can reclaim more space, but takes much
longer and exclusively locks the table. This method also requires extra
disk space, since it writes a new copy of the table and doesn't release
the old copy until the operation is complete. ..."

>
> How can I reclaim the disk space here other than write the data to a new
> table and drop the old one?
>
> Thank you,
>
> Igal
>
>
>


--
Adrian Klaver
[hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: Unable to Vacuum Large Defragmented Table

David Rowley-3
In reply to this post by Igal Sapir
On Mon, 8 Apr 2019 at 10:09, Igal Sapir <[hidden email]> wrote:
>
> I have a table for which pg_relation_size() shows only 31MB, but pg_total_relation_size() shows a whopping 84GB.
>
> The database engine is running inside a Docker container, with the data mounted as a volume from a partition on the host's file system.
>
> When I try to run `VACUUM FULL`, the disk usage goes up until it reaches the full capacity of the partition (about 27GB of free space), at which point it fails.

That sort of indicates that the table might not be as bloated as you
seem to think it is.  Remember that variable length attributes can be
toasted and stored in the relation's toast table.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Unable to Vacuum Large Defragmented Table

Igal Sapir
David,

On Sun, Apr 7, 2019 at 6:20 PM David Rowley <[hidden email]> wrote:
On Mon, 8 Apr 2019 at 10:09, Igal Sapir <[hidden email]> wrote:
>
> I have a table for which pg_relation_size() shows only 31MB, but pg_total_relation_size() shows a whopping 84GB.
>
> The database engine is running inside a Docker container, with the data mounted as a volume from a partition on the host's file system.
>
> When I try to run `VACUUM FULL`, the disk usage goes up until it reaches the full capacity of the partition (about 27GB of free space), at which point it fails.

That sort of indicates that the table might not be as bloated as you
seem to think it is.  Remember that variable length attributes can be
toasted and stored in the relation's toast table.

I think that you're on to something here.  The table has a JSONB column which has possibly toasted.

I have deleted many rows from the table itself though, and still fail to reclaim disk space.  Is there something else I should do to delete the toasted data?

Thanks,

Igal
Reply | Threaded
Open this post in threaded view
|

Re: Unable to Vacuum Large Defragmented Table

David Rowley-3
On Mon, 8 Apr 2019 at 14:19, Igal Sapir <[hidden email]> wrote:

>
> On Sun, Apr 7, 2019 at 6:20 PM David Rowley <[hidden email]> wrote:
>>
>> On Mon, 8 Apr 2019 at 10:09, Igal Sapir <[hidden email]> wrote:
>> >
>> > I have a table for which pg_relation_size() shows only 31MB, but pg_total_relation_size() shows a whopping 84GB.
>> >
>> > The database engine is running inside a Docker container, with the data mounted as a volume from a partition on the host's file system.
>> >
>> > When I try to run `VACUUM FULL`, the disk usage goes up until it reaches the full capacity of the partition (about 27GB of free space), at which point it fails.
>>
>> That sort of indicates that the table might not be as bloated as you
>> seem to think it is.  Remember that variable length attributes can be
>> toasted and stored in the relation's toast table.
>
>
> I think that you're on to something here.  The table has a JSONB column which has possibly toasted.
>
> I have deleted many rows from the table itself though, and still fail to reclaim disk space.  Is there something else I should do to delete the toasted data?

The toast data is part of the data. It's just stored out of line since
there's a hard limit of just under 8k per tuple and since tuples
cannot span multiple pages, PostgreSQL internally breaks them into
chunks, possibly compresses them and stores them in the toast table.
This can occur for any variable length type.

This means if you want to remove the toast data, then you'll need to
remove the data from the main table, either the form of deleting rows
or updating them to remove the toasted values.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Unable to Vacuum Large Defragmented Table

Igal Sapir
David,

On Sun, Apr 7, 2019 at 7:28 PM David Rowley <[hidden email]> wrote:
On Mon, 8 Apr 2019 at 14:19, Igal Sapir <[hidden email]> wrote:
>
> On Sun, Apr 7, 2019 at 6:20 PM David Rowley <[hidden email]> wrote:
>>
>> On Mon, 8 Apr 2019 at 10:09, Igal Sapir <[hidden email]> wrote:
>> >
>> > I have a table for which pg_relation_size() shows only 31MB, but pg_total_relation_size() shows a whopping 84GB.
>> >
>> > The database engine is running inside a Docker container, with the data mounted as a volume from a partition on the host's file system.
>> >
>> > When I try to run `VACUUM FULL`, the disk usage goes up until it reaches the full capacity of the partition (about 27GB of free space), at which point it fails.
>>
>> That sort of indicates that the table might not be as bloated as you
>> seem to think it is.  Remember that variable length attributes can be
>> toasted and stored in the relation's toast table.
>
>
> I think that you're on to something here.  The table has a JSONB column which has possibly toasted.
>
> I have deleted many rows from the table itself though, and still fail to reclaim disk space.  Is there something else I should do to delete the toasted data?

The toast data is part of the data. It's just stored out of line since
there's a hard limit of just under 8k per tuple and since tuples
cannot span multiple pages, PostgreSQL internally breaks them into
chunks, possibly compresses them and stores them in the toast table.
This can occur for any variable length type.

This means if you want to remove the toast data, then you'll need to
remove the data from the main table, either the form of deleting rows
or updating them to remove the toasted values.

However, I have now deleted about 50,000 rows more and the table has only 119,688 rows.  The pg_relation_size() still shows 31MB and pg_total_relation_size() still shows 84GB.

It doesn't make sense that after deleting about 30% of the rows the values here do not change.

Attempting to copy the data to a different table results in the out of disk error as well, so that is in line with your assessment.  But it actually just shows the problem.  The new table to which the data was copied (though failed due to out of disk) shows 0 rows, but  pg_total_relation_size() for that table shows 27GB.  So now I have an "empty" table that takes 27GB of disk space.

This is mostly transient data, so I don't mind deleting rows, but if some day this could happen in production then I have to know how to deal with it without losing all of the data.

Thanks,

Igal
Reply | Threaded
Open this post in threaded view
|

Re: Unable to Vacuum Large Defragmented Table

David Rowley-3
On Mon, 8 Apr 2019 at 14:57, Igal Sapir <[hidden email]> wrote:
> However, I have now deleted about 50,000 rows more and the table has only 119,688 rows.  The pg_relation_size() still shows 31MB and pg_total_relation_size() still shows 84GB.
>
> It doesn't make sense that after deleting about 30% of the rows the values here do not change.

deleting rows does not make the table any smaller, it just creates
dead rows in the table.  VACUUM tries to release the space used by
those dead rows and turns it back into free space.  Normal vacuum (not
FULL) can only shrink the table if completely empty pages are found at
the end of the table.

> Attempting to copy the data to a different table results in the out of disk error as well, so that is in line with your assessment.  But it actually just shows the problem.  The new table to which the data was copied (though failed due to out of disk) shows 0 rows, but  pg_total_relation_size() for that table shows 27GB.  So now I have an "empty" table that takes 27GB of disk space.

I think the best thing to do is perform a normal VACUUM on the table
then CREATE EXTENSION IF NOT EXISTS pgstattuple; then do; SELECT *
FROM pgstattuple('<tablename>); and the same again on the toast table.
If your table still contains many dead rows then perhaps an open
transaction is stopping rows from being turned into free space. Once
pgstattuples reports that "tuple_len" from the table, its toast table
and all its indexes has been reduced to an acceptable value then you
should try a VACUUM FULL.  Remember that VACUUM FULL must also write
WAL, so if WAL is on the same volume, then you'll need to consider
space required for that when deciding how much data to remove from the
table.

> This is mostly transient data, so I don't mind deleting rows, but if some day this could happen in production then I have to know how to deal with it without losing all of the data.

For the future, it would be better to delete more often than waiting
until the table grows too large.  A normal VACUUM will turn space used
by dead tuples back into free space, so if done often enough there
won't be a need to vacuum full.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Unable to Vacuum Large Defragmented Table

Igal Sapir
David,

On Sun, Apr 7, 2019 at 8:11 PM David Rowley <[hidden email]> wrote:
On Mon, 8 Apr 2019 at 14:57, Igal Sapir <[hidden email]> wrote:
> However, I have now deleted about 50,000 rows more and the table has only 119,688 rows.  The pg_relation_size() still shows 31MB and pg_total_relation_size() still shows 84GB.
>
> It doesn't make sense that after deleting about 30% of the rows the values here do not change.

deleting rows does not make the table any smaller, it just creates
dead rows in the table.  VACUUM tries to release the space used by
those dead rows and turns it back into free space.  Normal vacuum (not
FULL) can only shrink the table if completely empty pages are found at
the end of the table.

ACK
 

> Attempting to copy the data to a different table results in the out of disk error as well, so that is in line with your assessment.  But it actually just shows the problem.  The new table to which the data was copied (though failed due to out of disk) shows 0 rows, but  pg_total_relation_size() for that table shows 27GB.  So now I have an "empty" table that takes 27GB of disk space.

I think the best thing to do is perform a normal VACUUM on the table

Running VACUUM on the newly created table cleared the 27GB so that's good (I was planning to run normal VACUUM but ran FULL).
 
then CREATE EXTENSION IF NOT EXISTS pgstattuple; then do; SELECT *
FROM pgstattuple('<tablename>); and the same again on the toast table.
If your table still contains many dead rows then perhaps an open
transaction is stopping rows from being turned into free space.

I am not sure how to read the below.  I see a lot of "free_space" but not "dead":

-[ RECORD 1 ]------+------------
?column?           | primary
table_len          | 32055296
tuple_count        | 120764
tuple_len          | 9470882
tuple_percent      | 29.55
dead_tuple_count   | 0
dead_tuple_len     | 0
dead_tuple_percent | 0
free_space         | 20713580
free_percent       | 64.62
-[ RECORD 2 ]------+------------
?column?           | toast
table_len          | 88802156544
tuple_count        | 15126830
tuple_len          | 30658625743
tuple_percent      | 34.52
dead_tuple_count   | 0
dead_tuple_len     | 0
dead_tuple_percent | 0
free_space         | 57653329312
free_percent       | 64.92

 
Once pgstattuples reports that "tuple_len" from the table, its toast table
and all its indexes has been reduced to an acceptable value then you
should try a VACUUM FULL.  Remember that VACUUM FULL must also write
WAL, so if WAL is on the same volume, then you'll need to consider
space required for that when deciding how much data to remove from the
table.

WAL is on the same volume.  The PGDATA directory is mounted in a Docker container.

Isn't there any way to do an in-place VACUUM or pause the WAL at the risk of losing some data if recovery is required?

There is a catch-22 here.  I can't reclaim the disk space because that requires disk space.  Surely I'm not the first one to have encountered that problem with Postgres.
 

> This is mostly transient data, so I don't mind deleting rows, but if some day this could happen in production then I have to know how to deal with it without losing all of the data.

For the future, it would be better to delete more often than waiting
until the table grows too large.  A normal VACUUM will turn space used
by dead tuples back into free space, so if done often enough there
won't be a need to vacuum full.

ACK.  This issue came up while implementing a retention policy that will be enforced regularly.

Thank you for all of your help,

Igal
 
Reply | Threaded
Open this post in threaded view
|

Re: Unable to Vacuum Large Defragmented Table

Pavel Stehule


po 8. 4. 2019 v 7:57 odesílatel Igal Sapir <[hidden email]> napsal:
David,

On Sun, Apr 7, 2019 at 8:11 PM David Rowley <[hidden email]> wrote:
On Mon, 8 Apr 2019 at 14:57, Igal Sapir <[hidden email]> wrote:
> However, I have now deleted about 50,000 rows more and the table has only 119,688 rows.  The pg_relation_size() still shows 31MB and pg_total_relation_size() still shows 84GB.
>
> It doesn't make sense that after deleting about 30% of the rows the values here do not change.

deleting rows does not make the table any smaller, it just creates
dead rows in the table.  VACUUM tries to release the space used by
those dead rows and turns it back into free space.  Normal vacuum (not
FULL) can only shrink the table if completely empty pages are found at
the end of the table.

ACK
 

> Attempting to copy the data to a different table results in the out of disk error as well, so that is in line with your assessment.  But it actually just shows the problem.  The new table to which the data was copied (though failed due to out of disk) shows 0 rows, but  pg_total_relation_size() for that table shows 27GB.  So now I have an "empty" table that takes 27GB of disk space.

I think the best thing to do is perform a normal VACUUM on the table

Running VACUUM on the newly created table cleared the 27GB so that's good (I was planning to run normal VACUUM but ran FULL).

you can drop some indexes, then you can run vacuum full, and create dropped indexes again.


 
then CREATE EXTENSION IF NOT EXISTS pgstattuple; then do; SELECT *
FROM pgstattuple('<tablename>); and the same again on the toast table.
If your table still contains many dead rows then perhaps an open
transaction is stopping rows from being turned into free space.

I am not sure how to read the below.  I see a lot of "free_space" but not "dead":

-[ RECORD 1 ]------+------------
?column?           | primary
table_len          | 32055296
tuple_count        | 120764
tuple_len          | 9470882
tuple_percent      | 29.55
dead_tuple_count   | 0
dead_tuple_len     | 0
dead_tuple_percent | 0
free_space         | 20713580
free_percent       | 64.62
-[ RECORD 2 ]------+------------
?column?           | toast
table_len          | 88802156544
tuple_count        | 15126830
tuple_len          | 30658625743
tuple_percent      | 34.52
dead_tuple_count   | 0
dead_tuple_len     | 0
dead_tuple_percent | 0
free_space         | 57653329312
free_percent       | 64.92


it say, so your table can be reduced about 60%


 
Once pgstattuples reports that "tuple_len" from the table, its toast table
and all its indexes has been reduced to an acceptable value then you
should try a VACUUM FULL.  Remember that VACUUM FULL must also write
WAL, so if WAL is on the same volume, then you'll need to consider
space required for that when deciding how much data to remove from the
table.

WAL is on the same volume.  The PGDATA directory is mounted in a Docker container.

Isn't there any way to do an in-place VACUUM or pause the WAL at the risk of losing some data if recovery is required?

There is a catch-22 here.  I can't reclaim the disk space because that requires disk space.  Surely I'm not the first one to have encountered that problem with Postgres.
 

> This is mostly transient data, so I don't mind deleting rows, but if some day this could happen in production then I have to know how to deal with it without losing all of the data.

For the future, it would be better to delete more often than waiting
until the table grows too large.  A normal VACUUM will turn space used
by dead tuples back into free space, so if done often enough there
won't be a need to vacuum full.

ACK.  This issue came up while implementing a retention policy that will be enforced regularly.

Thank you for all of your help,

Igal
 
Reply | Threaded
Open this post in threaded view
|

Re: Unable to Vacuum Large Defragmented Table

Igal Sapir
Pavel,

On Sun, Apr 7, 2019 at 11:22 PM Pavel Stehule <[hidden email]> wrote:

po 8. 4. 2019 v 7:57 odesílatel Igal Sapir <[hidden email]> napsal:
David,

On Sun, Apr 7, 2019 at 8:11 PM David Rowley <[hidden email]> wrote:
On Mon, 8 Apr 2019 at 14:57, Igal Sapir <[hidden email]> wrote:
> However, I have now deleted about 50,000 rows more and the table has only 119,688 rows.  The pg_relation_size() still shows 31MB and pg_total_relation_size() still shows 84GB.
>
> It doesn't make sense that after deleting about 30% of the rows the values here do not change.

deleting rows does not make the table any smaller, it just creates
dead rows in the table.  VACUUM tries to release the space used by
those dead rows and turns it back into free space.  Normal vacuum (not
FULL) can only shrink the table if completely empty pages are found at
the end of the table.

ACK
 

> Attempting to copy the data to a different table results in the out of disk error as well, so that is in line with your assessment.  But it actually just shows the problem.  The new table to which the data was copied (though failed due to out of disk) shows 0 rows, but  pg_total_relation_size() for that table shows 27GB.  So now I have an "empty" table that takes 27GB of disk space.

I think the best thing to do is perform a normal VACUUM on the table

Running VACUUM on the newly created table cleared the 27GB so that's good (I was planning to run normal VACUUM but ran FULL).

you can drop some indexes, then you can run vacuum full, and create dropped indexes again.

The table does not have any indexes.  It is mostly an append-only table.
 


 
then CREATE EXTENSION IF NOT EXISTS pgstattuple; then do; SELECT *
FROM pgstattuple('<tablename>); and the same again on the toast table.
If your table still contains many dead rows then perhaps an open
transaction is stopping rows from being turned into free space.

I am not sure how to read the below.  I see a lot of "free_space" but not "dead":

-[ RECORD 1 ]------+------------
?column?           | primary
table_len          | 32055296
tuple_count        | 120764
tuple_len          | 9470882
tuple_percent      | 29.55
dead_tuple_count   | 0
dead_tuple_len     | 0
dead_tuple_percent | 0
free_space         | 20713580
free_percent       | 64.62
-[ RECORD 2 ]------+------------
?column?           | toast
table_len          | 88802156544
tuple_count        | 15126830
tuple_len          | 30658625743
tuple_percent      | 34.52
dead_tuple_count   | 0
dead_tuple_len     | 0
dead_tuple_percent | 0
free_space         | 57653329312
free_percent       | 64.92


it say, so your table can be reduced about 60%

That's what I thought, and releasing 65% of 84GB would be major here, but unfortunately I am unable to release it because VACUUM FULL requires more space than I currently have available.

Perhaps disabling the WAL, if possible, could help VACUUM FULL complete.  Or some way to do an in-place VACUUM so that it doesn't write all the data to a new table.

Thank you,

Igal

 


 
Once pgstattuples reports that "tuple_len" from the table, its toast table
and all its indexes has been reduced to an acceptable value then you
should try a VACUUM FULL.  Remember that VACUUM FULL must also write
WAL, so if WAL is on the same volume, then you'll need to consider
space required for that when deciding how much data to remove from the
table.

WAL is on the same volume.  The PGDATA directory is mounted in a Docker container.

Isn't there any way to do an in-place VACUUM or pause the WAL at the risk of losing some data if recovery is required?

There is a catch-22 here.  I can't reclaim the disk space because that requires disk space.  Surely I'm not the first one to have encountered that problem with Postgres.
 

> This is mostly transient data, so I don't mind deleting rows, but if some day this could happen in production then I have to know how to deal with it without losing all of the data.

For the future, it would be better to delete more often than waiting
until the table grows too large.  A normal VACUUM will turn space used
by dead tuples back into free space, so if done often enough there
won't be a need to vacuum full.

ACK.  This issue came up while implementing a retention policy that will be enforced regularly.

Thank you for all of your help,

Igal
 
Reply | Threaded
Open this post in threaded view
|

Re: Unable to Vacuum Large Defragmented Table

Pavel Stehule


po 8. 4. 2019 v 17:22 odesílatel Igal Sapir <[hidden email]> napsal:
Pavel,

On Sun, Apr 7, 2019 at 11:22 PM Pavel Stehule <[hidden email]> wrote:

po 8. 4. 2019 v 7:57 odesílatel Igal Sapir <[hidden email]> napsal:
David,

On Sun, Apr 7, 2019 at 8:11 PM David Rowley <[hidden email]> wrote:
On Mon, 8 Apr 2019 at 14:57, Igal Sapir <[hidden email]> wrote:
> However, I have now deleted about 50,000 rows more and the table has only 119,688 rows.  The pg_relation_size() still shows 31MB and pg_total_relation_size() still shows 84GB.
>
> It doesn't make sense that after deleting about 30% of the rows the values here do not change.

deleting rows does not make the table any smaller, it just creates
dead rows in the table.  VACUUM tries to release the space used by
those dead rows and turns it back into free space.  Normal vacuum (not
FULL) can only shrink the table if completely empty pages are found at
the end of the table.

ACK
 

> Attempting to copy the data to a different table results in the out of disk error as well, so that is in line with your assessment.  But it actually just shows the problem.  The new table to which the data was copied (though failed due to out of disk) shows 0 rows, but  pg_total_relation_size() for that table shows 27GB.  So now I have an "empty" table that takes 27GB of disk space.

I think the best thing to do is perform a normal VACUUM on the table

Running VACUUM on the newly created table cleared the 27GB so that's good (I was planning to run normal VACUUM but ran FULL).

you can drop some indexes, then you can run vacuum full, and create dropped indexes again.

The table does not have any indexes.  It is mostly an append-only table.
 


 
then CREATE EXTENSION IF NOT EXISTS pgstattuple; then do; SELECT *
FROM pgstattuple('<tablename>); and the same again on the toast table.
If your table still contains many dead rows then perhaps an open
transaction is stopping rows from being turned into free space.

I am not sure how to read the below.  I see a lot of "free_space" but not "dead":

-[ RECORD 1 ]------+------------
?column?           | primary
table_len          | 32055296
tuple_count        | 120764
tuple_len          | 9470882
tuple_percent      | 29.55
dead_tuple_count   | 0
dead_tuple_len     | 0
dead_tuple_percent | 0
free_space         | 20713580
free_percent       | 64.62
-[ RECORD 2 ]------+------------
?column?           | toast
table_len          | 88802156544
tuple_count        | 15126830
tuple_len          | 30658625743
tuple_percent      | 34.52
dead_tuple_count   | 0
dead_tuple_len     | 0
dead_tuple_percent | 0
free_space         | 57653329312
free_percent       | 64.92


it say, so your table can be reduced about 60%

That's what I thought, and releasing 65% of 84GB would be major here, but unfortunately I am unable to release it because VACUUM FULL requires more space than I currently have available.

Perhaps disabling the WAL, if possible, could help VACUUM FULL complete.  Or some way to do an in-place VACUUM so that it doesn't write all the data to a new table.

maybe this article can be interesting for you






Thank you,

Igal

 


 
Once pgstattuples reports that "tuple_len" from the table, its toast table
and all its indexes has been reduced to an acceptable value then you
should try a VACUUM FULL.  Remember that VACUUM FULL must also write
WAL, so if WAL is on the same volume, then you'll need to consider
space required for that when deciding how much data to remove from the
table.

WAL is on the same volume.  The PGDATA directory is mounted in a Docker container.

Isn't there any way to do an in-place VACUUM or pause the WAL at the risk of losing some data if recovery is required?

There is a catch-22 here.  I can't reclaim the disk space because that requires disk space.  Surely I'm not the first one to have encountered that problem with Postgres.
 

> This is mostly transient data, so I don't mind deleting rows, but if some day this could happen in production then I have to know how to deal with it without losing all of the data.

For the future, it would be better to delete more often than waiting
until the table grows too large.  A normal VACUUM will turn space used
by dead tuples back into free space, so if done often enough there
won't be a need to vacuum full.

ACK.  This issue came up while implementing a retention policy that will be enforced regularly.

Thank you for all of your help,

Igal
 
Reply | Threaded
Open this post in threaded view
|

Re: Unable to Vacuum Large Defragmented Table

Igal Sapir
Pavel,

On Mon, Apr 8, 2019 at 8:29 AM Pavel Stehule <[hidden email]> wrote:


po 8. 4. 2019 v 17:22 odesílatel Igal Sapir <[hidden email]> napsal:
Pavel,

On Sun, Apr 7, 2019 at 11:22 PM Pavel Stehule <[hidden email]> wrote:

po 8. 4. 2019 v 7:57 odesílatel Igal Sapir <[hidden email]> napsal:
David,

On Sun, Apr 7, 2019 at 8:11 PM David Rowley <[hidden email]> wrote:
On Mon, 8 Apr 2019 at 14:57, Igal Sapir <[hidden email]> wrote:
> However, I have now deleted about 50,000 rows more and the table has only 119,688 rows.  The pg_relation_size() still shows 31MB and pg_total_relation_size() still shows 84GB.
>
> It doesn't make sense that after deleting about 30% of the rows the values here do not change.

deleting rows does not make the table any smaller, it just creates
dead rows in the table.  VACUUM tries to release the space used by
those dead rows and turns it back into free space.  Normal vacuum (not
FULL) can only shrink the table if completely empty pages are found at
the end of the table.

ACK
 

> Attempting to copy the data to a different table results in the out of disk error as well, so that is in line with your assessment.  But it actually just shows the problem.  The new table to which the data was copied (though failed due to out of disk) shows 0 rows, but  pg_total_relation_size() for that table shows 27GB.  So now I have an "empty" table that takes 27GB of disk space.

I think the best thing to do is perform a normal VACUUM on the table

Running VACUUM on the newly created table cleared the 27GB so that's good (I was planning to run normal VACUUM but ran FULL).

you can drop some indexes, then you can run vacuum full, and create dropped indexes again.

The table does not have any indexes.  It is mostly an append-only table.
 


 
then CREATE EXTENSION IF NOT EXISTS pgstattuple; then do; SELECT *
FROM pgstattuple('<tablename>); and the same again on the toast table.
If your table still contains many dead rows then perhaps an open
transaction is stopping rows from being turned into free space.

I am not sure how to read the below.  I see a lot of "free_space" but not "dead":

-[ RECORD 1 ]------+------------
?column?           | primary
table_len          | 32055296
tuple_count        | 120764
tuple_len          | 9470882
tuple_percent      | 29.55
dead_tuple_count   | 0
dead_tuple_len     | 0
dead_tuple_percent | 0
free_space         | 20713580
free_percent       | 64.62
-[ RECORD 2 ]------+------------
?column?           | toast
table_len          | 88802156544
tuple_count        | 15126830
tuple_len          | 30658625743
tuple_percent      | 34.52
dead_tuple_count   | 0
dead_tuple_len     | 0
dead_tuple_percent | 0
free_space         | 57653329312
free_percent       | 64.92


it say, so your table can be reduced about 60%

That's what I thought, and releasing 65% of 84GB would be major here, but unfortunately I am unable to release it because VACUUM FULL requires more space than I currently have available.

Perhaps disabling the WAL, if possible, could help VACUUM FULL complete.  Or some way to do an in-place VACUUM so that it doesn't write all the data to a new table.

maybe this article can be interesting for you



It's very interesting, thank you.

I'm trying now to figure out the pointer on the primary table and the TOAST table.  The TOAST table has fields like chunk_id and chunk_seq, so I'm looking for the data in each row of the primary table that points to those in each toast-ed row.

Any ideas?

Thanks,

Igal

p.s. Unfortunately, pg_repack and pgcompact did not work in my case.  They show as if there is nothing to do even though I know that there is 2/3 bloat in the TOAST file.
 




Thank you,

Igal

 


 
Once pgstattuples reports that "tuple_len" from the table, its toast table
and all its indexes has been reduced to an acceptable value then you
should try a VACUUM FULL.  Remember that VACUUM FULL must also write
WAL, so if WAL is on the same volume, then you'll need to consider
space required for that when deciding how much data to remove from the
table.

WAL is on the same volume.  The PGDATA directory is mounted in a Docker container.

Isn't there any way to do an in-place VACUUM or pause the WAL at the risk of losing some data if recovery is required?

There is a catch-22 here.  I can't reclaim the disk space because that requires disk space.  Surely I'm not the first one to have encountered that problem with Postgres.
 

> This is mostly transient data, so I don't mind deleting rows, but if some day this could happen in production then I have to know how to deal with it without losing all of the data.

For the future, it would be better to delete more often than waiting
until the table grows too large.  A normal VACUUM will turn space used
by dead tuples back into free space, so if done often enough there
won't be a need to vacuum full.

ACK.  This issue came up while implementing a retention policy that will be enforced regularly.

Thank you for all of your help,

Igal