pg_wal fills up on big update query

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

pg_wal fills up on big update query

Daniel Fink (PDF)

Hi all,

 

I have a migration where I

·         Add a new nullable column to a table

·         update almost every row in this big table (8 million rows) from another table where I set this new column

 

I have also a replication setup running.

The database has a size of around 20GB.

While the migration is running, it more than doubles is size and fills up all space.

Then the migration fails and is rolled back.

 

What is the best way of keeping this from happening?

My current idea is to lock both tables completely from access (the queried and the updated one) so that postgresql does not have to ensure isolation for concurrent queries by keeping a copy of each row.

Is my thinking here correct?

 

Thanks in advance and Best Regards,

Daniel


This message may contain confidential and privileged information. If it has been sent to you in error, please reply to advise the sender of the error and then immediately permanently delete it and all attachments to it from your systems. If you are not the intended recipient, do not read, copy, disclose or otherwise use this message or any attachments to it. The sender disclaims any liability for such unauthorized use. PLEASE NOTE that all incoming e-mails sent to PDF e-mail accounts will be archived and may be scanned by us and/or by external service providers to detect and prevent threats to our systems, investigate illegal or inappropriate behavior, and/or eliminate unsolicited promotional e-mails (“spam”). If you have any concerns about this process, please contact us at [hidden email].
lup
Reply | Threaded
Open this post in threaded view
|

Re: pg_wal fills up on big update query

lup


On Aug 7, 2019, at 7:34 AM, Daniel Fink (PDF) <[hidden email]> wrote:

Hi all,

 

I have a migration where I

·         Add a new nullable column to a table

·         update almost every row in this big table (8 million rows) from another table where I set this new column

 

I have also a replication setup running.

The database has a size of around 20GB.

While the migration is running, it more than doubles is size and fills up all space.

Then the migration fails and is rolled back.

 

What is the best way of keeping this from happening?

My current idea is to lock both tables completely from access (the queried and the updated one) so that postgresql does not have to ensure isolation for concurrent queries by keeping a copy of each row.

Is my thinking here correct?

 

Thanks in advance and Best Regards,


Do the update in small chunks
Reply | Threaded
Open this post in threaded view
|

Re: pg_wal fills up on big update query

Luca Ferrari-2
In reply to this post by Daniel Fink (PDF)
On Wed, Aug 7, 2019 at 3:34 PM Daniel Fink (PDF) <[hidden email]> wrote:
> My current idea is to lock both tables completely from access (the queried and the updated one) so that postgresql does not have to ensure isolation for concurrent queries by keeping a copy of each row.

I'm not sure that locking will prevent the snapshotting and the WAL
machinery, but someone more expert on the are could clarify this.
Since the column is nullable, I would apply it outside of the
transaction, and then do the update. If that still fails, I would try
to split the update on small chunks (after all, it's an update, so it
is smething you can line up data).

Luca


Reply | Threaded
Open this post in threaded view
|

RE: pg_wal fills up on big update query

Daniel Fink (PDF)
In reply to this post by lup

Hi Rob,

 

Thanks, I will try.

It’s a bit of a  bummer though, because I just started to use flywaydb to manage migrations, and it wraps all migrations into a single transaction.

So I have to do this outside of the tool.

 

Best Regards,

Daniel

 

From: Rob Sargent [mailto:[hidden email]]
Sent: Wednesday, August 7, 2019 4:22 PM
To: Daniel Fink (PDF) <[hidden email]>
Cc: [hidden email]
Subject: Re: pg_wal fills up on big update query

 

 


On Aug 7, 2019, at 7:34 AM, Daniel Fink (PDF) <[hidden email]> wrote:

Hi all,

 

I have a migration where I

·       Add a new nullable column to a table

·       update almost every row in this big table (8 million rows) from another table where I set this new column

 

I have also a replication setup running.

The database has a size of around 20GB.

While the migration is running, it more than doubles is size and fills up all space.

Then the migration fails and is rolled back.

 

What is the best way of keeping this from happening?

My current idea is to lock both tables completely from access (the queried and the updated one) so that postgresql does not have to ensure isolation for concurrent queries by keeping a copy of each row.

Is my thinking here correct?

 

Thanks in advance and Best Regards,

 

Do the update in small chunks


This message may contain confidential and privileged information. If it has been sent to you in error, please reply to advise the sender of the error and then immediately permanently delete it and all attachments to it from your systems. If you are not the intended recipient, do not read, copy, disclose or otherwise use this message or any attachments to it. The sender disclaims any liability for such unauthorized use. PLEASE NOTE that all incoming e-mails sent to PDF e-mail accounts will be archived and may be scanned by us and/or by external service providers to detect and prevent threats to our systems, investigate illegal or inappropriate behavior, and/or eliminate unsolicited promotional e-mails (“spam”). If you have any concerns about this process, please contact us at [hidden email].
Reply | Threaded
Open this post in threaded view
|

RE: pg_wal fills up on big update query

Daniel Fink (PDF)
In reply to this post by Luca Ferrari-2
Hi Rob,

Thanks, I will try.
It’s a bit of a  bummer though, because I just started to use flywaydb to
manage migrations, and it wraps all migrations into a single transaction.
So I have to do this outside of the tool.
I will still try to evaluate if locking has any effect, to have a better
understanding of how postgres works under the hood.

Best Regards,
Daniel

-----Original Message-----
From: Luca Ferrari [mailto:[hidden email]]
Sent: Friday, August 9, 2019 3:58 PM
To: Daniel Fink (PDF) <[hidden email]>
Cc: pgsql-general <[hidden email]>
Subject: Re: pg_wal fills up on big update query

On Wed, Aug 7, 2019 at 3:34 PM Daniel Fink (PDF) <[hidden email]>
wrote:
> My current idea is to lock both tables completely from access (the queried
> and the updated one) so that postgresql does not have to ensure isolation
> for concurrent queries by keeping a copy of each row.

I'm not sure that locking will prevent the snapshotting and the WAL
machinery, but someone more expert on the are could clarify this.
Since the column is nullable, I would apply it outside of the transaction,
and then do the update. If that still fails, I would try to split the update
on small chunks (after all, it's an update, so it is smething you can line
up data).

Luca

--
This message may contain confidential and privileged information. If it has
been sent to you in error, please reply to advise the sender of the error
and then immediately permanently delete it and all attachments to it from
your systems. If you are not the intended recipient, do not read, copy,
disclose or otherwise use this message or any attachments to it. The sender
disclaims any liability for such unauthorized use.  PLEASE NOTE that all
incoming e-mails sent to PDF e-mail accounts will be archived and may be
scanned by us and/or by external service providers to detect and prevent
threats to our systems, investigate illegal or inappropriate behavior,
and/or eliminate unsolicited promotional e-mails (“spam”).  If you have any
concerns about this process, please contact us at [hidden email]
<mailto:[hidden email]>.


lup
Reply | Threaded
Open this post in threaded view
|

Re: pg_wal fills up on big update query

lup
In reply to this post by Daniel Fink (PDF)

flyway is an excellent tool, I use it too.  Just make a series of migrations.

On 8/9/19 9:03 AM, Daniel Fink (PDF) wrote:

Hi Rob,

 

Thanks, I will try.

It’s a bit of a  bummer though, because I just started to use flywaydb to manage migrations, and it wraps all migrations into a single transaction.

So I have to do this outside of the tool.

 

Best Regards,

Daniel

 

From: Rob Sargent [mailto:[hidden email]]
Sent: Wednesday, August 7, 2019 4:22 PM
To: Daniel Fink (PDF) <[hidden email]>
Cc: [hidden email]
Subject: Re: pg_wal fills up on big update query

 

 


On Aug 7, 2019, at 7:34 AM, Daniel Fink (PDF) <[hidden email]> wrote:

Hi all,

 

I have a migration where I

·       Add a new nullable column to a table

·       update almost every row in this big table (8 million rows) from another table where I set this new column

 

I have also a replication setup running.

The database has a size of around 20GB.

While the migration is running, it more than doubles is size and fills up all space.

Then the migration fails and is rolled back.

 

What is the best way of keeping this from happening?

My current idea is to lock both tables completely from access (the queried and the updated one) so that postgresql does not have to ensure isolation for concurrent queries by keeping a copy of each row.

Is my thinking here correct?

 

Thanks in advance and Best Regards,

 

Do the update in small chunks


This message may contain confidential and privileged information. If it has been sent to you in error, please reply to advise the sender of the error and then immediately permanently delete it and all attachments to it from your systems. If you are not the intended recipient, do not read, copy, disclose or otherwise use this message or any attachments to it. The sender disclaims any liability for such unauthorized use. PLEASE NOTE that all incoming e-mails sent to PDF e-mail accounts will be archived and may be scanned by us and/or by external service providers to detect and prevent threats to our systems, investigate illegal or inappropriate behavior, and/or eliminate unsolicited promotional e-mails (“spam”). If you have any concerns about this process, please contact us at [hidden email].