pg_upgrade: Pass -j down to vacuumdb

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
44 messages Options
123
Reply | Threaded
Open this post in threaded view
|

pg_upgrade: Pass -j down to vacuumdb

Jesper Pedersen
Hi Hackers,

Here is a patch that passes the -j option from pg_upgrade down to
vacuumdb if supported.

I'll add it to the January 'Fest.

Thanks for considering !

Best regards,
  Jesper

v1_0001-Pass-the-j-option-down-to-vacuumdb-if-supported.patch (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: pg_upgrade: Pass -j down to vacuumdb

Peter Eisentraut-6
On 21/12/2018 11:12, Jesper Pedersen wrote:
> Here is a patch that passes the -j option from pg_upgrade down to
> vacuumdb if supported.

It's not clear to me that that is what is usually wanted.

The point of the post-upgrade analyze script is specifically to do the
analyze in a gentle fashion.  Mixing in parallelism would seem to defeat
that a bit.

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: pg_upgrade: Pass -j down to vacuumdb

Jesper Pedersen
Hi,

On 12/29/18 10:03 AM, Peter Eisentraut wrote:

> On 21/12/2018 11:12, Jesper Pedersen wrote:
>> Here is a patch that passes the -j option from pg_upgrade down to
>> vacuumdb if supported.
>
> It's not clear to me that that is what is usually wanted.
>
> The point of the post-upgrade analyze script is specifically to do the
> analyze in a gentle fashion.  Mixing in parallelism would seem to defeat
> that a bit.
>

Well, that really depends. The user passed -j to pg_upgrade in order for
the upgrade to happen faster, so maybe they would expect, as I would,
that the ANALYZE phase would happen in parallel too.

At least there should be a "notice" about what the script will do in
this case.

Best regards,
  Jesper

Reply | Threaded
Open this post in threaded view
|

Re: pg_upgrade: Pass -j down to vacuumdb

Peter Eisentraut-6
On 02/01/2019 20:47, Jesper Pedersen wrote:
> Well, that really depends. The user passed -j to pg_upgrade in order for
> the upgrade to happen faster, so maybe they would expect, as I would,
> that the ANALYZE phase would happen in parallel too.

pg_upgrade -j reduces the *downtime* caused by pg_upgrade during the
upgrade process.  Also, during said downtime, nothing else is happening,
so you can use all the resources of the machine.

Once the system is back up, you don't necessarily want to use all the
resources.  The analyze script is specifically written to run while
production traffic is active.  If you just want to run the analyze as
fast as possible, you can just run vacuumdb -j directly, without using
the script.

Moreover, it's not clear that pg_upgrade and vacuumdb are bound the same
way, so it's not a given that the same -j number should be used.

Perhaps more documentation would be useful here.

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: pg_upgrade: Pass -j down to vacuumdb

Jerry Sievers-3
Peter Eisentraut <[hidden email]> writes:

> On 02/01/2019 20:47, Jesper Pedersen wrote:
>
>> Well, that really depends. The user passed -j to pg_upgrade in order for
>> the upgrade to happen faster, so maybe they would expect, as I would,
>> that the ANALYZE phase would happen in parallel too.
>
> pg_upgrade -j reduces the *downtime* caused by pg_upgrade during the
> upgrade process.  Also, during said downtime, nothing else is happening,
> so you can use all the resources of the machine.
>
> Once the system is back up, you don't necessarily want to use all the
> resources.  The analyze script is specifically written to run while
> production traffic is active.  If you just want to run the analyze as
> fast as possible, you can just run vacuumdb -j directly, without using
> the script.

Peter, I'm skeptical here.

I might permit a connection to a just pg_upgraded DB prior to any
analyze being known finished only for the most trivial case.  At my site
however, *trivial* systems are a small minority.

In fact, our automated upgrade workflow uses our home-built parallel
analyzer which predates vacuumdb -j.  Apps are not allowed into the DB
until a fast 1st pass has been done.

We run it in 2 phases...

$all preceeding upgrade steps w/system locked out
analyze-lite (reduced stats target)
open DB for application traffic
analyze-full

Of course we are increasing downtime by disallowing app traffic till
finish of analyze-lite however the assumption is that many queries would
be too slow to attempt without full analyzer coverage, albiet at a
reduced stats target.

IMO this is certainly a case of no 1-size-fits-all solution so perhaps a
--analyze-jobs option :-)

FWIW

Thanks

> Moreover, it's not clear that pg_upgrade and vacuumdb are bound the same
> way, so it's not a given that the same -j number should be used.
>
> Perhaps more documentation would be useful here.

--
Jerry Sievers
Postgres DBA/Development Consulting
e: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: pg_upgrade: Pass -j down to vacuumdb

Jesper Pedersen
In reply to this post by Peter Eisentraut-6
Hi Peter,

On 1/3/19 7:03 AM, Peter Eisentraut wrote:
> Perhaps more documentation would be useful here.
>

Here is v2 that just notes that the -j option isn't passed down.

Best regards,
  Jesper

v2_0001-Highlight-that-the-j-option-isn-t-passed-down-to-vac.patch (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

RE: pg_upgrade: Pass -j down to vacuumdb

Jamison, Kirk
Hi,

According to CF app, this patch needs review so I took a look at it.
Currently, your patch applies and builds cleanly, and all tests are also successful
based from the CF bot patch tester.

I'm not sure if I have understood the argument raised by Peter correctly.
Quoting Peter, "it's not clear that pg_upgrade and vacuumdb are bound the same way, so it's not a given that the same -j number should be used."
I think it's whether the # jobs for pg_upgrade is used the same way for parallel vacuum.

According to the official docs, the recommended setting for pg_upgrade --j option is the maximum of the number of CPU cores and tablespaces. [1]
As for vacuumdb, parallel vacuum's (-j) recommended setting is based from your max_connections [2], which is the max # of concurrent connections to your db server.

[1] https://www.postgresql.org/docs/current/pgupgrade.html
[2] https://www.postgresql.org/docs/current/app-vacuumdb.html


Regards,
Kirk Jamison
Reply | Threaded
Open this post in threaded view
|

Re: pg_upgrade: Pass -j down to vacuumdb

Michael Paquier-2
On Fri, Jan 25, 2019 at 02:31:57AM +0000, Jamison, Kirk wrote:
> I'm not sure if I have understood the argument raised by Peter
> correctly.  Quoting Peter, "it's not clear that pg_upgrade and
> vacuumdb are bound the same way, so it's not a given that the same
> -j number should be used."  I think it's whether the # jobs for
> pg_upgrade is used the same way for parallel vacuum.

vacuumdb and pg_upgrade are designed for different purposes and have
different properties, hence using a value of -j for one does not
necessarily mean that the same value should be used for the other.
An ANALYZE is nice because it is non-intrusive for a live production
system, and if you begin to pass down a -j argument you may finish by
eating more resources that would have been necessary for the same
job.  For some users perhaps that matters, for some it does not.  So
based on that linking the value used by pg_upgrade and vacuumdb is a
bad concept in my opinion, and the patch should be rejected.  More
documentation on pg_upgrade side to explain that a bit better could be
a good idea though, as it is perfectly possible to use your own
post-upgrade script or rewrite partially the generated one.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: pg_upgrade: Pass -j down to vacuumdb

Peter Eisentraut-6
On 25/01/2019 11:28, Michael Paquier wrote:
> based on that linking the value used by pg_upgrade and vacuumdb is a
> bad concept in my opinion, and the patch should be rejected.  More
> documentation on pg_upgrade side to explain that a bit better could be
> a good idea though, as it is perfectly possible to use your own
> post-upgrade script or rewrite partially the generated one.

Right.  pg_upgrade doesn't actually call vacuumdb.  It creates a script
that you may use.  The script itself contains a comment that says, if
you want to do this as fast as possible, don't use this script.  That
comment could be enhanced to suggest the use of the -j option.

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: pg_upgrade: Pass -j down to vacuumdb

Jesper Pedersen
In reply to this post by Jamison, Kirk
Hi Kirk,

On 1/24/19 9:31 PM, Jamison, Kirk wrote:

> According to CF app, this patch needs review so I took a look at it.
> Currently, your patch applies and builds cleanly, and all tests are also successful
> based from the CF bot patch tester.
>
> I'm not sure if I have understood the argument raised by Peter correctly.
> Quoting Peter, "it's not clear that pg_upgrade and vacuumdb are bound the same way, so it's not a given that the same -j number should be used."
> I think it's whether the # jobs for pg_upgrade is used the same way for parallel vacuum.
>
> According to the official docs, the recommended setting for pg_upgrade --j option is the maximum of the number of CPU cores and tablespaces. [1]
> As for vacuumdb, parallel vacuum's (-j) recommended setting is based from your max_connections [2], which is the max # of concurrent connections to your db server.
>

Thanks for your feedback !

As per Peter's comments I have changed the patch (v2) to not pass down
the -j option to vacuumdb.

Only an update to the documentation and console output is made in order
to make it more clear.

Best regards,
  Jesper

Reply | Threaded
Open this post in threaded view
|

Re: pg_upgrade: Pass -j down to vacuumdb

Alvaro Herrera-9
In reply to this post by Peter Eisentraut-6
On 2019-Jan-25, Peter Eisentraut wrote:

> On 25/01/2019 11:28, Michael Paquier wrote:
> > based on that linking the value used by pg_upgrade and vacuumdb is a
> > bad concept in my opinion, and the patch should be rejected.  More
> > documentation on pg_upgrade side to explain that a bit better could be
> > a good idea though, as it is perfectly possible to use your own
> > post-upgrade script or rewrite partially the generated one.
>
> Right.  pg_upgrade doesn't actually call vacuumdb.  It creates a script
> that you may use.  The script itself contains a comment that says, if
> you want to do this as fast as possible, don't use this script.  That
> comment could be enhanced to suggest the use of the -j option.

So let's have it write with a $VACUUMDB_OPTS variable, which is by
default defined as empty but with a comment suggesting that maybe the
user wants to add the -j option.  This way, if they have to edit it,
they only have to edit the VACUUMDB_OPTS line instead of each of the two
vacuumdb lines.

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: pg_upgrade: Pass -j down to vacuumdb

Tom Lane-2
Alvaro Herrera <[hidden email]> writes:
> So let's have it write with a $VACUUMDB_OPTS variable, which is by
> default defined as empty but with a comment suggesting that maybe the
> user wants to add the -j option.  This way, if they have to edit it,
> they only have to edit the VACUUMDB_OPTS line instead of each of the two
> vacuumdb lines.

Even better, allow the script to absorb a value from the environment,
so that it needn't be edited at all.

                        regards, tom lane

Reply | Threaded
Open this post in threaded view
|

Re: pg_upgrade: Pass -j down to vacuumdb

Michael Paquier-2
On Fri, Jan 25, 2019 at 12:16:49PM -0500, Tom Lane wrote:
> Alvaro Herrera <[hidden email]> writes:
>> So let's have it write with a $VACUUMDB_OPTS variable, which is by
>> default defined as empty but with a comment suggesting that maybe the
>> user wants to add the -j option.  This way, if they have to edit it,
>> they only have to edit the VACUUMDB_OPTS line instead of each of the two
>> vacuumdb lines.
>
> Even better, allow the script to absorb a value from the environment,
> so that it needn't be edited at all.

+1.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

RE: pg_upgrade: Pass -j down to vacuumdb

Jamison, Kirk
In reply to this post by Jesper Pedersen
Hi Jesper,

On Friday, January 25, 2019, Jesper Pedersen <[hidden email]>
> Thanks for your feedback !
>
> As per Peter's comments I have changed the patch (v2) to not pass down the -j option to vacuumdb.
>
> Only an update to the documentation and console output is made in order to make it more clear.

Yeah, I understood from the references that the parallel option use
for vacuumdb and pg_upgrade are different. I was also getting
clarification whether pg_upgrade gets to pass it down to vacuumdb.
Based from other Michael's and Peter's replies, I now understand it.

There were also helpful comments from the developers above,
pointing it to the right direction.
In addition to Peter's comment, quoting "...if you want to do this
as fast as possible, don't use this script.  That comment could be
enhanced to suggest the use of the -j option.", so I think you
should also update the following script in create_script_for_cluster_analyze():
     fprintf(script, "echo %sIf you would like default statistics as quickly as possible, cancel%s\n",
             ECHO_QUOTE, ECHO_QUOTE);

There were also advice from Tom and Alvaro that you can incorporate
further in your next patch.

>Alvaro Herrera <[hidden email]> writes:
>> So let's have it write with a $VACUUMDB_OPTS variable, which is by
>> default defined as empty but with a comment suggesting that maybe the
>> user wants to add the -j option.  This way, if they have to edit it,
>> they only have to edit the VACUUMDB_OPTS line instead of each of the
>> two vacuumdb lines.
>
Tom Lane wrote:
>Even better, allow the script to absorb a value from the environment, so that it needn't be edited at all.

Regards,
Kirk Jamison
Reply | Threaded
Open this post in threaded view
|

Re: pg_upgrade: Pass -j down to vacuumdb

Jesper Pedersen
Hi,

On 1/27/19 7:50 PM, Jamison, Kirk wrote:
> There were also helpful comments from the developers above,
> pointing it to the right direction.
> In addition to Peter's comment, quoting "...if you want to do this
> as fast as possible, don't use this script.  That comment could be
> enhanced to suggest the use of the -j option.", so I think you
> should also update the following script in create_script_for_cluster_analyze():
>       fprintf(script, "echo %sIf you would like default statistics as quickly as possible, cancel%s\n",
>               ECHO_QUOTE, ECHO_QUOTE);
>

Yes, it will echo the -j option if passed, and the server supports it.

>> Alvaro Herrera <[hidden email]> writes:
>>> So let's have it write with a $VACUUMDB_OPTS variable, which is by
>>> default defined as empty but with a comment suggesting that maybe the
>>> user wants to add the -j option.  This way, if they have to edit it,
>>> they only have to edit the VACUUMDB_OPTS line instead of each of the
>>> two vacuumdb lines.
>>
> Tom Lane wrote:
>> Even better, allow the script to absorb a value from the environment, so that it needn't be edited at all.
>
Done.

Attached is v3.

Best regards,
  Jesper

v3_0001-Highlight-that-the-j-option-isn-t-passed-down-to-vac.patch (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: pg_upgrade: Pass -j down to vacuumdb

fabriziomello

On Mon, Jan 28, 2019 at 1:15 PM Jesper Pedersen <[hidden email]> wrote:
>
> Done.
>
> Attached is v3.
>

IMHO you should use long option name '--jobs' like others.

Regards,

--
   Fabrízio de Royes Mello         Timbira - http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
Reply | Threaded
Open this post in threaded view
|

Re: pg_upgrade: Pass -j down to vacuumdb

Jesper Pedersen
On 1/28/19 11:30 AM, Fabrízio de Royes Mello wrote:
> IMHO you should use long option name '--jobs' like others.
>

Thanks for your feedback !

Done, and v4 attached.

Best regards,
  Jesper

v4_0001-Highlight-that-the-jobs-option-isn-t-passed-down-to-.patch (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: pg_upgrade: Pass -j down to vacuumdb

Peter Eisentraut-6
On 28/01/2019 17:47, Jesper Pedersen wrote:
> On 1/28/19 11:30 AM, Fabrízio de Royes Mello wrote:
>> IMHO you should use long option name '--jobs' like others.
>>
>
> Thanks for your feedback !
>
> Done, and v4 attached.

I would drop the changes in pgupgrade.sgml.  We don't need to explain
what doesn't happen, when nobody before now ever thought that it would
happen.

Also, we don't need the versioning stuff.  The new cluster in pg_upgrade
is always of the current version, and we know what that one supports.

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: pg_upgrade: Pass -j down to vacuumdb

Jesper Pedersen
On 1/28/19 3:50 PM, Peter Eisentraut wrote:
>> Done, and v4 attached.
>
> I would drop the changes in pgupgrade.sgml.  We don't need to explain
> what doesn't happen, when nobody before now ever thought that it would
> happen.
>
> Also, we don't need the versioning stuff.  The new cluster in pg_upgrade
> is always of the current version, and we know what that one supports.
>

Done.

Best regards,
  Jesper

v5_0001-Highlight-that-the-jobs-option-isn-t-passed-down-to-.patch (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

RE: pg_upgrade: Pass -j down to vacuumdb

Jamison, Kirk
Hi Jesper,

Thanks for updating your patches quickly.

>On 1/28/19 3:50 PM, Peter Eisentraut wrote:
>>> Done, and v4 attached.
>>
>> I would drop the changes in pgupgrade.sgml.  We don't need to explain
>> what doesn't happen, when nobody before now ever thought that it would
>> happen.
>>
>> Also, we don't need the versioning stuff.  The new cluster in
>> pg_upgrade is always of the current version, and we know what that one supports.
>>

>Done.

I just checked the patch.
As per advice, you removed the versioning and specified --jobs.
The patch still applies, builds and passed the tests successfully.

I'd argue though that it's helpful to find a short documentation to clarify
that it's not pass down by to vacuumdb default. I think the initial doc
change from the previous patch would be good.
+     machine. Note, that this option isn't passed to the
+     <application>vacuumdb</application> application by default.
Just remove the comma symbol from "Note, that..."


Regards,
Kirk Jamison
123