expose parallel leader in CSV and log_line_prefix

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
37 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: expose parallel leader in CSV and log_line_prefix

Justin Pryzby
On Fri, Jul 17, 2020 at 05:27:21PM -0400, Alvaro Herrera wrote:

> On 2020-Jul-17, Justin Pryzby wrote:
> > Ok, but should we then consider changing pg_stat_activity for consistency ?
> > Probably in v13 to avoid changing it a year later.
> > https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=b025f32e0b5d7668daec9bfa957edf3599f4baa8
> >
> > I think the story is that we're exposing to the user a "leader pid" what's
> > internally called (and used as) the "lock group leader", which for the leader
> > process is set to its own PID.  But I think what we're exposing as leader_pid
> > will seem like an implementation artifact to users.
>
> IMO it *is* an implementation artifact if, as you say, the leader PID
> remains set after the parallel query is done.  I mentioned the pgbouncer
> case before: if you run a single parallel query, then the process
> remains a "parallel leader" for days or weeks afterwards even if it
> hasn't run a parallel query ever since.  That doesn't sound great to me.
>
> I think it's understandable and OK if there's a small race condition
> that means you report a process as a leader shortly before or shortly
> after a parallel query is actually executed.  But doing so until backend
> termination seems confusing as well as useless.

I'm not sure that connection pooling is the strongest argument against the
current behavior, but we could change it as suggested to show as NULL the
leader_pid for the leader's own process.  I think that's the intuitive behavior
a user expects.  Parallel processes are those with leader_pid IS NOT NULL.  If
we ever used lockGroupLeader for something else, you'd also have to say AND
backend_type='parallel worker'.

We should talk about doing that for PSA and for v13 as well.  Here or on the
other thread or a new thread ?  It's a simple enough change, but the question
is if we want to provide a more "cooked" view whcih hides the internals, and if
so, is this really enough.

--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -737,3 +737,4 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
                                leader = proc->lockGroupLeader;
-                               if (leader)
+                               if (leader && leader->pid != beentry->st_procpid)
                                {
                                        values[29] = Int32GetDatum(leader->pid);

--
Justin


Reply | Threaded
Open this post in threaded view
|

Re: expose parallel leader in CSV and log_line_prefix

Justin Pryzby
On Fri, Jul 17, 2020 at 05:32:36PM -0500, Justin Pryzby wrote:

> On Fri, Jul 17, 2020 at 05:27:21PM -0400, Alvaro Herrera wrote:
> > On 2020-Jul-17, Justin Pryzby wrote:
> > > Ok, but should we then consider changing pg_stat_activity for consistency ?
> > > Probably in v13 to avoid changing it a year later.
> > > https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=b025f32e0b5d7668daec9bfa957edf3599f4baa8
> > >
> > > I think the story is that we're exposing to the user a "leader pid" what's
> > > internally called (and used as) the "lock group leader", which for the leader
> > > process is set to its own PID.  But I think what we're exposing as leader_pid
> > > will seem like an implementation artifact to users.
> >
> > IMO it *is* an implementation artifact if, as you say, the leader PID
> > remains set after the parallel query is done.  I mentioned the pgbouncer
> > case before: if you run a single parallel query, then the process
> > remains a "parallel leader" for days or weeks afterwards even if it
> > hasn't run a parallel query ever since.  That doesn't sound great to me.
> >
> > I think it's understandable and OK if there's a small race condition
> > that means you report a process as a leader shortly before or shortly
> > after a parallel query is actually executed.  But doing so until backend
> > termination seems confusing as well as useless.
>
> I'm not sure that connection pooling is the strongest argument against the
> current behavior, but we could change it as suggested to show as NULL the
> leader_pid for the leader's own process.  I think that's the intuitive behavior
> a user expects.  Parallel processes are those with leader_pid IS NOT NULL.  If
> we ever used lockGroupLeader for something else, you'd also have to say AND
> backend_type='parallel worker'.
>
> We should talk about doing that for PSA and for v13 as well.  Here or on the
> other thread or a new thread ?  It's a simple enough change, but the question
> is if we want to provide a more "cooked" view whcih hides the internals, and if
> so, is this really enough.
>
> --- a/src/backend/utils/adt/pgstatfuncs.c
> +++ b/src/backend/utils/adt/pgstatfuncs.c
> @@ -737,3 +737,4 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
>                                 leader = proc->lockGroupLeader;
> -                               if (leader)
> +                               if (leader && leader->pid != beentry->st_procpid)
>                                 {
>                                         values[29] = Int32GetDatum(leader->pid);

This thread is about a new feature that I proposed which isn't yet committed
(logging leader_pid).  But it raises a question which is immediately relevant
to pg_stat_activity.leader_pid, which is committed for v13.  So feel free to
move to a new thread or to the thread for commit b025f3.

I added this to the Opened Items list so it's not lost.

I see a couple options:

- Update the documentation only, saying something like "leader_pid: the lock
  group leader.  For a process involved in parallel query, this is the parallel
  leader.  In particular, for the leader process itself, leader_pid = pid, and
  it is not reset until the leader terminates (it does not change when parallel
  workers exit).  This leaves in place the "raw" view of the data structure,
  which can be desirable, but can be perceived as exposing unfriendly
  implementation details.

- Functional change to show leader_pid = NULL for the leader itself.  Maybe
  the columns should only be not-NULL when st_backendType == B_BG_WORKER &&
  bgw_type='parallel worker'.  Update documentation to say: "leader_pid: for
  parallel workers, the PID of their leader process".  (not a raw view of the
  "lock group leader").

- ??

--
Justin


Reply | Threaded
Open this post in threaded view
|

Re: expose parallel leader in CSV and log_line_prefix

Michael Paquier-2
On Mon, Jul 20, 2020 at 06:30:48PM -0500, Justin Pryzby wrote:
> This thread is about a new feature that I proposed which isn't yet committed
> (logging leader_pid).  But it raises a question which is immediately relevant
> to pg_stat_activity.leader_pid, which is committed for v13.  So feel free to
> move to a new thread or to the thread for commit b025f3.

For a change of this size, with everybody involved in the past
discussion already on this thread, and knowing that you already
created an open item pointing to this part of the thread, I am not
sure that I would bother spawning a new thread now :)

> I see a couple options:
>
> - Update the documentation only, saying something like "leader_pid: the lock
>   group leader.  For a process involved in parallel query, this is the parallel
>   leader.  In particular, for the leader process itself, leader_pid = pid, and
>   it is not reset until the leader terminates (it does not change when parallel
>   workers exit).  This leaves in place the "raw" view of the data structure,
>   which can be desirable, but can be perceived as exposing unfriendly
>   implementation details.
>
> - Functional change to show leader_pid = NULL for the leader itself.  Maybe
>   the columns should only be not-NULL when st_backendType == B_BG_WORKER &&
>   bgw_type='parallel worker'.  Update documentation to say: "leader_pid: for
>   parallel workers, the PID of their leader process".  (not a raw view of the
>   "lock group leader").
Yeah, I don't mind revisiting that per the connection pooler argument.
And I'd rather keep the simple suggestion of upthread to leave the
field as NULL for the parallel group leader with a PID match but not a
backend type check so as this could be useful for other types of
processes.  This leads me to the attached with the docs updated
(tested with read-only pgbench spawning parallel workers with
pg_stat_activity queried in parallel), to be applied down to 13.
Thoughts are welcome.
--
Michael

pgstat-leader-pid.patch (1K) Download Attachment
signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: expose parallel leader in CSV and log_line_prefix

Justin Pryzby
On Tue, Jul 21, 2020 at 12:51:45PM +0900, Michael Paquier wrote:
> And I'd rather keep the simple suggestion of upthread to leave the
> field as NULL for the parallel group leader with a PID match but not a
> backend type check so as this could be useful for other types of
> processes.

The documentation could talk about either:

1) "lock group leader" - low-level, raw view of the internal data structure
(with a secondary mention that "for a parallel process, this is its parallel
leader).
2) "parallel leaders" high-level, user-facing, "cooked" view;

Right now it doesn't matter, but it seems that if we document the high-level
"parallel leader", then we don't need to accomodate future uses (at least until
the future happens).

> diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
> index dc49177c78..15c598b2a5 100644
> --- a/doc/src/sgml/monitoring.sgml
> +++ b/doc/src/sgml/monitoring.sgml
> @@ -687,12 +687,9 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
>         <structfield>leader_pid</structfield> <type>integer</type>
>        </para>
>        <para>
> -       Process ID of the parallel group leader if this process is or
> -       has been involved in parallel query, or null. This field is set
> -       when a process wants to cooperate with parallel workers, and
> -       remains set as long as the process exists. For a parallel group leader,
> -       this field is set to its own process ID. For a parallel worker,
> -       this field is set to the process ID of the parallel group leader.
> +       Process ID of the parallel group leader if this process is involved
> +       in parallel query, or null.  For a parallel group leader, this field
> +       is <literal>NULL</literal>.
>        </para></entry>
>       </row>

FWIW , I prefer something like my earlier phrase:

| For a parallel worker, this is the Process ID of its leader process.  Null
| for processes which are not parallel workers.

--
Justin


Reply | Threaded
Open this post in threaded view
|

Re: expose parallel leader in CSV and log_line_prefix

Michael Paquier-2
On Mon, Jul 20, 2020 at 11:12:31PM -0500, Justin Pryzby wrote:

> On Tue, Jul 21, 2020 at 12:51:45PM +0900, Michael Paquier wrote:
> The documentation could talk about either:
>
> 1) "lock group leader" - low-level, raw view of the internal data structure
> (with a secondary mention that "for a parallel process, this is its parallel
> leader).
> 2) "parallel leaders" high-level, user-facing, "cooked" view;
>
> Right now it doesn't matter, but it seems that if we document the high-level
> "parallel leader", then we don't need to accomodate future uses (at least until
> the future happens).
Hmm.  Not sure.  This sounds like material for a separate and larger
patch.

>>        <para>
>> -       Process ID of the parallel group leader if this process is or
>> -       has been involved in parallel query, or null. This field is set
>> -       when a process wants to cooperate with parallel workers, and
>> -       remains set as long as the process exists. For a parallel group leader,
>> -       this field is set to its own process ID. For a parallel worker,
>> -       this field is set to the process ID of the parallel group leader.
>> +       Process ID of the parallel group leader if this process is involved
>> +       in parallel query, or null.  For a parallel group leader, this field
>> +       is <literal>NULL</literal>.
>>        </para></entry>
>
> FWIW , I prefer something like my earlier phrase:
>
> | For a parallel worker, this is the Process ID of its leader process.  Null
> | for processes which are not parallel workers.
I preferred mine, and it seems to me that the first sentence of the
previous patch covers already both things mentioned in your sentence.
It also seems to me that it is an important thing to directly outline
that this field remains NULL for group leaders.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: expose parallel leader in CSV and log_line_prefix

Michael Paquier-2
In reply to this post by Julien Rouhaud
On Fri, Jul 17, 2020 at 07:34:54AM +0200, Julien Rouhaud wrote:
> I had the same concern and was thinking about this approach too.
> Another argument is that IIUC any log emitted due to
> log_min_duration_statement wouldn't see the backend as executing a
> parallel query, since the workers would already have been shut down.

Not sure that it is worth bothering about this case.  You could also
have a backend killed by log_min_duration_statement on a query that
did not involve parallel query, where you would still report the PID
if it got involved at least once in parallel query for a query before
that.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: expose parallel leader in CSV and log_line_prefix

Julien Rouhaud
In reply to this post by Michael Paquier-2
On Tue, Jul 21, 2020 at 6:33 AM Michael Paquier <[hidden email]> wrote:

>
> On Mon, Jul 20, 2020 at 11:12:31PM -0500, Justin Pryzby wrote:
> > On Tue, Jul 21, 2020 at 12:51:45PM +0900, Michael Paquier wrote:
> > The documentation could talk about either:
> >
> > 1) "lock group leader" - low-level, raw view of the internal data structure
> > (with a secondary mention that "for a parallel process, this is its parallel
> > leader).
> > 2) "parallel leaders" high-level, user-facing, "cooked" view;
> >
> > Right now it doesn't matter, but it seems that if we document the high-level
> > "parallel leader", then we don't need to accomodate future uses (at least until
> > the future happens).
>
> Hmm.  Not sure.  This sounds like material for a separate and larger
> patch.
>
> >>        <para>
> >> -       Process ID of the parallel group leader if this process is or
> >> -       has been involved in parallel query, or null. This field is set
> >> -       when a process wants to cooperate with parallel workers, and
> >> -       remains set as long as the process exists. For a parallel group leader,
> >> -       this field is set to its own process ID. For a parallel worker,
> >> -       this field is set to the process ID of the parallel group leader.
> >> +       Process ID of the parallel group leader if this process is involved
> >> +       in parallel query, or null.  For a parallel group leader, this field
> >> +       is <literal>NULL</literal>.
> >>        </para></entry>
> >
> > FWIW , I prefer something like my earlier phrase:
> >
> > | For a parallel worker, this is the Process ID of its leader process.  Null
> > | for processes which are not parallel workers.
>
> I preferred mine, and it seems to me that the first sentence of the
> previous patch covers already both things mentioned in your sentence.
> It also seems to me that it is an important thing to directly outline
> that this field remains NULL for group leaders.

I agree that Michael's version seems less error prone and makes
everything crystal clear, so +1 for it.


Reply | Threaded
Open this post in threaded view
|

Re: expose parallel leader in CSV and log_line_prefix

Alvaro Herrera-9
In reply to this post by Michael Paquier-2
On 2020-Jul-21, Michael Paquier wrote:

> On Mon, Jul 20, 2020 at 11:12:31PM -0500, Justin Pryzby wrote:

> >> +       Process ID of the parallel group leader if this process is involved
> >> +       in parallel query, or null.  For a parallel group leader, this field
> >> +       is <literal>NULL</literal>.
> >>        </para></entry>

> > | For a parallel worker, this is the Process ID of its leader process.  Null
> > | for processes which are not parallel workers.

How about we combine both.  "Process ID of the parallel group leader, if
this process is a parallel query worker.  NULL if this process is a
parallel group leader or does not participate in parallel query".

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: expose parallel leader in CSV and log_line_prefix

Michael Paquier-2
On Wed, Jul 22, 2020 at 11:36:05AM -0400, Alvaro Herrera wrote:
> How about we combine both.  "Process ID of the parallel group leader, if
> this process is a parallel query worker.  NULL if this process is a
> parallel group leader or does not participate in parallel query".

Sounds fine to me.  Thanks.

Do others have any objections with this wording?
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: expose parallel leader in CSV and log_line_prefix

Tom Lane-2
Michael Paquier <[hidden email]> writes:
> On Wed, Jul 22, 2020 at 11:36:05AM -0400, Alvaro Herrera wrote:
>> How about we combine both.  "Process ID of the parallel group leader, if
>> this process is a parallel query worker.  NULL if this process is a
>> parallel group leader or does not participate in parallel query".

> Sounds fine to me.  Thanks.
> Do others have any objections with this wording?

Is "NULL" really le mot juste here?  If we're talking about text strings,
as the thread title implies (I've not read the patch), then I think you
should say "empty string", because the SQL concept of null doesn't apply.

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: expose parallel leader in CSV and log_line_prefix

Michael Paquier-2
On Wed, Jul 22, 2020 at 08:59:04PM -0400, Tom Lane wrote:
> Is "NULL" really le mot juste here?  If we're talking about text strings,
> as the thread title implies (I've not read the patch), then I think you
> should say "empty string", because the SQL concept of null doesn't apply.

Sorry for the confusion.  This part of the thread applies to the open
item for v13 related to pg_stat_activity's leader_pid.  A different
thread should have been spawned for this specific topic, but things
are as they are..
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: expose parallel leader in CSV and log_line_prefix

Michael Paquier-2
In reply to this post by Michael Paquier-2
On Thu, Jul 23, 2020 at 09:52:14AM +0900, Michael Paquier wrote:
> Sounds fine to me.  Thanks.
>
> Do others have any objections with this wording?

I have used the wording suggested by Alvaro, and applied the patch
down to 13.  Now let's see about the original item of this thread..
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: expose parallel leader in CSV and log_line_prefix

Justin Pryzby
On Sun, Jul 26, 2020 at 04:42:06PM +0900, Michael Paquier wrote:
> On Thu, Jul 23, 2020 at 09:52:14AM +0900, Michael Paquier wrote:
> > Sounds fine to me.  Thanks.
> >
> > Do others have any objections with this wording?
>
> I have used the wording suggested by Alvaro, and applied the patch
> down to 13.  Now let's see about the original item of this thread..

Updated with updated wording to avoid "null", per Tom.

--
Justin

v3-0001-Include-the-leader-PID-in-logfile.patch (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: expose parallel leader in CSV and log_line_prefix

Michael Paquier-2
On Sun, Jul 26, 2020 at 01:54:27PM -0500, Justin Pryzby wrote:
> +            <row>
> +             <entry><literal>%P</literal></entry>
> +             <entry>For a parallel worker, this is the Process ID of its leader
> +             process.
> +             </entry>
> +             <entry>no</entry>
> +            </row>

Let's be a maximum simple and consistent with surrounding descriptions
here and what we have for pg_stat_activity, say:
"Process ID of the parallel group leader, if this process is a
parallel query worker."

> +            case 'P':
> +                if (MyProc)
> +                {
> +                    PGPROC *leader = MyProc->lockGroupLeader;
> +                    if (leader == NULL || leader->pid == MyProcPid)
> +                        /* padding only */
> +                        appendStringInfoSpaces(buf,
> +                                padding > 0 ? padding : -padding);
> +                    else if (padding != 0)
> +                        appendStringInfo(buf, "%*d", padding, leader->pid);
> +                    else
> +                        appendStringInfo(buf, "%d", leader->pid);
It seems to me we should document here that the check on MyProcPid
ensures that this only prints the leader PID only for parallel workers
and discards the leader.

> +    appendStringInfoChar(&buf, ',');
> +
> +    /* leader PID */
> +    if (MyProc)
> +    {
> +        PGPROC *leader = MyProc->lockGroupLeader;
> +        if (leader && leader->pid != MyProcPid)
> +            appendStringInfo(&buf, "%d", leader->pid);
> +    }
> +
Same here.

Except for those nits, I have tested the patch and things behave as we
want (including padding and docs), so this looks good to me.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: expose parallel leader in CSV and log_line_prefix

Justin Pryzby
On Tue, Jul 28, 2020 at 10:10:33AM +0900, Michael Paquier wrote:
> Except for those nits, I have tested the patch and things behave as we
> want (including padding and docs), so this looks good to me.

Revised with your suggestions.

--
Justin

v4-0001-Include-the-leader-PID-in-logfile.patch (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: expose parallel leader in CSV and log_line_prefix

Justin Pryzby
On Fri, Jul 31, 2020 at 03:04:48PM -0500, Justin Pryzby wrote:
> On Tue, Jul 28, 2020 at 10:10:33AM +0900, Michael Paquier wrote:
> > Except for those nits, I have tested the patch and things behave as we
> > want (including padding and docs), so this looks good to me.
>
> Revised with your suggestions.

Uh, wrong patch.  2nd attempt.

Also, I was reminded by Tom's c410af098 about this comment:

                 * Further note: At least on some platforms, passing %*s rather than
                 * %s to appendStringInfo() is substantially slower, so many of the
                 * cases below avoid doing that unless non-zero padding is in fact
                 * specified.

It seems we can remove that hack and avoid its spiriling conditionals.
It's cleaner to make that 0001.

--
Justin

v5-0001-Remove-performance-hack-for-s-format-strings.patch (9K) Download Attachment
v5-0002-Include-the-leader-PID-in-logfile.patch (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: expose parallel leader in CSV and log_line_prefix

Michael Paquier-2
On Fri, Jul 31, 2020 at 10:31:13PM -0500, Justin Pryzby wrote:
> Also, I was reminded by Tom's c410af098 about this comment:
>
>                  * Further note: At least on some platforms, passing %*s rather than
>                  * %s to appendStringInfo() is substantially slower, so many of the
>                  * cases below avoid doing that unless non-zero padding is in fact
>                  * specified.
>
> It seems we can remove that hack and avoid its spiriling conditionals.
> It's cleaner to make that 0001.

Not sure what 0001 is doing on this thread, so I would suggest to
create a new thread for that to attract the correct audience.  It is
true that we should not need that anymore as we use our own
implementation of sprintf now.

For now, I have taken 0002 as a base, fixed a couple of things (doc
tweaks, removed unnecessary header inclusion, etc.), and committed it,
meaning that we are done here.
--
Michael

signature.asc (849 bytes) Download Attachment
12