Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
46 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

Yun Li
Hey pg developers,

Do you think if we can add queryId into the pg_stat_get_activity function and ultimatly expose it in the view? It would be easier to track "similar" query's performance over time easier.

Thanks a lot!
Yun
Reply | Threaded
Open this post in threaded view
|

Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

Tom Lane-2
Yun Li <[hidden email]> writes:
> Do you think if we can add queryId into the pg_stat_get_activity function
> and ultimatly expose it in the view? It would be easier to track "similar"
> query's performance over time easier.

No, we're not likely to do that, because it would mean (1) baking one
single definition of "query ID" into the core system and (2) paying
the cost to calculate that ID all the time.

pg_stat_statements has a notion of query ID, but that notion might be
quite inappropriate for other usages, which is why it's an extension
and not core.

                        regards, tom lane

Reply | Threaded
Open this post in threaded view
|

Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

Robert Haas
On Fri, Mar 15, 2019 at 9:50 PM Tom Lane <[hidden email]> wrote:

> Yun Li <[hidden email]> writes:
> > Do you think if we can add queryId into the pg_stat_get_activity function
> > and ultimatly expose it in the view? It would be easier to track "similar"
> > query's performance over time easier.
>
> No, we're not likely to do that, because it would mean (1) baking one
> single definition of "query ID" into the core system and (2) paying
> the cost to calculate that ID all the time.
>
> pg_stat_statements has a notion of query ID, but that notion might be
> quite inappropriate for other usages, which is why it's an extension
> and not core.

Having written an extension that also wanted a query ID, I disagree
with this position.  There's only one query ID field available, and
you can't use two extensions that care about query ID unless they
compute it the same way, and replicating all the code that computes
the query ID into each new extension that wants one sucks.  I think we
should actually bite the bullet and move all of that code into core,
and then just let extensions say whether they care about it getting
set.

Also, I think this is now the third independent request to expose
query ID in pg_stat_statements.  I think we should give the people
what they want.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Reply | Threaded
Open this post in threaded view
|

Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

Tom Lane-2
Robert Haas <[hidden email]> writes:
> On Fri, Mar 15, 2019 at 9:50 PM Tom Lane <[hidden email]> wrote:
>> pg_stat_statements has a notion of query ID, but that notion might be
>> quite inappropriate for other usages, which is why it's an extension
>> and not core.

> Having written an extension that also wanted a query ID, I disagree
> with this position.

[ shrug... ]  The fact remains that pg_stat_statements's definition is
pretty lame.  There's a lot of judgment calls in which query fields
it chooses to examine or ignore, and there's been no attempt at all
to make the ID PG-version-independent, and I rather doubt that it's
platform-independent either.  Nor will the IDs survive a dump/reload
even on the same server, since object OIDs will likely change.

These things are OK, or at least mostly tolerable, for pg_stat_statements'
usage ... but I don't think it's a good idea to have the core code
dictating that definition to all extensions.  Right now, if you have
an extension that needs some other query-ID definition, you can do it,
you just can't run that extension alongside pg_stat_statements.
But you'll be out of luck if the core code starts filling that field.

I'd be happier about having the core code compute a query ID if we
had a definition that was not so obviously slapped together.

                        regards, tom lane

Reply | Threaded
Open this post in threaded view
|

Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

Julien Rouhaud
On Sat, Mar 16, 2019 at 5:20 PM Tom Lane <[hidden email]> wrote:

>
> Robert Haas <[hidden email]> writes:
> > On Fri, Mar 15, 2019 at 9:50 PM Tom Lane <[hidden email]> wrote:
> >> pg_stat_statements has a notion of query ID, but that notion might be
> >> quite inappropriate for other usages, which is why it's an extension
> >> and not core.
>
> > Having written an extension that also wanted a query ID, I disagree
> > with this position.
>
> [ shrug... ]  The fact remains that pg_stat_statements's definition is
> pretty lame.  There's a lot of judgment calls in which query fields
> it chooses to examine or ignore, and there's been no attempt at all
> to make the ID PG-version-independent, and I rather doubt that it's
> platform-independent either.  Nor will the IDs survive a dump/reload
> even on the same server, since object OIDs will likely change.
>
> These things are OK, or at least mostly tolerable, for pg_stat_statements'
> usage ... but I don't think it's a good idea to have the core code
> dictating that definition to all extensions.  Right now, if you have
> an extension that needs some other query-ID definition, you can do it,
> you just can't run that extension alongside pg_stat_statements.
> But you'll be out of luck if the core code starts filling that field.
>
> I'd be happier about having the core code compute a query ID if we
> had a definition that was not so obviously slapped together.

But the queryId itself is stored in core.  Exposing it in
pg_stat_activity or log_line_prefix would still allow users to choose
the implementation of their choice, or none.  That seems like a
different complaint from asking pgss integration in core to have all
its metrics available by default (or at least without a restart).

Maybe we could add a GUC for pg_stat_statements to choose whether it
should set the queryid itself and not, if anyone wants to have its
metrics but with different queryid semantics?

Reply | Threaded
Open this post in threaded view
|

Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

legrand legrand
In reply to this post by Robert Haas
Hello,

This is available in https://github.com/legrandlegrand/pg_stat_sql_plans
extension with a specific function
pgssp_backend_queryid(pid) that permits to join pg_stat_activity with
pg_stat_sql_plans (that is similar to pg_stat_statements) and also permits
to collect samples of wait events per query id.

This extension computes its own queryid based on a normalized query text
(that doesn't change after table
drop/create).

Maybe that queryid calculation should stay in a dedicated extension,
permiting to users to choose their queryid definition.

Regards
PAscal



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html

Reply | Threaded
Open this post in threaded view
|

Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

Yun Li
In reply to this post by Julien Rouhaud
Thanks a lot for really good points!! I did not expected I will get this many points of view. :P

I have identical experience with Robert when other extension calculate the id different as PGSS, PGSS will overwritten that id when it is on. But Tom got a point that if we centralize the logic that pgss has, then other extension will have no way to change it unless we have some new config to toggle pointed out by Julien. Also Tom got the concern  about the current PGSS jumble query logic is not bullet proof and may take time then impact the perf.

Let's take one step back. Since queryId is stored in core as Julien pointed out, can we just add that global to the pg_stat_get_activity and ultimately exposed in pg_stat_activity view? Then no matter whether PGSS  is on or off, or however the customer extensions are updating that filed, we expose that field in that view then enable user to leverage that id to join with pgss or their extension. Will this sounds a good idea?

Thanks again,
Yun

On Sat, Mar 16, 2019 at 11:01 AM Julien Rouhaud <[hidden email]> wrote:
On Sat, Mar 16, 2019 at 5:20 PM Tom Lane <[hidden email]> wrote:
>
> Robert Haas <[hidden email]> writes:
> > On Fri, Mar 15, 2019 at 9:50 PM Tom Lane <[hidden email]> wrote:
> >> pg_stat_statements has a notion of query ID, but that notion might be
> >> quite inappropriate for other usages, which is why it's an extension
> >> and not core.
>
> > Having written an extension that also wanted a query ID, I disagree
> > with this position.
>
> [ shrug... ]  The fact remains that pg_stat_statements's definition is
> pretty lame.  There's a lot of judgment calls in which query fields
> it chooses to examine or ignore, and there's been no attempt at all
> to make the ID PG-version-independent, and I rather doubt that it's
> platform-independent either.  Nor will the IDs survive a dump/reload
> even on the same server, since object OIDs will likely change.
>
> These things are OK, or at least mostly tolerable, for pg_stat_statements'
> usage ... but I don't think it's a good idea to have the core code
> dictating that definition to all extensions.  Right now, if you have
> an extension that needs some other query-ID definition, you can do it,
> you just can't run that extension alongside pg_stat_statements.
> But you'll be out of luck if the core code starts filling that field.
>
> I'd be happier about having the core code compute a query ID if we
> had a definition that was not so obviously slapped together.

But the queryId itself is stored in core.  Exposing it in
pg_stat_activity or log_line_prefix would still allow users to choose
the implementation of their choice, or none.  That seems like a
different complaint from asking pgss integration in core to have all
its metrics available by default (or at least without a restart).

Maybe we could add a GUC for pg_stat_statements to choose whether it
should set the queryid itself and not, if anyone wants to have its
metrics but with different queryid semantics?
Reply | Threaded
Open this post in threaded view
|

Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

Nikolay Samokhvalov
In reply to this post by Robert Haas
Hello

On Sat, Mar 16, 2019 at 7:32 AM Robert Haas <[hidden email]> wrote:
Also, I think this is now the third independent request to expose
query ID in pg_stat_statements.  I think we should give the people
what they want.

Count me as the 4th.

This would be a very important feature for automated query analysis.
pg_stat_statements lacks query examples, and the only way to get them is from the logs.
Where we don't have queryid as well. So people end up either doing it manually or writing
yet another set of nasty regular expressions.

Routing query analysis s a crucial for any large project. If there are chances to implement
queryid for pg_stat_activity (or anything that will allow to automate query analysis)
in Postgres 12 or later -- this would be a great news and huge support for engineers.
Same level as recently implemented sampling for statement logging.

By the way, if queryid goes to the core someday, I'm sure it is worth to consider using
it in logs as well.

Thanks,
Nik
Reply | Threaded
Open this post in threaded view
|

Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

Julien Rouhaud
In reply to this post by Yun Li
On Mon, Mar 18, 2019 at 6:23 PM Yun Li <[hidden email]> wrote:
>
> Let's take one step back. Since queryId is stored in core as Julien pointed out, can we just add that global to the pg_stat_get_activity and ultimately exposed in pg_stat_activity view? Then no matter whether PGSS  is on or off, or however the customer extensions are updating that filed, we expose that field in that view then enable user to leverage that id to join with pgss or their extension. Will this sounds a good idea?

I'd greatly welcome expose queryid exposure in pg_stat_activity, and
also in log_line_prefix.  I'm afraid that it's too late for pg12
inclusion, but I'll be happy to provide a patch for that for pg13.

Reply | Threaded
Open this post in threaded view
|

Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

Maksim Milyutin-2
In reply to this post by Robert Haas
On 3/16/19 5:32 PM, Robert Haas wrote:

> There's only one query ID field available, and
> you can't use two extensions that care about query ID unless they
> compute it the same way, and replicating all the code that computes
> the query ID into each new extension that wants one sucks.  I think we
> should actually bite the bullet and move all of that code into core,
> and then just let extensions say whether they care about it getting
> set.


+1.

But I think that enough to integrate into core the query normalization
routine and store generalized query strings (from which the queryId is
produced) in shared memory (for example, hashtable that maps queryId to
the text representation of generalized query). And activate
normalization routine and filling the table of generalized queries by
specified GUC.

This allows to unbind extensions that require queryId from using
pg_stat_statements and consider such computing of queryId as canonical.


--
Regards,
Maksim Milyutin


Reply | Threaded
Open this post in threaded view
|

Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

Julien Rouhaud
On Tue, Mar 19, 2019 at 2:45 PM Maksim Milyutin <[hidden email]> wrote:
>
> But I think that enough to integrate into core the query normalization
> routine and store generalized query strings (from which the queryId is
> produced) in shared memory (for example, hashtable that maps queryId to
> the text representation of generalized query).

That's more or less how pg_stat_statements was previously behaving,
and it had too many problems.  Current implementation, with an
external file, is a better alternative.

> And activate
> normalization routine and filling the table of generalized queries by
> specified GUC.
>
> This allows to unbind extensions that require queryId from using
> pg_stat_statements and consider such computing of queryId as canonical.

The problem I see with this approach is that if you want a different
implementation, you'll have to reimplement the in-core normalised
queries saving and retrieval, but with a different set of SQL-visible
functions.  I don't think that's it's acceptable, unless we add a
specific hook for query normalisation and queryid computing.  But it
isn't ideal either, as it would be a total mess if someone changes the
implementation without resetting the previously saved normalised
queries.

Reply | Threaded
Open this post in threaded view
|

Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

Julien Rouhaud
In reply to this post by Julien Rouhaud
On Mon, Mar 18, 2019 at 7:33 PM Julien Rouhaud <[hidden email]> wrote:
>
> On Mon, Mar 18, 2019 at 6:23 PM Yun Li <[hidden email]> wrote:
> >
> > Let's take one step back. Since queryId is stored in core as Julien pointed out, can we just add that global to the pg_stat_get_activity and ultimately exposed in pg_stat_activity view? Then no matter whether PGSS  is on or off, or however the customer extensions are updating that filed, we expose that field in that view then enable user to leverage that id to join with pgss or their extension. Will this sounds a good idea?
>
> I'd greatly welcome expose queryid exposure in pg_stat_activity, and
> also in log_line_prefix.  I'm afraid that it's too late for pg12
> inclusion, but I'll be happy to provide a patch for that for pg13.

Here's a prototype patch for queryid exposure in pg_stat_activity and
log_line prefix.

queryid_exposure-v1.diff (23K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

Jim Finnerty
The queryId depends on oids, so it is not stable enough for some purposes.
For example, to create a SQL identifier that survives across a server
upgrade, or that can be shipped to another database, the queryId isn't
usable.

The apg_plan_mgmt extensions keeps both its own stable SQL identifier as
well as the queryId, so it can be used to join to pg_stat_statements if
desired.  If we were to standardize on one SQL identifier, it should be
stable enough to survive a major version upgrade or to be the same in
different databases.





-----
Jim Finnerty, AWS, Amazon Aurora PostgreSQL
--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html

Jim Finnerty, AWS, Amazon Aurora PostgreSQL
Reply | Threaded
Open this post in threaded view
|

Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

Robert Haas
On Tue, Mar 19, 2019 at 1:24 PM Jim Finnerty <[hidden email]> wrote:

> The queryId depends on oids, so it is not stable enough for some purposes.
> For example, to create a SQL identifier that survives across a server
> upgrade, or that can be shipped to another database, the queryId isn't
> usable.
>
> The apg_plan_mgmt extensions keeps both its own stable SQL identifier as
> well as the queryId, so it can be used to join to pg_stat_statements if
> desired.  If we were to standardize on one SQL identifier, it should be
> stable enough to survive a major version upgrade or to be the same in
> different databases.

If Amazon would like to open-source its (AIUI) proprietary technology
for computing query IDs and propose it for inclusion in PostgreSQL,
cool, but I think that is a separate question from whether people
would like more convenient access to the query ID technology that we
have today.  I think it's 100% clear that they would like that, even
as things stand, and therefore it does not make sense to block that
behind Amazon deciding to share what it already has or somebody else
trying to reimplement it.

If we need to have a space for both a core-standard query ID and
another query ID that is available for extension use, adding one more
field to struct Query, so we can have both coreQueryId and
extensionQueryId or whatever, would be easy to do.  It appears that
there's more use case than I would have guessed for custom query IDs.
On the other hand, it also appears that a lot of people would be very,
very happy to just be able to see the query ID field that already
exists, both in pg_stat_statements in pg_stat_activity, and we
shouldn't throw up unnecessary impediments in the way of making that
happen, at least IMHO.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Reply | Threaded
Open this post in threaded view
|

Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

legrand legrand
In reply to this post by Julien Rouhaud
Great,
thank you Julien !

Would it make sense to add it in auto explain ?
I don't know for explain itself, but maybe ...

Regards
PAscal




--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html

Reply | Threaded
Open this post in threaded view
|

Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

Julien Rouhaud
On Tue, Mar 19, 2019 at 8:38 PM legrand legrand
<[hidden email]> wrote:
>
> Would it make sense to add it in auto explain ?
> I don't know for explain itself, but maybe ...

I'd think that people interested in getting the queryid in the logs
would configure the log_line_prefix to display it consistently rather
than having it in only a subset of cases, so that's probably not
really needed.

Reply | Threaded
Open this post in threaded view
|

Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

legrand legrand
In reply to this post by Robert Haas
Hi Jim, Robert,

As this is a distinct subject from adding QueryId to pg_stat_activity,
would it be possible to continue the discussion "new QueryId definition"
(for postgres open source software) here:

https://www.postgresql.org/message-id/1553029215728-0.post@...

Thanks in advance.
Regards
PAscal




--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html

Reply | Threaded
Open this post in threaded view
|

RE: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

legrand legrand
In reply to this post by Julien Rouhaud

>> Would it make sense to add it in auto explain ?
>> I don't know for explain itself, but maybe ...

> I'd think that people interested in getting the queryid in the logs
> would configure the log_line_prefix to display it consistently rather
> than having it in only a subset of cases, so that's probably not
> really needed.

Ok.
Shoudn't you add this to commitfest ?

Reply | Threaded
Open this post in threaded view
|

Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

Julien Rouhaud
On Mon, Mar 25, 2019 at 12:36 PM legrand legrand
<[hidden email]> wrote:

>
> >> Would it make sense to add it in auto explain ?
> >> I don't know for explain itself, but maybe ...
>
> > I'd think that people interested in getting the queryid in the logs
> > would configure the log_line_prefix to display it consistently rather
> > than having it in only a subset of cases, so that's probably not
> > really needed.
>
> Ok.
> Shoudn't you add this to commitfest ?

I added it last week, see https://commitfest.postgresql.org/23/2069/

Reply | Threaded
Open this post in threaded view
|

RE: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

legrand legrand
>> Shoudn't you add this to commitfest ?

> I added it last week, see https://commitfest.postgresql.org/23/2069/

Oups, sorry for the noise
123