[Proposal] Add accumulated statistics for wait event

classic Classic list List threaded Threaded
39 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[Proposal] Add accumulated statistics for wait event

myungkyu.lim

Hello hackers,

 

This proposal is about recording additional statistics of wait events.

 

 

PostgreSQL statistics Issue

----------------------------------------

The pg_stat_activity view is very useful in analysis for performance issues.

But it is difficult to get information of wait events in detail,

when you need to deep dive into analysis of performance.

It is because pg_stat_activity just shows the current wait status of backend.

 

If PostgreSQL provides additional statistics about wait events,

it will be helpful in pinpointing the exact cause of throughput issue.

 

 

Proposal

----------------------------------------

Record additional statistics items per wait event for every backend.

    - total elapsed time to wait

    - max elapsed time to wait

    - number of times being waited

 

I suggest storing the above statistics in the pgBackendStatus structure.

 

typedef struct PgBackendStatus

{

...

    /*

     * proc's wait_event additional information.

     * each wait_events elapsed time & count.

    */

    TimestampTz st_wait_event_start_timestamp;

    uint64      st_wait_event_total_elapsed[NUM_WAIT_EVENT];

    uint64      st_wait_event_max_elapsed[NUM_WAIT_EVENT];

    uint32      st_wait_event_counting[NUM_WAIT_EVENT];

}

 

 

PoC test

----------------------------------------

I wrote a prototype patch.

With this patch, you can get additional wait event stats via
the new procedure ‘pg_stat_get_wait_events()’.

 

You can test by following steps.

    1. apply git patch

        - patch -p0 < wait_event_stat_patchfile.diff

    2. make distclean

    3. configure --with-wait-event-detail

    4. make & make install

    5. start postgreSQL and execute psql

    6. using pg_stat_get_wait_events(null) function

        - input parameter is pid.

 

display example>

postgres=# select * from pg_stat_get_wait_events(null) where counting > 0;

  pid  | wait_event_type |      wait_event       | total_elapsed | max_elapsed | counting

-------+-----------------+-----------------------+---------------+-------------+----------

 25291 | LWLock          | WALBufMappingLock     |          1359 |         376 |        6

 25291 | LWLock          | WALWriteLock          |        679733 |      113803 |        8

 25291 | IO              | BufFileRead           |           780 |           7 |      172

 25291 | IO              | BufFileWrite          |          1247 |          19 |      171

 25291 | IO              | DataFileExtend        |         44703 |          53 |     3395

 25291 | IO              | DataFileImmediateSync |        268798 |       72286 |       12

 25291 | IO              | DataFileRead          |         91763 |       22149 |       30

 25291 | IO              | WALSync               |        441139 |       60456 |       28

 25291 | IO              | WALWrite              |          9567 |         637 |      737

 24251 | LWLock          | WALBufMappingLock     |          1256 |         350 |        6

 24251 | LWLock          | WALWriteLock          |        649140 |      153994 |        7

 24251 | IO              | BufFileRead           |           620 |           9 |      172

 24251 | IO              | BufFileWrite          |          1228 |          20 |      171

 24251 | IO              | DataFileExtend        |         26884 |          51 |     3395

 24251 | IO              | DataFileImmediateSync |        208630 |       21067 |       12

 24251 | IO              | DataFileRead          |        426278 |       17327 |      128

 24251 | IO              | WALSync               |        307055 |       70853 |       24

 24251 | IO              | WALWrite              |         17935 |         961 |     2720

(18 rows)

     

 

etc. concept proposal

------------------------------------------

1. I allocated arrays for additional statistics per wait event.

   Normally the backend doesn’t use all wait events.

   So the size of memory used for recording statistics can be reduced

by allocating one hash list as memory pool for statistics of wait events. 

 

2. This feature can be implemented as extension

if some hooks were provided in following functions,

 - pgstat_report_wait_start

 -    Pgstat_report_wait_end

 

 

Feedback and suggestion will be very welcome.

Thanks!

 


wait_event_stat_patchfile.diff (25K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [Proposal] Add accumulated statistics for wait event

Michael Paquier-2
On Mon, Jul 23, 2018 at 04:04:42PM +0900, 임명규 wrote:
> This proposal is about recording additional statistics of wait events.

You should avoid sending things in html format, text format being
recommended on those mailing lists...  The patch applies after using
patch -p0 by the way.

I would recommend that you generate your patches using "git
format-patch".  Here are general guidelines on the matter:
https://wiki.postgresql.org/wiki/Submitting_a_Patch
Please study those guidelines, those are helpful if you want to get
yourself familiar with community process.

I have comments about your patch.  First, I don't think that you need to
count precisely the number of wait events triggered as usually when it
comes to analyzing a workload's bottleneck what counts is a periodic
*sampling* of events, patterns which can be fetched already from
pg_stat_activity and stored say in a different place.  This can be
achieved easily by using a cron job with an INSERT SELECT query which
adds data on a relation storing the event counts.  I took the time to
look at your patch, and here is some feedback.

This does not need a configure switch.  I assume that what you did is
good to learn the internals of ./configure though.

There is no documentation.  What does the patch do?  What is it useful
for?

+       case PG_WAIT_IPC:
+           arrayIndex = NUM_WAIT_LWLOCK +
+               NUM_WAIT_LOCK + NUM_WAIT_BUFFER_PIN +
+               NUM_WAIT_ACTIVITY + NUM_WAIT_CLIENT +
+               NUM_WAIT_EXTENSION + eventId;
+           break;
This is ugly and unmaintainable style.  You could perhaps have
considered an enum instead.

pg_stat_get_wait_events should be a set-returning function, which you
could filter at SQL level using a PID, so no need of it as an argument.

What's the performance penalty?  I am pretty sure that this is
measurable as wait events are stored for a backend for each I/O
operation as well, and you are calling a C routine within an inlined
function which is designed to be light-weight, doing only a four-byte
atomic operation.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [Proposal] Add accumulated statistics for wait event

Thomas Kellerer
In reply to this post by myungkyu.lim
> This proposal is about recording additional statistics&nbsp;of&nbsp;wait
events.
> The pg_stat_activity view is very useful in analysis for performance
> issues.
> But it is difficult to get information of wait events in detail,
> when you need to deep dive into analysis of performance.
> It is because pg_stat_activity just shows the current wait status of
> backend.

There is an extension that samples the information from pg_stat_activity
(similar to Oracle's ASH).

https://github.com/postgrespro/pg_wait_sampling

Maybe it's worthwhile to combine the efforts?




--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html

Reply | Threaded
Open this post in threaded view
|

Re: [Proposal] Add accumulated statistics for wait event

Egor Rogov
In reply to this post by Michael Paquier-2
Hi,

that will be a great feature.


On 23.07.2018 10:53, Michael Paquier wrote:
> I have comments about your patch.  First, I don't think that you need to
> count precisely the number of wait events triggered as usually when it
> comes to analyzing a workload's bottleneck what counts is a periodic
> *sampling* of events

If I get it right, this patch is not about sampling. It gathers
cumulative statistics, which is regrettably missing in PostgreSQL.
That's why it should not only count exact number of wait events, but
also min time and stddev would be helpful (as in pg_stat_statements).

--
Egor Rogov
Postgres Professional http://www.postgrespro.com


Reply | Threaded
Open this post in threaded view
|

Re: [Proposal] Add accumulated statistics for wait event

Alexander Korotkov
In reply to this post by Michael Paquier-2
On Mon, Jul 23, 2018 at 10:53 AM Michael Paquier <[hidden email]> wrote:
> What's the performance penalty?  I am pretty sure that this is
> measurable as wait events are stored for a backend for each I/O
> operation as well, and you are calling a C routine within an inlined
> function which is designed to be light-weight, doing only a four-byte
> atomic operation.

Yes, the question is overhead of measuring durations of individual
wait events.  It has been proposed before, and there been heated
debates about that (see threads [1-3]).  It doesn't seem to be a
conclusion about this feature.  The thing to be said for sure:
performance penalty heavily depends on OS/hardware/workload.  In some
cases overhead is negligible, but in other cases it appears to be
huge.

1. https://www.postgresql.org/message-id/flat/559D4729.9080704%40postgrespro.ru
2. https://www.postgresql.org/message-id/flat/CA%2BTgmoYd3GTz2_mJfUHF%2BRPe-bCy75ytJeKVv9x-o%2BSonCGApw%40mail.gmail.com
3. https://www.postgresql.org/message-id/flat/CAG95seUAQVj09KzLwU%2Bz1B-GqdMqerzEkPFR3hn0q88XzMq-PA%40mail.gmail.com

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Reply | Threaded
Open this post in threaded view
|

Re: [Proposal] Add accumulated statistics for wait event

Tom Lane-2
In reply to this post by Michael Paquier-2
Michael Paquier <[hidden email]> writes:
> This does not need a configure switch.

It probably is there because the OP realizes that most people wouldn't
accept having this code compiled in.

> What's the performance penalty?  I am pretty sure that this is
> measurable as wait events are stored for a backend for each I/O
> operation as well, and you are calling a C routine within an inlined
> function which is designed to be light-weight, doing only a four-byte
> atomic operation.

On machines with slow gettimeofday(), I suspect the cost of this
patch would be staggering.  Even with relatively fast gettimeofday,
it doesn't look acceptable for calls in hot code paths (for instance,
lwlock.c).

A bigger problem is that it breaks stuff.  There are countless
calls to pgstat_report_wait_start/pgstat_report_wait_end that
assume they have no side-effects (for example, on errno) and
can never fail.  I wouldn't trust GetCurrentTimestamp() for either.
If the report_wait calls can't be dropped into code with *complete*
certainty that they're safe, that's a big cost.

Why exactly is this insisting on logging timestamps and not,
say, just incrementing a counter?  I think doing it like this
is almost certain to end in rejection.

                        regards, tom lane

Reply | Threaded
Open this post in threaded view
|

RE: Re: [Proposal] Add accumulated statistics for wait event

myungkyu.lim
In reply to this post by Michael Paquier-2
 2018-07-23 16:53 (GMT+9), Michael Paquier wrote:
> On Mon, Jul 23, 2018 at 04:04:42PM +0900, 임명규 wrote:
>> This proposal is about recording additional statistics of wait events.
 
> I have comments about your patch.  First, I don't think that you need to
> count precisely the number of wait events triggered as usually when it
> comes to analyzing a workload's bottleneck what counts is a periodic
> *sampling* of events, patterns which can be fetched already from
> pg_stat_activity and stored say in a different place.

Thanks for your feedback.

This proposal is not about *sampling*.
Accumulated statistics of wait events information is useful for solving
issue. It can measure accurate data.

Some case, sampling of events can not find the cause of issue. It lose detail data.
For example, some throughput issue occur(ex : disk io), but each wait point
occurs only a few milliseconds.
In this case, it is highly likely that will not find the cause.

> This is ugly and unmaintainable style.

I'm sorry. You're right.
Think as the PoC.
 
> What's the performance penalty?

I have same worries. I just tried pgbench several times.
Let me know what some good performance check method.

Best regards,
MyungKyu, Lim

Reply | Threaded
Open this post in threaded view
|

RE: Re: [Proposal] Add accumulated statistics for wait event

myungkyu.lim
In reply to this post by Alexander Korotkov
> On Mon, Jul 23, 2018 at 10:53 AM Michael Paquier <[hidden email]> wrote:
>> What's the performance penalty?  I am pretty sure that this is
>> measurable as wait events are stored for a backend for each I/O
>> operation as well, and you are calling a C routine within an inlined
>> function which is designed to be light-weight, doing only a four-byte
>> atomic operation.

> Yes, the question is overhead of measuring durations of individual wait events.  It has been proposed before, and there been heated debates about that (see threads [1-3]).  It doesn't seem
> to be a conclusion about this feature.  The thing to be said for sure:
> performance penalty heavily depends on OS/hardware/workload.  In some cases overhead is negligible, but in other cases it appears to be huge.

Thanks for good information.
I agree. Performance penalty is exist.
But wait stats are demandable and useful. In some cases, it is worth sacrificing performance and using it.

So, what do you think about developing as extension? I have another concept proposal.
2. This feature can be implemented as extension if some hooks were provided in following functions,
 - pgstat_report_wait_start
 - pgstat_report_wait_end
This feature can be turned on/off by on-line config when necessary.

Best regards,
MyungKyu, Lim
 

Reply | Threaded
Open this post in threaded view
|

RE: Re: [Proposal] Add accumulated statistics for wait event

Phil Florent

Hi,

I am skeptical about accumulated statistics.

pg_stat_activity now gives necessary information about wait events. It can be easily be used with a polling system that sleeps most of the time to limit the overhead. Measuring the duration of individual wait events is not necessary to know the repartition of the charge.

You can aggregate the results of the polling by application, query, wait events or whatever you want.

I wrote a script for that that can be used interactively or in batch mode to produce reports but many solutions exist . 

Best regards

Phil





De : MyungKyu LIM <[hidden email]>
Envoyé : mardi 24 juillet 2018 12:10
À : Alexander Korotkov; [hidden email]
Cc : Woosung Sohn; DoHyung HONG
Objet : RE: Re: [Proposal] Add accumulated statistics for wait event
 
> On Mon, Jul 23, 2018 at 10:53 AM Michael Paquier <[hidden email]> wrote:
>> What's the performance penalty?  I am pretty sure that this is
>> measurable as wait events are stored for a backend for each I/O
>> operation as well, and you are calling a C routine within an inlined
>> function which is designed to be light-weight, doing only a four-byte
>> atomic operation.

> Yes, the question is overhead of measuring durations of individual wait events.  It has been proposed before, and there been heated debates about that (see threads [1-3]).  It doesn't seem
> to be a conclusion about this feature.  The thing to be said for sure:
> performance penalty heavily depends on OS/hardware/workload.  In some cases overhead is negligible, but in other cases it appears to be huge.

Thanks for good information.
I agree. Performance penalty is exist.
But wait stats are demandable and useful. In some cases, it is worth sacrificing performance and using it.

So, what do you think about developing as extension? I have another concept proposal.
2. This feature can be implemented as extension if some hooks were provided in following functions,
 - pgstat_report_wait_start
 - pgstat_report_wait_end
This feature can be turned on/off by on-line config when necessary.

Best regards,
MyungKyu, Lim
 

Reply | Threaded
Open this post in threaded view
|

Re: [Proposal] Add accumulated statistics for wait event

Tomas Vondra-4
In reply to this post by Tom Lane-2


On 07/23/2018 03:57 PM, Tom Lane wrote:

> Michael Paquier <[hidden email]> writes:
>> This does not need a configure switch.
>
> It probably is there because the OP realizes that most people wouldn't
> accept having this code compiled in.
>
>> What's the performance penalty?  I am pretty sure that this is
>> measurable as wait events are stored for a backend for each I/O
>> operation as well, and you are calling a C routine within an inlined
>> function which is designed to be light-weight, doing only a four-byte
>> atomic operation.
>
> On machines with slow gettimeofday(), I suspect the cost of this
> patch would be staggering.  Even with relatively fast gettimeofday,
> it doesn't look acceptable for calls in hot code paths (for instance,
> lwlock.c).
>

Yeah. I wonder if we could measure the time for a small fraction of the
wait events, and estimate the actual duration from that.

> A bigger problem is that it breaks stuff.  There are countless
> calls to pgstat_report_wait_start/pgstat_report_wait_end that
> assume they have no side-effects (for example, on errno) and
> can never fail.  I wouldn't trust GetCurrentTimestamp() for either.
> If the report_wait calls can't be dropped into code with *complete*
> certainty that they're safe, that's a big cost.
>
> Why exactly is this insisting on logging timestamps and not,
> say, just incrementing a counter?  I think doing it like this
> is almost certain to end in rejection.
>

Because the number of times you hit wait event may not correlate with
the time you spent waiting on it. So a simple counter is not the most
useful thing.

regards


--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: [Proposal] Add accumulated statistics for wait event

Tomas Vondra-4
In reply to this post by myungkyu.lim


On 07/24/2018 12:06 PM, MyungKyu LIM wrote:

>   2018-07-23 16:53 (GMT+9), Michael Paquier wrote:
>> On Mon, Jul 23, 2018 at 04:04:42PM +0900, 임명규 wrote:
>>> This proposal is about recording additional statistics of wait events.
>  
>> I have comments about your patch.  First, I don't think that you need to
>> count precisely the number of wait events triggered as usually when it
>> comes to analyzing a workload's bottleneck what counts is a periodic
>> *sampling* of events, patterns which can be fetched already from
>> pg_stat_activity and stored say in a different place.
>
> Thanks for your feedback.
>
> This proposal is not about *sampling*.
> Accumulated statistics of wait events information is useful for solving
> issue. It can measure accurate data.
>
> Some case, sampling of events can not find the cause of issue. It lose detail data.
> For example, some throughput issue occur(ex : disk io), but each wait point
> occurs only a few milliseconds.
> In this case, it is highly likely that will not find the cause.
>

I think it's highly likely that it will find the cause. The idea of
sampling is that while you don't measure the timing directly, you can
infer it from the frequency of the wait events in the samples. So if you
see the backend reports a particular wait event in 75% of samples, it
probably spent 75% time waiting on it.

I'm not saying sampling is perfect and it certainly is less convenient
than what you propose.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

RE: [Proposal] Add accumulated statistics for wait event

Phil Florent

Hi,


> Some case, sampling of events can not find the cause of issue. It lose detail data.
> For example, some throughput issue occur(ex : disk io), but each wait point
> occurs only a few milliseconds.


It loses non meaningful details and it's in fact a good point. In this example, sampling will definitely find the cause and won't cost resources.

Being as precise as possible to define a wait event is very useful but knowing precisely the duration of each event is less useful in terms of tuning.


Example of sampling + group by/order by percentage of activity :


./t -d 5 -o "application_name, wait_event_type" -o "application_name, wait_event, wait_event_type"              
traqueur 2.05.00 - performance tool for PostgreSQL 9.3 => 11
INFORMATION, no connection parameters provided, connecting to dedicated database ...
INFORMATION, connected to dedicated database traqueur
INFORMATION, PostgreSQL version : 110000
INFORMATION, sql preparation ...
INFORMATION, sql execution ...
 busy_pc | distinct_exe | application_name | wait_event_type
---------+--------------+------------------+-----------------
     206 | 8 / 103      | mperf            |
      62 | 2 / 31       | mperf            | LWLock
      20 | 3 / 10       | mperf            | IO
      12 | 1 / 6        | mperf            | Client
(4 rows)

 busy_pc | distinct_exe | application_name |      wait_event       | wait_event_type
---------+--------------+------------------+-----------------------+-----------------
     206 | 8 / 103      | mperf            |                       |
      62 | 2 / 31       | mperf            | WALWriteLock          | LWLock
      14 | 1 / 7        | mperf            | DataFileImmediateSync | IO
      12 | 1 / 6        | mperf            | ClientRead            | Client
       2 | 1 / 1        | mperf            | DataFileWrite         | IO
       2 | 1 / 1        | mperf            | DataFileRead          | IO
       2 | 1 / 1        | mperf            | WALInitWrite          | IO

No need to know the exact duration of each event to identify the bottleneck(s)... 


Best regards

Phil




De : Tomas Vondra <[hidden email]>
Envoyé : mardi 24 juillet 2018 17:45
À : [hidden email]
Objet : Re: [Proposal] Add accumulated statistics for wait event
 


On 07/24/2018 12:06 PM, MyungKyu LIM wrote:
>   2018-07-23 16:53 (GMT+9), Michael Paquier wrote:
>> On Mon, Jul 23, 2018 at 04:04:42PM +0900, 임명규 wrote:
>>> This proposal is about recording additional statistics of wait events.
>  
>> I have comments about your patch.  First, I don't think that you need to
>> count precisely the number of wait events triggered as usually when it
>> comes to analyzing a workload's bottleneck what counts is a periodic
>> *sampling* of events, patterns which can be fetched already from
>> pg_stat_activity and stored say in a different place.
>
> Thanks for your feedback.
>
> This proposal is not about *sampling*.
> Accumulated statistics of wait events information is useful for solving
> issue. It can measure accurate data.
>
> Some case, sampling of events can not find the cause of issue. It lose detail data.
> For example, some throughput issue occur(ex : disk io), but each wait point
> occurs only a few milliseconds.
> In this case, it is highly likely that will not find the cause.
>

I think it's highly likely that it will find the cause. The idea of
sampling is that while you don't measure the timing directly, you can
infer it from the frequency of the wait events in the samples. So if you
see the backend reports a particular wait event in 75% of samples, it
probably spent 75% time waiting on it.

I'm not saying sampling is perfect and it certainly is less convenient
than what you propose.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: [Proposal] Add accumulated statistics for wait event

Michael Paquier-2
On Tue, Jul 24, 2018 at 04:23:03PM +0000, Phil Florent wrote:
> It loses non meaningful details and it's in fact a good point. In this
> example, sampling will definitely find the cause and won't cost
> resources.

The higher the sampling frequency, the more details you get, with the
most load on the instance.  So if you are able to take an infinity of
samples, where registering multiple times the same event for the same
backend also matters because its overall weight gets higher and it shows
up higher in profiles, then you would be able converge to the set of
results that this patch adds.  Sampling method, especially its
frequency, is something controlled by the client and not the server.
Approaches like the one proposed here push the load on the server-side,
unconditionally, for *all* backends, and this has its cost.

Even if you have spiky workloads, sampling may miss those, but even with
adding counters for each event you would need to query the table holding
the counters at an insane frequency to be able to perhaps get something
out of it as you need to do sampling of the counters as well to extract
deltas.

As Tomas has mentioned up-thread, sampling is light-weight, as-is the
current design for wait events.  Even if it is not perfect because it
cannot give exact numbers, it would find bottlenecks in really most
cases, and that's what matters.  If not, increasing the sampling
frequency makes things easier to detect as well.  What would be the
point of taking only one sample every checkpoint for example?

There may be a benefit in having counters, I don't know the answer to
that, though the point would be to make sure that there is a specific
set of workloads where it makes sense, still my gut feeling is that
sampling would be able to detect those anyway.

(I am not a computer scientist by default but a physicist, think fluid
dynamics and turbulence, and I had my load of random events and signal
analysis as well.  All that is just statistics with attempts to approach
reality, where sampling is a life-saver over exactitude of
measurements.)

Adding hooks is not acceptable to me either, those have a cost, and it
is not clear what's the benefit we can get into putting hooks in such a
place for cases other than sampling and counters...
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [Proposal] Add accumulated statistics for wait event

Bertrand DROUVOT
Hello Guys,

As you mentioned Oracle like active session history sampling in this thread, I just want to let you know that I am working on a brand new extension to provide this feature.

You can find the extension here: https://github.com/pgsentinel/pgsentinel

Basically, you could see it as samplings of pg_stat_activity (one second interval as default) currently providing more information:

ash_time: the sampling time
        blockers:  the number of blockers
        blockerpid: the pid of the blocker (if blockers = 1), the pid of one blocker (if blockers > 1)
top_level_query: the top level statement (in case PL/pgSQL is used)
query: the statement being executed (not normalised, as it is in pg_stat_statements, means you see the values)
        cmdtype: the statement type (SELECT,UPDATE,INSERT,DELETE,UTILITY,UNKNOWN,NOTHING)
queryid: the queryid of the statement (the one coming from pg_stat_statements)

Thanks to the queryid field you are able to link the session activity with the sql activity.

It's implemented as in-memory ring buffer where samples are written with given (configurable) period. 
Therefore, user can see some number of recent samples depending on history size (configurable).

Current caveats: In case of high query rate per pid, you could see (I saw it at more than 1000 queries per second) top_level_query and query not "correlated" (query, queryid and cmdtype are still well linked together). This is due to the fact that those 2 informations are currently collected independently.

If you want to have a look, give your thoughts, you are welcome.

Bertrand

On 26 July 2018 at 03:24, Michael Paquier <[hidden email]> wrote:
On Tue, Jul 24, 2018 at 04:23:03PM +0000, Phil Florent wrote:
> It loses non meaningful details and it's in fact a good point. In this
> example, sampling will definitely find the cause and won't cost
> resources.

The higher the sampling frequency, the more details you get, with the
most load on the instance.  So if you are able to take an infinity of
samples, where registering multiple times the same event for the same
backend also matters because its overall weight gets higher and it shows
up higher in profiles, then you would be able converge to the set of
results that this patch adds.  Sampling method, especially its
frequency, is something controlled by the client and not the server.
Approaches like the one proposed here push the load on the server-side,
unconditionally, for *all* backends, and this has its cost.

Even if you have spiky workloads, sampling may miss those, but even with
adding counters for each event you would need to query the table holding
the counters at an insane frequency to be able to perhaps get something
out of it as you need to do sampling of the counters as well to extract
deltas.

As Tomas has mentioned up-thread, sampling is light-weight, as-is the
current design for wait events.  Even if it is not perfect because it
cannot give exact numbers, it would find bottlenecks in really most
cases, and that's what matters.  If not, increasing the sampling
frequency makes things easier to detect as well.  What would be the
point of taking only one sample every checkpoint for example?

There may be a benefit in having counters, I don't know the answer to
that, though the point would be to make sure that there is a specific
set of workloads where it makes sense, still my gut feeling is that
sampling would be able to detect those anyway.

(I am not a computer scientist by default but a physicist, think fluid
dynamics and turbulence, and I had my load of random events and signal
analysis as well.  All that is just statistics with attempts to approach
reality, where sampling is a life-saver over exactitude of
measurements.)

Adding hooks is not acceptable to me either, those have a cost, and it
is not clear what's the benefit we can get into putting hooks in such a
place for cases other than sampling and counters...
--
Michael

Reply | Threaded
Open this post in threaded view
|

RE: [Proposal] Add accumulated statistics for wait event

Phil Florent
In reply to this post by Michael Paquier-2

Hi,

I agree with that. PostgreSQL 10 is really great, tuning tools based on sampling of pg_stat_activity became accurate without any modification. 

Best regards

Phil




De : Michael Paquier <[hidden email]>
Envoyé : jeudi 26 juillet 2018 03:24
À : Phil Florent
Cc : Tomas Vondra; [hidden email]
Objet : Re: [Proposal] Add accumulated statistics for wait event
 
On Tue, Jul 24, 2018 at 04:23:03PM +0000, Phil Florent wrote:
> It loses non meaningful details and it's in fact a good point. In this
> example, sampling will definitely find the cause and won't cost
> resources.

The higher the sampling frequency, the more details you get, with the
most load on the instance.  So if you are able to take an infinity of
samples, where registering multiple times the same event for the same
backend also matters because its overall weight gets higher and it shows
up higher in profiles, then you would be able converge to the set of
results that this patch adds.  Sampling method, especially its
frequency, is something controlled by the client and not the server.
Approaches like the one proposed here push the load on the server-side,
unconditionally, for *all* backends, and this has its cost.

Even if you have spiky workloads, sampling may miss those, but even with
adding counters for each event you would need to query the table holding
the counters at an insane frequency to be able to perhaps get something
out of it as you need to do sampling of the counters as well to extract
deltas.

As Tomas has mentioned up-thread, sampling is light-weight, as-is the
current design for wait events.  Even if it is not perfect because it
cannot give exact numbers, it would find bottlenecks in really most
cases, and that's what matters.  If not, increasing the sampling
frequency makes things easier to detect as well.  What would be the
point of taking only one sample every checkpoint for example?

There may be a benefit in having counters, I don't know the answer to
that, though the point would be to make sure that there is a specific
set of workloads where it makes sense, still my gut feeling is that
sampling would be able to detect those anyway.

(I am not a computer scientist by default but a physicist, think fluid
dynamics and turbulence, and I had my load of random events and signal
analysis as well.  All that is just statistics with attempts to approach
reality, where sampling is a life-saver over exactitude of
measurements.)

Adding hooks is not acceptable to me either, those have a cost, and it
is not clear what's the benefit we can get into putting hooks in such a
place for cases other than sampling and counters...
--
Michael
Reply | Threaded
Open this post in threaded view
|

RE: [Proposal] Add accumulated statistics for wait event

Yotsunaga, Naoki
In reply to this post by Michael Paquier-2
On Thu, July 26, 2018 at 1:25 AM, Michael Paquier wrote:
> Even if you have spiky workloads, sampling may miss those, but even with adding counters for each event
> you would need to query the table holding the counters at an insane frequency to be able to perhaps get
> something out of it as you need to do sampling of the counters as well to extract deltas.

Hi, I was wondering why PostgreSQL did not have the number of wait events and wait time that other databases such as Oracle had as a function, and when I was looking for related threads, I got to this thread.

I am a DB beginner, so please tell me. It says that you can find events that are bottlenecks in sampling, but as you saw above, you can not find events shorter than the sampling interval, right?
If this short event has occurred quite a number of times and it was a considerable amount of time in total, can you solve this problem with sampling?
# I have asked, but I could not understand much of the discussion above and I do not know if such a case can exist.
Also, I think that it can be solved by higher the sampling frequency, but the load will be high, right? I think this method is not very practical.
 
Moreover, I think that it is not implemented due to the reason that sampling is good as described above and because it affects performance.
How about switching the function on / off and implementing with default off?
Do you care about performance degradation during bottleneck investigation?
When investigating the bottleneck, I think that it is better to find the cause even at the expense of performance.
# If you can detect with high frequency sampling, I think that it depends on whether the sampling or the function(the number of wait events and wait time) is high load.
 
Since I am a DB beginner, I think that it is saying strange things.
I am glad if you tell me.

-----
Naoki Yotsunaga


Reply | Threaded
Open this post in threaded view
|

RE: [Proposal] Add accumulated statistics for wait event

Phil Florent
Hi,
I am a DB beginner, so please tell me. It says that you can find events that are bottlenecks in sampling, but as you saw above, you can not find events shorter than the sampling interval, right?

If an event occurs frequently and if it is reported in pg_stat_activity, you will catch it again and again while sampling, no matter it duration.
Hence you just need to
  • Sample the sessions and consider the active ones. You need to know if they are waiting (PostgreSQL now provides detailed wait events) or if they are on the CPU
  • Optionally collect information on the system context at the time of sampling (CPU, memory...), it can be provided by many tools like psutil python library for example
If the client application itself provides information it's even more interesting. With something like program/module/action/client_info/sofar/totalwork in application_name you are able to focus directly on different kind of activity. It can give you information like  "I/O waits are meaningful for my batch activity but not for my OLTP activity, if my goal is to improve response time for end users I have to consider that."
 
Best regards
Phil


De : Yotsunaga, Naoki <[hidden email]>
Envoyé : jeudi 4 octobre 2018 10:31
À : 'Michael Paquier'; Phil Florent
Cc : Tomas Vondra; [hidden email]
Objet : RE: [Proposal] Add accumulated statistics for wait event
 
On Thu, July 26, 2018 at 1:25 AM, Michael Paquier wrote:
> Even if you have spiky workloads, sampling may miss those, but even with adding counters for each event
> you would need to query the table holding the counters at an insane frequency to be able to perhaps get
> something out of it as you need to do sampling of the counters as well to extract deltas.

Hi, I was wondering why PostgreSQL did not have the number of wait events and wait time that other databases such as Oracle had as a function, and when I was looking for related threads, I got to this thread.

I am a DB beginner, so please tell me. It says that you can find events that are bottlenecks in sampling, but as you saw above, you can not find events shorter than the sampling interval, right?
If this short event has occurred quite a number of times and it was a considerable amount of time in total, can you solve this problem with sampling?
# I have asked, but I could not understand much of the discussion above and I do not know if such a case can exist.
Also, I think that it can be solved by higher the sampling frequency, but the load will be high, right? I think this method is not very practical.
 
Moreover, I think that it is not implemented due to the reason that sampling is good as described above and because it affects performance.
How about switching the function on / off and implementing with default off?
Do you care about performance degradation during bottleneck investigation?
When investigating the bottleneck, I think that it is better to find the cause even at the expense of performance.
# If you can detect with high frequency sampling, I think that it depends on whether the sampling or the function(the number of wait events and wait time) is high load.
 
Since I am a DB beginner, I think that it is saying strange things.
I am glad if you tell me.

-----
Naoki Yotsunaga

Reply | Threaded
Open this post in threaded view
|

Re: [Proposal] Add accumulated statistics for wait event

Michael Paquier-2
On Thu, Oct 04, 2018 at 09:32:37AM +0000, Phil Florent wrote:
> I am a DB beginner, so please tell me. It says that you can find
> events that are bottlenecks in sampling, but as you saw above, you can
> not find events shorter than the sampling interval, right?

Yes, which is why it would be as simple as making the interval shorter,
still not too short so as it bloats the amount of information fetched
which needs to be stored and afterwards (perhaps) treated for analysis.
This gets rather close to signal processing.  A simple image is for
example, assuming that event A happens 100 times in an interval of 1s,
and event B only once in the same interval of 1s, then if the snapshot
interval is only 1s, then in the worst case A would be treated an equal
of B, which would be wrong.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

RE: [Proposal] Add accumulated statistics for wait event

Phil Florent
Hi,

It's the same logic with any polling system. An integration calculation using monte-carlo method with only a few points won't be accurate enough and can even be completely wrong etc. 
Polling is OK to troubleshoot a problem on the fly but 2 points are not enough. A few seconds are needed to obtain good enough data, e.g 5-10 seconds of polling with a 0.1=>0.01s interval between 2 queries of the activity.
Polling a few seconds while the user is waiting is normally enough to say if a significant part of the waits are on the database. It's very important to know that. With 1 hour of accumulated statistics, a DBA will always see something to fix. But if the user waits 10 seconds on a particular screen and 1 second is spent on the database it often won't directly help.  
Polling gives great information with postgreSQL 10 but it was already useful to catch top queries etc. in older versions.
I always check if activity is adequately reported by my tool using known cases. I want to be sure it will report adequately things in real-world troubleshooting sessions. Sometimes there are bugs in my tool, once there was an issue with postgres (pgstat_report_activty() was not called by workers in parallel index creation)

Best regards
Phil

De : Michael Paquier <[hidden email]>
Envoyé : jeudi 4 octobre 2018 12:58
À : Phil Florent
Cc : Yotsunaga, Naoki; Tomas Vondra; [hidden email]
Objet : Re: [Proposal] Add accumulated statistics for wait event
 
On Thu, Oct 04, 2018 at 09:32:37AM +0000, Phil Florent wrote:
> I am a DB beginner, so please tell me. It says that you can find
> events that are bottlenecks in sampling, but as you saw above, you can
> not find events shorter than the sampling interval, right?

Yes, which is why it would be as simple as making the interval shorter,
still not too short so as it bloats the amount of information fetched
which needs to be stored and afterwards (perhaps) treated for analysis.
This gets rather close to signal processing.  A simple image is for
example, assuming that event A happens 100 times in an interval of 1s,
and event B only once in the same interval of 1s, then if the snapshot
interval is only 1s, then in the worst case A would be treated an equal
of B, which would be wrong.
--
Michael
Reply | Threaded
Open this post in threaded view
|

RE: [Proposal] Add accumulated statistics for wait event

Yotsunaga, Naoki

On Thu, Oct 4, 2018 at 0:54 AM, Phil Florent wrote:

 

Phil, Michael, I appreciate your polite comments.

I understand as follows.

We can find it if we shorten the sampling interval, but a lot of information comes out.

# The balance is important.

Also, it is not good unless we have enough samples.

And I have to do various other things.

Is my understand correct?

 

It seems to me that it is difficult for me if my understanding is right.

Is DBA really able to solve bottlenecks with sampling?

# Since I am a beginner, I feel that way. And other people may not feel it difficult.

 

What I would like to say is that if we have information on the number of wait events and the wait time(like other DB), we can investigate more easily.

Of course, I understand that it also affects performance. So, I suggest a way that it can switch on and off, defaults is off.

 

-----

Naoki, Yotsunaga.

12