Planning counters in pg_stat_statements (using pgss_store)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
41 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Re: Planning counters in pg_stat_statements (using pgss_store)

Julien Rouhaud
On Thu, Mar 28, 2019 at 9:48 AM Julien Rouhaud <[hidden email]> wrote:

>
> On Thu, Mar 28, 2019 at 8:45 AM Sergei Kornilov <[hidden email]> wrote:
> >
> > >>  Ok, but keep in mind that this is the last commitfest for pg12, and
> > >>  there are only 4 days left. Will you have time to take care of it, or
> > >>  do you need help on it?
> > >
> > > Oups, sorry, I won't have time nor knowledge to finish in time ;o(
> > > Any help is welcome !
> >
> > No need to rush, this patch has is unlikely to get committed in pg12 even a month earlier. We have a general policy that we don't like complex patches that first show up for the last commitfest of a dev cycle. Current commitfest is last one before feature freeze.
>
> yes, but this patch first showed up years ago:
> https://commitfest.postgresql.org/16/1373/.  Since nothing happened
> since, it would be nice to have feedback on whether deeper changes on
> the planning functions are required (so for pg13), or if current
> approach is ok (and then I hope it'd be acceptable for pg12).
If that's helpful I attach the updated patches.  I split in two
commits, so if the query_text passing is not wanted it's quite easy to
ignore this part.

0001-Pass-query-string-to-the-planner.patch (14K) Download Attachment
0002-Add-planning-counters-to-pg_stat_statements.patch (39K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Planning counters in pg_stat_statements (using pgss_store)

legrand legrand
Hi,

I have played with this patch, it works fine.

rem the last position of the "new" total_time column is confusing
+CREATE VIEW pg_stat_statements AS
+  SELECT *, total_plan_time + total_exec_time AS total_time
+    FROM pg_stat_statements(true);

I wanted to perform some benchmark between those 4 cases:
0 - no pgss,
1 - original pgss (no planning counter and 1 access to pgss hash),
2 - pggs reading querytext in planner hook (* 2 accesses to pgss hash),
3 - pggs reading querytext in parse hook (* 3 accesses to pgss hash)

It seems that the difference is so tiny, that there was no other way than
running
minimal "Select 1;" statement ...

./pgbench -c 10 -j 5 -t 500000 -f select1stmt.sql postgres

case  avg_tps   pct_diff
0        89 278   --
1        88 745   0,6%
2        88 282   1,1%
3        86 660   2,9%

This means that even in this extrem test case, the worst degradation is less
than 3%
(this overhead can be removed using pg_stat_statements.track_planning guc)

notes:
- PostgreSQL 12devel on x86_64-w64-mingw32, compiled by gcc.exe
(x86_64-win32-sehrev1, Built by MinGW-W64 project) 7.2.0, 64-bit,
- cpu usage was less that 95%,
- avg_tps is based on 3 runs,
- there was a wait of arround 1 minute between each run to keep
  computer temperature and fan usage low.

Regards
PAscal



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html


Reply | Threaded
Open this post in threaded view
|

Re: Planning counters in pg_stat_statements (using pgss_store)

Julien Rouhaud
On Mon, Apr 1, 2019 at 10:35 PM legrand legrand
<[hidden email]> wrote:
>
> I have played with this patch, it works fine.

Thanks for testing!

> rem the last position of the "new" total_time column is confusing
> +CREATE VIEW pg_stat_statements AS
> +  SELECT *, total_plan_time + total_exec_time AS total_time
> +    FROM pg_stat_statements(true);

Yes, there are quite a lot of fields in pg_stat_statements(), so I
added the total_time as the last field to avoid enumerating all of
them.  I can change that if needed.


> I wanted to perform some benchmark between those 4 cases:
> 0 - no pgss,
> 1 - original pgss (no planning counter and 1 access to pgss hash),
> 2 - pggs reading querytext in planner hook (* 2 accesses to pgss hash),
> 3 - pggs reading querytext in parse hook (* 3 accesses to pgss hash)
>
> It seems that the difference is so tiny, that there was no other way than
> running
> minimal "Select 1;" statement ...
>
> ./pgbench -c 10 -j 5 -t 500000 -f select1stmt.sql postgres
>
> case  avg_tps   pct_diff
> 0        89 278   --
> 1        88 745   0,6%
> 2        88 282   1,1%
> 3        86 660   2,9%
>
> This means that even in this extrem test case, the worst degradation is less
> than 3%
> (this overhead can be removed using pg_stat_statements.track_planning guc)

Is the difference between 2 and 3 the extraneous pgss_store call to
always store the query text if planner hook doesn't have access to the
query text?


Reply | Threaded
Open this post in threaded view
|

RE: Planning counters in pg_stat_statements (using pgss_store)

legrand legrand
Hi,

>>
>> case  avg_tps   pct_diff
>> 0        89 278   --
>> 1        88 745   0,6%
>> 2        88 282   1,1%
>> 3        86 660   2,9%
>>
>> This means that even in this extrem test case, the worst degradation is less
>> than 3%
>> (this overhead can be removed using pg_stat_statements.track_planning guc)

> Is the difference between 2 and 3 the extraneous pgss_store call to
> always store the query text if planner hook doesn't have access to the
> query text?

Yes it is,
but I agree it seems a big gap (1,8%) compared to the difference between 1 and 2 (0,5%).
Maybe this is just mesure "noise" ...

Regards
PAscal
Reply | Threaded
Open this post in threaded view
|

Re: Planning counters in pg_stat_statements (using pgss_store)

Julien Rouhaud
On Tue, Apr 2, 2019 at 9:22 AM legrand legrand
<[hidden email]> wrote:

>
> >> case  avg_tps   pct_diff
> >> 0        89 278   --
> >> 1        88 745   0,6%
> >> 2        88 282   1,1%
> >> 3        86 660   2,9%
> >>
> >> This means that even in this extrem test case, the worst degradation is less
> >> than 3%
> >> (this overhead can be removed using pg_stat_statements.track_planning guc)
>
> > Is the difference between 2 and 3 the extraneous pgss_store call to
> > always store the query text if planner hook doesn't have access to the
> > query text?
>
> Yes it is,
> but I agree it seems a big gap (1,8%) compared to the difference between 1 and 2 (0,5%).
> Maybe this is just mesure "noise" ...
Rebased patches attached.

0002-Add-planning-counters-to-pg_stat_statements.patch (38K) Download Attachment
0001-Pass-query-string-to-the-planner.patch (14K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Planning counters in pg_stat_statements (using pgss_store)

Sergei Kornilov
Hello

I think the most important question for this topic is performance penalty.
It was a long story, first test on my desktop was too volatile. I setup separate PC with DB only and test few cases.

PC spec: 2-core Intel Core 2 Duo E6550, 4GB ram, mechanical HDD
All tests on top 7dedfd22b79822b7f4210e6255b672ea82db6678 commit, build via ./configure  --prefix=/home/melkij/tmp/ --enable-tap-tests
DB settings:
  listen_addresses = '*'
  log_line_prefix = '%m %p %u@%d from %h [vxid:%v txid:%x] [%i] '
  lc_messages = 'C'
  shared_buffers = 512MB

pgbench runned from different host, in same L2 network.
Database was generated by: pgbench -s 10 -i -h hostname postgres
After database start I run:
  create extension if not exists pg_prewarm;
  select count(*), sum(pg_prewarm) from pg_tables join pg_prewarm(tablename::regclass) on true where schemaname= 'public';
  select count(*), sum(pg_prewarm) from pg_indexes join pg_prewarm(indexname::regclass) on true where schemaname= 'public';
So all data was in buffers.

Load generated by command: pgbench --builtin=select-only --time=300 -n -c 10 -h hostname postgres -M (vary)

Tests are:
head_no_pgss - unpatched version, empty shared_preload_libraries
head_track_none - unpatched version with:
  shared_preload_libraries = 'pg_stat_statements'
  pg_stat_statements.max = 5000
  pg_stat_statements.track = none
  pg_stat_statements.save = off
  pg_stat_statements.track_utility = off
head_track_top - the same but with pg_stat_statements.track=top
5-times runned in every mode -M: simple, extended, prepared

patch_not_loaded - build with latest published patches, empty shared_preload_libraries
patch_track_none - patched build with
  shared_preload_libraries = 'pg_stat_statements'
  pg_stat_statements.max = 5000
  pg_stat_statements.track = none
  pg_stat_statements.save = off
  pg_stat_statements.track_utility = off
  pg_stat_statements.track_planning = off
patch_track_top - the same but with pg_stat_statements.track=top
patch_track_planning - with:
  shared_preload_libraries = 'pg_stat_statements'
  pg_stat_statements.max = 5000
  pg_stat_statements.track = top
  pg_stat_statements.save = off
  pg_stat_statements.track_utility = off
  pg_stat_statements.track_planning = on

10-times runned in every mode -M: simple, extended, prepared

Results:

         test         |   mode   | average_tps | degradation_perc
----------------------+----------+-------------+------------------
 head_no_pgss         | extended |       13816 |            1.000
 patch_not_loaded     | extended |       13755 |            0.996
 head_track_none      | extended |       13607 |            0.985
 patch_track_none     | extended |       13560 |            0.981
 head_track_top       | extended |       13277 |            0.961
 patch_track_top      | extended |       13189 |            0.955
 patch_track_planning | extended |       12983 |            0.940
 head_no_pgss         | prepared |       29101 |            1.000
 head_track_none      | prepared |       28510 |            0.980
 patch_track_none     | prepared |       28481 |            0.979
 patch_not_loaded     | prepared |       28382 |            0.975
 patch_track_planning | prepared |       28046 |            0.964
 head_track_top       | prepared |       28035 |            0.963
 patch_track_top      | prepared |       27973 |            0.961
 head_no_pgss         | simple   |       16733 |            1.000
 patch_not_loaded     | simple   |       16552 |            0.989
 head_track_none      | simple   |       16452 |            0.983
 patch_track_none     | simple   |       16365 |            0.978
 head_track_top       | simple   |       15867 |            0.948
 patch_track_top      | simple   |       15820 |            0.945
 patch_track_planning | simple   |       15739 |            0.941

So I found slight slowdown with track_planning = off compared to HEAD. Possibly just at the level of measurement error. I think this is ok.
track_planning = on also has no dramatic impact. In my opinion proposed design with pgss_store call is acceptable.

regards, Sergei


Reply | Threaded
Open this post in threaded view
|

Re: Planning counters in pg_stat_statements (using pgss_store)

Tomas Vondra-4
On Wed, Sep 04, 2019 at 07:19:47PM +0300, Sergei Kornilov wrote:

>
> ...
>
>Results:
>
>         test         |   mode   | average_tps | degradation_perc
>----------------------+----------+-------------+------------------
> head_no_pgss         | extended |       13816 |            1.000
> patch_not_loaded     | extended |       13755 |            0.996
> head_track_none      | extended |       13607 |            0.985
> patch_track_none     | extended |       13560 |            0.981
> head_track_top       | extended |       13277 |            0.961
> patch_track_top      | extended |       13189 |            0.955
> patch_track_planning | extended |       12983 |            0.940
> head_no_pgss         | prepared |       29101 |            1.000
> head_track_none      | prepared |       28510 |            0.980
> patch_track_none     | prepared |       28481 |            0.979
> patch_not_loaded     | prepared |       28382 |            0.975
> patch_track_planning | prepared |       28046 |            0.964
> head_track_top       | prepared |       28035 |            0.963
> patch_track_top      | prepared |       27973 |            0.961
> head_no_pgss         | simple   |       16733 |            1.000
> patch_not_loaded     | simple   |       16552 |            0.989
> head_track_none      | simple   |       16452 |            0.983
> patch_track_none     | simple   |       16365 |            0.978
> head_track_top       | simple   |       15867 |            0.948
> patch_track_top      | simple   |       15820 |            0.945
> patch_track_planning | simple   |       15739 |            0.941
>
>So I found slight slowdown with track_planning = off compared to HEAD. Possibly just at the level of measurement error. I think this is ok.
>track_planning = on also has no dramatic impact. In my opinion proposed design with pgss_store call is acceptable.
>

FWIW I've done some benchmarking on this too, with a single pgbench client
running select-only test on a tiny database, in different modes (simple,
extended, prepared). I've done that on two systems with different CPUs
(spreadsheet with results attached).

I don't see any performance regression - there are some small variations
in both directions (say, ~1%) but that's well within the noise. So I think
the patch is fine in this regard.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Reply | Threaded
Open this post in threaded view
|

Re: Planning counters in pg_stat_statements (using pgss_store)

Tomas Vondra-4
On Fri, Sep 06, 2019 at 04:19:16PM +0200, Tomas Vondra wrote:
>
>FWIW I've done some benchmarking on this too, with a single pgbench client
>running select-only test on a tiny database, in different modes (simple,
>extended, prepared). I've done that on two systems with different CPUs
>(spreadsheet with results attached).
>

And of course, I forgot to attach the spreadsheet, so here it is.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


planning.ods (71K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Planning counters in pg_stat_statements (using pgss_store)

Yoshikazu Imai
In reply to this post by Tomas Vondra-4
Hello

On 2019/09/06 23:19, Tomas Vondra wrote:

> On Wed, Sep 04, 2019 at 07:19:47PM +0300, Sergei Kornilov wrote:
>>
>> ...
>>
>> Results:
>>
>>         test         |   mode   | average_tps | degradation_perc
>> ----------------------+----------+-------------+------------------
>> head_no_pgss         | extended |       13816 |            1.000
>> patch_not_loaded     | extended |       13755 |            0.996
>> head_track_none      | extended |       13607 |            0.985
>> patch_track_none     | extended |       13560 |            0.981
>> head_track_top       | extended |       13277 |            0.961
>> patch_track_top      | extended |       13189 |            0.955
>> patch_track_planning | extended |       12983 |            0.940
>> head_no_pgss         | prepared |       29101 |            1.000
>> head_track_none      | prepared |       28510 |            0.980
>> patch_track_none     | prepared |       28481 |            0.979
>> patch_not_loaded     | prepared |       28382 |            0.975
>> patch_track_planning | prepared |       28046 |            0.964
>> head_track_top       | prepared |       28035 |            0.963
>> patch_track_top      | prepared |       27973 |            0.961
>> head_no_pgss         | simple   |       16733 |            1.000
>> patch_not_loaded     | simple   |       16552 |            0.989
>> head_track_none      | simple   |       16452 |            0.983
>> patch_track_none     | simple   |       16365 |            0.978
>> head_track_top       | simple   |       15867 |            0.948
>> patch_track_top      | simple   |       15820 |            0.945
>> patch_track_planning | simple   |       15739 |            0.941
>>
>> So I found slight slowdown with track_planning = off compared to HEAD.
>> Possibly just at the level of measurement error. I think this is ok.
>> track_planning = on also has no dramatic impact. In my opinion
>> proposed design with pgss_store call is acceptable.
>>
>
> FWIW I've done some benchmarking on this too, with a single pgbench client
> running select-only test on a tiny database, in different modes (simple,
> extended, prepared). I've done that on two systems with different CPUs
> (spreadsheet with results attached).

Refering to Sergei's results, if a user, currently using pgss with
tracking execute time, uses the new feature, a user will see 0~2.2%
performance regression as below.

 >> head_track_top       | extended |       13277 |            0.961
 >> patch_track_planning | extended |       12983 |            0.940
 >> patch_track_planning | prepared |       28046 |            0.964
 >> head_track_top       | prepared |       28035 |            0.963
 >> head_track_top       | simple   |       15867 |            0.948
 >> patch_track_planning | simple   |       15739 |            0.941

If a user will not turn on the track_planning, a user will see 0.2-0.6%
performance regression as below.

 >> head_track_top       | extended |       13277 |            0.961
 >> patch_track_top      | extended |       13189 |            0.955
 >> head_track_top       | prepared |       28035 |            0.963
 >> patch_track_top      | prepared |       27973 |            0.961
 >> head_track_top       | simple   |       15867 |            0.948
 >> patch_track_top      | simple   |       15820 |            0.945

>
> I don't see any performance regression - there are some small variations
> in both directions (say, ~1%) but that's well within the noise. So I think
> the patch is fine in this regard.

+1


I also saw the codes and have one comment.

[0002 patch]
In pgss_planner_hook:

+ /* calc differences of buffer counters. */
+ bufusage = compute_buffer_counters(bufusage_start, pgBufferUsage);
+
+ /*
+ * we only store planning duration, query text has been initialized
+ * during previous pgss_post_parse_analyze as it not available inside
+ * pgss_planner_hook.
+ */
+ pgss_store(query_text,

Do we need to calculate bufusage in here?
We only store planning duration in the following pgss_store.

--
Yoshikazu Imai
Reply | Threaded
Open this post in threaded view
|

Re: Planning counters in pg_stat_statements (using pgss_store)

Julien Rouhaud
In reply to this post by Tomas Vondra-4
On Fri, Sep 6, 2019 at 4:19 PM Tomas Vondra
<[hidden email]> wrote:

>
> On Wed, Sep 04, 2019 at 07:19:47PM +0300, Sergei Kornilov wrote:
> >
> > ...
> >
> >Results:
> >
> >         test         |   mode   | average_tps | degradation_perc
> >----------------------+----------+-------------+------------------
> > head_no_pgss         | extended |       13816 |            1.000
> > patch_not_loaded     | extended |       13755 |            0.996
> > head_track_none      | extended |       13607 |            0.985
> > patch_track_none     | extended |       13560 |            0.981
> > head_track_top       | extended |       13277 |            0.961
> > patch_track_top      | extended |       13189 |            0.955
> > patch_track_planning | extended |       12983 |            0.940
> > head_no_pgss         | prepared |       29101 |            1.000
> > head_track_none      | prepared |       28510 |            0.980
> > patch_track_none     | prepared |       28481 |            0.979
> > patch_not_loaded     | prepared |       28382 |            0.975
> > patch_track_planning | prepared |       28046 |            0.964
> > head_track_top       | prepared |       28035 |            0.963
> > patch_track_top      | prepared |       27973 |            0.961
> > head_no_pgss         | simple   |       16733 |            1.000
> > patch_not_loaded     | simple   |       16552 |            0.989
> > head_track_none      | simple   |       16452 |            0.983
> > patch_track_none     | simple   |       16365 |            0.978
> > head_track_top       | simple   |       15867 |            0.948
> > patch_track_top      | simple   |       15820 |            0.945
> > patch_track_planning | simple   |       15739 |            0.941
> >
> >So I found slight slowdown with track_planning = off compared to HEAD. Possibly just at the level of measurement error. I think this is ok.
> >track_planning = on also has no dramatic impact. In my opinion proposed design with pgss_store call is acceptable.
> >
>
> FWIW I've done some benchmarking on this too, with a single pgbench client
> running select-only test on a tiny database, in different modes (simple,
> extended, prepared). I've done that on two systems with different CPUs
> (spreadsheet with results attached).
>
> I don't see any performance regression - there are some small variations
> in both directions (say, ~1%) but that's well within the noise. So I think
> the patch is fine in this regard.

Thanks a lot Sergei and Tomas!  It's good to know that this patch
doesn't add significant overhead.


Reply | Threaded
Open this post in threaded view
|

Re: Planning counters in pg_stat_statements (using pgss_store)

Julien Rouhaud
In reply to this post by Yoshikazu Imai
Hello,

On Sun, Sep 8, 2019 at 11:45 AM Imai Yoshikazu <[hidden email]> wrote:
>
> I also saw the codes and have one comment.

Thanks for looking at this patch!

> [0002 patch]
> In pgss_planner_hook:
>
> +               /* calc differences of buffer counters. */
> +               bufusage = compute_buffer_counters(bufusage_start, pgBufferUsage);
> +
> +               /*
> +                * we only store planning duration, query text has been initialized
> +                * during previous pgss_post_parse_analyze as it not available inside
> +                * pgss_planner_hook.
> +                */
> +               pgss_store(query_text,
>
> Do we need to calculate bufusage in here?
> We only store planning duration in the following pgss_store.

Good point!  Postgres can definitely access some buffers while
planning a query (the most obvious example would be
get_actual_variable_range()), but as far as I can tell those were
previously not accounted for with pg_stat_statements as
queryDesc->totaltime->bufusage is only accumulating buffer usage in
the  executor, and indeed current patch also ignore such computed
counters.

I think it would be better to keep this bufusage calculation during
planning and fix pgss_store() to process them, but this would add
slightly more overhead.


> We only store planning duration in the following pgss_store.
>
> --
> Yoshikazu Imai


Reply | Threaded
Open this post in threaded view
|

RE: Planning counters in pg_stat_statements (using pgss_store)

imai.yoshikazu@fujitsu.com
On Tue, Sept 10, 2019 at 11:27 PM, Julien Rouhaud wrote:

> > [0002 patch]
> > In pgss_planner_hook:
> >
> > +               /* calc differences of buffer counters. */
> > +               bufusage = compute_buffer_counters(bufusage_start, pgBufferUsage);
> > +
> > +               /*
> > +                * we only store planning duration, query text has been initialized
> > +                * during previous pgss_post_parse_analyze as it not available inside
> > +                * pgss_planner_hook.
> > +                */
> > +               pgss_store(query_text,
> >
> > Do we need to calculate bufusage in here?
> > We only store planning duration in the following pgss_store.
>
> Good point!  Postgres can definitely access some buffers while
> planning a query (the most obvious example would be
> get_actual_variable_range()), but as far as I can tell those were
> previously not accounted for with pg_stat_statements as
> queryDesc->totaltime->bufusage is only accumulating buffer usage in
> the  executor, and indeed current patch also ignore such computed
> counters.
>
> I think it would be better to keep this bufusage calculation during
> planning and fix pgss_store() to process them, but this would add
slightly more overhead.

Sorry for my late reply.
I think overhead would be trivial and we can include bufusage of planning from
the POV of overhead, but yeah, it will be backward incompatibility if we
include them.


BTW, ISTM it is good for including {min,max,mean,stddev}_plan_time to
pg_stat_statements. Generally plan_time would be almost the same time in each
execution for the same query, but there are some exceptions. For example, if we
use prepare statements which uses partition tables, time differs largely
between creating a general plan and creating a custom plan.

1. Create partition table which has 1024 partitions.
2. Prepare select and update statements.
  sel) prepare sel(int) as select * from pt where a = $1;
  upd) prepare upd(int, int) as update pt set a = $2 where a = $1;
3. Execute each statement for 8 times.
  3-1. Select from pg_stat_statements view after every execution.
    select query, plans, total_plan_time, calls, total_exec_time from pg_stat_statements where query like 'prepare%';


Results of pg_stat_statements of sel) are
query                       | plans | total_plan_time | calls | total_exec_time
---------------------------------------------------+-------+-----------------+-------+-----------------
 prepare sel(int) as select * from pt where a = $1 |     1 |        0.164361 |     1 |        0.004613
 prepare sel(int) as select * from pt where a = $1 |     2 | 0.27715500000000004 |     2 |        0.009447
 prepare sel(int) as select * from pt where a = $1 |     3 | 0.39100100000000004 |     3 |        0.014281
 prepare sel(int) as select * from pt where a = $1 |     4 |        0.504004 |     4 |        0.019265
 prepare sel(int) as select * from pt where a = $1 |     5 |        0.628242 |     5 |        0.024091
 prepare sel(int) as select * from pt where a = $1 |     7 | 24.213586000000003 |     6 |        0.029144
 prepare sel(int) as select * from pt where a = $1 |     8 | 24.368900000000004 |     7 |        0.034099
 prepare sel(int) as select * from pt where a = $1 |     9 | 24.527956000000003 |     8 |        0.046152


Results of pg_stat_statements of upd) are
 prepare upd(int, int) as update pt set a = $2 where a = $1 |     1 |        0.280099 |     1 |        0.013138
 prepare upd(int, int) as update pt set a = $2 where a = $1 |     2 |        0.405416 |     2 |         0.01894
 prepare upd(int, int) as update pt set a = $2 where a = $1 |     3 |        0.532361 |     3 |        0.040716
 prepare upd(int, int) as update pt set a = $2 where a = $1 |     4 |        0.671445 |     4 |        0.046566
 prepare upd(int, int) as update pt set a = $2 where a = $1 |     5 |        0.798531 |     5 | 0.052729000000000005
 prepare upd(int, int) as update pt set a = $2 where a = $1 |     7 |      896.915458 |     6 | 0.05888600000000001
 prepare upd(int, int) as update pt set a = $2 where a = $1 |     8 |      897.043512 |     7 |        0.064446
 prepare upd(int, int) as update pt set a = $2 where a = $1 |     9 |      897.169711 |     8 |        0.070644


How do you think about that?


--
Yoshikazu Imai

Reply | Threaded
Open this post in threaded view
|

Re: Planning counters in pg_stat_statements (using pgss_store)

Julien Rouhaud
On Fri, Nov 8, 2019 at 5:35 AM [hidden email]
<[hidden email]> wrote:

>
> On Tue, Sept 10, 2019 at 11:27 PM, Julien Rouhaud wrote:
> > > [0002 patch]
> > > In pgss_planner_hook:
> > >
> > > +               /* calc differences of buffer counters. */
> > > +               bufusage = compute_buffer_counters(bufusage_start, pgBufferUsage);
> > > +
> > > +               /*
> > > +                * we only store planning duration, query text has been initialized
> > > +                * during previous pgss_post_parse_analyze as it not available inside
> > > +                * pgss_planner_hook.
> > > +                */
> > > +               pgss_store(query_text,
> > >
> > > Do we need to calculate bufusage in here?
> > > We only store planning duration in the following pgss_store.
> >
> > Good point!  Postgres can definitely access some buffers while
> > planning a query (the most obvious example would be
> > get_actual_variable_range()), but as far as I can tell those were
> > previously not accounted for with pg_stat_statements as
> > queryDesc->totaltime->bufusage is only accumulating buffer usage in
> > the  executor, and indeed current patch also ignore such computed
> > counters.
> >
> > I think it would be better to keep this bufusage calculation during
> > planning and fix pgss_store() to process them, but this would add
> slightly more overhead.
>
> Sorry for my late reply.
> I think overhead would be trivial and we can include bufusage of planning from
> the POV of overhead, but yeah, it will be backward incompatibility if we
> include them.

Ok, let's keep planning's bufusage then.

> BTW, ISTM it is good for including {min,max,mean,stddev}_plan_time to
> pg_stat_statements. Generally plan_time would be almost the same time in each
> execution for the same query, but there are some exceptions. For example, if we
> use prepare statements which uses partition tables, time differs largely
> between creating a general plan and creating a custom plan.
>
> 1. Create partition table which has 1024 partitions.
> 2. Prepare select and update statements.
>   sel) prepare sel(int) as select * from pt where a = $1;
>   upd) prepare upd(int, int) as update pt set a = $2 where a = $1;
> 3. Execute each statement for 8 times.
>   3-1. Select from pg_stat_statements view after every execution.
>     select query, plans, total_plan_time, calls, total_exec_time from pg_stat_statements where query like 'prepare%';
>
>
> Results of pg_stat_statements of sel) are
> query                       | plans | total_plan_time | calls | total_exec_time
> ---------------------------------------------------+-------+-----------------+-------+-----------------
>  prepare sel(int) as select * from pt where a = $1 |     1 |        0.164361 |     1 |        0.004613
>  prepare sel(int) as select * from pt where a = $1 |     2 | 0.27715500000000004 |     2 |        0.009447
>  prepare sel(int) as select * from pt where a = $1 |     3 | 0.39100100000000004 |     3 |        0.014281
>  prepare sel(int) as select * from pt where a = $1 |     4 |        0.504004 |     4 |        0.019265
>  prepare sel(int) as select * from pt where a = $1 |     5 |        0.628242 |     5 |        0.024091
>  prepare sel(int) as select * from pt where a = $1 |     7 | 24.213586000000003 |     6 |        0.029144
>  prepare sel(int) as select * from pt where a = $1 |     8 | 24.368900000000004 |     7 |        0.034099
>  prepare sel(int) as select * from pt where a = $1 |     9 | 24.527956000000003 |     8 |        0.046152
>
>
> Results of pg_stat_statements of upd) are
>  prepare upd(int, int) as update pt set a = $2 where a = $1 |     1 |        0.280099 |     1 |        0.013138
>  prepare upd(int, int) as update pt set a = $2 where a = $1 |     2 |        0.405416 |     2 |         0.01894
>  prepare upd(int, int) as update pt set a = $2 where a = $1 |     3 |        0.532361 |     3 |        0.040716
>  prepare upd(int, int) as update pt set a = $2 where a = $1 |     4 |        0.671445 |     4 |        0.046566
>  prepare upd(int, int) as update pt set a = $2 where a = $1 |     5 |        0.798531 |     5 | 0.052729000000000005
>  prepare upd(int, int) as update pt set a = $2 where a = $1 |     7 |      896.915458 |     6 | 0.05888600000000001
>  prepare upd(int, int) as update pt set a = $2 where a = $1 |     8 |      897.043512 |     7 |        0.064446
>  prepare upd(int, int) as update pt set a = $2 where a = $1 |     9 |      897.169711 |     8 |        0.070644
>
>
> How do you think about that?

That's indeed a very valid point and something we should help user to
investigate.  I'll submit an updated patch with support for
min/max/mean/stddev plan time shortly.


Reply | Threaded
Open this post in threaded view
|

Re: Planning counters in pg_stat_statements (using pgss_store)

Julien Rouhaud
On Fri, Nov 8, 2019 at 3:31 PM Julien Rouhaud <[hidden email]> wrote:

>
> On Fri, Nov 8, 2019 at 5:35 AM [hidden email]
> <[hidden email]> wrote:
> >
> > On Tue, Sept 10, 2019 at 11:27 PM, Julien Rouhaud wrote:
> > > > [0002 patch]
> > > > In pgss_planner_hook:
> > > >
> > > > +               /* calc differences of buffer counters. */
> > > > +               bufusage = compute_buffer_counters(bufusage_start, pgBufferUsage);
> > > > +
> > > > +               /*
> > > > +                * we only store planning duration, query text has been initialized
> > > > +                * during previous pgss_post_parse_analyze as it not available inside
> > > > +                * pgss_planner_hook.
> > > > +                */
> > > > +               pgss_store(query_text,
> > > >
> > > > Do we need to calculate bufusage in here?
> > > > We only store planning duration in the following pgss_store.
> > >
> > > Good point!  Postgres can definitely access some buffers while
> > > planning a query (the most obvious example would be
> > > get_actual_variable_range()), but as far as I can tell those were
> > > previously not accounted for with pg_stat_statements as
> > > queryDesc->totaltime->bufusage is only accumulating buffer usage in
> > > the  executor, and indeed current patch also ignore such computed
> > > counters.
> > >
> > > I think it would be better to keep this bufusage calculation during
> > > planning and fix pgss_store() to process them, but this would add
> > slightly more overhead.
> >
> > Sorry for my late reply.
> > I think overhead would be trivial and we can include bufusage of planning from
> > the POV of overhead, but yeah, it will be backward incompatibility if we
> > include them.
>
> Ok, let's keep planning's bufusage then.
>
> > BTW, ISTM it is good for including {min,max,mean,stddev}_plan_time to
> > pg_stat_statements. Generally plan_time would be almost the same time in each
> > execution for the same query, but there are some exceptions. For example, if we
> > use prepare statements which uses partition tables, time differs largely
> > between creating a general plan and creating a custom plan.
> >
> > 1. Create partition table which has 1024 partitions.
> > 2. Prepare select and update statements.
> >   sel) prepare sel(int) as select * from pt where a = $1;
> >   upd) prepare upd(int, int) as update pt set a = $2 where a = $1;
> > 3. Execute each statement for 8 times.
> >   3-1. Select from pg_stat_statements view after every execution.
> >     select query, plans, total_plan_time, calls, total_exec_time from pg_stat_statements where query like 'prepare%';
> >
> >
> > Results of pg_stat_statements of sel) are
> > query                       | plans | total_plan_time | calls | total_exec_time
> > ---------------------------------------------------+-------+-----------------+-------+-----------------
> >  prepare sel(int) as select * from pt where a = $1 |     1 |        0.164361 |     1 |        0.004613
> >  prepare sel(int) as select * from pt where a = $1 |     2 | 0.27715500000000004 |     2 |        0.009447
> >  prepare sel(int) as select * from pt where a = $1 |     3 | 0.39100100000000004 |     3 |        0.014281
> >  prepare sel(int) as select * from pt where a = $1 |     4 |        0.504004 |     4 |        0.019265
> >  prepare sel(int) as select * from pt where a = $1 |     5 |        0.628242 |     5 |        0.024091
> >  prepare sel(int) as select * from pt where a = $1 |     7 | 24.213586000000003 |     6 |        0.029144
> >  prepare sel(int) as select * from pt where a = $1 |     8 | 24.368900000000004 |     7 |        0.034099
> >  prepare sel(int) as select * from pt where a = $1 |     9 | 24.527956000000003 |     8 |        0.046152
> >
> >
> > Results of pg_stat_statements of upd) are
> >  prepare upd(int, int) as update pt set a = $2 where a = $1 |     1 |        0.280099 |     1 |        0.013138
> >  prepare upd(int, int) as update pt set a = $2 where a = $1 |     2 |        0.405416 |     2 |         0.01894
> >  prepare upd(int, int) as update pt set a = $2 where a = $1 |     3 |        0.532361 |     3 |        0.040716
> >  prepare upd(int, int) as update pt set a = $2 where a = $1 |     4 |        0.671445 |     4 |        0.046566
> >  prepare upd(int, int) as update pt set a = $2 where a = $1 |     5 |        0.798531 |     5 | 0.052729000000000005
> >  prepare upd(int, int) as update pt set a = $2 where a = $1 |     7 |      896.915458 |     6 | 0.05888600000000001
> >  prepare upd(int, int) as update pt set a = $2 where a = $1 |     8 |      897.043512 |     7 |        0.064446
> >  prepare upd(int, int) as update pt set a = $2 where a = $1 |     9 |      897.169711 |     8 |        0.070644
> >
> >
> > How do you think about that?
>
> That's indeed a very valid point and something we should help user to
> investigate.  I'll submit an updated patch with support for
> min/max/mean/stddev plan time shortly.
I attach v3 patches implementing those counters.  Note that to avoid
duplicating some code (related to Welford's method), I switched some
of the counters to arrays rather than scalar variables.  It
unfortunately makes pg_stat_statements_internal() a little bit messy,
but I hope that it's still acceptable.  While doing this refactoring I
saw that previous patches were failing to accumulate the buffers used
during planning, which is now fixed.

0002-Add-planning-counters-to-pg_stat_statements-v3.patch (62K) Download Attachment
0001-Pass-query-string-to-the-planner-v3.patch (14K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

RE: Planning counters in pg_stat_statements (using pgss_store)

imai.yoshikazu@fujitsu.com
On Sat, Nov 9, 2019 at 1:36 PM, Julien Rouhaud wrote:

> On Fri, Nov 8, 2019 at 3:31 PM Julien Rouhaud <[hidden email]> wrote:
> >
> > On Fri, Nov 8, 2019 at 5:35 AM [hidden email]
> > <[hidden email]> wrote:
> > >
> > > On Tue, Sept 10, 2019 at 11:27 PM, Julien Rouhaud wrote:
> > > > > [0002 patch]
> > > > > In pgss_planner_hook:
> > > > >
> > > > > +               /* calc differences of buffer counters. */
> > > > > +               bufusage =
> > > > > + compute_buffer_counters(bufusage_start, pgBufferUsage);
> > > > > +
> > > > > +               /*
> > > > > +                * we only store planning duration, query text has been initialized
> > > > > +                * during previous pgss_post_parse_analyze as it not available inside
> > > > > +                * pgss_planner_hook.
> > > > > +                */
> > > > > +               pgss_store(query_text,
> > > > >
> > > > > Do we need to calculate bufusage in here?
> > > > > We only store planning duration in the following pgss_store.
> > > >
> > > > Good point!  Postgres can definitely access some buffers while
> > > > planning a query (the most obvious example would be
> > > > get_actual_variable_range()), but as far as I can tell those were
> > > > previously not accounted for with pg_stat_statements as
> > > > queryDesc->totaltime->bufusage is only accumulating buffer usage
> > > > queryDesc->totaltime->in
> > > > the  executor, and indeed current patch also ignore such computed
> > > > counters.
> > > >
> > > > I think it would be better to keep this bufusage calculation
> > > > during planning and fix pgss_store() to process them, but this
> > > > would add
> > > slightly more overhead.
> > >
> > > Sorry for my late reply.
> > > I think overhead would be trivial and we can include bufusage of
> > > planning from the POV of overhead, but yeah, it will be backward
> > > incompatibility if we include them.
> >
> > Ok, let's keep planning's bufusage then.
> >
> > > BTW, ISTM it is good for including {min,max,mean,stddev}_plan_time
> > > to pg_stat_statements. Generally plan_time would be almost the same
> > > time in each execution for the same query, but there are some
> > > exceptions. For example, if we use prepare statements which uses
> > > partition tables, time differs largely between creating a general plan and creating a custom plan.
> > >
> > > 1. Create partition table which has 1024 partitions.
> > > 2. Prepare select and update statements.
> > >   sel) prepare sel(int) as select * from pt where a = $1;
> > >   upd) prepare upd(int, int) as update pt set a = $2 where a = $1;
> > > 3. Execute each statement for 8 times.
> > >   3-1. Select from pg_stat_statements view after every execution.
> > >     select query, plans, total_plan_time, calls, total_exec_time
> > > from pg_stat_statements where query like 'prepare%';
> > >
> > >
> > > Results of pg_stat_statements of sel) are
> > > query                       | plans | total_plan_time | calls | total_exec_time
> > > ---------------------------------------------------+-------+-----------------+-------+-----------------
> > >  prepare sel(int) as select * from pt where a = $1 |     1 |        0.164361 |     1 |        0.004613
> > >  prepare sel(int) as select * from pt where a = $1 |     2 | 0.27715500000000004 |     2 |        0.009447
> > >  prepare sel(int) as select * from pt where a = $1 |     3 | 0.39100100000000004 |     3 |        0.014281
> > >  prepare sel(int) as select * from pt where a = $1 |     4 |        0.504004 |     4 |        0.019265
> > >  prepare sel(int) as select * from pt where a = $1 |     5 |        0.628242 |     5 |        0.024091
> > >  prepare sel(int) as select * from pt where a = $1 |     7 | 24.213586000000003 |     6 |        0.029144
> > >  prepare sel(int) as select * from pt where a = $1 |     8 | 24.368900000000004 |     7 |        0.034099
> > >  prepare sel(int) as select * from pt where a = $1 |     9 | 24.527956000000003 |     8 |        0.046152
> > >
> > >
> > > Results of pg_stat_statements of upd) are
> > >  prepare upd(int, int) as update pt set a = $2 where a = $1 |     1 |        0.280099 |     1 |        0.013138
> > >  prepare upd(int, int) as update pt set a = $2 where a = $1 |     2 |        0.405416 |     2 |         0.01894
> > >  prepare upd(int, int) as update pt set a = $2 where a = $1 |     3 |        0.532361 |     3 |        0.040716
> > >  prepare upd(int, int) as update pt set a = $2 where a = $1 |     4 |        0.671445 |     4 |        0.046566
> > >  prepare upd(int, int) as update pt set a = $2 where a = $1 |     5 |        0.798531 |     5 | 0.052729000000000005
> > >  prepare upd(int, int) as update pt set a = $2 where a = $1 |     7 |      896.915458 |     6 | 0.05888600000000001
> > >  prepare upd(int, int) as update pt set a = $2 where a = $1 |     8 |      897.043512 |     7 |        0.064446
> > >  prepare upd(int, int) as update pt set a = $2 where a = $1 |     9 |      897.169711 |     8 |        0.070644
> > >
> > >
> > > How do you think about that?
> >
> > That's indeed a very valid point and something we should help user to
> > investigate.  I'll submit an updated patch with support for
> > min/max/mean/stddev plan time shortly.
>
> I attach v3 patches implementing those counters.  

Thanks for updating the patch! Now I can see min/max/mean/stddev plan time.


> Note that to avoid duplicating some code (related to Welford's method),
> I switched some of the counters to arrays rather than scalar variables.  It unfortunately makes
> pg_stat_statements_internal() a little bit messy, but I hope that it's still acceptable.  

Yeah, I also think it's acceptable, but I think the codes like below one is more
understandable than using for loop in pg_stat_statements_internal(),
although some codes will be duplicated.

pg_stat_statements_internal():

if (api_version >= PGSS_V1_8)
{
    kind = PGSS_PLAN;
    values[i++] = Int64GetDatumFast(tmp.calls[kind]);
    values[i++] = Float8GetDatumFast(tmp.total_time[kind]);
    values[i++] = Float8GetDatumFast(tmp.min_time[kind]);
    values[i++] = Float8GetDatumFast(tmp.max_time[kind]);
    values[i++] = Float8GetDatumFast(tmp.mean_time[kind]);
    values[i++] = Float8GetDatumFast(stddev(tmp));
}

kind = PGSS_EXEC;
values[i++] = Int64GetDatumFast(tmp.calls[kind]);
values[i++] = Float8GetDatumFast(tmp.total_time[kind]);
if (api_version >= PGSS_V1_3)
{
    values[i++] = Float8GetDatumFast(tmp.min_time[kind]);
    values[i++] = Float8GetDatumFast(tmp.max_time[kind]);
    values[i++] = Float8GetDatumFast(tmp.mean_time[kind]);
    values[i++] = Float8GetDatumFast(stddev(tmp));
}


stddev(Counters counters)
{
    /*
     * Note we are calculating the population variance here, not the
     * sample variance, as we have data for the whole population, so
     * Bessel's correction is not used, and we don't divide by
     * tmp.calls - 1.
     */
    if (counters.calls[kind] > 1)
        return stddev = sqrt(counters.sum_var_time[kind] / counters.calls[kind]);

    return 0.0;
}


> While doing this refactoring
> I saw that previous patches were failing to accumulate the buffers used during planning, which is now fixed.

Checked.
Now buffer usage columns include buffer usage during planning and executing,
but if we turn off track_planning, buffer usage during planning is also not
tracked which is good for users who don't want to take into account of that.


What I'm concerned about is column names will not be backward-compatible.
{total, min, max, mean, stddev}_{plan, exec}_time are the best names which
correctly show the meaning of its value, but we can't use
{total, min, max, mean, stddev}_time anymore and it might be harmful for
some users.
I don't come up with any good idea for that...

--
Yoshikazu Imai
Reply | Threaded
Open this post in threaded view
|

Re: Planning counters in pg_stat_statements (using pgss_store)

Julien Rouhaud
On Tue, Nov 12, 2019 at 5:41 AM [hidden email]
<[hidden email]> wrote:

>
> On Sat, Nov 9, 2019 at 1:36 PM, Julien Rouhaud wrote:
> >
> > I attach v3 patches implementing those counters.
>
> Thanks for updating the patch! Now I can see min/max/mean/stddev plan time.
>
>
> > Note that to avoid duplicating some code (related to Welford's method),
> > I switched some of the counters to arrays rather than scalar variables.  It unfortunately makes
> > pg_stat_statements_internal() a little bit messy, but I hope that it's still acceptable.
>
> Yeah, I also think it's acceptable, but I think the codes like below one is more
> understandable than using for loop in pg_stat_statements_internal(),
> although some codes will be duplicated.
>
> pg_stat_statements_internal():
>
> if (api_version >= PGSS_V1_8)
> {
>     kind = PGSS_PLAN;
>     values[i++] = Int64GetDatumFast(tmp.calls[kind]);
>     values[i++] = Float8GetDatumFast(tmp.total_time[kind]);
>     values[i++] = Float8GetDatumFast(tmp.min_time[kind]);
>     values[i++] = Float8GetDatumFast(tmp.max_time[kind]);
>     values[i++] = Float8GetDatumFast(tmp.mean_time[kind]);
>     values[i++] = Float8GetDatumFast(stddev(tmp));
> }
>
> kind = PGSS_EXEC;
> values[i++] = Int64GetDatumFast(tmp.calls[kind]);
> values[i++] = Float8GetDatumFast(tmp.total_time[kind]);
> if (api_version >= PGSS_V1_3)
> {
>     values[i++] = Float8GetDatumFast(tmp.min_time[kind]);
>     values[i++] = Float8GetDatumFast(tmp.max_time[kind]);
>     values[i++] = Float8GetDatumFast(tmp.mean_time[kind]);
>     values[i++] = Float8GetDatumFast(stddev(tmp));
> }
>
>
> stddev(Counters counters)
> {
>     /*
>      * Note we are calculating the population variance here, not the
>      * sample variance, as we have data for the whole population, so
>      * Bessel's correction is not used, and we don't divide by
>      * tmp.calls - 1.
>      */
>     if (counters.calls[kind] > 1)
>         return stddev = sqrt(counters.sum_var_time[kind] / counters.calls[kind]);
>
>     return 0.0;
> }

Yes, that's also a possibility (though this should also take the
"kind" as parameter).  I wanted to avoid adding a new function and
save some redundant code, but I can change it in the next version of
the patch if needed.

> > While doing this refactoring
> > I saw that previous patches were failing to accumulate the buffers used during planning, which is now fixed.
>
> Checked.
> Now buffer usage columns include buffer usage during planning and executing,
> but if we turn off track_planning, buffer usage during planning is also not
> tracked which is good for users who don't want to take into account of that.

Indeed.  Note that there's a similar discussion on adding planning
buffer counters to explain in [1].  I'm unsure if merging planning and
execution counters in the same columns is ok or not.

> What I'm concerned about is column names will not be backward-compatible.
> {total, min, max, mean, stddev}_{plan, exec}_time are the best names which
> correctly show the meaning of its value, but we can't use
> {total, min, max, mean, stddev}_time anymore and it might be harmful for
> some users.
> I don't come up with any good idea for that...

Well, perhaps keeping the old {total, min, max, mean, stddev}_time
would be ok, as those historically meant "execution".  I don't have a
strong opinion here.

[1] https://www.postgresql.org/message-id/20191112205506.rvadbx2dnku3paaw@...


Reply | Threaded
Open this post in threaded view
|

RE: Planning counters in pg_stat_statements (using pgss_store)

imai.yoshikazu@fujitsu.com
On Wed, Nov 13, 2019 at 10:50 AM, Julien Rouhaud wrote:

> On Tue, Nov 12, 2019 at 5:41 AM [hidden email] <[hidden email]> wrote:
> >
> > On Sat, Nov 9, 2019 at 1:36 PM, Julien Rouhaud wrote:
> > >
> > > I attach v3 patches implementing those counters.
> >
> > Thanks for updating the patch! Now I can see min/max/mean/stddev plan time.
> >
> >
> > > Note that to avoid duplicating some code (related to Welford's
> > > method), I switched some of the counters to arrays rather than
> > > scalar variables.  It unfortunately makes
> > > pg_stat_statements_internal() a little bit messy, but I hope that it's still acceptable.
> >
> > Yeah, I also think it's acceptable, but I think the codes like below
> > one is more understandable than using for loop in
> > pg_stat_statements_internal(), although some codes will be duplicated.
> >
> > pg_stat_statements_internal():
> >
> > if (api_version >= PGSS_V1_8)
> > {
> >     kind = PGSS_PLAN;
> >     values[i++] = Int64GetDatumFast(tmp.calls[kind]);
> >     values[i++] = Float8GetDatumFast(tmp.total_time[kind]);
> >     values[i++] = Float8GetDatumFast(tmp.min_time[kind]);
> >     values[i++] = Float8GetDatumFast(tmp.max_time[kind]);
> >     values[i++] = Float8GetDatumFast(tmp.mean_time[kind]);
> >     values[i++] = Float8GetDatumFast(stddev(tmp)); }
> >
> > kind = PGSS_EXEC;
> > values[i++] = Int64GetDatumFast(tmp.calls[kind]);
> > values[i++] = Float8GetDatumFast(tmp.total_time[kind]);
> > if (api_version >= PGSS_V1_3)
> > {
> >     values[i++] = Float8GetDatumFast(tmp.min_time[kind]);
> >     values[i++] = Float8GetDatumFast(tmp.max_time[kind]);
> >     values[i++] = Float8GetDatumFast(tmp.mean_time[kind]);
> >     values[i++] = Float8GetDatumFast(stddev(tmp)); }
> >
> >
> > stddev(Counters counters)
> > {
> >     /*
> >      * Note we are calculating the population variance here, not the
> >      * sample variance, as we have data for the whole population, so
> >      * Bessel's correction is not used, and we don't divide by
> >      * tmp.calls - 1.
> >      */
> >     if (counters.calls[kind] > 1)
> >         return stddev = sqrt(counters.sum_var_time[kind] / counters.calls[kind]);
> >
> >     return 0.0;
> > }
>
> Yes, that's also a possibility (though this should also take the
> "kind" as parameter).  I wanted to avoid adding a new function and
> save some redundant code, but I can change it in the next version of
> the patch if needed.

Okay. It's not much a serious problem, so we can leave it as it is.


> > > While doing this refactoring
> > > I saw that previous patches were failing to accumulate the buffers used during planning, which is now fixed.
> >
> > Checked.
> > Now buffer usage columns include buffer usage during planning and executing,
> > but if we turn off track_planning, buffer usage during planning is also not
> > tracked which is good for users who don't want to take into account of that.
>
> Indeed.  Note that there's a similar discussion on adding planning
> buffer counters to explain in [1].  I'm unsure if merging planning and
> execution counters in the same columns is ok or not.

Storing buffer usage to different columns is useful to detect the cause of the problems if there are the cases many buffers are used during planning, but I'm also unsure those cases actually exist.


> > What I'm concerned about is column names will not be backward-compatible.
> > {total, min, max, mean, stddev}_{plan, exec}_time are the best names which
> > correctly show the meaning of its value, but we can't use
> > {total, min, max, mean, stddev}_time anymore and it might be harmful for
> > some users.
> > I don't come up with any good idea for that...
>
> Well, perhaps keeping the old {total, min, max, mean, stddev}_time
> would be ok, as those historically meant "execution".  I don't have a
> strong opinion here.

Actually I also don't have strong opinion but I thought someone would complain about renaming of those columns and also some tools like monitoring which use those columns will not work. If we use {total, min, max, mean, stddev}_time, someone might mistakenly understand {total, min, max, mean, stddev}_time mean {total, min, max, mean, stddev} of planning and execution.
If I need to choose {total, min, max, mean, stddev}_time or {total, min, max, mean, stddev}_exec_time, I choose former one because choosing best name is not worth destructing the existing scripts or tools.

Thanks.
--
Yoshikazu Imai
Reply | Threaded
Open this post in threaded view
|

Re: Planning counters in pg_stat_statements (using pgss_store)

Julien Rouhaud
On Fri, Nov 15, 2019 at 2:00 AM [hidden email]
<[hidden email]> wrote:
>
> Actually I also don't have strong opinion but I thought someone would complain about renaming of those columns and also some tools like monitoring which use those columns will not work. If we use {total, min, max, mean, stddev}_time, someone might mistakenly understand {total, min, max, mean, stddev}_time mean {total, min, max, mean, stddev} of planning and execution.
> If I need to choose {total, min, max, mean, stddev}_time or {total, min, max, mean, stddev}_exec_time, I choose former one because choosing best name is not worth destructing the existing scripts or tools.

We could definitely keep (plan|exec)_time for the SRF, and have the
{total, min, max, mean, stddev}_time created by the view to be a sum
of planning + execution for each counter, and it doesn't sound like a
bad idea if having even more columns in the view is not an issue.


Reply | Threaded
Open this post in threaded view
|

RE: Planning counters in pg_stat_statements (using pgss_store)

imai.yoshikazu@fujitsu.com
On Tue, Nov 19, 2019 at 2:27 PM, Julien Rouhaud wrote:

> On Fri, Nov 15, 2019 at 2:00 AM [hidden email] <[hidden email]> wrote:
> >
> > Actually I also don't have strong opinion but I thought someone would complain about renaming of those columns and
> also some tools like monitoring which use those columns will not work. If we use {total, min, max, mean, stddev}_time,
> someone might mistakenly understand {total, min, max, mean, stddev}_time mean {total, min, max, mean, stddev} of planning
> and execution.
> > If I need to choose {total, min, max, mean, stddev}_time or {total, min, max, mean, stddev}_exec_time, I choose former
> one because choosing best name is not worth destructing the existing scripts or tools.
>
> We could definitely keep (plan|exec)_time for the SRF, and have the {total, min, max, mean, stddev}_time created by
> the view to be a sum of planning + execution for each counter

I might misunderstand but if we define {total, min, max, mean, stddev}_time is
just a sum of planning + execution for each counter like
"select total_plan_time + total_exec_time as total_time from pg_stat_statements",
I wonder we can calculate stddev_time correctly. If we prepare variables in
the codes to calculate those values, yes, we can correctly calculate those
values even for the total_stddev.

> and it doesn't sound like a bad idea if having even
> more columns in the view is not an issue.

I also wondered having many columns in the view is ok, but if it's ok, I agree
all of those columns are in the view. Only problem I can come up with is the
view will look bad with many columns, but it already looks bad because query
column values tend to be long and each row can't fit in the one row in the
console.

--
Yoshikazu Imai
Reply | Threaded
Open this post in threaded view
|

Re: Planning counters in pg_stat_statements (using pgss_store)

Julien Rouhaud
On Wed, Nov 20, 2019 at 2:06 AM [hidden email]
<[hidden email]> wrote:

>
> On Tue, Nov 19, 2019 at 2:27 PM, Julien Rouhaud wrote:
> > On Fri, Nov 15, 2019 at 2:00 AM [hidden email] <[hidden email]> wrote:
> > >
> > > Actually I also don't have strong opinion but I thought someone would complain about renaming of those columns and
> > also some tools like monitoring which use those columns will not work. If we use {total, min, max, mean, stddev}_time,
> > someone might mistakenly understand {total, min, max, mean, stddev}_time mean {total, min, max, mean, stddev} of planning
> > and execution.
> > > If I need to choose {total, min, max, mean, stddev}_time or {total, min, max, mean, stddev}_exec_time, I choose former
> > one because choosing best name is not worth destructing the existing scripts or tools.
> >
> > We could definitely keep (plan|exec)_time for the SRF, and have the {total, min, max, mean, stddev}_time created by
> > the view to be a sum of planning + execution for each counter
>
> I might misunderstand but if we define {total, min, max, mean, stddev}_time is
> just a sum of planning + execution for each counter like
> "select total_plan_time + total_exec_time as total_time from pg_stat_statements",
> I wonder we can calculate stddev_time correctly. If we prepare variables in
> the codes to calculate those values, yes, we can correctly calculate those
> values even for the total_stddev.

Yes you're right, this can't possibly work for most of the counters.
And also, since there's no guarantee that each execution will follow a
planning, providing such global counters for min/max/mean and stddev
wouldn't make much sense.


123