Transaction commits VS Transaction commits (with parallel) VS query mean time

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Transaction commits VS Transaction commits (with parallel) VS query mean time

Haribabu Kommi-2
Hi Hackers,

Does increase in Transaction commits per second means good query performance?
Why I asked this question is, many monitoring tools display that number of transactions
per second in the dashboard (including pgadmin).

During the testing of bunch of queries with different set of configurations, I observed that
TPS of some particular configuration has increased compared to default server configuration, but the overall query execution performance is decreased after comparing all queries run time. 

This is because of larger xact_commit value than default configuration. With the changed server configuration, that leads to generate more parallel workers and every parallel worker operation is treated as an extra commit, because of this reason, the total number of commits increased, but the overall query performance is decreased.

Is there any relation of transaction commits to performance? 
 
Is there any specific reason to consider the parallel worker activity also as a transaction commit? Especially in my observation, if we didn't consider the parallel worker activity as separate commits, the test doesn't show an increase in transaction commits.

Suggestions?

Regards,
Haribabu Kommi
Fujitsu Australia
Reply | Threaded
Open this post in threaded view
|

Re: Transaction commits VS Transaction commits (with parallel) VS query mean time

Haribabu Kommi-2

On Thu, Feb 7, 2019 at 9:31 PM Haribabu Kommi <[hidden email]> wrote:
Hi Hackers,

Does increase in Transaction commits per second means good query performance?
Why I asked this question is, many monitoring tools display that number of transactions
per second in the dashboard (including pgadmin).

During the testing of bunch of queries with different set of configurations, I observed that
TPS of some particular configuration has increased compared to default server configuration, but the overall query execution performance is decreased after comparing all queries run time. 

This is because of larger xact_commit value than default configuration. With the changed server configuration, that leads to generate more parallel workers and every parallel worker operation is treated as an extra commit, because of this reason, the total number of commits increased, but the overall query performance is decreased.

Is there any relation of transaction commits to performance? 
 
Is there any specific reason to consider the parallel worker activity also as a transaction commit? Especially in my observation, if we didn't consider the parallel worker activity as separate commits, the test doesn't show an increase in transaction commits.

The following statements shows the increase in the xact_commit value with
parallel workers. I can understand that workers updating the seq_scan stats
as they performed the seq scan. Is the same applied to parallel worker transaction
commits also?

The transaction commit counter is updated with all the internal operations like
autovacuum, checkpoint and etc. The increase in counters with these operations
are not that visible compared to parallel workers.

The spike of TPS with parallel workers is fine to consider?

postgres=# select relname, seq_scan from pg_stat_user_tables where relname = 'tbl';
 relname | seq_scan 
---------+----------
 tbl     |       16
(1 row)

postgres=# begin;
BEGIN
postgres=# select xact_commit from pg_stat_database where datname = 'postgres';
 xact_commit 
-------------
         524
(1 row)

postgres=# explain analyze select * from tbl where f1 = 1000;
                                                    QUERY PLAN                                                     
-------------------------------------------------------------------------------------------------------------------
 Gather  (cost=0.00..3645.83 rows=1 width=214) (actual time=1.703..79.736 rows=1 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Parallel Seq Scan on tbl  (cost=0.00..3645.83 rows=1 width=214) (actual time=28.180..51.672 rows=0 loops=3)
         Filter: (f1 = 1000)
         Rows Removed by Filter: 33333
 Planning Time: 0.090 ms
 Execution Time: 79.776 ms
(8 rows)

postgres=# commit;
COMMIT
postgres=# select xact_commit from pg_stat_database where datname = 'postgres';
 xact_commit 
-------------
         531
(1 row)

postgres=# select relname, seq_scan from pg_stat_user_tables where relname = 'tbl';
 relname | seq_scan 
---------+----------
 tbl     |       19
(1 row)

Regards,
Haribabu Kommi
Fujitsu Australia
Reply | Threaded
Open this post in threaded view
|

Re: Transaction commits VS Transaction commits (with parallel) VS query mean time

akapila
On Fri, Feb 8, 2019 at 6:55 AM Haribabu Kommi <[hidden email]> wrote:

>
> On Thu, Feb 7, 2019 at 9:31 PM Haribabu Kommi <[hidden email]> wrote:
>>
>>
>> This is because of larger xact_commit value than default configuration. With the changed server configuration, that leads to generate more parallel workers and every parallel worker operation is treated as an extra commit, because of this reason, the total number of commits increased, but the overall query performance is decreased.
>>
>> Is there any relation of transaction commits to performance?
>>
>> Is there any specific reason to consider the parallel worker activity also as a transaction commit? Especially in my observation, if we didn't consider the parallel worker activity as separate commits, the test doesn't show an increase in transaction commits.
>
>
> The following statements shows the increase in the xact_commit value with
> parallel workers. I can understand that workers updating the seq_scan stats
> as they performed the seq scan.
>

Yeah, that seems okay, however, one can say that for the scan they
want to consider it as a single scan even if part of the scan is
accomplished by workers or may be a separate counter for parallel
workers scan.

> Is the same applied to parallel worker transaction
> commits also?
>

I don't think so.  It seems to me that we should consider it as a
single transaction.  Do you want to do the leg work for this and try
to come up with a patch?  On a quick look, I think we might want to
change AtEOXact_PgStat so that the commits for parallel workers are
not considered.  I think the caller has that information.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Reply | Threaded
Open this post in threaded view
|

Re: Transaction commits VS Transaction commits (with parallel) VS query mean time

Haribabu Kommi-2

On Sat, Feb 9, 2019 at 4:07 PM Amit Kapila <[hidden email]> wrote:
On Fri, Feb 8, 2019 at 6:55 AM Haribabu Kommi <[hidden email]> wrote:
>
> On Thu, Feb 7, 2019 at 9:31 PM Haribabu Kommi <[hidden email]> wrote:
>>
>>
>> This is because of larger xact_commit value than default configuration. With the changed server configuration, that leads to generate more parallel workers and every parallel worker operation is treated as an extra commit, because of this reason, the total number of commits increased, but the overall query performance is decreased.
>>
>> Is there any relation of transaction commits to performance?
>>
>> Is there any specific reason to consider the parallel worker activity also as a transaction commit? Especially in my observation, if we didn't consider the parallel worker activity as separate commits, the test doesn't show an increase in transaction commits.
>
>
> The following statements shows the increase in the xact_commit value with
> parallel workers. I can understand that workers updating the seq_scan stats
> as they performed the seq scan.
>

Thanks for your opinion.
 
Yeah, that seems okay, however, one can say that for the scan they
want to consider it as a single scan even if part of the scan is
accomplished by workers or may be a separate counter for parallel
workers scan.

OK.
 
> Is the same applied to parallel worker transaction
> commits also?
>

I don't think so.  It seems to me that we should consider it as a
single transaction.  Do you want to do the leg work for this and try
to come up with a patch?  On a quick look, I think we might want to
change AtEOXact_PgStat so that the commits for parallel workers are
not considered.  I think the caller has that information.

I try to fix it by adding a check for parallel worker or not and based on it
count them into stats. Patch attached. 

With this patch, currently it doesn't count parallel worker transactions, and
rest of the stats like seqscan and etc are still get counted. IMO they still 
may need to be counted as those stats represent the number of tuples
returned and etc.

Comments?

Regards,
Haribabu Kommi
Fujitsu Australia

0001-Avoid-counting-parallel-worker-transactions-stats.patch (2K) Download Attachment