Commit(?) overhead

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Commit(?) overhead

Duncan Kinnear
We have a very simple table, whose DDL is as follows:

    CREATE TABLE public.next_id (
       id varchar(255) NOT NULL,
       next_value int8 NOT NULL,
       CONSTRAINT next_id_pk PRIMARY KEY (id)
    );

The table only has about 125 rows, and there are no indexes apart from the primary key constraint.

In DBeaver I am executing the following UPDATE query:

    UPDATE next_id SET next_value=next_value+1 WHERE id='Session';

If I point DBeaver to a server (localhost) running version:
   11.2 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.2.1 20181127, 64-bit
it executes on average in about 50ms.

the EXPLAIN (ANALYSE, TIMING TRUE) of this query gives:

 Update on next_id  (cost=0.14..8.16 rows=1 width=36) (actual time=0.057..0.057 rows=0 loops=1)
   ->  Index Scan using next_id_pk on next_id  (cost=0.14..8.16 rows=1 width=36) (actual time=0.039..0.040 rows=1 loops=1)
         Index Cond: ((id)::text = 'Session'::text)
 Planning Time: 0.083 ms
 Execution Time: 0.089 ms

which is significantly less than 50ms.

Now, if I point DBeaver to a VM server on the same gigabit network switch, running version:
   9.5.3 on i386-pc-solaris2.11, compiled by cc: Sun C 5.10 SunOS_i386 Patch 142363-07 2010/12/09, 64-bit
then the same query executes in about 2-3ms

The EXPLAIN output when executing the query on this server is:

 Update on next_id  (cost=0.27..8.29 rows=1 width=36) (actual time=0.062..0.062 rows=0 loops=1)
   ->  Index Scan using next_id_pkey on next_id  (cost=0.27..8.29 rows=1 width=36) (actual time=0.025..0.026 rows=1 loops=1)
         Index Cond: ((id)::text = 'Session'::text)
 Planning time: 0.083 ms
 Execution time: 0.096 ms

which you will see is virtually identical to the slower version.

Why is the query taking so much longer on the localhost server?

Not that the localhost machine is significantly faster in other metrics (CPU, file system, etc.)

I have also tried the query on another server on the same network switch running version:
   10.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.2.0, 64-bit
and the timings are very similar to those for 'localhost'. That is, approx 50ms on average.

Now, if I run the following FOR LOOP query:

    do $$
    begin
        for i in 1..10000 loop
            update NEXT_ID set next_value=next_value+1 where id='Session';
        end loop;
    end;
    $$;

Then this completes in about the same time on ALL of the servers - approximately 1.7s - which makes sense as 10,000 times the above plan/execute times is approx 1.7s.

So, to me this feels like some kind of COMMIT overhead of approx 50ms that the version 10 and version 11 servers are experiencing. But I have no idea where to look to try and find where this time is being spent.

Note that the schemas of the databases on the 3 servers involved are virtually identical. The schema for this table is exactly the same.

Hoping that someone can give me an idea about where to go looking.


Regards,

Duncan Kinnear
       
Floor 1, 100 McLeod St, Hastings 4120, New Zealand
PO Box 2006, Hastings 4153, New Zealand
P: +64 6 871 5700 F: +64 6 871 5709 E: [hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: Commit(?) overhead

Jeff Janes
On Thu, Apr 4, 2019 at 3:42 AM Duncan Kinnear <[hidden email]> wrote:

the EXPLAIN (ANALYSE, TIMING TRUE) of this query gives:

 Update on next_id  (cost=0.14..8.16 rows=1 width=36) (actual time=0.057..0.057 rows=0 loops=1)
   ->  Index Scan using next_id_pk on next_id  (cost=0.14..8.16 rows=1 width=36) (actual time=0.039..0.040 rows=1 loops=1)
         Index Cond: ((id)::text = 'Session'::text)
 Planning Time: 0.083 ms
 Execution Time: 0.089 ms

which is significantly less than 50ms.

The EXPLAIN ANALYZE doesn't include the time needed to fsync the transaction logs.  It measures only the update itself, not the implicit commit at the end.  DBeaver is seeing the fsync-inclusive time.  50ms is pretty long, but some file systems and OSes seem to be pretty inefficient at this and take several disk revolutions to get the data down.
 

Now, if I point DBeaver to a VM server on the same gigabit network switch, running version:
   9.5.3 on i386-pc-solaris2.11, compiled by cc: Sun C 5.10 SunOS_i386 Patch 142363-07 2010/12/09, 64-bit
then the same query executes in about 2-3ms

That machine probably has hardware to do a fast fsync, has fsync turned off, or is lying about the safety of its data.

Cheers,

Jeff
Reply | Threaded
Open this post in threaded view
|

Re: Commit(?) overhead

Duncan Kinnear
----- On 5 Apr, 2019, at 4:14 AM, Jeff Janes <[hidden email]> wrote:

> On Thu, Apr 4, 2019 at 3:42 AM Duncan Kinnear < [
> mailto:[hidden email] | [hidden email] ] > wrote:

>> the EXPLAIN (ANALYSE, TIMING TRUE) of this query gives:

>> Update on next_id (cost=0.14..8.16 rows=1 width=36) (actual time=0.057..0.057
>> rows=0 loops=1)
>> -> Index Scan using next_id_pk on next_id (cost=0.14..8.16 rows=1 width=36)
>> (actual time=0.039..0.040 rows=1 loops=1)
>> Index Cond: ((id)::text = 'Session'::text)
>> Planning Time: 0.083 ms
>> Execution Time: 0.089 ms

>> which is significantly less than 50ms.

> The EXPLAIN ANALYZE doesn't include the time needed to fsync the transaction
> logs. It measures only the update itself, not the implicit commit at the end.
> DBeaver is seeing the fsync-inclusive time. 50ms is pretty long, but some file
> systems and OSes seem to be pretty inefficient at this and take several disk
> revolutions to get the data down.

>> Now, if I point DBeaver to a VM server on the same gigabit network switch,
>> running version:
>> 9.5.3 on i386-pc-solaris2.11, compiled by cc: Sun C 5.10 SunOS_i386 Patch
>> 142363-07 2010/12/09, 64-bit
>> then the same query executes in about 2-3ms

> That machine probably has hardware to do a fast fsync, has fsync turned off, or
> is lying about the safety of its data.

Just a quick update. I tried performing a sequence of BEGIN; UPDATE ...; COMMIT; and I got the following log entries:

Apr 10 09:02:40 duncanpc postgres[7656]: 2019-04-10 09:02:40.639 NZST [29887] LOG:  duration: 0.025 ms  parse <unnamed>: begin
Apr 10 09:02:40 duncanpc postgres[7656]: 2019-04-10 09:02:40.639 NZST [29887] LOG:  duration: 0.014 ms  bind <unnamed>: begin
Apr 10 09:02:40 duncanpc postgres[7656]: 2019-04-10 09:02:40.639 NZST [29887] LOG:  duration: 0.003 ms  execute <unnamed>: begin
Apr 10 09:02:40 duncanpc postgres[7656]: 2019-04-10 09:02:40.639 NZST [29887] LOG:  duration: 0.045 ms  parse <unnamed>:  update NEXT_ID set next_value=next_value+1 where id='Session'
Apr 10 09:02:40 duncanpc postgres[7656]: 2019-04-10 09:02:40.640 NZST [29887] LOG:  duration: 0.055 ms  bind <unnamed>:  update NEXT_ID set next_value=next_value+1 where id='Session'
Apr 10 09:02:40 duncanpc postgres[7656]: 2019-04-10 09:02:40.640 NZST [29887] LOG:  duration: 0.059 ms  execute <unnamed>:  update NEXT_ID set next_value=next_value+1 where id='Session'
Apr 10 09:02:40 duncanpc postgres[7656]: 2019-04-10 09:02:40.640 NZST [29887] LOG:  duration: 0.004 ms  parse <unnamed>:  commit
Apr 10 09:02:40 duncanpc postgres[7656]: 2019-04-10 09:02:40.640 NZST [29887] LOG:  duration: 0.003 ms  bind <unnamed>:  commit
Apr 10 09:02:40 duncanpc postgres[7656]: 2019-04-10 09:02:40.690 NZST [29887] LOG:  duration: 50.237 ms  execute <unnamed>:  commit

So this confirms that the overhead is indeed happening in the COMMIT part. But how do I get more detailed logging to see what it is doing?

Note, in a previous reply to Jeff (which I forgot to CC to the list) I explained that the slow machines are both using BTRFS as the filesystem, and a bit of googling has revealed that using PostgreSQL on BTRFS filesystems is (don't cross the streams) bad.

Jeff, I will try adding the wait event stuff to see if that it what it is doing.


Cheers, Duncan


Reply | Threaded
Open this post in threaded view
|

Re: Commit(?) overhead

Justin Pryzby
On Wed, Apr 10, 2019 at 09:26:22AM +1200, Duncan Kinnear wrote:

> ----- On 5 Apr, 2019, at 4:14 AM, Jeff Janes <[hidden email]> wrote:
>
> > On Thu, Apr 4, 2019 at 3:42 AM Duncan Kinnear <[hidden email] | [hidden email] ] > wrote:
>
> >> the EXPLAIN (ANALYSE, TIMING TRUE) of this query gives:
>
> >> Update on next_id (cost=0.14..8.16 rows=1 width=36) (actual time=0.057..0.057
> >> rows=0 loops=1)
> >> -> Index Scan using next_id_pk on next_id (cost=0.14..8.16 rows=1 width=36)
> >> (actual time=0.039..0.040 rows=1 loops=1)
> >> Index Cond: ((id)::text = 'Session'::text)
> >> Planning Time: 0.083 ms
> >> Execution Time: 0.089 ms
>
> >> which is significantly less than 50ms.
>
> > The EXPLAIN ANALYZE doesn't include the time needed to fsync the transaction
> > logs. It measures only the update itself, not the implicit commit at the end.
> > DBeaver is seeing the fsync-inclusive time. 50ms is pretty long, but some file
> > systems and OSes seem to be pretty inefficient at this and take several disk
> > revolutions to get the data down.
>
> >> Now, if I point DBeaver to a VM server on the same gigabit network switch,
> >> running version:
> >> 9.5.3 on i386-pc-solaris2.11, compiled by cc: Sun C 5.10 SunOS_i386 Patch
> >> 142363-07 2010/12/09, 64-bit
> >> then the same query executes in about 2-3ms
>
> > That machine probably has hardware to do a fast fsync, has fsync turned off, or
> > is lying about the safety of its data.
>
> Just a quick update. I tried performing a sequence of BEGIN; UPDATE ...; COMMIT; and I got the following log entries:

> Apr 10 09:02:40 duncanpc postgres[7656]: 2019-04-10 09:02:40.640 NZST [29887] LOG:  duration: 0.003 ms  bind <unnamed>:  commit
> Apr 10 09:02:40 duncanpc postgres[7656]: 2019-04-10 09:02:40.690 NZST [29887] LOG:  duration: 50.237 ms  execute <unnamed>:  commit
>
> So this confirms that the overhead is indeed happening in the COMMIT part. But how do I get more detailed logging to see what it is doing?

commit is causing the fsync() Jeff mentioned.

You could test that's the issue by comparing with fsync=off (please read what
that means and don't run your production cluster like that).
https://www.postgresql.org/docs/current/runtime-config-wal.html#GUC-FSYNC

You could also put your XLOG on a separate FS (as a test or otherwise).

Justin


Reply | Threaded
Open this post in threaded view
|

Re: Commit(?) overhead

Andres Freund
Hi,

On 2019-04-09 17:12:27 -0500, Justin Pryzby wrote:
> You could test that's the issue by comparing with fsync=off (please read what
> that means and don't run your production cluster like that).
> https://www.postgresql.org/docs/current/runtime-config-wal.html#GUC-FSYNC

I suggest testing it with synchronous_commit=off instead. That's about
as fast for this type of workload, doesn't have cluster corruption
issues, the window of a transaction not persisting in case of a crash is
very small, and it can just set by any user in individual sessions.

Greetings,

Andres Freund


Reply | Threaded
Open this post in threaded view
|

Re: Commit(?) overhead

Duncan Kinnear
----- On 10 Apr, 2019, at 10:23 AM, Andres Freund [hidden email] wrote:

> On 2019-04-09 17:12:27 -0500, Justin Pryzby wrote:
>> You could test that's the issue by comparing with fsync=off (please read what
>> that means and don't run your production cluster like that).
>> https://www.postgresql.org/docs/current/runtime-config-wal.html#GUC-FSYNC
>
> I suggest testing it with synchronous_commit=off instead. That's about
> as fast for this type of workload, doesn't have cluster corruption
> issues, the window of a transaction not persisting in case of a crash is
> very small, and it can just set by any user in individual sessions.

Bingo! Adding 'SET LOCAL synchronous_commit TO OFF;' to my 'BEGIN; UPDATE ....; COMMIT;' block has given me sub-1ms timings! Thanks Andres.

I'll probably leave the setting as that on my local machine. The option appears to be relatively safe, but my machine is just a dev test machine anyway.


Regards,

Duncan Kinnear


Reply | Threaded
Open this post in threaded view
|

Re: Commit(?) overhead

Laurenz Albe
Duncan Kinnear wrote:
> Bingo! Adding 'SET LOCAL synchronous_commit TO OFF;' to my 'BEGIN; UPDATE ....; COMMIT;'
> block has given me sub-1ms timings! Thanks Andres.

That's a pretty clear indication that your I/O subsystem was overloaded.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com