Performance comparison between Pgsql 10.5 and Pgsql 11.2

classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|

Performance comparison between Pgsql 10.5 and Pgsql 11.2

ncontu1
Hello,
is there any reason why I am getting worse results using pgsql11.2 in writing comparing it with pgsql 10.6?

I have two Instances, both just restored, so no bloats.
Running read queries I have pretty much same results, a little bit better on pg11- Running writes the difference is in favour of 10.

I am expecting pg11 to be better.

Running pgbench :

PG11
[root@STAGING-CMD1 ~]#  /usr/local/pgsql11.2/bin/pgbench -t 1000 -c 20 -C -f stress_service_order.sql cmdstaging -U admin
transaction type: stress_service_order.sql
scaling factor: 1
query mode: simple
number of clients: 20
number of threads: 1
number of transactions per client: 1000
number of transactions actually processed: 20000/20000
latency average = 45.322 ms
tps = 441.283336 (including connections establishing)
tps = 463.731537 (excluding connections establishing)

PG10
[root@STAGING-CMD1 ~]#  pgbench -t 1000 -c 20 -C -f stress_service_order.sql cmdstaging -U admin
transaction type: stress_service_order.sql
scaling factor: 1
query mode: simple
number of clients: 20
number of threads: 1
number of transactions per client: 1000
number of transactions actually processed: 20000/20000
latency average = 44.686 ms
tps = 447.565403 (including connections establishing)
tps = 470.285561 (excluding connections establishing)

This is making a really big difference with longer queries.
Here I am updating a field in a random record.

With more transactions the difference is bigger

WITH POSTGRES 10
 
[root@STAGING-CMD1 ~]#  pgbench -t 100000 -c 20 -C -f stress_service_order_read.sql cmdstaging -U postgres
transaction type: stress_service_order_read.sql
scaling factor: 1
query mode: simple
number of clients: 20
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 2000000/2000000
latency average = 55.291 ms
tps = 442.1490778 (including connections establishing)
tps = 454.846844 (excluding connections establishing)

WITH POSTGRES 11 
[root@STAGING-CMD1 ~]#  pgbench -t 100000 -c 20 -C -f stress_service_order_read.sql cmdstaging -U postgres
transaction type: stress_service_order_read.sql
scaling factor: 1
query mode: simple
number of clients: 20
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 2000000/2000000
latency average = 53.291 ms
tps = 375.297748 (including connections establishing)
tps = 392.316057 (excluding connections establishing)


The postgres.conf file are the same.

max_connections = 220
shared_buffers = 10GB
effective_cache_size = 120GB
work_mem = 600MB
maintenance_work_mem = 2GB
min_wal_size = 1GB
max_wal_size = 2GB
checkpoint_completion_target = 0.7
wal_buffers = 16MB
#default_statistics_target = 100

Using data_sync_retry=on doesn't make any difference.

Is there anything else changed in the default values?

Any trick?
I don't want to go live and loose performances.

Thanks a lot,
Nicola
Reply | Threaded
Open this post in threaded view
|

Re: Performance comparison between Pgsql 10.5 and Pgsql 11.2

Ray O'Donnell
On 01/03/2019 15:01, Nicola Contu wrote:
> Hello,
> is there any reason why I am getting worse results using pgsql11.2 in
> writing comparing it with pgsql 10.6?
>
> I have two Instances, both just restored, so no bloats.
> Running read queries I have pretty much same results, a little bit
> better on pg11- Running writes the difference is in favour of 10.

Did you run ANALYZE on the databases after restoring?

Ray.



--
Raymond O'Donnell // Galway // Ireland
[hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Performance comparison between Pgsql 10.5 and Pgsql 11.2

Thomas Munro-5
On Sat, Mar 2, 2019 at 5:02 AM Ray O'Donnell <[hidden email]> wrote:

> On 01/03/2019 15:01, Nicola Contu wrote:
> > Hello,
> > is there any reason why I am getting worse results using pgsql11.2 in
> > writing comparing it with pgsql 10.6?
> >
> > I have two Instances, both just restored, so no bloats.
> > Running read queries I have pretty much same results, a little bit
> > better on pg11- Running writes the difference is in favour of 10.
>
> Did you run ANALYZE on the databases after restoring?

If you can rule out different query plans, and if you compiled them
both with the same compiler and optimisation levels and without
cassert enabled (it's a long shot but I mentioned that because you
showed a path in /usr/local so perhaps you're hand-compiling 11, but
10 came from a package?), then the next step might be to use a
profiler like "perf" (or something equivalent on your OS) to figure
out where 11 is spending more time in the write test?

--
Thomas Munro
https://enterprisedb.com

Reply | Threaded
Open this post in threaded view
|

Re: Performance comparison between Pgsql 10.5 and Pgsql 11.2

ncontu1
I did a analyze in stages on both.
And Yes both are compiled.
This is the configure command (change 10.6 for PG10)

./configure --prefix=/usr/local/pgsql11.2

See attached perf report. The difference seems to be all in this line, but not sure :

+   26.80%     0.00%           222  postmaster       [kernel.kallsyms]                    [k] system_call_fastpath



I am using CentOS 7
With Centos I am using this profile for tuned-adm
[root@STAGING-CMD1 ~]#  tuned-adm active
Current active profile: latency-performance


Il giorno sab 2 mar 2019 alle ore 20:41 Thomas Munro <[hidden email]> ha scritto:
On Sat, Mar 2, 2019 at 5:02 AM Ray O'Donnell <[hidden email]> wrote:
> On 01/03/2019 15:01, Nicola Contu wrote:
> > Hello,
> > is there any reason why I am getting worse results using pgsql11.2 in
> > writing comparing it with pgsql 10.6?
> >
> > I have two Instances, both just restored, so no bloats.
> > Running read queries I have pretty much same results, a little bit
> > better on pg11- Running writes the difference is in favour of 10.
>
> Did you run ANALYZE on the databases after restoring?

If you can rule out different query plans, and if you compiled them
both with the same compiler and optimisation levels and without
cassert enabled (it's a long shot but I mentioned that because you
showed a path in /usr/local so perhaps you're hand-compiling 11, but
10 came from a package?), then the next step might be to use a
profiler like "perf" (or something equivalent on your OS) to figure
out where 11 is spending more time in the write test?

--
Thomas Munro
https://enterprisedb.com

perf.txt (19K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Performance comparison between Pgsql 10.5 and Pgsql 11.2

Imre Samu
> is there any reason why I am getting worse results using pgsql11.2 in writing comparing it with pgsql 10.6?
>... And Yes both are compiled.

Why 10.6?

according to release notes
"14th February 2019: PostgreSQL 11.2, 10.7, 9.6.12, 9.5.16, and 9.4.21 Released!"  https://www.postgresql.org/about/news/1920/
imho: it would be better to compare PG11.2  with  PG10.7  (  similar bug Fixes and Improvements + same fsync()  behavior )

"This release changes the behavior in how PostgreSQL interfaces with fsync() and includes fixes for partitioning and over 70 other bugs that were reported over the past three months"

Imre



Nicola Contu <[hidden email]> ezt írta (időpont: 2019. márc. 4., H, 13:14):
I did a analyze in stages on both.
And Yes both are compiled.
This is the configure command (change 10.6 for PG10)

./configure --prefix=/usr/local/pgsql11.2

See attached perf report. The difference seems to be all in this line, but not sure :

+   26.80%     0.00%           222  postmaster       [kernel.kallsyms]                    [k] system_call_fastpath



I am using CentOS 7
With Centos I am using this profile for tuned-adm
[root@STAGING-CMD1 ~]#  tuned-adm active
Current active profile: latency-performance


Il giorno sab 2 mar 2019 alle ore 20:41 Thomas Munro <[hidden email]> ha scritto:
On Sat, Mar 2, 2019 at 5:02 AM Ray O'Donnell <[hidden email]> wrote:
> On 01/03/2019 15:01, Nicola Contu wrote:
> > Hello,
> > is there any reason why I am getting worse results using pgsql11.2 in
> > writing comparing it with pgsql 10.6?
> >
> > I have two Instances, both just restored, so no bloats.
> > Running read queries I have pretty much same results, a little bit
> > better on pg11- Running writes the difference is in favour of 10.
>
> Did you run ANALYZE on the databases after restoring?

If you can rule out different query plans, and if you compiled them
both with the same compiler and optimisation levels and without
cassert enabled (it's a long shot but I mentioned that because you
showed a path in /usr/local so perhaps you're hand-compiling 11, but
10 came from a package?), then the next step might be to use a
profiler like "perf" (or something equivalent on your OS) to figure
out where 11 is spending more time in the write test?

--
Thomas Munro
https://enterprisedb.com
Reply | Threaded
Open this post in threaded view
|

Re: Performance comparison between Pgsql 10.5 and Pgsql 11.2

ncontu1
Because I have 10.6 in production :) and I am comparing with what I will be loosing.
And I read that in the release notes but as said in my first email, even with data_sync_retry=on (going back to previous behavior) doesn't make any difference.

So I am looking for something that will keep my performances but still allows me to upgrade to 11 in production.
Also, trying with 11.1, the problem seems still there.

Il giorno lun 4 mar 2019 alle ore 14:45 Imre Samu <[hidden email]> ha scritto:
> is there any reason why I am getting worse results using pgsql11.2 in writing comparing it with pgsql 10.6?
>... And Yes both are compiled.

Why 10.6?

according to release notes
"14th February 2019: PostgreSQL 11.2, 10.7, 9.6.12, 9.5.16, and 9.4.21 Released!"  https://www.postgresql.org/about/news/1920/
imho: it would be better to compare PG11.2  with  PG10.7  (  similar bug Fixes and Improvements + same fsync()  behavior )

"This release changes the behavior in how PostgreSQL interfaces with fsync() and includes fixes for partitioning and over 70 other bugs that were reported over the past three months"

Imre



Nicola Contu <[hidden email]> ezt írta (időpont: 2019. márc. 4., H, 13:14):
I did a analyze in stages on both.
And Yes both are compiled.
This is the configure command (change 10.6 for PG10)

./configure --prefix=/usr/local/pgsql11.2

See attached perf report. The difference seems to be all in this line, but not sure :

+   26.80%     0.00%           222  postmaster       [kernel.kallsyms]                    [k] system_call_fastpath



I am using CentOS 7
With Centos I am using this profile for tuned-adm
[root@STAGING-CMD1 ~]#  tuned-adm active
Current active profile: latency-performance


Il giorno sab 2 mar 2019 alle ore 20:41 Thomas Munro <[hidden email]> ha scritto:
On Sat, Mar 2, 2019 at 5:02 AM Ray O'Donnell <[hidden email]> wrote:
> On 01/03/2019 15:01, Nicola Contu wrote:
> > Hello,
> > is there any reason why I am getting worse results using pgsql11.2 in
> > writing comparing it with pgsql 10.6?
> >
> > I have two Instances, both just restored, so no bloats.
> > Running read queries I have pretty much same results, a little bit
> > better on pg11- Running writes the difference is in favour of 10.
>
> Did you run ANALYZE on the databases after restoring?

If you can rule out different query plans, and if you compiled them
both with the same compiler and optimisation levels and without
cassert enabled (it's a long shot but I mentioned that because you
showed a path in /usr/local so perhaps you're hand-compiling 11, but
10 came from a package?), then the next step might be to use a
profiler like "perf" (or something equivalent on your OS) to figure
out where 11 is spending more time in the write test?

--
Thomas Munro
https://enterprisedb.com
Reply | Threaded
Open this post in threaded view
|

Re: Performance comparison between Pgsql 10.5 and Pgsql 11.2

Tom Lane-2
In reply to this post by ncontu1
Nicola Contu <[hidden email]> writes:
> See attached perf report. The difference seems to be all in this line, but
> not sure :
> +   26.80%     0.00%           222  postmaster       [kernel.kallsyms]
>               [k] system_call_fastpath

That would suggest that many more kernel calls are happening, which is
something you could usefully investigate with strace, perhaps.

                        regards, tom lane

Reply | Threaded
Open this post in threaded view
|

Re: Performance comparison between Pgsql 10.5 and Pgsql 11.2

ncontu1
Attached a part of the strace running the pgbench command for pg11
Also attached strace_10 for pg10.6.

Thanks

Il giorno lun 4 mar 2019 alle ore 15:22 Tom Lane <[hidden email]> ha scritto:
Nicola Contu <[hidden email]> writes:
> See attached perf report. The difference seems to be all in this line, but
> not sure :
> +   26.80%     0.00%           222  postmaster       [kernel.kallsyms]
>               [k] system_call_fastpath

That would suggest that many more kernel calls are happening, which is
something you could usefully investigate with strace, perhaps.

                        regards, tom lane

strace.txt (88K) Download Attachment
strace_10.txt (86K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Performance comparison between Pgsql 10.5 and Pgsql 11.2

Thomas Munro-5
On Tue, Mar 5, 2019 at 5:01 AM Nicola Contu <[hidden email]> wrote:
> Attached a part of the strace running the pgbench command for pg11
> Also attached strace_10 for pg10.6.

That looks like strace output from pgbench, and I don't see any
interesting differences between v10 and v11 (though I'm surprised to
see it using poll() instead of ppoll(), and opening a new connection
for every transaction).

How about syscalls on the server side?  You could start it with
something like "strace -f path/to/postgres -D path/to/pgdata" (-f for
follow children), and perhaps also use -c so that it shows aggregated
data (up until you ^C it) instead of every syscall?

--
Thomas Munro
https://enterprisedb.com

Reply | Threaded
Open this post in threaded view
|

Re: Performance comparison between Pgsql 10.5 and Pgsql 11.2

ncontu1
Not sure what you are requesting exactly but here is the strace for the start of the pg_ctl



Il giorno lun 4 mar 2019 alle ore 21:55 Thomas Munro <[hidden email]> ha scritto:
On Tue, Mar 5, 2019 at 5:01 AM Nicola Contu <[hidden email]> wrote:
> Attached a part of the strace running the pgbench command for pg11
> Also attached strace_10 for pg10.6.

That looks like strace output from pgbench, and I don't see any
interesting differences between v10 and v11 (though I'm surprised to
see it using poll() instead of ppoll(), and opening a new connection
for every transaction).

How about syscalls on the server side?  You could start it with
something like "strace -f path/to/postgres -D path/to/pgdata" (-f for
follow children), and perhaps also use -c so that it shows aggregated
data (up until you ^C it) instead of every syscall?

--
Thomas Munro
https://enterprisedb.com

strace_start.txt (11K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Performance comparison between Pgsql 10.5 and Pgsql 11.2

Thomas Munro-5
On Wed, Mar 6, 2019 at 4:22 AM Nicola Contu <[hidden email]> wrote:
>
> Not sure what you are requesting exactly but here is the strace for the start of the pg_ctl

I meant that you could run the server itself in the foreground under
strace, like so:

$ strace -f -c /usr/local/pgsql11.2/bin/postgres -D /db/pgsql11/data/

Then perform your testing, and finally stop it with pg_ctl from
another window (or hit ^C in this window) and strace should spit out a
table of system calls with some counters.  We might be able to see why
v11 is spending so much more time executing system calls than v10 for
your workload, or at least which systems calls they are, assuming you
run the same transactions against both versions.

--
Thomas Munro
https://enterprisedb.com

Reply | Threaded
Open this post in threaded view
|

Re: Performance comparison between Pgsql 10.5 and Pgsql 11.2

ncontu1
Here is the strace as requested for pg11

Thanks

Il giorno mar 5 mar 2019 alle ore 17:47 Thomas Munro <[hidden email]> ha scritto:
On Wed, Mar 6, 2019 at 4:22 AM Nicola Contu <[hidden email]> wrote:
>
> Not sure what you are requesting exactly but here is the strace for the start of the pg_ctl

I meant that you could run the server itself in the foreground under
strace, like so:

$ strace -f -c /usr/local/pgsql11.2/bin/postgres -D /db/pgsql11/data/

Then perform your testing, and finally stop it with pg_ctl from
another window (or hit ^C in this window) and strace should spit out a
table of system calls with some counters.  We might be able to see why
v11 is spending so much more time executing system calls than v10 for
your workload, or at least which systems calls they are, assuming you
run the same transactions against both versions.

--
Thomas Munro
https://enterprisedb.com

strace_direct.txt (11K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Performance comparison between Pgsql 10.5 and Pgsql 11.2

ncontu1
This is instead the strace of another server running the same version compiled  but that is even slower.



Il giorno mer 6 mar 2019 alle ore 11:14 Nicola Contu <[hidden email]> ha scritto:
Here is the strace as requested for pg11

Thanks

Il giorno mar 5 mar 2019 alle ore 17:47 Thomas Munro <[hidden email]> ha scritto:
On Wed, Mar 6, 2019 at 4:22 AM Nicola Contu <[hidden email]> wrote:
>
> Not sure what you are requesting exactly but here is the strace for the start of the pg_ctl

I meant that you could run the server itself in the foreground under
strace, like so:

$ strace -f -c /usr/local/pgsql11.2/bin/postgres -D /db/pgsql11/data/

Then perform your testing, and finally stop it with pg_ctl from
another window (or hit ^C in this window) and strace should spit out a
table of system calls with some counters.  We might be able to see why
v11 is spending so much more time executing system calls than v10 for
your workload, or at least which systems calls they are, assuming you
run the same transactions against both versions.

--
Thomas Munro
https://enterprisedb.com

strace_another_server.txt (6K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Performance comparison between Pgsql 10.5 and Pgsql 11.2

Thomas Munro-5
In reply to this post by ncontu1
On Wed, Mar 6, 2019 at 11:14 PM Nicola Contu <[hidden email]> wrote:
> Here is the strace as requested for pg11

How does it compare to v10 running the same test?

--
Thomas Munro
https://enterprisedb.com

Reply | Threaded
Open this post in threaded view
|

Re: Performance comparison between Pgsql 10.5 and Pgsql 11.2

Thomas Munro-5
In reply to this post by ncontu1
On Thu, Mar 7, 2019 at 12:12 AM Nicola Contu <[hidden email]> wrote:
> This is instead the strace of another server running the same version compiled  but that is even slower.

Huh.  That's a lot of lseek().  Some of these will be for random
reads/writes and will go way in v12, and some will be for probing the
size of relations while planning, and some while executing scans.  I
bet you could make some of them go away by using prepared statements.
Does the query in your test involve many partitions/tables?

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 32.50  143.010306           7  21044095           lseek
 26.21  115.354045          14   8144577           read
  6.18   27.185578          16   1669889        10 sendto
  5.29   23.300584          57    407528           fdatasync
  4.93   21.709522           9   2313529    824174 recvfrom
  3.31   14.547568          19    765897           write
  2.73   12.007486          14    867088     14494 epoll_wait
  2.18    9.597460          15    659871     84097 futex
  1.85    8.147759          14    567414           close
  1.77    7.767832          18    437656     11319 open

The other results had 1 usec lseek(), and much fewer of them relative
to the number of reads and writes.  BTW, are you comparing v10 and v11
on the same hardware, kernel, filesystem?  Just wondering if there
could be some change in syscall overhead on different kernel patch
levels or something like that: we see 7 usec vs 1 usec in those two
files (though I have no idea how reliable these times are) and if
we're going to call it 21 million times at some point it might
matter...

--
Thomas Munro
https://enterprisedb.com

Reply | Threaded
Open this post in threaded view
|

Re: Performance comparison between Pgsql 10.5 and Pgsql 11.2

ncontu1
So the first file is on Postgres11.2 on a test server (and where I compare 10 vs 11)
The second file, is our preprod machine running Postgres 11.2 (different hardware etc, it is a VM). I know that could be confusing, but I just wanted to compare that too because if you see the two files there's a lot of difference between the two machines.
And they are both running CentOS 7.

So at this point I have two problems. One inside the machine between Postgres 10 and 11 and another problem on the preprod (similar to prod) with a lot of lseek.

Sorry if this is confusing, hope it is clear now.

Regarding partitions/tables. The first file involves just one table. The second file (with a huge lseek) was running the test on a single table, but meanwhile it was accessible by the preprod web application. So it was maybe hit by some user and some other table.


Question:
1) Is it possible that pgbench could not be really a good tool for testing the performances? If I use a sql script of thousands of insert records and compare on the same server between pg10 and pg11 I get pretty much the same result (maybe better on pg11)
2) regarding preprod, is there any way to reduce those lseek()?  Just to let you know, comparing the same insert script between the first server, the first server takes 2m the second one takes 5-7m.

Thanks a lot,





Il giorno gio 7 mar 2019 alle ore 04:47 Thomas Munro <[hidden email]> ha scritto:
On Thu, Mar 7, 2019 at 12:12 AM Nicola Contu <[hidden email]> wrote:
> This is instead the strace of another server running the same version compiled  but that is even slower.

Huh.  That's a lot of lseek().  Some of these will be for random
reads/writes and will go way in v12, and some will be for probing the
size of relations while planning, and some while executing scans.  I
bet you could make some of them go away by using prepared statements.
Does the query in your test involve many partitions/tables?

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 32.50  143.010306           7  21044095           lseek
 26.21  115.354045          14   8144577           read
  6.18   27.185578          16   1669889        10 sendto
  5.29   23.300584          57    407528           fdatasync
  4.93   21.709522           9   2313529    824174 recvfrom
  3.31   14.547568          19    765897           write
  2.73   12.007486          14    867088     14494 epoll_wait
  2.18    9.597460          15    659871     84097 futex
  1.85    8.147759          14    567414           close
  1.77    7.767832          18    437656     11319 open

The other results had 1 usec lseek(), and much fewer of them relative
to the number of reads and writes.  BTW, are you comparing v10 and v11
on the same hardware, kernel, filesystem?  Just wondering if there
could be some change in syscall overhead on different kernel patch
levels or something like that: we see 7 usec vs 1 usec in those two
files (though I have no idea how reliable these times are) and if
we're going to call it 21 million times at some point it might
matter...

--
Thomas Munro
https://enterprisedb.com
Reply | Threaded
Open this post in threaded view
|

Re: Performance comparison between Pgsql 10.5 and Pgsql 11.2

ncontu1
Hello,
do you have any advice on this?

Thanks a lot in advance

Il giorno gio 7 mar 2019 alle ore 09:39 Nicola Contu <[hidden email]> ha scritto:
So the first file is on Postgres11.2 on a test server (and where I compare 10 vs 11)
The second file, is our preprod machine running Postgres 11.2 (different hardware etc, it is a VM). I know that could be confusing, but I just wanted to compare that too because if you see the two files there's a lot of difference between the two machines.
And they are both running CentOS 7.

So at this point I have two problems. One inside the machine between Postgres 10 and 11 and another problem on the preprod (similar to prod) with a lot of lseek.

Sorry if this is confusing, hope it is clear now.

Regarding partitions/tables. The first file involves just one table. The second file (with a huge lseek) was running the test on a single table, but meanwhile it was accessible by the preprod web application. So it was maybe hit by some user and some other table.


Question:
1) Is it possible that pgbench could not be really a good tool for testing the performances? If I use a sql script of thousands of insert records and compare on the same server between pg10 and pg11 I get pretty much the same result (maybe better on pg11)
2) regarding preprod, is there any way to reduce those lseek()?  Just to let you know, comparing the same insert script between the first server, the first server takes 2m the second one takes 5-7m.

Thanks a lot,





Il giorno gio 7 mar 2019 alle ore 04:47 Thomas Munro <[hidden email]> ha scritto:
On Thu, Mar 7, 2019 at 12:12 AM Nicola Contu <[hidden email]> wrote:
> This is instead the strace of another server running the same version compiled  but that is even slower.

Huh.  That's a lot of lseek().  Some of these will be for random
reads/writes and will go way in v12, and some will be for probing the
size of relations while planning, and some while executing scans.  I
bet you could make some of them go away by using prepared statements.
Does the query in your test involve many partitions/tables?

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 32.50  143.010306           7  21044095           lseek
 26.21  115.354045          14   8144577           read
  6.18   27.185578          16   1669889        10 sendto
  5.29   23.300584          57    407528           fdatasync
  4.93   21.709522           9   2313529    824174 recvfrom
  3.31   14.547568          19    765897           write
  2.73   12.007486          14    867088     14494 epoll_wait
  2.18    9.597460          15    659871     84097 futex
  1.85    8.147759          14    567414           close
  1.77    7.767832          18    437656     11319 open

The other results had 1 usec lseek(), and much fewer of them relative
to the number of reads and writes.  BTW, are you comparing v10 and v11
on the same hardware, kernel, filesystem?  Just wondering if there
could be some change in syscall overhead on different kernel patch
levels or something like that: we see 7 usec vs 1 usec in those two
files (though I have no idea how reliable these times are) and if
we're going to call it 21 million times at some point it might
matter...

--
Thomas Munro
https://enterprisedb.com
Reply | Threaded
Open this post in threaded view
|

Re: Performance comparison between Pgsql 10.5 and Pgsql 11.2

Thomas Munro-5
On Tue, Mar 12, 2019 at 10:49 PM Nicola Contu <[hidden email]> wrote:
> Il giorno gio 7 mar 2019 alle ore 09:39 Nicola Contu <[hidden email]> ha scritto:
>> So the first file is on Postgres11.2 on a test server (and where I compare 10 vs 11)
>> The second file, is our preprod machine running Postgres 11.2 (different hardware etc, it is a VM). I know that could be confusing, but I just wanted to compare that too because if you see the two files there's a lot of difference between the two machines.
>> And they are both running CentOS 7.
>>
>> So at this point I have two problems. One inside the machine between Postgres 10 and 11 and another problem on the preprod (similar to prod) with a lot of lseek.

So we still don't know what is different between 10 and 11.  We'll
need some more evidence to understand that.  We know that perf said 11
was spending more time making syscalls, but we haven't seen any
syscall measurement from the 10 system so we don't know what those
supposed extra syscalls are.  Also, backing up a bit, have you
compared the query plan to see if it's the same, and can we please see
it?

As for the preprod/lots-of-lseek system, I don't know, it seems there
are even more unknown variables there...  if you can say more about
the virtualisation technology you're using, perhaps someone who knows
more about that would have some ideas.  Besides apparently slower
syscalls, one factor that is unexplained is why that system is calling
lseek() more times per query (but you said there may be other work
happening on the server, so who knows).

--
Thomas Munro
https://enterprisedb.com