Centos 6.9 and centos 7

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Centos 6.9 and centos 7

ncontu1
Hello,
we recently upgrade OS from centos 6.9 to a new server with centos 7.

The centos 6.9 server has became the preproduction server now.
We are running postgres 9.6.6 on both servers.


They are both on SSD disk, these are the only differences :

- DB partition on centos 7 is on a RAID 10
- file system is xfs on centos 7 (ext4 in centos 6.9)
- more memory on the centos 7 (so params on the postgres.conf are higher)
 max_connections = 220
 shared_buffers = 10GB
 effective_cache_size = 120GB 
 work_mem = 349525kB
 maintenance_work_mem = 2GB
 min_wal_size = 1GB
 max_wal_size = 2GB
 checkpoint_completion_target = 0.7 
 wal_buffers = 16MB
 default_statistics_target = 100
- we have two replicas on the centos 7. One is async one is sync
 synchronous_standby_names = '1 ( "****" )'
 synchronous_commit = on

The have the same db inside, with same data.
Running the same script on the two servers will give different results. Even a select query is faster on the centos 6.9 server. Half time on the preprod server

centos 7 : 

dbname=# \timing Timing is on. cmdv3=# SELECT id FROM client_billing_account WHERE name = 'name'; id ------- ***** (1 row) Time: 3.884 ms

centos 6.9

dbname=# SELECT id FROM client_billing_account WHERE name = 'name'; id ------- ***** (1 row) Time: 1.620 ms

This table has 32148 records.

Do you think we can modify anything to achieve same performances?

I read about few kernel params :

kernel.sched_migration_cost_ns = 5000000
kernel.sched_autogroup_enabled = 0
vm.dirty_background_bytes = 67108864
vm.dirty_bytes = 1073741824
vm.zone_reclaim_mode = 0
vm.swappiness = 1.1

Is there anything you can advice to solve or identify the problem?

Thanks a lot,
Nicola

Reply | Threaded
Open this post in threaded view
|

Re: Centos 6.9 and centos 7

Christian Mair
> centos 7 :
> Time: 3.884 ms
>
> centos 6.9
Time: 1.620 ms
>
>
> Is there anything you can advice to solve or identify the problem?

Can you run this query 10 times on each server and note the timings?

I'd like to see the reproducability of this.

Also: both machines are otherwise idle (check with top or uptime)?

Bye,
Chris.



Reply | Threaded
Open this post in threaded view
|

Re: Centos 6.9 and centos 7

ncontu1
These are the timings in centos 7 :

Time: 4.248 ms
Time: 2.983 ms
Time: 3.027 ms
Time: 3.298 ms
Time: 4.420 ms
Time: 2.599 ms
Time: 2.555 ms
Time: 3.008 ms
Time: 6.220 ms
Time: 4.275 ms
Time: 2.841 ms
Time: 3.699 ms
Time: 3.387 ms


These are the timings in centos 6:
Time: 1.722 ms
Time: 1.670 ms
Time: 1.843 ms
Time: 1.823 ms
Time: 1.723 ms
Time: 1.724 ms
Time: 1.747 ms
Time: 1.734 ms
Time: 1.764 ms
Time: 1.622 ms


This is top on centos 6 :

[root@****]# top
top - 14:33:32 up 577 days, 23:08,  1 user,  load average: 0.16, 0.11, 0.15
Tasks: 1119 total,   1 running, 1118 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0%us,  0.1%sy,  0.0%ni, 99.9%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  132040132k total, 129530504k used,  2509628k free,   108084k buffers
Swap: 11665404k total,   331404k used, 11334000k free, 124508916k cached

This is top on centos 7:

top - 14:35:38 up 73 days, 19:00,  6 users,  load average: 22.46, 20.89, 20.54
Tasks: 821 total,  13 running, 807 sleeping,   0 stopped,   1 zombie
%Cpu(s): 14.2 us,  5.0 sy,  0.0 ni, 77.5 id,  3.1 wa,  0.0 hi,  0.2 si,  0.0 st
KiB Mem : 26383592+total,  4301464 free,  6250384 used, 25328406+buff/cache
KiB Swap: 16777212 total, 11798876 free,  4978336 used. 24497036+avail Mem


The production machine is obviously more accessed. But that does not seem to be the problem as running the same query on the replica of the production machine (same config of the master but not accessed by anyone) gives the same bad result:
Time: 6.366 ms


2017-12-04 15:19 GMT+01:00 Chris Mair <[hidden email]>:
centos 7 :
Time: 3.884 ms

centos 6.9
Time: 1.620 ms


Is there anything you can advice to solve or identify the problem?

Can you run this query 10 times on each server and note the timings?

I'd like to see the reproducability of this.

Also: both machines are otherwise idle (check with top or uptime)?

Bye,
Chris.



Reply | Threaded
Open this post in threaded view
|

Re: Centos 6.9 and centos 7

ncontu1
To make a better testing, I used a third server.
This is identical to the centos 7 machine, and it is not included in the replica cluster.

Nobody is accessing this machine, this is top :

top - 14:48:36 up 73 days, 17:39,  3 users,  load average: 0.00, 0.01, 0.05
Tasks: 686 total,   1 running, 685 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 26383592+total,  1782196 free,  2731144 used, 25932257+buff/cache
KiB Swap: 16777212 total, 16298536 free,   478676 used. 21693456+avail Mem

These are timings :
Time: 2.841 ms
Time: 1.980 ms
Time: 2.240 ms
Time: 2.947 ms
Time: 2.828 ms
Time: 2.227 ms
Time: 1.998 ms
Time: 1.990 ms
Time: 2.643 ms
Time: 2.143 ms
Time: 2.919 ms
Time: 2.246 ms

I never got same results of the centos 6.9 machine.



2017-12-04 15:40 GMT+01:00 Nicola Contu <[hidden email]>:
These are the timings in centos 7 :

Time: 4.248 ms
Time: 2.983 ms
Time: 3.027 ms
Time: 3.298 ms
Time: 4.420 ms
Time: 2.599 ms
Time: 2.555 ms
Time: 3.008 ms
Time: 6.220 ms
Time: 4.275 ms
Time: 2.841 ms
Time: 3.699 ms
Time: 3.387 ms


These are the timings in centos 6:
Time: 1.722 ms
Time: 1.670 ms
Time: 1.843 ms
Time: 1.823 ms
Time: 1.723 ms
Time: 1.724 ms
Time: 1.747 ms
Time: 1.734 ms
Time: 1.764 ms
Time: 1.622 ms


This is top on centos 6 :

[root@****]# top
top - 14:33:32 up 577 days, 23:08,  1 user,  load average: 0.16, 0.11, 0.15
Tasks: 1119 total,   1 running, 1118 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0%us,  0.1%sy,  0.0%ni, 99.9%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  132040132k total, 129530504k used,  2509628k free,   108084k buffers
Swap: 11665404k total,   331404k used, 11334000k free, 124508916k cached

This is top on centos 7:

top - 14:35:38 up 73 days, 19:00,  6 users,  load average: 22.46, 20.89, 20.54
Tasks: 821 total,  13 running, 807 sleeping,   0 stopped,   1 zombie
%Cpu(s): 14.2 us,  5.0 sy,  0.0 ni, 77.5 id,  3.1 wa,  0.0 hi,  0.2 si,  0.0 st
KiB Mem : 26383592+total,  4301464 free,  6250384 used, 25328406+buff/cache
KiB Swap: 16777212 total, 11798876 free,  4978336 used. 24497036+avail Mem


The production machine is obviously more accessed. But that does not seem to be the problem as running the same query on the replica of the production machine (same config of the master but not accessed by anyone) gives the same bad result:
Time: 6.366 ms


2017-12-04 15:19 GMT+01:00 Chris Mair <[hidden email]>:
centos 7 :
Time: 3.884 ms

centos 6.9
Time: 1.620 ms


Is there anything you can advice to solve or identify the problem?

Can you run this query 10 times on each server and note the timings?

I'd like to see the reproducability of this.

Also: both machines are otherwise idle (check with top or uptime)?

Bye,
Chris.




Reply | Threaded
Open this post in threaded view
|

Re: Centos 6.9 and centos 7

Tomas Vondra-4
In reply to this post by ncontu1

On 12/04/2017 02:19 PM, Nicola Contu wrote:
...>

> centos 7 : 
>
> dbname=# \timing Timing is on. cmdv3=# SELECT id FROM
> client_billing_account WHERE name = 'name'; id ------- ***** (1 row)
> Time: 3.884 ms
>
> centos 6.9
>
> dbname=# SELECT id FROM client_billing_account WHERE name = 'name'; id
> ------- ***** (1 row) Time: 1.620 ms
>

We need to see EXPLAIN (ANALYZE,BUFFERS) for the queries.

Are those VMs or bare metal? What CPUs and RAM are there? Have you
checked that power management is disabled / cpufreq uses the same
policy? That typically affects short CPU-bound queries.

Other than that, I recommend performing basic system benchmarks (CPU,
memory, ...) and only if those machines perform equally should you look
for issues in PostgreSQL. Chances are the root cause is in hw or OS, in
which case you need to address that first.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: Centos 6.9 and centos 7

Alban Hertroys-4
Did  you run ANALYZE on your tables before the test?

On 4 December 2017 at 16:01, Tomas Vondra <[hidden email]> wrote:

>
> On 12/04/2017 02:19 PM, Nicola Contu wrote:
> ...>
>> centos 7 :
>>
>> dbname=# \timing Timing is on. cmdv3=# SELECT id FROM
>> client_billing_account WHERE name = 'name'; id ------- ***** (1 row)
>> Time: 3.884 ms
>>
>> centos 6.9
>>
>> dbname=# SELECT id FROM client_billing_account WHERE name = 'name'; id
>> ------- ***** (1 row) Time: 1.620 ms
>>
>
> We need to see EXPLAIN (ANALYZE,BUFFERS) for the queries.
>
> Are those VMs or bare metal? What CPUs and RAM are there? Have you
> checked that power management is disabled / cpufreq uses the same
> policy? That typically affects short CPU-bound queries.
>
> Other than that, I recommend performing basic system benchmarks (CPU,
> memory, ...) and only if those machines perform equally should you look
> for issues in PostgreSQL. Chances are the root cause is in hw or OS, in
> which case you need to address that first.
>
> regards
>
> --
> Tomas Vondra                  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>



--
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.

Reply | Threaded
Open this post in threaded view
|

Re: Centos 6.9 and centos 7

ncontu1
No I did not run a vacuum analyze. Do you want me to try with that first?

@Tomas:
Talking abut power management, I changed the profile for tuned-adm to latency-performance instead of balanced (that is the default)

that is increasing performances for now and they are similar to centos 6.9.

Time: 2.121 ms
Time: 2.026 ms
Time: 1.664 ms
Time: 1.749 ms
Time: 1.656 ms
Time: 1.675 ms

Do you think this can be easily done in production as well? 

2017-12-04 16:37 GMT+01:00 Alban Hertroys <[hidden email]>:
Did  you run ANALYZE on your tables before the test?

On 4 December 2017 at 16:01, Tomas Vondra <[hidden email]> wrote:
>
> On 12/04/2017 02:19 PM, Nicola Contu wrote:
> ...>
>> centos 7 :
>>
>> dbname=# \timing Timing is on. cmdv3=# SELECT id FROM
>> client_billing_account WHERE name = 'name'; id ------- ***** (1 row)
>> Time: 3.884 ms
>>
>> centos 6.9
>>
>> dbname=# SELECT id FROM client_billing_account WHERE name = 'name'; id
>> ------- ***** (1 row) Time: 1.620 ms
>>
>
> We need to see EXPLAIN (ANALYZE,BUFFERS) for the queries.
>
> Are those VMs or bare metal? What CPUs and RAM are there? Have you
> checked that power management is disabled / cpufreq uses the same
> policy? That typically affects short CPU-bound queries.
>
> Other than that, I recommend performing basic system benchmarks (CPU,
> memory, ...) and only if those machines perform equally should you look
> for issues in PostgreSQL. Chances are the root cause is in hw or OS, in
> which case you need to address that first.
>
> regards
>
> --
> Tomas Vondra                  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>



--
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.

Reply | Threaded
Open this post in threaded view
|

Re: Centos 6.9 and centos 7

Tomas Vondra-4
On 12/04/2017 04:57 PM, Nicola Contu wrote:

> No I did not run a vacuum analyze. Do you want me to try with that first?
>
> @Tomas:
> Talking abut power management, I changed the profile for tuned-adm
> to latency-performance instead of balanced (that is the default)
>
> that is increasing performances for now and they are similar to centos 6.9.
>
> Time: 2.121 ms
> Time: 2.026 ms
> Time: 1.664 ms
> Time: 1.749 ms
> Time: 1.656 ms
> Time: 1.675 ms
>
> Do you think this can be easily done in production as well? 
>

How am I supposed to know? Not only that depends on your internal
deployment policies, but it's also much more a CentOS/RedHat question
than PostgreSQL.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: Centos 6.9 and centos 7

Alban Hertroys-4
In reply to this post by ncontu1

> On 4 Dec 2017, at 16:57, Nicola Contu <[hidden email]> wrote:
>
> No I did not run a vacuum analyze. Do you want me to try with that first?

That means your statistics may not be up to date, although by now autovacuum should have done the job (you didn't turn that off or anything, did you?). Bad statistics result in non-optimal query plans and therefore could very well cause your timing differences.

An easy way to verify, since you still have access to both versions of the database, is to compare the statistics of the relevant tables between the two. They should be similar.

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.