Logical replication monitoring

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Logical replication monitoring

AYahorau
Hello PostgreSQL Community!

I configured logical replication for PostgreSQL 10.4 on 2 machines, set wal_level to logical, created a publication on master node and created a subscription on standby node according to the PostgreSQL documentation.
Could you please suggest an approach for replication state monitoring.

According to my experience the monitoring of pg_stat_subscription and pg_publication, pg_replication_slots unfortunately is not enough for this aim. Moreover standby database does not prohibit write operations by default and it can lead to some inconsistency between these databases.

For example a chain of queries as
SELECT pg_is_is_recovery(),
SELECT * FROM pg_stat_replication and
SELECT * FROM pg_stat_wal_receiver
provide insight into replication state for hot_standby replication.

So is there a reliable way of replication state monitoring for logical replication?

Best regards,
Andrei Yahorau
Reply | Threaded
Open this post in threaded view
|

Re: Logical replication monitoring

Achilleas Mantzios
On 10/08/2018 13:05, [hidden email] wrote:
Hello PostgreSQL Community!

I configured logical replication for PostgreSQL 10.4 on 2 machines, set wal_level to logical, created a publication on master node and created a subscription on standby node according to the PostgreSQL documentation.
Could you please suggest an approach for replication state monitoring.

According to my experience the monitoring of pg_stat_subscription and pg_publication, pg_replication_slots unfortunately is not enough for this aim. Moreover standby database does not prohibit write operations by default and it can lead to some inconsistency between these databases.

For example a chain of queries as
SELECT pg_is_is_recovery(),
SELECT * FROM pg_stat_replication and
SELECT * FROM pg_stat_wal_receiver
provide insight into replication state for hot_standby replication.

So is there a reliable way of replication state monitoring for logical replication?
tail_n_mail and look for relevant errors (logical, logical replication worker ... exited with ..., ERROR)


Best regards,
Andrei Yahorau


-- 
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt
Reply | Threaded
Open this post in threaded view
|

Re: Logical replication monitoring

AYahorau
In reply to this post by AYahorau
Hello!

Thank you for your suggestion.
I  afraid this approach is not suitable for me. As a rule my postgresql log  on subscriber side contains a  bunch of the following entries:

ERROR:  terminating logical replication worker due to timeout

00000 LOG:  worker process: logical replication worker for subscription 24578 (PID 6217) exited with exit code 1

How should I handle this situation?
As I understand this is quite normal situation. But why is severity for it  an ERROR ?

I have another assumption. Could you correct me if I am wrong.
I found out in the source code that logical replication worker termination depends on wal_receiver_timeout paramer.
So I propose setting wal_receiver_timeout  to 0.
In this case I think that monitoring of the following views  pg_stat_subscription, pg_publication and pg_stat_replication  is enough.

In case if there  is some problem with connection or with replication pg_stat_replication  will show nothing because wal sender will not be working otherwise it will give some information.
Am I right? Are there any vulnerabilities in this approach ?


Best regards,
Andrei Yahorau



From:        Andrei Yahorau/IBA
To:        [hidden email],
Cc:        Mikalai Keida/IBA@IBA
Date:        10/08/2018 13:05
Subject:        Logical replication monitoring




Hello PostgreSQL Community!

I configured logical replication for PostgreSQL 10.4 on 2 machines, set wal_level to logical, created a publication on master node and created a subscription on standby node according to the PostgreSQL documentation.
Could you please suggest an approach for replication state monitoring.

According to my experience the monitoring of pg_stat_subscription and pg_publication, pg_replication_slots unfortunately is not enough for this aim. Moreover standby database does not prohibit write operations by default and it can lead to some inconsistency between these databases.

For example a chain of queries as
SELECT pg_is_is_recovery(),
SELECT * FROM pg_stat_replication and
SELECT * FROM pg_stat_wal_receiver
provide insight into replication state for hot_standby replication.

So is there a reliable way of replication state monitoring for logical replication?

Best regards,
Andrei Yahorau
Reply | Threaded
Open this post in threaded view
|

Re: Logical replication monitoring

Rajni Baliyan-2

On Mon, Aug 13, 2018 at 8:16 PM, <[hidden email]> wrote:
Hello!

Thank you for your suggestion.
I  afraid this approach is not suitable for me. As a rule my postgresql log  on subscriber side contains a  bunch of the following entries:

ERROR:  terminating logical replication worker due to timeout

00000 LOG:  worker process: logical replication worker for subscription 24578 (PID 6217) exited with exit code 1

How should I handle this situation?
As I understand this is quite normal situation. But why is severity for it  an ERROR ?

I have another assumption. Could you correct me if I am wrong.
I found out in the source code that logical replication worker termination depends on wal_receiver_timeout paramer.
So I propose setting wal_receiver_timeout  to 0.
In this case I think that monitoring of the following views  pg_stat_subscription, pg_publication and pg_stat_replication  is enough.

In case if there  is some problem with connection or with replication pg_stat_replication  will show nothing because wal sender will not be working otherwise it will give some information.
Am I right? Are there any vulnerabilities in this approach ?


Best regards,
Andrei Yahorau



From:        Andrei Yahorau/IBA
To:        [hidden email],
Cc:        Mikalai Keida/IBA@IBA
Date:        10/08/2018 13:05
Subject:        Logical replication monitoring




Hello PostgreSQL Community!

I configured logical replication for PostgreSQL 10.4 on 2 machines, set wal_level to logical, created a publication on master node and created a subscription on standby node according to the PostgreSQL documentation.
Could you please suggest an approach for replication state monitoring.

According to my experience the monitoring of pg_stat_subscription and pg_publication, pg_replication_slots unfortunately is not enough for this aim. Moreover standby database does not prohibit write operations by default and it can lead to some inconsistency between these databases.

For example a chain of queries as
SELECT pg_is_is_recovery(),
SELECT * FROM pg_stat_replication and
SELECT * FROM pg_stat_wal_receiver
provide insight into replication state for hot_standby replication.

So is there a reliable way of replication state monitoring for logical replication?

Best regards,
Andrei Yahorau

Reply | Threaded
Open this post in threaded view
|

Re: Logical replication monitoring

AYahorau
Hello,
Sorry for being persistent,

Could you please explain how to use these commands? Could you please provide an example?


Best regards,
Andrei Yahorau



From:        Rajni Baliyan <[hidden email]>
To:        [hidden email],
Cc:        pgsql-admin <[hidden email]>, [hidden email]
Date:        14/08/2018 06:24
Subject:        Re: Logical replication monitoring




Check below link. Might help you.

https://www.postgresql.org/docs/10/static/replication-origins.html

On Mon, Aug 13, 2018 at 8:16 PM, <AYahorau@...> wrote:
Hello!

Thank you for your suggestion.
I  afraid this approach is not suitable for me. As a rule my postgresql log  on subscriber side contains a  bunch of the following entries:

ERROR:  terminating logical replication worker due to timeout

00000 LOG:  worker process: logical replication worker for subscription 24578 (PID 6217) exited with exit code 1


How should I handle this situation?
As I understand this is quite normal situation. But why is severity for it  an ERROR ?

I have another assumption. Could you correct me if I am wrong.
I found out in the source code that logical replication worker termination depends on wal_receiver_timeout paramer.
So I propose setting wal_receiver_timeout  to 0.
In this case I think that monitoring of the following views  pg_stat_subscription, pg_publication and pg_stat_replication  is enough.
In case if there  is some problem with connection or with replication pg_stat_replication  will show nothing because wal sender will not be working otherwise it will give some information.
Am I right? Are there any vulnerabilities in this approach ?


Best regards,
Andrei Yahorau




From:        
Andrei Yahorau/IBA
To:        
[hidden email],
Cc:        
Mikalai Keida/IBA@IBA
Date:        
10/08/2018 13:05
Subject:        
Logical replication monitoring




Hello PostgreSQL Community!


I configured logical replication for PostgreSQL 10.4 on 2 machines, set wal_level to logical, created a publication on master node and created a subscription on standby node according to the PostgreSQL documentation.

Could you please suggest an approach for replication state monitoring.


According to my experience the monitoring of pg_stat_subscription and pg_publication, pg_replication_slots unfortunately is not enough for this aim. Moreover standby database does not prohibit write operations by default and it can lead to some inconsistency between these databases.


For example a chain of queries as
SELECT pg_is_is_recovery()
,

SELECT * FROM pg_stat_replication
and

SELECT * FROM pg_stat_wal_receiver

provide insight into replication state for hot_standby replication.


So is there a reliable way of replication state monitoring for logical replication?


Best regards,
Andrei Yahorau


Reply | Threaded
Open this post in threaded view
|

Re: Logical replication monitoring

Achilleas Mantzios
In reply to this post by AYahorau
On 13/08/2018 13:16, [hidden email] wrote:
Hello!

Thank you for your suggestion.
I  afraid this approach is not suitable for me. As a rule my postgresql log  on subscriber side contains a  bunch of the following entries:

ERROR:  terminating logical replication worker due to timeout

00000 LOG:  worker process: logical replication worker for subscription 24578 (PID 6217) exited with exit code 1

How should I handle this situation?
As I understand this is quite normal situation. But why is severity for it  an ERROR ?
In my short experience with logical replication, those ERRORs should be avoided, you better increase the relevant timeouts: wal_receiver_timeout , wal_sender_timeout . One hand tool we use here is tail_n_mail , it is tailored specifically for parsing PostgreSQL logs and sending emails with alerts.

I have another assumption. Could you correct me if I am wrong.
I found out in the source code that logical replication worker termination depends on wal_receiver_timeout paramer.
So I propose setting wal_receiver_timeout  to 0.
In this case I think that monitoring of the following views  pg_stat_subscription, pg_publication and pg_stat_replication  is enough.

In case if there  is some problem with connection or with replication pg_stat_replication  will show nothing because wal sender will not be working otherwise it will give some information.
Am I right? Are there any vulnerabilities in this approach ?


Best regards,
Andrei Yahorau



From:        Andrei Yahorau/IBA
To:        [hidden email],
Cc:        Mikalai Keida/IBA@IBA
Date:        10/08/2018 13:05
Subject:        Logical replication monitoring




Hello PostgreSQL Community!

I configured logical replication for PostgreSQL 10.4 on 2 machines, set wal_level to logical, created a publication on master node and created a subscription on standby node according to the PostgreSQL documentation.
Could you please suggest an approach for replication state monitoring.

According to my experience the monitoring of pg_stat_subscription and pg_publication, pg_replication_slots unfortunately is not enough for this aim. Moreover standby database does not prohibit write operations by default and it can lead to some inconsistency between these databases.

For example a chain of queries as
SELECT pg_is_is_recovery(),
SELECT * FROM pg_stat_replication and
SELECT * FROM pg_stat_wal_receiver
provide insight into replication state for hot_standby replication.

So is there a reliable way of replication state monitoring for logical replication?

Best regards,
Andrei Yahorau


-- 
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt
Previous Thread Next Thread