Disallow cancellation of waiting for synchronous replication

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
25 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Disallow cancellation of waiting for synchronous replication

Andrey Borodin-2
Hi, hackers!

This is continuation of thread [0] in pgsql-general with proposed changes. As Maksim pointed out, this topic was raised before here [1].

Currently, we can have split brain with the combination of following steps:
0. Setup cluster with synchronous replication. Isolate primary from standbys.
1. Issue upsert query INSERT .. ON CONFLICT DO NOTHING
2. CANCEL 1 during wait for synchronous replication
3. Retry 1. Idempotent query will succeed and client have confirmation of written data, while it is not present on any standby.

Thread [0] contain reproduction from psql.

In certain situations we cannot avoid cancelation of timed out queries. Yes, we can interpret warnings and thread them as errors, but warning is emitted on step 1, not on step 3.

I think proper solution here would be to add GUC to disallow cancellation of synchronous replication. Retry step 3 will wait on locks after hanging 1 and data will be consistent.
Three is still a problem when backend is not canceled, but terminated [2]. Ideal solution would be to keep locks on changed data. Some well known databases threat termination of synchronous replication as system failure and refuse to operate until standbys appear (see Maximum Protection mode). From my point of view it's enough to PANIC once so that HA tool be informed that something is going wrong.
Anyway situation with cancelation is more dangerous. We've observed it in some user cases.

Please find attached draft of proposed change.

Best regards, Andrey Borodin.

[0] https://www.postgresql.org/message-id/flat/B70260F9-D0EC-438D-9A59-31CB996B320A%40yandex-team.ru
[1] https://www.postgresql.org/message-id/flat/CAEET0ZHG5oFF7iEcbY6TZadh1mosLmfz1HLm311P9VOt7Z%2Bjeg%40mail.gmail.com
[2] https://www.postgresql.org/docs/current/warm-standby.html#SYNCHRONOUS-REPLICATION-HA


0001-Disallow-cancelation-of-syncronous-commit-V1.patch (6K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Disallow cancellation of waiting for synchronous replication

Marco Slot-2
On Fri, Dec 20, 2019 at 6:04 AM Andrey Borodin <[hidden email]> wrote:
> I think proper solution here would be to add GUC to disallow cancellation of synchronous replication. Retry step 3 will wait on locks after hanging 1 and data will be consistent.
> Three is still a problem when backend is not canceled, but terminated [2]. Ideal solution would be to keep locks on changed data. Some well known databases threat termination of synchronous replication as system failure and refuse to operate until standbys appear (see Maximum Protection mode). From my point of view it's enough to PANIC once so that HA tool be informed that something is going wrong.

Sending a cancellation is currently the only way to resume after
disabling synchronous replication. Some HA solutions (e.g.
pg_auto_failover) rely on this behaviour. Would it be worth checking
whether synchronous replication is still required?

Marco


Reply | Threaded
Open this post in threaded view
|

Re: Disallow cancellation of waiting for synchronous replication

Andrey Borodin-2


> 20 дек. 2019 г., в 12:23, Marco Slot <[hidden email]> написал(а):
>
> On Fri, Dec 20, 2019 at 6:04 AM Andrey Borodin <[hidden email]> wrote:
>> I think proper solution here would be to add GUC to disallow cancellation of synchronous replication. Retry step 3 will wait on locks after hanging 1 and data will be consistent.
>> Three is still a problem when backend is not canceled, but terminated [2]. Ideal solution would be to keep locks on changed data. Some well known databases threat termination of synchronous replication as system failure and refuse to operate until standbys appear (see Maximum Protection mode). From my point of view it's enough to PANIC once so that HA tool be informed that something is going wrong.
>
> Sending a cancellation is currently the only way to resume after
> disabling synchronous replication. Some HA solutions (e.g.
> pg_auto_failover) rely on this behaviour. Would it be worth checking
> whether synchronous replication is still required?

I think changing synchronous_standby_names to some available standbys will resume all backends waiting for synchronous replication.
Do we need to check necessity of synchronous replication in any other case?

Best regards, Andrey Borodin.

Reply | Threaded
Open this post in threaded view
|

Re: Disallow cancellation of waiting for synchronous replication

Tom Lane-2
In reply to this post by Andrey Borodin-2
Andrey Borodin <[hidden email]> writes:
> I think proper solution here would be to add GUC to disallow cancellation of synchronous replication.

This sounds entirely insane to me.  There is no possibility that you
can prevent a failure from occurring at this step.

> Three is still a problem when backend is not canceled, but terminated [2].

Exactly.  If you don't have a fix that handles that case, you don't have
anything.  In fact, you've arguably made things worse, by increasing the
temptation to terminate or "kill -9" the nonresponsive session.

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: Disallow cancellation of waiting for synchronous replication

Michael Paquier-2
In reply to this post by Andrey Borodin-2
On Fri, Dec 20, 2019 at 03:07:26PM +0500, Andrey Borodin wrote:
>> Sending a cancellation is currently the only way to resume after
>> disabling synchronous replication. Some HA solutions (e.g.
>> pg_auto_failover) rely on this behaviour. Would it be worth checking
>> whether synchronous replication is still required?
>
> I think changing synchronous_standby_names to some available
> standbys will resume all backends waiting for synchronous
> replication.  Do we need to check necessity of synchronous
> replication in any other case?

Yeah, I am not on board with the concept of this thread.  Depending
on your HA configuration you can also reset synchronous_standby_names
after a certain small-ish threshold has been reached in WAL to get at
the same result by disabling synchronous replication, though your
cluster cannot perform safely a failover so you need to keep track of
that state.  Something which would be useful is to improve some cases
where you still want to use synchronous replication by switching to a
different standby.  I recall that sometimes that can be rather slow..
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Disallow cancellation of waiting for synchronous replication

Marco Slot-2
In reply to this post by Andrey Borodin-2
On Fri, Dec 20, 2019 at 11:07 AM Andrey Borodin <[hidden email]> wrote:
> I think changing synchronous_standby_names to some available standbys will resume all backends waiting for synchronous replication.
> Do we need to check necessity of synchronous replication in any other case?

The GUCs are not re-checked in the main loop in SyncRepWaitForLSN, so
backends will remain stuck there even if synchronous replication has
been (temporarily) disabled while they were waiting.

I do agree with the general sentiment that terminating the connection
is preferable over sending a response to the client (except when
synchronous replication was already disabled). Synchronous replication
does not guarantee that a committed write is actually on any replica,
but it does in general guarantee that a commit has been replicated
before sending a response to the client. That's arguably more
important because the rest of what the application might depend on the
transaction completing and replicating successfully. I don't know of
cases other than cancellation in which a response is sent to the
client without replication when synchronous replication is enabled.

The error level should be FATAL instead of PANIC, since PANIC restarts
the database and I don't think there is a reason to do that.

Marco


Reply | Threaded
Open this post in threaded view
|

Re: Disallow cancellation of waiting for synchronous replication

Andrey Borodin-2


> 21 дек. 2019 г., в 2:19, Tom Lane <[hidden email]> написал(а):
>
> Andrey Borodin <[hidden email]> writes:
>> I think proper solution here would be to add GUC to disallow cancellation of synchronous replication.
>
> This sounds entirely insane to me.  There is no possibility that you
> can prevent a failure from occurring at this step.
Yes, we cannot prevent failure. If we wait here for a long time and someone cancels query, probably, node is failed. Database already lives in some other Availability Zone.
All we should do - refuse to commit anything here. Any committed data will be lost.

>> Three is still a problem when backend is not canceled, but terminated [2].
>
> Exactly.  If you don't have a fix that handles that case, you don't have
> anything.  In fact, you've arguably made things worse, by increasing the
> temptation to terminate or "kill -9" the nonresponsive session.
Currently, any Postgres HA solution can loose data when application issues INSERT ... ON CONFLICT DO NOTHING with retry. There is no need for any DBA mistake. Just a driver capable of issuing cancel on timeout.

Administrator issuing kill -9 is OK, database must shutdown to prevent splitbrain. Preferably, database should refuse to start after shutdown.
I'm not proposing this behavior as default. If administrator (or HA tool) configured DB in this mode - probably they know what they are doing.

> 21 дек. 2019 г., в 15:34, Marco Slot <[hidden email]> написал(а):
>
> On Fri, Dec 20, 2019 at 11:07 AM Andrey Borodin <[hidden email]> wrote:
>> I think changing synchronous_standby_names to some available standbys will resume all backends waiting for synchronous replication.
>> Do we need to check necessity of synchronous replication in any other case?
>
> The GUCs are not re-checked in the main loop in SyncRepWaitForLSN, so
> backends will remain stuck there even if synchronous replication has
> been (temporarily) disabled while they were waiting.
SyncRepInitConfig() will be called on SIGHUP. Backends waiting for synchronous replication will wake up on WAIT_EVENT_SYNC_REP and happily succeed.

> I do agree with the general sentiment that terminating the connection
> is preferable over sending a response to the client (except when
> synchronous replication was already disabled). Synchronous replication
> does not guarantee that a committed write is actually on any replica,
> but it does in general guarantee that a commit has been replicated
> before sending a response to the client. That's arguably more
> important because the rest of what the application might depend on the
> transaction completing and replicating successfully. I don't know of
> cases other than cancellation in which a response is sent to the
> client without replication when synchronous replication is enabled.
>
> The error level should be FATAL instead of PANIC, since PANIC restarts
> the database and I don't think there is a reason to do that.

Terminating connection is unacceptable, actually. Client will retry idempotent query. This query now do not need to write anything and will be committed.
We need to shutdown database and prevent it from starting. We should not acknowledge any data before synchronous replication configuration allows us.

When client tries to cancel his query - we refuse to do so and hold his write locks. If anyone terminate connection - locks will be released. It is better to shut down whole DB, then release these locks and continue to receive queries.


All this does not apply to simple cases when user accidentally enabled synchronous replication. This is a setup for quite sophisticated HA tool, which will rewind local database, when transient network partition will be over and old timeline is archived, and attach it to new primary.


Best regards, Andrey Borodin.

Reply | Threaded
Open this post in threaded view
|

Re: Disallow cancellation of waiting for synchronous replication

Maksim Milyutin-2
In reply to this post by Tom Lane-2
On 21.12.2019 00:19, Tom Lane wrote:

>> Three is still a problem when backend is not canceled, but terminated [2].
> Exactly.  If you don't have a fix that handles that case, you don't have
> anything.  In fact, you've arguably made things worse, by increasing the
> temptation to terminate or "kill -9" the nonresponsive session.


I assume that the termination of backend that causes termination of
PostgreSQL instance in Andrey's patch proposal have to be resolved by
external HA agents that could interrupt such terminations as parent
process of postmaster and make appropriate decisions e.g., restart
PostgreSQL node in closed from external users state (via pg_hba.conf
manipulation) until all sync replicas synchronize changes from master.
Stolon HA tool implements this strategy  [1]. This logic (waiting for
all replicas declared in synchronous_standby_names replicate all WAL
from master) could be implemented inside PostgreSQL kernel after start
recovery process before database is opened to users and this can be done
separately later.

Another approach is to implement two-phase commit over master and sync
replicas (as it did Oracle in old versions [2]) where the risk to get
local committed data under instance restarting and query canceling is
minimal (after starting of final commitment phase). But this approach
has latency penalty and complexity to resolve partial (prepared but not
committed) transactions under coordinator (in this case master node)
failure in automatic mode. Nicely if this approach will be implemented
later as option of synchronous commit.


1.
https://github.com/sorintlab/stolon/blob/master/doc/syncrepl.md#handling-postgresql-sync-repl-limits-under-such-circumstances

2.
https://docs.oracle.com/cd/B28359_01/server.111/b28326/repmaster.htm#i33607

--
Best regards,
Maksim Milyutin



Reply | Threaded
Open this post in threaded view
|

Re: Disallow cancellation of waiting for synchronous replication

Maksim Milyutin-2
In reply to this post by Marco Slot-2
On 21.12.2019 13:34, Marco Slot wrote:

> I do agree with the general sentiment that terminating the connection
> is preferable over sending a response to the client (except when
> synchronous replication was already disabled).


But in this case locally committed data becomes visible to new incoming
transactions that is bad side-effect of this issue. Under failover those
changes potentially undo.


> Synchronous replication
> does not guarantee that a committed write is actually on any replica,
> but it does in general guarantee that a commit has been replicated
> before sending a response to the client. That's arguably more
> important because the rest of what the application might depend on the
> transaction completing and replicating successfully. I don't know of
> cases other than cancellation in which a response is sent to the
> client without replication when synchronous replication is enabled.


Yes, at query canceling (e.g. by timeout from client driver) client
receives response about completed transaction (though with warning which
not all client drivers can handle properly) and the guarantee about
successfully replicated transaction *violates*.


--
Best regards,
Maksim Milyutin



Reply | Threaded
Open this post in threaded view
|

Re: Disallow cancellation of waiting for synchronous replication

Andrey Borodin-2


> 25 дек. 2019 г., в 15:28, Maksim Milyutin <[hidden email]> написал(а):
>
>> Synchronous replication
>> does not guarantee that a committed write is actually on any replica,
>> but it does in general guarantee that a commit has been replicated
>> before sending a response to the client. That's arguably more
>> important because the rest of what the application might depend on the
>> transaction completing and replicating successfully. I don't know of
>> cases other than cancellation in which a response is sent to the
>> client without replication when synchronous replication is enabled.
>
>
> Yes, at query canceling (e.g. by timeout from client driver) client receives response about completed transaction (though with warning which not all client drivers can handle properly) and the guarantee about successfully replicated transaction *violates*.

We obviously need a design discussion here to address the issue. But the immediate question is should we add this topic to January CF items?

Best regards, Andrey Borodin.

Reply | Threaded
Open this post in threaded view
|

Re: Disallow cancellation of waiting for synchronous replication

Marco Slot-2
In reply to this post by Maksim Milyutin-2


On Wed, Dec 25, 2019, 11:28 Maksim Milyutin <[hidden email]> wrote:
But in this case locally committed data becomes visible to new incoming
transactions that is bad side-effect of this issue. 

Your application should be prepared for that in any case.

At the point where synchronous replication waits, the commit has already been written to disk on the primary. If postgres restarts while waiting for replication then the write becomes immediately visible regardless of whether it was replicated. I don't think throwing a PANIC actually prevents that and if it does it's coincidental. Best you can do is signal to the client that the commit status is unknown.

That's far from ideal, but fixing it requires a much bigger change to streaming replication. The write should be replicated prior to commit on the primary, but applied after in a way where unapplied writes on the secondary can be overwritten/discarded if it turns out they did not commit on the primary.

Marco
Reply | Threaded
Open this post in threaded view
|

Re: Disallow cancellation of waiting for synchronous replication

Maksim Milyutin-2

On 25.12.2019 14:27, Marco Slot wrote:



On Wed, Dec 25, 2019, 11:28 Maksim Milyutin <[hidden email]> wrote:
But in this case locally committed data becomes visible to new incoming
transactions that is bad side-effect of this issue. 

Your application should be prepared for that in any case.

At the point where synchronous replication waits, the commit has already been written to disk on the primary. If postgres restarts while waiting for replication then the write becomes immediately visible regardless of whether it was replicated.


Yes, this write is recovered after starting of instance. At first, this case I want to delegate to external HA tool around PostgreSQL. It can handle instance stopping and take switchover to sync replica or start current instance with closed connections from external users until all writes replicate to sync replicas. Later, arguably closing connection after recovery process could be implemented inside the kernel.


I don't think throwing a PANIC actually prevents that and if it does it's coincidental.


PANIC lets down instance and doesn't provide clients to read locally committed data. HA tool takes further steps to close access to these data as described above.


That's far from ideal, but fixing it requires a much bigger change to streaming replication. The write should be replicated prior to commit on the primary, but applied after in a way where unapplied writes on the secondary can be overwritten/discarded if it turns out they did not commit on the primary.


Thanks for sharing your opinion about enhancement of synchronous commit protocol. Here [1] my position is listed. It would like to see positions of other members of community.


1. https://www.postgresql.org/message-id/f3ffc220-e601-cc43-3784-f9bba66dc382%40gmail.com

-- 
Best regards,
Maksim Milyutin
Reply | Threaded
Open this post in threaded view
|

Re: Disallow cancellation of waiting for synchronous replication

Maksim Milyutin-2
In reply to this post by Andrey Borodin-2

On 25.12.2019 13:45, Andrey Borodin wrote:

>> 25 дек. 2019 г., в 15:28, Maksim Milyutin <[hidden email]> написал(а):
>>
>>> Synchronous replication
>>> does not guarantee that a committed write is actually on any replica,
>>> but it does in general guarantee that a commit has been replicated
>>> before sending a response to the client. That's arguably more
>>> important because the rest of what the application might depend on the
>>> transaction completing and replicating successfully. I don't know of
>>> cases other than cancellation in which a response is sent to the
>>> client without replication when synchronous replication is enabled.
>>
>> Yes, at query canceling (e.g. by timeout from client driver) client receives response about completed transaction (though with warning which not all client drivers can handle properly) and the guarantee about successfully replicated transaction *violates*.
> We obviously need a design discussion here to address the issue. But the immediate question is should we add this topic to January CF items?

+1 on posting this topic to January CF.

Andrey, some fixes from me:

1) pulled out the cancelling of QueryCancelPending from internal branch
where synchronous_commit_cancelation is set so that to avoid dummy
iterations with printing message "canceling the wait for ..."

2) rewrote errdetail message under cancelling query: I hold in this case
we cannot assert that transaction committed locally because its changes
are not visible as yet so I propose to write about locally flushed
commit wal record.

Updated patch is attached.

--
Best regards,
Maksim Milyutin


0001-Disallow-cancelation-of-syncronous-commit-V2.patch (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Disallow cancellation of waiting for synchronous replication

Robert Haas
In reply to this post by Andrey Borodin-2
On Fri, Dec 20, 2019 at 12:04 AM Andrey Borodin <[hidden email]> wrote:
> Currently, we can have split brain with the combination of following steps:
> 0. Setup cluster with synchronous replication. Isolate primary from standbys.
> 1. Issue upsert query INSERT .. ON CONFLICT DO NOTHING
> 2. CANCEL 1 during wait for synchronous replication
> 3. Retry 1. Idempotent query will succeed and client have confirmation of written data, while it is not present on any standby.

It seems to me that in order for synchronous replication to work
reliably, you've got to be very careful about any situation where a
commit might or might not have completed, and this is one such
situation. When the client sends the query cancel, it does not and
cannot know whether the INSERT statement has not yet completed, has
already completed but not yet replicated, or has completed and
replicated but not yet sent back a response. However, the server's
response will be different in each of those cases, because in the
second case, there will be a WARNING about synchronous replication
having been interrupted. If the client ignores that WARNING, there are
going to be problems.

Now, even if you do pay attention to the warning, things are not
totally great here, because if you have inadvertently interrupted a
replication wait, how do you recover? You can't send a command that
means "oh, I want to wait after all." You would have to check the
standbys yourself, from the application code, and see whether the
changes that the query made have shown up there, or check the LSN on
the master and wait for the standbys to advance to that LSN. That's
not great, but might be doable for some applications.

One idea that I had during the initial discussion around synchronous
replication was that maybe there ought to be a distinction between
trying to cancel the query and trying to cancel the replication wait.
Imagine that you could send a cancel that would only cancel
replication waits but not queries, or only queries but not replication
waits. Then you could solve this problem by asking the server to
PQcancelWithAdvancedMagic(conn, PQ_CANCEL_TYPE_QUERY). I wasn't sure
that people would want this, and it didn't seem essential for the
version of this feature, but maybe this example shows that it would be
worthwhile. I don't really have any idea how you'd integrate such a
feature with psql, but maybe it would be good enough to have it
available through the C interface. Also, it's a wire-protocol change,
so there are compatibility issues to think about.

All that being said, like Tom and Michael, I don't think teaching the
backend to ignore cancels is the right approach. We have had
innumerable problems over the years that were caused by the backend
failing to respond to cancels when we would really have liked it to do
so, and users REALLY hate it when you tell them that they have to shut
down and restart (with crash recovery) the entire database because of
a single stuck backend.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Reply | Threaded
Open this post in threaded view
|

Re: Disallow cancellation of waiting for synchronous replication

Maksim Milyutin-2
On 29.12.2019 00:55, Robert Haas wrote:

> On Fri, Dec 20, 2019 at 12:04 AM Andrey Borodin <[hidden email]> wrote:
>> Currently, we can have split brain with the combination of following steps:
>> 0. Setup cluster with synchronous replication. Isolate primary from standbys.
>> 1. Issue upsert query INSERT .. ON CONFLICT DO NOTHING
>> 2. CANCEL 1 during wait for synchronous replication
>> 3. Retry 1. Idempotent query will succeed and client have confirmation of written data, while it is not present on any standby.
> All that being said, like Tom and Michael, I don't think teaching the
> backend to ignore cancels is the right approach. We have had
> innumerable problems over the years that were caused by the backend
> failing to respond to cancels when we would really have liked it to do
> so, and users REALLY hate it when you tell them that they have to shut
> down and restart (with crash recovery) the entire database because of
> a single stuck backend.
>

The stuckness of backend is not deadlock here. To cancel waiting of
backend fluently, client is enough to turn off synchronous replication
(change synchronous_standby_names through server reload) or change
synchronous replica to another livable one (again through changing of
synchronous_standby_names). In first case he explicitly agrees with
existence of local (not replicated) commits in master.


--
Best regards,
Maksim Milyutin



Reply | Threaded
Open this post in threaded view
|

Re: Disallow cancellation of waiting for synchronous replication

Robert Haas
On Sat, Dec 28, 2019 at 6:19 PM Maksim Milyutin <[hidden email]> wrote:
> The stuckness of backend is not deadlock here. To cancel waiting of
> backend fluently, client is enough to turn off synchronous replication
> (change synchronous_standby_names through server reload) or change
> synchronous replica to another livable one (again through changing of
> synchronous_standby_names). In first case he explicitly agrees with
> existence of local (not replicated) commits in master.

Sure, that's true. But I still maintain that responding to ^C is an
important property of the system. If you have to do some more
complicated set of steps like the ones you propose here, a decent
number of people aren't going to figure it out and will end up
unhappy. Now, as it is, you're unhappy, so I guess you can't please
everyone, but you asked for opinions so I'm giving you mine.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Reply | Threaded
Open this post in threaded view
|

Re: Disallow cancellation of waiting for synchronous replication

Andrey Borodin-2


> 29 дек. 2019 г., в 4:54, Robert Haas <[hidden email]> написал(а):
>
> On Sat, Dec 28, 2019 at 6:19 PM Maksim Milyutin <[hidden email]> wrote:
>> The stuckness of backend is not deadlock here. To cancel waiting of
>> backend fluently, client is enough to turn off synchronous replication
>> (change synchronous_standby_names through server reload) or change
>> synchronous replica to another livable one (again through changing of
>> synchronous_standby_names). In first case he explicitly agrees with
>> existence of local (not replicated) commits in master.
>
> Sure, that's true. But I still maintain that responding to ^C is an
> important property of the system.
Not loosing data - is a nice property of the database either.
Currently, synchronous replication fails to provide its guaranty - no data will be acknowledged until it is replicated.
We want to create a mode where this guaranty is provided.

When user issued CANCEL we could return him his warning or error, but we should not drop data locks. Other transactions should not get acknowledged on basis of non-replicated data.

> If you have to do some more
> complicated set of steps like the ones you propose here, a decent
> number of people aren't going to figure it out and will end up
> unhappy. Now, as it is, you're unhappy, so I guess you can't please
> everyone, but you asked for opinions so I'm giving you mine.

There are many cases when we do not allow user to shoot into his foot. For example, anti-wraparound vacuum. Single-user vacuum freeze is much less pain than split-brain. In case of wraparound protection, there is deterministic steps to take to get your database back consistently. But in case of split-brain there is no single plan for cure.

Best regards, Andrey Borodin.

Reply | Threaded
Open this post in threaded view
|

Re: Disallow cancellation of waiting for synchronous replication

Bruce Momjian
In reply to this post by Robert Haas
On Sat, Dec 28, 2019 at 04:55:55PM -0500, Robert Haas wrote:

> On Fri, Dec 20, 2019 at 12:04 AM Andrey Borodin <[hidden email]> wrote:
> > Currently, we can have split brain with the combination of following steps:
> > 0. Setup cluster with synchronous replication. Isolate primary from standbys.
> > 1. Issue upsert query INSERT .. ON CONFLICT DO NOTHING
> > 2. CANCEL 1 during wait for synchronous replication
> > 3. Retry 1. Idempotent query will succeed and client have confirmation of written data, while it is not present on any standby.
>
> It seems to me that in order for synchronous replication to work
> reliably, you've got to be very careful about any situation where a
> commit might or might not have completed, and this is one such
> situation. When the client sends the query cancel, it does not and
> cannot know whether the INSERT statement has not yet completed, has
> already completed but not yet replicated, or has completed and
> replicated but not yet sent back a response. However, the server's
> response will be different in each of those cases, because in the
> second case, there will be a WARNING about synchronous replication
> having been interrupted. If the client ignores that WARNING, there are
> going to be problems.

This gets to the heart of something I was hoping to discuss.  When is
something committed?  You would think it is when the client receives the
commit message, but Postgres can commit something, and try to inform the
client but fail to inform, perhaps due to network problems.  In Robert's
case above, we send a "success", but it is only a success on the primary
and not on the synchronous standby.

In the first case I mentioned, we commit without guaranteeing the client
knows, but in the second case, we tell the client success with a warning
that the synchronous standby didn't get the commit.  Are clients even
checking warning messages?  You see it in psql, but what about
applications that use Postgres.  Do they even check for warnings?
Should administrators be informed via email or some command when this
happens?

--
  Bruce Momjian  <[hidden email]>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +


Reply | Threaded
Open this post in threaded view
|

Re: Disallow cancellation of waiting for synchronous replication

Robert Haas
In reply to this post by Andrey Borodin-2
On Sun, Dec 29, 2019 at 4:13 AM Andrey Borodin <[hidden email]> wrote:
> Not loosing data - is a nice property of the database either.

Sure, but there's more than one way to fix that problem, as I pointed
out in my first response.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Reply | Threaded
Open this post in threaded view
|

Re: Disallow cancellation of waiting for synchronous replication

Robert Haas
In reply to this post by Bruce Momjian
On Mon, Dec 30, 2019 at 9:39 AM Bruce Momjian <[hidden email]> wrote:
> This gets to the heart of something I was hoping to discuss.  When is
> something committed?  You would think it is when the client receives the
> commit message, but Postgres can commit something, and try to inform the
> client but fail to inform, perhaps due to network problems.

This kind of problem can happen even without synchronous replication.
I've alluded to this problem in a couple of blog posts I've done on
sync rep.

If an application is connected to the database and sends a COMMIT
command (or a data-modifying command outside a transaction that will
commit implicitly) and the connection is closed before it receives a
response, it does not know whether the COMMIT actually happened. It
will have to wait until the database is back up and running and then
go examine the state of the database with SELECT statements and try to
figure out whether the changes it wanted actually got made. Otherwise
it doesn't know whether the failure that resulted in a loss of network
connectivity occurred before or after the commit.

I imagine that most applications are way too dumb to do this properly
and just report errors to the user and let the user decide what to do
to try to recover. And I imagine that most users are not terribly
careful about it and such events cause minor data loss/corruption on a
regular basis. But there are also probably some applications where
people are really fanatical about it.

>  In Robert's
> case above, we send a "success", but it is only a success on the primary
> and not on the synchronous standby.
>
> In the first case I mentioned, we commit without guaranteeing the client
> knows, but in the second case, we tell the client success with a warning
> that the synchronous standby didn't get the commit.  Are clients even
> checking warning messages?  You see it in psql, but what about
> applications that use Postgres.  Do they even check for warnings?
> Should administrators be informed via email or some command when this
> happens?

I suspect a lot of clients are not checking warning messages, but
whether that's really the server's problem is arguable. I think we've
developed a general practice over the years of trying to avoid warning
messages as a way of telling users about problems, and that's a good
idea in general precisely because they might just get ignored, but
there are cases where it is really the only possible way forward. It
would certainly be pretty bad to have the COMMIT succeed on the local
node but produce an ERROR; that would doubtless be much more confusing
than what it's doing now. There's nothing at all to prevent
administrators from watching the logs for such warnings and taking
whatever action they deem appropriate.

I continue to think that the root cause of this issue is that we can't
distinguish between cancelling the query and cancelling the sync rep
wait. The client in this case is asking for both when it really only
wants the former, and then ignoring the warning that the latter is
what actually occurred.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


12