Race condition in InvalidateObsoleteReplicationSlots()

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

Race condition in InvalidateObsoleteReplicationSlots()

Andres Freund
CONTENTS DELETED
The author has deleted this message.
Reply | Threaded
Open this post in threaded view
|

Re: Race condition in InvalidateObsoleteReplicationSlots()

Andres Freund
CONTENTS DELETED
The author has deleted this message.
Reply | Threaded
Open this post in threaded view
|

Re: Race condition in InvalidateObsoleteReplicationSlots()

akapila
On Thu, Apr 8, 2021 at 7:39 AM Andres Freund <[hidden email]> wrote:

>
> On 2021-04-07 17:10:37 -0700, Andres Freund wrote:
> > I think this can be solved in two different ways:
> >
> > 1) Hold ReplicationSlotAllocationLock with LW_SHARED across most of
> >    InvalidateObsoleteReplicationSlots(). That way nobody could re-create a new
> >    slot in the to-be-obsoleted-slot's place.
> >
> > 2) Atomically check whether the slot needs to be invalidated and try to
> >    acquire if needed. Don't release ReplicationSlotControlLock between those
> >    two steps. Signal the owner to release the slot iff we couldn't acquire the
> >    slot. In the latter case wait and then recheck if the slot still needs to
> >    be dropped.
> >
> > To me 2) seems better, because we then can also be sure that the slot still
> > needs to be obsoleted, rather than potentially doing so unnecessarily.
> >

+1.

> >
> > It looks to me like several of the problems here stem from trying to reuse
> > code from ReplicationSlotAcquireInternal() (which before this was just named
> > ReplicationSlotAcquire()).  I don't think that makes sense, because cases like
> > this want to check if a condition is true, and acquire it only if so.
> >
> > IOW, I think this basically needs to look like ReplicationSlotsDropDBSlots(),
> > except that a different condition is checked, and the if (active_pid) case
> > needs to prepare a condition variable, signal the owner and then wait on the
> > condition variable, to restart after.
>
> I'm also confused by the use of ConditionVariableTimedSleep(timeout =
> 10). Why do we need a timed sleep here in the first place? And why with
> such a short sleep?
>
> I also noticed that the code is careful to use CHECK_FOR_INTERRUPTS(); -
> but is aware it's running in checkpointer. I don't think CFI does much
> there? If we are worried about needing to check for interrupts, more
> work is needed.
>
>
> Sketch for a fix attached. I did leave the odd
> ConditionVariableTimedSleep(10ms) in, because I wasn't sure why it's
> there...
>

I haven't tested the patch but I couldn't spot any problems while
reading it. A minor point, don't we need to use
ConditionVariableCancelSleep() at some point after doing
ConditionVariableTimedSleep?

--
With Regards,
Amit Kapila.


Reply | Threaded
Open this post in threaded view
|

Re: Race condition in InvalidateObsoleteReplicationSlots()

Álvaro Herrera
In reply to this post by Andres Freund
On 2021-Apr-07, Andres Freund wrote:

> I'm also confused by the use of ConditionVariableTimedSleep(timeout =
> 10). Why do we need a timed sleep here in the first place? And why with
> such a short sleep?

I was scared of the possibility that a process would not set the CV for
whatever reason, causing checkpointing to become stuck.  Maybe that's
misguided thinking if CVs are reliable enough.

> I also noticed that the code is careful to use CHECK_FOR_INTERRUPTS(); -
> but is aware it's running in checkpointer. I don't think CFI does much
> there? If we are worried about needing to check for interrupts, more
> work is needed.

Hmm .. yeah, doing CFI seems pretty useless.  I think that should just
be removed.  If checkpointer gets USR2 (request for shutdown) it's not
going to affect the behavior of CFI anyway.

I attach a couple of changes to your 0001.  It's all cosmetic; what
looks not so cosmetic is the change of "continue" to "break" in helper
routine; if !s->in_use, we'd loop infinitely.  The other routine
already checks that before calling the helper; since you hold
ReplicationSlotControlLock at that point, it should not be possible to
drop it in between.  Anyway, it's a trivial change to make, so it should
be correct.

I also added a "continue" at the bottom of one block; currently that
doesn't change any behavior, but if we add code at the other block, it
might not be what's intended.

> After this I don't see a reason to have SAB_Inquire - as far as I can
> tell it's practically impossible to use without race conditions? Except
> for raising an error - which is "builtin"...

Hmm, interesting ... If not needed, yeah let's get rid of that.


Are you getting this set pushed, or would you like me to handle it?
(There seems to be some minor conflict in 13)

--
Álvaro Herrera       Valdivia, Chile
"Pido que me den el Nobel por razones humanitarias" (Nicanor Parra)

0001-Alvaro-s-edits.patch (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Race condition in InvalidateObsoleteReplicationSlots()

Andres Freund
In reply to this post by akapila
CONTENTS DELETED
The author has deleted this message.
Reply | Threaded
Open this post in threaded view
|

Re: Race condition in InvalidateObsoleteReplicationSlots()

Andres Freund
In reply to this post by Álvaro Herrera
CONTENTS DELETED
The author has deleted this message.
Reply | Threaded
Open this post in threaded view
|

Re: Race condition in InvalidateObsoleteReplicationSlots()

Álvaro Herrera
Here's a version that I feel is committable (0001).  There was an issue
when returning from the inner loop, if in a previous iteration we had
released the lock.  In that case we need to return with the lock not
held, so that the caller can acquire it again, but weren't.  This seems
pretty hard to hit in practice (I suppose somebody needs to destroy the
slot just as checkpointer killed the walsender, but before checkpointer
marks it as its own) ... but if it did happen, I think checkpointer
would block with no recourse.  Also added some comments and slightly
restructured the code.

There are plenty of conflicts in pg13, but it's all easy to handle.

I wrote a test (0002) to cover the case of signalling a walsender, which
is currently not covered (we only deal with the case of a standby that's
not running).  There are some sharp edges in this code -- I had to make
it use background_psql() to send a CHECKPOINT, which hangs, because I
previously send a SIGSTOP to the walreceiver.  Maybe there's a better
way to achieve a walreceiver that remains connected but doesn't consume
input from the primary, but I don't know what it is.  Anyway, the code
becomes covered with this.  I would like to at least see it in master,
to gather some reactions from buildfarm.

--
Álvaro Herrera       Valdivia, Chile
<Schwern> It does it in a really, really complicated way
<crab> why does it need to be complicated?
<Schwern> Because it's MakeMaker.

0001-the-code-fix.patch (8K) Download Attachment
0002-the-test.patch (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Race condition in InvalidateObsoleteReplicationSlots()

Álvaro Herrera
On 2021-Jun-10, Álvaro Herrera wrote:

> I wrote a test (0002) to cover the case of signalling a walsender, which
> is currently not covered (we only deal with the case of a standby that's
> not running).  There are some sharp edges in this code -- I had to make
> it use background_psql() to send a CHECKPOINT, which hangs, because I
> previously send a SIGSTOP to the walreceiver.  Maybe there's a better
> way to achieve a walreceiver that remains connected but doesn't consume
> input from the primary, but I don't know what it is.  Anyway, the code
> becomes covered with this.  I would like to at least see it in master,
> to gather some reactions from buildfarm.

Small fixup to the test one, so that skipping it on Windows works
correctly.

--
Álvaro Herrera                            39°49'30"S 73°17'W
Voy a acabar con todos los humanos / con los humanos yo acabaré
voy a acabar con todos (bis) / con todos los humanos acabaré ¡acabaré! (Bender)

v2-0001-the-code-fix.patch (8K) Download Attachment
v2-0002-the-test.patch (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Race condition in InvalidateObsoleteReplicationSlots()

Álvaro Herrera
On 2021-Jun-10, Álvaro Herrera wrote:

> Here's a version that I feel is committable (0001).  There was an issue
> when returning from the inner loop, if in a previous iteration we had
> released the lock.  In that case we need to return with the lock not
> held, so that the caller can acquire it again, but weren't.  This seems
> pretty hard to hit in practice (I suppose somebody needs to destroy the
> slot just as checkpointer killed the walsender, but before checkpointer
> marks it as its own) ... but if it did happen, I think checkpointer
> would block with no recourse.  Also added some comments and slightly
> restructured the code.
>
> There are plenty of conflicts in pg13, but it's all easy to handle.
Pushed, with additional minor changes.

> I wrote a test (0002) to cover the case of signalling a walsender, which
> is currently not covered (we only deal with the case of a standby that's
> not running).  There are some sharp edges in this code -- I had to make
> it use background_psql() to send a CHECKPOINT, which hangs, because I
> previously send a SIGSTOP to the walreceiver.  Maybe there's a better
> way to achieve a walreceiver that remains connected but doesn't consume
> input from the primary, but I don't know what it is.  Anyway, the code
> becomes covered with this.  I would like to at least see it in master,
> to gather some reactions from buildfarm.

I tried hard to make this stable, but it just isn't (it works fine one
thousand runs, then I grab some coffee and run it once more and that one
fails.  Why?  that's not clear to me).  Attached is the last one I have,
in case somebody wants to make it better.  Maybe there's some completely
different approach that works better, but I'm out of ideas for now.

--
Álvaro Herrera       Valdivia, Chile
"La experiencia nos dice que el hombre peló millones de veces las patatas,
pero era forzoso admitir la posibilidad de que en un caso entre millones,
las patatas pelarían al hombre" (Ijon Tichy)

v3-0001-Add-test-case-for-invalidating-an-active-replicat.patch (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Race condition in InvalidateObsoleteReplicationSlots()

Álvaro Herrera
In reply to this post by Andres Freund
On 2021-Apr-07, Andres Freund wrote:

> After this I don't see a reason to have SAB_Inquire - as far as I can
> tell it's practically impossible to use without race conditions? Except
> for raising an error - which is "builtin"...

Pushed 0002.

Thanks!

--
Álvaro Herrera                            39°49'30"S 73°17'W
"La persona que no quería pecar / estaba obligada a sentarse
 en duras y empinadas sillas    / desprovistas, por cierto
 de blandos atenuantes"                          (Patricio Vogel)


Reply | Threaded
Open this post in threaded view
|

Re: Race condition in InvalidateObsoleteReplicationSlots()

Andres Freund
CONTENTS DELETED
The author has deleted this message.
Reply | Threaded
Open this post in threaded view
|

Re: Race condition in InvalidateObsoleteReplicationSlots()

Álvaro Herrera
In reply to this post by Álvaro Herrera
On 2021-Jun-11, Álvaro Herrera wrote:

> I tried hard to make this stable, but it just isn't (it works fine one
> thousand runs, then I grab some coffee and run it once more and that one
> fails.  Why?  that's not clear to me).  Attached is the last one I have,
> in case somebody wants to make it better.  Maybe there's some completely
> different approach that works better, but I'm out of ideas for now.

It occurred to me that this could be made better by sigstopping both
walreceiver and walsender, then letting only the latter run; AFAICS this
makes the test stable.  I'll register this on the upcoming commitfest to
let cfbot run it, and if it looks good there I'll get it pushed to
master.  If there's any problem I'll just remove it before beta2 is
stamped.

--
Álvaro Herrera       Valdivia, Chile

v3-0001-Add-test-case-for-obsoleting-slot-with-active-wal.patch (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Race condition in InvalidateObsoleteReplicationSlots()

Álvaro Herrera
Apologies, I inadvertently sent the version before I added a maximum
number of iterations in the final loop.

--
Álvaro Herrera       Valdivia, Chile
"La fuerza no está en los medios físicos
sino que reside en una voluntad indomable" (Gandhi)

v4-0001-Add-test-case-for-obsoleting-slot-with-active-wal.patch (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Race condition in InvalidateObsoleteReplicationSlots()

Tom Lane-2
In reply to this post by Álvaro Herrera
CONTENTS DELETED
The author has deleted this message.
Reply | Threaded
Open this post in threaded view
|

Re: Race condition in InvalidateObsoleteReplicationSlots()

Álvaro Herrera
In reply to this post by Álvaro Herrera
Hah, desmoxytes failed once:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=desmoxytes&dt=2021-06-19%2003%3A06%3A04
I'll revert it and investigate later.  There have been no other
failures.

--
Álvaro Herrera                            39°49'30"S 73°17'W
"Hay que recordar que la existencia en el cosmos, y particularmente la
elaboración de civilizaciones dentro de él no son, por desgracia,
nada idílicas" (Ijon Tichy)


Reply | Threaded
Open this post in threaded view
|

Re: Race condition in InvalidateObsoleteReplicationSlots()

Tom Lane-2
In reply to this post by Tom Lane-2
CONTENTS DELETED
The author has deleted this message.
Reply | Threaded
Open this post in threaded view
|

Re: Race condition in InvalidateObsoleteReplicationSlots()

Tom Lane-2
CONTENTS DELETED
The author has deleted this message.