Weaker shmem interlock w/o postmaster.pid

classic Classic list List threaded Threaded
33 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Weaker shmem interlock w/o postmaster.pid

Noah Misch-2
If a starting postmaster's CreateLockFile() finds an existing postmaster.pid,
it subjects the shared memory segment named therein to the careful scrutiny of
PGSharedMemoryIsInUse().  If that segment matches the current data directory
and has any attached processes, we bail with the "pre-existing shared memory
block ... is still in use" error.  When the postmaster.pid file is missing,
there's inherently less we can do to reliably detect this situation; in
particular, an old postmaster could have chosen an unusual key due to the
usual 1+(port*1000) key being in use.  That being said, PGSharedMemoryCreate()
typically will stumble upon the old segment, and it (its sysv variant, anyway)
applies checks much weaker than those of PGSharedMemoryIsInUse().  If the
segment has a PGShmemHeader and the postmaster PID named in that header is not
alive, PGSharedMemoryCreate() will delete the segment and proceed.  Shouldn't
it instead check the same things as PGSharedMemoryIsInUse()?

The concrete situation in which I encountered this involved PostgreSQL 9.2 and
an immediate shutdown with a backend that had blocked SIGQUIT.  The backend
survived the immediate shutdown as one would expect.  The postmaster
nonetheless removed postmaster.pid before exiting, and I could immediately
restart PostgreSQL despite the survival of the SIGQUIT-blocked backend.  If I
instead SIGKILL the postmaster, postmaster.pid remains, and I must kill stray
backends before restarting.  The postmaster should not remove postmaster.pid
unless it has verified that its children have exited.  Concretely, that means
not removing postmaster.pid on immediate shutdown in 9.3 and earlier.  That's
consistent with the rough nature of an immediate shutdown, anyway.

I'm thinking to preserve postmaster.pid at immediate shutdown in all released
versions, but I'm less sure about back-patching a change to make
PGSharedMemoryCreate() pickier.  On the one hand, allowing startup to proceed
with backends still active in the same data directory is a corruption hazard.
On the other hand, it could break weird shutdown/restart patterns that permit
trivial lifespan overlap between backends of different postmasters.  Opinions?

Thanks,
nm

--
Noah Misch
EnterpriseDB                                 http://www.enterprisedb.com


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Weaker shmem interlock w/o postmaster.pid

Stephen Frost
* Noah Misch ([hidden email]) wrote:
> Shouldn't it instead check the same things as PGSharedMemoryIsInUse()?

Offhand, I tend to agree that we should really be doing a very careful
job of looking at if an existing segment is still in use.

> The concrete situation in which I encountered this involved PostgreSQL 9.2 and
> an immediate shutdown with a backend that had blocked SIGQUIT.  The backend
> survived the immediate shutdown as one would expect.  

Well..  We expect this now because of the analysis you did in the
adjacent thread showing how it can happen.

> The postmaster
> nonetheless removed postmaster.pid before exiting, and I could immediately
> restart PostgreSQL despite the survival of the SIGQUIT-blocked backend.  If I
> instead SIGKILL the postmaster, postmaster.pid remains, and I must kill stray
> backends before restarting.  The postmaster should not remove postmaster.pid
> unless it has verified that its children have exited.

This makes sense, however..

> Concretely, that means
> not removing postmaster.pid on immediate shutdown in 9.3 and earlier.  That's
> consistent with the rough nature of an immediate shutdown, anyway.

I don't like leaving the postmaster.pid file around, even on an
immediate shutdown.  I don't have any great suggestions regarding what
to do, given what we try to do wrt 'immediate', so perhaps it's
acceptable for future releases.

> I'm thinking to preserve postmaster.pid at immediate shutdown in all released
> versions, but I'm less sure about back-patching a change to make
> PGSharedMemoryCreate() pickier.  On the one hand, allowing startup to proceed
> with backends still active in the same data directory is a corruption hazard.

The corruption risk, imv anyway, is sufficient to backpatch the change
and overrides the concerns around very fast shutdown/restarts.

        Thanks,

                Stephen

signature.asc (205 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Weaker shmem interlock w/o postmaster.pid

Robert Haas
In reply to this post by Noah Misch-2
On Tue, Sep 10, 2013 at 11:33 PM, Noah Misch <[hidden email]> wrote:
> I'm thinking to preserve postmaster.pid at immediate shutdown in all released
> versions, but I'm less sure about back-patching a change to make
> PGSharedMemoryCreate() pickier.  On the one hand, allowing startup to proceed
> with backends still active in the same data directory is a corruption hazard.
> On the other hand, it could break weird shutdown/restart patterns that permit
> trivial lifespan overlap between backends of different postmasters.  Opinions?

I'm more sanguine about the second change than the first.  Leaving
postmaster.pid around seems like a clear user-visible behavior change
that could break user scripts or have other consequences that we don't
foresee; thus, I would vote against back-patching it.  Indeed, I'm not
sure it's a good idea to do that even in master.  On the other hand,
tightening the checks in PGSharedMemoryCreate() seems very much worth
doing, and I think it might also be safe enough to back-patch.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Weaker shmem interlock w/o postmaster.pid

Noah Misch-2
In reply to this post by Stephen Frost
On Wed, Sep 11, 2013 at 11:32:01AM -0400, Stephen Frost wrote:
> * Noah Misch ([hidden email]) wrote:
> > The concrete situation in which I encountered this involved PostgreSQL 9.2 and
> > an immediate shutdown with a backend that had blocked SIGQUIT.  The backend
> > survived the immediate shutdown as one would expect.  
>
> Well..  We expect this now because of the analysis you did in the
> adjacent thread showing how it can happen.

That was a surprising way for it to happen, but there have long been known
vectors like a SIGSTOP'd backend or a backend that has blocked SIGQUIT.

> > Concretely, that means
> > not removing postmaster.pid on immediate shutdown in 9.3 and earlier.  That's
> > consistent with the rough nature of an immediate shutdown, anyway.
>
> I don't like leaving the postmaster.pid file around, even on an
> immediate shutdown.  I don't have any great suggestions regarding what
> to do, given what we try to do wrt 'immediate', so perhaps it's
> acceptable for future releases.

Fair enough.

> > I'm thinking to preserve postmaster.pid at immediate shutdown in all released
> > versions, but I'm less sure about back-patching a change to make
> > PGSharedMemoryCreate() pickier.  On the one hand, allowing startup to proceed
> > with backends still active in the same data directory is a corruption hazard.
>
> The corruption risk, imv anyway, is sufficient to backpatch the change
> and overrides the concerns around very fast shutdown/restarts.

Making PGSharedMemoryCreate() pickier in all branches will greatly diminish
the marginal value of preserving postmaster.pid, so I'm fine with dropping the
postmaster.pid side of the proposal.

Thanks,
nm

--
Noah Misch
EnterpriseDB                                 http://www.enterprisedb.com


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Weaker shmem interlock w/o postmaster.pid

David Johnston
Noah Misch-2 wrote
> > I'm thinking to preserve postmaster.pid at immediate shutdown in all released
> > versions, but I'm less sure about back-patching a change to make
> > PGSharedMemoryCreate() pickier.  On the one hand, allowing startup to proceed
> > with backends still active in the same data directory is a corruption hazard.
>
> The corruption risk, imv anyway, is sufficient to backpatch the change
> and overrides the concerns around very fast shutdown/restarts.

Making PGSharedMemoryCreate() pickier in all branches will greatly diminish
the marginal value of preserving postmaster.pid, so I'm fine with dropping the
postmaster.pid side of the proposal.
Its probably still worth a fresh look at the immediate shutdown process to see whether the current location where postmaster.pid is removed is acceptable.  It may not be necessary to leave it in place always but:

1) if there is a section of shared memory that can only be reached/found if one knows the pid, and
2) postmaster.pid is removed before that area is "secured from future clobbering"

then there may be a risk that can still be mitigated by moving its removal without having to go to the extreme.  

David J.
Reply | Threaded
Open this post in threaded view
|

Re: Weaker shmem interlock w/o postmaster.pid

Noah Misch-2
On Wed, Sep 11, 2013 at 07:57:22PM -0700, David Johnston wrote:

> Noah Misch-2 wrote
> > Making PGSharedMemoryCreate() pickier in all branches will greatly
> > diminish
> > the marginal value of preserving postmaster.pid, so I'm fine with dropping
> > the
> > postmaster.pid side of the proposal.
>
> Its probably still worth a fresh look at the immediate shutdown process to
> see whether the current location where postmaster.pid is removed is
> acceptable.  It may not be necessary to leave it in place always but:
>
> 1) if there is a section of shared memory that can only be reached/found if
> one knows the pid, and

Similar: one needs a sysv shared memory key to find the segment, and
postmaster.pid records that key.  The chosen key is almost always the same
from run to run, so a new postmaster typically does find the old segment even
without postmaster.pid.

> 2) postmaster.pid is removed before that area is "secured from future
> clobbering"

Clobbering shared memory is not the actual threat here.  We use the shared
memory segment as a witness to the fact that PostgreSQL processes are active
in a data directory.  If we let children of different postmasters operate in
the same directory simultaneously, they can corrupt data.

> then there may be a risk that can still be mitigated by moving its removal
> without having to go to the extreme.  

--
Noah Misch
EnterpriseDB                                 http://www.enterprisedb.com


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Weaker shmem interlock w/o postmaster.pid

Bruce Momjian
In reply to this post by Robert Haas
On Wed, Sep 11, 2013 at 02:10:45PM -0400, Robert Haas wrote:

> On Tue, Sep 10, 2013 at 11:33 PM, Noah Misch <[hidden email]> wrote:
> > I'm thinking to preserve postmaster.pid at immediate shutdown in all released
> > versions, but I'm less sure about back-patching a change to make
> > PGSharedMemoryCreate() pickier.  On the one hand, allowing startup to proceed
> > with backends still active in the same data directory is a corruption hazard.
> > On the other hand, it could break weird shutdown/restart patterns that permit
> > trivial lifespan overlap between backends of different postmasters.  Opinions?
>
> I'm more sanguine about the second change than the first.  Leaving
> postmaster.pid around seems like a clear user-visible behavior change
> that could break user scripts or have other consequences that we don't
> foresee; thus, I would vote against back-patching it.  Indeed, I'm not
> sure it's a good idea to do that even in master.  On the other hand,
> tightening the checks in PGSharedMemoryCreate() seems very much worth
> doing, and I think it might also be safe enough to back-patch.

Were these changes every applied?  I don't see them.

--
  Bruce Momjian  <[hidden email]>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + Everyone has their own god. +


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Weaker shmem interlock w/o postmaster.pid

Noah Misch-2
On Wed, Feb 12, 2014 at 06:06:40PM -0500, Bruce Momjian wrote:

> On Wed, Sep 11, 2013 at 02:10:45PM -0400, Robert Haas wrote:
> > On Tue, Sep 10, 2013 at 11:33 PM, Noah Misch <[hidden email]> wrote:
> > > I'm thinking to preserve postmaster.pid at immediate shutdown in all released
> > > versions, but I'm less sure about back-patching a change to make
> > > PGSharedMemoryCreate() pickier.  On the one hand, allowing startup to proceed
> > > with backends still active in the same data directory is a corruption hazard.
> > > On the other hand, it could break weird shutdown/restart patterns that permit
> > > trivial lifespan overlap between backends of different postmasters.  Opinions?
> >
> > I'm more sanguine about the second change than the first.  Leaving
> > postmaster.pid around seems like a clear user-visible behavior change
> > that could break user scripts or have other consequences that we don't
> > foresee; thus, I would vote against back-patching it.  Indeed, I'm not
> > sure it's a good idea to do that even in master.  On the other hand,
> > tightening the checks in PGSharedMemoryCreate() seems very much worth
> > doing, and I think it might also be safe enough to back-patch.
>
> Were these changes every applied?  I don't see them.

No, I haven't gotten around to writing them.

--
Noah Misch
EnterpriseDB                                 http://www.enterprisedb.com


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Weaker shmem interlock w/o postmaster.pid

Noah Misch-2
In reply to this post by Noah Misch-2
On Wed, Sep 11, 2013 at 10:28:00PM -0400, Noah Misch wrote:
> On Wed, Sep 11, 2013 at 11:32:01AM -0400, Stephen Frost wrote:
> > * Noah Misch ([hidden email]) wrote:
> > > Concretely, that means
> > > not removing postmaster.pid on immediate shutdown in 9.3 and earlier.  That's
> > > consistent with the rough nature of an immediate shutdown, anyway.

With 9.3 having a few months left, that's less interesting, but ...

> > I don't like leaving the postmaster.pid file around, even on an
> > immediate shutdown.  I don't have any great suggestions regarding what
> > to do, given what we try to do wrt 'immediate', so perhaps it's
> > acceptable for future releases.
>
> Fair enough.

... we could still worry about folks doing "kill -9 `head -n1 postmaster.pid`
&& rm postmaster.pid".  A possible defense is to record a copy of the shm
identifiers in a second file ("sysv_shmem_key"?) within the data directory.
Since users would have no expectations about a new file's lifecycle, we could
remove it when and only when we verify a segment doesn't exist.  However, a
DBA determined to remove postmaster.pid would likely remove both files.

> > > I'm thinking to preserve postmaster.pid at immediate shutdown in all released
> > > versions, but I'm less sure about back-patching a change to make
> > > PGSharedMemoryCreate() pickier.  On the one hand, allowing startup to proceed
> > > with backends still active in the same data directory is a corruption hazard.
> >
> > The corruption risk, imv anyway, is sufficient to backpatch the change
> > and overrides the concerns around very fast shutdown/restarts.
>
> Making PGSharedMemoryCreate() pickier in all branches will greatly diminish
> the marginal value of preserving postmaster.pid, so I'm fine with dropping the
> postmaster.pid side of the proposal.
Here's an implementation.  The first, small patch replaces an obsolete
errhint() about manually removing shared memory blocks.  In its place,
recommend killing processes.  Today's text, added in commit 4d14fe0, is too
rarely pertinent to justify a HINT.  (You might use ipcrm if a PostgreSQL
process is stuck in D state and has SIGKILL pending, so it will never become
runnable again.)  Removal of the ipcclean script ten years ago (39627b1) and
its non-replacement corroborate that manual shm removal is now a niche goal.

In the main patch, one of the tests covers an unsafe case that sysv_shmem.c
still doesn't detect.  I could pursue a fix via the aforementioned
sysv_shmem_key file, modulo the possibility of a DBA removing it.  I could
also, when postmaster.pid is missing, make sysv_shmem.c check the first N
(N=100?) keys applicable to the selected port.  My gut feeling is that neither
thing is worthwhile, but I'm interested to hear other opinions.

This removes the creatorPID test, which commit c715fde added before commit
4d14fe0 added the shm_nattch test.  The pair of tests is not materially
stronger than the shm_nattch test by itself.  For reasons explained in the
comment at "enum IpcMemoryState", this patch has us cease recycling segments
pertaining to data directories other than our own.  This isn't ideal for the
buildfarm, where a postmaster crash bug would permanently leak a shm segment.
I think the benefits for production use cases eclipse this drawback.  Did I
miss a reason to like the old behavior?

Single-user mode has been protected even less than normal execution, because
it selected a shm key as though using port zero.  Thus, starting single-user
mode after an incomplete shutdown would never detect the pre-shutdown shm.
The patch makes single-user mode work like normal execution.  Until commit
de98a7e, single-user mode used malloc(), not a shm segment.  That commit also
restricted single-user mode to not recycle old segments.  I didn't find a
rationale for that restriction, but a contributing factor may have been the
fact that every single-user process uses the same port-0 shm key space.  Also,
initdb starts various single-user backends, and reaping stale segments would
be an odd side effect for initdb.  With single-user mode using a real port
number and PostgreSQL no longer recycling segments of other data directories,
I removed that restriction.  By the way, as of this writing, my regular
development system has 41 stale segments, all in the single-user key space
(0x00000001 to 0x0000002d with gaps).  It's clear how they persisted once
leaked, but I don't know how I leaked them in the first place.

I ran 9327 iterations of the test file, and it did not leak a shm segment in
that time.  I tested on Red Hat and on Windows Server 2016; I won't be shocked
if the test (not the code under test) breaks on other Windows configurations.
The patch adds PostgresNode features to enable that test coverage.

The comment referring to a CreateDataDirLockFile() bug refers to this one:
https://postgr.es/m/flat/20120803145635.GE9683%40tornado.leadboat.com

Thanks,
nm

ipcrm-hint-v1.patch (836 bytes) Download Attachment
PGSharedMemoryCreate-safety-v1.patch (33K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Weaker shmem interlock w/o postmaster.pid

Dmitry Dolgov
> On Sun, Aug 12, 2018 at 8:48 AM Noah Misch <[hidden email]> wrote:
>
> On Wed, Sep 11, 2013 at 10:28:00PM -0400, Noah Misch wrote:
> > On Wed, Sep 11, 2013 at 11:32:01AM -0400, Stephen Frost wrote:
> > > * Noah Misch ([hidden email]) wrote:
> > > > Concretely, that means
> > > > not removing postmaster.pid on immediate shutdown in 9.3 and earlier.  That's
> > > > consistent with the rough nature of an immediate shutdown, anyway.
>
> With 9.3 having a few months left, that's less interesting, but ...

Thanks for the patch!

I'm a bit out of context, but taking into account that 9.3 is already beyond
EOL, is it still interesting? I'm just thinking about whether it's worth to
move it to the next CF, or mark as rejected, since no one reviewed the patch so
far?

As a side note, with this patch recovery tests are failing now on 016_shm.pl

#   Failed test 'detected live backend via shared memory'
#   at t/016_shm.pl line 87.
#                   '2018-11-28 13:08:08.504 UTC [21924] LOG:
listening on Unix socket "/tmp/yV2oDNcG8e/gnat/.s.PGSQL.512"
# 2018-11-28 13:08:08.512 UTC [21925] LOG:  database system was
interrupted; last known up at 2018-11-28 13:08:08 UTC
# 2018-11-28 13:08:08.512 UTC [21925] LOG:  database system was not
properly shut down; automatic recovery in progress
# 2018-11-28 13:08:08.512 UTC [21925] LOG:  invalid record length at
0/165FEF8: wanted 24, got 0
# 2018-11-28 13:08:08.512 UTC [21925] LOG:  redo is not required
# 2018-11-28 13:08:08.516 UTC [21924] LOG:  database system is ready
to accept connections
# '
#     doesn't match '(?^:pre-existing shared memory block)'

Reply | Threaded
Open this post in threaded view
|

Re: Weaker shmem interlock w/o postmaster.pid

Noah Misch-2
On Thu, Nov 29, 2018 at 01:10:57PM +0100, Dmitry Dolgov wrote:
> On Sun, Aug 12, 2018 at 8:48 AM Noah Misch <[hidden email]> wrote:
> > With 9.3 having a few months left, that's less interesting, but ...

> I'm a bit out of context, but taking into account that 9.3 is already beyond
> EOL, is it still interesting?

Yes.

> As a side note, with this patch recovery tests are failing now on 016_shm.pl
>
> #   Failed test 'detected live backend via shared memory'
> #   at t/016_shm.pl line 87.
> #                   '2018-11-28 13:08:08.504 UTC [21924] LOG:
> listening on Unix socket "/tmp/yV2oDNcG8e/gnat/.s.PGSQL.512"
> # 2018-11-28 13:08:08.512 UTC [21925] LOG:  database system was
> interrupted; last known up at 2018-11-28 13:08:08 UTC
> # 2018-11-28 13:08:08.512 UTC [21925] LOG:  database system was not
> properly shut down; automatic recovery in progress
> # 2018-11-28 13:08:08.512 UTC [21925] LOG:  invalid record length at
> 0/165FEF8: wanted 24, got 0
> # 2018-11-28 13:08:08.512 UTC [21925] LOG:  redo is not required
> # 2018-11-28 13:08:08.516 UTC [21924] LOG:  database system is ready
> to accept connections
> # '
> #     doesn't match '(?^:pre-existing shared memory block)'

Thanks for the report.  Since commit cfdf4dc made pg_sleep() react to
postmaster death, the test will need a different way to stall a backend.  This
doesn't affect non-test code, and the test still passes against cfdf4dc^ and
against REL_11_STABLE.  I've queued a task to update the test code, but review
can proceed in parallel.

Reply | Threaded
Open this post in threaded view
|

Re: Weaker shmem interlock w/o postmaster.pid

Noah Misch-2
On Thu, Nov 29, 2018 at 10:51:40PM -0800, Noah Misch wrote:

> On Thu, Nov 29, 2018 at 01:10:57PM +0100, Dmitry Dolgov wrote:
> > As a side note, with this patch recovery tests are failing now on 016_shm.pl
> >
> > #   Failed test 'detected live backend via shared memory'
> > #   at t/016_shm.pl line 87.
> > #                   '2018-11-28 13:08:08.504 UTC [21924] LOG:
> > listening on Unix socket "/tmp/yV2oDNcG8e/gnat/.s.PGSQL.512"
> > # 2018-11-28 13:08:08.512 UTC [21925] LOG:  database system was
> > interrupted; last known up at 2018-11-28 13:08:08 UTC
> > # 2018-11-28 13:08:08.512 UTC [21925] LOG:  database system was not
> > properly shut down; automatic recovery in progress
> > # 2018-11-28 13:08:08.512 UTC [21925] LOG:  invalid record length at
> > 0/165FEF8: wanted 24, got 0
> > # 2018-11-28 13:08:08.512 UTC [21925] LOG:  redo is not required
> > # 2018-11-28 13:08:08.516 UTC [21924] LOG:  database system is ready
> > to accept connections
> > # '
> > #     doesn't match '(?^:pre-existing shared memory block)'
>
> Thanks for the report.  Since commit cfdf4dc made pg_sleep() react to
> postmaster death, the test will need a different way to stall a backend.  This
> doesn't affect non-test code, and the test still passes against cfdf4dc^ and
> against REL_11_STABLE.  I've queued a task to update the test code, but review
> can proceed in parallel.
Here's a version fixing that test for post-cfdf4dc backends.

ipcrm-hint-v1.patch (836 bytes) Download Attachment
PGSharedMemoryCreate-safety-v2.patch (35K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Weaker shmem interlock w/o postmaster.pid

Michael Paquier-2
On Sun, Dec 02, 2018 at 04:41:06PM -0800, Noah Misch wrote:
> Here's a version fixing that test for post-cfdf4dc backends.

This patch set still applies and needs review, so moved to next CF.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Weaker shmem interlock w/o postmaster.pid

Kyotaro HORIGUCHI-2
In reply to this post by Noah Misch-2
Hello.

At Sun, 2 Dec 2018 16:41:06 -0800, Noah Misch <[hidden email]> wrote in <[hidden email]>

> On Thu, Nov 29, 2018 at 10:51:40PM -0800, Noah Misch wrote:
> > On Thu, Nov 29, 2018 at 01:10:57PM +0100, Dmitry Dolgov wrote:
> > > As a side note, with this patch recovery tests are failing now on 016_shm.pl
> > >
> > > #   Failed test 'detected live backend via shared memory'
> > > #   at t/016_shm.pl line 87.
> > > #                   '2018-11-28 13:08:08.504 UTC [21924] LOG:
> > > listening on Unix socket "/tmp/yV2oDNcG8e/gnat/.s.PGSQL.512"
> > > # 2018-11-28 13:08:08.512 UTC [21925] LOG:  database system was
> > > interrupted; last known up at 2018-11-28 13:08:08 UTC
> > > # 2018-11-28 13:08:08.512 UTC [21925] LOG:  database system was not
> > > properly shut down; automatic recovery in progress
> > > # 2018-11-28 13:08:08.512 UTC [21925] LOG:  invalid record length at
> > > 0/165FEF8: wanted 24, got 0
> > > # 2018-11-28 13:08:08.512 UTC [21925] LOG:  redo is not required
> > > # 2018-11-28 13:08:08.516 UTC [21924] LOG:  database system is ready
> > > to accept connections
> > > # '
> > > #     doesn't match '(?^:pre-existing shared memory block)'
> >
> > Thanks for the report.  Since commit cfdf4dc made pg_sleep() react to
> > postmaster death, the test will need a different way to stall a backend.  This
> > doesn't affect non-test code, and the test still passes against cfdf4dc^ and
> > against REL_11_STABLE.  I've queued a task to update the test code, but review
> > can proceed in parallel.

I found that I have 65(h) segments left alone on my environment:p

At Sat, 11 Aug 2018 23:48:15 -0700, Noah Misch <[hidden email]> wrote in <[hidden email]>
> still doesn't detect.  I could pursue a fix via the aforementioned
> sysv_shmem_key file, modulo the possibility of a DBA removing it.  I could
> also, when postmaster.pid is missing, make sysv_shmem.c check the first N
> (N=100?) keys applicable to the selected port.  My gut feeling is that neither
> thing is worthwhile, but I'm interested to hear other opinions.

# Though I don't get the meaning of the "modulo" there..

I think the only thing we must avoid here is sharing the same
shmem segment with a living-dead server. If we can do that
without the pid file, it would be better than relying on it. We
could remove orphaned segments automatically, but I don't think
we should do that going so far as relying on a dedicated
file. Also, I don't think it's worth stopping shmem id scanning
at a certain number since I don't come up with an appropriate
number for it. But looking "port * 1000", it might be expected
that a usable segment id will found while scanning that number of
ids (1000).

> Here's a version fixing that test for post-cfdf4dc backends.

This moves what were in PGSharedMmoeryIsInUse into a new function
IpcMemoryAnalyze for reusing then adds distinction among
EEXISTS/FOREIGN in not-in-use cases and ATTACHED/ANALYSIS_FAILURE
in an in-use case. UNATTACHED is changed to a not-in-use case by
this patch.  As the result PGSharedMemoryIsInUse() is changed so
that it reutrns "not-in-use" for the UNATTACHED case. It looks
fine.

PGSharedMemoryCreate changed to choose a usable shmem id using
the IpcMemoryAnalyze().  But some of the statuses from
IpcMemoryAnalyze() is concealed by failure of
PGSharedMemoryAttach() and ignored silently opposed to what the
code is intending to do. (By the way SHMSTATE_EEXISTS seems
suggesting oppsite thing from EEXIST, which would be confusing.)

PGSharedMemoryCreate() repeats shmat/shmdt twice in every
iteration. It won't harm so much but it would be better if we
could get rid of that.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center


Reply | Threaded
Open this post in threaded view
|

Re: Weaker shmem interlock w/o postmaster.pid

Noah Misch-2
On Mon, Mar 04, 2019 at 06:04:20PM +0900, Kyotaro HORIGUCHI wrote:
> I found that I have 65(h) segments left alone on my environment:p

Did patched PostgreSQL create those, or did unpatched PostgreSQL create them?

> At Sat, 11 Aug 2018 23:48:15 -0700, Noah Misch <[hidden email]> wrote in <[hidden email]>
> > still doesn't detect.  I could pursue a fix via the aforementioned
> > sysv_shmem_key file, modulo the possibility of a DBA removing it.  I could
> > also, when postmaster.pid is missing, make sysv_shmem.c check the first N
> > (N=100?) keys applicable to the selected port.  My gut feeling is that neither
> > thing is worthwhile, but I'm interested to hear other opinions.
>
> # Though I don't get the meaning of the "modulo" there..

It wasn't clear.  Adding a separate sysv_shmem_key file would not protect a
cluster if the DBA does "rm -f postmaster.pid sysv_shmem_key".  It would,
however, fix this test case:

+# Fail to reject startup if shm key N has become available and we crash while
+# using key N+1.  This is unwanted, but expected.  Windows is immune, because
+# its GetSharedMemName() use DataDir strings, not numeric keys.
+$flea->stop;    # release first key
+is( $gnat->start(fail_ok => 1),
+       $TestLib::windows_os ? 0 : 1,
+       'key turnover fools only sysv_shmem.c');

> I think the only thing we must avoid here is sharing the same
> shmem segment with a living-dead server. If we can do that
> without the pid file, it would be better than relying on it. We
> could remove orphaned segments automatically, but I don't think
> we should do that going so far as relying on a dedicated
> file.

There are two things to avoid.  First, during postmaster startup, avoid
sharing a segment with any other process.  Second, avoid sharing a data
directory with a running PostgreSQL process that was a child of some other
postmaster.  I think that's what you mean by "living-dead server".  The first
case is already prevented thoroughly, because we only proceed with startup
after creating a new segment with IPC_CREAT | IPC_EXCL.  The second case is
what this patch seeks to improve.

> Also, I don't think it's worth stopping shmem id scanning
> at a certain number since I don't come up with an appropriate
> number for it. But looking "port * 1000", it might be expected
> that a usable segment id will found while scanning that number of
> ids (1000).

If we find thousands of unusable segments, we do keep going until we find a
usable segment.  I was referring to a different situation.  Today, if there's
no postmaster.pid file and we're using port 5432, we first try segment ID
5432000.  If segment ID 5432000 is usable, we use it without examining any
other segment ID.  If segment ID 5432010 were in use by a "living-dead
server", we won't discover that fact.  We could increase the chance of
discovering that fact by checking every segment ID from 5432000 to 5432999.
(Once finished with that scan, we'd still create and use 5432000.)  I didn't
implement that idea in the patch, but I shared it to see if anyone thought it
was worth adding.

> PGSharedMemoryCreate changed to choose a usable shmem id using
> the IpcMemoryAnalyze().  But some of the statuses from
> IpcMemoryAnalyze() is concealed by failure of
> PGSharedMemoryAttach() and ignored silently opposed to what the
> code is intending to do.

SHMSTATE_FOREIGN is ignored silently.  The patch modifies the
PGSharedMemoryAttach() header comment to direct callers to treat
PGSharedMemoryAttach() failure like SHMSTATE_FOREIGN.  I don't think the code
had an unintentional outcome due to the situation you describe.  Perhaps I
could have made the situation clearer.

> (By the way SHMSTATE_EEXISTS seems
> suggesting oppsite thing from EEXIST, which would be confusing.)

Good catch.  Is SHMSTATE_ENOENT better?

> PGSharedMemoryCreate() repeats shmat/shmdt twice in every
> iteration. It won't harm so much but it would be better if we
> could get rid of that.

I'll try to achieve that and see if the code turns out cleaner.

Thanks,
nm

Reply | Threaded
Open this post in threaded view
|

Re: Weaker shmem interlock w/o postmaster.pid

Noah Misch-2
On Wed, Mar 06, 2019 at 07:24:22PM -0800, Noah Misch wrote:

> On Mon, Mar 04, 2019 at 06:04:20PM +0900, Kyotaro HORIGUCHI wrote:
> > PGSharedMemoryCreate changed to choose a usable shmem id using
> > the IpcMemoryAnalyze().  But some of the statuses from
> > IpcMemoryAnalyze() is concealed by failure of
> > PGSharedMemoryAttach() and ignored silently opposed to what the
> > code is intending to do.
>
> SHMSTATE_FOREIGN is ignored silently.  The patch modifies the
> PGSharedMemoryAttach() header comment to direct callers to treat
> PGSharedMemoryAttach() failure like SHMSTATE_FOREIGN.  I don't think the code
> had an unintentional outcome due to the situation you describe.  Perhaps I
> could have made the situation clearer.
I think v3, attached, avoids this appearance of unintended behavior.

> > (By the way SHMSTATE_EEXISTS seems
> > suggesting oppsite thing from EEXIST, which would be confusing.)
>
> Good catch.  Is SHMSTATE_ENOENT better?

I did s/SHMSTATE_EEXISTS/SHMSTATE_ENOENT/.

> > PGSharedMemoryCreate() repeats shmat/shmdt twice in every
> > iteration. It won't harm so much but it would be better if we
> > could get rid of that.
>
> I'll try to achieve that and see if the code turns out cleaner.

I renamed IpcMemoryAnalyze() to PGSharedMemoryAttach() and deleted the old
function of that name.  Now, this function never calls shmdt(); the caller is
responsible for that.  I do like things better this way.  What do you think?

ipcrm-hint-v1.patch (836 bytes) Download Attachment
PGSharedMemoryCreate-safety-v3.patch (37K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Weaker shmem interlock w/o postmaster.pid

Daniel Gustafsson
On Saturday, March 9, 2019 8:16 AM, Noah Misch <[hidden email]> wrote:

This patch is not really in my wheelhouse, so I might very well be testing it
in the wrong way, but whats the fun in staying in shallow end. Below is my
attempt at reviewing this patchset.

Both patches applies with a little bit of fuzz, and passes check-world. No new
perlcritic warnings are introduced, but 016_shm.pl needs to be renamed to 017
since commit b0825d28ea83e44139bd319e6d1db2c499c claimed 016. They absolutely
seem to do what they on the tin, and to my understanding this seems like an
improvement we definitely want.

> I renamed IpcMemoryAnalyze() to PGSharedMemoryAttach() and deleted the old
> function of that name. Now, this function never calls shmdt(); the caller is
> responsible for that. I do like things better this way. What do you think?

I think it makes for a good API for the caller to be responsible, but it does
warrant a comment on the function to explicitly state that.

A few other small comments:

+   state = PGSharedMemoryAttach((IpcMemoryId) id2, &memAddress);
+   if (memAddress)
+       shmdt(memAddress);

This seems like a case where it would be useful to log a shmdt() error or do
an Assert() around the success of the operation perhaps?

+    * Loop till we find a free IPC key.  Trust CreateDataDirLockFile() to
+    * ensure no more than one postmaster per data directory can enter this
+    * loop simultaneously.  (CreateDataDirLockFile() does not ensure that,
+    * but prefer fixing it over coping here.)

This comment make it seem like there is a fix to CreateLockFile() missing to
his patch, is that correct? If so, do you have an idea for that patch?

> I tested on Red Hat and on Windows Server 2016; I won't be shocked
> if the test (not the code under test) breaks on other Windows configurations.

IIRC there are Windows versions where Win32::Process::KillProcess is required
for this, but thats admittedly somewhat dated knowledge. If the buildfarm goes
red on older Windows animals it might be something to look at perhaps.

Switching this to Ready for Committer since I can't see anything but tiny things.

cheers ./daniel


Reply | Threaded
Open this post in threaded view
|

Re: Weaker shmem interlock w/o postmaster.pid

Noah Misch-2
On Fri, Mar 29, 2019 at 09:53:51AM +0000, Daniel Gustafsson wrote:
> On Saturday, March 9, 2019 8:16 AM, Noah Misch <[hidden email]> wrote:
> > I renamed IpcMemoryAnalyze() to PGSharedMemoryAttach() and deleted the old
> > function of that name. Now, this function never calls shmdt(); the caller is
> > responsible for that. I do like things better this way. What do you think?
>
> I think it makes for a good API for the caller to be responsible, but it does
> warrant a comment on the function to explicitly state that.

The name "PGSharedMemoryAttach" makes that fact sufficiently obvious, I think.

> A few other small comments:
>
> +   state = PGSharedMemoryAttach((IpcMemoryId) id2, &memAddress);
> +   if (memAddress)
> +       shmdt(memAddress);
>
> This seems like a case where it would be useful to log a shmdt() error or do
> an Assert() around the success of the operation perhaps?

I'll add the same elog(LOG) we have at other shmdt() sites.  I can't think of
a site where we Assert() about the results of a system call.  While shmdt()
might be a justified exception, elog(LOG) seems reasonable.

> +    * Loop till we find a free IPC key.  Trust CreateDataDirLockFile() to
> +    * ensure no more than one postmaster per data directory can enter this
> +    * loop simultaneously.  (CreateDataDirLockFile() does not ensure that,
> +    * but prefer fixing it over coping here.)
>
> This comment make it seem like there is a fix to CreateLockFile() missing to
> his patch, is that correct? If so, do you have an idea for that patch?

That comment refers to
https://postgr.es/m/flat/20120803145635.GE9683%40tornado.leadboat.com

> Switching this to Ready for Committer since I can't see anything but tiny things.

Thanks.


Reply | Threaded
Open this post in threaded view
|

Re: Weaker shmem interlock w/o postmaster.pid

Daniel Gustafsson
On Monday, April 1, 2019 12:42 AM, Noah Misch <[hidden email]> wrote:
> On Fri, Mar 29, 2019 at 09:53:51AM +0000, Daniel Gustafsson wrote:

> > This seems like a case where it would be useful to log a shmdt() error or do
> > an Assert() around the success of the operation perhaps?
>
> I'll add the same elog(LOG) we have at other shmdt() sites. I can't think of
> a site where we Assert() about the results of a system call. While shmdt()
> might be a justified exception, elog(LOG) seems reasonable.

Agreed, seems reasonable.

> > -   -   Loop till we find a free IPC key. Trust CreateDataDirLockFile() to
> > -   -   ensure no more than one postmaster per data directory can enter this
> > -   -   loop simultaneously. (CreateDataDirLockFile() does not ensure that,
> > -   -   but prefer fixing it over coping here.)
> >
> > This comment make it seem like there is a fix to CreateLockFile() missing to
> > his patch, is that correct? If so, do you have an idea for that patch?
>
> That comment refers to
> https://postgr.es/m/flat/20120803145635.GE9683%40tornado.leadboat.com

Gotcha, thanks for the clarification.

cheers ./daniel


Reply | Threaded
Open this post in threaded view
|

Re: Weaker shmem interlock w/o postmaster.pid

Noah Misch-2
On Mon, Apr 01, 2019 at 08:19:56AM +0000, Daniel Gustafsson wrote:

> On Monday, April 1, 2019 12:42 AM, Noah Misch <[hidden email]> wrote:
> > On Fri, Mar 29, 2019 at 09:53:51AM +0000, Daniel Gustafsson wrote:
>
> > > This seems like a case where it would be useful to log a shmdt() error or do
> > > an Assert() around the success of the operation perhaps?
> >
> > I'll add the same elog(LOG) we have at other shmdt() sites. I can't think of
> > a site where we Assert() about the results of a system call. While shmdt()
> > might be a justified exception, elog(LOG) seems reasonable.
>
> Agreed, seems reasonable.

Pushed, but that broke two buildfarm members:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=idiacanthus&dt=2019-04-04%2000%3A33%3A14
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=komodoensis&dt=2019-04-04%2000%3A33%3A13

I think the problem arose because these animals run on the same machine, and
their test execution was synchronized to the second.  Two copies of the new
test ran concurrently.  It doesn't tolerate that, owing to expectations about
which shared memory keys are in use.  My initial thought is to fix this by
having a third postmaster that runs throughout the test and represents
ownership of a given port.  If that postmaster gets something other than the
first shm key pertaining to its port, switch ports and try again.

I'll also include fixes for the warnings Andres reported on the
pgsql-committers thread.


12