pgsql: Consistently test for in-use shared memory.

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

pgsql: Consistently test for in-use shared memory.

Noah Misch-2
Consistently test for in-use shared memory.

postmaster startup scrutinizes any shared memory segment recorded in
postmaster.pid, exiting if that segment matches the current data
directory and has an attached process.  When the postmaster.pid file was
missing, a starting postmaster used weaker checks.  Change to use the
same checks in both scenarios.  This increases the chance of a startup
failure, in lieu of data corruption, if the DBA does "kill -9 `head -n1
postmaster.pid` && rm postmaster.pid && pg_ctl -w start".  A postmaster
will no longer stop if shmat() of an old segment fails with EACCES.  A
postmaster will no longer recycle segments pertaining to other data
directories.  That's good for production, but it's bad for integration
tests that crash a postmaster and immediately delete its data directory.
Such a test now leaks a segment indefinitely.  No "make check-world"
test does that.  win32_shmem.c already avoided all these problems.  In
9.6 and later, enhance PostgresNode to facilitate testing.  Back-patch
to 9.4 (all supported versions).

Reviewed (in earlier versions) by Daniel Gustafsson and Kyotaro HORIGUCHI.

Discussion: https://postgr.es/m/20190408064141.GA2016666@...

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/c098509927f9a49ebceb301a2cb6a477ecd4ac3c

Modified Files
--------------
src/Makefile.global.in              |   4 +-
src/backend/port/sysv_shmem.c       | 269 +++++++++++++++++++++---------------
src/backend/port/win32_shmem.c      |   7 +-
src/backend/postmaster/postmaster.c |  12 +-
src/backend/storage/ipc/ipci.c      |  14 +-
src/backend/utils/init/postinit.c   |   6 +-
src/include/storage/ipc.h           |   2 +-
src/include/storage/pg_shmem.h      |   6 +-
src/test/perl/PostgresNode.pm       | 184 +++++++++++++++++++-----
src/test/recovery/t/017_shm.pl      | 200 +++++++++++++++++++++++++++
src/tools/msvc/vcregress.pl         |   1 +
11 files changed, 525 insertions(+), 180 deletions(-)

Reply | Threaded
Open this post in threaded view
|

Re: pgsql: Consistently test for in-use shared memory.

Tom Lane-2
Noah Misch <[hidden email]> writes:
> Consistently test for in-use shared memory.

Hmm ... conchuela has just found another problem with this test case:

ok 2 - detected live backend via shared memory
# Running: postgres --single -D /home/pgbf/buildroot/HEAD/pgsql.build/src/test/recovery/tmp_check/t_017_shm_gnat_data/pgdata template1
ack Broken pipe: write( 16, 'SELECT 1 + 1' ) at /usr/local/lib/perl5/site_perl/IPC/Run/IO.pm line 549.

I interpret this as "the single-user backend exited before we could stuff
'SELECT 1 + 1' down the pipe to it, and the Perl script is not expecting
to get a write failure there".

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: pgsql: Consistently test for in-use shared memory.

Noah Misch-2
On Mon, Apr 15, 2019 at 08:08:48PM -0400, Tom Lane wrote:

> Noah Misch <[hidden email]> writes:
> > Consistently test for in-use shared memory.
>
> Hmm ... conchuela has just found another problem with this test case:
>
> ok 2 - detected live backend via shared memory
> # Running: postgres --single -D /home/pgbf/buildroot/HEAD/pgsql.build/src/test/recovery/tmp_check/t_017_shm_gnat_data/pgdata template1
> ack Broken pipe: write( 16, 'SELECT 1 + 1' ) at /usr/local/lib/perl5/site_perl/IPC/Run/IO.pm line 549.
>
> I interpret this as "the single-user backend exited before we could stuff
> 'SELECT 1 + 1' down the pipe to it, and the Perl script is not expecting
> to get a write failure there".

Perhaps I should write the 'SELECT 1 + 1' to a regular file and redirect input
from that file.


Reply | Threaded
Open this post in threaded view
|

Re: pgsql: Consistently test for in-use shared memory.

Tom Lane-2
Noah Misch <[hidden email]> writes:
> On Mon, Apr 15, 2019 at 08:08:48PM -0400, Tom Lane wrote:
>> I interpret this as "the single-user backend exited before we could stuff
>> 'SELECT 1 + 1' down the pipe to it, and the Perl script is not expecting
>> to get a write failure there".

> Perhaps I should write the 'SELECT 1 + 1' to a regular file and redirect input
> from that file.

Yeah, that was the first idea that occurred to me as well.
Kinda grotty, but ...

Another idea is to shove the problem off on a sub-shell, that is run

        echo SELECT 1 + 1 | postgres --single ...

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: pgsql: Consistently test for in-use shared memory.

Tom Lane-2
Noah Misch <[hidden email]> writes:
>> Perhaps I should write the 'SELECT 1 + 1' to a regular file and redirect input
>> from that file.

Actually ... we don't need to run a query at all do we?

Although this would only help if Perl will optimize away an attempt
to write zero bytes to the pipe, which seems likely but not certain.

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: pgsql: Consistently test for in-use shared memory.

Noah Misch-2
On Mon, Apr 15, 2019 at 09:01:45PM -0400, Tom Lane wrote:
> Noah Misch <[hidden email]> writes:
> >> Perhaps I should write the 'SELECT 1 + 1' to a regular file and redirect input
> >> from that file.
>
> Actually ... we don't need to run a query at all do we?

No.  We expect never to run one; it was there for the unexpected case of
"postgres --single" startup succeeding.  I pushed a change to close the stdin
of "postgres --single" instead of writing to it.  I probably worried that
"postgres --single" would not tolerate that, but it seems to.


Reply | Threaded
Open this post in threaded view
|

Re: pgsql: Consistently test for in-use shared memory.

Tom Lane-2
Noah Misch <[hidden email]> writes:
> No.  We expect never to run one; it was there for the unexpected case of
> "postgres --single" startup succeeding.  I pushed a change to close the stdin
> of "postgres --single" instead of writing to it.

Ah, good, that's safer than what I was imagining.

> I probably worried that
> "postgres --single" would not tolerate that, but it seems to.

Sure.  EOF-on-stdin is the standard way to end a standalone-backend
session.  This fix means it'd have zero queries to execute rather
than one, but I'd be entirely willing to call it a backend bug if that
didn't work.

The more important point of course is that we must correctly detect
test failure if the backend doesn't complain.  But since the test
explicitly looks for a particular error message text, that seems
fine.

                        regards, tom lane