A test for replay of regression tests

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

A test for replay of regression tests

Thomas Munro-5
Hi,

We have automated tests for many specific replication and recovery
scenarios, but nothing that tests replay of a wide range of records.
People working on recovery code often use installcheck (presumably
along with other custom tests) to exercise it, sometimes with
wal_consistency_check enabled.  So, why don't we automate that?  Aside
from exercising the WAL decoding machinery (which brought me here),
that'd hopefully provide some decent improvements in coverage of the
various redo routines, many of which are not currently exercised at
all.

I'm not quite sure where it belongs, though.  The attached initial
sketch patch puts it under rc/test/recovery near other similar things,
but I'm not sure if it's really OK to invoke make -C ../regress from
here.  I copied pg_update/test.sh's trick of using a different
outputdir to avoid clobbering a concurrent run under src/test/regress,
and I also needed to invent a way to stop it from running the cursed
tablespace test (deferring startup of the standby also works but eats
way too much space, which I learned after blowing out a smallish
development VM's disk).   Alternatively, we could put it under
src/test/regress, which makes some kind of logical sense, but it would
make a quick "make check" take more than twice as long.  Perhaps it
could use a different target, one that check-world somehow reaches but
not check, and that also doesn't seem great.  Ideas on how to
structure this or improve the perl welcome.

0001-Test-replay-of-regression-tests.patch (8K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

RE: A test for replay of regression tests

tsunakawa.takay@fujitsu.com
From: Thomas Munro <[hidden email]>

> We have automated tests for many specific replication and recovery scenarios,
> but nothing that tests replay of a wide range of records.
> People working on recovery code often use installcheck (presumably along
> with other custom tests) to exercise it, sometimes with
> wal_consistency_check enabled.  So, why don't we automate that?  Aside
> from exercising the WAL decoding machinery (which brought me here), that'd
> hopefully provide some decent improvements in coverage of the various redo
> routines, many of which are not currently exercised at all.
>
> I'm not quite sure where it belongs, though.  The attached initial sketch patch

I think that's a good attempt.  It'd be better to confirm that the database contents are identical on the primary and standby.  But... I remember when I ran make installcheck with a standby connected, then ran pg_dumpall on both the primary and standby and compare the two output files, about 40 lines of difference were observed.  Those differences were all about the sequence values.  I didn't think about whether I should/can remove the differences.


+# XXX The tablespace tests don't currently work when the standby shares a
+# filesystem with the primary due to colliding absolute paths.  We'll skip
+# that for now.

Maybe we had better have a hidden feature that creates tablespace contents in

/tablespace_location/PG_..._<some_name>/

<some_name> is the value of cluster_name or application_name.

Or, we may provide a visible feature that allows users to store tablespace contents in a specified directory regardless of the LOCATION value in CREATE TABLESPACE.  For instance, add a GUC like

    table_space_directory = '/some_dir'

Then, the tablespace contents are stored in /some_dir/<tablespace_name>/.  This may be useful if a DBaaS provider wants to offer some tablespace-based feature, say tablespace transparent data encryption, but doesn't want users to specify filesystem directories.


Regards
Takayuki Tsunakawa


Reply | Threaded
Open this post in threaded view
|

Re: A test for replay of regression tests

Thomas Munro-5
On Fri, Apr 23, 2021 at 6:27 PM [hidden email]
<[hidden email]> wrote:
> From: Thomas Munro <[hidden email]>
> > I'm not quite sure where it belongs, though.  The attached initial sketch patch
>
> I think that's a good attempt.  It'd be better to confirm that the database contents are identical on the primary and standby.  But... I remember when I ran make installcheck with a standby connected, then ran pg_dumpall on both the primary and standby and compare the two output files, about 40 lines of difference were observed.  Those differences were all about the sequence values.  I didn't think about whether I should/can remove the differences.

Interesting idea.  I hadn't noticed the thing with sequences before.

> +# XXX The tablespace tests don't currently work when the standby shares a
> +# filesystem with the primary due to colliding absolute paths.  We'll skip
> +# that for now.
>
> Maybe we had better have a hidden feature that creates tablespace contents in
>
> /tablespace_location/PG_..._<some_name>/
>
> <some_name> is the value of cluster_name or application_name.
>
> Or, we may provide a visible feature that allows users to store tablespace contents in a specified directory regardless of the LOCATION value in CREATE TABLESPACE.  For instance, add a GUC like
>
>     table_space_directory = '/some_dir'
>
> Then, the tablespace contents are stored in /some_dir/<tablespace_name>/.  This may be useful if a DBaaS provider wants to offer some tablespace-based feature, say tablespace transparent data encryption, but doesn't want users to specify filesystem directories.

Yeah, a few similar ideas have been put forward before, for example in
this thread:

https://www.postgresql.org/message-id/flat/CALfoeisEF92F5nJ-aAcuWTvF_Aogxq_1bHLem_kVfM_tHc2mfg%40mail.gmail.com

... but also others.  I would definitely like to fix that problem too
(having worked on several things that interact with tablespaces, I
definitely want to be able to extend those tests in future patches,
and get coverage on the build farm and CI), but with --skip-tests that
could be done independently.

Apparently the invocation of make failed for some reason on CI (not
sure why, works for me), but in any case, that feels a bit clunky and
might not ever work on Windows, so perhaps we should figure out how to
invoke the pg_regress[.exe] program directly.


Reply | Threaded
Open this post in threaded view
|

Re: A test for replay of regression tests

Andres Freund
In reply to this post by Thomas Munro-5
Hi,

On 2021-04-23 17:37:48 +1200, Thomas Munro wrote:
> We have automated tests for many specific replication and recovery
> scenarios, but nothing that tests replay of a wide range of records.
> People working on recovery code often use installcheck (presumably
> along with other custom tests) to exercise it, sometimes with
> wal_consistency_check enabled.  So, why don't we automate that?  Aside
> from exercising the WAL decoding machinery (which brought me here),
> that'd hopefully provide some decent improvements in coverage of the
> various redo routines, many of which are not currently exercised at
> all.

Yay.


> I'm not quite sure where it belongs, though.  The attached initial
> sketch patch puts it under rc/test/recovery near other similar things,
> but I'm not sure if it's really OK to invoke make -C ../regress from
> here.

I'd say it's not ok, and we should just invoke pg_regress without make.


> Add a new TAP test under src/test/recovery that runs the regression
> tests with wal_consistency_checking=all.

Hm. I wonder if running with wal_consistency_checking=all doesn't also
reduce coverage of some aspects of recovery, by increasing record sizes
etc.


> I copied pg_update/test.sh's trick of using a different
> outputdir to avoid clobbering a concurrent run under src/test/regress,
> and I also needed to invent a way to stop it from running the cursed
> tablespace test (deferring startup of the standby also works but eats
> way too much space, which I learned after blowing out a smallish
> development VM's disk).

That's because you are using wal_consistency_checking=all, right?
Because IIRC we don't generate that much WAL otherwise?


> +# Create some content on primary and check its presence in standby 1
> +$node_primary->safe_psql('postgres',
> + "CREATE TABLE tab_int AS SELECT generate_series(1,1002) AS a");
> +
> +# Wait for standby to catch up
> +$node_primary->wait_for_catchup($node_standby_1, 'replay',
> + $node_primary->lsn('insert'));

> +my $result =
> +  $node_standby_1->safe_psql('postgres', "SELECT count(*) FROM tab_int");
> +print "standby 1: $result\n";
> +is($result, qq(1002), 'check streamed content on standby 1');

Why is this needed?



Greetings,

Andres Freund


Reply | Threaded
Open this post in threaded view
|

Re: A test for replay of regression tests

Andrew Dunstan
In reply to this post by Thomas Munro-5

On 4/23/21 1:37 AM, Thomas Munro wrote:

> Hi,
>
> We have automated tests for many specific replication and recovery
> scenarios, but nothing that tests replay of a wide range of records.
> People working on recovery code often use installcheck (presumably
> along with other custom tests) to exercise it, sometimes with
> wal_consistency_check enabled.  So, why don't we automate that?  Aside
> from exercising the WAL decoding machinery (which brought me here),
> that'd hopefully provide some decent improvements in coverage of the
> various redo routines, many of which are not currently exercised at
> all.
>
> I'm not quite sure where it belongs, though.  The attached initial
> sketch patch puts it under rc/test/recovery near other similar things,
> but I'm not sure if it's really OK to invoke make -C ../regress from
> here.  I copied pg_update/test.sh's trick of using a different
> outputdir to avoid clobbering a concurrent run under src/test/regress,
> and I also needed to invent a way to stop it from running the cursed
> tablespace test (deferring startup of the standby also works but eats
> way too much space, which I learned after blowing out a smallish
> development VM's disk).   Alternatively, we could put it under
> src/test/regress, which makes some kind of logical sense, but it would
> make a quick "make check" take more than twice as long.  Perhaps it
> could use a different target, one that check-world somehow reaches but
> not check, and that also doesn't seem great.  Ideas on how to
> structure this or improve the perl welcome.



Nice, I like adding a skip-tests option to pg_regress.

The perl looks ok - I assume those

    print "standby 1: $result\n";  

lines are there for debugging. Normally you would just process $result
using the Test::More functions


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com



Reply | Threaded
Open this post in threaded view
|

Re: A test for replay of regression tests

Tom Lane-2
In reply to this post by Andres Freund
Andres Freund <[hidden email]> writes:
> On 2021-04-23 17:37:48 +1200, Thomas Munro wrote:
>> We have automated tests for many specific replication and recovery
>> scenarios, but nothing that tests replay of a wide range of records.

> Yay.

+1

>> Add a new TAP test under src/test/recovery that runs the regression
>> tests with wal_consistency_checking=all.

> Hm. I wonder if running with wal_consistency_checking=all doesn't also
> reduce coverage of some aspects of recovery, by increasing record sizes
> etc.

Yeah.  I found out earlier that wal_consistency_checking=all is an
absolute PIG.  Running the regression tests that way requires tens of
gigabytes of disk space, and a significant amount of time if your
disk is not very speedy.  If we put this into the buildfarm at all,
it would have to be opt-in, not run-by-default, because a lot of BF
animals simply don't have the horsepower.  I think I'd vote against
adding it to check-world, too; the cost/benefit ratio is not good
unless you are specifically working on replay functions.

And as you say, it alters the behavior enough to make it a little
questionable whether we're exercising the "normal" code paths.

The two things I'd say about this are (1) Whether to use
wal_consistency_checking, and with what value, needs to be
easily adjustable. (2) People will want to run other test suites
than the core regression tests, eg contrib modules.

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: A test for replay of regression tests

Andres Freund
Hi,

On 2021-04-23 11:53:35 -0400, Tom Lane wrote:

> Andres Freund <[hidden email]> writes:
> > Hm. I wonder if running with wal_consistency_checking=all doesn't also
> > reduce coverage of some aspects of recovery, by increasing record sizes
> > etc.
>
> Yeah.  I found out earlier that wal_consistency_checking=all is an
> absolute PIG.  Running the regression tests that way requires tens of
> gigabytes of disk space, and a significant amount of time if your
> disk is not very speedy.  If we put this into the buildfarm at all,
> it would have to be opt-in, not run-by-default, because a lot of BF
> animals simply don't have the horsepower.  I think I'd vote against
> adding it to check-world, too; the cost/benefit ratio is not good
> unless you are specifically working on replay functions.

I think it'd be a huge improvement to test recovery during check-world
by default - it's a huge swath of crucial code that practically has no
test coverage.  I agree that testing by default with
wal_consistency_checking=all isn't feasible due to the time & space
overhead, so I think we should not do that.


> The two things I'd say about this are (1) Whether to use
> wal_consistency_checking, and with what value, needs to be
> easily adjustable. (2) People will want to run other test suites
> than the core regression tests, eg contrib modules.

I'm not really excited about tackling 2) initially. I think it's a
significant issue that we don't have test coverage for most redo
routines and that we should change that with some urgency - even if we
back out these WAL prefetch related changes, there've been sufficiently
many changes affecting WAL that I think it's worth doing the minimal
thing soon.

I don't think there's actually that much need to test contrib modules
through recovery - most of them don't seem like they'd add meaningful
coverage? The exception is contrib/bloom, but perhaps that'd be better
tackled with a dedicated test?

Greetings,

Andres Freund


Reply | Threaded
Open this post in threaded view
|

Re: A test for replay of regression tests

Tom Lane-2
Andres Freund <[hidden email]> writes:
> On 2021-04-23 11:53:35 -0400, Tom Lane wrote:
>> Yeah.  I found out earlier that wal_consistency_checking=all is an
>> absolute PIG.  Running the regression tests that way requires tens of
>> gigabytes of disk space, and a significant amount of time if your
>> disk is not very speedy.  If we put this into the buildfarm at all,
>> it would have to be opt-in, not run-by-default, because a lot of BF
>> animals simply don't have the horsepower.  I think I'd vote against
>> adding it to check-world, too; the cost/benefit ratio is not good
>> unless you are specifically working on replay functions.

> I think it'd be a huge improvement to test recovery during check-world
> by default - it's a huge swath of crucial code that practically has no
> test coverage.  I agree that testing by default with
> wal_consistency_checking=all isn't feasible due to the time & space
> overhead, so I think we should not do that.

I was mainly objecting to enabling wal_consistency_checking by default.
I agree it's bad that we have so little routine test coverage on WAL
replay code.

>> The two things I'd say about this are (1) Whether to use
>> wal_consistency_checking, and with what value, needs to be
>> easily adjustable. (2) People will want to run other test suites
>> than the core regression tests, eg contrib modules.

> I don't think there's actually that much need to test contrib modules
> through recovery - most of them don't seem like they'd add meaningful
> coverage? The exception is contrib/bloom, but perhaps that'd be better
> tackled with a dedicated test?

contrib/bloom is indeed the only(?) case within contrib, but in my mind
that's a proxy for what people will be needing to test out-of-core AMs.
It might not be a test to run by default, but there needs to be a way
to do it.

Also, I suspect that there are bits of GIST/GIN/SPGIST that are not
well exercised if you don't run the contrib modules that add opclasses
for those.

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: A test for replay of regression tests

Andres Freund
Hi,

On 2021-04-23 13:13:15 -0400, Tom Lane wrote:
> contrib/bloom is indeed the only(?) case within contrib, but in my mind
> that's a proxy for what people will be needing to test out-of-core AMs.
> It might not be a test to run by default, but there needs to be a way
> to do it.

Hm. My experience in the past was that the best way to test an external
AM is to run the core regression tests with a different
default_table_access_method. That does require some work of ensuring the
AM is installed and the relevant extension created, which in turn
requires a different test schedule, or hacking template1.  So likely a
different test script anyway?


> Also, I suspect that there are bits of GIST/GIN/SPGIST that are not
> well exercised if you don't run the contrib modules that add opclasses
> for those.

Possible - still think it'd be best to get the minimal thing in asap,
and then try to extend further afterwards... Perfect being the enemy of
good and all that.

Greetings,

Andres Freund


Reply | Threaded
Open this post in threaded view
|

Re: A test for replay of regression tests

Thomas Munro-5
Ok, here's a new version incorporating feedback so far.

1.  Invoke pg_regress directly (no make).

2.  Use PG_TEST_EXTRA="wal_consistency_checking" as a way to opt in to
the more expensive test.

3.  Use parallel schedule rather than serial.  It's faster but also
the non-determinism might discover more things.  This required
changing the TAP test max_connections setting from 10 to 25.

4.  Remove some extraneous print statements and
check-if-data-is-replicated-using-SELECT tests that are technically
not needed (I had copied those from 001_stream_rep.pl).

v2-0001-Test-replay-of-regression-tests.patch (9K) Download Attachment