Re: [GENERAL] Testing of MVCC

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
27 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: [GENERAL] Testing of MVCC

Matt Miller
On Mon, 2005-08-08 at 16:59 -0400, Tom Lane wrote:
> Matt Miller <[hidden email]> writes:
> > I want to write some regression tests that confirm the behavior of
> > multiple connections simultaneously going at the same tables/rows.  Is
> > there something like this already, e.g. in src/test/regress?
>
> No. ... but surely we need one.

It seems to me that contrib/dblink could greatly simplify the design and
coding of multi-user regression tests.  Is there objection to a portion
of src/test/regress depending on contrib/dblink?  I'm not sure yet how
that dependency would look, but I'm mainly wondering if there are
objections in principle to depending on contrib/.

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings
Reply | Threaded
Open this post in threaded view
|

Re: [GENERAL] Testing of MVCC

Tom Lane-2
Matt Miller <[hidden email]> writes:
> It seems to me that contrib/dblink could greatly simplify the design and
> coding of multi-user regression tests.  Is there objection to a portion
> of src/test/regress depending on contrib/dblink?

Yes.  Given the difficulties we had in getting the contrib/dblink
regression tests to pass in the buildfarm, and the environmental
sensitivity it has, I don't think making the core tests depend on it
is a hot idea.  In any case I doubt it would be very useful, since
a script based on that still doesn't let you issue concurrent queries.

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings
Reply | Threaded
Open this post in threaded view
|

Re: [GENERAL] Testing of MVCC

Matt Miller
On Wed, 2005-08-10 at 16:41 -0400, Tom Lane wrote:
> Matt Miller <[hidden email]> writes:
> > It seems to me that contrib/dblink could greatly simplify the design and
> > coding of multi-user regression tests.
>
> I doubt it would be very useful, since
> a script based on that still doesn't let you issue concurrent queries.

I think it would be useful to allow a test script to first create a set
of committed and uncommitted transactions, and to then issue some
queries on another connection to confirm that the other connection has a
proper view of the database at that point.  This type of test is
serialized, but I think it would be a useful multi-user test.  Also, the
output from such a test is probably pretty easy to fit into the
diff-based validation of "make check."

I realize that we also need to have tests that spawn several connections
and run scripts concurrently across those connections.  I agree that
this type of test would probably not benefit fundamentally from
contrib/dblink.  However, I was grasping a bit to see how the output
from such a concurrent test would be diff'ed with an expected output in
a meaningful way.  So, to continue to progress on this problem, I
figured that a contrib/dblink dependency would at least allow me to
start coding something...

> > Is there objection to a portion
> > of src/test/regress depending on contrib/dblink?
>
> Yes.

Understood.

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

               http://archives.postgresql.org
Reply | Threaded
Open this post in threaded view
|

Re: Testing of MVCC

Matt Miller
In reply to this post by Matt Miller
On Mon, 2005-08-08 at 16:59 -0400, Tom Lane wrote:
> Matt Miller <[hidden email]> writes:
> > I want to write some regression tests that confirm the behavior of
> > multiple connections simultaneously going at the same tables/rows.  Is
> > there something like this already, e.g. in src/test/regress?
>
> No. ... but surely we need one.

The attached patch allows src/test/regress/pg_regress.sh to recognize
lines that begin with "curr_test:" in the schedule file.  Tests named on
such a line are run concurrently across multiple connections.  To make
use of this facility each test in the group must begin with the line:

select * from concurrency_test where key = '<test_name>' for update;

where <test_name> is replace by the name of that test.  This will enable
pg_regress to start this test at the same time as the other tests in the
group.

Is this a reasonable starting point for a concurrent testing framework?

This does not address the issue of how to interpret the test output.
Maybe the simplest solution is to force test writers to generate output
that does not depend on the relative progress of any concurrent tests.
Or, maybe the "ignore:" directive in the schedule file could be employed
somehow.


---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

curr_test.patch (6K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Testing of MVCC

Tom Lane-2
Matt Miller <[hidden email]> writes:
> The attached patch allows src/test/regress/pg_regress.sh to recognize
> lines that begin with "curr_test:" in the schedule file.  Tests named on
> such a line are run concurrently across multiple connections.

This doesn't seem like any advance over the existing parallel-test
facility.  Synchronizing the test starts slightly more closely
isn't really going to buy anything: you still can't control or even
predict relative progress.

> Maybe the simplest solution is to force test writers to generate output
> that does not depend on the relative progress of any concurrent tests.

Well, that's exactly the situation we have now, and it's not really
adequate.

What we really need is a test program that can issue a command on one
connection (perhaps waiting for it to finish, perhaps not) and then
issue other commands on other connections, all according to a script.
I am unsure that the existing pg_regress infrastructure is the right
place to start from.  Perhaps we should look at Expect or something
similar.

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match
Reply | Threaded
Open this post in threaded view
|

Re: Testing of MVCC

Matt Miller
> What we really need is a test program that can issue a command on one
> connection (perhaps waiting for it to finish, perhaps not) and then
> issue other commands on other connections, all according to a script.

It seems to me that this is what contrib/dblink could allow, but when I
presented that idea earlier you replied:

> I doubt it would be very useful, since a script based on that
> still doesn't let you issue concurrent queries.

So, I guess I'm not clear on what you're thinking.

> Perhaps we should look at Expect or something similar.

Where can I get more info on Expect?

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

               http://archives.postgresql.org
Reply | Threaded
Open this post in threaded view
|

Re: Testing of MVCC

Matt Miller
> > Perhaps we should look at Expect or something similar.
>
> Where can I get more info on Expect?

I think I found it:

http://expect.nist.gov/

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq
Reply | Threaded
Open this post in threaded view
|

Re: Testing of MVCC

Michael Fuhr
In reply to this post by Matt Miller
On Mon, Aug 15, 2005 at 10:37:06PM +0000, Matt Miller wrote:
> > Perhaps we should look at Expect or something similar.
>
> Where can I get more info on Expect?

http://www.google.com/

:-)

Or here:

http://expect.nist.gov/

--
Michael Fuhr

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match
Reply | Threaded
Open this post in threaded view
|

Re: Testing of MVCC

Andrew Dunstan
In reply to this post by Tom Lane-2


Tom Lane wrote:

>
>What we really need is a test program that can issue a command on one
>connection (perhaps waiting for it to finish, perhaps not) and then
>issue other commands on other connections, all according to a script.
>I am unsure that the existing pg_regress infrastructure is the right
>place to start from.  Perhaps we should look at Expect or something
>similar.
>
>
>  
>

Or else a harness that operates at the library/connection level rather
than trying to control a tty app.

Expect is very cool, but it would impose an extra dependency on tcl that
we don't now have for building and testing, and I am not sure how easy
or even possible it is to get it to work in a satisfactory way on
Windows. The NIST site says it's in AS Tcl, but in the docs that
accompany my copy of same, it says "Unix only" on the Expect manual page.

Just some words of caution.

One other note: please be very careful in changing pg_regress.sh -
getting it right especially on Windows was very time consuming, and it
is horribly fragile.

cheers

andrew

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings
Reply | Threaded
Open this post in threaded view
|

Re: Testing of MVCC

Tom Lane-2
Andrew Dunstan <[hidden email]> writes:
> Or else a harness that operates at the library/connection level rather
> than trying to control a tty app.

Right.  What is sort of in the back of my mind is a C program that can
open more than one connection, and it reads a script that tells it
"fire this command out on this connection".  The question at hand is
whether we can avoid re-inventing the wheel.

> Expect is very cool, but it would impose an extra dependency on tcl that
> we don't now have for building and testing,

True.  I was pointing to it more as an example of the sorts of tools
people have built for this type of problem.

I'm pretty sure there are re-implementations of Expect out there that
don't use Tcl; would you be happier with, say, a perl-based tool?

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings
Reply | Threaded
Open this post in threaded view
|

Re: Testing of MVCC

Andrew Dunstan


Tom Lane wrote:

>>Expect is very cool, but it would impose an extra dependency on tcl that
>>we don't now have for building and testing,
>>    
>>
>
>True.  I was pointing to it more as an example of the sorts of tools
>people have built for this type of problem.
>
>I'm pretty sure there are re-implementations of Expect out there that
>don't use Tcl; would you be happier with, say, a perl-based tool?
>
>
>  
>

Yes, because we already have a dependency on perl. But don't be
surprised if we can't find such a beast, especially one that runs under
the weird MSys DTK perl - I won't even begin to tell you the nightmares
that caused with getting buildfarm to work on Windows.

BTW, further reading indicates that AS Expect does exist for Windows,
but it's a commercial offering, not a free one. Others appear to be
somewhat limited in value, but I could be wrong.

cheers

andrew

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

               http://archives.postgresql.org
Reply | Threaded
Open this post in threaded view
|

Re: Testing of MVCC

Tom Lane-2
Andrew Dunstan <[hidden email]> writes:
> Tom Lane wrote:
>> I'm pretty sure there are re-implementations of Expect out there that
>> don't use Tcl; would you be happier with, say, a perl-based tool?

> Yes, because we already have a dependency on perl. But don't be
> surprised if we can't find such a beast, especially one that runs under
> the weird MSys DTK perl -

[ digs... ]  It looks like what I was remembering is
http://search.cpan.org/~lbrocard/Test-Expect-0.29/lib/Test/Expect.pm
which seems to leave all the interesting problems (like driving more
than one program-under-test) to the user's own devices.  Sigh.

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq
Reply | Threaded
Open this post in threaded view
|

Re: Testing of MVCC

Greg Stark-3

Tom Lane <[hidden email]> writes:

> [ digs... ]  It looks like what I was remembering is
> http://search.cpan.org/~lbrocard/Test-Expect-0.29/lib/Test/Expect.pm
> which seems to leave all the interesting problems (like driving more
> than one program-under-test) to the user's own devices.  Sigh.

The goal here is to find race conditions in the server, right? There's no real
chance of any race condition errors in psql as far as I can see, perhaps in
the \commands but I don't think that's what you're worried about here.

So why bother with driving multiple invocations of psql under Expect. Just use
DBD::Pg to open as many connections as you want and issue whatever queries you
want.

The driver program would be really simple. I'm not sure if you would specify
the series of queries with a perl data structure or define a text file format
that it would parse. Either seems pretty straightforward.

If you're worried about adding a dependency on DBD::Pg which would create a
circular dependency, well, it's just the test harness, it would just mean
someone would have to go build DBD::Pg before running the tests. (Personally
my inclination would be to break the cycle by including DBD::Pg in core but
that seems to be an uphill battle these days.)

--
greg


---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq
Reply | Threaded
Open this post in threaded view
|

Re: Testing of MVCC

Tom Lane-2
Greg Stark <[hidden email]> writes:
> So why bother with driving multiple invocations of psql under
> Expect. Just use DBD::Pg to open as many connections as you want and
> issue whatever queries you want.

The bit that I think is missing in DBI is "issue a command and don't
wait for the result just yet".  Without that, you cannot for instance
stack up several waiters for the same lock, as you might wish to do to
verify that they get released in the correct order once the original
lock holder goes away.  Or stack up some conflicting waiters and check
to see if deadlock is detected when it should be ... or contrariwise,
not signalled when it should not be.  There's lots of stuff you can
do that isn't exactly probing for race conditions, yet would be awfully
nice to check for in a routine test suite.

I might be wrong though, not being exactly a DBI guru ... can this
sort of thing be done?

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
       subscribe-nomail command to [hidden email] so that your
       message can get through to the mailing list cleanly
Reply | Threaded
Open this post in threaded view
|

Re: Testing of MVCC

Tino Wildenhain
Tom Lane schrieb:

> Greg Stark <[hidden email]> writes:
>
>>So why bother with driving multiple invocations of psql under
>>Expect. Just use DBD::Pg to open as many connections as you want and
>>issue whatever queries you want.
>
>
> The bit that I think is missing in DBI is "issue a command and don't
> wait for the result just yet".  Without that, you cannot for instance
> stack up several waiters for the same lock, as you might wish to do to
> verify that they get released in the correct order once the original
> lock holder goes away.  Or stack up some conflicting waiters and check
> to see if deadlock is detected when it should be ... or contrariwise,
> not signalled when it should not be.  There's lots of stuff you can
> do that isn't exactly probing for race conditions, yet would be awfully
> nice to check for in a routine test suite.
>
> I might be wrong though, not being exactly a DBI guru ... can this
> sort of thing be done?
>
I wonder if you dont have a wrapper around libpq you can use like that?

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
Reply | Threaded
Open this post in threaded view
|

Re: Testing of MVCC

Andrew Piskorski
In reply to this post by Tom Lane-2
On Mon, Aug 15, 2005 at 06:01:20PM -0400, Tom Lane wrote:

> What we really need is a test program that can issue a command on one
> connection (perhaps waiting for it to finish, perhaps not) and then
> issue other commands on other connections, all according to a script.

Well, using Tcl with its Tcl Threads Extension should certainly let
you easily control multiple concurrent PostgreSQL connections.  (The
Thread Extension's APIs are particularly nice for multi-threaded
programming.)  Its docs are here:

  http://cvs.sourceforge.net/viewcvs.py/tcl/thread/doc/html/

> I am unsure that the existing pg_regress infrastructure is the right
> place to start from.  Perhaps we should look at Expect or something
> similar.

I don't have any clear idea of what sort of tests you want to run
"according to a script" though, so I'm not sure whether the Tcl
Threads Extension, or Expect, or some other tool would best meet your
needs.

--
Andrew Piskorski <[hidden email]>
http://www.piskorski.com/

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq
Reply | Threaded
Open this post in threaded view
|

Re: Testing of MVCC

Tom Lane-2
In reply to this post by Tino Wildenhain
Tino Wildenhain <[hidden email]> writes:
> Tom Lane schrieb:
>> The bit that I think is missing in DBI is "issue a command and don't
>> wait for the result just yet". ...
>> I might be wrong though, not being exactly a DBI guru ... can this
>> sort of thing be done?
>>
> I wonder if you dont have a wrapper around libpq you can use like that?

Sure, it wouldn't take much to create a minimal C+libpq program that
would do the basics.  But the history of testing tools teaches that
you soon find yourself wanting a whole lot more functionality, like
conditional tests, looping, etc, in the test-driver mechanism.
That's the wheel that I don't want to re-invent.  And it's a big part
of the reason why stuff like Expect and the Perl Test modules have
become so popular: you have a full scripting language right there at
your command.

Maybe the right answer is just to hack up Pg.pm or DBD::Pg to provide
the needed asynchronous-command-submission facility, and go forward
from there using the Perl Test framework.

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings
Reply | Threaded
Open this post in threaded view
|

Re: Testing of MVCC

Tino Wildenhain
Tom Lane schrieb:

> Tino Wildenhain <[hidden email]> writes:
>
>>Tom Lane schrieb:
>>
>>>The bit that I think is missing in DBI is "issue a command and don't
>>>wait for the result just yet". ...
>>>I might be wrong though, not being exactly a DBI guru ... can this
>>>sort of thing be done?
>>>
>>
>>I wonder if you dont have a wrapper around libpq you can use like that?
>
>
> Sure, it wouldn't take much to create a minimal C+libpq program that
> would do the basics.  But the history of testing tools teaches that

Well no no. I was just thinking perl might have something similar to
pythons pyPgSQL module which both hase dbapi2 interface as well
as low level access to libpq - all that nicely accessible from the
scripting language. I'm using it for NOTIFY/LISTEN for example.

> you soon find yourself wanting a whole lot more functionality, like
> conditional tests, looping, etc, in the test-driver mechanism.
> That's the wheel that I don't want to re-invent.  And it's a big part
> of the reason why stuff like Expect and the Perl Test modules have
> become so popular: you have a full scripting language right there at
> your command.

Sure, see above :)

> Maybe the right answer is just to hack up Pg.pm or DBD::Pg to provide
> the needed asynchronous-command-submission facility, and go forward
> from there using the Perl Test framework.

Nothing on cpan or how thats called?


---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match
Reply | Threaded
Open this post in threaded view
|

Re: Testing of MVCC

Greg Stark-3
In reply to this post by Tom Lane-2
Tom Lane <[hidden email]> writes:

> Greg Stark <[hidden email]> writes:
> > So why bother with driving multiple invocations of psql under
> > Expect. Just use DBD::Pg to open as many connections as you want and
> > issue whatever queries you want.
>
> The bit that I think is missing in DBI is "issue a command and don't
> wait for the result just yet".  Without that, you cannot for instance
> stack up several waiters for the same lock, as you might wish to do to
> verify that they get released in the correct order once the original
> lock holder goes away.  Or stack up some conflicting waiters and check
> to see if deadlock is detected when it should be ... or contrariwise,
> not signalled when it should not be.  There's lots of stuff you can
> do that isn't exactly probing for race conditions, yet would be awfully
> nice to check for in a routine test suite.
>
> I might be wrong though, not being exactly a DBI guru ... can this
> sort of thing be done?

Hm.

The API is designed like that. You issue a query in one call and retrieve the
results in a separate call (or series of calls). I don't know the DBD::Pg
implementation enough to be sure it's a 1:1 mapping to Postgres wire protocol
messages but I would expect you could get a pretty control at that level.

I doubt it's using asynchronous I/O though. Which would mean, for example,
that you can't arrange to send a message while another connection is in the
middle of receiving a large message.

I think part of this boils down to a deficiency in the Postgres wire protocol
though. It doesn't allow for interleaving calls in the middle of downloading a
large results block. That means DBD::Pg would be in bad shape if it returned
control to the user while in the process of downloading query results. If the
user issued any calls to the driver in that state it would have to return some
sort of error.

By comparison DBD::Oracle can stream results to the user while still
continuing to download more results. It tries to adjust the number of records
read whenever the buffer empties to keep the network pipeline full. This
allows the user to process records while the database is still working on
executing the query and the network is still working on shipping the results.
(Obviously this works better with some plans than others.) And the driver can
cancel or issue other queries between any of these block reads.

--
greg


---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq
Reply | Threaded
Open this post in threaded view
|

Re: Testing of MVCC

Andrew Dunstan
In reply to this post by Tom Lane-2


Tom Lane wrote:

>Sure, it wouldn't take much to create a minimal C+libpq program that
>would do the basics.  But the history of testing tools teaches that
>you soon find yourself wanting a whole lot more functionality, like
>conditional tests, looping, etc, in the test-driver mechanism.
>That's the wheel that I don't want to re-invent.  And it's a big part
>of the reason why stuff like Expect and the Perl Test modules have
>become so popular: you have a full scripting language right there at
>your command.
>
>Maybe the right answer is just to hack up Pg.pm or DBD::Pg to provide
>the needed asynchronous-command-submission facility, and go forward
>from there using the Perl Test framework.
>
>
>  
>

How will we make sure it's consistent? People have widely varying
versions of DBD::Pg and DBI installed, not to mention the bewildering
array of Test::Foo modules out there (just try installing Template
Toolkit on a less than very modern perl and see yourself get into module
hell). The only way I can see of working on this path would be to keep
and make our own copies of the needed modules, and point PERL5LIB at
that collection. But that would constitute a large extra buildtime burden.

A better solution might be to hack something out of the pure perl DBD
driver and use that. It's known to have some problems, but maybe this
would be a good impetus to iron those out, and this would reduce us to
carrying a single non-compiled perl module (plus whatever test framework
we need).

cheers

andrew

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend
12