Serialization errors on single threaded request stream

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Serialization errors on single threaded request stream

Kevin Grittner
I have an odd one here.  I was unable to find it with a search of the mailing lists.  I've spent a few hours trying to create a simple test case, but so far these simple cases aren't showing the problem.  I want to make sure this isn't a know problem before investing more time trying to come up with a test case suffiently complex to expose the problem.
 
The problem is this:  a single thread is submitting database updates through a middle tier which has a pool of connections.  There are no guarantees of which connection will be used for any request.  Each request is commited as its own database transaction before the middle tier responds to the requester, which then immediately submits the next request.  Nothing else it hitting the database.  We are getting serialization errors.
 
If we add a 1 ms delay on the client side between requests to the middle tier, the frequency of these errors drops by about two orders of magnitude.  With a 100 ms delay, we haven't seen any.
 
The pattern of activity which causes the problem involves a single database transaction with inserts and updates to many tables, including one with a potentially large blob, followed by an update to a numeric column in a row which tracks progress.  The serialization errors are happening on this final update.  My simple test cases use a single thread on two JDBC connection emulating just this final update, and the problem does not show up.
 
We have the same behavior on 8.0.3 and the develpment snapshot from yesterday.  (I haven't gotten a test run from today's beta release yet -- I need to coordinate the test with someone else who's not here right now.  I'll follow up if the beta release changes this behavior.)
 
The server is SuSE 9.3 with dual xeons and xfs on a SAN.  The client and middle tier for these tests have been on Windows XP.  The requests are going through JDBC.
 
Does this behavior sound familiar to anyone?
 
-Kevin
 



---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
Reply | Threaded
Open this post in threaded view
|

Re: Serialization errors on single threaded request stream

Tom Lane-2
"Kevin Grittner" <[hidden email]> writes:
> The problem is this:  a single thread is submitting database updates through a middle tier which has a pool of connections.  There are no guarantees of which connection will be used for any request.  Each request is commited as its own database transaction before the middle tier responds to the requester, which then immediately submits the next request.  Nothing else it hitting the database.  We are getting serialization errors.

Hm.  Are you sure your middle tier is actually waiting for the commit
to come back before it claims the transaction is done?

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

               http://archives.postgresql.org
Reply | Threaded
Open this post in threaded view
|

Re: Serialization errors on single threaded request stream

Tom Lane-2
In reply to this post by Kevin Grittner
"Kevin Grittner" <[hidden email]> writes:
> What happens if the timestamp of the commit is an exact match for the
> timestamp of the next transaction start?  What is the resolution of
> the time sampling?

It's not done via timestamps: rather, each transaction takes a census
of the transaction XIDs that are running in other backends when it
starts (there is an array in shared memory that lets it get this
information cheaply).  Reliability of the system clock is not a factor.

Are you sure the server is 8.0.3?  There was a bug in prior releases
that might possibly be related:

2005-05-07 17:22  tgl

        * src/backend/utils/time/: tqual.c (REL7_3_STABLE), tqual.c
        (REL7_4_STABLE), tqual.c (REL7_2_STABLE), tqual.c (REL8_0_STABLE),
        tqual.c: Adjust time qual checking code so that we always check
        TransactionIdIsInProgress before we check commit/abort status.
        Formerly this was done in some paths but not all, with the result
        that a transaction might be considered committed for some purposes
        before it became committed for others. Per example found by Jan
        Wieck.

My recollection though is that this only affected applications that were
using SELECT FOR UPDATE.  In any case, it's pretty hard to see how this
would affect an application that is in fact waiting for the backend to
report commit-done before it launches the next transaction; the
race-condition window we were concerned about no longer exists by the
time the backend sends CommandComplete.  So my suspicion remains fixed
on that point.  Do you have any way of sniffing the network traffic of
the middle-tier to confirm that it's doing what it's supposed to?

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match