Re: Serialization errors on single threaded request

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: Serialization errors on single threaded request

Kevin Grittner
I am absolutely sure that the database transaction is always terminated by invoking commit or rollback, and waiting for the method to come back, before the middle tier returns control to the client.
 
A couple other potentially relevant facts are that these connections are doing all this work in the SERIALIZABLE transaction isolation mode, and that the updates are done through ResultSet objects from prepared statements which SELECT * on the appropriate rows.
 
I read through the documentation of the error message, and of the way PostgreSQL handles the isolation levels.  This is behaving as though the time the PostgreSQL server assigns to the commit is sometimes later than the time of the subsequent transaction start, so I totally understand why you would ask the question you did.  It is also why I checked this very carefully before posting.

What happens if the timestamp of the commit is an exact match for the timestamp of the next transaction start?  What is the resolution of the time sampling?  It may be possible that we could submit several of these, on different connections, within the space of a millisecond.  Could that be a problem?  (It doesn't appear to be in my simple test cases.)
 
I don't trust the clock on the Windows client, but I wouldn't think that has anything to do with the issue.
 
-Kevin
 
 
>>> Tom Lane <[hidden email]> 08/26/05 11:10 AM >>>
"Kevin Grittner" <[hidden email]> writes:
> The problem is this:  a single thread is submitting database updates through a middle tier which has a pool of connections.  There are no guarantees of which connection will be used for any request.  Each request is commited as its own database transaction before the middle tier responds to the requester, which then immediately submits the next request.  Nothing else it hitting the database.  We are getting serialization errors.

Hm.  Are you sure your middle tier is actually waiting for the commit
to come back before it claims the transaction is done?

                        regards, tom lane


---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq
Reply | Threaded
Open this post in threaded view
|

Re: Serialization errors on single threaded request

Kevin Grittner
Unfortunately, the original test environment has been blown away in favor of testing the 8.1 beta release.  I can confirm that the problem exists on a build of the 8.1 beta.  If it would be helpful I could set it up again on 8.0.3 to confirm.  I THINK it was actually the tip of the 8.0 stable branch as opposed to the 8.0.3 release proper.
 
We have a little more information about the failure pattern -- when we get these, it is always after there has been a rollback on the thread which eventually generates the serialization error.  So I think the pattern is:
 
ConnectionA:
  -  A series of insert/update/deletes (on tables OTHER than the progress table).
  -  Update the progress table.
  -  Commit the transaction.
ConnectionB:
  -  A series of insert/update/deletes (on tables OTHER than the progress table) fails.
  -  Rollback the transaction.
  -  Attempt each insert/update/delete individually.   Commit or rollback each as we go.
  -  Attempt to update the progress table -- fail on serialization error.
 
To avoid any ambiguity in my former posts -- introducing even a very small delay between the operations on ConnectionA and ConnectionB makes the serialization error very infrequent; introducing a larger delay seems to make it go away.  I hate to consider that as a solution, however.
 
I'm afraid I'm not familiar with a good way to capture the stream of communications with the database server.  If you could point me in the right direction, I'll give it my best shot.
 
I did just have a thought, though -- is there any chance that the JDBC Connection.commit is returning once the command is written to the TCP buffer, and I'm getting hurt by some network latency issues -- the Nagle algorithm or some such?  (I assume that the driver is waiting for a response from the server before returning, so this shouldn't be the issue.)  At the point that the commit confirmation is sent by the server, I assume the shared memory changes are visible to the other processes?
 
-Kevin
 
 
>>> Tom Lane <[hidden email]> 08/26/05 12:16 PM >>>
"Kevin Grittner" <[hidden email]> writes:
> What happens if the timestamp of the commit is an exact match for the
> timestamp of the next transaction start?  What is the resolution of
> the time sampling?

It's not done via timestamps: rather, each transaction takes a census
of the transaction XIDs that are running in other backends when it
starts (there is an array in shared memory that lets it get this
information cheaply).  Reliability of the system clock is not a factor.

Are you sure the server is 8.0.3?  There was a bug in prior releases
that might possibly be related:

2005-05-07 17:22  tgl

        * src/backend/utils/time/: tqual.c (REL7_3_STABLE), tqual.c
        (REL7_4_STABLE), tqual.c (REL7_2_STABLE), tqual.c (REL8_0_STABLE),
        tqual.c: Adjust time qual checking code so that we always check
        TransactionIdIsInProgress before we check commit/abort status.
        Formerly this was done in some paths but not all, with the result
        that a transaction might be considered committed for some purposes
        before it became committed for others. Per example found by Jan
        Wieck.

My recollection though is that this only affected applications that were
using SELECT FOR UPDATE.  In any case, it's pretty hard to see how this
would affect an application that is in fact waiting for the backend to
report commit-done before it launches the next transaction; the
race-condition window we were concerned about no longer exists by the
time the backend sends CommandComplete.  So my suspicion remains fixed
on that point.  Do you have any way of sniffing the network traffic of
the middle-tier to confirm that it's doing what it's supposed to?

                        regards, tom lane


---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq
Reply | Threaded
Open this post in threaded view
|

Re: Serialization errors on single threaded request

Oliver Jowett
Kevin Grittner wrote:

> I'm afraid I'm not familiar with a good way to capture the stream of communications with the database server.  If you could point me in the right direction, I'll give it my best shot.

tcpdump will do the trick (something like 'tcpdump -n -w
some.output.file -s 1514 -i any tcp port 5432')

Or you can pass '&loglevel=2' as part of the JDBC connection URL to have
the JDBC driver generate a log of all the messages it sends/receives (in
less detail than a full network-level capture would give you, though)

> I did just have a thought, though -- is there any chance that the JDBC Connection.commit is returning once the command is written to the TCP buffer, and I'm getting hurt by some network latency issues

No, the JDBC driver waits for ReadyForQuery from the backend before
returning.

-O

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq