Clients disconnect but query still runs

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
33 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Clients disconnect but query still runs

Robert James
Hi.  I noticed that when clients (both psql and pgAdmin) disconnect or cancel, queries are often still running on the server.  A few questions:
1) Is there a way to reconnect and get the results?
2) Is there a way to tell postgres to automatically stop all queries when the client who queried them disconnects?
3) Is there a way to see all queries whose clients have disconnected?
4) And finally: Why is this the behavior? Doesn't this keep some very long queries running which drain performance but don't seem to benefit anyone?
Reply | Threaded
Open this post in threaded view
|

Re: Clients disconnect but query still runs

Tom Lane-2
Robert James <[hidden email]> writes:
> Hi.  I noticed that when clients (both psql and pgAdmin) disconnect or
> cancel, queries are often still running on the server.  A few questions:
> 1) Is there a way to reconnect and get the results?

No.

> 2) Is there a way to tell postgres to automatically stop all queries when
> the client who queried them disconnects?

No.

> 3) Is there a way to see all queries whose clients have disconnected?

No.

> 4) And finally: Why is this the behavior?

It's not easy to tell whether a client has disconnected (particularly if
the network stack is unhelpful, which is depressingly often true).
Postgres will cancel a query if it gets told that the connection's been
dropped, but it will only discover this when an attempt to output to the
client fails.  It does not spend cycles looking aside to see if the
connection has dropped when it is doing something that doesn't involve
output to the client.

If your client code is polite enough to send a cancel request before
disconnecting, that should terminate the query reasonably promptly.
But just "yanking the plug" doesn't do that.

                        regards, tom lane

--
Sent via pgsql-general mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Reply | Threaded
Open this post in threaded view
|

Re: Clients disconnect but query still runs

Robert James
I see - thanks, Tom, for the informative explanation.
In my experience admining high volume servers, I found this to a major failure pattern: Client tries query which seems to go on forever (either do to contention or resource exhaustion or some other problem), client gives up / fails / gets shut down or rebooted, yet the database is left hanging working on the sloooow query, which is probably consuming all of its resources.  Perhaps the client restarts and tries again, now making the problem much worse, and the vicious cycle continues until the server is rebooted.
Is there no way to have the OS interrupt the postgres process when a TCP/IP disconnect happens? Or is the OS also in the dark that the TCP/IP connection was dropped? I believe that there is a way to monitor this using TCP/IP keep alives.
Or perhaps Postgres could check once every minute? Either way, in my experience, solving this would be a major boon to high volume servers, at least in the usage patterns I've worked with.

On Mon, Jul 27, 2009 at 9:49 PM, Tom Lane <[hidden email]> wrote:
Robert James <[hidden email]> writes:
> Hi.  I noticed that when clients (both psql and pgAdmin) disconnect or
> cancel, queries are often still running on the server.  A few questions:
> 1) Is there a way to reconnect and get the results?

No.

> 2) Is there a way to tell postgres to automatically stop all queries when
> the client who queried them disconnects?

No.

> 3) Is there a way to see all queries whose clients have disconnected?

No.

> 4) And finally: Why is this the behavior?

It's not easy to tell whether a client has disconnected (particularly if
the network stack is unhelpful, which is depressingly often true).
Postgres will cancel a query if it gets told that the connection's been
dropped, but it will only discover this when an attempt to output to the
client fails.  It does not spend cycles looking aside to see if the
connection has dropped when it is doing something that doesn't involve
output to the client.

If your client code is polite enough to send a cancel request before
disconnecting, that should terminate the query reasonably promptly.
But just "yanking the plug" doesn't do that.

                       regards, tom lane

Reply | Threaded
Open this post in threaded view
|

Re: Clients disconnect but query still runs

Albe Laurenz *EXTERN*
Robert James wrote:
> Is there no way to have the OS interrupt the postgres process
> when a TCP/IP disconnect happens? Or is the OS also in the
> dark that the TCP/IP connection was dropped? I believe that
> there is a way to monitor this using TCP/IP keep alives.
> Or perhaps Postgres could check once every minute? Either
> way, in my experience, solving this would be a major boon to
> high volume servers, at least in the usage patterns I've worked with.

The server machine has no way of knowing that the client died
unless the client closes the connection gracefully.

There are server configuration parameters "tcp_keepalives_idle",
"tcp_keepalives_interval" and "tcp_keepalives_count" which, when
used, will make the operating system check idle connections
regularly.
They are not supported on all operating systems (only on these
whose socket options include TCP_KEEPIDLE, TCP_KEEPINTVL and
TCP_KEEPCNT).

Maybe they can help you.

Yours,
Laurenz Albe

--
Sent via pgsql-general mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Reply | Threaded
Open this post in threaded view
|

Re: Clients disconnect but query still runs

Craig Ringer
In reply to this post by Robert James
Robert James wrote:
> I see - thanks, Tom, for the informative explanation.
> In my experience admining high volume servers, I found this to a major
> failure pattern: Client tries query which seems to go on forever (either
> do to contention or resource exhaustion or some other problem), client
> gives up / fails / gets shut down or rebooted

The client should always make its best effort to notify the server if
it's disconnecting. How it's done depends on client OS, client program
language, etc, but it generally ends up meaning AT LEAST that the client
sends a TCP RST to the server to close the client <-> server socket.

I don't know off the top of my head if the server backend will
immediately notice an RST on the socket and terminate. If it doesn't,
then that's certainly something that'd be desirable.

If the client doesn't send an RST and just "vanishes" then of course the
server has no way to know anything's changed. As you say, you'd need to
have tcp keepalives in use to find out.

, yet the database is left
> hanging working on the sloooow query, which is probably consuming all of
> its resources.  Perhaps the client restarts and tries again, now making
> the problem much worse, and the vicious cycle continues until the server
> is rebooted.

The server should never need to be rebooted. What about
pg_cancel_backend() ? What about killing the backends with SIGTERM (not
SIGKILL, -9) or similar?

--
Craig Ringer

--
Sent via pgsql-general mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Reply | Threaded
Open this post in threaded view
|

Re: Clients disconnect but query still runs

Jasen Betts-5
In reply to this post by Tom Lane-2
On 2009-07-28, Tom Lane <[hidden email]> wrote:

> Robert James <[hidden email]> writes:
>> Hi.  I noticed that when clients (both psql and pgAdmin) disconnect or
>> cancel, queries are often still running on the server.  A few questions:
>> 1) Is there a way to reconnect and get the results?
>
> No.
>
>> 2) Is there a way to tell postgres to automatically stop all queries when
>> the client who queried them disconnects?
>
> No.
>
>> 3) Is there a way to see all queries whose clients have disconnected?
>
> No.
>
>> 4) And finally: Why is this the behavior?
>
> It's not easy to tell whether a client has disconnected (particularly if
> the network stack is unhelpful, which is depressingly often true).
> Postgres will cancel a query if it gets told that the connection's been
> dropped, but it will only discover this when an attempt to output to the
> client fails.  It does not spend cycles looking aside to see if the
> connection has dropped when it is doing something that doesn't involve
> output to the client.
>
> If your client code is polite enough to send a cancel request before
> disconnecting, that should terminate the query reasonably promptly.
> But just "yanking the plug" doesn't do that.

can't coerce a signal from the network stack? the linux socket(2)
manpage is full of promise (SIGPIPE, SIGURG, SIGIO)




--
Sent via pgsql-general mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Reply | Threaded
Open this post in threaded view
|

Re: Clients disconnect but query still runs

Greg Stark-3
On Wed, Jul 29, 2009 at 1:58 PM, Jasen Betts<[hidden email]> wrote:
> can't coerce a signal from the network stack? the linux socket(2)
> manpage is full of promise (SIGPIPE, SIGURG, SIGIO)

[please don't quote the entire message back, just the part you're responding to]

Well SIGPIPE is no help since it would only fire if we tried to write
to the socket anyways.

SIGIO on the other hand looks like exactly what we would need. I'm not
sure if it can be set to fire a signal only when the connection is
disconnected and not for other state changes but if so it would be
interesting.

SIGURG might be useful but it would be more complex to use and less
widely useful since it would only work if the client disconnects
gracefully (though it might be worth checking into as an alternative
to our existing query cancel method).

--
greg
http://mit.edu/~gsstark/resume.pdf

--
Sent via pgsql-general mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Reply | Threaded
Open this post in threaded view
|

Re: Clients disconnect but query still runs

Sam Mason
In reply to this post by Tom Lane-2
On Mon, Jul 27, 2009 at 09:49:04PM -0400, Tom Lane wrote:
> It does not spend cycles looking aside to see if the
> connection has dropped when it is doing something that doesn't involve
> output to the client.

Is this ever an interesting case?  It would seem possible for something
to test the client connections every once in a while to see if they're
still valid.

The postmaster seems like a reasonable place to do this to me, it has
all the descriptors it just discards them at the moment.

--
  Sam  http://samason.me.uk/

--
Sent via pgsql-general mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Reply | Threaded
Open this post in threaded view
|

Re: Clients disconnect but query still runs

Tom Lane-2
In reply to this post by Greg Stark-3
Greg Stark <[hidden email]> writes:
> On Wed, Jul 29, 2009 at 1:58 PM, Jasen Betts<[hidden email]> wrote:
>> can't coerce a signal from the network stack? the linux socket(2)
>> manpage is full of promise (SIGPIPE, SIGURG, SIGIO)

> SIGIO on the other hand looks like exactly what we would need. I'm not
> sure if it can be set to fire a signal only when the connection is
> disconnected and not for other state changes but if so it would be
> interesting.

And the other question is how much of what you read in the Linux manpage
is portable to any other system...

                        regards, tom lane

--
Sent via pgsql-general mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Reply | Threaded
Open this post in threaded view
|

Re: Clients disconnect but query still runs

Greg Stark-3
On Wed, Jul 29, 2009 at 3:17 PM, Tom Lane<[hidden email]> wrote:
> Greg Stark <[hidden email]> writes:
>> On Wed, Jul 29, 2009 at 1:58 PM, Jasen Betts<[hidden email]> wrote:
>>> can't coerce a signal from the network stack? the linux socket(2)
>>> manpage is full of promise (SIGPIPE, SIGURG, SIGIO)
>
>
> And the other question is how much of what you read in the Linux manpage
> is portable to any other system...

That is a question. But actually I think sigio might be fairly
portable -- at least the first hit I found was for someone complaining
that it wasn't working on Linux (due to a bug) and this broke their
app which worked everywhere else.

In any case this would be a feature which if it didn't work would
leave us just where we are today. That's another advantage over trying
to do something with sigurg which would be far more likely to cause
headaches if it behave incorrectly.


--
greg
http://mit.edu/~gsstark/resume.pdf

--
Sent via pgsql-general mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Reply | Threaded
Open this post in threaded view
|

Re: Clients disconnect but query still runs

Tom Lane-2
Greg Stark <[hidden email]> writes:
> That is a question. But actually I think sigio might be fairly
> portable -- at least the first hit I found was for someone complaining
> that it wasn't working on Linux (due to a bug) and this broke their
> app which worked everywhere else.

> In any case this would be a feature which if it didn't work would
> leave us just where we are today. That's another advantage over trying
> to do something with sigurg which would be far more likely to cause
> headaches if it behave incorrectly.

[ reads man pages for awhile... ]  It looks to me like SIGIO is sent
whenever the socket comes ready for either reading or writing, which
makes it pretty nearly useless for detecting a broken-connection
condition.  You'd be too busy filtering out uninteresting signals ---
and the signal handler itself can't do very much of that work.

                        regards, tom lane

--
Sent via pgsql-general mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Reply | Threaded
Open this post in threaded view
|

Re: Clients disconnect but query still runs

Craig Ringer
In reply to this post by Greg Stark-3
On Wed, 2009-07-29 at 14:56 +0100, Greg Stark wrote:

> SIGURG might be useful but it would be more complex to use and less
> widely useful since it would only work if the client disconnects
> gracefully (though it might be worth checking into as an alternative
> to our existing query cancel method).

Might it not also fire if the client disconnects without notice, but tcp
keepalives are enabled?

I might have to write a little test program and see.

[much later] My test program did not appear to receive SIGURB, even
after registering for it with fcntl(sockfd, F_SETOWN, ...) and setting a
signal handler for it. This was the case whether the connection was
dropped due to a tcp keepalive failure, the dropping of a network
interface, or a normal disconnect. The next read() or recv() returned
zero bytes read but no asynchronous notification appeared to occur. I'm
under the impression it's really for use with asynchronous sockets, but
haven't tested this yet.

What does work well is occasionally poking the socket with recv(...,
MSG_DONTWAIT) while doing other work. Program attached. TCP keepalives
seem to work very well at least on my Linux test system, and it's easy
to test for a dud connection using recv(...) with the MSG_DONTWAIT and
(if desired) MSG_PEEK flags. If the connection has exited cleanly it'll
return a zero-size read; if the connection has dropped due to keepalive
failure it'll return ETIMEDOUT.

Pg's backend code already supports keepalives. I guess what'd be helpful
would be a periodic recv(..., MSG_DONTWAIT) on the client<->server
socket while the backend is working hard on a query. A SIGLARM would be
handy for that, though I guess Pg isn't used to having to test for EINTR
on syscalls...

--
Craig Ringer


--
Sent via pgsql-general mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

sockbreak3.c (5K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Clients disconnect but query still runs

Tatsuo Ishii-4
In reply to this post by Greg Stark-3
> Well SIGPIPE is no help since it would only fire if we tried to write
> to the socket anyways.

Right. For this purpose, pgpool sends param packet to client
periodically while waiting for a reply from backend to detect if the
connection to the client is broken. If it's broken, pgool sends cancel
packet to backend not to waste backend machine's CPU cycle.
--
Tatsuo Ishii
SRA OSS, Inc. Japan

--
Sent via pgsql-general mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Reply | Threaded
Open this post in threaded view
|

Re: Clients disconnect but query still runs

Greg Stark-3
On Thu, Jul 30, 2009 at 8:41 AM, Tatsuo Ishii<[hidden email]> wrote:
>> Well SIGPIPE is no help since it would only fire if we tried to write
>> to the socket anyways.
>
> Right. For this purpose, pgpool sends param packet to client
> periodically while waiting for a reply from backend to detect if the
> connection to the client is broken. If it's broken, pgool sends cancel
> packet to backend not to waste backend machine's CPU cycle.

The downside to this is that it will cause spurious failures for
transient network failures even if the network comes back before it's
actually needed.

--
greg
http://mit.edu/~gsstark/resume.pdf

--
Sent via pgsql-general mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Reply | Threaded
Open this post in threaded view
|

Re: Clients disconnect but query still runs

Greg Stark-3
In reply to this post by Craig Ringer
On Thu, Jul 30, 2009 at 7:43 AM, Craig
Ringer<[hidden email]> wrote:

> On Wed, 2009-07-29 at 14:56 +0100, Greg Stark wrote:
>
>> SIGURG might be useful but it would be more complex to use and less
>> widely useful since it would only work if the client disconnects
>> gracefully (though it might be worth checking into as an alternative
>> to our existing query cancel method).
>
> Might it not also fire if the client disconnects without notice, but tcp
> keepalives are enabled?
>
> I might have to write a little test program and see.
>
> [much later] My test program did not appear to receive SIGURB, even
> after registering for it with fcntl(sockfd, F_SETOWN, ...) and setting a
> signal handler for it. This was the case whether the connection was
> dropped due to a tcp keepalive failure, the dropping of a network
> interface, or a normal disconnect. The next read() or recv() returned
> zero bytes read but no asynchronous notification appeared to occur. I'm
> under the impression it's really for use with asynchronous sockets, but
> haven't tested this yet.

Right, you'll only get SIGURG if there's actually any urgent data
received. The client would have to actively send such data
periodically. That would make this a portability headache since it
wouldn't just be an add-on which would fail gracefully if it's
unsupported. The server and client would both have to be sure they
understood whether they both supported this feature.

>
> What does work well is occasionally poking the socket with recv(...,
> MSG_DONTWAIT) while doing other work. Program attached. TCP keepalives
> seem to work very well at least on my Linux test system, and it's easy
> to test for a dud connection using recv(...) with the MSG_DONTWAIT and
> (if desired) MSG_PEEK flags. If the connection has exited cleanly it'll
> return a zero-size read; if the connection has dropped due to keepalive
> failure it'll return ETIMEDOUT.


The problem with this is that it introduces spurious failures for
transient network failures. Also it requires the server to
periodically take time out from processing the query to do this. I
think we want a zero-cost method which will interrupt processing if
the client actively disconnects. If there's a network failure we'll
find out about it in the normal course of events.


--
greg
http://mit.edu/~gsstark/resume.pdf

--
Sent via pgsql-general mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Reply | Threaded
Open this post in threaded view
|

Re: Clients disconnect but query still runs

Csaba Nagy
Hi all,

On Thu, 2009-07-30 at 11:02 +0200, Greg Stark wrote:

> On Thu, Jul 30, 2009 at 7:43 AM, Craig
> Ringer<[hidden email]> wrote:
> > On Wed, 2009-07-29 at 14:56 +0100, Greg Stark wrote:
> > What does work well is occasionally poking the socket with recv(...,
> > MSG_DONTWAIT) while doing other work. Program attached. TCP keepalives
> > seem to work very well at least on my Linux test system, and it's easy
> > to test for a dud connection using recv(...) with the MSG_DONTWAIT and
> > (if desired) MSG_PEEK flags. If the connection has exited cleanly it'll
> > return a zero-size read; if the connection has dropped due to keepalive
> > failure it'll return ETIMEDOUT.
>
>
> The problem with this is that it introduces spurious failures for
> transient network failures. Also it requires the server to
> periodically take time out from processing the query to do this. I
> think we want a zero-cost method which will interrupt processing if
> the client actively disconnects. If there's a network failure we'll
> find out about it in the normal course of events.

Sorry, I have to disagree here. If there's a spurious network error, you
have usually bigger problems. I prefer to have the connection killed
even if the network recovers than risk an idle in transaction connection
to live forever when the client/network crashes for any reason. In case
of network failure the connection will probably be cleaned eventually,
but it did happen to me that a client machine crashed in the middle of a
transaction while not executing any SQL, and that connection stayed
until I killed it manually. A simple ping to the client would have
cleared the fact that the client is not there anymore. I would also be
happy to pay the cost of pinging the clients let's say once per a minute
(or configurable interval). Considering that the connections are one to
one with a client, it's enough to have a single timer which periodically
signals each backend to ping it's client, but this is implementation
details for which I have no clue how it would be best, the main thing
is: I would love to have this functionality. It's extremely hard to
secure all clients against crash, and a crash of one of the clients in
the middle of a transaction can have very bad consequences (think
indefinitely stucked open transaction).

Cheers,
Csaba.



--
Sent via pgsql-general mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Reply | Threaded
Open this post in threaded view
|

Re: Clients disconnect but query still runs

Greg Stark-3
On Thu, Jul 30, 2009 at 10:27 AM, Csaba Nagy<[hidden email]> wrote:
>
> Sorry, I have to disagree here. If there's a spurious network error, you
> have usually bigger problems. I prefer to have the connection killed
> even if the network recovers

I know this is a popular feeling. But you're throwing away decades of
work in making TCP reliable. You would change feelings quickly if you
ever faced this scenario too. All it takes is some bad memory or a bad
wire and you would be turning a performance drain into random
connection drops.


> than risk an idle in transaction connection
> to live forever when the client/network crashes for any reason. In case
> of network failure the connection will probably be cleaned eventually,
> but it did happen to me that a client machine crashed in the middle of a
> transaction while not executing any SQL, and that connection stayed
> until I killed it manually.

Well it ought to have eventually died. Your patience may have ran out
before the keep-alive timeouts fired though.

--
greg
http://mit.edu/~gsstark/resume.pdf

--
Sent via pgsql-general mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Reply | Threaded
Open this post in threaded view
|

Re: Clients disconnect but query still runs

Csaba Nagy
On Thu, 2009-07-30 at 11:41 +0200, Greg Stark wrote:
> I know this is a popular feeling. But you're throwing away decades of
> work in making TCP reliable. You would change feelings quickly if you
> ever faced this scenario too. All it takes is some bad memory or a bad
> wire and you would be turning a performance drain into random
> connection drops.

But if I get bad memory or bad wire I'll get much worse problems
already, and don't tell me it will work more reliably if you don't kill
the connection. It's a lot better to find out sooner that you have those
problems and fix them than having spurious errors which you'll get even
if you don't kill the connection in case of such problems.

> Well it ought to have eventually died. Your patience may have ran out
> before the keep-alive timeouts fired though.

Well it lived for at least one hour (could be more, I don't remember for
sure) keeping vacuum from doing it's job on a heavily updated DB. It was
not so much about my patience as about starting to have abysmal
performance, AFTER we fixed the initial cause of the crash, and without
any warning, except of course I did find out immediately that bloat
happens and found the idle transactions and killed them, but I imagine
the hair-pulling for a less experienced postgres DBA. I would have also
preferred that postgres solves this issue on it's own - the network
stack is clearly not fast enough in resolving it.

Cheers,
Csaba.



--
Sent via pgsql-general mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Reply | Threaded
Open this post in threaded view
|

Re: Clients disconnect but query still runs

Greg Stark-3
On Thu, Jul 30, 2009 at 10:59 AM, Csaba Nagy<[hidden email]> wrote:
> But if I get bad memory or bad wire I'll get much worse problems
> already, and don't tell me it will work more reliably if you don't kill
> the connection. It's a lot better to find out sooner that you have those
> problems and fix them than having spurious errors which you'll get even
> if you don't kill the connection in case of such problems.

Are you sure? Do you know how many times you haven't even found out
you had a problem because TCP just silently kept working despite the
problem?

Having had to use protocols which imposed their own timeouts on lame
hotel networks, buggy wireless drivers, and bad DSL connections and
found my connections dying every few minutes I can say it's
maddeningly frustrating. Especially knowing that TCP was *supposed* to
work in this scenario and they had broken it by trying to be clever.

>> Well it ought to have eventually died. Your patience may have ran out
>> before the keep-alive timeouts fired though.
>
> Well it lived for at least one hour (could be more, I don't remember for
> sure) keeping vacuum from doing it's job on a heavily updated DB. It was
> not so much about my patience as about starting to have abysmal
> performance, AFTER we fixed the initial cause of the crash, and without
> any warning, except of course I did find out immediately that bloat
> happens and found the idle transactions and killed them, but I imagine
> the hair-pulling for a less experienced postgres DBA. I would have also
> preferred that postgres solves this issue on it's own - the network
> stack is clearly not fast enough in resolving it.

Indeed, properly set TCP keepalives don't time out for over 2 hours.
But that's configurable in postgresql.conf.

--
greg
http://mit.edu/~gsstark/resume.pdf

--
Sent via pgsql-general mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Reply | Threaded
Open this post in threaded view
|

Re: Clients disconnect but query still runs

Craig Ringer
In reply to this post by Greg Stark-3
Greg Stark wrote:

> Right, you'll only get SIGURG if there's actually any urgent data
> received. The client would have to actively send such data
> periodically. That would make this a portability headache since it
> wouldn't just be an add-on which would fail gracefully if it's
> unsupported.

It'd also have the same issue as relying on keepalives - ie transient
network drop-outs would be much more likely to cause an unnecessary
query cancel. Worse, the client wouldn't even know about it, because it
was unreachable at the time the server sent it a RST, so it'd be waiting
  for an answer from the server that'd never come...

>> What does work well is occasionally poking the socket with recv(...,
>> MSG_DONTWAIT) while doing other work.

> The problem with this is that it introduces spurious failures for
> transient network failures.

Yep, and often failures where only one side notices (unless _both_ sides
are relying on keepalives and are checking the connection status). Ick.

> Also it requires the server to
> periodically take time out from processing the query to do this.

This aspect I'm not to bothered about. I doubt it'd cost anything
detectable if done a few times a minute - unless it required
restructuring of query processing to accomodate it. In that case, no way.

> I
> think we want a zero-cost method which will interrupt processing if
> the client actively disconnects. If there's a network failure we'll
> find out about it in the normal course of events.

Personally, I'm with you. I think the _real_ problem here is clients
that're just giving up and vanishing without issuing a query cancel or,
apparently, even closing the connection.

It's _not_ hard to time out on a query and use another connection to
cancel the backend. If you've set a server-side timeout, of course,
there's no need to do anything, but if you're timing out client-side or
your user has cancelled the request, it's still not exactly hard to
clean up after yourself. It was perhaps twenty minute's work to
implement a generic query cancel command with Java/JDBC - just spawn
another thread or grab one from a worker pool, open a new Pg connection,
issue the backend cancel, and close the connection. In the thread that
just had its backend cancelled you receive an informative SQLException.

In fact, I'm not even sure _how_ one goes about exiting without sending
an RST. A quick check shows that when I `kill -9' a process with an open
client socket (ssh, in this case) the OS sends a FIN, and responds to
the server's FIN,ACK with its own ACK. So the OS is closing the socket
for the dead process. If I try this with a `psql' process, the server
cleans up the orphaned backend promptly.

So, barring network breaks (wifi down / out of range, ethernet cable
fell out, etc etc) how is the OP managing to leave backends running
queries? Hard-resetting the machine?

--
Craig Ringer

--
Sent via pgsql-general mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
12
Previous Thread Next Thread