Query on pg_stat_activity table got stuck

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Query on pg_stat_activity table got stuck

neeraj kumar
Hello,

We are using PG 10.6. We have one cron job that queries pg_stat_activity table to find out how many queries are running longer than X minutes and generate metrics.

Query look like this : 
SELECT * FROM pg_stat_activity WHERE state='active'

After some days, this query get stuck and doesn't finish. We tried to run this query manually and same result.

We looked into pg_locks table and there this query is not blocked on any lock :
https://justpaste.it/48rpe 

Now we don't know where to look next. Our expectation was query on pg_stat_activity shouldn't get blocked on anything and pg_locks also showing same result.
Google search also not returning anything useful that is why reaching out to this email list.
We have separated that DB and issue is still produceable on that DB, in case if more information is not needed.

-------------------------------------
Thanks
Neeraj

Reply | Threaded
Open this post in threaded view
|

Re: Query on pg_stat_activity table got stuck

neeraj kumar
Took some time to get stack trace as we didn't had root permission. Attaching stack trace of two process (out of many) stuck for same query below[1][2]

All stack trace's top function points to somewhere in this loop so it means calls are not stuck, but are unable to come out of this loop. 

[1] 
[12:43:30][root][~]$ pstack 6283 
#0 pgstat_read_current_status () at pgstat.c:3495 
#1 0x0000000000732381 in pgstat_read_current_status () at pgstat.c:2566 
#2 pgstat_fetch_stat_numbackends () at pgstat.c:2567 
#3 0x000000000083bfef in pg_stat_get_activity (fcinfo=0x7ffd26955b80) at pgstatfuncs.c:581 
#4 0x00000000006832a1 in ExecMakeTableFunctionResult (setexpr=0x14ea907dcc60, econtext=0x14ea907dca50, argContext=<optimized out>, expectedDesc=0x14ea907df048, randomAccess=0 '\000') at execSRF.c:231 
#5 0x000000000068e7b3 in FunctionNext (node=node@entry=0x14ea907dc298) at nodeFunctionscan.c:94 
#6 0x000000000068275a in ExecScanFetch (recheckMtd=0x68e4e0 <FunctionRecheck>, accessMtd=0x68e500 <FunctionNext>, node=0x14ea907dc298) at execScan.c:97 
#7 ExecScan (node=0x14ea907dc298, accessMtd=0x68e500 <FunctionNext>, recheckMtd=0x68e4e0 <FunctionRecheck>) at execScan.c:147 
#8 0x0000000000688009 in ExecProcNode (node=0x14ea907dc298) at ../../../src/include/executor/executor.h:250 
#9 fetch_input_tuple (aggstate=aggstate@entry=0x14ea907dc4f8) at nodeAgg.c:695 
#10 0x000000000068a0af in agg_retrieve_direct (aggstate=0x14ea907dc4f8) at nodeAgg.c:2347 
#11 ExecAgg (pstate=0x14ea907dc4f8) at nodeAgg.c:2158 
#12 0x000000000067cce2 in ExecProcNode (node=0x14ea907dc4f8) at ../../../src/include/executor/executor.h:250 
#13 ExecutePlan (execute_once=<optimized out>, dest=0x14ea907a1190, direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x14ea907dc4f8, estate=0x14ea907dc038) at execMain.c:1723 
#14 standard_ExecutorRun (queryDesc=0x14ea906cc038, direction=<optimized out>, count=0, execute_once=<optimized out>) at execMain.c:364 
#15 0x000014ed56cee425 in pgss_ExecutorRun (queryDesc=0x14ea906cc038, direction=ForwardScanDirection, count=0, execute_once=<optimized out>) at pg_stat_statements.c:891 
#16 0x000014ed56cd760e in explain_ExecutorRun (queryDesc=0x14ea906cc038, direction=ForwardScanDirection, count=0, execute_once=<optimized out>) at auto_explain.c:267 
#17 0x00000000007b328c in PortalRunSelect (portal=portal@entry=0x14ea907da038, forward=forward@entry=1 '\001', count=0, count@entry=9223372036854775807, dest=dest@entry=0x14ea907a1190) at pquery.c:932 
#18 0x00000000007b4630 in PortalRun (portal=portal@entry=0x14ea907da038, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=1 '\001', run_once=run_once@entry=1 '\001', dest=dest@entry=0x14ea907a1190, altdest=altdest@entry=0x14ea907a1190, completionTag=0x7ffd26956530 "") at pquery.c:773 
#19 0x00000000007b0223 in exec_simple_query (query_string=0x14ed51d1f038 "select count(*) from pg_stat_activity;") at postgres.c:1145 
#20 0x00000000007b2388 in PostgresMain (argc=<optimized out>, argv=argv@entry=0x14ed51dd42a8, dbname=0x14ed51dd4158 "db_name", username=<optimized out>) at postgres.c:4235 
#21 0x00000000004cf2ae in BackendRun (port=0x14ed51dfa380) at postmaster.c:4791 
#22 BackendStartup (port=0x14ed51dfa380) at postmaster.c:4458 
#23 ServerLoop () at postmaster.c:1930 
#24 0x0000000000739d58 in PostmasterMain (argc=argc@entry=9, argv=argv@entry=0x14ed51c246f0) at postmaster.c:1557 
#25 0x00000000004d1594 in main (argc=9, argv=0x14ed51c246f0) at main.c:228


[2]
[14:53:36][root][~]$ pstack 82504 
#0 0x000000000072e053 in pgstat_read_current_status () at pgstat.c:3467 
#1 0x0000000000732381 in pgstat_read_current_status () at pgstat.c:2566 
#2 pgstat_fetch_stat_numbackends () at pgstat.c:2567 
#3 0x000000000083bfef in pg_stat_get_activity (fcinfo=0x7ffd26955c30) at pgstatfuncs.c:581 
#4 0x00000000006832a1 in ExecMakeTableFunctionResult (setexpr=0x14ea906e95b0, econtext=0x14ea906e8a50, argContext=<optimized out>, expectedDesc=0x14ea906eb958, randomAccess=0 '\000') at execSRF.c:231 
#5 0x000000000068e7b3 in FunctionNext (node=node@entry=0x14ea906e8298) at nodeFunctionscan.c:94 
#6 0x00000000006826e7 in ExecScanFetch (recheckMtd=0x68e4e0 <FunctionRecheck>, accessMtd=0x68e500 <FunctionNext>, node=0x14ea906e8298) at execScan.c:97 
#7 ExecScan (node=0x14ea906e8298, accessMtd=0x68e500 <FunctionNext>, recheckMtd=0x68e4e0 <FunctionRecheck>) at execScan.c:164 
#8 0x0000000000688009 in ExecProcNode (node=0x14ea906e8298) at ../../../src/include/executor/executor.h:250 
#9 fetch_input_tuple (aggstate=aggstate@entry=0x14ea906e84f8) at nodeAgg.c:695 
#10 0x000000000068a0af in agg_retrieve_direct (aggstate=0x14ea906e84f8) at nodeAgg.c:2347 
#11 ExecAgg (pstate=0x14ea906e84f8) at nodeAgg.c:2158 
#12 0x000000000067cce2 in ExecProcNode (node=0x14ea906e84f8) at ../../../src/include/executor/executor.h:250 
#13 ExecutePlan (execute_once=<optimized out>, dest=0x14ed51d1f448, direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x14ea906e84f8, estate=0x14ea906e8038) at execMain.c:1723 
#14 standard_ExecutorRun (queryDesc=0x14ed51d854b8, direction=<optimized out>, count=0, execute_once=<optimized out>) at execMain.c:364 
#15 0x000014ed56cee425 in pgss_ExecutorRun (queryDesc=0x14ed51d854b8, direction=ForwardScanDirection, count=0, execute_once=<optimized out>) at pg_stat_statements.c:891 
#16 0x000014ed56cd760e in explain_ExecutorRun (queryDesc=0x14ed51d854b8, direction=ForwardScanDirection, count=0, execute_once=<optimized out>) at auto_explain.c:267 
#17 0x00000000007b328c in PortalRunSelect (portal=portal@entry=0x14ed51e08038, forward=forward@entry=1 '\001', count=0, count@entry=9223372036854775807, dest=dest@entry=0x14ed51d1f448) at pquery.c:932 
#18 0x00000000007b4630 in PortalRun (portal=portal@entry=0x14ed51e08038, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=1 '\001', run_once=<optimized out>, dest=dest@entry=0x14ed51d1f448, altdest=altdest@entry=0x14ed51d1f448, completionTag=0x7ffd26956730 "") at pquery.c:773 
#19 0x00000000007b200c in exec_execute_message (max_rows=9223372036854775807, portal_name=0x14ed51d1f038 "") at postgres.c:2030 
#20 PostgresMain (argc=<optimized out>, argv=argv@entry=0x14ed51d6d088, dbname=0x14ed51d6d068 "db_name", username=<optimized out>) at postgres.c:4298 
#21 0x00000000004cf2ae in BackendRun (port=0x14ed51dfa380) at postmaster.c:4791 
#22 BackendStartup (port=0x14ed51dfa380) at postmaster.c:4458 
#23 ServerLoop () at postmaster.c:1930 
#24 0x0000000000739d58 in PostmasterMain (argc=argc@entry=9, argv=argv@entry=0x14ed51c246f0) at postmaster.c:1557 
#25 0x00000000004d1594 in main (argc=9, argv=0x14ed51c246f0) at main.c:228 
[14:53:43][root][~]$




On Mon, May 6, 2019 at 2:33 PM Tom Lane <[hidden email]> wrote:
neeraj kumar <[hidden email]> writes:
> We are using PG 10.6. We have one cron job that queries pg_stat_activity
> table to find out how many queries are running longer than X minutes and
> generate metrics.

> Query look like this :
> SELECT * FROM pg_stat_activity WHERE state='active'

> After some days, this query get stuck and doesn't finish. We tried to run
> this query manually and same result.

> We looked into pg_locks table and there this query is not blocked on any
> lock :
> https://justpaste.it/48rpe

Interesting.  Can you get a stack trace to show where in the code it's
stuck?

https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend

                        regards, tom lane


--
-------------------------------------
Thanks
Neeraj Kumar,
+1  (206) 427-7267
Reply | Threaded
Open this post in threaded view
|

Re: Query on pg_stat_activity table got stuck

neeraj kumar
Yes we use SSL to connect to DB.


From comment seems like each backend should have its own copy of PgBackendStatus, it means st_changecount should be different for each process. If st_changecount was corrupted for 1/2 process, how can it impact newly created process. So could you please help to understand then how come if we run new query via new console (means new process) that also is getting stuck.


On Wed, May 8, 2019 at 11:12 AM Tom Lane <[hidden email]> wrote:
neeraj kumar <[hidden email]> writes:
> Took some time to get stack trace as we didn't had root permission.
> Attaching stack trace of two process (out of many) stuck for same query
> below[1][2]

Hmm, the line numbers in your stack traces don't agree with either v10
or HEAD branches for me.  But assuming that you've correctly identified
where it's stuck:

> Seems like call is unable to come out of this loop :
> https://github.com/postgres/postgres/blob/master/src/backend/postmaster/pgstat.c#L3361-L3400

the only really obvious theory is that some process left its
st_changecount odd, which would more or less have to imply that
something threw an error between pgstat_increment_changecount_before
and pgstat_increment_changecount_after.  There's only one place
where that seems very plausible, namely pgstat_bestart, which is
doing a rather scary amount of stuff in between.  Are you using
either SSL or GSS?

                        regards, tom lane


--
-------------------------------------
Thanks
Neeraj Kumar,
+1  (206) 427-7267
Reply | Threaded
Open this post in threaded view
|

Re: Query on pg_stat_activity table got stuck

Jeremy Schneider-2
On 5/8/19 14:58, Tom Lane wrote:

> neeraj kumar <[hidden email]> writes:
>> Yes we use SSL to connect to DB.
>
> Hm.  I'm suspicious that one of the functions that fetch data for
> an SSL connection threw an error.  In particular, it doesn't look
> to be hard at all to make X509_NAME_to_cstring fall over --- an
> encoding conversion failure would do it, even without any stretchy
> assumptions about OOM this early in backend start.  Have you got
> any SSL certificates floating around with non-ASCII subject name
> or issuer name?

Crazy timing. We just had a report come in from a database in the RDS
fleet that's hitting this same issue.  It was one of the Aurora systems,
but there wasn't anything Aurora-specific that I could see in the
relevant bits of code.

Seems to me that at a minimum, this loop shouldn't go on forever. Even
having an arbitrary, crazy high, hard-coded number of attempts before
failure (like a million) would be better than spinning on the CPU
forever - which is what we are seeing.

Would be even cooler to detect and correct a broken slot in
PgBackendStatus... if I have a good idea I'll post/try it.  :)

-Jeremy

--
Jeremy Schneider
Database Engineer
Amazon Web Services


Reply | Threaded
Open this post in threaded view
|

Re: Query on pg_stat_activity table got stuck

neeraj kumar
We got more information about this issue. There is one backend process still present into beentry which has changecount as odd value. However this process is long gone/terminated. It means when this process was killed/terminated its entry was not cleaned from beentry list. There seems to be some shutdown hook which cleans beentry if process is killed/terminated that somehow was not kicked off ?

These are some of the fields of corrupted beentry entry which is still hanging :
st_changecount = 1407, st_procpid = 0, st_backendType = B_BACKEND, st_proc_start_timestamp = 610236763633421, st_xact_start_timestamp = 0, st_clienthostname = 0x9000023d480 "", st_ssl = 1 '\001', st_sslstatus = 0x90000c60f80, st_state = STATE_IDLEINTRANSACTION_ABORTED,




On Thu, May 9, 2019 at 1:00 PM Tom Lane <[hidden email]> wrote:
Jeremy Schneider <[hidden email]> writes:
> Seems to me that at a minimum, this loop shouldn't go on forever. Even
> having an arbitrary, crazy high, hard-coded number of attempts before
> failure (like a million) would be better than spinning on the CPU
> forever - which is what we are seeing.

I don't think it's the readers' fault.  The problem is that the
writer is violating the protocol.  If we put an upper limit on
the number of spin cycles on the reader side, we'll just be creating
a new failure mode when a writer gets swapped out at the wrong moment.

IMO we need to (a) get the failure-prone code out of the critical
section, and then (b) fix the pgstat_increment_changecount macros
so that the critical sections around these shmem changes really are
critical sections (ie bump CritSectionCount).  That way, if somebody
makes the same mistake again, at least there'll be a pretty obvious
failure rather than a lot of stuck readers.

                        regards, tom lane


--
-------------------------------------
Thanks
Neeraj Kumar,
+1  (206) 427-7267
Reply | Threaded
Open this post in threaded view
|

Re: Query on pg_stat_activity table got stuck

neeraj kumar
Tom, may be I didn't make my point clear.
There are two issues :
1) Why this value was left as odd
2) Why backend entry is still pending in beentry for backend process even after it was killed/terminated.

I am talking about 2nd issue. My understanding is query on pg_stat_activity goes via all backend entries via beentry and it finds this wrong/corrupted entry. When a process terminates, ideally this backend entry into beentery should have also been cleaned. But why this still there? Whose responsibility it is to remove entry from beentry when process terminates ? May be that part is broken?

Ideally this issue should have stayed till this process was active. If entry from beentry would have been removed after process was killed, system should have auto recovered.



On Thu, May 9, 2019 at 8:25 PM Tom Lane <[hidden email]> wrote:
neeraj kumar <[hidden email]> writes:
> We got more information about this issue. There is one backend process
> still present into beentry which has changecount as odd value. However this
> process is long gone/terminated. It means when this process was
> killed/terminated its entry was not cleaned from beentry list. There seems
> to be some shutdown hook which cleans beentry if process is
> killed/terminated that somehow was not kicked off ?

No, that's the point: there isn't any such cleanup code.  The design
idea here was that we'd only have short straight-line code segments
with no possible way to throw an error while changecount is odd.
Then somebody abused the heck out of that ...

                        regards, tom lane


--
-------------------------------------
Thanks
Neeraj Kumar,
+1  (206) 427-7267
Reply | Threaded
Open this post in threaded view
|

Re: Query on pg_stat_activity table got stuck

neeraj kumar
There are multiple ways see this problem. One way I am seeing is : how system will auto-recover from this particular state.

So ideally if st_procpid is set to zero it means this process is already terminated, however it might be have left some corrupted information in memory. So when other components tries to read beentry, they should also check if st_procpid is already set to zero, if yes it means this process is gone and no need to consider this process any more.

Agree this is solving particular issue about pg_stat_activity however don't see any harm in adding that check.


On Thu, May 9, 2019 at 9:29 PM Tom Lane <[hidden email]> wrote:
neeraj kumar <[hidden email]> writes:
> Tom, may be I didn't make my point clear.
> There are two issues :
> 1) Why this value was left as odd

Because a function called by pgstat_bestart threw an error, is what
I'm guessing.

> 2) Why backend entry is still pending in beentry for backend process even
> after it was killed/terminated.
> I am talking about 2nd issue. My understanding is query on pg_stat_activity
> goes via all backend entries via beentry and it finds this wrong/corrupted
> entry. When a process terminates, ideally this backend entry into beentery
> should have also been cleaned. But why this still there? Whose
> responsibility it is to remove entry from beentry when process terminates ?

If you're imagining that something takes an electronic hacksaw to shared
memory and physically removes that array entry, you're mistaken.  It's
still going to be there.  Now, there *is* cleanup code --- this bit of
pgstat_beshutdown_hook is supposed to mark the process's entry as
not-in-use:

        /*
         * Clear my status entry, following the protocol of bumping st_changecount
         * before and after.  We use a volatile pointer here to ensure the
         * compiler doesn't try to get cute.
         */
        pgstat_increment_changecount_before(beentry);

        beentry->st_procpid = 0;        /* mark invalid */

        pgstat_increment_changecount_after(beentry);

However, if something had left st_changecount odd before we got to this,
it'd still be odd afterwards.  Sure, st_procpid is now zero, but that
doesn't help readers because they're not supposed to believe st_procpid
is valid unless the changecount is even.

You could maybe argue that this cleanup code should be trying to ensure
that the changecount is left even, but I don't think that's an appropriate
response.  If it's not even when we get here, we already screwed up very
badly, because we were breaking the protocol for (potentially) as long as
the process has been in existence.  Moreover, if you're worried about
corner cases where we did mess that up, what of corner cases where process
exit fails before getting here?  I think the right answer is to bring the
hammer down as soon as we mess up, which is what the critical-section
mechanism is for.

                        regards, tom lane


--
-------------------------------------
Thanks
Neeraj Kumar,
+1  (206) 427-7267
Reply | Threaded
Open this post in threaded view
|

Re: Query on pg_stat_activity table got stuck

neeraj kumar
Also curious why query on pg_stat_activity is considering terminated process ? Irrespective of corrupted state or not, ideally query on pg_stat_activity should ignore already terminated process.
My 2 cents.


On Fri, May 10, 2019 at 8:01 AM neeraj kumar <[hidden email]> wrote:
There are multiple ways see this problem. One way I am seeing is : how system will auto-recover from this particular state.

So ideally if st_procpid is set to zero it means this process is already terminated, however it might be have left some corrupted information in memory. So when other components tries to read beentry, they should also check if st_procpid is already set to zero, if yes it means this process is gone and no need to consider this process any more.

Agree this is solving particular issue about pg_stat_activity however don't see any harm in adding that check.


On Thu, May 9, 2019 at 9:29 PM Tom Lane <[hidden email]> wrote:
neeraj kumar <[hidden email]> writes:
> Tom, may be I didn't make my point clear.
> There are two issues :
> 1) Why this value was left as odd

Because a function called by pgstat_bestart threw an error, is what
I'm guessing.

> 2) Why backend entry is still pending in beentry for backend process even
> after it was killed/terminated.
> I am talking about 2nd issue. My understanding is query on pg_stat_activity
> goes via all backend entries via beentry and it finds this wrong/corrupted
> entry. When a process terminates, ideally this backend entry into beentery
> should have also been cleaned. But why this still there? Whose
> responsibility it is to remove entry from beentry when process terminates ?

If you're imagining that something takes an electronic hacksaw to shared
memory and physically removes that array entry, you're mistaken.  It's
still going to be there.  Now, there *is* cleanup code --- this bit of
pgstat_beshutdown_hook is supposed to mark the process's entry as
not-in-use:

        /*
         * Clear my status entry, following the protocol of bumping st_changecount
         * before and after.  We use a volatile pointer here to ensure the
         * compiler doesn't try to get cute.
         */
        pgstat_increment_changecount_before(beentry);

        beentry->st_procpid = 0;        /* mark invalid */

        pgstat_increment_changecount_after(beentry);

However, if something had left st_changecount odd before we got to this,
it'd still be odd afterwards.  Sure, st_procpid is now zero, but that
doesn't help readers because they're not supposed to believe st_procpid
is valid unless the changecount is even.

You could maybe argue that this cleanup code should be trying to ensure
that the changecount is left even, but I don't think that's an appropriate
response.  If it's not even when we get here, we already screwed up very
badly, because we were breaking the protocol for (potentially) as long as
the process has been in existence.  Moreover, if you're worried about
corner cases where we did mess that up, what of corner cases where process
exit fails before getting here?  I think the right answer is to bring the
hammer down as soon as we mess up, which is what the critical-section
mechanism is for.

                        regards, tom lane


--
-------------------------------------
Thanks
Neeraj Kumar,
+1  (206) 427-7267


--
-------------------------------------
Thanks
Neeraj Kumar,
+1  (206) 427-7267
Reply | Threaded
Open this post in threaded view
|

Re: Query on pg_stat_activity table got stuck

neeraj kumar
--> But the right answer is to fix it on the writing side.
Yes I agree with this part. Even though there is very low probability, a process can still be killed in middle when writing. So what is your suggestion on how to recover from this automatically?

On Fri, May 10, 2019 at 9:14 AM Tom Lane <[hidden email]> wrote:
neeraj kumar <[hidden email]> writes:
> Also curious why query on pg_stat_activity is considering terminated
> process ?

The short answer to that is that this bug leaves shared memory in
a corrupt state.  It's not really useful to worry about whether
readers should react differently to that --- you could spend endless
amounts of time hypothesizing different kinds of corruption and
endless amounts of code/cycles trying to work around it.  But the
right answer is to fix it on the writing side.

                        regards, tom lane


--
-------------------------------------
Thanks
Neeraj Kumar,
+1  (206) 427-7267