BUG #15585: infinite DynamicSharedMemoryControlLock waiting in parallel query

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

BUG #15585: infinite DynamicSharedMemoryControlLock waiting in parallel query

PG Bug reporting form
The following bug has been logged on the website:

Bug reference:      15585
Logged by:          Sergei Kornilov
Email address:      [hidden email]
PostgreSQL version: 10.6
Operating system:   Ubuntu 16.04.2 LTS
Description:        

Hello
I have encountered issue with infinite parallel query: many processes
waiting LWLock DynamicSharedMemoryControlLock (client backend and several
background worker) and one client backend with IPC BgWorkerShutdown waiting
forever (some hours) and unkillable. Initially this issue was spotted 1 jan
on 10.5 version, then i shutdown database (immediate, i was unable shutdown
cleanly), install gdb, dbg packages and upgrade to 10.6. Yesterday i have
this issue again and save some backtraces (query was hang about 0.5 hour).
It seems a bit similar to BUG #15290 [1], but this server was already 10.5
and another LWLock.

Parallel leader:
datid            | 16400
datname          | db
pid              | 28927
usesysid         | 852050794
usename          | user
application_name | ...
client_addr      | 127.0.0.1
client_hostname  |
client_port      | 58326
backend_start    | 2019-01-09 10:54:02.50649+00
xact_start       | 2019-01-09 10:56:55.802265+00
query_start      | 2019-01-09 10:56:55.80524+00
state_change     | 2019-01-09 10:56:55.805243+00
wait_event_type  | IPC
wait_event       | BgWorkerShutdown
state            | active
backend_xid      |
backend_xmin     | 2047764791
query            | SELECT ...
backend_type     | client backend

#0  0x00007f198e58b9f3 in __epoll_wait_nocancel () at
../sysdeps/unix/syscall-template.S:84
#1  0x0000555d7fe393d1 in WaitEventSetWaitBlock (nevents=1,
occurred_events=0x7ffe64b3c320, cur_timeout=-1, set=0x555d8108fec8) at
/build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/storage/ipc/latch.c:1048
#2  WaitEventSetWait (set=set@entry=0x555d8108fec8,
timeout=timeout@entry=-1,
occurred_events=occurred_events@entry=0x7ffe64b3c320,
nevents=nevents@entry=1, wait_event_info=wait_event_info@entry=134217728)
    at
/build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/storage/ipc/latch.c:1000
#3  0x0000555d7fe39844 in WaitLatchOrSocket (latch=0x7f15ab5a80b4,
wakeEvents=wakeEvents@entry=17, sock=sock@entry=-1, timeout=-1,
timeout@entry=0, wait_event_info=wait_event_info@entry=134217728)
    at
/build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/storage/ipc/latch.c:385
#4  0x0000555d7fe398f5 in WaitLatch (latch=<optimized out>,
wakeEvents=wakeEvents@entry=17, timeout=timeout@entry=0,
wait_event_info=wait_event_info@entry=134217728)
    at
/build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/storage/ipc/latch.c:339
#5  0x0000555d7fddce3b in WaitForBackgroundWorkerShutdown
(handle=0x555d810b2db0) at
/build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/postmaster/bgworker.c:1147
#6  0x0000555d7fc0211d in WaitForParallelWorkersToExit (pcxt=0x555d810b2938,
pcxt=0x555d810b2938) at
/build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/access/transam/parallel.c:659
#7  0x0000555d7fc02e47 in DestroyParallelContext (pcxt=0x555d810b2938) at
/build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/access/transam/parallel.c:741
#8  0x0000555d7fc03458 in AtEOXact_Parallel (isCommit=isCommit@entry=0
'\000') at
/build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/access/transam/parallel.c:1010
#9  0x0000555d7fc0e807 in AbortTransaction () at
/build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/access/transam/xact.c:2566
#10 0x0000555d7fc0ef65 in AbortCurrentTransaction () at
/build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/access/transam/xact.c:3125
#11 0x0000555d7fe5c7cd in PostgresMain (argc=1,
argv=argv@entry=0x555d81009a08, dbname=0x555d810370f8 "shard_sw_production",
username=0x555d81037128 "startwire_background")
    at
/build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/tcop/postgres.c:3908
#12 0x0000555d7fb93509 in BackendRun (port=0x555d81032d80) at
/build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/postmaster/postmaster.c:4405
#13 BackendStartup (port=0x555d81032d80) at
/build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/postmaster/postmaster.c:4077
#14 ServerLoop () at
/build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/postmaster/postmaster.c:1755
#15 0x0000555d7fdeb6bb in PostmasterMain (argc=5, argv=<optimized out>) at
/build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/postmaster/postmaster.c:1363
#16 0x0000555d7fb94905 in main (argc=5, argv=0x555d81007850) at
/build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/main/main.c:228

One another locked process:

datid            | 16400
datname          | db
pid              | 22070
usesysid         | 852050794
usename          | user
application_name | ...
client_addr      | 127.0.0.1
client_hostname  |
client_port      | 57762
backend_start    | 2019-01-09 10:16:45.773862+00
xact_start       | 2019-01-09 10:57:38.902604+00
query_start      | 2019-01-09 10:57:38.90572+00
state_change     | 2019-01-09 10:57:38.905724+00
wait_event_type  | LWLock
wait_event       | DynamicSharedMemoryControlLock
state            | active
backend_xid      |
backend_xmin     | 2047768195
query            | SELECT ...
backend_type     | client backend

#0  0x00007f199087a827 in futex_abstimed_wait_cancelable (private=128,
abstime=0x0, expected=0, futex_word=0x7f11d620c438) at
../sysdeps/unix/sysv/linux/futex-internal.h:205
#1  do_futex_wait (sem=sem@entry=0x7f11d620c438, abstime=0x0) at
sem_waitcommon.c:111
#2  0x00007f199087a8d4 in __new_sem_wait_slow (sem=0x7f11d620c438,
abstime=0x0) at sem_waitcommon.c:181
#3  0x00007f199087a97a in __new_sem_wait (sem=sem@entry=0x7f11d620c438) at
sem_wait.c:29
#4  0x0000555d7fdd9782 in PGSemaphoreLock (sema=0x7f11d620c438) at
pg_sema.c:316
#5  0x0000555d7fe4ca0c in LWLockAcquire (lock=0x7f11d620e600,
mode=mode@entry=LW_EXCLUSIVE) at
/build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/storage/lmgr/lwlock.c:1233
#6  0x0000555d7fe37717 in dsm_create (size=size@entry=380920,
flags=flags@entry=1) at
/build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/storage/ipc/dsm.c:493
#7  0x0000555d7fc0248f in InitializeParallelDSM
(pcxt=pcxt@entry=0x555d81388958) at
/build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/access/transam/parallel.c:268
#8  0x0000555d7fd266c7 in ExecInitParallelPlan (planstate=0x555d81105440,
estate=0x555d811047e8, nworkers=4) at
/build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/executor/execParallel.c:470
#9  0x0000555d7fd36260 in ExecGather (pstate=0x555d81105120) at
/build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/executor/nodeGather.c:158
#10 0x0000555d7fd2e70c in ExecProcNode (node=0x555d81105120) at
/build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/include/executor/executor.h:250
#11 fetch_input_tuple (aggstate=aggstate@entry=0x555d81104a78) at
/build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/executor/nodeAgg.c:695
#12 0x0000555d7fd309cb in agg_retrieve_direct (aggstate=0x555d81104a78) at
/build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/executor/nodeAgg.c:2347
#13 ExecAgg (pstate=0x555d81104a78) at
/build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/executor/nodeAgg.c:2158
#14 0x0000555d7fd2250b in ExecProcNode (node=0x555d81104a78) at
/build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/include/executor/executor.h:250
#15 ExecutePlan (execute_once=<optimized out>, dest=0x555d810b5fc8,
direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>,
operation=CMD_SELECT, use_parallel_mode=<optimized out>,
planstate=0x555d81104a78,
    estate=0x555d811047e8) at
/build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/executor/execMain.c:1723
#16 standard_ExecutorRun (queryDesc=0x555d81111338, direction=<optimized
out>, count=0, execute_once=<optimized out>) at
/build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/executor/execMain.c:364
#17 0x00007f1987338ea5 in pgss_ExecutorRun (queryDesc=0x555d81111338,
direction=ForwardScanDirection, count=0, execute_once=<optimized out>)
    at
/build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../contrib/pg_stat_statements/pg_stat_statements.c:889
#18 0x0000555d7fe5ebfb in PortalRunSelect
(portal=portal@entry=0x555d81026d48, forward=forward@entry=1 '\001',
count=0, count@entry=9223372036854775807, dest=dest@entry=0x555d810b5fc8)
    at
/build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/tcop/pquery.c:932
#19 0x0000555d7fe60210 in PortalRun (portal=portal@entry=0x555d81026d48,
count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=1 '\001',
run_once=<optimized out>, dest=dest@entry=0x555d810b5fc8,
    altdest=altdest@entry=0x555d810b5fc8, completionTag=0x7ffe64b3c780 "")
at
/build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/tcop/pquery.c:773
#20 0x0000555d7fe5cd89 in exec_execute_message
(max_rows=9223372036854775807, portal_name=0x555d810b5bb8 "") at
/build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/tcop/postgres.c:2007
#21 PostgresMain (argc=<optimized out>, argv=argv@entry=0x555d81009a08,
dbname=0x555d810370f8 "shard_sw_production", username=<optimized out>)
    at
/build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/tcop/postgres.c:4180
#22 0x0000555d7fb93509 in BackendRun (port=0x555d81033e30) at
/build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/postmaster/postmaster.c:4405
#23 BackendStartup (port=0x555d81033e30) at
/build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/postmaster/postmaster.c:4077
#24 ServerLoop () at
/build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/postmaster/postmaster.c:1755
#25 0x0000555d7fdeb6bb in PostmasterMain (argc=5, argv=<optimized out>) at
/build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/postmaster/postmaster.c:1363
#26 0x0000555d7fb94905 in main (argc=5, argv=0x555d81007850) at
/build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/main/main.c:228

Also i found relevant (and strange) messages in logs:

 2019-01-09 10:56:55.947 UTC 31156 @ from  [vxid:16/1632 txid:0] [] ERROR:
dsa_area could not attach to segment
 2019-01-09 10:56:55.947 UTC 31156 @ from  [vxid:16/1632 txid:0] []
STATEMENT:  SELECT ...
 2019-01-09 10:56:55.950 UTC 14433 @ from  [vxid: txid:0] [] LOG:  worker
process: parallel worker for PID 28927 (PID 31156) exited with exit code 1
 2019-01-09 10:56:55.950 UTC 28927 user@db from 127.0.0.1 [vxid:102/913
txid:0] [SELECT] ERROR:  dsa_area could not attach to segment
 2019-01-09 10:56:55.950 UTC 28927 user@db from 127.0.0.1 [vxid:102/913
txid:0] [SELECT] STATEMENT:  SELECT ...
 2019-01-09 10:56:55.954 UTC 31157 @ from  [vxid:32/1740 txid:0] [] FATAL:
terminating connection due to administrator command
 2019-01-09 10:56:55.954 UTC 31157 @ from  [vxid:32/1740 txid:0] []
STATEMENT:  SELECT ...
 2019-01-09 10:56:55.954 UTC 31157 @ from  [vxid:32/0 txid:0] [] FATAL:
cannot unpin a segment that is not pinned

log_line_prefix = "%m %p %u@%d from %h [vxid:%v txid:%x] [%i]", without
another log messages +- 5 min.

Pid 28927 is parallel leader with BgWorkerShutdown, 31156 is parallel worker
for this query, 31157 seems another parallel worker for this query.
Unfortunally i notice it later and have no backtrace. But i have
pg_stat_activity row for this process:

datid            | 16400
datname          | db
pid              | 22070
usesysid         | 852050794
usename          | user
application_name | ...
client_addr      |
client_hostname  |
client_port      |
backend_start    | 2019-01-09 10:56:55.839719+00
xact_start       |
query_start      | 2019-01-09 10:56:55.80524+00
state_change     | 2019-01-09 10:56:55.840737+00
wait_event_type  | LWLock
wait_event       | DynamicSharedMemoryControlLock
state            | active
backend_xid      |
backend_xmin     |
query            | SELECT ... (same query as stuck leader)
backend_type     | background worker

Postgresql installed from PGDG repo, no external modules are installed.
Queries uses one native partitioned table with Parallel Index Scan or
Parallel Bitmap Heap Scan. Query itself works ok, i can not reproduce this
issue by hand.
Same log output "dsa_area could not attach to segment" + "cannot unpin a
segment that is not pinned" was 1 jan.

Also it is expected have CHECK_FOR_INTERRUPTS in
WaitForBackgroundWorkerShutdown with only one caller in
DestroyParallelContext:
        /*
         * We can't finish transaction commit or abort until all of the workers
         * have exited.  This means, in particular, that we can't respond to
         * interrupts at this stage.
         */
        HOLD_INTERRUPTS();
        WaitForParallelWorkersToExit(pcxt);
        RESUME_INTERRUPTS();

Can parallel worker wait something response from leader?

thanks in advance!

regards, Sergei

[1]
https://www.postgresql.org/message-id/flat/153228422922.1395.1746424054206154747%40wrigleys.postgresql.org

Reply | Threaded
Open this post in threaded view
|

Re: BUG #15585: infinite DynamicSharedMemoryControlLock waiting in parallel query

Thomas Munro-3
Hi Sergei,

Thanks for the report and backtraces.  If this is a production system,
the workaround for would be to disable parallel query for the problem
queries (max_parallel_workers_per_gather = 0).

On Thu, Jan 10, 2019 at 11:14 PM PG Bug reporting form
<[hidden email]> wrote:
> It seems a bit similar to BUG #15290 [1], but this server was already 10.5
> and another LWLock.

Although the symptoms were a little similar, and you did mention
further down that you are using Parallel Index Scan, that bug was
fixed, and the stack looked rather different (you could see the index
scan itself was blocked).

> dsa_area could not attach to segment

This was also reported recently in 11.1[1] and seems to imply a
problem with the recycling of DSM segments slots in DSA areas, or some
other kind of concurrency problem.  I am trying to reproduce it.

> Postgresql installed from PGDG repo, no external modules are installed.
> Queries uses one native partitioned table with Parallel Index Scan or
> Parallel Bitmap Heap Scan. Query itself works ok, i can not reproduce this
> issue by hand.
> Same log output "dsa_area could not attach to segment" + "cannot unpin a
> segment that is not pinned" was 1 jan.

Yeah, it does seem to be rare; Justin mentioned off-list that he did
manage to reproduce the problem he reported in the other thread by
running his query in a tight loop for 8 hours.

> Also it is expected have CHECK_FOR_INTERRUPTS in
> WaitForBackgroundWorkerShutdown with only one caller in
> DestroyParallelContext:
>         /*
>          * We can't finish transaction commit or abort until all of the workers
>          * have exited.  This means, in particular, that we can't respond to
>          * interrupts at this stage.
>          */
>         HOLD_INTERRUPTS();
>         WaitForParallelWorkersToExit(pcxt);
>         RESUME_INTERRUPTS();
>
> Can parallel worker wait something response from leader?

The fact that the leader waits without interrupts here is a separate
topic, and I doubt we should change that; the real problem is that
something went wrong in a worker and I need to figure out what and fix
that.

[1] https://www.postgresql.org/message-id/flat/20181231221734.GB25379%40telsasoft.com


--
Thomas Munro
http://www.enterprisedb.com

Reply | Threaded
Open this post in threaded view
|

Re: BUG #15585: infinite DynamicSharedMemoryControlLock waiting in parallel query

Thomas Munro-3
On Fri, Jan 11, 2019 at 10:04 AM Thomas Munro
<[hidden email]> wrote:
> I am trying to reproduce it.

I ran squillions of concurrent parallel queries today, making sure
they would allocate and free entire segments a lot (based on the
assumption that something along those lines must be required).  I made
sure to use Linux, GCC, -O2, no asserts (since both reports came from
that environment, and I'd previously failed to reproduce this on my
usual tool chain/platforms), and I used a multi-socket box (in case
some cache coherency effect was not occurring on my development
laptops).  I did learn some interesting things about parallel query
performance that I plan to follow up on, but I had no luck in
reproducing this error.  Rats.

One observation is that the other report involved a fairly simple
Parallel Hash Join query (on 11), and this report involves Parallel
Index Scan and Parallel Bitmap Index Scan (on 10), so that suggests
that it's probably not a bug in the parallel executor code (for
example an access-after-free, whose undefined behaviour could in
theory look like this with unlucky timing, I speculate) but rather
something lower level.

--
Thomas Munro
http://www.enterprisedb.com

Reply | Threaded
Open this post in threaded view
|

Re: BUG #15585: infinite DynamicSharedMemoryControlLock waiting in parallel query

Maksim Milyutin-2


On 1/11/19 8:16 AM, Thomas Munro wrote:
On Fri, Jan 11, 2019 at 10:04 AM Thomas Munro
[hidden email] wrote:
I am trying to reproduce it.
I ran squillions of concurrent parallel queries today, making sure
they would allocate and free entire segments a lot (based on the
assumption that something along those lines must be required).  I made
sure to use Linux, GCC, -O2, no asserts (since both reports came from
that environment, and I'd previously failed to reproduce this on my
usual tool chain/platforms), and I used a multi-socket box (in case
some cache coherency effect was not occurring on my development
laptops).  I did learn some interesting things about parallel query
performance that I plan to follow up on, but I had no luck in
reproducing this error.  Rats.

One observation is that the other report involved a fairly simple
Parallel Hash Join query (on 11), and this report involves Parallel
Index Scan and Parallel Bitmap Index Scan (on 10), so that suggests
that it's probably not a bug in the parallel executor code (for
example an access-after-free, whose undefined behaviour could in
theory look like this with unlucky timing, I speculate) but rather
something lower level.


Recently one of our customers encountered the same problem on PG 11.1. The query structure and error messages are similar to one that was described in [1].


OS/PG settings:

Ubuntu 16.04

mem 128 GB
cpu 16 cores
huge_pages is turned off

# sudo cat /proc/sys/kernel/shmmax
18446744073692774399

max_connections = 200
shared_buffers = 33GB
dynamic_shared_memory_type = posix
max_worker_processes = 16
max_parallel_maintenance_workers = 6
max_parallel_workers_per_gather = 6
max_parallel_workers = 12


The query was:

SELECT
  ev.date,
  r.indicator_id,
  r.service_reg_id,
  count(r.value),
  count(CASE WHEN r.value::int > 3 THEN 1 ELSE NULL END),
  avg(r.value::int)
FROM base.hershel_events AS ev
  INNER JOIN base.hershel_rates AS r ON r.event_id = ev.id
WHERE ev.date between '2018-09-15' AND '2018-12-15'
GROUP BY ev.date, r.indicator_id, r.service_reg_id
ORDER BY ev.date ASC

Its plan with record statistics:

 Finalize GroupAggregate (actual rows=81332937 loops=1)
   Group Key: ev.date, r.indicator_id, r.service_reg_id
   ->  Gather Merge (actual rows=81332937 loops=1)
         Workers Planned: 7
         Workers Launched: 7
         ->  Partial GroupAggregate (actual rows=10166617 loops=8)
               Group Key: ev.date, r.indicator_id, r.service_reg_id
               ->  Sort (actual rows=10166617 loops=8)
                     Sort Key: ev.date, r.indicator_id, r.service_reg_id
                     Sort Method: external merge  Disk: 549392kB
                     Worker 0:  Sort Method: external merge  Disk: 516104kB
                     Worker 1:  Sort Method: external merge  Disk: 533200kB
                     Worker 2:  Sort Method: external merge  Disk: 540680kB
                     Worker 3:  Sort Method: external merge  Disk: 540760kB
                     Worker 4:  Sort Method: external merge  Disk: 530512kB
                     Worker 5:  Sort Method: external merge  Disk: 472240kB
                     Worker 6:  Sort Method: external merge  Disk: 539256kB
                     ->  Parallel Hash Join (actual rows=10166617 loops=8)
                           Hash Cond: (r.event_id = ev.id)
                           ->  Parallel Seq Scan on hershel_rates r (actual rows=15000000 loops=8)
                           ->  Parallel Hash (actual rows=9319175 loops=8)
                                 Buckets: 4194304  Batches: 32  Memory Usage: 124064kB
                                 ->  Parallel Seq Scan on hershel_events ev (actual rows=9319175 loops=8)
                                       Filter: ((date >= '2018-09-15'::date) AND (date <= '2018-12-15'::date))
                                       Rows Removed by Filter: 4430825


At some moment one of parallel workers emitted to log the error message: dsa_area could not attach to segment. After terminating all workers (the leader also bumped into the same error) the last one began to emit to log the following type of messages: LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp5027.0.sharedfileset/i55of128.p1.0", size 458752. The summary size of temp files that was mentioned in the log was about *14GB*. And eventually that last worker got the fatal error: cannot unpin a segment that is not pinned. After that the DynamicSharedMemoryControlLock left unreleased and soon new backends hanged on it.

Another time some backend (more likely, it was some utility worker) bumped into the error: could not open shared memory segment "/PostgreSQL.1948558832": No such file or directory. After that other backends stopped on DynamicSharedMemoryControlLock. Before this issue there were parallel aggregate queries.


Unfortunately our customer doesn't want to catch up this error on production system again. He turned off the query parallelism. And this issue doesn't reproduce on test environment. AFAIK this is heisenbug and is reproducible on heavy load. I'll try to run some stress tests to catch up stack trace.


[1] https://www.postgresql.org/message-id/flat/20181231221734.GB25379%40telsasoft.com

-- 
Regards, Maksim Milyutin
Postgres Professional:http://www.postgrespro.ru
The Russian Postgres Company
Reply | Threaded
Open this post in threaded view
|

Re: BUG #15585: infinite DynamicSharedMemoryControlLock waiting in parallel query

Thomas Munro-3
On Thu, Jan 17, 2019 at 3:23 AM Maksim Milyutin <[hidden email]> wrote:
> Recently one of our customers encountered the same problem on PG 11.1. The query structure and error messages are similar to one that was described in [1].

Hi Maksim,

Thanks for this report, and apologies for not responding sooner (I was
out of the office).  Indeed, it seems quite similar to the other 2
reports.  I'm still trying to reproduce the root problem here.

--
Thomas Munro
http://www.enterprisedb.com

Reply | Threaded
Open this post in threaded view
|

Re: BUG #15585: infinite DynamicSharedMemoryControlLock waiting in parallel query

Sergei Kornilov
Hello

Today I got this error again, same host (earlier i did not disable parallel query), same symptoms, same errors in logs, same backtrace on parallel leader. But now i have backtrace from last one stucked parallel worker for this query. And this backtrace looks very strange for me:

#0  0x00007eff6dada827 in futex_abstimed_wait_cancelable (private=128, abstime=0x0, expected=0, futex_word=0x7efb8be0ceb8) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
#1  do_futex_wait (sem=sem@entry=0x7efb8be0ceb8, abstime=0x0) at sem_waitcommon.c:111
#2  0x00007eff6dada8d4 in __new_sem_wait_slow (sem=0x7efb8be0ceb8, abstime=0x0) at sem_waitcommon.c:181
#3  0x00007eff6dada97a in __new_sem_wait (sem=sem@entry=0x7efb8be0ceb8) at sem_wait.c:29
#4  0x00005624561c5782 in PGSemaphoreLock (sema=0x7efb8be0ceb8) at pg_sema.c:316
#5  0x0000562456238a0c in LWLockAcquire (lock=0x7efb8be0e600, mode=mode@entry=LW_EXCLUSIVE) at /build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/storage/lmgr/lwlock.c:1233
#6  0x00005624562239f6 in dsm_detach (seg=0x5624578710c8) at /build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/storage/ipc/dsm.c:772
#7  0x0000562456223c52 in dsm_backend_shutdown () at /build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/storage/ipc/dsm.c:655
#8  0x000056245622426a in shmem_exit (code=code@entry=1) at /build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/storage/ipc/ipc.c:248
#9  0x0000562456224363 in proc_exit_prepare (code=code@entry=1) at /build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/storage/ipc/ipc.c:185
#10 0x0000562456224400 in proc_exit (code=code@entry=1) at /build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/storage/ipc/ipc.c:102
#11 0x0000562456359e06 in errfinish (dummy=dummy@entry=0) at /build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/utils/error/elog.c:543
#12 0x000056245635e384 in elog_finish (elevel=elevel@entry=20, fmt=fmt@entry=0x5624564be7e0 "cannot unpin a segment that is not pinned")
    at /build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/utils/error/elog.c:1378
#13 0x0000562456224031 in dsm_unpin_segment (handle=1322526153) at /build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/storage/ipc/dsm.c:929
#14 0x000056245637e747 in dsa_release_in_place (place=0x7eff6de081f8) at /build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/utils/mmgr/dsa.c:618
#15 0x0000562456223980 in dsm_detach (seg=0x5624578710c8) at /build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/storage/ipc/dsm.c:746
#16 0x0000562456223c52 in dsm_backend_shutdown () at /build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/storage/ipc/dsm.c:655
#17 0x000056245622426a in shmem_exit (code=code@entry=1) at /build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/storage/ipc/ipc.c:248
#18 0x0000562456224363 in proc_exit_prepare (code=code@entry=1) at /build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/storage/ipc/ipc.c:185
#19 0x0000562456224400 in proc_exit (code=code@entry=1) at /build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/storage/ipc/ipc.c:102
#20 0x0000562456359e06 in errfinish (dummy=<optimized out>) at /build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/utils/error/elog.c:543
#21 0x0000562456245dbb in ProcessInterrupts () at /build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/tcop/postgres.c:3034
#22 0x000056245611472d in ExecScanFetch (recheckMtd=0x56245611f800 <BitmapHeapRecheck>, accessMtd=0x56245611f890 <BitmapHeapNext>, node=0x562457958fa8)
    at /build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/executor/execScan.c:43
#23 ExecScan (node=0x562457958fa8, accessMtd=0x56245611f890 <BitmapHeapNext>, recheckMtd=0x56245611f800 <BitmapHeapRecheck>) at /build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/executor/execScan.c:164
#24 0x000056245611a3b3 in ExecProcNode (node=0x562457958fa8) at /build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/include/executor/executor.h:250
#25 ExecAppend (pstate=0x5624579191f8) at /build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/executor/nodeAppend.c:222
#26 0x000056245611a70c in ExecProcNode (node=0x5624579191f8) at /build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/include/executor/executor.h:250
#27 fetch_input_tuple (aggstate=aggstate@entry=0x562457919430) at /build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/executor/nodeAgg.c:695
#28 0x000056245611c866 in agg_retrieve_direct (aggstate=0x562457919430) at /build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/executor/nodeAgg.c:2448
#29 ExecAgg (pstate=0x562457919430) at /build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/executor/nodeAgg.c:2158
#30 0x000056245610e50b in ExecProcNode (node=0x562457919430) at /build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/include/executor/executor.h:250
#31 ExecutePlan (execute_once=<optimized out>, dest=0x5624578eacf8, direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x562457919430,
    estate=0x5624579188f8) at /build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/executor/execMain.c:1723
#32 standard_ExecutorRun (queryDesc=0x56245794d2d8, direction=<optimized out>, count=0, execute_once=<optimized out>) at /build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/executor/execMain.c:364
#33 0x00007eff64598ea5 in pgss_ExecutorRun (queryDesc=0x56245794d2d8, direction=ForwardScanDirection, count=0, execute_once=<optimized out>)
    at /build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../contrib/pg_stat_statements/pg_stat_statements.c:889
#34 0x0000562456112dfd in ParallelQueryMain (seg=<optimized out>, toc=0x7eff6de08000) at /build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/executor/execParallel.c:996
#35 0x0000562455fef844 in ParallelWorkerMain (main_arg=<optimized out>) at /build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/access/transam/parallel.c:1214
#36 0x00005624561c87b8 in StartBackgroundWorker () at /build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/postmaster/bgworker.c:828
#37 0x00005624561d55fb in do_start_bgworker (rw=<optimized out>) at /build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/postmaster/postmaster.c:5728
#38 maybe_start_bgworkers () at /build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/postmaster/postmaster.c:5941
#39 0x00005624561d62b5 in sigusr1_handler (postgres_signal_arg=<optimized out>) at /build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/postmaster/postmaster.c:5121
#40 <signal handler called>
#41 0x00007eff6b7e15b3 in __select_nocancel () at ../sysdeps/unix/syscall-template.S:84
#42 0x0000562455f7eabc in ServerLoop () at /build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/postmaster/postmaster.c:1719
#43 0x00005624561d76bb in PostmasterMain (argc=5, argv=<optimized out>) at /build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/postmaster/postmaster.c:1363
#44 0x0000562455f80905 in main (argc=5, argv=0x56245783f850) at /build/postgresql-10-4UiiBA/postgresql-10-10.6/build/../src/backend/main/main.c:228

We should not call dsm_backend_shutdown twice in same process, right? So we tried call dsm_detach on same segment 0x5624578710c8 twice, but this is unexpected behavior and refcnt would be incorrect. And seems we can not LWLockAcquire lock and then LWLockAcquire same lock again without release. And here we have infinite waiting.

regards, Sergei

Reply | Threaded
Open this post in threaded view
|

Re: BUG #15585: infinite DynamicSharedMemoryControlLock waiting in parallel query

Thomas Munro-3
On Thu, Jan 24, 2019 at 11:56 PM Sergei Kornilov <[hidden email]> wrote:
> We should not call dsm_backend_shutdown twice in same process, right? So we tried call dsm_detach on same segment 0x5624578710c8 twice, but this is unexpected behavior and refcnt would be incorrect. And seems we can not LWLockAcquire lock and then LWLockAcquire same lock again without release. And here we have infinite waiting.

Yeah, I think your analysis is right.  It shouldn't do so while
holding the lock.  dsm_unpin_segment() should perhaps release it
before it raises an error, something like:

diff --git a/src/backend/storage/ipc/dsm.c b/src/backend/storage/ipc/dsm.c
index 36904d2676..b989c0b94a 100644
--- a/src/backend/storage/ipc/dsm.c
+++ b/src/backend/storage/ipc/dsm.c
@@ -924,9 +924,15 @@ dsm_unpin_segment(dsm_handle handle)
         * called on a segment which is pinned.
         */
        if (control_slot == INVALID_CONTROL_SLOT)
+       {
+               LWLockRelease(DynamicSharedMemoryControlLock);
                elog(ERROR, "cannot unpin unknown segment handle");
+       }
        if (!dsm_control->item[control_slot].pinned)
+       {
+               LWLockRelease(DynamicSharedMemoryControlLock);
                elog(ERROR, "cannot unpin a segment that is not pinned");
+       }
        Assert(dsm_control->item[control_slot].refcnt > 1);

        /*

I have contemplated that before, but not done it because I'm not sure
about the state of the system after that; we just shouldn't be in this
situation, because if we are, it means that we can error out when
later segments (in the array dsa_release_in_place() loops through)
remain pinned forever and we'll leak memory and run out of DSM slots.
Segment pinning is opting out of resource owner control, which means
the client code is responsible for not screwing it up.  Perhaps that
suggests we should PANIC, or perhaps just LOG and continue, but I'm
not sure.

I think the root cause is earlier and in a different process (see
ProcessInterrupt() in the stack).  Presumably one that reported
"dsa_area could not attach to segment" is closer to the point where
things go wrong.  If you are in a position to reproduce this on a
modified source tree, it'd be good to see the back trace for that, to
figure out which of a couple of possible code paths reach it.  Perhaps
you could do that by enabling core files and changing this:

-                       elog(ERROR, "dsa_area could not attach to segment");
+                       elog(PANIC, "dsa_area could not attach to segment");

I have so far not succeeded in reaching that condition.

--
Thomas Munro
http://www.enterprisedb.com

Reply | Threaded
Open this post in threaded view
|

Re: BUG #15585: infinite DynamicSharedMemoryControlLock waiting in parallel query

Sergei Kornilov
Hi

> I think the root cause is earlier and in a different process (see
> ProcessInterrupt() in the stack). Presumably one that reported
> "dsa_area could not attach to segment" is closer to the point where
> things go wrong.

I agree.
Log was:
 2019-01-23 20:31:33.508 UTC 2593 @ from  [vxid:121/1810816 txid:0] [] ERROR:  dsa_area could not attach to segment
 2019-01-23 20:31:33.508 UTC 2593 @ from  [vxid:121/1810816 txid:0] [] STATEMENT:  SELECT ...
 2019-01-23 20:31:33.510 UTC 22941 @ from  [vxid: txid:0] [] LOG:  worker process: parallel worker for PID 31224 (PID 2593) exited with exit code 1
 2019-01-23 20:31:33.511 UTC 31224 user@db from 127.0.0.1 [vxid:20/25544096 txid:0] [SELECT] ERROR:  dsa_area could not attach to segment
 2019-01-23 20:31:33.511 UTC 31224 user@db from 127.0.0.1 [vxid:20/25544096 txid:0] [SELECT] STATEMENT:  SELECT ...
 2019-01-23 20:31:33.511 UTC 2595 @ from  [vxid:138/277023 txid:0] [] FATAL:  terminating connection due to administrator command
 2019-01-23 20:31:33.511 UTC 2595 @ from  [vxid:138/277023 txid:0] [] STATEMENT:  SELECT ...
 2019-01-23 20:31:33.511 UTC 2594 @ from  [vxid:136/536348 txid:0] [] ERROR:  dsa_area could not attach to segment
 2019-01-23 20:31:33.511 UTC 2594 @ from  [vxid:136/536348 txid:0] [] STATEMENT:  SELECT ...
 2019-01-23 20:31:33.512 UTC 2595 @ from  [vxid:138/0 txid:0] [] FATAL:  cannot unpin a segment that is not pinned
 2019-01-23 20:31:33.513 UTC 22941 @ from  [vxid: txid:0] [] LOG:  worker process: parallel worker for PID 31224 (PID 2594) exited with exit code 1

22941 postmaster, 31224 - parallel leader, 2595 - parallel worker (backtrace was from this process), 2593 and 2594 seems another parallel workers for this query.

But I was surprised by possible double call dsm_backend_shutdown. Can first process (2593 in my case) be terminated with some reason (and therefore call shmem_exit) but be logged only with second error?

> If you are in a position to reproduce this

Unfortunally no, i can not run this database with modified source. And i still can not reproduce this bug by hand. Modified source was runned on my local host few days long. Without some luck...

regards, Sergei

Reply | Threaded
Open this post in threaded view
|

Re: BUG #15585: infinite DynamicSharedMemoryControlLock waiting in parallel query

Sergei Kornilov
Hello

Just for archive purposes:

This bug should be already fixed in stable branch and will be shipped in next minor release (11.3 and 10.8). Until the fix is released, the simplest workaround is to disable parallel query with SET max_parallel_workers_per_gather = 0 if you're affected by the bug.
It's a rare timing bug, but some workloads seem to be more prone to it, so unfortunately it went undetected for a long time.

Related discussion and patch was here: https://www.postgresql.org/message-id/flat/20190207014719.GJ29720%40telsasoft.com

regards, Sergei