SegFault on 9.6.14

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
46 messages Options
123
Reply | Threaded
Open this post in threaded view
|

SegFault on 9.6.14

Jerry Sievers-3
Greetings Hackers.

We have a reproduceable case of $subject that issues a backtrace such as
seen below.

The query that I'd prefer to sanitize before sending is <30 lines of at
a glance, not terribly complex logic.

It nonetheless dies hard after a few seconds of running and as expected,
results in an automatic all-backend restart.

Please advise on how to proceed.  Thanks!

bt
#0  initscan (scan=scan@entry=0x55d7a7daa0b0, key=0x0, keep_startblock=keep_startblock@entry=1 '\001')
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:233
#1  0x000055d7a72fa8d0 in heap_rescan (scan=0x55d7a7daa0b0, key=key@entry=0x0) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:1529
#2  0x000055d7a7451fef in ExecReScanSeqScan (node=node@entry=0x55d7a7d85100) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeSeqscan.c:280
#3  0x000055d7a742d36e in ExecReScan (node=0x55d7a7d85100) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:158
#4  0x000055d7a7445d38 in ExecReScanGather (node=node@entry=0x55d7a7d84d30) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeGather.c:475
#5  0x000055d7a742d255 in ExecReScan (node=0x55d7a7d84d30) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:166
#6  0x000055d7a7448673 in ExecReScanHashJoin (node=node@entry=0x55d7a7d84110) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeHashjoin.c:1019
#7  0x000055d7a742d29e in ExecReScan (node=node@entry=0x55d7a7d84110) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:226
<about 30 lines omitted>

--
Jerry Sievers
Postgres DBA/Development Consulting
e: [hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: SegFault on 9.6.14

Tomas Vondra-4
On Mon, Jul 15, 2019 at 06:48:05PM -0500, Jerry Sievers wrote:

>Greetings Hackers.
>
>We have a reproduceable case of $subject that issues a backtrace such as
>seen below.
>
>The query that I'd prefer to sanitize before sending is <30 lines of at
>a glance, not terribly complex logic.
>
>It nonetheless dies hard after a few seconds of running and as expected,
>results in an automatic all-backend restart.
>
>Please advise on how to proceed.  Thanks!
>
>bt
>#0  initscan (scan=scan@entry=0x55d7a7daa0b0, key=0x0, keep_startblock=keep_startblock@entry=1 '\001')
>    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:233
>#1  0x000055d7a72fa8d0 in heap_rescan (scan=0x55d7a7daa0b0, key=key@entry=0x0) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:1529
>#2  0x000055d7a7451fef in ExecReScanSeqScan (node=node@entry=0x55d7a7d85100) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeSeqscan.c:280
>#3  0x000055d7a742d36e in ExecReScan (node=0x55d7a7d85100) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:158
>#4  0x000055d7a7445d38 in ExecReScanGather (node=node@entry=0x55d7a7d84d30) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeGather.c:475
>#5  0x000055d7a742d255 in ExecReScan (node=0x55d7a7d84d30) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:166
>#6  0x000055d7a7448673 in ExecReScanHashJoin (node=node@entry=0x55d7a7d84110) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeHashjoin.c:1019
>#7  0x000055d7a742d29e in ExecReScan (node=node@entry=0x55d7a7d84110) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:226
><about 30 lines omitted>
>

Hmmm, that means it's crashing here:

    if (scan->rs_parallel != NULL)
        scan->rs_nblocks = scan->rs_parallel->phs_nblocks;     <--- here
    else
        scan->rs_nblocks = RelationGetNumberOfBlocks(scan->rs_rd);

But clearly, scan is valid (otherwise it'd crash on the if condition),
and scan->rs_parallel must me non-NULL. Which probably means the pointer
is (no longer) valid.

Could it be that the rs_parallel DSM disappears on rescan, or something
like that?


regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: SegFault on 9.6.14

Jerry Sievers-3
Tomas Vondra <[hidden email]> writes:

> On Mon, Jul 15, 2019 at 06:48:05PM -0500, Jerry Sievers wrote:
>
>>Greetings Hackers.
>>
>>We have a reproduceable case of $subject that issues a backtrace such as
>>seen below.
>>
>>The query that I'd prefer to sanitize before sending is <30 lines of at
>>a glance, not terribly complex logic.
>>
>>It nonetheless dies hard after a few seconds of running and as expected,
>>results in an automatic all-backend restart.
>>
>>Please advise on how to proceed.  Thanks!
>>
>>bt
>>#0  initscan (scan=scan@entry=0x55d7a7daa0b0, key=0x0, keep_startblock=keep_startblock@entry=1 '\001')
>>    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:233
>>#1  0x000055d7a72fa8d0 in heap_rescan (scan=0x55d7a7daa0b0, key=key@entry=0x0) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:1529
>>#2  0x000055d7a7451fef in ExecReScanSeqScan (node=node@entry=0x55d7a7d85100) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeSeqscan.c:280
>>#3  0x000055d7a742d36e in ExecReScan (node=0x55d7a7d85100) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:158
>>#4  0x000055d7a7445d38 in ExecReScanGather (node=node@entry=0x55d7a7d84d30) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeGather.c:475
>>#5  0x000055d7a742d255 in ExecReScan (node=0x55d7a7d84d30) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:166
>>#6  0x000055d7a7448673 in ExecReScanHashJoin (node=node@entry=0x55d7a7d84110) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeHashjoin.c:1019
>>#7  0x000055d7a742d29e in ExecReScan (node=node@entry=0x55d7a7d84110) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:226
>><about 30 lines omitted>
>>
>
> Hmmm, that means it's crashing here:
>
>    if (scan->rs_parallel != NULL)
>        scan->rs_nblocks = scan->rs_parallel->phs_nblocks;     <--- here
>    else
>        scan->rs_nblocks = RelationGetNumberOfBlocks(scan->rs_rd);
>
> But clearly, scan is valid (otherwise it'd crash on the if condition),
> and scan->rs_parallel must me non-NULL. Which probably means the pointer
> is (no longer) valid.
>
> Could it be that the rs_parallel DSM disappears on rescan, or something
> like that?

No clue but something I just tried was to disable parallelism by setting
max_parallel_workers_per_gather to 0 and however the query has not
finished after a few minutes, there is no crash.

Please advise.

Thx

>
>
> regards

--
Jerry Sievers
Postgres DBA/Development Consulting
e: [hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: SegFault on 9.6.14

Tomas Vondra-4
On Mon, Jul 15, 2019 at 07:22:55PM -0500, Jerry Sievers wrote:

>Tomas Vondra <[hidden email]> writes:
>
>> On Mon, Jul 15, 2019 at 06:48:05PM -0500, Jerry Sievers wrote:
>>
>>>Greetings Hackers.
>>>
>>>We have a reproduceable case of $subject that issues a backtrace such as
>>>seen below.
>>>
>>>The query that I'd prefer to sanitize before sending is <30 lines of at
>>>a glance, not terribly complex logic.
>>>
>>>It nonetheless dies hard after a few seconds of running and as expected,
>>>results in an automatic all-backend restart.
>>>
>>>Please advise on how to proceed.  Thanks!
>>>
>>>bt
>>>#0  initscan (scan=scan@entry=0x55d7a7daa0b0, key=0x0, keep_startblock=keep_startblock@entry=1 '\001')
>>>    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:233
>>>#1  0x000055d7a72fa8d0 in heap_rescan (scan=0x55d7a7daa0b0, key=key@entry=0x0) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:1529
>>>#2  0x000055d7a7451fef in ExecReScanSeqScan (node=node@entry=0x55d7a7d85100) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeSeqscan.c:280
>>>#3  0x000055d7a742d36e in ExecReScan (node=0x55d7a7d85100) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:158
>>>#4  0x000055d7a7445d38 in ExecReScanGather (node=node@entry=0x55d7a7d84d30) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeGather.c:475
>>>#5  0x000055d7a742d255 in ExecReScan (node=0x55d7a7d84d30) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:166
>>>#6  0x000055d7a7448673 in ExecReScanHashJoin (node=node@entry=0x55d7a7d84110) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeHashjoin.c:1019
>>>#7  0x000055d7a742d29e in ExecReScan (node=node@entry=0x55d7a7d84110) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:226
>>><about 30 lines omitted>
>>>
>>
>> Hmmm, that means it's crashing here:
>>
>>    if (scan->rs_parallel != NULL)
>>        scan->rs_nblocks = scan->rs_parallel->phs_nblocks;     <--- here
>>    else
>>        scan->rs_nblocks = RelationGetNumberOfBlocks(scan->rs_rd);
>>
>> But clearly, scan is valid (otherwise it'd crash on the if condition),
>> and scan->rs_parallel must me non-NULL. Which probably means the pointer
>> is (no longer) valid.
>>
>> Could it be that the rs_parallel DSM disappears on rescan, or something
>> like that?
>
>No clue but something I just tried was to disable parallelism by setting
>max_parallel_workers_per_gather to 0 and however the query has not
>finished after a few minutes, there is no crash.
>

That might be a hint my rough analysis was somewhat correct. The
question is whether the non-parallel plan does the same thing. Maybe it
picks a plan that does not require rescans, or something like that.

>Please advise.
>

It would be useful to see (a) exacution plan of the query, (b) full
backtrace and (c) a bit of context for the place where it crashed.

Something like (in gdb):

    bt full
    list
    p *scan



regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: SegFault on 9.6.14

Jerry Sievers-3
Tomas Vondra <[hidden email]> writes:

> On Mon, Jul 15, 2019 at 07:22:55PM -0500, Jerry Sievers wrote:
>
>>Tomas Vondra <[hidden email]> writes:
>>
>>> On Mon, Jul 15, 2019 at 06:48:05PM -0500, Jerry Sievers wrote:
>>>
>>>>Greetings Hackers.
>>>>
>>>>We have a reproduceable case of $subject that issues a backtrace such as
>>>>seen below.
>>>>
>>>>The query that I'd prefer to sanitize before sending is <30 lines of at
>>>>a glance, not terribly complex logic.
>>>>
>>>>It nonetheless dies hard after a few seconds of running and as expected,
>>>>results in an automatic all-backend restart.
>>>>
>>>>Please advise on how to proceed.  Thanks!
>>>>
>>>>bt
>>>>#0  initscan (scan=scan@entry=0x55d7a7daa0b0, key=0x0, keep_startblock=keep_startblock@entry=1 '\001')
>>>>    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:233
>>>>#1  0x000055d7a72fa8d0 in heap_rescan (scan=0x55d7a7daa0b0, key=key@entry=0x0) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:1529
>>>>#2  0x000055d7a7451fef in ExecReScanSeqScan (node=node@entry=0x55d7a7d85100) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeSeqscan.c:280
>>>>#3  0x000055d7a742d36e in ExecReScan (node=0x55d7a7d85100) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:158
>>>>#4  0x000055d7a7445d38 in ExecReScanGather (node=node@entry=0x55d7a7d84d30) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeGather.c:475
>>>>#5  0x000055d7a742d255 in ExecReScan (node=0x55d7a7d84d30) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:166
>>>>#6  0x000055d7a7448673 in ExecReScanHashJoin (node=node@entry=0x55d7a7d84110) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeHashjoin.c:1019
>>>>#7  0x000055d7a742d29e in ExecReScan (node=node@entry=0x55d7a7d84110) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:226
>>>><about 30 lines omitted>
>>>>
>>>
>>> Hmmm, that means it's crashing here:
>>>
>>>    if (scan->rs_parallel != NULL)
>>>        scan->rs_nblocks = scan->rs_parallel->phs_nblocks;     <--- here
>>>    else
>>>        scan->rs_nblocks = RelationGetNumberOfBlocks(scan->rs_rd);
>>>
>>> But clearly, scan is valid (otherwise it'd crash on the if condition),
>>> and scan->rs_parallel must me non-NULL. Which probably means the pointer
>>> is (no longer) valid.
>>>
>>> Could it be that the rs_parallel DSM disappears on rescan, or something
>>> like that?
>>
>>No clue but something I just tried was to disable parallelism by setting
>>max_parallel_workers_per_gather to 0 and however the query has not
>>finished after a few minutes, there is no crash.
>>
>
> That might be a hint my rough analysis was somewhat correct. The
> question is whether the non-parallel plan does the same thing. Maybe it
> picks a plan that does not require rescans, or something like that.
>
>>Please advise.
>>
>
> It would be useful to see (a) exacution plan of the query, (b) full
> backtrace and (c) a bit of context for the place where it crashed.
>
> Something like (in gdb):
>
>    bt full
>    list
>    p *scan

The p *scan did nothing unless I ran it first however my gdb $foo isn't
strong presently.

I'll need to sanitize the explain output but can do so ASAP and send it
along.

Thx!


$ gdb /usr/lib/postgresql/9.6/bin/postgres core
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.5) 7.11.1
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/lib/postgresql/9.6/bin/postgres...Reading symbols from /usr/lib/debug/.build-id/04/6f55a5ce6ce05064edfc8feee61c6cb039d296.debug...done.
done.
[New LWP 31654]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `postgres: foo_eis_segfault: jsievers staging 10.220.22.26(57948) SELECT       '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  initscan (scan=scan@entry=0x55d7a7daa0b0, key=0x0, keep_startblock=keep_startblock@entry=1 '\001')
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:233
233 /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c: No such file or directory.
(gdb) p *scan
$1 = {rs_rd = 0x7fa6c6935a08, rs_snapshot = 0x55d7a7c2e630, rs_nkeys = 0, rs_key = 0x0, rs_bitmapscan = 0 '\000', rs_samplescan = 0 '\000', rs_pageatatime = 1 '\001',
  rs_allow_strat = 1 '\001', rs_allow_sync = 1 '\001', rs_temp_snap = 1 '\001', rs_nblocks = 198714, rs_startblock = 1920300133, rs_numblocks = 4294967295, rs_strategy = 0x55d7a7daa6a0,
  rs_syncscan = 1 '\001', rs_inited = 0 '\000', rs_ctup = {t_len = 114, t_self = {ip_blkid = {bi_hi = 0, bi_lo = 62879}, ip_posid = 77}, t_tableOid = 994804890, t_data = 0x0},
  rs_cblock = 4294967295, rs_cbuf = 0, rs_parallel = 0x7fa673a54108, rs_cindex = 76, rs_ntuples = 77, rs_vistuples = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
    21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66,
    67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 28255, 25711, 24421, 25705, 12576, 8247, 29754, 29281, 25959, 27764, 29545, 8308, 31528, 16724, 18258, 21573, 20037, 21076,
    8281, 25914, 28792, 8306, 22139, 21057, 14880, 24950, 28274, 8303, 12337, 14880, 24950, 24946, 29812, 28526, 13088, 14880, 24950, 29810, 28793, 8293, 13106, 14880, 24950, 29810, 28793,
    28525, 8292, 12589, 14880, 24950, 25458, 27759, 26988, 8292, 8240, 30266, 29281, 25964, 25974, 29548, 28789, 12320, 14880, 24950, 28274, 28527, 25708, 12576, 8240, 30266, 29281, 24943,
    29812, 28526, 13088, 14880, 28524, 24931, 26996, 28271, 13344, 13110, 8317, 29242, 29541, 28526, 12576, 14880, 25970, 28275, 28001, 8293, 15932, 14880, 25970, 29555, 29295, 26484,
    28530, 28789, 25970, 8294, 8240, 29242, 29541, 29295, 26473, 25204, 8300, 8240, 29242, 29541, 29295, 26473, 28515, 8300, 8240, 29242, 29541...}}
(gdb) bt full
#0  initscan (scan=scan@entry=0x55d7a7daa0b0, key=0x0, keep_startblock=keep_startblock@entry=1 '\001')
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:233
        allow_strat = <optimized out>
        allow_sync = <optimized out>
#1  0x000055d7a72fa8d0 in heap_rescan (scan=0x55d7a7daa0b0, key=key@entry=0x0) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:1529
        __func__ = "heap_rescan"
#2  0x000055d7a7451fef in ExecReScanSeqScan (node=node@entry=0x55d7a7d85100) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeSeqscan.c:280
        scan = <optimized out>
#3  0x000055d7a742d36e in ExecReScan (node=0x55d7a7d85100) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:158
        __func__ = "ExecReScan"
#4  0x000055d7a7445d38 in ExecReScanGather (node=node@entry=0x55d7a7d84d30) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeGather.c:475
No locals.
#5  0x000055d7a742d255 in ExecReScan (node=0x55d7a7d84d30) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:166
        __func__ = "ExecReScan"
#6  0x000055d7a7448673 in ExecReScanHashJoin (node=node@entry=0x55d7a7d84110) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeHashjoin.c:1019
No locals.
#7  0x000055d7a742d29e in ExecReScan (node=node@entry=0x55d7a7d84110) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:226
        __func__ = "ExecReScan"
#8  0x000055d7a7433ce7 in ExecProcNode (node=node@entry=0x55d7a7d84110) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execProcnode.c:381
        result = <optimized out>
        __func__ = "ExecProcNode"
#9  0x000055d7a7452989 in ExecSort (node=node@entry=0x55d7a7d83ea0) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeSort.c:103
        plannode = <optimized out>
        outerNode = 0x55d7a7d84110
        tupDesc = <optimized out>
        estate = 0x55d7a7d5fee8
        dir = ForwardScanDirection
        tuplesortstate = 0x55d7a7dd2448
        slot = <optimized out>
#10 0x000055d7a7433de8 in ExecProcNode (node=0x55d7a7d83ea0) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execProcnode.c:495
        result = <optimized out>
        __func__ = "ExecProcNode"
#11 0x000055d7a743ffe9 in fetch_input_tuple (aggstate=aggstate@entry=0x55d7a7d83528) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeAgg.c:598
        slot = <optimized out>
#12 0x000055d7a7441bb3 in agg_retrieve_direct (aggstate=0x55d7a7d83528) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeAgg.c:2078
        econtext = 0x55d7a7d838b0
        pergroup = 0x55d7a7d8e758
        firstSlot = 0x55d7a7d83960
        numGroupingSets = 1
        node = 0x7fa6c68a5da8
        tmpcontext = 0x55d7a7d83750
        peragg = 0x55d7a7d8d6b8
        outerslot = <optimized out>
        nextSetSize = <optimized out>
        result = <optimized out>
        hasGroupingSets = 0 '\000'
        currentSet = <optimized out>
        numReset = 1
        i = <optimized out>
---Type <return> to continue, or q <return> to quit---
#13 ExecAgg (node=node@entry=0x55d7a7d83528) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeAgg.c:1903
        result = <optimized out>
#14 0x000055d7a7433dc8 in ExecProcNode (node=node@entry=0x55d7a7d83528) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execProcnode.c:503
        result = <optimized out>
        __func__ = "ExecProcNode"
#15 0x000055d7a744af74 in ExecLimit (node=node@entry=0x55d7a7d83288) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeLimit.c:91
        direction = ForwardScanDirection
        slot = <optimized out>
        outerPlan = 0x55d7a7d83528
        __func__ = "ExecLimit"
#16 0x000055d7a7433d28 in ExecProcNode (node=node@entry=0x55d7a7d83288) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execProcnode.c:531
        result = <optimized out>
        __func__ = "ExecProcNode"
#17 0x000055d7a744ff69 in ExecNestLoop (node=node@entry=0x55d7a7d60cd0) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeNestloop.c:174
        nl = 0x7fa6c68a6048
        innerPlan = 0x55d7a7d83288
        outerPlan = 0x55d7a7d610c0
        outerTupleSlot = <optimized out>
        innerTupleSlot = <optimized out>
        joinqual = 0x0
        otherqual = 0x0
        econtext = 0x55d7a7d60de0
        lc = <optimized out>
#18 0x000055d7a7433e28 in ExecProcNode (node=node@entry=0x55d7a7d60cd0) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execProcnode.c:476
        result = <optimized out>
        __func__ = "ExecProcNode"
#19 0x000055d7a7452989 in ExecSort (node=node@entry=0x55d7a7d60a60) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeSort.c:103
        plannode = <optimized out>
        outerNode = 0x55d7a7d60cd0
        tupDesc = <optimized out>
        estate = 0x55d7a7d5fee8
        dir = ForwardScanDirection
        tuplesortstate = 0x55d7a7d98398
        slot = <optimized out>
#20 0x000055d7a7433de8 in ExecProcNode (node=0x55d7a7d60a60) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execProcnode.c:495
        result = <optimized out>
        __func__ = "ExecProcNode"
#21 0x000055d7a743ffe9 in fetch_input_tuple (aggstate=aggstate@entry=0x55d7a7d60088) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeAgg.c:598
        slot = <optimized out>
#22 0x000055d7a7441bb3 in agg_retrieve_direct (aggstate=0x55d7a7d60088) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeAgg.c:2078
        econtext = 0x55d7a7d60440
        pergroup = 0x55d7a7d91230
        firstSlot = 0x55d7a7d604f0
        numGroupingSets = 1
        node = 0x7fa6c68a6328
        tmpcontext = 0x55d7a7d602b0
        peragg = 0x55d7a7d90190
        outerslot = <optimized out>
        nextSetSize = <optimized out>
---Type <return> to continue, or q <return> to quit---
        result = <optimized out>
        hasGroupingSets = 0 '\000'
        currentSet = <optimized out>
        numReset = 1
        i = <optimized out>
#23 ExecAgg (node=node@entry=0x55d7a7d60088) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeAgg.c:1903
        result = <optimized out>
#24 0x000055d7a7433dc8 in ExecProcNode (node=node@entry=0x55d7a7d60088) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execProcnode.c:503
        result = <optimized out>
        __func__ = "ExecProcNode"
#25 0x000055d7a742ff2e in ExecutePlan (dest=0x7fa673a96308, direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>, operation=CMD_SELECT, use_parallel_mode=<optimized out>,
    planstate=0x55d7a7d60088, estate=0x55d7a7d5fee8) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execMain.c:1567
        slot = <optimized out>
        current_tuple_count = 0
#26 standard_ExecutorRun (queryDesc=0x55d7a7d54718, direction=<optimized out>, count=0) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execMain.c:339
        estate = 0x55d7a7d5fee8
        operation = CMD_SELECT
        dest = 0x7fa673a96308
        sendTuples = <optimized out>
#27 0x00007fa6c7027515 in explain_ExecutorRun (queryDesc=0x55d7a7d54718, direction=ForwardScanDirection, count=0)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../contrib/auto_explain/auto_explain.c:281
        save_exception_stack = 0x7fff4aeeaa80
        save_context_stack = 0x0
        local_sigjmp_buf = {{__jmpbuf = {94384722106264, 8229023444991490729, 0, 94384722102040, 0, 1, 8229023444890827433, 8250672449167702697}, __mask_was_saved = 0, __saved_mask = {
              __val = {94384721739856, 140734450543072, 94384714940022, 140354273004312, 140354273004312, 140734450543104, 94384714691234, 2, 2, 140734450543200, 94384711690034, 2,
                3462443396, 8388608, 3547611511646930944, 140734450543200}}}}
#28 0x00007fa6c6e1fdb0 in pgss_ExecutorRun (queryDesc=0x55d7a7d54718, direction=ForwardScanDirection, count=0)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../contrib/pg_stat_statements/pg_stat_statements.c:875
        save_exception_stack = 0x7fff4aeeac20
        save_context_stack = 0x0
        local_sigjmp_buf = {{__jmpbuf = {94384722106264, 8229023444960033449, 0, 94384722102040, 0, 1, 8229023444993587881, 8250670555334589097}, __mask_was_saved = 0, __saved_mask = {
              __val = {4294967296, 140354272256808, 94384714928429, 16, 94384719269552, 24, 94384720895528, 94384722102040, 0, 140734450543408, 94384714928429, 94384722106264,
                94384720895528, 140734450543440, 94384714994982, 94384722106264}}}}
#29 0x000055d7a7553167 in PortalRunSelect (portal=portal@entry=0x55d7a7d55798, forward=forward@entry=1 '\001', count=0, count@entry=9223372036854775807, dest=dest@entry=0x7fa673a96308)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/tcop/pquery.c:948
        queryDesc = 0x55d7a7d54718
        direction = <optimized out>
        nprocessed = <optimized out>
        __func__ = "PortalRunSelect"
#30 0x000055d7a75547a0 in PortalRun (portal=portal@entry=0x55d7a7d55798, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=1 '\001', dest=dest@entry=0x7fa673a96308,
    altdest=altdest@entry=0x7fa673a96308, completionTag=completionTag@entry=0x7fff4aeeb050 "") at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/tcop/pquery.c:789
        save_exception_stack = 0x7fff4aeeaf00
        save_context_stack = 0x0
        local_sigjmp_buf = {{__jmpbuf = {94384721085312, 8229023445033433769, 94384722106264, 140352881779464, 94384721085584, 2, 8229023444955839145, 2765920793019169449},
            __mask_was_saved = 0, __saved_mask = {__val = {0, 12099560782865280144, 0, 8, 8, 140734450544226, 1, 88, 94384722106264, 94384715935530, 94384721085584, 140734450543840,
                94384714930017, 2, 94384722106264, 140734450543872}}}}
        result = <optimized out>
        nprocessed = <optimized out>
        saveTopTransactionResourceOwner = 0x55d7a7c118e8
---Type <return> to continue, or q <return> to quit---
        saveTopTransactionContext = 0x55d7a7c10eb8
        saveActivePortal = 0x0
        saveResourceOwner = 0x55d7a7c118e8
        savePortalContext = 0x0
        saveMemoryContext = 0x55d7a7c10eb8
        __func__ = "PortalRun"
#31 0x000055d7a75512d6 in exec_simple_query (
    query_string=0x55d7a7ce6b38 "select v.account_id, COUNT(cnt.clicks), te.description,\nl.product_id\nfrom nbox_nc_ah.tracking_events te\njoin nbox_nc_ah.page_views pv on pv.page_view_id = te.page_view_id\njoin nbox_nc_ah.visits v on v"...) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/tcop/postgres.c:1109
        parsetree = 0x55d7a7c5c380
        portal = 0x55d7a7d55798
        snapshot_set = <optimized out>
        commandTag = <optimized out>
        completionTag = "\000\370\253\247\327U\000\000\240F\252\247\327U\000\000\200\260\356J\377\177\000\000\215\326g\247\327U\000\000\300\260\356J\377\177\000\000d\261\356J\377\177\000\000\240\260\356J\377\177\000\000v\031F\247\327U\000"
        querytree_list = <optimized out>
        plantree_list = 0x7fa673a962d8
        receiver = 0x7fa673a96308
        format = 0
        dest = DestRemote
        parsetree_list = 0x55d7a7c5c4b0
        save_log_statement_stats = 0 '\000'
        was_logged = 0 '\000'
        msec_str = "\020\261\356J\377\177\000\000(\002", '\000' <repeats 14 times>, "\340?\256\247\327U\000"
        parsetree_item = 0x55d7a7c5c490
        isTopLevel = 1 '\001'
#32 PostgresMain (argc=<optimized out>, argv=argv@entry=0x55d7a7c56830, dbname=0x55d7a7c11b88 "staging", username=<optimized out>)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/tcop/postgres.c:4101
        query_string = 0x55d7a7ce6b38 "select v.account_id, COUNT(cnt.clicks), te.description,\nl.product_id\nfrom nbox_nc_ah.tracking_events te\njoin nbox_nc_ah.page_views pv on pv.page_view_id = te.page_view_id\njoin nbox_nc_ah.visits v on v"...
        firstchar = -1479190632
        input_message = {
          data = 0x55d7a7ce6b38 "select v.account_id, COUNT(cnt.clicks), te.description,\nl.product_id\nfrom nbox_nc_ah.tracking_events te\njoin nbox_nc_ah.page_views pv on pv.page_view_id = te.page_view_id\njoin nbox_nc_ah.visits v on v"..., len = 1042, maxlen = 2048, cursor = 1042}
        local_sigjmp_buf = {{__jmpbuf = {140734450544288, 8229023445169748649, 94384721061936, 1, 94384721061720, 94384721052928, 8229023445035530921, 2765920790734322345},
            __mask_was_saved = 1, __saved_mask = {__val = {0, 94386201296895, 94384713689589, 18446603339259007057, 140354407146656, 0, 1305670059009, 32, 4, 489626271867, 0, 0,
                532575944823, 140734450544608, 0, 140734450544704}}}}
        send_ready_for_query = 0 '\000'
        disable_idle_in_transaction_timeout = <optimized out>
        __func__ = "PostgresMain"
#33 0x000055d7a72c6a1b in BackendRun (port=0x55d7a7c54500) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/postmaster/postmaster.c:4339
        ac = 1
        secs = 616545808
        usecs = 503344
        i = 1
        av = 0x55d7a7c56830
        maxac = <optimized out>
#34 BackendStartup (port=0x55d7a7c54500) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/postmaster/postmaster.c:4013
        bn = <optimized out>
---Type <return> to continue, or q <return> to quit---
        pid = <optimized out>
#35 ServerLoop () at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/postmaster/postmaster.c:1722
        rmask = {fds_bits = {16, 0 <repeats 15 times>}}
        selres = <optimized out>
        now = <optimized out>
        readmask = {fds_bits = {48, 0 <repeats 15 times>}}
        last_lockfile_recheck_time = 1563230588
        last_touch_time = 1563230588
        __func__ = "ServerLoop"
#36 0x000055d7a74ed281 in PostmasterMain (argc=13, argv=<optimized out>) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/postmaster/postmaster.c:1330
        opt = <optimized out>
        status = <optimized out>
        userDoption = <optimized out>
        listen_addr_saved = 1 '\001'
        i = <optimized out>
        output_config_variable = <optimized out>
        __func__ = "PostmasterMain"
#37 0x000055d7a72c7bf1 in main (argc=13, argv=0x55d7a7c0f840) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/main/main.c:228
No locals.
(gdb) list
228 in /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c
(gdb)

>
>
>
> regards

--
Jerry Sievers
Postgres DBA/Development Consulting
e: [hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: SegFault on 9.6.14

Tomas Vondra-4
On Mon, Jul 15, 2019 at 08:20:00PM -0500, Jerry Sievers wrote:

>Tomas Vondra <[hidden email]> writes:
>
>> On Mon, Jul 15, 2019 at 07:22:55PM -0500, Jerry Sievers wrote:
>>
>>>Tomas Vondra <[hidden email]> writes:
>>>
>>>> On Mon, Jul 15, 2019 at 06:48:05PM -0500, Jerry Sievers wrote:
>>>>
>>>>>Greetings Hackers.
>>>>>
>>>>>We have a reproduceable case of $subject that issues a backtrace such as
>>>>>seen below.
>>>>>
>>>>>The query that I'd prefer to sanitize before sending is <30 lines of at
>>>>>a glance, not terribly complex logic.
>>>>>
>>>>>It nonetheless dies hard after a few seconds of running and as expected,
>>>>>results in an automatic all-backend restart.
>>>>>
>>>>>Please advise on how to proceed.  Thanks!
>>>>>
>>>>>bt
>>>>>#0  initscan (scan=scan@entry=0x55d7a7daa0b0, key=0x0, keep_startblock=keep_startblock@entry=1 '\001')
>>>>>    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:233
>>>>>#1  0x000055d7a72fa8d0 in heap_rescan (scan=0x55d7a7daa0b0, key=key@entry=0x0) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:1529
>>>>>#2  0x000055d7a7451fef in ExecReScanSeqScan (node=node@entry=0x55d7a7d85100) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeSeqscan.c:280
>>>>>#3  0x000055d7a742d36e in ExecReScan (node=0x55d7a7d85100) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:158
>>>>>#4  0x000055d7a7445d38 in ExecReScanGather (node=node@entry=0x55d7a7d84d30) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeGather.c:475
>>>>>#5  0x000055d7a742d255 in ExecReScan (node=0x55d7a7d84d30) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:166
>>>>>#6  0x000055d7a7448673 in ExecReScanHashJoin (node=node@entry=0x55d7a7d84110) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeHashjoin.c:1019
>>>>>#7  0x000055d7a742d29e in ExecReScan (node=node@entry=0x55d7a7d84110) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:226
>>>>><about 30 lines omitted>
>>>>>
>>>>
>>>> Hmmm, that means it's crashing here:
>>>>
>>>>    if (scan->rs_parallel != NULL)
>>>>        scan->rs_nblocks = scan->rs_parallel->phs_nblocks;     <--- here
>>>>    else
>>>>        scan->rs_nblocks = RelationGetNumberOfBlocks(scan->rs_rd);
>>>>
>>>> But clearly, scan is valid (otherwise it'd crash on the if condition),
>>>> and scan->rs_parallel must me non-NULL. Which probably means the pointer
>>>> is (no longer) valid.
>>>>
>>>> Could it be that the rs_parallel DSM disappears on rescan, or something
>>>> like that?
>>>
>>>No clue but something I just tried was to disable parallelism by setting
>>>max_parallel_workers_per_gather to 0 and however the query has not
>>>finished after a few minutes, there is no crash.
>>>
>>
>> That might be a hint my rough analysis was somewhat correct. The
>> question is whether the non-parallel plan does the same thing. Maybe it
>> picks a plan that does not require rescans, or something like that.
>>
>>>Please advise.
>>>
>>
>> It would be useful to see (a) exacution plan of the query, (b) full
>> backtrace and (c) a bit of context for the place where it crashed.
>>
>> Something like (in gdb):
>>
>>    bt full
>>    list
>>    p *scan
>
>The p *scan did nothing unless I ran it first however my gdb $foo isn't
>strong presently.

Hmm, the rs_parallel pointer looks sane (it's not obvious garbage). Can
you try this?

   p *scan->rs_parallel

Another question - are you sure this is not an OOM issue? That might
sometimes look like SIGSEGV due to overcommit. What's the memory
consumption / is there anything in dmesg?

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: SegFault on 9.6.14

Thomas Munro-5
On Tue, Jul 16, 2019 at 8:22 PM Tomas Vondra
<[hidden email]> wrote:
> On Mon, Jul 15, 2019 at 08:20:00PM -0500, Jerry Sievers wrote:
> >>>>>We have a reproduceable case of $subject that issues a backtrace such as
> >>>>>seen below.

> >>>>>#0  initscan (scan=scan@entry=0x55d7a7daa0b0, key=0x0, keep_startblock=keep_startblock@entry=1 '\001')
> >>>>>    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:233
> >>>>>#1  0x000055d7a72fa8d0 in heap_rescan (scan=0x55d7a7daa0b0, key=key@entry=0x0) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:1529
> >>>>>#2  0x000055d7a7451fef in ExecReScanSeqScan (node=node@entry=0x55d7a7d85100) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeSeqscan.c:280
> >>>>>#3  0x000055d7a742d36e in ExecReScan (node=0x55d7a7d85100) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:158
> >>>>>#4  0x000055d7a7445d38 in ExecReScanGather (node=node@entry=0x55d7a7d84d30) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeGather.c:475
> >>>>>#5  0x000055d7a742d255 in ExecReScan (node=0x55d7a7d84d30) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:166

Here's a query that rescans a gather node repeatedly on 9.6 in case it
helps someone build a repro, but it works fine here.

--
Thomas Munro
https://enterprisedb.com

test.sql (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: SegFault on 9.6.14

Jerry Sievers-3
In reply to this post by Tomas Vondra-4
Tomas Vondra <[hidden email]> writes:

> On Mon, Jul 15, 2019 at 08:20:00PM -0500, Jerry Sievers wrote:
>
>>Tomas Vondra <[hidden email]> writes:
>>
>>> On Mon, Jul 15, 2019 at 07:22:55PM -0500, Jerry Sievers wrote:
>>>
>>>>Tomas Vondra <[hidden email]> writes:
>>>>
>>>>> On Mon, Jul 15, 2019 at 06:48:05PM -0500, Jerry Sievers wrote:
>>>>>
>>>>>>Greetings Hackers.
>>>>>>
>>>>>>We have a reproduceable case of $subject that issues a backtrace such as
>>>>>>seen below.
>>>>>>
>>>>>>The query that I'd prefer to sanitize before sending is <30 lines of at
>>>>>>a glance, not terribly complex logic.
>>>>>>
>>>>>>It nonetheless dies hard after a few seconds of running and as expected,
>>>>>>results in an automatic all-backend restart.
>>>>>>
>>>>>>Please advise on how to proceed.  Thanks!
>>>>>>
>>>>>>bt
>>>>>>#0  initscan (scan=scan@entry=0x55d7a7daa0b0, key=0x0, keep_startblock=keep_startblock@entry=1 '\001')
>>>>>>    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:233
>>>>>>#1  0x000055d7a72fa8d0 in heap_rescan (scan=0x55d7a7daa0b0, key=key@entry=0x0) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:1529
>>>>>>#2  0x000055d7a7451fef in ExecReScanSeqScan (node=node@entry=0x55d7a7d85100) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeSeqscan.c:280
>>>>>>#3  0x000055d7a742d36e in ExecReScan (node=0x55d7a7d85100) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:158
>>>>>>#4  0x000055d7a7445d38 in ExecReScanGather (node=node@entry=0x55d7a7d84d30) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeGather.c:475
>>>>>>#5  0x000055d7a742d255 in ExecReScan (node=0x55d7a7d84d30) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:166
>>>>>>#6  0x000055d7a7448673 in ExecReScanHashJoin (node=node@entry=0x55d7a7d84110) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeHashjoin.c:1019
>>>>>>#7  0x000055d7a742d29e in ExecReScan (node=node@entry=0x55d7a7d84110) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:226
>>>>>><about 30 lines omitted>
>>>>>>
>>>>>
>>>>> Hmmm, that means it's crashing here:
>>>>>
>>>>>    if (scan->rs_parallel != NULL)
>>>>>        scan->rs_nblocks = scan->rs_parallel->phs_nblocks;     <--- here
>>>>>    else
>>>>>        scan->rs_nblocks = RelationGetNumberOfBlocks(scan->rs_rd);
>>>>>
>>>>> But clearly, scan is valid (otherwise it'd crash on the if condition),
>>>>> and scan->rs_parallel must me non-NULL. Which probably means the pointer
>>>>> is (no longer) valid.
>>>>>
>>>>> Could it be that the rs_parallel DSM disappears on rescan, or something
>>>>> like that?
>>>>
>>>>No clue but something I just tried was to disable parallelism by setting
>>>>max_parallel_workers_per_gather to 0 and however the query has not
>>>>finished after a few minutes, there is no crash.
>>>>
>>>
>>> That might be a hint my rough analysis was somewhat correct. The
>>> question is whether the non-parallel plan does the same thing. Maybe it
>>> picks a plan that does not require rescans, or something like that.
>>>
>>>>Please advise.
>>>>
>>>
>>> It would be useful to see (a) exacution plan of the query, (b) full
>>> backtrace and (c) a bit of context for the place where it crashed.
>>>
>>> Something like (in gdb):
>>>
>>>    bt full
>>>    list
>>>    p *scan
>>
>>The p *scan did nothing unless I ran it first however my gdb $foo isn't
>>strong presently.
>
> Hmm, the rs_parallel pointer looks sane (it's not obvious garbage). Can
> you try this?
>
>   p *scan->rs_parallel


$ gdb /usr/lib/postgresql/9.6/bin/postgres core
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.5) 7.11.1
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/lib/postgresql/9.6/bin/postgres...Reading symbols from /usr/lib/debug/.build-id/04/6f55a5ce6ce05064edfc8feee61c6cb039d296.debug...done.
done.
[New LWP 31654]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `postgres: foo_eis_segfault: jsievers staging 10.220.22.26(57948) SELECT       '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  initscan (scan=scan@entry=0x55d7a7daa0b0, key=0x0, keep_startblock=keep_startblock@entry=1 '\001')
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:233
233 /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c: No such file or directory.
(gdb) p *scan->rs_parallel
Cannot access memory at address 0x7fa673a54108
(gdb)

>
> Another question - are you sure this is not an OOM issue? That might
> sometimes look like SIGSEGV due to overcommit. What's the memory
> consumption / is there anything in dmesg?

Below is all I got after a prior dmesg -c...

dmesg -c
[5441294.442062] postgres[12033]: segfault at 7f3d011d2110 ip 000055666def9a31 sp 00007ffc37be9a70 error 4 in postgres[55666de23000+653000]

Thanks!

>
> regards

--
Jerry Sievers
Postgres DBA/Development Consulting
e: [hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: SegFault on 9.6.14

Thomas Munro-5
On Wed, Jul 17, 2019 at 11:06 AM Jerry Sievers <[hidden email]> wrote:
> (gdb) p *scan->rs_parallel
> Cannot access memory at address 0x7fa673a54108

So I guess one question is: was it a valid address that's been
unexpectedly unmapped, or is the pointer corrupted?  Any chance you
can strace the backend and pull out the map, unmap calls?

--
Thomas Munro
https://enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: SegFault on 9.6.14

Thomas Munro-5
On Wed, Jul 17, 2019 at 11:11 AM Thomas Munro <[hidden email]> wrote:
> map, unmap

mmap, munmap

--
Thomas Munro
https://enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: SegFault on 9.6.14

Jerry Sievers-3
In reply to this post by Thomas Munro-5
Thomas Munro <[hidden email]> writes:

> On Wed, Jul 17, 2019 at 11:06 AM Jerry Sievers <[hidden email]> wrote:
>
>> (gdb) p *scan->rs_parallel
>> Cannot access memory at address 0x7fa673a54108
>
> So I guess one question is: was it a valid address that's been
> unexpectedly unmapped, or is the pointer corrupted?  Any chance you
> can strace the backend and pull out the map, unmap calls?

I'll dig further.

Here is a sanitized look at the query and explain plan...

The segfault happens $immediately upon issuance of the query.





begin;

-- This setting makes the segfault go away
--set local max_parallel_workers_per_gather to 0;

explain
select v.account_id, COUNT(cnt.clicks), te.description,
l.product_id
from thing3.thing10 te
join thing3.thing9 pv on pv.page_view_id = te.page_view_id
join thing3.thing11 v on v.visit_id = pv.visit_id
left join thing6.thing12 l on v.account_id=l.account_id
  left join lateral (
    select MAX(v.visit_id)
         ,COUNT(*) as clicks
         from thing3.thing10 te
         join thing3.thing9 pv on pv.page_view_id =
te.page_view_id
         join thing3.thing11 v on v.visit_id = pv.visit_id
         where te.description in ('thing7',
'thing8')
           and v.account_id=l.account_id
         GROUP BY v.account_id, v.visit_id
         order by v.account_id, v.visit_id desc
         limit 1
    )cnt on true
where (te.description in ('thing4',
'thing5')
         or te.description like'%auto%')
  and te.created_at > '2019-06-24 00:00:00'
--and l.loan_status_id in (5,6)
group by v.account_id, te.description,
l.product_id;

abort;
BEGIN
                                                                                                                            QUERY PLAN                                                                                                                            
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=12300178.71..12300179.79 rows=48 width=44)
   Group Key: v.account_id, te.description, l.product_id
   ->  Sort  (cost=12300178.71..12300178.83 rows=48 width=44)
         Sort Key: v.account_id, te.description, l.product_id
         ->  Nested Loop Left Join  (cost=251621.81..12300177.37 rows=48 width=44)
               ->  Gather  (cost=1001.55..270403.27 rows=48 width=40)
                     Workers Planned: 3
                     ->  Nested Loop Left Join  (cost=1.56..269398.47 rows=15 width=40)
                           ->  Nested Loop  (cost=1.13..269391.71 rows=14 width=32)
                                 ->  Nested Loop  (cost=0.57..269368.66 rows=39 width=32)
                                       ->  Parallel Seq Scan on thing10 te  (cost=0.00..269228.36 rows=39 width=32)
                                             Filter: ((created_at > '2019-06-24 00:00:00'::timestamp without time zone) AND (((description)::text = ANY ('{thing4,thing5}'::text[])) OR ((description)::text ~~ '%auto%'::text)))
                                       ->  Index Scan using page_views_pkey on thing9 pv  (cost=0.57..3.59 rows=1 width=8)
                                             Index Cond: (page_view_id = te.page_view_id)
                                 ->  Index Scan using visits_pkey on thing11 v  (cost=0.56..0.58 rows=1 width=8)
                                       Index Cond: (visit_id = pv.visit_id)
                           ->  Index Scan using index_loans_on_account_id on thing12 l  (cost=0.42..0.46 rows=2 width=8)
                                 Index Cond: (v.account_id = account_id)
               ->  Limit  (cost=250620.25..250620.27 rows=1 width=20)
                     ->  GroupAggregate  (cost=250620.25..250620.27 rows=1 width=20)
                           Group Key: v_1.visit_id
                           ->  Sort  (cost=250620.25..250620.26 rows=1 width=8)
                                 Sort Key: v_1.visit_id DESC
                                 ->  Hash Join  (cost=1154.34..250620.24 rows=1 width=8)
                                       Hash Cond: (te_1.page_view_id = pv_1.page_view_id)
                                       ->  Gather  (cost=1000.00..250452.00 rows=3706 width=4)
                                             Workers Planned: 3
                                             ->  Parallel Seq Scan on thing10 te_1  (cost=0.00..249081.40 rows=1195 width=4)
                                                   Filter: ((description)::text = ANY ('{thing7,thing8}'::text[]))
                                       ->  Hash  (cost=152.85..152.85 rows=119 width=12)
                                             ->  Nested Loop  (cost=1.01..152.85 rows=119 width=12)
                                                   ->  Index Scan using index_visits_on_account_id on thing11 v_1  (cost=0.43..15.63 rows=18 width=8)
                                                         Index Cond: (account_id = l.account_id)
                                                   ->  Index Scan using index_pv_on_visit on thing9 pv_1  (cost=0.57..7.55 rows=7 width=8)
                                                         Index Cond: (visit_id = v_1.visit_id)
(35 rows)

ROLLBACK


--
Jerry Sievers
Postgres DBA/Development Consulting
e: [hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: SegFault on 9.6.14

Jerry Sievers-3
In reply to this post by Thomas Munro-5
Thomas Munro <[hidden email]> writes:

> On Wed, Jul 17, 2019 at 11:06 AM Jerry Sievers <[hidden email]> wrote:
>
>> (gdb) p *scan->rs_parallel
>> Cannot access memory at address 0x7fa673a54108
>
> So I guess one question is: was it a valid address that's been
> unexpectedly unmapped, or is the pointer corrupted?  Any chance you
> can strace the backend and pull out the map, unmap calls?

There were about 60k lines from strace including these few...


mmap(NULL, 528384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f3d0127a000
mmap(NULL, 266240, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f3d01239000
mmap(NULL, 287624, PROT_READ|PROT_WRITE, MAP_SHARED, 124, 0) = 0x7f3d011f2000
mmap(NULL, 262504, PROT_READ|PROT_WRITE, MAP_SHARED, 124, 0) = 0x7f3d011b1000
munmap(0x7f3d011b1000, 262504)          = 0

Thx





--
Jerry Sievers
Postgres DBA/Development Consulting
e: [hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: SegFault on 9.6.14

Thomas Munro-5
On Wed, Jul 17, 2019 at 11:33 AM Jerry Sievers <[hidden email]> wrote:
>          ->  Nested Loop Left Join  (cost=251621.81..12300177.37 rows=48 width=44)
>                ->  Gather  (cost=1001.55..270403.27 rows=48 width=40)

>                ->  Limit  (cost=250620.25..250620.27 rows=1 width=20)

>                                        ->  Gather  (cost=1000.00..250452.00 rows=3706 width=4)

One observation is that it's a rescan a bit like the one in the
unsuccessful repro attempt I posted, but it has *two* Gather nodes in
it (and thus two parallel query DSM segments), and only one of them
should be rescanned, and from the backtrace we see that it is indeed
the expected one, the one under the Limit operator.  Neither of them
should be getting unmapped in the leader though and AFAIK nothing
happening in the workers could cause this effect, the leader would
have to explicitly unmap the thing AFAIK.

On Wed, Jul 17, 2019 at 11:42 AM Jerry Sievers <[hidden email]> wrote:
> mmap(NULL, 287624, PROT_READ|PROT_WRITE, MAP_SHARED, 124, 0) = 0x7f3d011f2000
> mmap(NULL, 262504, PROT_READ|PROT_WRITE, MAP_SHARED, 124, 0) = 0x7f3d011b1000
> munmap(0x7f3d011b1000, 262504)          = 0

Ok, there go our two parallel query DSM segments, and there it is
being unmapped.  Hmm.  Any chance you could attach a debugger, and
"break munmap", "cont", and then show us the backtrace "bt" when that
is reached?




--
Thomas Munro
https://enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: SegFault on 9.6.14

Jerry Sievers-3
Thomas Munro <[hidden email]> writes:

> On Wed, Jul 17, 2019 at 11:33 AM Jerry Sievers <[hidden email]> wrote:
>
>>          ->  Nested Loop Left Join  (cost=251621.81..12300177.37 rows=48 width=44)
>>                ->  Gather  (cost=1001.55..270403.27 rows=48 width=40)
>
>>                ->  Limit  (cost=250620.25..250620.27 rows=1 width=20)
>
>>                                        ->  Gather  (cost=1000.00..250452.00 rows=3706 width=4)
>
> One observation is that it's a rescan a bit like the one in the
> unsuccessful repro attempt I posted, but it has *two* Gather nodes in
> it (and thus two parallel query DSM segments), and only one of them
> should be rescanned, and from the backtrace we see that it is indeed
> the expected one, the one under the Limit operator.  Neither of them
> should be getting unmapped in the leader though and AFAIK nothing
> happening in the workers could cause this effect, the leader would
> have to explicitly unmap the thing AFAIK.
>
> On Wed, Jul 17, 2019 at 11:42 AM Jerry Sievers <[hidden email]> wrote:
>> mmap(NULL, 287624, PROT_READ|PROT_WRITE, MAP_SHARED, 124, 0) = 0x7f3d011f2000
>> mmap(NULL, 262504, PROT_READ|PROT_WRITE, MAP_SHARED, 124, 0) = 0x7f3d011b1000
>> munmap(0x7f3d011b1000, 262504)          = 0
>
> Ok, there go our two parallel query DSM segments, and there it is
> being unmapped.  Hmm.  Any chance you could attach a debugger, and
> "break munmap", "cont", and then show us the backtrace "bt" when that
> is reached?

gdb /usr/lib/postgresql/9.6/bin/postgres 21640
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.5) 7.11.1
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/lib/postgresql/9.6/bin/postgres...Reading symbols from /usr/lib/debug/.build-id/04/6f55a5ce6ce05064edfc8feee61c6cb039d296.debug...done.
done.
Attaching to program: /usr/lib/postgresql/9.6/bin/postgres, process 21640
Reading symbols from /usr/lib/x86_64-linux-gnu/libxml2.so.2...Reading symbols from /usr/lib/debug/.build-id/d3/57ce1dba1fab803eddf48922123ffd0a303676.debug...done.
done.
Reading symbols from /lib/x86_64-linux-gnu/libpam.so.0...(no debugging symbols found)...done.
Reading symbols from /lib/x86_64-linux-gnu/libssl.so.1.0.0...Reading symbols from /usr/lib/debug/.build-id/ff/69ea60ebe05f2dd689d2b26fc85a73e5fbc3a0.debug...done.
done.
Reading symbols from /lib/x86_64-linux-gnu/libcrypto.so.1.0.0...Reading symbols from /usr/lib/debug/.build-id/15/ffeb43278726b025f020862bf51302822a40ec.debug...done.
done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2...(no debugging symbols found)...done.
Reading symbols from /lib/x86_64-linux-gnu/librt.so.1...Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/librt-2.23.so...done.
done.
Reading symbols from /lib/x86_64-linux-gnu/libdl.so.2...Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/libdl-2.23.so...done.
done.
Reading symbols from /lib/x86_64-linux-gnu/libm.so.6...Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/libm-2.23.so...done.
done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libldap_r-2.4.so.2...Reading symbols from /usr/lib/debug/.build-id/38/90d33727391e4a85dc0f819ab0aa29bb5dfc86.debug...done.
done.
Reading symbols from /lib/x86_64-linux-gnu/libsystemd.so.0...(no debugging symbols found)...done.
Reading symbols from /lib/x86_64-linux-gnu/libc.so.6...Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/libc-2.23.so...done.
done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libicuuc.so.55...Reading symbols from /usr/lib/debug/.build-id/46/3d8b610702d64ae0803c7dfcaa02cfb4c6477b.debug...done.
done.
Reading symbols from /lib/x86_64-linux-gnu/libz.so.1...Reading symbols from /usr/lib/debug/.build-id/8d/9bd4ce26e45ef16075c67d5f5eeafd8b562832.debug...done.
done.
Reading symbols from /lib/x86_64-linux-gnu/liblzma.so.5...(no debugging symbols found)...done.
Reading symbols from /lib/x86_64-linux-gnu/libaudit.so.1...(no debugging symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libkrb5.so.3...(no debugging symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libk5crypto.so.3...(no debugging symbols found)...done.
Reading symbols from /lib/x86_64-linux-gnu/libcom_err.so.2...Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/libcom_err.so.2.1...done.
done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libkrb5support.so.0...(no debugging symbols found)...done.
Reading symbols from /lib/x86_64-linux-gnu/libpthread.so.0...Reading symbols from /usr/lib/debug/.build-id/b1/7c21299099640a6d863e423d99265824e7bb16.debug...done.
done.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Reading symbols from /lib64/ld-linux-x86-64.so.2...Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/ld-2.23.so...done.
done.
Reading symbols from /usr/lib/x86_64-linux-gnu/liblber-2.4.so.2...Reading symbols from /usr/lib/debug/.build-id/8e/613d0b8d8e3537785637424782be8502ababd2.debug...done.
done.
Reading symbols from /lib/x86_64-linux-gnu/libresolv.so.2...Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/libresolv-2.23.so...done.
done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libsasl2.so.2...(no debugging symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libgssapi.so.3...(no debugging symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libgnutls.so.30...(no debugging symbols found)...done.
Reading symbols from /lib/x86_64-linux-gnu/libselinux.so.1...(no debugging symbols found)...done.
Reading symbols from /lib/x86_64-linux-gnu/libgcrypt.so.20...(no debugging symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libicudata.so.55...(no debugging symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libstdc++.so.6...(no debugging symbols found)...done.
Reading symbols from /lib/x86_64-linux-gnu/libgcc_s.so.1...Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/libgcc_s.so.1...done.
done.
Reading symbols from /lib/x86_64-linux-gnu/libkeyutils.so.1...(no debugging symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libheimntlm.so.0...(no debugging symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libkrb5.so.26...(no debugging symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libasn1.so.8...(no debugging symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libhcrypto.so.4...(no debugging symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libroken.so.18...(no debugging symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libp11-kit.so.0...(no debugging symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libidn.so.11...(no debugging symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libtasn1.so.6...(no debugging symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libnettle.so.6...(no debugging symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libhogweed.so.4...(no debugging symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libgmp.so.10...(no debugging symbols found)...done.
Reading symbols from /lib/x86_64-linux-gnu/libpcre.so.3...Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/libpcre.so.3.13.2...done.
done.
Reading symbols from /lib/x86_64-linux-gnu/libgpg-error.so.0...(no debugging symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libwind.so.0...(no debugging symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libheimbase.so.1...(no debugging symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libhx509.so.5...(no debugging symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libsqlite3.so.0...Reading symbols from /usr/lib/debug/.build-id/3b/0454e57467057071f7ad49651e0fa7b01cf5c7.debug...done.
done.
Reading symbols from /lib/x86_64-linux-gnu/libcrypt.so.1...Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/libcrypt-2.23.so...done.
done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libffi.so.6...Reading symbols from /usr/lib/debug/.build-id/9d/9c958f1f4894afef6aecd90d1c430ea29ac34f.debug...done.
done.
Reading symbols from /usr/lib/postgresql/9.6/lib/auto_explain.so...Reading symbols from /usr/lib/debug/.build-id/94/ab76178c50b0e098f2bd0f3501d9cb6562c743.debug...done.
done.
Reading symbols from /usr/lib/postgresql/9.6/lib/pg_stat_statements.so...Reading symbols from /usr/lib/debug/.build-id/cf/f288800c22fd97059aaf8e425ae17e29fb88fb.debug...done.
done.
Reading symbols from /usr/lib/postgresql/9.6/lib/pglogical.so...(no debugging symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libpq.so.5...(no debugging symbols found)...done.
Reading symbols from /lib/x86_64-linux-gnu/libnss_files.so.2...Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/libnss_files-2.23.so...done.
done.
0x00007f3d093379f3 in __epoll_wait_nocancel () at ../sysdeps/unix/syscall-template.S:84
84 ../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb) break munmap
Breakpoint 1 at 0x7f3d09331740: file ../sysdeps/unix/syscall-template.S, line 84.
(gdb) cont
Continuing.

Program received signal SIGUSR1, User defined signal 1.
hash_search_with_hash_value (hashp=0x5566701baa68, keyPtr=keyPtr@entry=0x7ffc37be9790,
    hashvalue=hashvalue@entry=1634369601, action=action@entry=HASH_FIND,
    foundPtr=foundPtr@entry=0x0)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/utils/hash/dynahash.c:959
959 /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/utils/hash/dynahash.c: No such file or directory.
(gdb) bt
#0  hash_search_with_hash_value (hashp=0x5566701baa68, keyPtr=keyPtr@entry=0x7ffc37be9790,
    hashvalue=hashvalue@entry=1634369601, action=action@entry=HASH_FIND,
    foundPtr=foundPtr@entry=0x0)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/utils/hash/dynahash.c:959
#1  0x000055666e1224ca in BufTableLookup (tagPtr=tagPtr@entry=0x7ffc37be9790,
    hashcode=hashcode@entry=1634369601)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/storage/buffer/buf_table.c:96
#2  0x000055666e12527d in BufferAlloc (foundPtr=0x7ffc37be978b "", strategy=0x556670360418,
    blockNum=53, forkNum=MAIN_FORKNUM, relpersistence=112 'p', smgr=0x5566702a5990)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/storage/buffer/bufmgr.c:1013
#3  ReadBuffer_common (smgr=0x5566702a5990, relpersistence=<optimized out>,
    forkNum=forkNum@entry=MAIN_FORKNUM, blockNum=blockNum@entry=53, mode=RBM_NORMAL,
    strategy=0x556670360418, hit=0x7ffc37be9837 "")
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/storage/buffer/bufmgr.c:745
#4  0x000055666e125b15 in ReadBufferExtended (reln=0x7f3d015e2670,
    forkNum=forkNum@entry=MAIN_FORKNUM, blockNum=blockNum@entry=53,
    mode=mode@entry=RBM_NORMAL, strategy=<optimized out>)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/storage/buffer/bufmgr.c:664
#5  0x000055666defc036 in heapgetpage (scan=scan@entry=0x5566703484f8, page=page@entry=53)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:375
#6  0x000055666defd5c2 in heapgettup_pagemode (key=0x0, nkeys=0, dir=ForwardScanDirection,
    scan=0x5566703484f8)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:1036
#7  heap_getnext (scan=scan@entry=0x5566703484f8,
    direction=direction@entry=ForwardScanDirection)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:1787
#8  0x000055666e053e21 in SeqNext (node=node@entry=0x556670328c48)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeSeqscan.c:80
#9  0x000055666e03d711 in ExecScanFetch (recheckMtd=0x55666e053de0 <SeqRecheck>,
    accessMtd=0x55666e053df0 <SeqNext>, node=0x556670328c48)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execScan.c:95
#10 ExecScan (node=node@entry=0x556670328c48,
    accessMtd=accessMtd@entry=0x55666e053df0 <SeqNext>,
    recheckMtd=recheckMtd@entry=0x55666e053de0 <SeqRecheck>)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execScan.c:180
#11 0x000055666e053ea8 in ExecSeqScan (node=node@entry=0x556670328c48)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeSeqscan.c:127
---Type <return> to continue, or q <return> to quit---


>
>
>
>
> --
> Thomas Munro
> https://enterprisedb.com
>
>
>

--
Jerry Sievers
Postgres DBA/Development Consulting
e: [hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: SegFault on 9.6.14

Thomas Munro-5
On Wed, Jul 17, 2019 at 12:05 PM Jerry Sievers <[hidden email]> wrote:
> Program received signal SIGUSR1, User defined signal 1.

Oh, we need to ignore those pesky signals with "handle SIGUSR1 noprint nostop".

--
Thomas Munro
https://enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: SegFault on 9.6.14

Jerry Sievers-3
Thomas Munro <[hidden email]> writes:

> On Wed, Jul 17, 2019 at 12:05 PM Jerry Sievers <[hidden email]> wrote:
>
>> Program received signal SIGUSR1, User defined signal 1.
>
> Oh, we need to ignore those pesky signals with "handle SIGUSR1 noprint nostop".

Is this the right sequencing?

1. Start client and get backend pid
2. GDB;  handle SIGUSR1, break, cont
3. Run query
4. bt

Thanks

Don't think I am doing this correctly.  Please advise.

handle SIGUSR1 noprint nostop
Signal        Stop Print Pass to program Description
SIGUSR1       No No Yes User defined signal 1
(gdb) break munmap
Breakpoint 1 at 0x7f3d09331740: file ../sysdeps/unix/syscall-template.S, line 84.
(gdb) cont
Continuing.

Breakpoint 1, munmap () at ../sysdeps/unix/syscall-template.S:84
84 ../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb) bt
#0  munmap () at ../sysdeps/unix/syscall-template.S:84
#1  0x000055666e12d7f4 in dsm_impl_posix (impl_private=0x22, elevel=19,
    mapped_size=0x556670205890, mapped_address=0x556670205888, request_size=0,
    handle=<optimized out>, op=DSM_OP_DETACH)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/storage/ipc/dsm_impl.c:259
#2  dsm_impl_op (op=op@entry=DSM_OP_DETACH, handle=<optimized out>,
    request_size=request_size@entry=0, impl_private=impl_private@entry=0x556670205880,
    mapped_address=mapped_address@entry=0x556670205888,
    mapped_size=mapped_size@entry=0x556670205890, elevel=19)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/storage/ipc/dsm_impl.c:176
#3  0x000055666e12efb1 in dsm_detach (seg=0x556670205860)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/storage/ipc/dsm.c:738
#4  0x000055666df31369 in DestroyParallelContext (pcxt=0x556670219b68)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/transam/parallel.c:750
#5  0x000055666e0357bb in ExecParallelCleanup (pei=0x7f3d012218b0)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execParallel.c:575
#6  0x000055666e047ca2 in ExecShutdownGather (node=node@entry=0x55667033bed0)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeGather.c:443
#7  0x000055666e0359f5 in ExecShutdownNode (node=0x55667033bed0)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execProcnode.c:820
#8  0x000055666e0777e1 in planstate_tree_walker (planstate=0x55667033b2b0,
    walker=0x55666e0359a0 <ExecShutdownNode>, context=0x0)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/nodes/nodeFuncs.c:3636
#9  0x000055666e0777e1 in planstate_tree_walker (planstate=0x55667033b040,
    walker=0x55666e0359a0 <ExecShutdownNode>, context=0x0)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/nodes/nodeFuncs.c:3636
#10 0x000055666e0777e1 in planstate_tree_walker (planstate=planstate@entry=0x55667033a6c8,
    walker=walker@entry=0x55666e0359a0 <ExecShutdownNode>, context=context@entry=0x0)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/nodes/nodeFuncs.c:3636
#11 0x000055666e0359df in ExecShutdownNode (node=node@entry=0x55667033a6c8)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execProcnode.c:830
#12 0x000055666e04d0ff in ExecLimit (node=node@entry=0x55667033a428)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeLimit.c:139
#13 0x000055666e035d28 in ExecProcNode (node=node@entry=0x55667033a428)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execProcnode.c:531
#14 0x000055666e051f69 in ExecNestLoop (node=node@entry=0x55667031c660)
---Type <return> to continue, or q <return> to quit---    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeNestloop.c:174
#15 0x000055666e035e28 in ExecProcNode (node=node@entry=0x55667031c660)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execProcnode.c:476
#16 0x000055666e054989 in ExecSort (node=node@entry=0x55667031c3f0)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeSort.c:103
#17 0x000055666e035de8 in ExecProcNode (node=0x55667031c3f0)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execProcnode.c:495
#18 0x000055666e041fe9 in fetch_input_tuple (aggstate=aggstate@entry=0x55667031ba18)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeAgg.c:598
#19 0x000055666e043bb3 in agg_retrieve_direct (aggstate=0x55667031ba18)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeAgg.c:2078
#20 ExecAgg (node=node@entry=0x55667031ba18)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeAgg.c:1903
#21 0x000055666e035dc8 in ExecProcNode (node=node@entry=0x55667031ba18)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execProcnode.c:503
#22 0x000055666e031f2e in ExecutePlan (dest=0x7f3d01277aa8, direction=<optimized out>,
    numberTuples=0, sendTuples=<optimized out>, operation=CMD_SELECT,
    use_parallel_mode=<optimized out>, planstate=0x55667031ba18, estate=0x55667031b878)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execMain.c:1567
#23 standard_ExecutorRun (queryDesc=0x556670320a78, direction=<optimized out>, count=0)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execMain.c:339
#24 0x00007f3d01cd0515 in explain_ExecutorRun (queryDesc=0x556670320a78,
    direction=ForwardScanDirection, count=0)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../contrib/auto_explain/auto_explain.c:281
#25 0x00007f3d01ac8db0 in pgss_ExecutorRun (queryDesc=0x556670320a78,
    direction=ForwardScanDirection, count=0)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../contrib/pg_stat_statements/pg_stat_statements.c:875
#26 0x000055666e155167 in PortalRunSelect (portal=portal@entry=0x5566701d6df8,
    forward=forward@entry=1 '\001', count=0, count@entry=9223372036854775807,
    dest=dest@entry=0x7f3d01277aa8)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/tcop/pquery.c:948
#27 0x000055666e1567a0 in PortalRun (portal=portal@entry=0x5566701d6df8,
    count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=1 '\001',
    dest=dest@entry=0x7f3d01277aa8, altdest=altdest@entry=0x7f3d01277aa8,
    completionTag=completionTag@entry=0x7ffc37bea670 "")
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/tcop/pquery.c:---Type <return> to continue, or q <return> to quit---
789
#28 0x000055666e1532d6 in exec_simple_query (
    query_string=0x5566702a4c68 "select v.account_id, COUNT(cnt.clicks), te.description,\nl.product_id\nfrom nbox_nc_ah.tracking_events te\njoin nbox_nc_ah.page_views pv on pv.page_view_id = te.page_view_id\njoin nbox_nc_ah.visits v on v"...)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/tcop/postgres.c:1109
#29 PostgresMain (argc=<optimized out>, argv=argv@entry=0x556670204630,
    dbname=0x5566701bab88 "staging", username=<optimized out>)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/tcop/postgres.c:4101
#30 0x000055666dec8a1b in BackendRun (port=0x5566701fd500)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/postmaster/postmaster.c:4339
#31 BackendStartup (port=0x5566701fd500)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/postmaster/postmaster.c:4013
#32 ServerLoop ()
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/postmaster/postmaster.c:1722
#33 0x000055666e0ef281 in PostmasterMain (argc=13, argv=<optimized out>)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/postmaster/postmaster.c:1330
#34 0x000055666dec9bf1 in main (argc=13, argv=0x5566701b8840)
    at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/main/main.c:228
(gdb)
(gdb) quit
A debugging session is active.

        Inferior 1 [process 32291] will be detached.

Quit anyway? (y or n) y
Detaching from program: /usr/lib/postgresql/9.6/bin/postgres, process 32291
root@pgdev01:/home/jsievers#



--
Jerry Sievers
Postgres DBA/Development Consulting
e: [hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: SegFault on 9.6.14

Thomas Munro-5
On Wed, Jul 17, 2019 at 12:26 PM Jerry Sievers <[hidden email]> wrote:
> Is this the right sequencing?
>
> 1. Start client and get backend pid
> 2. GDB;  handle SIGUSR1, break, cont
> 3. Run query
> 4. bt

Perfect, thanks.  I think I just spotted something:

> #11 0x000055666e0359df in ExecShutdownNode (node=node@entry=0x55667033a6c8)
>     at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execProcnode.c:830
> #12 0x000055666e04d0ff in ExecLimit (node=node@entry=0x55667033a428)
>     at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeLimit.c:139

https://github.com/postgres/postgres/blob/REL9_6_STABLE/src/backend/executor/nodeLimit.c#L139

Limit thinks it's OK to "shut down" the subtree, but if you shut down a
Gather node you can't rescan it later because it destroys its shared
memory.  Oops.  Not sure what to do about that yet.


--
Thomas Munro
https://enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: SegFault on 9.6.14

Thomas Munro-5
On Wed, Jul 17, 2019 at 12:44 PM Thomas Munro <[hidden email]> wrote:

> > #11 0x000055666e0359df in ExecShutdownNode (node=node@entry=0x55667033a6c8)
> >     at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execProcnode.c:830
> > #12 0x000055666e04d0ff in ExecLimit (node=node@entry=0x55667033a428)
> >     at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeLimit.c:139
>
> https://github.com/postgres/postgres/blob/REL9_6_STABLE/src/backend/executor/nodeLimit.c#L139
>
> Limit thinks it's OK to "shut down" the subtree, but if you shut down a
> Gather node you can't rescan it later because it destroys its shared
> memory.  Oops.  Not sure what to do about that yet.

CCing Amit and Robert, authors of commits 19df1702 and 69de1718.

--
Thomas Munro
https://enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: SegFault on 9.6.14

Jerry Sievers-3
In reply to this post by Thomas Munro-5
Thomas Munro <[hidden email]> writes:

> On Wed, Jul 17, 2019 at 12:26 PM Jerry Sievers <[hidden email]> wrote:
>
>> Is this the right sequencing?
>>
>> 1. Start client and get backend pid
>> 2. GDB;  handle SIGUSR1, break, cont
>> 3. Run query
>> 4. bt
>
> Perfect, thanks.  I think I just spotted something:

Dig that!  Great big thanks to you and Tomas, et al for jumping on this.

Please let know if there's anything else I can submit that would be
helpful.


>
>> #11 0x000055666e0359df in ExecShutdownNode (node=node@entry=0x55667033a6c8)
>>     at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execProcnode.c:830
>> #12 0x000055666e04d0ff in ExecLimit (node=node@entry=0x55667033a428)
>>     at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeLimit.c:139
>
> https://github.com/postgres/postgres/blob/REL9_6_STABLE/src/backend/executor/nodeLimit.c#L139
>
> Limit thinks it's OK to "shut down" the subtree, but if you shut down a
> Gather node you can't rescan it later because it destroys its shared
> memory.  Oops.  Not sure what to do about that yet.
>
>
> --
> Thomas Munro
> https://enterprisedb.com
>
>
>

--
Jerry Sievers
Postgres DBA/Development Consulting
e: [hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: SegFault on 9.6.14

Thomas Munro-5
In reply to this post by Thomas Munro-5
On Wed, Jul 17, 2019 at 12:57 PM Thomas Munro <[hidden email]> wrote:

> On Wed, Jul 17, 2019 at 12:44 PM Thomas Munro <[hidden email]> wrote:
> > > #11 0x000055666e0359df in ExecShutdownNode (node=node@entry=0x55667033a6c8)
> > >     at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execProcnode.c:830
> > > #12 0x000055666e04d0ff in ExecLimit (node=node@entry=0x55667033a428)
> > >     at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeLimit.c:139
> >
> > https://github.com/postgres/postgres/blob/REL9_6_STABLE/src/backend/executor/nodeLimit.c#L139
> >
> > Limit thinks it's OK to "shut down" the subtree, but if you shut down a
> > Gather node you can't rescan it later because it destroys its shared
> > memory.  Oops.  Not sure what to do about that yet.
>
> CCing Amit and Robert, authors of commits 19df1702 and 69de1718.

Here's a repro (I'm sure you can find a shorter one, this one's hacked
up from join_hash.sql, basically just adding LIMIT):

create table join_foo as select generate_series(1, 3000) as id,
'xxxxx'::text as t;
alter table join_foo set (parallel_workers = 0);
create table join_bar as select generate_series(0, 10000) as id,
'xxxxx'::text as t;
alter table join_bar set (parallel_workers = 2);

set parallel_setup_cost = 0;
set parallel_tuple_cost = 0;
set max_parallel_workers_per_gather = 2;
set enable_material = off;
set enable_mergejoin = off;
set work_mem = '1GB';

select count(*) from join_foo
    left join (select b1.id, b1.t from join_bar b1 join join_bar b2
using (id) limit 1000) ss
    on join_foo.id < ss.id + 1 and join_foo.id > ss.id - 1;

--
Thomas Munro
https://enterprisedb.com


123