Deadlock in XLogInsert at AIX

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
51 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Deadlock in XLogInsert at AIX

konstantin knizhnik
Hi Hackers,

We are running Postgres at AIX and encoountered two strqange problems: active zombies process and deadlock in XLOG writer.
First problem I will explain in separate mail, now I am mostly concerning about deadlock.
It is irregularly reproduced with standard pgbench launched with 100 connections.

It sometimes happens with 9.6 stable version of Postgres but only when it is compiled with xlc compiler.
We failed to reproduce the problem with GCC. So it looks like as bug in compiler or xlc-specific atomics implementation...
But there are few moments which contradicts with this hypothesis:

1. The problem is reproduce with Postgres built without optimization. Usually compiler bugs affect only optimized code.
2. Disabling atomics doesn't help.
3. Without optimization and with  LOCK_DEBUG defined time of reproducing the problem significantly increased. With optimized code it is almost always reproduced in few minutes.
With debug version it usually takes much more time.

But the most confusing thing is stack trace:

(dbx) where
semop(??, ??, ??) at 0x9000000001f5790
PGSemaphoreLock(sema = 0x0a00000044b95928), line 387 in "pg_sema.c"
unnamed block in LWLockWaitForVar(lock = 0x0a0000000000d980, valptr = 0x0a0000000000d9a8, oldval = 102067874256, newval = 0x0fffffffffff9c10), line 1666 in "lwlock.c"
LWLockWaitForVar(lock = 0x0a0000000000d980, valptr = 0x0a0000000000d9a8, oldval = 102067874256, newval = 0x0fffffffffff9c10), line 1666 in "lwlock.c"
unnamed block in WaitXLogInsertionsToFinish(upto = 102067874328), line 1583 in "xlog.c"
WaitXLogInsertionsToFinish(upto = 102067874328), line 1583 in "xlog.c"
AdvanceXLInsertBuffer(upto = 102067874256, opportunistic = '\0'), line 1916 in "xlog.c"
unnamed block in GetXLogBuffer(ptr = 102067874256), line 1697 in "xlog.c"
GetXLogBuffer(ptr = 102067874256), line 1697 in "xlog.c"
CopyXLogRecordToWAL(write_len = 70, isLogSwitch = '\0', rdata = 0x000000011007ce10, StartPos = 102067874256, EndPos = 102067874328), line 1279 in "xlog.c"
XLogInsertRecord(rdata = 0x000000011007ce10, fpw_lsn = 102067718328), line 1011 in "xlog.c"
unnamed block in XLogInsert(rmid = '\n', info = '@'), line 453 in "xloginsert.c"
XLogInsert(rmid = '\n', info = '@'), line 453 in "xloginsert.c"
log_heap_update(reln = 0x0000000110273540, oldbuf = 40544, newbuf = 40544, oldtup = 0x0fffffffffffa2a0, newtup = 0x00000001102bb958, old_key_tuple = (nil), all_visible_cleared = '\0', new_all_visible_cleared = '\0'), line 7708 in "heapam.c"
unnamed block in heap_update(relation = 0x0000000110273540, otid = 0x0fffffffffffa6f8, newtup = 0x00000001102bb958, cid = 1, crosscheck = (nil), wait = '^A', hufd = 0x0fffffffffffa5b0, lockmode = 0x0fffffffffffa5c8), line 4212 in "heapam.c"
heap_update(relation = 0x0000000110273540, otid = 0x0fffffffffffa6f8, newtup = 0x00000001102bb958, cid = 1, crosscheck = (nil), wait = '^A', hufd = 0x0fffffffffffa5b0, lockmode = 0x0fffffffffffa5c8), line 4212 in "heapam.c"
unnamed block in ExecUpdate(tupleid = 0x0fffffffffffa6f8, oldtuple = (nil), slot = 0x00000001102bb308, planSlot = 0x00000001102b4630, epqstate = 0x00000001102b2cd8, estate = 0x00000001102b29e0, canSetTag = '^A'), line 937 in "nodeModifyTable.c"
ExecUpdate(tupleid = 0x0fffffffffffa6f8, oldtuple = (nil), slot = 0x00000001102bb308, planSlot = 0x00000001102b4630, epqstate = 0x00000001102b2cd8, estate = 0x00000001102b29e0, canSetTag = '^A'), line 937 in "nodeModifyTable.c"
ExecModifyTable(node = 0x00000001102b2c30), line 1516 in "nodeModifyTable.c"
ExecProcNode(node = 0x00000001102b2c30), line 396 in "execProcnode.c"
ExecutePlan(estate = 0x00000001102b29e0, planstate = 0x00000001102b2c30, use_parallel_mode = '\0', operation = CMD_UPDATE, sendTuples = '\0', numberTuples = 0, direction = ForwardScanDirection, dest = 0x00000001102b7520), line 1569 in "execMain.c"
standard_ExecutorRun(queryDesc = 0x00000001102b25c0, direction = ForwardScanDirection, count = 0), line 338 in "execMain.c"
ExecutorRun(queryDesc = 0x00000001102b25c0, direction = ForwardScanDirection, count = 0), line 286 in "execMain.c"
ProcessQuery(plan = 0x00000001102b1510, sourceText = "UPDATE pgbench_tellers SET tbalance = tbalance + 4019 WHERE tid = 6409;", params = (nil), dest = 0x00000001102b7520, completionTag = ""), line 187 in "pquery.c"
unnamed block in PortalRunMulti(portal = 0x0000000110115e20, isTopLevel = '^A', setHoldSnapshot = '\0', dest = 0x00000001102b7520, altdest = 0x00000001102b7520, completionTag = ""), line 1303 in "pquery.c"
unnamed block in PortalRunMulti(portal = 0x0000000110115e20, isTopLevel = '^A', setHoldSnapshot = '\0', dest = 0x00000001102b7520, altdest = 0x00000001102b7520, completionTag = ""), line 1303 in "pquery.c"
PortalRunMulti(portal = 0x0000000110115e20, isTopLevel = '^A', setHoldSnapshot = '\0', dest = 0x00000001102b7520, altdest = 0x00000001102b7520, completionTag = ""), line 1303 in "pquery.c"
unnamed block in PortalRun(portal = 0x0000000110115e20, count = 9223372036854775807, isTopLevel = '^A', dest = 0x00000001102b7520, altdest = 0x00000001102b7520, completionTag = ""), line 815 in "pquery.c"
PortalRun(portal = 0x0000000110115e20, count = 9223372036854775807, isTopLevel = '^A', dest = 0x00000001102b7520, altdest = 0x00000001102b7520, completionTag = ""), line 815 in "pquery.c"
unnamed block in exec_simple_query(query_string = "UPDATE pgbench_tellers SET tbalance = tbalance + 4019 WHERE tid = 6409;"), line 1094 in "postgres.c"
exec_simple_query(query_string = "UPDATE pgbench_tellers SET tbalance = tbalance + 4019 WHERE tid = 6409;"), line 1094 in "postgres.c"
unnamed block in PostgresMain(argc = 1, argv = 0x0000000110119f68, dbname = "postgres", username = "postgres"), line 4076 in "postgres.c"
PostgresMain(argc = 1, argv = 0x0000000110119f68, dbname = "postgres", username = "postgres"), line 4076 in "postgres.c"
BackendRun(port = 0x0000000110114290), line 4279 in "postmaster.c"
BackendStartup(port = 0x0000000110114290), line 3953 in "postmaster.c"
unnamed block in ServerLoop(), line 1701 in "postmaster.c"
unnamed block in ServerLoop(), line 1701 in "postmaster.c"
unnamed block in ServerLoop(), line 1701 in "postmaster.c"
ServerLoop(), line 1701 in "postmaster.c"
PostmasterMain(argc = 3, argv = 0x00000001100c6190), line 1309 in "postmaster.c"
main(argc = 3, argv = 0x00000001100c6190), line 228 in "main.c"


As I already mentioned, we built Postgres with LOCK_DEBUG , so we can inspect lock owner. Backend is waiting for itself!
Now please look at two frames in this stack trace marked with red.
XLogInsertRecord is setting WALInsert locks at the beginning of the function:

    if (isLogSwitch)
        WALInsertLockAcquireExclusive();
    else
        WALInsertLockAcquire();

WALInsertLockAcquire just selects random item from WALInsertLocks array and exclusively locks:

    if (lockToTry == -1)
        lockToTry = MyProc->pgprocno % NUM_XLOGINSERT_LOCKS;
    MyLockNo = lockToTry;
    immed = LWLockAcquire(&WALInsertLocks[MyLockNo].l.lock, LW_EXCLUSIVE);

Then, following the stack trace, AdvanceXLInsertBuffer calls WaitXLogInsertionsToFinish:

            /*
             * Now that we have an up-to-date LogwrtResult value, see if we
             * still need to write it or if someone else already did.
             */
            if (LogwrtResult.Write < OldPageRqstPtr)
            {
                /*
                 * Must acquire write lock. Release WALBufMappingLock first,
                 * to make sure that all insertions that we need to wait for
                 * can finish (up to this same position). Otherwise we risk
                 * deadlock.
                 */
                LWLockRelease(WALBufMappingLock);

                WaitXLogInsertionsToFinish(OldPageRqstPtr);

                LWLockAcquire(WALWriteLock, LW_EXCLUSIVE);


It releases WALBufMappingLock but not WAL insert locks!
Finally in WaitXLogInsertionsToFinish tries to wait for all locks:

    for (i = 0; i < NUM_XLOGINSERT_LOCKS; i++)
    {
        XLogRecPtr    insertingat = InvalidXLogRecPtr;

        do
        {
            /*
             * See if this insertion is in progress. LWLockWait will wait for
             * the lock to be released, or for the 'value' to be set by a
             * LWLockUpdateVar call.  When a lock is initially acquired, its
             * value is 0 (InvalidXLogRecPtr), which means that we don't know
             * where it's inserting yet.  We will have to wait for it.  If
             * it's a small insertion, the record will most likely fit on the
             * same page and the inserter will release the lock without ever
             * calling LWLockUpdateVar.  But if it has to sleep, it will
             * advertise the insertion point with LWLockUpdateVar before
             * sleeping.
             */
            if (LWLockWaitForVar(&WALInsertLocks[i].l.lock,
                                 &WALInsertLocks[i].l.insertingAt,
                                 insertingat, &insertingat))

And here we stuck!
The comment to WaitXLogInsertionsToFinish  says:

 * Note: When you are about to write out WAL, you must call this function
 * *before* acquiring WALWriteLock, to avoid deadlocks. This function might
 * need to wait for an insertion to finish (or at least advance to next
 * uninitialized page), and the inserter might need to evict an old WAL buffer
 * to make room for a new one, which in turn requires WALWriteLock.

Which contradicts to the observed stack trace.

I wonder if it is really synchronization bug in xlog.c or there is something wrong in this stack trace and it can not happen in case of normal work?

Thanks in advance,
-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 
Reply | Threaded
Open this post in threaded view
|

Re: Deadlock in XLogInsert at AIX

konstantin knizhnik
More information about the problem - Postgres log contains several records:

2017-01-24 19:15:20.272 MSK [19270462] LOG:  request to flush past end of generated WAL; request 6/AAEBE000, currpos 6/AAEBC2B0

and them correspond to the time when deadlock happen.
There is the following comment in xlog.c concerning this message:

    /*
     * No-one should request to flush a piece of WAL that hasn't even been
     * reserved yet. However, it can happen if there is a block with a bogus
     * LSN on disk, for example. XLogFlush checks for that situation and
     * complains, but only after the flush. Here we just assume that to mean
     * that all WAL that has been reserved needs to be finished. In this
     * corner-case, the return value can be smaller than 'upto' argument.
     */

So looks like it should not happen.
The first thing to suspect is spinlock implementation which is different for GCC and XLC.
But ... if I rebuild Postgres without spinlocks, then the problem is still reproduced.

On 24.01.2017 17:47, Konstantin Knizhnik wrote:
Hi Hackers,

We are running Postgres at AIX and encoountered two strqange problems: active zombies process and deadlock in XLOG writer.
First problem I will explain in separate mail, now I am mostly concerning about deadlock.
It is irregularly reproduced with standard pgbench launched with 100 connections.

It sometimes happens with 9.6 stable version of Postgres but only when it is compiled with xlc compiler.
We failed to reproduce the problem with GCC. So it looks like as bug in compiler or xlc-specific atomics implementation...
But there are few moments which contradicts with this hypothesis:

1. The problem is reproduce with Postgres built without optimization. Usually compiler bugs affect only optimized code.
2. Disabling atomics doesn't help.
3. Without optimization and with  LOCK_DEBUG defined time of reproducing the problem significantly increased. With optimized code it is almost always reproduced in few minutes.
With debug version it usually takes much more time.

But the most confusing thing is stack trace:

(dbx) where
semop(??, ??, ??) at 0x9000000001f5790
PGSemaphoreLock(sema = 0x0a00000044b95928), line 387 in "pg_sema.c"
unnamed block in LWLockWaitForVar(lock = 0x0a0000000000d980, valptr = 0x0a0000000000d9a8, oldval = 102067874256, newval = 0x0fffffffffff9c10), line 1666 in "lwlock.c"
LWLockWaitForVar(lock = 0x0a0000000000d980, valptr = 0x0a0000000000d9a8, oldval = 102067874256, newval = 0x0fffffffffff9c10), line 1666 in "lwlock.c"
unnamed block in WaitXLogInsertionsToFinish(upto = 102067874328), line 1583 in "xlog.c"
WaitXLogInsertionsToFinish(upto = 102067874328), line 1583 in "xlog.c"
AdvanceXLInsertBuffer(upto = 102067874256, opportunistic = '\0'), line 1916 in "xlog.c"
unnamed block in GetXLogBuffer(ptr = 102067874256), line 1697 in "xlog.c"
GetXLogBuffer(ptr = 102067874256), line 1697 in "xlog.c"
CopyXLogRecordToWAL(write_len = 70, isLogSwitch = '\0', rdata = 0x000000011007ce10, StartPos = 102067874256, EndPos = 102067874328), line 1279 in "xlog.c"
XLogInsertRecord(rdata = 0x000000011007ce10, fpw_lsn = 102067718328), line 1011 in "xlog.c"
unnamed block in XLogInsert(rmid = '\n', info = '@'), line 453 in "xloginsert.c"
XLogInsert(rmid = '\n', info = '@'), line 453 in "xloginsert.c"
log_heap_update(reln = 0x0000000110273540, oldbuf = 40544, newbuf = 40544, oldtup = 0x0fffffffffffa2a0, newtup = 0x00000001102bb958, old_key_tuple = (nil), all_visible_cleared = '\0', new_all_visible_cleared = '\0'), line 7708 in "heapam.c"
unnamed block in heap_update(relation = 0x0000000110273540, otid = 0x0fffffffffffa6f8, newtup = 0x00000001102bb958, cid = 1, crosscheck = (nil), wait = '^A', hufd = 0x0fffffffffffa5b0, lockmode = 0x0fffffffffffa5c8), line 4212 in "heapam.c"
heap_update(relation = 0x0000000110273540, otid = 0x0fffffffffffa6f8, newtup = 0x00000001102bb958, cid = 1, crosscheck = (nil), wait = '^A', hufd = 0x0fffffffffffa5b0, lockmode = 0x0fffffffffffa5c8), line 4212 in "heapam.c"
unnamed block in ExecUpdate(tupleid = 0x0fffffffffffa6f8, oldtuple = (nil), slot = 0x00000001102bb308, planSlot = 0x00000001102b4630, epqstate = 0x00000001102b2cd8, estate = 0x00000001102b29e0, canSetTag = '^A'), line 937 in "nodeModifyTable.c"
ExecUpdate(tupleid = 0x0fffffffffffa6f8, oldtuple = (nil), slot = 0x00000001102bb308, planSlot = 0x00000001102b4630, epqstate = 0x00000001102b2cd8, estate = 0x00000001102b29e0, canSetTag = '^A'), line 937 in "nodeModifyTable.c"
ExecModifyTable(node = 0x00000001102b2c30), line 1516 in "nodeModifyTable.c"
ExecProcNode(node = 0x00000001102b2c30), line 396 in "execProcnode.c"
ExecutePlan(estate = 0x00000001102b29e0, planstate = 0x00000001102b2c30, use_parallel_mode = '\0', operation = CMD_UPDATE, sendTuples = '\0', numberTuples = 0, direction = ForwardScanDirection, dest = 0x00000001102b7520), line 1569 in "execMain.c"
standard_ExecutorRun(queryDesc = 0x00000001102b25c0, direction = ForwardScanDirection, count = 0), line 338 in "execMain.c"
ExecutorRun(queryDesc = 0x00000001102b25c0, direction = ForwardScanDirection, count = 0), line 286 in "execMain.c"
ProcessQuery(plan = 0x00000001102b1510, sourceText = "UPDATE pgbench_tellers SET tbalance = tbalance + 4019 WHERE tid = 6409;", params = (nil), dest = 0x00000001102b7520, completionTag = ""), line 187 in "pquery.c"
unnamed block in PortalRunMulti(portal = 0x0000000110115e20, isTopLevel = '^A', setHoldSnapshot = '\0', dest = 0x00000001102b7520, altdest = 0x00000001102b7520, completionTag = ""), line 1303 in "pquery.c"
unnamed block in PortalRunMulti(portal = 0x0000000110115e20, isTopLevel = '^A', setHoldSnapshot = '\0', dest = 0x00000001102b7520, altdest = 0x00000001102b7520, completionTag = ""), line 1303 in "pquery.c"
PortalRunMulti(portal = 0x0000000110115e20, isTopLevel = '^A', setHoldSnapshot = '\0', dest = 0x00000001102b7520, altdest = 0x00000001102b7520, completionTag = ""), line 1303 in "pquery.c"
unnamed block in PortalRun(portal = 0x0000000110115e20, count = 9223372036854775807, isTopLevel = '^A', dest = 0x00000001102b7520, altdest = 0x00000001102b7520, completionTag = ""), line 815 in "pquery.c"
PortalRun(portal = 0x0000000110115e20, count = 9223372036854775807, isTopLevel = '^A', dest = 0x00000001102b7520, altdest = 0x00000001102b7520, completionTag = ""), line 815 in "pquery.c"
unnamed block in exec_simple_query(query_string = "UPDATE pgbench_tellers SET tbalance = tbalance + 4019 WHERE tid = 6409;"), line 1094 in "postgres.c"
exec_simple_query(query_string = "UPDATE pgbench_tellers SET tbalance = tbalance + 4019 WHERE tid = 6409;"), line 1094 in "postgres.c"
unnamed block in PostgresMain(argc = 1, argv = 0x0000000110119f68, dbname = "postgres", username = "postgres"), line 4076 in "postgres.c"
PostgresMain(argc = 1, argv = 0x0000000110119f68, dbname = "postgres", username = "postgres"), line 4076 in "postgres.c"
BackendRun(port = 0x0000000110114290), line 4279 in "postmaster.c"
BackendStartup(port = 0x0000000110114290), line 3953 in "postmaster.c"
unnamed block in ServerLoop(), line 1701 in "postmaster.c"
unnamed block in ServerLoop(), line 1701 in "postmaster.c"
unnamed block in ServerLoop(), line 1701 in "postmaster.c"
ServerLoop(), line 1701 in "postmaster.c"
PostmasterMain(argc = 3, argv = 0x00000001100c6190), line 1309 in "postmaster.c"
main(argc = 3, argv = 0x00000001100c6190), line 228 in "main.c"


As I already mentioned, we built Postgres with LOCK_DEBUG , so we can inspect lock owner. Backend is waiting for itself!
Now please look at two frames in this stack trace marked with red.
XLogInsertRecord is setting WALInsert locks at the beginning of the function:

    if (isLogSwitch)
        WALInsertLockAcquireExclusive();
    else
        WALInsertLockAcquire();

WALInsertLockAcquire just selects random item from WALInsertLocks array and exclusively locks:

    if (lockToTry == -1)
        lockToTry = MyProc->pgprocno % NUM_XLOGINSERT_LOCKS;
    MyLockNo = lockToTry;
    immed = LWLockAcquire(&WALInsertLocks[MyLockNo].l.lock, LW_EXCLUSIVE);

Then, following the stack trace, AdvanceXLInsertBuffer calls WaitXLogInsertionsToFinish:

            /*
             * Now that we have an up-to-date LogwrtResult value, see if we
             * still need to write it or if someone else already did.
             */
            if (LogwrtResult.Write < OldPageRqstPtr)
            {
                /*
                 * Must acquire write lock. Release WALBufMappingLock first,
                 * to make sure that all insertions that we need to wait for
                 * can finish (up to this same position). Otherwise we risk
                 * deadlock.
                 */
                LWLockRelease(WALBufMappingLock);

                WaitXLogInsertionsToFinish(OldPageRqstPtr);

                LWLockAcquire(WALWriteLock, LW_EXCLUSIVE);


It releases WALBufMappingLock but not WAL insert locks!
Finally in WaitXLogInsertionsToFinish tries to wait for all locks:

    for (i = 0; i < NUM_XLOGINSERT_LOCKS; i++)
    {
        XLogRecPtr    insertingat = InvalidXLogRecPtr;

        do
        {
            /*
             * See if this insertion is in progress. LWLockWait will wait for
             * the lock to be released, or for the 'value' to be set by a
             * LWLockUpdateVar call.  When a lock is initially acquired, its
             * value is 0 (InvalidXLogRecPtr), which means that we don't know
             * where it's inserting yet.  We will have to wait for it.  If
             * it's a small insertion, the record will most likely fit on the
             * same page and the inserter will release the lock without ever
             * calling LWLockUpdateVar.  But if it has to sleep, it will
             * advertise the insertion point with LWLockUpdateVar before
             * sleeping.
             */
            if (LWLockWaitForVar(&WALInsertLocks[i].l.lock,
                                 &WALInsertLocks[i].l.insertingAt,
                                 insertingat, &insertingat))

And here we stuck!
The comment to WaitXLogInsertionsToFinish  says:

 * Note: When you are about to write out WAL, you must call this function
 * *before* acquiring WALWriteLock, to avoid deadlocks. This function might
 * need to wait for an insertion to finish (or at least advance to next
 * uninitialized page), and the inserter might need to evict an old WAL buffer
 * to make room for a new one, which in turn requires WALWriteLock.

Which contradicts to the observed stack trace.

I wonder if it is really synchronization bug in xlog.c or there is something wrong in this stack trace and it can not happen in case of normal work?

Thanks in advance,
-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 
Reply | Threaded
Open this post in threaded view
|

Re: Deadlock in XLogInsert at AIX

Bernd Helmle
Hi Konstantin,

We had observed exactly the same issues on a customer system with the
same environment and PostgreSQL 9.5.5. Additionally, we've tested on
Linux with XL/C 12 and 13 with exactly the same deadlock behavior.

So we assumed that this is somehow a compiler issue.

Am Dienstag, den 24.01.2017, 19:26 +0300 schrieb Konstantin Knizhnik:
> More information about the problem - Postgres log contains several
> records:
>
> 2017-01-24 19:15:20.272 MSK [19270462] LOG:  request to flush past
> end 
> of generated WAL; request 6/AAEBE000, currpos 6/AAEBC2B0
>
> and them correspond to the time when deadlock happen.

Yeah, the same logs here:

LOG:  request to flush past end of generated WAL; request 1/1F4C6000,
currpos 1/1F4C40E0
STATEMENT:  UPDATE pgbench_accounts SET abalance = abalance + -2653
WHERE aid = 3662494;


> There is the following comment in xlog.c concerning this message:
>
>      /*
>       * No-one should request to flush a piece of WAL that hasn't
> even been
>       * reserved yet. However, it can happen if there is a block with
> a 
> bogus
>       * LSN on disk, for example. XLogFlush checks for that situation
> and
>       * complains, but only after the flush. Here we just assume that
> to 
> mean
>       * that all WAL that has been reserved needs to be finished. In
> this
>       * corner-case, the return value can be smaller than 'upto'
> argument.
>       */
>
> So looks like it should not happen.
> The first thing to suspect is spinlock implementation which is
> different 
> for GCC and XLC.
> But ... if I rebuild Postgres without spinlocks, then the problem is 
> still reproduced.

Before we got the results from XLC on Linux (where Postgres show the
same behavior) i had a look into the spinlock implementation. If i got
it right, XLC doesn't use the ppc64 specific ones, but the fallback
implementation (system monitoring on AIX also has shown massive calls
for signal(0)...). So i tried the following patch:

diff --git a/src/include/port/atomics/arch-ppc.h
b/src/include/port/atomics/arch-ppc.h
new file mode 100644
index f901a0c..028cced
*** a/src/include/port/atomics/arch-ppc.h
--- b/src/include/port/atomics/arch-ppc.h
***************
*** 23,26 ****
--- 23,33 ----
  #define pg_memory_barrier_impl()      __asm__ __volatile__ ("sync" :
: :
"memory")
  #define pg_read_barrier_impl()                __asm__ __volatile__
("lwsync" : : : "memory")
  #define pg_write_barrier_impl()               __asm__ __volatile__
("lwsync" : : : "memory")
+
+ #elif defined(__IBMC__) || defined(__IBMCPP__)
+
+ #define pg_memory_barrier_impl()      __asm__ __volatile__ (" sync
\n"
::: "memory")
+ #define pg_read_barrier_impl()                __asm__ __volatile__ ("
lwsync \n" ::: "memory")
+ #define pg_write_barrier_impl()               __asm__ __volatile__ ("
lwsync \n" ::: "memory")
+
  #endif

This didn't change the picture, though.



--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Deadlock in XLogInsert at AIX

Heikki Linnakangas
In reply to this post by konstantin knizhnik
On 01/24/2017 04:47 PM, Konstantin Knizhnik wrote:

> As I already mentioned, we built Postgres with LOCK_DEBUG , so we can
> inspect lock owner. Backend is waiting for itself!
> Now please look at two frames in this stack trace marked with red.
> XLogInsertRecord is setting WALInsert locks at the beginning of the
> function:
>
>      if (isLogSwitch)
>          WALInsertLockAcquireExclusive();
>      else
>          WALInsertLockAcquire();
>
> WALInsertLockAcquire just selects random item from WALInsertLocks array
> and exclusively locks:
>
>      if (lockToTry == -1)
>          lockToTry = MyProc->pgprocno % NUM_XLOGINSERT_LOCKS;
>      MyLockNo = lockToTry;
>      immed = LWLockAcquire(&WALInsertLocks[MyLockNo].l.lock, LW_EXCLUSIVE);
>
> Then, following the stack trace, AdvanceXLInsertBuffer calls
> WaitXLogInsertionsToFinish:
>
>              /*
>               * Now that we have an up-to-date LogwrtResult value, see if we
>               * still need to write it or if someone else already did.
>               */
>              if (LogwrtResult.Write < OldPageRqstPtr)
>              {
>                  /*
>                   * Must acquire write lock. Release WALBufMappingLock
> first,
>                   * to make sure that all insertions that we need to
> wait for
>                   * can finish (up to this same position). Otherwise we risk
>                   * deadlock.
>                   */
>                  LWLockRelease(WALBufMappingLock);
>
> WaitXLogInsertionsToFinish(OldPageRqstPtr);
>
>                  LWLockAcquire(WALWriteLock, LW_EXCLUSIVE);
>
>
> It releases WALBufMappingLock but not WAL insert locks!
> Finally in WaitXLogInsertionsToFinish tries to wait for all locks:
>
>      for (i = 0; i < NUM_XLOGINSERT_LOCKS; i++)
>      {
>          XLogRecPtr    insertingat = InvalidXLogRecPtr;
>
>          do
>          {
>              /*
>               * See if this insertion is in progress. LWLockWait will
> wait for
>               * the lock to be released, or for the 'value' to be set by a
>               * LWLockUpdateVar call.  When a lock is initially
> acquired, its
>               * value is 0 (InvalidXLogRecPtr), which means that we
> don't know
>               * where it's inserting yet.  We will have to wait for it.  If
>               * it's a small insertion, the record will most likely fit
> on the
>               * same page and the inserter will release the lock without
> ever
>               * calling LWLockUpdateVar.  But if it has to sleep, it will
>               * advertise the insertion point with LWLockUpdateVar before
>               * sleeping.
>               */
>              if (LWLockWaitForVar(&WALInsertLocks[i].l.lock,
>   &WALInsertLocks[i].l.insertingAt,
>                                   insertingat, &insertingat))
>
> And here we stuck!
Interesting.. What should happen here is that for the backend's own
insertion slot, the "insertingat" value should be greater than the
requested flush point ('upto' variable). That's because before
GetXLogBuffer() calls AdvanceXLInsertBuffer(), it updates the backend's
insertingat value, to the position that it wants to insert to. And
AdvanceXLInsertBuffer() only calls WaitXLogInsertionsToFinish() with
value smaller than what was passed as the 'upto' argument.

> The comment to WaitXLogInsertionsToFinish says:
>
>   * Note: When you are about to write out WAL, you must call this function
>   * *before* acquiring WALWriteLock, to avoid deadlocks. This function might
>   * need to wait for an insertion to finish (or at least advance to next
>   * uninitialized page), and the inserter might need to evict an old WAL
> buffer
>   * to make room for a new one, which in turn requires WALWriteLock.
>
> Which contradicts to the observed stack trace.
Not AFAICS. In the stack trace you showed, the backend is not holding
WALWriteLock. It would only acquire it after the
WaitXLogInsertionsToFinish() call finished.

> I wonder if it is really synchronization bug in xlog.c or there is
> something wrong in this stack trace and it can not happen in case of
> normal work?

Yeah, hard to tell. Something is clearly wrong..

This line in the stack trace is suspicious:

> WaitXLogInsertionsToFinish(upto = 102067874328), line 1583 in "xlog.c"

AdvanceXLInsertBuffer() should only ever call
WaitXLogInsertionsToFinish() with an xlog position that points to a page
bounary, but that upto value points to the middle of a page.

Perhaps the value stored in the stack trace is not what the caller
passed, but it was updated because it was past the 'reserveUpto' value?
That would explain the "request to flush past end
of generated WAL" notices you saw in the log. Now, why would that
happen, I have no idea.

If you can and want to provide me access to the system, I could have a
look myself. I'd also like to see if the attached additional Assertions
will fire.

- Heikki



--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

extra-asserts-in-AdvanceXLInsertBuffer.patch (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Deadlock in XLogInsert at AIX

konstantin knizhnik

On 30.01.2017 19:21, Heikki Linnakangas wrote:

> On 01/24/2017 04:47 PM, Konstantin Knizhnik wrote:
> Interesting.. What should happen here is that for the backend's own
> insertion slot, the "insertingat" value should be greater than the
> requested flush point ('upto' variable). That's because before
> GetXLogBuffer() calls AdvanceXLInsertBuffer(), it updates the
> backend's insertingat value, to the position that it wants to insert
> to. And AdvanceXLInsertBuffer() only calls
> WaitXLogInsertionsToFinish() with value smaller than what was passed
> as the 'upto' argument.
>
>> The comment to WaitXLogInsertionsToFinish says:
>>
>>   * Note: When you are about to write out WAL, you must call this
>> function
>>   * *before* acquiring WALWriteLock, to avoid deadlocks. This
>> function might
>>   * need to wait for an insertion to finish (or at least advance to next
>>   * uninitialized page), and the inserter might need to evict an old WAL
>> buffer
>>   * to make room for a new one, which in turn requires WALWriteLock.
>>
>> Which contradicts to the observed stack trace.
>
> Not AFAICS. In the stack trace you showed, the backend is not holding
> WALWriteLock. It would only acquire it after the
> WaitXLogInsertionsToFinish() call finished.
>
>

Hmmm, may be I missed something.
I am not telling about WALBufMappingLock which is required after return
from XLogInsertionsToFinish.
But about lock obtained by WALInsertLockAcquire  at line 946 in
XLogInsertRecord.
It will be release at line  1021 by  WALInsertLockRelease(). But
CopyXLogRecordToWAL is invoked with this lock granted.


> This line in the stack trace is suspicious:
>
>> WaitXLogInsertionsToFinish(upto = 102067874328), line 1583 in "xlog.c"
>
> AdvanceXLInsertBuffer() should only ever call
> WaitXLogInsertionsToFinish() with an xlog position that points to a
> page bounary, but that upto value points to the middle of a page.
>
> Perhaps the value stored in the stack trace is not what the caller
> passed, but it was updated because it was past the 'reserveUpto'
> value? That would explain the "request to flush past end
> of generated WAL" notices you saw in the log. Now, why would that
> happen, I have no idea.
>
> If you can and want to provide me access to the system, I could have a
> look myself. I'd also like to see if the attached additional
> Assertions will fire.

I really get this assertion failed:

ExceptionalCondition(conditionName = "!(OldPageRqstPtr <= upto ||
opportunistic)", errorType = "FailedAssertion", fileName = "xlog.c",
lineNumber = 1917), line 54 in "assert.c"
(dbx) up
unnamed block in AdvanceXLInsertBuffer(upto = 147439056632,
opportunistic = '\0'), line 1917 in "xlog.c"
(dbx) p OldPageRqstPtr
147439058944
(dbx) p upto
147439056632
(dbx) p opportunistic
'\0'

Also , in another run, I encountered yet another assertion failure:

ExceptionalCondition(conditionName = "!((((NewPageBeginPtr) / 8192) %
(XLogCtl->XLogCacheBlck + 1)) == nextidx)", errorType =
"FailedAssertion", fileName = "xlog.c", lineNumber = 1950), line 54 in
"assert.c"

nextidx equals to 1456, while expected value is 1457.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Deadlock in XLogInsert at AIX

konstantin knizhnik
One more assertion failure:


ExceptionalCondition(conditionName = "!(OldPageRqstPtr <= XLogCtl->InitializedUpTo)", errorType = "FailedAssertion", fileName = "xlog.c", lineNumber = 1887), line 54 in "assert.c"

(dbx) p OldPageRqstPtr
153551667200
(dbx) p XLogCtl->InitializedUpTo
153551667200
(dbx) p InitializedUpTo
153551659008

I slightly modify xlog.c code - store value of XLogCtl->InitializedUpTo in local variable:


 1870         LWLockAcquire(WALBufMappingLock, LW_EXCLUSIVE);
 1871
 1872         /*
 1873          * Now that we have the lock, check if someone initialized the page
 1874          * already.
 1875          */
 1876         while (upto >= XLogCtl->InitializedUpTo || opportunistic)
 1877         {
 1878                 XLogRecPtr InitializedUpTo = XLogCtl->InitializedUpTo;
 1879                 nextidx = XLogRecPtrToBufIdx(InitializedUpTo);
 1880
 1881                 /*
 1882                  * Get ending-offset of the buffer page we need to replace (this may
 1883                  * be zero if the buffer hasn't been used yet).  Fall through if it's
 1884                  * already written out.
 1885                  */
 1886                 OldPageRqstPtr = XLogCtl->xlblocks[nextidx];
 1887                 Assert(OldPageRqstPtr <= XLogCtl->InitializedUpTo);


And, as you can see,  XLogCtl->InitializedUpTo is not equal to saved value InitializedUpTo.
But we are under exclusive WALBufMappingLock and InitializedUpTo is updated only under this lock.
So it means that LW-locks doesn't work!
I inspected code of pg_atomic_compare_exchange_u32_impl and didn't sync in prologue:

(dbx) listi pg_atomic_compare_exchange_u32_impl
0x1000817bc (pg_atomic_compare_exchange_u32_impl+0x1c)  e88100b0             ld   r4,0xb0(r1)
0x1000817c0 (pg_atomic_compare_exchange_u32_impl+0x20)  e86100b8             ld   r3,0xb8(r1)
0x1000817c4 (pg_atomic_compare_exchange_u32_impl+0x24)  800100c0            lwz   r0,0xc0(r1)
0x1000817c8 (pg_atomic_compare_exchange_u32_impl+0x28)  7c0007b4          extsw   r0,r0
0x1000817cc (pg_atomic_compare_exchange_u32_impl+0x2c)  e8a30002            lwa   r5,0x0(r3)
0x1000817d0 (pg_atomic_compare_exchange_u32_impl+0x30)  7cc02028          lwarx   r6,r0,r4,0x0
0x1000817d4 (pg_atomic_compare_exchange_u32_impl+0x34)  7c053040           cmpl   cr0,0x0,r5,r6
0x1000817d8 (pg_atomic_compare_exchange_u32_impl+0x38)  4082000c            bne   0x1000817e4 (pg_atomic_compare_exchange_u32_impl+0x44)
0x1000817dc (pg_atomic_compare_exchange_u32_impl+0x3c)  7c00212d         stwcx.   r0,r0,r4
0x1000817e0 (pg_atomic_compare_exchange_u32_impl+0x40)  40e2fff0           bne+   0x1000817d0 (pg_atomic_compare_exchange_u32_impl+0x30)
0x1000817e4 (pg_atomic_compare_exchange_u32_impl+0x44)  60c00000            ori   r0,r6,0x0
0x1000817e8 (pg_atomic_compare_exchange_u32_impl+0x48)  90030000            stw   r0,0x0(r3)
0x1000817ec (pg_atomic_compare_exchange_u32_impl+0x4c)  7c000026           mfcr   r0
0x1000817f0 (pg_atomic_compare_exchange_u32_impl+0x50)  54001ffe         rlwinm   r0,r0,0x3,0x1f,0x1f
0x1000817f4 (pg_atomic_compare_exchange_u32_impl+0x54)  78000620         rldicl   r0,r0,0x0,0x19
0x1000817f8 (pg_atomic_compare_exchange_u32_impl+0x58)  98010070            stb   r0,0x70(r1)
0x1000817fc (pg_atomic_compare_exchange_u32_impl+0x5c)  4c00012c          isync
0x100081800 (pg_atomic_compare_exchange_u32_impl+0x60)  88610070            lbz   r3,0x70(r1)
0x100081804 (pg_atomic_compare_exchange_u32_impl+0x64)  48000004              b   0x100081808 (pg_atomic_compare_exchange_u32_impl+0x68)
0x100081808 (pg_atomic_compare_exchange_u32_impl+0x68)  38210080           addi   r1,0x80(r1)
0x10008180c (pg_atomic_compare_exchange_u32_impl+0x6c)  4e800020            blr


Source code of pg_atomic_compare_exchange_u32_impl is the following:

static inline bool
pg_atomic_compare_exchange_u32_impl(volatile pg_atomic_uint32 *ptr,
                                    uint32 *expected, uint32 newval)
{
    bool        ret;

    /*
     * atomics.h specifies sequential consistency ("full barrier semantics")
     * for this interface.  Since "lwsync" provides acquire/release
     * consistency only, do not use it here.  GCC atomics observe the same
     * restriction; see its rs6000_pre_atomic_barrier().
     */
    __asm__ __volatile__ ("    sync \n" ::: "memory");

    /*
     * XXX: __compare_and_swap is defined to take signed parameters, but that
     * shouldn't matter since we don't perform any arithmetic operations.
     */
    ret = __compare_and_swap((volatile int*)&ptr->value,
                             (int *)expected, (int)newval);

    /*
     * xlc's documentation tells us:
     * "If __compare_and_swap is used as a locking primitive, insert a call to
     * the __isync built-in function at the start of any critical sections."
     *
     * The critical section begins immediately after __compare_and_swap().
     */
    __isync();

    return ret;
}

and if I compile this fuctions standalone, I get the following assembler code:

.pg_atomic_compare_exchange_u32_impl:   # 0x0000000000000000 (H.4.NO_SYMBOL)
        stdu       SP,-128(SP)
        std        r3,176(SP)
        std        r4,184(SP)
        std        r5,192(SP)
        ld         r0,192(SP)
        stw        r0,192(SP)
        sync      
        ld         r4,176(SP)
        ld         r3,184(SP)
        lwz        r0,192(SP)
        extsw      r0,r0
        lwa        r5,0(r3)
__L30:                                  # 0x0000000000000030 (H.4.NO_SYMBOL+0x030)
        lwarx      r6,r0,r4
        cmpl       0,0,r5,r6
        bc         BO_IF_NOT,CR0_EQ,__L44
        stwcx.     r0,r0,r4
        .machine        "any"
        bc         BO_IF_NOT_3,CR0_EQ,__L30
__L44:                                  # 0x0000000000000044 (H.4.NO_SYMBOL+0x044)
        ori        r0,r6,0x0000
        stw        r0,0(r3)
        mfcr       r0
        rlwinm     r0,r0,3,31,31
        rldicl     r0,r0,0,56
        stb        r0,112(SP)
        isync     
        lbz        r3,112(SP)
        addi       SP,SP,128
        bclr       BO_ALWAYS,CR0_LT

sync is here!


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 
Reply | Threaded
Open this post in threaded view
|

Re: Deadlock in XLogInsert at AIX

Heikki Linnakangas
On 01/31/2017 05:03 PM, Konstantin Knizhnik wrote:

> One more assertion failure:
>
>
> ExceptionalCondition(conditionName = "!(OldPageRqstPtr <=
> XLogCtl->InitializedUpTo)", errorType = "FailedAssertion", fileName =
> "xlog.c", lineNumber = 1887), line 54 in "assert.c"
>
> (dbx) p OldPageRqstPtr
> 153551667200
> (dbx) p XLogCtl->InitializedUpTo
> 153551667200
> (dbx) p InitializedUpTo
> 153551659008
>
> I slightly modify xlog.c code - store value of XLogCtl->InitializedUpTo
> in local variable:
>
>
>   1870         LWLockAcquire(WALBufMappingLock, LW_EXCLUSIVE);
>   1871
>   1872         /*
>   1873          * Now that we have the lock, check if someone
> initialized the page
>   1874          * already.
>   1875          */
>   1876         while (upto >= XLogCtl->InitializedUpTo || opportunistic)
>   1877         {
>   1878                 XLogRecPtr InitializedUpTo =
> XLogCtl->InitializedUpTo;
>   1879                 nextidx = XLogRecPtrToBufIdx(InitializedUpTo);
>   1880
>   1881                 /*
>   1882                  * Get ending-offset of the buffer page we need
> to replace (this may
>   1883                  * be zero if the buffer hasn't been used yet).
> Fall through if it's
>   1884                  * already written out.
>   1885                  */
>   1886                 OldPageRqstPtr = XLogCtl->xlblocks[nextidx];
>   1887                 Assert(OldPageRqstPtr <= XLogCtl->InitializedUpTo);
>
>
> And, as you can see,  XLogCtl->InitializedUpTo is not equal to saved
> value InitializedUpTo.
> But we are under exclusive WALBufMappingLock and InitializedUpTo is
> updated only under this lock.
> So it means that LW-locks doesn't work!

Yeah, so it seems. XLogCtl->InitializeUpTo is quite clearly protected by
the WALBufMappingLock. All references to it (after StartupXLog) happen
while holding the lock.

Can you get the assembly output of the AdvanceXLInsertBuffer() function?
I wonder if the compiler is rearranging things so that
XLogCtl->InitializedUpTo is fetched before the LWLockAcquire call. Or
should there be a memory barrier instruction somewhere in LWLockAcquire?

- Heikki



--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Deadlock in XLogInsert at AIX

Heikki Linnakangas
In reply to this post by konstantin knizhnik
Oh, you were one step ahead of me, I didn't understand it on first read
of your email. Need more coffee..

On 01/31/2017 05:03 PM, Konstantin Knizhnik wrote:
> I inspected code of pg_atomic_compare_exchange_u32_impl and didn't sync
> in prologue:
>
> (dbx) listi pg_atomic_compare_exchange_u32_impl
 > [no sync instruction]

> and if I compile this fuctions standalone, I get the following assembler
> code:
>
> .pg_atomic_compare_exchange_u32_impl:   # 0x0000000000000000 (H.4.NO_SYMBOL)
>          stdu       SP,-128(SP)
>          std        r3,176(SP)
>          std        r4,184(SP)
>          std        r5,192(SP)
>          ld         r0,192(SP)
>          stw        r0,192(SP)
>         sync
>          ld         r4,176(SP)
>          ld         r3,184(SP)
>          lwz        r0,192(SP)
>          extsw      r0,r0
>          lwa        r5,0(r3)
 > ...
>
> sync is here!

Ok, so, the 'sync' instruction gets lost somehow. That "standalone"
assemly version looks slightly different in other ways too, you perhaps
used different optimization levels, or it looks different when it's
inlined into the caller. Not sure which version of the function gdb
would show, when it's a "static inline" function. Would be good to check
the disassembly of LWLockAttemptLock(), to see if the 'sync' is there.

Certainly seems like a compiler bug, though.

- Heikki



--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Deadlock in XLogInsert at AIX

konstantin knizhnik
Attached please find my patch for XLC/AIX.
The most critical fix is adding __sync to pg_atomic_fetch_add_u32_impl.
The comment in this file says that:

       * __fetch_and_add() emits a leading "sync" and trailing "isync",
thereby
       * providing sequential consistency.  This is undocumented.

But it is not true any more (I checked generated assembler code in
debugger).
This is why I have added __sync() to this function. Now pgbench working
normally.

Also there is mysterious disappearance of assembler section function
with sync instruction from pg_atomic_compare_exchange_u32_impl.
I have fixed it by using __sync() built-in function instead.


Thanks to everybody who helped me to locate and fix this problem.

--

Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

xlc.patch (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Deadlock in XLogInsert at AIX

Heikki Linnakangas
On 02/01/2017 01:07 PM, Konstantin Knizhnik wrote:

> Attached please find my patch for XLC/AIX.
> The most critical fix is adding __sync to pg_atomic_fetch_add_u32_impl.
> The comment in this file says that:
>
>        * __fetch_and_add() emits a leading "sync" and trailing "isync",
> thereby
>        * providing sequential consistency.  This is undocumented.
>
> But it is not true any more (I checked generated assembler code in
> debugger).
> This is why I have added __sync() to this function. Now pgbench working
> normally.
Seems like it was not so much undocumented, but an implementation detail
that was not guaranteed after all..

Does __fetch_and_add emit a trailing isync there either? Seems odd if
__compare_and_swap requires it, but __fetch_and_add does not. Unless we
can find conclusive documentation on that, I think we should assume that
an __isync() is required, too.

> Also there is mysterious disappearance of assembler section function
> with sync instruction from pg_atomic_compare_exchange_u32_impl.
> I have fixed it by using __sync() built-in function instead.

__sync() seems more appropriate there, anyway. We're using intrinsics
for all the other things in generic-xlc.h. But it sure is scary that the
"asm" sections just disappeared.

In arch-ppc.h, shouldn't we have #ifdef __IBMC__ guards for the __sync()
and __lwsync() intrinsics? Those are an xlc compiler-specific thing,
right? Or if they are expected to work on any ppc compiler, then we
should probably use them always, instead of the asm sections.

In summary, I came up with the attached. It's essentially your patch,
with tweaks for the above-mentioned things. I don't have a powerpc
system to test on, so there are probably some silly typos there.

- Heikki



--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

xlc-heikki-1.patch (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Deadlock in XLogInsert at AIX

Heikki Linnakangas
In reply to this post by konstantin knizhnik
On 02/01/2017 01:07 PM, Konstantin Knizhnik wrote:

> Attached please find my patch for XLC/AIX.
> The most critical fix is adding __sync to pg_atomic_fetch_add_u32_impl.
> The comment in this file says that:
>
>        * __fetch_and_add() emits a leading "sync" and trailing "isync",
> thereby
>        * providing sequential consistency.  This is undocumented.
>
> But it is not true any more (I checked generated assembler code in
> debugger).
> This is why I have added __sync() to this function. Now pgbench working
> normally.
Seems like it was not so much undocumented, but an implementation detail
that was not guaranteed after all..

Does __fetch_and_add emit a trailing isync there either? Seems odd if
__compare_and_swap requires it, but __fetch_and_add does not. Unless we
can find conclusive documentation on that, I think we should assume that
an __isync() is required, too.

There was a long thread on these things the last time this was changed:
https://www.postgresql.org/message-id/20160425185204.jrvlghn3jxulsb7i%40alap3.anarazel.de.
I couldn't find an explanation there of why we thought that
fetch_and_add implicitly performs sync and isync.

> Also there is mysterious disappearance of assembler section function
> with sync instruction from pg_atomic_compare_exchange_u32_impl.
> I have fixed it by using __sync() built-in function instead.

__sync() seems more appropriate there, anyway. We're using intrinsics
for all the other things in generic-xlc.h. But it sure is scary that the
"asm" sections just disappeared.

In arch-ppc.h, shouldn't we have #ifdef __IBMC__ guards for the __sync()
and __lwsync() intrinsics? Those are an xlc compiler-specific thing,
right? Or if they are expected to work on any ppc compiler, then we
should probably use them always, instead of the asm sections.

In summary, I came up with the attached. It's essentially your patch,
with tweaks for the above-mentioned things. I don't have a powerpc
system to test on, so there are probably some silly typos there.

- Heikki



--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

xlc-heikki-1.patch (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Deadlock in XLogInsert at AIX

REIX, Tony
In reply to this post by konstantin knizhnik

Hi,

I'm now working on the port of PostgreSQL on AIX.
(RPMs can be found, as free OpenSource work, at http://http://bullfreeware.com/ .
 
http://bullfreeware.com/search.php?package=postgresql )

I was not aware of any issue with XLC v12 on AIX for atomic operations.
(XLC v13 generates at least 2 tests failures)

For now, with version 9.6.1, all tests "check-world", plus numeric_big test, are OK, in both 32 & 64bit versions.

Am I missing something ?

I configure the build of PostgreSQL with (in 64bits):

 ./configure
        --prefix=/opt/freeware
        --libdir=/opt/freeware/lib64
        --mandir=/opt/freeware/man
        --with-perl
        --with-tcl
        --with-tclconfig=/opt/freeware/lib
        --with-python
        --with-ldap
        --with-openssl
        --with-libxml
        --with-libxslt
        --enable-nls
        --enable-thread-safety
        --sysconfdir=/etc/sysconfig/postgresql

Am I missing some option for more optimization on AIX ?

Thanks

Regards,

Tony


Le 01/02/2017 à 12:07, Konstantin Knizhnik a écrit :
Attached please find my patch for XLC/AIX.
The most critical fix is adding __sync to pg_atomic_fetch_add_u32_impl.
The comment in this file says that:

      * __fetch_and_add() emits a leading "sync" and trailing "isync",
thereby
      * providing sequential consistency.  This is undocumented.

But it is not true any more (I checked generated assembler code in
debugger).
This is why I have added __sync() to this function. Now pgbench working
normally.

Also there is mysterious disappearance of assembler section function
with sync instruction from pg_atomic_compare_exchange_u32_impl.
I have fixed it by using __sync() built-in function instead.


Thanks to everybody who helped me to locate and fix this problem.

--

Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


ATOS WARNING !
This message contains attachments that could potentially harm your computer.
Please make sure you open ONLY attachments from senders you know, trust and is in an e-mail that you are expecting.

AVERTISSEMENT ATOS !
Ce message contient des pièces jointes qui peuvent potentiellement endommager votre ordinateur.
Merci de vous assurer que vous ouvrez uniquement les pièces jointes provenant d’emails que vous attendez et dont vous connaissez les expéditeurs et leur faites confiance.

AVISO DE ATOS !
Este mensaje contiene datos adjuntos que pudiera ser que dañaran su ordenador.
Asegúrese de abrir SOLO datos adjuntos enviados desde remitentes de confianza y que procedan de un correo esperado.

ATOS WARNUNG !
Diese E-Mail enthält Anlagen, welche möglicherweise ihren Computer beschädigen könnten.
Bitte beachten Sie, daß Sie NUR Anlagen öffnen, von einem Absender den Sie kennen, vertrauen und vom dem Sie vor allem auch E-Mails mit Anlagen erwarten.




Reply | Threaded
Open this post in threaded view
|

Re: Deadlock in XLogInsert at AIX

konstantin knizhnik
Hi,

We are using 13.1.3 version of XLC. All tests are passed.
Please notice that is is synchronization bug which can be reproduced only under hard load.
Our server has 64 cores and it is necessary to run pgbench with 100 connections during several minutes to reproduce the problem.
So may be you just didn't notice it;)



On 01.02.2017 16:29, REIX, Tony wrote:

Hi,

I'm now working on the port of PostgreSQL on AIX.
(RPMs can be found, as free OpenSource work, at http://http://bullfreeware.com/ .
 
http://bullfreeware.com/search.php?package=postgresql )

I was not aware of any issue with XLC v12 on AIX for atomic operations.
(XLC v13 generates at least 2 tests failures)

For now, with version 9.6.1, all tests "check-world", plus numeric_big test, are OK, in both 32 & 64bit versions.

Am I missing something ?

I configure the build of PostgreSQL with (in 64bits):

 ./configure
        --prefix=/opt/freeware
        --libdir=/opt/freeware/lib64
        --mandir=/opt/freeware/man
        --with-perl
        --with-tcl
        --with-tclconfig=/opt/freeware/lib
        --with-python
        --with-ldap
        --with-openssl
        --with-libxml
        --with-libxslt
        --enable-nls
        --enable-thread-safety
        --sysconfdir=/etc/sysconfig/postgresql

Am I missing some option for more optimization on AIX ?

Thanks

Regards,

Tony


Le 01/02/2017 à 12:07, Konstantin Knizhnik a écrit :
Attached please find my patch for XLC/AIX.
The most critical fix is adding __sync to pg_atomic_fetch_add_u32_impl.
The comment in this file says that:

      * __fetch_and_add() emits a leading "sync" and trailing "isync",
thereby
      * providing sequential consistency.  This is undocumented.

But it is not true any more (I checked generated assembler code in
debugger).
This is why I have added __sync() to this function. Now pgbench working
normally.

Also there is mysterious disappearance of assembler section function
with sync instruction from pg_atomic_compare_exchange_u32_impl.
I have fixed it by using __sync() built-in function instead.


Thanks to everybody who helped me to locate and fix this problem.

--

Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


ATOS WARNING !
This message contains attachments that could potentially harm your computer.
Please make sure you open ONLY attachments from senders you know, trust and is in an e-mail that you are expecting.

AVERTISSEMENT ATOS !
Ce message contient des pièces jointes qui peuvent potentiellement endommager votre ordinateur.
Merci de vous assurer que vous ouvrez uniquement les pièces jointes provenant d’emails que vous attendez et dont vous connaissez les expéditeurs et leur faites confiance.

AVISO DE ATOS !
Este mensaje contiene datos adjuntos que pudiera ser que dañaran su ordenador.
Asegúrese de abrir SOLO datos adjuntos enviados desde remitentes de confianza y que procedan de un correo esperado.

ATOS WARNUNG !
Diese E-Mail enthält Anlagen, welche möglicherweise ihren Computer beschädigen könnten.
Bitte beachten Sie, daß Sie NUR Anlagen öffnen, von einem Absender den Sie kennen, vertrauen und vom dem Sie vor allem auch E-Mails mit Anlagen erwarten.




-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 
Reply | Threaded
Open this post in threaded view
|

Re: Deadlock in XLogInsert at AIX

konstantin knizhnik
In reply to this post by Heikki Linnakangas
On 01.02.2017 15:39, Heikki Linnakangas wrote:
On 02/01/2017 01:07 PM, Konstantin Knizhnik wrote:
Attached please find my patch for XLC/AIX.
The most critical fix is adding __sync to pg_atomic_fetch_add_u32_impl.
The comment in this file says that:

       * __fetch_and_add() emits a leading "sync" and trailing "isync",
thereby
       * providing sequential consistency.  This is undocumented.

But it is not true any more (I checked generated assembler code in
debugger).
This is why I have added __sync() to this function. Now pgbench working
normally.

Seems like it was not so much undocumented, but an implementation detail that was not guaranteed after all..

Does __fetch_and_add emit a trailing isync there either? Seems odd if __compare_and_swap requires it, but __fetch_and_add does not. Unless we can find conclusive documentation on that, I think we should assume that an __isync() is required, too.

There was a long thread on these things the last time this was changed: https://www.postgresql.org/message-id/20160425185204.jrvlghn3jxulsb7i%40alap3.anarazel.de. I couldn't find an explanation there of why we thought that fetch_and_add implicitly performs sync and isync.

Also there is mysterious disappearance of assembler section function
with sync instruction from pg_atomic_compare_exchange_u32_impl.
I have fixed it by using __sync() built-in function instead.

__sync() seems more appropriate there, anyway. We're using intrinsics for all the other things in generic-xlc.h. But it sure is scary that the "asm" sections just disappeared.

In arch-ppc.h, shouldn't we have #ifdef __IBMC__ guards for the __sync() and __lwsync() intrinsics? Those are an xlc compiler-specific thing, right? Or if they are expected to work on any ppc compiler, then we should probably use them always, instead of the asm sections.

In summary, I came up with the attached. It's essentially your patch, with tweaks for the above-mentioned things. I don't have a powerpc system to test on, so there are probably some silly typos there.

Why do you prefer to use _check_lock instead of __check_lock_mp ?
First one is even not mentioned in XLC compiler manual:
http://www-01.ibm.com/support/docview.wss?uid=swg27046906&aid=7
or
http://scv.bu.edu/computation/bluegene/IBMdocs/compiler/xlc-8.0/html/compiler/ref/bif_sync.htm


- Heikki




    

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 
Reply | Threaded
Open this post in threaded view
|

Re: Deadlock in XLogInsert at AIX

konstantin knizhnik
In reply to this post by Heikki Linnakangas


On 01.02.2017 15:39, Heikki Linnakangas wrote:
>
> In summary, I came up with the attached. It's essentially your patch,
> with tweaks for the above-mentioned things. I don't have a powerpc
> system to test on, so there are probably some silly typos there.
>

Attached pleased find fixed version of your patch.
I verified that it is correctly applied, build and postgres normally
works with it.


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

xlc-heikki-2.patch (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Deadlock in XLogInsert at AIX

REIX, Tony
In reply to this post by konstantin knizhnik

Hi Konstantin

XLC.

I'm on AIX 7.1 for now.

I'm using this version of XLC v13:

# xlc -qversion
IBM XL C/C++ for AIX, V13.1.3 (5725-C72, 5765-J07)
Version: 13.01.0003.0003

With this version, I have (at least, since I tested with "check" and not "check-world" at that time) 2 failing tests: create_aggregate , aggregates .


With the following XLC v12 version, I have NO test failure:

# /usr/vac/bin/xlc -qversion
IBM XL C/C++ for AIX, V12.1 (5765-J02, 5725-C72)
Version: 12.01.0000.0016


So maybe you are not using XLC v13.1.3.3, rather another sub-version. Unless you are using more options for the configure ?


Configure.

What are the options that you give to the configure ?


Hard load & 64 cores ? OK. That clearly explains why I do not see this issue.


pgbench ? I wanted to run it. However, I'm still looking where to get it plus a guide for using it for testing. I would add such tests when building my PostgreSQL RPMs on AIX. So any help is welcome !


Performance.

- Also, I'd like to compare PostgreSQL performance on AIX vs Linux/PPC64. Any idea how I should proceed ? Any PostgreSQL performance benchmark that I could find and use ? pgbench ?

- I'm interested in any information for improving the performance & quality of my PostgreSQM RPMs on AIX. (As I already said, BullFreeware RPMs for AIX are free and can be used by anyone, like Perzl RPMs are. My company (ATOS/Bull) sells IBM Power machines under the Escala brand since ages (25 years this year)).


How to help ?

How could I help for improving the quality and performance of PostgreSQL on AIX ?
I may have access to very big machines for even more deeply testing of PostgreSQL. I just need to know how to run tests.


Thanks!

Regards,

Tony



Le 01/02/2017 à 14:48, Konstantin Knizhnik a écrit :
Hi,

We are using 13.1.3 version of XLC. All tests are passed.
Please notice that is is synchronization bug which can be reproduced only under hard load.
Our server has 64 cores and it is necessary to run pgbench with 100 connections during several minutes to reproduce the problem.
So may be you just didn't notice it;)



On 01.02.2017 16:29, REIX, Tony wrote:

Hi,

I'm now working on the port of PostgreSQL on AIX.
(RPMs can be found, as free OpenSource work, at http://http://bullfreeware.com/ .
 
http://bullfreeware.com/search.php?package=postgresql )

I was not aware of any issue with XLC v12 on AIX for atomic operations.
(XLC v13 generates at least 2 tests failures)

For now, with version 9.6.1, all tests "check-world", plus numeric_big test, are OK, in both 32 & 64bit versions.

Am I missing something ?

I configure the build of PostgreSQL with (in 64bits):

 ./configure
        --prefix=/opt/freeware
        --libdir=/opt/freeware/lib64
        --mandir=/opt/freeware/man
        --with-perl
        --with-tcl
        --with-tclconfig=/opt/freeware/lib
        --with-python
        --with-ldap
        --with-openssl
        --with-libxml
        --with-libxslt
        --enable-nls
        --enable-thread-safety
        --sysconfdir=/etc/sysconfig/postgresql

Am I missing some option for more optimization on AIX ?

Thanks

Regards,

Tony


Le 01/02/2017 à 12:07, Konstantin Knizhnik a écrit :
Attached please find my patch for XLC/AIX.
The most critical fix is adding __sync to pg_atomic_fetch_add_u32_impl.
The comment in this file says that:

      * __fetch_and_add() emits a leading "sync" and trailing "isync",
thereby
      * providing sequential consistency.  This is undocumented.

But it is not true any more (I checked generated assembler code in
debugger).
This is why I have added __sync() to this function. Now pgbench working
normally.

Also there is mysterious disappearance of assembler section function
with sync instruction from pg_atomic_compare_exchange_u32_impl.
I have fixed it by using __sync() built-in function instead.


Thanks to everybody who helped me to locate and fix this problem.

--

Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


ATOS WARNING !
This message contains attachments that could potentially harm your computer.
Please make sure you open ONLY attachments from senders you know, trust and is in an e-mail that you are expecting.

AVERTISSEMENT ATOS !
Ce message contient des pièces jointes qui peuvent potentiellement endommager votre ordinateur.
Merci de vous assurer que vous ouvrez uniquement les pièces jointes provenant d’emails que vous attendez et dont vous connaissez les expéditeurs et leur faites confiance.

AVISO DE ATOS !
Este mensaje contiene datos adjuntos que pudiera ser que dañaran su ordenador.
Asegúrese de abrir SOLO datos adjuntos enviados desde remitentes de confianza y que procedan de un correo esperado.

ATOS WARNUNG !
Diese E-Mail enthält Anlagen, welche möglicherweise ihren Computer beschädigen könnten.
Bitte beachten Sie, daß Sie NUR Anlagen öffnen, von einem Absender den Sie kennen, vertrauen und vom dem Sie vor allem auch E-Mails mit Anlagen erwarten.




-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Reply | Threaded
Open this post in threaded view
|

Re: Deadlock in XLogInsert at AIX

konstantin knizhnik
Hi Tony,

On 01.02.2017 18:42, REIX, Tony wrote:

Hi Konstantin

XLC.

I'm on AIX 7.1 for now.

I'm using this version of XLC v13:

# xlc -qversion
IBM XL C/C++ for AIX, V13.1.3 (5725-C72, 5765-J07)
Version: 13.01.0003.0003

With this version, I have (at least, since I tested with "check" and not "check-world" at that time) 2 failing tests: create_aggregate , aggregates .


With the following XLC v12 version, I have NO test failure:

# /usr/vac/bin/xlc -qversion
IBM XL C/C++ for AIX, V12.1 (5765-J02, 5725-C72)
Version: 12.01.0000.0016


So maybe you are not using XLC v13.1.3.3, rather another sub-version. Unless you are using more options for the configure ?


Configure.

What are the options that you give to the configure ?


export CC="/opt/IBM/xlc/13.1.3/bin/xlc"
export CFLAGS="-qarch=pwr8 -qtune=pwr8 -O2 -qalign=natural -q64 "
export LDFLAGS="-Wl,-bbigtoc,-b64"
export AR="/usr/bin/ar -X64"
export LD="/usr/bin/ld -b64 "
export NM="/usr/bin/nm -X64"
./configure --prefix="/opt/postgresql/xlc-debug/9.6"


Hard load & 64 cores ? OK. That clearly explains why I do not see this issue.


pgbench ? I wanted to run it. However, I'm still looking where to get it plus a guide for using it for testing.


pgbench is part of Postgres distributive (src/bin/pgbench)


I would add such tests when building my PostgreSQL RPMs on AIX. So any help is welcome !


Performance.

- Also, I'd like to compare PostgreSQL performance on AIX vs Linux/PPC64. Any idea how I should proceed ? Any PostgreSQL performance benchmark that I could find and use ? pgbench ?

pgbench is most widely used tool simulating OLTP workload. Certainly it is quite primitive and its results are rather artificial. TPC-C seems to be better choice.
But the best case is to implement your own benchmark simulating actual workload of your real application.

- I'm interested in any information for improving the performance & quality of my PostgreSQM RPMs on AIX. (As I already said, BullFreeware RPMs for AIX are free and can be used by anyone, like Perzl RPMs are. My company (ATOS/Bull) sells IBM Power machines under the Escala brand since ages (25 years this year)).


How to help ?

How could I help for improving the quality and performance of PostgreSQL on AIX ?


We still have one open issue at AIX: see https://www.mail-archive.com/pgsql-hackers@.../msg303094.html
It will be great if you can somehow help to fix this problem.



-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 
Reply | Threaded
Open this post in threaded view
|

Re: Deadlock in XLogInsert at AIX

Heikki Linnakangas
In reply to this post by konstantin knizhnik
On 02/01/2017 04:12 PM, Konstantin Knizhnik wrote:

> On 01.02.2017 15:39, Heikki Linnakangas wrote:
>> On 02/01/2017 01:07 PM, Konstantin Knizhnik wrote:
>>> Attached please find my patch for XLC/AIX.
>>> The most critical fix is adding __sync to pg_atomic_fetch_add_u32_impl.
>>> The comment in this file says that:
>>>
>>>        * __fetch_and_add() emits a leading "sync" and trailing "isync",
>>> thereby
>>>        * providing sequential consistency.  This is undocumented.
>>>
>>> But it is not true any more (I checked generated assembler code in
>>> debugger).
>>> This is why I have added __sync() to this function. Now pgbench working
>>> normally.
>>
>> Seems like it was not so much undocumented, but an implementation
>> detail that was not guaranteed after all..
>>
>> Does __fetch_and_add emit a trailing isync there either? Seems odd if
>> __compare_and_swap requires it, but __fetch_and_add does not. Unless
>> we can find conclusive documentation on that, I think we should assume
>> that an __isync() is required, too.
>>
>> There was a long thread on these things the last time this was
>> changed:
>> https://www.postgresql.org/message-id/20160425185204.jrvlghn3jxulsb7i%40alap3.anarazel.de.
>> I couldn't find an explanation there of why we thought that
>> fetch_and_add implicitly performs sync and isync.
>>
>>> Also there is mysterious disappearance of assembler section function
>>> with sync instruction from pg_atomic_compare_exchange_u32_impl.
>>> I have fixed it by using __sync() built-in function instead.
>>
>> __sync() seems more appropriate there, anyway. We're using intrinsics
>> for all the other things in generic-xlc.h. But it sure is scary that
>> the "asm" sections just disappeared.
>>
>> In arch-ppc.h, shouldn't we have #ifdef __IBMC__ guards for the
>> __sync() and __lwsync() intrinsics? Those are an xlc compiler-specific
>> thing, right? Or if they are expected to work on any ppc compiler,
>> then we should probably use them always, instead of the asm sections.
>>
>> In summary, I came up with the attached. It's essentially your patch,
>> with tweaks for the above-mentioned things. I don't have a powerpc
>> system to test on, so there are probably some silly typos there.
>
> Why do you prefer to use _check_lock instead of __check_lock_mp ?
> First one is even not mentioned in XLC compiler manual:
> http://www-01.ibm.com/support/docview.wss?uid=swg27046906&aid=7
> or
> http://scv.bu.edu/computation/bluegene/IBMdocs/compiler/xlc-8.0/html/compiler/ref/bif_sync.htm

Googling around, it seems that they do more or less the same thing. I
would guess that they actually produce the same assembly code, but I
have no machine to test on. If I understand correctly, the difference is
that __check_lock_mp() is an xlc compiler intrinsic, while _check_lock()
is a libc function. The libc function presumably does __check_lock_mp()
or __check_lock_up() depending on whether the system is a multi- or
uni-processor system.

So I think if we're going to change this, the use of __check_lock_mp()
needs to be in an #ifdef block to check that you're on the XLC compiler,
as it's a *compiler* intrinsic, while the current code that uses
_check_lock() are in an "#ifdef _AIX" block, which is correct for
_check_lock() because it's defined in libc, not by the compiler.

But if there's no pressing reason to change it, let's leave it alone.
It's not related to the problem at hand, right?

- Heikki



--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Deadlock in XLogInsert at AIX

REIX, Tony
In reply to this post by konstantin knizhnik

Hi Konstantin,

Please run: /opt/IBM/xlc/13.1.3/bin/xlc -qversion  so that I know your exact XLC v13 version.

I'm building on Power7 and not giving any architecture flag to XLC.

I'm not using -qalign=natural . Thus, by default, XLC use -qalign=power, which is close to natural, as explained at:
         https://www.ibm.com/support/knowledgecenter/SSGH2K_13.1.0/com.ibm.xlc131.aix.doc/compiler_ref/opt_align.html
Why are you using this flag ?

Thanks for info about pgbench. PostgreSQL web-site contains a lot of old information...

If you could share scripts or instructions about the tests you are doing with pgbench, I would reproduce here.
I have no "real" application. My job consists in porting OpenSource packages on AIX. Many packages. Erlang, Go, these days. I just want to make PostgreSQL RPMs as good as possible... within the limited amount of time I can give to this package, before moving to another one.

About the zombie issue, I've discussed with my colleagues. Looks like the process keeps zombie till the father looks at its status. However, though I did that several times, I  do not remember well the details. And that should be not specific to AIX. I'll discuss with another colleague, tomorrow, who should understand this better than me.

Patch for Large Files: When building PostgreSQL, I found required to use the following patch so that PostgreSQL works with large files. I do not remember the details. Do you agree with such a patch ? 1rst version (new-...) shows the exact places where   define _LARGE_FILES 1  is required.  2nd version (new2-...) is simpler.

I'm now experimenting with your patch for dead lock. However, that should be invisible with the  "check-world" tests I guess.

Regards,

Tony


Le 01/02/2017 à 16:59, Konstantin Knizhnik a écrit :
Hi Tony,

On 01.02.2017 18:42, REIX, Tony wrote:

Hi Konstantin

XLC.

I'm on AIX 7.1 for now.

I'm using this version of XLC v13:

# xlc -qversion
IBM XL C/C++ for AIX, V13.1.3 (5725-C72, 5765-J07)
Version: 13.01.0003.0003

With this version, I have (at least, since I tested with "check" and not "check-world" at that time) 2 failing tests: create_aggregate , aggregates .


With the following XLC v12 version, I have NO test failure:

# /usr/vac/bin/xlc -qversion
IBM XL C/C++ for AIX, V12.1 (5765-J02, 5725-C72)
Version: 12.01.0000.0016


So maybe you are not using XLC v13.1.3.3, rather another sub-version. Unless you are using more options for the configure ?


Configure.

What are the options that you give to the configure ?


export CC="/opt/IBM/xlc/13.1.3/bin/xlc"
export CFLAGS="-qarch=pwr8 -qtune=pwr8 -O2 -qalign=natural -q64 "
export LDFLAGS="-Wl,-bbigtoc,-b64"
export AR="/usr/bin/ar -X64"
export LD="/usr/bin/ld -b64 "
export NM="/usr/bin/nm -X64"
./configure --prefix="/opt/postgresql/xlc-debug/9.6"


Hard load & 64 cores ? OK. That clearly explains why I do not see this issue.


pgbench ? I wanted to run it. However, I'm still looking where to get it plus a guide for using it for testing.


pgbench is part of Postgres distributive (src/bin/pgbench)


I would add such tests when building my PostgreSQL RPMs on AIX. So any help is welcome !


Performance.

- Also, I'd like to compare PostgreSQL performance on AIX vs Linux/PPC64. Any idea how I should proceed ? Any PostgreSQL performance benchmark that I could find and use ? pgbench ?

pgbench is most widely used tool simulating OLTP workload. Certainly it is quite primitive and its results are rather artificial. TPC-C seems to be better choice.
But the best case is to implement your own benchmark simulating actual workload of your real application.

- I'm interested in any information for improving the performance & quality of my PostgreSQM RPMs on AIX. (As I already said, BullFreeware RPMs for AIX are free and can be used by anyone, like Perzl RPMs are. My company (ATOS/Bull) sells IBM Power machines under the Escala brand since ages (25 years this year)).


How to help ?

How could I help for improving the quality and performance of PostgreSQL on AIX ?


We still have one open issue at AIX: see https://www.mail-archive.com/pgsql-hackers@.../msg303094.html
It will be great if you can somehow help to fix this problem.



-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 



--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

postgresql-9.6.1-new2-LARGE_FILES.patch (418 bytes) Download Attachment
postgresql-9.6.1-new-LARGE_FILES.patch (5K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Deadlock in XLogInsert at AIX

konstantin knizhnik
On 02/01/2017 08:30 PM, REIX, Tony wrote:

Hi Konstantin,

Please run: /opt/IBM/xlc/13.1.3/bin/xlc -qversion  so that I know your exact XLC v13 version.

IBM XL C/C++ for AIX, V13.1.3 (5725-C72, 5765-J07)

I'm building on Power7 and not giving any architecture flag to XLC.

I'm not using -qalign=natural . Thus, by default, XLC use -qalign=power, which is close to natural, as explained at:
         https://www.ibm.com/support/knowledgecenter/SSGH2K_13.1.0/com.ibm.xlc131.aix.doc/compiler_ref/opt_align.html
Why are you using this flag ?


Because otherwise double type is aligned on 4 bytes.

Thanks for info about pgbench. PostgreSQL web-site contains a lot of old information...

If you could share scripts or instructions about the tests you are doing with pgbench, I would reproduce here.


You do not need any script.
Just two simple commands.
One to initialize database:

pgbench -i -s 1000

And another to run benchmark itself:

pgbench -c 100 -j 20 -P 1 -T 1000000000


I have no "real" application. My job consists in porting OpenSource packages on AIX. Many packages. Erlang, Go, these days. I just want to make PostgreSQL RPMs as good as possible... within the limited amount of time I can give to this package, before moving to another one.

About the zombie issue, I've discussed with my colleagues. Looks like the process keeps zombie till the father looks at its status. However, though I did that several times, I  do not remember well the details. And that should be not specific to AIX. I'll discuss with another colleague, tomorrow, who should understand this better than me.


1. Process is not in zomby state (according to ps). It is in <exiting> state... It is something AIX specific, I have not see processes in this state at Linux.
2. I have implemented simple test - forkbomb. It creates 1000 children and then wait for them. It is about ten times slower than at Intel/Linux, but still much faster than 100 seconds. So there is some difference between postgress backend and dummy process doing nothing - just immediately terminating after return from fork()

Patch for Large Files: When building PostgreSQL, I found required to use the following patch so that PostgreSQL works with large files. I do not remember the details. Do you agree with such a patch ? 1rst version (new-...) shows the exact places where   define _LARGE_FILES 1  is required.  2nd version (new2-...) is simpler.

I'm now experimenting with your patch for dead lock. However, that should be invisible with the  "check-world" tests I guess.

Regards,

Tony


Le 01/02/2017 à 16:59, Konstantin Knizhnik a écrit :
Hi Tony,

On 01.02.2017 18:42, REIX, Tony wrote:

Hi Konstantin

XLC.

I'm on AIX 7.1 for now.

I'm using this version of XLC v13:

# xlc -qversion
IBM XL C/C++ for AIX, V13.1.3 (5725-C72, 5765-J07)
Version: 13.01.0003.0003

With this version, I have (at least, since I tested with "check" and not "check-world" at that time) 2 failing tests: create_aggregate , aggregates .


With the following XLC v12 version, I have NO test failure:

# /usr/vac/bin/xlc -qversion
IBM XL C/C++ for AIX, V12.1 (5765-J02, 5725-C72)
Version: 12.01.0000.0016


So maybe you are not using XLC v13.1.3.3, rather another sub-version. Unless you are using more options for the configure ?


Configure.

What are the options that you give to the configure ?


export CC="/opt/IBM/xlc/13.1.3/bin/xlc"
export CFLAGS="-qarch=pwr8 -qtune=pwr8 -O2 -qalign=natural -q64 "
export LDFLAGS="-Wl,-bbigtoc,-b64"
export AR="/usr/bin/ar -X64"
export LD="/usr/bin/ld -b64 "
export NM="/usr/bin/nm -X64"
./configure --prefix="/opt/postgresql/xlc-debug/9.6"


Hard load & 64 cores ? OK. That clearly explains why I do not see this issue.


pgbench ? I wanted to run it. However, I'm still looking where to get it plus a guide for using it for testing.


pgbench is part of Postgres distributive (src/bin/pgbench)


I would add such tests when building my PostgreSQL RPMs on AIX. So any help is welcome !


Performance.

- Also, I'd like to compare PostgreSQL performance on AIX vs Linux/PPC64. Any idea how I should proceed ? Any PostgreSQL performance benchmark that I could find and use ? pgbench ?

pgbench is most widely used tool simulating OLTP workload. Certainly it is quite primitive and its results are rather artificial. TPC-C seems to be better choice.
But the best case is to implement your own benchmark simulating actual workload of your real application.

- I'm interested in any information for improving the performance & quality of my PostgreSQM RPMs on AIX. (As I already said, BullFreeware RPMs for AIX are free and can be used by anyone, like Perzl RPMs are. My company (ATOS/Bull) sells IBM Power machines under the Escala brand since ages (25 years this year)).


How to help ?

How could I help for improving the quality and performance of PostgreSQL on AIX ?


We still have one open issue at AIX: see https://www.mail-archive.com/pgsql-hackers@.../msg303094.html
It will be great if you can somehow help to fix this problem.



-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 



-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
123