Problem with pg_atomic_compare_exchange_u64 at 32-bit platformwd

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Problem with pg_atomic_compare_exchange_u64 at 32-bit platformwd

konstantin knizhnik
Definition of pg_atomic_compare_exchange_u64 requires alignment of
expected pointer on 8-byte boundary.

pg_atomic_compare_exchange_u64(volatile pg_atomic_uint64 *ptr,
                                uint64 *expected, uint64 newval)
{
#ifndef PG_HAVE_ATOMIC_U64_SIMULATION
     AssertPointerAlignment(ptr, 8);
     AssertPointerAlignment(expected, 8);
#endif


I wonder if there are platforms  where such restriction is actually needed.
And if so, looks like our ./src/test/regress/regress.c is working only
occasionally:

static void
test_atomic_uint64(void)
{
     pg_atomic_uint64 var;
     uint64        expected;
     ...
         if (!pg_atomic_compare_exchange_u64(&var, &expected, 1))

because there is no warranty that "expected" variable will be aligned on
stack at 8 byte boundary (at least at Win32).


Reply | Threaded
Open this post in threaded view
|

Re: Problem with pg_atomic_compare_exchange_u64 at 32-bit platformwd

Noah Misch-2
On Tue, May 19, 2020 at 04:07:29PM +0300, Konstantin Knizhnik wrote:

> Definition of pg_atomic_compare_exchange_u64 requires alignment of expected
> pointer on 8-byte boundary.
>
> pg_atomic_compare_exchange_u64(volatile pg_atomic_uint64 *ptr,
>                                uint64 *expected, uint64 newval)
> {
> #ifndef PG_HAVE_ATOMIC_U64_SIMULATION
>     AssertPointerAlignment(ptr, 8);
>     AssertPointerAlignment(expected, 8);
> #endif
>
>
> I wonder if there are platforms  where such restriction is actually needed.

In general, sparc Linux does SIGBUS on unaligned access.  Other platforms
function but suffer performance penalties.

> And if so, looks like our ./src/test/regress/regress.c is working only
> occasionally:
>
> static void
> test_atomic_uint64(void)
> {
>     pg_atomic_uint64 var;
>     uint64        expected;
>     ...
>         if (!pg_atomic_compare_exchange_u64(&var, &expected, 1))
>
> because there is no warranty that "expected" variable will be aligned on
> stack at 8 byte boundary (at least at Win32).

src/tools/msvc sets ALIGNOF_LONG_LONG_INT=8, so it believes that win32 does
guarantee 8-byte alignment of both automatic variables.  Is it wrong?


Reply | Threaded
Open this post in threaded view
|

Re: Problem with pg_atomic_compare_exchange_u64 at 32-bit platformwd

Andres Freund
Hi,

On May 19, 2020 8:05:00 PM PDT, Noah Misch <[hidden email]> wrote:

>On Tue, May 19, 2020 at 04:07:29PM +0300, Konstantin Knizhnik wrote:
>> Definition of pg_atomic_compare_exchange_u64 requires alignment of
>expected
>> pointer on 8-byte boundary.
>>
>> pg_atomic_compare_exchange_u64(volatile pg_atomic_uint64 *ptr,
>>                                uint64 *expected, uint64 newval)
>> {
>> #ifndef PG_HAVE_ATOMIC_U64_SIMULATION
>>     AssertPointerAlignment(ptr, 8);
>>     AssertPointerAlignment(expected, 8);
>> #endif
>>
>>
>> I wonder if there are platforms  where such restriction is actually
>needed.
>
>In general, sparc Linux does SIGBUS on unaligned access.  Other
>platforms
>function but suffer performance penalties.

Indeed. Cross cacheline atomics are e.g. really expensive on x86. Essentially requiring a full blown bus lock iirc.


>> And if so, looks like our ./src/test/regress/regress.c is working
>only
>> occasionally:
>>
>> static void
>> test_atomic_uint64(void)
>> {
>>     pg_atomic_uint64 var;
>>     uint64        expected;
>>     ...
>>         if (!pg_atomic_compare_exchange_u64(&var, &expected, 1))
>>
>> because there is no warranty that "expected" variable will be aligned
>on
>> stack at 8 byte boundary (at least at Win32).
>
>src/tools/msvc sets ALIGNOF_LONG_LONG_INT=8, so it believes that win32
>does
>guarantee 8-byte alignment of both automatic variables.  Is it wrong?

Generally the definition of the atomics should ensure the required alignment. E.g. using alignment attributes to the struct.

Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.


Reply | Threaded
Open this post in threaded view
|

Re: Problem with pg_atomic_compare_exchange_u64 at 32-bit platforms

konstantin knizhnik
In reply to this post by Noah Misch-2


On 20.05.2020 06:05, Noah Misch wrote:

> On Tue, May 19, 2020 at 04:07:29PM +0300, Konstantin Knizhnik wrote:
>> Definition of pg_atomic_compare_exchange_u64 requires alignment of expected
>> pointer on 8-byte boundary.
>>
>> pg_atomic_compare_exchange_u64(volatile pg_atomic_uint64 *ptr,
>>                                 uint64 *expected, uint64 newval)
>> {
>> #ifndef PG_HAVE_ATOMIC_U64_SIMULATION
>>      AssertPointerAlignment(ptr, 8);
>>      AssertPointerAlignment(expected, 8);
>> #endif
>>
>>
>> I wonder if there are platforms  where such restriction is actually needed.
> In general, sparc Linux does SIGBUS on unaligned access.  Other platforms
> function but suffer performance penalties.
Well, if platform enforces strict alignment, then addressed value should
be properly aligned in any case, shouldn't it?
So my question is whether there are platforms which allows unaligned
access for normal (non-atomic) memory operations
but requires them for atomic operations.

>
>> And if so, looks like our ./src/test/regress/regress.c is working only
>> occasionally:
>>
>> static void
>> test_atomic_uint64(void)
>> {
>>      pg_atomic_uint64 var;
>>      uint64        expected;
>>      ...
>>          if (!pg_atomic_compare_exchange_u64(&var, &expected, 1))
>>
>> because there is no warranty that "expected" variable will be aligned on
>> stack at 8 byte boundary (at least at Win32).
> src/tools/msvc sets ALIGNOF_LONG_LONG_INT=8, so it believes that win32 does
> guarantee 8-byte alignment of both automatic variables.  Is it wrong?

Yes, by default "long long" and "double" types are aligned on 8-byte
boundary at 32-bit Windows (but not at 32-bit Linux).
Bu it is only about alignment of fields inside struct.
So if you define structure:

typedef struct {
      int x;
      long long y;
} foo;

then sizeof(foo) will be really 16 at Win32.'
But Win32 doesn't enforce alignment of stack frames on 8-byte boundary.
It means that if you define local variable "y":

void f() {
      int x;
      long long y;
      printf("%p\n", &y);
}

then its address must not be aligned on 8 at 32-bit platform.
This is why "expected" in test_atomic_uint64 may not be aligned on
8-byte boundary and we can get assertion failure.



Reply | Threaded
Open this post in threaded view
|

Re: Problem with pg_atomic_compare_exchange_u64 at 32-bit platforms

konstantin knizhnik
In reply to this post by Andres Freund


On 20.05.2020 08:10, Andres Freund wrote:

> Hi,
>
> On May 19, 2020 8:05:00 PM PDT, Noah Misch <[hidden email]> wrote:
>> On Tue, May 19, 2020 at 04:07:29PM +0300, Konstantin Knizhnik wrote:
>>> Definition of pg_atomic_compare_exchange_u64 requires alignment of
>> expected
>>> pointer on 8-byte boundary.
>>>
>>> pg_atomic_compare_exchange_u64(volatile pg_atomic_uint64 *ptr,
>>>                                 uint64 *expected, uint64 newval)
>>> {
>>> #ifndef PG_HAVE_ATOMIC_U64_SIMULATION
>>>      AssertPointerAlignment(ptr, 8);
>>>      AssertPointerAlignment(expected, 8);
>>> #endif
>>>
>>>
>>> I wonder if there are platforms  where such restriction is actually
>> needed.
>>
>> In general, sparc Linux does SIGBUS on unaligned access.  Other
>> platforms
>> function but suffer performance penalties.
> Indeed. Cross cacheline atomics are e.g. really expensive on x86. Essentially requiring a full blown bus lock iirc.
>
Please notice that here we talk about alignment not of atomic pointer
itself, but of pointer to the expected value.
At Intel CMPXCHG instruction read and write expected value throw AX
register.
So alignment of pointer to expected value in
pg_atomic_compare_exchange_u64 is not needed in this case.

And my question was whether there are some platforms where
implementation of compare-exchange 64-bit primitive
requires stronger alignment of "expected" pointer than one enforced by
original alignment rules for this platform.


>
> Generally the definition of the atomics should ensure the required alignment. E.g. using alignment attributes to the struct.

Once again, we are speaking not about alignment of "pg_atomic_uint64 *ptr"
which is really enforced by alignment of pg_atomic_uint64 struct, but
about alignment of "uint64 *expected"
which is not guaranteed.

Actually, If you allocate pg_atomic_uint64 on stack at 32-bt platform,
then it my be also not properly aligned!
But since there is completely no sense in local atomic variables, it is
not a problem.




Reply | Threaded
Open this post in threaded view
|

Re: Problem with pg_atomic_compare_exchange_u64 at 32-bit platforms

Noah Misch-2
In reply to this post by konstantin knizhnik
On Wed, May 20, 2020 at 10:23:37AM +0300, Konstantin Knizhnik wrote:

> On 20.05.2020 06:05, Noah Misch wrote:
> >On Tue, May 19, 2020 at 04:07:29PM +0300, Konstantin Knizhnik wrote:
> >>Definition of pg_atomic_compare_exchange_u64 requires alignment of expected
> >>pointer on 8-byte boundary.
> >>
> >>pg_atomic_compare_exchange_u64(volatile pg_atomic_uint64 *ptr,
> >>                                uint64 *expected, uint64 newval)
> >>{
> >>#ifndef PG_HAVE_ATOMIC_U64_SIMULATION
> >>     AssertPointerAlignment(ptr, 8);
> >>     AssertPointerAlignment(expected, 8);
> >>#endif
> >>
> >>
> >>I wonder if there are platforms  where such restriction is actually needed.
> >In general, sparc Linux does SIGBUS on unaligned access.  Other platforms
> >function but suffer performance penalties.
> Well, if platform enforces strict alignment, then addressed value should be
> properly aligned in any case, shouldn't it?

No.  One can always cast a char* to a uint64* and get a misaligned read when
dereferencing the resulting pointer.

> >>And if so, looks like our ./src/test/regress/regress.c is working only
> >>occasionally:
> >>
> >>static void
> >>test_atomic_uint64(void)
> >>{
> >>     pg_atomic_uint64 var;
> >>     uint64        expected;
> >>     ...
> >>         if (!pg_atomic_compare_exchange_u64(&var, &expected, 1))
> >>
> >>because there is no warranty that "expected" variable will be aligned on
> >>stack at 8 byte boundary (at least at Win32).
> >src/tools/msvc sets ALIGNOF_LONG_LONG_INT=8, so it believes that win32 does
> >guarantee 8-byte alignment of both automatic variables.  Is it wrong?
>
> Yes, by default "long long" and "double" types are aligned on 8-byte
> boundary at 32-bit Windows (but not at 32-bit Linux).
> Bu it is only about alignment of fields inside struct.
> So if you define structure:
>
> typedef struct {
>      int x;
>      long long y;
> } foo;
>
> then sizeof(foo) will be really 16 at Win32.'
> But Win32 doesn't enforce alignment of stack frames on 8-byte boundary.
> It means that if you define local variable "y":
>
> void f() {
>      int x;
>      long long y;
>      printf("%p\n", &y);
> }
>
> then its address must not be aligned on 8 at 32-bit platform.
> This is why "expected" in test_atomic_uint64 may not be aligned on 8-byte
> boundary and we can get assertion failure.

Can you construct a patch that adds some automatic variables to a regress.c
function and causes an assertion inside pg_atomic_compare_exchange_u64() to
fail on some machine you have?  I don't think win32 behaves as you say.  If
you can make a test actually fail using the technique you describe, that would
remove all doubt.


Reply | Threaded
Open this post in threaded view
|

Re: Problem with pg_atomic_compare_exchange_u64 at 32-bit platforms

konstantin knizhnik


On 20.05.2020 10:36, Noah Misch wrote:

> On Wed, May 20, 2020 at 10:23:37AM +0300, Konstantin Knizhnik wrote:
>> On 20.05.2020 06:05, Noah Misch wrote:
>>> On Tue, May 19, 2020 at 04:07:29PM +0300, Konstantin Knizhnik wrote:
>>>> Definition of pg_atomic_compare_exchange_u64 requires alignment of expected
>>>> pointer on 8-byte boundary.
>>>>
>>>> pg_atomic_compare_exchange_u64(volatile pg_atomic_uint64 *ptr,
>>>>                                 uint64 *expected, uint64 newval)
>>>> {
>>>> #ifndef PG_HAVE_ATOMIC_U64_SIMULATION
>>>>      AssertPointerAlignment(ptr, 8);
>>>>      AssertPointerAlignment(expected, 8);
>>>> #endif
>>>>
>>>>
>>>> I wonder if there are platforms  where such restriction is actually needed.
>>> In general, sparc Linux does SIGBUS on unaligned access.  Other platforms
>>> function but suffer performance penalties.
>> Well, if platform enforces strict alignment, then addressed value should be
>> properly aligned in any case, shouldn't it?
> No.  One can always cast a char* to a uint64* and get a misaligned read when
> dereferencing the resulting pointer.
Yes, certainly we can "fool" compiler using type casts:

char buf[8];
*(int64_t*)buf = 1;

But I am speaking about normal (safe) access to variables:

long long x;

In this case "x" compiler enforces proper alignment of "x" for the
target platform.
We are not adding AssertPointerAlignment to any function which has
pointer arguments, aren' we?
I understand we do we require struct alignment pointer to atomic
variables even at the platforms which do not require it
(as Andreas explained, if value cross cacheline, it will cause
significant slowdown).
But my question was whether we need string alignment of expected value?


>> void f() {
>>       int x;
>>       long long y;
>>       printf("%p\n", &y);
>> }
>>
>> then its address must not be aligned on 8 at 32-bit platform.
>> This is why "expected" in test_atomic_uint64 may not be aligned on 8-byte
>> boundary and we can get assertion failure.
> Can you construct a patch that adds some automatic variables to a regress.c
> function and causes an assertion inside pg_atomic_compare_exchange_u64() to
> fail on some machine you have?  I don't think win32 behaves as you say.  If
> you can make a test actually fail using the technique you describe, that would
> remove all doubt.
I do not have access to Win32.
But I think that if you just add some 4-byte variable before "expected"
definition, then you will get this  assertion failure (proposed patch is
attached).
Please notice that PG_HAVE_ATOMIC_U64_SIMULATION should not be defined
and Postgres is build with --enable-cassert and CLAGS=-O0

Also please notice that my report is not caused just by hypothetical
problem which I found out looking at Postgres code.
We actually get this assertion failure in pg_atomic_compare_exchange_u64
at Win32 (not in regress.c).


regress.patch (402 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Problem with pg_atomic_compare_exchange_u64 at 32-bit platforms

Noah Misch-2
On Wed, May 20, 2020 at 10:59:44AM +0300, Konstantin Knizhnik wrote:

> On 20.05.2020 10:36, Noah Misch wrote:
> >On Wed, May 20, 2020 at 10:23:37AM +0300, Konstantin Knizhnik wrote:
> >>On 20.05.2020 06:05, Noah Misch wrote:
> >>>On Tue, May 19, 2020 at 04:07:29PM +0300, Konstantin Knizhnik wrote:
> >>>>Definition of pg_atomic_compare_exchange_u64 requires alignment of expected
> >>>>pointer on 8-byte boundary.
> >>>>
> >>>>pg_atomic_compare_exchange_u64(volatile pg_atomic_uint64 *ptr,
> >>>>                                uint64 *expected, uint64 newval)
> >>>>{
> >>>>#ifndef PG_HAVE_ATOMIC_U64_SIMULATION
> >>>>     AssertPointerAlignment(ptr, 8);
> >>>>     AssertPointerAlignment(expected, 8);
> >>>>#endif
> >>>>
> >>>>
> >>>>I wonder if there are platforms  where such restriction is actually needed.
> >>>In general, sparc Linux does SIGBUS on unaligned access.  Other platforms
> >>>function but suffer performance penalties.
> >>Well, if platform enforces strict alignment, then addressed value should be
> >>properly aligned in any case, shouldn't it?
> >No.  One can always cast a char* to a uint64* and get a misaligned read when
> >dereferencing the resulting pointer.
>
> Yes, certainly we can "fool" compiler using type casts:
>
> char buf[8];
> *(int64_t*)buf = 1;

PostgreSQL does things like that, so the assertions aren't frivolous.

> But I am speaking about normal (safe) access to variables:
>
> long long x;
>
> In this case "x" compiler enforces proper alignment of "x" for the target
> platform.
> We are not adding AssertPointerAlignment to any function which has pointer
> arguments, aren' we?

Most functions don't have such assertions.  That doesn't make it wrong for
this function to have them.

> I understand we do we require struct alignment pointer to atomic variables
> even at the platforms which do not require it
> (as Andreas explained, if value cross cacheline, it will cause significant
> slowdown).
> But my question was whether we need string alignment of expected value?

I expect at least some platforms need strict alignment, though I haven't tried
to prove it.

> >>void f() {
> >>      int x;
> >>      long long y;
> >>      printf("%p\n", &y);
> >>}
> >>
> >>then its address must not be aligned on 8 at 32-bit platform.
> >>This is why "expected" in test_atomic_uint64 may not be aligned on 8-byte
> >>boundary and we can get assertion failure.
> >Can you construct a patch that adds some automatic variables to a regress.c
> >function and causes an assertion inside pg_atomic_compare_exchange_u64() to
> >fail on some machine you have?  I don't think win32 behaves as you say.  If
> >you can make a test actually fail using the technique you describe, that would
> >remove all doubt.
> I do not have access to Win32.
> But I think that if you just add some 4-byte variable before "expected"
> definition, then you will get this  assertion failure (proposed patch is
> attached).
> Please notice that PG_HAVE_ATOMIC_U64_SIMULATION should not be defined and
> Postgres is build with --enable-cassert and CLAGS=-O0
>
> Also please notice that my report is not caused just by hypothetical problem
> which I found out looking at Postgres code.
> We actually get this assertion failure in pg_atomic_compare_exchange_u64 at
> Win32 (not in regress.c).

Given https://postgr.es/m/flat/20150108204635.GK6299%40alap3.anarazel.de was
necessary, that is plausible.  Again, if you can provide a test case that you
have confirmed reproduces it, that will remove all doubt.  You refer to a "we"
that has access to a system that reproduces it.


Reply | Threaded
Open this post in threaded view
|

Re: Problem with pg_atomic_compare_exchange_u64 at 32-bit platforms

Andres Freund
In reply to this post by konstantin knizhnik
Hi,

On 2020-05-20 10:32:18 +0300, Konstantin Knizhnik wrote:

> On 20.05.2020 08:10, Andres Freund wrote:
> > On May 19, 2020 8:05:00 PM PDT, Noah Misch <[hidden email]> wrote:
> > > On Tue, May 19, 2020 at 04:07:29PM +0300, Konstantin Knizhnik wrote:
> > > > Definition of pg_atomic_compare_exchange_u64 requires alignment of
> > > expected
> > > > pointer on 8-byte boundary.
> > > >
> > > > pg_atomic_compare_exchange_u64(volatile pg_atomic_uint64 *ptr,
> > > >                                 uint64 *expected, uint64 newval)
> > > > {
> > > > #ifndef PG_HAVE_ATOMIC_U64_SIMULATION
> > > >      AssertPointerAlignment(ptr, 8);
> > > >      AssertPointerAlignment(expected, 8);
> > > > #endif
> > > >
> > > >
> > > > I wonder if there are platforms  where such restriction is actually
> > > needed.
> > >
> > > In general, sparc Linux does SIGBUS on unaligned access.  Other
> > > platforms
> > > function but suffer performance penalties.
> > Indeed. Cross cacheline atomics are e.g. really expensive on x86. Essentially requiring a full blown bus lock iirc.
> >
> Please notice that here we talk about alignment not of atomic pointer
> itself, but of pointer to the expected value.

That wasn't particularly clear in your first email... In hindsight I
can see that you meant that, but I'm not surprised to not have
understood that the on the first read either.


> At Intel CMPXCHG instruction read and write expected value throw AX
> register.
> So alignment of pointer to expected value in pg_atomic_compare_exchange_u64
> is not needed in this case.

x86 also supports doing a CMPXCHG crossing a cacheline boundary, it's
just terrifyingly expensive...


I can imagine this being a problem on a 32bit platforms, but on 64bit
platforms, it seems only an insane platform ABI would only have 4 byte
alignment on 64bit integers. That'd cause so much unnecessarily split
cachlines... That's separate from the ISA actually supporting doing such
reads efficiently, of course.


But that still leaves the alignment check on expected to be too strong
on 32 bit platforms where 64bit alignment is only 4 bytes. I'm doubtful
that's it's a good idea to use a comparison value potentially split
across cachelines for an atomic operation that's potentially
contended. But also, I don't particularly care about 32 bit performance.

I think we should probably just drop the second assert down to
ALIGNOF_INT64. Would require a new configure stanza, but that's easy
enough to do. It's just adding
AC_CHECK_ALIGNOF(PG_INT64_TYPE)

Doing that change made me think about replace the conditional long long
int alignof logic in configure.in, and just unconditionally do the a
check for PG_INT64_TYPE, seems nicer. That made me look at Solution.pm
due to ALIGNOF_LONG_LONG, and it's interesting:
        # Every symbol in pg_config.h.in must be accounted for here.  Set
        # to undef if the symbol should not be defined.
        my %define = (
...
                ALIGNOF_LONG_LONG_INT      => 8,
...
                PG_INT64_TYPE       => 'long long int',

so currently our msvc build actually claims that the alignment
requirements are what the code tests. And that's not just since
pg_config.h is autogenerated, it was that way before too:

/* The alignment requirement of a `long long int'. */
#define ALIGNOF_LONG_LONG_INT 8
/* Define to the name of a signed 64-bit integer type. */
#define PG_INT64_TYPE long long int

and has been for a while.


> And my question was whether there are some platforms where implementation of
> compare-exchange 64-bit primitive
> requires stronger alignment of "expected" pointer than one enforced by
> original alignment rules for this platform.

IIRC there's a few older platforms that have single-copy-atomicity for 8
byte values, but do *not* have it for ones not aligned to 8 byte
platforms. Despite not having such an ABI alignment.

It's not impossible to come up with a case where that could matter (if
expected pointed into some shared memory that could be read by others),
but it's hard to take them serious.

Greetings,

Andres Freund