master make check fails on Solaris 10

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
46 messages Options
123
Reply | Threaded
Open this post in threaded view
|

master make check fails on Solaris 10

Marina Polyakova
Hello, hackers! I got a permanent failure of master (commit
ca454b9bd34c75995eda4d07c9858f7c22890c2b) make check on Solaris 10.
Regression output and diffs are attached.

I used the following commands:
./configure CC="ccache gcc" CFLAGS="-m64 -I/opt/csw/include"
LDFLAGS="-L/opt/csw/lib/sparcv9 -L/usr/local/lib/64" --enable-cassert
--enable-debug --enable-nls --with-perl --with-tcl --with-python
--with-gssapi --with-openssl --with-ldap --with-libxml --with-libxslt
gmake > make_results.txt
gmake check

About the system: SunOS, Release 5.10, KernelID Generic_141444-09.

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

regression.diffs (41K) Download Attachment
regression.out (11K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: master make check fails on Solaris 10

Tom Lane-2
Marina Polyakova <[hidden email]> writes:
> Hello, hackers! I got a permanent failure of master (commit
> ca454b9bd34c75995eda4d07c9858f7c22890c2b) make check on Solaris 10.
> Regression output and diffs are attached.

Hm, buildfarm member protosciurus is running a similar configuration
without problems.  Looking at its configuration, maybe you need to
fool with LD_LIBRARY_PATH and/or LDFLAGS_SL?

                        regards, tom lane

Reply | Threaded
Open this post in threaded view
|

Re: master make check fails on Solaris 10

Andres Freund
In reply to this post by Marina Polyakova
Hi,

On 2018-01-11 20:21:11 +0300, Marina Polyakova wrote:
> Hello, hackers! I got a permanent failure of master (commit
> ca454b9bd34c75995eda4d07c9858f7c22890c2b) make check on Solaris 10.

Did this use to work? If so, could you check whether it worked before
69c3936a1499b772a749ae629fc59b2d72722332?

Greetings,

Andres Freund

Reply | Threaded
Open this post in threaded view
|

Re: master make check fails on Solaris 10

Marina Polyakova
In reply to this post by Tom Lane-2
On 11-01-2018 20:34, Tom Lane wrote:
> Marina Polyakova <[hidden email]> writes:
>> Hello, hackers! I got a permanent failure of master (commit
>> ca454b9bd34c75995eda4d07c9858f7c22890c2b) make check on Solaris 10.
>> Regression output and diffs are attached.
>
> Hm, buildfarm member protosciurus is running a similar configuration
> without problems.  Looking at its configuration, maybe you need to
> fool with LD_LIBRARY_PATH and/or LDFLAGS_SL?

I added these parameters with the same values in configure
(LDFLAGS_SL="-m64"
LD_LIBRARY_PATH="/lib/64:/usr/lib/64:/usr/sfw/lib/64:/usr/local/lib"),
there're the same failures :( (see the attached regression diffs and
output)

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

other_config_regression.diffs (42K) Download Attachment
other_config_regression.out (11K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: master make check fails on Solaris 10

Marina Polyakova
In reply to this post by Andres Freund
On 11-01-2018 20:39, Andres Freund wrote:
> Hi,
>
> On 2018-01-11 20:21:11 +0300, Marina Polyakova wrote:
>> Hello, hackers! I got a permanent failure of master (commit
>> ca454b9bd34c75995eda4d07c9858f7c22890c2b) make check on Solaris 10.
>
> Did this use to work?

It always fails if you have asked about this..

> If so, could you check whether it worked before
> 69c3936a1499b772a749ae629fc59b2d72722332?

- on the previous commit (272c2ab9fd0a604e3200030b1ea26fd464c44935) the
same failures occur (see the attached regression diffs and output);
- on commit bf54c0f05c0a58db17627724a83e1b6d4ec2712c make check-world
passes.
I'll try to find out from what commit it started.. Don't you have any
suspicions?)

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

272c2ab_regression.diffs (41K) Download Attachment
272c2ab_regression.out (11K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: master make check fails on Solaris 10

Álvaro Herrera
Marina Polyakova wrote:

> - on the previous commit (272c2ab9fd0a604e3200030b1ea26fd464c44935) the same
> failures occur (see the attached regression diffs and output);
> - on commit bf54c0f05c0a58db17627724a83e1b6d4ec2712c make check-world
> passes.
> I'll try to find out from what commit it started.. Don't you have any
> suspicions?)

Perhaps you can use "git bisect".

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: master make check fails on Solaris 10

Marina Polyakova
On 12-01-2018 18:12, Alvaro Herrera wrote:

> Marina Polyakova wrote:
>
>> - on the previous commit (272c2ab9fd0a604e3200030b1ea26fd464c44935)
>> the same
>> failures occur (see the attached regression diffs and output);
>> - on commit bf54c0f05c0a58db17627724a83e1b6d4ec2712c make check-world
>> passes.
>> I'll try to find out from what commit it started.. Don't you have any
>> suspicions?)
>
> Perhaps you can use "git bisect".

Thanks, I'm doing the same thing :)

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Reply | Threaded
Open this post in threaded view
|

Re: master make check fails on Solaris 10

Marina Polyakova
In reply to this post by Marina Polyakova
On 12-01-2018 14:05, Marina Polyakova wrote:
> - on the previous commit (272c2ab9fd0a604e3200030b1ea26fd464c44935)
> the same failures occur (see the attached regression diffs and
> output);
> - on commit bf54c0f05c0a58db17627724a83e1b6d4ec2712c make check-world
> passes.
> I'll try to find out from what commit it started..

Binary search has shown that all these failures begin with commit
7518049980be1d90264addab003476ae105f70d4 (Prevent int128 from requiring
more than MAXALIGN alignment.). On the previous commit
(91aec93e6089a5ba49cce0aca3bf7f7022d62ea4) make check-world passes.

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Reply | Threaded
Open this post in threaded view
|

Re: master make check fails on Solaris 10

Tom Lane-2
Marina Polyakova <[hidden email]> writes:
> On 12-01-2018 14:05, Marina Polyakova wrote:
>> - on the previous commit (272c2ab9fd0a604e3200030b1ea26fd464c44935)
>> the same failures occur (see the attached regression diffs and
>> output);
>> - on commit bf54c0f05c0a58db17627724a83e1b6d4ec2712c make check-world
>> passes.
>> I'll try to find out from what commit it started..

> Binary search has shown that all these failures begin with commit
> 7518049980be1d90264addab003476ae105f70d4 (Prevent int128 from requiring
> more than MAXALIGN alignment.).

Hm ... so apparently, that compiler has bugs in handling nondefault
alignment specs.  You said upthread it was gcc, but what version
exactly?

                        regards, tom lane

Reply | Threaded
Open this post in threaded view
|

Re: master make check fails on Solaris 10

Marina Polyakova
On 12-01-2018 21:00, Tom Lane wrote:

> Marina Polyakova <[hidden email]> writes:
>> ...
>> Binary search has shown that all these failures begin with commit
>> 7518049980be1d90264addab003476ae105f70d4 (Prevent int128 from
>> requiring
>> more than MAXALIGN alignment.).
>
> Hm ... so apparently, that compiler has bugs in handling nondefault
> alignment specs.  You said upthread it was gcc, but what version
> exactly?

This is 5.2.0:

$ gcc -v
Reading specs from /opt/csw/lib/gcc/sparc-sun-solaris2.10/5.2.0/specs
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/opt/csw/libexec/gcc/sparc-sun-solaris2.10/5.2.0/lto-wrapper
Target: sparc-sun-solaris2.10
Configured with:
/home/dam/mgar/pkg/gcc5/trunk/work/solaris10-sparc/build-isa-sparcv8plus/gcc-5.2.0/configure
--prefix=/opt/csw --exec_prefix=/opt/csw --bindir=/opt/csw/bin
--sbindir=/opt/csw/sbin --libexecdir=/opt/csw/libexec
--datadir=/opt/csw/share --sysconfdir=/etc/opt/csw
--sharedstatedir=/opt/csw/share --localstatedir=/var/opt/csw
--libdir=/opt/csw/lib --infodir=/opt/csw/share/info
--includedir=/opt/csw/include --mandir=/opt/csw/share/man
--enable-cloog-backend=isl --enable-java-awt=xlib
--enable-languages=ada,c,c++,fortran,go,java,objc --enable-libada
--enable-libssp --enable-nls --enable-objc-gc --enable-threads=posix
--program-suffix=-5.2 --with-cloog=/opt/csw --with-gmp=/opt/csw
--with-included-gettext --with-ld=/usr/ccs/bin/ld --without-gnu-ld
--with-libiconv-prefix=/opt/csw --with-mpfr=/opt/csw --with-ppl=/opt/csw
--with-system-zlib=/opt/csw --with-as=/usr/ccs/bin/as --without-gnu-as
Thread model: posix
gcc version 5.2.0 (GCC)

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Reply | Threaded
Open this post in threaded view
|

Re: master make check fails on Solaris 10

Tom Lane-2
Marina Polyakova <[hidden email]> writes:
> On 12-01-2018 21:00, Tom Lane wrote:
>> Hm ... so apparently, that compiler has bugs in handling nondefault
>> alignment specs.  You said upthread it was gcc, but what version
>> exactly?

> This is 5.2.0:

Ugh ... protosciurus has 3.4.3, but I see that configure detects that
as *not* having __int128.  Probably what's happening on your machine
is that gcc knows __int128 but generates buggy code for it when an
alignment spec is given.  So that's unfortunate, but it's not really
a regression from 3.4.3.

I'm not sure there's much we can do about this.  Dropping the use
of the alignment spec isn't a workable option.  If there were a
simple way for configure to detect that the compiler generates bad
code for that, we could have it do so and reject use of __int128,
but it'd be up to you to come up with a workable test.

In the end this might just be an instance of the old saw about
avoiding dot-zero releases.  Have you tried a newer gcc?
(Digging in their bugzilla finds quite a number of __int128 bugs
fixed in 5.4.x, though none look to be specifically about
misaligned data.)

Also, if it still happens with current gcc on that hardware,
there'd be grounds for a new bug report to them.

                        regards, tom lane

Reply | Threaded
Open this post in threaded view
|

Re: master make check fails on Solaris 10

Marina Polyakova
Thank you very much!

On 13-01-2018 21:10, Tom Lane wrote:

> Marina Polyakova <[hidden email]> writes:
>> On 12-01-2018 21:00, Tom Lane wrote:
>>> Hm ... so apparently, that compiler has bugs in handling nondefault
>>> alignment specs.  You said upthread it was gcc, but what version
>>> exactly?
>
>> This is 5.2.0:
>
> Ugh ... protosciurus has 3.4.3, but I see that configure detects that
> as *not* having __int128.  Probably what's happening on your machine
> is that gcc knows __int128 but generates buggy code for it when an
> alignment spec is given.  So that's unfortunate, but it's not really
> a regression from 3.4.3.
>
> I'm not sure there's much we can do about this.  Dropping the use
> of the alignment spec isn't a workable option.  If there were a
> simple way for configure to detect that the compiler generates bad
> code for that, we could have it do so and reject use of __int128,
> but it'd be up to you to come up with a workable test.

I'll think about it..

> In the end this might just be an instance of the old saw about
> avoiding dot-zero releases.  Have you tried a newer gcc?
> (Digging in their bugzilla finds quite a number of __int128 bugs
> fixed in 5.4.x, though none look to be specifically about
> misaligned data.)

As I was told offlist, 5.2.0 is already a fairly new version of gcc for
this system..

> Also, if it still happens with current gcc on that hardware,
> there'd be grounds for a new bug report to them.

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Reply | Threaded
Open this post in threaded view
|

Re: master make check fails on Solaris 10

Tom Lane-2
Marina Polyakova <[hidden email]> writes:
> On 13-01-2018 21:10, Tom Lane wrote:
>> I'm not sure there's much we can do about this.  Dropping the use
>> of the alignment spec isn't a workable option.  If there were a
>> simple way for configure to detect that the compiler generates bad
>> code for that, we could have it do so and reject use of __int128,
>> but it'd be up to you to come up with a workable test.

> I'll think about it..

Attached is a possible test program.  I can confirm it passes on a
machine with working __int128, but I have no idea whether it will
detect the problem on yours.  If not, maybe you can tweak it?

                        regards, tom lane


#include <stddef.h>
#include <stdio.h>

/* GCC, Sunpro and XLC support aligned */
#if defined(__GNUC__) || defined(__SUNPRO_C) || defined(__IBMC__)
#define pg_attribute_aligned(a) __attribute__((aligned(a)))
#endif

typedef __int128 int128a
#if defined(pg_attribute_aligned)
pg_attribute_aligned(8)
#endif
;

/*
 * These are globals to discourage the compiler from folding all the
 * arithmetic tests down to compile-time constants.  We do not have
 * convenient support for 128bit literals at this point...
 */
struct glob128
{
        __int128 start;
        char pad;
        int128a a;
        int128a b;
        int128a c;
        int128a d;
} g = {0, 'p', 48828125, 97656255, 0, 0};

int
main()
{
        if (offsetof(struct glob128, a) < 17 ||
                offsetof(struct glob128, a) > 24)
        {
                printf("wrong alignment, %d\n", (int) offsetof(struct glob128, a));
                return 1;
        }
        g.a = (g.a << 12) + 1; /* 200000000001 */
        g.b = (g.b << 12) + 5; /* 400000000005 */
        /* use the most relevant arithmetic ops */
        g.c = g.a * g.b;
        g.d = (g.c + g.b) / g.b;
        /* return different values, to prevent optimizations */
        if (g.d != g.a + 1)
        {
                printf("wrong arithmetic result\n");
                return 1;
        }
        printf("A-OK!\n");
        return 0;
}
Reply | Threaded
Open this post in threaded view
|

Re: master make check fails on Solaris 10

Marina Polyakova
[I added Victor Wagner as co-researcher of this problem]

On 13-01-2018 21:10, Tom Lane wrote:
> In the end this might just be an instance of the old saw about
> avoiding dot-zero releases.  Have you tried a newer gcc?
> (Digging in their bugzilla finds quite a number of __int128 bugs
> fixed in 5.4.x, though none look to be specifically about
> misaligned data.)

gcc 5.5.0 (from [1]) did not fix the problem..

On 16-01-2018 2:41, Tom Lane wrote:

> Marina Polyakova <[hidden email]> writes:
>> On 13-01-2018 21:10, Tom Lane wrote:
>>> I'm not sure there's much we can do about this.  Dropping the use
>>> of the alignment spec isn't a workable option.  If there were a
>>> simple way for configure to detect that the compiler generates bad
>>> code for that, we could have it do so and reject use of __int128,
>>> but it'd be up to you to come up with a workable test.
>
>> I'll think about it..
>
> Attached is a possible test program.  I can confirm it passes on a
> machine with working __int128, but I have no idea whether it will
> detect the problem on yours.  If not, maybe you can tweak it?
Thank you! Using gcc 5.5.0 it prints that everything is ok. But,
investigating the regression diffs, we found out that the error occurs
when we pass int128 as not the first argument to the function (perhaps
its value is replaced by the value of some address):

-- Use queries from random.sql
SELECT count(*) FROM onek; -- Everything is ok
...
SELECT random, count(random) FROM RANDOM_TBL
   GROUP BY random HAVING count(random) > 3; -- Everything is ok

postgres=# SELECT * FROM RANDOM_TBL ORDER BY random; -- Print current
data
  random
--------
      78
      86
      98
      98
(4 rows)

postgres=# SELECT AVG(random) FROM RANDOM_TBL
postgres-#   HAVING AVG(random) NOT BETWEEN 80 AND 120; -- Oops!
               avg
-------------------------------
  79446934848446476698976780288
(1 row)

Debug output from the last query (see attached diff.patch, it is based
on commit 9c7d06d60680c7f00d931233873dee81fdb311c6 of master):

makeInt128AggState
int8_avg_accum val 98
int8_avg_accum val_int128 as 2 x int64: 0 98
int8_avg_accum val_int128 bytes: 00000000000000000000000000000062
int8_avg_accum state 100e648d8
int8_avg_accum 1007f2e94
do_int128_accum int128 newval as 2 x int64: 4306826968 0
do_int128_accum int128 newval bytes: 0000000100B4F6D80000000000000000
do_int128_accum state 100e648d8
do_int128_accum 1007f1e30
int8_avg_accum val 86
int8_avg_accum val_int128 as 2 x int64: 0 86
int8_avg_accum val_int128 bytes: 00000000000000000000000000000056
int8_avg_accum state 100e648d8
int8_avg_accum 1007f2e94
do_int128_accum int128 newval as 2 x int64: 4306826968 0
do_int128_accum int128 newval bytes: 0000000100B4F6D80000000000000000
do_int128_accum state 100e648d8
do_int128_accum 1007f1e30
int8_avg_accum val 98
int8_avg_accum val_int128 as 2 x int64: 0 98
int8_avg_accum val_int128 bytes: 00000000000000000000000000000062
int8_avg_accum state 100e648d8
int8_avg_accum 1007f2e94
do_int128_accum int128 newval as 2 x int64: 4306826968 0
do_int128_accum int128 newval bytes: 0000000100B4F6D80000000000000000
do_int128_accum state 100e648d8
do_int128_accum 1007f1e30
int8_avg_accum val 78
int8_avg_accum val_int128 as 2 x int64: 0 78
int8_avg_accum val_int128 bytes: 0000000000000000000000000000004E
int8_avg_accum state 100e648d8
int8_avg_accum 1007f2e94
do_int128_accum int128 newval as 2 x int64: 4306826968 0
do_int128_accum int128 newval bytes: 0000000100B4F6D80000000000000000
do_int128_accum state 100e648d8
do_int128_accum 1007f1e30
numeric_poly_avg
int128_to_numericvar
int128_to_numericvar int128 val as 2 x int64: 17227307872 0
int128_to_numericvar int128 val bytes: 0000000402D3DB600000000000000000

(val_int128 in the function int8_avg_accum is correct, but newval in the
function do_int128_accum is not equal to it. val in the function
int128_to_numericvar is (4 * 4306826968).)

Based on this, we modified the test program (see attached). Here is its
output on Solaris 10 for different alignments requirements for int128
(on my machine where make check-world passes everything is OK)
(ALIGNOF_PG_INT128_TYPE is 16 on Solaris 10):

$ gcc -D PG_ALIGN_128=16 -m64 -o int128test2 int128test2.c
$ ./int128test2
basic aritmetic OK
pass int 16 OK
pass uint 16 OK
pass int 32 OK
pass int 64 OK
pass int 128 OK
$ gcc -D PG_ALIGN_128=8 -m64 -o int128test2 int128test2.c
$ ./int128test2
basic aritmetic OK
pass int 16 FAILED
pass uint 16 FAILED
pass int 32 FAILED
pass int 64 FAILED
pass int 128 OK

Maybe some pass test from int128test2.c can be used to test __int128?

P.S. I suppose, g.b should be 97656250 to get 400000000005:

> struct glob128
> {
> __int128 start;
> char pad;
> int128a a;
> int128a b;
> int128a c;
> int128a d;
> } g = {0, 'p', 48828125, 97656255, 0, 0};
> ...
> g.b = (g.b << 12) + 5; /* 400000000005 */
[1] https://www.opencsw.org

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

int128test2.c (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: master make check fails on Solaris 10

Tom Lane-2
Marina Polyakova <[hidden email]> writes:
> investigating the regression diffs, we found out that the error occurs
> when we pass int128 as not the first argument to the function (perhaps
> its value is replaced by the value of some address):
> ...
> Based on this, we modified the test program (see attached). Here is its
> output on Solaris 10 for different alignments requirements for int128
> (on my machine where make check-world passes everything is OK)
> (ALIGNOF_PG_INT128_TYPE is 16 on Solaris 10):

Excellent.  This fails the same way on gcc 5.2.0 and 5.5.0?

> Maybe some pass test from int128test2.c can be used to test __int128?

Yeah, I can work with this.  What I propose to do is use a somewhat
stripped-down version of this test as an AC_RUN_IFELSE test normally,
but if cross-compiling, fall back to just seeing if we can link.

Thanks for investigating!

                        regards, tom lane

Reply | Threaded
Open this post in threaded view
|

Re: master make check fails on Solaris 10

Marina Polyakova
In reply to this post by Marina Polyakova
Sorry, diff.patch is attached now.

On 17-01-2018 18:02, Marina Polyakova wrote:

> [I added Victor Wagner as co-researcher of this problem]
>
> On 13-01-2018 21:10, Tom Lane wrote:
>> In the end this might just be an instance of the old saw about
>> avoiding dot-zero releases.  Have you tried a newer gcc?
>> (Digging in their bugzilla finds quite a number of __int128 bugs
>> fixed in 5.4.x, though none look to be specifically about
>> misaligned data.)
>
> gcc 5.5.0 (from [1]) did not fix the problem..
>
> On 16-01-2018 2:41, Tom Lane wrote:
>> Marina Polyakova <[hidden email]> writes:
>>> On 13-01-2018 21:10, Tom Lane wrote:
>>>> I'm not sure there's much we can do about this.  Dropping the use
>>>> of the alignment spec isn't a workable option.  If there were a
>>>> simple way for configure to detect that the compiler generates bad
>>>> code for that, we could have it do so and reject use of __int128,
>>>> but it'd be up to you to come up with a workable test.
>>
>>> I'll think about it..
>>
>> Attached is a possible test program.  I can confirm it passes on a
>> machine with working __int128, but I have no idea whether it will
>> detect the problem on yours.  If not, maybe you can tweak it?
>
> Thank you! Using gcc 5.5.0 it prints that everything is ok. But,
> investigating the regression diffs, we found out that the error occurs
> when we pass int128 as not the first argument to the function (perhaps
> its value is replaced by the value of some address):
>
> -- Use queries from random.sql
> SELECT count(*) FROM onek; -- Everything is ok
> ...
> SELECT random, count(random) FROM RANDOM_TBL
>   GROUP BY random HAVING count(random) > 3; -- Everything is ok
>
> postgres=# SELECT * FROM RANDOM_TBL ORDER BY random; -- Print current
> data
>  random
> --------
>      78
>      86
>      98
>      98
> (4 rows)
>
> postgres=# SELECT AVG(random) FROM RANDOM_TBL
> postgres-#   HAVING AVG(random) NOT BETWEEN 80 AND 120; -- Oops!
>               avg
> -------------------------------
>  79446934848446476698976780288
> (1 row)
>
> Debug output from the last query (see attached diff.patch, it is based
> on commit 9c7d06d60680c7f00d931233873dee81fdb311c6 of master):
>
> makeInt128AggState
> int8_avg_accum val 98
> int8_avg_accum val_int128 as 2 x int64: 0 98
> int8_avg_accum val_int128 bytes: 00000000000000000000000000000062
> int8_avg_accum state 100e648d8
> int8_avg_accum 1007f2e94
> do_int128_accum int128 newval as 2 x int64: 4306826968 0
> do_int128_accum int128 newval bytes: 0000000100B4F6D80000000000000000
> do_int128_accum state 100e648d8
> do_int128_accum 1007f1e30
> int8_avg_accum val 86
> int8_avg_accum val_int128 as 2 x int64: 0 86
> int8_avg_accum val_int128 bytes: 00000000000000000000000000000056
> int8_avg_accum state 100e648d8
> int8_avg_accum 1007f2e94
> do_int128_accum int128 newval as 2 x int64: 4306826968 0
> do_int128_accum int128 newval bytes: 0000000100B4F6D80000000000000000
> do_int128_accum state 100e648d8
> do_int128_accum 1007f1e30
> int8_avg_accum val 98
> int8_avg_accum val_int128 as 2 x int64: 0 98
> int8_avg_accum val_int128 bytes: 00000000000000000000000000000062
> int8_avg_accum state 100e648d8
> int8_avg_accum 1007f2e94
> do_int128_accum int128 newval as 2 x int64: 4306826968 0
> do_int128_accum int128 newval bytes: 0000000100B4F6D80000000000000000
> do_int128_accum state 100e648d8
> do_int128_accum 1007f1e30
> int8_avg_accum val 78
> int8_avg_accum val_int128 as 2 x int64: 0 78
> int8_avg_accum val_int128 bytes: 0000000000000000000000000000004E
> int8_avg_accum state 100e648d8
> int8_avg_accum 1007f2e94
> do_int128_accum int128 newval as 2 x int64: 4306826968 0
> do_int128_accum int128 newval bytes: 0000000100B4F6D80000000000000000
> do_int128_accum state 100e648d8
> do_int128_accum 1007f1e30
> numeric_poly_avg
> int128_to_numericvar
> int128_to_numericvar int128 val as 2 x int64: 17227307872 0
> int128_to_numericvar int128 val bytes: 0000000402D3DB600000000000000000
>
> (val_int128 in the function int8_avg_accum is correct, but newval in
> the function do_int128_accum is not equal to it. val in the function
> int128_to_numericvar is (4 * 4306826968).)
>
> Based on this, we modified the test program (see attached). Here is
> its output on Solaris 10 for different alignments requirements for
> int128 (on my machine where make check-world passes everything is OK)
> (ALIGNOF_PG_INT128_TYPE is 16 on Solaris 10):
>
> $ gcc -D PG_ALIGN_128=16 -m64 -o int128test2 int128test2.c
> $ ./int128test2
> basic aritmetic OK
> pass int 16 OK
> pass uint 16 OK
> pass int 32 OK
> pass int 64 OK
> pass int 128 OK
> $ gcc -D PG_ALIGN_128=8 -m64 -o int128test2 int128test2.c
> $ ./int128test2
> basic aritmetic OK
> pass int 16 FAILED
> pass uint 16 FAILED
> pass int 32 FAILED
> pass int 64 FAILED
> pass int 128 OK
>
> Maybe some pass test from int128test2.c can be used to test __int128?
>
> P.S. I suppose, g.b should be 97656250 to get 400000000005:
>
>> struct glob128
>> {
>> __int128 start;
>> char pad;
>> int128a a;
>> int128a b;
>> int128a c;
>> int128a d;
>> } g = {0, 'p', 48828125, 97656255, 0, 0};
>> ...
>> g.b = (g.b << 12) + 5; /* 400000000005 */
>
> [1] https://www.opencsw.org
--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff.patch (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: master make check fails on Solaris 10

Victor Wagner
In reply to this post by Marina Polyakova
On Wed, 17 Jan 2018 18:02:26 +0300
Marina Polyakova <[hidden email]> wrote:


> > Attached is a possible test program.  I can confirm it passes on a
> > machine with working __int128, but I have no idea whether it will
> > detect the problem on yours.  If not, maybe you can tweak it?
>
> Thank you! Using gcc 5.5.0 it prints that everything is ok. But,
> investigating the regression diffs, we found out that the error
> occurs when we pass int128 as not the first argument to the function
> (perhaps its value is replaced by the value of some address):

I'm attaching stripped-down version of test program, which demonstrate
the problem and two assembler listings produced with this C source using
alignment 8 and 16. May be this stripped-down version can be used as
base for configure test.

As it turns out, Sparc GCC passes function arguments via register ring
which is referenced as %on in the calling code and as %in in function.

And somehow it happens that alignment attribute of typedef affects
access to arguments in the function, but doesn't affect how regiser
ring is filled before call. Looks like bug in GCC.

Unfortunately, we have only one Sparc machine and started our
investigation by upgrading GCC 5.2.0 to GCC 5.5.0, so it is hard to
downgrade and test with older GCC.

--

align_test.c (702 bytes) Download Attachment
align_test8.s (6K) Download Attachment
align_test16.s (6K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: master make check fails on Solaris 10

Tom Lane-2
In reply to this post by Marina Polyakova
BTW, now that you've demonstrated that the bug exists in a current
gcc release, you should definitely file a bug at
https://gcc.gnu.org/bugzilla/
I think you can just give them int128test2.c as-is as a test case.

Please do that and let me know the PR number --- I think it would be
good to cite the bug specifically in the comments for our configure code.

                        regards, tom lane

Reply | Threaded
Open this post in threaded view
|

Re: master make check fails on Solaris 10

Victor Wagner
In reply to this post by Tom Lane-2
On Wed, 17 Jan 2018 10:07:37 -0500
Tom Lane <[hidden email]> wrote:

> Marina Polyakova <[hidden email]> writes:

> Yeah, I can work with this.  What I propose to do is use a somewhat
> stripped-down version of this test as an AC_RUN_IFELSE test normally,
> but if cross-compiling, fall back to just seeing if we can link.

I'd suggest to add a configure option to switch off 128-bit support
(--disable-int128), especially for these cross-compile cases where link
test cannot give us enough information to decide automatically.

--

Reply | Threaded
Open this post in threaded view
|

Re: master make check fails on Solaris 10

Tom Lane-2
Victor Wagner <[hidden email]> writes:
> On Wed, 17 Jan 2018 10:07:37 -0500
> Tom Lane <[hidden email]> wrote:
>> Yeah, I can work with this.  What I propose to do is use a somewhat
>> stripped-down version of this test as an AC_RUN_IFELSE test normally,
>> but if cross-compiling, fall back to just seeing if we can link.

> I'd suggest to add a configure option to switch off 128-bit support
> (--disable-int128), especially for these cross-compile cases where link
> test cannot give us enough information to decide automatically.

I don't want to go there without some evidence that the problem is much
more widespread than it appears now.  A disable switch will be a permanent
documentation and maintenance overhead, plus anyone who puts it into
their build scripts will probably never remember to remove it :-(.
And how many people will be cross-compiling to Solaris/SPARC anyway?
(If there are any, they can always manually change pg_config.h ...)

                        regards, tom lane

123
Previous Thread Next Thread