BUG #14897: Segfault on statitics SQL request

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
39 messages Options
12
Reply | Threaded
Open this post in threaded view
|

BUG #14897: Segfault on statitics SQL request

Vincent Lachenal
The following bug has been logged on the website:

Bug reference:      14897
Logged by:          Vincent Lachenal
Email address:      [hidden email]
PostgreSQL version: 10.1
Operating system:   Linux (Archlinux)
Description:        

Hi,

First of all, thanks for your work. PostgreSQL is my favorite SQL
database.

I contact you because I just upgrade PostgreSQL from 9.6.5 to 10.1.
Migration is OK. My databases seems to work.

But one of the resquest I use causes segmentation fault to server process.

The request is :
SELECT
        s.protocol,
        s.mapper,
        c.method,
        s.nb_threads,
        avg(c.client_end - c.client_start) / 1000000 AS total,
        avg(c.server_end - c.server_start) / 1000000 AS server,
        avg(c.server_start - c.client_start) / 1000000 AS client_to_server,
        avg(c.client_end - c.server_end) / 1000000 AS server_to_client
FROM testsuite s
INNER JOIN testcall c ON s.id = c.test_suite_id
GROUP BY (s.protocol, s.mapper, c.method, s.nb_threads)
ORDER BY s.nb_threads, c.method, s.mapper, s.protocol;


In systemctl logs, I have (sorry it is in french):
nov. 10 19:27:18 daidoji.rokugan.local postgres[5460]: 2017-11-10
19:27:18.705 CET [5462] LOG:  processus serveur (PID 5601) a été arrêté par
le signal 11 : Segmentation fault
nov. 10 19:27:18 daidoji.rokugan.local postgres[5460]: 2017-11-10
19:27:18.705 CET [5462] DÉTAIL:  Le processus qui a échoué exécutait :
SELECT s.protocol,s.mapper,c.method,s.nb_threads,avg(c.client_end -
c.client_start) / 1000000 AS total,avg(c.server_end - c.server_start) /
1000000 AS server,avg(c.server_start - c.client_start) / 1000000 AS
client_to_server,avg(c.client_end - c.server_end) / 1000000 AS
server_to_client FROM testsuite s INNER JOIN testcall c ON s.id =
c.test_suite_id GROUP BY (s.protocol, s.mapper, c.method, s.nb_threads)
ORDER BY s.nb_threads, c.method, s.mapper, s.protocol;
nov. 10 19:27:18 daidoji.rokugan.local postgres[5460]: 2017-11-10
19:27:18.705 CET [5462] LOG:  arrêt des autres processus serveur actifs
nov. 10 19:27:18 daidoji.rokugan.local postgres[5460]: 2017-11-10
19:27:18.707 CET [5551] ATTENTION:  arrêt de la connexion à cause de l'arrêt
brutal d'un autre processus serveur
nov. 10 19:27:18 daidoji.rokugan.local postgres[5460]: 2017-11-10
19:27:18.707 CET [5551] DÉTAIL:  Le postmaster a commandé à ce processus
serveur d'annuler la transaction
nov. 10 19:27:18 daidoji.rokugan.local postgres[5460]:         courante et
de quitter car un autre processus serveur a quitté anormalement
nov. 10 19:27:18 daidoji.rokugan.local postgres[5460]:         et qu'il
existe probablement de la mémoire partagée corrompue.
nov. 10 19:27:18 daidoji.rokugan.local postgres[5460]: 2017-11-10
19:27:18.707 CET [5551] ASTUCE :  Dans un moment, vous devriez être capable
de vous reconnecter à la base de
nov. 10 19:27:18 daidoji.rokugan.local postgres[5460]:         données et de
relancer votre commande.
nov. 10 19:27:18 daidoji.rokugan.local postgres[5460]: 2017-11-10
19:27:18.708 CET [5462] LOG:  tous les processus serveur se sont arrêtés,
réinitialisation
nov. 10 19:27:18 daidoji.rokugan.local postgres[5460]: 2017-11-10
19:27:18.729 CET [5605] LOG:  le système de bases de données a été
interrompu ; dernier lancement connu à 2017-11-10 19:26:23 CET
nov. 10 19:27:18 daidoji.rokugan.local postgres[5460]: 2017-11-10
19:27:18.810 CET [5605] LOG:  le système de bases de données n'a pas été
arrêté proprement ; restauration
nov. 10 19:27:18 daidoji.rokugan.local postgres[5460]:         automatique
en cours
nov. 10 19:27:18 daidoji.rokugan.local postgres[5460]: 2017-11-10
19:27:18.813 CET [5605] LOG:  la ré-exécution commence à 0/3160FD0
nov. 10 19:27:18 daidoji.rokugan.local postgres[5460]: 2017-11-10
19:27:18.813 CET [5605] LOG:  longueur invalide de l'enregistrement à
0/3161008 : voulait 24, a eu 0
nov. 10 19:27:18 daidoji.rokugan.local postgres[5460]: 2017-11-10
19:27:18.813 CET [5605] LOG:  ré-exécution faite à 0/3160FD0
nov. 10 19:27:18 daidoji.rokugan.local postgres[5460]: 2017-11-10
19:27:18.830 CET [5462] LOG:  le système de bases de données est prêt pour
accepter les connexions


--
Sent via pgsql-bugs mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
Reply | Threaded
Open this post in threaded view
|

Re: BUG #14897: Segfault on statitics SQL request

Tom Lane-2
[hidden email] writes:
> But one of the resquest I use causes segmentation fault to server process.

> The request is :
> SELECT
> s.protocol,
> s.mapper,
> c.method,
> s.nb_threads,
> avg(c.client_end - c.client_start) / 1000000 AS total,
> avg(c.server_end - c.server_start) / 1000000 AS server,
> avg(c.server_start - c.client_start) / 1000000 AS client_to_server,
> avg(c.client_end - c.server_end) / 1000000 AS server_to_client
> FROM testsuite s
> INNER JOIN testcall c ON s.id = c.test_suite_id
> GROUP BY (s.protocol, s.mapper, c.method, s.nb_threads)
> ORDER BY s.nb_threads, c.method, s.mapper, s.protocol;

The query alone doesn't help us much.  Can you put together a
self-contained test case, including table declarations and some
sample data?  Also, are you using any non-default configuration
settings?

Alternatively, if you can get a stack trace from the crash,
that might be enough info to diagnose it (no promises though).

https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend

                        regards, tom lane


--
Sent via pgsql-bugs mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
Reply | Threaded
Open this post in threaded view
|

Re: BUG #14897: Segfault on statitics SQL request

Vincent Lachenal
Sorry. I didn't find the way to create an attachment in the interface.
You can find the database dump as attachment.

I didn't tweak database. I use default Archlinux configuration (which should be PostgreSQL default configuration).

I will try to get a stack trace.

Regards.

Vincent

Le ven. 10 nov. 2017 à 20:08, Tom Lane <[hidden email]> a écrit :
[hidden email] writes:
> But one of the resquest I use causes segmentation fault to server process.

> The request is :
> SELECT
>       s.protocol,
>       s.mapper,
>       c.method,
>       s.nb_threads,
>       avg(c.client_end - c.client_start) / 1000000 AS total,
>       avg(c.server_end - c.server_start) / 1000000 AS server,
>       avg(c.server_start - c.client_start) / 1000000 AS client_to_server,
>       avg(c.client_end - c.server_end) / 1000000 AS server_to_client
> FROM testsuite s
> INNER JOIN testcall c ON s.id = c.test_suite_id
> GROUP BY (s.protocol, s.mapper, c.method, s.nb_threads)
> ORDER BY s.nb_threads, c.method, s.mapper, s.protocol;

The query alone doesn't help us much.  Can you put together a
self-contained test case, including table declarations and some
sample data?  Also, are you using any non-default configuration
settings?

Alternatively, if you can get a stack trace from the crash,
that might be enough info to diagnose it (no promises though).

https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend

                        regards, tom lane


--
Sent via pgsql-bugs mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

dump.tar.xz (2M) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: BUG #14897: Segfault on statitics SQL request

Vincent Lachenal
Sorry. I didn't find the way to create an attachment in the interface.
You can find the database's dump here : https://github.com/vlachenal/webservices-bench . It is the dump.tar.xz file.

I didn't tweak database. I use default Archlinux configuration (which should be PostgreSQL default configuration).

I will try to get a stack trace.

Regards.

Vincent

Le ven. 10 nov. 2017 à 20:47, Vincent Lachenal <[hidden email]> a écrit :
Sorry. I didn't find the way to create an attachment in the interface.
You can find the database dump as attachment.

I didn't tweak database. I use default Archlinux configuration (which should be PostgreSQL default configuration).

I will try to get a stack trace.

Regards.

Vincent

Le ven. 10 nov. 2017 à 20:08, Tom Lane <[hidden email]> a écrit :
[hidden email] writes:
> But one of the resquest I use causes segmentation fault to server process.

> The request is :
> SELECT
>       s.protocol,
>       s.mapper,
>       c.method,
>       s.nb_threads,
>       avg(c.client_end - c.client_start) / 1000000 AS total,
>       avg(c.server_end - c.server_start) / 1000000 AS server,
>       avg(c.server_start - c.client_start) / 1000000 AS client_to_server,
>       avg(c.client_end - c.server_end) / 1000000 AS server_to_client
> FROM testsuite s
> INNER JOIN testcall c ON s.id = c.test_suite_id
> GROUP BY (s.protocol, s.mapper, c.method, s.nb_threads)
> ORDER BY s.nb_threads, c.method, s.mapper, s.protocol;

The query alone doesn't help us much.  Can you put together a
self-contained test case, including table declarations and some
sample data?  Also, are you using any non-default configuration
settings?

Alternatively, if you can get a stack trace from the crash,
that might be enough info to diagnose it (no promises though).

https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend

                        regards, tom lane
Reply | Threaded
Open this post in threaded view
|

Re: BUG #14897: Segfault on statitics SQL request

Tom Lane-2
In reply to this post by Vincent Lachenal
Vincent Lachenal <[hidden email]> writes:
> Sorry. I didn't find the way to create an attachment in the interface.
> You can find the database dump as attachment.

Thanks for sending the data!  However, the given query doesn't crash for
me, so there must be something non-default about your parameter settings.
Could you send the result of

select name,setting,source from pg_settings where source != 'default';

                        regards, tom lane


--
Sent via pgsql-bugs mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
Reply | Threaded
Open this post in threaded view
|

Re: BUG #14897: Segfault on statitics SQL request

Vincent Lachenal
The request returns this result on this database :
select name,setting,source from pg_settings where source != 'default';
           name            |      setting      |        source         
----------------------------+-------------------+----------------------
application_name           | psql              | client
client_encoding            | UTF8              | client
data_checksums             | off               | override
DateStyle                  | ISO, DMY          | configuration file
default_text_search_config | pg_catalog.french | configuration file
dynamic_shared_memory_type | posix             | configuration file
lc_collate                 | fr_FR.utf8        | override
lc_ctype                   | fr_FR.utf8        | override
lc_messages                | fr_FR.utf8        | configuration file
lc_monetary                | fr_FR.utf8        | configuration file
lc_numeric                 | fr_FR.utf8        | configuration file
lc_time                    | fr_FR.utf8        | configuration file
log_timezone               | Europe/Paris      | configuration file
max_connections            | 100               | configuration file
max_stack_depth            | 2048              | environment variable
server_encoding            | UTF8              | override
shared_buffers             | 16384             | configuration file
TimeZone                   | Europe/Paris      | configuration file
transaction_deferrable     | off               | override
transaction_isolation      | read committed    | override
transaction_read_only      | off               | override
wal_buffers                | 512               | override

On postgres database, it returns :
select name,setting,source from pg_settings where source != 'default';
           name            |                setting                 |        source         
----------------------------+----------------------------------------+----------------------
application_name           | psql                                   | client
client_encoding            | UTF8                                   | client
config_file                | /var/lib/postgres/data/postgresql.conf | override
data_checksums             | off                                    | override
data_directory             | /var/lib/postgres/data                 | override
DateStyle                  | ISO, DMY                               | configuration file
default_text_search_config | pg_catalog.french                      | configuration file
dynamic_shared_memory_type | posix                                  | configuration file
hba_file                   | /var/lib/postgres/data/pg_hba.conf     | override
ident_file                 | /var/lib/postgres/data/pg_ident.conf   | override
lc_collate                 | fr_FR.utf8                             | override
lc_ctype                   | fr_FR.utf8                             | override
lc_messages                | fr_FR.utf8                             | configuration file
lc_monetary                | fr_FR.utf8                             | configuration file
lc_numeric                 | fr_FR.utf8                             | configuration file
lc_time                    | fr_FR.utf8                             | configuration file
log_timezone               | Europe/Paris                           | configuration file
max_connections            | 100                                    | configuration file
max_stack_depth            | 2048                                   | environment variable
server_encoding            | UTF8                                   | override
shared_buffers             | 16384                                  | configuration file
TimeZone                   | Europe/Paris                           | configuration file
transaction_deferrable     | off                                    | override
transaction_isolation      | read committed                         | override
transaction_read_only      | off                                    | override
wal_buffers                | 512                                    | override

Regards.

Vincent

Le ven. 10 nov. 2017 à 21:12, Tom Lane <[hidden email]> a écrit :
Vincent Lachenal <[hidden email]> writes:
> Sorry. I didn't find the way to create an attachment in the interface.
> You can find the database dump as attachment.

Thanks for sending the data!  However, the given query doesn't crash for
me, so there must be something non-default about your parameter settings.
Could you send the result of

select name,setting,source from pg_settings where source != 'default';

                        regards, tom lane
Reply | Threaded
Open this post in threaded view
|

Re: BUG #14897: Segfault on statitics SQL request

Tom Lane-2
Vincent Lachenal <[hidden email]> writes:
> The request returns this result on this database :
> select name,setting,source from pg_settings where source != 'default';

Hmm .... not much help there.  I wonder if this is specific to ArchLinux.
We do have an Arch machine in the buildfarm, but it's an ARM, so that
might not prove much about Arch on other hardware.

One other thing that might be useful: could you send the output of
"pg_config" (that's a command-line program, not a SQL command)?

                        regards, tom lane


--
Sent via pgsql-bugs mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
Reply | Threaded
Open this post in threaded view
|

Re: BUG #14897: Segfault on statitics SQL request

Vincent Lachenal
Here is the resut of pg_config:
$ pg_config  
BINDIR = /usr/bin
DOCDIR = /usr/share/doc/postgresql
HTMLDIR = /usr/share/doc/postgresql
INCLUDEDIR = /usr/include
PKGINCLUDEDIR = /usr/include/postgresql
INCLUDEDIR-SERVER = /usr/include/postgresql/server
LIBDIR = /usr/lib
PKGLIBDIR = /usr/lib/postgresql
LOCALEDIR = /usr/share/locale
MANDIR = /usr/share/man
SHAREDIR = /usr/share/postgresql
SYSCONFDIR = /etc/postgresql
PGXS = /usr/lib/postgresql/pgxs/src/makefiles/pgxs.mk
CONFIGURE = '--prefix=/usr' '--mandir=/usr/share/man' '--datadir=/usr/share/postgresql' '--sysconfdir=/etc' '--with-gssapi' '--with-libxml' '--with-openssl' '--with-pe
rl' '--with-python' 'PYTHON=/usr/bin/python2' '--with-tcl' '--with-pam' '--with-system-tzdata=/usr/share/zoneinfo' '--with-uuid=e2fs' '--enable-nls' '--enable-thread-s
afety' 'CFLAGS=-march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt' 'LDFLAGS=-Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now' 'CPPFLAGS=-D_FORTI
FY_SOURCE=2'
CC = gcc
CPPFLAGS = -DFRONTEND -D_FORTIFY_SOURCE=2 -D_GNU_SOURCE -I/usr/include/libxml2
CFLAGS = -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwr
apv -fexcess-precision=standard -march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt
CFLAGS_SL = -fPIC
LDFLAGS = -L../../src/common -Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now -Wl,--as-needed -Wl,-rpath,'/usr/lib',--enable-new-dtags
LDFLAGS_EX =  
LDFLAGS_SL =  
LIBS = -lpgcommon -lpgport -lpthread -lxml2 -lpam -lssl -lcrypto -lgssapi_krb5 -lz -lreadline -lrt -lcrypt -ldl -lm   
VERSION = PostgreSQL 10.1

If it is not help, I will recreate database and inject dump. It could be a migration bug.
Archlinux does not provide debugging symbol package and my other computer is under Gentoo (and it will take a long time to have a stable release). If bug persists, I will open a bug on Archlinux bugtracker.

Regards.

Vincent

Le ven. 10 nov. 2017 à 21:46, Tom Lane <[hidden email]> a écrit :
Vincent Lachenal <[hidden email]> writes:
> The request returns this result on this database :
> select name,setting,source from pg_settings where source != 'default';

Hmm .... not much help there.  I wonder if this is specific to ArchLinux.
We do have an Arch machine in the buildfarm, but it's an ARM, so that
might not prove much about Arch on other hardware.

One other thing that might be useful: could you send the output of
"pg_config" (that's a command-line program, not a SQL command)?

                        regards, tom lane
Reply | Threaded
Open this post in threaded view
|

Re: BUG #14897: Segfault on statitics SQL request

Tom Lane-2
Vincent Lachenal <[hidden email]> writes:
> Here is the resut of pg_config:

Hmm ... there are some nondefault compiler flags that they're sticking in,
but nothing that isn't used by other distros AFAIK.

> If it is not help, I will recreate database and inject dump. It could be a
> migration bug.

It would certainly be good to see if you can reproduce the crash from a
freshly-loaded copy of that dump file.

> Archlinux does not provide debugging symbol package and my other computer
> is under Gentoo (and it will take a long time to have a stable release). If
> bug persists, I will open a bug on Archlinux bugtracker.

Ugh.  With no symbols we couldn't learn anything very useful from a core
dump.

                        regards, tom lane


--
Sent via pgsql-bugs mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
Reply | Threaded
Open this post in threaded view
|

Re: BUG #14897: Segfault on statitics SQL request

Vincent Lachenal
Tom,

With a fresh database, bug is still here ...
I will try to compile PostgreSQL on my computer before open a bug on Archlinux (I will also try the testing Gentoo package). If I find something interesting, I will reply in in this thread.

Thanks for your help and your reactivity. The bug seems to be limited to my specific environment which is a good news (Arch has been installed for almost 4 years and is not known for its stablility).

Regards.

Vincent

Le ven. 10 nov. 2017 à 22:25, Tom Lane <[hidden email]> a écrit :
Vincent Lachenal <[hidden email]> writes:
> Here is the resut of pg_config:

Hmm ... there are some nondefault compiler flags that they're sticking in,
but nothing that isn't used by other distros AFAIK.

> If it is not help, I will recreate database and inject dump. It could be a
> migration bug.

It would certainly be good to see if you can reproduce the crash from a
freshly-loaded copy of that dump file.

> Archlinux does not provide debugging symbol package and my other computer
> is under Gentoo (and it will take a long time to have a stable release). If
> bug persists, I will open a bug on Archlinux bugtracker.

Ugh.  With no symbols we couldn't learn anything very useful from a core
dump.

                        regards, tom lane
Reply | Threaded
Open this post in threaded view
|

Re: BUG #14897: Segfault on statitics SQL request

Tom Lane-2
Vincent Lachenal <[hidden email]> writes:
> With a fresh database, bug is still here ...
> I will try to compile PostgreSQL on my computer before open a bug on
> Archlinux (I will also try the testing Gentoo package). If I find something
> interesting, I will reply in in this thread.

> Thanks for your help and your reactivity. The bug seems to be limited to my
> specific environment which is a good news (Arch has been installed for
> almost 4 years and is not known for its stablility).

FWIW, I tried a build on Fedora 25 using the same compiler flags you
showed.  Works OK there too.  So this suggests possibly an Archlinux-
specific compiler bug, though I wouldn't want to pin blame without
more evidence.

It'd definitely be interesting to see a stack trace, if you manage
to reproduce the problem in a debug-enabled build.

                        regards, tom lane


--
Sent via pgsql-bugs mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
Reply | Threaded
Open this post in threaded view
|

Re: BUG #14897: Segfault on statitics SQL request

Dmitry Dolgov
> On 10 November 2017 at 23:16, Tom Lane <[hidden email]> wrote:
>
> It'd definitely be interesting to see a stack trace, if you manage
> to reproduce the problem in a debug-enabled build.

Looks like I can reproduce something close to this issue on my Gentoo
installation using the provided dataset, but it looks quite weird for me:

at `numeric.c:4468` we generate `PolyNumAggState *result` as `Int128AggState`

4468    result = makePolyNumAggStateCurrentContext(false);

3972 static Int128AggState *
3973 makeInt128AggStateCurrentContext(bool calcSumX2)
3974 {
3975    Int128AggState *state;
3976
3977    state = (Int128AggState *) palloc0(sizeof(Int128AggState));
3978    state->calcSumX2 = calcSumX2;
3979
3980    return state;
3981 }

And as the result of this function we've got:

>>> p *state
>>> $2 = {
>>>   calcSumX2 = 0 '\000',
>>>   N = 0,
>>>   sumX = 0x00000000000000000000000000000000,
>>>   sumX2 = 0x00000000000000000000000000000000
>>> }

And after that `result->sumX` is passed to `numericvar_to_int128`

4479 #ifdef HAVE_INT128
4480    numericvar_to_int128(&num, &result->sumX);
4481 #else
4482    accum_sum_add(&result->sumX, &num);
4483 #endif

At `numeric.c:6348` we've got a segmentation fault

6348    *result = neg ? -val : val;
>>> p *result
$3 = 0x00000000000000000000000000000000
Reply | Threaded
Open this post in threaded view
|

Re: BUG #14897: Segfault on statitics SQL request

Tom Lane-2
Dmitry Dolgov <[hidden email]> writes:
> Looks like I can reproduce something close to this issue on my Gentoo
> installation using the provided dataset, but it looks quite weird for me:

Interesting.  I wonder whether __int128 has an alignment requirement that
is more than MAXALIGN.  Intel chips generally don't enforce alignment
requirements, but maybe there's an exception here?

My Fedora box thinks __alignof__(__int128) is 16, which is suspicious,
but it's not crashing.

                        regards, tom lane


--
Sent via pgsql-bugs mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
Reply | Threaded
Open this post in threaded view
|

Re: BUG #14897: Segfault on statitics SQL request

Andres Freund
On 2017-11-10 18:06:55 -0500, Tom Lane wrote:
> Interesting.  I wonder whether __int128 has an alignment requirement that
> is more than MAXALIGN.  Intel chips generally don't enforce alignment
> requirements, but maybe there's an exception here?

As long as no SIMD instructions are used... Which'd be compiler version
and flag specific.

Greetings,

Andres Freund


--
Sent via pgsql-bugs mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
Reply | Threaded
Open this post in threaded view
|

Re: BUG #14897: Segfault on statitics SQL request

Michael Paquier
In reply to this post by Tom Lane-2
On Sat, Nov 11, 2017 at 8:06 AM, Tom Lane <[hidden email]> wrote:

> Dmitry Dolgov <[hidden email]> writes:
>> Looks like I can reproduce something close to this issue on my Gentoo
>> installation using the provided dataset, but it looks quite weird for me:
>
> Interesting.  I wonder whether __int128 has an alignment requirement that
> is more than MAXALIGN.  Intel chips generally don't enforce alignment
> requirements, but maybe there's an exception here?
>
> My Fedora box thinks __alignof__(__int128) is 16, which is suspicious,
> but it's not crashing.

My laptop uses Arch, and I can see the crash easily when compiling
with gcc 7.2 which is the one bundled in the core package set:
#0  int8_avg_combine (fcinfo=0x55e290767d50) at numeric.c:4355
#1  0x000055e28e571ae3 in advance_combine_function
(pergroupstate=0x55e290764ac0, pertrans=0x55e290767c28,
aggstate=0x55e290756d78) at nodeAgg.c:1264
#2  combine_aggregates (aggstate=0x55e290756d78, pergroup=<optimized
out>) at nodeAgg.c:1198
#3  0x000055e28e5727ad in agg_retrieve_direct
(aggstate=0x55e290756d78) at nodeAgg.c:2438
#4  ExecAgg (pstate=0x55e290756d78) at nodeAgg.c:2155
#5  0x000055e28e5649da in ExecProcNode (node=0x55e290756d78) at
../../../src/include/executor/executor.h:251
#6  ExecutePlan (execute_once=<optimized out>, dest=0x7ff0cec60d98,
direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>,
operation=CMD_SELECT,
    use_parallel_mode=<optimized out>, planstate=0x55e290756d78,
estate=0x55e290756b38) at execMain.c:1720
#7  standard_ExecutorRun (queryDesc=0x55e290756728,
direction=<optimized out>, count=0, execute_once=<optimized out>) at
execMain.c:363
#8  0x000055e28e69166d in PortalRunSelect
(portal=portal@entry=0x55e290754718, forward=forward@entry=1 '\001',
count=0, count@entry=9223372036854775807,
    dest=dest@entry=0x7ff0cec60d98) at pquery.c:932
#9  0x000055e28e692b4e in PortalRun
(portal=portal@entry=0x55e290754718,
count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=1
'\001',
    run_once=run_once@entry=1 '\001', dest=dest@entry=0x7ff0cec60d98,
altdest=altdest@entry=0x7ff0cec60d98, completionTag=0x7ffd6d2dfd50 "")
at pquery.c:773
#10 0x000055e28e68e882 in exec_simple_query (
    query_string=0x55e290686398 "SELECT\n        s.protocol,\n
s.mapper,\n        c.method,\n        s.nb_threads,\n
avg(c.client_end - c.client_start) / 1000000 AS total,\n
avg(c.server_end - c.server_start) / 1000000"...) at postgres.c:1120
#11 0x000055e28e6907f0 in PostgresMain (argc=<optimized out>,
argv=argv@entry=0x55e290698dd8, dbname=<optimized out>,
username=<optimized out>) at postgres.c:4139
#12 0x000055e28e3e531c in BackendRun (port=0x55e290690500) at postmaster.c:4364
#13 BackendStartup (port=0x55e290690500) at postmaster.c:4036
#14 ServerLoop () at postmaster.c:1755
#15 0x000055e28e622fe4 in PostmasterMain (argc=3, argv=0x55e290666760)
at postmaster.c:1363
#16 0x000055e28e3e68e8 in main (argc=3, argv=0x55e290666760) at main.c:228
(gdb) p state1->sumX
$1 = 0x00000000000000000000000000000000
(gdb) p state2->sumX
$2 = 0x0000000000000000000000004c170e30
--
Michael


--
Sent via pgsql-bugs mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
Reply | Threaded
Open this post in threaded view
|

Re: BUG #14897: Segfault on statitics SQL request

Dmitry Dolgov
> On 11 November 2017 at 09:39, Michael Paquier <[hidden email]> wrote:
>
> My laptop uses Arch, and I can see the crash easily when compiling
> with gcc 7.2 which is the one bundled in the core package set

Yes, I forgot to mention, that I also used quite recent version of gcc (GCC) 8.0.0 20170805 (experimental).


Reply | Threaded
Open this post in threaded view
|

Re: BUG #14897: Segfault on statitics SQL request

Vincent Lachenal
I retried database migration using another ... and the bug does not happen anymore.
The first time I use "Manual dump and reload" migration method from Archlinux wiki (https://wiki.archlinux.org/index.php/PostgreSQL#Upgrading_PostgreSQL):
# systemctl stop postgresql.service
# mv /var/lib/postgres/data /var/lib/postgres/olddata
# mkdir /var/lib/postgres/data
# chown postgres:postgres /var/lib/postgres/data
[postgres]$ initdb --locale $LANG -E UTF8 -D '/var/lib/postgres/data'
# /opt/pgsql-9.6/bin/pg_ctl -D /var/lib/postgres/olddata/ start
# /opt/pgsql-9.6/bin/pg_dumpall >> old_backup.sql
# /opt/pgsql-9.6/bin/pg_ctl -D /var/lib/postgres/olddata/ stop
# systemctl start postgresql.service
# psql -f old_backup.sql postgres
The second time I use the other way:
# systemctl stop postgresql.service
# mv /var/lib/postgres/data /var/lib/postgres/olddata
# mkdir /var/lib/postgres/data
# chown postgres:postgres /var/lib/postgres/data
[postgres]$ initdb --locale $LANG -E UTF8 -D '/var/lib/postgres/data'
[postgres]$ cd /tmp
[postgres]$ pg_upgrade -b /opt/pgsql-9.6/bin -B /usr/bin -d /var/lib/postgres/olddata -D /var/lib/postgres/data

I have kept the migration data (both of them). So I will compile PostgreSQL with debugging symbol to have a workable stacktrace.

Regards.

Vincent


Le sam. 11 nov. 2017 à 11:34, Dmitry Dolgov <[hidden email]> a écrit :
> On 11 November 2017 at 09:39, Michael Paquier <[hidden email]> wrote:
>
> My laptop uses Arch, and I can see the crash easily when compiling
> with gcc 7.2 which is the one bundled in the core package set

Yes, I forgot to mention, that I also used quite recent version of gcc (GCC) 8.0.0 20170805 (experimental).


Reply | Threaded
Open this post in threaded view
|

Re: BUG #14897: Segfault on statitics SQL request

Tom Lane-2
In reply to this post by Dmitry Dolgov
Dmitry Dolgov <[hidden email]> writes:
>> On 11 November 2017 at 09:39, Michael Paquier <[hidden email]>
> wrote:
>> My laptop uses Arch, and I can see the crash easily when compiling
>> with gcc 7.2 which is the one bundled in the core package set

> Yes, I forgot to mention, that I also used quite recent version of gcc
> (GCC) 8.0.0 20170805 (experimental).

Would you guys who are seeing the problem note whether the address of
the int128 field is 16-aligned, or only 8-aligned?

Also, it'd be real useful to see some disassembly around the point of
the crash, so that we can check whether the compiler is using SIMD
instructions.  (Or just get the compiler to generate a numeric.s file
with -S.)

I'm pretty suspicious that this is where the problem is, but it would
be good to have confirmation before we put effort into a fix.

                        regards, tom lane


--
Sent via pgsql-bugs mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
Reply | Threaded
Open this post in threaded view
|

Re: BUG #14897: Segfault on statitics SQL request

Dmitry Dolgov
> On 11 November 2017 at 17:36, Tom Lane <[hidden email]> wrote:
>
> Would you guys who are seeing the problem note whether the address of
> the int128 field is 16-aligned, or only 8-aligned?

__alignof__(__int128) returns 16 on my machine.

> Also, it'd be real useful to see some disassembly around the point of
> the crash, so that we can check whether the compiler is using SIMD
> instructions.  (Or just get the compiler to generate a numeric.s file
> with -S.)

Here is the disassembly section I've got in my case:

    0x00000000007fc474 numericvar_to_int128+458 je     0x7fc48c <numericvar_to_int128+482>
    0x00000000007fc476 numericvar_to_int128+460 mov    -0x88(%rbp),%rax
    0x00000000007fc47d numericvar_to_int128+467 movdqa -0x60(%rbp),%xmm0
--> 0x00000000007fc482 numericvar_to_int128+472 movaps %xmm0,(%rax)
    0x00000000007fc485 numericvar_to_int128+475 mov    $0x1,%eax
    0x00000000007fc48a numericvar_to_int128+480 jmp    0x7fc455 <numericvar_to_int128+427>
    0x00000000007fc48c numericvar_to_int128+482 negq   -0x60(%rbp)
Reply | Threaded
Open this post in threaded view
|

Re: BUG #14897: Segfault on statitics SQL request

Andres Freund
On 2017-11-11 17:52:34 +0100, Dmitry Dolgov wrote:

> > On 11 November 2017 at 17:36, Tom Lane <[hidden email]> wrote:
> >
> > Would you guys who are seeing the problem note whether the address of
> > the int128 field is 16-aligned, or only 8-aligned?
>
> __alignof__(__int128) returns 16 on my machine.
>
> > Also, it'd be real useful to see some disassembly around the point of
> > the crash, so that we can check whether the compiler is using SIMD
> > instructions.  (Or just get the compiler to generate a numeric.s file
> > with -S.)
>
> Here is the disassembly section I've got in my case:
>
>     0x00000000007fc474 numericvar_to_int128+458 je     0x7fc48c
> <numericvar_to_int128+482>
>     0x00000000007fc476 numericvar_to_int128+460 mov    -0x88(%rbp),%rax
>     0x00000000007fc47d numericvar_to_int128+467 movdqa -0x60(%rbp),%xmm0
> --> 0x00000000007fc482 numericvar_to_int128+472 movaps %xmm0,(%rax)
>     0x00000000007fc485 numericvar_to_int128+475 mov    $0x1,%eax
>     0x00000000007fc48a numericvar_to_int128+480 jmp    0x7fc455
> <numericvar_to_int128+427>
>     0x00000000007fc48c numericvar_to_int128+482 negq   -0x60(%rbp)

That's using SSE, which requires 16byte alignment IIRC.  I think we need
a function that properly allocate int128 vars with the right alignment -
don't think we want to go for full 16byte alignment for everything.

Greetings,

Andres Freund


--
Sent via pgsql-bugs mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
12
Previous Thread Next Thread