BUG #16833: postgresql 13.1 process crash every hour

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

BUG #16833: postgresql 13.1 process crash every hour

PG Bug reporting form
The following bug has been logged on the website:

Bug reference:      16833
Logged by:          Alex F
Email address:      [hidden email]
PostgreSQL version: 13.1
Operating system:   4.14.209-160.339.amzn2.x86_64 #1 SMP Wed Dec 16 22
Description:        

Postgresql run inside of  official docker container
https://hub.docker.com/_/postgres
psql (PostgreSQL) 13.1 (Debian 13.1-1.pgdg100+1)
on Amazon Linux 2

Process crash inside docker containter 2-3 times per hour without any
additional information
./postgresql-Thu-00.log:2021-01-21 00:11:34 UTC [1]: user=,db=,app=,client=
LOG:  server process (PID 20071) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-01.log:2021-01-21 01:11:50 UTC [1]: user=,db=,app=,client=
LOG:  server process (PID 23827) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-02.log:2021-01-21 02:11:31 UTC [1]: user=,db=,app=,client=
LOG:  server process (PID 27974) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-03.log:2021-01-21 03:11:56 UTC [1]: user=,db=,app=,client=
LOG:  server process (PID 31389) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-04.log:2021-01-21 04:11:23 UTC [1]: user=,db=,app=,client=
LOG:  server process (PID 2544) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-05.log:2021-01-21 05:11:52 UTC [1]: user=,db=,app=,client=
LOG:  server process (PID 5962) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-06.log:2021-01-21 06:12:41 UTC [1]: user=,db=,app=,client=
LOG:  server process (PID 60) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-07.log:2021-01-21 07:12:59 UTC [1]: user=,db=,app=,client=
LOG:  server process (PID 3810) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-08.log:2021-01-21 08:12:37 UTC [1]: user=,db=,app=,client=
LOG:  server process (PID 7730) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-09.log:2021-01-21 09:13:26 UTC [1]: user=,db=,app=,client=
LOG:  server process (PID 11257) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-10.log:2021-01-21 10:13:05 UTC [1]: user=,db=,app=,client=
LOG:  server process (PID 14982) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-11.log:2021-01-21 11:13:18 UTC [1]: user=,db=,app=,client=
LOG:  server process (PID 18503) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-12.log:2021-01-21 12:13:04 UTC [1]: user=,db=,app=,client=
LOG:  server process (PID 22037) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-13.log:2021-01-21 13:13:57 UTC [1]: user=,db=,app=,client=
LOG:  server process (PID 25484) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-14.log:2021-01-21 14:13:15 UTC [1]: user=,db=,app=,client=
LOG:  server process (PID 29072) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-15.log:2021-01-21 15:13:22 UTC [1]: user=,db=,app=,client=
LOG:  server process (PID 32657) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-16.log:2021-01-21 16:13:15 UTC [1]: user=,db=,app=,client=
LOG:  server process (PID 3788) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-17.log:2021-01-21 17:13:26 UTC [1]: user=,db=,app=,client=
LOG:  server process (PID 7306) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-18.log:2021-01-21 18:12:58 UTC [1]: user=,db=,app=,client=
LOG:  server process (PID 11885) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-19.log:2021-01-21 19:13:21 UTC [1]: user=,db=,app=,client=
LOG:  server process (PID 14442) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-20.log:2021-01-21 20:13:36 UTC [1]: user=,db=,app=,client=
LOG:  server process (PID 17975) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-21.log:2021-01-21 21:13:55 UTC [1]: user=,db=,app=,client=
LOG:  server process (PID 21631) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-22.log:2021-01-21 22:14:45 UTC [1]: user=,db=,app=,client=
LOG:  server process (PID 25251) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-23.log:2021-01-21 23:15:01 UTC [1]: user=,db=,app=,client=
LOG:  server process (PID 28851) was terminated by signal 11: Segmentation
fault

I can share postgresql.conf, process crash core dumps for analysis

Reply | Threaded
Open this post in threaded view
|

Re: BUG #16833: postgresql 13.1 process crash every hour

Tom Lane-2
PG Bug reporting form <[hidden email]> writes:
> Process crash inside docker containter 2-3 times per hour without any
> additional information
> ./postgresql-Thu-00.log:2021-01-21 00:11:34 UTC [1]: user=,db=,app=,client=
> LOG:  server process (PID 20071) was terminated by signal 11: Segmentation
> fault

Hm, please see if you can get a stack trace:

https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend

Also try to figure out what query(s) are causing the crash.
(It's unlikely that the postmaster log doesn't provide more
information than you've shared here.)

> I can share postgresql.conf, process crash core dumps for analysis

Core dumps are unlikely to help anyone else; they are too
machine-specific.  Not to mention that they might contain
sensitive data.  You'll need to examine them yourself.

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: BUG #16833: postgresql 13.1 process crash every hour

Alex F
Thank you Tom for detailed instructions!
What I understood is that some specific query lead to database block corruption which causes segfault.
I analyzed action log before segfault
1. database worked fine on pg13.0 for a few months
2. issue caused right after pg13.1 binaries upgrade
3. (this actually was wrong action) rollback pg13.0 binaries and tried to start database on it and now got infinite segfault on startup
LOG:  database system was interrupted while in recovery at 2021-01-22 14:27:49 UTC
HINT:  This probably means that some data is corrupted and you will have to use the last backup for recovery.
LOG:  database system was not properly shut down; automatic recovery in progress
LOG:  redo starts at 49D/364DD528
LOG:  startup process (PID 26) was terminated by signal 11: Segmentation fault
LOG:  aborting startup due to startup process failure
LOG:  database system is shut down

Most disappointed fact that slave node also got corrupted wal block and also unable to start.
So I have a chance to recover database with initdb+pgdump only.

Anyway I will try to compile pg13.1 binaries with --enable-debug and enable all queries logging. Hope this will help with the investigation.
Thanks for your support!

пт, 22 янв. 2021 г. в 20:23, Tom Lane <[hidden email]>:
PG Bug reporting form <[hidden email]> writes:
> Process crash inside docker containter 2-3 times per hour without any
> additional information
> ./postgresql-Thu-00.log:2021-01-21 00:11:34 UTC [1]: user=,db=,app=,client=
> LOG:  server process (PID 20071) was terminated by signal 11: Segmentation
> fault

Hm, please see if you can get a stack trace:

https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend

Also try to figure out what query(s) are causing the crash.
(It's unlikely that the postmaster log doesn't provide more
information than you've shared here.)

> I can share postgresql.conf, process crash core dumps for analysis

Core dumps are unlikely to help anyone else; they are too
machine-specific.  Not to mention that they might contain
sensitive data.  You'll need to examine them yourself.

                        regards, tom lane
Reply | Threaded
Open this post in threaded view
|

Re: BUG #16833: postgresql 13.1 process crash every hour

Andres Freund
In reply to this post by PG Bug reporting form
Hi,

On 2021-01-22 05:37:56 +0000, PG Bug reporting form wrote:
> Postgresql run inside of  official docker container
> https://hub.docker.com/_/postgres
> psql (PostgreSQL) 13.1 (Debian 13.1-1.pgdg100+1)
> on Amazon Linux 2
>
> Process crash inside docker containter 2-3 times per hour without any
> additional information

What kind of resource limits have you set up with docker?

Are there any kernel messages?

Greetings,

Andres Freund