OOM Killer kills PostgreSQL

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

OOM Killer kills PostgreSQL

piotrwlodarczyk89
Hi folks,

We met unexpected PostgreSQL shutdown. After a little investigation we've discovered that problem is in OOM killer which kills our PostgreSQL. Unfortunately we can't find query on DB causing this problem. Log is as below:

May 05 09:05:33 HOST kernel: postgres invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=-1000
May 05 09:05:34 HOST kernel: postgres cpuset=/ mems_allowed=0
May 05 09:05:34 HOST kernel: CPU: 0 PID: 28286 Comm: postgres Not tainted 3.10.0-1127.el7.x86_64 #1
May 05 09:05:34 HOST kernel: Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
May 05 09:05:34 HOST kernel: Call Trace:
May 05 09:05:34 HOST kernel:  [<ffffffffa097ff85>] dump_stack+0x19/0x1b
May 05 09:05:34 HOST kernel:  [<ffffffffa097a8a3>] dump_header+0x90/0x229
May 05 09:05:34 HOST kernel:  [<ffffffffa050da5b>] ? cred_has_capability+0x6b/0x120
May 05 09:05:34 HOST kernel:  [<ffffffffa03c246e>] oom_kill_process+0x25e/0x3f0
May 05 09:05:35 HOST kernel:  [<ffffffffa0333a41>] ? cpuset_mems_allowed_intersects+0x21/0x30
May 05 09:05:40 HOST kernel:  [<ffffffffa03c1ecd>] ? oom_unkillable_task+0xcd/0x120
May 05 09:05:42 HOST kernel:  [<ffffffffa03c1f76>] ? find_lock_task_mm+0x56/0xc0
May 05 09:05:42 HOST kernel:  [<ffffffffa03c2cc6>] out_of_memory+0x4b6/0x4f0
May 05 09:05:42 HOST kernel:  [<ffffffffa097b3c0>] __alloc_pages_slowpath+0x5db/0x729
May 05 09:05:42 HOST kernel:  [<ffffffffa03c9146>] __alloc_pages_nodemask+0x436/0x450
May 05 09:05:42 HOST kernel:  [<ffffffffa0418e18>] alloc_pages_current+0x98/0x110
May 05 09:05:42 HOST kernel:  [<ffffffffa03be377>] __page_cache_alloc+0x97/0xb0
May 05 09:05:42 HOST kernel:  [<ffffffffa03c0f30>] filemap_fault+0x270/0x420
May 05 09:05:42 HOST kernel:  [<ffffffffc03c07d6>] ext4_filemap_fault+0x36/0x50 [ext4]
May 05 09:05:42 HOST kernel:  [<ffffffffa03edeea>] __do_fault.isra.61+0x8a/0x100
May 05 09:05:42 HOST kernel:  [<ffffffffa03ee49c>] do_read_fault.isra.63+0x4c/0x1b0
May 05 09:05:42 HOST kernel:  [<ffffffffa03f5d00>] handle_mm_fault+0xa20/0xfb0
May 05 09:05:42 HOST kernel:  [<ffffffffa098d653>] __do_page_fault+0x213/0x500
May 05 09:05:42 HOST kernel:  [<ffffffffa098da26>] trace_do_page_fault+0x56/0x150
May 05 09:05:42 HOST kernel:  [<ffffffffa098cfa2>] do_async_page_fault+0x22/0xf0
May 05 09:05:42 HOST kernel:  [<ffffffffa09897a8>] async_page_fault+0x28/0x30
May 05 09:05:42 HOST kernel: Mem-Info:
May 05 09:05:42 HOST kernel: active_anon:5382083 inactive_anon:514069 isolated_anon:0
                                                active_file:653 inactive_file:412 isolated_file:75
                                                unevictable:0 dirty:0 writeback:0 unstable:0
                                                slab_reclaimable:120624 slab_unreclaimable:14538
                                                mapped:814755 shmem:816586 pagetables:60496 bounce:0
                                                free:30218 free_pcp:562 free_cma:0

Can You tell me how to find problematic query? Or how to "pimp" configuration to let db be alive and let us find problematic query?

-- 

Pozdrawiam
Piotr Włodarczyk
Reply | Threaded
Open this post in threaded view
|

Re: OOM Killer kills PostgreSQL

Laurenz Albe
On Wed, 2020-05-20 at 09:30 +0200, Piotr Włodarczyk wrote:
> We met unexpected PostgreSQL shutdown. After a little investigation
> we've discovered that problem is in OOM killer which kills our PostgreSQL.
> Unfortunately we can't find query on DB causing this problem. Log is as below:

Is there nothing in the PostgreSQL log?

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com



Reply | Threaded
Open this post in threaded view
|

Re: OOM Killer kills PostgreSQL

piotrwlodarczyk89
Nothing special. I'll check it agin after next dead

On Wed, May 20, 2020 at 10:22 AM Laurenz Albe <[hidden email]> wrote:
On Wed, 2020-05-20 at 09:30 +0200, Piotr Włodarczyk wrote:
> We met unexpected PostgreSQL shutdown. After a little investigation
> we've discovered that problem is in OOM killer which kills our PostgreSQL.
> Unfortunately we can't find query on DB causing this problem. Log is as below:

Is there nothing in the PostgreSQL log?

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com



--

Pozdrawiam
Piotr Włodarczyk
Reply | Threaded
Open this post in threaded view
|

Re: OOM Killer kills PostgreSQL

Fabio Pardi
In reply to this post by piotrwlodarczyk89
Maybe your memory budget does not meet the RAM on the machine?

The problem is not in the query you are looking for, but in the settings you are using for Postgres.

regards,

fabio pardi



On 20/05/2020 09:30, Piotr Włodarczyk wrote:
Hi folks,

We met unexpected PostgreSQL shutdown. After a little investigation we've discovered that problem is in OOM killer which kills our PostgreSQL. Unfortunately we can't find query on DB causing this problem. Log is as below:

May 05 09:05:33 HOST kernel: postgres invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=-1000
May 05 09:05:34 HOST kernel: postgres cpuset=/ mems_allowed=0
May 05 09:05:34 HOST kernel: CPU: 0 PID: 28286 Comm: postgres Not tainted 3.10.0-1127.el7.x86_64 #1
May 05 09:05:34 HOST kernel: Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
May 05 09:05:34 HOST kernel: Call Trace:
May 05 09:05:34 HOST kernel:  [<ffffffffa097ff85>] dump_stack+0x19/0x1b
May 05 09:05:34 HOST kernel:  [<ffffffffa097a8a3>] dump_header+0x90/0x229
May 05 09:05:34 HOST kernel:  [<ffffffffa050da5b>] ? cred_has_capability+0x6b/0x120
May 05 09:05:34 HOST kernel:  [<ffffffffa03c246e>] oom_kill_process+0x25e/0x3f0
May 05 09:05:35 HOST kernel:  [<ffffffffa0333a41>] ? cpuset_mems_allowed_intersects+0x21/0x30
May 05 09:05:40 HOST kernel:  [<ffffffffa03c1ecd>] ? oom_unkillable_task+0xcd/0x120
May 05 09:05:42 HOST kernel:  [<ffffffffa03c1f76>] ? find_lock_task_mm+0x56/0xc0
May 05 09:05:42 HOST kernel:  [<ffffffffa03c2cc6>] out_of_memory+0x4b6/0x4f0
May 05 09:05:42 HOST kernel:  [<ffffffffa097b3c0>] __alloc_pages_slowpath+0x5db/0x729
May 05 09:05:42 HOST kernel:  [<ffffffffa03c9146>] __alloc_pages_nodemask+0x436/0x450
May 05 09:05:42 HOST kernel:  [<ffffffffa0418e18>] alloc_pages_current+0x98/0x110
May 05 09:05:42 HOST kernel:  [<ffffffffa03be377>] __page_cache_alloc+0x97/0xb0
May 05 09:05:42 HOST kernel:  [<ffffffffa03c0f30>] filemap_fault+0x270/0x420
May 05 09:05:42 HOST kernel:  [<ffffffffc03c07d6>] ext4_filemap_fault+0x36/0x50 [ext4]
May 05 09:05:42 HOST kernel:  [<ffffffffa03edeea>] __do_fault.isra.61+0x8a/0x100
May 05 09:05:42 HOST kernel:  [<ffffffffa03ee49c>] do_read_fault.isra.63+0x4c/0x1b0
May 05 09:05:42 HOST kernel:  [<ffffffffa03f5d00>] handle_mm_fault+0xa20/0xfb0
May 05 09:05:42 HOST kernel:  [<ffffffffa098d653>] __do_page_fault+0x213/0x500
May 05 09:05:42 HOST kernel:  [<ffffffffa098da26>] trace_do_page_fault+0x56/0x150
May 05 09:05:42 HOST kernel:  [<ffffffffa098cfa2>] do_async_page_fault+0x22/0xf0
May 05 09:05:42 HOST kernel:  [<ffffffffa09897a8>] async_page_fault+0x28/0x30
May 05 09:05:42 HOST kernel: Mem-Info:
May 05 09:05:42 HOST kernel: active_anon:5382083 inactive_anon:514069 isolated_anon:0
                                                active_file:653 inactive_file:412 isolated_file:75
                                                unevictable:0 dirty:0 writeback:0 unstable:0
                                                slab_reclaimable:120624 slab_unreclaimable:14538
                                                mapped:814755 shmem:816586 pagetables:60496 bounce:0
                                                free:30218 free_pcp:562 free_cma:0

Can You tell me how to find problematic query? Or how to "pimp" configuration to let db be alive and let us find problematic query?

-- 

Pozdrawiam
Piotr Włodarczyk

Reply | Threaded
Open this post in threaded view
|

Re: OOM Killer kills PostgreSQL

Justin Pryzby
In reply to this post by piotrwlodarczyk89
What postgres version ?  What environment (RAM) and config ?
https://wiki.postgresql.org/wiki/Server_Configuration

I think you can probably find more info in dmesg/syslog ; probably a
line saying "OOM killed ..." showing which PID and its vsz.

Are you able to see some particular process continuously growing (like
in top or ps) ?

Do you have full query logs enabled to help determine which pid/query
was involved ?
log_statement=all log_min_messages=info log_checkpoints=on
log_lock_waits=on log_temp_files=0



On Wed, May 20, 2020 at 2:31 AM Piotr Włodarczyk
<[hidden email]> wrote:

>
> Hi folks,
>
> We met unexpected PostgreSQL shutdown. After a little investigation we've discovered that problem is in OOM killer which kills our PostgreSQL. Unfortunately we can't find query on DB causing this problem. Log is as below:
>
> May 05 09:05:33 HOST kernel: postgres invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=-1000
> May 05 09:05:34 HOST kernel: postgres cpuset=/ mems_allowed=0
> May 05 09:05:34 HOST kernel: CPU: 0 PID: 28286 Comm: postgres Not tainted 3.10.0-1127.el7.x86_64 #1
> May 05 09:05:34 HOST kernel: Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
> May 05 09:05:34 HOST kernel: Call Trace:
> May 05 09:05:34 HOST kernel:  [<ffffffffa097ff85>] dump_stack+0x19/0x1b
> May 05 09:05:34 HOST kernel:  [<ffffffffa097a8a3>] dump_header+0x90/0x229
> May 05 09:05:34 HOST kernel:  [<ffffffffa050da5b>] ? cred_has_capability+0x6b/0x120
> May 05 09:05:34 HOST kernel:  [<ffffffffa03c246e>] oom_kill_process+0x25e/0x3f0
> May 05 09:05:35 HOST kernel:  [<ffffffffa0333a41>] ? cpuset_mems_allowed_intersects+0x21/0x30
> May 05 09:05:40 HOST kernel:  [<ffffffffa03c1ecd>] ? oom_unkillable_task+0xcd/0x120
> May 05 09:05:42 HOST kernel:  [<ffffffffa03c1f76>] ? find_lock_task_mm+0x56/0xc0
> May 05 09:05:42 HOST kernel:  [<ffffffffa03c2cc6>] out_of_memory+0x4b6/0x4f0
> May 05 09:05:42 HOST kernel:  [<ffffffffa097b3c0>] __alloc_pages_slowpath+0x5db/0x729
> May 05 09:05:42 HOST kernel:  [<ffffffffa03c9146>] __alloc_pages_nodemask+0x436/0x450
> May 05 09:05:42 HOST kernel:  [<ffffffffa0418e18>] alloc_pages_current+0x98/0x110
> May 05 09:05:42 HOST kernel:  [<ffffffffa03be377>] __page_cache_alloc+0x97/0xb0
> May 05 09:05:42 HOST kernel:  [<ffffffffa03c0f30>] filemap_fault+0x270/0x420
> May 05 09:05:42 HOST kernel:  [<ffffffffc03c07d6>] ext4_filemap_fault+0x36/0x50 [ext4]
> May 05 09:05:42 HOST kernel:  [<ffffffffa03edeea>] __do_fault.isra.61+0x8a/0x100
> May 05 09:05:42 HOST kernel:  [<ffffffffa03ee49c>] do_read_fault.isra.63+0x4c/0x1b0
> May 05 09:05:42 HOST kernel:  [<ffffffffa03f5d00>] handle_mm_fault+0xa20/0xfb0
> May 05 09:05:42 HOST kernel:  [<ffffffffa098d653>] __do_page_fault+0x213/0x500
> May 05 09:05:42 HOST kernel:  [<ffffffffa098da26>] trace_do_page_fault+0x56/0x150
> May 05 09:05:42 HOST kernel:  [<ffffffffa098cfa2>] do_async_page_fault+0x22/0xf0
> May 05 09:05:42 HOST kernel:  [<ffffffffa09897a8>] async_page_fault+0x28/0x30
> May 05 09:05:42 HOST kernel: Mem-Info:
> May 05 09:05:42 HOST kernel: active_anon:5382083 inactive_anon:514069 isolated_anon:0
>                                                 active_file:653 inactive_file:412 isolated_file:75
>                                                 unevictable:0 dirty:0 writeback:0 unstable:0
>                                                 slab_reclaimable:120624 slab_unreclaimable:14538
>                                                 mapped:814755 shmem:816586 pagetables:60496 bounce:0
>                                                 free:30218 free_pcp:562 free_cma:0
>
> Can You tell me how to find problematic query? Or how to "pimp" configuration to let db be alive and let us find problematic query?
>
> --
>
> Pozdrawiam
> Piotr Włodarczyk


Reply | Threaded
Open this post in threaded view
|

Re: OOM Killer kills PostgreSQL

Stephen Frost
In reply to this post by piotrwlodarczyk89
Greetings,

* Piotr Włodarczyk ([hidden email]) wrote:
> We met unexpected PostgreSQL shutdown. After a little investigation we've
> discovered that problem is in OOM killer which kills our PostgreSQL.

You need to configure your system to not overcommit.

Read up on overcommit_ratio and overcommit_memory Linux settings.

Thanks,

Stephen

signature.asc (836 bytes) Download Attachment