debugging intermittent slow updates under higher load

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

debugging intermittent slow updates under higher load

Chris Withers-2
Hi All,

This is on postgres 9.4.16, same table as the last question I asked,
here's an abbreviated desc:

# \d alerts_alert
               Table "public.alerts_alert"
      Column      |           Type           | Modifiers
-----------------+--------------------------+-----------
  tags            | jsonb                    | not null
  id              | character varying(86)    | not null
...
Indexes:
     "alerts_alert_pkey" PRIMARY KEY, btree (id)

The table has around 1.5M rows which have been updated/inserted around
121M times, the distribution of updates to row in alerts_alert will be
quite uneven, from 1 insert up to 1 insert and 0.5M updates.

Under high load (200-300 inserts/updates per second) we see occasional
(~10 per hour) updates taking excessively long times (2-10s). These
updates are always of the form:

UPDATE "alerts_alert" SET ...bunch of fields... WHERE
"alerts_alert"."id" = '...sha1 hash...';

Here's a sample explain:

https://explain.depesz.com/s/Fjq8

What could be causing this? What could we do to debug? What config
changes could we make to alleviate this?

cheers,

Chris

Reply | Threaded
Open this post in threaded view
|

Re: debugging intermittent slow updates under higher load

bashtanov

>
> The table has around 1.5M rows which have been updated/inserted around
> 121M times, the distribution of updates to row in alerts_alert will be
> quite uneven, from 1 insert up to 1 insert and 0.5M updates.
>
> Under high load (200-300 inserts/updates per second) we see occasional
> (~10 per hour) updates taking excessively long times (2-10s). These
> updates are always of the form:
>
> UPDATE "alerts_alert" SET ...bunch of fields... WHERE
> "alerts_alert"."id" = '...sha1 hash...';
>
> Here's a sample explain:
>
> https://explain.depesz.com/s/Fjq8
>
> What could be causing this? What could we do to debug? What config
> changes could we make to alleviate this?
>

Hello Chris,

One of the reasons could be the row already locked by another backend,
doing the same kind of an update or something different.
Are these updates performed in a longer transactions?
Can they hit the same row from two clients at the same time?
Is there any other write or select-for-update/share load on the table?

Have you tried periodical logging of the non-granted locks?
Try querying pg_stat_activity and pg_locks (possibly joined and maybe
repeatedly self-joined, google for it)
to get the backends that wait one for another while competing for to
lock the same row or object.

Best,
  Alex

Reply | Threaded
Open this post in threaded view
|

Re: debugging intermittent slow updates under higher load

ichbinrene
Also read about hot updates and the storage parameter named "fill_factor", so, data blocks can be recycled instead of creating new ones if the updated fields don't update also indexes.

Am Mi., 5. Dez. 2018 um 09:39 Uhr schrieb Alexey Bashtanov <[hidden email]>:

>
> The table has around 1.5M rows which have been updated/inserted around
> 121M times, the distribution of updates to row in alerts_alert will be
> quite uneven, from 1 insert up to 1 insert and 0.5M updates.
>
> Under high load (200-300 inserts/updates per second) we see occasional
> (~10 per hour) updates taking excessively long times (2-10s). These
> updates are always of the form:
>
> UPDATE "alerts_alert" SET ...bunch of fields... WHERE
> "alerts_alert"."id" = '...sha1 hash...';
>
> Here's a sample explain:
>
> https://explain.depesz.com/s/Fjq8
>
> What could be causing this? What could we do to debug? What config
> changes could we make to alleviate this?
>

Hello Chris,

One of the reasons could be the row already locked by another backend,
doing the same kind of an update or something different.
Are these updates performed in a longer transactions?
Can they hit the same row from two clients at the same time?
Is there any other write or select-for-update/share load on the table?

Have you tried periodical logging of the non-granted locks?
Try querying pg_stat_activity and pg_locks (possibly joined and maybe
repeatedly self-joined, google for it)
to get the backends that wait one for another while competing for to
lock the same row or object.

Best,
  Alex



--
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/

Reply | Threaded
Open this post in threaded view
|

Re: debugging intermittent slow updates under higher load

ichbinrene
This parameter can be updated on a "per table" basis.

Am Mi., 5. Dez. 2018 um 09:47 Uhr schrieb Rene Romero Benavides <[hidden email]>:
Also read about hot updates and the storage parameter named "fill_factor", so, data blocks can be recycled instead of creating new ones if the updated fields don't update also indexes.

Am Mi., 5. Dez. 2018 um 09:39 Uhr schrieb Alexey Bashtanov <[hidden email]>:

>
> The table has around 1.5M rows which have been updated/inserted around
> 121M times, the distribution of updates to row in alerts_alert will be
> quite uneven, from 1 insert up to 1 insert and 0.5M updates.
>
> Under high load (200-300 inserts/updates per second) we see occasional
> (~10 per hour) updates taking excessively long times (2-10s). These
> updates are always of the form:
>
> UPDATE "alerts_alert" SET ...bunch of fields... WHERE
> "alerts_alert"."id" = '...sha1 hash...';
>
> Here's a sample explain:
>
> https://explain.depesz.com/s/Fjq8
>
> What could be causing this? What could we do to debug? What config
> changes could we make to alleviate this?
>

Hello Chris,

One of the reasons could be the row already locked by another backend,
doing the same kind of an update or something different.
Are these updates performed in a longer transactions?
Can they hit the same row from two clients at the same time?
Is there any other write or select-for-update/share load on the table?

Have you tried periodical logging of the non-granted locks?
Try querying pg_stat_activity and pg_locks (possibly joined and maybe
repeatedly self-joined, google for it)
to get the backends that wait one for another while competing for to
lock the same row or object.

Best,
  Alex



--
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/



--
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/

Reply | Threaded
Open this post in threaded view
|

Re: debugging intermittent slow updates under higher load

Chris Withers-2
In reply to this post by bashtanov
On 05/12/2018 15:40, Alexey Bashtanov wrote:
>
>>
> One of the reasons could be the row already locked by another backend,
> doing the same kind of an update or something different.
> Are these updates performed in a longer transactions?

Nope, the transaction will just be updating one row at a time.

> Can they hit the same row from two clients at the same time?

I've looked for evidence of this, but can't find any. Certainly nothing
running for 2-10s, queries against this table are normally a few hundred ms.

> Is there any other write or select-for-update/share load on the table?

Not that I'm aware of. How would I go about getting metrics on problems
like these?

> Have you tried periodical logging of the non-granted locks?
> Try querying pg_stat_activity and pg_locks (possibly joined and maybe
> repeatedly self-joined, google for it)
> to get the backends that wait one for another while competing for to
> lock the same row or object.

Is there any existing tooling that does this? I'm loath to start hacking
something up when I'd hope others have done a better job already...

Chris

Reply | Threaded
Open this post in threaded view
|

Re: debugging intermittent slow updates under higher load

Chris Withers-2
In reply to this post by ichbinrene
On 05/12/2018 15:47, Rene Romero Benavides wrote:
> Also read about hot updates and the storage parameter named
> "fill_factor", so, data blocks can be recycled instead of creating new
> ones if the updated fields don't update also indexes.

I have read about these, but I'd prefer not to be making
opportunistic/guessing changes on this.

How can I collect metrics/logging/etc evidence to confirm what the
problem actually is?

cheers,

Chris

Reply | Threaded
Open this post in threaded view
|

Re: debugging intermittent slow updates under higher load

bashtanov
In reply to this post by Chris Withers-2

> Is there any existing tooling that does this?
There must be some, google for queries involving pg_locks

> I'm loath to start hacking something up when I'd hope others have done
> a better job already...
If you log all queries that take more than a second to complete, is your
update the only one logged, or something (the would-be blocker) gets
logged down together with it?

Reply | Threaded
Open this post in threaded view
|

Re: debugging intermittent slow updates under higher load

Chris Withers-2
On 06/12/2018 11:00, Alexey Bashtanov wrote:
>
>> I'm loath to start hacking something up when I'd hope others have done
>> a better job already...
> If you log all queries that take more than a second to complete, is your
> update the only one logged, or something (the would-be blocker) gets
> logged down together with it?

Nope, only ones logged are these updates.

Chris

Reply | Threaded
Open this post in threaded view
|

Re: debugging intermittent slow updates under higher load

Pavel Stehule
Hi

čt 6. 12. 2018 v 12:18 odesílatel Chris Withers <[hidden email]> napsal:
On 06/12/2018 11:00, Alexey Bashtanov wrote:
>
>> I'm loath to start hacking something up when I'd hope others have done
>> a better job already...
> If you log all queries that take more than a second to complete, is your
> update the only one logged, or something (the would-be blocker) gets
> logged down together with it?

Nope, only ones logged are these updates.

Can you check latency on file system? Some latencies can be enforced by overloaded file system due wrong configuration of file system cache.


Regards

Pavel


Chris