Is there a known bug with SKIP LOCKED and "tuple to be locked was already moved to another partition due to concurrent update"?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Is there a known bug with SKIP LOCKED and "tuple to be locked was already moved to another partition due to concurrent update"?

Gunther Schadow
Hi all,

long time ago I devised with your help a task queuing system which uses
SELECT ... FOR UPDATE SKIP LOCKED for many parallel workers to find
tasks in the queue, and it used a partitioned table where the hot part
of the queue is short and so the query for a job is quick and the skip
locked locking makes sure that one job is only assigned to one worker.
And this works pretty well for me, except that when we run many workers
we find a lot of these failures occurring:

"tuple to be locked was already moved to another partition due to
concurrent update"

This would not exactly look like a bug, because the message says "to be
locked", so at least it's not allowing two workers to lock the same
tuple. But it seems that the skip-locked mode should not make an error
out of this, but treat it as the tuple was already locked. Why would it
want to lock the tuple (representing the job) if another worker has
already finished his UPDATE of the job to mark it as "done" (which is
what makes the tuple move to the "completed" partition.)

Either the SELECT for jobs to do returned a wrong tuple, which was
already update, or there is some lapse in the locking.

Either way it would seem to be a waste of time throwing all these errors
when the tuple should not even have been selected for update and locking.

I wonder if anybody knows anything about that issue? Of course you'll
want to see the DDL and SQL queries, etc. but you can't really try it
out unless you do some massively parallel magic. So I figured I just ask.

regards,
-Gunther



Reply | Threaded
Open this post in threaded view
|

Re: Is there a known bug with SKIP LOCKED and "tuple to be locked was already moved to another partition due to concurrent update"?

Jim Jarvie

Hi Gunther & List,

I think I have an extremely similar issue and things point in the same direction of a potential issue for skip locked on partitioned tables.

Background is I had a queue table on v9.6 with fairly high volume (>50M items, growth in the 1+M/daily).

Processing the queue with FOR UPDATE SKIP LOCKED was reliable but traffic volumes on v9.6 and the fact v12 is current let to migrating to v12 and using a partitioned table.

Queue has distinct categories of items, so the table is partitioned by list on each category.  Processing in 1 category results in it being updated to the next logical category (i.e. it will migrate partition once it is processed).

Within each category, there can be 10'sM rows, so the list partition is hash partitioned as well.  I don't think this is the issue but is mentioned for completeness.

Now, when processing the queue, there are regular transaction aborts with "tuple to be locked was already moved to another partition due to concurrent update".

From everything I can trace, it really does look like this is caused by rows which should be locked/skipped as they are processed by a different thread.

I tried switching 'for update' to 'for key share' and that created a cascade of deadlock aborts, so was worse for my situation.

For now, I roll back and repeat the select for update skip locked until it succeeds - which it eventually does.

However, it really feels like these should just have been skipped by PostgreSQL without the rollback/retry until success.

So, am I missing something/doing it wrong?  Or could there be a potential issue that needs raised?

Thanks

Jim



On 30-Jun.-2020 12:10, Gunther Schadow wrote:
Hi all,

long time ago I devised with your help a task queuing system which uses SELECT ... FOR UPDATE SKIP LOCKED for many parallel workers to find tasks in the queue, and it used a partitioned table where the hot part of the queue is short and so the query for a job is quick and the skip locked locking makes sure that one job is only assigned to one worker. And this works pretty well for me, except that when we run many workers we find a lot of these failures occurring:

"tuple to be locked was already moved to another partition due to concurrent update"

This would not exactly look like a bug, because the message says "to be locked", so at least it's not allowing two workers to lock the same tuple. But it seems that the skip-locked mode should not make an error out of this, but treat it as the tuple was already locked. Why would it want to lock the tuple (representing the job) if another worker has already finished his UPDATE of the job to mark it as "done" (which is what makes the tuple move to the "completed" partition.)

Either the SELECT for jobs to do returned a wrong tuple, which was already update, or there is some lapse in the locking.

Either way it would seem to be a waste of time throwing all these errors when the tuple should not even have been selected for update and locking.

I wonder if anybody knows anything about that issue? Of course you'll want to see the DDL and SQL queries, etc. but you can't really try it out unless you do some massively parallel magic. So I figured I just ask.

regards,
-Gunther