pgsql: Fix freezing of a dead HOT-updated tuple

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
53 messages Options
123
Reply | Threaded
Open this post in threaded view
|

pgsql: Fix freezing of a dead HOT-updated tuple

Álvaro Herrera
Fix freezing of a dead HOT-updated tuple

Vacuum calls page-level HOT prune to remove dead HOT tuples before doing
liveness checks (HeapTupleSatisfiesVacuum) on the remaining tuples.  But
concurrent transaction commit/abort may turn DEAD some of the HOT tuples
that survived the prune, before HeapTupleSatisfiesVacuum tests them.
This happens to activate the code that decides to freeze the tuple ...
which resuscitates it, duplicating data.

(This is especially bad if there's any unique constraints, because those
are now internally violated due to the duplicate entries, though you
won't know until you try to REINDEX or dump/restore the table.)

One possible fix would be to simply skip doing anything to the tuple,
and hope that the next HOT prune would remove it.  But there is a
problem: if the tuple is older than freeze horizon, this would leave an
unfrozen XID behind, and if no HOT prune happens to clean it up before
the containing pg_clog segment is truncated away, it'd later cause an
error when the XID is looked up.

Fix the problem by having the tuple freezing routines cope with the
situation: don't freeze the tuple (and keep it dead).  In the cases that
the XID is older than the freeze age, set the HEAP_XMAX_COMMITTED flag
so that there is no need to look up the XID in pg_clog later on.

An isolation test is included, authored by Michael Paquier, loosely
based on Daniel Wood's original reproducer.  It only tests one
particular scenario, though, not all the possible ways for this problem
to surface; it be good to have a more reliable way to test this more
fully, but it'd require more work.
In message https://postgr.es/m/20170911140103.5akxptyrwgpc25bw@...
I outlined another test case (more closely matching Dan Wood's) that
exposed a few more ways for the problem to occur.

Backpatch all the way back to 9.3, where this problem was introduced by
multixact juggling.  In branches 9.3 and 9.4, this includes a backpatch
of commit e5ff9fefcd50 (of 9.5 era), since the original is not
correctable without matching the coding pattern in 9.5 up.

Reported-by: Daniel Wood
Diagnosed-by: Daniel Wood
Reviewed-by: Yi Wen Wong, Michaël Paquier
Discussion: https://postgr.es/m/E5711E62-8FDF-4DCA-A888-C200BF6B5742@...

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/20b655224249e6d2daf7ef0595995228baddb381

Modified Files
--------------
src/backend/access/heap/heapam.c                |  57 +++++++++----
src/backend/commands/vacuumlazy.c               |  20 ++---
src/test/isolation/expected/freeze-the-dead.out | 101 ++++++++++++++++++++++++
src/test/isolation/isolation_schedule           |   1 +
src/test/isolation/specs/freeze-the-dead.spec   |  27 +++++++
5 files changed, 179 insertions(+), 27 deletions(-)


--
Sent via pgsql-committers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
Reply | Threaded
Open this post in threaded view
|

Re: pgsql: Fix freezing of a dead HOT-updated tuple

Peter Geoghegan-4
On Thu, Sep 28, 2017 at 7:47 AM, Alvaro Herrera <[hidden email]> wrote:
> Fix freezing of a dead HOT-updated tuple

If I run Dan Wood's test case again, the obvious symptom (spurious
duplicates) goes away. However, the enhanced amcheck, and thus CREATE
INDEX/REINDEX, still isn't happy about this:

postgres=# select bt_index_check('t_pkey', true);
DEBUG:  00000: verifying presence of required tuples in index "t_pkey"
LOCATION:  bt_check_every_level, verify_nbtree.c:424
ERROR:  XX000: failed to find parent tuple for heap-only tuple at
(0,6) in table "t"
LOCATION:  IndexBuildHeapRangeScan, index.c:2597
Time: 3.699 ms

--
Peter Geoghegan


--
Sent via pgsql-committers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
Reply | Threaded
Open this post in threaded view
|

Re: pgsql: Fix freezing of a dead HOT-updated tuple

Andres Freund
In reply to this post by Álvaro Herrera
Hi,

On 2017-09-28 14:47:53 +0000, Alvaro Herrera wrote:

> Fix freezing of a dead HOT-updated tuple
>
> Vacuum calls page-level HOT prune to remove dead HOT tuples before doing
> liveness checks (HeapTupleSatisfiesVacuum) on the remaining tuples.  But
> concurrent transaction commit/abort may turn DEAD some of the HOT tuples
> that survived the prune, before HeapTupleSatisfiesVacuum tests them.
> This happens to activate the code that decides to freeze the tuple ...
> which resuscitates it, duplicating data.
>
> (This is especially bad if there's any unique constraints, because those
> are now internally violated due to the duplicate entries, though you
> won't know until you try to REINDEX or dump/restore the table.)
>
> One possible fix would be to simply skip doing anything to the tuple,
> and hope that the next HOT prune would remove it.  But there is a
> problem: if the tuple is older than freeze horizon, this would leave an
> unfrozen XID behind, and if no HOT prune happens to clean it up before
> the containing pg_clog segment is truncated away, it'd later cause an
> error when the XID is looked up.
>
> Fix the problem by having the tuple freezing routines cope with the
> situation: don't freeze the tuple (and keep it dead).  In the cases that
> the XID is older than the freeze age, set the HEAP_XMAX_COMMITTED flag
> so that there is no need to look up the XID in pg_clog later on.

I think this is the wrong fix - the assumption that ctid chains can be
validated based on the prev-xmax = cur-xmin is fairly ingrained into the
system, and we shouldn't just be breaking it. The need to later
lobotomize the checks, in a5736bf754, is some evidence of that.

I spent some time discussing this with Robert today (with both of us
alternating between feeling the other and ourselves as stupid), and the
conclusion I think is that the problem is on the pruning, rather than
the freezing side.

FWIW, I don't think the explanation in the commit message of how the
problem triggers is actually correct - the testcase you added doesn't
have any transactions concurrently committing / aborting when
crashing. Rather the problem is that the liveliness checks for freezing
is different from the ones for pruning. HTSV considers xmin
RECENTLY_DEAD when there's a multi xmax with at least one alive locker,
whereas pruning thinks it has to do something because there's the member
xid below the cutoff. No concurrent activity is needed to trigger that.

I think the problem is on the pruning, rather than the freezing side. We
can't freeze a tuple if it has an alive predecessor - rather than
weakining this, we should be fixing the pruning to not have the alive
predecessor.

The relevant logic in HTSV is:

        if (tuple->t_infomask & HEAP_XMAX_IS_MULTI)
        {
                TransactionId xmax;

                if (MultiXactIdIsRunning(HeapTupleHeaderGetRawXmax(tuple), false))
                {
                        /* already checked above */
                        Assert(!HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask));

                        xmax = HeapTupleGetUpdateXid(tuple);

                        /* not LOCKED_ONLY, so it has to have an xmax */
                        Assert(TransactionIdIsValid(xmax));

                        if (TransactionIdIsInProgress(xmax))
                                return HEAPTUPLE_DELETE_IN_PROGRESS;
                        else if (TransactionIdDidCommit(xmax))
                                /* there are still lockers around -- can't return DEAD here */
                                return HEAPTUPLE_RECENTLY_DEAD;
                        /* updating transaction aborted */
                        return HEAPTUPLE_LIVE;

with the problematic branch being the TransactionIdDidCommit()
case. Rather than unconditionally returning HEAPTUPLE_RECENTLY_DEAD, we
should test the update xid against OldestXmin and return DEAD /
RECENTLY_DEAD according to that.

If the update xmin is actually below the cutoff we can remove the tuple
even if there's live lockers - the lockers will also be present in the
newer version of the tuple.  I verified that for me that fixes the
problem. Obviously that'd require some comment work and more careful
diagnosis.


I think a5736bf754c82d8b86674e199e232096c679201d might be dangerous in
the face of previously corrupted tuple chains and pg_upgraded clusters -
it can lead to tuples being considered related, even though they they're
from entirely independent hot chains. Especially when upgrading 9.3 post
your fix, to current releases.


In short, I think the two commits should be reverted, and replaced with
a fix along what I'm outlining above.

There'll be some trouble for people that upgraded to an unreleased
version, but I don't really see what we could do about that.

I could be entirely wrong - I've been travelling for the last two weeks
and my brain is somewhat more fried than usual.

Regards,

Andres


--
Sent via pgsql-committers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
Reply | Threaded
Open this post in threaded view
|

Re: pgsql: Fix freezing of a dead HOT-updated tuple

Robert Haas
On Thu, Nov 2, 2017 at 4:50 PM, Andres Freund <[hidden email]> wrote:
> I think a5736bf754c82d8b86674e199e232096c679201d might be dangerous in
> the face of previously corrupted tuple chains and pg_upgraded clusters -
> it can lead to tuples being considered related, even though they they're
> from entirely independent hot chains. Especially when upgrading 9.3 post
> your fix, to current releases.

I think this is a key point.  If the new behavior were merely not
entirely correct, we could perhaps refine it later.  But it's not only
not correct - it actually has the potential to create new problems
that didn't exist before those commits.  And if we release without
reverting those commits then we can't change our mind later.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


--
Sent via pgsql-committers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
Reply | Threaded
Open this post in threaded view
|

Re: pgsql: Fix freezing of a dead HOT-updated tuple

Álvaro Herrera
In reply to this post by Andres Freund
Andres Freund wrote:

> I spent some time discussing this with Robert today (with both of us
> alternating between feeling the other and ourselves as stupid), and the
> conclusion I think is that the problem is on the pruning, rather than
> the freezing side.

Thanks both for spending some more time on this.

> I think the problem is on the pruning, rather than the freezing side. We
> can't freeze a tuple if it has an alive predecessor - rather than
> weakining this, we should be fixing the pruning to not have the alive
> predecessor.

I gave a look at HTSV back then, but I didn't find what the right tweak
was, but then I only tried changing the return value to DEAD and
DELETE_IN_PROGRESS; the thought of selecting DEAD or RECENTLY_DEAD based
on OldestXmin didn't occur to me ... I was thinking that the fact that
there were live lockers meant that the tuple could not be removed,
obviously failing to notice that the subsequent versions of the tuple
would be good enough.

> If the update xmin is actually below the cutoff we can remove the tuple
> even if there's live lockers - the lockers will also be present in the
> newer version of the tuple.  I verified that for me that fixes the
> problem. Obviously that'd require some comment work and more careful
> diagnosis.

Sounds good.

I'll revert those commits then, keeping the test, and then you can
commit your change.  OK?

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-committers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
Reply | Threaded
Open this post in threaded view
|

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

Andres Freund
Hi,

On 2017-11-02 13:49:47 +0100, Alvaro Herrera wrote:

> Andres Freund wrote:
> > I think the problem is on the pruning, rather than the freezing side. We
> > can't freeze a tuple if it has an alive predecessor - rather than
> > weakining this, we should be fixing the pruning to not have the alive
> > predecessor.
>
> I gave a look at HTSV back then, but I didn't find what the right tweak
> was, but then I only tried changing the return value to DEAD and
> DELETE_IN_PROGRESS; the thought of selecting DEAD or RECENTLY_DEAD based
> on OldestXmin didn't occur to me ... I was thinking that the fact that
> there were live lockers meant that the tuple could not be removed,
> obviously failing to notice that the subsequent versions of the tuple
> would be good enough.

I'll try to write up a commit based on that idea. I think there's some
comment work needed too, Robert and I were both confused by a few
things.
I'm unfortunately travelling atm - it's evening here, and I'll flying
back to the US all Saturday. I'm fairly sure I'll be able to come up
with a decent patch tomorrow, but I'll need review and testing by
others.


> > If the update xmin is actually below the cutoff we can remove the tuple
> > even if there's live lockers - the lockers will also be present in the
> > newer version of the tuple.  I verified that for me that fixes the
> > problem. Obviously that'd require some comment work and more careful
> > diagnosis.
>
> Sounds good.
>
> I'll revert those commits then, keeping the test, and then you can
> commit your change.  OK?

Generally that sounds good - but you can't keep the testcase in without
the new fix - the buildfarm would immediately turn red. I guess the best
thing would be to temporarily remove it from the schedule?


Do we care about people upgrading to unreleased versions? We could do
nothing, document it in the release notes, or ???


Greetings,

Andres Freund


--
Sent via pgsql-committers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
Reply | Threaded
Open this post in threaded view
|

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

Tom Lane-2
Andres Freund <[hidden email]> writes:
> Do we care about people upgrading to unreleased versions? We could do
> nothing, document it in the release notes, or ???

Do nothing.

                        regards, tom lane


--
Sent via pgsql-committers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
Reply | Threaded
Open this post in threaded view
|

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

Stephen Frost
* Tom Lane ([hidden email]) wrote:
> Andres Freund <[hidden email]> writes:
> > Do we care about people upgrading to unreleased versions? We could do
> > nothing, document it in the release notes, or ???
>
> Do nothing.

Agreed.  Not much we can do there.

Thanks!

Stephen

signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

Álvaro Herrera
Stephen Frost wrote:
> * Tom Lane ([hidden email]) wrote:
> > Andres Freund <[hidden email]> writes:
> > > Do we care about people upgrading to unreleased versions? We could do
> > > nothing, document it in the release notes, or ???
> >
> > Do nothing.
>
> Agreed.  Not much we can do there.

Pushed the reverts.

I noticed while doing so that REL_10_STABLE contains the bogus commits.  
Does that change our opinion regarding what to do for people upgrading
to a version containing the broken commits?  I don't think so, because

  1) we hope that not many people will trust their data to 10.0
     immediately after release
  2) the bug is very low probability
  3) it doesn't look like we can do a lot about it anyway.

I'll experiment with Andres' proposed fix now.

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-committers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
Reply | Threaded
Open this post in threaded view
|

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

Robert Haas
On Thu, Nov 2, 2017 at 8:25 PM, Alvaro Herrera <[hidden email]> wrote:

> Pushed the reverts.
>
> I noticed while doing so that REL_10_STABLE contains the bogus commits.
> Does that change our opinion regarding what to do for people upgrading
> to a version containing the broken commits?  I don't think so, because
>
>   1) we hope that not many people will trust their data to 10.0
>      immediately after release
>   2) the bug is very low probability
>   3) it doesn't look like we can do a lot about it anyway.

Just to be clear, it looks like "Fix freezing of a dead HOT-updated
tuple" (46c35116ae1acc8826705ef2a7b5d9110f9d6e84) went in before 10.0
was stamped, but "Fix traversal of half-frozen update chains"
(22576734b805fb0952f9be841ca8f643694ee868) went in afterwards and is
therefore unreleased at present.

Users of 10.0 who hit the code introduced by
46c35116ae1acc8826705ef2a7b5d9110f9d6e84 will have XIDs stored in the
xmax fields of tuples that predate relfrozenxid.  Those tuples will be
hinted-committed.  That's not good, but it might not really have much
in the way of consequences.  *IF* the next VACUUM doesn't get confused
by the old XID, then it will prune the tuple then and I think we'll be
OK.  And I think it won't, because it should just call
HeapTupleSatisfiesVacuum() and that should see that
HEAP_XMAX_COMMITTED is set and not actually try to consult the old
CLOG.  If that hint bit can ever get lost - or fail to propagate to a
standby - then we have more trouble, but the fact that it's set by a
logged operation makes me hope that can't happen. Furthermore, that
follow-on VACUUM should indeed arrive in due time, because we will not
have marked the page all-visible -- HeapTupleSatisfiesVacuum() will
NOT have returned HEAPTUPLE_LIVE when called from lazy_scan_heap(),
and therefore we will have set all_visible = false.

The second commit (22576734b805fb0952f9be841ca8f643694ee868) is where
I think things get a lot more dangerous.  The problem (as Andres
pointed out to me this afternoon) is that it seems possible to end up
with a situation where there should be two HOT chains on a page, and
because of the weakened xmin/xmax checking rules, we end up thinking
that the second one is a continuation of the first one, which will be
all kinds of bad news.  That would be particularly likely to happen in
releases from before we invented HEAP_XMIN_FROZEN, when there's no
xmin/xmax matching at all, but could happen on later releases if we
are extraordinarily unlucky (i.e. xmin of the first tuple in the
second chain happens to be the same as the pre-freeze xmax in the old
chain, probably because the same XID was used to update the page in
two consecutive epochs).  Fortunately, that commit is (I think) not
released anywhere.

Personally, I think it would be best to push the release out a week.
I think we understand this well enough now that we can fix it
relatively easily, but haste makes bugs, and (I know you're all tired
of hearing me say this) patches that implicate the on-disk format are
scary.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


--
Sent via pgsql-committers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
Reply | Threaded
Open this post in threaded view
|

Re: pgsql: Fix freezing of a dead HOT-updated tuple

Peter Geoghegan-4
In reply to this post by Andres Freund
On Thu, Nov 2, 2017 at 4:20 AM, Andres Freund <[hidden email]> wrote:
> I think the problem is on the pruning, rather than the freezing side. We
> can't freeze a tuple if it has an alive predecessor - rather than
> weakining this, we should be fixing the pruning to not have the alive
> predecessor.

Excellent catch.

> If the update xmin is actually below the cutoff we can remove the tuple
> even if there's live lockers - the lockers will also be present in the
> newer version of the tuple.  I verified that for me that fixes the
> problem. Obviously that'd require some comment work and more careful
> diagnosis.

I didn't even know that that was safe.

> I think a5736bf754c82d8b86674e199e232096c679201d might be dangerous in
> the face of previously corrupted tuple chains and pg_upgraded clusters -
> it can lead to tuples being considered related, even though they they're
> from entirely independent hot chains. Especially when upgrading 9.3 post
> your fix, to current releases.

Frankly, I'm relieved that you got to this. I was highly suspicious of
a5736bf754c82d8b86674e199e232096c679201d, even beyond my specific,
actionable concern about how it failed to handle the
9.3/FrozenTransactionId xmin case as special. As I went into in the
"heap/SLRU verification, relfrozenxid cut-off, and freeze-the-dead
bug" thread, these commits left us with a situation where there didn't
seem to be a reliable way of knowing whether or not it is safe to
interrogate clog for a given heap tuple using a tool like amcheck.
And, it wasn't obvious that you couldn't have a codepath that failed
to account for pre-cutoff non-frozen tuples -- codepaths that call
TransactionIdDidCommit() despite it actually being unsafe.

If I'm not mistaken, your proposed fix restores sanity there.

--
Peter Geoghegan


--
Sent via pgsql-committers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
Reply | Threaded
Open this post in threaded view
|

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

Peter Geoghegan-4
In reply to this post by Robert Haas
On Thu, Nov 2, 2017 at 9:44 AM, Robert Haas <[hidden email]> wrote:

> The second commit (22576734b805fb0952f9be841ca8f643694ee868) is where
> I think things get a lot more dangerous.  The problem (as Andres
> pointed out to me this afternoon) is that it seems possible to end up
> with a situation where there should be two HOT chains on a page, and
> because of the weakened xmin/xmax checking rules, we end up thinking
> that the second one is a continuation of the first one, which will be
> all kinds of bad news.  That would be particularly likely to happen in
> releases from before we invented HEAP_XMIN_FROZEN, when there's no
> xmin/xmax matching at all, but could happen on later releases if we
> are extraordinarily unlucky (i.e. xmin of the first tuple in the
> second chain happens to be the same as the pre-freeze xmax in the old
> chain, probably because the same XID was used to update the page in
> two consecutive epochs).  Fortunately, that commit is (I think) not
> released anywhere.

FWIW, if you look at the second commit
(22576734b805fb0952f9be841ca8f643694ee868) carefully, you'll realize
that it doesn't even treat those two cases differently. It was buggy
even on its own terms. The FrozenTransactionId test used an xmin from
HeapTupleHeaderGetXmin(), not HeapTupleHeaderGetRawXmin().

--
Peter Geoghegan


--
Sent via pgsql-committers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
Reply | Threaded
Open this post in threaded view
|

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

Tom Lane-2
In reply to this post by Robert Haas
Robert Haas <[hidden email]> writes:
> Just to be clear, it looks like "Fix freezing of a dead HOT-updated
> tuple" (46c35116ae1acc8826705ef2a7b5d9110f9d6e84) went in before 10.0
> was stamped, but "Fix traversal of half-frozen update chains"
> (22576734b805fb0952f9be841ca8f643694ee868) went in afterwards and is
> therefore unreleased at present.

Thanks for doing this analysis of the actual effects in 10.0.

> Personally, I think it would be best to push the release out a week.

I would only be in favor of that if there were some reason to think that
the bug is worse now than it's been in the four years since 9.3 was
released.  Otherwise, we should ship the bug fixes we have on-schedule.
I think it's a very very safe bet that there are other data-loss-causing
bugs in there, so I see no good reason for panicking over this one.

> ... and (I know you're all tired of hearing me say this) patches that
> implicate the on-disk format are scary.

Agreed, but how is that relevant now that the bogus patches are reverted?

                        regards, tom lane


--
Sent via pgsql-committers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
Reply | Threaded
Open this post in threaded view
|

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

Robert Haas
In reply to this post by Peter Geoghegan-4
On Thu, Nov 2, 2017 at 10:26 PM, Peter Geoghegan <[hidden email]> wrote:

> On Thu, Nov 2, 2017 at 9:44 AM, Robert Haas <[hidden email]> wrote:
>> The second commit (22576734b805fb0952f9be841ca8f643694ee868) is where
>> I think things get a lot more dangerous.  The problem (as Andres
>> pointed out to me this afternoon) is that it seems possible to end up
>> with a situation where there should be two HOT chains on a page, and
>> because of the weakened xmin/xmax checking rules, we end up thinking
>> that the second one is a continuation of the first one, which will be
>> all kinds of bad news.  That would be particularly likely to happen in
>> releases from before we invented HEAP_XMIN_FROZEN, when there's no
>> xmin/xmax matching at all, but could happen on later releases if we
>> are extraordinarily unlucky (i.e. xmin of the first tuple in the
>> second chain happens to be the same as the pre-freeze xmax in the old
>> chain, probably because the same XID was used to update the page in
>> two consecutive epochs).  Fortunately, that commit is (I think) not
>> released anywhere.
>
> FWIW, if you look at the second commit
> (22576734b805fb0952f9be841ca8f643694ee868) carefully, you'll realize
> that it doesn't even treat those two cases differently. It was buggy
> even on its own terms. The FrozenTransactionId test used an xmin from
> HeapTupleHeaderGetXmin(), not HeapTupleHeaderGetRawXmin().

Oh, wow.  You seem to be correct.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


--
Sent via pgsql-committers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
Reply | Threaded
Open this post in threaded view
|

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

Robert Haas
In reply to this post by Tom Lane-2
On Thu, Nov 2, 2017 at 10:38 PM, Tom Lane <[hidden email]> wrote:
>> Personally, I think it would be best to push the release out a week.
>
> I would only be in favor of that if there were some reason to think that
> the bug is worse now than it's been in the four years since 9.3 was
> released.  Otherwise, we should ship the bug fixes we have on-schedule.
> I think it's a very very safe bet that there are other data-loss-causing
> bugs in there, so I see no good reason for panicking over this one.

Well, my thought was that delaying this release for a week would be
better than either (a) doing an extra minor release just to get this
fix out or (b) waiting another three months to release this fix.  The
former seems like fairly unnecessary work, and the latter doesn't seem
particularly responsible.  Users can't reasonably expect us to fix
data-loss-causing bugs that we don't know about yet, but they can
reasonably expect us to issue fixes promptly for ones that we do know
about.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


--
Sent via pgsql-committers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
Reply | Threaded
Open this post in threaded view
|

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

Tom Lane-2
Robert Haas <[hidden email]> writes:
> Well, my thought was that delaying this release for a week would be
> better than either (a) doing an extra minor release just to get this
> fix out or (b) waiting another three months to release this fix.  The
> former seems like fairly unnecessary work, and the latter doesn't seem
> particularly responsible.  Users can't reasonably expect us to fix
> data-loss-causing bugs that we don't know about yet, but they can
> reasonably expect us to issue fixes promptly for ones that we do know
> about.

Our experience with "hold the release waiting for a fix for bug X"
decisions has been consistently bad.  Furthermore, if we can't produce
a patch we trust by Monday, I would much rather that we not do it
in a rushed fashion at all.  I think it would be entirely reasonable
to consider making off-cadence releases in, perhaps, a month, once
the dust is entirely settled.

                        regards, tom lane


--
Sent via pgsql-committers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
Reply | Threaded
Open this post in threaded view
|

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

Andres Freund
In reply to this post by Andres Freund
On 2017-11-02 06:05:51 -0700, Andres Freund wrote:

> Hi,
>
> On 2017-11-02 13:49:47 +0100, Alvaro Herrera wrote:
> > Andres Freund wrote:
> > > I think the problem is on the pruning, rather than the freezing side. We
> > > can't freeze a tuple if it has an alive predecessor - rather than
> > > weakining this, we should be fixing the pruning to not have the alive
> > > predecessor.
> >
> > I gave a look at HTSV back then, but I didn't find what the right tweak
> > was, but then I only tried changing the return value to DEAD and
> > DELETE_IN_PROGRESS; the thought of selecting DEAD or RECENTLY_DEAD based
> > on OldestXmin didn't occur to me ... I was thinking that the fact that
> > there were live lockers meant that the tuple could not be removed,
> > obviously failing to notice that the subsequent versions of the tuple
> > would be good enough.
>
> I'll try to write up a commit based on that idea. I think there's some
> comment work needed too, Robert and I were both confused by a few
> things.
> I'm unfortunately travelling atm - it's evening here, and I'll flying
> back to the US all Saturday. I'm fairly sure I'll be able to come up
> with a decent patch tomorrow, but I'll need review and testing by
> others.
Here's that patch.  I've stared at this some, and Robert did too. Robert
mentioned that the commit message might need some polish and I'm not
100% sure about the error message texts yet.

I'm not yet convinced that the new elog in vacuumlazy can never trigger
- but I also don't think we want to actually freeze the tuple in that
case.

Staring at the vacuumlazy hunk I think I might have found a related bug:
heap_update_tuple() just copies the old xmax to the new tuple's xmax if
a multixact and still running.  It does so without verifying liveliness
of members.  Isn't that buggy? Consider what happens if we have three
blocks: 1 has free space, two is being vacuumed and is locked, three is
full and has a tuple that's key share locked by a live tuple and is
updated by a dead xmax from before the xmin horizon. In that case afaict
the multi will be copied from the third page to the first one.  Which is
quite bad, because vacuum already processed it, and we'll set
relfrozenxid accordingly.  I hope I'm missing something here?

Greetings,

Andres Freund


--
Sent via pgsql-committers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

0001-Fix-pruning-of-locked-and-updated-tuples.patch (8K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

Peter Geoghegan-4
Andres Freund <[hidden email]> wrote:
>Here's that patch.  I've stared at this some, and Robert did too. Robert
>mentioned that the commit message might need some polish and I'm not
>100% sure about the error message texts yet.

The commit message should probably say that the bug involves the
resurrection of previously dead tuples, which is different to there
being duplicates because a constraint is not enforced because HOT chains
are broken (that's a separate, arguably less serious problem).

>Staring at the vacuumlazy hunk I think I might have found a related bug:
>heap_update_tuple() just copies the old xmax to the new tuple's xmax if
>a multixact and still running.  It does so without verifying liveliness
>of members.  Isn't that buggy? Consider what happens if we have three
>blocks: 1 has free space, two is being vacuumed and is locked, three is
>full and has a tuple that's key share locked by a live tuple and is
>updated by a dead xmax from before the xmin horizon. In that case afaict
>the multi will be copied from the third page to the first one.  Which is
>quite bad, because vacuum already processed it, and we'll set
>relfrozenxid accordingly.  I hope I'm missing something here?

Can you be more specific about what you mean here? I think that I
understand where you're going with this, but I'm not sure.

--
Peter Geoghegan


--
Sent via pgsql-committers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
Reply | Threaded
Open this post in threaded view
|

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

Álvaro Herrera
Peter Geoghegan wrote:
> Andres Freund <[hidden email]> wrote:

> > Staring at the vacuumlazy hunk I think I might have found a related bug:
> > heap_update_tuple() just copies the old xmax to the new tuple's xmax if
> > a multixact and still running.  It does so without verifying liveliness
> > of members.  Isn't that buggy? Consider what happens if we have three
> > blocks: 1 has free space, two is being vacuumed and is locked, three is
> > full and has a tuple that's key share locked by a live tuple and is
> > updated by a dead xmax from before the xmin horizon. In that case afaict
> > the multi will be copied from the third page to the first one.  Which is
> > quite bad, because vacuum already processed it, and we'll set
> > relfrozenxid accordingly.  I hope I'm missing something here?
>
> Can you be more specific about what you mean here? I think that I
> understand where you're going with this, but I'm not sure.

He means that the tuple that heap_update moves to page 1 (which will no
longer be processed by vacuum) will contain a multixact that's older
than relminmxid -- because it is copied unchanged by heap_update instead
of properly checking against age limit.

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-committers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
Reply | Threaded
Open this post in threaded view
|

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

Peter Geoghegan-4
Alvaro Herrera <[hidden email]> wrote:
>He means that the tuple that heap_update moves to page 1 (which will no
>longer be processed by vacuum) will contain a multixact that's older
>than relminmxid -- because it is copied unchanged by heap_update instead
>of properly checking against age limit.

I see. The problem is more or less with this heap_update() code:

    /*
     * And also prepare an Xmax value for the new copy of the tuple.  If there
     * was no xmax previously, or there was one but all lockers are now gone,
     * then use InvalidXid; otherwise, get the xmax from the old tuple.  (In
     * rare cases that might also be InvalidXid and yet not have the
     * HEAP_XMAX_INVALID bit set; that's fine.)
     */
    if ((oldtup.t_data->t_infomask & HEAP_XMAX_INVALID) ||
        HEAP_LOCKED_UPGRADED(oldtup.t_data->t_infomask) ||
        (checked_lockers && !locker_remains))
        xmax_new_tuple = InvalidTransactionId;
    else
        xmax_new_tuple = HeapTupleHeaderGetRawXmax(oldtup.t_data);

My naive guess is that we have to create a new MultiXactId here in at
least some cases, just like FreezeMultiXactId() sometimes does.

--
Peter Geoghegan


--
Sent via pgsql-committers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
123
Previous Thread Next Thread