COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
46 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

Ibrar Ahmed-4


On Tue, Mar 24, 2020 at 10:06 PM Ibrar Ahmed <[hidden email]> wrote:


On Fri, Mar 13, 2020 at 6:58 AM Justin Pryzby <[hidden email]> wrote:

Thanks for picking up this patch.  There's a minor typo:

+                        * readable outside of this sessoin. Therefore doing IO here isn't

=> session

--
Justin

Thanks, please see the updated and rebased patch. (master 17a28b03645e27d73bf69a95d7569b61e58f06eb)

--
Ibrar Ahmed

Andres while fixing the one FIXME in the patch

"            visibilitymap_pin(relation, BufferGetBlockNumber(buffer), &vmbuffer);


            /*

             * FIXME: setting recptr here is a dirty dirty hack, to prevent

             * visibilitymap_set() from WAL logging.

             *
"
I am not able to see any scenario where recptr is not set before reaching to that statement. Can you clarify why you think recptr will not be set at that statement?


--
Ibrar Ahmed
Reply | Threaded
Open this post in threaded view
|

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

Daniel Gustafsson
This patch incurs a compiler warning, which probably is quite simple to fix:

heapam.c: In function ‘heap_multi_insert’:
heapam.c:2349:4: error: ‘recptr’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
    visibilitymap_set(relation, BufferGetBlockNumber(buffer), buffer,
    ^
heapam.c:2136:14: note: ‘recptr’ was declared here
   XLogRecPtr recptr;
              ^

Please fix and submit a new version, I'm marking the entry Waiting on Author in
the meantime.

cheers ./daniel

Reply | Threaded
Open this post in threaded view
|

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

Anastasia Lubennikova
On 01.07.2020 12:38, Daniel Gustafsson wrote:

> This patch incurs a compiler warning, which probably is quite simple to fix:
>
> heapam.c: In function ‘heap_multi_insert’:
> heapam.c:2349:4: error: ‘recptr’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
>      visibilitymap_set(relation, BufferGetBlockNumber(buffer), buffer,
>      ^
> heapam.c:2136:14: note: ‘recptr’ was declared here
>     XLogRecPtr recptr;
>                ^
>
> Please fix and submit a new version, I'm marking the entry Waiting on Author in
> the meantime.
>
> cheers ./daniel
>
This patch looks very useful to me, so I want to pick it up.


The patch that fixes the compiler warning is in the attachment. Though,
I'm not
entirely satisfied with this fix. Also, the patch contains some FIXME
comments.
I'll test it more and send fixes this week.

Questions from the first review pass:

1) Do we need XLH_INSERT_ALL_VISIBLE_SET ? IIUC, in the patch it is always
implied by XLH_INSERT_ALL_FROZEN_SET.

2) What does this comment mean? Does XXX refers to the lsn comparison?
Since it
is definitely necessary to update the VM.

+             * XXX: This seems entirely unnecessary?
+             *
+             * FIXME: Theoretically we should only do this after we've
+             * modified the heap - but it's safe to do it here I think,
+             * because this means that the page previously was empty.
+             */
+            if (lsn > PageGetLSN(vmpage))
+                visibilitymap_set(reln, blkno, InvalidBuffer, lsn,
vmbuffer,
+                                  InvalidTransactionId, vmbits);

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


0005-copy-freeze-should-actually-freeze-right.patch (13K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

Robert Haas
On Tue, Jul 14, 2020 at 1:51 PM Anastasia Lubennikova
<[hidden email]> wrote:
> Questions from the first review pass:
>
> 1) Do we need XLH_INSERT_ALL_VISIBLE_SET ? IIUC, in the patch it is always
> implied by XLH_INSERT_ALL_FROZEN_SET.

I agree that it looks unnecessary to have two separate bits.

> 2) What does this comment mean? Does XXX refers to the lsn comparison?
> Since it
> is definitely necessary to update the VM.
>
> +             * XXX: This seems entirely unnecessary?
> +             *
> +             * FIXME: Theoretically we should only do this after we've
> +             * modified the heap - but it's safe to do it here I think,
> +             * because this means that the page previously was empty.
> +             */
> +            if (lsn > PageGetLSN(vmpage))
> +                visibilitymap_set(reln, blkno, InvalidBuffer, lsn,
> vmbuffer,
> +                                  InvalidTransactionId, vmbits);

I wondered about that too. The comment which precedes it was, I
believe, originally written by me, and copied here from
heap_xlog_visible(). But it's not clear very good practice to just
copy the comment like this. If the same logic applies, the code should
say that we're doing the same thing here as in heap_xlog_visible() for
consistency, or some such thing; after all, that's the primary place
where that happens. But it looks like the XXX might have been added by
a second person who thought that we didn't need this logic at all, and
the FIXME by a third person who thought it was in the wrong place, so
the whole thing is really confusing at this point.

I'm pretty worried about this, too:

+             * FIXME: setting recptr here is a dirty dirty hack, to prevent
+             * visibilitymap_set() from WAL logging.

That is indeed a dirty hack, and something needs to be done about it.

I wonder if it was really all that smart to try to make the
HEAP2_MULTI_INSERT do this instead of just issuing separate WAL
records to mark it all-visible afterwards, but I don't see any reason
why this can't be made to work. It needs substantially more polishing
than it's had, though, I think.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Reply | Threaded
Open this post in threaded view
|

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

Anastasia Lubennikova
On 31.07.2020 23:28, Robert Haas wrote:

> On Tue, Jul 14, 2020 at 1:51 PM Anastasia Lubennikova
> <[hidden email]> wrote:
>> Questions from the first review pass:
>>
>> 1) Do we need XLH_INSERT_ALL_VISIBLE_SET ? IIUC, in the patch it is always
>> implied by XLH_INSERT_ALL_FROZEN_SET.
> I agree that it looks unnecessary to have two separate bits.
>
>> 2) What does this comment mean? Does XXX refers to the lsn comparison?
>> Since it
>> is definitely necessary to update the VM.
>>
>> +             * XXX: This seems entirely unnecessary?
>> +             *
>> +             * FIXME: Theoretically we should only do this after we've
>> +             * modified the heap - but it's safe to do it here I think,
>> +             * because this means that the page previously was empty.
>> +             */
>> +            if (lsn > PageGetLSN(vmpage))
>> +                visibilitymap_set(reln, blkno, InvalidBuffer, lsn,
>> vmbuffer,
>> +                                  InvalidTransactionId, vmbits);
> I wondered about that too. The comment which precedes it was, I
> believe, originally written by me, and copied here from
> heap_xlog_visible(). But it's not clear very good practice to just
> copy the comment like this. If the same logic applies, the code should
> say that we're doing the same thing here as in heap_xlog_visible() for
> consistency, or some such thing; after all, that's the primary place
> where that happens. But it looks like the XXX might have been added by
> a second person who thought that we didn't need this logic at all, and
> the FIXME by a third person who thought it was in the wrong place, so
> the whole thing is really confusing at this point.
>
> I'm pretty worried about this, too:
>
> +             * FIXME: setting recptr here is a dirty dirty hack, to prevent
> +             * visibilitymap_set() from WAL logging.
>
> That is indeed a dirty hack, and something needs to be done about it.
>
> I wonder if it was really all that smart to try to make the
> HEAP2_MULTI_INSERT do this instead of just issuing separate WAL
> records to mark it all-visible afterwards, but I don't see any reason
> why this can't be made to work. It needs substantially more polishing
> than it's had, though, I think.
>
New version of the patch is in the attachment.

New design is more conservative and simpler:
- pin the visibility map page in advance;
- set PageAllVisible;
- call visibilitymap_set() with its own XlogRecord, just like in other
places.

It allows to remove most of the "hacks" and keep code clean.
The patch passes tests added in previous versions.

I haven't tested performance yet, though. Maybe after tests, I'll bring
some optimizations back.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


copy-freeze-vm_freeze_v1.patch (12K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

Ibrar Ahmed-4


On Mon, Aug 3, 2020 at 2:29 PM Anastasia Lubennikova <[hidden email]> wrote:
On 31.07.2020 23:28, Robert Haas wrote:
> On Tue, Jul 14, 2020 at 1:51 PM Anastasia Lubennikova
> <[hidden email]> wrote:
>> Questions from the first review pass:
>>
>> 1) Do we need XLH_INSERT_ALL_VISIBLE_SET ? IIUC, in the patch it is always
>> implied by XLH_INSERT_ALL_FROZEN_SET.
> I agree that it looks unnecessary to have two separate bits.
>
>> 2) What does this comment mean? Does XXX refers to the lsn comparison?
>> Since it
>> is definitely necessary to update the VM.
>>
>> +             * XXX: This seems entirely unnecessary?
>> +             *
>> +             * FIXME: Theoretically we should only do this after we've
>> +             * modified the heap - but it's safe to do it here I think,
>> +             * because this means that the page previously was empty.
>> +             */
>> +            if (lsn > PageGetLSN(vmpage))
>> +                visibilitymap_set(reln, blkno, InvalidBuffer, lsn,
>> vmbuffer,
>> +                                  InvalidTransactionId, vmbits);
> I wondered about that too. The comment which precedes it was, I
> believe, originally written by me, and copied here from
> heap_xlog_visible(). But it's not clear very good practice to just
> copy the comment like this. If the same logic applies, the code should
> say that we're doing the same thing here as in heap_xlog_visible() for
> consistency, or some such thing; after all, that's the primary place
> where that happens. But it looks like the XXX might have been added by
> a second person who thought that we didn't need this logic at all, and
> the FIXME by a third person who thought it was in the wrong place, so
> the whole thing is really confusing at this point.
>
> I'm pretty worried about this, too:
>
> +             * FIXME: setting recptr here is a dirty dirty hack, to prevent
> +             * visibilitymap_set() from WAL logging.
>
> That is indeed a dirty hack, and something needs to be done about it.
>
> I wonder if it was really all that smart to try to make the
> HEAP2_MULTI_INSERT do this instead of just issuing separate WAL
> records to mark it all-visible afterwards, but I don't see any reason
> why this can't be made to work. It needs substantially more polishing
> than it's had, though, I think.
>
New version of the patch is in the attachment.

New design is more conservative and simpler:
- pin the visibility map page in advance;
- set PageAllVisible;
- call visibilitymap_set() with its own XlogRecord, just like in other
places.

It allows to remove most of the "hacks" and keep code clean.
The patch passes tests added in previous versions.

I haven't tested performance yet, though. Maybe after tests, I'll bring
some optimizations back.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Here are some performance results with a patched and unpatched master branch. 
The table used for the test contains three columns (integer, text, varchar). 
The total number of rows is 10000000 in total. 

Unpatched (Master: 92c12e46d5f1e25fc85608a6d6a19b8f5ea02600)
COPY: 9069.432 ms vacuum; 2567.961ms
COPY: 9004.533 ms vacuum: 2553.075ms
COPY: 8832.422 ms vacuum: 2540.742ms

Patched (Master: 92c12e46d5f1e25fc85608a6d6a19b8f5ea02600)
COPY: 10031.723 ms vacuum: 127.524 ms
COPY: 9985.109  ms vacuum: 39.953 ms
COPY: 9283.373  ms vacuum: 37.137 ms

Time to take the copy slightly increased but the vacuum time significantly decrease.  

--
Ibrar Ahmed
123