COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
46 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

Alvaro Herrera-9
On 2019-Apr-04, Andres Freund wrote:

> I'm totally not OK with this from a layering
> POV. CheckAndSetAllVisibleBulkInsertState is entirely heap specific
> (without being named such), whereas all the heap specific bits are
> getting moved below tableam.

This is a fair complaint, but on the other hand the COPY changes for
table AM are still being developed, so there's no ground on which to
rebase this patch.  Do you have a timeline on getting the COPY one
committed?

I think it's fair to ask the RMT for an exceptional extension of a
couple of working days for this patch.

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

Andres Freund
Hi,

On 2019-04-04 16:15:54 -0300, Alvaro Herrera wrote:

> On 2019-Apr-04, Andres Freund wrote:
>
> > I'm totally not OK with this from a layering
> > POV. CheckAndSetAllVisibleBulkInsertState is entirely heap specific
> > (without being named such), whereas all the heap specific bits are
> > getting moved below tableam.
>
> This is a fair complaint, but on the other hand the COPY changes for
> table AM are still being developed, so there's no ground on which to
> rebase this patch.  Do you have a timeline on getting the COPY one
> committed?

~2h. Just pondering the naming of some functions etc.  Don't think
there's a large interdependency though.

But even if tableam weren't committed, I'd still argue that it's
structurally done wrong in the patch right now.  FWIW, I actually think
this whole approach isn't quite right - this shouldn't be done as a
secondary action after we'd already inserted, with a separate
lock-unlock cycle etc.

Also, how is this code even close to correct?
CheckAndSetPageAllVisible() modifies the buffer in a crucial way, and
there's no WAL logging? Without even a comment arguing why that's OK (I
don't think it is)?

Greetings,

Andres Freund


Reply | Threaded
Open this post in threaded view
|

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

Andres Freund
Hi,

On 2019-04-04 12:23:08 -0700, Andres Freund wrote:
> Also, how is this code even close to correct?
> CheckAndSetPageAllVisible() modifies the buffer in a crucial way, and
> there's no WAL logging? Without even a comment arguing why that's OK (I
> don't think it is)?

Peter Geoghegan just reminded me over IM that there's actually logging
inside log_heap_visible(), called from visibilitymap_set(). Still lacks
a critical section though.

I still think this is the wrong architecture.

Greetings,

Andres Freund

PS: We're going to have to revamp visibilitymap_set() soon-ish - the
fact that it directly calls heap routines inside is bad, means that
additional AMs e.g. zheap has to reimplement that routine.


Reply | Threaded
Open this post in threaded view
|

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

Alvaro Herrera-9
On 2019-Apr-04, Andres Freund wrote:

> On 2019-04-04 12:23:08 -0700, Andres Freund wrote:
> > Also, how is this code even close to correct?
> > CheckAndSetPageAllVisible() modifies the buffer in a crucial way, and
> > there's no WAL logging? Without even a comment arguing why that's OK (I
> > don't think it is)?
>
> Peter Geoghegan just reminded me over IM that there's actually logging
> inside log_heap_visible(), called from visibilitymap_set(). Still lacks
> a critical section though.

Hmm, isn't there already a critical section in visibilitymap_set itself?

> I still think this is the wrong architecture.

Hmm.

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

Andres Freund
Hi,

On 2019-04-05 00:06:04 -0300, Alvaro Herrera wrote:

> On 2019-Apr-04, Andres Freund wrote:
>
> > On 2019-04-04 12:23:08 -0700, Andres Freund wrote:
> > > Also, how is this code even close to correct?
> > > CheckAndSetPageAllVisible() modifies the buffer in a crucial way, and
> > > there's no WAL logging? Without even a comment arguing why that's OK (I
> > > don't think it is)?
> >
> > Peter Geoghegan just reminded me over IM that there's actually logging
> > inside log_heap_visible(), called from visibilitymap_set(). Still lacks
> > a critical section though.
>
> Hmm, isn't there already a critical section in visibilitymap_set itself?

There is, but the proposed code sets all visible on the page, and marks
the buffer dirty, before calling visibilitymap_set.

Greetings,

Andres Freund


Reply | Threaded
Open this post in threaded view
|

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

Andres Freund
In reply to this post by Alvaro Herrera-9
Hi,

On 2019-04-05 00:06:04 -0300, Alvaro Herrera wrote:
> On 2019-Apr-04, Andres Freund wrote:
> > I still think this is the wrong architecture.
>
> Hmm.

I think the right approach would be to do all of this in heap_insert and
heap_multi_insert. Whenever starting to work on a page, if INSERT_FROZEN
is specified, remember whether it is either currently empty, or is
already marked as all-visible. If previously empty, mark it as all
visible at the end. If already all visible, there's no need to change
that. If not yet all-visible, no need to do anything, since it can't
have been inserted with COPY FREEZE.  Do you see any problem doing it
that way?

Greetings,

Andres Freund


Reply | Threaded
Open this post in threaded view
|

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

Pavan Deolasee
Hi,

On Fri, Apr 5, 2019 at 9:05 AM Andres Freund <[hidden email]> wrote:


I think the right approach would be to do all of this in heap_insert and
heap_multi_insert. Whenever starting to work on a page, if INSERT_FROZEN
is specified, remember whether it is either currently empty, or is
already marked as all-visible. If previously empty, mark it as all
visible at the end. If already all visible, there's no need to change
that. If not yet all-visible, no need to do anything, since it can't
have been inserted with COPY FREEZE. 

We're doing roughly the same. If we are running INSERT_FROZEN, whenever we're about to switch to a new page, we check if the previous page should be marked all-frozen and do it that way. The special code in copy.c is necessary to take care of the last page which we don't get to handle in the regular code path.

Or are you suggesting that we don't even rescan the page for all-frozen tuples at the end and just simply mark it all-frozen at the start, when the first tuple is inserted and then don't touch the PD_ALL_VISIBLE/visibility map bit as we go on inserting more tuples in the page?

Anyways, if major architectural changes are required then it's probably too late to consider this for PG12, even though it's more of a bug fix and a candidate for back-patching too.

Thanks,
Pavan

--
 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
Reply | Threaded
Open this post in threaded view
|

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

Pavan Deolasee
In reply to this post by Andres Freund


On Fri, Apr 5, 2019 at 8:37 AM Andres Freund <[hidden email]> wrote:
Hi,

On 2019-04-05 00:06:04 -0300, Alvaro Herrera wrote:

>
> Hmm, isn't there already a critical section in visibilitymap_set itself?

There is, but the proposed code sets all visible on the page, and marks
the buffer dirty, before calling visibilitymap_set.

How's it any different than what we're doing at vacuumlazy.c:1322? We set the page-level bit, mark the buffer dirty and then call visibilitymap_set(), all outside a critical section.

1300         /* mark page all-visible, if appropriate */
1301         if (all_visible && !all_visible_according_to_vm)
1302         {
1303             uint8       flags = VISIBILITYMAP_ALL_VISIBLE;
1304 
1305             if (all_frozen)
1306                 flags |= VISIBILITYMAP_ALL_FROZEN;
1307 
1308             /*
1309              * It should never be the case that the visibility map page is set
1310              * while the page-level bit is clear, but the reverse is allowed
1311              * (if checksums are not enabled).  Regardless, set the both bits
1312              * so that we get back in sync.
1313              *
1314              * NB: If the heap page is all-visible but the VM bit is not set,
1315              * we don't need to dirty the heap page.  However, if checksums
1316              * are enabled, we do need to make sure that the heap page is
1317              * dirtied before passing it to visibilitymap_set(), because it
1318              * may be logged.  Given that this situation should only happen in
1319              * rare cases after a crash, it is not worth optimizing.
1320              */
1321             PageSetAllVisible(page);
1322             MarkBufferDirty(buf);
1323             visibilitymap_set(onerel, blkno, buf, InvalidXLogRecPtr,
1324                               vmbuffer, visibility_cutoff_xid, flags);
1325         } 

As the first para in that comment says, I thought it's ok for page-level bit to be set but the visibility bit to be clear, but not the vice versa. The proposed code does not introduce any  new behaviour AFAICS. But I might be missing something.

Thanks,
Pavan

--
 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
Reply | Threaded
Open this post in threaded view
|

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

Tom Lane-2
In reply to this post by Andres Freund
Andres Freund <[hidden email]> writes:
> I think the right approach would be to do all of this in heap_insert and
> heap_multi_insert. Whenever starting to work on a page, if INSERT_FROZEN
> is specified, remember whether it is either currently empty, or is
> already marked as all-visible. If previously empty, mark it as all
> visible at the end. If already all visible, there's no need to change
> that. If not yet all-visible, no need to do anything, since it can't
> have been inserted with COPY FREEZE.  Do you see any problem doing it
> that way?

Do we want to add overhead to these hot-spot routines for this purpose?

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

Andres Freund
In reply to this post by Pavan Deolasee
Hi,

On 2019-04-05 09:20:36 +0530, Pavan Deolasee wrote:

> On Fri, Apr 5, 2019 at 9:05 AM Andres Freund <[hidden email]> wrote:
> > I think the right approach would be to do all of this in heap_insert and
> > heap_multi_insert. Whenever starting to work on a page, if INSERT_FROZEN
> > is specified, remember whether it is either currently empty, or is
> > already marked as all-visible. If previously empty, mark it as all
> > visible at the end. If already all visible, there's no need to change
> > that. If not yet all-visible, no need to do anything, since it can't
> > have been inserted with COPY FREEZE.
>
>
> We're doing roughly the same. If we are running INSERT_FROZEN, whenever
> we're about to switch to a new page, we check if the previous page should
> be marked all-frozen and do it that way. The special code in copy.c is
> necessary to take care of the last page which we don't get to handle in the
> regular code path.

Well, it's not the same, because you need extra code from copy.c, extra
lock cycles, and extra WAL logging.


> Or are you suggesting that we don't even rescan the page for all-frozen
> tuples at the end and just simply mark it all-frozen at the start, when the
> first tuple is inserted and then don't touch the PD_ALL_VISIBLE/visibility
> map bit as we go on inserting more tuples in the page?

Correct. If done right that should be cheaper (no extra scans, less WAL
logging), without requiring some new dispatch logic from copy.c.


> Anyways, if major architectural changes are required then it's probably too
> late to consider this for PG12, even though it's more of a bug fix and a
> candidate for back-patching too.

Let's just see how bad it looks? I don't feel like we'd need to be super
strict about it. If it looks simple enough I'd e.g. be ok to merge this
soon after freeze, and backpatch around maybe 12.1 or such.

Greetings,

Andres Freund


Reply | Threaded
Open this post in threaded view
|

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

Andres Freund
In reply to this post by Tom Lane-2
Hi,

On 2019-04-04 23:57:58 -0400, Tom Lane wrote:

> Andres Freund <[hidden email]> writes:
> > I think the right approach would be to do all of this in heap_insert and
> > heap_multi_insert. Whenever starting to work on a page, if INSERT_FROZEN
> > is specified, remember whether it is either currently empty, or is
> > already marked as all-visible. If previously empty, mark it as all
> > visible at the end. If already all visible, there's no need to change
> > that. If not yet all-visible, no need to do anything, since it can't
> > have been inserted with COPY FREEZE.  Do you see any problem doing it
> > that way?
>
> Do we want to add overhead to these hot-spot routines for this purpose?

For heap_multi_insert I can't see it being a problem - it's only used
from copy.c, and the cost should be "smeared" over many tuples.  I'd
assume that compared with locking a page, WAL logging, etc, it'd not
even meaningfully show up for heap_insert. Especially because we already
have codepaths for options & HEAP_INSERT_FROZEN in
heap_prepare_insert(), and I'd assume those could be combined.

I think we should measure it, but I don't think that one or two
additional, very well predictd, branches are going to be measurable in
in those routines.

The patch, as implemented, has modifications in
RelationGetBufferForTuple(), that seems like they'd be more expensive.

Greetings,

Andres Freund


Reply | Threaded
Open this post in threaded view
|

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

Darafei "Komяpa" Praliaskouski
In reply to this post by Tom Lane-2


On Fri, Apr 5, 2019 at 6:58 AM Tom Lane <[hidden email]> wrote:
Andres Freund <[hidden email]> writes:
> I think the right approach would be to do all of this in heap_insert and
> heap_multi_insert. Whenever starting to work on a page, if INSERT_FROZEN
> is specified, remember whether it is either currently empty, or is
> already marked as all-visible. If previously empty, mark it as all
> visible at the end. If already all visible, there's no need to change
> that. If not yet all-visible, no need to do anything, since it can't
> have been inserted with COPY FREEZE.  Do you see any problem doing it
> that way?

Do we want to add overhead to these hot-spot routines for this purpose?

Sizing the overhead: workflows right now don't end with COPY FREEZE - you need another VACUUM to set maps. 
Anything that lets you skip that VACUUM (and faster than that VACUUM itself) is helping. You specifically asked for it to be skippable with FREEZE anyway.


--
Darafei Praliaskouski
Reply | Threaded
Open this post in threaded view
|

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

Andres Freund
Hi,

On 2019-04-05 08:38:34 +0300, Darafei "Komяpa" Praliaskouski wrote:

> On Fri, Apr 5, 2019 at 6:58 AM Tom Lane <[hidden email]> wrote:
>
> > Andres Freund <[hidden email]> writes:
> > > I think the right approach would be to do all of this in heap_insert and
> > > heap_multi_insert. Whenever starting to work on a page, if INSERT_FROZEN
> > > is specified, remember whether it is either currently empty, or is
> > > already marked as all-visible. If previously empty, mark it as all
> > > visible at the end. If already all visible, there's no need to change
> > > that. If not yet all-visible, no need to do anything, since it can't
> > > have been inserted with COPY FREEZE.  Do you see any problem doing it
> > > that way?
> >
> > Do we want to add overhead to these hot-spot routines for this purpose?
> >
>
> Sizing the overhead: workflows right now don't end with COPY FREEZE - you
> need another VACUUM to set maps.
> Anything that lets you skip that VACUUM (and faster than that VACUUM
> itself) is helping. You specifically asked for it to be skippable with
> FREEZE anyway.

Tom's point was that the routines I was suggesting to adapt above aren't
just used for COPY FREEZE.

Greetings,

Andres Freund


Reply | Threaded
Open this post in threaded view
|

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

Andres Freund
In reply to this post by Andres Freund
Hi,

On 2019-04-04 21:04:49 -0700, Andres Freund wrote:

> On 2019-04-04 23:57:58 -0400, Tom Lane wrote:
> > Andres Freund <[hidden email]> writes:
> > > I think the right approach would be to do all of this in heap_insert and
> > > heap_multi_insert. Whenever starting to work on a page, if INSERT_FROZEN
> > > is specified, remember whether it is either currently empty, or is
> > > already marked as all-visible. If previously empty, mark it as all
> > > visible at the end. If already all visible, there's no need to change
> > > that. If not yet all-visible, no need to do anything, since it can't
> > > have been inserted with COPY FREEZE.  Do you see any problem doing it
> > > that way?
> >
> > Do we want to add overhead to these hot-spot routines for this purpose?
>
> For heap_multi_insert I can't see it being a problem - it's only used
> from copy.c, and the cost should be "smeared" over many tuples.  I'd
> assume that compared with locking a page, WAL logging, etc, it'd not
> even meaningfully show up for heap_insert. Especially because we already
> have codepaths for options & HEAP_INSERT_FROZEN in
> heap_prepare_insert(), and I'd assume those could be combined.
>
> I think we should measure it, but I don't think that one or two
> additional, very well predictd, branches are going to be measurable in
> in those routines.
>
> The patch, as implemented, has modifications in
> RelationGetBufferForTuple(), that seems like they'd be more expensive.
Here's a *prototype* patch for this.  It only implements what I
described for heap_multi_insert, not for plain inserts. I wanted to see
what others think before investing additional time.

I don't think it's possible to see the overhead of the changed code in
heap_multi_insert(), and probably - with less confidence - that it's
also going to be ok for heap_insert(). But we gotta measure that.


This avoids an extra WAL record for setting empty pages to all visible,
by adding XLH_INSERT_ALL_VISIBLE_SET & XLH_INSERT_ALL_FROZEN_SET, and
setting those when appropriate in heap_multi_insert.  Unfortunately
currently visibilitymap_set() doesn't really properly allow to do this,
as it has embedded WAL logging for heap.

I think we should remove the WAL logging from visibilitymap_set(), and
move it to a separate, heap specific, function. Right now different
tableams e.g. would have to reimplement visibilitymap_set(), so that's a
second need to separate that functionality.  Let me try to come up with
a proposal.


The patch currently does a vmbuffer_pin() while holding an exclusive
lwlock on the page. That's something we normally try to avoid - but I
think it's probably OK here, because INSERT_FROZEN can only be used to
insert into a new relfilenode (which thus no other session can see). I
think it's preferrable to have this logic in specific to the
INSERT_FROZEN path, rather than adding nontrivial complications to
RelationGetBufferForTuple().

I noticed that, before this patch, we do a
        if (vmbuffer != InvalidBuffer)
                ReleaseBuffer(vmbuffer);
after every filled page - that doesn't strike me as particularly smart -
it's pretty likely that the next heap page to be filled is going to be
on the same vm page as the previous iteration.


I noticed one small oddity that I think is common to all the approaches
presented in this thread so far. After

BEGIN;
TRUNCATE foo;
COPY foo(i) FROM '/tmp/foo' WITH FREEZE;
COPY foo(i) FROM '/tmp/foo' WITH FREEZE;
COPY foo(i) FROM '/tmp/foo' WITH FREEZE;
COMMIT;

we currently end up with pages like:
┌───────┬───────────┬──────────┬───────┬───────┬───────┬─────────┬──────────┬─────────┬───────────┐
│ blkno │    lsn    │ checksum │ flags │ lower │ upper │ special │ pagesize │ version │ prune_xid │
├───────┼───────────┼──────────┼───────┼───────┼───────┼─────────┼──────────┼─────────┼───────────┤
│     0 │ 0/50B5488 │        0 │     4 │   928 │   960 │    8192 │     8192 │       4 │         0 │
│     1 │ 0/50B6360 │        0 │     4 │   928 │   960 │    8192 │     8192 │       4 │         0 │
│     2 │ 0/50B71B8 │        0 │     4 │   928 │   960 │    8192 │     8192 │       4 │         0 │
│     3 │ 0/50B8028 │        0 │     4 │   928 │   960 │    8192 │     8192 │       4 │         0 │
│     4 │ 0/50B8660 │        0 │     4 │   408 │  5120 │    8192 │     8192 │       4 │         0 │
│     5 │ 0/50B94B8 │        0 │     4 │   928 │   960 │    8192 │     8192 │       4 │         0 │
│     6 │ 0/50BA328 │        0 │     4 │   928 │   960 │    8192 │     8192 │       4 │         0 │
│     7 │ 0/50BB180 │        0 │     4 │   928 │   960 │    8192 │     8192 │       4 │         0 │
│     8 │ 0/50BBFD8 │        0 │     4 │   928 │   960 │    8192 │     8192 │       4 │         0 │
│     9 │ 0/50BCF88 │        0 │     4 │   928 │   960 │    8192 │     8192 │       4 │         0 │
│    10 │ 0/50BDDE0 │        0 │     4 │   928 │   960 │    8192 │     8192 │       4 │         0 │
│    11 │ 0/50BEC50 │        0 │     4 │   928 │   960 │    8192 │     8192 │       4 │         0 │
│    12 │ 0/50BFAA8 │        0 │     4 │   928 │   960 │    8192 │     8192 │       4 │         0 │
│    13 │ 0/50C06F8 │        0 │     4 │   792 │  2048 │    8192 │     8192 │       4 │         0 │
└───────┴───────────┴──────────┴───────┴───────┴───────┴─────────┴──────────┴─────────┴───────────┘
(14 rows)

Note how block 4 has more space available. That's because the
visibilitymap_pin() called in the first COPY has to vm_extend(), which
in turn does:

        /*
         * Send a shared-inval message to force other backends to close any smgr
         * references they may have for this rel, which we are about to change.
         * This is a useful optimization because it means that backends don't have
         * to keep checking for creation or extension of the file, which happens
         * infrequently.
         */
        CacheInvalidateSmgr(rel->rd_smgr->smgr_rnode);

which invalidates ->rd_smgr->smgr_targblock *after* the first COPY,
because that's when the pending smgr invalidations are sent out.  That's
far from great, but it doesn't seem to be this patch's fault.

It seems to me we need a separate invalidation that doesn't close the
whole smgr relation, but just invalidates the VM specific fields.

Greetings,

Andres Freund

0002-WIP-copy-freeze-should-actually-freeze-right.patch (9K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

Andres Freund
Hi,

On 2019-04-07 18:04:27 -0700, Andres Freund wrote:
> Here's a *prototype* patch for this.  It only implements what I
> described for heap_multi_insert, not for plain inserts. I wanted to see
> what others think before investing additional time.

Pavan, are you planning to work on this for v13 CF1? Or have you lost
interest on the topic?

- Andres


Reply | Threaded
Open this post in threaded view
|

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

Pavan Deolasee


On Tue, 28 May 2019 at 4:36 PM, Andres Freund <[hidden email]> wrote:
Hi,

On 2019-04-07 18:04:27 -0700, Andres Freund wrote:
> Here's a *prototype* patch for this.  It only implements what I
> described for heap_multi_insert, not for plain inserts. I wanted to see
> what others think before investing additional time.

Pavan, are you planning to work on this for v13 CF1? Or have you lost
interest on the topic?

Yes, I plan to work on it.

Thanks,
Pavan
--
 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
Reply | Threaded
Open this post in threaded view
|

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

Pavan Deolasee
Hi Andres,

On Wed, May 29, 2019 at 1:50 PM Pavan Deolasee <[hidden email]> wrote:


On Tue, 28 May 2019 at 4:36 PM, Andres Freund <[hidden email]> wrote:
Hi,

On 2019-04-07 18:04:27 -0700, Andres Freund wrote:
> Here's a *prototype* patch for this.  It only implements what I
> described for heap_multi_insert, not for plain inserts. I wanted to see
> what others think before investing additional time.

Pavan, are you planning to work on this for v13 CF1? Or have you lost
interest on the topic?

Yes, I plan to work on it.


I am sorry, but I am not able to find time to get back to this because of other high priority items. If it still remains unaddressed in the next few weeks, I will pick it up again. But for now, I am happy if someone wants to pick and finish the work. 

Thanks,
Pavan

--
 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
Reply | Threaded
Open this post in threaded view
|

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

akapila
On Thu, Jun 27, 2019 at 11:02 AM Pavan Deolasee
<[hidden email]> wrote:

>
>>> On 2019-04-07 18:04:27 -0700, Andres Freund wrote:
>>> > Here's a *prototype* patch for this.  It only implements what I
>>> > described for heap_multi_insert, not for plain inserts. I wanted to see
>>> > what others think before investing additional time.
>>>
>>> Pavan, are you planning to work on this for v13 CF1? Or have you lost
>>> interest on the topic?
>>
>>
>> Yes, I plan to work on it.
>>
>
> I am sorry, but I am not able to find time to get back to this because of other high priority items. If it still remains unaddressed in the next few weeks, I will pick it up again. But for now, I am happy if someone wants to pick and finish the work.
>

Fair enough, I have marked the entry [1] in the coming CF as "Returned
with Feedback".  I hope that is okay with you.

[1] - https://commitfest.postgresql.org/23/2009/

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

Ibrar Ahmed-4


On Sat, Jun 29, 2019 at 12:56 AM Amit Kapila <[hidden email]> wrote:
On Thu, Jun 27, 2019 at 11:02 AM Pavan Deolasee
<[hidden email]> wrote:
>
>>> On 2019-04-07 18:04:27 -0700, Andres Freund wrote:
>>> > Here's a *prototype* patch for this.  It only implements what I
>>> > described for heap_multi_insert, not for plain inserts. I wanted to see
>>> > what others think before investing additional time.
>>>
>>> Pavan, are you planning to work on this for v13 CF1? Or have you lost
>>> interest on the topic?
>>
>>
>> Yes, I plan to work on it.
>>
>
> I am sorry, but I am not able to find time to get back to this because of other high priority items. If it still remains unaddressed in the next few weeks, I will pick it up again. But for now, I am happy if someone wants to pick and finish the work.
>

Fair enough, I have marked the entry [1] in the coming CF as "Returned
with Feedback".  I hope that is okay with you.

[1] - https://commitfest.postgresql.org/23/2009/



Hi,

As Pavan mentioned in the last email that he is no longer interested in that, so I want to
take the lead and want to finish that. It is a bug and needs to be fixed.
I have rebased and the patch with the latest master and added some test
cases (borrowed from Pavan's patch), and did some performance testing with a table size of
700MB (10Millions rows)

COPY WITH FREEZE took 21406.692ms and VACUUM took 2478.666ms
COPY WITH FREEZE took 23095.985ms and VACUUM took 26.309ms

The performance decrease in copy with the patch is only 7%, but we get
quite adequate performance in VACUUM. In any case, this is a bug fix, so we can ignore
the performance hit.

There are two issues left to address.

1 - Andres: It only implements what I described for heap_multi_insert, not for plain inserts.
I wanted to see what others think before investing additional time.

In which condition we need that for plain inserts?

2 - Andres:  I think we should remove the WAL logging from visibilitymap_set(), and
move it to a separate, heap specific, function. Right now different
tableams e.g. would have to reimplement visibilitymap_set(), so that's a
second need to separate that functionality.  Let me try to come up with
a proposal.




--
Ibrar Ahmed

0003-copy-freeze-should-actually-freeze-right.patch (17K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

Ibrar Ahmed-4


On Fri, Mar 13, 2020 at 6:58 AM Justin Pryzby <[hidden email]> wrote:

Thanks for picking up this patch.  There's a minor typo:

+                        * readable outside of this sessoin. Therefore doing IO here isn't

=> session

--
Justin

Thanks, please see the updated and rebased patch. (master 17a28b03645e27d73bf69a95d7569b61e58f06eb)

--
Ibrar Ahmed

0004-copy-freeze-should-actually-freeze-right.patch (17K) Download Attachment
123