Why does ExecComputeStoredGenerated() form a heap tuple

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Why does ExecComputeStoredGenerated() form a heap tuple

Andres Freund
Hi,

while rebasing the remaining tableam patches (luckily a pretty small set
now!), I had a few conflicts with ExecComputeStoredGenerated(). While
resolving I noticed:

        oldtuple = ExecFetchSlotHeapTuple(slot, true, &should_free);
        newtuple = heap_modify_tuple(oldtuple, tupdesc, values, nulls, replaces);
        ExecForceStoreHeapTuple(newtuple, slot);
        if (should_free)
                heap_freetuple(oldtuple);

        MemoryContextSwitchTo(oldContext);

First off, I'm not convinced this is correct:

ISTM you'd need at least an ExecMaterializeSlot() before the
MemoryContextSwitchTo() in ExecComputeStoredGenerated().

But what actually brought me to reply was that it seems like it'll cause
unnecessary slowdowns for !heap AMs. First, it'll form a heaptuple if
the slot isn't in that form, and then it'll cause a conversion by
storing a heap tuple even if the target doesn't use heap representation.

ISTM the above would be much more efficiently - even more efficient if
only heap is used - implemented as something roughly akin to:

    slot_getallattrs(slot);
    memcpy(values, slot->tts_values, ...);
    memcpy(nulls, slot->tts_isnull, ...);

    for (int i = 0; i < natts; i++)
    {
        if (TupleDescAttr(tupdesc, i)->attgenerated == ATTRIBUTE_GENERATED_STORED)
        {
            values[i] = ...
        }
        else
            values[i] = datumCopy(...);
    }

    ExecClearTuple(slot);
    memcpy(slot->tts_values, values, ...);
    memcpy(slot->tts_isnull, nulls, ...);
    ExecStoreVirtualTuple(slot);
    ExecMaterializeSlot(slot);

that's not perfect, but more efficient than your version...

Greetings,

Andres Freund


Reply | Threaded
Open this post in threaded view
|

Re: Why does ExecComputeStoredGenerated() form a heap tuple

Andres Freund
Hi,

On 2019-03-30 19:57:44 -0700, Andres Freund wrote:

> while rebasing the remaining tableam patches (luckily a pretty small set
> now!), I had a few conflicts with ExecComputeStoredGenerated(). While
> resolving I noticed:
>
> oldtuple = ExecFetchSlotHeapTuple(slot, true, &should_free);
> newtuple = heap_modify_tuple(oldtuple, tupdesc, values, nulls, replaces);
> ExecForceStoreHeapTuple(newtuple, slot);
> if (should_free)
> heap_freetuple(oldtuple);
>
> MemoryContextSwitchTo(oldContext);
>
> First off, I'm not convinced this is correct:
>
> ISTM you'd need at least an ExecMaterializeSlot() before the
> MemoryContextSwitchTo() in ExecComputeStoredGenerated().
>
> But what actually brought me to reply was that it seems like it'll cause
> unnecessary slowdowns for !heap AMs. First, it'll form a heaptuple if
> the slot isn't in that form, and then it'll cause a conversion by
> storing a heap tuple even if the target doesn't use heap representation.
>
> ISTM the above would be much more efficiently - even more efficient if
> only heap is used - implemented as something roughly akin to:
>
>     slot_getallattrs(slot);
>     memcpy(values, slot->tts_values, ...);
>     memcpy(nulls, slot->tts_isnull, ...);
>
>     for (int i = 0; i < natts; i++)
>     {
>         if (TupleDescAttr(tupdesc, i)->attgenerated == ATTRIBUTE_GENERATED_STORED)
>         {
>             values[i] = ...
>         }
>         else
>             values[i] = datumCopy(...);
>     }
>
>     ExecClearTuple(slot);
>     memcpy(slot->tts_values, values, ...);
>     memcpy(slot->tts_isnull, nulls, ...);
>     ExecStoreVirtualTuple(slot);
>     ExecMaterializeSlot(slot);
>
> that's not perfect, but more efficient than your version...

Also, have you actually benchmarked this code? ISTM that adding a
stored generated column would cause quite noticable slowdowns in the
COPY path based on this code.

Greetings,

Andres Freund


Reply | Threaded
Open this post in threaded view
|

Re: Why does ExecComputeStoredGenerated() form a heap tuple

Peter Eisentraut-6
In reply to this post by Andres Freund
On 2019-03-31 04:57, Andres Freund wrote:
> while rebasing the remaining tableam patches (luckily a pretty small set
> now!), I had a few conflicts with ExecComputeStoredGenerated(). While
> resolving I noticed:

The core of that code was written a long time ago and perhaps hasn't
caught up with all the refactoring going on.  I'll look through your
proposal and update the code.

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Why does ExecComputeStoredGenerated() form a heap tuple

Peter Eisentraut-6
In reply to this post by Andres Freund
On 2019-03-31 05:00, Andres Freund wrote:
> Also, have you actually benchmarked this code? ISTM that adding a
> stored generated column would cause quite noticable slowdowns in the
> COPY path based on this code.

Yes, it'll be slower than not having it, but it's much faster than the
equivalent trigger.

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Why does ExecComputeStoredGenerated() form a heap tuple

Andres Freund
Hi,

On 2019-04-01 11:25:46 +0200, Peter Eisentraut wrote:
> On 2019-03-31 05:00, Andres Freund wrote:
> > Also, have you actually benchmarked this code? ISTM that adding a
> > stored generated column would cause quite noticable slowdowns in the
> > COPY path based on this code.
>
> Yes, it'll be slower than not having it, but it's much faster than the
> equivalent trigger.

It at the moment is quite noticably slower than directly inserting the
generated column.

postgres[11993][1]=# CREATE TABLE foo_without_generated(id int, copy_of_int int);
CREATE TABLE
Time: 0.625 ms
postgres[11993][1]=# CREATE TABLE foo_with_generated(id int, copy_of_int int generated always as (id) stored);
CREATE TABLE
Time: 0.771 ms
postgres[11993][1]=# INSERT INTO foo_without_generated SELECT g.i, g.i FROM generate_series(1, 1000000) g(i);
INSERT 0 1000000
Time: 691.533 ms
postgres[11993][1]=# INSERT INTO foo_with_generated SELECT g.i FROM generate_series(1, 1000000) g(i);
INSERT 0 1000000
Time: 825.471 ms
postgres[11993][1]=# COPY foo_without_generated TO '/tmp/foo_without_generated';
COPY 1000000
Time: 194.051 ms
postgres[11993][1]=# COPY foo_with_generated TO '/tmp/foo_with_generated';
COPY 1000000
Time: 153.146 ms
postgres[11993][1]=# ;TRUNCATE foo_without_generated ;COPY foo_without_generated FROM '/tmp/foo_without_generated';
Time: 0.178 ms
TRUNCATE TABLE
Time: 8.456 ms
COPY 1000000
Time: 394.990 ms
postgres[11993][1]=# ;TRUNCATE foo_with_generated ;COPY foo_with_generated FROM '/tmp/foo_with_generated';
Time: 0.147 ms
TRUNCATE TABLE
Time: 8.043 ms
COPY 1000000
Time: 508.918 ms

From a quick profile that's indeed largely because
ExecComputeStoredGenerated() is really inefficient - and it seems
largely unnecessarily so.  I think this should at least be roughly as
efficient as getting the additional data from the client.


Minor other point: I'm not a fan of defining more general infrastructure
like ExecComputedStoredGenerated() in nodeModifyTable.c - it's already
large and confusing, and it's not obvious that e.g. COPY would call into
it.

Greetings,

Andres Freund


Reply | Threaded
Open this post in threaded view
|

Re: Why does ExecComputeStoredGenerated() form a heap tuple

Peter Eisentraut-6
In reply to this post by Peter Eisentraut-6
On 2019-04-01 11:23, Peter Eisentraut wrote:
> On 2019-03-31 04:57, Andres Freund wrote:
>> while rebasing the remaining tableam patches (luckily a pretty small set
>> now!), I had a few conflicts with ExecComputeStoredGenerated(). While
>> resolving I noticed:
>
> The core of that code was written a long time ago and perhaps hasn't
> caught up with all the refactoring going on.  I'll look through your
> proposal and update the code.

The attached patch is based on your sketch.  It's clearly better in the
long term not to rely on heap tuples here.  But in testing this change
seems to make it slightly slower, certainly not a speedup as you were
apparently hoping for.


Test setup:

create table t0 (a int, b int);
insert into t0 select generate_series (1, 10000000);  -- 10 million
\copy t0 (a) to 'test.dat';

-- for comparison, without generated column
truncate t0;
\copy t0 (a) from 'test.dat';

-- master
create table t1 (a int, b int generated always as (a * 2) stored);
truncate t1;
\copy t1 (a) from 'test.dat';

-- patched
truncate t1;
\copy t1 (a) from 'test.dat';

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

0001-Don-t-form-heap-tuple-in-ExecComputeStoredGenerated.patch (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Why does ExecComputeStoredGenerated() form a heap tuple

David Rowley-3
On Tue, 23 Apr 2019 at 20:23, Peter Eisentraut
<[hidden email]> wrote:

> The attached patch is based on your sketch.  It's clearly better in the
> long term not to rely on heap tuples here.  But in testing this change
> seems to make it slightly slower, certainly not a speedup as you were
> apparently hoping for.
>
>
> Test setup:
>
> create table t0 (a int, b int);
> insert into t0 select generate_series (1, 10000000);  -- 10 million
> \copy t0 (a) to 'test.dat';
>
> -- for comparison, without generated column
> truncate t0;
> \copy t0 (a) from 'test.dat';
>
> -- master
> create table t1 (a int, b int generated always as (a * 2) stored);
> truncate t1;
> \copy t1 (a) from 'test.dat';
>
> -- patched
> truncate t1;
> \copy t1 (a) from 'test.dat';

I didn't do the exact same test, but if I use COPY instead of \copy,
then for me patched is faster.

Normal table:


postgres=# copy t0 (a) from '/home/drowley/test.dat';
COPY 10000000
Time: 5437.768 ms (00:05.438)
postgres=# truncate t0;
TRUNCATE TABLE
Time: 20.775 ms
postgres=# copy t0 (a) from '/home/drowley/test.dat';
COPY 10000000
Time: 5272.228 ms (00:05.272)

Master:

postgres=# copy t1 (a) from '/home/drowley/test.dat';
COPY 10000000
Time: 6570.031 ms (00:06.570)
postgres=# truncate t1;
TRUNCATE TABLE
Time: 17.813 ms
postgres=# copy t1 (a) from '/home/drowley/test.dat';
COPY 10000000
Time: 6486.253 ms (00:06.486)

Patched:

postgres=# copy t1 (a) from '/home/drowley/test.dat';
COPY 10000000
Time: 5359.338 ms (00:05.359)
postgres=# truncate table t1;
TRUNCATE TABLE
Time: 25.551 ms
postgres=# copy t1 (a) from '/home/drowley/test.dat';
COPY 10000000
Time: 5347.596 ms (00:05.348)


For the patch, I wonder if you need this line:

+ memcpy(values, slot->tts_values, sizeof(*values) * natts);

If you got rid of that and changed the datumCopy to use
slot->tts_values[i] instead.

Maybe it's also worth getting rid of the first memcpy for the null
array and just assign the element in the else clause.

It might also be cleaner to assign TupleDescAttr(tupdesc, i) to a
variable instead of using the macro 3 times. It'd make that datumCopy
line shorter too.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Why does ExecComputeStoredGenerated() form a heap tuple

Peter Eisentraut-6
On 2019-04-24 00:26, David Rowley wrote:
> I didn't do the exact same test, but if I use COPY instead of \copy,
> then for me patched is faster.

OK, confirmed that way, too.

> For the patch, I wonder if you need this line:
>
> + memcpy(values, slot->tts_values, sizeof(*values) * natts);
>
> If you got rid of that and changed the datumCopy to use
> slot->tts_values[i] instead.

done

> Maybe it's also worth getting rid of the first memcpy for the null
> array and just assign the element in the else clause.

Tried that, seems to be slower.  So I left it as is.

> It might also be cleaner to assign TupleDescAttr(tupdesc, i) to a
> variable instead of using the macro 3 times. It'd make that datumCopy
> line shorter too.

Also done.

Updated patch attached.

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

v2-0001-Convert-ExecComputeStoredGenerated-to-use-tuple-s.patch (7K) Download Attachment