Pluggable Storage - Andres's take

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
179 messages Options
1234567 ... 9
Reply | Threaded
Open this post in threaded view
|

Re: Pluggable Storage - Andres's take

Dmitry Dolgov
> On Sat, Dec 15, 2018 at 8:37 PM Andres Freund <[hidden email]> wrote:
>
> We need to dump the table access method at dump time, otherwise we loose
> that information.

Oh, right. So, something like in the attached patch?

> > As a side note, in a table description I haven't found any mention of which
> > access method is used for this table, probably it's useful to show that with \d+
> > (see the attached patch).
>
> I'm not convinced that's really worth the cost of including it in \d
> (rather than \d+ or such).

Maybe I'm missing the point, but I've meant exactly the same and the patch,
suggested in the previous email, add this info to \d+

pg_dump_access_method.patch (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Pluggable Storage - Andres's take

Peter Geoghegan-4
In reply to this post by Dmitry Dolgov
On Mon, Dec 10, 2018 at 8:13 AM Dmitry Dolgov <[hidden email]> wrote:
> Just out of curiosity I've also tried tpc-c from oltpbench (in the very same
> simple environment), it doesn't show any significant difference from master as
> well.

FWIW, I have found BenchmarkSQL to be significantly better than
oltpbench, having used both quite a bit now:

https://bitbucket.org/openscg/benchmarksql

For example, oltpbench requires a max_connections setting that far
exceeds the number of terminals/clients used by the benchmark, because
the number of connections used during bulk loading far exceeds what is
truly required. BenchmarkSQL also makes it easy to generate useful
html reports, complete with graphs.

--
Peter Geoghegan

Reply | Threaded
Open this post in threaded view
|

Re: Pluggable Storage - Andres's take

Dmitry Dolgov
> On Sat, Dec 15, 2018 at 8:37 PM Andres Freund <[hidden email]> wrote:
>
> We need to dump the table access method at dump time, otherwise we loose
> that information.

As a result of the discussion in [1] (btw, thanks for starting it), here is
proposed solution with tracking current default_table_access_method. Next I'll
tackle similar issue for psql and probably add some tests for both patches.

[1]: https://www.postgresql.org/message-id/flat/20190107235616.6lur25ph22u5u5av%40alap3.anarazel.de

pg_dump_access_method.patch (37K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Pluggable Storage - Andres's take

Andres Freund
Hi,

On 2019-01-12 01:35:06 +0100, Dmitry Dolgov wrote:
> > On Sat, Dec 15, 2018 at 8:37 PM Andres Freund <[hidden email]> wrote:
> >
> > We need to dump the table access method at dump time, otherwise we loose
> > that information.
>
> As a result of the discussion in [1] (btw, thanks for starting it), here is
> proposed solution with tracking current default_table_access_method. Next I'll
> tackle similar issue for psql and probably add some tests for both patches.

Thanks!

> +/*
> + * Set the proper default_table_access_method value for the table.
> + */
> +static void
> +_selectTableAccessMethod(ArchiveHandle *AH, const char *tableam)
> +{
> + PQExpBuffer cmd = createPQExpBuffer();
> + const char *want, *have;
> +
> + have = AH->currTableAm;
> + want = tableam;
> +
> + if (!want)
> + return;
> +
> + if (have && strcmp(want, have) == 0)
> + return;
> +
> +
> + appendPQExpBuffer(cmd, "SET default_table_access_method = %s;", tableam);

This needs escaping, at the very least with "", but better with proper
routines for dealing with identifiers.



> @@ -5914,7 +5922,7 @@ getTables(Archive *fout, int *numTables)
>    "tc.relfrozenxid AS tfrozenxid, "
>    "tc.relminmxid AS tminmxid, "
>    "c.relpersistence, c.relispopulated, "
> -  "c.relreplident, c.relpages, "
> +  "c.relreplident, c.relpages, am.amname AS amname, "

That AS doesn't do anything, does it?


>   /* other fields were zeroed above */
>  
> @@ -9355,7 +9370,7 @@ dumpComment(Archive *fout, const char *type, const char *name,
>   * post-data.
>   */
>   ArchiveEntry(fout, nilCatalogId, createDumpId(),
> - tag->data, namespace, NULL, owner,
> + tag->data, namespace, NULL, owner, NULL,
>   "COMMENT", SECTION_NONE,
>   query->data, "", NULL,
>   &(dumpId), 1,

We really ought to move the arguments to a struct, so we don't generate
quite as much useless diffs whenever we do a change around one of
these...

Greetings,

Andres Freund

Reply | Threaded
Open this post in threaded view
|

Re: Pluggable Storage - Andres's take

Dmitry Dolgov
> On Sat, Jan 12, 2019 at 1:44 AM Andres Freund <[hidden email]> wrote:
>
> > +     appendPQExpBuffer(cmd, "SET default_table_access_method = %s;", tableam);
>
> This needs escaping, at the very least with "", but better with proper
> routines for dealing with identifiers.

Thanks for noticing, fixed.

> > @@ -5914,7 +5922,7 @@ getTables(Archive *fout, int *numTables)
> >                                                 "tc.relfrozenxid AS tfrozenxid, "
> >                                                 "tc.relminmxid AS tminmxid, "
> >                                                 "c.relpersistence, c.relispopulated, "
> > -                                               "c.relreplident, c.relpages, "
> > +                                               "c.relreplident, c.relpages, am.amname AS amname, "
>
> That AS doesn't do anything, does it?

Rigth, I've renamed it few times and forgot to get rid of it. Removed.

>
> >               /* other fields were zeroed above */
> >
> > @@ -9355,7 +9370,7 @@ dumpComment(Archive *fout, const char *type, const char *name,
> >                * post-data.
> >                */
> >               ArchiveEntry(fout, nilCatalogId, createDumpId(),
> > -                                      tag->data, namespace, NULL, owner,
> > +                                      tag->data, namespace, NULL, owner, NULL,
> >                                        "COMMENT", SECTION_NONE,
> >                                        query->data, "", NULL,
> >                                        &(dumpId), 1,
>
> We really ought to move the arguments to a struct, so we don't generate
> quite as much useless diffs whenever we do a change around one of
> these...
That's what I though too. Maybe then I'll suggest a mini-patch to the master to
refactor these arguments out into a separate struct, so we can leverage it here.

pg_dump_access_method_v2.patch (37K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Pluggable Storage - Andres's take

Amit Khandekar-2
Thanks for the patch updates.

A few comments so far from me :

+static void _selectTableAccessMethod(ArchiveHandle *AH, const char
*tablespace);
tablespace => tableam

+_selectTableAccessMethod(ArchiveHandle *AH, const char *tableam)
+{
+       PQExpBuffer cmd = createPQExpBuffer();
createPQExpBuffer() should be moved after the below statement, so that
it does not leak memory :
if (have && strcmp(want, have) == 0)
return;

char    *tableam; /* table access method, onlyt for TABLE tags */
Indentation is a bit misaligned. onlyt=> only



@@ -2696,6 +2701,7 @@ ReadToc(ArchiveHandle *AH)
                        te->tablespace = ReadStr(AH);

te->owner = ReadStr(AH);
+  te->tableam = ReadStr(AH);

Above, I am not sure about the this, but possibly we may require to
have archive-version check like how it is done for tablespace :
if (AH->version >= K_VERS_1_10)
   te->tablespace = ReadStr(AH);

So how about bumping up the archive version and doing these checks ?
Otherwise, if we run pg_restore using old version, we may read some
junk into te->tableam, or possibly crash. As I said, I am not sure
about this due to lack of clear understanding of archive versioning,
but let me know if you indeed find this issue to be true.

Reply | Threaded
Open this post in threaded view
|

Re: Pluggable Storage - Andres's take

Dmitry Dolgov
> On Mon, Jan 14, 2019 at 2:07 PM Amit Khandekar <[hidden email]> wrote:
>
> createPQExpBuffer() should be moved after the below statement, so that
> it does not leak memory

Thanks for noticing, fixed.

> So how about bumping up the archive version and doing these checks ?

Yeah, you're right, I've added this check.

pg_dump_access_method_v3.patch (38K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Pluggable Storage - Andres's take

Haribabu Kommi-2
In reply to this post by Andres Freund

On Tue, Dec 11, 2018 at 1:13 PM Andres Freund <[hidden email]> wrote:
Hi,

On 2018-11-26 17:55:57 -0800, Andres Freund wrote:
Further tasks I'm not yet planning to tackle, that I'd welcome help on:
- pg_upgrade testing

I did the pg_upgrade testing from older version with some tables and views
exists,  and all of them are properly transformed into new server with heap
as the default access method.

I will add the dimitry pg_dump patch and test the pg_upgrade to confirm
the proper access method is retained on the upgraded database.

 
- I think we should consider removing HeapTuple->t_tableOid, it should
  imo live entirely in the slot

I removed the t_tableOid from HeapTuple and during testing I found some
problems with triggers, will post the patch once it is fixed.

Regards,
Haribabu Kommi
Fujitsu Australia
Reply | Threaded
Open this post in threaded view
|

Re: Pluggable Storage - Andres's take

Andres Freund
Hi,

On 2019-01-15 18:02:38 +1100, Haribabu Kommi wrote:

> On Tue, Dec 11, 2018 at 1:13 PM Andres Freund <[hidden email]> wrote:
>
> > Hi,
> >
> > On 2018-11-26 17:55:57 -0800, Andres Freund wrote:
> > Further tasks I'm not yet planning to tackle, that I'd welcome help on:
> > - pg_upgrade testing
> >
>
> I did the pg_upgrade testing from older version with some tables and views
> exists,  and all of them are properly transformed into new server with heap
> as the default access method.
>
> I will add the dimitry pg_dump patch and test the pg_upgrade to confirm
> the proper access method is retained on the upgraded database.
>
>
>
> > - I think we should consider removing HeapTuple->t_tableOid, it should
> >   imo live entirely in the slot
> >
>
> I removed the t_tableOid from HeapTuple and during testing I found some
> problems with triggers, will post the patch once it is fixed.


Please note that I'm working on a heavily revised version of the patch
right now, trying to clean up a lot of things (you might have seen some
of the threads I started). I hope to post it ~Thursday.  Local-ish
patches shouldn't be a problem though.

Greetings,

Andres Freund

Reply | Threaded
Open this post in threaded view
|

Re: Pluggable Storage - Andres's take

Amit Khandekar-2
In reply to this post by Dmitry Dolgov
On Sat, 12 Jan 2019 at 18:11, Dmitry Dolgov <[hidden email]> wrote:

> > On Sat, Jan 12, 2019 at 1:44 AM Andres Freund <[hidden email]> wrote:
> > >               /* other fields were zeroed above */
> > >
> > > @@ -9355,7 +9370,7 @@ dumpComment(Archive *fout, const char *type, const char *name,
> > >                * post-data.
> > >                */
> > >               ArchiveEntry(fout, nilCatalogId, createDumpId(),
> > > -                                      tag->data, namespace, NULL, owner,
> > > +                                      tag->data, namespace, NULL, owner, NULL,
> > >                                        "COMMENT", SECTION_NONE,
> > >                                        query->data, "", NULL,
> > >                                        &(dumpId), 1,
> >
> > We really ought to move the arguments to a struct, so we don't generate
> > quite as much useless diffs whenever we do a change around one of
> > these...
>
> That's what I thought too. Maybe then I'll suggest a mini-patch to the master to
> refactor these arguments out into a separate struct, so we can leverage it here.

Then for each of the calls, we would need to declare that structure
variable (with = {0}) and assign required fields in that structure
before passing it to ArchiveEntry(). But a major reason of
ArchiveEntry() is to avoid doing this and instead conveniently pass
those fields as parameters. This will cause unnecessary more lines of
code. I think better way is to have an ArchiveEntry() function with
limited number of parameters, and have an ArchiveEntryEx() with those
extra parameters which are not needed in usual cases. E.g. we can have
tablespace, tableam, dumpFn and dumpArg as those extra arguments of
ArchiveEntryEx(), because most of the places these are passed as NULL.
All future arguments would go in ArchiveEntryEx().
Comments ?



--
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company

Reply | Threaded
Open this post in threaded view
|

Re: Pluggable Storage - Andres's take

Amit Khandekar-2
In reply to this post by Dmitry Dolgov
On Tue, 15 Jan 2019 at 12:27, Dmitry Dolgov <[hidden email]> wrote:
>
> > On Mon, Jan 14, 2019 at 2:07 PM Amit Khandekar <[hidden email]> wrote:
> >
> > createPQExpBuffer() should be moved after the below statement, so that
> > it does not leak memory
>
> Thanks for noticing, fixed.

Looks good.

>
> > So how about bumping up the archive version and doing these checks ?
>
> Yeah, you're right, I've added this check.

Need to bump K_VERS_MINOR as well.

On Mon, 14 Jan 2019 at 18:36, Amit Khandekar <[hidden email]> wrote:
> +static void _selectTableAccessMethod(ArchiveHandle *AH, const char
> *tablespace);
> tablespace => tableam

This is yet to be addressed.


--
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company

Reply | Threaded
Open this post in threaded view
|

Re: Pluggable Storage - Andres's take

Dmitry Dolgov
> On Tue, Jan 15, 2019 at 10:52 AM Amit Khandekar <[hidden email]> wrote:
>
> Need to bump K_VERS_MINOR as well.

I've bumped it up, but somehow this change escaped the previous version. Now
should be there, thanks!

> On Mon, 14 Jan 2019 at 18:36, Amit Khandekar <[hidden email]> wrote:
> > +static void _selectTableAccessMethod(ArchiveHandle *AH, const char
> > *tablespace);
> > tablespace => tableam
>
> This is yet to be addressed.

Fixed.

Also I guess another attached patch should address the psql part, namely
displaying a table access method with \d+ and possibility to hide it with a
psql variable (HIDE_TABLEAM, but I'm open for suggestion about the name).

pg_dump_access_method_v4.patch (38K) Download Attachment
psql_describe_am.patch (15K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Pluggable Storage - Andres's take

Andres Freund
In reply to this post by Amit Khandekar-2
Hi,

On 2019-01-15 14:37:36 +0530, Amit Khandekar wrote:
> Then for each of the calls, we would need to declare that structure
> variable (with = {0}) and assign required fields in that structure
> before passing it to ArchiveEntry(). But a major reason of
> ArchiveEntry() is to avoid doing this and instead conveniently pass
> those fields as parameters. This will cause unnecessary more lines of
> code. I think better way is to have an ArchiveEntry() function with
> limited number of parameters, and have an ArchiveEntryEx() with those
> extra parameters which are not needed in usual cases.

I don't think that'll really solve the problem. I think it might be more
reasonable to rely on structs. Now that we can rely on designated
initializers for structs we can do something like

    ArchiveEntry((ArchiveArgs){.tablespace = 3,
                               .dumpFn = somefunc,
                               ...});

and unused arguments will automatically initialized to zero.  Or we
could pass the struct as a pointer, might be more efficient (although I
doubt it matters here):

    ArchiveEntry(&(ArchiveArgs){.tablespace = 3,
                                .dumpFn = somefunc,
                                ...});

What do others think?  It'd probably be a good idea to start a new
thread about this.

Greetings,

Andres Freund

Reply | Threaded
Open this post in threaded view
|

Re: Pluggable Storage - Andres's take

Amit Khandekar-2
In reply to this post by Dmitry Dolgov
On Tue, 15 Jan 2019 at 17:58, Dmitry Dolgov <[hidden email]> wrote:

>
> > On Tue, Jan 15, 2019 at 10:52 AM Amit Khandekar <[hidden email]> wrote:
> >
> > Need to bump K_VERS_MINOR as well.
>
> I've bumped it up, but somehow this change escaped the previous version. Now
> should be there, thanks!
>
> > On Mon, 14 Jan 2019 at 18:36, Amit Khandekar <[hidden email]> wrote:
> > > +static void _selectTableAccessMethod(ArchiveHandle *AH, const char
> > > *tablespace);
> > > tablespace => tableam
> >
> > This is yet to be addressed.
>
> Fixed.

Thanks, the patch looks good to me. Of course there's the other thread
about ArchiveEntry arguments which may alter this patch, but
otherwise, I have no more comments on this patch.

>
> Also I guess another attached patch should address the psql part, namely
> displaying a table access method with \d+ and possibility to hide it with a
> psql variable (HIDE_TABLEAM, but I'm open for suggestion about the name).

Will have a look at this one.


--
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company

Reply | Threaded
Open this post in threaded view
|

Re: Pluggable Storage - Andres's take

Amit Khandekar-2
On Fri, 18 Jan 2019 at 10:13, Amit Khandekar <[hidden email]> wrote:
> On Tue, 15 Jan 2019 at 17:58, Dmitry Dolgov <[hidden email]> wrote:

> > Also I guess another attached patch should address the psql part, namely
> > displaying a table access method with \d+ and possibility to hide it with a
> > psql variable (HIDE_TABLEAM, but I'm open for suggestion about the name).

I am ok with the name.

>
> Will have a look at this one.

--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -1,3 +1,4 @@
+\set HIDE_TABLEAM on
 CREATE TEMP TABLE x (

I thought we wanted to avoid having to add this setting in individual
regression tests. Can't we do this in pg_regress as a common setting ?

+ /* Access method info */
+ if (pset.sversion >= 120000 && verbose && tableinfo.relam != NULL &&
+    !(pset.hide_tableam && tableinfo.relam_is_default))
+ {
+         printfPQExpBuffer(&buf, _("Access method: %s"),
fmtId(tableinfo.relam));

So this will make psql hide the access method if it's same as the
default. I understand that this was kind of concluded in the other
thread "Displaying and dumping of table access methods". But IMHO, if
the hide_tableam is false, we should *always* show the access method,
regardless of the default value. I mean, we can make it simple : off
means never show table-access, on means always show table-access,
regardless of the default access method. And this also will work with
regression tests. If some regression test wants specifically to output
the access method, it can have a "\SET HIDE_TABLEAM off" command.

If we hide the method if it's default, then for a regression test that
wants to forcibly show the table access method of all tables, it won't
show up for tables that have default access method.

------------

+ if (pset.sversion >= 120000 && verbose && tableinfo.relam != NULL &&

If the server does not support relam, tableinfo.relam will be NULL
anyways. So I think sversion check is not needed.

------------

+ printfPQExpBuffer(&buf, _("Access method: %s"), fmtId(tableinfo.relam));
fmtId is not required. In fact, we should display the access method
name as-is. fmtId is required only for identifiers present in SQL
queries.

-----------

+      printfPQExpBuffer(&buf, _("Access method: %s"), fmtId(tableinfo.relam));
+      printTableAddFooter(&cont, buf.data);
+   }
+
+
 }

Last two blank lines are not needed.

-----------

+ bool            hide_tableam;
 } PsqlSettings;

These variables, it seems, are supposed to be grouped together by type.

-----------

I believe you are going to add a new regression testcase for the change ?


--
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company

Reply | Threaded
Open this post in threaded view
|

Re: Pluggable Storage - Andres's take

Dmitry Dolgov
> On Fri, Jan 18, 2019 at 11:22 AM Amit Khandekar <[hidden email]> wrote:
>
> --- a/src/test/regress/expected/copy2.out
> +++ b/src/test/regress/expected/copy2.out
> @@ -1,3 +1,4 @@
> +\set HIDE_TABLEAM on
>
> I thought we wanted to avoid having to add this setting in individual
> regression tests. Can't we do this in pg_regress as a common setting ?

Yeah, you're probably right. Actually, I couldn't find anything that looks like
"common settings", and so far I've placed it into psql_start_test as a psql
argument. But not sure, maybe there is a better place.

> + /* Access method info */
> + if (pset.sversion >= 120000 && verbose && tableinfo.relam != NULL &&
> +    !(pset.hide_tableam && tableinfo.relam_is_default))
> + {
> +         printfPQExpBuffer(&buf, _("Access method: %s"),
> fmtId(tableinfo.relam));
>
> So this will make psql hide the access method if it's same as the
> default. I understand that this was kind of concluded in the other
> thread "Displaying and dumping of table access methods". But IMHO, if
> the hide_tableam is false, we should *always* show the access method,
> regardless of the default value. I mean, we can make it simple : off
> means never show table-access, on means always show table-access,
> regardless of the default access method. And this also will work with
> regression tests. If some regression test wants specifically to output
> the access method, it can have a "\SET HIDE_TABLEAM off" command.
>
> If we hide the method if it's default, then for a regression test that
> wants to forcibly show the table access method of all tables, it won't
> show up for tables that have default access method.
I can't imagine, what kind of test would need to forcibly show the table access
method of all the tables? Even if you need to verify tableam for something,
maybe it's even easier just to select it from pg_am?

> + if (pset.sversion >= 120000 && verbose && tableinfo.relam != NULL &&
>
> If the server does not support relam, tableinfo.relam will be NULL
> anyways. So I think sversion check is not needed.
> ------------
>
> + printfPQExpBuffer(&buf, _("Access method: %s"), fmtId(tableinfo.relam));
> fmtId is not required.
> -----------
>
> +      printfPQExpBuffer(&buf, _("Access method: %s"), fmtId(tableinfo.relam));
> +      printTableAddFooter(&cont, buf.data);
> +   }
> +
> +
>  }
>
> Last two blank lines are not needed.
Right, fixed.

> + bool            hide_tableam;
>  } PsqlSettings;
>
> These variables, it seems, are supposed to be grouped together by type.

Well, this grouping looks strange for me. But since I don't have a strong
opinion, I moved the variable.

> I believe you are going to add a new regression testcase for the change ?

Yep.

psql_describe_am_v2.patch (5K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Pluggable Storage - Andres's take

Haribabu Kommi-2
In reply to this post by Andres Freund
On Tue, Jan 15, 2019 at 6:05 PM Andres Freund <[hidden email]> wrote:
Hi,

On 2019-01-15 18:02:38 +1100, Haribabu Kommi wrote:
> On Tue, Dec 11, 2018 at 1:13 PM Andres Freund <[hidden email]> wrote:
>
> > Hi,
> >
> > On 2018-11-26 17:55:57 -0800, Andres Freund wrote:
> > Further tasks I'm not yet planning to tackle, that I'd welcome help on:
> > - pg_upgrade testing
> >
>
> I did the pg_upgrade testing from older version with some tables and views
> exists,  and all of them are properly transformed into new server with heap
> as the default access method.
>
> I will add the dimitry pg_dump patch and test the pg_upgrade to confirm
> the proper access method is retained on the upgraded database.
>
>
>
> > - I think we should consider removing HeapTuple->t_tableOid, it should
> >   imo live entirely in the slot
> >
>
> I removed the t_tableOid from HeapTuple and during testing I found some
> problems with triggers, will post the patch once it is fixed.


Please note that I'm working on a heavily revised version of the patch
right now, trying to clean up a lot of things (you might have seen some
of the threads I started). I hope to post it ~Thursday.  Local-ish
patches shouldn't be a problem though.

Yes, I am checking you other threads of refactoring and cleanups.
I will rebase this patch once the revised code is available.

I am not able to remove the complete t_tableOid from HeapTuple,
because of its use in triggers, as the slot is not available in triggers
and I need to store the tableOid also as part of the tuple.

Currently setting of t_tableOid is done only when the tuple is formed
from the slot, and it is use is replaced with slot member.

comments?

Regards,
Haribabu Kommi
Fujitsu Australia

0001-Reduce-the-use-of-HeapTuple-t_tableOid.patch (75K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Pluggable Storage - Andres's take

Andres Freund
In reply to this post by Andres Freund
Hi,

(resending with compressed attachements, perhaps that'll go through)

On 2018-12-10 18:13:40 -0800, Andres Freund wrote:
> On 2018-11-26 17:55:57 -0800, Andres Freund wrote:
> > FWIW, now that oids are removed, and the tuple table slot abstraction
> > got in, I'm working on rebasing the pluggable storage patchset ontop of
> > that.
>
> I've pushed a version to that to the git tree, including a rebased
> version of zheap:
> https://github.com/anarazel/postgres-pluggable-storage
> https://github.com/anarazel/postgres-pluggable-zheap

I've pushed the newest, substantially revised, version to the same
repository. Note, that while the newest pluggable-zheap version is newer
than my last email, it's not based on the latest version, and the
pluggable-zheap development is now happening in the main zheap
repository.


> My next steps are:
> - make relation creation properly pluggable
> - remove the typedefs from tableam.h, instead move them into the
>   TableAmRoutine struct.
> - Move rs_{nblocks, startblock, numblocks} out of TableScanDescData
> - Move HeapScanDesc and IndexFetchHeapData out of relscan.h
> - remove ExecSlotCompare(), it's entirely unrelated to these changes imo
>   (and in the wrong place)

These are done.


> - split pluggable storage patchset, to commit earlier:
>   - EvalPlanQual slotification
>   - trigger slotification
>   - split of IndexBuildHeapScan out of index.c

The patchset is now pretty granularly split into individual pieces.
There's two commits that might be worthwhile to split up further:

1) The commit introducing table_beginscan et al, currently also
   introduces indexscans through tableam.
2) The commit introducing table_(insert|delete|update) also includes
   table_lock_tuple(), which in turn changes a bunch of EPQ related
   code. It's probably worthwhile to break that out.

I tried to make each individual commit make some sense, and pass all
tests on its own. That requires some changes that are then obsolted
in a later commit, but it's not as much as I feared.


> - rename HeapUpdateFailureData et al to not reference Heap

I've not done that, I decided it's best to do that after all the work
has gone in.


> - See if the slot in SysScanDescData can be avoided, it's not exactly
>   free of overhead.

After reconsidering, I don't think it's worth doing so.


There's pretty substantial changes in this series, besides the things
mentioned above:

- I re-introduced parallel scan into pluggable storage, but added a set
  of helper functions to avoid having to duplicate the current block
  based logic from heap. That way it can be shared between most/all
  block based AMs
- latestRemovedXid handling is moved into the table-AM, that's required
  for correct replay on Hot-Standby, where we do not know the AM of the
  current
- the whole truncation and relation creation code has been overhauled
- the order of functions in tableam.h, heapam_handler.c etc has been
  made more sensible
- a number of callbacks have been obsoleted (relation_sync,
  relation_create_init_fork, scansetlimits)
- A bunch of prerequisite work has been merged
- (heap|relation)_(open|openrv|close) have been split into their own
  files
- To avoid having to care about the bulk-insert flags code that uses a
  bulk-insert now unconditionally calls table_finish_bulk_insert(). The
  AM then internally can decide what it needs to do in case of
  e.g. HEAP_INSERT_SKIP_WAL.  Zheap currently for example doesn't
  implement that (because UNDO handling is complicated), and this way it
  can just ignore the option, without needing call-site code for that.
- A *lot* of cleanups

Todo:
- merge psql / pg_dump support by Dmitry
- consider removing scan_update_snapshot
- consider removing table_gimmegimmeslot()
- add substantial docs for every callback
- consider revising the current table_lock_tuple() API, I'm not quite
  convinced that's right
- reconsider heap_fetch() API changes, causes unnecessary pain
- polish the split out trigger and EPQ changes, so they can be merged
  soon-ish


I plan to merge the first few commits pretty soon (as largely announced
in related threads).


While I saw an initial attempt at writing smgl docs for the table AM
API, I'm not convinced that's the best approach.  I think it might make
more sense to have high-level docs in sgml, but then do all the
per-callback docs in tableam.h.

Greetings,

Andres Freund

v12-0001-WIP-Introduce-access-table.h-access-relation.h.patch.gz (7K) Download Attachment
v12-0002-Replace-heapam.h-includes-with-relation.h-table..patch.gz (9K) Download Attachment
v12-0003-Replace-uses-of-heap_open-et-al-with-table_open-.patch.gz (101K) Download Attachment
v12-0004-Remove-superfluous-tqual.h-includes.patch.gz (6K) Download Attachment
v12-0005-WIP-rename-tqual.c-to-heapam_visibility.c.patch.gz (5K) Download Attachment
v12-0006-WIP-change-snapshot-type-to-enum-rather-than-cal.patch.gz (5K) Download Attachment
v12-0007-Move-generic-snapshot-related-code-from-tqual.h-.patch.gz (6K) Download Attachment
v12-0008-Move-heap-visibility-routines-to-heapam.h.patch.gz (5K) Download Attachment
v12-0009-Rephrase-references-to-time-qualification.patch.gz (2K) Download Attachment
v12-0010-WIP-Extend-tuples-when-getting-them-from-a-slot.patch.gz (1K) Download Attachment
v12-0011-WIP-Move-page-initialization-from-RelationAddExt.patch.gz (3K) Download Attachment
v12-0012-WIP-ForceStore-HeapTupleDatum.patch.gz (1K) Download Attachment
v12-0013-slot-type-fixes.patch.gz (1K) Download Attachment
v12-0014-Rename-RelationData.rd_amroutine-to-rd_indam.patch.gz (6K) Download Attachment
v12-0015-Add-ExecStorePinnedBufferHeapTuple.patch.gz (2K) Download Attachment
v12-0016-Store-HeapTupleData-in-Buffer-HeapTupleTableSlot.patch.gz (1K) Download Attachment
v12-0017-Buffer-tuples-may-be-virtualized.patch.gz (1K) Download Attachment
v12-0018-slot-tableoid-tid-support.patch.gz (1K) Download Attachment
v12-0019-Set-tableoid-in-a-bunch-of-places.patch.gz (2K) Download Attachment
v12-0020-WIP-Slotified-triggers.patch.gz (26K) Download Attachment
v12-0021-WIP-Slotify-EPQ.patch.gz (10K) Download Attachment
v12-0022-tableam-introduce-minimal-infrastructure.patch.gz (19K) Download Attachment
v12-0023-tableam-Introduce-and-use-begin-endscan-and-do-i.patch.gz (61K) Download Attachment
v12-0024-tableam-Inquire-slot-type-from-AM-rather-than-ha.patch.gz (8K) Download Attachment
v12-0025-tableam-introduce-slot-based-table-getnext-and-u.patch.gz (9K) Download Attachment
v12-0026-tableam-Add-insert-delete-update-lock_tuple.patch.gz (27K) Download Attachment
v12-0027-tableam-Add-fetch_row_version.patch.gz (4K) Download Attachment
v12-0028-tableam-Add-use-tableam_fetch_follow_check.patch.gz (2K) Download Attachment
v12-0029-tableam-Add-table_get_latest_tid.patch.gz (2K) Download Attachment
v12-0030-tableam-multi_insert-and-slotify-COPY.patch.gz (8K) Download Attachment
v12-0031-tableam-finish_bulk_insert.patch.gz (2K) Download Attachment
v12-0032-tableam-slotify-CREATE-TABLE-AS-and-CREATE-MATER.patch.gz (2K) Download Attachment
v12-0033-tableam-index-builds.patch.gz (21K) Download Attachment
v12-0034-tableam-relation-creation-VACUUM-FULL-CLUSTER-SE.patch.gz (21K) Download Attachment
v12-0035-tableam-VACUUM-and-ANALYZE.patch.gz (7K) Download Attachment
v12-0036-tableam-planner-size-estimation.patch.gz (5K) Download Attachment
v12-0037-tableam-Sample-Scan-Support.patch.gz (9K) Download Attachment
v12-0038-tableam-bitmap-heap-scan.patch.gz (6K) Download Attachment
v12-0039-tableam-remaining-stuff.patch.gz (5K) Download Attachment
v12-0040-WIP-Move-xid-horizon-computation-for-page-level-.patch.gz (8K) Download Attachment
v12-0041-tableam-Add-function-to-determine-newest-xid-amo.patch.gz (1K) Download Attachment
v12-0042-tableam-Fetch-tuples-for-triggers-EPQ-using-a-pr.patch.gz (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Pluggable Storage - Andres's take

Amit Khandekar-2
In reply to this post by Dmitry Dolgov
On Sun, 20 Jan 2019 at 22:46, Dmitry Dolgov <[hidden email]> wrote:

>
> > On Fri, Jan 18, 2019 at 11:22 AM Amit Khandekar <[hidden email]> wrote:
> >
> > --- a/src/test/regress/expected/copy2.out
> > +++ b/src/test/regress/expected/copy2.out
> > @@ -1,3 +1,4 @@
> > +\set HIDE_TABLEAM on
> >
> > I thought we wanted to avoid having to add this setting in individual
> > regression tests. Can't we do this in pg_regress as a common setting ?
>
> Yeah, you're probably right. Actually, I couldn't find anything that looks like
> "common settings", and so far I've placed it into psql_start_test as a psql
> argument. But not sure, maybe there is a better place.

Yeah, psql_start_test() looks good to me. pg_regress does not seem to
have it's own psqlrc file where we could have put this variable. May
be later on if we want to have more such variables, we could device
this infrastructure.

>
> > + /* Access method info */
> > + if (pset.sversion >= 120000 && verbose && tableinfo.relam != NULL &&
> > +    !(pset.hide_tableam && tableinfo.relam_is_default))
> > + {
> > +         printfPQExpBuffer(&buf, _("Access method: %s"),
> > fmtId(tableinfo.relam));
> >
> > So this will make psql hide the access method if it's same as the
> > default. I understand that this was kind of concluded in the other
> > thread "Displaying and dumping of table access methods". But IMHO, if
> > the hide_tableam is false, we should *always* show the access method,
> > regardless of the default value. I mean, we can make it simple : off
> > means never show table-access, on means always show table-access,
> > regardless of the default access method. And this also will work with
> > regression tests. If some regression test wants specifically to output
> > the access method, it can have a "\SET HIDE_TABLEAM off" command.
> >
> > If we hide the method if it's default, then for a regression test that
> > wants to forcibly show the table access method of all tables, it won't
> > show up for tables that have default access method.
>
> I can't imagine, what kind of test would need to forcibly show the table access
> method of all the tables? Even if you need to verify tableam for something,
> maybe it's even easier just to select it from pg_am?

Actually my statement is wrong, sorry. For a regression test that
wants to forcibly show table access for all tables, it just needs to
SET HIDE_TABLEAM to OFF. With your patch, if we set HIDE_TABLEAM to
OFF, it will *always* show the table access regardless of default
access method.

It is with HIDE_TABLEAM=ON that your patch hides the table access
conditionally (i.e. it shows when default value does not match). It's
in this case, that I feel we should *unconditionally* hide the table
access. Regression tests that use \d+ to show the table details might
not be interested specifically in table access method. But these will
fail if run with a modified default access method.

Besides, my general inclination is : keep the GUC behaviour simple;
and also,  it looks like we can keep the regression test output
consistent without having to have this conditional behaviour.

--
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company

Reply | Threaded
Open this post in threaded view
|

Re: Pluggable Storage - Andres's take

Haribabu Kommi-2
In reply to this post by Andres Freund

On Mon, Jan 21, 2019 at 1:01 PM Andres Freund <[hidden email]> wrote:
Hi,

On 2018-12-10 18:13:40 -0800, Andres Freund wrote:
> On 2018-11-26 17:55:57 -0800, Andres Freund wrote:
> > FWIW, now that oids are removed, and the tuple table slot abstraction
> > got in, I'm working on rebasing the pluggable storage patchset ontop of
> > that.
>
> I've pushed a version to that to the git tree, including a rebased
> version of zheap:
> https://github.com/anarazel/postgres-pluggable-storage
> https://github.com/anarazel/postgres-pluggable-zheap

I've pushed the newest, substantially revised, version to the same
repository. Note, that while the newest pluggable-zheap version is newer
than my last email, it's not based on the latest version, and the
pluggable-zheap development is now happening in the main zheap
repository.

Thanks for the new version of patches and changes.
 
Todo:
- consider removing scan_update_snapshot

Attached the patch for removal of scan_update_snapshot
and also the rebased patch of reduction in use of t_tableOid.
 
- consider removing table_gimmegimmeslot()
- add substantial docs for every callback

Will work on the above two.

While I saw an initial attempt at writing smgl docs for the table AM
API, I'm not convinced that's the best approach.  I think it might make
more sense to have high-level docs in sgml, but then do all the
per-callback docs in tableam.h.

OK, I will update the sgml docs accordingly. 
Index AM has per callback docs in the sgml, refactor them also?

Regards,
Haribabu Kommi
Fujitsu Australia

0002-Removal-of-scan_update_snapshot.patch (5K) Download Attachment
0001-Reduce-the-use-of-HeapTuple-t_tableOid.patch (73K) Download Attachment
1234567 ... 9