Table AM and DDLs

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Table AM and DDLs

Mats Kindahl
Hi all,

I am quite new to PostgreSQL so forgive me if my understanding of the code below is wrong and please clarify what I have misunderstood.

I started to experiment with the table access method interface to see if it can be used for some ideas I have.

For the experiment, I am using a simple in-memory table access method that stores all the data as shared memory pages instead of disk pages. I know there are other ways to get good performance, but this implementation is good enough for my experiments since it tests a few things with the Table AM interface that I am wondering about.

Now, the first question I was looking at is if it is possible to
handle DDL properly if you have a non-normal storage. Both create new storage blocks on table creation, clean up on dropping a table as well as handling schema changes on alter table?

Creating new blocks for a table is straightforward to implement by using the `relation_set_new_filenode` callback where you can create new memory blocks for a relation, but I cannot find a way to clean up those blocks when the table is dropped nor a way to handle a change of the schema for a table.

The `relation_set_new_filenode` is indirectly called from
`heap_create_with_catalog`, but there is no corresponding callback from `heap_drop_with_catalog`. It also seems like the intention is that the callback should call `RelationCreateStorage` itself (makes sense, since the access method knows about how to use the storage), so it seems natural to add a `relation_reset_filenode` to the table AM that is called from `heap_drop_with_catalog` for tables and add that to the heap implementation (see the attached patch).

Altering the schema does not seem to be covered at all, but this is something that table access methods need to know about since it might want to optimize the internal storage when the schema changes. I have not been able to find any discussions around this, but it seems like a natural thing to do with a table. Have I misunderstood how this works?

Best wishes,
Mats Kindahl
Timescale

0001-Add-callback-to-reset-filenode-to-table-access-metho.patch (5K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Table AM and DDLs

Andres Freund
Hi,

On 2021-02-22 08:33:21 +0100, Mats Kindahl wrote:
> I started to experiment with the table access method interface to see if it
> can be used for some ideas I have.

Cool.


> The `relation_set_new_filenode` is indirectly called from
> `heap_create_with_catalog`, but there is no corresponding callback from
> `heap_drop_with_catalog`. It also seems like the intention is that the
> callback should call `RelationCreateStorage` itself (makes sense, since the
> access method knows about how to use the storage), so it seems natural to
> add a `relation_reset_filenode` to the table AM that is called from
> `heap_drop_with_catalog` for tables and add that to the heap implementation
> (see the attached patch).

I don't think that's quite right. It's not exactly obvious from the
name, but RelationDropStorage() does not actually drop storage. Instead
it *schedules* the storage to be dropped upon commit.

The reason for deferring the dropping of table storage is that DDL in
postgres is transactional. Therefore we cannot remove the storage at the
moment the DROP TABLE is executed - only when the transaction that
performed the DDL commits. Therefore just providing you with a callback
that runs in heap_drop_with_catalog() doesn't really achieve much -
you'd not have a way to execute the "actual" dropping of the relation at
the later stage.


> Creating new blocks for a table is straightforward to implement by using
> the `relation_set_new_filenode` callback where you can create new memory
> blocks for a relation, but I cannot find a way to clean up those blocks
> when the table is dropped nor a way to handle a change of the schema for a
> table.

What precisely do you mean with the "handle a change of the schema" bit?
I.e. what would you like to do, and what do you think is preventing you
from it? But before you answer see my next point below.


> Altering the schema does not seem to be covered at all, but this is
> something that table access methods need to know about since it might want
> to optimize the internal storage when the schema changes. I have not been
> able to find any discussions around this, but it seems like a natural thing
> to do with a table. Have I misunderstood how this works?

Due to postgres' transactional DDL you cannot really change the storage
layout of *existing data* when that DDL command is executed - the data
still needs to be interpretable in case the DDL is rolled back
(including when crashing).

Before I explain some more: Could you describe in a bit more detail what
kind of optimization you'd like to make?

Back to schema change handling:

For some schema changes postgres assumes that they can be done
"in-place", e.g. adding a column to a table.

Other changes, e.g. changing the type of a column "sufficiently", will
cause a so called table rewrite. Which means that a new relation will be
created (including a call to relation_set_new_filenode()), then that new
relation will get all the new data inserted, and then
pg_class->relfilenode for the "original" relation will be changed to the
"rewritten" table (there's two variants of this, once for rewrites due
to ALTER TABLE and a separate one for VACUUM FULL/CLUSTER).

When the transaction containing such a rewrite commits that
->relfilenode change becomes visible for everyone, and the old
relfilenode will be deleted.


This means that right now there's no easy way to store the data anywhere
but in the file referenced by pg_class.relfilenode. I don't think
anybody would object on principle to making the necessary infrastructure
changes to support storing data elsewhere - but I think it'll also not
quite as simple as the change you suggested :(.

Greetings,

Andres Freund