Hi all, I am quite new to PostgreSQL so forgive me if my understanding of the code below is wrong and please clarify what I have misunderstood. I started to experiment with the table access method interface to see if it can be used for some ideas I have. For the experiment, I am using a simple in-memory table access method that stores all the data as shared memory pages instead of disk pages. I know there are other ways to get good performance, but this implementation is good enough for my experiments since it tests a few things with the Table AM interface that I am wondering about. Now, the first question I was looking at is if it is possible to handle DDL properly if you have a non-normal storage. Both create new storage blocks on table creation, clean up on dropping a table as well as handling schema changes on alter table? Creating new blocks for a table is straightforward to implement by using the `relation_set_new_filenode` callback where you can create new memory blocks for a relation, but I cannot find a way to clean up those blocks when the table is dropped nor a way to handle a change of the schema for a table. The `relation_set_new_filenode` is indirectly called from `heap_create_with_catalog`, but there is no corresponding callback from `heap_drop_with_catalog`. It also seems like the intention is that the callback should call `RelationCreateStorage` itself (makes sense, since the access method knows about how to use the storage), so it seems natural to add a `relation_reset_filenode` to the table AM that is called from `heap_drop_with_catalog` for tables and add that to the heap implementation (see the attached patch). Altering the schema does not seem to be covered at all, but this is something that table access methods need to know about since it might want to optimize the internal storage when the schema changes. I have not been able to find any discussions around this, but it seems like a natural thing to do with a table. Have I misunderstood how this works? Best wishes, Mats Kindahl Timescale |
Hi,
On 2021-02-22 08:33:21 +0100, Mats Kindahl wrote: > I started to experiment with the table access method interface to see if it > can be used for some ideas I have. Cool. > The `relation_set_new_filenode` is indirectly called from > `heap_create_with_catalog`, but there is no corresponding callback from > `heap_drop_with_catalog`. It also seems like the intention is that the > callback should call `RelationCreateStorage` itself (makes sense, since the > access method knows about how to use the storage), so it seems natural to > add a `relation_reset_filenode` to the table AM that is called from > `heap_drop_with_catalog` for tables and add that to the heap implementation > (see the attached patch). I don't think that's quite right. It's not exactly obvious from the name, but RelationDropStorage() does not actually drop storage. Instead it *schedules* the storage to be dropped upon commit. The reason for deferring the dropping of table storage is that DDL in postgres is transactional. Therefore we cannot remove the storage at the moment the DROP TABLE is executed - only when the transaction that performed the DDL commits. Therefore just providing you with a callback that runs in heap_drop_with_catalog() doesn't really achieve much - you'd not have a way to execute the "actual" dropping of the relation at the later stage. > Creating new blocks for a table is straightforward to implement by using > the `relation_set_new_filenode` callback where you can create new memory > blocks for a relation, but I cannot find a way to clean up those blocks > when the table is dropped nor a way to handle a change of the schema for a > table. What precisely do you mean with the "handle a change of the schema" bit? I.e. what would you like to do, and what do you think is preventing you from it? But before you answer see my next point below. > Altering the schema does not seem to be covered at all, but this is > something that table access methods need to know about since it might want > to optimize the internal storage when the schema changes. I have not been > able to find any discussions around this, but it seems like a natural thing > to do with a table. Have I misunderstood how this works? Due to postgres' transactional DDL you cannot really change the storage layout of *existing data* when that DDL command is executed - the data still needs to be interpretable in case the DDL is rolled back (including when crashing). Before I explain some more: Could you describe in a bit more detail what kind of optimization you'd like to make? Back to schema change handling: For some schema changes postgres assumes that they can be done "in-place", e.g. adding a column to a table. Other changes, e.g. changing the type of a column "sufficiently", will cause a so called table rewrite. Which means that a new relation will be created (including a call to relation_set_new_filenode()), then that new relation will get all the new data inserted, and then pg_class->relfilenode for the "original" relation will be changed to the "rewritten" table (there's two variants of this, once for rewrites due to ALTER TABLE and a separate one for VACUUM FULL/CLUSTER). When the transaction containing such a rewrite commits that ->relfilenode change becomes visible for everyone, and the old relfilenode will be deleted. This means that right now there's no easy way to store the data anywhere but in the file referenced by pg_class.relfilenode. I don't think anybody would object on principle to making the necessary infrastructure changes to support storing data elsewhere - but I think it'll also not quite as simple as the change you suggested :(. Greetings, Andres Freund |
Free forum by Nabble | Edit this page |