Fallback table AM for relkinds without storage

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Fallback table AM for relkinds without storage

Michael Paquier-2
Hi all,

In recent history, we have had two bugs causing a crash of the backend
because of the default behavior of rd_tableam to be NULL for a
relcache entry for relkinds that have no storage:
1) Sequential attempt for a view:
https://www.postgresql.org/message-id/16856-0363e05c6e1612fd@...
2) currtid() and currtid2():
https://postgr.es/m/CAJGNTeO93u-5APMga6WH41eTZ3Uee9f3s8dCpA-GSSqNs1b=Ug@...

Any hole in the code that allows a relation without storage to attempt
to access a table AM is able to take the server down.  Of course, any
code doing that would be wrong, but it seems to me that we had better
put in place better defenses so as any mistake does not result in a
server going down.  Looking at the code, we would need to do a couple
of things, mainly:
- Create a new table AM for relations without storage to plug into.
The idea would be a simple wrapper for all the AM functions that
triggers a elog(ERROR) for each one of them.  If possible, provide
some details based on the arguments given by the caller of the
function.  Here are some ideas of names: no_storage_table_am,
no_storage_am, error_table_am, error_am, fallback_am (this one sounds
wrong).  This requires an extra row in pg_am.
- Tweak the area around RelationInitTableAccessMethod(), with rd_am so
as rd_amhandler is never NULL.

Putting sanity checks within all the table_* functions of tableam.h
would not be a good idea, as nothing prevents the call of what's
stored in rel->rd_tableam.  

Thoughts?
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Fallback table AM for relkinds without storage

Michael Paquier-2
On Tue, Feb 09, 2021 at 04:27:34PM +0900, Michael Paquier wrote:
> Putting sanity checks within all the table_* functions of tableam.h
> would not be a good idea, as nothing prevents the call of what's
> stored in rel->rd_tableam.  

I have been playing with this idea, and finished with the attached,
which is not the sexiest patch around.  The table AM used as fallback
for tables without storage is called no_storage (this could be called
virtual_am?).  Reverting e786be5 or dd705a0 leads to an error coming
from no_storage instead of a crash.

One thing to note is that this simplifies a bit slot_callbacks as
views, foreign tables and partitioned tables can grab their slot type
directly from this new table AM.
--
Michael

0001-Add-no_storage-fallback-table-AM-for-relations-witho.patch (31K) Download Attachment
signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Fallback table AM for relkinds without storage

Justin Pryzby
On Mon, Feb 15, 2021 at 04:21:38PM +0900, Michael Paquier wrote:

> On Tue, Feb 09, 2021 at 04:27:34PM +0900, Michael Paquier wrote:
> > Putting sanity checks within all the table_* functions of tableam.h
> > would not be a good idea, as nothing prevents the call of what's
> > stored in rel->rd_tableam.  
>
> I have been playing with this idea, and finished with the attached,
> which is not the sexiest patch around.  The table AM used as fallback
> for tables without storage is called no_storage (this could be called
> virtual_am?).  Reverting e786be5 or dd705a0 leads to an error coming
> from no_storage instead of a crash.

If you apply this patch, will you want to actually revert those earlier changes?

> One thing to note is that this simplifies a bit slot_callbacks as
> views, foreign tables and partitioned tables can grab their slot type
> directly from this new table AM.

Also (related), this still crashes if methods are omitted from the initializer,
like:

// .slot_callbacks = no_storage_slot_callbacks,

I'm not sure if there's any better way to enforce that's updated when callbacks
are added.

Most of the methods have Assert( != NULL), so maybe this one is missing?

src/backend/access/table/tableamapi.c
GetTableAmRoutine(Oid amhandler)
...
        Assert(routine->slot_callbacks != NULL);

See also
https://www.postgresql.org/message-id/CALfoeisgdZhYDrJOukaBzvXfJOK2FQ0szVMK7dzmcy6w93iDUA%40mail.gmail.com

--
Justin


Reply | Threaded
Open this post in threaded view
|

Re: Fallback table AM for relkinds without storage

Michael Paquier-2
On Sun, Feb 21, 2021 at 09:43:59AM -0600, Justin Pryzby wrote:
> If you apply this patch, will you want to actually revert those
> earlier changes?

That's not in the plan.

> Also (related), this still crashes if methods are omitted from the initializer,
> like:
>
> // .slot_callbacks = no_storage_slot_callbacks,
>
> I'm not sure if there's any better way to enforce that's updated when callbacks
> are added.
>
> Most of the methods have Assert( != NULL), so maybe this one is missing?
>
> src/backend/access/table/tableamapi.c
> GetTableAmRoutine(Oid amhandler)
> ...
> Assert(routine->slot_callbacks != NULL);
Good point, that looks like an omission.  Even if the code tries to
look after the slot type for a view, foreign table or partitioned
table, this cannot be NULL.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Fallback table AM for relkinds without storage

Andres Freund
In reply to this post by Michael Paquier-2
Hi,

On 2021-02-15 16:21:38 +0900, Michael Paquier wrote:
> I have been playing with this idea, and finished with the attached,
> which is not the sexiest patch around.  The table AM used as fallback
> for tables without storage is called no_storage (this could be called
> virtual_am?).


> One thing to note is that this simplifies a bit slot_callbacks as
> views, foreign tables and partitioned tables can grab their slot type
> directly from this new table AM.

This doesn't seem like an advantage to me. Isn't this just pushing logic
away from a fairly obvious point into an AM that one would expect to
never actually get called?


> +static const TupleTableSlotOps *
> +no_storage_slot_callbacks(Relation relation)
> +{
> + if (relation->rd_rel->relkind == RELKIND_FOREIGN_TABLE)
> + {
> + /*
> + * Historically FDWs expect to store heap tuples in slots. Continue
> + * handing them one, to make it less painful to adapt FDWs to new
> + * versions. The cost of a heap slot over a virtual slot is pretty
> + * small.
> + */
> + return &TTSOpsHeapTuple;
> + }
> +
> + /*
> + * These need to be supported, as some parts of the code (like COPY) need
> + * to create slots for such relations too. It seems better to centralize
> + * the knowledge that a heap slot is the right thing in that case here.
> + */
> + if (relation->rd_rel->relkind != RELKIND_VIEW &&
> + relation->rd_rel->relkind != RELKIND_PARTITIONED_TABLE)
> + elog(ERROR, "no_storage_slot_callbacks() failed on relation \"%s\"",
> + RelationGetRelationName(relation));
> + return &TTSOpsVirtual;
> +}

If we want to go down this path what's the justification for have
relkind awareness inside the AM? If we want it, we should give FDWs
their own tableam.  And we should do the same for sequences (that'd imo
be a much nicer improvement than this change in itself).

If we were to go for separate AMs I think we could consider implementing
most of their functionality in one file, to avoid needing to duplicate
the functions that error out.

And I'd vote for not naming it no_storage - to me that sounds like a
name you'd give "blackhole_am". This concept kinda reminds me of
pseudotypes - so maybe we should just name it pseudo_am.c or such?

Greetings,

Andres Freund


Reply | Threaded
Open this post in threaded view
|

Re: Fallback table AM for relkinds without storage

Michael Paquier-2
On Mon, Feb 22, 2021 at 05:19:37PM -0800, Andres Freund wrote:
> This doesn't seem like an advantage to me. Isn't this just pushing logic
> away from a fairly obvious point into an AM that one would expect to
> never actually get called?
>
> If we want to go down this path what's the justification for have
> relkind awareness inside the AM? If we want it, we should give FDWs
> their own tableam.

Agreed, I am not completely comfortable with passing down any
knowledge of the relkind down to the AM itself.

> And we should do the same for sequences (that'd imo be a much nicer
> improvement than this change in itself).

Sequences just use the existing heap AM, so you mean to drop from
relcache.c anything specific to sequences when initializing the
relation cache and set pg_class.relam accordingly, right?  That makes
sense for consistency with the rest.

> If we were to go for separate AMs I think we could consider implementing
> most of their functionality in one file, to avoid needing to duplicate
> the functions that error out.

Yep, definitely.  No issues with that.

> And I'd vote for not naming it no_storage - to me that sounds like a
> name you'd give "blackhole_am". This concept kinda reminds me of
> pseudotypes - so maybe we should just name it pseudo_am.c or such?

For the file name, using something like pseudo_handler.c or similar
would be fine, I guess.  However, if we go down the path of one AM per
relkind for the slot callback, then why not just calling the AMs
foreign_table_am, view_am and part_table_am?  This could be coupled
with sanity checks to make sure that AMs assigned to those relations
are the expected ones.

blackhole_am is not the best fit for that IMO.  It already exists, but
I would be fine to change this code, of course:
https://github.com/michaelpq/pg_plugins/tree/master/blackhole_am
--
Michael

signature.asc (849 bytes) Download Attachment