pg_partition_tree crashes for a non-defined relation

classic Classic list List threaded Threaded
20 messages Options
Reply | Threaded
Open this post in threaded view
|

pg_partition_tree crashes for a non-defined relation

Michael Paquier-2
Hi all,

While testing another patch, I have bumped into the issue of
$subject...  I should have put some more negative testing from the start
on this stuff, here is a culprit query when passing directly an OID:
select pg_partition_tree(0);

I think that we should make the function return NULL if the relation
defined does not exist, as we usually do for system-facing functions.
It is also easier for the caller to know that the relation does not
exist instead of having a plpgsql try/catch wrapper or such.

Thoughts?
--
Michael

partition-tree-invalid.patch (1K) Download Attachment
signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: pg_partition_tree crashes for a non-defined relation

Michael Paquier-2
On Fri, Dec 07, 2018 at 10:04:06AM +0900, Michael Paquier wrote:

> While testing another patch, I have bumped into the issue of
> $subject...  I should have put some more negative testing from the start
> on this stuff, here is a culprit query when passing directly an OID:
> select pg_partition_tree(0);
>
> I think that we should make the function return NULL if the relation
> defined does not exist, as we usually do for system-facing functions.
> It is also easier for the caller to know that the relation does not
> exist instead of having a plpgsql try/catch wrapper or such.
>
> Thoughts?
Are there any objections about fixing this issue?  I would rather fix it
sonner than later.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: pg_partition_tree crashes for a non-defined relation

Amit Langote
Hi,

Sorry for not replying sooner.

On Sat, Dec 8, 2018 at 8:06 Michael Paquier <[hidden email]> wrote:
On Fri, Dec 07, 2018 at 10:04:06AM +0900, Michael Paquier wrote:
> While testing another patch, I have bumped into the issue of
> $subject...  I should have put some more negative testing from the start
> on this stuff, here is a culprit query when passing directly an OID:
> select pg_partition_tree(0);
>
> I think that we should make the function return NULL if the relation
> defined does not exist, as we usually do for system-facing functions.
> It is also easier for the caller to know that the relation does not
> exist instead of having a plpgsql try/catch wrapper or such.
>
> Thoughts?

Are there any objections about fixing this issue?  I would rather fix it
sonner than later.

Thanks for noticing it and creating the patch.  The fix makes sense.

Regards,
Amit
Reply | Threaded
Open this post in threaded view
|

Re: pg_partition_tree crashes for a non-defined relation

Michael Paquier-2
On Sat, Dec 08, 2018 at 12:28:53PM +0900, Amit Langote wrote:
> Thanks for noticing it and creating the patch.  The fix makes sense.

Thanks a lot for looking at it!
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: pg_partition_tree crashes for a non-defined relation

Tom Lane-2
In reply to this post by Michael Paquier-2
Michael Paquier <[hidden email]> writes:
> On Fri, Dec 07, 2018 at 10:04:06AM +0900, Michael Paquier wrote:
>> I think that we should make the function return NULL if the relation
>> defined does not exist, as we usually do for system-facing functions.
>> It is also easier for the caller to know that the relation does not
>> exist instead of having a plpgsql try/catch wrapper or such.

> Are there any objections about fixing this issue?  I would rather fix it
> sonner than later.

Return NULL seems a reasonable behavior.

How about cases where the relation OID exists but it's the wrong
kind of relation?

                        regards, tom lane

Reply | Threaded
Open this post in threaded view
|

Re: pg_partition_tree crashes for a non-defined relation

Michael Paquier-2
On Fri, Dec 07, 2018 at 11:33:32PM -0500, Tom Lane wrote:
> How about cases where the relation OID exists but it's the wrong
> kind of relation?

Such cases already return an error:
=# create sequence popo;
CREATE SEQUENCE
=# select pg_partition_tree('popo');
ERROR:  42809: "popo" is not a table, a foreign table, or an index
LOCATION:  pg_partition_tree, partitionfuncs.c:54

I think that's a sensible choice, because it makes no sense to look for
the inheritors of unsupported relkinds.  Perhaps you have a different
view on the matter?
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: pg_partition_tree crashes for a non-defined relation

Stephen Frost
Greetings,

* Michael Paquier ([hidden email]) wrote:

> On Fri, Dec 07, 2018 at 11:33:32PM -0500, Tom Lane wrote:
> > How about cases where the relation OID exists but it's the wrong
> > kind of relation?
>
> Such cases already return an error:
> =# create sequence popo;
> CREATE SEQUENCE
> =# select pg_partition_tree('popo');
> ERROR:  42809: "popo" is not a table, a foreign table, or an index
> LOCATION:  pg_partition_tree, partitionfuncs.c:54
>
> I think that's a sensible choice, because it makes no sense to look for
> the inheritors of unsupported relkinds.  Perhaps you have a different
> view on the matter?
We should really have a more clearly defined policy around this, but my
recollection is that we often prefer to return NULL rather than throwing
an error for the convenience of people doing things like querying
pg_class using similar functions.

I wonder if we maybe should have a regression test for every such
function which just queries the catalog in a way to force the function
to be called for every relation defined in the regression tests, to
ensure that it doesn't segfault or throw an error..

Thanks!

Stephen

signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: pg_partition_tree crashes for a non-defined relation

Michael Paquier-2
On Sat, Dec 08, 2018 at 08:46:08AM -0500, Stephen Frost wrote:
> We should really have a more clearly defined policy around this, but my
> recollection is that we often prefer to return NULL rather than throwing
> an error for the convenience of people doing things like querying
> pg_class using similar functions.

Yes, that's visibly right.  At least that's what I can see from the
various pg_get_*def and pg_*_is_visible.  Returning NULL would indeed
be more consistent.

> I wonder if we maybe should have a regression test for every such
> function which just queries the catalog in a way to force the function
> to be called for every relation defined in the regression tests, to
> ensure that it doesn't segfault or throw an error..

Like sqlsmith?  It looks hard to me to make something like that part of
the main regression test suite, as that's going to be costly and hard to
scale with.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: pg_partition_tree crashes for a non-defined relation

Michael Paquier-2
On Sun, Dec 09, 2018 at 08:15:07AM +0900, Michael Paquier wrote:
> On Sat, Dec 08, 2018 at 08:46:08AM -0500, Stephen Frost wrote:
>> We should really have a more clearly defined policy around this, but my
>> recollection is that we often prefer to return NULL rather than throwing
>> an error for the convenience of people doing things like querying
>> pg_class using similar functions.
>
> Yes, that's visibly right.  At least that's what I can see from the
> various pg_get_*def and pg_*_is_visible.  Returning NULL would indeed
> be more consistent.

Thinking more about your argument, scanning fully pg_class is quite
sensible as well because there is no need to apply an extra qual on
relkind, so let's change the function as you suggest, by returning NULL
on invalid relation type.  Any opinions about the attached then which
does the switch?
--
Michael

partition-tree-invalid-v2.patch (2K) Download Attachment
signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: pg_partition_tree crashes for a non-defined relation

Stephen Frost
In reply to this post by Michael Paquier-2
Greetings,

* Michael Paquier ([hidden email]) wrote:
> On Sat, Dec 08, 2018 at 08:46:08AM -0500, Stephen Frost wrote:
> > I wonder if we maybe should have a regression test for every such
> > function which just queries the catalog in a way to force the function
> > to be called for every relation defined in the regression tests, to
> > ensure that it doesn't segfault or throw an error..
>
> Like sqlsmith?  It looks hard to me to make something like that part of
> the main regression test suite, as that's going to be costly and hard to
> scale with.

No, I mean something like:

with x as (select pg_partition_tree(relname) from pg_class)
select 1 from x limit 1;

or whatever it takes to make sure that the function is run against every
entry in pg_class (or whatever is appropriate) while not returning the
results (since we don't actually care about the output, we just want to
make sure it doesn't ERROR or crash).

sqlsmith, as I recall, doesn't care about ERROR cases, it's just looking
for crashes, so it's not quite the same thing.

Thanks!

Stephen

signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: pg_partition_tree crashes for a non-defined relation

Stephen Frost
In reply to this post by Michael Paquier-2
Greetings,

* Michael Paquier ([hidden email]) wrote:

> On Sun, Dec 09, 2018 at 08:15:07AM +0900, Michael Paquier wrote:
> > On Sat, Dec 08, 2018 at 08:46:08AM -0500, Stephen Frost wrote:
> >> We should really have a more clearly defined policy around this, but my
> >> recollection is that we often prefer to return NULL rather than throwing
> >> an error for the convenience of people doing things like querying
> >> pg_class using similar functions.
> >
> > Yes, that's visibly right.  At least that's what I can see from the
> > various pg_get_*def and pg_*_is_visible.  Returning NULL would indeed
> > be more consistent.
>
> Thinking more about your argument, scanning fully pg_class is quite
> sensible as well because there is no need to apply an extra qual on
> relkind, so let's change the function as you suggest, by returning NULL
> on invalid relation type.  Any opinions about the attached then which
> does the switch?
Looks alright on a quick glance, but shouldn't you also update the
comment..?

Thanks!

Stephen

signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: pg_partition_tree crashes for a non-defined relation

Tom Lane-2
In reply to this post by Stephen Frost
Stephen Frost <[hidden email]> writes:
> * Michael Paquier ([hidden email]) wrote:
>> On Sat, Dec 08, 2018 at 08:46:08AM -0500, Stephen Frost wrote:
>>> I wonder if we maybe should have a regression test for every such
>>> function which just queries the catalog in a way to force the function
>>> to be called for every relation defined in the regression tests, to
>>> ensure that it doesn't segfault or throw an error..

> No, I mean something like:
> with x as (select pg_partition_tree(relname) from pg_class)
> select 1 from x limit 1;
> or whatever it takes to make sure that the function is run against every
> entry in pg_class (or whatever is appropriate) while not returning the
> results (since we don't actually care about the output, we just want to
> make sure it doesn't ERROR or crash).

I'm going to object to that on cost grounds.  It is not reasonable to
run moderately-expensive functions like this on more than a thousand
pg_class entries in order to test what are just a few distinct cases,
especially in code that's highly unlikely to break once written.
Or at least, it's not reasonable to do that every time anybody runs
the regression tests for the rest of our lives.

Moreover, this would only help check a specific new function if the
author or reviewer remembered to add a test case that does it.
Since the whole problem here is "I forgot to consider this", I don't
have much faith in that happening.

Maybe we should have some sort of checklist of things to think about
when adding new SQL-visible functions, rather than trying to
memorialize every such consideration as a regression test no matter
how expensive.  Admittedly, "I forgot to go over the checklist" is
still a problem; but at least it's not penalizing every developer and
every buildfarm run till kingdom come.

                        regards, tom lane

Reply | Threaded
Open this post in threaded view
|

Re: pg_partition_tree crashes for a non-defined relation

Stephen Frost
Greetings,

* Tom Lane ([hidden email]) wrote:

> Stephen Frost <[hidden email]> writes:
> > * Michael Paquier ([hidden email]) wrote:
> >> On Sat, Dec 08, 2018 at 08:46:08AM -0500, Stephen Frost wrote:
> >>> I wonder if we maybe should have a regression test for every such
> >>> function which just queries the catalog in a way to force the function
> >>> to be called for every relation defined in the regression tests, to
> >>> ensure that it doesn't segfault or throw an error..
>
> > No, I mean something like:
> > with x as (select pg_partition_tree(relname) from pg_class)
> > select 1 from x limit 1;
> > or whatever it takes to make sure that the function is run against every
> > entry in pg_class (or whatever is appropriate) while not returning the
> > results (since we don't actually care about the output, we just want to
> > make sure it doesn't ERROR or crash).
>
> I'm going to object to that on cost grounds.  It is not reasonable to
> run moderately-expensive functions like this on more than a thousand
> pg_class entries in order to test what are just a few distinct cases,
> especially in code that's highly unlikely to break once written.
Yeah, I had been wondering if it would be too expensive also.

I don't entirely buy off on the argument that it's code that's 'highly
unlikely to break once written' though- we do add new relkinds from time
to time, for example.  Perhaps we could have these functions run just
once per relkind.

> Moreover, this would only help check a specific new function if the
> author or reviewer remembered to add a test case that does it.

We could possibly write a test which would run every function that
accepts the typical data types (oid/regclass/name/etc) instead of
depending on the author to remember to add it.

> Since the whole problem here is "I forgot to consider this", I don't
> have much faith in that happening.

Yeah, I agree that we'd want something more automated as, otherwise, it
would definitely be likely to be forgotten.

> Maybe we should have some sort of checklist of things to think about
> when adding new SQL-visible functions, rather than trying to
> memorialize every such consideration as a regression test no matter
> how expensive.  Admittedly, "I forgot to go over the checklist" is
> still a problem; but at least it's not penalizing every developer and
> every buildfarm run till kingdom come.

This seems like something we should do regardless.  Moreover, I'd
suggest that we start documenting some of these policies in the way that
we have a style guide for errors and such- we need a "system function
style guide" that could start with something like "prefer to return NULL
instead of ERROR on unexpected but otherwise valid inputs, and test that
the code doesn't segfault when given a variety of inputs".

Thanks!

Stephen

signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: pg_partition_tree crashes for a non-defined relation

Tom Lane-2
Stephen Frost <[hidden email]> writes:
> * Tom Lane ([hidden email]) wrote:
>> ... especially in code that's highly unlikely to break once written.

> I don't entirely buy off on the argument that it's code that's 'highly
> unlikely to break once written' though- we do add new relkinds from time
> to time, for example.  Perhaps we could have these functions run just
> once per relkind.

Well, the relevant code is likely to be "if relkind is not x, y, or z,
then PG_RETURN_NULL".  If we add a new relkind and forget to consider the
function, the outcome is a NULL result that perhaps should not have been
NULL ... but a test like this won't help us notice that.

                        regards, tom lane

Reply | Threaded
Open this post in threaded view
|

Re: pg_partition_tree crashes for a non-defined relation

Michael Paquier-2
On Sun, Dec 09, 2018 at 02:07:29PM -0500, Tom Lane wrote:

> Stephen Frost <[hidden email]> writes:
>> I don't entirely buy off on the argument that it's code that's 'highly
>> unlikely to break once written' though- we do add new relkinds from time
>> to time, for example.  Perhaps we could have these functions run just
>> once per relkind.
>
> Well, the relevant code is likely to be "if relkind is not x, y, or z,
> then PG_RETURN_NULL".  If we add a new relkind and forget to consider the
> function, the outcome is a NULL result that perhaps should not have been
> NULL ... but a test like this won't help us notice that.
Yes, in order to prevent problems with newly-introduced relkinds, I
think that the checks within functions should be careful to check only
for relkinds that they support, and not list those they do not support.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: pg_partition_tree crashes for a non-defined relation

Michael Paquier-2
In reply to this post by Stephen Frost
On Sun, Dec 09, 2018 at 12:49:18PM -0500, Stephen Frost wrote:
> Looks alright on a quick glance, but shouldn't you also update the
> comment..?

The comment on HEAD or with the patch is that:
    /* Only allow relation types that can appear in partition trees. */

This still looks adapted to me.  Or would you reword it because "allow"
rather implies that an ERROR is returned?  Would you prefer changing it
something like that perhaps:
"Return NULL for relation types that do not appear in partition trees."
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: pg_partition_tree crashes for a non-defined relation

Stephen Frost
In reply to this post by Michael Paquier-2
Greetings,

* Michael Paquier ([hidden email]) wrote:

> On Sun, Dec 09, 2018 at 02:07:29PM -0500, Tom Lane wrote:
> > Stephen Frost <[hidden email]> writes:
> >> I don't entirely buy off on the argument that it's code that's 'highly
> >> unlikely to break once written' though- we do add new relkinds from time
> >> to time, for example.  Perhaps we could have these functions run just
> >> once per relkind.
> >
> > Well, the relevant code is likely to be "if relkind is not x, y, or z,
> > then PG_RETURN_NULL".  If we add a new relkind and forget to consider the
> > function, the outcome is a NULL result that perhaps should not have been
> > NULL ... but a test like this won't help us notice that.
>
> Yes, in order to prevent problems with newly-introduced relkinds, I
> think that the checks within functions should be careful to check only
> for relkinds that they support, and not list those they do not support.
Yes, but it's certainly possible for someone to add another relkind to
that list without looking through the rest of the function carefully
enough.  Perhaps not the best example.  I still don't like the
assumption that the code won't be broken once it's written correctly the
first time though, and I continue to feel that it would be good to have
regression tests which run these functions with interesting arguments.

I also agree that we don't want to make the regression tests take a lot
of additional time.

Thanks!

Stephen

signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: pg_partition_tree crashes for a non-defined relation

Stephen Frost
In reply to this post by Michael Paquier-2
Greetings,

* Michael Paquier ([hidden email]) wrote:

> On Sun, Dec 09, 2018 at 12:49:18PM -0500, Stephen Frost wrote:
> > Looks alright on a quick glance, but shouldn't you also update the
> > comment..?
>
> The comment on HEAD or with the patch is that:
>     /* Only allow relation types that can appear in partition trees. */
>
> This still looks adapted to me.  Or would you reword it because "allow"
> rather implies that an ERROR is returned?  Would you prefer changing it
> something like that perhaps:
> "Return NULL for relation types that do not appear in partition trees."
Yes.

Thanks!

Stephen

signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: pg_partition_tree crashes for a non-defined relation

Michael Paquier-2
On Mon, Dec 10, 2018 at 08:21:43AM -0500, Stephen Frost wrote:
> * Michael Paquier ([hidden email]) wrote:
>> This still looks adapted to me.  Or would you reword it because "allow"
>> rather implies that an ERROR is returned?  Would you prefer changing it
>> something like that perhaps:
>> "Return NULL for relation types that do not appear in partition trees."
>
> Yes.

OK.  Sure, let's do as you suggest then.  I'll wait a couple of days
before committing the patch so as if someone has extra comments they
have the time to post.  So please feel free to comment!
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: pg_partition_tree crashes for a non-defined relation

Michael Paquier-2
On Mon, Dec 10, 2018 at 10:52:37PM +0900, Michael Paquier wrote:
> OK.  Sure, let's do as you suggest then.  I'll wait a couple of days
> before committing the patch so as if someone has extra comments they
> have the time to post.  So please feel free to comment!

And done this way.  Thanks for the input, Stephen and others!
--
Michael

signature.asc (849 bytes) Download Attachment