BUG #15623: Inconsistent use of default for updatable view

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

BUG #15623: Inconsistent use of default for updatable view

apt.postgresql.org Repository Update
The following bug has been logged on the website:

Bug reference:      15623
Logged by:          Roger Curley
Email address:      [hidden email]
PostgreSQL version: 11.1
Operating system:   Ubuntu 11.1
Description:        

Steps to reproduce (run in psql shell):
```
DROP TABLE IF EXISTS test CASCADE;
CREATE TABLE test (
  id int PRIMARY KEY,
  value int DEFAULT 0
);
CREATE VIEW test_view AS (SELECT * FROM test);

INSERT INTO test_view VALUES (1, DEFAULT), (2, DEFAULT);
INSERT INTO test VALUES (3, DEFAULT), (4, DEFAULT);
INSERT INTO test_view VALUES (5, DEFAULT);
SELECT * FROM test;
```

Result:
```
 id | value
----+-------
  1 |      
  2 |      
  3 |     0
  4 |     0
  5 |     0
```

Expected Result:
```
 id | value
----+-------
  1 |     0
  2 |     0
  3 |     0
  4 |     0
  5 |     0
```
In particular, it's surprising that inserting 1 row into an updatable view
uses the table default, while inserting 2 uses null.

Reply | Threaded
Open this post in threaded view
|

Re: BUG #15623: Inconsistent use of default for updatable view

Amit Langote-2
Hi,

On 2019/02/08 6:42, PG Bug reporting form wrote:

> The following bug has been logged on the website:
>
> Bug reference:      15623
> Logged by:          Roger Curley
> Email address:      [hidden email]
> PostgreSQL version: 11.1
> Operating system:   Ubuntu 11.1
> Description:        
>
> Steps to reproduce (run in psql shell):
> ```
> DROP TABLE IF EXISTS test CASCADE;
> CREATE TABLE test (
>   id int PRIMARY KEY,
>   value int DEFAULT 0
> );
> CREATE VIEW test_view AS (SELECT * FROM test);
>
> INSERT INTO test_view VALUES (1, DEFAULT), (2, DEFAULT);
> INSERT INTO test VALUES (3, DEFAULT), (4, DEFAULT);
> INSERT INTO test_view VALUES (5, DEFAULT);
> SELECT * FROM test;
> ```
>
> Result:
> ```
>  id | value
> ----+-------
>   1 |      
>   2 |      
>   3 |     0
>   4 |     0
>   5 |     0
> ```
>
> Expected Result:
> ```
>  id | value
> ----+-------
>   1 |     0
>   2 |     0
>   3 |     0
>   4 |     0
>   5 |     0
> ```
> In particular, it's surprising that inserting 1 row into an updatable view
> uses the table default, while inserting 2 uses null.
Thanks for the report.  Seems odd indeed.

Looking into this, the reason it works when inserting just one row vs.
more than one row is that those two cases are handled by nearby but
different pieces of code.  The code that handles multiple rows seems buggy
as seen in the above example.  Specifically, I think the bug is in
rewriteValuesRTE() which is a function to replace the default placeholders
in the input rows by the default values as defined for the target
relation.  It is called twice when inserting via the view -- first for the
view relation and then again for the underlying table.  This arrangement
seems to work correctly if the view specifies its own defaults for columns
(assuming that it's okay for the view's defaults to override the
underlying base table's).  If there are no view-specified defaults, then
rewriteValuesRTE replaces the default placeholders in the input row by
NULL constants when called for the first time with the view as target
relation and the next invocation for the underlying table finds that it
has no work to do, so its defaults are not filled.

Attached find a patch that adjusts rewriteValuesRTE to not replace the
default placeholder if the view has no default value for a given column.
Also, adds a test in updatable_views.sql.

Thanks,
Amit

view-insert-null-default-fix.patch (5K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: BUG #15623: Inconsistent use of default for updatable view

Dean Rasheed-3
On Fri, 8 Feb 2019 at 05:07, Amit Langote <[hidden email]> wrote:
> Thanks for the report.  Seems odd indeed.

Hmm, indeed. That seems to have been broken ever since updatable views
were added.

> Looking into this, the reason it works when inserting just one row vs.
> more than one row is that those two cases are handled by nearby but
> different pieces of code.  The code that handles multiple rows seems buggy
> as seen in the above example.  Specifically, I think the bug is in
> rewriteValuesRTE() which is a function to replace the default placeholders
> in the input rows by the default values as defined for the target
> relation.  It is called twice when inserting via the view -- first for the
> view relation and then again for the underlying table.

Right, except when the view is trigger-updatable. In that case, we do
have to explicitly set the column value to NULL when
rewriteValuesRTE() is called for the view, because it won't be called
again for the underlying table -- it is the trigger's responsibility
to work how (or indeed if) to update the underlying table. IOW, you
need to also use view_has_instead_trigger() to check the view,
otherwise your patch breaks this case:

DROP TABLE IF EXISTS test CASCADE;
CREATE TABLE test (
  id int PRIMARY KEY,
  value int DEFAULT 0
);
CREATE VIEW test_view AS (SELECT * FROM test);

CREATE OR REPLACE FUNCTION test_view_ins() RETURNS trigger
AS
$$
BEGIN
  INSERT INTO test VALUES (NEW.id, NEW.value);
  RETURN NEW;
END;
$$
LANGUAGE plpgsql;

CREATE TRIGGER test_view_trig INSTEAD OF INSERT ON test_view
  FOR EACH ROW EXECUTE FUNCTION test_view_ins();

INSERT INTO test_view VALUES (1, DEFAULT), (2, DEFAULT);

ERROR:  unrecognized node type: 142


While playing around with this, I noticed a related bug affecting the
new identity columns feature. I've not investigated it fully, but It
looks almost the same -- if the column is an identity column, and
we're inserting a multi-row VALUES set containing DEFAULTS, they will
get rewritten to NULLs which will then lead to an error if overriding
the generated value isn't allowed:

DROP TABLE IF EXISTS foo CASCADE;
CREATE TABLE foo
(
  a int,
  b int GENERATED ALWAYS AS IDENTITY
);

INSERT INTO foo VALUES (1,DEFAULT); -- OK
INSERT INTO foo VALUES (2,DEFAULT),(3,DEFAULT); -- Fails

I think fixing that should be tackled separately, because it may turn
out to be subtly different, but it definitely looks like another bug.

Regards,
Dean

Reply | Threaded
Open this post in threaded view
|

Re: BUG #15623: Inconsistent use of default for updatable view

Amit Langote
Thanks for looking at this.

On Fri, Feb 8, 2019 at 8:01 PM Dean Rasheed <[hidden email]> wrote:

>
> On Fri, 8 Feb 2019 at 05:07, Amit Langote <[hidden email]> wrote:
> > Thanks for the report.  Seems odd indeed.
>
> Hmm, indeed. That seems to have been broken ever since updatable views
> were added.
>
> > Looking into this, the reason it works when inserting just one row vs.
> > more than one row is that those two cases are handled by nearby but
> > different pieces of code.  The code that handles multiple rows seems buggy
> > as seen in the above example.  Specifically, I think the bug is in
> > rewriteValuesRTE() which is a function to replace the default placeholders
> > in the input rows by the default values as defined for the target
> > relation.  It is called twice when inserting via the view -- first for the
> > view relation and then again for the underlying table.
>
> Right, except when the view is trigger-updatable. In that case, we do
> have to explicitly set the column value to NULL when
> rewriteValuesRTE() is called for the view, because it won't be called
> again for the underlying table -- it is the trigger's responsibility
> to work how (or indeed if) to update the underlying table. IOW, you
> need to also use view_has_instead_trigger() to check the view,
> otherwise your patch breaks this case:
>
> DROP TABLE IF EXISTS test CASCADE;
> CREATE TABLE test (
>   id int PRIMARY KEY,
>   value int DEFAULT 0
> );
> CREATE VIEW test_view AS (SELECT * FROM test);
>
> CREATE OR REPLACE FUNCTION test_view_ins() RETURNS trigger
> AS
> $$
> BEGIN
>   INSERT INTO test VALUES (NEW.id, NEW.value);
>   RETURN NEW;
> END;
> $$
> LANGUAGE plpgsql;
>
> CREATE TRIGGER test_view_trig INSTEAD OF INSERT ON test_view
>   FOR EACH ROW EXECUTE FUNCTION test_view_ins();
>
> INSERT INTO test_view VALUES (1, DEFAULT), (2, DEFAULT);
>
> ERROR:  unrecognized node type: 142
Oops, I missed this bit.  Updated the patch per your suggestion and
expanded the test case to exercise this.

> While playing around with this, I noticed a related bug affecting the
> new identity columns feature. I've not investigated it fully, but It
> looks almost the same -- if the column is an identity column, and
> we're inserting a multi-row VALUES set containing DEFAULTS, they will
> get rewritten to NULLs which will then lead to an error if overriding
> the generated value isn't allowed:
>
> DROP TABLE IF EXISTS foo CASCADE;
> CREATE TABLE foo
> (
>   a int,
>   b int GENERATED ALWAYS AS IDENTITY
> );
>
> INSERT INTO foo VALUES (1,DEFAULT); -- OK
> INSERT INTO foo VALUES (2,DEFAULT),(3,DEFAULT); -- Fails
>
> I think fixing that should be tackled separately, because it may turn
> out to be subtly different, but it definitely looks like another bug.
I haven't looked into the details of this, but agree about raising a
thread on -hackers about it.

Thanks,
Amit

view-insert-null-default-fix-v2.patch (7K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: BUG #15623: Inconsistent use of default for updatable view

Dean Rasheed-3
On Sat, 9 Feb 2019 at 16:57, Amit Langote <[hidden email]> wrote:

> On Fri, Feb 8, 2019 at 8:01 PM Dean Rasheed <[hidden email]> wrote:
> > On Fri, 8 Feb 2019 at 05:07, Amit Langote <[hidden email]> wrote:
> > > Looking into this, the reason it works when inserting just one row vs.
> > > more than one row is that those two cases are handled by nearby but
> > > different pieces of code.  The code that handles multiple rows seems buggy
> > > as seen in the above example.  Specifically, I think the bug is in
> > > rewriteValuesRTE() which is a function to replace the default placeholders
> > > in the input rows by the default values as defined for the target
> > > relation.  It is called twice when inserting via the view -- first for the
> > > view relation and then again for the underlying table.
> >
> > Right, except when the view is trigger-updatable...
>
> Oops, I missed this bit.  Updated the patch per your suggestion and
> expanded the test case to exercise this.
>
Unfortunately, that's still not quite right because it makes the
behaviour of single- and multi-row inserts inconsistent for
rule-updatable views. Attached is an updated patch that fixes that. I
adjusted the tests a bit to try to make it clearer which defaults get
applied, and test all possibilities.

However, this is still not the end of the story, because it doesn't
fix the fact that, if the view has a DO ALSO rule on it, single-row
inserts behave differently from multi-row inserts. In that case, each
insert becomes 2 inserts, and defaults need to be treated differently
in each of the 2 queries. That's going to need a little more thought.

Regards,
Dean

view-insert-null-default-fix-v3.patch (18K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: BUG #15623: Inconsistent use of default for updatable view

Dean Rasheed-3
On Sun, 10 Feb 2019 at 00:48, Dean Rasheed <[hidden email]> wrote:
> However, this is still not the end of the story, because it doesn't
> fix the fact that, if the view has a DO ALSO rule on it, single-row
> inserts behave differently from multi-row inserts. In that case, each
> insert becomes 2 inserts, and defaults need to be treated differently
> in each of the 2 queries. That's going to need a little more thought.
>

Here's an updated patch to handle that case.

In case it's not obvious, I'm not intending to try to get this into
next week's updates -- more time is needed to be sure of this fix, and
more pairs of eyes would definitely be helpful, once those updates
have been shipped.

Regards,
Dean

view-insert-null-default-fix-v4.patch (30K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: BUG #15623: Inconsistent use of default for updatable view

Dean Rasheed-3
On Sun, 10 Feb 2019 at 11:18, Dean Rasheed <[hidden email]> wrote:

>
> On Sun, 10 Feb 2019 at 00:48, Dean Rasheed <[hidden email]> wrote:
> > However, this is still not the end of the story, because it doesn't
> > fix the fact that, if the view has a DO ALSO rule on it, single-row
> > inserts behave differently from multi-row inserts. In that case, each
> > insert becomes 2 inserts, and defaults need to be treated differently
> > in each of the 2 queries. That's going to need a little more thought.
> >
>
> Here's an updated patch to handle that case.
>
> In case it's not obvious, I'm not intending to try to get this into
> next week's updates -- more time is needed to be sure of this fix.
So I did some more testing of this and I'm reasonably happy that this
now fixes the originally reported issue of inconsistent handling of
DEFAULTS in multi-row VALUES lists vs single-row ones. I tested
various other scenarios involving conditional/unconditional
also/instead rules, and I didn't find any other surprises. Attached is
an updated patch with improved comments, and a little less code
duplication.

Regards,
Dean

view-insert-null-default-fix-v5.patch (29K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: BUG #15623: Inconsistent use of default for updatable view

Amit Langote-2
Hi Dean,

On 2019/02/12 19:33, Dean Rasheed wrote:

> On Sun, 10 Feb 2019 at 11:18, Dean Rasheed <[hidden email]> wrote:
>> On Sun, 10 Feb 2019 at 00:48, Dean Rasheed <[hidden email]> wrote:
>>> However, this is still not the end of the story, because it doesn't
>>> fix the fact that, if the view has a DO ALSO rule on it, single-row
>>> inserts behave differently from multi-row inserts. In that case, each
>>> insert becomes 2 inserts, and defaults need to be treated differently
>>> in each of the 2 queries. That's going to need a little more thought.
>>>
>>
>> Here's an updated patch to handle that case.
>>
>> In case it's not obvious, I'm not intending to try to get this into
>> next week's updates -- more time is needed to be sure of this fix.
>
> So I did some more testing of this and I'm reasonably happy that this
> now fixes the originally reported issue of inconsistent handling of
> DEFAULTS in multi-row VALUES lists vs single-row ones. I tested
> various other scenarios involving conditional/unconditional
> also/instead rules, and I didn't find any other surprises. Attached is
> an updated patch with improved comments, and a little less code
> duplication.

Thanks for updating the patch.

I can't really comment on all of the changes that that you made
considering various cases, but became curious if the single-row and
multi-row inserts cases could share the code that determines if the
DEFAULT item be replaced by the target-relation-specified default or NULL?
 IOW, is there some reason why we can't avoid the special handling for the
multi-row (RTE_VALUES) case?

Thanks,
Amit


Reply | Threaded
Open this post in threaded view
|

Re: BUG #15623: Inconsistent use of default for updatable view

Dean Rasheed-3
On Wed, 13 Feb 2019 at 03:02, Amit Langote
<[hidden email]> wrote:
>
> I can't really comment on all of the changes that that you made
> considering various cases, but became curious if the single-row and
> multi-row inserts cases could share the code that determines if the
> DEFAULT item be replaced by the target-relation-specified default or NULL?
>  IOW, is there some reason why we can't avoid the special handling for the
> multi-row (RTE_VALUES) case?
>

No, not as far as I can see.

The tricky part for a multi-row insert is working out what to do when
it sees a DEFAULT, and there is no column default on the target
relation. For an auto-updatable view, it needs to leave the DEFAULT
untouched, so that it can later apply the base relation's column
default when it recurses. For all other kinds of relation, it needs to
turn the DEFAULT into a NULL.

For a single-row insert, that's all much easier. If it sees a DEFAULT,
and there is no column default, it simply omits that entry from the
targetlist. If it then recurses to the base relation, it will put the
targetlist entry back in, if the base relation has a column default.
So it doesn't need to know locally whether it's an auto-updatable
view, and the logic is much simpler. The multi-row case can't easily
do that (add and remove columns) because it's working with a
fixed-width table structure.

Actually, that's not quite the end of it. So far, this has only been
considering INSERT's. I think there are more issues with UPDATE's, but
that's a whole other can of worms. I think I'll commit this first, and
start a thread on -hackers to discuss that.

Regards,
Dean

Reply | Threaded
Open this post in threaded view
|

Re: BUG #15623: Inconsistent use of default for updatable view

Amit Langote-2
On 2019/02/13 18:04, Dean Rasheed wrote:

> On Wed, 13 Feb 2019 at 03:02, Amit Langote
> <[hidden email]> wrote:
>>
>> I can't really comment on all of the changes that that you made
>> considering various cases, but became curious if the single-row and
>> multi-row inserts cases could share the code that determines if the
>> DEFAULT item be replaced by the target-relation-specified default or NULL?
>>  IOW, is there some reason why we can't avoid the special handling for the
>> multi-row (RTE_VALUES) case?
>>
>
> No, not as far as I can see.
>
> The tricky part for a multi-row insert is working out what to do when
> it sees a DEFAULT, and there is no column default on the target
> relation. For an auto-updatable view, it needs to leave the DEFAULT
> untouched, so that it can later apply the base relation's column
> default when it recurses. For all other kinds of relation, it needs to
> turn the DEFAULT into a NULL.
>
> For a single-row insert, that's all much easier. If it sees a DEFAULT,
> and there is no column default, it simply omits that entry from the
> targetlist. If it then recurses to the base relation, it will put the
> targetlist entry back in, if the base relation has a column default.
> So it doesn't need to know locally whether it's an auto-updatable
> view, and the logic is much simpler. The multi-row case can't easily
> do that (add and remove columns) because it's working with a
> fixed-width table structure.

Hmm yeah, column sets must be the same in in all value-lists.

> Actually, that's not quite the end of it. So far, this has only been
> considering INSERT's. I think there are more issues with UPDATE's, but
> that's a whole other can of worms. I think I'll commit this first, and
> start a thread on -hackers to discuss that.

Sure, thanks for the explanation.

Regards,
Amit