[PATCH] Generic type subscripting

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
139 messages Options
1234567
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Generic type subscripting

Robert Haas
On Thu, Jan 11, 2018 at 1:15 PM, Tom Lane <[hidden email]> wrote:

> I think you missed the point.  The question is whether the existence of a
> subscripting function means that we need to treat the subscriptable type
> as physically containing the subscript result type.  For example, if the
> subscript result type is composite, do we need to do something about a
> column of the subscriptable type when somebody does an ALTER TYPE
> ... ALTER ATTRIBUTE TYPE on the result type?  The dependency mechanism
> doesn't have enough information to answer that.  It's fairly easy to
> imagine cases where it wouldn't be true --- for instance, if you had
> a subscripting conversion from JSONB to my_composite_type, changing
> my_composite_type would likely change the set of JSONB values for which
> the subscripting function would succeed, but it wouldn't create a need
> to physically rewrite any JSONB columns.

I don't think I missed the point at all -- this is the exact same set
of issues that arise with respect to functions.  Indeed, I gave an
example of a function that needs to be updated if a column of the
input type is altered.  In the case of functions, we've decided that
it's not our problem.  If the user updates the composite type and
fails to update the function definitions as needed, things might
break, so they should do that.  If they don't, it's not our bug.

> After further thought, I think I'm prepared to say (for the moment) that
> only true arrays need be deemed to be containers in this sense.  If you
> make a subscripting function for anything else, we'll treat it as just a
> function that happens to yield the result type but doesn't imply that that
> is what is physically stored.  Perhaps at some point that will need to
> change, but I'm failing to think of near-term use cases where it would be
> important to have such a property.

In other words, we're vigorously agreeing.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Generic type subscripting

Tom Lane-2
Robert Haas <[hidden email]> writes:
> On Thu, Jan 11, 2018 at 1:15 PM, Tom Lane <[hidden email]> wrote:
>> I think you missed the point.  The question is whether the existence of a
>> subscripting function means that we need to treat the subscriptable type
>> as physically containing the subscript result type.

> I don't think I missed the point at all -- this is the exact same set
> of issues that arise with respect to functions.  Indeed, I gave an
> example of a function that needs to be updated if a column of the
> input type is altered.  In the case of functions, we've decided that
> it's not our problem.

Right, but in the case of stored arrays, we've decided that it *is*
our problem (as indeed it must be, because the user has no tools with
which they could fix a representation change for stored data).  The
question is to what extent that need would propagate to pseudo array
types.

> In other words, we're vigorously agreeing.

I think we're agreed on what should be in the v1 version of the patch.
I'm not 100% convinced that the problem won't come up eventually.

                        regards, tom lane

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Generic type subscripting

Robert Haas
On Thu, Jan 11, 2018 at 1:37 PM, Tom Lane <[hidden email]> wrote:

>> I don't think I missed the point at all -- this is the exact same set
>> of issues that arise with respect to functions.  Indeed, I gave an
>> example of a function that needs to be updated if a column of the
>> input type is altered.  In the case of functions, we've decided that
>> it's not our problem.
>
> Right, but in the case of stored arrays, we've decided that it *is*
> our problem (as indeed it must be, because the user has no tools with
> which they could fix a representation change for stored data).  The
> question is to what extent that need would propagate to pseudo array
> types.

I think I view the rationale a bit differently.  Let's say that a user
defines a composite type as (a int, b text) and uses that composite
type as a column type.  Then, somebody tries to change column a to
have type text, and suppose we don't throw an error but simply permit
the operation.  If the user now tries to select from the offending
column, the server will very likely crash.  In contrast, in the case
where the user has defined an SQL function that selects $1.a and
returns it as an int, they will get a runtime error when they try to
use the function.  In my mind, that is the critical difference.  An
operation that can cause the server to crash or emit internal errors
must be prohibited, whereas an operation that might cause stuff to
fail with suitable error messages in the future can be allowed.

In other words, the problem isn't that the user has no tools to fix
the problem; it's that, with certain exceptions like superusers
indulging in random catalog-hackery, unprivileged users shouldn't be
allowed to break the world.  You might point out that the chances of
break-the-world behavior for type subscripting are pretty high, since
we're slinging around arguments of type internal.  But C functions are
always an exception to the notion that we'll trap and report errors.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Generic type subscripting

Tom Lane-2
Robert Haas <[hidden email]> writes:
> On Thu, Jan 11, 2018 at 1:37 PM, Tom Lane <[hidden email]> wrote:
>> Right, but in the case of stored arrays, we've decided that it *is*
>> our problem (as indeed it must be, because the user has no tools with
>> which they could fix a representation change for stored data).  The
>> question is to what extent that need would propagate to pseudo array
>> types.

> I think I view the rationale a bit differently.  Let's say that a user
> defines a composite type as (a int, b text) and uses that composite
> type as a column type.  Then, somebody tries to change column a to
> have type text, and suppose we don't throw an error but simply permit
> the operation.  If the user now tries to select from the offending
> column, the server will very likely crash.  In contrast, in the case
> where the user has defined an SQL function that selects $1.a and
> returns it as an int, they will get a runtime error when they try to
> use the function.  In my mind, that is the critical difference.

There are two critical differences --- that's one, and the other is
that there are SQL-level ways to fix the problem, ie change the function
text with CREATE OR REPLACE FUNCTION.  We don't have a SQL command that
says "now go update the representation of table T column C".

But I think we've probably beaten this topic to death ...

                        regards, tom lane

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Generic type subscripting

Robert Haas
On Thu, Jan 11, 2018 at 2:20 PM, Tom Lane <[hidden email]> wrote:
> But I think we've probably beaten this topic to death ...

Yep.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Generic type subscripting

Dmitry Dolgov
Sorry for late reply. I've attached a new version of the patch with following
changes:

> On 7 January 2018 at 23:39, Tom Lane <[hidden email]> wrote:
> Part of the reason why I'm insistent on that is that I think it will
> expose that the division of labor between the core parser and the
> datatype-specific parse function is still a mess.

I moved all logic that busy with determining required input data type into the
datatype specific code (except everything domain related). Unfortunately, we
need some of those types before the custom code can perform validation - to
address this `parse` function now splitted into two phases, initial preparation
(when the custom code can provide all required information), and actual
validation (when everything, e.g. rhs is processed). In general I hope this
approach makes separation of concerns more clear.

> On the executor side of things, I suspect Andres will be unhappy that
> you are making ExprEvalStep part of the API for datatypes --- he
> objected to my exposing it to plpgsql param eval in
> https://postgr.es/m/20171220174243.n4y3hgzf7xd3mm5e@...
> and there was a lot more reason to do so there than there is here, IMO.

I've changes fetch/assign functions so that ExprEvalStep isn't exposed anymore.

> Now on the other hand, maybe the right way to go is to embrace a similar
> approach to what I did for plpgsql param eval, and let the datatype
> control what gets generated as the expression execution step.  The main
> point here would be to let the datatype provide the address of a callback
> function that gets executed for a subscripting step, rather than having it
> specify the OID of a pg_proc entry to call.  There would be two big wins
> from that:
>
> * The callback function would have a plain C call signature, so we would
> not have to go through FunctionCallN, saving a few cycles.  This is
> attractive because it would pretty much eliminate any concern about this
> patch making array access slower at execution time.
>
> * There would no longer be a wired-in restriction that there be two and
> only two subscripting execution functions per datatype, since there would
> not be any need for those functions to be identified in pg_type.
>
> The two disadvantages I can see of approaching things this way are:
>
> * There'd be at least some connection of subscriptable types to
> expression compilation, which is what Andres was objecting to in the
> message I cited above.  Possibly we could alleviate that a bit by
> providing helper functions that mask exactly what goes into the
> expression step structs, but I'm not sure that that gets us far.
>
> * We'd not have OIDs of execution functions in the parse trees for
> subscripting operations, which would mean that we would not have
> a convenient way to identify subscripting operations that are
> mutable, parallel-unsafe, or leaky.  Probably it'd be fine to assume
> that subscripting is always immutable and parallel-safe, although
> I'm slightly more concerned about whether people would want the
> option to label it leaky vs leakproof.  As against that, the approach
> that's there right now adds planning overhead that wasn't there before
> for exactly those function property lookups, and again I'm a bit worried
> about the performance impact.  (I did some crude performance tests
> today that indicated that the existing patch has small but measurable
> penalties, maybe on the order of 10 percent; and it'd be more by the
> time we're done because I'm pretty sure you've missed some places that
> ought to check these function properties if we're going to have them.
> So I'm afraid that we'll get pushback from people who don't care about
> extensible subscripts and do care about array performance.)
I tired to apply this approach with callback functions for fetch/assign logic
(these functions I mentioned above for prepare/validate are also implemented
like that). This part is actually can be done more or less independently from
changes above. Is it close to what you suggesting?

As a side note, I'm not sure why this main function should have a signature
with opcode:

    subscript_support(int opcode, internal other_info) returns internal

since instead of calling it with different opcode we can just return a set of
callback and execute what's necessary at this particular point.

I haven't evaluated performance of this implementation yet, will do that soon.
But in the meantime I want to align on what can be accepted as the best solution
here.

> Yeah, I'm beginning to wonder if we should do the renaming at all.
> It's useful for being sure we've found everyplace that needs to change
> ... but if lots of those places don't actually need more than the
> name changes, maybe it's just make-work and code thrashing.

I'm strongly in favor of renaming, just because I don't feel comfortable
when a name of a concept is not exactly deliver the meaning. In the current
version I kept the name "container" so far due lack of feasible alternatives.

> While I'm on the topic, I am not really happy with s/array/container/
> as you've done in some of this code.  To my mind, "container type"
> includes composite types.  Particularly in the parse_target code, where
> we're currently dealing with either composites or arrays, making it say
> that we're dealing with either composites or containers is just a recipe
> for confusion.  Unfortunately I can't think of a better word offhand,
> but some attention to this is needed.  As far as the comments go,
> we might be able to use the term "subscriptable type", but not sure if
> that will work for function/variable names.
>
> After further thought, I think I'm prepared to say (for the moment) that
> only true arrays need be deemed to be containers in this sense.  If you
> make a subscripting function for anything else, we'll treat it as just a
> function that happens to yield the result type but doesn't imply that that
> is what is physically stored.  Perhaps at some point that will need to
> change, but I'm failing to think of near-term use cases where it would be
> important to have such a property.
>
> This is, however, a good reason why I don't like the use of "container"
> terminology in the patch.  I think we want to reserve "container" for
> types where physical containment is assumed.
I see the point. But to my understanding all other datatypes are "containers"
too, since a subscripting function most likely will return some data from it,
maybe in some transformed form. Arrays are just more strict containers, so
maybe reserve "typed/strict_container" for them?

One more note. In the current version of the patch I haven't updated a tutorial
part, since as I said I want to discuss and have an agreement on details stated
above. Also, this idea about separating renaming stuff from everything else is
actually paid off - I found out few more places where I forgot to revert some
remnants of the implementation with two separated node types. I'm going to fix
them within few days in the next version.

0001-Renaming-for-new-subscripting-mechanism-v5.patch (64K) Download Attachment
0002-Base-implementation-of-subscripting-mechanism-v5.patch (173K) Download Attachment
0003-Subscripting-for-array-v5.patch (17K) Download Attachment
0004-Subscripting-for-jsonb-v5.patch (45K) Download Attachment
0005-Subscripting-documentation-v5.patch (25K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Generic type subscripting

Dmitry Dolgov
> On 22 January 2018 at 23:38, Dmitry Dolgov <[hidden email]> wrote:
> Sorry for late reply. I've attached a new version of the patch with following
> changes:
>
> I found out few more places where I forgot to revert some remnants of the
> implementation with two separated node types. I'm going to fix them within
> few days in the next version.

Here is a new version of the patch:

* rebased to the latest master

* fixed issues I mentioned above

* updated an example from the tutorial part

0001-Renaming-for-new-subscripting-mechanism-v6.patch (63K) Download Attachment
0002-Base-implementation-of-subscripting-mechanism-v6.patch (173K) Download Attachment
0003-Subscripting-for-array-v6.patch (17K) Download Attachment
0004-Subscripting-for-jsonb-v6.patch (45K) Download Attachment
0005-Subscripting-documentation-v6.patch (25K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Generic type subscripting

a.zakirov
Hi,

On Sun, Jan 28, 2018 at 06:26:56PM +0100, Dmitry Dolgov wrote:
>
> Here is a new version of the patch:
>
> * rebased to the latest master
>
> * fixed issues I mentioned above
>
> * updated an example from the tutorial part

I have a few comments.

0002-Base-implementation-of-subscripting-mechanism-v6.patch:

> - if (op->d.arrayref_subscript.isupper)
> - indexes = arefstate->upperindex;
> + if (op->d.sbsref_subscript.isupper)
> + indexes = sbsrefstate->upper;

I think upperindex is better here. There was no need to rename it. Same
for lowerindex/lower.

There are a couple changes which unrelated to the patch. For example:

> - * subscripting.  Adjacent A_Indices nodes have to be treated as a single
> + * subscripting. Adjacent A_Indices nodes have to be treated as a single

It is better to avoid it for the sake of decrease size of the patch.

> - * typmod to be applied to the base type.  Subscripting a domain is an
> + * typmod to be applied to the base type. Subscripting a domain is an

Same here.

> +/* Non-inline data for container operations */
> +typedef struct SubscriptingRefState
> +{
> + bool isassignment; /* is it assignment, or just fetch? */
> ...
> +} SubscriptingRefState;

It is not good to move up SubscriptingRefState, because it is hard to
see changes between SubscriptingRefState and ArrayRefState.

> + FmgrInfo   *eval_finfo; /* function to evaluate subscript */
> + FmgrInfo   *nested_finfo; /* function to handle nested assignment */

I think eval_finfo and nested_finfo are not needed anymore.

> +typedef Datum (*SubscriptingFetch) (Datum source, struct SubscriptingRefState *sbsefstate);
> +
> +typedef Datum (*SubscriptingAssign) (Datum source, struct SubscriptingRefState *sbsefstate);

Typo here? Did you mean sbsrefstate in the second argument?

> +typedef struct SbsRoutines
> +{
> + SubscriptingPrepare prepare;
> + SubscriptingValidate validate;
> + SubscriptingFetch fetch;
> + SubscriptingAssign assign;
> +
> +} SbsRoutines;

SbsRoutines is not good name for me. SubscriptRoutines or
SubscriptingRoutines sound better and it is consistent with other
structures.

0005-Subscripting-documentation-v6.patch:

> +   <replaceable class="parameter">type_modifier_output_function</replaceable>,
> +   <replaceable class="parameter">analyze_function</replaceable>,
> +   <replaceable class="parameter">subscripting_handler_function</replaceable>,
>    are optional.  Generally these functions have to be coded in C

Extra comma here.

--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Generic type subscripting

Dmitry Dolgov
> On 29 January 2018 at 14:41, Arthur Zakirov <[hidden email]> wrote:
>
> I have a few comments.

Thanks for suggestions, I've incorporated all of them in a new version of the
patch.

> SbsRoutines is not good name for me. SubscriptRoutines or
> SubscriptingRoutines sound better and it is consistent with other
> structures.

In general I agree, and I thought about that while implementing this structure.
But my concern is that some parts of subscripting infrastructure are annoyingly
verbose, maybe it makes sense to come up with a short abbreviation and use it
everywhere across this code.

0005-Subscripting-documentation-v7.patch (25K) Download Attachment
0004-Subscripting-for-jsonb-v7.patch (45K) Download Attachment
0003-Subscripting-for-array-v7.patch (17K) Download Attachment
0002-Base-implementation-of-subscripting-mechanism-v7.patch (173K) Download Attachment
0001-Renaming-for-new-subscripting-mechanism-v7.patch (61K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Generic type subscripting

Dmitry Dolgov
> On 30 January 2018 at 16:47, Dmitry Dolgov <[hidden email]> wrote:
> > On 29 January 2018 at 14:41, Arthur Zakirov <[hidden email]> wrote:
> >
> > I have a few comments.
>
> Thanks for suggestions, I've incorporated all of them in a new version of the
> patch.

Few more updates. I've attached a new version with some minor changes, mostly
about moving a subscripting depth check to type related logic. Also I've made
some performance tests for arrays using pgbench:

    pgbench -c 10 -j 2 -T 600 -f test.sql -r

with queries like:

    select (ARRAY[1, 2, 3])[0];
    select (ARRAY[1, 2, 3, ...,  98, 99, 100])[0];
    select (ARRAY[[[[[[1]]]]]])[1][1][1][1][1][1];
    select (ARRAY[[[[[[1, 2, 3]]]]]])[1][1][1][1][1:2];

and the difference in average latency was about 2%:

* with the patch

    number of transactions actually processed: 349211
    latency average = 1.718 ms
    tps = 5820.048783 (including connections establishing)
    tps = 5820.264433 (excluding connections establishing)

* without the patch

    number of transactions actually processed: 356024
    latency average = 1.685 ms
    tps = 5933.538195 (including connections establishing)
    tps = 5934.124599 (excluding connections establishing)


0005-Subscripting-documentation-v8.patch (25K) Download Attachment
0004-Subscripting-for-jsonb-v8.patch (45K) Download Attachment
0003-Subscripting-for-array-v8.patch (18K) Download Attachment
0002-Base-implementation-of-subscripting-mechanism-v8.patch (173K) Download Attachment
0001-Renaming-for-new-subscripting-mechanism-v8.patch (60K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Generic type subscripting

Dmitry Dolgov
> On 22 February 2018 at 18:30, Dmitry Dolgov <[hidden email]> wrote:

>
> Few more updates. I've attached a new version with some minor changes,
> mostly
> about moving a subscripting depth check to type related logic. Also I've
> made
> some performance tests for arrays using pgbench:
>
>     pgbench -c 10 -j 2 -T 600 -f test.sql -r
>
> with queries like:
>
>     select (ARRAY[1, 2, 3])[0];
>     select (ARRAY[1, 2, 3, ...,  98, 99, 100])[0];
>     select (ARRAY[[[[[[1]]]]]])[1][1][1][1][1][1];
>     select (ARRAY[[[[[[1, 2, 3]]]]]])[1][1][1][1][1:2];
>
> and the difference in average latency was about 2%:
>
> * with the patch
>
>     number of transactions actually processed: 349211
>     latency average = 1.718 ms
>     tps = 5820.048783 (including connections establishing)
>     tps = 5820.264433 (excluding connections establishing)
>
> * without the patch
>
>     number of transactions actually processed: 356024
>     latency average = 1.685 ms
>     tps = 5933.538195 (including connections establishing)
>     tps = 5934.124599 (excluding connections establishing)
>
One more small update after fd1a421fe6 in attachments.

0001-Renaming-for-new-subscripting-mechanism-v9.patch (60K) Download Attachment
0002-Base-implementation-of-subscripting-mechanism-v9.patch (173K) Download Attachment
0003-Subscripting-for-array-v9.patch (18K) Download Attachment
0004-Subscripting-for-jsonb-v9.patch (45K) Download Attachment
0005-Subscripting-documentation-v9.patch (25K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Generic type subscripting

Shulgin, Oleksandr
On Tue, Mar 6, 2018 at 6:21 PM, Dmitry Dolgov <[hidden email]> wrote:

One more small update after fd1a421fe6 in attachments.

Before looking at the code I have a few comments about documentation:

in json.sgml:

+-- Extract value by key
+SELECT ('{"a": 1}'::jsonb)['a'];

What is the result of running this query?  What is the resulting data type?

+-- Extract nested value by key path
+SELECT ('{"a": {"b": {"c": 1}}}'::jsonb)['a']['b']['c'];
+
+-- Extract element by index
+SELECT ('[1, "2", null]'::jsonb)['1'];

What is the result here?  Why subscript is a string and not a number?  Are subscription indexes 0- or 1-based?

+-- Update value by key
+UPDATE table_name set jsonb_field['key'] = 1;
+
+-- Select records using where clause with subscripting
+SELECT * from table_name where jsonb_field['key'] = '"value"';

Please capitalize: SET, FROM, WHERE.

Use of double quotes around "value" requires some explanation, I think.
Should the user expect that a suitable index is used by the query planner for this query?

In other words, I would like to see this part of documentation to be extended beyond just showcasing the syntax.

Regards,
--
Oleksandr "Alex" Shulgin | Database Engineer | Zalando SE | Tel: +49 176 127-59-707

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Generic type subscripting

Dmitry Dolgov
> On 20 March 2018 at 11:09, Oleksandr Shulgin <[hidden email]> wrote:

>> On Tue, Mar 6, 2018 at 6:21 PM, Dmitry Dolgov <[hidden email]> wrote:
>>
>>
>> One more small update after fd1a421fe6 in attachments.
>
>
> Before looking at the code I have a few comments about documentation:
>
> ...
>
> In other words, I would like to see this part of documentation to be
> extended beyond just showcasing the syntax.
Good point, thanks for noticing. The thing is that the implementation of
subscripting for jsonb data type in this patch relies on the `setPath` function
and follows the same rules as e.g. `jsonb_set`, but I need to mention this
explicitly in the documentation. Speaking about your questions:

> +-- Extract value by key
> +SELECT ('{"a": 1}'::jsonb)['a'];
>
> What is the result of running this query?  What is the resulting data type?
>

Jsonb subscripting expression always returns another jsonb

> +-- Extract element by index
> +SELECT ('[1, "2", null]'::jsonb)['1'];
>
> What is the result here?  Why subscript is a string and not a number?  Are
> subscription indexes 0- or 1-based?
>

For jsonb arrays an index is 0 based. It's also not necessary to have an index
as a string in this situation (so `data['1']` and `data[1]` are actually equal)

> +-- Select records using where clause with subscripting
> +SELECT * from table_name where jsonb_field['key'] = '"value"';
>
> Use of double quotes around "value" requires some explanation, I think.

In case of comparison, since a subscripting expression returns something of
jsonb data type, we're going to compare two objects of type jsonb. Which means
we need to convert 'value' to a jsonb scalar, and for that purpose it should be
in double quotes.

> Should the user expect that a suitable index is used by the query planner
> for this query?

There is no specific indexing support for subscripting expressions, so if you
need you can create a functional index using it.

Here is the updated version of patch, rebased after recent conflicts and with
suggested documentation improvements.

0001-Renaming-for-new-subscripting-mechanism-v10.patch (61K) Download Attachment
0002-Base-implementation-of-subscripting-mechanism-v10.patch (174K) Download Attachment
0003-Subscripting-for-array-v10.patch (18K) Download Attachment
0004-Subscripting-for-jsonb-v10.patch (45K) Download Attachment
0005-Subscripting-documentation-v10.patch (26K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Generic type subscripting

Dmitry Dolgov
> On 22 March 2018 at 23:25, Dmitry Dolgov <[hidden email]> wrote:
>
> Here is the updated version of patch, rebased after recent conflicts and with
> suggested documentation improvements.

Another rebased version of the patch.

0005-Subscripting-documentation-v11.patch (26K) Download Attachment
0004-Subscripting-for-jsonb-v11.patch (44K) Download Attachment
0003-Subscripting-for-array-v11.patch (68K) Download Attachment
0002-Base-implementation-of-subscripting-mechanism-v11.patch (79K) Download Attachment
0001-Renaming-for-new-subscripting-mechanism-v11.patch (64K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Generic type subscripting

Dmitry Dolgov
> On Thu, 26 Apr 2018 at 16:44, Dmitry Dolgov <[hidden email]> wrote:
>
> > On 22 March 2018 at 23:25, Dmitry Dolgov <[hidden email]> wrote:
> >
> > Here is the updated version of patch, rebased after recent conflicts and with
> > suggested documentation improvements.
>
> Another rebased version of the patch.

I've noticed, that I never updated llvmjit code for the arrayref expressions,
and it's important to do so, since the patch introduces another layer of
flexibility. Hence here is the new version.

0003-Subscripting-for-array-v12.patch (17K) Download Attachment
0005-Subscripting-documentation-v12.patch (26K) Download Attachment
0004-Subscripting-for-jsonb-v12.patch (44K) Download Attachment
0002-Base-implementation-of-subscripting-mechanism-v12.patch (128K) Download Attachment
0001-Renaming-for-new-subscripting-mechanism-v12.patch (74K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Generic type subscripting

Dmitry Dolgov
> On Fri, 20 Jul 2018 at 23:32, Dmitry Dolgov <[hidden email]> wrote:

>
> > On Thu, 26 Apr 2018 at 16:44, Dmitry Dolgov <[hidden email]> wrote:
> >
> > > On 22 March 2018 at 23:25, Dmitry Dolgov <[hidden email]> wrote:
> > >
> > > Here is the updated version of patch, rebased after recent conflicts and with
> > > suggested documentation improvements.
> >
> > Another rebased version of the patch.
>
> I've noticed, that I never updated llvmjit code for the arrayref expressions,
> and it's important to do so, since the patch introduces another layer of
> flexibility. Hence here is the new version.
Here is another rebased version, and a bit of history: the first prototypes of
this patch were sent more than 3 years ago. Of course the patch evolved
significantly over this period, and I take it as a good sign that it wasn't
rejected and keeps moving through the commitfests. At the same time the lack of
attention makes things a bit frustrating. I have an impression that it's sort
of regular situation and wonder if there are any ideas (besides the well known
advice of putting some efforts into review patches from other people, since I'm
already doing my best and enjoying this) how to make progress in such cases?

0002-Base-implementation-of-subscripting-mechanism-v13.patch (85K) Download Attachment
0003-Subscripting-for-array-v13.patch (18K) Download Attachment
0005-Subscripting-documentation-v13.patch (26K) Download Attachment
0004-Subscripting-for-jsonb-v13.patch (45K) Download Attachment
0001-Renaming-for-new-subscripting-mechanism-v13.patch (74K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Generic type subscripting

Pavel Stehule


ne 30. 9. 2018 v 0:21 odesílatel Dmitry Dolgov <[hidden email]> napsal:
> On Fri, 20 Jul 2018 at 23:32, Dmitry Dolgov <[hidden email]> wrote:
>
> > On Thu, 26 Apr 2018 at 16:44, Dmitry Dolgov <[hidden email]> wrote:
> >
> > > On 22 March 2018 at 23:25, Dmitry Dolgov <[hidden email]> wrote:
> > >
> > > Here is the updated version of patch, rebased after recent conflicts and with
> > > suggested documentation improvements.
> >
> > Another rebased version of the patch.
>
> I've noticed, that I never updated llvmjit code for the arrayref expressions,
> and it's important to do so, since the patch introduces another layer of
> flexibility. Hence here is the new version.

Here is another rebased version, and a bit of history: the first prototypes of
this patch were sent more than 3 years ago. Of course the patch evolved
significantly over this period, and I take it as a good sign that it wasn't
rejected and keeps moving through the commitfests. At the same time the lack of
attention makes things a bit frustrating. I have an impression that it's sort
of regular situation and wonder if there are any ideas (besides the well known
advice of putting some efforts into review patches from other people, since I'm
already doing my best and enjoying this) how to make progress in such cases?

This feature looks nice, and it can be great when some values of some not atomic type should be updated.

Regards

Pavel
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Generic type subscripting

Pavel Stehule
Hi

ne 30. 9. 2018 v 8:23 odesílatel Pavel Stehule <[hidden email]> napsal:


ne 30. 9. 2018 v 0:21 odesílatel Dmitry Dolgov <[hidden email]> napsal:
> On Fri, 20 Jul 2018 at 23:32, Dmitry Dolgov <[hidden email]> wrote:
>
> > On Thu, 26 Apr 2018 at 16:44, Dmitry Dolgov <[hidden email]> wrote:
> >
> > > On 22 March 2018 at 23:25, Dmitry Dolgov <[hidden email]> wrote:
> > >
> > > Here is the updated version of patch, rebased after recent conflicts and with
> > > suggested documentation improvements.
> >
> > Another rebased version of the patch.
>
> I've noticed, that I never updated llvmjit code for the arrayref expressions,
> and it's important to do so, since the patch introduces another layer of
> flexibility. Hence here is the new version.

Here is another rebased version, and a bit of history: the first prototypes of
this patch were sent more than 3 years ago. Of course the patch evolved
significantly over this period, and I take it as a good sign that it wasn't
rejected and keeps moving through the commitfests. At the same time the lack of
attention makes things a bit frustrating. I have an impression that it's sort
of regular situation and wonder if there are any ideas (besides the well known
advice of putting some efforts into review patches from other people, since I'm
already doing my best and enjoying this) how to make progress in such cases?

This feature looks nice, and it can be great when some values of some not atomic type should be updated.

I am playing with this feature little bit

I have one idea - can be possible to use integer subscript for record fields? It can helps with iteration over record.

example:

select ('{"a":{"a":[10,20]}}'::jsonb)[0];--> NULL, but can be more practical if it returns same like select ('{"a":{"a":[10,"20"]}}'::jsonb)['a'];

I don't like quite ignoring bad subsript in update

postgres=# insert into test(v) values( '[]');
INSERT 0 1
postgres=# update test set v[1000] = 'a';
UPDATE 1
postgres=# update test set v[1000] = 'a';
UPDATE 1
postgres=# update test set v[1000] = 'a';
UPDATE 1
postgres=# select * from test;
┌────┬─────────────────┐
│ id │        v        │
╞════╪═════════════════╡
│    │ ["a", "a", "a"] │
└────┴─────────────────┘
(1 row)


It should to raise exception in this case. Current behave allows append simply, but can be source of errors. For this case we can introduce some special symbol - some like -0 :)

It is maybe strange, but I prefer less magic syntax like

update test set v['a']['a'] =  v['a']['a'] || '1000';

more readable than

update test set v['a']['a'][1000000] = 1000;

My first impression is very good - update jsonb, xml documents can be very friendly.

Regards

Pavel




 

Regards

Pavel
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Generic type subscripting

Dmitry Dolgov
> On Wed, 10 Oct 2018 at 14:26, Pavel Stehule <[hidden email]> wrote:
>
> I am playing with this feature little bit

Thanks a lot!

> I have one idea - can be possible to use integer subscript for record fields? It can helps with iteration over record.
>
> example:
>
> select ('{"a":{"a":[10,20]}}'::jsonb)[0];--> NULL, but can be more practical if it returns same like select ('{"a":{"a":[10,"20"]}}'::jsonb)['a'];

Sounds interesting, but I'm not sure how consistent it would be with the rest
of jsonb functionality, and someone may want to get an error in this case. At
the same time I believe that this can be achieved quite nicely with json_query
or json_table from SQL/JSON patch (see examples here [1]). What do you think
about this approach?

> I don't like quite ignoring bad subsript in update

Can you show an example of such ignoring of a bad subsript in an update?

> postgres=# insert into test(v) values( '[]');
> INSERT 0 1
> postgres=# update test set v[1000] = 'a';
> UPDATE 1
> postgres=# update test set v[1000] = 'a';
> UPDATE 1
> postgres=# update test set v[1000] = 'a';
> UPDATE 1
> postgres=# select * from test;
> ┌────┬─────────────────┐
> │ id │        v        │
> ╞════╪═════════════════╡
> │    │ ["a", "a", "a"] │
> └────┴─────────────────┘
> (1 row)
>
> It should to raise exception in this case. Current behave allows append simply, but can be source of errors. For this case we can introduce some special symbol - some like -0 :)

Yeah, it may look strange, but there is a reason behind it. I tried to keep the
behaviour of this feature consistent with jsonb_set function (and in fact
they're sharing the same functionality). And for jsonb_set it's documented:

    If the item (of a path, in our case an index) is out of the range
    -array_length .. array_length -1, and create_missing is true, the new value
    is added at the beginning of the array if the item is negative, and at the
    end of the array if it is positive.

So, the index 1000 is way above the end of the array v, and every new item has
being appended at the end.

Of course no one said that they should behave similarly, but I believe it's
quite nice to have consistency here. Any other opinions?

> It is maybe strange, but I prefer less magic syntax like
>
> update test set v['a']['a'] =  v['a']['a'] || '1000';
>
> more readable than
>
> update test set v['a']['a'][1000000] = 1000;

Yep, with this patch it's possible to use both ways:

    =# table test;
    v
    -------------------------
     {"a": {"a": [1, 2, 3]}}
    (1 row)

    =# update test set v['a']['a'] = v['a']['a'] || '1000';
    UPDATE 1

    =# table test;
       v
    -------------------------------
     {"a": {"a": [1, 2, 3, 1000]}}
    (1 row)

> My first impression is very good - update jsonb, xml documents can be very friendly.

Thanks!

1: https://www.postgresql.org/message-id/flat/732208d3-56c3-25a4-8f08-3be1d54ad51b@...

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Generic type subscripting

Pavel Stehule


čt 11. 10. 2018 v 22:48 odesílatel Dmitry Dolgov <[hidden email]> napsal:
> On Wed, 10 Oct 2018 at 14:26, Pavel Stehule <[hidden email]> wrote:
>
> I am playing with this feature little bit

Thanks a lot!

> I have one idea - can be possible to use integer subscript for record fields? It can helps with iteration over record.
>
> example:
>
> select ('{"a":{"a":[10,20]}}'::jsonb)[0];--> NULL, but can be more practical if it returns same like select ('{"a":{"a":[10,"20"]}}'::jsonb)['a'];

Sounds interesting, but I'm not sure how consistent it would be with the rest
of jsonb functionality, and someone may want to get an error in this case. At
the same time I believe that this can be achieved quite nicely with json_query
or json_table from SQL/JSON patch (see examples here [1]). What do you think
about this approach?

In this case, I don't see any problem - the array or multidimensional array can be indexed by numbers or by special keys. But numbers are natural every time.

For me, SQL/JSON, JSONPath support is different topic. More - the generic support can be used for other types than Jsonb. I can imagine integrated dictionary type - and the SQL/JSON support doesn't help here.

This is not too strong theme for me - just I don't see a reason for strong restrictivity there.


> I don't like quite ignoring bad subsript in update

Can you show an example of such ignoring of a bad subsript in an update?

> postgres=# insert into test(v) values( '[]');
> INSERT 0 1
> postgres=# update test set v[1000] = 'a';
> UPDATE 1
> postgres=# update test set v[1000] = 'a';
> UPDATE 1
> postgres=# update test set v[1000] = 'a';
> UPDATE 1
> postgres=# select * from test;
> ┌────┬─────────────────┐
> │ id │        v        │
> ╞════╪═════════════════╡
> │    │ ["a", "a", "a"] │
> └────┴─────────────────┘
> (1 row)
>
> It should to raise exception in this case. Current behave allows append simply, but can be source of errors. For this case we can introduce some special symbol - some like -0 :)

Yeah, it may look strange, but there is a reason behind it. I tried to keep the
behaviour of this feature consistent with jsonb_set function (and in fact
they're sharing the same functionality). And for jsonb_set it's documented:

    If the item (of a path, in our case an index) is out of the range
    -array_length .. array_length -1, and create_missing is true, the new value
    is added at the beginning of the array if the item is negative, and at the
    end of the array if it is positive.

So, the index 1000 is way above the end of the array v, and every new item has
being appended at the end.

Of course no one said that they should behave similarly, but I believe it's
quite nice to have consistency here. Any other opinions?

Aha - although I understand to your motivation, I am think so it is bad design - and jsonb_set behave is not happy.

I am think so it is wrong idea, because you lost some information - field position - I push value on index 10, but it will be stored on second position.

Regards

Pavel


> It is maybe strange, but I prefer less magic syntax like
>
> update test set v['a']['a'] =  v['a']['a'] || '1000';
>
> more readable than
>
> update test set v['a']['a'][1000000] = 1000;

Yep, with this patch it's possible to use both ways:

    =# table test;
    v
    -------------------------
     {"a": {"a": [1, 2, 3]}}
    (1 row)

    =# update test set v['a']['a'] = v['a']['a'] || '1000';
    UPDATE 1

    =# table test;
       v
    -------------------------------
     {"a": {"a": [1, 2, 3, 1000]}}
    (1 row)

> My first impression is very good - update jsonb, xml documents can be very friendly.

Thanks!

1: https://www.postgresql.org/message-id/flat/732208d3-56c3-25a4-8f08-3be1d54ad51b@...
1234567