jsonpath

classic Classic list List threaded Threaded
198 messages Options
1234 ... 10
Reply | Threaded
Open this post in threaded view
|

jsonpath

Andrew Dunstan-8

Next up in the proposed SQL/JSON feature set I will review the three
jsonpath patches. These are attached, extracted from the patch set
Nikita posted on Jan 2nd.


Note that they depend on the TZH/TZM patch (see separate email thread).
I'd like to be able to commit that patch pretty soon.


cheers


andrew


--
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


0004-jsonpath-json-v07.patch (96K) Download Attachment
0003-jsonpath-gin-v07.patch (39K) Download Attachment
0002-jsonpath-v07.patch (243K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: jsonpath

Pavel Stehule


2018-01-07 20:31 GMT+01:00 Andrew Dunstan <[hidden email]>:

Next up in the proposed SQL/JSON feature set I will review the three
jsonpath patches. These are attached, extracted from the patch set
Nikita posted on Jan 2nd.


Note that they depend on the TZH/TZM patch (see separate email thread).
I'd like to be able to commit that patch pretty soon.

I did few tests - and it is working very well.

regards

Pavel



cheers


andrew


--
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: jsonpath

Andrew Dunstan-8


On 01/07/2018 03:00 PM, Pavel Stehule wrote:

>
>
> 2018-01-07 20:31 GMT+01:00 Andrew Dunstan
> <[hidden email] <mailto:[hidden email]>>:
>
>
>     Next up in the proposed SQL/JSON feature set I will review the three
>     jsonpath patches. These are attached, extracted from the patch set
>     Nikita posted on Jan 2nd.
>
>
>     Note that they depend on the TZH/TZM patch (see separate email
>     thread).
>     I'd like to be able to commit that patch pretty soon.
>
>
> I did few tests - and it is working very well.
>
>


You mean on the revised TZH/TZM patch that I posted?

cheers

andrew

--
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: jsonpath

Pavel Stehule


2018-01-07 21:14 GMT+01:00 Andrew Dunstan <[hidden email]>:


On 01/07/2018 03:00 PM, Pavel Stehule wrote:
>
>
> 2018-01-07 20:31 GMT+01:00 Andrew Dunstan
> <[hidden email] <mailto:[hidden email]>>:
>
>
>     Next up in the proposed SQL/JSON feature set I will review the three
>     jsonpath patches. These are attached, extracted from the patch set
>     Nikita posted on Jan 2nd.
>
>
>     Note that they depend on the TZH/TZM patch (see separate email
>     thread).
>     I'd like to be able to commit that patch pretty soon.
>
>
> I did few tests - and it is working very well.
>
>


You mean on the revised TZH/TZM patch that I posted?

no -  I tested jsonpath implementation

Pavel

cheers

andrew

--
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: jsonpath

Andrew Dunstan-8


On 01/07/2018 03:16 PM, Pavel Stehule wrote:

>
>
> 2018-01-07 21:14 GMT+01:00 Andrew Dunstan
> <[hidden email] <mailto:[hidden email]>>:
>
>
>
>     On 01/07/2018 03:00 PM, Pavel Stehule wrote:
>     >
>     >
>     > 2018-01-07 20:31 GMT+01:00 Andrew Dunstan
>     > <[hidden email]
>     <mailto:[hidden email]>
>     <mailto:[hidden email]
>     <mailto:[hidden email]>>>:
>     >
>     >
>     >     Next up in the proposed SQL/JSON feature set I will review
>     the three
>     >     jsonpath patches. These are attached, extracted from the
>     patch set
>     >     Nikita posted on Jan 2nd.
>     >
>     >
>     >     Note that they depend on the TZH/TZM patch (see separate email
>     >     thread).
>     >     I'd like to be able to commit that patch pretty soon.
>     >
>     >
>     > I did few tests - and it is working very well.
>     >
>     >
>
>
>     You mean on the revised TZH/TZM patch that I posted?
>
>
> no -  I tested jsonpath implementation
>
>


OK. You can help by also reviewing and testing the small patch at
<https://www.postgresql.org/message-id/f16a6408-45fe-b299-02b0-41e50a1c3023%402ndQuadrant.com>

thanks

cheers

andrew

--
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: jsonpath

Pavel Stehule


2018-01-07 21:20 GMT+01:00 Andrew Dunstan <[hidden email]>:


On 01/07/2018 03:16 PM, Pavel Stehule wrote:
>
>
> 2018-01-07 21:14 GMT+01:00 Andrew Dunstan
> <[hidden email] <mailto:[hidden email]>>:
>
>
>
>     On 01/07/2018 03:00 PM, Pavel Stehule wrote:
>     >
>     >
>     > 2018-01-07 20:31 GMT+01:00 Andrew Dunstan
>     > <[hidden email]
>     <mailto:[hidden email]>
>     <mailto:[hidden email]
>     <mailto:[hidden email]>>>:
>     >
>     >
>     >     Next up in the proposed SQL/JSON feature set I will review
>     the three
>     >     jsonpath patches. These are attached, extracted from the
>     patch set
>     >     Nikita posted on Jan 2nd.
>     >
>     >
>     >     Note that they depend on the TZH/TZM patch (see separate email
>     >     thread).
>     >     I'd like to be able to commit that patch pretty soon.
>     >
>     >
>     > I did few tests - and it is working very well.
>     >
>     >
>
>
>     You mean on the revised TZH/TZM patch that I posted?
>
>
> no -  I tested jsonpath implementation
>
>


OK. You can help by also reviewing and testing the small patch at
<https://www.postgresql.org/message-id/f16a6408-45fe-b299-02b0-41e50a1c3023%402ndQuadrant.com>

I'll look there

Regards

Pavel


thanks

cheers

andrew

--
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: jsonpath

Nikita Glukhov
Attached new 8th version of jsonpath related patches. Complete
documentation is still missing.

The first 4 small patches are necessary datetime handling in jsonpath:
1. simple refactoring, extracted function that will be used later in
jsonpath
2. throw an error when the input or format string contains trailing elements
3. avoid unnecessary cstring to text conversions
4. add function for automatic datetime type recognition by the presence
of formatting components

Should they be posted in a separate thread?

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

0001-extract-JsonEncodeDateTime-v08.patch (10K) Download Attachment
0002-strict-do_to_timestamp-v08.patch (3K) Download Attachment
0003-pass-cstring-to-do_to_timestamp-v08.patch (2K) Download Attachment
0004-add-to_datetime-v08.patch (12K) Download Attachment
0005-jsonpath-v08.patch (219K) Download Attachment
0006-jsonpath-gin-v08.patch (39K) Download Attachment
0007-jsonpath-json-v08.patch (96K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: jsonpath

Andrew Dunstan-8


On 01/10/2018 05:42 PM, Nikita Glukhov wrote:

> Attached new 8th version of jsonpath related patches. Complete
> documentation is still missing.
>
> The first 4 small patches are necessary datetime handling in jsonpath:
> 1. simple refactoring, extracted function that will be used later in
> jsonpath
> 2. throw an error when the input or format string contains trailing
> elements
> 3. avoid unnecessary cstring to text conversions
> 4. add function for automatic datetime type recognition by the
> presence of formatting components
>
> Should they be posted in a separate thread?
>


The first of these refactors the json/jsonb timestamp formatting into a
single function, removing a lot of code duplication. The involves
exposing time2tm() and timetz2tm(). I don't think that's a tragedy, so
unless there is any objection I propose to commit it shortly.

The next three expose a bit more of the date/time API. I'm still
reviewing those.

cheers

andrew

--
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: jsonpath

Andrew Dunstan-8


On 01/15/2018 07:24 PM, Andrew Dunstan wrote:

>
> On 01/10/2018 05:42 PM, Nikita Glukhov wrote:
>> Attached new 8th version of jsonpath related patches. Complete
>> documentation is still missing.
>>
>> The first 4 small patches are necessary datetime handling in jsonpath:
>> 1. simple refactoring, extracted function that will be used later in
>> jsonpath
>> 2. throw an error when the input or format string contains trailing
>> elements
>> 3. avoid unnecessary cstring to text conversions
>> 4. add function for automatic datetime type recognition by the
>> presence of formatting components
>>
>> Should they be posted in a separate thread?
>>
>
> The first of these refactors the json/jsonb timestamp formatting into a
> single function, removing a lot of code duplication. The involves
> exposing time2tm() and timetz2tm(). I don't think that's a tragedy, so
> unless there is any objection I propose to commit it shortly.
>
> The next three expose a bit more of the date/time API. I'm still
> reviewing those.
>


I have committed the first of these patches.

I have reviewed the next three, and I think they are generally good.
There is no real point in committing them ahead of the jsonpath patch
since there would be no point in having them at all but for that patch.

Note that these do export the following hitherto internal bits of the
datetime functionality:

    tm2time
    tm2timetz
    AdjustTimeForTypmod
    AdjustTimestampForTypmod

Moving on to review the main jsonpath patch.

cheers

andrew

--
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: jsonpath

Andrew Dunstan-8


On 01/17/2018 04:01 PM, Andrew Dunstan wrote:

>
> On 01/15/2018 07:24 PM, Andrew Dunstan wrote:
>> On 01/10/2018 05:42 PM, Nikita Glukhov wrote:
>>> Attached new 8th version of jsonpath related patches. Complete
>>> documentation is still missing.
>>>
>>> The first 4 small patches are necessary datetime handling in jsonpath:
>>> 1. simple refactoring, extracted function that will be used later in
>>> jsonpath
>>> 2. throw an error when the input or format string contains trailing
>>> elements
>>> 3. avoid unnecessary cstring to text conversions
>>> 4. add function for automatic datetime type recognition by the
>>> presence of formatting components
>>>
>>> Should they be posted in a separate thread?
>>>
>> The first of these refactors the json/jsonb timestamp formatting into a
>> single function, removing a lot of code duplication. The involves
>> exposing time2tm() and timetz2tm(). I don't think that's a tragedy, so
>> unless there is any objection I propose to commit it shortly.
>>
>> The next three expose a bit more of the date/time API. I'm still
>> reviewing those.
>>
>
> I have committed the first of these patches.
>
> I have reviewed the next three, and I think they are generally good.
> There is no real point in committing them ahead of the jsonpath patch
> since there would be no point in having them at all but for that patch.
>
> Note that these do export the following hitherto internal bits of the
> datetime functionality:
>
>     tm2time
>     tm2timetz
>     AdjustTimeForTypmod
>     AdjustTimestampForTypmod
>
> Moving on to review the main jsonpath patch.
>




OK, I understand a good deal more than I did about how the jsopnpath
code works, but the commenting is abysmal.

I got quite nervous about adding a new datetime variant to JsonbValue.
However, my understanding from the code is that this will only ever be
used in an intermediate jsonpath processing result, and it won't be used
in a stored or build jsonb object. Is that right? If it is we need to
say so, and moreover we need to warn coders in the header file about the
restricted use of this variant.  I'm not sure we can enforce it with an
Assert, but If we can we should. I'm not 100% sure that doing it this
way, i.e. by extending and resuing jsonbValue, is desirable, I'd like to
know some of the thinking behind the design.

The encoding of a jsonpath value into a binary string is quite nifty,
but it needs to be documented. Once I understood it better some of my
fears about parser overhead were alleviated.

The use of a function pointer inside JsonPathVariable seems unnecessary
and baroque. It would be much more straightforward to set an isnull flag
in the struct and process the value in an "if" statement accordingly,
avoiding the function pointer altogether.

I am going to be travelling for a few days, then back online for a day
or two, then gone for a week. I hope we can make progress on these
features, but this needs lots more eyeballs, reviewing code as well as
testing, and lots more responsiveness. The whole suite is enormous.

cheers

andrew




--
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: jsonpath

Nikita Glukhov
On 23.01.2018 01:41, Andrew Dunstan wrote:

> On 01/17/2018 04:01 PM, Andrew Dunstan wrote:
>> On 01/15/2018 07:24 PM, Andrew Dunstan wrote:
>>> On 01/10/2018 05:42 PM, Nikita Glukhov wrote:
>>>> Attached new 8th version of jsonpath related patches. Complete
>>>> documentation is still missing.
>>>>
>>>> The first 4 small patches are necessary datetime handling in jsonpath:
>>>> 1. simple refactoring, extracted function that will be used later in
>>>> jsonpath
>>>> 2. throw an error when the input or format string contains trailing
>>>> elements
>>>> 3. avoid unnecessary cstring to text conversions
>>>> 4. add function for automatic datetime type recognition by the
>>>> presence of formatting components
>>>>
>>>> Should they be posted in a separate thread?
>>>>
>>> The first of these refactors the json/jsonb timestamp formatting into a
>>> single function, removing a lot of code duplication. The involves
>>> exposing time2tm() and timetz2tm(). I don't think that's a tragedy, so
>>> unless there is any objection I propose to commit it shortly.
>>>
>>> The next three expose a bit more of the date/time API. I'm still
>>> reviewing those.
>> I have committed the first of these patches.
>>
>> I have reviewed the next three, and I think they are generally good.
>> There is no real point in committing them ahead of the jsonpath patch
>> since there would be no point in having them at all but for that patch.
>>
>> Note that these do export the following hitherto internal bits of the
>> datetime functionality:
>>
>>      tm2time
>>      tm2timetz
>>      AdjustTimeForTypmod
>>      AdjustTimestampForTypmod
>>
>> Moving on to review the main jsonpath patch.
>
> OK, I understand a good deal more than I did about how the jsopnpath
> code works, but the commenting is abysmal.
Thank you for reviewing.

> I got quite nervous about adding a new datetime variant to JsonbValue.
> However, my understanding from the code is that this will only ever be
> used in an intermediate jsonpath processing result, and it won't be used
> in a stored or build jsonb object. Is that right?
Yes, you are right. Datetime JsonbValues are used only for for in-memory
representation of SQL/JSON datetime items, they converted into ordinary
JSON strings in ISO format when json/jsonb encoded into a datum.

> If it is we need to say so, and moreover we need to warn coders in the
> header file about the restricted use of this variant.  I'm not sure we
> can enforce it with an Assert, but If we can we should. I'm not 100%
> sure that doing it this way, i.e. by extending and resuing jsonbValue,
> is desirable, I'd like to know some of the thinking behind the design.
Datetime support was added to our jsonpath implementation when there was
already a lot of code using plain JsonbValue. So, the simplest are most
effective solution I found was JsonbValue extension. We could also
introduce extended struct SqlJsonItem, but it seems that there will be a
lot of unnecessary conversions between SqlJsonItem and JsonbValue.

> The encoding of a jsonpath value into a binary string is quite nifty,
> but it needs to be documented. Once I understood it better some of my
> fears about parser overhead were alleviated.
> The use of a function pointer inside JsonPathVariable seems unnecessary
> and baroque. It would be much more straightforward to set an isnull flag
> in the struct and process the value in an "if" statement accordingly,
> avoiding the function pointer altogether.
Callback in JsonPathVariable is used for on-demand evaluation of
SQL/JSON PASSING parameters (see EvalJsonPathVar() from patch
0010-sqljson-v08.patch). For jsonpath itself it is really unnecessary.

> I am going to be travelling for a few days, then back online for a day
> or two, then gone for a week. I hope we can make progress on these
> features, but this needs lots more eyeballs, reviewing code as well as
> testing, and lots more responsiveness. The whole suite is enormous.
>
> cheers
>
> andrew

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Reply | Threaded
Open this post in threaded view
|

Re: jsonpath

Nikita Glukhov
Attached 9th version of jsonpath patches rebased onto the latest master.

Jsonpath grammar for parenthesized expressions and predicates was fixed.

Documentation drafts for jsonpath written by Oleg Bartunov:
https://github.com/obartunov/sqljsondoc

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

0001-strict-do_to_timestamp-v09.patch (3K) Download Attachment
0002-pass-cstring-to-do_to_timestamp-v09.patch (2K) Download Attachment
0003-add-to_datetime-v09.patch (12K) Download Attachment
0004-jsonpath-v09.patch (218K) Download Attachment
0005-jsonpath-gin-v09.patch (39K) Download Attachment
0006-jsonpath-json-v09.patch (96K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: jsonpath

Oleg Bartunov-2
On Wed, Feb 14, 2018 at 2:04 AM, Nikita Glukhov <[hidden email]> wrote:
> Attached 9th version of jsonpath patches rebased onto the latest master.
>
> Jsonpath grammar for parenthesized expressions and predicates was fixed.
>
> Documentation drafts for jsonpath written by Oleg Bartunov:
> https://github.com/obartunov/sqljsondoc

Direct link is https://github.com/obartunov/sqljsondoc/blob/master/README.jsonpath.md
Please consider it as WIP, I will update it in my spare time.

>
>
> --
> Nikita Glukhov
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company

Reply | Threaded
Open this post in threaded view
|

Re: jsonpath

Nikita Glukhov
Attached 10th version of the jsonpath patches.

1. Fixed error handling in arithmetic operators.

    Now run-time errors in arithmetic operators are catched (added
    PG_TRY/PG_CATCH around operator's functions calls) and converted into
    Unknown values in predicates as it is required by the standard:

    =# SELECT jsonb '[1,0,2]' @* '$[*] ? (1 / @ > 0)';
     ?column?
    ----------
     1
     2
    (2 rows)

2. Fixed grammar for parenthesized expressions.

3. Refactored GIN support for jsonpath operators.

4. Added one more operator json[b] @# jsonpath returning singleton json[b] with
    automatic conditional wrapping of sequences with more than one element into
    arrays:

    =# SELECT jsonb '[1,2,3,4,5]' @# '$[*] ? (@ > 2)';
     ?column?
    -----------
     [3, 4, 5]
    (1 row)

    =# SELECT jsonb '[1,2,3,4,5]' @# '$[*] ? (@ > 4)';
     ?column?
    ----------
     5
    (1 row)

    =# SELECT jsonb '[1,2,3,4,5]' @# '$[*] ? (@ > 5)';
     ?column?
    ----------
     (null)
    (1 row)

    Existing set-returning operator json[b] @* jsonpath is also very userful but
    can't be used in functional indices like new operator @#.

    Note that conditional wrapping of @# differs from the wrapping in
    JSON_QUERY(... WITH [ARRAY] WRAPPER), where only singleton objects and
    arrays are not wrapped.  Unconditional wrapping can be emulated with our
    array construction feature (see below).

5. Changed time zone behavior in .datetime() item method.

    In the previous version of the patch timestamptz SQL/JSON items were
    serialized into JSON string items using session time zone.  This behavior
    did not allow jsonpath operators to be marked as immutable, and therefore
    they could not be used in functional indices.  Also, when the time zone was
    not specified in the input string, but TZM or TZH format fields were present
    in the format string, session time zone was used as a default for
    timestamptz items.

    To make jsonpath operators immutable we decided to save input time zone for
    timestamptz items and disallow automatic time zone assignment.  Also
    additional parameter was added to .datetime() for default time zone
    specification:

    =# SET timezone = '+03';
    SET

    =# SELECT jsonb '"10-03-2017 12:34:56"' @*
                    '$.datetime("DD-MM-YYYY HH24:MI:SS TZH")';
    ERROR:  Invalid argument for SQL/JSON datetime function

    =# SELECT jsonb '"10-03-2017 12:34:56"' @*
                    '$.datetime("DD-MM-YYYY HH24:MI:SS TZH", "+05")';
              ?column?
    -----------------------------
     "2017-03-10T12:34:56+05:00"
    (1 row)

    =# SELECT jsonb '"10-03-2017 12:34:56 +05"' @*
                    '$.datetime("DD-MM-YYYY HH24:MI:SS TZH")';
              ?column?
    -----------------------------
     "2017-03-10T12:34:56+05:00"
    (1 row)

    Please note that our .datetime() behavior is not standard now: by the
    standard, input and format strings must match exactly, i.e. they both should
    not contain trailing unmatched elements, so automatic time zone assignment
    is impossible.  But it too restrictive for PostgreSQL users, so we decided
    to preserve usual PostgreSQL behavior here:

    =# SELECT jsonb '"10-03-2017"' @* '$.datetime("DD-MM-YYYY HH24:MI:SS")';
           ?column?
    -----------------------
     "2017-03-10T00:00:00"
    (1 row)


    Also PostgreSQL is able to automatically recognize format of the input
    string for the specified datetime type, but we can only bring this behavior
    into jsonpath by introducing separate item methods .date(), .time(),
    .timetz(), .timestamp() and .timestamptz().   Also we can use here our
    unfinished feature that gives us ability to work with PostresSQL types in
    jsonpath using cast operator :: (see sqljson_ext branch in our github repo):

    =# SELECT jsonb '"10/03/2017 12:34"' @* '$::timestamptz';
              ?column?
    -----------------------------
     "2017-03-10T12:34:00+03:00"
    (1 row)



A brief description of the extra jsonpath syntax features contained in the
patch #7:

   * Sequence construction by joining path expressions with comma:

     =# SELECT jsonb '[1, 2, 3]' @* '$[*], 4, 5';
      ?column?
     ----------
      1
      2
      3
      4
      5
     (5 rows)

   * Array construction by placing sequence into brackets (equivalent to
     JSON_QUERY(... WITH UNCONDITIONAL WRAPPER)):

     =# SELECT jsonb '[1, 2, 3]' @* '[$[*], 4, 5]';
         ?column?
     -----------------
      [1, 2, 3, 4, 5]
     (1 row)

   * Object construction by placing sequences of key-value pairs into braces:

     =# SELECT jsonb '{"a" : [1, 2, 3]}' @* '{a: [$.a[*], 4, 5], "b c": "dddd"}';
                    ?column?
     ---------------------------------------
      {"a": [1, 2, 3, 4, 5], "b c": "dddd"}
     (1 row)

   * Object subscripting with string-valued expressions:

     =# SELECT jsonb '{"a" : "aaa", "b": "a", "c": "ccc"}' @* '$[$.b, "c"]';
      ?column?
     ----------
      "aaa"
      "ccc"
     (2 rows)

   * Support of UNIX epoch time in .datetime() item method:

     =# SELECT jsonb '1519649957.37' @* '$.datetime()';
                 ?column?
     --------------------------------
      "2018-02-26T12:59:17.37+00:00"
     (1 row)

--
Nikita Glukhov
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company


0001-strict-do_to_timestamp-v10.patch (3K) Download Attachment
0002-pass-cstring-to-do_to_timestamp-v10.patch (2K) Download Attachment
0003-add-to_datetime-v10.patch (12K) Download Attachment
0004-jsonpath-v10.patch (232K) Download Attachment
0005-jsonpath-gin-v10.patch (44K) Download Attachment
0006-jsonpath-json-v10.patch (99K) Download Attachment
0007-jsonpath-extras-v10.patch (38K) Download Attachment
0008-jsonpath-extras-tests-for-json-v10.patch (8K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: jsonpath

Oleg Bartunov-2
On Mon, Feb 26, 2018 at 6:34 PM, Nikita Glukhov <[hidden email]> wrote:

> Attached 10th version of the jsonpath patches.
>
> 1. Fixed error handling in arithmetic operators.
>
>    Now run-time errors in arithmetic operators are catched (added
>    PG_TRY/PG_CATCH around operator's functions calls) and converted into
>    Unknown values in predicates as it is required by the standard:
>
>    =# SELECT jsonb '[1,0,2]' @* '$[*] ? (1 / @ > 0)';
>     ?column?
>    ----------
>     1
>     2
>    (2 rows)
>
> 2. Fixed grammar for parenthesized expressions.
>
> 3. Refactored GIN support for jsonpath operators.
>
> 4. Added one more operator json[b] @# jsonpath returning singleton json[b]
> with
>    automatic conditional wrapping of sequences with more than one element
> into
>    arrays:
>
>    =# SELECT jsonb '[1,2,3,4,5]' @# '$[*] ? (@ > 2)';
>     ?column?
>    -----------
>     [3, 4, 5]
>    (1 row)
>
>    =# SELECT jsonb '[1,2,3,4,5]' @# '$[*] ? (@ > 4)';
>     ?column?
>    ----------
>     5
>    (1 row)
>
>    =# SELECT jsonb '[1,2,3,4,5]' @# '$[*] ? (@ > 5)';
>     ?column?
>    ----------
>     (null)
>    (1 row)
>
>    Existing set-returning operator json[b] @* jsonpath is also very userful
> but
>    can't be used in functional indices like new operator @#.
>
>    Note that conditional wrapping of @# differs from the wrapping in
>    JSON_QUERY(... WITH [ARRAY] WRAPPER), where only singleton objects and
>    arrays are not wrapped.  Unconditional wrapping can be emulated with our
>    array construction feature (see below).
>
> 5. Changed time zone behavior in .datetime() item method.
>
>    In the previous version of the patch timestamptz SQL/JSON items were
>    serialized into JSON string items using session time zone.  This behavior
>    did not allow jsonpath operators to be marked as immutable, and therefore
>    they could not be used in functional indices.  Also, when the time zone
> was
>    not specified in the input string, but TZM or TZH format fields were
> present
>    in the format string, session time zone was used as a default for
>    timestamptz items.
>
>    To make jsonpath operators immutable we decided to save input time zone
> for
>    timestamptz items and disallow automatic time zone assignment.  Also
>    additional parameter was added to .datetime() for default time zone
>    specification:
>
>    =# SET timezone = '+03';
>    SET
>
>    =# SELECT jsonb '"10-03-2017 12:34:56"' @*
>                    '$.datetime("DD-MM-YYYY HH24:MI:SS TZH")';
>    ERROR:  Invalid argument for SQL/JSON datetime function
>
>    =# SELECT jsonb '"10-03-2017 12:34:56"' @*
>                    '$.datetime("DD-MM-YYYY HH24:MI:SS TZH", "+05")';
>              ?column?
>    -----------------------------
>     "2017-03-10T12:34:56+05:00"
>    (1 row)
>
>    =# SELECT jsonb '"10-03-2017 12:34:56 +05"' @*
>                    '$.datetime("DD-MM-YYYY HH24:MI:SS TZH")';
>              ?column?
>    -----------------------------
>     "2017-03-10T12:34:56+05:00"
>    (1 row)
>
>    Please note that our .datetime() behavior is not standard now: by the
>    standard, input and format strings must match exactly, i.e. they both
> should
>    not contain trailing unmatched elements, so automatic time zone
> assignment
>    is impossible.  But it too restrictive for PostgreSQL users, so we
> decided
>    to preserve usual PostgreSQL behavior here:
>
>    =# SELECT jsonb '"10-03-2017"' @* '$.datetime("DD-MM-YYYY HH24:MI:SS")';
>           ?column?
>    -----------------------
>     "2017-03-10T00:00:00"
>    (1 row)

I think someday we should consider adding support for sql standard
conforming datetime. Since it
breaks postgres behaviour we will need 'standard_conforming_datetime' guc.


>
>
>    Also PostgreSQL is able to automatically recognize format of the input
>    string for the specified datetime type, but we can only bring this
> behavior
>    into jsonpath by introducing separate item methods .date(), .time(),
>    .timetz(), .timestamp() and .timestamptz().   Also we can use here our
>    unfinished feature that gives us ability to work with PostresSQL types in
>    jsonpath using cast operator :: (see sqljson_ext branch in our github
> repo):
>
>    =# SELECT jsonb '"10/03/2017 12:34"' @* '$::timestamptz';
>              ?column?
>    -----------------------------
>     "2017-03-10T12:34:00+03:00"
>    (1 row)

Another note.
We decided to preserve TZ in JSON_QUERY function and follow standard
Postgres behaviour  in JSON_VALUE,
since JSON_QUERY returns JSON object and JSON_VALUE returns SQL value.

SELECT JSON_QUERY(jsonb '"2018-02-21 17:01:23 +05"',
'$.datetime("YYYY-MM-DD HH24:MI:SS TZH")');
         json_query
-----------------------------
 "2018-02-21T17:01:23+05:00"
(1 row)

show timezone;
 TimeZone
----------
 W-SU
(1 row)

SELECT JSON_VALUE(jsonb '"2018-02-21 17:01:23 +05"',
'$.datetime("YYYY-MM-DD HH24:MI:SS TZH")');
       json_value
------------------------
 2018-02-21 15:01:23+03
(1 row)

>
>
>
> A brief description of the extra jsonpath syntax features contained in the
> patch #7:
>
>   * Sequence construction by joining path expressions with comma:
>
>     =# SELECT jsonb '[1, 2, 3]' @* '$[*], 4, 5';
>      ?column?
>     ----------
>      1
>      2
>      3
>      4
>      5
>     (5 rows)
>
>   * Array construction by placing sequence into brackets (equivalent to
>     JSON_QUERY(... WITH UNCONDITIONAL WRAPPER)):
>
>     =# SELECT jsonb '[1, 2, 3]' @* '[$[*], 4, 5]';
>         ?column?
>     -----------------
>      [1, 2, 3, 4, 5]
>     (1 row)
>
>   * Object construction by placing sequences of key-value pairs into braces:
>
>     =# SELECT jsonb '{"a" : [1, 2, 3]}' @* '{a: [$.a[*], 4, 5], "b c":
> "dddd"}';
>                    ?column?
>     ---------------------------------------
>      {"a": [1, 2, 3, 4, 5], "b c": "dddd"}
>     (1 row)
>
>   * Object subscripting with string-valued expressions:
>
>     =# SELECT jsonb '{"a" : "aaa", "b": "a", "c": "ccc"}' @* '$[$.b, "c"]';
>      ?column?
>     ----------
>      "aaa"
>      "ccc"
>     (2 rows)
>
>   * Support of UNIX epoch time in .datetime() item method:
>
>     =# SELECT jsonb '1519649957.37' @* '$.datetime()';
>                 ?column?
>     --------------------------------
>      "2018-02-26T12:59:17.37+00:00"
>     (1 row)
>

Documentation in user-friendly format (it will be convered to xml, of
course) is available
https://github.com/obartunov/sqljsondoc/blob/master/README.jsonpath.md
We are permanently working on it.



>
> --
> Nikita Glukhov
> Postgres Professional:http://www.postgrespro.com
> The Russian Postgres Company
>

Reply | Threaded
Open this post in threaded view
|

Re: jsonpath

Robert Haas
In reply to this post by Nikita Glukhov
On Mon, Feb 26, 2018 at 10:34 AM, Nikita Glukhov
<[hidden email]> wrote:
> Attached 10th version of the jsonpath patches.
>
> 1. Fixed error handling in arithmetic operators.
>
>    Now run-time errors in arithmetic operators are catched (added
>    PG_TRY/PG_CATCH around operator's functions calls) and converted into
>    Unknown values in predicates as it is required by the standard:

I think we really need to rename PG_TRY and PG_CATCH or rethink this
whole interface so that people stop thinking they can use it to
prevent errors from being thrown.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Reply | Threaded
Open this post in threaded view
|

Re: jsonpath

Nikita Glukhov
On 28.02.2018 06:55, Robert Haas wrote:

> On Mon, Feb 26, 2018 at 10:34 AM, Nikita Glukhov
> <[hidden email]> wrote:
>> Attached 10th version of the jsonpath patches.
>>
>> 1. Fixed error handling in arithmetic operators.
>>
>>     Now run-time errors in arithmetic operators are catched (added
>>     PG_TRY/PG_CATCH around operator's functions calls) and converted into
>>     Unknown values in predicates as it is required by the standard:
> I think we really need to rename PG_TRY and PG_CATCH or rethink this
> whole interface so that people stop thinking they can use it to
> prevent errors from being thrown.

I understand that it is unsafe to call arbitrary function inside PG_TRY without
rethrowing of caught errors in PG_CATCH, but in jsonpath only the following
numeric and datetime functions with known behavior are called inside PG_TRY
and only errors of category ERRCODE_DATA_EXCEPTION are caught:

numeric_add()
numeric_mul()
numeric_div()
numeric_mod()
numeric_float8()
float8in()
float8_numeric()
to_datetime()

SQL/JSON standard requires us to handle errors and then perform the specified
ON ERROR behavior.  In the next SQL/JSON patch I had to use subtransactions for
catching errors in JSON_VALUE, JSON_QUERY and JSON_EXISTS where an arbitrary
user-defined typecast function can be called.

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Reply | Threaded
Open this post in threaded view
|

Re: jsonpath

Alexander Korotkov
On Fri, Mar 2, 2018 at 12:40 AM, Nikita Glukhov <[hidden email]> wrote:
On 28.02.2018 06:55, Robert Haas wrote:

On Mon, Feb 26, 2018 at 10:34 AM, Nikita Glukhov
<[hidden email]> wrote:
Attached 10th version of the jsonpath patches.

1. Fixed error handling in arithmetic operators.

    Now run-time errors in arithmetic operators are catched (added
    PG_TRY/PG_CATCH around operator's functions calls) and converted into
    Unknown values in predicates as it is required by the standard:
I think we really need to rename PG_TRY and PG_CATCH or rethink this
whole interface so that people stop thinking they can use it to
prevent errors from being thrown.

I understand that it is unsafe to call arbitrary function inside PG_TRY without
rethrowing of caught errors in PG_CATCH, but in jsonpath only the following
numeric and datetime functions with known behavior are called inside PG_TRY
and only errors of category ERRCODE_DATA_EXCEPTION are caught:

numeric_add()
numeric_mul()
numeric_div()
numeric_mod()
numeric_float8()
float8in()
float8_numeric()
to_datetime()

That seems like a quite limited list of functions.  What about reworking them
providing a way of calling them without risk of exception?  For example, we can
have numeric_add_internal() function which fill given data structure with
error information instead of throwing the error.  numeric_add() would be a
wrapper over numeric_add_internal(), which throws an error if corresponding
data structure is filled.  In jsonpath we can call numeric_add_internal() and
interpret errors in another way.  That seems to be better than use of PG_TRY
and PG_CATCH.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 
Reply | Threaded
Open this post in threaded view
|

Re: jsonpath

Tomas Vondra-4
In reply to this post by Nikita Glukhov
Hi,

The patch no longer applies - it got broken by fd1a421fe66 which changed
columns in pg_proc. A rebase is needed.

Fixing it is pretty simle, so I've done that locally and tried to run
'make check' under valgrind. And I got a bunch of reports about
uninitialised values. Full report attached, but in general there seem to
be two types of failures:

  Conditional jump or move depends on uninitialised value(s)
     at 0x57BB47A: vfprintf (in /usr/lib64/libc-2.24.so)
     by 0x57E5478: vsnprintf (in /usr/lib64/libc-2.24.so)
     by 0xA926D3: pvsnprintf (psprintf.c:121)
     by 0x723E03: appendStringInfoVA (stringinfo.c:130)
     by 0x723D58: appendStringInfo (stringinfo.c:87)
     by 0x76BEFF: _outCoerceViaIO (outfuncs.c:1413)
     by 0x776F99: outNode (outfuncs.c:3978)
     by 0x76D7E7: _outJsonCoercion (outfuncs.c:1779)
     by 0x777CB9: outNode (outfuncs.c:4398)
     by 0x76D507: _outJsonExpr (outfuncs.c:1752)
     by 0x777CA1: outNode (outfuncs.c:4395)
     by 0x767000: _outList (outfuncs.c:187)
     by 0x776874: outNode (outfuncs.c:3753)
     by 0x76A4D2: _outTableFunc (outfuncs.c:1068)
     by 0x776D89: outNode (outfuncs.c:3912)
     by 0x7744FD: _outRangeTblEntry (outfuncs.c:3209)
     by 0x777959: outNode (outfuncs.c:4290)
     by 0x767000: _outList (outfuncs.c:187)
     by 0x776874: outNode (outfuncs.c:3753)
     by 0x773713: _outQuery (outfuncs.c:3049)
   Uninitialised value was created by a stack allocation
     at 0x5B0C19: base_yyparse (gram.c:26287)

This happens when _outCoerceViaIO tries to output 'location' field
(that's line 1413), so I guess it's not set/copied somewhere.

The second failure looks like this:

  Conditional jump or move depends on uninitialised value(s)
     at 0x49E58B: ginFillScanEntry (ginscan.c:72)
     by 0x49EB56: ginFillScanKey (ginscan.c:221)
     by 0x49EF72: ginNewScanKey (ginscan.c:369)
     by 0x4A3875: gingetbitmap (ginget.c:1807)
     by 0x4F620B: index_getbitmap (indexam.c:727)
     by 0x6EE342: MultiExecBitmapIndexScan (nodeBitmapIndexscan.c:104)
     by 0x6DA8F8: MultiExecProcNode (execProcnode.c:506)
     by 0x6EC53D: BitmapHeapNext (nodeBitmapHeapscan.c:118)
     by 0x6DC26D: ExecScanFetch (execScan.c:95)
     by 0x6DC308: ExecScan (execScan.c:162)
     by 0x6ED7E5: ExecBitmapHeapScan (nodeBitmapHeapscan.c:730)
     by 0x6DA80A: ExecProcNodeFirst (execProcnode.c:446)
     by 0x6E5961: ExecProcNode (executor.h:239)
     by 0x6E5E25: fetch_input_tuple (nodeAgg.c:406)
     by 0x6E8091: agg_retrieve_direct (nodeAgg.c:1736)
     by 0x6E7C84: ExecAgg (nodeAgg.c:1551)
     by 0x6DA80A: ExecProcNodeFirst (execProcnode.c:446)
     by 0x6D1361: ExecProcNode (executor.h:239)
     by 0x6D3BB7: ExecutePlan (execMain.c:1721)
     by 0x6D1917: standard_ExecutorRun (execMain.c:361)
   Uninitialised value was created by a heap allocation
     at 0xA64FDC: palloc (mcxt.c:858)
     by 0x938636: gin_extract_jsonpath_query (jsonb_gin.c:630)
     by 0x938AB6: gin_extract_jsonb_query (jsonb_gin.c:746)
     by 0xA340C0: FunctionCall7Coll (fmgr.c:1201)
     by 0x49EE7F: ginNewScanKey (ginscan.c:313)
     by 0x4A3875: gingetbitmap (ginget.c:1807)
     by 0x4F620B: index_getbitmap (indexam.c:727)
     by 0x6EE342: MultiExecBitmapIndexScan (nodeBitmapIndexscan.c:104)
     by 0x6DA8F8: MultiExecProcNode (execProcnode.c:506)
     by 0x6EC53D: BitmapHeapNext (nodeBitmapHeapscan.c:118)
     by 0x6DC26D: ExecScanFetch (execScan.c:95)
     by 0x6DC308: ExecScan (execScan.c:162)
     by 0x6ED7E5: ExecBitmapHeapScan (nodeBitmapHeapscan.c:730)
     by 0x6DA80A: ExecProcNodeFirst (execProcnode.c:446)
     by 0x6E5961: ExecProcNode (executor.h:239)
     by 0x6E5E25: fetch_input_tuple (nodeAgg.c:406)
     by 0x6E8091: agg_retrieve_direct (nodeAgg.c:1736)
     by 0x6E7C84: ExecAgg (nodeAgg.c:1551)
     by 0x6DA80A: ExecProcNodeFirst (execProcnode.c:446)
     by 0x6D1361: ExecProcNode (executor.h:239)

So the extra_data allocated in gin_extract_jsonpath_query() get to
ginFillScanEntry() uninitialised.


Both seem like a valid issues, I think.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

valgrind.log (24K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: jsonpath

Oleg Bartunov-2
Thanks, Tomas !

Will publish a new version really soon !

Regards,
Oleg

On Tue, Mar 6, 2018 at 12:29 AM, Tomas Vondra
<[hidden email]> wrote:

> Hi,
>
> The patch no longer applies - it got broken by fd1a421fe66 which changed
> columns in pg_proc. A rebase is needed.
>
> Fixing it is pretty simle, so I've done that locally and tried to run
> 'make check' under valgrind. And I got a bunch of reports about
> uninitialised values. Full report attached, but in general there seem to
> be two types of failures:
>
>   Conditional jump or move depends on uninitialised value(s)
>      at 0x57BB47A: vfprintf (in /usr/lib64/libc-2.24.so)
>      by 0x57E5478: vsnprintf (in /usr/lib64/libc-2.24.so)
>      by 0xA926D3: pvsnprintf (psprintf.c:121)
>      by 0x723E03: appendStringInfoVA (stringinfo.c:130)
>      by 0x723D58: appendStringInfo (stringinfo.c:87)
>      by 0x76BEFF: _outCoerceViaIO (outfuncs.c:1413)
>      by 0x776F99: outNode (outfuncs.c:3978)
>      by 0x76D7E7: _outJsonCoercion (outfuncs.c:1779)
>      by 0x777CB9: outNode (outfuncs.c:4398)
>      by 0x76D507: _outJsonExpr (outfuncs.c:1752)
>      by 0x777CA1: outNode (outfuncs.c:4395)
>      by 0x767000: _outList (outfuncs.c:187)
>      by 0x776874: outNode (outfuncs.c:3753)
>      by 0x76A4D2: _outTableFunc (outfuncs.c:1068)
>      by 0x776D89: outNode (outfuncs.c:3912)
>      by 0x7744FD: _outRangeTblEntry (outfuncs.c:3209)
>      by 0x777959: outNode (outfuncs.c:4290)
>      by 0x767000: _outList (outfuncs.c:187)
>      by 0x776874: outNode (outfuncs.c:3753)
>      by 0x773713: _outQuery (outfuncs.c:3049)
>    Uninitialised value was created by a stack allocation
>      at 0x5B0C19: base_yyparse (gram.c:26287)
>
> This happens when _outCoerceViaIO tries to output 'location' field
> (that's line 1413), so I guess it's not set/copied somewhere.
>
> The second failure looks like this:
>
>   Conditional jump or move depends on uninitialised value(s)
>      at 0x49E58B: ginFillScanEntry (ginscan.c:72)
>      by 0x49EB56: ginFillScanKey (ginscan.c:221)
>      by 0x49EF72: ginNewScanKey (ginscan.c:369)
>      by 0x4A3875: gingetbitmap (ginget.c:1807)
>      by 0x4F620B: index_getbitmap (indexam.c:727)
>      by 0x6EE342: MultiExecBitmapIndexScan (nodeBitmapIndexscan.c:104)
>      by 0x6DA8F8: MultiExecProcNode (execProcnode.c:506)
>      by 0x6EC53D: BitmapHeapNext (nodeBitmapHeapscan.c:118)
>      by 0x6DC26D: ExecScanFetch (execScan.c:95)
>      by 0x6DC308: ExecScan (execScan.c:162)
>      by 0x6ED7E5: ExecBitmapHeapScan (nodeBitmapHeapscan.c:730)
>      by 0x6DA80A: ExecProcNodeFirst (execProcnode.c:446)
>      by 0x6E5961: ExecProcNode (executor.h:239)
>      by 0x6E5E25: fetch_input_tuple (nodeAgg.c:406)
>      by 0x6E8091: agg_retrieve_direct (nodeAgg.c:1736)
>      by 0x6E7C84: ExecAgg (nodeAgg.c:1551)
>      by 0x6DA80A: ExecProcNodeFirst (execProcnode.c:446)
>      by 0x6D1361: ExecProcNode (executor.h:239)
>      by 0x6D3BB7: ExecutePlan (execMain.c:1721)
>      by 0x6D1917: standard_ExecutorRun (execMain.c:361)
>    Uninitialised value was created by a heap allocation
>      at 0xA64FDC: palloc (mcxt.c:858)
>      by 0x938636: gin_extract_jsonpath_query (jsonb_gin.c:630)
>      by 0x938AB6: gin_extract_jsonb_query (jsonb_gin.c:746)
>      by 0xA340C0: FunctionCall7Coll (fmgr.c:1201)
>      by 0x49EE7F: ginNewScanKey (ginscan.c:313)
>      by 0x4A3875: gingetbitmap (ginget.c:1807)
>      by 0x4F620B: index_getbitmap (indexam.c:727)
>      by 0x6EE342: MultiExecBitmapIndexScan (nodeBitmapIndexscan.c:104)
>      by 0x6DA8F8: MultiExecProcNode (execProcnode.c:506)
>      by 0x6EC53D: BitmapHeapNext (nodeBitmapHeapscan.c:118)
>      by 0x6DC26D: ExecScanFetch (execScan.c:95)
>      by 0x6DC308: ExecScan (execScan.c:162)
>      by 0x6ED7E5: ExecBitmapHeapScan (nodeBitmapHeapscan.c:730)
>      by 0x6DA80A: ExecProcNodeFirst (execProcnode.c:446)
>      by 0x6E5961: ExecProcNode (executor.h:239)
>      by 0x6E5E25: fetch_input_tuple (nodeAgg.c:406)
>      by 0x6E8091: agg_retrieve_direct (nodeAgg.c:1736)
>      by 0x6E7C84: ExecAgg (nodeAgg.c:1551)
>      by 0x6DA80A: ExecProcNodeFirst (execProcnode.c:446)
>      by 0x6D1361: ExecProcNode (executor.h:239)
>
> So the extra_data allocated in gin_extract_jsonpath_query() get to
> ginFillScanEntry() uninitialised.
>
>
> Both seem like a valid issues, I think.
>
> regards
>
> --
> Tomas Vondra                  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

1234 ... 10