jsonpath

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
198 messages Options
123456 ... 10
Reply | Threaded
Open this post in threaded view
|

Re: jsonpath

Tomas Vondra-4
BTW is the v19 really just a rebase of the preceding version?

I'm asking because v18 was adding two types into pg_type.dat, namely
jsonpath (6050) and _jsonpath (6051), while v19 only adds jsonpath
(6050). I've noticed because v18 was missing a comma between the two
entries, which I'd say is a bug - interestingly enough, both types were
created correctly, so it seems to be resilient to such typos.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: jsonpath

Tom Lane-2
Tomas Vondra <[hidden email]> writes:
> BTW is the v19 really just a rebase of the preceding version?
> I'm asking because v18 was adding two types into pg_type.dat, namely
> jsonpath (6050) and _jsonpath (6051), while v19 only adds jsonpath
> (6050).

I haven't looked at this patch, but manual generation of array type entries
is obsolete since 3dc820c43e427371b66d217f2bd5481fc9ef2e2d.  There should
still be a mention of the OID to use for the array type though ...

                        regards, tom lane

Reply | Threaded
Open this post in threaded view
|

Re: jsonpath

Tomas Vondra-4



On 11/8/18 6:12 AM, Tom Lane wrote:

> Tomas Vondra <[hidden email]> writes:
>> BTW is the v19 really just a rebase of the preceding version?
>> I'm asking because v18 was adding two types into pg_type.dat, namely
>> jsonpath (6050) and _jsonpath (6051), while v19 only adds jsonpath
>> (6050).
>
> I haven't looked at this patch, but manual generation of array type entries
> is obsolete since 3dc820c43e427371b66d217f2bd5481fc9ef2e2d.  There should
> still be a mention of the OID to use for the array type though ...
>

Ah, I missed this improvement somehow. (Yes, the array type OID is still
mentioned as part of the type definition.)

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: jsonpath

Nikita Glukhov
In reply to this post by Tomas Vondra-4
Attached 20th version the jsonpath patches.


On 08.11.2018 4:52, Tomas Vondra wrote:
On 11/6/18 4:48 PM, Tomas Vondra wrote:
On 11/6/18 3:31 PM, Nikita Glukhov wrote:
On 29.10.2018 2:20, Tomas Vondra wrote:>
9) It's generally a good idea to make the individual pieces committable
separately, but that means e.g. the regression tests have to pass after
each patch. At the moment that does not seem to be the case for 0002,
see the attached file. I'm running with -DRANDOMIZE_ALLOCATED_MEMORY,
not sure if that's related.
This should definitely be a bug in json support, but I can't reproduce
it simply by defining -DRANDOMIZE_ALLOCATED_MEMORY.  Could you provide
a stack trace at least?
I'll try.
Not sure why you can't reproduce the failures, it's perfectly
reproducible for me. For the record, I'm doing this:

./configure --prefix=/home/user/pg-jsonpath --enable-debug
--enable-cassert CFLAGS="-O0 -DRANDOMIZE_ALLOCATED_MEMORY" && make -s
clean && make -s -j4 && make check

After sticking Assert(false) to JsonEncodeJsonbValue (to the default
case), I get a failure like this:

  select json '{}' @* 'lax $[0]';
! WARNING:  unknown jsonb value type: 20938064
! server closed the connection unexpectedly
! 	This probably means the server terminated abnormally
! 	before or while processing the request.
! connection to server was lost

The backtrace is attached. My guess is JsonValueListGetList in
jsonb_jsonpath_query only does shallow copy instead of copying the
pieces into funcctx->multi_call_memory_ctx, so it gets corrupted on
subsequent calls.

I also attach valgrind report, but I suppose the reported issues are a
consequence of the same bug.
Than you for your help, I finally have managed to reproduce this bug.

The problem was really related to copying values but it was not located in the
jsonb_jsonpath_query() where the whole value list should already be allocated in 
funcctx->multi_call_memory_ctx (executeJsonPath() is called in that context and
also input json/jsonb is copied into it).  Stack-allocated root JsonbValue was
not copied to the heap due to wrong copy condition in recursiveExecuteNoUnwrap()
(jpiIndexArray case):

@@ -1754,1 +1754,1 @@ recursiveExecuteNoUnwrap(JsonPathExecContext *cxt, ...
-   res = recursiveExecuteNext(cxt, jsp, &elem, v, found, !binary);
+   res = recursiveExecuteNext(cxt, jsp, &elem, v, found, singleton || !binary);

I have refactored a bit this place extracting explicit 'copy' flag.

Also I found a similar problem in makePassingVars() where Numerics were not
copied into 'multi_call_memory_ctx' from the input json object. I decided to use
datumCopy() inside makePassingVars() instead of copying the whole input json in
jsonb_jsonpath_query() using PG_GETARG_JSONB_P_COPY().  But I think that
processing of the input json with variables should ideally be bone lazily,
just don't know if this refactoring is worth it because 3rd argument of
json[b]_jsonpath_xxx() functions is used only for testing of variables passing,
SQL/JSON functions use the different mechanism.

On 06.11.2018 18:48, Tomas Vondra wrote:
But FF7-FF9 are questionable since the maximal supported precision is only 6.
They are optional by the standard:
   95) Specifications for Feature F555, “Enhanced seconds precision”:
     d) Subclause 9.44, “Datetime templates”:
       i) Without Feature F555, “Enhanced seconds precision”,
          a <datetime template fraction> shall not be FF7, FF8 or FF9.
So I decided to allow FF7-FF9 only on the output in to_char().
Hmmmm, isn't that against the standard then? I mean, if our precision is 
only 6, that probably means we don't have the F555 feature, which means 
FF7-9 should not be available.
  
It also seems like a bit surprising behavior that we only allow those 
fields in to_char() and not in other places.
Ok, now error is thrown if FF7-FF9 are found in the format string.


On 06.11.2018 18:48, Tomas Vondra wrote:
4) There probably should be .gitignore rule for jsonpath_gram.h, just
like for other generated header files.
I see 3 rules in /src/backend/utils/adt/.gitignore:

/jsonpath_gram.h
/jsonpath_gram.c
/jsonpath_scan.c
But there's a symlink in src/include/utils/jsonpath_gram.h and that's 
not in .gitignore.
Added jsonpath_gram.h to /src/include/utils/.gitignore.
-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

0001-Preliminary-datetime-infrastructure-v20.patch (33K) Download Attachment
0002-Jsonpath-engine-and-operators-v20.patch (356K) Download Attachment
0003-Jsonpath-GIN-support-v20.patch (44K) Download Attachment
0004-Jsonpath-syntax-extensions-v20.patch (48K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: jsonpath

Tomas Vondra-4
Hi,

I've done another round of reviews on v20, assuming the patch is almost
ready to commit, but unfortunately I ran into a bunch of issues that
need to be resolved. None of this is a huge issue, but it's also above
the threshold of what could be tweaked by a committer IMHO.

(Which brings the question who plans to commit this. The patch does not
have a committer in the CF app, but I see both Teodor and Alexander are
listed as it's authors, so I'd expect it to be one of those. Or I might
do that, of course.)


0001
----

1) to_timestamp() does this:

  do_to_timestamp(date_txt, VARDATA(fmt), VARSIZE_ANY_EXHDR(fmt), false,
                  &tm, &fsec, &fprec, NULL);

Shouldn't it really do VARDATA_ANY() instead of VARDATA()? It's what the
function did before (well, it called text_to_cstring, but that does
VARDATA_ANY). The same thing applies to to_date(), BTW.

I also find it a bit inconvenient that we expand the fmt like this in
all do_to_timestamp() calls, although it's just to_datetime() that needs
to do it this way. I realize we can't change to_datetime() because it's
external API, but maybe we should make it construct a varlena and pass
it to do_to_timestamp().


2) We define both DCH_FF# and DCH_ff#, but we never ever use the
lower-case version. Heck, it's not mentioned even in DCH_keywords, which
does this:

  ...
  {"FF1", 3, DCH_FF1, false, FROM_CHAR_DATE_NONE},  /* F */
  ...
  {"ff1", 3, DCH_FF1, false, FROM_CHAR_DATE_NONE},  /* F */
  ...

Compare that to DCH_DAY, DCH_Day and DCH_day, mapped to "DAY", "Day" and
"day". Also, the comment for "ff1" is wrong, it should be "f" I guess.

And of course, there are regression tests for FF# variants, but not the
lower-case ones.


3) Of course, these new formatting patterns should be added to SGML
docs, for example to "Template Patterns for Date/Time Formatting" in
func.sgml (and maybe to other places, I haven't looked for them very
thoroughly).


4) The last block in DCH_from_char() does this

    while (*s == ' ')
        s++;

but I suppose it should use isspace() just like the code immediately
before it.


5) I might be missing something, but why is there the "D" at the end of
the return flags from DCH_from_char?

    /* Return flags for DCH_from_char() */
    #define DCH_DATED    0x01
    #define DCH_TIMED    0x02
    #define DCH_ZONED    0x04


0002
----

1) There are some unnecessary changes to to_datetime signature (date_txt
renamed to vs. datetxt), which is mostly minor but unnecessary churn.


2) There are some extra changes to to_datetime (extra parameters, etc.).
I wonder why those are not included in 0001, as part of the supporting
datetime infrastructure.


3) I'm not sure whether the _safe functions are a good or bad idea, but
at the very least the comments should be updated to explain what it does
(as the API has changed, obviously).


4) the json.c changes are under-documented, e.g. there are no comments
for lex_peek_value, JsonEncodeDateTime doesn't say what tzp is for and
whether it needs to be specified all the time, half of the functions at
the end don't have comments (some of them are really simple, but then
other simple functions do have comments).

I don't know what the right balance here is (I'm certainly not asking
for bogus comments just to have comments) and I agree that the existing
code is not exactly abundant in comments. But perhaps having at least
some would be nice.

The same thing applies to jsonpath.c and jsonpath_exec.c, I think. There
are pretty much no comments whatsoever (at least at the function level,
explaining what the functions do). It would be good to have a file-level
comment too, explaining how jsonpath works, probably.


5) I see uniqueifyJsonbObject now does this:

        if (!object->val.object.uniquified)
                return;

That seems somewhat strange, considering the function says it'll
uniqueify objects, but then exits when noticing the passed object is not
uniquified?


6) Why do we make make_result inline? (numeric.c) If this needs to be
marked with "inline" then perhaps all the _internal functions should be
marked like that too? I have my doubts about the need for this.


7) The other places add _safe to functions that don't throw errors
directly and instead update edata. Why are set_var_from_str, div_var,
mod_var excluded from this convention?


8) I wonder if the changes in numeric can have negative impact on
performance. Has anyone done any performance tests of this part?


0003
----

1) json_gin.c should probably add comments briefly explaining what
JsonPathNode, GinEntries, ExtractedPathEntry, ExtractedJsonPath and
JsonPathExtractionContext are for.


2) I find it a bit suspicious that there are no asserts in json_gin.c
(well, there are 3 in the existing code, but nothing in the new code,
and the patch essentially doubles the amount of code here).


No comments for 0004 at this point.


regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: jsonpath

Alexander Korotkov
Hi!

On Sat, Nov 24, 2018 at 11:03 PM Tomas Vondra
<[hidden email]> wrote:
> I've done another round of reviews on v20, assuming the patch is almost
> ready to commit, but unfortunately I ran into a bunch of issues that
> need to be resolved. None of this is a huge issue, but it's also above
> the threshold of what could be tweaked by a committer IMHO.
>
> (Which brings the question who plans to commit this. The patch does not
> have a committer in the CF app, but I see both Teodor and Alexander are
> listed as it's authors, so I'd expect it to be one of those. Or I might
> do that, of course.)

Thanks a lot for your efforts on this patch!  I was planning to work
on this patch during this commitfest (and probably commit).  But I
didn't manage to do this due to family circumstances (I got my baby
born and hardly could allocate time to work in November).  But I'm
still planning to commit this patch.  So, I'm going to set myself as
committer in commitfest application.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Reply | Threaded
Open this post in threaded view
|

Re: jsonpath

Dmitry Dolgov
> On Wed, Nov 28, 2018 at 9:26 PM Alexander Korotkov <[hidden email]> wrote:
>
> On Sat, Nov 24, 2018 at 11:03 PM Tomas Vondra
> <[hidden email]> wrote:
> > I've done another round of reviews on v20, assuming the patch is almost
> > ready to commit, but unfortunately I ran into a bunch of issues that
> > need to be resolved. None of this is a huge issue, but it's also above
> > the threshold of what could be tweaked by a committer IMHO.
> >
> > (Which brings the question who plans to commit this. The patch does not
> > have a committer in the CF app, but I see both Teodor and Alexander are
> > listed as it's authors, so I'd expect it to be one of those. Or I might
> > do that, of course.)
>
> Thanks a lot for your efforts on this patch!  I was planning to work
> on this patch during this commitfest (and probably commit).  But I
> didn't manage to do this due to family circumstances (I got my baby
> born and hardly could allocate time to work in November).  But I'm
> still planning to commit this patch.  So, I'm going to set myself as
> committer in commitfest application.

Great news, thank you! Just for the information, cfbot doesn't like the patch
on windows:

src/backend/utils/adt/jsonpath_exec.c(1539): error C2146: syntax error
: missing ')' before identifier 'id'
[C:\projects\postgresql\postgres.vcxproj]
src/backend/utils/adt/jsonpath_exec.c(1539): error C2081: 'int32_t' :
name in formal parameter list illegal
[C:\projects\postgresql\postgres.vcxproj]
src/backend/utils/adt/jsonpath_exec.c(1539): error C2061: syntax error
: identifier 'id' [C:\projects\postgresql\postgres.vcxproj]
src/backend/utils/adt/jsonpath_exec.c(1539): error C2059: syntax error
: ';' [C:\projects\postgresql\postgres.vcxproj]
src/backend/utils/adt/jsonpath_exec.c(1539): error C2059: syntax error
: ')' [C:\projects\postgresql\postgres.vcxproj]
src/backend/utils/adt/jsonpath_exec.c(1540): error C2449: found '{' at
file scope (missing function header?)
[C:\projects\postgresql\postgres.vcxproj]
src/backend/utils/adt/jsonpath_exec.c(1548): error C2059: syntax error
: '}' [C:\projects\postgresql\postgres.vcxproj]
c:\projects\postgresql\src\backend\utils\adt\jsonpath_exec.c(1539):
error C2146: syntax error : missing ')' before identifier 'id'
(src/backend/utils/adt/jsonpath_json.c)
[C:\projects\postgresql\postgres.vcxproj]
c:\projects\postgresql\src\backend\utils\adt\jsonpath_exec.c(1539):
error C2081: 'int32_t' : name in formal parameter list illegal
(src/backend/utils/adt/jsonpath_json.c)
[C:\projects\postgresql\postgres.vcxproj]
c:\projects\postgresql\src\backend\utils\adt\jsonpath_exec.c(1539):
error C2061: syntax error : identifier 'id'
(src/backend/utils/adt/jsonpath_json.c)
[C:\projects\postgresql\postgres.vcxproj]
c:\projects\postgresql\src\backend\utils\adt\jsonpath_exec.c(1539):
error C2059: syntax error : ';'
(src/backend/utils/adt/jsonpath_json.c)
[C:\projects\postgresql\postgres.vcxproj]
c:\projects\postgresql\src\backend\utils\adt\jsonpath_exec.c(1539):
error C2059: syntax error : ')'
(src/backend/utils/adt/jsonpath_json.c)
[C:\projects\postgresql\postgres.vcxproj]
c:\projects\postgresql\src\backend\utils\adt\jsonpath_exec.c(1540):
error C2449: found '{' at file scope (missing function header?)
(src/backend/utils/adt/jsonpath_json.c)
[C:\projects\postgresql\postgres.vcxproj]
c:\projects\postgresql\src\backend\utils\adt\jsonpath_exec.c(1548):
error C2059: syntax error : '}'
(src/backend/utils/adt/jsonpath_json.c)
[C:\projects\postgresql\postgres.vcxproj]
10 Warning(s)
14 Error(s)

Reply | Threaded
Open this post in threaded view
|

Re: jsonpath

Nikita Glukhov
In reply to this post by Tomas Vondra-4
Attached 21-st version of the patches rebased onto the current master.

Added new patch 0004 with documentation written by Liudmila Mantrova, and
optional patch 0003 that was separated from 0002.


On 24.11.2018 23:03, Tomas Vondra wrote:

Hi,

I've done another round of reviews on v20, assuming the patch is almost
ready to commit, but unfortunately I ran into a bunch of issues that
need to be resolved. None of this is a huge issue, but it's also above
the threshold of what could be tweaked by a committer IMHO.
Thank your for your review.
(Which brings the question who plans to commit this. The patch does not
have a committer in the CF app, but I see both Teodor and Alexander are
listed as it's authors, so I'd expect it to be one of those. Or I might
do that, of course.)
0001
----

1) to_timestamp() does this:

  do_to_timestamp(date_txt, VARDATA(fmt), VARSIZE_ANY_EXHDR(fmt), false,
                  &tm, &fsec, &fprec, NULL);

Shouldn't it really do VARDATA_ANY() instead of VARDATA()? It's what the
function did before (well, it called text_to_cstring, but that does
VARDATA_ANY). The same thing applies to to_date(), BTW.

I also find it a bit inconvenient that we expand the fmt like this in
all do_to_timestamp() calls, although it's just to_datetime() that needs
to do it this way. I realize we can't change to_datetime() because it's
external API, but maybe we should make it construct a varlena and pass
it to do_to_timestamp().
Of course, there must be VARDATA_ANY() instead of of VARDATA().  But I decided
to construct varlena and pass it to do_to_timestamp() and to to_datetime().

2) We define both DCH_FF# and DCH_ff#, but we never ever use the
lower-case version. Heck, it's not mentioned even in DCH_keywords, which
does this:

  ...
  {"FF1", 3, DCH_FF1, false, FROM_CHAR_DATE_NONE},  /* F */
  ...
  {"ff1", 3, DCH_FF1, false, FROM_CHAR_DATE_NONE},  /* F */
  ...

Compare that to DCH_DAY, DCH_Day and DCH_day, mapped to "DAY", "Day" and
"day". 
Yes, "ff#" are mapped to DCH_FF# like "mi" is mapped DCH_MI.

"Day", "day" are not mapped to DCH_DAY because they determine letter case in the
output, but "ff1" and "FF#" output contains only digits.

Also, the comment for "ff1" is wrong, it should be "f" I guess.
Fixed.
And of course, there are regression tests for FF# variants, but not the
lower-case ones.

Tests were added

3) Of course, these new formatting patterns should be added to SGML
docs, for example to "Template Patterns for Date/Time Formatting" in
func.sgml (and maybe to other places, I haven't looked for them very
thoroughly).

Fixed.

4) The last block in DCH_from_char() does this

    while (*s == ' ')
        s++;

but I suppose it should use isspace() just like the code immediately
before it.

Fixed.

5) I might be missing something, but why is there the "D" at the end of
the return flags from DCH_from_char?

    /* Return flags for DCH_from_char() */
    #define DCH_DATED    0x01
    #define DCH_TIMED    0x02
    #define DCH_ZONED    0x04
These terms "dated", "timed", and "zoned" are taken from the standard:

9.39.10) If <JSON datetime template> JDT is specified, ... :
  a) If JDT contains <datetime template year>, ... then JDT is dated.
  b) If JDT contains <datetime template 12-hour>, ... then JDT is timed.
  c) If JDT contains <datetime template time zone hour>, ... then JDT is zoned.


0002
----

1) There are some unnecessary changes to to_datetime signature (date_txt
renamed to vs. datetxt), which is mostly minor but unnecessary churn.

2) There are some extra changes to to_datetime (extra parameters, etc.).
I wonder why those are not included in 0001, as part of the supporting
datetime infrastructure.
Extra changes to to_datetime() were moved into patch 0001.

3) I'm not sure whether the _safe functions are a good or bad idea, but
at the very least the comments should be updated to explain what it does
(as the API has changed, obviously).
 
This functions were introduced for elimination of PG_TRY/PG_CATCH in jsonpath
code.  But this error handling approach has a lot of issues (see below), so I
decided separate again them into optional patch 0004.

4) the json.c changes are under-documented, e.g. there are no comments
for lex_peek_value, JsonEncodeDateTime doesn't say what tzp is for and
whether it needs to be specified all the time, half of the functions at
the end don't have comments (some of them are really simple, but then
other simple functions do have comments).

I don't know what the right balance here is (I'm certainly not asking
for bogus comments just to have comments) and I agree that the existing
code is not exactly abundant in comments. But perhaps having at least
some would be nice.
Some new comments were added to json.c.
The same thing applies to jsonpath.c and jsonpath_exec.c, I think. There
are pretty much no comments whatsoever (at least at the function level,
explaining what the functions do). It would be good to have a file-level
comment too, explaining how jsonpath works, probably.

I hope to add detailed comments for jsonpath in the next version of the patches.

5) I see uniqueifyJsonbObject now does this:

	if (!object->val.object.uniquified)
		return;

That seems somewhat strange, considering the function says it'll
uniqueify objects, but then exits when noticing the passed object is not
uniquified?

'uniquified' field was rename to 'uniquify'.

6) Why do we make make_result inline? (numeric.c) If this needs to be
marked with "inline" then perhaps all the _internal functions should be
marked like that too? I have my doubts about the need for this.


7) The other places add _safe to functions that don't throw errors
directly and instead update edata. Why are set_var_from_str, div_var,
mod_var excluded from this convention?
I added "_safe" postfix only to functions having the both variants: safe (error
info is placed into *edata) and unsafe (errors are always thrown).

One-line make_result() is called from many unsafe places, so I decided to leave
it as it was.  It can also be defined as a simple macro.

New "_internal" functions are called from jsonpath_exec.c, so they can't be
"static inline", but "extern inline" seems to be non-portable.

8) I wonder if the changes in numeric can have negative impact on
performance. Has anyone done any performance tests of this part?
I've tried to measure this impact by evaluation of expression with dozens of '+'
or even by adding a loop into numeric_add(), but failed to get any measurable
difference.

0003
----

1) jsonb_gin.c should probably add comments briefly explaining what
JsonPathNode, GinEntries, ExtractedPathEntry, ExtractedJsonPath and
JsonPathExtractionContext are for.
A lot of comments were added. Also major refactoring of jsonb_gin.c was done.

2) I find it a bit suspicious that there are no asserts in json_gin.c
(well, there are 3 in the existing code, but nothing in the new code,
and the patch essentially doubles the amount of code here).
I have managed to find only 4 similar places suitable for asserts.

No comments for 0004 at this point.

regards

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

0001-Preliminary-datetime-infrastructure-v21.patch (36K) Download Attachment
0002-Jsonpath-engine-and-operators-v21.patch (331K) Download Attachment
0003-Remove-PG_TRY-in-jsonpath-arithmetics-v21.patch (29K) Download Attachment
0004-Jsonpath-docs-v21.patch (41K) Download Attachment
0005-Jsonpath-GIN-support-v21.patch (48K) Download Attachment
0006-Jsonpath-syntax-extensions-v21.patch (49K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: jsonpath

Alexander Korotkov
On Tue, Dec 4, 2018 at 2:23 AM Nikita Glukhov <[hidden email]> wrote:

> 2) We define both DCH_FF# and DCH_ff#, but we never ever use the
> lower-case version. Heck, it's not mentioned even in DCH_keywords, which
> does this:
>
>   ...
>   {"FF1", 3, DCH_FF1, false, FROM_CHAR_DATE_NONE},  /* F */
>   ...
>   {"ff1", 3, DCH_FF1, false, FROM_CHAR_DATE_NONE},  /* F */
>   ...
>
> Compare that to DCH_DAY, DCH_Day and DCH_day, mapped to "DAY", "Day" and
> "day".
>
> Yes, "ff#" are mapped to DCH_FF# like "mi" is mapped DCH_MI.
>
> "Day", "day" are not mapped to DCH_DAY because they determine letter case in the
> output, but "ff1" and "FF#" output contains only digits.

Right, DCH_poz is also offset in DCH_keywords array.  So, if array has
an entry for "ff1" then enum should have a DCH_ff1 member in the same
position.

I got some other questions regarding this patchset.

1) Why do we parse FF7-FF9 if we're not supporting them anyway?
Without defining FF7-FF9 we can also don't throw errors for them
everywhere.  That would save us some code lines.

2) + DCH_to_char_fsec("%01d", in->fsec / INT64CONST(100000));
Why do we use INT64CONST() here and in the similar places assuming
that fsec is only uint32?

3) wrapItem() is unused in
0002-Jsonpath-engine-and-operators-v21.patch, but used in
0006-Jsonpath-syntax-extensions-v21.patch.  Please, move it to
0006-Jsonpath-syntax-extensions-v21.patch?

4) I also got these couple of warning during compilation.

jsonpath_exec.c:1485:1: warning: unused function
'recursiveExecuteNested' [-Wunused-function]
recursiveExecuteNested(JsonPathExecContext *cxt, JsonPathItem *jsp,
^
1 warning generated.
jsonpath_scan.l:444:6: warning: implicit declaration of function
'jsonpath_yyparse' is invalid in C99 [-Wimplicit-function-declaration]
        if (jsonpath_yyparse((void*)&parseresult) != 0)
            ^
1 warning generated.

Perhaps recursiveExecuteNested() is unsed in this patchset.  It's
probably used by some subsequent SQL/JSON-related patchset.  So,
please, move it there.

5) I think each usage of PG_TRY()/PG_CATCH() deserves comment
describing why it's safe to use without subtransaction.  In fact we
should just state that no function called inside performs data
modification.

------
Alexander Korotkov

Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Reply | Threaded
Open this post in threaded view
|

Re: jsonpath

Tomas Vondra-4
On 1/5/19 1:11 AM, Alexander Korotkov wrote:

> On Tue, Dec 4, 2018 at 2:23 AM Nikita Glukhov <[hidden email]> wrote:
>> 2) We define both DCH_FF# and DCH_ff#, but we never ever use the
>> lower-case version. Heck, it's not mentioned even in DCH_keywords, which
>> does this:
>>
>>   ...
>>   {"FF1", 3, DCH_FF1, false, FROM_CHAR_DATE_NONE},  /* F */
>>   ...
>>   {"ff1", 3, DCH_FF1, false, FROM_CHAR_DATE_NONE},  /* F */
>>   ...
>>
>> Compare that to DCH_DAY, DCH_Day and DCH_day, mapped to "DAY", "Day" and
>> "day".
>>
>> Yes, "ff#" are mapped to DCH_FF# like "mi" is mapped DCH_MI.
>>
>> "Day", "day" are not mapped to DCH_DAY because they determine letter case in the
>> output, but "ff1" and "FF#" output contains only digits.
>
> Right, DCH_poz is also offset in DCH_keywords array.  So, if array has
> an entry for "ff1" then enum should have a DCH_ff1 member in the same
> position.
>

I guess my question is why we define DCH_ff# at all, when it's not
mapped in DCH_keywords? ISTM we could simply leave them out of all the
arrays, no? Of course, this is not specific to this patch, as it applies
to pre-existing items (like DCH_mi).

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: jsonpath

Alexander Korotkov
On Sat, Jan 5, 2019 at 5:21 PM Tomas Vondra
<[hidden email]> wrote:

> On 1/5/19 1:11 AM, Alexander Korotkov wrote:
> > On Tue, Dec 4, 2018 at 2:23 AM Nikita Glukhov <[hidden email]> wrote:
> >> 2) We define both DCH_FF# and DCH_ff#, but we never ever use the
> >> lower-case version. Heck, it's not mentioned even in DCH_keywords, which
> >> does this:
> >>
> >>   ...
> >>   {"FF1", 3, DCH_FF1, false, FROM_CHAR_DATE_NONE},  /* F */
> >>   ...
> >>   {"ff1", 3, DCH_FF1, false, FROM_CHAR_DATE_NONE},  /* F */
> >>   ...
> >>
> >> Compare that to DCH_DAY, DCH_Day and DCH_day, mapped to "DAY", "Day" and
> >> "day".
> >>
> >> Yes, "ff#" are mapped to DCH_FF# like "mi" is mapped DCH_MI.
> >>
> >> "Day", "day" are not mapped to DCH_DAY because they determine letter case in the
> >> output, but "ff1" and "FF#" output contains only digits.
> >
> > Right, DCH_poz is also offset in DCH_keywords array.  So, if array has
> > an entry for "ff1" then enum should have a DCH_ff1 member in the same
> > position.
> >
>
> I guess my question is why we define DCH_ff# at all, when it's not
> mapped in DCH_keywords? ISTM we could simply leave them out of all the
> arrays, no?

I think we still need separate array entries for "FF1" and "ff1".  At
least, as long as we don't allow "fF1" and "Ff1".  In order to get rid
of DCH_ff#, I think we should decouple DCH_keywords from array
indexes.

> Of course, this is not specific to this patch, as it applies
> to pre-existing items (like DCH_mi).

Right, that should be a subject of separate patch.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Reply | Threaded
Open this post in threaded view
|

Re: jsonpath

Nikita Glukhov
In reply to this post by Alexander Korotkov
Attached 23rd version of the patches.


On 05.01.2019 3:11, Alexander Korotkov wrote:

> On Tue, Dec 4, 2018 at 2:23 AM Nikita Glukhov <[hidden email]> wrote:
>> 2) We define both DCH_FF# and DCH_ff#, but we never ever use the
>> lower-case version. Heck, it's not mentioned even in DCH_keywords, which
>> does this:
>>
>>    ...
>>    {"FF1", 3, DCH_FF1, false, FROM_CHAR_DATE_NONE},  /* F */
>>    ...
>>    {"ff1", 3, DCH_FF1, false, FROM_CHAR_DATE_NONE},  /* F */
>>    ...
>>
>> Compare that to DCH_DAY, DCH_Day and DCH_day, mapped to "DAY", "Day" and
>> "day".
>>
>> Yes, "ff#" are mapped to DCH_FF# like "mi" is mapped DCH_MI.
>>
>> "Day", "day" are not mapped to DCH_DAY because they determine letter case in the
>> output, but "ff1" and "FF#" output contains only digits.
> Right, DCH_poz is also offset in DCH_keywords array.  So, if array has
> an entry for "ff1" then enum should have a DCH_ff1 member in the same
> position.
>
> I got some other questions regarding this patchset.
>
> 1) Why do we parse FF7-FF9 if we're not supporting them anyway?
> Without defining FF7-FF9 we can also don't throw errors for them
> everywhere.  That would save us some code lines.
FF7-FF9 were removed.
> 2) + DCH_to_char_fsec("%01d", in->fsec / INT64CONST(100000));
> Why do we use INT64CONST() here and in the similar places assuming
> that fsec is only uint32?
Fixed.

> 3) wrapItem() is unused in
> 0002-Jsonpath-engine-and-operators-v21.patch, but used in
> 0006-Jsonpath-syntax-extensions-v21.patch.  Please, move it to
> 0006-Jsonpath-syntax-extensions-v21.patch?
wraptItem() was moved into patch #6.

> 4) I also got these couple of warning during compilation.
>
> jsonpath_exec.c:1485:1: warning: unused function
> 'recursiveExecuteNested' [-Wunused-function]
> recursiveExecuteNested(JsonPathExecContext *cxt, JsonPathItem *jsp,
> ^
> 1 warning generated.
> jsonpath_scan.l:444:6: warning: implicit declaration of function
> 'jsonpath_yyparse' is invalid in C99 [-Wimplicit-function-declaration]
>          if (jsonpath_yyparse((void*)&parseresult) != 0)
>              ^
> 1 warning generated.
>
> Perhaps recursiveExecuteNested() is unsed in this patchset.  It's
> probably used by some subsequent SQL/JSON-related patchset.  So,
> please, move it there.
recursiveExecuteNested() was removed from this patch set.

Prototype for jsonpath_yyparse() should be in the Bison-generated file
src/include/adt/jsonpath_gram.h.

> 5) I think each usage of PG_TRY()/PG_CATCH() deserves comment
> describing why it's safe to use without subtransaction.  In fact we
> should just state that no function called inside performs data
> modification.
Comments were added.

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

0001-Preliminary-datetime-infrastructure-v23.patch (32K) Download Attachment
0002-Jsonpath-engine-and-operators-v23.patch (330K) Download Attachment
0003-Remove-PG_TRY-in-jsonpath-arithmetics-v23.patch (29K) Download Attachment
0004-Jsonpath-docs-v23.patch (41K) Download Attachment
0005-Jsonpath-GIN-support-v23.patch (48K) Download Attachment
0006-Jsonpath-syntax-extensions-v23.patch (50K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: jsonpath

Alexander Korotkov
Nikita, thank you!

I noticed another thing.  3-arguments version of functions
jsonpath_exists(), jsonpath_predicate(), jsonpath_query(),
jsonpath_query_wrapped() are:

1) Not documented
2) Not used in operator definitions
3) Not used in subsequent patches

So, it looks like we can just remove then.  But I think we're very
likely can commit jsonpath patch to PostgreSQL 12, but I'm not sure
about other patches.  So, having those functions exposed to user can
be extremely useful until we have separate nodes for SQL/JSON.  So, I
vote to document these functions.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Reply | Threaded
Open this post in threaded view
|

Re: jsonpath

Tomas Vondra-4
On 1/18/19 6:58 PM, Alexander Korotkov wrote:

> Nikita, thank you!
>
> I noticed another thing.  3-arguments version of functions
> jsonpath_exists(), jsonpath_predicate(), jsonpath_query(),
> jsonpath_query_wrapped() are:
>
> 1) Not documented
> 2) Not used in operator definitions
> 3) Not used in subsequent patches
>
> So, it looks like we can just remove then.  But I think we're very
> likely can commit jsonpath patch to PostgreSQL 12, but I'm not sure
> about other patches.  So, having those functions exposed to user can
> be extremely useful until we have separate nodes for SQL/JSON.  So, I
> vote to document these functions.
>

That seems a bit strange. If those functions are meant to be used by
other patches (which ones?) then why should not we make them part of
those future patches?

But it seems more like those functions are actually meant to be used by
users in the first place, in cases when we need to provide a third
parameter (which operators can't do). In that case yes - we should have
them documented properly, but also tested. Which is not the case
currently, because the regression tests only use the operators.

Two more comments:

1) I'm wondering why we even need separate functions for the different
numbers of arguments at the C level, as both simply call to the same
function anyway with a PG_NARGS() condition in it. Can't we ditch those
wrappers and reference that target function directly?

2) I once again ran into the jsonb->json business, which essentially
means that when you do this:

    select json '{"a": { "b" : 10}}' @? '$.a.b';

it ends up calling jsonb_jsonpath_exists(), which then does this:

    Jsonb *jb = PG_GETARG_JSONB_P(0);

I and am not sure why/how it seems to work, but I find it confusing
because the stack still looks like this:

#0  jsonb_jsonpath_exists (fcinfo=0x162f800) at jsonpath_exec.c:2857
#1  0x000000000096d721 in json_jsonpath_exists2 (fcinfo=0x162f800) at
jsonpath_exec.c:2882
#2  0x00000000006c790a in ExecInterpExpr (state=0x162f300,
econtext=0x162ee18, isnull=0x7ffcea4c3857) at execExprInterp.c:648
...

and of course, the fcinfo->flinfo still points to the json-specific
function, which say the first parameter type is 'json'.

(gdb) print *fcinfo->flinfo
$23 = {fn_addr = 0x96d709 <json_jsonpath_exists2>, fn_oid = 6043,
fn_nargs = 2, fn_strict = true, fn_retset = false, fn_stats = 2 '\002',
fn_extra = 0x0, fn_mcxt = 0x162e990, fn_expr = 0x15db378}


test=# select proname, prosrc, proargtypes from pg_proc where oid = 6043;
     proname     |        prosrc         | proargtypes
-----------------+-----------------------+-------------
 jsonpath_exists | json_jsonpath_exists2 | 114 6050
(1 row)

test=# select typname from pg_type where oid = 114;
 typname
---------
 json
(1 row)


regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: jsonpath

Alexander Korotkov
On Sat, Jan 19, 2019 at 1:32 AM Tomas Vondra
<[hidden email]> wrote:

> On 1/18/19 6:58 PM, Alexander Korotkov wrote:
> > So, it looks like we can just remove then.  But I think we're very
> > likely can commit jsonpath patch to PostgreSQL 12, but I'm not sure
> > about other patches.  So, having those functions exposed to user can
> > be extremely useful until we have separate nodes for SQL/JSON.  So, I
> > vote to document these functions.
>
> That seems a bit strange. If those functions are meant to be used by
> other patches (which ones?) then why should not we make them part of
> those future patches?

No, these functions aren't used by other patches (it was my original
wrong idea).  Other patches provides SQL expressions making these
functions not that necessary.  But I think we should keep these
functions in jsonpath patch.

> But it seems more like those functions are actually meant to be used by
> users in the first place, in cases when we need to provide a third
> parameter (which operators can't do). In that case yes - we should have
> them documented properly, but also tested. Which is not the case
> currently, because the regression tests only use the operators.

+1
Thank you for noticing.  Tests should be provided as well.

> Two more comments:
>
> 1) I'm wondering why we even need separate functions for the different
> numbers of arguments at the C level, as both simply call to the same
> function anyway with a PG_NARGS() condition in it. Can't we ditch those
> wrappers and reference that target function directly?

That was surprising for me too.  Technically, it's OK to do this.  And
we do this for extensions.  But for in-core functions we have
following sanity check.

-- Considering only built-in procs (prolang = 12), look for multiple uses
-- of the same internal function (ie, matching prosrc fields).  It's OK to
-- have several entries with different pronames for the same internal function,
-- but conflicts in the number of arguments and other critical items should
-- be complained of.  (We don't check data types here; see next query.)
-- Note: ignore aggregate functions here, since they all point to the same
-- dummy built-in function.

SELECT p1.oid, p1.proname, p2.oid, p2.proname
FROM pg_proc AS p1, pg_proc AS p2
WHERE p1.oid < p2.oid AND
    p1.prosrc = p2.prosrc AND
    p1.prolang = 12 AND p2.prolang = 12 AND
    (p1.prokind != 'a' OR p2.prokind != 'a') AND
    (p1.prolang != p2.prolang OR
     p1.prokind != p2.prokind OR
     p1.prosecdef != p2.prosecdef OR
     p1.proleakproof != p2.proleakproof OR
     p1.proisstrict != p2.proisstrict OR
     p1.proretset != p2.proretset OR
     p1.provolatile != p2.provolatile OR
     p1.pronargs != p2.pronargs);

And we already have some code written especially to make this check happy.

/* This is separate to keep the opr_sanity regression test from complaining */
Datum
regexp_split_to_table_no_flags(PG_FUNCTION_ARGS)
{
    return regexp_split_to_table(fcinfo);
}

Personally I'm not fan of this approach, and I would rather relax this
sanity check.  But that doesn't seem to be a subject of this patch.
For jsonpath, it's OK to just keep this tradition.

> 2) I once again ran into the jsonb->json business, which essentially
> means that when you do this:
>
>     select json '{"a": { "b" : 10}}' @? '$.a.b';
>
> it ends up calling jsonb_jsonpath_exists(), which then does this:
>
>     Jsonb *jb = PG_GETARG_JSONB_P(0);
>
> I and am not sure why/how it seems to work, but I find it confusing
> because the stack still looks like this:
>
> #0  jsonb_jsonpath_exists (fcinfo=0x162f800) at jsonpath_exec.c:2857
> #1  0x000000000096d721 in json_jsonpath_exists2 (fcinfo=0x162f800) at
> jsonpath_exec.c:2882
> #2  0x00000000006c790a in ExecInterpExpr (state=0x162f300,
> econtext=0x162ee18, isnull=0x7ffcea4c3857) at execExprInterp.c:648
> ...
>
> and of course, the fcinfo->flinfo still points to the json-specific
> function, which say the first parameter type is 'json'.
>
> (gdb) print *fcinfo->flinfo
> $23 = {fn_addr = 0x96d709 <json_jsonpath_exists2>, fn_oid = 6043,
> fn_nargs = 2, fn_strict = true, fn_retset = false, fn_stats = 2 '\002',
> fn_extra = 0x0, fn_mcxt = 0x162e990, fn_expr = 0x15db378}
>
>
> test=# select proname, prosrc, proargtypes from pg_proc where oid = 6043;
>      proname     |        prosrc         | proargtypes
> -----------------+-----------------------+-------------
>  jsonpath_exists | json_jsonpath_exists2 | 114 6050
> (1 row)
>
> test=# select typname from pg_type where oid = 114;
>  typname
> ---------
>  json
> (1 row)

It's tricky.  There is jsonpath_json.c, which includes jsonpath_exec.c
with set of macro definitions making that code work with json type.
I'm going to refactor that.  But general approach to include same
functions more than once seems OK for me.

Some more notes:

1) It seems that @* and @# are not going to be supported by any
indexes.  I think we should remove these operators and let users use
functions instead.
2) I propose to rename @~ operator to @@.  We already use @@ as
"satisfies" in multiple places, and I thinks this case fits too.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Reply | Threaded
Open this post in threaded view
|

Re: jsonpath

Alexander Korotkov
On Sat, Jan 19, 2019 at 2:54 AM Alexander Korotkov
<[hidden email]> wrote:
> 1) It seems that @* and @# are not going to be supported by any
> indexes.  I think we should remove these operators and let users use
> functions instead.
> 2) I propose to rename @~ operator to @@.  We already use @@ as
> "satisfies" in multiple places, and I thinks this case fits too.

3) How do we calculate the "id" property returned by keyvalue()
function?  It's not documented.  Even presence of "id" columns isn't
documented.  Standard stands that it's implementation-depended
indetifier of object holding key-value pair.  The way of its
calculation is also not clear from the code.  Why do we need constant
of 10000000000?

                id = jb->type != jbvBinary ? 0 :
                    (int64)((char *) jb->val.binary.data -
                            (char *) cxt->baseObject.jbc);
                id += (int64) cxt->baseObject.id * INT64CONST(10000000000);

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Reply | Threaded
Open this post in threaded view
|

Re: jsonpath

Alexander Korotkov
On Sun, Jan 20, 2019 at 2:45 AM Alexander Korotkov
<[hidden email]> wrote:

> 3) How do we calculate the "id" property returned by keyvalue()
> function?  It's not documented.  Even presence of "id" columns isn't
> documented.  Standard stands that it's implementation-depended
> indetifier of object holding key-value pair.  The way of its
> calculation is also not clear from the code.  Why do we need constant
> of 10000000000?
>
>                 id = jb->type != jbvBinary ? 0 :
>                     (int64)((char *) jb->val.binary.data -
>                             (char *) cxt->baseObject.jbc);
>                 id += (int64) cxt->baseObject.id * INT64CONST(10000000000);
I've revising patchset bringing it to commitable shape.  The
intermediate result is attached.

0001-Preliminary-datetime-infrastructure-v24.patch

* Commit message & minor cleanup

0002-Jsonpath-engine-and-docs-v24.patch

 * Support of json is removed.  Current implementation is tricky and
cumbersome.  We need to design a suitable abstraction layer for that.
It should be done by separate patch applying not only to jsonpath, but
also to other jsonb functions lacking of json analogues.
* jsonpath_exists(jsonb, jsonpath[, jsonb]), jsonpath_predicate(jsonb,
jsonpath[, jsonb]), jsonpath_query(jsonb, jsonpath[, jsonb]),
jsonpath_wrapped(jsonb, jsonpath[, jsonb]) are documented.
* @* and @# operators are removed, @~ operator is renamed to @@.
* Commit message & minor cleanup.

I'll continue revising this patchset.  Nikita, could you please write
tests for 3-argument versions of functions?  Also, please answer the
question regarding "id" property.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

0001-Preliminary-datetime-infrastructure-v24.patch (44K) Download Attachment
0002-Jsonpath-engine-and-docs-v24.patch (384K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: jsonpath

Alexander Korotkov
On Sun, Jan 20, 2019 at 6:30 AM Alexander Korotkov
<[hidden email]> wrote:
> I'll continue revising this patchset.  Nikita, could you please write
> tests for 3-argument versions of functions?  Also, please answer the
> question regarding "id" property.

I've some more notes regarding function set provided in jsonpath patch:
1) Function names don't follow the convention we have.  All our
functions dealing with jsonb have "jsonb_" prefix.  Exclusions have
"jsonb" in other part of name, for example, "to_jsonb(anyelement)".
We could name functions at SQL level in the same way they are named in
C.  So, they would be jsonb_jsonpath_exists() etc.  But it's probably
too long.  What about jsonb_path_exists() etc?
2) jsonpath_query_wrapped() name seems counter intuitive for me.  What
about jsonpath_query_array()?
3) jsonpath_query_wrapped() behavior looks tricky to me.  It returns
NULL for no results.  When result item is one, it is returned "as is".
When there are two or more result items, they are wrapped into array.
This representation of result seems extremely inconvenient for me.  It
becomes hard to solve even trivial tasks with that: for instance,
iterate all items found.  Without extra assumptions on what is going
to be returned it's also impossible for caller to distinguish single
array item found from multiple items found wrapped into array.  And
that seems very bad.  I know this behavior is inspired by SQL/JSON
standard.  But since these functions are anyway our extension, we can
implement them as we like.  So, I propose to make this function always
return array of items regardless on count of those items (even for 1
or 0 items).
4) If we change behavior of jsonpath_query_wrapped() as above, we can
also implement jsonpath_query_one() function, which would return first
result item or NULL for no items.
Any thoughts?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Reply | Threaded
Open this post in threaded view
|

Re: jsonpath

Nikita Glukhov
In reply to this post by Alexander Korotkov

On 20.01.2019 2:45, Alexander Korotkov wrote:

3) How do we calculate the "id" property returned by keyvalue()
function?  It's not documented.  Even presence of "id" columns isn't
documented.  Standard stands that it's implementation-depended
indetifier of object holding key-value pair.  The way of its
calculation is also not clear from the code.  Why do we need constant
of 10000000000?

                id = jb->type != jbvBinary ? 0 :
                    (int64)((char *) jb->val.binary.data -
                            (char *) cxt->baseObject.jbc);
                id += (int64) cxt->baseObject.id * INT64CONST(10000000000);
I decided to construct object id from the two parts: base object id and its
binary offset in its base object's jsonb:

object_id = 10000000000 * base_object_id + object_offset_in_base_object


10000000000 (10^10) -- is a first round decimal number greater than 2^32
(maximal offset in jsonb).  Decimal multiplier is used here to improve the
readability of identifiers.


Base object is usually a root object of the path: context item '$' or path
variable '$var', literals can't produce objects for now.  But if the path
contains generated objects (.keyvalue() itself, for example), then they become
base object for the subsequent .keyvalue().  See example:

'$.a.b.keyvalue().value.keyvalue()' :
 - base for the first .keyvalue() is '$'
 - base for the second .keyvalue() is '$.a.b.keyvalue()'


Id of '$' is 0.
Id of '$var' is its ordinal (positive) number in the list of variables.
Ids for generated objects are assigned using global counter
'JsonPathExecContext.generatedObjectId' starting from 'number_of_vars + 1'.


Corresponding comments will be added in the upcoming version of the patches.

-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Reply | Threaded
Open this post in threaded view
|

Re: jsonpath

Oleg Bartunov-3
In reply to this post by Alexander Korotkov
On Mon, Jan 21, 2019 at 1:40 AM Alexander Korotkov
<[hidden email]> wrote:

>
> On Sun, Jan 20, 2019 at 6:30 AM Alexander Korotkov
> <[hidden email]> wrote:
> > I'll continue revising this patchset.  Nikita, could you please write
> > tests for 3-argument versions of functions?  Also, please answer the
> > question regarding "id" property.
>
> I've some more notes regarding function set provided in jsonpath patch:
> 1) Function names don't follow the convention we have.  All our
> functions dealing with jsonb have "jsonb_" prefix.  Exclusions have
> "jsonb" in other part of name, for example, "to_jsonb(anyelement)".
> We could name functions at SQL level in the same way they are named in
> C.  So, they would be jsonb_jsonpath_exists() etc.  But it's probably
> too long.  What about jsonb_path_exists() etc?

jsonpath_exists is similar to xpath_exists.
Actually, we could use jsonb_exists(jsonb, jsonpath, json), but then
we should specify the type of the second argument.

> 2) jsonpath_query_wrapped() name seems counter intuitive for me.  What
> about jsonpath_query_array()?

The question is should we try to provide a functional interface for
all options of
JSON_QUERY clause ?  The same is for  other SQL/JSON clauses.
Currently,  patch contains very limited subset of JSON_QUERY
functionality,  mostly for jsonpath testing.

> 3) jsonpath_query_wrapped() behavior looks tricky to me.  It returns
> NULL for no results.  When result item is one, it is returned "as is".
> When there are two or more result items, they are wrapped into array.
> This representation of result seems extremely inconvenient for me.  It
> becomes hard to solve even trivial tasks with that: for instance,
> iterate all items found.  Without extra assumptions on what is going
> to be returned it's also impossible for caller to distinguish single
> array item found from multiple items found wrapped into array.  And
> that seems very bad.  I know this behavior is inspired by SQL/JSON
> standard.  But since these functions are anyway our extension, we can
> implement them as we like.  So, I propose to make this function always
> return array of items regardless on count of those items (even for 1
> or 0 items).

Fair enough, but if we agree, that we provide an exact functionality of
SQL clauses, then better to follow the standard to avoid problems.


> 4) If we change behavior of jsonpath_query_wrapped() as above, we can
> also implement jsonpath_query_one() function, which would return first
> result item or NULL for no items.
> Any thoughts?

I think, that we should postpone this functional interface, which could be
added later.

>
> ------
> Alexander Korotkov
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company



--
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

123456 ... 10