SEARCH and CYCLE clauses

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

SEARCH and CYCLE clauses

Peter Eisentraut-6
I have implemented the SEARCH and CYCLE clauses.

This is standard SQL syntax attached to a recursive CTE to compute a
depth- or breadth-first ordering and cycle detection, respectively.
This is just convenience syntax for what you can already do manually.
The original discussion about recursive CTEs briefly mentioned these as
something to do later but then it was never mentioned again.

SQL specifies these in terms of syntactic transformations, and so that's
how I have implemented them also, mainly in the rewriter.

I have successfully tested this against examples I found online that
were aimed at DB2.

The contained documentation and the code comment in rewriteHandler.c
explain the details.

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

v1-0001-SEARCH-and-CYCLE-clauses.patch (107K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: SEARCH and CYCLE clauses

Vik Fearing-6
On 5/20/20 1:46 PM, Peter Eisentraut wrote:
> I have implemented the SEARCH and CYCLE clauses.

YES!

> This is standard SQL syntax attached to a recursive CTE to compute a
> depth- or breadth-first ordering and cycle detection, respectively. This
> is just convenience syntax for what you can already do manually. The
> original discussion about recursive CTEs briefly mentioned these as
> something to do later but then it was never mentioned again.
>
> SQL specifies these in terms of syntactic transformations, and so that's
> how I have implemented them also, mainly in the rewriter.

I've attempted to do this several times but didn't get anywhere with it.
 I'm looking forward to reviewing this.

(And maybe it will put me on the right path for implementing <unique
predicate>.)
--
Vik Fearing


Reply | Threaded
Open this post in threaded view
|

Re: SEARCH and CYCLE clauses

Vik Fearing-6
On 5/20/20 3:04 PM, Vik Fearing wrote:

> I'm looking forward to reviewing this.
A few quick things I've noticed so far:

1)
There are some smart quotes in the comments that should be converted to
single quotes.


2)
This query is an infinite loop, as expected:

  with recursive a as (select 1 as b union all select b from a)
  table a;

But it becomes an error when you add a cycle clause to it:

  with recursive a as (select 1 as b union all table a)
    cycle b set c to true default false using p
  table a;

  ERROR:  each UNION query must have the same number of columns

The same error occurs with a search clause.


3)
If I take the same infinite loop query but replace the TABLE syntax with
a SELECT and add a cycle clause, it's not an infinite loop anymore.

  with recursive a as (select 1 as b union all select b from a)
    cycle b set c to true default false using p
  table a;

 b | c |     p
---+---+-----------
 1 | f | {(1)}
 1 | t | {(1),(1)}
(2 rows)

Why does it stop?  It should still be an infinite loop.


4)
If I use NULL instead of false, I only get one row back.

  with recursive a as (select 1 as b union all select b from a)
    cycle b set c to true default false using p
  table a;

 b | c |   p
---+---+-------
 1 |   | {(1)}
(1 row)


5)
I can set both states to the same value.

  with recursive a as (select 1 as b union all select b from a)
    cycle b set c to true default true using p
  table a;

 b | c |   p
---+---+-------
 1 | t | {(1)}
(1 row)

This is a direct violation of 7.18 SR 2.b.ii.3 as well as common sense.
 BTW, I applaud your decision to violate the other part of that rule and
allowing any data type here.


5)
The same rule as above says that the value and the default value must be
literals but not everything that a human might consider a literal is
accepted.  In particular:

  with recursive a as (select 1 as b union all select b from a)
    cycle b set c to 1 default -1 using p
  table a;

  ERROR:  syntax error at or near "-"

Can we just accept a full a_expr here instead of AexprConst?  Both
DEFAULT and USING are fully reserved keywords.



That's all for now; will test more later.
Thanks for working on this!
--
Vik Fearing


Reply | Threaded
Open this post in threaded view
|

Re: SEARCH and CYCLE clauses

Peter Eisentraut-6
On 2020-05-20 21:28, Vik Fearing wrote:
> 1)
> There are some smart quotes in the comments that should be converted to
> single quotes.

ok, fixing that

> 2)
> This query is an infinite loop, as expected:
>
>    with recursive a as (select 1 as b union all select b from a)
>    table a;
>
> But it becomes an error when you add a cycle clause to it:
>
>    with recursive a as (select 1 as b union all table a)
>      cycle b set c to true default false using p
>    table a;
>
>    ERROR:  each UNION query must have the same number of columns

table a expands to select * from a, and if you have a cycle clause, then
a has three columns, but the other branch of the union only has one, so
that won't work anymore, will it?

> 3)
> If I take the same infinite loop query but replace the TABLE syntax with
> a SELECT and add a cycle clause, it's not an infinite loop anymore.
>
>    with recursive a as (select 1 as b union all select b from a)
>      cycle b set c to true default false using p
>    table a;
>
>   b | c |     p
> ---+---+-----------
>   1 | f | {(1)}
>   1 | t | {(1),(1)}
> (2 rows)
>
> Why does it stop?  It should still be an infinite loop.

If you specify the cycle clause, then the processing will stop if it
sees the same row more than once, which it did here.

> 4)
> If I use NULL instead of false, I only get one row back.
>
>    with recursive a as (select 1 as b union all select b from a)
>      cycle b set c to true default false using p
>    table a;
>
>   b | c |   p
> ---+---+-------
>   1 |   | {(1)}
> (1 row)

If you specify null, then the cycle check expression will always fail,
so it will abort after the first row.  (We should perhaps prohibit
specifying null, but see below.)

> 5)
> I can set both states to the same value.
>
>    with recursive a as (select 1 as b union all select b from a)
>      cycle b set c to true default true using p
>    table a;

> This is a direct violation of 7.18 SR 2.b.ii.3 as well as common sense.
>   BTW, I applaud your decision to violate the other part of that rule and
> allowing any data type here.
>
>
> 5)
> The same rule as above says that the value and the default value must be
> literals but not everything that a human might consider a literal is
> accepted.  In particular:
>
>    with recursive a as (select 1 as b union all select b from a)
>      cycle b set c to 1 default -1 using p
>    table a;
>
>    ERROR:  syntax error at or near "-"
>
> Can we just accept a full a_expr here instead of AexprConst?  Both
> DEFAULT and USING are fully reserved keywords.

This is something we need to think about.  If we want to check at parse
time whether the two values are not the same (and perhaps not null),
then we either need to restrict the generality of what we can specify,
or we need to be prepared to do full expression evaluation in the
parser.  A simple and practical way might be to only allow string and
boolean literal.  I don't have a strong opinion here.

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: SEARCH and CYCLE clauses

Pavel Stehule


pá 22. 5. 2020 v 11:25 odesílatel Peter Eisentraut <[hidden email]> napsal:
On 2020-05-20 21:28, Vik Fearing wrote:
> 1)
> There are some smart quotes in the comments that should be converted to
> single quotes.

ok, fixing that

> 2)
> This query is an infinite loop, as expected:
>
>    with recursive a as (select 1 as b union all select b from a)
>    table a;
>
> But it becomes an error when you add a cycle clause to it:
>
>    with recursive a as (select 1 as b union all table a)
>      cycle b set c to true default false using p
>    table a;
>
>    ERROR:  each UNION query must have the same number of columns

table a expands to select * from a, and if you have a cycle clause, then
a has three columns, but the other branch of the union only has one, so
that won't work anymore, will it?

> 3)
> If I take the same infinite loop query but replace the TABLE syntax with
> a SELECT and add a cycle clause, it's not an infinite loop anymore.
>
>    with recursive a as (select 1 as b union all select b from a)
>      cycle b set c to true default false using p
>    table a;
>
>   b | c |     p
> ---+---+-----------
>   1 | f | {(1)}
>   1 | t | {(1),(1)}
> (2 rows)
>
> Why does it stop?  It should still be an infinite loop.

If you specify the cycle clause, then the processing will stop if it
sees the same row more than once, which it did here.

> 4)
> If I use NULL instead of false, I only get one row back.
>
>    with recursive a as (select 1 as b union all select b from a)
>      cycle b set c to true default false using p
>    table a;
>
>   b | c |   p
> ---+---+-------
>   1 |   | {(1)}
> (1 row)

If you specify null, then the cycle check expression will always fail,
so it will abort after the first row.  (We should perhaps prohibit
specifying null, but see below.)

> 5)
> I can set both states to the same value.
>
>    with recursive a as (select 1 as b union all select b from a)
>      cycle b set c to true default true using p
>    table a;

> This is a direct violation of 7.18 SR 2.b.ii.3 as well as common sense.
>   BTW, I applaud your decision to violate the other part of that rule and
> allowing any data type here.
>
>
> 5)
> The same rule as above says that the value and the default value must be
> literals but not everything that a human might consider a literal is
> accepted.  In particular:
>
>    with recursive a as (select 1 as b union all select b from a)
>      cycle b set c to 1 default -1 using p
>    table a;
>
>    ERROR:  syntax error at or near "-"
>
> Can we just accept a full a_expr here instead of AexprConst?  Both
> DEFAULT and USING are fully reserved keywords.

This is something we need to think about.  If we want to check at parse
time whether the two values are not the same (and perhaps not null),
then we either need to restrict the generality of what we can specify,
or we need to be prepared to do full expression evaluation in the
parser.  A simple and practical way might be to only allow string and
boolean literal.  I don't have a strong opinion here.

if you check it in parse time, then you disallow parametrization there.

Is any reason to do this check in parse time?


--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: SEARCH and CYCLE clauses

Vik Fearing-6
In reply to this post by Peter Eisentraut-6
On 5/22/20 11:24 AM, Peter Eisentraut wrote:

> On 2020-05-20 21:28, Vik Fearing wrote:
>> 1)
>> There are some smart quotes in the comments that should be converted to
>> single quotes.
>
> ok, fixing that
>
>> 2)
>> This query is an infinite loop, as expected:
>>
>>    with recursive a as (select 1 as b union all select b from a)
>>    table a;
>>
>> But it becomes an error when you add a cycle clause to it:
>>
>>    with recursive a as (select 1 as b union all table a)
>>      cycle b set c to true default false using p
>>    table a;
>>
>>    ERROR:  each UNION query must have the same number of columns
>
> table a expands to select * from a, and if you have a cycle clause, then
> a has three columns, but the other branch of the union only has one, so
> that won't work anymore, will it?

It seems there was a copy/paste error here.  The first query should have
been the same as the second but without the cycle clause.

It seems strange to me that adding a <search or cycle clause> would
break a previously working query.  I would rather see the * expanded
before adding the new columns.  This is a user's opinion, I don't know
how hard that would be to implement.

>> 3)
>> If I take the same infinite loop query but replace the TABLE syntax with
>> a SELECT and add a cycle clause, it's not an infinite loop anymore.
>>
>>    with recursive a as (select 1 as b union all select b from a)
>>      cycle b set c to true default false using p
>>    table a;
>>
>>   b | c |     p
>> ---+---+-----------
>>   1 | f | {(1)}
>>   1 | t | {(1),(1)}
>> (2 rows)
>>
>> Why does it stop?  It should still be an infinite loop.
>
> If you specify the cycle clause, then the processing will stop if it
> sees the same row more than once, which it did here.

Yes, this was a misplaced expectation on my part.

>> 4)
>> If I use NULL instead of false, I only get one row back.
>>
>>    with recursive a as (select 1 as b union all select b from a)
>>      cycle b set c to true default false using p
>>    table a;
>>
>>   b | c |   p
>> ---+---+-------
>>   1 |   | {(1)}
>> (1 row)
>
> If you specify null, then the cycle check expression will always fail,
> so it will abort after the first row.  (We should perhaps prohibit
> specifying null, but see below.)

I would rather make the cycle check expression work with null.

>> 5)
>> I can set both states to the same value.
>>
>>    with recursive a as (select 1 as b union all select b from a)
>>      cycle b set c to true default true using p
>>    table a;
>
>> This is a direct violation of 7.18 SR 2.b.ii.3 as well as common sense.
>>   BTW, I applaud your decision to violate the other part of that rule and
>> allowing any data type here.
>>
>>
>> 5)
>> The same rule as above says that the value and the default value must be
>> literals but not everything that a human might consider a literal is
>> accepted.  In particular:
>>
>>    with recursive a as (select 1 as b union all select b from a)
>>      cycle b set c to 1 default -1 using p
>>    table a;
>>
>>    ERROR:  syntax error at or near "-"
>>
>> Can we just accept a full a_expr here instead of AexprConst?  Both
>> DEFAULT and USING are fully reserved keywords.
>
> This is something we need to think about.  If we want to check at parse
> time whether the two values are not the same (and perhaps not null),
> then we either need to restrict the generality of what we can specify,
> or we need to be prepared to do full expression evaluation in the
> parser.  A simple and practical way might be to only allow string and
> boolean literal.  I don't have a strong opinion here.


I'm with Pavel on this one.  Why does it have to be a parse-time error?
 Also, I regularly see people write functions as sort of pauper's
variables, but a function call isn't allowed here.

----

Another bug I found.  If I try to do your regression test as an
autonomous query, I get this:

    with recursive

    graph (f, t, label) as (
        values (1, 2, 'arc 1 -> 2'),
               (1, 3, 'arc 1 -> 3'),
               (2, 3, 'arc 2 -> 3'),
               (1, 4, 'arc 1 -> 4'),
               (4, 5, 'arc 4 -> 5'),
               (5, 1, 'arc 5 -> 1')
    ),

    search_graph (f, t, label) as (
        select * from graph g
        union all
        select g.*
        from graph g, search_graph sg
        where g.f = sg.t
    )
    cycle f, t set is_cycle to true default false using path

    select * from search_graph;

    ERROR:  could not find CTE "graph"

----

As an improvement over the spec, I think the vast majority of people
will be using simple true/false values.  Can we make that optional?

    CYCLE f, t SET is_cycle USING path

would be the same as

    CYCLE f, t SET is_cycle TO true DEFAULT false USING path
--
Vik Fearing


Reply | Threaded
Open this post in threaded view
|

Re: SEARCH and CYCLE clauses

Peter Eisentraut-6
On 2020-05-22 12:40, Vik Fearing wrote:
>> If you specify null, then the cycle check expression will always fail,
>> so it will abort after the first row.  (We should perhaps prohibit
>> specifying null, but see below.)
>
> I would rather make the cycle check expression work with null.

It works correctly AFAICT.  What is your expectation?

>> This is something we need to think about.  If we want to check at parse
>> time whether the two values are not the same (and perhaps not null),
>> then we either need to restrict the generality of what we can specify,
>> or we need to be prepared to do full expression evaluation in the
>> parser.  A simple and practical way might be to only allow string and
>> boolean literal.  I don't have a strong opinion here.
>
>
> I'm with Pavel on this one.  Why does it have to be a parse-time error?
>   Also, I regularly see people write functions as sort of pauper's
> variables, but a function call isn't allowed here.

If not parse-time error, at what time do you want to check it?

> As an improvement over the spec, I think the vast majority of people
> will be using simple true/false values.  Can we make that optional?
>
>      CYCLE f, t SET is_cycle USING path
>
> would be the same as
>
>      CYCLE f, t SET is_cycle TO true DEFAULT false USING path

I was also considering that.  It would be an easy change to make.

(Apparently, in DB2 you can omit the USING path clause.  Not sure how to
make that work, however.)

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services