Removing [Merge]Append nodes which contain a single subpath

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
22 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Removing [Merge]Append nodes which contain a single subpath

David Rowley-3
It seems like a good idea for the planner not to generate Append or
MergeAppend paths when the node contains just a single subpath. If we
were able to remove these then the planner would have more flexibility
to build a better plan.

As of today, because we include this needless [Merge]Append node, we
cannot parallelise scans below the Append. We must also include a
Materialize node in Merge Joins since both MergeAppend and Append
don't support mark and restore. Also, as shown in [1], there's an
overhead to pulling tuples through nodes.

I've been looking into resolving this issue but the ways I've explored
so far seems to be bending the current planner a bit out of shape.

Method 1:

In set_append_rel_size() detect when just a single subpath would be
added to the Append path. Set a promotion_child RelOptInfo field in
the base rel's RelOptInfo, and during make_one_rel, after
set_base_rel_sizes() modify the joinlist to get rid of the parent and
use the child instead.

This pretty much breaks the assumption we have that the finalrel will
have all the relids to root->all_baserels. We have an Assert in
make_one_rel() which checks this is true.

There are also complications around set_rel_size() generating paths
for subqueries which may be parameterized paths with the Append parent
required for the parameterization to work.

Method 2:

Invent a ProxyPath concept that allows us to add Paths belonging to
one relation to another relation. The ProxyPaths can have
translation_vars to translate targetlists to reference the correct
Vars. These ProxyPaths could exist up as far as createplan, where we
could perform the translation and build the ProxyPath's subnode
instead.

This method is not exactly very clean either as there are various
places around the planner we'd need to give special treatment to these
ProxyPaths, such as is_projection_capable_path() would need to return
the projection capability of the subpath within the ProxyPath since we
never actually create a "Proxy" node.

Probably either of these two methods could be made to work. I prefer
method 2, since that infrastructure could one day be put towards
scanning a Materialized View instead of the relation. in order to
satisfy some query. However, method 2 appears to also require some Var
translation in Path targetlists which contain this Proxy path, either
that or some global Var replacement would need to be done during
setrefs.

I'm posting here in the hope that it will steer my development of this
in a direction that's acceptable to the community.

Perhaps there is also a "Method 3" which I've not thought about which
would be much cleaner.

[1] https://www.postgresql.org/message-id/CAKJS1f9UXdk6ZYyqbJnjFO9a9hyHKGW7B%3DZRh-rxy9qxfPA5Gw%40mail.gmail.com

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Removing [Merge]Append nodes which contain a single subpath

Ashutosh Bapat
On Thu, Oct 26, 2017 at 3:29 AM, David Rowley
<[hidden email]> wrote:
> However, method 2 appears to also require some Var
> translation in Path targetlists which contain this Proxy path, either
> that or some global Var replacement would need to be done during
> setrefs.
>

For inheritance and partitioning, we translate parent expressions to
child expression by applying Var translations. The translated
expressions differ only in the Var nodes (at least for partitioning
this true, and I believe it to be true for inheritance). I have been
toying with an idea of having some kind of a (polymorphic?) Var node
which can represent parent and children. That will greatly reduce the
need to translate the expression trees and will also help in this
implementation. I am wondering, if ProxyPath could be helpful in
avoiding the cost of reparameterize_path_by_child() introduced for
partition-wise join.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Removing [Merge]Append nodes which contain a single subpath

Robert Haas
In reply to this post by David Rowley-3
On Wed, Oct 25, 2017 at 11:59 PM, David Rowley
<[hidden email]> wrote:
> As of today, because we include this needless [Merge]Append node, we
> cannot parallelise scans below the Append.

Without disputing your general notion that singe-child Append or
MergeAppend nodes are a pointless waste, how does a such a needless
node prevent parallelizing scans beneath it?

> Invent a ProxyPath concept that allows us to add Paths belonging to
> one relation to another relation. The ProxyPaths can have
> translation_vars to translate targetlists to reference the correct
> Vars. These ProxyPaths could exist up as far as createplan, where we
> could perform the translation and build the ProxyPath's subnode
> instead.

This idea sounds like it might have some legs, although I'm not sure
"proxy" is really the right terminology.

Like both you and Ashutosh, I suspect that there might be some other
applications for this.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Removing [Merge]Append nodes which contain a single subpath

Antonin Houska-2
In reply to this post by David Rowley-3
David Rowley <[hidden email]> wrote:

> Method 1:
>
> In set_append_rel_size() detect when just a single subpath would be
> added to the Append path.

I spent some time reviewing the partition-wise join patch during the last CF
and have an impression that set_append_rel_size() is called too early to find
out how many child paths will eventually exist if Append represents a join of
a partitioned table. I think the partition matching takes place at later stage
and that determines the actual number of partitions, and therefore the number
of Append subpaths.

--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Removing [Merge]Append nodes which contain a single subpath

David Rowley-3
In reply to this post by Robert Haas
On 26 October 2017 at 23:30, Robert Haas <[hidden email]> wrote:
> On Wed, Oct 25, 2017 at 11:59 PM, David Rowley
> <[hidden email]> wrote:
>> As of today, because we include this needless [Merge]Append node, we
>> cannot parallelise scans below the Append.
>
> Without disputing your general notion that singe-child Append or
> MergeAppend nodes are a pointless waste, how does a such a needless
> node prevent parallelizing scans beneath it?

hmm, I think I was wrong about that now. I had been looking over the
regression test diffs after having made tenk1 a partitioned table with
a single partition containing all the rows, but it looks like I read
the diff backwards. The planner actually parallelized the Append
version, not the non-Append version, like I had previously thought.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Removing [Merge]Append nodes which contain a single subpath

David Rowley-3
In reply to this post by Antonin Houska-2
On 26 October 2017 at 23:42, Antonin Houska <[hidden email]> wrote:

> David Rowley <[hidden email]> wrote:
>
>> Method 1:
>>
>> In set_append_rel_size() detect when just a single subpath would be
>> added to the Append path.
>
> I spent some time reviewing the partition-wise join patch during the last CF
> and have an impression that set_append_rel_size() is called too early to find
> out how many child paths will eventually exist if Append represents a join of
> a partitioned table. I think the partition matching takes place at later stage
> and that determines the actual number of partitions, and therefore the number
> of Append subpaths.

I understand that we must wait until set_append_rel_size() is being
called on the RELOPT_BASEREL since partitions may themselves be
partitioned tables and we'll only know what's left after we've
recursively checked all partitions of the baserel.

I've not studied the partition-wise join code yet, but if we've
eliminated all but one partition in set_append_rel_size() then I
imagine we'd just need to ensure partition-wise join is not attempted
since we'd be joining directly to the only partition anyway.


--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Removing [Merge]Append nodes which contain a single subpath

Ashutosh Bapat
On Thu, Oct 26, 2017 at 5:07 PM, David Rowley
<[hidden email]> wrote:

> On 26 October 2017 at 23:42, Antonin Houska <[hidden email]> wrote:
>> David Rowley <[hidden email]> wrote:
>>
>>> Method 1:
>>>
>>> In set_append_rel_size() detect when just a single subpath would be
>>> added to the Append path.
>>
>> I spent some time reviewing the partition-wise join patch during the last CF
>> and have an impression that set_append_rel_size() is called too early to find
>> out how many child paths will eventually exist if Append represents a join of
>> a partitioned table. I think the partition matching takes place at later stage
>> and that determines the actual number of partitions, and therefore the number
>> of Append subpaths.
>
> I understand that we must wait until set_append_rel_size() is being
> called on the RELOPT_BASEREL since partitions may themselves be
> partitioned tables and we'll only know what's left after we've
> recursively checked all partitions of the baserel.
>
> I've not studied the partition-wise join code yet, but if we've
> eliminated all but one partition in set_append_rel_size() then I
> imagine we'd just need to ensure partition-wise join is not attempted
> since we'd be joining directly to the only partition anyway.
>

I think Antonin has a point. The join processing may deem some base
relations dummy (See populate_joinrel_with_paths()). So an Append
relation which had multiple child alive at the end of
set_append_rel_size() might ultimately have only one child after
partition-wise join has worked on it.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Removing [Merge]Append nodes which contain a single subpath

Tom Lane-2
In reply to this post by David Rowley-3
David Rowley <[hidden email]> writes:
> It seems like a good idea for the planner not to generate Append or
> MergeAppend paths when the node contains just a single subpath.
> ...
> Perhaps there is also a "Method 3" which I've not thought about which
> would be much cleaner.

Have you considered turning AppendPath with a single child into more
of a special case?  I think this is morally equivalent to your ProxyPath
proposal, but it would take less new boilerplate code to support.
If you taught create_append_path to inherit the child's pathkeys
when there's only one child, and taught createplan.c to skip making the
useless Append node, I think you might be nearly done.

This might seem a bit ugly, but considering that zero-child AppendPaths
are already a special case (and a much bigger one), I don't have much
of a problem with decreeing that one-child is also a special case.

As an example of why this might be better than a separate ProxyPath
node type, consider this call in add_paths_to_append_rel:

    /*
     * If we found unparameterized paths for all children, build an unordered,
     * unparameterized Append path for the rel.  (Note: this is correct even
     * if we have zero or one live subpath due to constraint exclusion.)
     */
    if (subpaths_valid)
        add_path(rel, (Path *) create_append_path(rel, subpaths, NULL, 0,
                                                  partitioned_rels));

That comment could stand a bit of adjustment with this approach, but
the code itself requires no extra work.

Now, you would have to do something about this in create_append_plan:

    /*
     * XXX ideally, if there's just one child, we'd not bother to generate an
     * Append node but just return the single child.  At the moment this does
     * not work because the varno of the child scan plan won't match the
     * parent-rel Vars it'll be asked to emit.
     */

but that's going to come out in the wash someplace, no matter what you do.
Leaving it for the last minute is likely to reduce the number of places
you have to teach about this.

                        regards, tom lane


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Removing [Merge]Append nodes which contain a single subpath

David Rowley-3
In reply to this post by Ashutosh Bapat
On 27 October 2017 at 01:44, Ashutosh Bapat
<[hidden email]> wrote:
> I think Antonin has a point. The join processing may deem some base
> relations dummy (See populate_joinrel_with_paths()). So an Append
> relation which had multiple child alive at the end of
> set_append_rel_size() might ultimately have only one child after
> partition-wise join has worked on it.

hmm, I see. partition-wise join is able to remove eliminate partitions
on the other side of the join that can't be matched to:

# set enable_partition_wise_join=on;
SET
# explain select count(*) from ab ab1 inner join ab ab2 on ab1.a=ab2.a
where ab1.a between 1 and 2 and ab2.a between 100000000 and 100000001;
                   QUERY PLAN
------------------------------------------------
 Aggregate  (cost=0.00..0.01 rows=1 width=8)
   ->  Result  (cost=0.00..0.00 rows=0 width=0)
         One-Time Filter: false
(3 rows)

# \d+ ab
                                    Table "public.ab"
 Column |  Type   | Collation | Nullable | Default | Storage | Stats
target | Description
--------+---------+-----------+----------+---------+---------+--------------+-------------
 a      | integer |           |          |         | plain   |              |
 b      | integer |           |          |         | plain   |              |
Partition key: RANGE (a)
Partitions: ab_p1 FOR VALUES FROM (1) TO (100000000),
            ab_p2 FOR VALUES FROM (100000000) TO (200000000)



--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Removing [Merge]Append nodes which contain a single subpath

Tom Lane-2
David Rowley <[hidden email]> writes:
> On 27 October 2017 at 01:44, Ashutosh Bapat
> <[hidden email]> wrote:
>> I think Antonin has a point. The join processing may deem some base
>> relations dummy (See populate_joinrel_with_paths()). So an Append
>> relation which had multiple child alive at the end of
>> set_append_rel_size() might ultimately have only one child after
>> partition-wise join has worked on it.

TBH, that means partition-wise join planning is broken and needs to be
redesigned.  It's impossible that we're going to make sane planning
choices if the sizes of relations change after we've already done some
planning work.

                        regards, tom lane


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Removing [Merge]Append nodes which contain a single subpath

Ashutosh Bapat
In reply to this post by David Rowley-3
On Fri, Oct 27, 2017 at 12:49 AM, David Rowley
<[hidden email]> wrote:

> On 27 October 2017 at 01:44, Ashutosh Bapat
> <[hidden email]> wrote:
>> I think Antonin has a point. The join processing may deem some base
>> relations dummy (See populate_joinrel_with_paths()). So an Append
>> relation which had multiple child alive at the end of
>> set_append_rel_size() might ultimately have only one child after
>> partition-wise join has worked on it.
>
> hmm, I see. partition-wise join is able to remove eliminate partitions
> on the other side of the join that can't be matched to:
>
> # set enable_partition_wise_join=on;
> SET
> # explain select count(*) from ab ab1 inner join ab ab2 on ab1.a=ab2.a
> where ab1.a between 1 and 2 and ab2.a between 100000000 and 100000001;
>                    QUERY PLAN
> ------------------------------------------------
>  Aggregate  (cost=0.00..0.01 rows=1 width=8)
>    ->  Result  (cost=0.00..0.00 rows=0 width=0)
>          One-Time Filter: false
> (3 rows)
>
> # \d+ ab
>                                     Table "public.ab"
>  Column |  Type   | Collation | Nullable | Default | Storage | Stats
> target | Description
> --------+---------+-----------+----------+---------+---------+--------------+-------------
>  a      | integer |           |          |         | plain   |              |
>  b      | integer |           |          |         | plain   |              |
> Partition key: RANGE (a)
> Partitions: ab_p1 FOR VALUES FROM (1) TO (100000000),
>             ab_p2 FOR VALUES FROM (100000000) TO (200000000)
>

IIUC, ab1_p2 and ab2_p1 are marked dummy by constraint exclusion.
Because of partition-wise join when the corresponding partitions are
joined, their inner joins are marked dummy resulting in an empty join.
This isn't the case of a join processing marking one of the joining
relations dummy, that I was referring to. But I think it's relevant
here since partition-wise join might create single child joins this
way.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Removing [Merge]Append nodes which contain a single subpath

Ashutosh Bapat
In reply to this post by Tom Lane-2
On Fri, Oct 27, 2017 at 1:36 AM, Tom Lane <[hidden email]> wrote:

> David Rowley <[hidden email]> writes:
>> On 27 October 2017 at 01:44, Ashutosh Bapat
>> <[hidden email]> wrote:
>>> I think Antonin has a point. The join processing may deem some base
>>> relations dummy (See populate_joinrel_with_paths()). So an Append
>>> relation which had multiple child alive at the end of
>>> set_append_rel_size() might ultimately have only one child after
>>> partition-wise join has worked on it.
>
> TBH, that means partition-wise join planning is broken and needs to be
> redesigned.  It's impossible that we're going to make sane planning
> choices if the sizes of relations change after we've already done some
> planning work.

The mark_dummy_rel() call that marks a joining relation dummy was
added by 719012e013f10f72938520c46889c52df40fa329. The joining
relations are planned before the joins they participate in are
planned. Further some of the joins in which it participates might have
been planned before the joining pair, which causes it to be marked
dummy, is processed by populate_joinrel_with_paths(). So we already
have places where the sizes of relation changes during planning.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Removing [Merge]Append nodes which contain a single subpath

David Rowley-3
In reply to this post by Tom Lane-2
On 27 October 2017 at 04:47, Tom Lane <[hidden email]> wrote:

> David Rowley <[hidden email]> writes:
>> It seems like a good idea for the planner not to generate Append or
>> MergeAppend paths when the node contains just a single subpath.
>> ...
>> Perhaps there is also a "Method 3" which I've not thought about which
>> would be much cleaner.
>
> Have you considered turning AppendPath with a single child into more
> of a special case?  I think this is morally equivalent to your ProxyPath
> proposal, but it would take less new boilerplate code to support.
> If you taught create_append_path to inherit the child's pathkeys
> when there's only one child, and taught createplan.c to skip making the
> useless Append node, I think you might be nearly done.
That seems reasonable, although after going and implementing this, it
does not seem quite as convenient as dummy paths are. I had to make
the  AppendPath carries a bit more weight than I originally thought.
It now stores translate to/from expressions and also a single bool
(isproxy) to mark if it's a proxy type Append. I tried to do without
this bool and just check if there's a single subpath and some
translation expressions, but that was getting fooled by Paths which
had empty targetlists which meant there were no translations, and
going on the single subpath alone is no good as do we create append
paths more places than just add_paths_to_append_rel().

I've attached a patch which goes and implements all this. I think it
still needs a bit of work. I'm not 100% confident I'm translating
expressions in all the required places in createplan.c. I did modify
tenk1 to become the parent of a single partition and ran the
regression tests. There appear to be 4 subtle differences to querying
a parent with an only-child vs querying the child directly.

1. EXPLAIN VERBOSE shows the columns prefixed by the table name. This
is caused by "useprefix = list_length(es->rtable) > 1;" in explain.c.
2. There's a small change in an error message in portals.out. "ERROR:
cursor "c" is not a simply updatable scan of table "tenk1a"" now
mentions the child table instead of the parent. Nothing to worry about
as it appears to already do this with native partitioning.
3. Some plans are parallelized that were not before (see join.out).
This is caused by compute_parallel_worker() having a special case for
non-baserels.
4. Some gating plans don't work the same as they used to, this is
highlighted in aggregates.out and is caused by child rel Const TRUE
baserestrictinfo clauses being dropped in set_append_rel_size.

For differences 2-4, I've attached another file which modifies tenk1
to become a partitioned table with a single partition. I've modified
all the expected results for #1 to reduce the noise, so 3 tests are
still failing for reasons 2-4. I'm not at all concerned with #2. #3
might not be worth fixing, I've just no idea how yet and #4 seems like
it's on purpose.

Other things I'm not 100% happy with are; in prepunion.c I had to
invent a new function named find_all_appinfos_for_child, which is a
bit similar to adjust_appendrel_attrs_multilevel, only it modifies the
attributes, but the use I have for find_all_appinfos_for_child
includes also calling add_child_rel_equivalences for the children at
each level. Probably a bit more thinking will come up with some way to
modify the existing function sets to be more reusable, I've just not
done that yet.

I decided that in createplan.c to have a generic expression translator
rather than something that only understands AppendRelInfos. The reason
for this is two-fold:

1. There may be multiple AppendRelInfos required for a single
translation between a parent and child if there are multiple other
children in between. Seems wasteful to have intermediate translations.
2. I have other uses in mind for this feature that are not Append
child relations.

A sanity check on what I have would be most welcome if anyone has time.

I'm going to add this to the November commitfest.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

remove_singleton_appends_2017-10-31.patch (65K) Download Attachment
remove_singleton_appends_examples_of_differences_2017-10-31.patch (113K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Removing [Merge]Append nodes which contain a single subpath

David Rowley-3
On 1 November 2017 at 02:47, David Rowley <[hidden email]> wrote:
> On 27 October 2017 at 04:47, Tom Lane <[hidden email]> wrote:
>> Have you considered turning AppendPath with a single child into more
>> of a special case?  I think this is morally equivalent to your ProxyPath
>> proposal, but it would take less new boilerplate code to support.
>> If you taught create_append_path to inherit the child's pathkeys
>> when there's only one child, and taught createplan.c to skip making the
>> useless Append node, I think you might be nearly done.

I've made another pass over this patch today to try to ensure it's in
a better state. I'm away on leave until the end of the month starting
from this Friday, so likely I won't get any time to resolve any issues
found until maybe the last day of the month.

I've attached an updated version of the patch which fixes a bug that I
discovered today where I had neglected to translate the quals to a
grouping set. I noticed this when I made all the main regression
tables partitioned tables containing a single child partition with
MINVALUE to MAXVALUE range. The fact that all plans now appear to work
as planned after having removed the Append has given me a bit more
confidence that I've not missed any places to translate Vars, but I'm
still not 100% certain. I just can't think of any other way or testing
that at the moment. Any that I have missed should result in an error
like "variable not found in subplan target list"

I've also modified the patch to make use of some existing functions in
prepunion.c which gather up AppendRelInfos for a childrel. I'd
previously invented my own function because the existing ones were not
quite exactly what I needed. I'd planned to come back to it, but only
just got around to fixing that.

Apart from that, the changes are just in the code comments.

The remove_singleton_appends_examples_of_differences_2017-11-15.patch
which I've attached applies changes to the regression tests to make
many of the major tables partitioned tables with a single partition.
It also changes the expected results of the small insignificant plan
changes and leaves only the actual true differences. This file is not
intended for commit, but more just a demo of it working for a larger
variety of plan shapes. There is actually no additional tests with the
patch. It relies on the places in the existing tests which have
changed.  I didn't add any extra as I can't think of any particular
area to target given that it's such a generic thing that can apply to
so many different cases.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

remove_singleton_appends_2017-11-15.patch (67K) Download Attachment
remove_singleton_appends_examples_of_differences_2017-11-15.patch (138K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Removing [Merge]Append nodes which contain a single subpath

Michael Paquier
On Wed, Nov 15, 2017 at 3:17 PM, David Rowley
<[hidden email]> wrote:

> The remove_singleton_appends_examples_of_differences_2017-11-15.patch
> which I've attached applies changes to the regression tests to make
> many of the major tables partitioned tables with a single partition.
> It also changes the expected results of the small insignificant plan
> changes and leaves only the actual true differences. This file is not
> intended for commit, but more just a demo of it working for a larger
> variety of plan shapes. There is actually no additional tests with the
> patch. It relies on the places in the existing tests which have
> changed.  I didn't add any extra as I can't think of any particular
> area to target given that it's such a generic thing that can apply to
> so many different cases.

The last patch set does not apply and did not get any reviews. So I am
moving this item to the next with waiting on author as status.
--
Michael

Reply | Threaded
Open this post in threaded view
|

Re: Removing [Merge]Append nodes which contain a single subpath

David Rowley-3
On 30 November 2017 at 15:34, Michael Paquier <[hidden email]> wrote:
On Wed, Nov 15, 2017 at 3:17 PM, David Rowley
<[hidden email]> wrote:
> The remove_singleton_appends_examples_of_differences_2017-11-15.patch
> which I've attached applies changes to the regression tests to make
> many of the major tables partitioned tables with a single partition.
> It also changes the expected results of the small insignificant plan
> changes and leaves only the actual true differences. This file is not
> intended for commit, but more just a demo of it working for a larger
> variety of plan shapes. There is actually no additional tests with the
> patch. It relies on the places in the existing tests which have
> changed.  I didn't add any extra as I can't think of any particular
> area to target given that it's such a generic thing that can apply to
> so many different cases.

The last patch set does not apply and did not get any reviews. So I am
moving this item to the next with waiting on author as status.

(returning from 2 weeks leave)

Thanks for managing the commitfest.

I've attached a patch which fixes the conflict with the regression test expected output in inherits.out. I've also made changes to the expected output in the new partition_prune test's expected output.

I'll set this back to waiting on review now.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

remove_append_rel_2017-11-30.patch (80K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Removing [Merge]Append nodes which contain a single subpath

David Rowley-3
On 30 November 2017 at 16:04, David Rowley <[hidden email]> wrote:
> I've attached a patch which fixes the conflict with the regression test
> expected output in inherits.out. I've also made changes to the expected
> output in the new partition_prune test's expected output.

I've attached an updated patch which fixes the conflicts caused by ab727167.

Also, I've attached a patch which is not intended for commit which
shows the subtle differences between directly querying a partitioned
table with a single remaining leaf partition to scan vs directly
querying the leaf partition itself. The regression test failures seen
with both patches applied highlight the differences.

While rebasing this today I also noticed that we won't properly detect
unique joins in add_paths_to_joinrel() as we're still testing for
uniqueness against the partitioned parent rather than the only child.
This is likely not a huge problem since we'll always get a false
negative and never a false positive, but it is a missing optimisation.
I've not thought of how to solve it yet, it's perhaps not worth going
to too much trouble over.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

remove_singleton_appends_2017-12-07.patch (81K) Download Attachment
remove_singleton_appends_examples_of_differences_2017-12-07.patch (148K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Removing [Merge]Append nodes which contain a single subpath

Ashutosh Bapat
On Thu, Dec 7, 2017 at 5:11 AM, David Rowley
<[hidden email]> wrote:
>
> While rebasing this today I also noticed that we won't properly detect
> unique joins in add_paths_to_joinrel() as we're still testing for
> uniqueness against the partitioned parent rather than the only child.
> This is likely not a huge problem since we'll always get a false
> negative and never a false positive, but it is a missing optimisation.
> I've not thought of how to solve it yet, it's perhaps not worth going
> to too much trouble over.
>

It's only the jointrelids that are parent's relids, but all the
following tests for unique join use RelOptInfo::relids which are child
relids.

        case JOIN_UNIQUE_INNER:
            extra.inner_unique = bms_is_subset(sjinfo->min_lefthand,
                                               outerrel->relids);
            break;
        case JOIN_UNIQUE_OUTER:
            extra.inner_unique = innerrel_is_unique(root,
                                                    outerrel->relids,
                                                    innerrel,
                                                    JOIN_INNER,
                                                    restrictlist,
                                                    false);
            break;
        default:
            extra.inner_unique = innerrel_is_unique(root,
                                                    outerrel->relids,
                                                    innerrel,
                                                    jointype,
                                                    restrictlist,
                                                    false);
            break;

Do you have a testcase, which shows the problem?

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Reply | Threaded
Open this post in threaded view
|

Re: Removing [Merge]Append nodes which contain a single subpath

David Rowley-3
On 11 December 2017 at 21:18, Ashutosh Bapat
<[hidden email]> wrote:

> On Thu, Dec 7, 2017 at 5:11 AM, David Rowley
> <[hidden email]> wrote:
>> While rebasing this today I also noticed that we won't properly detect
>> unique joins in add_paths_to_joinrel() as we're still testing for
>> uniqueness against the partitioned parent rather than the only child.
>> This is likely not a huge problem since we'll always get a false
>> negative and never a false positive, but it is a missing optimisation.
>> I've not thought of how to solve it yet, it's perhaps not worth going
>> to too much trouble over.
>>
>
[...]

> Do you have a testcase, which shows the problem?

I do indeed:

create table p (a int not null) partition by range (a);
create table p1 partition of p for values from (minvalue) to (maxvalue);
create unique index on p1 (a);
create table t (a int not null);

insert into p values(1),(2);
insert into t values(1);

analyze p;
analyze t;

explain (verbose, costs off) select * from p inner join t on p.a = t.a;
         QUERY PLAN
-----------------------------
 Nested Loop
   Output: p1.a, t.a
   Join Filter: (p1.a = t.a)
   ->  Seq Scan on public.t
         Output: t.a
   ->  Seq Scan on public.p1
         Output: p1.a
(7 rows)

explain (verbose, costs off) select * from p1 inner join t on p1.a = t.a;
         QUERY PLAN
-----------------------------
 Nested Loop
   Output: p1.a, t.a
   Inner Unique: true
   Join Filter: (p1.a = t.a)
   ->  Seq Scan on public.t
         Output: t.a
   ->  Seq Scan on public.p1
         Output: p1.a
(8 rows)

Notice that when we join to the partitioned table directly and the
Append is removed, we don't get the "Inner Unique: true"

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: Removing [Merge]Append nodes which contain a single subpath

Ashutosh Bapat
On Mon, Dec 11, 2017 at 2:42 PM, David Rowley
<[hidden email]> wrote:

> On 11 December 2017 at 21:18, Ashutosh Bapat
> <[hidden email]> wrote:
>> On Thu, Dec 7, 2017 at 5:11 AM, David Rowley
>> <[hidden email]> wrote:
>>> While rebasing this today I also noticed that we won't properly detect
>>> unique joins in add_paths_to_joinrel() as we're still testing for
>>> uniqueness against the partitioned parent rather than the only child.
>>> This is likely not a huge problem since we'll always get a false
>>> negative and never a false positive, but it is a missing optimisation.
>>> I've not thought of how to solve it yet, it's perhaps not worth going
>>> to too much trouble over.
>>>
>>
> [...]
>
>> Do you have a testcase, which shows the problem?
>
> I do indeed:
>
> create table p (a int not null) partition by range (a);
> create table p1 partition of p for values from (minvalue) to (maxvalue);
> create unique index on p1 (a);
> create table t (a int not null);
>
> insert into p values(1),(2);
> insert into t values(1);
>
> analyze p;
> analyze t;
>
> explain (verbose, costs off) select * from p inner join t on p.a = t.a;
>          QUERY PLAN
> -----------------------------
>  Nested Loop
>    Output: p1.a, t.a
>    Join Filter: (p1.a = t.a)
>    ->  Seq Scan on public.t
>          Output: t.a
>    ->  Seq Scan on public.p1
>          Output: p1.a
> (7 rows)
>
> explain (verbose, costs off) select * from p1 inner join t on p1.a = t.a;
>          QUERY PLAN
> -----------------------------
>  Nested Loop
>    Output: p1.a, t.a
>    Inner Unique: true
>    Join Filter: (p1.a = t.a)
>    ->  Seq Scan on public.t
>          Output: t.a
>    ->  Seq Scan on public.p1
>          Output: p1.a
> (8 rows)
>
> Notice that when we join to the partitioned table directly and the
> Append is removed, we don't get the "Inner Unique: true"

Ok. I thought we don't use unique joins in partition-wise joins. Sorry
for the misunderstanding.

I think this didn't work with inheritance before partition-wise join
as well for the same reason.

Probably, we could do it if unique paths (sorted paths which can be
unique-ified) bubble up the Append hierarchy. We already have code in
add_paths_to_append_rel() to create sorted paths when even a single
child has an index on it.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

12
Previous Thread Next Thread