advanced partition matching algorithm for partition-wise join

classic Classic list List threaded Threaded
76 messages Options
1234
Reply | Threaded
Open this post in threaded view
|

advanced partition matching algorithm for partition-wise join

Ashutosh Bapat
The patch-set in [1] supports partition-wise join when the partition bounds and
partition keys of the joining tables exactly match. The last two patches in the
last few patch-sets in that thread implement more
advanced partition matching code. In order to avoid mixing reviews for advanced
partition matching and the basic partition-wise join implementation, I am
starting a new thread to discuss the same. I am attaching the last two
patches from that patch set here.

The new partition matching algorithm handles following cases when a
given partition
on one side has at most one matching partition matching on the other side.

1. When the ranges of the joining tables do not match exactly E.g. partition
table t1 has partitions t1p1 (0 - 100), t1p2 (150 - 200) and partition table t2
has partitions t2p1 (0 - 50), t2p2 (100 - 175). In this case (t1p1, t2p1) and
(t1p2, t2p2) form the matching partition pairs, which can be joined. While
matching the pairs, we also compute the partition bounds for the resulting
join. An INNER join between t1 and t2 will have ranges (0 - 50) since no row
with 50 <= key < 100 from t1p1 is going to find a matching row in t2p1 and (150
- 175) since no row with 100 <= key < 150 from t2p2 is going to find a matching
row in t1p2 and no row with 175 <= key < 200 in t1p2 is going to find a
matching row in t1p2. A t1 LEFT join t2 on the other hand will have ranges same
as the outer relation i.e. t1, (0 - 100), (150 - 200) since all rows from t1
will be part of the join. Thus depending upon the type of join the partition
bounds of the resultant join relation change. Similarly for list partitioned
table, when the lists do not match exactly, the algorithm finds matching pairs
of partitions and the lists of resultant join relation. E.g.  t1 has
partitions t1p1 ('a',
'b', 'c'), t1p2 ('e', 'f') and t2 has partitions t2p1 ('a', 'b'), t2p2 ('d',
'e', 'f'). In this case (t1p1, t2p1) and (t2p1, t2p2) form the matching
pairs which are joined. Inner join will have bounds ('a','b'), ('e', 'f') and
t1 LEFT JOIN t2 will have bounds same as t1.

2. When one or both side have at least one partition that does not have
matching partition on the other side. E.g. t1 has partitions t1p1 ('a','b'),
t1p2 ('c','d') and t2 has only one partition t2p1 ('a','b') OR t1 has
partitions t1p1 (0 - 100), t1p2 (100 - 200) and t2 has only one partition t2p1
(0 - 100). In this case as well different types of joins will have different
partition bounds for the result using similar rules described above.

3. A combination of 1 and 2 e.g. t1 has partitions t1p1('a','b','c'),
t1p2('d','e','f') and t2 has a single partition t2p1 ('a','b', 'z').

Algorithm
---------
The pairs of matching partitions and the partition bounds of the join are
calculated by an algorithm similar to merge join.

In such a join, it can be observed that every partition on either side,
contributes to at most one partition of the resultant join relation. Thus for
every partition on either side, we keep track of the partition of resultant
join (if any), which it contributes to.  If multiple partitions from any of the
joining relations map to a single partition of the resultant join, we need to
gang those partitions together before joining the partition/s from the other
side. Since we do not have infrastructure for ganging multiple arbitrary
RelOptInfos together in a parent RelOptInfo, we do not support such a
partitionw-wise join right now. We stop merging the bounds immediately when we
detect such a case.

For list partitioned tables, we compare list values from both the sides,
starting with the lowest. If the two list values being compared match,
corresponding partitions from both sides form a pair of partitions to be
joined. We record this mapping and also include the list value in join bounds.
If the two list values do not match and the lower of those two comes from the
outer side of the join, we include it in the join bounds. We advance to the
next list value on side with the lower list value continuing the process of
merging till list values on at least one side are exhausted. If the remaining
values are from the outer side, we include those in the join partition bounds.
Every list value included in the join bounds, and its originating partition/s
are associated with appropriate partition of the resultant join. For more
details please see partition_list_bounds_merge() in the attached patch.

In case of range partitioned tables, we compare the ranges of the partitions in
increasing order of their bounds. If two ranges being compared overlap,
corresponding partitions from both sides form a pair of partitions to be
joined. We record this mapping and also include the merged range in the bounds
of resultant join. The overlapping ranges are merged based on the type of join
as described above. If either of the ranges completely precedes the other, and
it's on the outer side, we include that range in the bounds of resultant join.
We advance to the next range on the side with lower upper bound till ranges on
at least one side are exhausted.  If the remaining ranges are from the outer
side, we include those in the partition bounds of resultant join. While
including a range in the partition bounds of the resultant join if its lower
bound precedes the upper bound of the last included range, it indicates that
multiple partitions on that side map to one partition on the other side, so we
bail out. Notice that in this method, we always include the ranges in the
partition bounds of the resultant join in the increasing order of their bounds.
Every range included in the join's partition bounds and it's corresponding
partition/s from joining relations are associated with appropriate partition of
the resultant join. For more details please see partition_range_bounds_merge()
in the attached patch.

The partitions from both sides (one partition from each side) which map to the
same partition of the resultant join are joined to form child-joins. The case
when an outer partition may not have a matching partition from the inner side
will be discussed in the next section. Except for the above algorithm to find
the pairs of matching partitions and calculating bounds of the resultant join,
the rest of the partition-wise join algorithm remains the same.

Unsupported case: When a partition from outer side doesn't have matching
partition on the inner side.
--------------------------------------------------------------------------
Consider a join t1 LEFT JOIN t2 where t1 has partitions t1p1 (0 - 100), t1p2
(100 - 200) and t2 has a single partition t2p1(0 - 100). The rows in t1p2 won't
have a matching row in t2 since there is no partition matching t1p2. The result
of the join will have rows in t1p2 with columns from t2 NULLed. In order to
execute this join as a partition-wise join, we need a dummy relation in place
of the missing partition, which we can join with t1p2. We need this placeholder
dummy relation (its targetlist, relids etc.), so that rest of the planner can
work with the resulting child-join.

We notice the missing partitions only while planning the join (during the
execution of make_one_rel()), by which time we have frozen the number of base
relations.  Introducing a base relation during join planning is not supported
in current planner. Similarly, a partition can be missing from a partitioned
join relation, in which case we have to add a dummy join relation. This might
need adding corresponding base relations as well. I have not spent time looking
for what it takes to support these cases. For now the patch does not support
partition-wise join in such cases.

TODOs
-----------
1. Add tests for advanced partition matching algorithm
2. Improve code quality, commenting, function names etc.

[1] https://www.postgresql.org/message-id/CAFjFpRd9Vqh_=-Ldv-XqWY006d07TJ+VXuhXCbdj=P1jukYBrw@...

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

0010-Modify-bound-comparision-functions-to-accept-members.patch (10K) Download Attachment
0011-WIP-Partition-wise-join-for-1-1-1-0-0-1-partition-ma.patch (65K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: advanced partition matching algorithm for partition-wise join

rajkumar.raghuwanshi
On Mon, Aug 21, 2017 at 12:43 PM, Ashutosh Bapat <[hidden email]> wrote:
TODOs
-----------
1. Add tests for advanced partition matching algorithm

Hi Ashutosh,

I have applied all partition-wise-join patches (v26) and tested feature. I have modified partition_join.sql file and added extra test cases to test partition matching.

Attaching WIP test case patch which as of now have some server crashes and a data corruptions issue which is commented in the file itself and need to be removed once issue got solved. Also some of queries is not picking or picking partition-wise-join as per expectation which may need some adjustment.

 



 


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

advanced_partition_matching_test.patch (290K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: advanced partition matching algorithm for partition-wise join

Ashutosh Bapat
PFA the patches rebased on the latest sources. There are also fixes
for some of the crashes and bugs reported. I haven't yet included the
testcase patch in the main patchset.

On Mon, Aug 28, 2017 at 12:44 PM, Rajkumar Raghuwanshi
<[hidden email]> wrote:

> On Mon, Aug 21, 2017 at 12:43 PM, Ashutosh Bapat
> <[hidden email]> wrote:
>>
>> TODOs
>> -----------
>> 1. Add tests for advanced partition matching algorithm
>
>
> Hi Ashutosh,
>
> I have applied all partition-wise-join patches (v26) and tested feature. I
> have modified partition_join.sql file and added extra test cases to test
> partition matching.
>
> Attaching WIP test case patch which as of now have some server crashes and a
> data corruptions issue which is commented in the file itself and need to be
> removed once issue got solved. Also some of queries is not picking or
> picking partition-wise-join as per expectation which may need some
> adjustment.
>
>
>
>
>
>


--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

0011-Modify-bound-comparision-functions-to-accept-members.patch (10K) Download Attachment
0012-WIP-Partition-wise-join-for-1-1-1-0-0-1-partition-ma.patch (67K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: advanced partition matching algorithm for partition-wise join

rajkumar.raghuwanshi

On Sat, Sep 2, 2017 at 12:42 AM, Ashutosh Bapat <[hidden email]> wrote:
PFA the patches rebased on the latest sources. There are also fixes
for some of the crashes and bugs reported. I haven't yet included the
testcase patch in the main patchset.

On Mon, Aug 28, 2017 at 12:44 PM, Rajkumar Raghuwanshi
<[hidden email]> wrote:
> On Mon, Aug 21, 2017 at 12:43 PM, Ashutosh Bapat
> <[hidden email]> wrote:
>>
>> TODOs
>> -----------
>> 1. Add tests for advanced partition matching algorithm
>
>
> Hi Ashutosh,
>
> I have applied all partition-wise-join patches (v26) and tested feature. I
> have modified partition_join.sql file and added extra test cases to test
> partition matching.
>
> Attaching WIP test case patch which as of now have some server crashes and a
> data corruptions issue which is commented in the file itself and need to be
> removed once issue got solved. Also some of queries is not picking or
> picking partition-wise-join as per expectation which may need some
> adjustment.
>

I have applied v27 patches and tested feature. Also tried to reduce regression diff with the
existing partition_join.sql by adding new partition instead of changing original partition bounds.

Attached WIP patch have a server crash and some wrong output which need to be fixed. I have
commented these issue in patch itself, Please take a look and let me know if it need more
changes.


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

advance_partition_matching_test_v1.patch (297K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: advanced partition matching algorithm for partition-wise join

Ashutosh Bapat
>

> I have applied v27 patches and tested feature. Also tried to reduce
> regression diff with the
> existing partition_join.sql by adding new partition instead of changing
> original partition bounds.
>
> Attached WIP patch have a server crash and some wrong output which need to
> be fixed. I have
> commented these issue in patch itself, Please take a look and let me know if
> it need more
> changes.
I have fixed the issues which were marked as TODOs in the attached
patches. Also, I have included your test change patch in my series of
patches. Are there any other issues you have commented out?

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

0011-Modify-bound-comparision-functions-to-accept-members.patch (10K) Download Attachment
0012-WIP-Partition-wise-join-for-1-1-1-0-0-1-partition-ma.patch (67K) Download Attachment
0013-Tests-for-0-1-1-1-and-1-0-partition-matching.patch (298K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: advanced partition matching algorithm for partition-wise join

rajkumar.raghuwanshi
On Tue, Sep 5, 2017 at 4:34 PM, Ashutosh Bapat <[hidden email]> wrote:
I have fixed the issues which were marked as TODOs in the attached
patches. Also, I have included your test change patch in my series of
patches. Are there any other issues you have commented out?

Thanks Ashutosh, All commented issue got fixed. I am working on some combinations of N-way joins
to test partition matching, will send those as well once done.

 
Reply | Threaded
Open this post in threaded view
|

Re: advanced partition matching algorithm for partition-wise join

Antonin Houska-2
In reply to this post by Ashutosh Bapat
Ashutosh Bapat <[hidden email]> wrote:

> I have fixed the issues which were marked as TODOs in the attached
> patches. Also, I have included your test change patch in my series of
> patches.

I've noticed that partition_bounds_merge() is called twice from
make_join_rel():

 * build_join_rel -> build_joinrel_partition_info -> partition_bounds_merge

 * try_partition_wise_join -> partition_bounds_merge

Is this intentional, or just a thinko?

--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: advanced partition matching algorithm for partition-wise join

Ashutosh Bapat
On Thu, Sep 7, 2017 at 7:34 PM, Antonin Houska <[hidden email]> wrote:

> Ashutosh Bapat <[hidden email]> wrote:
>
>> I have fixed the issues which were marked as TODOs in the attached
>> patches. Also, I have included your test change patch in my series of
>> patches.
>
> I've noticed that partition_bounds_merge() is called twice from
> make_join_rel():
>
>  * build_join_rel -> build_joinrel_partition_info -> partition_bounds_merge
>
>  * try_partition_wise_join -> partition_bounds_merge
>
> Is this intentional, or just a thinko?
>

This is expected. partition_bounds_merge() also returns the pairs of
matching partitions. So, we have to call that function for every pair
of joining relations.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: advanced partition matching algorithm for partition-wise join

Ashutosh Bapat
Here's updated patch set based on the basic partition-wise join
committed. The patchset applies on top of the patch to optimize the
case of dummy partitioned tables [1].

Right now, the advanced partition matching algorithm bails out when
either of the joining relations has a default partition.

[1] https://www.postgresql.org/message-id/CAFjFpRcPvT5ay9_p3e-k2Cwu4bW_rypON7ceJVWhsU3Uk4Nmmg@...

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

0002-Modify-bound-comparision-functions-to-accept-members.patch (10K) Download Attachment
0003-WIP-Partition-wise-join-for-1-1-1-0-0-1-partition-ma.patch (69K) Download Attachment
0004-Tests-for-0-1-1-1-and-1-0-partition-matching.patch (288K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: advanced partition matching algorithm for partition-wise join

Robert Haas
On Wed, Oct 11, 2017 at 7:08 AM, Ashutosh Bapat
<[hidden email]> wrote:
> Here's updated patch set based on the basic partition-wise join
> committed. The patchset applies on top of the patch to optimize the
> case of dummy partitioned tables [1].
>
> Right now, the advanced partition matching algorithm bails out when
> either of the joining relations has a default partition.

So is that something you are going to fix?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: advanced partition matching algorithm for partition-wise join

Ashutosh Bapat
On Thu, Oct 12, 2017 at 9:46 PM, Robert Haas <[hidden email]> wrote:

> On Wed, Oct 11, 2017 at 7:08 AM, Ashutosh Bapat
> <[hidden email]> wrote:
>> Here's updated patch set based on the basic partition-wise join
>> committed. The patchset applies on top of the patch to optimize the
>> case of dummy partitioned tables [1].
>>
>> Right now, the advanced partition matching algorithm bails out when
>> either of the joining relations has a default partition.
>
> So is that something you are going to fix?
>

Yes, if time permits. I had left the patch unattended while basic
partition-wise join was getting committed. Now that it's committed, I
rebased it. It still has TODOs and some work is required to improve
it. But for the patch to be really complete, we have to deal with the
problem of missing partitions described before. I am fine
collaborating if someone else wants to pick it up.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: advanced partition matching algorithm for partition-wise join

Ashutosh Bapat
On Fri, Oct 13, 2017 at 7:59 AM, Ashutosh Bapat
<[hidden email]> wrote:

> On Thu, Oct 12, 2017 at 9:46 PM, Robert Haas <[hidden email]> wrote:
>> On Wed, Oct 11, 2017 at 7:08 AM, Ashutosh Bapat
>> <[hidden email]> wrote:
>>> Here's updated patch set based on the basic partition-wise join
>>> committed. The patchset applies on top of the patch to optimize the
>>> case of dummy partitioned tables [1].
>>>
>>> Right now, the advanced partition matching algorithm bails out when
>>> either of the joining relations has a default partition.
>>
>> So is that something you are going to fix?
>>
>
> Yes, if time permits. I had left the patch unattended while basic
> partition-wise join was getting committed. Now that it's committed, I
> rebased it. It still has TODOs and some work is required to improve
> it. But for the patch to be really complete, we have to deal with the
> problem of missing partitions described before. I am fine
> collaborating if someone else wants to pick it up.
>
Here's patchset which support advanced partition matching for
partition bounds with default partition. The patchset is rebased on
the latest head.

When a list value is present in one of the joining relations and not
the other, and the other relation has default partition, match (join)
the partition containing that list value with the default partition,
since the default partition may contain rows with that list value. If
the default partition happens to be on the outer side of the join, the
resulting join partition acts as a default partition as it will
contain all the values from the default partition. If the partition
containing the list value happens to be on the outer side of the join,
the resulting join partition is associated with the list value, since
no other partition key value from the default partition makes it to
the join result.

When a range is present (completely or partly) in one of the joining
relations and not the other, and the other relation has default
partition, match (join) the partition corresponding to that range with
the default partition. If the default partition happens to be on the
outer side of the join, the resulting join partition acts as a default
partition as it will contain all the values from the default
partition. If the non-partition corresponding to the range happens to
be on the outer side of the join, the resulting join partition is
associated with that range, since partition key values from the
default partition outside that range won't make it to the join result.

If both the relations have default partition, match (join) the default
partition with each other and deem the resulting join partition as
default partition. If one of the relations has default partition but
not the other, and the default partition happens to be on the outer
side of the join, all its rows will make it to the join.  Such a
default partition may get joined to a non-default partition from the
inner side, if inner side has a range missing in the outer side.

If any of the above causes multiple partitions from one side to match
with one or more partitions on the other side, we won't use
partition-wise join as discussed in the first mail of this thread.

I have tested the patches for two-way join, but haven't added any test
involving default partitions to the patch itself. It needs to be
tested for N-way join as well. So, for now I have kept the two patches
supporting the default partition in case of range and list resp.
separate. Also, some of the code duplication in partition matching
functions can be avoided using macros. I will merge those patches into
the main patch and add macros once they are tested appropriately.

For hash partitioned table, we haven't implemented the advanced
partition matching, since it would be rare that somebody has hash
partitioned tables with holes (even if they are allowed).

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

pg_adv_dp_join_patches_v2.tar.gz (58K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: advanced partition matching algorithm for partition-wise join

Ashutosh Bapat
Here's a new patchset with following changes

1. Rebased on the latest head taking care of partition bound
comparison function changes
2. Refactored the code to avoid duplication.
3. There's an extensive test (provided by Rajkumar) set added, which
is not meant to be committed. That testset has testcases which crash
or reveal a bug. I will fix those crashes and add corresponding
testcases to partition_join.sql.

TODO
1. FIx crashes/bugs in the testcases.



On Sun, Dec 3, 2017 at 4:53 PM, Ashutosh Bapat
<[hidden email]> wrote:

> On Fri, Oct 13, 2017 at 7:59 AM, Ashutosh Bapat
> <[hidden email]> wrote:
>> On Thu, Oct 12, 2017 at 9:46 PM, Robert Haas <[hidden email]> wrote:
>>> On Wed, Oct 11, 2017 at 7:08 AM, Ashutosh Bapat
>>> <[hidden email]> wrote:
>>>> Here's updated patch set based on the basic partition-wise join
>>>> committed. The patchset applies on top of the patch to optimize the
>>>> case of dummy partitioned tables [1].
>>>>
>>>> Right now, the advanced partition matching algorithm bails out when
>>>> either of the joining relations has a default partition.
>>>
>>> So is that something you are going to fix?
>>>
>>
>> Yes, if time permits. I had left the patch unattended while basic
>> partition-wise join was getting committed. Now that it's committed, I
>> rebased it. It still has TODOs and some work is required to improve
>> it. But for the patch to be really complete, we have to deal with the
>> problem of missing partitions described before. I am fine
>> collaborating if someone else wants to pick it up.
>>
>
> Here's patchset which support advanced partition matching for
> partition bounds with default partition. The patchset is rebased on
> the latest head.
>
> When a list value is present in one of the joining relations and not
> the other, and the other relation has default partition, match (join)
> the partition containing that list value with the default partition,
> since the default partition may contain rows with that list value. If
> the default partition happens to be on the outer side of the join, the
> resulting join partition acts as a default partition as it will
> contain all the values from the default partition. If the partition
> containing the list value happens to be on the outer side of the join,
> the resulting join partition is associated with the list value, since
> no other partition key value from the default partition makes it to
> the join result.
>
> When a range is present (completely or partly) in one of the joining
> relations and not the other, and the other relation has default
> partition, match (join) the partition corresponding to that range with
> the default partition. If the default partition happens to be on the
> outer side of the join, the resulting join partition acts as a default
> partition as it will contain all the values from the default
> partition. If the non-partition corresponding to the range happens to
> be on the outer side of the join, the resulting join partition is
> associated with that range, since partition key values from the
> default partition outside that range won't make it to the join result.
>
> If both the relations have default partition, match (join) the default
> partition with each other and deem the resulting join partition as
> default partition. If one of the relations has default partition but
> not the other, and the default partition happens to be on the outer
> side of the join, all its rows will make it to the join.  Such a
> default partition may get joined to a non-default partition from the
> inner side, if inner side has a range missing in the outer side.
>
> If any of the above causes multiple partitions from one side to match
> with one or more partitions on the other side, we won't use
> partition-wise join as discussed in the first mail of this thread.
>
> I have tested the patches for two-way join, but haven't added any test
> involving default partitions to the patch itself. It needs to be
> tested for N-way join as well. So, for now I have kept the two patches
> supporting the default partition in case of range and list resp.
> separate. Also, some of the code duplication in partition matching
> functions can be avoided using macros. I will merge those patches into
> the main patch and add macros once they are tested appropriately.
>
> For hash partitioned table, we haven't implemented the advanced
> partition matching, since it would be rare that somebody has hash
> partitioned tables with holes (even if they are allowed).
>
> --
> Best Wishes,
> Ashutosh Bapat
> EnterpriseDB Corporation
> The Postgres Database Company


--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

pg_adv_dp_join_patches_v3.tar.gz (204K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: advanced partition matching algorithm for partition-wise join

Amit Langote-2
Hi Ashutosh.

On 2018/02/07 13:51, Ashutosh Bapat wrote:
> Here's a new patchset with following changes
>
> 1. Rebased on the latest head taking care of partition bound
> comparison function changes

I was about to make these changes myself while revising the fast pruning
patch.  Instead, I decided to take a look at your patch and try to use it
in my tree.

I looked at the patch 0001 and noticed that git diff --check says:

src/backend/catalog/partition.c:2900: trailing whitespace.
+partition_rbound_datum_cmp(FmgrInfo *partsupfunc, Oid *partcollation,

Also, might be a good idea to write briefly about the new arguments in the
header comment.  Something like that they are PartitionKey elements.

Thanks,
Amit


Reply | Threaded
Open this post in threaded view
|

Re: advanced partition matching algorithm for partition-wise join

Amit Langote-2
On 2018/02/08 11:55, Amit Langote wrote:

> Hi Ashutosh.
>
> On 2018/02/07 13:51, Ashutosh Bapat wrote:
>> Here's a new patchset with following changes
>>
>> 1. Rebased on the latest head taking care of partition bound
>> comparison function changes
>
> I was about to make these changes myself while revising the fast pruning
> patch.  Instead, I decided to take a look at your patch and try to use it
> in my tree.

I also noticed that a later patch adds partsupfunc to PartitionScheme,
which the pruning patch needs too.  So, perhaps would be nice to take out
that portion of the patch.  That is, the changes to PartitionScheme struct
definition and those to find_partition_scheme().

Regarding the latter, wouldn't be nice to have a comment before the code
that does the copying about why we don't compare the partsupfunc field to
decide if we have a match or not.  I understand it's because the
partsupfunc array contains pointers, not OIDs.  But maybe, that's too
obvious to warrant a comment.

Thanks,
Amit


Reply | Threaded
Open this post in threaded view
|

Re: advanced partition matching algorithm for partition-wise join

Ashutosh Bapat
In reply to this post by Amit Langote-2
On Thu, Feb 8, 2018 at 8:25 AM, Amit Langote
<[hidden email]> wrote:

> Hi Ashutosh.
>
> On 2018/02/07 13:51, Ashutosh Bapat wrote:
>> Here's a new patchset with following changes
>>
>> 1. Rebased on the latest head taking care of partition bound
>> comparison function changes
>
> I was about to make these changes myself while revising the fast pruning
> patch.  Instead, I decided to take a look at your patch and try to use it
> in my tree.
>
> I looked at the patch 0001 and noticed that git diff --check says:
>
> src/backend/catalog/partition.c:2900: trailing whitespace.
> +partition_rbound_datum_cmp(FmgrInfo *partsupfunc, Oid *partcollation,
Thanks. Fixed.

>
> Also, might be a good idea to write briefly about the new arguments in the
> header comment.  Something like that they are PartitionKey elements.

Here's updated patch set with those comments added.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

pg_adv_dp_join_patches_v4.tar.gz (204K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: advanced partition matching algorithm for partition-wise join

Ashutosh Bapat
In reply to this post by Amit Langote-2
On Thu, Feb 8, 2018 at 10:41 AM, Amit Langote
<[hidden email]> wrote:

> On 2018/02/08 11:55, Amit Langote wrote:
>> Hi Ashutosh.
>>
>> On 2018/02/07 13:51, Ashutosh Bapat wrote:
>>> Here's a new patchset with following changes
>>>
>>> 1. Rebased on the latest head taking care of partition bound
>>> comparison function changes
>>
>> I was about to make these changes myself while revising the fast pruning
>> patch.  Instead, I decided to take a look at your patch and try to use it
>> in my tree.
>
> I also noticed that a later patch adds partsupfunc to PartitionScheme,
> which the pruning patch needs too.  So, perhaps would be nice to take out
> that portion of the patch.  That is, the changes to PartitionScheme struct
> definition and those to find_partition_scheme().

I am not sure whether a patch with just that change and without any
changes to use that member will be acceptable. So leaving this aside.

>
> Regarding the latter, wouldn't be nice to have a comment before the code
> that does the copying about why we don't compare the partsupfunc field to
> decide if we have a match or not.  I understand it's because the
> partsupfunc array contains pointers, not OIDs.  But maybe, that's too
> obvious to warrant a comment.

It's because partsupfuncs should point to the information of the same
function when partopfamily matches and partopcintype matches. I would
have added an assertion for that with a comment, but with the pointer
that would be risky. Or we can just assert that the oids match.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Reply | Threaded
Open this post in threaded view
|

Re: advanced partition matching algorithm for partition-wise join

Amit Langote-2
On 2018/02/09 14:31, Ashutosh Bapat wrote:
>> I also noticed that a later patch adds partsupfunc to PartitionScheme,
>> which the pruning patch needs too.  So, perhaps would be nice to take out
>> that portion of the patch.  That is, the changes to PartitionScheme struct
>> definition and those to find_partition_scheme().
>
> I am not sure whether a patch with just that change and without any
> changes to use that member will be acceptable. So leaving this aside.

I asked, because with everything that I have now changed in the partition
pruning patch, one would need to pass these FmgrInfo pointers down to
partition bound searching functions from the optimizer.  If the changes to
add partsupfunc to the optimizer were taken out from your main patch, the
pruning patch could just start using it.  For now, I'm making those
changes part of the pruning patch.

Thanks,
Amit


Reply | Threaded
Open this post in threaded view
|

Re: advanced partition matching algorithm for partition-wise join

Ashutosh Bapat
On Fri, Feb 9, 2018 at 11:26 AM, Amit Langote
<[hidden email]> wrote:

> On 2018/02/09 14:31, Ashutosh Bapat wrote:
>>> I also noticed that a later patch adds partsupfunc to PartitionScheme,
>>> which the pruning patch needs too.  So, perhaps would be nice to take out
>>> that portion of the patch.  That is, the changes to PartitionScheme struct
>>> definition and those to find_partition_scheme().
>>
>> I am not sure whether a patch with just that change and without any
>> changes to use that member will be acceptable. So leaving this aside.
>
> I asked, because with everything that I have now changed in the partition
> pruning patch, one would need to pass these FmgrInfo pointers down to
> partition bound searching functions from the optimizer.  If the changes to
> add partsupfunc to the optimizer were taken out from your main patch, the
> pruning patch could just start using it.  For now, I'm making those
> changes part of the pruning patch.

That's fine. Someone's patch will be committed first and the other
will just take out those changes. But I am open to separate those
changes into other patch if a committer feels so.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Reply | Threaded
Open this post in threaded view
|

Re: advanced partition matching algorithm for partition-wise join

Amit Langote-2
In reply to this post by Ashutosh Bapat
Hi Ashutosh.

On 2018/02/09 14:27, Ashutosh Bapat wrote:
> Here's updated patch set with those comments added.

I looked at patches 0002 and 0003.

In 0002:

+ * In case of hash partition we store modulus and remainder in datums array

In case of hash partitioned table?

+ * which has the same data type irrespective of the number of partition keys
+ * and their data types. Hence we can compare the hash bound collection
without
+ * any partition key specific information.

"has the same data type" sounds like it means a Postgres data type,
whereas I think you mean that they are simple int32 values, so we don't
need any PartitionKey information to compare them.

In 0003:

A portion of code in both partition_range_bounds_merge(),
partition_list_bounds_merge(), and merge_null_partitions() has an extra
semi-colon at the end of a line starting with else if:

                   if (default_index < 0)
                        default_index = merged_index;
                    else if(default_index != merged_index);
                    {

which emits warnings like this:

partition.c: In function ‘partition_range_bounds_merge’:
partition.c:4192:11: warning: this ‘if’ clause does not guard...
[-Wmisleading-indentation]
      else if(default_index != merged_index);

           ^~
partition.c: In function ‘partition_list_bounds_merge’:
partition.c:4261:11: warning: this ‘if’ clause does not guard...
[-Wmisleading-indentation]
      else if(default_index != merged_index);
           ^~
Also, get this warning.

partition.c:3955:1: warning: ‘is_next_range_continuous’ defined but not
used [-Wunused-function]

I'm trying to understand the optimizer side of this patch.  In your commit
message, I read:

    This commit allows partition-wise join to be applied under
    following conditions

    1. the partition bounds of joining relations are such that rows from
    given partition on one side join can join with rows from maximum one
    partition on the other side i.e. bounds of a given partition on one
    side match/overlap with those of maximum one partition on the other
    side. If the mapping happens to be m:n where m > 1 or n > 1, we have
    to gang multiple partition relations together into a single relation.
    This means that we have to add simple relations during join
    processing, something which is not supported right now.  ALso, in such
    a case, different pairs of joining relations can produce different
    partition bounds for the same join relation, which again is not
    supported right now.


So, is the currently allowed case (partition bounds on both sides match
exactly) a special case of this new feature which tries to match
partitions in a more generalized manner?  I see that this patch removes
the partition_bound_equal(outer_rel->boundinfo, inner_rel->boundinfo)
check in build_joinrel_partition_info() in favor of reconciling any
differences in the representation of the partition bounds by calling
partition_bounds_merge() from try_partition_wise_join().

    2. For every partition on outer side that can contribute to the result
    of an OUTER side, there exists at least one (taken along with item 1,
    it means exactly one)  matching partition on the inner side. To
    support partition-wise join when the inner matching partition doesn't
    exist, we have to add a dummy base relation corresponding to the
    non-existent inner partition. We don't have support add base relations
    during join processing.

Sorry but I'm a bit confused by the last sentence; does it mean we're not
able to allow partition-wise join to happen in this case?  But this is in
the list of the new cases that the patch makes partition-wise join to
happen for.


Looking at the code changes under src/backend/optimizer:

+    else
+    {
+        Assert(partition_bounds_equal(part_scheme->partnatts,
+                                      part_scheme->parttyplen,
+                                      part_scheme->parttypbyval,
+                                      join_boundinfo, joinrel->boundinfo));

IIUC, this else block would run when try_partition_wise_join() is called
again for the same pair of relations.

+        /*
+         * Every pair of joining relations should result in the same number
+         * of child-joins.
+         */

Sorry if I'm misreading this, but does it mean: a given pair of joining
relations should always result in the same number of (set of, too?)
child-joins?

In the new comment in build_joinrel_partition_info():

+     * Because of restrictions in partition_bounds_merge(), not every pair of
+     * joining relation

joining relations


I will try to hop into partition_bounds_merge() from now...

Thanks,
Amit


1234