why partition pruning doesn't work?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
64 messages Options
1234
Reply | Threaded
Open this post in threaded view
|

why partition pruning doesn't work?

Pavel Stehule
Hi

CREATE TABLE data(a text, vlozeno date) PARTITION BY RANGE(vlozeno);
CREATE TABLE data_2016 PARTITION OF data FOR VALUES FROM ('2016-01-01') TO ('2016-12-31');
CREATE TABLE data_2017 PARTITION OF data FOR VALUES FROM ('2017-01-01') TO ('2017-12-31');
CREATE TABLE data_other PARTITION OF DATA DEFAULT;

insert into data select 'ahoj', '2016-01-01'::date + (random() * 900)::int from generate_series(1,1000000);
analyze data;

postgres=# explain analyze select * from data where vlozeno > '2018-06-01';
┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ QUERY PLAN │
╞══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
│ Append (cost=0.00..3519.83 rows=20001 width=9) (actual time=0.042..27.750 rows=19428 loops=1) │
│ -> Seq Scan on data_other (cost=0.00..3419.83 rows=20001 width=9) (actual time=0.040..25.895 rows=19428 loops=1) │
│ Filter: (vlozeno > '2018-06-01'::date) │
│ Rows Removed by Filter: 171518 │
│ Planning Time: 0.766 ms │
│ Execution Time: 28.718 ms │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(6 rows)

postgres=# explain analyze select * from data where vlozeno > current_date;
┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ QUERY PLAN │
╞═════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
│ Gather (cost=1000.00..17281.36 rows=20080 width=9) (actual time=0.749..95.389 rows=19428 loops=1) │
│ Workers Planned: 2 │
│ Workers Launched: 2 │
│ -> Parallel Append (cost=0.00..14273.36 rows=8367 width=9) (actual time=59.141..89.458 rows=6476 loops=3) │
│ -> Parallel Seq Scan on data_2016 (cost=0.00..5768.69 rows=24 width=9) (actual time=34.847..34.847 rows=0 loops=3) │
│ Filter: (vlozeno > CURRENT_DATE) │
│ Rows Removed by Filter: 135119 │
│ -> Parallel Seq Scan on data_2017 (cost=0.00..5745.02 rows=23 width=9) (actual time=53.269..53.269 rows=0 loops=2) │
│ Filter: (vlozeno > CURRENT_DATE) │
│ Rows Removed by Filter: 201848 │
│ -> Parallel Seq Scan on data_other (cost=0.00..2717.82 rows=11765 width=9) (actual time=0.044..55.502 rows=19428 loops=1) │
│ Filter: (vlozeno > CURRENT_DATE) │
│ Rows Removed by Filter: 171518 │
│ Planning Time: 0.677 ms │
│ Execution Time: 98.349 ms │
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(15 rows)

but

postgres=# explain analyze select * from data where vlozeno > (select current_date);
┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                      QUERY PLAN                                                      │
╞══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
│ Append  (cost=0.01..19574.68 rows=333333 width=9) (actual time=0.095..31.945 rows=19428 loops=1)                     │
│   InitPlan 1 (returns $0)                                                                                            │
│     ->  Result  (cost=0.00..0.01 rows=1 width=4) (actual time=0.010..0.010 rows=1 loops=1)                           │
│   ->  Seq Scan on data_2016  (cost=0.00..7258.98 rows=135119 width=9) (never executed)                               │
│         Filter: (vlozeno > $0)                                                                                       │
│   ->  Seq Scan on data_2017  (cost=0.00..7229.20 rows=134565 width=9) (never executed)                               │
│         Filter: (vlozeno > $0)                                                                                       │
│   ->  Seq Scan on data_other  (cost=0.00..3419.83 rows=63649 width=9) (actual time=0.069..29.856 rows=19428 loops=1) │
│         Filter: (vlozeno > $0)                                                                                       │
│         Rows Removed by Filter: 171518                                                                               │
│ Planning Time: 0.418 ms                                                                                              │
│ Execution Time: 33.019 ms                                                                                            │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(12 rows)


Partition pruning is working now.

Is it expected? Tested on fresh master.


Regards

Pavel





Reply | Threaded
Open this post in threaded view
|

Re: why partition pruning doesn't work?

Dmitry Dolgov
> On 1 June 2018 at 07:19, Pavel Stehule <[hidden email]> wrote:
>
> Partition pruning is working now.
>
> Is it expected? Tested on fresh master.

That's interesting. So there are two cases:

* vlozeno > (select current_date) (pruning works)

* vlozeno > current_date (pruning doesn't work)

In pull_partkey_params when we need to extract Params matching partition key in
the first case everything is fine, we've got an expr of type Param. In the
second case we've got a SQLValueFunction, which is ignored in the code - so
eventually we think that there is nothing matching a partition key and we don't
need to apply pruning.

With the attached hacky patch it would be taken into account (although I assume
in reality SQLValueFunction should be treated somehow differently) and pruning
is happening:

=# explain analyze select * from data where vlozeno > current_date;
                                                             QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1000.00..17223.38 rows=19512 width=9) (actual
time=0.456..32.952 rows=19340 loop s=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Parallel Append  (cost=0.00..14272.18 rows=8130 width=9)
(actual time=0.042..26.616 rows =6447 loops=3)
         ->  Parallel Seq Scan on data_2016  (cost=0.00..5771.19
rows=24 width=9) (never executed)
               Filter: (vlozeno > CURRENT_DATE)
         ->  Parallel Seq Scan on data_2017  (cost=0.00..5747.65
rows=23 width=9) (never executed)
               Filter: (vlozeno > CURRENT_DATE)
         ->  Parallel Seq Scan on data_other  (cost=0.00..2712.69
rows=11431 width=9) (actual time=0.032..26.031 rows=6447 loops=3)
               Filter: (vlozeno > CURRENT_DATE)
               Rows Removed by Filter: 57084
 Planning Time: 1.256 ms
 Execution Time: 35.327 ms
(13 rows)

Time: 40.291 ms

partpruning_sql_value_function.patch (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: why partition pruning doesn't work?

Ashutosh Bapat
On Fri, Jun 1, 2018 at 9:47 AM, Dmitry Dolgov <[hidden email]> wrote:

>> On 1 June 2018 at 07:19, Pavel Stehule <[hidden email]> wrote:
>>
>> Partition pruning is working now.
>>
>> Is it expected? Tested on fresh master.
>
> That's interesting. So there are two cases:
>
> * vlozeno > (select current_date) (pruning works)
>
> * vlozeno > current_date (pruning doesn't work)
>
> In pull_partkey_params when we need to extract Params matching partition key in
> the first case everything is fine, we've got an expr of type Param. In the
> second case we've got a SQLValueFunction, which is ignored in the code - so
> eventually we think that there is nothing matching a partition key and we don't
> need to apply pruning.
>
> With the attached hacky patch it would be taken into account (although I assume
> in reality SQLValueFunction should be treated somehow differently) and pruning
> is happening:

I think the patch is right if we were to handle only SQLValueFunction,
but the bigger picture here is that we aren't evaluating stable
functions before run-time partition pruning happens. I was under the
impression that the stable functions/expressions get evaluated and
folded into a constant just before the execution begins since a stable
function produces the same output for same input during one execution
invocation. But I am not able to find where we do that and probably we
don't do that at all. If we could do that then it's matter of using
same methods as plan-time partition pruning to prune the partitions.

If we go ahead with this patch, we should at least update it to handle
stable functions for the sake of completeness.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Reply | Threaded
Open this post in threaded view
|

Re: why partition pruning doesn't work?

Tom Lane-2
Ashutosh Bapat <[hidden email]> writes:
> I think the patch is right if we were to handle only SQLValueFunction,
> but the bigger picture here is that we aren't evaluating stable
> functions before run-time partition pruning happens. I was under the
> impression that the stable functions/expressions get evaluated and
> folded into a constant just before the execution begins since a stable
> function produces the same output for same input during one execution
> invocation. But I am not able to find where we do that and probably we
> don't do that at all.

We don't; there was a patch floating around to make that happen, but
it hasn't been updated lately.

I agree though that it seems strange to special-case SQLValueFunction
rather than any-stable-expression.  As long as the evaluation happens
at executor start (i.e. with the query's run-time snapshot) it should
be reasonable to simplify any stable expression.

It's worth questioning whether this is a bug fix or an improvement.
If the latter, it probably ought to wait for v12.

                        regards, tom lane

Reply | Threaded
Open this post in threaded view
|

Re: why partition pruning doesn't work?

Pavel Stehule


2018-06-01 17:53 GMT+02:00 Tom Lane <[hidden email]>:
Ashutosh Bapat <[hidden email]> writes:
> I think the patch is right if we were to handle only SQLValueFunction,
> but the bigger picture here is that we aren't evaluating stable
> functions before run-time partition pruning happens. I was under the
> impression that the stable functions/expressions get evaluated and
> folded into a constant just before the execution begins since a stable
> function produces the same output for same input during one execution
> invocation. But I am not able to find where we do that and probably we
> don't do that at all.

We don't; there was a patch floating around to make that happen, but
it hasn't been updated lately.

I agree though that it seems strange to special-case SQLValueFunction
rather than any-stable-expression.  As long as the evaluation happens
at executor start (i.e. with the query's run-time snapshot) it should
be reasonable to simplify any stable expression.

It's worth questioning whether this is a bug fix or an improvement.
If the latter, it probably ought to wait for v12.

The result is correct - but it was unpleasant surprise. I searched the most simple demo for this feature, and it doesn't work. Filtering based on CURRENT_DATE is often.

Regards

Pavel


                        regards, tom lane

Reply | Threaded
Open this post in threaded view
|

Re: why partition pruning doesn't work?

Jeff Janes
In reply to this post by Tom Lane-2
On Fri, Jun 1, 2018 at 11:53 AM, Tom Lane <[hidden email]> wrote:

I agree though that it seems strange to special-case SQLValueFunction
rather than any-stable-expression.  As long as the evaluation happens
at executor start (i.e. with the query's run-time snapshot) it should
be reasonable to simplify any stable expression.

It's worth questioning whether this is a bug fix or an improvement.
If the latter, it probably ought to wait for v12.

If explaining the change requires reference to tokens from the source code, rather than something an end user could understand, I'd argue it is a bug fix rather than an improvement.

Cheers,

Jeff
Reply | Threaded
Open this post in threaded view
|

Re: why partition pruning doesn't work?

Ashutosh Bapat
On Sat, Jun 2, 2018 at 5:16 PM, Jeff Janes <[hidden email]> wrote:

> On Fri, Jun 1, 2018 at 11:53 AM, Tom Lane <[hidden email]> wrote:
>>
>>
>> I agree though that it seems strange to special-case SQLValueFunction
>> rather than any-stable-expression.  As long as the evaluation happens
>> at executor start (i.e. with the query's run-time snapshot) it should
>> be reasonable to simplify any stable expression.
>>
>> It's worth questioning whether this is a bug fix or an improvement.
>> If the latter, it probably ought to wait for v12.
>
>
> If explaining the change requires reference to tokens from the source code,
> rather than something an end user could understand, I'd argue it is a bug
> fix rather than an improvement.

If we going to implement stable expression folding before the actual
execution starts, that's a feature in itself. So, it's V12 material.
Partition pruning will use that feature. I don't think we should make
partition pruning work with stable expressions in some ad-hoc way in
V11 and the some future release (mostly V12) implements it on top of
stable expression folding feature. So my vote for making it work in
V12.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Reply | Threaded
Open this post in threaded view
|

Re: why partition pruning doesn't work?

Tom Lane-2
In reply to this post by Jeff Janes
Jeff Janes <[hidden email]> writes:
> On Fri, Jun 1, 2018 at 11:53 AM, Tom Lane <[hidden email]> wrote:
>> It's worth questioning whether this is a bug fix or an improvement.
>> If the latter, it probably ought to wait for v12.

> If explaining the change requires reference to tokens from the source code,
> rather than something an end user could understand, I'd argue it is a bug
> fix rather than an improvement.

Well, the difference between volatile, stable and immutable functions is
well-documented, so I don't think that's a great argument.  If there's
some aspect of this behavior that's not predictable from understanding
which class the partition key expression falls into, then I could agree
that's a bug.

                        regards, tom lane

Reply | Threaded
Open this post in threaded view
|

Re: why partition pruning doesn't work?

Dmitry Dolgov
> On 1 June 2018 at 17:53, Tom Lane <[hidden email]> wrote:
> Ashutosh Bapat <[hidden email]> writes:
>> I think the patch is right if we were to handle only SQLValueFunction,
>> but the bigger picture here is that we aren't evaluating stable
>> functions before run-time partition pruning happens.
>
> I agree though that it seems strange to special-case SQLValueFunction
> rather than any-stable-expression.  As long as the evaluation happens
> at executor start (i.e. with the query's run-time snapshot) it should
> be reasonable to simplify any stable expression.

Just to clarify for myself, for evaluating any stable function here would it be
enough to handle all function-like expressions (FuncExpr / OpExpr /
DistinctExpr / NullIfExpr) and check a corresponding function for provolatile,
like in the attached patch?

partpruning_stable_func.patch (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: why partition pruning doesn't work?

Tom Lane-2
Dmitry Dolgov <[hidden email]> writes:
> Just to clarify for myself, for evaluating any stable function here would it be
> enough to handle all function-like expressions (FuncExpr / OpExpr /
> DistinctExpr / NullIfExpr) and check a corresponding function for provolatile,
> like in the attached patch?

I think the entire approach is wrong here.  Rather than concerning
yourself with Params, or any other specific expression type, you
should be using !contain_volatile_functions() to decide whether
an expression is run-time-constant.  If it is, use the regular
expression evaluation machinery to extract the value.

                        regards, tom lane

Reply | Threaded
Open this post in threaded view
|

Re: why partition pruning doesn't work?

Dmitry Dolgov
> On 3 June 2018 at 19:11, Tom Lane <[hidden email]> wrote:

> Dmitry Dolgov <[hidden email]> writes:
>> Just to clarify for myself, for evaluating any stable function here would it be
>> enough to handle all function-like expressions (FuncExpr / OpExpr /
>> DistinctExpr / NullIfExpr) and check a corresponding function for provolatile,
>> like in the attached patch?
>
> I think the entire approach is wrong here.  Rather than concerning
> yourself with Params, or any other specific expression type, you
> should be using !contain_volatile_functions() to decide whether
> an expression is run-time-constant.  If it is, use the regular
> expression evaluation machinery to extract the value.
Yes, it makes sense. Then, to my understanding, the attached code is close to
what was described above (although I'm not sure about the Const part).

partpruning_stable_func_v2.patch (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: why partition pruning doesn't work?

Amit Langote-2
Hi Dmitry,

Thanks for creating the patch.  I looked at it and have some comments.

On 2018/06/04 22:30, Dmitry Dolgov wrote:

>> On 3 June 2018 at 19:11, Tom Lane <[hidden email]> wrote:
>> Dmitry Dolgov <[hidden email]> writes:
>>> Just to clarify for myself, for evaluating any stable function here would it be
>>> enough to handle all function-like expressions (FuncExpr / OpExpr /
>>> DistinctExpr / NullIfExpr) and check a corresponding function for provolatile,
>>> like in the attached patch?
>>
>> I think the entire approach is wrong here.  Rather than concerning
>> yourself with Params, or any other specific expression type, you
>> should be using !contain_volatile_functions() to decide whether
>> an expression is run-time-constant.  If it is, use the regular
>> expression evaluation machinery to extract the value.
>
> Yes, it makes sense. Then, to my understanding, the attached code is close to
> what was described above (although I'm not sure about the Const part).

This:

@@ -2727,6 +2730,13 @@ pull_partkey_params(PartitionPruneInfo *pinfo, List
*steps)
                 }
                 gotone = true;
             }
+            else if (!IsA(expr, Const))
+            {
+                Param   *param = (Param *) expr;
+                pinfo->execparams = bms_add_member(pinfo->execparams,
+                                                   param->paramid);
+                gotone = true;
+            }

doesn't look quite right.  What says expr is really a Param?  The patch
appears to work because, by setting pinfo->execparams to *something*, it
triggers execution-time pruning to run; its contents aren't necessarily
used during execution pruning.  In fact, it would've crashed if the
execution-time pruning code had required execparams to contain *valid*
param id, but currently it doesn't.

What I think we'd need to do to make this work is to make execution-time
pruning be invoked even if there aren't any Params involved.  IOW, let's
try to teach make_partition_pruneinfo that it can go ahead also in the
cases where there are expressions being compared with the partition key
that contain (only) stable functions.  Then, go and fix the
execution-pruning code to not *always* expect there to be Params to prune
with.

Maybe, David (added to cc) has something to say about all this, especially
whether he considers this a feature and not a bug fix.

Thanks,
Amit


Reply | Threaded
Open this post in threaded view
|

Re: why partition pruning doesn't work?

Dmitry Dolgov
> On 5 June 2018 at 12:31, Amit Langote <[hidden email]> wrote:
>
> doesn't look quite right.  What says expr is really a Param?  The patch
> appears to work because, by setting pinfo->execparams to *something*, it
> triggers execution-time pruning to run; its contents aren't necessarily
> used during execution pruning.  In fact, it would've crashed if the
> execution-time pruning code had required execparams to contain *valid*
> param id, but currently it doesn't.
>
> What I think we'd need to do to make this work is to make execution-time
> pruning be invoked even if there aren't any Params involved.  IOW, let's
> try to teach make_partition_pruneinfo that it can go ahead also in the
> cases where there are expressions being compared with the partition key
> that contain (only) stable functions.  Then, go and fix the
> execution-pruning code to not *always* expect there to be Params to prune
> with.

Yeah, I agree - I copied this approach mindlessly from the original hacky
patch. So, looks like it's necessary to have something like got_stable_expr
together with gotparam. And after that the only place where I see Params
are in use is partkey_datum_from_expr where all the stuff is actually
evaluated. So apparently this part about "fix the execution-pruning code to not
*always* expect there to be Params to prune with" will be only about this
function - am I correct or there is something else that I missed?

Reply | Threaded
Open this post in threaded view
|

Re: why partition pruning doesn't work?

Ashutosh Bapat
On Tue, Jun 5, 2018 at 6:24 PM, Dmitry Dolgov <[hidden email]> wrote:

>> On 5 June 2018 at 12:31, Amit Langote <[hidden email]> wrote:
>>
>> doesn't look quite right.  What says expr is really a Param?  The patch
>> appears to work because, by setting pinfo->execparams to *something*, it
>> triggers execution-time pruning to run; its contents aren't necessarily
>> used during execution pruning.  In fact, it would've crashed if the
>> execution-time pruning code had required execparams to contain *valid*
>> param id, but currently it doesn't.
>>
>> What I think we'd need to do to make this work is to make execution-time
>> pruning be invoked even if there aren't any Params involved.  IOW, let's
>> try to teach make_partition_pruneinfo that it can go ahead also in the
>> cases where there are expressions being compared with the partition key
>> that contain (only) stable functions.  Then, go and fix the
>> execution-pruning code to not *always* expect there to be Params to prune
>> with.
>
> Yeah, I agree - I copied this approach mindlessly from the original hacky
> patch. So, looks like it's necessary to have something like got_stable_expr
> together with gotparam.

I think the current code is heavily relying on Params to be present
for partition pruning, which isn't true. Runtime partition pruning is
possible when there are comparison conditions with partition key
expressions on one side and "execution time constant" expressions on
the other side. By "execution time constant" expression, I mean any
expression that evaluates to a constant at the time of execution like
a stable expressions (not just functions) or a Param expression. I can
think of only these two at this time, but there can be more. So,
gotparam should be renamed as "gotprunable_cond" to be generic.
pull_partkey_params() should be renamed as "pull_partkey_conds" or
something generic. That function would return true if there exists an
expression in steps which can be evaluated to a constant at runtime,
otherwise it returns false. My guess is there will be false-positives
which need to be dealt with later, but there will be no
false-negatives.

> And after that the only place where I see Params
> are in use is partkey_datum_from_expr where all the stuff is actually
> evaluated. So apparently this part about "fix the execution-pruning code to not
> *always* expect there to be Params to prune with" will be only about this
> function - am I correct or there is something else that I missed?

Yes. But I think trying to evaluate parameters in this function is not
good. The approach of folding constant expressions before or
immediately after the execution starts doesn't require the expressions
to be evaluated in partkey_datum_from_expr and might benefit other
places where stable expressions or params can appear.

Other problem with partkey_datum_from_expr() seems to be that it
evaluated only param nodes but not the expressions involving
parameters which can folded into constants at runtime. Take for
example following queries on table t1 with two partitions (0, 100) and
(100, 200), populated using "insert into t1 select i, i from
generate_series(0, 199) i;". There's an index on t1(a).

explain analyze select * from t1 x left join t1 y on x.a = y.b where y.a = 5;
                                                 QUERY PLAN
------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..6.78 rows=1 width=16) (actual
time=0.033..0.066 rows=1 loops=1)
   ->  Append  (cost=0.00..2.25 rows=1 width=8) (actual
time=0.019..0.035 rows=1 loops=1)
         ->  Seq Scan on t1p1 y  (cost=0.00..2.25 rows=1 width=8)
(actual time=0.018..0.035 rows=1 loops=1)
               Filter: (a = 5)
               Rows Removed by Filter: 99
   ->  Append  (cost=0.00..4.51 rows=2 width=8) (actual
time=0.011..0.027 rows=1 loops=1)
         ->  Seq Scan on t1p1 x  (cost=0.00..2.25 rows=1 width=8)
(actual time=0.006..0.022 rows=1 loops=1)
               Filter: (y.b = a)
               Rows Removed by Filter: 99
         ->  Seq Scan on t1p2 x_1  (cost=0.00..2.25 rows=1 width=8)
(never executed)
               Filter: (y.b = a)
 Planning Time: 0.644 ms
 Execution Time: 0.115 ms
(13 rows)

t1p2 x_1 is never scanned indicating that run time partition pruning
happened. But then see the following query

postgres:17889=#explain analyze select * from t1 x left join t1 y on
x.a = y.b + 100 where y.a = 5;
                                                  QUERY PLAN
--------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..7.28 rows=1 width=16) (actual
time=0.055..0.093 rows=1 loops=1)
   ->  Append  (cost=0.00..2.25 rows=1 width=8) (actual
time=0.017..0.034 rows=1 loops=1)
         ->  Seq Scan on t1p1 y  (cost=0.00..2.25 rows=1 width=8)
(actual time=0.016..0.033 rows=1 loops=1)
               Filter: (a = 5)
               Rows Removed by Filter: 99
   ->  Append  (cost=0.00..5.01 rows=2 width=8) (actual
time=0.034..0.054 rows=1 loops=1)
         ->  Seq Scan on t1p1 x  (cost=0.00..2.50 rows=1 width=8)
(actual time=0.026..0.026 rows=0 loops=1)
               Filter: ((y.b + 100) = a)
               Rows Removed by Filter: 100
         ->  Seq Scan on t1p2 x_1  (cost=0.00..2.50 rows=1 width=8)
(actual time=0.007..0.027 rows=1 loops=1)
               Filter: ((y.b + 100) = a)
               Rows Removed by Filter: 99
 Planning Time: 0.424 ms
 Execution Time: 0.139 ms
(14 rows)

The scan on t1p1 x returns no rows and should have been pruned since
y.b + 100 is constant for a given y.b.

But for this to work, folding constant expressions doesn't help since
y.b changes with every rescan of t1 x. So may be we need some way to
constant fold expression during ExecutorRewind() as well.

This is digression from the original report, but it's still within the
scope of "why partition pruning doesn't work?"

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Reply | Threaded
Open this post in threaded view
|

Re: why partition pruning doesn't work?

David Rowley-3
In reply to this post by Amit Langote-2
On 5 June 2018 at 22:31, Amit Langote <[hidden email]> wrote:
> Maybe, David (added to cc) has something to say about all this, especially
> whether he considers this a feature and not a bug fix.

Thanks, Amit. I had missed this thread.

Yeah. I admit if I'd thought about this case when I wrote the code,
then I'd have made any non-volatile Expr work, but I didn't :-(

It was pointed out to be a few months ago in a comment in [1]. I
initially thought that this was v12 material, but it seems there are a
few people here that are pretty unhappy about it.

I was going to describe what such a patch should look like here, but
that seemed like about as much work as writing it, so:

Please see the attached patch. I've only just finished with it and
it's not fully done yet as there's still an XXX comment where I've not
quite thought about Exprs with Vars from higher levels. These might
always be converted to Params, so the code might be okay as is, but
I've not checked this yet, hence the comment remains.

I'm slightly leaning towards this being included in v11. Without this
people are forced into hacks like WHERE partkey = (SELECT
stablefunc()); to get pruning working at all. If that SQL remains
after this patch then pruning can only take place during actual
execution. With the attached patch the pruning can take place during
the initialization of the executor, which in cases with many
partitions can be significantly faster, providing actual execution is
short.  I'd rather people didn't get into bad habits like that if we
can avoid it.

[1] https://blog.2ndquadrant.com/partition-elimination-postgresql-11/

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

run-time_pruning_for_exprs.patch (32K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: why partition pruning doesn't work?

Pavel Stehule


2018-06-05 17:07 GMT+02:00 David Rowley <[hidden email]>:
On 5 June 2018 at 22:31, Amit Langote <[hidden email]> wrote:
> Maybe, David (added to cc) has something to say about all this, especially
> whether he considers this a feature and not a bug fix.

Thanks, Amit. I had missed this thread.

Yeah. I admit if I'd thought about this case when I wrote the code,
then I'd have made any non-volatile Expr work, but I didn't :-(

It was pointed out to be a few months ago in a comment in [1]. I
initially thought that this was v12 material, but it seems there are a
few people here that are pretty unhappy about it.

I was going to describe what such a patch should look like here, but
that seemed like about as much work as writing it, so:

Please see the attached patch. I've only just finished with it and
it's not fully done yet as there's still an XXX comment where I've not
quite thought about Exprs with Vars from higher levels. These might
always be converted to Params, so the code might be okay as is, but
I've not checked this yet, hence the comment remains.

I'm slightly leaning towards this being included in v11. Without this
people are forced into hacks like WHERE partkey = (SELECT
stablefunc()); to get pruning working at all. If that SQL remains
after this patch then pruning can only take place during actual
execution. With the attached patch the pruning can take place during
the initialization of the executor, which in cases with many
partitions can be significantly faster, providing actual execution is
short.  I'd rather people didn't get into bad habits like that if we
can avoid it.

This is really great

Regards

Pavel


[1] https://blog.2ndquadrant.com/partition-elimination-postgresql-11/

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: why partition pruning doesn't work?

David Rowley-3
In reply to this post by David Rowley-3
On 6 June 2018 at 03:07, David Rowley <[hidden email]> wrote:
> Please see the attached patch. I've only just finished with it and
> it's not fully done yet as there's still an XXX comment where I've not
> quite thought about Exprs with Vars from higher levels. These might
> always be converted to Params, so the code might be okay as is, but
> I've not checked this yet, hence the comment remains.

I looked at this again today and decided that the XXX comment could
just be removed. I also changed contains_only_safeparams into
contains_unsafeparams and reversed the condition. I then decided that
I didn't like the way we need to check which params are in the Expr
each time we call partkey_datum_from_expr. It seems better to prepare
this in advance when building the pruning steps. I started work on
that, but soon realised that I'd need to pass a List of Bitmapsets to
the executor. This is a problem as Bitmapset is not a Node type and
cannot be copied with COPY_NODE_FIELD(). Probably this could be
refactored to instead of passing 3 Lists in the PartitionPruneStepOp
we could invent a new node type that just has 3 fields and store a
single List.

Anyway, I didn't do that because I'm not sure what the fate of this
patch is going to be. I'd offer to change things around to add a new
Node type, but I don't really want to work on that now if this is v12
material.

I've attached a cleaned up version of the patch I posted yesterday.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

run-time_pruning_for_exprs_v2.patch (32K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: why partition pruning doesn't work?

Amit Langote-2
On 2018/06/06 14:10, David Rowley wrote:

> I then decided that
> I didn't like the way we need to check which params are in the Expr
> each time we call partkey_datum_from_expr. It seems better to prepare
> this in advance when building the pruning steps. I started work on
> that, but soon realised that I'd need to pass a List of Bitmapsets to
> the executor. This is a problem as Bitmapset is not a Node type and
> cannot be copied with COPY_NODE_FIELD(). Probably this could be
> refactored to instead of passing 3 Lists in the PartitionPruneStepOp
> we could invent a new node type that just has 3 fields and store a
> single List.

I wonder why we need to create those Bitmapsets in the planner?  Why not
in ExecSetupPartitionPruneState()?  For example, like how
context->exprstates is initialized.

Thanks,
Amit


Reply | Threaded
Open this post in threaded view
|

Re: why partition pruning doesn't work?

David Rowley-3
On 6 June 2018 at 18:05, Amit Langote <[hidden email]> wrote:

> On 2018/06/06 14:10, David Rowley wrote:
>> I then decided that
>> I didn't like the way we need to check which params are in the Expr
>> each time we call partkey_datum_from_expr. It seems better to prepare
>> this in advance when building the pruning steps. I started work on
>> that, but soon realised that I'd need to pass a List of Bitmapsets to
>> the executor. This is a problem as Bitmapset is not a Node type and
>> cannot be copied with COPY_NODE_FIELD(). Probably this could be
>> refactored to instead of passing 3 Lists in the PartitionPruneStepOp
>> we could invent a new node type that just has 3 fields and store a
>> single List.
>
> I wonder why we need to create those Bitmapsets in the planner?  Why not
> in ExecSetupPartitionPruneState()?  For example, like how
> context->exprstates is initialized.
That seems like a good idea.  Certainly much better than working them
out each time we prune.

v3 patch attached.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

run-time_pruning_for_exprs_v3.patch (31K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: why partition pruning doesn't work?

Amit Langote-2
On 2018/06/06 18:52, David Rowley wrote:

> On 6 June 2018 at 18:05, Amit Langote <[hidden email]> wrote:
>> On 2018/06/06 14:10, David Rowley wrote:
>>> I then decided that
>>> I didn't like the way we need to check which params are in the Expr
>>> each time we call partkey_datum_from_expr. It seems better to prepare
>>> this in advance when building the pruning steps. I started work on
>>> that, but soon realised that I'd need to pass a List of Bitmapsets to
>>> the executor. This is a problem as Bitmapset is not a Node type and
>>> cannot be copied with COPY_NODE_FIELD(). Probably this could be
>>> refactored to instead of passing 3 Lists in the PartitionPruneStepOp
>>> we could invent a new node type that just has 3 fields and store a
>>> single List.
>>
>> I wonder why we need to create those Bitmapsets in the planner?  Why not
>> in ExecSetupPartitionPruneState()?  For example, like how
>> context->exprstates is initialized.
>
> That seems like a good idea.  Certainly much better than working them
> out each time we prune.
>
> v3 patch attached.

Thanks David.  This one looks good.  I also like it that hasparamlessexprs
is no longer determined and set in the planner.

I checked what happens with the cases that Ashutosh complained about
upthread and seems that the pruning works as expected.

create table t1 (a int, b int) partition by range (a);
create table t1p1 partition of t1 for values from (0) to (100);
create table t1p2 partition of t1 for values from (100) to (200);
create index on t1 (a);
insert into t1 select i, i from generate_series(0, 199) i;

explain (costs off, analyze) select * from t1 x left join t1 y on x.a =
y.b + 100 where y.a = 5;
                                          QUERY PLAN

-----------------------------------------------------------------------------------------------
 Nested Loop (actual time=0.294..0.371 rows=1 loops=1)
   ->  Append (actual time=0.067..0.092 rows=1 loops=1)
         ->  Bitmap Heap Scan on t1p1 y (actual time=0.049..0.059 rows=1
loops=1)
               Recheck Cond: (a = 5)
               Heap Blocks: exact=1
               ->  Bitmap Index Scan on t1p1_a_idx (actual
time=0.022..0.022 rows=1 loops=1)
                     Index Cond: (a = 5)
   ->  Append (actual time=0.192..0.219 rows=1 loops=1)
         ->  Index Scan using t1p1_a_idx on t1p1 x (never executed)
               Index Cond: (a = (y.b + 100))
         ->  Index Scan using t1p2_a_idx on t1p2 x_1 (actual
time=0.134..0.145 rows=1 loops=1)
               Index Cond: (a = (y.b + 100))
 Planning Time: 5.314 ms
 Execution Time: 0.938 ms
(14 rows)

Note that the condition x.a = y.b + 100 is able to prune t1p1, whereas on
HEAD it isn't.

Thanks,
Amit


1234
Previous Thread Next Thread