Calculate total_table_pages after set_base_rel_sizes()

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Calculate total_table_pages after set_base_rel_sizes()

David Rowley-3
I believe that we should be delaying the PlannerInfo's
total_table_pages calculation until after constraint exclusion and
partition pruning have taken place. Doing this calculation before we
determine which relations we don't need to scan can lead to
incorrectly applying random_page_cost to too many pages processed
during an Index Scan.

We already don't count relations removed by join removals from this
calculation, so counting pruned partitions seems like an omission.

The attached patch moves the calculation to after set_base_rel_sizes()
is called and before set_base_rel_pathlists() is called, where the
information is actually used.

I am considering this a bug fix, but I'm proposing this for PG12 only
as I don't think destabilising plans in the back branches is a good
idea. I'll add this to the September commitfest.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

v1-0001-Calculate-total_table_pages-after-set_base_rel_si.patch (6K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Calculate total_table_pages after set_base_rel_sizes()

Edmund Horner
David Rowley said:

> I believe that we should be delaying the PlannerInfo's
> total_table_pages calculation until after constraint exclusion and
> partition pruning have taken place. Doing this calculation before we
> determine which relations we don't need to scan can lead to
> incorrectly applying random_page_cost to too many pages processed
> during an Index Scan.
>
> We already don't count relations removed by join removals from this
> calculation, so counting pruned partitions seems like an omission.
>
> The attached patch moves the calculation to after set_base_rel_sizes()
> is called and before set_base_rel_pathlists() is called, where the
> information is actually used.
>
> I am considering this a bug fix, but I'm proposing this for PG12 only
> as I don't think destabilising plans in the back branches is a good
> idea. I'll add this to the September commitfest.

Hi David, I had a quick look at this.  (I haven't tested it so this isn't a full review.)

It looks like a fairly straightforward code move.  And I think it's correct to exclude the pages from partitions that won't be read.

I have a small tweak.  In make_one_rel, we currently have:

    /*
     * Compute size estimates and consider_parallel flags for each base rel,
     * then generate access paths.
     */
    set_base_rel_sizes(root);
    set_base_rel_pathlists(root);

Your patch inserts code between the two lines.  I think the comment should be split too.

    /* Compute size estimates and consider_parallel flags for each base rel. */
    set_base_rel_sizes(root);

    // NEW CODE

    /* Generate access paths. */
    set_base_rel_pathlists(root);

Cheers,
Edmund
Reply | Threaded
Open this post in threaded view
|

Re: Calculate total_table_pages after set_base_rel_sizes()

David Rowley-3
On Tue, 25 Sep 2018 at 10:23, Edmund Horner <[hidden email]> wrote:

> I have a small tweak.  In make_one_rel, we currently have:
>
>     /*
>      * Compute size estimates and consider_parallel flags for each base rel,
>      * then generate access paths.
>      */
>     set_base_rel_sizes(root);
>     set_base_rel_pathlists(root);
>
> Your patch inserts code between the two lines.  I think the comment should be split too.
>
>     /* Compute size estimates and consider_parallel flags for each base rel. */
>     set_base_rel_sizes(root);
>
>     // NEW CODE
>
>     /* Generate access paths. */
>     set_base_rel_pathlists(root);
Thanks for looking at this.

I've changed that in the attached updated patch.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

v2-0001-Calculate-total_table_pages-after-set_base_rel_si.patch (6K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Calculate total_table_pages after set_base_rel_sizes()

Edmund Horner
David Rowley said:

> I believe that we should be delaying the PlannerInfo's
> total_table_pages calculation until after constraint exclusion and
> partition pruning have taken place. Doing this calculation before we
> determine which relations we don't need to scan can lead to
> incorrectly applying random_page_cost to too many pages processed
> during an Index Scan.
>
> We already don't count relations removed by join removals from this
> calculation, so counting pruned partitions seems like an omission.
>
> The attached patch moves the calculation to after set_base_rel_sizes()
> is called and before set_base_rel_pathlists() is called, where the
> information is actually used.
>
> I am considering this a bug fix, but I'm proposing this for PG12 only
> as I don't think destabilising plans in the back branches is a good
> idea. I'll add this to the September commitfest.

I played with the new patched today with a set of large partitions.
It seems to work, though the effect is subtle.  The expected behaviour
of index_pages_fetched is a bit hard to fathom when the cache share
pushes it between cases A,B and C of the formula, but that's not
really down to this patch.

Basically I think it's ready for a committer to look at.  Should I
update the CF entry?

Edmund

Reply | Threaded
Open this post in threaded view
|

Re: Calculate total_table_pages after set_base_rel_sizes()

David Rowley-3
On 6 October 2018 at 18:20, Edmund Horner <[hidden email]> wrote:

> David Rowley said:
>> I am considering this a bug fix, but I'm proposing this for PG12 only
>> as I don't think destabilising plans in the back branches is a good
>> idea. I'll add this to the September commitfest.
>
> I played with the new patched today with a set of large partitions.
> It seems to work, though the effect is subtle.  The expected behaviour
> of index_pages_fetched is a bit hard to fathom when the cache share
> pushes it between cases A,B and C of the formula, but that's not
> really down to this patch.

Thanks for looking at this and testing it too.

The primary purpose of this patch was step 1 in delaying the creation
of RangeTblEntrys for partitions until after partition pruning has
taken place.  The code I had to do this caused problems around the
total_table_pages calculation due to the lack of RangeTblEntry for the
partitions at the time it was being calculated. But regardless of
that, I still believe where we currently calculate this number is
subtlely broken as it counts all partitions, even ones that will later
be pruned, thus decreasing the likelihood of an index being used as
random_page_cost would be applied to a higher number of index pages.

Amit Langote has since posted a patch to delay the RangeTblEntry
creation until after pruning. His patch happens to also move the
total_table_pages calculation, but I believe this change should be
made as an independent commit to anything else.  I've kept it in the
commitfest for that reason.

> Basically I think it's ready for a committer to look at.  Should I
> update the CF entry?

That sounds good, please do.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: Calculate total_table_pages after set_base_rel_sizes()

Amit Langote-2
On 2018/10/07 17:43, David Rowley wrote:

> On 6 October 2018 at 18:20, Edmund Horner <[hidden email]> wrote:
>> David Rowley said:
>>> I am considering this a bug fix, but I'm proposing this for PG12 only
>>> as I don't think destabilising plans in the back branches is a good
>>> idea. I'll add this to the September commitfest.
>>
>> I played with the new patched today with a set of large partitions.
>> It seems to work, though the effect is subtle.  The expected behaviour
>> of index_pages_fetched is a bit hard to fathom when the cache share
>> pushes it between cases A,B and C of the formula, but that's not
>> really down to this patch.
>
> Thanks for looking at this and testing it too.
>
> The primary purpose of this patch was step 1 in delaying the creation
> of RangeTblEntrys for partitions until after partition pruning has
> taken place.  The code I had to do this caused problems around the
> total_table_pages calculation due to the lack of RangeTblEntry for the
> partitions at the time it was being calculated. But regardless of
> that, I still believe where we currently calculate this number is
> subtlely broken as it counts all partitions, even ones that will later
> be pruned, thus decreasing the likelihood of an index being used as
> random_page_cost would be applied to a higher number of index pages.
>
> Amit Langote has since posted a patch to delay the RangeTblEntry
> creation until after pruning. His patch happens to also move the
> total_table_pages calculation, but I believe this change should be
> made as an independent commit to anything else.  I've kept it in the
> commitfest for that reason.

Yeah, if this patch is a win independent of the other project of delaying
partition RTE creation, which seems to be true, I think we should go ahead
with applying this patch.

>> Basically I think it's ready for a committer to look at.  Should I
>> update the CF entry?
>
> That sounds good, please do.

+1

Thanks,
Amit


Reply | Threaded
Open this post in threaded view
|

Re: Calculate total_table_pages after set_base_rel_sizes()

Amit Langote-2
On 2018/10/11 13:45, Amit Langote wrote:

> On 2018/10/07 17:43, David Rowley wrote:
>> Amit Langote has since posted a patch to delay the RangeTblEntry
>> creation until after pruning. His patch happens to also move the
>> total_table_pages calculation, but I believe this change should be
>> made as an independent commit to anything else.  I've kept it in the
>> commitfest for that reason.
>
> Yeah, if this patch is a win independent of the other project of delaying
> partition RTE creation, which seems to be true, I think we should go ahead
> with applying this patch.

Fwiw, I've incorporated David's patch in my own series, so that one of my
patches no longer has the code movement hunks that are in his.  I will
post the new version of my patch series like that.

Thanks,
Amit


Reply | Threaded
Open this post in threaded view
|

Re: Calculate total_table_pages after set_base_rel_sizes()

David Rowley-3
On 12 October 2018 at 23:35, Amit Langote <[hidden email]> wrote:

> On 2018/10/11 13:45, Amit Langote wrote:
>> On 2018/10/07 17:43, David Rowley wrote:
>>> Amit Langote has since posted a patch to delay the RangeTblEntry
>>> creation until after pruning. His patch happens to also move the
>>> total_table_pages calculation, but I believe this change should be
>>> made as an independent commit to anything else.  I've kept it in the
>>> commitfest for that reason.
>>
>> Yeah, if this patch is a win independent of the other project of delaying
>> partition RTE creation, which seems to be true, I think we should go ahead
>> with applying this patch.
>
> Fwiw, I've incorporated David's patch in my own series, so that one of my
> patches no longer has the code movement hunks that are in his.  I will
> post the new version of my patch series like that.

Thanks. I'll keep this open here anyway so the change can be
considered independently. I think counting pages of pruned partitions
is a bug that should be fixed. You can just drop that patch from your
set if this gets committed.


--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: Calculate total_table_pages after set_base_rel_sizes()

Amit Langote
On Sat, Oct 13, 2018 at 7:36 David Rowley <[hidden email]> wrote:
On 12 October 2018 at 23:35, Amit Langote <[hidden email]> wrote:
> On 2018/10/11 13:45, Amit Langote wrote:
>> On 2018/10/07 17:43, David Rowley wrote:
>>> Amit Langote has since posted a patch to delay the RangeTblEntry
>>> creation until after pruning. His patch happens to also move the
>>> total_table_pages calculation, but I believe this change should be
>>> made as an independent commit to anything else.  I've kept it in the
>>> commitfest for that reason.
>>
>> Yeah, if this patch is a win independent of the other project of delaying
>> partition RTE creation, which seems to be true, I think we should go ahead
>> with applying this patch.
>
> Fwiw, I've incorporated David's patch in my own series, so that one of my
> patches no longer has the code movement hunks that are in his.  I will
> post the new version of my patch series like that.

Thanks. I'll keep this open here anyway so the change can be
considered independently. I think counting pages of pruned partitions
is a bug that should be fixed. You can just drop that patch from your
set if this gets committed.

Yeah, that was my plan anyway.

Thanks,
Amit