Planner can't seem to use partial function indexes with parameter from join

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Planner can't seem to use partial function indexes with parameter from join

Alastair McKinley
Hello all,

I am having quite an interesting problem trying to get the planner to use my indexes as intended in my setup.
I am using partial functional indexes that have a different function parameter based on the record type.

A dynamically generated query using unions that are more explicit about index usage is significantly faster (10-50x) that my preferred plain sql approach using joins.

I have reduced my scenario to a minimal test case with inline comments to illustrate the issue here https://gist.github.com/a-mckinley/1b0e95142789cbc09121b71a83d03f45

Is there something that I am missing to allow the planner to use the underlying indexes?  Or is the scenario too complex and should I stick with dynamic sql?

Best regards,

Alastair

Reply | Threaded
Open this post in threaded view
|

Re: Planner can't seem to use partial function indexes with parameter from join

David Rowley-3
On Sun, 14 Apr 2019 at 21:55, Alastair McKinley
<[hidden email]> wrote:
> I have reduced my scenario to a minimal test case with inline comments to illustrate the issue here https://gist.github.com/a-mckinley/1b0e95142789cbc09121b71a83d03f45
>
> Is there something that I am missing to allow the planner to use the underlying indexes?  Or is the scenario too complex and should I stick with dynamic sql?

I'd say the biggest part of the problem is that "record" is not in
first normal form. However, it's worse than that as you're having to
perform a join to determine how to fetch the value you want.  If
"record" was designed to have a "record_prefix" column and then store
the remainder of the record_text over in the column by that name, then
with an index on record (record_prefix, type_id) you could just do:

explain analyze
select r.type_id,count(*)
from record r
INNER JOIN (VALUES(1,'aa'),(2,'aab')) v(type_id, record_prefix) ON
r.type_id = v.type_id AND r.record_prefix = v.record_prefix
group by r.type_id;

If you're lucky, and there's a good chance you would be, then you'd
get a parameterised nested loop join.

As for why the partial indexes cannot be used; partial indexes can
only be used when the planner is able to match the index up to quals
that will be evaluated in the table level quals (i.e not join quals).
There's no such thing as dynamic index selection at execution time.
The closest thing we have to that is run-time partition pruning in
PG11, but that can't help you either since the partition key cannot
contain values from other tables. You'd still need to normalise the
record table. With that and a partitioned table, there might be
further advantages of partition-wise aggregation, but that might not
buy you much more than just normalising the table.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Planner can't seem to use partial function indexes with parameter from join

Alastair McKinley
Hi David,

Thanks for having a look at this. You are right about the normalisation of record, unfortunately there are other constraints with the application that suit the denormalised approach for this table very well.

Although my test case is quite simple compared to the real application (the prefix index is just for illustration of the function index), I have considered the record_prefix column idea you mentioned and will try that.

It's good to clarify that runtime index selection won't work here (I had hoped it might be possible), I will almost certainly resort to the dynamically generated query approach in the interim.

Best regards,

Alastair

From: David Rowley
Sent: Sunday, 14 April, 14:57
Subject: Re: Planner can't seem to use partial function indexes with parameter from join
To: Alastair McKinley
On Sun, 14 Apr 2019 at 21:55, Alastair McKinley
<[hidden email]> wrote:
> I have reduced my scenario to a minimal test case with inline comments to illustrate the issue here https://gist.github.com/a-mckinley/1b0e95142789cbc09121b71a83d03f45
>
> Is there something that I am missing to allow the planner to use the underlying indexes?  Or is the scenario too complex and should I stick with dynamic sql?

I'd say the biggest part of the problem is that "record" is not in
first normal form. However, it's worse than that as you're having to
perform a join to determine how to fetch the value you want.  If
"record" was designed to have a "record_prefix" column and then store
the remainder of the record_text over in the column by that name, then
with an index on record (record_prefix, type_id) you could just do:

explain analyze
select r.type_id,count(*)
from record r
INNER JOIN (VALUES(1,'aa'),(2,'aab')) v(type_id, record_prefix) ON
r.type_id = v.type_id AND r.record_prefix = v.record_prefix
group by r.type_id;

If you're lucky, and there's a good chance you would be, then you'd
get a parameterised nested loop join.

As for why the partial indexes cannot be used; partial indexes can
only be used when the planner is able to match the index up to quals
that will be evaluated in the table level quals (i.e not join quals).
There's no such thing as dynamic index selection at execution time.
The closest thing we have to that is run-time partition pruning in
PG11, but that can't help you either since the partition key cannot
contain values from other tables. You'd still need to normalise the
record table. With that and a partitioned table, there might be
further advantages of partition-wise aggregation, but that might not
buy you much more than just normalising the table.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services