WIP Join Removal

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
26 messages Options
12
Reply | Threaded
Open this post in threaded view
|

WIP Join Removal

Simon Riggs
As discussed on 26 June, "Join Removal/Vertical Partitioning", here's a
patch to remove joins in certain circumstances.

Tested and blind reviewed, but this is complex and subtle enough I
welcome and expect your comments on corner cases and missed
complications. (Lord knows, I've been down a few blind alleys writing it
to date...)

Patch works, but there's a bit I haven't finished yet - checking unique
indexes. So patch is marked WIP for now. Depending upon how long
commitfest lasts I may have a fully working version. (Yes, I know its
only 1 hours work, but haven't worked out how yet...)

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


--
Sent via pgsql-patches mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches

join_removal.v4.patch (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: WIP Join Removal

Heikki Linnakangas-3
Simon Riggs wrote:
> Patch works, but there's a bit I haven't finished yet - checking unique
> indexes.

Did plan invalidation make it safe to rely on the presence of a unique
index for planning decisions?

Couldn't we also do join removal for inner joins, when there's a foreign
key reference that enforces that there's one and only one matching tuple
in the removed table:

SELECT child.data FROM child, parent WHERE child.fkey = parent.pkey

?


> + /*
> + * We can now remove join by pulling up child plan from the keeprel.
> + * This needs to be done considering costs, since its possible for
> + * a nested inner indexscan plan to be cheaper. So it isn't
> + * always desirable to remove the join.

Can you elaborate that a bit? I can't imagine a case where we wouldn't
want to remove a join, when we know we can.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-patches mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches
Reply | Threaded
Open this post in threaded view
|

Re: WIP Join Removal

Simon Riggs

On Mon, 2008-09-01 at 22:23 +0300, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > Patch works, but there's a bit I haven't finished yet - checking unique
> > indexes.
>
> Did plan invalidation make it safe to rely on the presence of a unique
> index for planning decisions?

My understanding was "Yes" and this case was the specific reason I
originally wanted to pursue plan invalidation back in 2006.

> Couldn't we also do join removal for inner joins, when there's a foreign
> key reference that enforces that there's one and only one matching tuple
> in the removed table:
>
> SELECT child.data FROM child, parent WHERE child.fkey = parent.pkey

Hmm, I had thought this was the same case, but the inner join
possibility wasn't something I'd seen. Guess that flaw shows this is all
original thought - I'll go back and read that optimizer blog again...

I agree it will work.

We would need to replace the join condition with an alteration of the
original quals on child so that we add "AND child.fkey is not null".
Which would mean we would need to re-plan the access to that base
relation so we picked up the new qual and potentially used an index for
it as well. That would be possible only if the join condition exactly
matches the FK constraint.

Hmm, will think about that, its certainly not an easy addition, for me.
I'll concentrate on getting this patch finished and committed first.

> > + /*
> > + * We can now remove join by pulling up child plan from the keeprel.
> > + * This needs to be done considering costs, since its possible for
> > + * a nested inner indexscan plan to be cheaper. So it isn't
> > + * always desirable to remove the join.
>
> Can you elaborate that a bit? I can't imagine a case where we wouldn't
> want to remove a join, when we know we can.

Neither could I when I first looked at this.

It turns out that a join like this

select a.col2
from a left outer join b on a.col1 = b.col1
where b.col2 = 1;

can be cheaper if we don't remove the join, when there is an index on
a.col1 and b.col2, because the presence of b allows the values returned
from b to be used for an index scan on a.

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


--
Sent via pgsql-patches mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches
Reply | Threaded
Open this post in threaded view
|

Re: WIP Join Removal

Heikki Linnakangas-3
Simon Riggs wrote:
> It turns out that a join like this
>
> select a.col2
> from a left outer join b on a.col1 = b.col1
> where b.col2 = 1;
>
> can be cheaper if we don't remove the join, when there is an index on
> a.col1 and b.col2, because the presence of b allows the values returned
> from b to be used for an index scan on a.

Umm, you *can't* remove that join. Because of the condition "b.col2 =
1", which implies that "b.col1 IS NOT NULL", that's actually equal to:

select a.col2
from a inner join b on a.col1 = b.col1
where b.col2 = 1;

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-patches mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches
Reply | Threaded
Open this post in threaded view
|

Re: WIP Join Removal

Simon Riggs

On Tue, 2008-09-02 at 13:20 +0300, Heikki Linnakangas wrote:

> Simon Riggs wrote:
> > It turns out that a join like this
> >
> > select a.col2
> > from a left outer join b on a.col1 = b.col1
> > where b.col2 = 1;
> >
> > can be cheaper if we don't remove the join, when there is an index on
> > a.col1 and b.col2, because the presence of b allows the values returned
> > from b to be used for an index scan on a.
>
> Umm, you *can't* remove that join.

Yes, you can. The presence or absence of rows in b is not important to
the result of the query because of the "left outer join".

I spent nearly a whole day going down that deadend also.

> Because of the condition "b.col2 =
> 1", which implies that "b.col1 IS NOT NULL",

No it doesn't, but as above, it is irrelevant anyway.

> that's actually equal to:

> select a.col2
> from a inner join b on a.col1 = b.col1
> where b.col2 = 1;

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


--
Sent via pgsql-patches mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches
Reply | Threaded
Open this post in threaded view
|

Re: WIP Join Removal

Heikki Linnakangas-3
Simon Riggs wrote:

> On Tue, 2008-09-02 at 13:20 +0300, Heikki Linnakangas wrote:
>> Simon Riggs wrote:
>>> It turns out that a join like this
>>>
>>> select a.col2
>>> from a left outer join b on a.col1 = b.col1
>>> where b.col2 = 1;
>>>
>>> can be cheaper if we don't remove the join, when there is an index on
>>> a.col1 and b.col2, because the presence of b allows the values returned
>>> from b to be used for an index scan on a.
>> Umm, you *can't* remove that join.
>
> Yes, you can. The presence or absence of rows in b is not important to
> the result of the query because of the "left outer join".
>
> I spent nearly a whole day going down that deadend also.

Oh. How does the query look like after removing the join, then?

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-patches mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches
Reply | Threaded
Open this post in threaded view
|

Re: WIP Join Removal

Simon Riggs

On Tue, 2008-09-02 at 13:41 +0300, Heikki Linnakangas wrote:

> Simon Riggs wrote:
> > On Tue, 2008-09-02 at 13:20 +0300, Heikki Linnakangas wrote:
> >> Simon Riggs wrote:
> >>> It turns out that a join like this
> >>>
> >>> select a.col2
> >>> from a left outer join b on a.col1 = b.col1
> >>> where b.col2 = 1;
> >>>
> >>> can be cheaper if we don't remove the join, when there is an index on
> >>> a.col1 and b.col2, because the presence of b allows the values returned
> >>> from b to be used for an index scan on a.
> >> Umm, you *can't* remove that join.
> >
> > Yes, you can. The presence or absence of rows in b is not important to
> > the result of the query because of the "left outer join".
> >
> > I spent nearly a whole day going down that deadend also.
>
> Oh. How does the query look like after removing the join, then?

Same answer, just slower. Removing the join makes the access to a into a
SeqScan, whereas it was a two-table index plan when both tables present.
The two table plan is added by the immediately preceding call add_... -
i.e. that plan is only added during join time not during planning of
base relations.

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


--
Sent via pgsql-patches mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches
Reply | Threaded
Open this post in threaded view
|

Re: WIP Join Removal

Heikki Linnakangas-3
Simon Riggs wrote:

> On Tue, 2008-09-02 at 13:41 +0300, Heikki Linnakangas wrote:
>> Simon Riggs wrote:
>>> On Tue, 2008-09-02 at 13:20 +0300, Heikki Linnakangas wrote:
>>>> Simon Riggs wrote:
>>>>> It turns out that a join like this
>>>>>
>>>>> select a.col2
>>>>> from a left outer join b on a.col1 = b.col1
>>>>> where b.col2 = 1;
>>>>>
>>>>> can be cheaper if we don't remove the join, when there is an index on
>>>>> a.col1 and b.col2, because the presence of b allows the values returned
>>>>> from b to be used for an index scan on a.
>>>> Umm, you *can't* remove that join.
>>> Yes, you can. The presence or absence of rows in b is not important to
>>> the result of the query because of the "left outer join".
>>>
>>> I spent nearly a whole day going down that deadend also.
>> Oh. How does the query look like after removing the join, then?
>
> Same answer, just slower. Removing the join makes the access to a into a
> SeqScan, whereas it was a two-table index plan when both tables present.
> The two table plan is added by the immediately preceding call add_... -
> i.e. that plan is only added during join time not during planning of
> base relations.

I mean, can you how me an SQL query of what's left after removing the
join? Certainly just removing the join and the WHERE clause doesn't give
the same answer. Or is it something that can't be expressed with SQL?
What's the filter in the SeqScan?

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-patches mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches
Reply | Threaded
Open this post in threaded view
|

Re: WIP Join Removal

Gregory Stark-2
In reply to this post by Simon Riggs
Simon Riggs <[hidden email]> writes:

> Same answer, just slower. Removing the join makes the access to a into a
> SeqScan, whereas it was a two-table index plan when both tables present.
> The two table plan is added by the immediately preceding call add_... -
> i.e. that plan is only added during join time not during planning of
> base relations.

Perhaps it would clearer to discuss a non-outer join here:

select invoices.*
  from customer join invoices using (company_id,customer_id)
 where customer_id = ?

where there's a foreign key relation guaranteeing that every invoice has a
matching <company_id, customer_id>.

If there's an index on customer(customer_id) but not on invoices(customer_id)
then conceivably it would be faster to use that than scan all of the invoices.

I wonder if it would be more worthwhile to remove them and have a subsequent
phase where we look for possible joins to *add*. So even if the user writes
"select * from invoices where customer_id=?" the planner might be able to
discover that it can find those records quicker by scanning customer, finding
the matching <company_id,customer_id> and then using an index to look them up
in invoices.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

--
Sent via pgsql-patches mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches
Reply | Threaded
Open this post in threaded view
|

Re: WIP Join Removal

Simon Riggs
In reply to this post by Heikki Linnakangas-3

On Tue, 2008-09-02 at 14:03 +0300, Heikki Linnakangas wrote:

> Simon Riggs wrote:
> > On Tue, 2008-09-02 at 13:41 +0300, Heikki Linnakangas wrote:
> >> Simon Riggs wrote:
> >>> On Tue, 2008-09-02 at 13:20 +0300, Heikki Linnakangas wrote:
> >>>> Simon Riggs wrote:
> >>>>> It turns out that a join like this
> >>>>>
> >>>>> select a.col2
> >>>>> from a left outer join b on a.col1 = b.col1
> >>>>> where b.col2 = 1;
> >>>>>
> >>>>> can be cheaper if we don't remove the join, when there is an index on
> >>>>> a.col1 and b.col2, because the presence of b allows the values returned
> >>>>> from b to be used for an index scan on a.
> >>>> Umm, you *can't* remove that join.
> >>> Yes, you can. The presence or absence of rows in b is not important to
> >>> the result of the query because of the "left outer join".
> >>>
> >>> I spent nearly a whole day going down that deadend also.
> >> Oh. How does the query look like after removing the join, then?
> >
> > Same answer, just slower. Removing the join makes the access to a into a
> > SeqScan, whereas it was a two-table index plan when both tables present.
> > The two table plan is added by the immediately preceding call add_... -
> > i.e. that plan is only added during join time not during planning of
> > base relations.
>
> I mean, can you how me an SQL query of what's left after removing the
> join? Certainly just removing the join and the WHERE clause doesn't give
> the same answer.

Yes, it does

select a.col2
from a left outer join b on a.col1 = b.col1
where b.col2 = 1;

is logically equivalent to

select a.col2
from a;

and hence removing the join produces a SeqScan plan, whereas the
equivalent join can in some circumstances be faster.

I discovered this, I didn't think of it in advance.

> Or is it something that can't be expressed with SQL?
> What's the filter in the SeqScan?

There is no filter in the SeqScan. Try some queries and you'll see what
I mean.

I've said its a dead end and that I spent hours thinking that, so please
think about this...

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


--
Sent via pgsql-patches mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches
Reply | Threaded
Open this post in threaded view
|

Re: WIP Join Removal

Heikki Linnakangas-3
Simon Riggs wrote:
> select a.col2
> from a left outer join b on a.col1 = b.col1
> where b.col2 = 1;
>
> is logically equivalent to
>
> select a.col2
> from a;

No, it's not:

postgres=# CREATE TABLE a (col1 int4, col2 int4);
CREATE TABLE
postgres=# CREATE TABLE b (col1 int4, col2 int4);
CREATE TABLE
postgres=# INSERT INTO a VALUES (1,1);
INSERT 0 1
postgres=# select a.col2 from a;
  col2
------
     1
(1 row)

postgres=# select a.col2 from a left outer join b on a.col1 = b.col1
where b.col2 = 1;
  col2
------
(0 rows)

But anyway, Greg's example looks valid, and proves the point that
removing a join isn't always a win.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-patches mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches
Reply | Threaded
Open this post in threaded view
|

Re: WIP Join Removal

Simon Riggs
In reply to this post by Simon Riggs

On Tue, 2008-09-02 at 10:41 +0100, Simon Riggs wrote:
> On Mon, 2008-09-01 at 22:23 +0300, Heikki Linnakangas wrote:

> > Couldn't we also do join removal for inner joins, when there's a foreign
> > key reference that enforces that there's one and only one matching tuple
> > in the removed table:
> >
> > SELECT child.data FROM child, parent WHERE child.fkey = parent.pkey
>
> Hmm, I had thought this was the same case, but the inner join
> possibility wasn't something I'd seen. Guess that flaw shows this is all
> original thought - I'll go back and read that optimizer blog again...
>
> I agree it will work.
>
> We would need to replace the join condition with an alteration of the
> original quals on child so that we add "AND child.fkey is not null".
> Which would mean we would need to re-plan the access to that base
> relation so we picked up the new qual and potentially used an index for
> it as well. That would be possible only if the join condition exactly
> matches the FK constraint.

Also, note that when we do an inner join we must not have any
qualification of the rows on the checkrel. Any qualification that
removes rows will alter the answer.

This is a direct contrast to the left outer join case where the presence
or absence of a single table qualification on the checkrel has *no
effect* on the results of the query (as long as the qual is immutable).

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


--
Sent via pgsql-patches mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches
Reply | Threaded
Open this post in threaded view
|

Re: WIP Join Removal

Simon Riggs
In reply to this post by Gregory Stark-2

On Tue, 2008-09-02 at 12:05 +0100, Gregory Stark wrote:

> I wonder if it would be more worthwhile to remove them and have a
> subsequent phase where we look for possible joins to *add*. So even if
> the user writes
> "select * from invoices where customer_id=?" the planner might be able
> to discover that it can find those records quicker by scanning
> customer, finding the matching <company_id,customer_id> and then using
> an index to look them up in invoices.

That's a good idea.

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


--
Sent via pgsql-patches mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches
Reply | Threaded
Open this post in threaded view
|

Re: WIP Join Removal

Simon Riggs
In reply to this post by Heikki Linnakangas-3

On Tue, 2008-09-02 at 14:20 +0300, Heikki Linnakangas wrote:

> Simon Riggs wrote:
> > select a.col2
> > from a left outer join b on a.col1 = b.col1
> > where b.col2 = 1;
> >
> > is logically equivalent to
> >
> > select a.col2
> > from a;
>
> No, it's not:
>
> postgres=# CREATE TABLE a (col1 int4, col2 int4);
> CREATE TABLE
> postgres=# CREATE TABLE b (col1 int4, col2 int4);
> CREATE TABLE
> postgres=# INSERT INTO a VALUES (1,1);
> INSERT 0 1
> postgres=# select a.col2 from a;
>   col2
> ------
>      1
> (1 row)
>
> postgres=# select a.col2 from a left outer join b on a.col1 = b.col1
> where b.col2 = 1;
>   col2
> ------
> (0 rows)

You raise an interesting and important point that shows an error of
mine. Notice that

 select a.col2 from a left outer join b on a.col1 = b.col1
 *and* b.col2 = 1;

can be re-written as

 select a.col2 from a;

whereas

 select a.col2 from a left outer join b on a.col1 = b.col1
 where b.col2 = 1;

cannot, as you show.

It seems I wrote my original tests using "and" instead of "where" and
hadn't noticed the distinction. Thanks for helping me catch that error.

I will put back the code that looks for an empty filter condition on the
checkrel. That day was not wasted after all.

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


--
Sent via pgsql-patches mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches
Reply | Threaded
Open this post in threaded view
|

Re: WIP Join Removal

Simon Riggs
In reply to this post by Gregory Stark-2

On Tue, 2008-09-02 at 12:05 +0100, Gregory Stark wrote:

> I wonder if it would be more worthwhile to remove them and have a subsequent
> phase where we look for possible joins to *add*. So even if the user writes
> "select * from invoices where customer_id=?" the planner might be able to
> discover that it can find those records quicker by scanning customer, finding
> the matching <company_id,customer_id> and then using an index to look them up
> in invoices.

This seems a less useful idea now just simply because it is such a
special case.

We would need to have a case where we have a table A that does not have
an index on a specific column, yet table B does have an index on the
specific column. But also when A references B as a foreign key and where
the column is a subset of the columns of the primary key of B.

That means only queries like

select ...
from a
where a.col2 = x;

can be transformed into

select ...
from a join b on (foreign key cols)
where a.col2 = x;

and then because a.col2 is a subset of foreign key columns we can infer
that b.col2 = x.

So the pre-conditions for this to be useful are:
* constraint on subset of a FK
* subset of FK is indexed on B
* subset of FK is not indexed on A

Which doesn't seem that likely to occur.


Thanks both to Heikki and Greg for good, fast input on this patch.
Nothing more needed now while I rework patch.

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


--
Sent via pgsql-patches mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches
Reply | Threaded
Open this post in threaded view
|

Re: WIP Join Removal

Heikki Linnakangas-3
In reply to this post by Gregory Stark-2
Gregory Stark wrote:
> I wonder if it would be more worthwhile to remove them and have a subsequent
> phase where we look for possible joins to *add*. So even if the user writes
> "select * from invoices where customer_id=?" the planner might be able to
> discover that it can find those records quicker by scanning customer, finding
> the matching <company_id,customer_id> and then using an index to look them up
> in invoices.

Yeah, that would be cool. The question is whether it's worth the
additional overhead in planner, compared to the gain in the rare case
that it's applicable. That's always the thing with planner tricks like
this. I think we'll eventually need some sort of tuning knob to control
how hard the planner tries to apply different optimizations like that.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-patches mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches
Reply | Threaded
Open this post in threaded view
|

Re: WIP Join Removal

Heikki Linnakangas-3
In reply to this post by Simon Riggs
Simon Riggs wrote:
> It seems I wrote my original tests using "and" instead of "where" and
> hadn't noticed the distinction. Thanks for helping me catch that error.

Ah, yeah, that's a big difference. Proving correctness is hard, but to
refute something you need just one test case that fails ;-).

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-patches mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches
Reply | Threaded
Open this post in threaded view
|

Re: WIP Join Removal

Tom Lane-2
In reply to this post by Simon Riggs
Simon Riggs <[hidden email]> writes:
> On Mon, 2008-09-01 at 22:23 +0300, Heikki Linnakangas wrote:
>> Did plan invalidation make it safe to rely on the presence of a unique
>> index for planning decisions?

> My understanding was "Yes" and this case was the specific reason I
> originally wanted to pursue plan invalidation back in 2006.

Yeah, it should work.  The theory is that any schema change that could
affect planning should result in broadcasting a relcache inval message
for the table (not just the index, note).  I'm pretty confident that
that works for index addition and removal (cf index_update_stats and
index_drop).  There might be some situations where we need to force a
relcache inval but don't currently do so --- constraint addition/removal
for instance I'm not too sure about.  But that would represent an easily
fixable bug.

                        regards, tom lane

--
Sent via pgsql-patches mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches
Reply | Threaded
Open this post in threaded view
|

Re: WIP Join Removal

Tom Lane-2
In reply to this post by Simon Riggs
Simon Riggs <[hidden email]> writes:
>> Oh. How does the query look like after removing the join, then?

> Same answer, just slower. Removing the join makes the access to a into a
> SeqScan, whereas it was a two-table index plan when both tables present.

I don't really believe this: please show an actual case where the join
would be faster.

AFAICS, in the outer-join examples, it is not possible for a join to
enable some kind of indexscan on the outer table, because by definition
an outer join excludes none of the left-hand rows.  So a seqscan on the
outer is optimal.

I also find all the worry about generating other plans for the inner
relation to be off the mark.  You're not going to *use* any plan for the
inner rel, so who cares what plans it has?

                        regards, tom lane

--
Sent via pgsql-patches mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches
Reply | Threaded
Open this post in threaded view
|

Re: WIP Join Removal

Gregory Stark-2
In reply to this post by Tom Lane-2
Tom Lane <[hidden email]> writes:

> Simon Riggs <[hidden email]> writes:
>> On Mon, 2008-09-01 at 22:23 +0300, Heikki Linnakangas wrote:
>>> Did plan invalidation make it safe to rely on the presence of a unique
>>> index for planning decisions?
>
>> My understanding was "Yes" and this case was the specific reason I
>> originally wanted to pursue plan invalidation back in 2006.

It may be worth considering what other cases might need this info and taking
them into account to be sure the solution is usable for them too. I suspect
we'll probably need a generic function for determining whether a PathKey list
can be proved unique.

Other cases off the top of three other cases where this could be useful -- but
generally anywhere the planner introduces a Unique node could benefit from
looking at this.

a) Turn a UNION into UNION ALL if there are unique indexes for any column in each
side and at least one column is a constant in each side and none of the
constants are equal.


b) Remove the aggregate on IN subqueries when there's a unique constraint so
that:

  SELECT * from a where a.fk IN (select pk FROM b)

Can do a semijoin without taking care to avoid duplicating records in "a" if
there should be duplicate values of "pk" in "b".


c) Turn bad mysqlish queries which are really semijoins (used to work around
their historic lack of subqueries) such as:

 SELECT DISTINCT a.pk FROM a JOIN b USING (x)

into

 SELECT a.pk FROM a WHERE x IN (SELECT x FROM b)



--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

--
Sent via pgsql-patches mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches
12