Performance improvement for queries with IN clause

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Performance improvement for queries with IN clause

Rafia Sabih
Hello all,

I would like to direct your attention to the queries of following type,
select <some_column(s)>
from <table_name>
where <some_column> IN (<a_list_of_some_values>)

the plan for such a query uses index scan (or index-only), now in our experiments, if the provided list is sorted then query performance improves by ~10%. Which makes sense also as once we have found the required btree leaf we just keep moving in one direction, which should be expectantly less time consuming than searching the tree again.

Now, my question is shouldn't we always use this list in sorted order, in other words can there be scenarios where such a sorting will not help? I am talking about only the cases where the list consists of all constants and could fit in memory. Basically, when we are transforming the in expression and found that it consists of all constants, then sort it as well, codewise at transfromAExprIn, of course there might be better ways to accomplish this.

So, your thoughts, opinions, suggestions are more than welcome.

--
Regards,
Rafia Sabih
Reply | Threaded
Open this post in threaded view
|

Re: Performance improvement for queries with IN clause

Andreas Karlsson
On 11/8/19 2:52 PM, Rafia Sabih wrote:
> Now, my question is shouldn't we always use this list in sorted order,
> in other words can there be scenarios where such a sorting will not
> help? I am talking about only the cases where the list consists of all
> constants and could fit in memory. Basically, when we are
> transforming the in expression and found that it consists of all
> constants, then sort it as well, codewise at transfromAExprIn, of course
> there might be better ways to accomplish this.
>
> So, your thoughts, opinions, suggestions are more than welcome.

If it is worth sorting them should depend on the index, e.g. for hash
indexes sorting would just be a waste of time.

Andreas


Reply | Threaded
Open this post in threaded view
|

Re: Performance improvement for queries with IN clause

akapila
On Sat, Nov 9, 2019 at 5:22 PM Andreas Karlsson <[hidden email]> wrote:

>
> On 11/8/19 2:52 PM, Rafia Sabih wrote:
> > Now, my question is shouldn't we always use this list in sorted order,
> > in other words can there be scenarios where such a sorting will not
> > help? I am talking about only the cases where the list consists of all
> > constants and could fit in memory. Basically, when we are
> > transforming the in expression and found that it consists of all
> > constants, then sort it as well, codewise at transfromAExprIn, of course
> > there might be better ways to accomplish this.
> >
> > So, your thoughts, opinions, suggestions are more than welcome.
>
> If it is worth sorting them should depend on the index, e.g. for hash
> indexes sorting would just be a waste of time.
>

I think we also need to be careful that this might lead to regression
for cases where the list is already sorted.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Performance improvement for queries with IN clause

Rafia Sabih
In reply to this post by Andreas Karlsson
On Sat, 9 Nov 2019 at 12:52, Andreas Karlsson <[hidden email]> wrote:
On 11/8/19 2:52 PM, Rafia Sabih wrote:
> Now, my question is shouldn't we always use this list in sorted order,
> in other words can there be scenarios where such a sorting will not
> help? I am talking about only the cases where the list consists of all
> constants and could fit in memory. Basically, when we are
> transforming the in expression and found that it consists of all
> constants, then sort it as well, codewise at transfromAExprIn, of course
> there might be better ways to accomplish this.
>
> So, your thoughts, opinions, suggestions are more than welcome.

If it is worth sorting them should depend on the index, e.g. for hash
indexes sorting would just be a waste of time.

Hi Andreas, 
Thanks for your response. Here, I meant this list sorting only for Btree, as you well pointed out that for other indexes like hash this wouldn't really make sense. 


--
Regards,
Rafia Sabih