Measuring the Query Optimizer Effect: Turning off the QO?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

Measuring the Query Optimizer Effect: Turning off the QO?

Tom Mercha
Hi All

As we know, a query goes through number of stages before it is executed.
One of these stages is query optimization (QO).

There are various parameters to try and influence optimizer decisions
and costs. But I wanted to measure the effect of such a stage by turning
it off completely and I can't find such a parameter which explicitly
does that. Then I could execute a query to get the effect of "QO active
and "QO inactive" and compare.

Obviously, I know well what the results would generally look like but I
am just interested in measuring the differences for various types of
queries. I am also aware that this is a simple comparison - there are
more interesting comparisons to perform with QO tweaks, but right now I
am interested in something basic.

So how would one shut down QO? Or at least, obtaining the guarantee of
generating the worst plan possible, ideally without touching many
parameters?

Best,
Tom
lup
Reply | Threaded
Open this post in threaded view
|

Re: Measuring the Query Optimizer Effect: Turning off the QO?

lup


> On Jul 7, 2019, at 5:22 PM, Tom Mercha <[hidden email]> wrote:
>
> Hi All
>
> As we know, a query goes through number of stages before it is executed.
> One of these stages is query optimization (QO).
>
> There are various parameters to try and influence optimizer decisions
> and costs. But I wanted to measure the effect of such a stage by turning
> it off completely and I can't find such a parameter which explicitly
> does that. Then I could execute a query to get the effect of "QO active
> and "QO inactive" and compare.
>
> Obviously, I know well what the results would generally look like but I
> am just interested in measuring the differences for various types of
> queries. I am also aware that this is a simple comparison - there are
> more interesting comparisons to perform with QO tweaks, but right now I
> am interested in something basic.
>
> So how would one shut down QO? Or at least, obtaining the guarantee of
> generating the worst plan possible, ideally without touching many
> parameters?
>
> Best,
> Tom

Drop all indices?



Reply | Threaded
Open this post in threaded view
|

Re: Measuring the Query Optimizer Effect: Turning off the QO?

Tom Mercha
On 08/07/2019 01:46, Rob Sargent wrote:

>
>
>> On Jul 7, 2019, at 5:22 PM, Tom Mercha <[hidden email]> wrote:
>>
>> Hi All
>>
>> As we know, a query goes through number of stages before it is executed.
>> One of these stages is query optimization (QO).
>>
>> There are various parameters to try and influence optimizer decisions
>> and costs. But I wanted to measure the effect of such a stage by turning
>> it off completely and I can't find such a parameter which explicitly
>> does that. Then I could execute a query to get the effect of "QO active
>> and "QO inactive" and compare.
>>
>> Obviously, I know well what the results would generally look like but I
>> am just interested in measuring the differences for various types of
>> queries. I am also aware that this is a simple comparison - there are
>> more interesting comparisons to perform with QO tweaks, but right now I
>> am interested in something basic.
>>
>> So how would one shut down QO? Or at least, obtaining the guarantee of
>> generating the worst plan possible, ideally without touching many
>> parameters?
>>
>> Best,
>> Tom
>
> Drop all indices?
>

Sorry, maybe my question wasn't clear enough.

A query can be rewritten in various ways by applying rules and costs of
relational algebra operators, as well as their parallelisation. I am
talking about turning off this query optimization, so I am already
assuming that indexes aren't present.
lup
Reply | Threaded
Open this post in threaded view
|

Re: Measuring the Query Optimizer Effect: Turning off the QO?

lup


On Jul 7, 2019, at 5:49 PM, Tom Mercha <[hidden email]> wrote:

On 08/07/2019 01:46, Rob Sargent wrote:


On Jul 7, 2019, at 5:22 PM, Tom Mercha <[hidden email]> wrote:

Hi All

As we know, a query goes through number of stages before it is executed.
One of these stages is query optimization (QO).

There are various parameters to try and influence optimizer decisions
and costs. But I wanted to measure the effect of such a stage by turning
it off completely and I can't find such a parameter which explicitly
does that. Then I could execute a query to get the effect of "QO active
and "QO inactive" and compare.

Obviously, I know well what the results would generally look like but I
am just interested in measuring the differences for various types of
queries. I am also aware that this is a simple comparison - there are
https://gitlab.com/camplab/jpsgcs interesting comparisons to perform with QO tweaks, but right now I
am interested in something basic.

So how would one shut down QO? Or at least, obtaining the guarantee of
generating the worst plan possible, ideally without touching many
parameters?

Best,
Tom

Drop all indices?


Sorry, maybe my question wasn't clear enough.

A query can be rewritten in various ways by applying rules and costs of
relational algebra operators, as well as their parallelisation. I am
talking about turning off this query optimization, so I am already
assuming that indexes aren't present.

Have you played with any of these settings?

postgres=# select version();
                                                 version                                                 
---------------------------------------------------------------------------------------------------------
 PostgreSQL 10.7 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36), 64-bit
(1 row)

postgres=# select name, setting, unit,short_desc from pg_settings where name ~ 'para';
              name               | setting | unit |                                             short_desc                                             
---------------------------------+---------+------+----------------------------------------------------------------------------------------------------
 force_parallel_mode             | off     |      | Forces use of parallel query facilities.
 max_parallel_workers            | 16      |      | Sets the maximum number of parallel workers that can be active at one time.
 max_parallel_workers_per_gather | 8       |      | Sets the maximum number of parallel processes per executor node.
 min_parallel_index_scan_size    | 64      | 8kB  | Sets the minimum amount of index data for a parallel scan.
 min_parallel_table_scan_size    | 1024    | 8kB  | Sets the minimum amount of table data for a parallel scan.
 parallel_setup_cost             | 1000    |      | Sets the planner's estimate of the cost of starting up worker processes for parallel query.
 parallel_tuple_cost             | 0.1     |      | Sets the planner's estimate of the cost of passing each tuple (row) from worker to master backend.
 ssl_dh_params_file              |         |      | Location of the SSL DH parameters file.
(8 rows)

lup
Reply | Threaded
Open this post in threaded view
|

Re: Measuring the Query Optimizer Effect: Turning off the QO?

lup


On Jul 7, 2019, at 6:01 PM, Rob Sargent <[hidden email]> wrote:



On Jul 7, 2019, at 5:49 PM, Tom Mercha <[hidden email]> wrote:

On 08/07/2019 01:46, Rob Sargent wrote:


On Jul 7, 2019, at 5:22 PM, Tom Mercha <[hidden email]> wrote:

Hi All

As we know, a query goes through number of stages before it is executed.
One of these stages is query optimization (QO).

There are various parameters to try and influence optimizer decisions
and costs. But I wanted to measure the effect of such a stage by turning
it off completely and I can't find such a parameter which explicitly
does that. Then I could execute a query to get the effect of "QO active
and "QO inactive" and compare.

Obviously, I know well what the results would generally look like but I
am just interested in measuring the differences for various types of
queries. I am also aware that this is a simple comparison - there are
https://gitlab.com/camplab/jpsgcs interesting comparisons to perform with QO tweaks, but right now I
am interested in something basic.

So how would one shut down QO? Or at least, obtaining the guarantee of
generating the worst plan possible, ideally without touching many
parameters?

Best,
Tom

Drop all indices?


Sorry, maybe my question wasn't clear enough.

A query can be rewritten in various ways by applying rules and costs of 
relational algebra operators, as well as their parallelisation. I am 
talking about turning off this query optimization, so I am already 
assuming that indexes aren't present.

Have you played with any of these settings?

postgres=# select version();
                                                 version                                                 
---------------------------------------------------------------------------------------------------------
 PostgreSQL 10.7 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36), 64-bit
(1 row)

postgres=# select name, setting, unit,short_desc from pg_settings where name ~ 'para';
              name               | setting | unit |                                             short_desc                                             
---------------------------------+---------+------+----------------------------------------------------------------------------------------------------
 force_parallel_mode             | off     |      | Forces use of parallel query facilities.
 max_parallel_workers            | 16      |      | Sets the maximum number of parallel workers that can be active at one time.
 max_parallel_workers_per_gather | 8       |      | Sets the maximum number of parallel processes per executor node.
 min_parallel_index_scan_size    | 64      | 8kB  | Sets the minimum amount of index data for a parallel scan.
 min_parallel_table_scan_size    | 1024    | 8kB  | Sets the minimum amount of table data for a parallel scan.
 parallel_setup_cost             | 1000    |      | Sets the planner's estimate of the cost of starting up worker processes for parallel query.
 parallel_tuple_cost             | 0.1     |      | Sets the planner's estimate of the cost of passing each tuple (row) from worker to master backend.
 ssl_dh_params_file              |         |      | Location of the SSL DH parameters file.
(8 rows)

Well not the last one of course.
lup
Reply | Threaded
Open this post in threaded view
|

Re: Measuring the Query Optimizer Effect: Turning off the QO?

lup


On Jul 7, 2019, at 6:02 PM, Rob Sargent <[hidden email]> wrote:



On Jul 7, 2019, at 6:01 PM, Rob Sargent <[hidden email]> wrote:



On Jul 7, 2019, at 5:49 PM, Tom Mercha <[hidden email]> wrote:

On 08/07/2019 01:46, Rob Sargent wrote:


On Jul 7, 2019, at 5:22 PM, Tom Mercha <[hidden email]> wrote:

Hi All

As we know, a query goes through number of stages before it is executed.
One of these stages is query optimization (QO).

There are various parameters to try and influence optimizer decisions
and costs. But I wanted to measure the effect of such a stage by turning
it off completely and I can't find such a parameter which explicitly
does that. Then I could execute a query to get the effect of "QO active
and "QO inactive" and compare.

Obviously, I know well what the results would generally look like but I
am just interested in measuring the differences for various types of
queries. I am also aware that this is a simple comparison - there are
https://gitlab.com/camplab/jpsgcs interesting comparisons to perform with QO tweaks, but right now I
am interested in something basic.

So how would one shut down QO? Or at least, obtaining the guarantee of
generating the worst plan possible, ideally without touching many
parameters?

Best,
Tom

Drop all indices?


Sorry, maybe my question wasn't clear enough.

A query can be rewritten in various ways by applying rules and costs of 
relational algebra operators, as well as their parallelisation. I am 
talking about turning off this query optimization, so I am already 
assuming that indexes aren't present.

Have you played with any of these settings?

postgres=# select version();
                                                 version                                                 
---------------------------------------------------------------------------------------------------------
 PostgreSQL 10.7 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36), 64-bit
(1 row)

postgres=# select name, setting, unit,short_desc from pg_settings where name ~ 'para';
              name               | setting | unit |                                             short_desc                                             
---------------------------------+---------+------+----------------------------------------------------------------------------------------------------
 force_parallel_mode             | off     |      | Forces use of parallel query facilities.
 max_parallel_workers            | 16      |      | Sets the maximum number of parallel workers that can be active at one time.
 max_parallel_workers_per_gather | 8       |      | Sets the maximum number of parallel processes per executor node.
 min_parallel_index_scan_size    | 64      | 8kB  | Sets the minimum amount of index data for a parallel scan.
 min_parallel_table_scan_size    | 1024    | 8kB  | Sets the minimum amount of table data for a parallel scan.
 parallel_setup_cost             | 1000    |      | Sets the planner's estimate of the cost of starting up worker processes for parallel query.
 parallel_tuple_cost             | 0.1     |      | Sets the planner's estimate of the cost of passing each tuple (row) from worker to master backend.
 ssl_dh_params_file              |         |      | Location of the SSL DH parameters file.
(8 rows)

Well not the last one of course.

Better yet, “where category ~* ‘planner’"
Reply | Threaded
Open this post in threaded view
|

Re: Measuring the Query Optimizer Effect: Turning off the QO?

Tom Mercha
On 08/07/2019 02:04, Rob Sargent wrote:

>
>
>> On Jul 7, 2019, at 6:02 PM, Rob Sargent <[hidden email]> wrote:
>>
>>
>>
>>> On Jul 7, 2019, at 6:01 PM, Rob Sargent <[hidden email] <mailto:[hidden email]>> wrote:
>>>
>>>
>>>
>>>> On Jul 7, 2019, at 5:49 PM, Tom Mercha <[hidden email] <mailto:[hidden email]>> wrote:
>>>>
>>>> On 08/07/2019 01:46, Rob Sargent wrote:
>>>>>
>>>>>
>>>>>> On Jul 7, 2019, at 5:22 PM, Tom Mercha <[hidden email] <mailto:[hidden email]>> wrote:
>>>>>>
>>>>>> Hi All
>>>>>>
>>>>>> As we know, a query goes through number of stages before it is executed.
>>>>>> One of these stages is query optimization (QO).
>>>>>>
>>>>>> There are various parameters to try and influence optimizer decisions
>>>>>> and costs. But I wanted to measure the effect of such a stage by turning
>>>>>> it off completely and I can't find such a parameter which explicitly
>>>>>> does that. Then I could execute a query to get the effect of "QO active
>>>>>> and "QO inactive" and compare.
>>>>>>
>>>>>> Obviously, I know well what the results would generally look like but I
>>>>>> am just interested in measuring the differences for various types of
>>>>>> queries. I am also aware that this is a simple comparison - there are
>>>>>> https://gitlab.com/camplab/jpsgcs <https://gitlab.com/camplab/jpsgcs> interesting comparisons to perform with QO tweaks, but right now I
>>>>>> am interested in something basic.
>>>>>>
>>>>>> So how would one shut down QO? Or at least, obtaining the guarantee of
>>>>>> generating the worst plan possible, ideally without touching many
>>>>>> parameters?
>>>>>>
>>>>>> Best,
>>>>>> Tom
>>>>>
>>>>> Drop all indices?
>>>>>
>>>>
>>>> Sorry, maybe my question wasn't clear enough.
>>>>
>>>> A query can be rewritten in various ways by applying rules and costs of
>>>> relational algebra operators, as well as their parallelisation. I am
>>>> talking about turning off this query optimization, so I am already
>>>> assuming that indexes aren't present.
>>>
>>> Have you played with any of these settings?
>>>
>>> postgres=# select version();
>>>                                                   version
>>> ---------------------------------------------------------------------------------------------------------
>>>   PostgreSQL 10.7 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36), 64-bit
>>> (1 row)
>>>
>>> postgres=# select name, setting, unit,short_desc from pg_settings where name ~ 'para';
>>>                name               | setting | unit |                                             short_desc
>>> ---------------------------------+---------+------+----------------------------------------------------------------------------------------------------
>>>   force_parallel_mode             | off     |      | Forces use of parallel query facilities.
>>>   max_parallel_workers            | 16      |      | Sets the maximum number of parallel workers that can be active at one time.
>>>   max_parallel_workers_per_gather | 8       |      | Sets the maximum number of parallel processes per executor node.
>>>   min_parallel_index_scan_size    | 64      | 8kB  | Sets the minimum amount of index data for a parallel scan.
>>>   min_parallel_table_scan_size    | 1024    | 8kB  | Sets the minimum amount of table data for a parallel scan.
>>>   parallel_setup_cost             | 1000    |      | Sets the planner's estimate of the cost of starting up worker processes for parallel query.
>>>   parallel_tuple_cost             | 0.1     |      | Sets the planner's estimate of the cost of passing each tuple (row) from worker to master backend.
>>>   ssl_dh_params_file              |         |      | Location of the SSL DH parameters file.
>>> (8 rows)
>>>
>> Well not the last one of course.
>
> Better yet, “where category ~* ‘planner’"
>

Hi

Yes, I've taken a look. I'm just interested in turning off the whole
module and there is no parameter to do just that - an on and off switch
- from what I can understand. What I can do instead is to go over the
entire list of parameters and configure them each to generate a bad
plan, but I'm not sure in general how to make a configuration which
gives me the worst plan possible under all situations.

I was hoping that someone has the on/off switch I'm looking for or a
parameter template which has been used for the same purpose or something
along these lines...

This document describes the module I'm talking about:
https://www.postgresql.org/docs/current/planner-optimizer.html

lup
Reply | Threaded
Open this post in threaded view
|

Re: Measuring the Query Optimizer Effect: Turning off the QO?

lup


O
Sorry, maybe my question wasn't clear enough.

A query can be rewritten in various ways by applying rules and costs of
relational algebra operators, as well as their parallelisation. I am
talking about turning off this query optimization, so I am already
assuming that indexes aren't present.

Have you played with any of these settings?

postgres=# select version();
                                                 version
---------------------------------------------------------------------------------------------------------
 PostgreSQL 10.7 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36), 64-bit
(1 row)

postgres=# select name, setting, unit,short_desc from pg_settings where name ~ 'para';
              name               | setting | unit |                                             short_desc
---------------------------------+---------+------+----------------------------------------------------------------------------------------------------
 force_parallel_mode             | off     |      | Forces use of parallel query facilities.
 max_parallel_workers            | 16      |      | Sets the maximum number of parallel workers that can be active at one time.
 max_parallel_workers_per_gather | 8       |      | Sets the maximum number of parallel processes per executor node.
 min_parallel_index_scan_size    | 64      | 8kB  | Sets the minimum amount of index data for a parallel scan.
 min_parallel_table_scan_size    | 1024    | 8kB  | Sets the minimum amount of table data for a parallel scan.
 parallel_setup_cost             | 1000    |      | Sets the planner's estimate of the cost of starting up worker processes for parallel query.
 parallel_tuple_cost             | 0.1     |      | Sets the planner's estimate of the cost of passing each tuple (row) from worker to master backend.
 ssl_dh_params_file              |         |      | Location of the SSL DH parameters file.
(8 rows)

Well not the last one of course.

Better yet, “where category ~* ‘planner’"


Hi

Yes, I've taken a look. I'm just interested in turning off the whole 
module and there is no parameter to do just that - an on and off switch 
- from what I can understand. What I can do instead is to go over the 
entire list of parameters and configure them each to generate a bad 
plan, but I'm not sure in general how to make a configuration which 
gives me the worst plan possible under all situations.

I was hoping that someone has the on/off switch I'm looking for or a 
parameter template which has been used for the same purpose or something 
along these lines...

This document describes the module I'm talking about: 
https://www.postgresql.org/docs/current/planner-optimizer.html

Yeah, I guess I read that to say postgres will optimize generically (after geqo_threshold). Now maybe that’s the worst plan? Per force a lot of seq-scans in the absence of any indices.  And if it isn’t the worst possible plan, so what, you’re using postgres, you won’t get a worse plan without lying to the optimizer.
But I’m clearly in over my head.  The pros will be along shortly, I’m sure.


lup
Reply | Threaded
Open this post in threaded view
|

Re: Measuring the Query Optimizer Effect: Turning off the QO?

lup


On Jul 7, 2019, at 6:29 PM, Rob Sargent <[hidden email]> wrote:



O
Sorry, maybe my question wasn't clear enough.

A query can be rewritten in various ways by applying rules and costs of
relational algebra operators, as well as their parallelisation. I am
talking about turning off this query optimization, so I am already
assuming that indexes aren't present.

Have you played with any of these settings?

postgres=# select version();
                                                 version
---------------------------------------------------------------------------------------------------------
 PostgreSQL 10.7 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36), 64-bit
(1 row)

postgres=# select name, setting, unit,short_desc from pg_settings where name ~ 'para';
              name               | setting | unit |                                             short_desc
---------------------------------+---------+------+----------------------------------------------------------------------------------------------------
 force_parallel_mode             | off     |      | Forces use of parallel query facilities.
 max_parallel_workers            | 16      |      | Sets the maximum number of parallel workers that can be active at one time.
 max_parallel_workers_per_gather | 8       |      | Sets the maximum number of parallel processes per executor node.
 min_parallel_index_scan_size    | 64      | 8kB  | Sets the minimum amount of index data for a parallel scan.
 min_parallel_table_scan_size    | 1024    | 8kB  | Sets the minimum amount of table data for a parallel scan.
 parallel_setup_cost             | 1000    |      | Sets the planner's estimate of the cost of starting up worker processes for parallel query.
 parallel_tuple_cost             | 0.1     |      | Sets the planner's estimate of the cost of passing each tuple (row) from worker to master backend.
 ssl_dh_params_file              |         |      | Location of the SSL DH parameters file.
(8 rows)

Well not the last one of course.

Better yet, “where category ~* ‘planner’"


Hi

Yes, I've taken a look. I'm just interested in turning off the whole 
module and there is no parameter to do just that - an on and off switch 
- from what I can understand. What I can do instead is to go over the 
entire list of parameters and configure them each to generate a bad 
plan, but I'm not sure in general how to make a configuration which 
gives me the worst plan possible under all situations.

I was hoping that someone has the on/off switch I'm looking for or a 
parameter template which has been used for the same purpose or something 
along these lines...

This document describes the module I'm talking about: 
https://www.postgresql.org/docs/current/planner-optimizer.html

Yeah, I guess I read that to say postgres will optimize generically (after geqo_threshold). Now maybe that’s the worst plan? Per force a lot of seq-scans in the absence of any indices.  And if it isn’t the worst possible plan, so what, you’re using postgres, you won’t get a worse plan without lying to the optimizer.
But I’m clearly in over my head.  The pros will be along shortly, I’m sure.

And I didn’t notice I had pasted a url to a git project. Totally irrelevant to the issue at hand.  My apologies.


Reply | Threaded
Open this post in threaded view
|

Re: Measuring the Query Optimizer Effect: Turning off the QO?

Ron-2
In reply to this post by Tom Mercha
On 7/7/19 6:49 PM, Tom Mercha wrote:
On 08/07/2019 01:46, Rob Sargent wrote:
On Jul 7, 2019, at 5:22 PM, Tom Mercha [hidden email] wrote:

Hi All

As we know, a query goes through number of stages before it is executed.
One of these stages is query optimization (QO).

There are various parameters to try and influence optimizer decisions
and costs. But I wanted to measure the effect of such a stage by turning
it off completely and I can't find such a parameter which explicitly
does that. Then I could execute a query to get the effect of "QO active
and "QO inactive" and compare.

Obviously, I know well what the results would generally look like but I
am just interested in measuring the differences for various types of
queries. I am also aware that this is a simple comparison - there are
more interesting comparisons to perform with QO tweaks, but right now I
am interested in something basic.

So how would one shut down QO? Or at least, obtaining the guarantee of
generating the worst plan possible, ideally without touching many
parameters?

Best,
Tom
Drop all indices?

Sorry, maybe my question wasn't clear enough.

A query can be rewritten in various ways by applying rules and costs of 
relational algebra operators, as well as their parallelisation. I am 
talking about turning off this query optimization, so I am already 
assuming that indexes aren't present.

It seems as though you're asking what the "First Approximation" plan is, before it tries to get Too Clever.  However, I don't think there's a separation between Query Planner and Query Optimizerhttps://www.postgresql.org/docs/9.6/planner-optimizer.html

--
Angular momentum makes the world go 'round.
Reply | Threaded
Open this post in threaded view
|

Re: Measuring the Query Optimizer Effect: Turning off the QO?

Tom Lane-2
In reply to this post by Tom Mercha
Tom Mercha <[hidden email]> writes:
> As we know, a query goes through number of stages before it is executed.
> One of these stages is query optimization (QO).
> There are various parameters to try and influence optimizer decisions
> and costs. But I wanted to measure the effect of such a stage by turning
> it off completely and I can't find such a parameter which explicitly
> does that. Then I could execute a query to get the effect of "QO active
> and "QO inactive" and compare.

There is no such parameter because the code can't support that.  For
efficiency reasons, query optimization is bound up pretty tightly with
essential plan-preparation activities.  As an example, you can't turn
off constant-folding because eval_const_expressions also takes care of
some non-optional activities like filling in default parameter values
in function calls.

However, there are some knobs you can twiddle, as others have already
pointed out.  Two I'd particularly draw your attention to are
join_collapse_limit and from_collapse_limit --- if you set both to 1,
that'll effectively disable searching for a good join order, causing
the join order to match the syntactic structure of the FROM clause.
For instance "FROM a,b,c" will always be done by joining a to b first
then joining to c.  The code will still consider all possible ways
to do each of those joins, though you can shut off consideration of
some possibilities with parameters like enable_hashjoin.

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: Measuring the Query Optimizer Effect: Turning off the QO?

Andrew Gierth
In reply to this post by Tom Mercha
>>>>> "Tom" == Tom Mercha <[hidden email]> writes:

 Tom> Hi All

 Tom> As we know, a query goes through number of stages before it is
 Tom> executed. One of these stages is query optimization (QO).

That's not really true at all. One of the stages is query _planning_,
which takes the (rewritten) query as input and converts it to something
that the executor can take action on. There isn't actually any separate
"optimization" phase.

--
Andrew (irc:RhodiumToad)


Reply | Threaded
Open this post in threaded view
|

Re: Measuring the Query Optimizer Effect: Turning off the QO?

Andrew Gierth
In reply to this post by Tom Lane-2
>>>>> "Tom" == Tom Lane <[hidden email]> writes:

 Tom> Two I'd particularly draw your attention to are
 Tom> join_collapse_limit and from_collapse_limit --- if you set both to
 Tom> 1, that'll effectively disable searching for a good join order,
 Tom> causing the join order to match the syntactic structure of the
 Tom> FROM clause. For instance "FROM a,b,c" will always be done by
 Tom> joining a to b first

FROM a,b,c can always be planned in any join order. If you want to force
the join order you have to set join_collapse_limit=1 AND write it as
FROM a JOIN b ON ... JOIN c ON ...

For an example, try:

explain select * from onek o1, tenk1 t, onek o2
         where o1.unique1=t.unique1 and t.unique1=o2.unique1
           and o1.unique2<10 and o2.unique2<10;

which (at least for me) joins o1 and o2 together first even with the
collapse limits set to 1.

--
Andrew (irc:RhodiumToad)


Reply | Threaded
Open this post in threaded view
|

Re: Measuring the Query Optimizer Effect: Turning off the QO?

Tom Lane-2
Andrew Gierth <[hidden email]> writes:
> "Tom" == Tom Lane <[hidden email]> writes:
>  Tom> Two I'd particularly draw your attention to are
>  Tom> join_collapse_limit and from_collapse_limit --- if you set both to
>  Tom> 1, that'll effectively disable searching for a good join order,
>  Tom> causing the join order to match the syntactic structure of the
>  Tom> FROM clause. For instance "FROM a,b,c" will always be done by
>  Tom> joining a to b first

> FROM a,b,c can always be planned in any join order.

Ah, right, -ENOCAFFEINE.  What from_collapse_limit really does is
prevent flattening sub-SELECTs when doing so would add more join-order
freedom in the parent query.  But ignoring sub-SELECTs, using explicit
JOIN syntax with join_collapse_limit=1 will let you control the
join order.

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: Measuring the Query Optimizer Effect: Turning off the QO?

Tom Mercha
On 08/07/2019 18:29, Tom Lane wrote:

> Andrew Gierth <[hidden email]> writes:
>> "Tom" == Tom Lane <[hidden email]> writes:
>>   Tom> Two I'd particularly draw your attention to are
>>   Tom> join_collapse_limit and from_collapse_limit --- if you set both to
>>   Tom> 1, that'll effectively disable searching for a good join order,
>>   Tom> causing the join order to match the syntactic structure of the
>>   Tom> FROM clause. For instance "FROM a,b,c" will always be done by
>>   Tom> joining a to b first
>
>> FROM a,b,c can always be planned in any join order.
>
> Ah, right, -ENOCAFFEINE.  What from_collapse_limit really does is
> prevent flattening sub-SELECTs when doing so would add more join-order
> freedom in the parent query.  But ignoring sub-SELECTs, using explicit
> JOIN syntax with join_collapse_limit=1 will let you control the
> join order.
>
> regards, tom lane
>

Thanks for the clarification. I really appreciate taking the time to
provide such in depth info about my query!
Reply | Threaded
Open this post in threaded view
|

Re: Measuring the Query Optimizer Effect: Turning off the QO?

Tom Mercha
In reply to this post by Andrew Gierth
On 08/07/2019 16:23, Andrew Gierth wrote:

>>>>>> "Tom" == Tom Mercha <[hidden email]> writes:
>
>   Tom> Hi All
>
>   Tom> As we know, a query goes through number of stages before it is
>   Tom> executed. One of these stages is query optimization (QO).
>
> That's not really true at all. One of the stages is query _planning_,
> which takes the (rewritten) query as input and converts it to something
> that the executor can take action on. There isn't actually any separate
> "optimization" phase.
>

Hi

I was just loosely speaking - I am merely referring to the concept that
a query can be optimized vs unoptimized. Of course it follows that we
have a phase or a subset of stages which are responsible for this purpose.