Explain says 8 workers planned, only 1 executed

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Explain says 8 workers planned, only 1 executed

Alastair McKinley
Hi all,

I have a long running query that I have tweaked along with config (e.g. min_parallel_table_scan_size) to execute nicely and very fast in parallel which works as expected executed directly from psql client.  The query is then embedded in a psql function like "return query select * from function_that_executes_the_query()".

I am checking the explain output (using execute explain $query) just before executing inside my function and it the plan is identical to what I would expect, planning 8 workers.  However, this query actually only uses 1 worker and takes many times longer than when ran directly on the psql command line with the same server configuration parameters.

Why would the explain output be different from the executed plan? Is this a limitation of plpgsql functions? Is there any way to debug this further?

If it is meaningful during parallel execution I notice lots of "postgres: parallel worker" proceses in top and when executing from my function just a single "postgres: $user $db $host($pid) SELECT" processes.

Best regards,

Alastair


Reply | Threaded
Open this post in threaded view
|

Re: Explain says 8 workers planned, only 1 executed

Adrian Klaver-4
On 3/21/20 10:25 AM, Alastair McKinley wrote:
> Hi all,
>
> I have a long running query that I have tweaked along with config (e.g.
> min_parallel_table_scan_size) to execute nicely and very fast in
> parallel which works as expected executed directly from psql client.  
> The query is then embedded in a psql function like "return query select
> * from function_that_executes_the_query()".

Postgres version?

What is happening in function_that_executes_the_query()?

You might want to take a look at below to see any of the conditions apply:

https://www.postgresql.org/docs/12/when-can-parallel-query-be-used.html

>
> I am checking the explain output (using execute explain $query) just
> before executing inside my function and it the plan is identical to what
> I would expect, planning 8 workers.  However, this query actually only
> uses 1 worker and takes many times longer than when ran directly on the
> psql command line with the same server configuration parameters.
>
> Why would the explain output be different from the executed plan? Is
> this a limitation of plpgsql functions? Is there any way to debug this
> further?
>
> If it is meaningful during parallel execution I notice lots of
> "postgres: parallel worker" proceses in top and when executing from my
> function just a single "postgres: $user $db $host($pid) SELECT" processes.
>
> Best regards,
>
> Alastair
>
>


--
Adrian Klaver
[hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: Explain says 8 workers planned, only 1 executed

Alastair McKinley
Hi Adrian,

Thanks for getting back to me.

Postgres version is:

PostgreSQL 11.6 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit

I simplified it to the following structure:

create function f() returns setof my_type as
$$
declare
q text;
output text;
begin
    select generate_query1() into q; -- the query q executes in parallel with 8 workers if executed standalone
    for output in execute('explain ' || q) loop
        raise notice '%',output;  -- this plan says 8 workers will be launched
    end loop;
    return query execute q; -- this launches one worker

   select generate_query2() into q;
    for output in execute('explain ' || q) loop
        raise notice '%',output;  -- this plan says 8 workers will be launched
    end loop;
    return query execute q; -- this also launches one worker
end;
language plpgsql;

Should this work in principle or am I missing something subtle about parallel dynamic queries in plpgsql functions?  Does the outer function need to be parallel safe?
Might a stored proc work better?

Best regards,

Alastair



From: Adrian Klaver <[hidden email]>
Sent: 21 March 2020 17:38
To: Alastair McKinley <[hidden email]>; [hidden email] <[hidden email]>
Subject: Re: Explain says 8 workers planned, only 1 executed
 
On 3/21/20 10:25 AM, Alastair McKinley wrote:
> Hi all,
>
> I have a long running query that I have tweaked along with config (e.g.
> min_parallel_table_scan_size) to execute nicely and very fast in
> parallel which works as expected executed directly from psql client. 
> The query is then embedded in a psql function like "return query select
> * from function_that_executes_the_query()".

Postgres version?

What is happening in function_that_executes_the_query()?

You might want to take a look at below to see any of the conditions apply:

https://www.postgresql.org/docs/12/when-can-parallel-query-be-used.html

>
> I am checking the explain output (using execute explain $query) just
> before executing inside my function and it the plan is identical to what
> I would expect, planning 8 workers.  However, this query actually only
> uses 1 worker and takes many times longer than when ran directly on the
> psql command line with the same server configuration parameters.
>
> Why would the explain output be different from the executed plan? Is
> this a limitation of plpgsql functions? Is there any way to debug this
> further?
>
> If it is meaningful during parallel execution I notice lots of
> "postgres: parallel worker" proceses in top and when executing from my
> function just a single "postgres: $user $db $host($pid) SELECT" processes.
>
> Best regards,
>
> Alastair
>
>


--
Adrian Klaver
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Explain says 8 workers planned, only 1 executed

Jeremy Smith
Unfortunately, return query will never use parallel workers.  See: https://stackoverflow.com/q/58079898/895640 and https://www.postgresql.org/message-id/16040-eaacad11fecfb198@...

On Sat, Mar 21, 2020 at 1:59 PM Alastair McKinley <[hidden email]> wrote:
Hi Adrian,

Thanks for getting back to me.

Postgres version is:

PostgreSQL 11.6 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit

I simplified it to the following structure:

create function f() returns setof my_type as
$$
declare
q text;
output text;
begin
    select generate_query1() into q; -- the query q executes in parallel with 8 workers if executed standalone
    for output in execute('explain ' || q) loop
        raise notice '%',output;  -- this plan says 8 workers will be launched
    end loop;
    return query execute q; -- this launches one worker

   select generate_query2() into q;
    for output in execute('explain ' || q) loop
        raise notice '%',output;  -- this plan says 8 workers will be launched
    end loop;
    return query execute q; -- this also launches one worker
end;
language plpgsql;

Should this work in principle or am I missing something subtle about parallel dynamic queries in plpgsql functions?  Does the outer function need to be parallel safe?
Might a stored proc work better?

Best regards,

Alastair



From: Adrian Klaver <[hidden email]>
Sent: 21 March 2020 17:38
To: Alastair McKinley <[hidden email]>; [hidden email] <[hidden email]>
Subject: Re: Explain says 8 workers planned, only 1 executed
 
On 3/21/20 10:25 AM, Alastair McKinley wrote:
> Hi all,
>
> I have a long running query that I have tweaked along with config (e.g.
> min_parallel_table_scan_size) to execute nicely and very fast in
> parallel which works as expected executed directly from psql client. 
> The query is then embedded in a psql function like "return query select
> * from function_that_executes_the_query()".

Postgres version?

What is happening in function_that_executes_the_query()?

You might want to take a look at below to see any of the conditions apply:

https://www.postgresql.org/docs/12/when-can-parallel-query-be-used.html

>
> I am checking the explain output (using execute explain $query) just
> before executing inside my function and it the plan is identical to what
> I would expect, planning 8 workers.  However, this query actually only
> uses 1 worker and takes many times longer than when ran directly on the
> psql command line with the same server configuration parameters.
>
> Why would the explain output be different from the executed plan? Is
> this a limitation of plpgsql functions? Is there any way to debug this
> further?
>
> If it is meaningful during parallel execution I notice lots of
> "postgres: parallel worker" proceses in top and when executing from my
> function just a single "postgres: $user $db $host($pid) SELECT" processes.
>
> Best regards,
>
> Alastair
>
>


--
Adrian Klaver
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Explain says 8 workers planned, only 1 executed

Alastair McKinley
Hi Jeremy,

Thanks for solving the mystery.  I think this might be a missing point in section 15.2 in the docs.

I wonder will this ever be improved or should I just write to temporary tables instead of return query?

Best regards,

Alastair

From: Jeremy Smith <[hidden email]>
Sent: 21 March 2020 20:50
To: Alastair McKinley <[hidden email]>
Cc: Adrian Klaver <[hidden email]>; [hidden email] <[hidden email]>
Subject: Re: Explain says 8 workers planned, only 1 executed
 
Unfortunately, return query will never use parallel workers.  See: https://stackoverflow.com/q/58079898/895640 and https://www.postgresql.org/message-id/16040-eaacad11fecfb198@...

On Sat, Mar 21, 2020 at 1:59 PM Alastair McKinley <[hidden email]> wrote:
Hi Adrian,

Thanks for getting back to me.

Postgres version is:

PostgreSQL 11.6 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit

I simplified it to the following structure:

create function f() returns setof my_type as
$$
declare
q text;
output text;
begin
    select generate_query1() into q; -- the query q executes in parallel with 8 workers if executed standalone
    for output in execute('explain ' || q) loop
        raise notice '%',output;  -- this plan says 8 workers will be launched
    end loop;
    return query execute q; -- this launches one worker

   select generate_query2() into q;
    for output in execute('explain ' || q) loop
        raise notice '%',output;  -- this plan says 8 workers will be launched
    end loop;
    return query execute q; -- this also launches one worker
end;
language plpgsql;

Should this work in principle or am I missing something subtle about parallel dynamic queries in plpgsql functions?  Does the outer function need to be parallel safe?
Might a stored proc work better?

Best regards,

Alastair



From: Adrian Klaver <[hidden email]>
Sent: 21 March 2020 17:38
To: Alastair McKinley <[hidden email]>; [hidden email] <[hidden email]>
Subject: Re: Explain says 8 workers planned, only 1 executed
 
On 3/21/20 10:25 AM, Alastair McKinley wrote:
> Hi all,
>
> I have a long running query that I have tweaked along with config (e.g.
> min_parallel_table_scan_size) to execute nicely and very fast in
> parallel which works as expected executed directly from psql client. 
> The query is then embedded in a psql function like "return query select
> * from function_that_executes_the_query()".

Postgres version?

What is happening in function_that_executes_the_query()?

You might want to take a look at below to see any of the conditions apply:

https://www.postgresql.org/docs/12/when-can-parallel-query-be-used.html

>
> I am checking the explain output (using execute explain $query) just
> before executing inside my function and it the plan is identical to what
> I would expect, planning 8 workers.  However, this query actually only
> uses 1 worker and takes many times longer than when ran directly on the
> psql command line with the same server configuration parameters.
>
> Why would the explain output be different from the executed plan? Is
> this a limitation of plpgsql functions? Is there any way to debug this
> further?
>
> If it is meaningful during parallel execution I notice lots of
> "postgres: parallel worker" proceses in top and when executing from my
> function just a single "postgres: $user $db $host($pid) SELECT" processes.
>
> Best regards,
>
> Alastair
>
>


--
Adrian Klaver
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Explain says 8 workers planned, only 1 executed

Tom Lane-2
Alastair McKinley <[hidden email]> writes:
> Thanks for solving the mystery.  I think this might be a missing point in section 15.2 in the docs.
> I wonder will this ever be improved or should I just write to temporary tables instead of return query?

I just posted a patch to improve that [1], but it's not something we'd be
likely to back-patch into existing releases.

                        regards, tom lane

[1] https://www.postgresql.org/message-id/1741.1584847383%40sss.pgh.pa.us