es_query_dsa is broken

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

es_query_dsa is broken

Thomas Munro-3
Hi hackers,

While reviewing commit c6755e23 I realised that es_query_dsa is
broken.  It might have made some kind of sense as a name and a concept
in an earlier version of the proposal to add a DSA area for parallel
query's use, when the DSA area was going to be independent of parallel
query DSM segments and could have been used for the whole query.  But
as it stands, each Gather [Merge] node has its own DSA area in its own
DSM segment, and that needs to be the one available to the nodes of
the child plan when executed in the leader process and it needs to be
not available to any other nodes in the tree.  It's OK for the workers
to have just one es_query_dsa set up at the beginning and used for the
whole lifetime of the executor, but it's not OK for the leader to
install it and forget about it as it does now.

The attached draft patch (for discussion only) shows one way to fix
that: only install it for the duration of Gather[Merge]'s
ExecProcNode(child), instead of doing it in ExecInitParallelPlan().
Also for the [Re]IntializeDSM invocations.

This type of install-call-clear coding isn't ideal, but unfortunately
the area doesn't exist until the first time through ExecGather runs,
and we don't seem to have another suitable scope (ie some object
easily accessible to all children of a given Gather) to put it in
right now.

Better ideas?

--
Thomas Munro
http://www.enterprisedb.com

fix-es_query_dsa-v1.patch (5K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: es_query_dsa is broken

Robert Haas
On Wed, Nov 29, 2017 at 7:30 AM, Thomas Munro
<[hidden email]> wrote:

> While reviewing commit c6755e23 I realised that es_query_dsa is
> broken.  It might have made some kind of sense as a name and a concept
> in an earlier version of the proposal to add a DSA area for parallel
> query's use, when the DSA area was going to be independent of parallel
> query DSM segments and could have been used for the whole query.  But
> as it stands, each Gather [Merge] node has its own DSA area in its own
> DSM segment, and that needs to be the one available to the nodes of
> the child plan when executed in the leader process and it needs to be
> not available to any other nodes in the tree.  It's OK for the workers
> to have just one es_query_dsa set up at the beginning and used for the
> whole lifetime of the executor, but it's not OK for the leader to
> install it and forget about it as it does now.

Ugh.

> Better ideas?

How about this:

1. Remove es_query_dsa altogether.
2. Add a dsa_area * to ExecParallelInitializeDSMContext.
3. In ExecParallelInitializeDSM, pass the dsa_area * as a separate to
the per-node-type function.
4. If the per-node-type function cares about the dsa_area *, it is
responsible for saving a pointer to it in the PlanState node.
5. Also pass it down via the ParallelWorkerContext.

In v10 we might need to go with a solution like what you've sketched
here, because Tom will complain about breaking binary compatibility
with EState (and maybe other PlanState nodes) in a released branch.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Reply | Threaded
Open this post in threaded view
|

Re: es_query_dsa is broken

Thomas Munro-3
On Thu, Nov 30, 2017 at 4:01 AM, Robert Haas <[hidden email]> wrote:

>> Better ideas?
>
> How about this:
>
> 1. Remove es_query_dsa altogether.
> 2. Add a dsa_area * to ExecParallelInitializeDSMContext.
> 3. In ExecParallelInitializeDSM, pass the dsa_area * as a separate to
> the per-node-type function.
> 4. If the per-node-type function cares about the dsa_area *, it is
> responsible for saving a pointer to it in the PlanState node.
> 5. Also pass it down via the ParallelWorkerContext.
>
> In v10 we might need to go with a solution like what you've sketched
> here, because Tom will complain about breaking binary compatibility
> with EState (and maybe other PlanState nodes) in a released branch.

I will post both versions.  I've been stuck for a while now trying to
come up with a query that actually breaks, so I can show that it's
fixed...

--
Thomas Munro
http://www.enterprisedb.com

Reply | Threaded
Open this post in threaded view
|

Re: es_query_dsa is broken

Thomas Munro-3
On Thu, Nov 30, 2017 at 11:19 PM, Thomas Munro
<[hidden email]> wrote:

> On Thu, Nov 30, 2017 at 4:01 AM, Robert Haas <[hidden email]> wrote:
>>> Better ideas?
>>
>> How about this:
>>
>> 1. Remove es_query_dsa altogether.
>> 2. Add a dsa_area * to ExecParallelInitializeDSMContext.
>> 3. In ExecParallelInitializeDSM, pass the dsa_area * as a separate to
>> the per-node-type function.
>> 4. If the per-node-type function cares about the dsa_area *, it is
>> responsible for saving a pointer to it in the PlanState node.
>> 5. Also pass it down via the ParallelWorkerContext.
>>
>> In v10 we might need to go with a solution like what you've sketched
>> here, because Tom will complain about breaking binary compatibility
>> with EState (and maybe other PlanState nodes) in a released branch.
Here's a pair of versions of that patch tidied up a bit, for
REL_10_STABLE and master.  Unfortunately I've been unable to come up
with a reproducer that shows misbehaviour (something like what was
reported on -bugs[1] which appears to be a case of calling
dsa_get_address(wrong_area, some_dsa_pointer)).

I don't yet have a patch to remove es_query_dsa.  Instead of adding a
dsa_area * parameter to a ton of per-node functions as you suggested,
I'm wondering if we shouldn't just pass the whole ParallelExecutorInfo
object into all ExecXXXInitializeDSM() and ExecXXXInitializeWorker()
functions so they can hold onto it if they want to.  In the worker
case it'd be only partially filled out: just area and pcxt, and pcxt
would also be only partially filled out.  Alternatively I could create
a new struct ParallelExecutorWorkerInfo which has just the bit needed
for workers (that'd replace ParallelWorkerContext, which I now realise
was put in the wrong place -- it doesn't belong in access/parallel.h,
it belongs in an executor header with an executor-sounding name).  I'd
like to get a couple of other things done before I come back to this.

[1] https://www.postgresql.org/message-id/CAEepm=1V-ZjAAyG-xj6s7ESXFhzLsRBpyX=BLz-w2DS=ObNu4A@...

--
Thomas Munro
http://www.enterprisedb.com

fix-es_query_dsa-pg10-v2.patch (5K) Download Attachment
fix-es_query_dsa-pg11-v2.patch (7K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: es_query_dsa is broken

akapila
On Tue, Dec 5, 2017 at 6:45 AM, Thomas Munro
<[hidden email]> wrote:

> On Thu, Nov 30, 2017 at 11:19 PM, Thomas Munro
> <[hidden email]> wrote:
>> On Thu, Nov 30, 2017 at 4:01 AM, Robert Haas <[hidden email]> wrote:
>>>
>>> In v10 we might need to go with a solution like what you've sketched
>>> here, because Tom will complain about breaking binary compatibility
>>> with EState (and maybe other PlanState nodes) in a released branch.
>
> Here's a pair of versions of that patch tidied up a bit, for
> REL_10_STABLE and master.  Unfortunately I've been unable to come up
> with a reproducer that shows misbehaviour (something like what was
> reported on -bugs[1] which appears to be a case of calling
> dsa_get_address(wrong_area, some_dsa_pointer)).
>

+ EState *estate = gatherstate->ps.state;
+
+ /* Install our DSA area while executing the plan. */
+ estate->es_query_dsa = gatherstate->pei->area;
  outerTupleSlot = ExecProcNode(outerPlan);
+ estate->es_query_dsa = NULL;

Won't the above coding pattern create a problem, if ExecProcNode
throws an error and outer block catches it and continues execution
(consider the case of execution inside PL blocks)?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Reply | Threaded
Open this post in threaded view
|

Re: es_query_dsa is broken

Robert Haas
On Tue, Dec 5, 2017 at 7:24 AM, Amit Kapila <[hidden email]> wrote:

> + EState *estate = gatherstate->ps.state;
> +
> + /* Install our DSA area while executing the plan. */
> + estate->es_query_dsa = gatherstate->pei->area;
>   outerTupleSlot = ExecProcNode(outerPlan);
> + estate->es_query_dsa = NULL;
>
> Won't the above coding pattern create a problem, if ExecProcNode
> throws an error and outer block catches it and continues execution
> (consider the case of execution inside PL blocks)?

I don't see what the problem is.  The query that got aborted by the
error wouldn't be sharing an EState with one that didn't.  Control
would have to return to someplace inside the same ExecProcNode and it
would have to return normally.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Reply | Threaded
Open this post in threaded view
|

Re: es_query_dsa is broken

akapila
On Wed, Dec 6, 2017 at 1:14 AM, Robert Haas <[hidden email]> wrote:

> On Tue, Dec 5, 2017 at 7:24 AM, Amit Kapila <[hidden email]> wrote:
>> + EState *estate = gatherstate->ps.state;
>> +
>> + /* Install our DSA area while executing the plan. */
>> + estate->es_query_dsa = gatherstate->pei->area;
>>   outerTupleSlot = ExecProcNode(outerPlan);
>> + estate->es_query_dsa = NULL;
>>
>> Won't the above coding pattern create a problem, if ExecProcNode
>> throws an error and outer block catches it and continues execution
>> (consider the case of execution inside PL blocks)?
>
> I don't see what the problem is.  The query that got aborted by the
> error wouldn't be sharing an EState with one that didn't.
>

That's right.  Ignore my comment, I got confused.   Other than that, I
don't see any problem with the code as such apart from that it looks
slightly hacky.  I think Thomas or someone needs to develop a patch on
the lines you have mentioned or what Thomas was trying to describe in
his email and see how it comes out.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Reply | Threaded
Open this post in threaded view
|

Re: es_query_dsa is broken

Thomas Munro-3
On Wed, Dec 6, 2017 at 10:16 PM, Amit Kapila <[hidden email]> wrote:

> On Wed, Dec 6, 2017 at 1:14 AM, Robert Haas <[hidden email]> wrote:
>> On Tue, Dec 5, 2017 at 7:24 AM, Amit Kapila <[hidden email]> wrote:
>>> + EState *estate = gatherstate->ps.state;
>>> +
>>> + /* Install our DSA area while executing the plan. */
>>> + estate->es_query_dsa = gatherstate->pei->area;
>>>   outerTupleSlot = ExecProcNode(outerPlan);
>>> + estate->es_query_dsa = NULL;
>>>
>>> Won't the above coding pattern create a problem, if ExecProcNode
>>> throws an error and outer block catches it and continues execution
>>> (consider the case of execution inside PL blocks)?
>>
>> I don't see what the problem is.  The query that got aborted by the
>> error wouldn't be sharing an EState with one that didn't.
>
> That's right.  Ignore my comment, I got confused.   Other than that, I
> don't see any problem with the code as such apart from that it looks
> slightly hacky.  I think Thomas or someone needs to develop a patch on
> the lines you have mentioned or what Thomas was trying to describe in
> his email and see how it comes out.

Yeah, it is a bit hacky, but I can't see a another way to fix it
without changing released APIs and it's only for one release and will
certainly be unhackified in v11.  For v11 I think we need to decide
between:

1.  Removing es_query_dsa and injecting the right context into the
executor tree as discussed.

2.  Another idea mentioned by Robert in an off-list chat:  We could
consolidate all DSM segments in a multi-gather plan into one.  See the
nearby thread where someone had over a hundred Gather nodes and had to
crank up max_connections to reserve enough DSM slots.  Of course,
optimising for that case doesn't make too much sense (I suspect
multi-gather plans will become less likely with Parallel Append and
Parallel Hash in the picture anyway), but it would reduce a bunch of
duplicated work we do when it happens as well as scarce slot
consumption.  If we did that, then all of a sudden es_query_dsa would
make sense again (ie it'd be whole-query scoped), and we could revert
that hacky change.

Or we could do both things anyway.

--
Thomas Munro
http://www.enterprisedb.com

Previous Thread Next Thread