pgsql: Avoid sharing PARAM_EXEC slots between different levels of NestL

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

pgsql: Avoid sharing PARAM_EXEC slots between different levels of NestL

Tom Lane-2
Avoid sharing PARAM_EXEC slots between different levels of NestLoop.

Up to now, createplan.c attempted to share PARAM_EXEC slots for
NestLoopParams across different plan levels, if the same underlying Var
was being fed down to different righthand-side subplan trees by different
NestLoops.  This was, I think, more of an artifact of using subselect.c's
PlannerParamItem infrastructure than an explicit design goal, but anyway
that was the end result.

This works well enough as long as the plan tree is executing synchronously,
but the feature whereby Gather can execute the parallelized subplan locally
breaks it.  An upper NestLoop node might execute for a row retrieved from
a parallel worker, and assign a value for a PARAM_EXEC slot from that row,
while the leader's copy of the parallelized subplan is suspended with a
different active value of the row the Var comes from.  When control
eventually returns to the leader's subplan, it gets the wrong answers if
the same PARAM_EXEC slot is being used within the subplan, as reported
in bug #15577 from Bartosz Polnik.

This is pretty reminiscent of the problem fixed in commit 46c508fbc, and
the proper fix seems to be the same: don't try to share PARAM_EXEC slots
across different levels of controlling NestLoop nodes.

This requires decoupling NestLoopParam handling from PlannerParamItem
handling, although the logic remains somewhat similar.  To avoid bizarre
division of labor between subselect.c and createplan.c, I decided to move
all the param-slot-assignment logic for both cases out of those files
and put it into a new file paramassign.c.  Hopefully it's a bit better
documented now, too.

A regression test case for this might be nice, but we don't know a
test case that triggers the problem with a suitably small amount
of data.

Back-patch to 9.6 where we added Gather nodes.  It's conceivable that
related problems exist in older branches; but without some evidence
for that, I'll leave the older branches alone.

Discussion: https://postgr.es/m/15577-ca61ab18904af852@...

Branch
------
REL_11_STABLE

Details
-------
https://git.postgresql.org/pg/commitdiff/05eb923eae46c1698088d555ae590a73d4fc7070

Modified Files
--------------
src/backend/optimizer/plan/createplan.c  | 182 +---------
src/backend/optimizer/plan/planner.c     |   9 +-
src/backend/optimizer/plan/subselect.c   | 387 ++------------------
src/backend/optimizer/util/Makefile      |   3 +-
src/backend/optimizer/util/paramassign.c | 599 +++++++++++++++++++++++++++++++
src/include/optimizer/paramassign.h      |  34 ++
src/include/optimizer/subselect.h        |   6 +-
7 files changed, 683 insertions(+), 537 deletions(-)