BUG #16329: Valgrind detects an invalid read when building a gist index with buffering

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

BUG #16329: Valgrind detects an invalid read when building a gist index with buffering

PG Doc comments form
The following bug has been logged on the website:

Bug reference:      16329
Logged by:          Alexander Lakhin
Email address:      [hidden email]
PostgreSQL version: 12.2
Operating system:   Ubuntu 18.04
Description:        

Running the following query under valgrind:
create table gist_point_tbl(id int4, p point);
insert into gist_point_tbl (id, p)
select g, point(g*10+1, g*10+1) from generate_series(1, 30000) g;
create index gist_pointidx2 on gist_point_tbl using gist(p) with (buffering
= on, fillfactor=50);

leads to a memory access error:
==00:00:00:08.755 18585== Invalid read of size 2
==00:00:00:08.756 18585==    at 0x1F94CA: gistBuildCallback
(gistbuild.c:495)
==00:00:00:08.756 18585==    by 0x21F556: heapam_index_build_range_scan
(heapam_handler.c:1664)
==00:00:00:08.756 18585==    by 0x1F972F: table_index_build_scan
(tableam.h:1522)
==00:00:00:08.756 18585==    by 0x1F972F: gistbuild (gistbuild.c:196)
==00:00:00:08.756 18585==    by 0x29D1FE: index_build (index.c:2909)
==00:00:00:08.756 18585==    by 0x29EAC8: index_create (index.c:1221)
==00:00:00:08.756 18585==    by 0x348C60: DefineIndex (indexcmds.c:1005)
==00:00:00:08.756 18585==    by 0x52B518: ProcessUtilitySlow
(utility.c:1372)
==00:00:00:08.756 18585==    by 0x52AAC5: standard_ProcessUtility
(utility.c:927)
==00:00:00:08.756 18585==    by 0x52AB73: ProcessUtility (utility.c:360)
==00:00:00:08.756 18585==    by 0x527025: PortalRunUtility (pquery.c:1175)
==00:00:00:08.756 18585==    by 0x527C6F: PortalRunMulti (pquery.c:1321)
==00:00:00:08.756 18585==    by 0x5289DE: PortalRun (pquery.c:796)
==00:00:00:08.756 18585==  Address 0x8a4ca1e is 398 bytes inside a block of
size 8,192 alloc'd
==00:00:00:08.756 18585==    at 0x4C2FBFF: malloc (in
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==00:00:00:08.756 18585==    by 0x6781E6: AllocSetContextCreateInternal
(aset.c:477)
==00:00:00:08.756 18585==    by 0x3BF065: ExecInitFunctionScan
(nodeFunctionscan.c:508)
==00:00:00:08.756 18585==    by 0x3ACE6F: ExecInitNode
(execProcnode.c:247)
==00:00:00:08.756 18585==    by 0x3D03E1: ExecInitModifyTable
(nodeModifyTable.c:2363)
==00:00:00:08.756 18585==    by 0x3ACD4C: ExecInitNode
(execProcnode.c:174)
==00:00:00:08.756 18585==    by 0x3A70F3: InitPlan (execMain.c:1020)
==00:00:00:08.756 18585==    by 0x3A7299: standard_ExecutorStart
(execMain.c:266)
==00:00:00:08.756 18585==    by 0x3A739B: ExecutorStart (execMain.c:148)
==00:00:00:08.756 18585==    by 0x527936: ProcessQuery (pquery.c:156)
==00:00:00:08.756 18585==    by 0x527B34: PortalRunMulti (pquery.c:1283)
==00:00:00:08.756 18585==    by 0x5289DE: PortalRun (pquery.c:796)

Reproduced at least on REL_10_STABLE..master.

Reply | Threaded
Open this post in threaded view
|

Re: BUG #16329: Valgrind detects an invalid read when building a gist index with buffering

Alexander Lakhin-2
30.03.2020 00:00, PG Bug reporting form wrote:

> The following bug has been logged on the website:
>
> Bug reference:      16329
> Logged by:          Alexander Lakhin
> Email address:      [hidden email]
> PostgreSQL version: 12.2
> Operating system:   Ubuntu 18.04
> Description:        
>
> Running the following query under valgrind:
> create table gist_point_tbl(id int4, p point);
> insert into gist_point_tbl (id, p)
> select g, point(g*10+1, g*10+1) from generate_series(1, 30000) g;
> create index gist_pointidx2 on gist_point_tbl using gist(p) with (buffering
> = on, fillfactor=50);
>
> leads to a memory access error:
> ==00:00:00:08.755 18585== Invalid read of size 2
> ==00:00:00:08.756 18585==    at 0x1F94CA: gistBuildCallback
> (gistbuild.c:495)
I've found that the memory context where the itup is created:
    oldCtx = MemoryContextSwitchTo(buildstate->giststate->tempCxt);

    /* form an index tuple and point it at the heap tuple */
    itup = gistFormTuple(buildstate->giststate, index, values, isnull,
true);

can be reset in gistBufferingBuildInsert->gistProcessEmptyingQueue:
            /* Free all the memory allocated during index tuple
processing */
            MemoryContextReset(buildstate->giststate->tempCxt);

Probably, the offending commit is d22a09dc.



Reply | Threaded
Open this post in threaded view
|

Re: BUG #16329: Valgrind detects an invalid read when building a gist index with buffering

Alexander Lakhin-2
Hello hackers,
30.03.2020 08:00, Alexander Lakhin wrote:

> 30.03.2020 00:00, PG Bug reporting form wrote:
>> The following bug has been logged on the website:
>>
>> Bug reference:      16329
>> Logged by:          Alexander Lakhin
>> Email address:      [hidden email]
>> PostgreSQL version: 12.2
>> Operating system:   Ubuntu 18.04
>> Description:        
>>
>> Running the following query under valgrind:
>> create table gist_point_tbl(id int4, p point);
>> insert into gist_point_tbl (id, p)
>> select g, point(g*10+1, g*10+1) from generate_series(1, 30000) g;
>> create index gist_pointidx2 on gist_point_tbl using gist(p) with (buffering
>> = on, fillfactor=50);
>>
>> leads to a memory access error:
>> ==00:00:00:08.755 18585== Invalid read of size 2
>> ==00:00:00:08.756 18585==    at 0x1F94CA: gistBuildCallback
>> (gistbuild.c:495)
>>
Please look at the patch that modifies the gist regression test to make
the issue visible and fixes it by avoiding access to the memory context
that can be reset in gistProcessEmptyingQueue().

Best regards,
Alexander

fix-gistbuild.patch (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: BUG #16329: Valgrind detects an invalid read when building a gist index with buffering

Alexander Lakhin-2
06.04.2020 09:00, Alexander Lakhin wrote:

> 30.03.2020 00:00, PG Bug reporting form wrote:
>>> Running the following query under valgrind:
>>> create table gist_point_tbl(id int4, p point);
>>> insert into gist_point_tbl (id, p)
>>> select g, point(g*10+1, g*10+1) from generate_series(1, 30000) g;
>>> create index gist_pointidx2 on gist_point_tbl using gist(p) with (buffering
>>> = on, fillfactor=50);
>>>
>>> leads to a memory access error:
>>> ==00:00:00:08.755 18585== Invalid read of size 2
>>> ==00:00:00:08.756 18585==    at 0x1F94CA: gistBuildCallback
>>> (gistbuild.c:495)
> Please look at the patch that modifies the gist regression test to make
> the issue visible and fixes it by avoiding access to the memory context
> that can be reset in gistProcessEmptyingQueue().
Could you please let me know if the fix is incorrect (or not elaborated
enough to be included in the upcoming releases) or this issue is false
positive?

Best regards,
Alexander




Reply | Threaded
Open this post in threaded view
|

Re: BUG #16329: Valgrind detects an invalid read when building a gist index with buffering

Tom Lane-2
Alexander Lakhin <[hidden email]> writes:
>> Please look at the patch that modifies the gist regression test to make
>> the issue visible and fixes it by avoiding access to the memory context
>> that can be reset in gistProcessEmptyingQueue().

> Could you please let me know if the fix is incorrect (or not elaborated
> enough to be included in the upcoming releases) or this issue is false
> positive?

I took a look at this.  I see the hazard, but I'm not sure that
I like your proposed quick-fix.  Essentially, the problem is that
gistBuildCallback is supposing that the tuple it creates will survive
the execution of its subroutines, which in fact it won't always.
But that means that at some point the tuple pointer it's passed down to
those subroutines becomes a dangling pointer.  What guarantee would we
have that the subroutines don't access the tuple after that point?

I think the real issue here is confusion about which level of function
"owns" the tempCxt and gets to decide when to reset it.  We can't have
both gistBuildCallback and gistProcessEmptyingQueue doing that.  Maybe
we need more than one temporary context, or maybe we could just skip
the context reset in gistProcessEmptyingQueue and let the storage
accumulate till we get back to gistBuildCallback.  But in any case
it's going to require more analysis.

Or, if we do just hack it as you suggest, there had better be a couple
of fat comments in gistBufferingBuildInsert warning about the hazards.

I was somewhat dismayed to realize from looking at the code coverage
report that we have exactly zero test coverage of the gist buffering
build code paths today.  That's just awful; how did the feature get
committed with no test coverage?  Your proposed test additions raise the
coverage in gistbuild.c and gistbuildbuffers.c to something respectable,
at least.  But it looks like there are still some potentially-delicate
paths that aren't tested, notably the "have to stop buffering" business.
Can we do better at reasonable cost, or are those paths just never
reached without huge data volume?  (Could we make them more reachable
by turning down maintenance_work_mem or some other setting?)

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: BUG #16329: Valgrind detects an invalid read when building a gist index with buffering

Alexander Lakhin-2
Hello Tom,
05.05.2020 21:59, Tom Lane wrote:
Could you please let me know if the fix is incorrect (or not elaborated
enough to be included in the upcoming releases) or this issue is false
positive?
I took a look at this.  I see the hazard, but I'm not sure that
I like your proposed quick-fix.  Essentially, the problem is that
gistBuildCallback is supposing that the tuple it creates will survive
the execution of its subroutines, which in fact it won't always.
But that means that at some point the tuple pointer it's passed down to
those subroutines becomes a dangling pointer.  What guarantee would we
have that the subroutines don't access the tuple after that point?
Thanks for your review!
Yes, I'm not excited by this fix too, so I'll try to find another not so quick way to fix it.
Or, if we do just hack it as you suggest, there had better be a couple
of fat comments in gistBufferingBuildInsert warning about the hazards.

I was somewhat dismayed to realize from looking at the code coverage
report that we have exactly zero test coverage of the gist buffering
build code paths today.  That's just awful; how did the feature get
committed with no test coverage?  Your proposed test additions raise the
coverage in gistbuild.c and gistbuildbuffers.c to something respectable,
at least.  But it looks like there are still some potentially-delicate
paths that aren't tested, notably the "have to stop buffering" business.
Can we do better at reasonable cost, or are those paths just never
reached without huge data volume?  (Could we make them more reachable
by turning down maintenance_work_mem or some other setting?)
Please look at the improved test that makes the code coverage for gistbuildbuffers.c almost 100%.

Best regards,
Alexander

gist-buffering-test.patch (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: BUG #16329: Valgrind detects an invalid read when building a gist index with buffering

Alexander Lakhin-2
11.05.2020 23:00, Alexander Lakhin wrote:
Please look at the improved test that makes the code coverage for gistbuildbuffers.c almost 100%.
That improved test exposes another bug (when running under valgrind):

==00:00:00:34.861 1439277== Conditional jump or move depends on uninitialised value(s)
==00:00:00:34.861 1439277==    at 0x1FFE6A: gistpenalty (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x1F86E0: gistRelocateBuildBuffersOnSplit (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x1F67B7: gistbufferinginserttuples (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x1F6B6F: gistbufferinginserttuples (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x1F6E0E: gistProcessItup (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x1F701F: gistProcessEmptyingQueue (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x1F706D: gistBufferingBuildInsert (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x1F7817: gistBuildCallback (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x2293F3: heapam_index_build_range_scan (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x1F7A23: gistbuild (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x2B8372: index_build (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x2B9C8D: index_create (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==  Uninitialised value was created by a heap allocation
==00:00:00:34.861 1439277==    at 0x6B646A: palloc (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x1F5A8A: initGISTstate (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x1F7913: gistbuild (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x2B8372: index_build (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x2B9C8D: index_create (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x361677: DefineIndex (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x563FE3: ProcessUtilitySlow (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x56358F: standard_ProcessUtility (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x56366E: ProcessUtility (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x55FAED: PortalRunUtility (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x560640: PortalRunMulti (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x5612F8: PortalRun (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==
{
   <insert_a_suppression_name_here>
   Memcheck:Cond
   fun:gistpenalty
   fun:gistRelocateBuildBuffersOnSplit
   fun:gistbufferinginserttuples
   fun:gistbufferinginserttuples
   fun:gistProcessItup
   fun:gistProcessEmptyingQueue
   fun:gistBufferingBuildInsert
   fun:gistBuildCallback
   fun:heapam_index_build_range_scan
   fun:gistbuild
   fun:index_build
   fun:index_create
}
==00:00:00:34.861 1439277== Use of uninitialised value of size 8
==00:00:00:34.861 1439277==    at 0x68E92D: FunctionCall3Coll (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==  Uninitialised value was created by a heap allocation
==00:00:00:34.861 1439277==    at 0x6B646A: palloc (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x1F5A8A: initGISTstate (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x1F7913: gistbuild (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x2B8372: index_build (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x2B9C8D: index_create (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x361677: DefineIndex (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x563FE3: ProcessUtilitySlow (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x56358F: standard_ProcessUtility (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x56366E: ProcessUtility (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x55FAED: PortalRunUtility (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x560640: PortalRunMulti (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x5612F8: PortalRun (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==
{
   <insert_a_suppression_name_here>
   Memcheck:Value8
   fun:FunctionCall3Coll
}
==00:00:00:34.861 1439277== Jump to the invalid address stated on the next line
==00:00:00:34.861 1439277==    at 0x0: ???
==00:00:00:34.861 1439277==    by 0x68E92E: FunctionCall3Coll (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x1FFE9F: gistpenalty (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x1F86E0: gistRelocateBuildBuffersOnSplit (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x1F67B7: gistbufferinginserttuples (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x1F6B6F: gistbufferinginserttuples (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x1F6E0E: gistProcessItup (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x1F701F: gistProcessEmptyingQueue (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x1F706D: gistBufferingBuildInsert (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x1F7817: gistBuildCallback (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x2293F3: heapam_index_build_range_scan (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x1F7A23: gistbuild (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==00:00:00:34.861 1439277==
{
   <insert_a_suppression_name_here>
   Memcheck:Jump
   obj:*
   fun:FunctionCall3Coll
   fun:gistpenalty
   fun:gistRelocateBuildBuffersOnSplit
   fun:gistbufferinginserttuples
   fun:gistbufferinginserttuples
   fun:gistProcessItup
   fun:gistProcessEmptyingQueue
   fun:gistBufferingBuildInsert
   fun:gistBuildCallback
   fun:heapam_index_build_range_scan
   fun:gistbuild
}
==00:00:00:34.861 1439277==
==00:00:00:34.861 1439277== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==00:00:00:34.861 1439277==  Bad permissions for mapped region at address 0x0
==00:00:00:34.861 1439277==    at 0x0: ???
==00:00:00:34.861 1439277==    by 0x68E92E: FunctionCall3Coll (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x1FFE9F: gistpenalty (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x1F86E0: gistRelocateBuildBuffersOnSplit (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x1F67B7: gistbufferinginserttuples (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x1F6B6F: gistbufferinginserttuples (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x1F6E0E: gistProcessItup (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x1F701F: gistProcessEmptyingQueue (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x1F706D: gistBufferingBuildInsert (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x1F7817: gistBuildCallback (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x2293F3: heapam_index_build_range_scan (in .../tmp_install/usr/local/pgsql/bin/postgres)
==00:00:00:34.861 1439277==    by 0x1F7A23: gistbuild (in .../tmp_install/usr/local/pgsql/bin/postgres)
2020-05-12 06:51:30.764 MSK postmaster[1439231] LOG:  server process (PID 1439277) was terminated by signal 11: Segmentation fault
2020-05-12 06:51:30.764 MSK postmaster[1439231] DETAIL:  Failed process was running: create index gist_pointidx6 on gist_point_tbl using gist(p) include(t) with (buffering = on, fillfactor=50);

Best regards,
Alexander