Strange coding in mvdistinct.c

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Strange coding in mvdistinct.c

Tom Lane-2
In the wake of the discussion at [1] I went looking for structs that
should be using FLEXIBLE_ARRAY_MEMBER and are not, by dint of grepping
for size calculations of the form "offsetof(struct,fld) + n * sizeof(...)"
and then seeing how "fld" is declared.  I haven't yet found anything
like that that I want to change, but I did come across this bit in
mvdistinct.c's statext_ndistinct_serialize():

    len = VARHDRSZ + SizeOfMVNDistinct +
        ndistinct->nitems * (offsetof(MVNDistinctItem, attrs) + sizeof(int));

Given the way that the subsequent code looks, I would argue that
offsetof(MVNDistinctItem, attrs) has got basically nothing to do with
this calculation, and that the right way to phrase it is just

    len = VARHDRSZ + SizeOfMVNDistinct +
        ndistinct->nitems * (sizeof(double) + sizeof(int));

Consider if there happened to be alignment padding in MVNDistinctItem:
as the code stands it'd overestimate the space needed.  (There won't be
padding on any machine we support, I believe, so this isn't a live bug ---
but it's overcomplicated code, and could become buggy if any
less-than-double-width fields get added to MVNDistinctItem.)

For largely the same reason, I do not think that SizeOfMVNDistinct is
a helpful way to compute the space needed for those fields --- any
alignment padding that might be included is irrelevant for this purpose.
In short I'd be inclined to phrase this just as

    len = VARHDRSZ + 3 * sizeof(uint32) +
        ndistinct->nitems * (sizeof(double) + sizeof(int));

It looks to me actually like all the uses of both SizeOfMVNDistinctItem
and SizeOfMVNDistinct are wrong, because the code using those symbols
is really thinking about the size of this serialized representation,
which is guaranteed not to have any inter-field padding, unlike the
structs.

Thoughts?

                        regards, tom lane

[1] https://postgr.es/m/a620f85a-42ab-e0f3-3337-b04b97e2e2f5@...


Reply | Threaded
Open this post in threaded view
|

Re: Strange coding in mvdistinct.c

Tom Lane-2
Oh, and as I continue to grep, I found this in dependencies.c:

            dependencies = (MVDependencies *) repalloc(dependencies,
                                                       offsetof(MVDependencies, deps)
                                                       + dependencies->ndeps * sizeof(MVDependency));

I'm pretty sure this is an actual bug: the calculation should be

                       offsetof(MVDependencies, deps)
                       + dependencies->ndeps * sizeof(MVDependency *));

because deps is an array of MVDependency* not MVDependency.

This would lead to an overallocation not underallocation, and it's
probably pretty harmless because ndeps can't get too large (I hope;
if it could, this would have O(N^2) performance problems).  Still,
you oughta fix it.

(There's a similar calculation later in the file that gets it right.)

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: Strange coding in mvdistinct.c

Tomas Vondra-4
On Mon, Apr 15, 2019 at 06:12:24PM -0400, Tom Lane wrote:

>Oh, and as I continue to grep, I found this in dependencies.c:
>
>            dependencies = (MVDependencies *) repalloc(dependencies,
>                                                       offsetof(MVDependencies, deps)
>                                                       + dependencies->ndeps * sizeof(MVDependency));
>
>I'm pretty sure this is an actual bug: the calculation should be
>
>                       offsetof(MVDependencies, deps)
>                       + dependencies->ndeps * sizeof(MVDependency *));
>
>because deps is an array of MVDependency* not MVDependency.
>
>This would lead to an overallocation not underallocation, and it's
>probably pretty harmless because ndeps can't get too large (I hope;
>if it could, this would have O(N^2) performance problems).  Still,
>you oughta fix it.
>
>(There's a similar calculation later in the file that gets it right.)
>

Thanks. I noticed some of the bugs while investigating the recent MCV
serialization, and I plan to fix them soon. This week, hopefully.


regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services