Yet another fast GiST build

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
134 messages Options
1234567
Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Tom Lane-2
Heikki Linnakangas <[hidden email]> writes:
> On 21/09/2020 02:06, Tom Lane wrote:
>> Another interesting point is that all the other index AMs seem to WAL-log
>> the new page before the smgrextend call, whereas this code is doing it
>> in the other order.  I strongly doubt that both patterns are equally
>> correct.  Could be that the other AMs are in the wrong though.

> My thinking was that it's better to call smgrextend() first, so that if
> you run out of disk space, you get the error before WAL-logging it. That
> reduces the chance that WAL replay will run out of disk space. A lot of
> things are different during WAL replay, so it's quite likely that WAL
> replay runs out of disk space anyway if you're living on the edge, but
> still.

Yeah.  access/transam/README points out that such failures need to be
planned for, and explains what we do for heap pages;

    1. Adding a disk page to an existing table.

    This action isn't WAL-logged at all.  We extend a table by writing a page
    of zeroes at its end.  We must actually do this write so that we are sure
    the filesystem has allocated the space.  If the write fails we can just
    error out normally.  Once the space is known allocated, we can initialize
    and fill the page via one or more normal WAL-logged actions.  Because it's
    possible that we crash between extending the file and writing out the WAL
    entries, we have to treat discovery of an all-zeroes page in a table or
    index as being a non-error condition.  In such cases we can just reclaim
    the space for re-use.

So GIST seems to be acting according to that design.  (Someday we need
to update this para to acknowledge that not all filesystems behave as
it's assuming.)

> I didn't notice that the other callers are doing it the other way round,
> though. I think they need to, so that they can stamp the page with the
> LSN of the WAL record. But GiST build is special in that regard, because
> it stamps all pages with GistBuildLSN.

Kind of unpleasant; that means they risk what the README points out:

    In all of these cases, if WAL replay fails to redo the original action
    we must panic and abort recovery.  The DBA will have to manually clean up
    (for instance, free up some disk space or fix directory permissions) and
    then restart recovery.  This is part of the reason for not writing a WAL
    entry until we've successfully done the original action.

I'm not sufficiently motivated to go and change it right now, though.

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Heikki Linnakangas
In reply to this post by Andrey Borodin-2

On 21/09/2020 16:29, Andrey M. Borodin wrote:
> But thing that bothers me now: when we vacuum leaf page, we bump it's
> NSN. But we do not bump internal page LSN. Does this means we will
> follow rightlinks after vacuum? It seems superflous.

Sorry, I did not understand what you said above. Vacuum doesn't update
any NSNs, only LSNs. Can you elaborate?

> And btw we did not adjust internal page tuples after vacuum...
What do you mean by that?

- Heikki


Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Heikki Linnakangas
In reply to this post by Andrey Borodin-2
On 21/09/2020 17:19, Andrey M. Borodin wrote:
>> 21 сент. 2020 г., в 18:29, Andrey M. Borodin <[hidden email]> написал(а):
>>
>> It was a conscious decision with incorrect motivation. I was thinking that it will help to reduce number of "false positive" inspecting right pages. But now I see that:
>> 1. There should be no such "false positives" that we can avoid
>> 2. Valid rightlinks could help to do amcheck verification in future
>
> Well, point number 2 here is invalid. There exist one leaf page p, so that if we start traversing rightlink from p we will reach all leaf pages. But we practically have no means to find this page. This makes rightlinks not very helpful in amcheck for GiST.

Well, if you store all the right links in a hash table or something, you
can "connect the dots" after you have scanned all the pages to see that
the chain is unbroken. Probably would not be worth the trouble, since
the rightlinks are not actually needed after concurrent scans have
completed.

> But for consistency I think it worth to install them.

I agree. I did some testing with your patch. It seems that the
rightlinks are still not always set. I didn't try to debug why.

I wrote a couple of 'pageinspect' function to inspect GiST pages for
this. See attached. I then created a test table and index like this:

create table points (p point);
insert into points select point(x,y) from generate_series(-2000, 2000)
x, generate_series(-2000, 2000) y;
create index points_idx on points using gist (p);

And this is what the root page looks like:

postgres=# select * from gist_page_items(get_raw_page('points_idx', 0));
  itemoffset |     ctid      | itemlen
------------+---------------+---------
           1 | (27891,65535) |      40
           2 | (55614,65535) |      40
           3 | (83337,65535) |      40
           4 | (97019,65535) |      40
(4 rows)

And the right links on the next level:

postgres=# select * from (VALUES (27891), (55614), (83337), (97019)) b
(blkno), lateral gist_page_opaque_info(get_raw_page('points_idx', blkno));
  blkno | lsn | nsn | rightlink  | flags
-------+-----+-----+------------+-------
  27891 | 0/1 | 0/0 | 4294967295 | {}
  55614 | 0/1 | 0/0 | 4294967295 | {}
  83337 | 0/1 | 0/0 |      27891 | {}
  97019 | 0/1 | 0/0 |      55614 | {}
(4 rows)

I expected there to be only one page with invalid right link, but there
are two.

- Heikki

0001-Add-functions-to-pageinspect-to-inspect-GiST-indexes.patch (7K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Andrey Borodin-2


> 28 сент. 2020 г., в 13:12, Heikki Linnakangas <[hidden email]> написал(а):
>
> On 21/09/2020 17:19, Andrey M. Borodin wrote:
>>> 21 сент. 2020 г., в 18:29, Andrey M. Borodin <[hidden email]> написал(а):
>>>
>>> It was a conscious decision with incorrect motivation. I was thinking that it will help to reduce number of "false positive" inspecting right pages. But now I see that:
>>> 1. There should be no such "false positives" that we can avoid
>>> 2. Valid rightlinks could help to do amcheck verification in future
>> Well, point number 2 here is invalid. There exist one leaf page p, so that if we start traversing rightlink from p we will reach all leaf pages. But we practically have no means to find this page. This makes rightlinks not very helpful in amcheck for GiST.
>
> Well, if you store all the right links in a hash table or something, you can "connect the dots" after you have scanned all the pages to see that the chain is unbroken. Probably would not be worth the trouble, since the rightlinks are not actually needed after concurrent scans have completed.
>
>> But for consistency I think it worth to install them.
>
> I agree. I did some testing with your patch. It seems that the rightlinks are still not always set. I didn't try to debug why.
>
> I wrote a couple of 'pageinspect' function to inspect GiST pages for this. See attached. I then created a test table and index like this:
>
> create table points (p point);
> insert into points select point(x,y) from generate_series(-2000, 2000) x, generate_series(-2000, 2000) y;
> create index points_idx on points using gist (p);
>
> And this is what the root page looks like:
>
> postgres=# select * from gist_page_items(get_raw_page('points_idx', 0));
> itemoffset |     ctid      | itemlen
> ------------+---------------+---------
>          1 | (27891,65535) |      40
>          2 | (55614,65535) |      40
>          3 | (83337,65535) |      40
>          4 | (97019,65535) |      40
> (4 rows)
>
> And the right links on the next level:
>
> postgres=# select * from (VALUES (27891), (55614), (83337), (97019)) b (blkno), lateral gist_page_opaque_info(get_raw_page('points_idx', blkno));
> blkno | lsn | nsn | rightlink  | flags
> -------+-----+-----+------------+-------
> 27891 | 0/1 | 0/0 | 4294967295 | {}
> 55614 | 0/1 | 0/0 | 4294967295 | {}
> 83337 | 0/1 | 0/0 |      27891 | {}
> 97019 | 0/1 | 0/0 |      55614 | {}
> (4 rows)
>
> I expected there to be only one page with invalid right link, but there are two.
Yes, there is a bug. Now it seems to me so obvious, yet it took some time to understand that links were shifted by one extra jump. PFA fixed rightlinks installation.

BTW some one more small thing: we initialise page buffers with palloc() and palloc0(), while first one is sufficient for gistinitpage().
Also I'm working on btree_gist opclasses and found out that new sortsupport function is not mentioned in gistadjustmembers(). I think this can be fixed with gist_btree patch.

Your pageinspect patch seems very useful. How do you think, should we provide a way to find invalid tuples in GiST within gist_page_items()? At some point we will have to ask user to reindex GiSTs with invalid tuples.


> 28 сент. 2020 г., в 11:53, Heikki Linnakangas <[hidden email]> написал(а):
>
> On 21/09/2020 16:29, Andrey M. Borodin wrote:
>> But thing that bothers me now: when we vacuum leaf page, we bump it's
>> NSN. But we do not bump internal page LSN. Does this means we will
>> follow rightlinks after vacuum? It seems superflous.
>
> Sorry, I did not understand what you said above. Vacuum doesn't update any NSNs, only LSNs. Can you elaborate?
I've misunderstood difference between NSN and LSN. Seems like everything is fine.

>
>> And btw we did not adjust internal page tuples after vacuum...
> What do you mean by that?

When we delete rows from table internal tuples in GiST stay wide.


Thanks!

Best regards, Andrey Borodin.


install_rightlinks_v2.diff (530 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Heikki Linnakangas
On 29/09/2020 21:04, Andrey M. Borodin wrote:
> 28 сент. 2020 г., в 13:12, Heikki Linnakangas <[hidden email]> написал(а):
>> I did some testing with your patch. It seems that the rightlinks
>> are still not always set. I didn't try to debug why.
>
> Yes, there is a bug. Now it seems to me so obvious, yet it took some
> time to understand that links were shifted by one extra jump. PFA
> fixed rightlinks installation.
Ah, that was simple. I propose adding a comment on it:

--- a/src/backend/access/gist/gistbuild.c
+++ b/src/backend/access/gist/gistbuild.c
@@ -540,6 +540,19 @@ gist_indexsortbuild_pagestate_flush(GISTBuildState
*state,
  /* Re-initialize the page buffer for next page on this level. */
  pagestate->page = palloc(BLCKSZ);
  gistinitpage(pagestate->page, isleaf ? F_LEAF : 0);
+
+ /*
+ * Set the right link to point to the previous page. This is just for
+ * debugging purposes: GiST only follows the right link if a page is split
+ * concurrently to a scan, and that cannot happen during index build.
+ *
+ * It's a bit counterintuitive that we set the right link on the new page
+ * to point to the previous page, and not the other way round. But GiST
+ * pages are not ordered like B-tree pages are, so as long as the
+ * right-links form a chain through all the pages in the same level, the
+ * order doesn't matter.
+ */
+ GistPageGetOpaque(pagestate->page)->rightlink = blkno;
  }

> BTW some one more small thing: we initialise page buffers with
> palloc() and palloc0(), while first one is sufficient for
> gistinitpage().
Hmm. Only the first one, in gist_indexsortbuild(), but that needs to be
palloc0, because it's used to write an all-zeros placeholder for the
root page.

> Also I'm working on btree_gist opclasses and found out that new
> sortsupport function is not mentioned in gistadjustmembers(). I think
> this can be fixed with gist_btree patch.

Thanks!

> Your pageinspect patch seems very useful. How do you think, should we
> provide a way to find invalid tuples in GiST within
> gist_page_items()? At some point we will have to ask user to reindex
> GiSTs with invalid tuples.
You mean invalid tuples created by crash on PostgreSQL version 9.0 or
below, and pg_upgraded? I doubt there are any of those still around in
the wild. We have to keep the code to detect them, though.

It would be nice to improve gist_page_items() to display more
information about the items, although I wouldn't worry much about
invalid tuples. The 'gevel' extension that Oleg mentioned upthread does
more, it would be nice to incorporate that into pageinspect somehow.

- Heikki


Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Pavel Borisov
I've been making tests with memory sanitizer and got one another error in regression test create_index:
CREATE INDEX gpointind ON point_tbl USING gist (f1);
server closed the connection unexpectedly

with logfile:
gistproc.c:1714:28: runtime error: 1e+300 is outside the range of representable values of type 'float'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior gistproc.c:1714:28

gistproc.c:
1714     z = point_zorder_internal(p->x, p->y);

Consider this a minor issue but unrelated to the other issues discussed. It is reproduced on the last master commit 0a3c864c32751fd29d021929cf70af421fd27370 after all changes into Gist committed.

cflags="-DUSE_VALGRIND -Og -O0 -fsanitize=address -fsanitize=undefined -fno-sanitize-recover=all -fno-sanitize=alignment -fstack-protector"


--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com
Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Heikki Linnakangas
On 06/10/2020 14:05, Pavel Borisov wrote:
> I've been making tests with memory sanitizer

Thanks!

> and got one another error
> in regression test create_index:
> CREATE INDEX gpointind ON point_tbl USING gist (f1);
> server closed the connection unexpectedly
>
> with logfile:
> gistproc.c:1714:28: runtime error: 1e+300 is outside the range of
> representable values of type 'float'
> SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior gistproc.c:1714:28
>
> gistproc.c:
> 1714     z = point_zorder_internal(p->x, p->y);
>
> Consider this a minor issue but unrelated to the other issues discussed.
> It is reproduced on the last master
> commit 0a3c864c32751fd29d021929cf70af421fd27370 after all changes into
> Gist committed.
>
> cflags="-DUSE_VALGRIND -Og -O0 -fsanitize=address -fsanitize=undefined
> -fno-sanitize-recover=all -fno-sanitize=alignment -fstack-protector"

You get the same error with:

select (float8 '1e+300')::float4;

float.c:1204:11: runtime error: 1e+300 is outside the range of
representable values of type 'float'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior float.c:1204:11 in

It boils down to casting a C double to float, when the value doesn't fit
in float. I'm surprised that's undefined behavior, but I'm no C99
lawyer. The code in dtof() expects it to yield Inf.

I'm inclined to shrug this off and say that the sanitizer is being
over-zealous. Is there some compiler flag we should be setting, to tell
it that we require specific behavior? Any other ideas?

- Heikki


Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Tom Lane-2
Heikki Linnakangas <[hidden email]> writes:
> You get the same error with:
> select (float8 '1e+300')::float4;
> float.c:1204:11: runtime error: 1e+300 is outside the range of
> representable values of type 'float'
> SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior float.c:1204:11 in

> It boils down to casting a C double to float, when the value doesn't fit
> in float. I'm surprised that's undefined behavior, but I'm no C99
> lawyer. The code in dtof() expects it to yield Inf.

I think UBSan read C99 6.3.1.5:

       [#2]  When  a double is demoted to float or a long double to
       double or float, if the value being converted is outside the
       range  of  values  that  can be represented, the behavior is
       undefined.

and stopped reading at that point, which they should not have.
If you go on to read the portions around, particularly, <fenv.h>,
you get a different picture of affairs.  If we're relying on IEEE
float semantics in other places, which we are, we're perfectly
entitled to assume that the cast will yield Inf (and a floating
point exception flag, which we ignore).  I think the "undefined"
here is just meant to say that there's no single behavior promised
across all possible C implementations.  They'd have been better to
write "implementation-defined", though.

> I'm inclined to shrug this off and say that the sanitizer is being
> over-zealous. Is there some compiler flag we should be setting, to tell
> it that we require specific behavior? Any other ideas?

If UBSan doesn't have a flag to tell it to assume IEEE math,
I'd say that makes it next door to worthless for our purposes.

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Pavel Borisov
It became normal with 
-fsanitize=signed-integer-overflow,null,alignment
instead of
-fsanitize=undefined
(which is strictly a 'default' list of needed and unnecessary things to check, can be overridden anyway but needed some reading for it)

Thanks!

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com
Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Andrey Borodin-2
In reply to this post by Andrey Borodin-2


> 29 сент. 2020 г., в 23:04, Andrey M. Borodin <[hidden email]> написал(а):
>
>
> Also I'm working on btree_gist opclasses and found out that new sortsupport function is not mentioned in gistadjustmembers(). I think this can be fixed with gist_btree patch.

Here's draft patch with implementation of sortsupport for ints and floats.

Thanks!

Best regards, Andrey Borodin.

0001-Sortsupport-for-sorting-GiST-build-for-ints-and-floa.patch (19K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Heikki Linnakangas
On 07/10/2020 15:27, Andrey Borodin wrote:
> Here's draft patch with implementation of sortsupport for ints and floats.

> +static int
> +gbt_int4_cmp(Datum a, Datum b, SortSupport ssup)
> +{
> + int32KEY   *ia = (int32KEY *) DatumGetPointer(a);
> + int32KEY   *ib = (int32KEY *) DatumGetPointer(b);
> +
> + if (ia->lower == ib->lower)
> + {
> + if (ia->upper == ib->upper)
> + return 0;
> +
> + return (ia->upper > ib->upper) ? 1 : -1;
> + }
> +
> + return (ia->lower > ib->lower) ? 1 : -1;
> +}

We're only dealing with leaf items during index build, so the 'upper'
and 'lower' should always be equal here, right? Maybe add a comment and
an assertion on that.

(It's pretty sad that the on-disk representation is identical for leaf
and internal items, because that wastes a lot of disk space, but that's
way out of scope for this patch.)

- Heikki


Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Andrey Borodin-2


> 7 окт. 2020 г., в 17:38, Heikki Linnakangas <[hidden email]> написал(а):
>
> On 07/10/2020 15:27, Andrey Borodin wrote:
>> Here's draft patch with implementation of sortsupport for ints and floats.
>
>> +static int
>> +gbt_int4_cmp(Datum a, Datum b, SortSupport ssup)
>> +{
>> + int32KEY   *ia = (int32KEY *) DatumGetPointer(a);
>> + int32KEY   *ib = (int32KEY *) DatumGetPointer(b);
>> +
>> + if (ia->lower == ib->lower)
>> + {
>> + if (ia->upper == ib->upper)
>> + return 0;
>> +
>> + return (ia->upper > ib->upper) ? 1 : -1;
>> + }
>> +
>> + return (ia->lower > ib->lower) ? 1 : -1;
>> +}
>
> We're only dealing with leaf items during index build, so the 'upper' and 'lower' should always be equal here, right? Maybe add a comment and an assertion on that.
>
> (It's pretty sad that the on-disk representation is identical for leaf and internal items, because that wastes a lot of disk space, but that's way out of scope for this patch.)
Thanks, I've added assert() where is was easy to test equalty.

PFA patch with all types.

I had a plan to implement and test one type each day. I did not quite understood how rich our type system is.

Thanks!

Best regards, Andrey Borodin.

0001-Sortsupport-for-sorting-GiST-build-for-gist_btree-ty.patch (60K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Pavel Borisov

>> +    return (ia->lower > ib->lower) ? 1 : -1;
>> +}
>
> We're only dealing with leaf items during index build, so the 'upper' and 'lower' should always be equal here, right? Maybe add a comment and an assertion on that.
>
> (It's pretty sad that the on-disk representation is identical for leaf and internal items, because that wastes a lot of disk space, but that's way out of scope for this patch.)

Thanks, I've added assert() where is was easy to test equalty.

PFA patch with all types.

I had a plan to implement and test one type each day. I did not quite understood how rich our type system is.

Cool, thanks!

BTW there is a somewhat parallel discussion on this gist patch in pgsql-bugs which you may miss 

The main point is that buffering build is better to be enforced with buffering=on as otherwise buffering build becomes hard to test in the presence of sort support.
 
--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com
Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Heikki Linnakangas
In reply to this post by Andrey Borodin-2
On 21/10/2020 20:13, Andrey Borodin wrote:

>> 7 окт. 2020 г., в 17:38, Heikki Linnakangas <[hidden email]> написал(а):
>>
>> On 07/10/2020 15:27, Andrey Borodin wrote:
>>> Here's draft patch with implementation of sortsupport for ints and floats.
>>
>>> +static int
>>> +gbt_int4_cmp(Datum a, Datum b, SortSupport ssup)
>>> +{
>>> + int32KEY   *ia = (int32KEY *) DatumGetPointer(a);
>>> + int32KEY   *ib = (int32KEY *) DatumGetPointer(b);
>>> +
>>> + if (ia->lower == ib->lower)
>>> + {
>>> + if (ia->upper == ib->upper)
>>> + return 0;
>>> +
>>> + return (ia->upper > ib->upper) ? 1 : -1;
>>> + }
>>> +
>>> + return (ia->lower > ib->lower) ? 1 : -1;
>>> +}
>>
>> We're only dealing with leaf items during index build, so the 'upper' and 'lower' should always be equal here, right? Maybe add a comment and an assertion on that.
>>
>> (It's pretty sad that the on-disk representation is identical for leaf and internal items, because that wastes a lot of disk space, but that's way out of scope for this patch.)
>
> Thanks, I've added assert() where is was easy to test equalty.
>
> PFA patch with all types.

gbt_ts_cmp(), gbt_time_cmp_sort() and gbt_date_cmp_sort() still have the
above issue, they still compare "upper" for no good reason.

> +static Datum
> +gbt_bit_abbrev_convert(Datum original, SortSupport ssup)
> +{
> +       return (Datum) 0;
> +}
> +
> +static int
> +gbt_bit_cmp_abbrev(Datum z1, Datum z2, SortSupport ssup)
> +{
> +       return 0;
> +}

If an abbreviated key is not useful, just don't define abbrev functions
and don't set SortSupport->abbrev_converter in the first place.

> static bool
> gbt_inet_abbrev_abort(int memtupcount, SortSupport ssup)
> {
> #if SIZEOF_DATUM == 8
> return false;
> #else
> return true;
> #endif
> }

Better to not set the 'abbrev_converter' function in the first place. Or
would it be better to cast the float8 to float4 if SIZEOF_DATUM == 4?

> I had a plan to implement and test one type each day. I did not quite understood how rich our type system is.

:-)

- Heikki


Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Andrey Borodin-2
Thanks!

> 27 окт. 2020 г., в 16:43, Heikki Linnakangas <[hidden email]> написал(а):
>
> gbt_ts_cmp(), gbt_time_cmp_sort() and gbt_date_cmp_sort() still have the above issue, they still compare "upper" for no good reason.
Fixed.

>> +static Datum
>> +gbt_bit_abbrev_convert(Datum original, SortSupport ssup)
>> +{
>> +       return (Datum) 0;
>> +}
>> +
>> +static int
>> +gbt_bit_cmp_abbrev(Datum z1, Datum z2, SortSupport ssup)
>> +{
>> +       return 0;
>> +}
>
> If an abbreviated key is not useful, just don't define abbrev functions and don't set SortSupport->abbrev_converter in the first place.
Fixed.

>
>> static bool
>> gbt_inet_abbrev_abort(int memtupcount, SortSupport ssup)
>> {
>> #if SIZEOF_DATUM == 8
>> return false;
>> #else
>> return true;
>> #endif
>> }
>
> Better to not set the 'abbrev_converter' function in the first place. Or would it be better to cast the float8 to float4 if SIZEOF_DATUM == 4?
Ok, now for 4 bytes Datum we do return (Datum) Float4GetDatum((float) z);

How do you think, should this patch and patch with pageinspect GiST functions be registered on commitfest?

Thanks!

Best regards, Andrey Borodin.

v3-0001-Sortsupport-for-sorting-GiST-build-for-gist_btree.patch (54K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Heikki Linnakangas
On 30/10/2020 20:20, Andrey Borodin wrote:

>> 27 окт. 2020 г., в 16:43, Heikki Linnakangas <[hidden email]> написал(а):
>>> static bool
>>> gbt_inet_abbrev_abort(int memtupcount, SortSupport ssup)
>>> {
>>> #if SIZEOF_DATUM == 8
>>> return false;
>>> #else
>>> return true;
>>> #endif
>>> }
>>
>> Better to not set the 'abbrev_converter' function in the first place. Or would it be better to cast the float8 to float4 if SIZEOF_DATUM == 4?
> Ok, now for 4 bytes Datum we do return (Datum) Float4GetDatum((float) z);

A few quick comments:

* Currently with float8, you immediately abort abbreviation if
SIZEOF_DATUM == 4. Like in the 'inet' above, you could convert the
float8 to float4 instead.

* Some of the ALTER OPERATOR FAMILY commands in btree_gist--1.6--1.7.sql
are copy-pasted wrong. Here's one example:

ALTER OPERATOR FAMILY gist_timestamptz_ops USING gist ADD
        FUNCTION 11 (text, text) gbt_text_sortsupport;

* It's easy to confuse the new comparison functions with the existing
comparisons used by the picksplit functions. Some comments and/or naming
changes would be good to clarify how they're different.

* It would be straightforward to have abbreviated functions for macaddr
and macaddr8 too.

> How do you think, should this patch and patch with pageinspect GiST functions be registered on commitfest?

Yeah, that'd be good.

- Heikki


Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Andrey Borodin-2


> 2 нояб. 2020 г., в 19:45, Heikki Linnakangas <[hidden email]> написал(а):
>
>> How do you think, should this patch and patch with pageinspect GiST functions be registered on commitfest?
>
> Yeah, that'd be good.
I've registered both patches on January CF.
pageinspect patch's code looks goot to me (besides TODO item there), but it lacks docs and tests. I can add some info and calls in future reviews.

>
> On 30/10/2020 20:20, Andrey Borodin wrote:
> A few quick comments:
>
> * Currently with float8, you immediately abort abbreviation if SIZEOF_DATUM == 4. Like in the 'inet' above, you could convert the float8 to float4 instead.
>
> * Some of the ALTER OPERATOR FAMILY commands in btree_gist--1.6--1.7.sql are copy-pasted wrong. Here's one example:
>
> ALTER OPERATOR FAMILY gist_timestamptz_ops USING gist ADD
> FUNCTION 11 (text, text) gbt_text_sortsupport;
>
> * It's easy to confuse the new comparison functions with the existing comparisons used by the picksplit functions. Some comments and/or naming changes would be good to clarify how they're different.
>
> * It would be straightforward to have abbreviated functions for macaddr and macaddr8 too.

I'll fix these issues soon. But things like this bother me a lot:
> ALTER OPERATOR FAMILY gist_timestamptz_ops USING gist ADD
> FUNCTION 11 (text, text) gbt_text_sortsupport;

To test that functions are actually called for sorting build we should support directive sorting build like "CREATE INDEX ON A USING GIST(B) WITH(sorting=surely,and fail if not)".
If we have unconditional sorting build and unconditional buffered build we can check opclasses code better.
The problem is current reloption is called "buffering". It would be strange if we gave this enum possible value like "not buffering, but very much like buffering, just another way".
How do you think, is it ok to add reloption "buffering=sorting" to enhance tests?

Best regards, Andrey Borodin.

Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Justin Pryzby
On Thu, Nov 05, 2020 at 10:11:52PM +0500, Andrey Borodin wrote:
> To test that functions are actually called for sorting build we should support directive sorting build like "CREATE INDEX ON A USING GIST(B) WITH(sorting=surely,and fail if not)".

Maybe you could add a DEBUG1 message for that, and include that in regression
tests, which would then fail if sorting wasn't used.

Maybe you'd need to make it consistent by setting GUCs like work_mem /
max_parallel_maintenance_workers / ??

Similar to this

postgres=# SET client_min_messages =debug;
postgres=# CREATE INDEX ON A USING GIST(i) WITH(buffering=off);
DEBUG:  building index "a_i_idx2" on table "a" serially
CREATE INDEX

> If we have unconditional sorting build and unconditional buffered build we can check opclasses code better.
> The problem is current reloption is called "buffering". It would be strange if we gave this enum possible value like "not buffering, but very much like buffering, just another way".
> How do you think, is it ok to add reloption "buffering=sorting" to enhance tests?
>
> Best regards, Andrey Borodin.


Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Andrey Borodin-2


> 5 нояб. 2020 г., в 22:20, Justin Pryzby <[hidden email]> написал(а):
>
> On Thu, Nov 05, 2020 at 10:11:52PM +0500, Andrey Borodin wrote:
>> To test that functions are actually called for sorting build we should support directive sorting build like "CREATE INDEX ON A USING GIST(B) WITH(sorting=surely,and fail if not)".
>
> Maybe you could add a DEBUG1 message for that, and include that in regression
> tests, which would then fail if sorting wasn't used.

That's a good idea. Thanks!

>
> Maybe you'd need to make it consistent by setting GUCs like work_mem /
> max_parallel_maintenance_workers / ??
>
> Similar to this
>
> postgres=# SET client_min_messages =debug;
> postgres=# CREATE INDEX ON A USING GIST(i) WITH(buffering=off);
> DEBUG:  building index "a_i_idx2" on table "a" serially
> CREATE INDEX
Currently, only B-tree uses parallel build, so no need to tweak GUCs except client_min_messages.
Before these tests, actually, ~20% of opclasses were not working as expected. Despite I've checked each one by hand. I have

PFA patch with fixed comments from Heikki.

Thanks!

Best regards, Andrey Borodin.

v4-0001-Sortsupport-for-sorting-GiST-build-for-gist_btree.patch (110K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Andrey Borodin-2
In reply to this post by Heikki Linnakangas


> 28 сент. 2020 г., в 13:12, Heikki Linnakangas <[hidden email]> написал(а):
>
> I wrote a couple of 'pageinspect' function to inspect GiST pages for this. See attached.
> <0001-Add-functions-to-pageinspect-to-inspect-GiST-indexes.patch>

Here's version with tests and docs. I still have no idea how to print some useful information about tuples keys.

Thanks!

Best regards, Andrey Borodin.




v2-0001-Add-functions-to-pageinspect-to-inspect-GiST-inde.patch (19K) Download Attachment
1234567