Yet another fast GiST build

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
69 messages Options
1234
Reply | Threaded
Open this post in threaded view
|

Yet another fast GiST build

Andrey M. Borodin
Hi!

In many cases GiST index can be build fast using z-order sorting.

I've looked into proof of concept by Nikita Glukhov [0] and it looks very interesting.
So, I've implemented yet another version of B-tree-like GiST build.
It's main use case and benefits can be summarized with small example:

postgres=# create table x as select point (random(),random()) from generate_series(1,3000000,1);
SELECT 3000000
Time: 5061,967 ms (00:05,062)
postgres=# create index ON x using gist (point ) with (fast_build_sort_function=gist_point_sortsupport);
CREATE INDEX
Time: 6140,227 ms (00:06,140)
postgres=# create index ON x using gist (point );
CREATE INDEX
Time: 32061,200 ms (00:32,061)

As you can see, Z-order build is on order of magnitude faster. Select performance is roughly the same. Also, index is significantly smaller.

Nikita's PoC is faster because it uses parallel build, but it intervenes into B-tree code a lot (for reuse). This patchset is GiST-isolated.
My biggest concern is that passing function to relation option seems a bit hacky. You can pass there any function matching sort support signature.
Embedding this function into opclass makes no sense: it does not affect scan anyhow.

In current version, docs and tests are not implemented. I want to discuss overall design. Do we really want yet another GiST build, if it is 3-10 times faster?

Thanks!

Best regards, Andrey Borodin.

[0] https://github.com/postgres/postgres/compare/master...glukhovn:gist_btree_build


0001-function-relopt-for-gist-build.patch (6K) Download Attachment
0003-Implement-GiST-build-using-sort-support.patch (21K) Download Attachment
0002-Add-sort-support-for-point-gist_point_sortsupport.patch (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Darafei "Komяpa" Praliaskouski
Hello,

This is very interesting. In my pipeline currently GiST index rebuild is the biggest time consuming step.

I believe introducing optional concept of order in the GiST opclass will be beneficial not only for fast build, but for other tasks later:
 - CLUSTER can order the table using that notion, in parallel way.
 - btree_gist can be even closer to btree by getting the tuples sorted inside page.
 - tree descend on insertion in future can traverse the list in more opportunistic way, calculating penalty for siblings-by-order first.

I believe everywhere the idea of ordering is needed it's provided by giving a btree opclass.

How about giving a link to btree opclass inside a gist opclass?


On Mon, Aug 26, 2019 at 10:59 AM Andrey Borodin <[hidden email]> wrote:
Hi!

In many cases GiST index can be build fast using z-order sorting.

I've looked into proof of concept by Nikita Glukhov [0] and it looks very interesting.
So, I've implemented yet another version of B-tree-like GiST build.
It's main use case and benefits can be summarized with small example:

postgres=# create table x as select point (random(),random()) from generate_series(1,3000000,1);
SELECT 3000000
Time: 5061,967 ms (00:05,062)
postgres=# create index ON x using gist (point ) with (fast_build_sort_function=gist_point_sortsupport);
CREATE INDEX
Time: 6140,227 ms (00:06,140)
postgres=# create index ON x using gist (point );
CREATE INDEX
Time: 32061,200 ms (00:32,061)

As you can see, Z-order build is on order of magnitude faster. Select performance is roughly the same. Also, index is significantly smaller.

Nikita's PoC is faster because it uses parallel build, but it intervenes into B-tree code a lot (for reuse). This patchset is GiST-isolated.
My biggest concern is that passing function to relation option seems a bit hacky. You can pass there any function matching sort support signature.
Embedding this function into opclass makes no sense: it does not affect scan anyhow.

In current version, docs and tests are not implemented. I want to discuss overall design. Do we really want yet another GiST build, if it is 3-10 times faster?

Thanks!

Best regards, Andrey Borodin.

[0] https://github.com/postgres/postgres/compare/master...glukhovn:gist_btree_build



--
Darafei Praliaskouski
Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Heikki Linnakangas
In reply to this post by Andrey M. Borodin
On 26/08/2019 10:59, Andrey Borodin wrote:
> Hi!
>
> In many cases GiST index can be build fast using z-order sorting.
>
> I've looked into proof of concept by Nikita Glukhov [0] and it looks very interesting.
> So, I've implemented yet another version of B-tree-like GiST build.

Cool!

> My biggest concern is that passing function to relation option seems
> a bit hacky. You can pass there any function matching sort support
> signature. Embedding this function into opclass makes no sense: it
> does not affect scan anyhow.

I think it should be a new, optional, GiST "AM support function", in
pg_amproc. That's how the sort support functions are defined for B-tree
opfamilies, too.

- Heikki


Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Alexander Korotkov
In reply to this post by Andrey M. Borodin
On Mon, Aug 26, 2019 at 10:59 AM Andrey Borodin <[hidden email]> wrote:

> In many cases GiST index can be build fast using z-order sorting.
>
> I've looked into proof of concept by Nikita Glukhov [0] and it looks very interesting.
> So, I've implemented yet another version of B-tree-like GiST build.
> It's main use case and benefits can be summarized with small example:
>
> postgres=# create table x as select point (random(),random()) from generate_series(1,3000000,1);
> SELECT 3000000
> Time: 5061,967 ms (00:05,062)
> postgres=# create index ON x using gist (point ) with (fast_build_sort_function=gist_point_sortsupport);
> CREATE INDEX
> Time: 6140,227 ms (00:06,140)
> postgres=# create index ON x using gist (point );
> CREATE INDEX
> Time: 32061,200 ms (00:32,061)
>
> As you can see, Z-order build is on order of magnitude faster. Select performance is roughly the same. Also, index is significantly smaller.

Cool!  These experiments bring me to following thoughts.  Can we not
only build, but also maintain GiST indexes in B-tree-like manner?  If
we put Z-value together with MBR to the non-leaf keys, that should be
possible.  Maintaining it in B-tree-like manner would have a lot of
advantages.
1) Binary search in non-leaf pages instead of probing each key is much faster.
2) Split algorithm is also simpler and faster.
3) We can refind existing index tuple in predictable time of O(log N).
And this doesn't depend on MBR overlapping degree.  This is valuable
for future table AMs, which would have notion of deleting individual
index tuples (for instance, zheap promises to eventually support
delete-marking indexes).

Eventually, we may come to the idea of B-tree indexes with
user-defined additional keys in non-leaf tuples, which could be used
for scanning.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Peter Geoghegan-4
On Thu, Aug 29, 2019 at 3:48 PM Alexander Korotkov
<[hidden email]> wrote:
> > As you can see, Z-order build is on order of magnitude faster. Select performance is roughly the same. Also, index is significantly smaller.
>
> Cool!  These experiments bring me to following thoughts.  Can we not
> only build, but also maintain GiST indexes in B-tree-like manner?  If
> we put Z-value together with MBR to the non-leaf keys, that should be
> possible.  Maintaining it in B-tree-like manner would have a lot of
> advantages.

I'm not an expert on GiST, but that seems like it would have a lot of
advantages in the long term. It is certainly theoretically appealing.

Could this make it easier to use merge join with containment
operators? I'm thinking of things like geospatial joins, which can
generally only be performed as nested loop joins at the moment. This
is often wildly inefficient.

--
Peter Geoghegan


Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Alexander Korotkov
On Fri, Aug 30, 2019 at 2:34 AM Peter Geoghegan <[hidden email]> wrote:

> On Thu, Aug 29, 2019 at 3:48 PM Alexander Korotkov
> <[hidden email]> wrote:
> > > As you can see, Z-order build is on order of magnitude faster. Select performance is roughly the same. Also, index is significantly smaller.
> >
> > Cool!  These experiments bring me to following thoughts.  Can we not
> > only build, but also maintain GiST indexes in B-tree-like manner?  If
> > we put Z-value together with MBR to the non-leaf keys, that should be
> > possible.  Maintaining it in B-tree-like manner would have a lot of
> > advantages.
>
> I'm not an expert on GiST, but that seems like it would have a lot of
> advantages in the long term. It is certainly theoretically appealing.
>
> Could this make it easier to use merge join with containment
> operators? I'm thinking of things like geospatial joins, which can
> generally only be performed as nested loop joins at the moment. This
> is often wildly inefficient.

AFAICS, spatial joins aren't going to be as easy as just merge joins
on Z-value.  When searching for close points in two datasets (closer
than given threshold) we can scan some ranges of Z-value in one
dataset while iterating on another.  But dealing with prolonged
spatial objects in not that easy.  In order to determine if there are
matching values in given Z-range you also need to be aware on size of
objects inside that Z-range.  So, before merge-like join you need to
not only sort, but build some index-like structure.

Alternatively you can encode size in Z-value.  But this increases
dimensionality of space and decreases efficiency of join.  Also,
spatial join can be made using two indexes, even just current GiST
without Z-values.  We've prototyped that, see [1].

Links
1. https://github.com/pgsphere/pgsphere/blob/crossmatch_cnode/crossmatch.c

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Peter Geoghegan-4
On Thu, Aug 29, 2019 at 8:22 PM Alexander Korotkov
<[hidden email]> wrote:
> Alternatively you can encode size in Z-value.  But this increases
> dimensionality of space and decreases efficiency of join.  Also,
> spatial join can be made using two indexes, even just current GiST
> without Z-values.  We've prototyped that, see [1].

I'm pretty sure that spatial joins generally need two spatial indexes
(usually R-Trees). There seems to have been quite a lot of research in
it in the 1990s.

--
Peter Geoghegan


Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Andrey M. Borodin
In reply to this post by Alexander Korotkov


30 авг. 2019 г., в 3:47, Alexander Korotkov <[hidden email]> написал(а):

1) Binary search in non-leaf pages instead of probing each key is much faster.

That's a neat idea, but key union breaks ordering, even for z-order.
for two sets of tuples X and Y
if for any i,o from N, Xi < Yo
does not guaranty union(X) < union (Y) 

For example consider this z-ordered keyspace (picture attached)

union(5, 9) is z-order-smaller than union(4,4)

I'm not even sure we can use sorted search for choosing subtree for insertion.

How do you think, should I supply GiST-build patch with docs and tests and add it to CF? Or do we need more design discussion before?


Best regards, Andrey Borodin.
Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Andrey M. Borodin


> 30 авг. 2019 г., в 16:44, Andrey Borodin <[hidden email]> написал(а):
>
>
> How do you think, should I supply GiST-build patch with docs and tests and add it to CF? Or do we need more design discussion before?

PFA v2: now sort support is part of opclass.
There's a problem with Z-ordering NaN which causes regression tests to fail. So I decided not to add patch to CF (despite having few minutes to do so).
How to correctly Z-order NaN? So that it would be consistent with semantics of union() and consistent() functions. If one of values is NaN, then we consider all it's bits to be 1?

BTW patch uses
union {
        float f;
        uint32 i;
    }
I hope it's OK, because AFAIK we do not have non-IEEE-754 platforms now.

Thanks!

Best regards, Andrey Borodin.


v2-0001-Add-sort-support-for-point-gist_point_sortsupport.patch (3K) Download Attachment
v2-0002-Implement-GiST-build-using-sort-support.patch (23K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Alexander Korotkov
In reply to this post by Andrey M. Borodin
On Fri, Aug 30, 2019 at 2:44 PM Andrey Borodin <[hidden email]> wrote:

>
> 30 авг. 2019 г., в 3:47, Alexander Korotkov <[hidden email]> написал(а):
>
> 1) Binary search in non-leaf pages instead of probing each key is much faster.
>
>
> That's a neat idea, but key union breaks ordering, even for z-order.
> for two sets of tuples X and Y
> if for any i,o from N, Xi < Yo
> does not guaranty union(X) < union (Y)
>
> For example consider this z-ordered keyspace (picture attached)
>
> union(5, 9) is z-order-smaller than union(4,4)
>
> I'm not even sure we can use sorted search for choosing subtree for insertion.


Sorry, I didn't explain my proposal in enough details.  I didn't mean
B-tree separator keys would be the same as union key (MBR).  I mean
B-tree on Z-values, which maintains union key in addition to separator
keys.  So, you select downlink to insert using separator Z-values and
then also extend union key (if needed).  It's probably still not
enough detail yet.  I'll try to spend more time for more detailed
description later.

>
> How do you think, should I supply GiST-build patch with docs and tests and add it to CF? Or do we need more design discussion before?


+1 for adding to CF.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Alexander Korotkov
In reply to this post by Peter Geoghegan-4
On Fri, Aug 30, 2019 at 6:28 AM Peter Geoghegan <[hidden email]> wrote:

> On Thu, Aug 29, 2019 at 8:22 PM Alexander Korotkov
> <[hidden email]> wrote:
> > Alternatively you can encode size in Z-value.  But this increases
> > dimensionality of space and decreases efficiency of join.  Also,
> > spatial join can be made using two indexes, even just current GiST
> > without Z-values.  We've prototyped that, see [1].
>
> I'm pretty sure that spatial joins generally need two spatial indexes
> (usually R-Trees). There seems to have been quite a lot of research in
> it in the 1990s.

Sure, our prototype was an implementation of one of such papers.  My
point is that advantages of Z-value ordered GiST for spatial joins are
not yet clear for me.  Except faster build and smaller index, which
are general advantages.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Andrey M. Borodin
In reply to this post by Andrey M. Borodin


> 1 сент. 2019 г., в 15:53, Andrey Borodin <[hidden email]> написал(а):
>
>
> <v2-0001-Add-sort-support-for-point-gist_point_sortsupport.patch><v2-0002-Implement-GiST-build-using-sort-support.patch>

Here's V3 of the patch set.
Changes:
1. Added some documentation of new sort support routines
2. Fixed bug with dirty pages

I did not add sort support procs to built-in boxes, circles and polys, since it may be not optimal index for them. However, for points Z-order is quite good as a default.

Tests only pass with fixes for GiST KNN from Alexander in other thread.

Thanks!

Best regards, Andrey Borodin.


v3-0002-Implement-GiST-build-using-sort-support.patch (29K) Download Attachment
v3-0001-Add-sort-support-for-point-gist_point_sortsupport.patch (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Michael Paquier-2
On Sun, Sep 08, 2019 at 01:54:35PM +0500, Andrey Borodin wrote:
> Here's V3 of the patch set.
> Changes:
> 1. Added some documentation of new sort support routines
> 2. Fixed bug with dirty pages
>
> I did not add sort support procs to built-in boxes, circles and
> polys, since it may be not optimal index for them. However, for
> points Z-order is quite good as a defaul t.

The latest patch does not apply.  Could you send a rebase?  I have
moved the patch to next CF, waiting on author for now.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Andrey M. Borodin


> 1 дек. 2019 г., в 7:06, Michael Paquier <[hidden email]> написал(а):
>
> On Sun, Sep 08, 2019 at 01:54:35PM +0500, Andrey Borodin wrote:
>> Here's V3 of the patch set.
>> Changes:
>> 1. Added some documentation of new sort support routines
>> 2. Fixed bug with dirty pages
>>
>> I did not add sort support procs to built-in boxes, circles and
>> polys, since it may be not optimal index for them. However, for
>> points Z-order is quite good as a defaul t.
>
> The latest patch does not apply.  Could you send a rebase?  I have
> moved the patch to next CF, waiting on author for now.
Thanks, Michael!

PFA rebased patch.

Best regards, Andrey Borodin.




v4-0002-Implement-GiST-build-using-sort-support.patch (29K) Download Attachment
v4-0001-Add-sort-support-for-point-gist_point_sortsupport.patch (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Thomas Munro-5
On Mon, Dec 30, 2019 at 7:43 PM Andrey Borodin <[hidden email]> wrote:
> PFA rebased patch.

Hi Andrey,

This looks really interesting, and I am sure there are a lot of GIS
people who would love to see dramatically faster and smaller indexes
in PG13.  I don't know enough to comment on the details, but here are
some superficial comments:

+   method is also optional and is used diring fast GiST build.

-> during

+    /* esteblish order between x and y */

-> establish

+/* Compute Z-oder for point */
 static inline uint64
 point_zorder_internal(Point *p)

-> order

Could this function please have a comment that explains why it works?
I mean, just a breadcrumb... the name of the technique or something...
so that uninitiated hackers can google their way to a clue (is it
"Morton encoding"?)

MSVC says:

src/backend/access/gist/gistproc.c(1582): error C2065: 'INT32_MAX' :
undeclared identifier

GCC says:

gistbuild.c: In function ‘gist_indexsortbuild’:
gistbuild.c:256:4: error: ISO C90 forbids mixed declarations and code
[-Werror=declaration-after-statement]
    IndexTuple *itvec = gistextractpage(lower_page, &vect_len);
    ^


Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Thomas Munro-5
On Wed, Feb 19, 2020 at 8:00 PM Thomas Munro <[hidden email]> wrote:
> Could this function please have a comment that explains why it works?
> I mean, just a breadcrumb... the name of the technique or something...
> so that uninitiated hackers can google their way to a clue (is it
> "Morton encoding"?)

Ok I think I get it now after doing some homework.

1.  We expect floats to be in IEEE format, and the sort order of IEEE
floats is mostly correlated to the binary sort order of the bits
reinterpreted as an int.  It isn't in some special cases, but for this
use case we don't really care about that, we're just trying to
encourage locality.
2.  We generate a Morton code that interleaves the bits of N integers
to produce a single integer that preserves locality: things that were
close in the N dimensional space are close in the resulting integer.

Cool.

+static int
+my_fastcmp(Datum x, Datum y, SortSupport ssup)
+{
+    /* esteblish order between x and y */
+
+    return z1 == z2 ? 0 : z1 > z2 ? 1 : -1;
+}

This example code from the documentation looks wrong, probably missing
eg int64 z1 = DatumGetInt64(x).


Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Thomas Munro-5
On Thu, Feb 20, 2020 at 10:14 AM Thomas Munro <[hidden email]> wrote:
> 1.  We expect floats to be in IEEE format, and the sort order of IEEE
> floats is mostly correlated to the binary sort order of the bits
> reinterpreted as an int.  It isn't in some special cases, but for this
> use case we don't really care about that, we're just trying to
> encourage locality.

I suppose there is a big jump in integer value (whether signed or
unsigned) as you cross from positive to negative floats, and then the
sort order is reversed.  I have no idea if either of those things is a
problem worth fixing.  That made me wonder if there might also be an
endianness problem.  It seems from some quick googling that all
current architectures have integers and floats of the same endianness.
Apparently this wasn't always the case, and some ARMs have a weird
half-flipped arrangement for 64 bit floats, but not 32 bit floats as
you are using here.


Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Andrey M. Borodin
Hi Thomas!

Thanks for looking into this! I’ll fix your notices asap.

> On 24 февр. 2020 г., at 01:58, Thomas Munro <[hidden email]> wrote:
>
> On Thu, Feb 20, 2020 at 10:14 AM Thomas Munro <[hidden email]> wrote:
>> 1.  We expect floats to be in IEEE format, and the sort order of IEEE
>> floats is mostly correlated to the binary sort order of the bits
>> reinterpreted as an int.  It isn't in some special cases, but for this
>> use case we don't really care about that, we're just trying to
>> encourage locality.
>
> I suppose there is a big jump in integer value (whether signed or
> unsigned) as you cross from positive to negative floats, and then the
> sort order is reversed.  I have no idea if either of those things is a
> problem worth fixing.  That made me wonder if there might also be an
> endianness problem.  It seems from some quick googling that all
> current architectures have integers and floats of the same endianness.
> Apparently this wasn't always the case, and some ARMs have a weird
> half-flipped arrangement for 64 bit floats, but not 32 bit floats as
> you are using here.

Yes, this leap is a problem for point as generic data type. And I do not know
how to fix it. It can cause inefficient Index Scans when searching near (0,0) and query
window touches simultaneously all quadrants (4x slower).
But everything will be just fine when all data is in 2nd quadrant.

Actually, we do not need to add this hacky code to core: we can provide colum-wise
ordering or something similar as an example.
This feature is aimed at PostGIS and they already possess bit tricks tricks [0].
I’ve taken this union code from PostGIS.

Thanks!

Best regards, Andrey Borodin.


[0] https://github.com/postgis/postgis/blob/master/postgis/gserialized_gist_nd.c#L1150

Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Andrey M. Borodin
Hi!

> On 24 февр. 2020 г., at 13:50, Andrey M. Borodin <[hidden email]> wrote:
>
> Hi Thomas!
>
> Thanks for looking into this! I’ll fix your notices asap.

PFA v5.
Thomas, I've used your wording almost exactly with explanation how
point_zorder_internal() works. It has more explanation power than my attempts
to compose good comment.

There is one design decision that worries me most:
should we use opclass function or index option to provide this sorting information?
It is needed only during index creation, actually. And having extra i-class only for fast build
seems excessive.
I think we can provide both ways and let opclass developers decide?

Thanks!

Best regards, Andrey Borodin.


v5-0001-Add-sort-support-for-point-gist_point_sortsupport.patch (3K) Download Attachment
v5-0002-Implement-GiST-build-using-sort-support.patch (31K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build (typo)

Erik Rijkers
On 2020-02-29 13:13, Andrey M. Borodin wrote:

> Hi!
>
>> On 24 февр. 2020 г., at 13:50, Andrey M. Borodin
>> <[hidden email]> wrote:
>>
>> Hi Thomas!
>>
>> Thanks for looking into this! I’ll fix your notices asap.
>
> PFA v5.
> Thomas, I've used your wording almost exactly with explanation how
> point_zorder_internal() works. It has more explanation power than my
> attempts
> to compose good comment.

Small typo alert:
In v5-0002-Implement-GiST-build-using-sort-support.patch there is:

+   method is also optional and is used diring fast GiST build.

'diring' should be 'during'



1234