Yet another fast GiST build

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
133 messages Options
1234567
Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build (typo)

Andrey Borodin-2


> 6 сент. 2020 г., в 18:26, Heikki Linnakangas <[hidden email]> написал(а):
>
> On 05/09/2020 14:53, Andrey M. Borodin wrote:
>> Thanks for ideas, Heikki. Please see v13 with proposed changes.
>
> Thanks, that was quick!
>
>> But I've found out that logging page-by-page slows down GiST build by
>> approximately 15% (when CPU constrained). Though In think that this
>> is IO-wise.
> Hmm, any ideas why that is? log_newpage_range() writes one WAL record for 32 pages, while now you're writing one record per page, so you'll have a little bit more overhead from that. But 15% seems like a lot.
Hmm, this works for B-tree too.
this index creation
psql -c '\timing' -c "create table x as select random() from generate_series(1,10000000,1);" -c "create index ON x (random );"
takes 7 seconds on may machine, but with one weird patch it takes only 6 :)

Maybe I'm missing something? Like forgot to log 10% of pages, or something like that...

Best regards, Andrey Borodin.

0001-nbtree-faster-logging.patch (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build (typo)

Andrey Borodin-2


> 7 сент. 2020 г., в 12:14, Andrey M. Borodin <[hidden email]> написал(а):
>
> Maybe I'm missing something? Like forgot to log 10% of pages, or something like that...

Indeed, there was a bug. I've fixed it, and I still observe same performance gain.

Best regards, Andrey Borodin.


v2-0001-nbtree-faster-logging.patch (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Heikki Linnakangas
In reply to this post by Andrey Borodin-2
On 24/02/2020 10:50, Andrey M. Borodin wrote:

>> On 24 февр. 2020 г., at 01:58, Thomas Munro <[hidden email]> wrote:
>>
>> On Thu, Feb 20, 2020 at 10:14 AM Thomas Munro <[hidden email]> wrote:
>>> 1.  We expect floats to be in IEEE format, and the sort order of IEEE
>>> floats is mostly correlated to the binary sort order of the bits
>>> reinterpreted as an int.  It isn't in some special cases, but for this
>>> use case we don't really care about that, we're just trying to
>>> encourage locality.
>>
>> I suppose there is a big jump in integer value (whether signed or
>> unsigned) as you cross from positive to negative floats, and then the
>> sort order is reversed.  I have no idea if either of those things is a
>> problem worth fixing.  That made me wonder if there might also be an
>> endianness problem.  It seems from some quick googling that all
>> current architectures have integers and floats of the same endianness.
>> Apparently this wasn't always the case, and some ARMs have a weird
>> half-flipped arrangement for 64 bit floats, but not 32 bit floats as
>> you are using here.
>
> Yes, this leap is a problem for point as generic data type. And I do not know
> how to fix it. It can cause inefficient Index Scans when searching near (0,0) and query
> window touches simultaneously all quadrants (4x slower).
I took a stab at fixing this, see attached patch (applies on top of your
patch v14).

To evaluate this, I used the other attached patch to expose the zorder
function to SQL, and plotted points around zero with gnuplot. See the
attached two images, one with patch v14, and the other one with this patch.

I'll continue looking at these patches in whole tomorrow. I think it's
getting close to a committable state.

> But everything will be just fine when all data is in 2nd quadrant.

Simon Riggs and friends would agree :-)

- Heikki

v15-0003-Expose-point_zorder-to-SQL.patch (1K) Download Attachment
v15-0004-Map-negative-values-better.patch (7K) Download Attachment
zorder-v14.png (62K) Download Attachment
zorder-v15-heikki.png (57K) Download Attachment
gist-point-test.sql (670 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Pavel Borisov

>> I suppose there is a big jump in integer value (whether signed or
>> unsigned) as you cross from positive to negative floats, and then the
>> sort order is reversed.  I have no idea if either of those things is a
>> problem worth fixing.  That made me wonder if there might also be an

I took a stab at fixing this, see attached patch (applies on top of your
patch v14).

To evaluate this, I used the other attached patch to expose the zorder
function to SQL, and plotted points around zero with gnuplot. See the
attached two images, one with patch v14, and the other one with this patch.

I'd made testing of sorted SpGist build in cases of points distributed only in 2d quadrant and points in all 4 quadrants and it appears that this abnormality doesn't affect as much as Andrey supposed. But Heikki's patch is really nice way to avoid what can be avoided and I'd like it is included together with Andrey's patch.

Pavel.
Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Heikki Linnakangas
On 07/09/2020 13:59, Pavel Borisov wrote:

>>>> I suppose there is a big jump in integer value (whether signed or
>>>> unsigned) as you cross from positive to negative floats, and then the
>>>> sort order is reversed.  I have no idea if either of those things is a
>>>> problem worth fixing.  That made me wonder if there might also be an
>>
>> I took a stab at fixing this, see attached patch (applies on top of your
>> patch v14).
>>
>> To evaluate this, I used the other attached patch to expose the zorder
>> function to SQL, and plotted points around zero with gnuplot. See the
>> attached two images, one with patch v14, and the other one with this patch.
>
> I'd made testing of sorted SpGist build in cases of points distributed only
> in 2d quadrant and points in all 4 quadrants and it appears that this
> abnormality doesn't affect as much as Andrey supposed. But Heikki's patch
> is really nice way to avoid what can be avoided and I'd like it is included
> together with Andrey's patch.
Thanks! Did you measure the quality of the built index somehow? The
ordering shouldn't make any difference to the build speed, but it
affects the shape of the resulting index and the speed of queries
against it.

I played with some simple queries like this:

explain (analyze, buffers) select count(*) from points_good where p <@
box(point(50, 50), point(75, 75));

and looking at the "Buffers" line for how many pages were accessed.
There doesn't seem to be any consistent difference between v14 and my
fix. So I concur it doesn't seem to matter much.


I played some more with plotting the curve. I wrote a little python
program to make an animation of it, and also simulated how the points
would be divided into pages, assuming that each GiST page can hold 200
tuples (I think the real number is around 150 with default page size).
In the animation, the leaf pages appear as rectangles as it walks
through the Z-order curve. This is just a simulation by splitting all
the points into batches of 200 and drawing a bounding box around each
batch. I haven't checked the actual pages as the GiST creates, but I
think this gives a good idea of how it works.

The animation shows that there's quite a lot of overlap between the
pages. It's not necessarily this patch's business to try to improve
that, and the non-sorting index build isn't perfect either. But it
occurs to me that there's maybe one pretty simple trick we could do:
instead of blindly filling the leaf pages in Z-order, collect tuples
into a larger buffer, in Z-order. I'm thinking 32 pages worth of tuples,
or something in that ballpark, or maybe go all the way up to work_mem.
When the buffer fills up, call the picksplit code to divide the buffer
into the actual pages, and flush them to disk. If you look at the
animation and imagine that you would take a handful of pages in the
order they're created, and re-divide the points with the split
algorithm, there would be much less overlap.

- Heikki

gist-point-test.py (1K) Download Attachment
zorder.mp4 (4M) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Andrey Borodin-2


> 7 сент. 2020 г., в 19:10, Heikki Linnakangas <[hidden email]> написал(а):
>
> On 07/09/2020 13:59, Pavel Borisov wrote:
>>>>> I suppose there is a big jump in integer value (whether signed or
>>>>> unsigned) as you cross from positive to negative floats, and then the
>>>>> sort order is reversed.  I have no idea if either of those things is a
>>>>> problem worth fixing.  That made me wonder if there might also be an
>>>
>>> I took a stab at fixing this, see attached patch (applies on top of your
>>> patch v14).
>>>
>>> To evaluate this, I used the other attached patch to expose the zorder
>>> function to SQL, and plotted points around zero with gnuplot. See the
>>> attached two images, one with patch v14, and the other one with this patch.
>>
>> I'd made testing of sorted SpGist build in cases of points distributed only
>> in 2d quadrant and points in all 4 quadrants and it appears that this
>> abnormality doesn't affect as much as Andrey supposed. But Heikki's patch
>> is really nice way to avoid what can be avoided and I'd like it is included
>> together with Andrey's patch.
>
> Thanks! Did you measure the quality of the built index somehow? The
> ordering shouldn't make any difference to the build speed, but it
> affects the shape of the resulting index and the speed of queries
> against it.
I've tried to benchmark the difference between build time v14 and v15. v15 seems to be slightly slower, but with negligible difference.

> I played with some simple queries like this:
>
> explain (analyze, buffers) select count(*) from points_good where p <@
> box(point(50, 50), point(75, 75));
To observe IndexScan difference query should touch 4 quadrants. i.e. search within ((-25,-25),point(25,25))

> and looking at the "Buffers" line for how many pages were accessed.
> There doesn't seem to be any consistent difference between v14 and my
> fix. So I concur it doesn't seem to matter much.
>
>
> I played some more with plotting the curve. I wrote a little python
> program to make an animation of it, and also simulated how the points
> would be divided into pages, assuming that each GiST page can hold 200
> tuples (I think the real number is around 150 with default page size).
> In the animation, the leaf pages appear as rectangles as it walks
> through the Z-order curve. This is just a simulation by splitting all
> the points into batches of 200 and drawing a bounding box around each
> batch. I haven't checked the actual pages as the GiST creates, but I
> think this gives a good idea of how it works.
> The animation shows that there's quite a lot of overlap between the
> pages. It's not necessarily this patch's business to try to improve
> that, and the non-sorting index build isn't perfect either. But it
> occurs to me that there's maybe one pretty simple trick we could do:
> instead of blindly filling the leaf pages in Z-order, collect tuples
> into a larger buffer, in Z-order. I'm thinking 32 pages worth of tuples,
> or something in that ballpark, or maybe go all the way up to work_mem.
> When the buffer fills up, call the picksplit code to divide the buffer
> into the actual pages, and flush them to disk. If you look at the
> animation and imagine that you would take a handful of pages in the
> order they're created, and re-divide the points with the split
> algorithm, there would be much less overlap.

Animation looks cool! It really pins the inefficiency of resulting MBRs.
But in R*-tree one of Beckman's points was that overlap optimisation worth doing on higher levels, not lower.
But we can do this for splits on each level, I think. We do not know tree depth in advance to divide maintenance workmem among level.. But, probably we don't need to, let's allocate half to first level, quarter to second, 1/8 to third etc until it's one page. Should we take allocations inside picksplit() into account?
The more I think about it the cooler idea seem to me.

BTW I've found one more bug in the patch: it writes WAL even for unlogged tables. I'm not sending a patch because changes are trivial and currently we already have lengthy patchset in different messages.
Also, to avoid critical section we can use log_new_page() instead of log_buffer().


Thanks!

Best regards, Andrey Borodin.

Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Pavel Borisov
> Thanks! Did you measure the quality of the built index somehow? The
> ordering shouldn't make any difference to the build speed, but it
> affects the shape of the resulting index and the speed of queries
> against it.
 
Again I've tried random select tests near axes and haven't noticed any performance difference between ordinary gist build and z-ordered one. The same is for selects far from axes. Theoretically, there may be a possible slowdown for particular points inside the MBR which crosses the axis but I haven't tried to dig so deep and haven't tested performance as a function of coordinate.

So I feel this patch is not about select speed optimization.

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com
Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Heikki Linnakangas
On 08/09/2020 21:33, Pavel Borisov wrote:

>      > Thanks! Did you measure the quality of the built index somehow? The
>      > ordering shouldn't make any difference to the build speed, but it
>      > affects the shape of the resulting index and the speed of queries
>      > against it.
>
> Again I've tried random select tests near axes and haven't noticed any
> performance difference between ordinary gist build and z-ordered one.
> The same is for selects far from axes. Theoretically, there may be a
> possible slowdown for particular points inside the MBR which crosses the
> axis but I haven't tried to dig so deep and haven't tested performance
> as a function of coordinate.
>
> So I feel this patch is not about select speed optimization.
Ok, thank for confirming.

I've been reviewing the patch today. The biggest changes I've made have
been in restructuring the code in gistbuild.c for readability, but there
are a bunch of smaller changes throughout. Attached is what I've got so
far, squashed into one patch. I'm continuing to review it, but a couple
of questions so far:

In the gistBuildCallback(), you're skipping the tuple if 'tupleIsAlive
== false'. That seems fishy, surely we need to index recently-dead
tuples, too. The normal index build path isn't skipping them either.

How does the 'sortsupport' routine interact with
'compress'/'decompress'? Which representation is passed to the
comparator routine: the original value from the table, the compressed
representation, or the decompressed representation? Do the
comparetup_index_btree() and readtup_index() routines agree with that?

- Heikki

v16-0001-Add-sort-support-for-point-gist_point_sortsuppor.patch (42K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Andrey Borodin-2


> 9 сент. 2020 г., в 00:05, Heikki Linnakangas <[hidden email]> написал(а):
>
> I've been reviewing the patch today. The biggest changes I've made have been in restructuring the code in gistbuild.c for readability, but there are a bunch of smaller changes throughout. Attached is what I've got so far, squashed into one patch.
Thanks!

> I'm continuing to review it, but a couple of questions so far:
>
> In the gistBuildCallback(), you're skipping the tuple if 'tupleIsAlive == false'. That seems fishy, surely we need to index recently-dead tuples, too. The normal index build path isn't skipping them either.
That's an oversight.
>
> How does the 'sortsupport' routine interact with 'compress'/'decompress'? Which representation is passed to the comparator routine: the original value from the table, the compressed representation, or the decompressed representation? Do the comparetup_index_btree() and readtup_index() routines agree with that?

Currently we pass compressed values, which seems not very good.
But there was a request from PostGIS maintainers to pass values before decompression.
Darafei, please, correct me if I'm wrong. Also can you please provide link on PostGIS B-tree sorting functions?

Thanks!

Best regards, Andrey Borodin.

Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Darafei "Komяpa" Praliaskouski
Hi,


On Wed, Sep 9, 2020 at 9:43 AM Andrey M. Borodin <[hidden email]> wrote:


> 9 сент. 2020 г., в 00:05, Heikki Linnakangas <[hidden email]> написал(а):
>
> I've been reviewing the patch today. The biggest changes I've made have been in restructuring the code in gistbuild.c for readability, but there are a bunch of smaller changes throughout. Attached is what I've got so far, squashed into one patch.
Thanks!

> I'm continuing to review it, but a couple of questions so far:
>
> In the gistBuildCallback(), you're skipping the tuple if 'tupleIsAlive == false'. That seems fishy, surely we need to index recently-dead tuples, too. The normal index build path isn't skipping them either.
That's an oversight.
>
> How does the 'sortsupport' routine interact with 'compress'/'decompress'? Which representation is passed to the comparator routine: the original value from the table, the compressed representation, or the decompressed representation? Do the comparetup_index_btree() and readtup_index() routines agree with that?

Currently we pass compressed values, which seems not very good.
But there was a request from PostGIS maintainers to pass values before decompression.
Darafei, please, correct me if I'm wrong. Also can you please provide link on PostGIS B-tree sorting functions?

We were expecting to reuse btree opclass for this thing. This way btree_gist extension will become a lot thinner. :)

Core routine for current sorting implementation is Hilbert curve, which is based on 2D center of a box - and used for abbreviated sort:
https://github.com/postgis/postgis/blob/2a7ebd0111b02aed3aa24752aad0ba89aef5d431/liblwgeom/gbox.c#L893 

All the btree functions are wrappers around gserialized_cmp which just adds a bunch of tiebreakers that don't matter in practice:
https://github.com/postgis/postgis/blob/2a7ebd0111b02aed3aa24752aad0ba89aef5d431/liblwgeom/gserialized.c#L313

Base representation for index compressed datatype is GIDX, which is also a box. We can make it work on top of it instead of the original representation.
There is no such thing as "decompressed representation" unfortunately as compression is lossy.

Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Andrey Borodin-2
Thanks Darafei!

> 9 сент. 2020 г., в 12:05, Darafei Komяpa Praliaskouski <[hidden email]> написал(а):
>
> > How does the 'sortsupport' routine interact with 'compress'/'decompress'? Which representation is passed to the comparator routine: the original value from the table, the compressed representation, or the decompressed representation? Do the comparetup_index_btree() and readtup_index() routines agree with that?
>
> Currently we pass compressed values, which seems not very good.
> But there was a request from PostGIS maintainers to pass values before decompression.
> Darafei, please, correct me if I'm wrong. Also can you please provide link on PostGIS B-tree sorting functions?
>
> We were expecting to reuse btree opclass for this thing. This way btree_gist extension will become a lot thinner. :)

I think if we aim at reusing B-tree sort support functions we have to pass uncompressed values. They can be a lot bigger and slower in case of PostGIS. We will be sorting actual geometries instead of MBRs.

In my view it's better to implement GiST-specific sort support in btree_gist, rather than trying to reuse existing sort supports.

Best regards, Andrey Borodin.

Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Heikki Linnakangas
On 09/09/2020 13:28, Andrey M. Borodin wrote:

> Thanks Darafei!
>
>> 9 сент. 2020 г., в 12:05, Darafei Komяpa Praliaskouski
>> <[hidden email]> написал(а):
>>
>>> How does the 'sortsupport' routine interact with
>>> 'compress'/'decompress'? Which representation is passed to the
>>> comparator routine: the original value from the table, the
>>> compressed representation, or the decompressed representation? Do
>>> the comparetup_index_btree() and readtup_index() routines agree
>>> with that?
>>
>> Currently we pass compressed values, which seems not very good. But
>> there was a request from PostGIS maintainers to pass values before
>> decompression. Darafei, please, correct me if I'm wrong. Also can
>> you please provide link on PostGIS B-tree sorting functions?
>>
>> We were expecting to reuse btree opclass for this thing. This way
>> btree_gist extension will become a lot thinner. :)
>
> I think if we aim at reusing B-tree sort support functions we have to
> pass uncompressed values. They can be a lot bigger and slower in case
> of PostGIS. We will be sorting actual geometries instead of MBRs.
>
> In my view it's better to implement GiST-specific sort support in
> btree_gist, rather than trying to reuse existing sort supports.

Yeah, I don't think reusing existing sortsupport functions directly is
important. The comparison function should be short anyway for
performance reasons, so it won't be a lot of code to copy-paste. And if
there are some common subroutines, you can put them in a separate
internal functions for reuse.

Using the 'compressed' format seems reasonable to me. It's natural to
the gistbuild.c code, and the comparison routine can 'decompress' itself
if it wishes. If the decompressions is somewhat expensive, it's
unfortunate if you need to do it repeatedly in the comparator, but
tuplesort.c would need pretty big changes to keep around a separate
in-memory representation compare. However, you could use the sort
"abbreviation" functionality to mitigate that.

Come to think of it, the point z-order comparator could benefit a lot
from key abbreviation, too. You could do the point -> zorder conversion
in the abbreviation routine.

- Heikki


Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Darafei "Komяpa" Praliaskouski


On Wed, Sep 9, 2020 at 3:09 PM Heikki Linnakangas <[hidden email]> wrote:
On 09/09/2020 13:28, Andrey M. Borodin wrote:
> Thanks Darafei!
>
>> 9 сент. 2020 г., в 12:05, Darafei Komяpa Praliaskouski
>> <[hidden email]> написал(а):
>>
>>> How does the 'sortsupport' routine interact with
>>> 'compress'/'decompress'? Which representation is passed to the
>>> comparator routine: the original value from the table, the
>>> compressed representation, or the decompressed representation? Do
>>> the comparetup_index_btree() and readtup_index() routines agree
>>> with that?
>>
....

Come to think of it, the point z-order comparator could benefit a lot
from key abbreviation, too. You could do the point -> zorder conversion
in the abbreviation routine.

That's how it works in PostGIS, only that we moved to more effecient Hilbert curve:
https://github.com/postgis/postgis/blob/54399b9f6b0f02e8db9444f9f042b8d4ca6d4fa4/postgis/lwgeom_btree.c#L171



 
Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Heikki Linnakangas
On 09/09/2020 15:20, Darafei "Komяpa" Praliaskouski wrote:
> On Wed, Sep 9, 2020 at 3:09 PM Heikki Linnakangas <[hidden email]> wrote:
>
>> Come to think of it, the point z-order comparator could benefit a lot
>> from key abbreviation, too. You could do the point -> zorder conversion
>> in the abbreviation routine.
>
> That's how it works in PostGIS, only that we moved to more
> effecient Hilbert curve:
> https://github.com/postgis/postgis/blob/54399b9f6b0f02e8db9444f9f042b8d4ca6d4fa4/postgis/lwgeom_btree.c#L171

Thanks, that's interesting.

I implemented the abbreviated keys for the point opclass, too, and
noticed that the patch as it was never used it. I reworked the patch so
that tuplesort_begin_index_gist() is responsible for looking up the
sortsupport function, like tuplesort_begin_index_btree() does, and uses
abbreviation when possible.

I think this is pretty much ready for commit now. I'll do a bit more
testing (do we have regression test coverage for this?), also on a
SIZEOF_DATUM==4 system since the abbreviation works differently with
that, and push if nothing new comes up. And clarify the documentation
and/or comments that the sortsupport function sees "compressed" values.

I wonder if we could use sorting to also speed up building tsvector
indexes? The values stored there are bit signatures, what would be a
good sort order for those?

- Heikki


Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Heikki Linnakangas
In reply to this post by Darafei "Komяpa" Praliaskouski
On 09/09/2020 15:20, Darafei "Komяpa" Praliaskouski wrote:
> On Wed, Sep 9, 2020 at 3:09 PM Heikki Linnakangas <[hidden email]> wrote:
>
>> Come to think of it, the point z-order comparator could benefit a lot
>> from key abbreviation, too. You could do the point -> zorder conversion
>> in the abbreviation routine.
>
> That's how it works in PostGIS, only that we moved to more
> effecient Hilbert curve:
> https://github.com/postgis/postgis/blob/54399b9f6b0f02e8db9444f9f042b8d4ca6d4fa4/postgis/lwgeom_btree.c#L171

Thanks, that's interesting.

I implemented the abbreviated keys for the point opclass, too, and
noticed that the patch as it was never used it. I reworked the patch so
that tuplesort_begin_index_gist() is responsible for looking up the
sortsupport function, like tuplesort_begin_index_btree() does, and uses
abbreviation when possible.

I think this is pretty much ready for commit now. I'll do a bit more
testing (do we have regression test coverage for this?), also on a
SIZEOF_DATUM==4 system since the abbreviation works differently with
that, and push if nothing new comes up. And clarify the documentation
and/or comments that the sortsupport function sees "compressed" values.

I wonder if we could use sorting to also speed up building tsvector
indexes? The values stored there are bit signatures, what would be a
good sort order for those?

- Heikki

v17-0001-Add-sort-support-for-point-gist_point_sortsuppor.patch (46K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Andrey Borodin-2


> 9 сент. 2020 г., в 20:39, Heikki Linnakangas <[hidden email]> написал(а):
>
> On 09/09/2020 15:20, Darafei "Komяpa" Praliaskouski wrote:
>> On Wed, Sep 9, 2020 at 3:09 PM Heikki Linnakangas <[hidden email]> wrote:
>>> Come to think of it, the point z-order comparator could benefit a lot
>>> from key abbreviation, too. You could do the point -> zorder conversion
>>> in the abbreviation routine.
>> That's how it works in PostGIS, only that we moved to more
>> effecient Hilbert curve:
>> https://github.com/postgis/postgis/blob/54399b9f6b0f02e8db9444f9f042b8d4ca6d4fa4/postgis/lwgeom_btree.c#L171
>
> Thanks, that's interesting.
>
> I implemented the abbreviated keys for the point opclass, too, and noticed that the patch as it was never used it. I reworked the patch so that tuplesort_begin_index_gist() is responsible for looking up the sortsupport function, like tuplesort_begin_index_btree() does, and uses abbreviation when possible.
Wow, abbreviated sort made gist for points construction even 1.5x faster!
btw there is small typo in arg names in gist_bbox_zorder_cmp_abbrev(); z1,z2 -> a,b

> do we have regression test coverage for this?
Yes, sorting build for points is tested in point.sql, but with small dataset. index_including_gist.sql seems to be working with boxes, but triggers point paths too.

> , also on a SIZEOF_DATUM==4 system since the abbreviation works differently with that, and push if nothing new comes up. And clarify the documentation and/or comments that the sortsupport function sees "compressed" values.
>
> I wonder if we could use sorting to also speed up building tsvector indexes? The values stored there are bit signatures, what would be a good sort order for those?
We need an order so that nearby values have a lot of bits in common.
What is the length of this signature?
For each 4 bytes we can compute number of 1s in it's binary representation. Then z-order these dwords as values 0-32.

This will be very inefficient grouping, but it will tend to keep empty and dense 4-byte regions apart.

Thanks for working on this!


Best regards, Andrey Borodin.


Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Oleg Bartunov-3
In reply to this post by Heikki Linnakangas
On Mon, Sep 7, 2020 at 7:50 PM Heikki Linnakangas <[hidden email]> wrote:

>
> On 07/09/2020 13:59, Pavel Borisov wrote:
> >>>> I suppose there is a big jump in integer value (whether signed or
> >>>> unsigned) as you cross from positive to negative floats, and then the
> >>>> sort order is reversed.  I have no idea if either of those things is a
> >>>> problem worth fixing.  That made me wonder if there might also be an
> >>
> >> I took a stab at fixing this, see attached patch (applies on top of your
> >> patch v14).
> >>
> >> To evaluate this, I used the other attached patch to expose the zorder
> >> function to SQL, and plotted points around zero with gnuplot. See the
> >> attached two images, one with patch v14, and the other one with this patch.
> >
> > I'd made testing of sorted SpGist build in cases of points distributed only
> > in 2d quadrant and points in all 4 quadrants and it appears that this
> > abnormality doesn't affect as much as Andrey supposed. But Heikki's patch
> > is really nice way to avoid what can be avoided and I'd like it is included
> > together with Andrey's patch.
>
> Thanks! Did you measure the quality of the built index somehow? The
> ordering shouldn't make any difference to the build speed, but it
> affects the shape of the resulting index and the speed of queries
> against it.
>
> I played with some simple queries like this:
>
> explain (analyze, buffers) select count(*) from points_good where p <@
> box(point(50, 50), point(75, 75));
>
> and looking at the "Buffers" line for how many pages were accessed.
> There doesn't seem to be any consistent difference between v14 and my
> fix. So I concur it doesn't seem to matter much.
>
>
> I played some more with plotting the curve. I wrote a little python
> program to make an animation of it, and also simulated how the points
> would be divided into pages, assuming that each GiST page can hold 200
> tuples (I think the real number is around 150 with default page size).
> In the animation, the leaf pages appear as rectangles as it walks
> through the Z-order curve. This is just a simulation by splitting all
> the points into batches of 200 and drawing a bounding box around each
> batch. I haven't checked the actual pages as the GiST creates, but I
> think this gives a good idea of how it works.

Heikki, you may use our gevel extension to visualize index tree
http://www.sai.msu.su/~megera/wiki/Gevel
I used it to investigate rtree index
http://www.sai.msu.su/~megera/wiki/Rtree_Index


>
> The animation shows that there's quite a lot of overlap between the
> pages. It's not necessarily this patch's business to try to improve
> that, and the non-sorting index build isn't perfect either. But it
> occurs to me that there's maybe one pretty simple trick we could do:
> instead of blindly filling the leaf pages in Z-order, collect tuples
> into a larger buffer, in Z-order. I'm thinking 32 pages worth of tuples,
> or something in that ballpark, or maybe go all the way up to work_mem.
> When the buffer fills up, call the picksplit code to divide the buffer
> into the actual pages, and flush them to disk. If you look at the
> animation and imagine that you would take a handful of pages in the
> order they're created, and re-divide the points with the split
> algorithm, there would be much less overlap.

Interesting to see also the size of index, it should be several times less.

>
> - Heikki



--
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Pavel Borisov
Interesting to see also the size of index, it should be several times less.

I've tested this above in CF thread and got ordered GiST index ~1.7 times smaller than non-ordered one for single column of real points.


--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com
Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Heikki Linnakangas
In reply to this post by Andrey Borodin-2
On 09/09/2020 19:50, Andrey M. Borodin wrote:

>> 9 сент. 2020 г., в 20:39, Heikki Linnakangas <[hidden email]> написал(а):
>>
>> On 09/09/2020 15:20, Darafei "Komяpa" Praliaskouski wrote:
>>> On Wed, Sep 9, 2020 at 3:09 PM Heikki Linnakangas <[hidden email]> wrote:
>>>> Come to think of it, the point z-order comparator could benefit a lot
>>>> from key abbreviation, too. You could do the point -> zorder conversion
>>>> in the abbreviation routine.
>>> That's how it works in PostGIS, only that we moved to more
>>> effecient Hilbert curve:
>>> https://github.com/postgis/postgis/blob/54399b9f6b0f02e8db9444f9f042b8d4ca6d4fa4/postgis/lwgeom_btree.c#L171
>>
>> Thanks, that's interesting.
>>
>> I implemented the abbreviated keys for the point opclass, too, and noticed that the patch as it was never used it. I reworked the patch so that tuplesort_begin_index_gist() is responsible for looking up the sortsupport function, like tuplesort_begin_index_btree() does, and uses abbreviation when possible.
> Wow, abbreviated sort made gist for points construction even 1.5x faster!
> btw there is small typo in arg names in gist_bbox_zorder_cmp_abbrev(); z1,z2 -> a,b
One more patch version attached. I fixed some memory leaks, and fixed
the abbreviation on 32-bit systems, and a bunch of small comment changes
and such.

- Heikki

v18-0001-Add-support-for-building-GiST-index-by-sorting.patch (47K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Yet another fast GiST build

Andrey Borodin-2


> 10 сент. 2020 г., в 17:43, Heikki Linnakangas <[hidden email]> написал(а):
>
> One more patch version attached. I fixed some memory leaks, and fixed the abbreviation on 32-bit systems, and a bunch of small comment changes and such.
>
> - Heikki
> <v18-0001-Add-support-for-building-GiST-index-by-sorting.patch>

The patch looks fine to me. On my machine GiST for points is builded 10x faster than before the patch.

Future action items:
1. Sort support for gist_btree data types
2. Better page borders with split and fillfactor
3. Consider sort build for tsvector

I'll certainly do 1 before next CF and most probably 2.
Item 1 is basically a lot of similar code for many many different types.
In Item 2 I plan to use Oleg's gevel to evaluation possibilities of MBR overlap reduction.

Item 3 seems tricky and need deeper evaluation: chances are sort build will decrease IndexScan performance in this case.

Thanks, Heikki!

Best regards, Andrey Borodin,

1234567