Index Skip Scan (new UniqueKeys)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Index Skip Scan (new UniqueKeys)

Dmitry Dolgov
Hi,

Here is a new version of index skip scan patch, based on v8 patch for
UniqueKeys implementation from [1]. I want to start a new thread to
simplify navigation, hopefully I didn't forget anyone who actively
participated in the discussion.

To simplify reviewing I've split it into several parts:

* First two are taken from [1] just for the reference and to make cfbot happy.

* Extend-UniqueKeys consists changes that needs to be done for
  UniqueKeys to be used in skip scan. Essentially this is a reduced
  version of previous implementation from Jesper & David, based on the
  new UniqueKeys infrastructure, as it does the very same thing.

* Index-Skip-Scan contains not am specific code and the overall
  infrastructure, including configuration elements and so on.

* Btree-implementation contains btree specific code to implement amskip,
  introduced in the previous patch.

* The last one contains just documentation bits.

Interesting enough with a new UniqueKey implementation skipping is
applied in some tests unexpectedly. For those I've disabled
indexskipscan to avoid confusion.

[1]: https://www.postgresql.org/message-id/flat/CAKU4AWrwZMAL%3DuaFUDMf4WGOVkEL3ONbatqju9nSXTUucpp_pw%40mail.gmail.com

v8-0001-Introduce-RelOptInfo-notnullattrs-attribute.patch (2K) Download Attachment
v8-0002-Introuduce-RelOptInfo.uniquekeys-attribute.patch (49K) Download Attachment
v35-0001-Extend-UniqueKeys.patch (13K) Download Attachment
v35-0002-Index-skip-scan.patch (39K) Download Attachment
v35-0003-Btree-implementation-of-skipping.patch (41K) Download Attachment
v35-0004-Index-skip-scan-documentation.patch (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Index Skip Scan (new UniqueKeys)

Andy Fan


On Tue, Jun 9, 2020 at 6:20 PM Dmitry Dolgov <[hidden email]> wrote:
Hi,

Here is a new version of index skip scan patch, based on v8 patch for
UniqueKeys implementation from [1]. I want to start a new thread to
simplify navigation, hopefully I didn't forget anyone who actively
participated in the discussion.

To simplify reviewing I've split it into several parts:

* First two are taken from [1] just for the reference and to make cfbot happy.

* Extend-UniqueKeys consists changes that needs to be done for
  UniqueKeys to be used in skip scan. Essentially this is a reduced
  version of previous implementation from Jesper & David, based on the
  new UniqueKeys infrastructure, as it does the very same thing.

* Index-Skip-Scan contains not am specific code and the overall
  infrastructure, including configuration elements and so on.

* Btree-implementation contains btree specific code to implement amskip,
  introduced in the previous patch.

* The last one contains just documentation bits.

Interesting enough with a new UniqueKey implementation skipping is
applied in some tests unexpectedly. For those I've disabled
indexskipscan to avoid confusion.

[1]: https://www.postgresql.org/message-id/flat/CAKU4AWrwZMAL%3DuaFUDMf4WGOVkEL3ONbatqju9nSXTUucpp_pw%40mail.gmail.com

Thanks for the patch.

I just get the rough idea of patch, looks we have to narrow down the user cases
where we can use this method.  Consider the below example:

create table j1(i int, im5 int,  im100 int, im1000 int);
insert into j1 select i, i%5, i%100, i%1000 from generate_series(1, 10000000)i;
create index on j1(im5, i);
insert into j1 values (1, 1, 0, 0);
analyze j1;

demo=# select distinct * from j1 where i < 2;
 i | im5 | im100 | im1000
---+-----+-------+--------
 1 |   1 |     1 |      1
(1 row)

demo=# set enable_indexskipscan to off;
SET
demo=# select distinct * from j1 where i < 2;
 i | im5 | im100 | im1000
---+-----+-------+--------
 1 |   1 |     0 |      0
 1 |   1 |     1 |      1
(2 rows)

drop index j1_im5_i_idx;

create index on j1(im5, i, im100);
demo=# select distinct im5, i, im100 from j1 where i < 2;
 im5 | i | im100
-----+---+-------
   1 | 1 |     0
   1 | 1 |     1
(2 rows)
demo=# set enable_indexskipscan to on;
SET
demo=# select distinct im5, i, im100 from j1 where i < 2;
 im5 | i | im100
-----+---+-------
   1 | 1 |     0
(1 row)


--
Best Regards
Andy Fan
Reply | Threaded
Open this post in threaded view
|

Re: Index Skip Scan (new UniqueKeys)

Dmitry Dolgov
> On Thu, Jun 11, 2020 at 04:14:07PM +0800, Andy Fan wrote:
>
> I just get the rough idea of patch, looks we have to narrow down the
> user cases where we can use this method. Consider the below example:

Hi

Not exactly narrow down, but rather get rid of wrong usage of skipping
for index scan. Since skipping for it was added later than for index
only scan I can imagine there are still blind spots, so good that you've
looked. In this particular case, when index expressions do not fully
cover those expressionse result need to be distinct on, skipping just
doesn't have enough information and should not be used. I'll add it to
the next version, thanks!


Reply | Threaded
Open this post in threaded view
|

Re: Index Skip Scan (new UniqueKeys)

Peter Geoghegan-4
In reply to this post by Dmitry Dolgov
On Tue, Jun 9, 2020 at 3:20 AM Dmitry Dolgov <[hidden email]> wrote:
> * Btree-implementation contains btree specific code to implement amskip,
>   introduced in the previous patch.

The way that you're dealing with B-Tree tuples here needs to account
for posting list tuples:

> +               currItem = &so->currPos.items[so->currPos.lastItem];
> +               itup = (IndexTuple) (so->currTuples + currItem->tupleOffset);
> +               nextOffset = ItemPointerGetOffsetNumber(&itup->t_tid);

But I wonder more generally what the idea here is. The following
comments that immediately follow provide some hints:

> +               /*
> +                * To check if we returned the same tuple, try to find a
> +                * startItup on the current page. For that we need to update
> +                * scankey to match the whole tuple and set nextkey to return
> +                * an exact tuple, not the next one. If the nextOffset is the
> +                * same as before, it means we are in the loop, return offnum
> +                * to the original position and jump further
> +                */

Why does it make sense to use the offset number like this? It isn't
stable or reliable. The patch goes on to do this:

> +                   startOffset = _bt_binsrch(scan->indexRelation,
> +                                             so->skipScanKey,
> +                                             so->currPos.buf);
> +
> +                   page = BufferGetPage(so->currPos.buf);
> +                   maxoff = PageGetMaxOffsetNumber(page);
> +
> +                   if (nextOffset <= startOffset)
> +                   {

Why compare a heap TID's offset number (an offset number for a heap
page) to another offset number for a B-Tree leaf page? They're
fundamentally different things.

--
Peter Geoghegan