Tid scan improvements

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
93 messages Options
12345
Reply | Threaded
Open this post in threaded view
|

Re: Tid scan improvements

Andres Freund
Hi,

On 2019-01-19 17:04:13 +1300, Edmund Horner wrote:

> On Sat, 19 Jan 2019 at 05:35, Tom Lane <[hidden email]> wrote:
>
> > Edmund Horner <[hidden email]> writes:
> > > My patch uses the same path type and executor for all extractable
> > tidquals.
> >
> > > This worked pretty well, but I am finding it difficult to reimplement it
> > in
> > > the new tidpath.c code.
> >
> > I didn't like that approach to begin with, and would suggest that you go
> > over to using a separate path type and executor node.  I don't think the
> > amount of commonality for the two cases is all that large, and doing it
> > as you had it required some ugly ad-hoc conventions about the semantics
> > of the tidquals list.  Where I think this should go is that the tidquals
> > list still has OR semantics in the existing path type, but you use AND
> > semantics in the new path type, so that "ctid > ? AND ctid < ?" is just
> > represented as an implicit-AND list of two simple RestrictInfos.
> >
>
> Thanks for the advice.  This approach resembles my first draft, which had a
> separate executor type.  However, it did have a combined path type, with an
> enum TidPathMethod to determine how tidquals was interpreted.  At this
> point, I think a different path type is clearer, though generation of both
> types can live in tidpath.c (just as indxpath.c generates different index
> path types).
>
>
> > Now admittedly, this wouldn't give us an efficient way to execute
> > queries with conditions like "WHERE ctid = X OR (ctid > Y AND ctid < Z)",
> > but I find myself quite unable to get excited about supporting that.
> > I see no reason for the new code to worry about any cases more complex
> > than one or two TID inequalities at top level of the restriction list.
> >
>
> I'm a bit sad to see support for multiple ranges go, though I never saw
> such queries as ever being particularly common.  (And there was always a
> nagging feeling that tidpath.c was beginning to perform feats of boolean
> acrobatics out of proportion to its importance.  Perhaps in some distant
> future, TID quals will become another way of supplying TIDs to a bitmap
> heap scan, which would enable complicated boolean queries using both
> indexes and TID scans.  But that's just musing, not a proposal.)
>
> > In the query information given to the path generator, there is no existing
> > > RestrictInfo relating to the whole expression "ctid > ? AND ctid < ?".  I
> > > am still learning about RestrictInfos, but my understanding is it doesn't
> > > make sense to have a RestrictInfo for an AND clause, anyway; you're
> > > supposed to have them for the sub-expressions of it.
> >
> > FWIW, the actual data structure for cases like that is that there's
> > a RestrictInfo for the whole clause ctid = X OR (ctid > Y AND ctid < Z),
> > and if you look into its "orclause" field, you will find RestrictInfos
> > attached to the primitive clauses ctid = X, ctid > Y, ctid < Z.  (The
> > old code in tidpath.c didn't know that, because it'd never been rewritten
> > since RestrictInfos were invented.)  However, I think this new code should
> > not worry about OR cases at all, but just pull out top-level TID
> > comparison clauses.
> >
>
> Thanks for the explanation.
>
> > And it doesn't seem a good idea to try to create new RestrictInfos in the
> > > path generation just to pass the tidquals back to plan creation.
> >
> > No, you should avoid that.  There are places that assume there's only
> > one RestrictInfo for any given original clause (or sub-clause).


The commitfest has ended, and you've not updated the patch to address
the feedback yet. Are you planning to do so soon? Otherwise I think we
ought to mark the patch as returned with feedback?

Greetings,

Andres Freund

Reply | Threaded
Open this post in threaded view
|

Re: Tid scan improvements

Edmund Horner
Hi, my apologies for the delay.

I've finished rebasing and rewriting it for Tom's changes to tidpath.c and his recommendations for tid range scans, but I then found a bug with cursor interaction.  Specifically, FETCH LAST scans through the whole range, and then proceeds to scan backwards to get the last row.  It worked in both my very first draft, and in the most recent draft before the changes to tidpath, but I haven't got it working yet for the new version.

I'm hoping to get that fixed in the next 24 hours, and I'll then post the new patch.

Edmund


On Sun, 3 Feb 2019 at 23:34, Andres Freund <[hidden email]> wrote:
Hi,

On 2019-01-19 17:04:13 +1300, Edmund Horner wrote:
> On Sat, 19 Jan 2019 at 05:35, Tom Lane <[hidden email]> wrote:
>
> > Edmund Horner <[hidden email]> writes:
> > > My patch uses the same path type and executor for all extractable
> > tidquals.
> >
> > > This worked pretty well, but I am finding it difficult to reimplement it
> > in
> > > the new tidpath.c code.
> >
> > I didn't like that approach to begin with, and would suggest that you go
> > over to using a separate path type and executor node.  I don't think the
> > amount of commonality for the two cases is all that large, and doing it
> > as you had it required some ugly ad-hoc conventions about the semantics
> > of the tidquals list.  Where I think this should go is that the tidquals
> > list still has OR semantics in the existing path type, but you use AND
> > semantics in the new path type, so that "ctid > ? AND ctid < ?" is just
> > represented as an implicit-AND list of two simple RestrictInfos.
> >
>
> Thanks for the advice.  This approach resembles my first draft, which had a
> separate executor type.  However, it did have a combined path type, with an
> enum TidPathMethod to determine how tidquals was interpreted.  At this
> point, I think a different path type is clearer, though generation of both
> types can live in tidpath.c (just as indxpath.c generates different index
> path types).
>
>
> > Now admittedly, this wouldn't give us an efficient way to execute
> > queries with conditions like "WHERE ctid = X OR (ctid > Y AND ctid < Z)",
> > but I find myself quite unable to get excited about supporting that.
> > I see no reason for the new code to worry about any cases more complex
> > than one or two TID inequalities at top level of the restriction list.
> >
>
> I'm a bit sad to see support for multiple ranges go, though I never saw
> such queries as ever being particularly common.  (And there was always a
> nagging feeling that tidpath.c was beginning to perform feats of boolean
> acrobatics out of proportion to its importance.  Perhaps in some distant
> future, TID quals will become another way of supplying TIDs to a bitmap
> heap scan, which would enable complicated boolean queries using both
> indexes and TID scans.  But that's just musing, not a proposal.)
>
> > In the query information given to the path generator, there is no existing
> > > RestrictInfo relating to the whole expression "ctid > ? AND ctid < ?".  I
> > > am still learning about RestrictInfos, but my understanding is it doesn't
> > > make sense to have a RestrictInfo for an AND clause, anyway; you're
> > > supposed to have them for the sub-expressions of it.
> >
> > FWIW, the actual data structure for cases like that is that there's
> > a RestrictInfo for the whole clause ctid = X OR (ctid > Y AND ctid < Z),
> > and if you look into its "orclause" field, you will find RestrictInfos
> > attached to the primitive clauses ctid = X, ctid > Y, ctid < Z.  (The
> > old code in tidpath.c didn't know that, because it'd never been rewritten
> > since RestrictInfos were invented.)  However, I think this new code should
> > not worry about OR cases at all, but just pull out top-level TID
> > comparison clauses.
> >
>
> Thanks for the explanation.
>
> > And it doesn't seem a good idea to try to create new RestrictInfos in the
> > > path generation just to pass the tidquals back to plan creation.
> >
> > No, you should avoid that.  There are places that assume there's only
> > one RestrictInfo for any given original clause (or sub-clause).


The commitfest has ended, and you've not updated the patch to address
the feedback yet. Are you planning to do so soon? Otherwise I think we
ought to mark the patch as returned with feedback?

Greetings,

Andres Freund
Reply | Threaded
Open this post in threaded view
|

Re: Tid scan improvements

Edmund Horner
In reply to this post by Edmund Horner
On Sat, 19 Jan 2019 at 17:04, Edmund Horner <[hidden email]> wrote:
On Sat, 19 Jan 2019 at 05:35, Tom Lane <[hidden email]> wrote:
Edmund Horner <[hidden email]> writes:
> My patch uses the same path type and executor for all extractable tidquals.

> This worked pretty well, but I am finding it difficult to reimplement it in
> the new tidpath.c code.

I didn't like that approach to begin with, and would suggest that you go
over to using a separate path type and executor node.  I don't think the
amount of commonality for the two cases is all that large, and doing it
as you had it required some ugly ad-hoc conventions about the semantics
of the tidquals list.  Where I think this should go is that the tidquals
list still has OR semantics in the existing path type, but you use AND
semantics in the new path type, so that "ctid > ? AND ctid < ?" is just
represented as an implicit-AND list of two simple RestrictInfos.

Thanks for the advice.  This approach resembles my first draft, which had a separate executor type.  However, it did have a combined path type, with an enum TidPathMethod to determine how tidquals was interpreted.  At this point, I think a different path type is clearer, though generation of both types can live in tidpath.c (just as indxpath.c generates different index path types).

Hi, here's a new set of patches.  This one adds a new path type called TidRangePath and a new execution node called TidRangeScan.  I haven't included any of the patches for adding pathkeys to TidPaths or TidRangePaths.

1. v6-0001-Add-selectivity-estimate-for-CTID-system-variables.patch
2. v6-0002-Support-backward-scans-over-restricted-ranges-in-hea.patch
3. v6-0003-Support-range-quals-in-Tid-Scan.patch
4. v6-0004-TID-selectivity-reduce-the-density-of-the-last-page-.patch

Patches 1, 2, and 4 are basically unchanged from my previous post.  Patch 4 is an optional tweak to the CTID selectivity estimates.

Patch 3 is a substantial rewrite from what I had before.  I've checked David's most recent review and tried to make sure the new code meets his suggestions where applicable, although there is one spot where I left the code as "if (tidrangequals) ..." instead of the preferred "if (tidrangequals != NIL) ...", just for consistency with the surrounding code.

Questions --

1. Tid Range Paths are costed as random_page_cost for the first page, and sequential page cost for the remaining pages.  It made sense when there could be multiple non-overlapping ranges.  Now that there's only one range, it might not, but it has the benefit of making Tid Range Scans a little bit more expensive than Sequential Scans, so that they are less likely to be picked when a Seq Scan will do just as well.  Is there a better cost formula to use?

2. Is it worth trying to get rid of some of the code duplication between the TidPath and TidRangePath handling, such as in costsize.c or createplan.c?

3. TidRangeRecheck (copied from TidRecheck) has an existing comment asking whether it should actually be performing a check on the returned tuple.  It seems to me that as long as TidRangeNext doesn't return a tuple outside the requested range, then the check shouldn't be necessary (and we'd simplify the comment to "nothing to check").  If a range key can change at runtime, it should never have been included in the TidRangePath.  Is my understanding correct?

4. I'm a little uncomfortable with the way heapam.c changes the scan limits ("--scan->rs_numblocks") as it progresses through the pages.  I have the executor node reset the scan limits after scanning all the tuples, which seems to work for the tests I have, but I'm using the heap_setscanlimits feature in a slightly different way from the only existing use, which is for the one-off scans when building a BRIN index.  I have added some tests for cursor fetches which seems to exercise the code, but I'd still appreciate close review of how I'm using heapam.

Edmund


v6-0001-Add-selectivity-estimate-for-CTID-system-variables.patch (3K) Download Attachment
v6-0003-Support-range-quals-in-Tid-Scan.patch (77K) Download Attachment
v6-0004-TID-selectivity-reduce-the-density-of-the-last-page-.patch (2K) Download Attachment
v6-0002-Support-backward-scans-over-restricted-ranges-in-hea.patch (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Tid scan improvements

David Rowley-3
On Mon, 4 Feb 2019 at 18:37, Edmund Horner <[hidden email]> wrote:
> 1. v6-0001-Add-selectivity-estimate-for-CTID-system-variables.patch

I think 0001 is good to go. It's a clear improvement over what we do today.

(t1 = 1 million row table with a single int column.)

Patched:
# explain (analyze, timing off) select * from t1 where ctid < '(1, 90)';
 Seq Scan on t1  (cost=0.00..16925.00 rows=315 width=4) (actual
rows=315 loops=1)

# explain (analyze, timing off) select * from t1 where ctid <= '(1, 90)';
 Seq Scan on t1  (cost=0.00..16925.00 rows=316 width=4) (actual
rows=316 loops=1)

Master:
# explain (analyze, timing off) select * from t1 where ctid < '(1, 90)';
 Seq Scan on t1  (cost=0.00..16925.00 rows=333333 width=4) (actual
rows=315 loops=1)

# explain (analyze, timing off) select * from t1 where ctid <= '(1, 90)';
 Seq Scan on t1  (cost=0.00..16925.00 rows=333333 width=4) (actual
rows=316 loops=1)

The only possible risk I can foresee is that it may be more likely we
underestimate the selectivity and that causes something like a nested
loop join due to the estimation being, say 1 row.

It could happen in a case like:

SELECT * FROM bloated_table WHERE ctid >= <last ctid that would exist
without bloat>

but I don't think we should keep using DEFAULT_INEQ_SEL just in case
this happens. We could probably fix 90% of those cases by returning 2
rows instead of 1.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: Tid scan improvements

David Rowley-3
In reply to this post by Edmund Horner
On Mon, 4 Feb 2019 at 18:37, Edmund Horner <[hidden email]> wrote:
> 2. v6-0002-Support-backward-scans-over-restricted-ranges-in-hea.patch
> 3. v6-0003-Support-range-quals-in-Tid-Scan.patch
> 4. v6-0004-TID-selectivity-reduce-the-density-of-the-last-page-.patch

These ones need a rebase.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: Tid scan improvements

Edmund Horner
In reply to this post by David Rowley-3
On Thu, 14 Mar 2019 at 16:46, David Rowley <[hidden email]> wrote:

>
> The only possible risk I can foresee is that it may be more likely we
> underestimate the selectivity and that causes something like a nested
> loop join due to the estimation being, say 1 row.
>
> It could happen in a case like:
>
> SELECT * FROM bloated_table WHERE ctid >= <last ctid that would exist
> without bloat>
>
> but I don't think we should keep using DEFAULT_INEQ_SEL just in case
> this happens. We could probably fix 90% of those cases by returning 2
> rows instead of 1.

Thanks for looking at the patch David.

I'm not sure how an unreasonable underestimation would occur here.  If
you have a table bloated to say 10x its minimal size, the estimator
still assumes an even distribution of tuples (I don't think we can do
much better than that).  So the selectivity of "ctid >= <last ctid
that would exist without bloat>" is still going to be 0.9.

Edmund

Reply | Threaded
Open this post in threaded view
|

Re: Tid scan improvements

David Rowley-3
On Thu, 14 Mar 2019 at 21:12, Edmund Horner <[hidden email]> wrote:

> I'm not sure how an unreasonable underestimation would occur here.  If
> you have a table bloated to say 10x its minimal size, the estimator
> still assumes an even distribution of tuples (I don't think we can do
> much better than that).  So the selectivity of "ctid >= <last ctid
> that would exist without bloat>" is still going to be 0.9.

Okay, think you're right there.  I guess the only risk there is just
varying tuple density per page, and that seems no greater risk than we
have with the existing stats.

Just looking again, I think the block of code starting:

+ if (density > 0.0)

needs a comment to mention what it's doing. Perhaps:

+ /*
+ * Using the average tuples per page, calculate how far into
+ * the page the itemptr is likely to be and adjust block
+ * accordingly.
+ */
+ if (density > 0.0)

Or some better choice of words.  With that done, I think 0001 is good to go.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: Tid scan improvements

Edmund Horner
On Thu, 14 Mar 2019 at 23:06, David Rowley <[hidden email]> wrote:

> On Thu, 14 Mar 2019 at 21:12, Edmund Horner <[hidden email]> wrote:
> > I'm not sure how an unreasonable underestimation would occur here.  If
> > you have a table bloated to say 10x its minimal size, the estimator
> > still assumes an even distribution of tuples (I don't think we can do
> > much better than that).  So the selectivity of "ctid >= <last ctid
> > that would exist without bloat>" is still going to be 0.9.
>
> Okay, think you're right there.  I guess the only risk there is just
> varying tuple density per page, and that seems no greater risk than we
> have with the existing stats.

Yeah that is a risk, and will probably come up in practice.  But at
least we're not just picking a hardcoded selectivity any more.

> Just looking again, I think the block of code starting:
>
> + if (density > 0.0)
>
> needs a comment to mention what it's doing. Perhaps:
>
> + /*
> + * Using the average tuples per page, calculate how far into
> + * the page the itemptr is likely to be and adjust block
> + * accordingly.
> + */
> + if (density > 0.0)
>
> Or some better choice of words.  With that done, I think 0001 is good to go.

Ok, I'll look at it and hopefully get a new patch up soon.

Edmund

Reply | Threaded
Open this post in threaded view
|

Re: Tid scan improvements

Edmund Horner
On Thu, 14 Mar 2019 at 23:37, Edmund Horner <[hidden email]> wrote:

> On Thu, 14 Mar 2019 at 23:06, David Rowley <[hidden email]> wrote:
> > Just looking again, I think the block of code starting:
> >
> > + if (density > 0.0)
> >
> > needs a comment to mention what it's doing. Perhaps:
> >
> > + /*
> > + * Using the average tuples per page, calculate how far into
> > + * the page the itemptr is likely to be and adjust block
> > + * accordingly.
> > + */
> > + if (density > 0.0)
> >
> > Or some better choice of words.  With that done, I think 0001 is good to go.
>
> Ok, I'll look at it and hopefully get a new patch up soon.
Hullo,

Here's a new set of patches.

It includes new versions of the other patches, which needed to be
rebased because of the introduction of the "tableam" API by
c2fe139c20.

I've had to adapt it to use the table scan API.  I've got it compiling
and passing tests, but I'm uneasy about some things that still use the
heapam API.

1. I call heap_setscanlimits as I'm not sure there is a tableam equivalent.
2. I'm not sure whether non-heap tableam implementations can also be
supported by my TID Range Scan: we need to be able to set the scan
limits.  There may not be any other implementations yet, but when
there are, how do we stop the planner using a TID Range Scan for
non-heap relations?
3. When fetching tuples, I see that nodeSeqscan.c uses
table_scan_getnextslot, which saves dealing with HeapTuples.  But
nodeTidrangescan wants to do some checking of the block and offset
before returning the slot.  So I have it using heap_getnext and
ExecStoreBufferHeapTuple.  Apart from being heapam-specific, it's just
not as clean as the new API calls.

Ideally, we can get to to support general tableam implementations
rather than using heapam-specific calls.  Any advice on how to do
this?

Thanks
Edmund

v7-0001-Add-selectivity-estimate-for-CTID-system-variables.patch (4K) Download Attachment
v7-0002-Support-backward-scans-over-restricted-ranges-in-hea.patch (3K) Download Attachment
v7-0004-TID-selectivity-reduce-the-density-of-the-last-page-.patch (2K) Download Attachment
v7-0003-Support-range-quals-in-Tid-Scan.patch (78K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Tid scan improvements

David Rowley-3
On Fri, 15 Mar 2019 at 18:42, Edmund Horner <[hidden email]> wrote:

> I've had to adapt it to use the table scan API.  I've got it compiling
> and passing tests, but I'm uneasy about some things that still use the
> heapam API.
>
> 1. I call heap_setscanlimits as I'm not sure there is a tableam equivalent.
> 2. I'm not sure whether non-heap tableam implementations can also be
> supported by my TID Range Scan: we need to be able to set the scan
> limits.  There may not be any other implementations yet, but when
> there are, how do we stop the planner using a TID Range Scan for
> non-heap relations?
> 3. When fetching tuples, I see that nodeSeqscan.c uses
> table_scan_getnextslot, which saves dealing with HeapTuples.  But
> nodeTidrangescan wants to do some checking of the block and offset
> before returning the slot.  So I have it using heap_getnext and
> ExecStoreBufferHeapTuple.  Apart from being heapam-specific, it's just
> not as clean as the new API calls.
>
> Ideally, we can get to to support general tableam implementations
> rather than using heapam-specific calls.  Any advice on how to do
> this?

The commit message in 8586bf7ed mentions:

> Subsequent commits will incrementally abstract table access
> functionality to be routed through table access methods. That change
> is too large to be reviewed & committed at once, so it'll be done
> incrementally.

and looking at [1] I see patch 0004 introduces some changes in
nodeTidscan.c to call a new tableam API function named
heapam_fetch_row_version. I see this function does have a ItemPointer
argument, so I guess we must be keeping those as unique row
identifiers in the API.

Patch 0001 does change the signature of heap_setscanlimits() (appears
to be committed already), and then in 0010 the only code that calls
heap_setscanlimits() (IndexBuildHeapRangeScan()) is moved and renamed
to heapam_index_build_range_scan() and set to be called via the
index_build_range_scan TableAmRoutine method.  So it looks like out of
that patch series nothing is there to allow you to access
heap_setscanlimits() directly via the TableAmRoutine API, so perhaps
for this to work heap_setscanlimits will need to be interfaced,
however, I'm unsure if that'll violate any assumptions that Andres
wants to keep out of the API...  Andres?

[1] https://www.postgresql.org/message-id/20190311193746.hhv4e4e62nxtq3k6@...

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: Re: Tid scan improvements

David Steele
On 3/18/19 1:35 PM, David Rowley wrote:

> On Fri, 15 Mar 2019 at 18:42, Edmund Horner <[hidden email]> wrote:
>
>> Subsequent commits will incrementally abstract table access
>> functionality to be routed through table access methods. That change
>> is too large to be reviewed & committed at once, so it'll be done
>> incrementally.
>
> and looking at [1] I see patch 0004 introduces some changes in
> nodeTidscan.c to call a new tableam API function named
> heapam_fetch_row_version. I see this function does have a ItemPointer
> argument, so I guess we must be keeping those as unique row
> identifiers in the API.
>
> Patch 0001 does change the signature of heap_setscanlimits() (appears
> to be committed already), and then in 0010 the only code that calls
> heap_setscanlimits() (IndexBuildHeapRangeScan()) is moved and renamed
> to heapam_index_build_range_scan() and set to be called via the
> index_build_range_scan TableAmRoutine method.  So it looks like out of
> that patch series nothing is there to allow you to access
> heap_setscanlimits() directly via the TableAmRoutine API, so perhaps
> for this to work heap_setscanlimits will need to be interfaced,
> however, I'm unsure if that'll violate any assumptions that Andres
> wants to keep out of the API...  Andres?

Thoughts on this, Andres?

Regards,
--
-David
[hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Tid scan improvements

Andres Freund
In reply to this post by Edmund Horner
Hi,

On 2019-03-15 18:42:40 +1300, Edmund Horner wrote:
> I've had to adapt it to use the table scan API.  I've got it compiling
> and passing tests, but I'm uneasy about some things that still use the
> heapam API.
>
> 1. I call heap_setscanlimits as I'm not sure there is a tableam
> equivalent.

There used to be, but it wasn't clear that it was useful. In core pg the
only caller are index range scans, and those are - in a later patch in
the series - moved into the AM as well, as they need to deal with things
like HOT.


> 2. I'm not sure whether non-heap tableam implementations can also be
> supported by my TID Range Scan: we need to be able to set the scan
> limits.  There may not be any other implementations yet, but when
> there are, how do we stop the planner using a TID Range Scan for
> non-heap relations?

I've not yet looked through your code, but if required we'd probably
need to add a new tableam callback. It'd be marked optional, and the
planner could just check for its presence. A later part of the pluggable
storage series does that for bitmap scans, perhaps it's worth looking at
that?


> 3. When fetching tuples, I see that nodeSeqscan.c uses
> table_scan_getnextslot, which saves dealing with HeapTuples.  But
> nodeTidrangescan wants to do some checking of the block and offset
> before returning the slot.  So I have it using heap_getnext and
> ExecStoreBufferHeapTuple.  Apart from being heapam-specific, it's just
> not as clean as the new API calls.

Yea, that's not ok.  Note that, since yesterday, nodeTidscan doesn't
call heap_fetch() anymore (there's still a heap dependency, but that's
just for heap_get_latest_tid(), which I'll move into execMain or such).


> Ideally, we can get to to support general tableam implementations
> rather than using heapam-specific calls.  Any advice on how to do
> this?

Not yet - could you perhaps look at the bitmap scan patch in the tableam
queue, and see if that gives you inspiration?

- Andres

Reply | Threaded
Open this post in threaded view
|

Re: Tid scan improvements

Andres Freund
In reply to this post by David Rowley-3
Hi,

On 2019-03-18 22:35:05 +1300, David Rowley wrote:

> The commit message in 8586bf7ed mentions:
>
> > Subsequent commits will incrementally abstract table access
> > functionality to be routed through table access methods. That change
> > is too large to be reviewed & committed at once, so it'll be done
> > incrementally.
>
> and looking at [1] I see patch 0004 introduces some changes in
> nodeTidscan.c to call a new tableam API function named
> heapam_fetch_row_version. I see this function does have a ItemPointer
> argument, so I guess we must be keeping those as unique row
> identifiers in the API.

Right, we are. At least for now - there's some discussions around
allowing different format for TIDs, to allow things like index organized
tables, but that's for later.


> Patch 0001 does change the signature of heap_setscanlimits() (appears
> to be committed already), and then in 0010 the only code that calls
> heap_setscanlimits() (IndexBuildHeapRangeScan()) is moved and renamed
> to heapam_index_build_range_scan() and set to be called via the
> index_build_range_scan TableAmRoutine method.  So it looks like out of
> that patch series nothing is there to allow you to access
> heap_setscanlimits() directly via the TableAmRoutine API, so perhaps
> for this to work heap_setscanlimits will need to be interfaced,
> however, I'm unsure if that'll violate any assumptions that Andres
> wants to keep out of the API...

I was kinda hoping to keep block numbers out of the "main" APIs, to
avoid assuming everything is BLCKSZ based. I don't have a particular
problem allowing an optional setscanlimits type callback that works with
block numbers. The planner could check its presence and just not build
tid range scans if not present.  Alternatively a bespoke scan API for
tid range scans, like the later patches in the tableam series for
bitmap, sample, analyze scans, might be an option.

Greetings,

Andres Freund

Reply | Threaded
Open this post in threaded view
|

Re: Tid scan improvements

Tom Lane-2
Andres Freund <[hidden email]> writes:
> I was kinda hoping to keep block numbers out of the "main" APIs, to
> avoid assuming everything is BLCKSZ based. I don't have a particular
> problem allowing an optional setscanlimits type callback that works with
> block numbers. The planner could check its presence and just not build
> tid range scans if not present.  Alternatively a bespoke scan API for
> tid range scans, like the later patches in the tableam series for
> bitmap, sample, analyze scans, might be an option.

Given Andres' API concerns, and the short amount of time remaining in
this CF, I'm not sure how much of this patch set we can expect to land
in v12.  It seems like it might be a good idea to scale back our ambitions
and see whether there's a useful subset we can push in easily.

With that in mind, I went ahead and pushed 0001+0004, since improving
the planner's selectivity estimate for a "ctid vs constant" qual is
likely to be helpful whether the executor is smart about it or not.

FWIW, I don't really see the point of treating 0002 as a separate patch.
If it had some utility on its own, then it'd be sensible, but what
would that be?  Also, it looks from 0002 like you are trying to overload
rs_startblock with a different meaning than it has for syncscans, and
I think that might be a bad idea.

                        regards, tom lane

Reply | Threaded
Open this post in threaded view
|

Re: Tid scan improvements

Edmund Horner
On Tue, 26 Mar 2019 at 11:54, Tom Lane <[hidden email]> wrote:

>
> Andres Freund <[hidden email]> writes:
> > I was kinda hoping to keep block numbers out of the "main" APIs, to
> > avoid assuming everything is BLCKSZ based. I don't have a particular
> > problem allowing an optional setscanlimits type callback that works with
> > block numbers. The planner could check its presence and just not build
> > tid range scans if not present.  Alternatively a bespoke scan API for
> > tid range scans, like the later patches in the tableam series for
> > bitmap, sample, analyze scans, might be an option.
>
> Given Andres' API concerns, and the short amount of time remaining in
> this CF, I'm not sure how much of this patch set we can expect to land
> in v12.  It seems like it might be a good idea to scale back our ambitions
> and see whether there's a useful subset we can push in easily.

I agree.  It'll take some time to digest Andres' advice and write a
better patch.

Should I set update CF app to a) set the target version to 13, and/or
move it to next commitfest?

> With that in mind, I went ahead and pushed 0001+0004, since improving
> the planner's selectivity estimate for a "ctid vs constant" qual is
> likely to be helpful whether the executor is smart about it or not.

Cool.

> FWIW, I don't really see the point of treating 0002 as a separate patch.
> If it had some utility on its own, then it'd be sensible, but what
> would that be?  Also, it looks from 0002 like you are trying to overload
> rs_startblock with a different meaning than it has for syncscans, and
> I think that might be a bad idea.

Yeah I don't think either patch is useful without the other.  They
were separate because, initially, only some of the TidRangeScan
functionality depended on it, and I was particularly uncomfortable
with what I was doing to heapam.c.

The changes in heapam.c were required for backward scan support, as
used by ORDER BY ctid DESC and MAX(ctid); and also for FETCH LAST and
FETCH PRIOR.  I have removed the backward scans functionality from the
current set of patches, but support for backward cursor fetches
remains.

I guess to brutally simplify the patch further, we could give up
backward cursor fetches entirely?  This means such cursors that end up
using a TidRangeScan will require SCROLL to go backwards (which is a
small pain for user experience), but TBH I don't think backwards-going
cursors on CTID will be hugely common.

I'm still not familiar enough with heapam.c to have any better ideas
on how to support backward scanning a limited range.

Reply | Threaded
Open this post in threaded view
|

Re: Tid scan improvements

David Steele
On 3/26/19 8:11 AM, Edmund Horner wrote:

> On Tue, 26 Mar 2019 at 11:54, Tom Lane <[hidden email]> wrote:
>>
>> Andres Freund <[hidden email]> writes:
>>> I was kinda hoping to keep block numbers out of the "main" APIs, to
>>> avoid assuming everything is BLCKSZ based. I don't have a particular
>>> problem allowing an optional setscanlimits type callback that works with
>>> block numbers. The planner could check its presence and just not build
>>> tid range scans if not present.  Alternatively a bespoke scan API for
>>> tid range scans, like the later patches in the tableam series for
>>> bitmap, sample, analyze scans, might be an option.
>>
>> Given Andres' API concerns, and the short amount of time remaining in
>> this CF, I'm not sure how much of this patch set we can expect to land
>> in v12.  It seems like it might be a good idea to scale back our ambitions
>> and see whether there's a useful subset we can push in easily.
>
> I agree.  It'll take some time to digest Andres' advice and write a
> better patch.
>
> Should I set update CF app to a) set the target version to 13, and/or
> move it to next commitfest?

If you plan to continue working on it in this CF then you can just
change the target to PG13.  If you plan to take a break and pick up the
work later then go ahead and push it to the next CF.

Regards,
--
-David
[hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Tid scan improvements

Andres Freund
In reply to this post by Edmund Horner
Hi,

On 2019-03-26 19:11:13 +1300, Edmund Horner wrote:

> The changes in heapam.c were required for backward scan support, as
> used by ORDER BY ctid DESC and MAX(ctid); and also for FETCH LAST and
> FETCH PRIOR.  I have removed the backward scans functionality from the
> current set of patches, but support for backward cursor fetches
> remains.
>
> I guess to brutally simplify the patch further, we could give up
> backward cursor fetches entirely?  This means such cursors that end up
> using a TidRangeScan will require SCROLL to go backwards (which is a
> small pain for user experience), but TBH I don't think backwards-going
> cursors on CTID will be hugely common.

FWIW, I think it'd be entirely reasonable to remove support for backward
scans without SCROLL. In fact, I think it'd be wasted effort to maintain
code for it, without a pretty clear reason why we need it (unless it
were trivial to support, which it isn't).

Greetings,

Andres Freund

Reply | Threaded
Open this post in threaded view
|

Re: Tid scan improvements

Thomas Munro-5
In reply to this post by David Steele
On Tue, Mar 26, 2019 at 7:25 PM David Steele <[hidden email]> wrote:
> On 3/26/19 8:11 AM, Edmund Horner wrote:
> > Should I set update CF app to a) set the target version to 13, and/or
> > move it to next commitfest?
>
> If you plan to continue working on it in this CF then you can just
> change the target to PG13.  If you plan to take a break and pick up the
> work later then go ahead and push it to the next CF.

Hi Edmund,

The new CF is here.  I'm going through poking threads for submissions
that don't apply, but it sounds like this needs more than a rebase?
Perhaps this belongs in the next CF?

--
Thomas Munro
https://enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Tid scan improvements

David Rowley-3
On Mon, 1 Jul 2019 at 23:29, Thomas Munro <[hidden email]> wrote:
> The new CF is here.  I'm going through poking threads for submissions
> that don't apply, but it sounds like this needs more than a rebase?
> Perhaps this belongs in the next CF?

0001 and 0004 of v7 got pushed in PG12. The CFbot will be trying to
apply 0001 still, but on testing 0002, no joy there either.

It would be good to see this back in PG13. For now, I'll mark it as
waiting on author.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Tid scan improvements

Edmund Horner
On Thu, 4 Jul 2019 at 15:43, David Rowley <[hidden email]> wrote:

> On Mon, 1 Jul 2019 at 23:29, Thomas Munro <[hidden email]> wrote:
> > The new CF is here.  I'm going through poking threads for submissions
> > that don't apply, but it sounds like this needs more than a rebase?
> > Perhaps this belongs in the next CF?
>
> 0001 and 0004 of v7 got pushed in PG12. The CFbot will be trying to
> apply 0001 still, but on testing 0002, no joy there either.
>
> It would be good to see this back in PG13. For now, I'll mark it as
> waiting on author.

Hi,

I'm not really sure how to proceed.  I started with a fairly pragmatic
solution to "WHERE ctid > ? AND ctid < ?" for tables, and then tableam
came along.

The options I see are:

A.  Continue to target only heapam tables, making the bare minimum
changes necessary for the new tableam api.
B.  Try to do something more general that works on all tableam
implementations for which it may be useful.

There may not be much different between them, but B. means a bit more
research into zheap, zstore and other possible tableams.

Next question, how will the executor access the table?

1. Continue to use the seqscan tableam methods, by setting limits.
2. Use the bitmap scan methods, for instance by faking a BitmapIteratorResuit.
3. Add new tableam methods specially for scanning a range of TIDs.

Any thoughts?


12345