New vacuum option to do only freezing

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
118 messages Options
123456
Reply | Threaded
Open this post in threaded view
|

Re: New vacuum option to do only freezing

Masahiko Sawada
On Tue, Mar 5, 2019 at 8:01 PM Kyotaro HORIGUCHI
<[hidden email]> wrote:
>
> Hello, I have some other comments.
>

Thank you for the comment!

On Tue, Mar 5, 2019 at 8:01 PM Kyotaro HORIGUCHI
<[hidden email]> wrote:

>
>
> +                   nleft;            /* item pointers we left */
>
> The name seems to be something other, and the comment doesn't
> makes sense at least.. for me.. Looking below,
>
> +                                    "%.0f tuples are left as dead.\n",
> +                                    nleft),
> +                     nleft);
>
> How about "nleft_dead; /* iterm pointers left as dead */"?
Fixed.

>
>
>
> In this block:
>
> -        if (nindexes == 0 &&
> +        if ((nindexes == 0 || skip_index_vacuum) &&
>              vacrelstats->num_dead_tuples > 0)
>          {
>
> Is it right that vacuumed_pages is incremented and FSM is updated
> while the page is not actually vacuumed?
Good catch. I think the FSM stuff is right because we actually did HOT
pruning but the increment of vacuumed_page seems wrong to me. I think
we should not increment it and not report "removed XX row version in
YY pages" message.

>
>
>          tups_vacuumed += heap_page_prune(onerel, buf, OldestXmin, false,
> -                                         &vacrelstats->latestRemovedXid);
> +                                         &vacrelstats->latestRemovedXid,
> +                                         &tups_pruned);
>
> tups_pruned looks as "HOT pruned tuples". It is named "unused" in
> the function's parameters. (But I think it is useless. Please see
> the details below.)
>
>
> I tested it with a simple data set.
>
> (autovacuum = off)
> drop table if exists t;
> create table t with (fillfactor=50) as select a, a % 3 as b from generate_series(1, 9) a;
> create index on t(a);
> update t set a = -a where b = 0;
> update t set b = b + 1 where b = 1;
>
> We now have 9 tuples, 15 versions and 3 out of 6 "old" tuples are
> to be "left dead" by DISABLE_INDEX_CLEANUP vacuum. It means,
> three tuples ends with "left dead", three tuples are removed and
> 12 tuples will survive the vacuum below.
>
> vacuum (verbose, freeze ,disable_index_cleanup, disable_page_skipping) t;
>
> > INFO:  "t": removed 0 row versions in 1 pages
> > INFO:  "t": found 0 removable, 9 nonremovable row versions in 1 out of 1 pages
> > DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 925
> > There were 0 unused item pointers.
> > Skipped 0 pages due to buffer pins, 0 frozen pages.
> > 0 pages are entirely empty.
> > 3 tuples are left as dead.
>
> Three tuple versions have vanished. Actually they were removed
> but not shown in the message.
>
> heap_prune_chain() doesn't count a live root entry of a chain as
> "unused (line pointer)" since it is marked as "redierected". As
> the result the vanished tuples are counted in tups_vacuumed, not
> in tups_pruned. Maybe the name tups_vacuumed is confusing.  After
> removing tups_pruned code it works correctly.
>
> > INFO:  "t": removed 6 row versions in 1 pages
> > INFO:  "t": found 6 removable, 9 nonremovable row versions in 1 out of 1 pages
> > DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 932
> > There were 0 unused item pointers.
> > Skipped 0 pages due to buffer pins, 0 frozen pages.
> > 0 pages are entirely empty.
> > 3 tuples are left as dead.
>
> I see two choices of the second line above.
>
> 1> "t": found 6 removable, 9 nonremovable row versions in 1 out of 1 pages
>
>   "removable" includes "left dead" tuples.
>
> 2> "t": found 3 removable, 12 nonremovable row versions in 1 out of 1 pages
>
>   "removable" excludes "left dead" tuples.
>
> If you prefer the latter, removable and nonremoveable need to be
> corrected using nleft.
I think that the first vacuum should report the former message because
it's true that the table has 6 removable tuples. We remove 6 tuples
but leave 3 item pointers. So in the second vacuum, it should be
"found 0 removable, 9 nonremovable row versions ..." and "3 tuples are
left as dead". But to report more precisely it'd be better to report
"0 tuples and 3 item identifiers are left as dead" here.

Attached updated patch incorporated all of comments. Also I've added
new reloption vacuum_index_cleanup as per discussion on the "reloption
to prevent VACUUM from truncating empty pages at the end of relation"
thread. Autovacuums also can skip index cleanup when the reloption is
set to false. Since the setting this to false might lead some problems
I've made autovacuums report the number of dead tuples and dead
itemids we left.

Regards,

Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

v8-0002-Add-diable-index-cleanup-option-to-vacuumdb.patch (7K) Download Attachment
v8-0001-Add-DISABLE_INDEX_CLEANUP-option-to-VACUUM-comman.patch (27K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: New vacuum option to do only freezing

Robert Haas
On Tue, Mar 5, 2019 at 11:29 PM Masahiko Sawada <[hidden email]> wrote:
> Attached updated patch incorporated all of comments. Also I've added
> new reloption vacuum_index_cleanup as per discussion on the "reloption
> to prevent VACUUM from truncating empty pages at the end of relation"
> thread. Autovacuums also can skip index cleanup when the reloption is
> set to false. Since the setting this to false might lead some problems
> I've made autovacuums report the number of dead tuples and dead
> itemids we left.

It seems to me that the disable_index_cleanup should be renamed
index_cleanup and the default should be changed to true, for
consistency with the reloption (and, perhaps, other patches).

- num_tuples = live_tuples = tups_vacuumed = nkeep = nunused = 0;
+ num_tuples = live_tuples = tups_vacuumed  = nkeep = nunused =
+ nleft_dead_itemids = nleft_dead_tuples = 0;

I would suggest leaving the existing line alone (and not adding an
extra space to it as the patch does now) and just adding a second
initialization on the next line as a separate statement. a = b = c = d
= e = 0 isn't such great coding style that we should stick to it
rigorously even when it ends up having to be broken across lines.

+ /* Index vacuum must be enabled in two-pass vacuum */
+ Assert(!skip_index_vacuum);

I am a big believer in naming consistency.  Please, let's use the same
name everywhere!  If it's going to be index_cleanup, then call the
reloption vacuum_index_cleanup, and call the option index_cleanup, and
call the variable index_cleanup.  Picking a different subset of
cleanup, index, vacuum, skip, and disable for each new name makes it
harder to understand.

- * If there are no indexes then we can vacuum the page right now
- * instead of doing a second scan.
+ * If there are no indexes or index vacuum is disabled we can
+ * vacuum the page right now instead of doing a second scan.

This comment is wrong.  That wouldn't be safe.  And that's probably
why it's not what the code does.

- /* If no indexes, make log report that lazy_vacuum_heap would've made */
+ /*
+ * If no index or index vacuum is disabled, make log report that
+ * lazy_vacuum_heap would've make.
+ */
  if (vacuumed_pages)

Hmm, does this really do what the comment claims?  It looks to me like
we only increment vacuumed_pages when we call lazy_vacuum_page(), and
we (correctly) don't do that when index cleanup is disabled, but then
here this claims that if (vacuumed_pages) will be true in that case.

I wonder if it would be cleaner to rename vacrelstate->hasindex to
'useindex' and set it to false if there are no indexes or index
cleanup is disabled.  But that might actually be worse, not sure.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Reply | Threaded
Open this post in threaded view
|

Re: New vacuum option to do only freezing

Masahiko Sawada
On Thu, Mar 7, 2019 at 3:55 AM Robert Haas <[hidden email]> wrote:

>
> On Tue, Mar 5, 2019 at 11:29 PM Masahiko Sawada <[hidden email]> wrote:
> > Attached updated patch incorporated all of comments. Also I've added
> > new reloption vacuum_index_cleanup as per discussion on the "reloption
> > to prevent VACUUM from truncating empty pages at the end of relation"
> > thread. Autovacuums also can skip index cleanup when the reloption is
> > set to false. Since the setting this to false might lead some problems
> > I've made autovacuums report the number of dead tuples and dead
> > itemids we left.
>
> It seems to me that the disable_index_cleanup should be renamed
> index_cleanup and the default should be changed to true, for
> consistency with the reloption (and, perhaps, other patches).
Hmm, the patch already has new reloption vacuum_index_cleanup and
default value is true but you meant I should change its name to
index_cleanup?

>
> - num_tuples = live_tuples = tups_vacuumed = nkeep = nunused = 0;
> + num_tuples = live_tuples = tups_vacuumed  = nkeep = nunused =
> + nleft_dead_itemids = nleft_dead_tuples = 0;
>
> I would suggest leaving the existing line alone (and not adding an
> extra space to it as the patch does now) and just adding a second
> initialization on the next line as a separate statement. a = b = c = d
> = e = 0 isn't such great coding style that we should stick to it
> rigorously even when it ends up having to be broken across lines.
Fixed.

>
> + /* Index vacuum must be enabled in two-pass vacuum */
> + Assert(!skip_index_vacuum);
>
> I am a big believer in naming consistency.  Please, let's use the same
> name everywhere!  If it's going to be index_cleanup, then call the
> reloption vacuum_index_cleanup, and call the option index_cleanup, and
> call the variable index_cleanup.  Picking a different subset of
> cleanup, index, vacuum, skip, and disable for each new name makes it
> harder to understand.
Fixed.

>
> - * If there are no indexes then we can vacuum the page right now
> - * instead of doing a second scan.
> + * If there are no indexes or index vacuum is disabled we can
> + * vacuum the page right now instead of doing a second scan.
>
> This comment is wrong.  That wouldn't be safe.  And that's probably
> why it's not what the code does.

Fixed.

>
> - /* If no indexes, make log report that lazy_vacuum_heap would've made */
> + /*
> + * If no index or index vacuum is disabled, make log report that
> + * lazy_vacuum_heap would've make.
> + */
>   if (vacuumed_pages)
>
> Hmm, does this really do what the comment claims?  It looks to me like
> we only increment vacuumed_pages when we call lazy_vacuum_page(), and
> we (correctly) don't do that when index cleanup is disabled, but then
> here this claims that if (vacuumed_pages) will be true in that case.
You're right, vacuumed_pages never be > 0 in disable_index_cleanup case. Fixed.

>
> I wonder if it would be cleaner to rename vacrelstate->hasindex to
> 'useindex' and set it to false if there are no indexes or index
> cleanup is disabled.  But that might actually be worse, not sure.
>

I tried the changes and it seems good idea to me. Fixed.

Attached the updated version patches.

Regards,

--


Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

v9-0001-Add-DISABLE_INDEX_CLEANUP-option-to-VACUUM-comman.patch (24K) Download Attachment
v9-0002-Add-diable-index-cleanup-option-to-vacuumdb.patch (7K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: New vacuum option to do only freezing

Robert Haas
On Thu, Mar 7, 2019 at 1:03 AM Masahiko Sawada <[hidden email]> wrote:
> Hmm, the patch already has new reloption vacuum_index_cleanup and
> default value is true but you meant I should change its name to
> index_cleanup?

No, I mean that you should make it so that someone writes VACUUM
(INDEX_CLEANUP false) instead of VACUUM (DISABLE_INDEX_CLEANUP).

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Reply | Threaded
Open this post in threaded view
|

Re: New vacuum option to do only freezing

Masahiko Sawada
On Fri, Mar 8, 2019 at 12:04 AM Robert Haas <[hidden email]> wrote:
>
> On Thu, Mar 7, 2019 at 1:03 AM Masahiko Sawada <[hidden email]> wrote:
> > Hmm, the patch already has new reloption vacuum_index_cleanup and
> > default value is true but you meant I should change its name to
> > index_cleanup?
>
> No, I mean that you should make it so that someone writes VACUUM
> (INDEX_CLEANUP false) instead of VACUUM (DISABLE_INDEX_CLEANUP).
>

IIUC we've discussed the field-and-value style vacuum option. I
suggested that since we have already the disable_page_skipping option
the disable_page_skipping option would be more natural style and
consistent. I think "VACUUM (INDEX_CLEANUP false)" seems consistent
with its reloption but not with other vacuum options. So why does only
this option (and probably up-coming new options) need to support new
style? Do we need the same change to the existing options?

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Reply | Threaded
Open this post in threaded view
|

Re: New vacuum option to do only freezing

Robert Haas
On Fri, Mar 8, 2019 at 12:14 AM Masahiko Sawada <[hidden email]> wrote:
> IIUC we've discussed the field-and-value style vacuum option. I
> suggested that since we have already the disable_page_skipping option
> the disable_page_skipping option would be more natural style and
> consistent. I think "VACUUM (INDEX_CLEANUP false)" seems consistent
> with its reloption but not with other vacuum options. So why does only
> this option (and probably up-coming new options) need to support new
> style? Do we need the same change to the existing options?

Well, it's too late to change to change DISABLE_PAGE_SKIPPING to work
some other way; it's been released, and we're stuck with it at this
point.  However, I generally believe that it is preferable to phrase
options positively then negatively, so that for example one writes
EXPLAIN (ANALYZE, TIMING OFF) not EXPLAIN (ANALYZE, NO_TIMING).  So
I'd like to do it that way for the new options that we're proposing to
add.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Reply | Threaded
Open this post in threaded view
|

Re: New vacuum option to do only freezing

Masahiko Sawada
On Sat, Mar 23, 2019 at 3:25 AM Robert Haas <[hidden email]> wrote:

>
> On Fri, Mar 8, 2019 at 12:14 AM Masahiko Sawada <[hidden email]> wrote:
> > IIUC we've discussed the field-and-value style vacuum option. I
> > suggested that since we have already the disable_page_skipping option
> > the disable_page_skipping option would be more natural style and
> > consistent. I think "VACUUM (INDEX_CLEANUP false)" seems consistent
> > with its reloption but not with other vacuum options. So why does only
> > this option (and probably up-coming new options) need to support new
> > style? Do we need the same change to the existing options?
>
> Well, it's too late to change to change DISABLE_PAGE_SKIPPING to work
> some other way; it's been released, and we're stuck with it at this
> point.
Agreed.

> However, I generally believe that it is preferable to phrase
> options positively then negatively, so that for example one writes
> EXPLAIN (ANALYZE, TIMING OFF) not EXPLAIN (ANALYZE, NO_TIMING).  So
> I'd like to do it that way for the new options that we're proposing to
> add.

Agreed with using phrase options positively than negatively. Since
DISABLE_PAGE_SKIPPING is an option for emergency we might be able to
rename for consistency in a future release.

Attached updated version patches. 0001 patch can be applied on top of
the patch that allows the all existing options have one boolean
argument, which I've attached on another thread[1]. So please apply
them in following order.

1. v20-0001-All-VACUUM-command-options-allow-an-argument.patch
(attached on [1] thread)
2. v10-0001-Add-INDEX_CLEANUP-option-to-VACUUM-command.patch
3. v10-0002-Add-disable-index-cleanup-option-to-vacuumdb.patch

I kept the --disable-index-cleanup option of vacuumdb command since
perhaps it would be understandable to specify this option rather than
setting true/false as a command line option.

Please review the patches.

[1] https://www.postgresql.org/message-id/CAD21AoBg8CBf1OAse6ESKJmNBon14h3nAR67nJhZ%3DyujA%2BLk4A%40mail.gmail.com

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

v10-0001-Add-INDEX_CLEANUP-option-to-VACUUM-command.patch (27K) Download Attachment
v10-0002-Add-disable-index-cleanup-option-to-vacuumdb.patch (7K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: New vacuum option to do only freezing

Masahiko Sawada
On Wed, Mar 27, 2019 at 12:12 AM Masahiko Sawada <[hidden email]> wrote:

>
> On Sat, Mar 23, 2019 at 3:25 AM Robert Haas <[hidden email]> wrote:
> >
> > On Fri, Mar 8, 2019 at 12:14 AM Masahiko Sawada <[hidden email]> wrote:
> > > IIUC we've discussed the field-and-value style vacuum option. I
> > > suggested that since we have already the disable_page_skipping option
> > > the disable_page_skipping option would be more natural style and
> > > consistent. I think "VACUUM (INDEX_CLEANUP false)" seems consistent
> > > with its reloption but not with other vacuum options. So why does only
> > > this option (and probably up-coming new options) need to support new
> > > style? Do we need the same change to the existing options?
> >
> > Well, it's too late to change to change DISABLE_PAGE_SKIPPING to work
> > some other way; it's been released, and we're stuck with it at this
> > point.
>
> Agreed.
>
> > However, I generally believe that it is preferable to phrase
> > options positively then negatively, so that for example one writes
> > EXPLAIN (ANALYZE, TIMING OFF) not EXPLAIN (ANALYZE, NO_TIMING).  So
> > I'd like to do it that way for the new options that we're proposing to
> > add.
>
> Agreed with using phrase options positively than negatively. Since
> DISABLE_PAGE_SKIPPING is an option for emergency we might be able to
> rename for consistency in a future release.
>
> Attached updated version patches.

The patch adds the basic functionality to disable index cleanup but
one possible argument could be whether we should always disable it
when anti-wraparound vacuum. As discussed on another thread[1]
anti-wraparound vacuum still could lead the I/O burst problem and take
a long time, especially for append-only large table. Originally the
purpose of this feature is to resolve the problem that vacuum takes a
long time even if the table has just a few dead tuples, which is a
quite common situation of anti-wraparound vacuum. It might be too late
to discuss but if we always disable it when anti-wraparound vacuum
then users don't need to do "VACUUM (INDEX_CLEANUP false)" manually on
PostgreSQL 12. Dose anyone have opinions?

[1] https://www.postgresql.org/message-id/CAC8Q8t%2Bj36G_bLF%3D%2B0iMo6jGNWnLnWb1tujXuJr-%2Bx8ZCCTqoQ%40mail.gmail.com



Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Reply | Threaded
Open this post in threaded view
|

Re: New vacuum option to do only freezing

Robert Haas
On Thu, Mar 28, 2019 at 2:00 AM Masahiko Sawada <[hidden email]> wrote:

> The patch adds the basic functionality to disable index cleanup but
> one possible argument could be whether we should always disable it
> when anti-wraparound vacuum. As discussed on another thread[1]
> anti-wraparound vacuum still could lead the I/O burst problem and take
> a long time, especially for append-only large table. Originally the
> purpose of this feature is to resolve the problem that vacuum takes a
> long time even if the table has just a few dead tuples, which is a
> quite common situation of anti-wraparound vacuum. It might be too late
> to discuss but if we always disable it when anti-wraparound vacuum
> then users don't need to do "VACUUM (INDEX_CLEANUP false)" manually on
> PostgreSQL 12. Dose anyone have opinions?

I think we can respect the configured value of the option even for
aggressive vacuums, but I don't think we should change aggressive
vacuums to work that way by default.  You are correct that the table
might have only a few dead tuples, but it might also have a lot of
dead tuples; I have heard rumors of a PostgreSQL installation that had
autovacuum = off and non-stop wraparound autovacuums desperately
trying to forestall shutdown.  That's probably a lot less likely now
that we have the freeze map and such a system would almost surely have
a nasty bloat problem, but disabling index cleanup by default would
make it worse.

I think the solution in the long run here is to (1) allow the
index_cleanup option (or the corresponding reloption) to override the
default behavior and (2) eventually change the default behavior from
'always yes' to 'depends on how many dead tuples we found'.  But I
think that the second of those things is not appropriate to consider
changing in PG 12 at this point.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Reply | Threaded
Open this post in threaded view
|

Re: New vacuum option to do only freezing

Masahiko Sawada
On Fri, Mar 29, 2019 at 4:39 AM Robert Haas <[hidden email]> wrote:

>
> On Thu, Mar 28, 2019 at 2:00 AM Masahiko Sawada <[hidden email]> wrote:
> > The patch adds the basic functionality to disable index cleanup but
> > one possible argument could be whether we should always disable it
> > when anti-wraparound vacuum. As discussed on another thread[1]
> > anti-wraparound vacuum still could lead the I/O burst problem and take
> > a long time, especially for append-only large table. Originally the
> > purpose of this feature is to resolve the problem that vacuum takes a
> > long time even if the table has just a few dead tuples, which is a
> > quite common situation of anti-wraparound vacuum. It might be too late
> > to discuss but if we always disable it when anti-wraparound vacuum
> > then users don't need to do "VACUUM (INDEX_CLEANUP false)" manually on
> > PostgreSQL 12. Dose anyone have opinions?
>
> I think we can respect the configured value of the option even for
> aggressive vacuums, but I don't think we should change aggressive
> vacuums to work that way by default.  You are correct that the table
> might have only a few dead tuples, but it might also have a lot of
> dead tuples; I have heard rumors of a PostgreSQL installation that had
> autovacuum = off and non-stop wraparound autovacuums desperately
> trying to forestall shutdown.  That's probably a lot less likely now
> that we have the freeze map and such a system would almost surely have
> a nasty bloat problem, but disabling index cleanup by default would
> make it worse.
Understood and agreed. Always setting it to false would affect much
and there are users who expect anti-wraparound vacuums to reclaim
garbage.

>
> I think the solution in the long run here is to (1) allow the
> index_cleanup option (or the corresponding reloption) to override the
> default behavior and (2) eventually change the default behavior from
> 'always yes' to 'depends on how many dead tuples we found'.  But I
> think that the second of those things is not appropriate to consider
> changing in PG 12 at this point.

The current patch already takes (1) and I agreed with you that (2)
would be for PG 13 or later. So the patch would be helpful for such
users as well.

Attached updated patches. These patches are applied on top of 0001
patch on parallel vacuum thread[1].

[1] https://www.postgresql.org/message-id/CAD21AoBaFcKBAeL5_%2B%2Bj%2BVzir2vBBcF4juW7qH8b3HsQY%3DQ6%2Bw%40mail.gmail.com

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

v11-0002-Add-disable-index-cleanup-option-to-vacuumdb.patch (7K) Download Attachment
v11-0001-Add-INDEX_CLEANUP-option-to-VACUUM-command.patch (27K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: New vacuum option to do only freezing

Robert Haas
On Fri, Mar 29, 2019 at 2:16 AM Masahiko Sawada <[hidden email]> wrote:
> Attached updated patches. These patches are applied on top of 0001
> patch on parallel vacuum thread[1].

+    bool index_cleanup = true;  /* by default */

I think we should instead initialize index_cleanup to the reloption
value, if there is one, or true if none, and then let it be overridden
by the loop that follows, where whatever the user specifies in the SQL
command is processed.  That way, any explicitly-specified option
within the command itself wins, and the reloption sets the default.
As you have it, index cleanup is disabled when the reloption is set to
false even if the user wrote VACUUM (INDEX_CLEANUP TRUE).

+            appendStringInfo(&buf,
+                             _("%.0f tuples and %.0f item identifiers are left
as dead.\n"),
+                             vacrelstats->nleft_dead_tuples,
+                             vacrelstats->nleft_dead_itemids);

I tend to think we should omit this line entirely if both values are
0, as will very often be the case.

+    if ((params->options & VACOPT_FULL) != 0 &&
+        (params->options & VACOPT_INDEX_CLEANUP) == 0)
+        ereport(ERROR,
+                (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+                 errmsg("VACUUM option INDEX_CLEANUP cannot be set to
false with FULL")));

I think it would be better to just ignore the INDEX_CLEANUP option
when FULL is specified.

I wasn't all that happy with the documentation changes you proposed.
Please find attached a proposed set of doc changes.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

vacuum-index-cleanup-doc-rmh.patch (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: New vacuum option to do only freezing

Masahiko Sawada
On Fri, Mar 29, 2019 at 10:46 PM Robert Haas <[hidden email]> wrote:

>
> On Fri, Mar 29, 2019 at 2:16 AM Masahiko Sawada <[hidden email]> wrote:
> > Attached updated patches. These patches are applied on top of 0001
> > patch on parallel vacuum thread[1].
>
> +    bool index_cleanup = true;  /* by default */
>
> I think we should instead initialize index_cleanup to the reloption
> value, if there is one, or true if none, and then let it be overridden
> by the loop that follows, where whatever the user specifies in the SQL
> command is processed.  That way, any explicitly-specified option
> within the command itself wins, and the reloption sets the default.
> As you have it, index cleanup is disabled when the reloption is set to
> false even if the user wrote VACUUM (INDEX_CLEANUP TRUE).
>

Yeah, but since multiple relations might be specified in VACUUM
command we need to process index_cleanup option after opened each
relations. Maybe we need to process all options except for
INDEX_CLEANUP in ExecVacuum() and pass VacuumStmt down to vacuum_rel()
and process it  in manner of you suggested after opened the relation.
Is that right?

> +            appendStringInfo(&buf,
> +                             _("%.0f tuples and %.0f item identifiers are left
> as dead.\n"),
> +                             vacrelstats->nleft_dead_tuples,
> +                             vacrelstats->nleft_dead_itemids);
>
> I tend to think we should omit this line entirely if both values are
> 0, as will very often be the case.

Fixed.

>
> +    if ((params->options & VACOPT_FULL) != 0 &&
> +        (params->options & VACOPT_INDEX_CLEANUP) == 0)
> +        ereport(ERROR,
> +                (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> +                 errmsg("VACUUM option INDEX_CLEANUP cannot be set to
> false with FULL")));
>
> I think it would be better to just ignore the INDEX_CLEANUP option
> when FULL is specified.

Okay, but why do we ignore that in this case while we complain in the
case of FULL and DISABLE_PAGE_SKIPPING?

>
> I wasn't all that happy with the documentation changes you proposed.
> Please find attached a proposed set of doc changes.

Thank you! I've incorporated these changes.
Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Reply | Threaded
Open this post in threaded view
|

Re: New vacuum option to do only freezing

Robert Haas
On Fri, Mar 29, 2019 at 12:27 PM Masahiko Sawada <[hidden email]> wrote:
> Yeah, but since multiple relations might be specified in VACUUM
> command we need to process index_cleanup option after opened each
> relations. Maybe we need to process all options except for
> INDEX_CLEANUP in ExecVacuum() and pass VacuumStmt down to vacuum_rel()
> and process it  in manner of you suggested after opened the relation.
> Is that right?

Blech, no, let's not do that.  We'd better use some method that can
indicate yes/no/default.  Something like psql's trivalue thingy, but
probably not exactly that.  We can define an enum for this purpose,
for example - VACUUM_INDEX_CLEANUP_{ENABLED,DISABLED,DEFAULT}.  Or
maybe there's some other way.  But let's not pass bits of the parse
tree around any more than really required.

> > I think it would be better to just ignore the INDEX_CLEANUP option
> > when FULL is specified.
>
> Okay, but why do we ignore that in this case while we complain in the
> case of FULL and DISABLE_PAGE_SKIPPING?

Well, that's a fair point, I guess.  If we go that that route, we'll
need to make sure that setting the reloption doesn't prevent VACUUM
FULL from working -- the complaint must only affect an explicit option
on the VACUUM command line.  I think we should have a regression test
for that.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Reply | Threaded
Open this post in threaded view
|

Re: New vacuum option to do only freezing

Masahiko Sawada
On Sat, Mar 30, 2019 at 5:04 AM Robert Haas <[hidden email]> wrote:

>
> On Fri, Mar 29, 2019 at 12:27 PM Masahiko Sawada <[hidden email]> wrote:
> > Yeah, but since multiple relations might be specified in VACUUM
> > command we need to process index_cleanup option after opened each
> > relations. Maybe we need to process all options except for
> > INDEX_CLEANUP in ExecVacuum() and pass VacuumStmt down to vacuum_rel()
> > and process it  in manner of you suggested after opened the relation.
> > Is that right?
>
> Blech, no, let's not do that.  We'd better use some method that can
> indicate yes/no/default.  Something like psql's trivalue thingy, but
> probably not exactly that.  We can define an enum for this purpose,
> for example - VACUUM_INDEX_CLEANUP_{ENABLED,DISABLED,DEFAULT}.  Or
> maybe there's some other way.  But let's not pass bits of the parse
> tree around any more than really required.
I've defined an enum VacOptTernaryValue representing
enabled/disabled/default and added index_cleanup variable as the new
enum type to VacuumParams. The vacuum options that uses the reloption
value as the default value such as index cleanup and new truncation
option can use this enum and set either enabled or disabled after
opened the relation when it’s set to default. Please find the attached
patches.

>
> > > I think it would be better to just ignore the INDEX_CLEANUP option
> > > when FULL is specified.
> >
> > Okay, but why do we ignore that in this case while we complain in the
> > case of FULL and DISABLE_PAGE_SKIPPING?
>
> Well, that's a fair point, I guess.  If we go that that route, we'll
> need to make sure that setting the reloption doesn't prevent VACUUM
> FULL from working -- the complaint must only affect an explicit option
> on the VACUUM command line.  I think we should have a regression test
> for that.
I've added regression tests. Since we check it before setting
index_cleanup based on reloptions we doesn't prevent VAUCUM FULL from
working even when the vacuum_index_cleanup is false.


Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

v12-0001-Add-INDEX_CLEANUP-option-to-VACUUM-command.patch (29K) Download Attachment
v12-0002-Add-disable-index-cleanup-option-to-vacuumdb.patch (7K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: New vacuum option to do only freezing

Darafei "Komяpa" Praliaskouski
The following review has been posted through the commitfest application:
make installcheck-world:  not tested
Implements feature:       not tested
Spec compliant:           not tested
Documentation:            not tested

I have read this patch. I like the concept and would like it to get committed.

Question I have after reading the patch is around this construct:

  /*
- * If there are no indexes then we can vacuum the page right now
- * instead of doing a second scan.
+ * If there are no indexes we can vacuum the page right now instead of
+ * doing a second scan. Also we don't do that but forget dead tuples
+ * when index cleanup is disabled.
  */

This seems to change behavior on heap tuples, even though the option itself is documented to be about "Indexes" only. This needs either better explanation what "forget dead tuples" means and that it does not lead to some kind of internal inconsistency, or documentation on what is the effect on heap tuples.

This same block raises a question on "after I enable this option, do a vacuum, decide I don't like it, what do I need to do to disable it back?" - just set it back, or set and perform a vacuum, or set and perform a VACUUM FULL as something was "forgotten"?

It may happen this concept of "forgetting" is documented somewhere in the near comments but I'd prefer it to be stated explicitly.

The new status of this patch is: Waiting on Author
Reply | Threaded
Open this post in threaded view
|

Re: New vacuum option to do only freezing

Kyotaro HORIGUCHI-2
In reply to this post by Masahiko Sawada
Hello.

At Mon, 1 Apr 2019 14:26:15 +0900, Masahiko Sawada <[hidden email]> wrote in <CAD21AoCKKwvgQgWxKwPPmVFJjQVj6c=[hidden email]>

> On Sat, Mar 30, 2019 at 5:04 AM Robert Haas <[hidden email]> wrote:
> >
> > On Fri, Mar 29, 2019 at 12:27 PM Masahiko Sawada <[hidden email]> wrote:
> > > Yeah, but since multiple relations might be specified in VACUUM
> > > command we need to process index_cleanup option after opened each
> > > relations. Maybe we need to process all options except for
> > > INDEX_CLEANUP in ExecVacuum() and pass VacuumStmt down to vacuum_rel()
> > > and process it  in manner of you suggested after opened the relation.
> > > Is that right?
> >
> > Blech, no, let's not do that.  We'd better use some method that can
> > indicate yes/no/default.  Something like psql's trivalue thingy, but
> > probably not exactly that.  We can define an enum for this purpose,
> > for example - VACUUM_INDEX_CLEANUP_{ENABLED,DISABLED,DEFAULT}.  Or
> > maybe there's some other way.  But let's not pass bits of the parse
> > tree around any more than really required.
>
> I've defined an enum VacOptTernaryValue representing
> enabled/disabled/default and added index_cleanup variable as the new

It is defined as ENABLED=0, DISABLED=1, DEFAULT=2. At leat
ENABLED=0 and DISABLED=1 are misleading.

> enum type to VacuumParams. The vacuum options that uses the reloption
> value as the default value such as index cleanup and new truncation
> option can use this enum and set either enabled or disabled after
> opened the relation when it’s set to default. Please find the attached
> patches.

+static VacOptTernaryValue get_vacopt_ternary_value(DefElem *def);

This is actually a type converter of boolean. It is used to read
VACUUM option but not used to read reloption. It seems useless.


Finally the ternary value is determined to true or false before
use. It is simple that index_cleanup finaly be read as bool. We
could add another boolean to indicate that the value is set or
not, but I think it would be better that the ternary type is a
straightfoward expansion of bool.{DEFAULT = -1, DISABLED = 0,
ENABLED = 1} and make sure that index_cleanup is not DEFAULT at a
certain point.

So, how about this?

#define VACOPT_TERNARY_DEFAULT -1
typedef int VacOptTernaryValue;  /* -1 is undecided, 0 is false, 1 is true */

/* No longer the value mustn't be left DEFAULT */
Assert (params->index_cleanup != VACOPT_TERNARY_DEFAULT);


> > > > I think it would be better to just ignore the INDEX_CLEANUP option
> > > > when FULL is specified.
> > >
> > > Okay, but why do we ignore that in this case while we complain in the
> > > case of FULL and DISABLE_PAGE_SKIPPING?
> >
> > Well, that's a fair point, I guess.  If we go that that route, we'll
> > need to make sure that setting the reloption doesn't prevent VACUUM
> > FULL from working -- the complaint must only affect an explicit option
> > on the VACUUM command line.  I think we should have a regression test
> > for that.
>
> I've added regression tests. Since we check it before setting
> index_cleanup based on reloptions we doesn't prevent VAUCUM FULL from
> working even when the vacuum_index_cleanup is false.

+  errmsg("VACUUM option INDEX_CLEANUP cannot be set to false with FULL")));

I'm not one to talk on this, but this seems somewhat confused.

"VACUUM option INDEX_CLEANUP cannot be set to false with FULL being specified"

or

"INDEX_CLEANUP cannot be disabled for VACUUM FULL"?


And in the following part:

+ /* Set index cleanup option based on reloptions */
+ if (params->index_cleanup == VACUUM_OPTION_DEFAULT)
+ {
+ if (onerel->rd_options == NULL ||
+ ((StdRdOptions *) onerel->rd_options)->vacuum_index_cleanup)
+ params->index_cleanup = VACUUM_OPTION_ENABLED;
+ else
+ params->index_cleanup = VACUUM_OPTION_DISABLED;
+ }
+

The option should not be false while VACUUM FULL, and maybe we
should complain in WARNING or NOTICE that the relopt is ignored.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center
Reply | Threaded
Open this post in threaded view
|

Re: New vacuum option to do only freezing

Masahiko Sawada
On Wed, Apr 3, 2019 at 10:56 AM Kyotaro HORIGUCHI
<[hidden email]> wrote:

>
> Hello.
>
> At Mon, 1 Apr 2019 14:26:15 +0900, Masahiko Sawada <[hidden email]> wrote in <CAD21AoCKKwvgQgWxKwPPmVFJjQVj6c=[hidden email]>
> > On Sat, Mar 30, 2019 at 5:04 AM Robert Haas <[hidden email]> wrote:
> > >
> > > On Fri, Mar 29, 2019 at 12:27 PM Masahiko Sawada <[hidden email]> wrote:
> > > > Yeah, but since multiple relations might be specified in VACUUM
> > > > command we need to process index_cleanup option after opened each
> > > > relations. Maybe we need to process all options except for
> > > > INDEX_CLEANUP in ExecVacuum() and pass VacuumStmt down to vacuum_rel()
> > > > and process it  in manner of you suggested after opened the relation.
> > > > Is that right?
> > >
> > > Blech, no, let's not do that.  We'd better use some method that can
> > > indicate yes/no/default.  Something like psql's trivalue thingy, but
> > > probably not exactly that.  We can define an enum for this purpose,
> > > for example - VACUUM_INDEX_CLEANUP_{ENABLED,DISABLED,DEFAULT}.  Or
> > > maybe there's some other way.  But let's not pass bits of the parse
> > > tree around any more than really required.
> >
> > I've defined an enum VacOptTernaryValue representing
> > enabled/disabled/default and added index_cleanup variable as the new
>

Thank you for reviewing the patch!

> It is defined as ENABLED=0, DISABLED=1, DEFAULT=2. At leat
> ENABLED=0 and DISABLED=1 are misleading.

Indeed, will fix.

>
> > enum type to VacuumParams. The vacuum options that uses the reloption
> > value as the default value such as index cleanup and new truncation
> > option can use this enum and set either enabled or disabled after
> > opened the relation when it’s set to default. Please find the attached
> > patches.
>
> +static VacOptTernaryValue get_vacopt_ternary_value(DefElem *def);
>
> This is actually a type converter of boolean. It is used to read
> VACUUM option but not used to read reloption. It seems useless.
>
>
> Finally the ternary value is determined to true or false before
> use. It is simple that index_cleanup finaly be read as bool. We
> could add another boolean to indicate that the value is set or
> not, but I think it would be better that the ternary type is a
> straightfoward expansion of bool.{DEFAULT = -1, DISABLED = 0,
> ENABLED = 1} and make sure that index_cleanup is not DEFAULT at a
> certain point.
>
> So, how about this?
>
> #define VACOPT_TERNARY_DEFAULT -1
> typedef int VacOptTernaryValue;  /* -1 is undecided, 0 is false, 1 is true */

Hmm, if we do that we set either VAOPT_TERNARY_DEFAULT, true or false
to index_cleanup, but I'm not sure this is a good approach. I think we
would want VACOPT_TERNARY_TRUE and VACOPT_TERNARY_FALSE as we defined
new type as a ternary value and already have VACOPT_TERNARY_DEFAULT.

>
> /* No longer the value mustn't be left DEFAULT */
> Assert (params->index_cleanup != VACOPT_TERNARY_DEFAULT);

Agreed, will add it.

>
>
> > > > > I think it would be better to just ignore the INDEX_CLEANUP option
> > > > > when FULL is specified.
> > > >
> > > > Okay, but why do we ignore that in this case while we complain in the
> > > > case of FULL and DISABLE_PAGE_SKIPPING?
> > >
> > > Well, that's a fair point, I guess.  If we go that that route, we'll
> > > need to make sure that setting the reloption doesn't prevent VACUUM
> > > FULL from working -- the complaint must only affect an explicit option
> > > on the VACUUM command line.  I think we should have a regression test
> > > for that.
> >
> > I've added regression tests. Since we check it before setting
> > index_cleanup based on reloptions we doesn't prevent VAUCUM FULL from
> > working even when the vacuum_index_cleanup is false.
>
> +  errmsg("VACUUM option INDEX_CLEANUP cannot be set to false with FULL")));
>
> I'm not one to talk on this, but this seems somewhat confused.
>
> "VACUUM option INDEX_CLEANUP cannot be set to false with FULL being specified"
>
> or
>
> "INDEX_CLEANUP cannot be disabled for VACUUM FULL"?

I prefer the former, will fix.

>
>
> And in the following part:
>
> +       /* Set index cleanup option based on reloptions */
> +       if (params->index_cleanup == VACUUM_OPTION_DEFAULT)
> +       {
> +               if (onerel->rd_options == NULL ||
> +                       ((StdRdOptions *) onerel->rd_options)->vacuum_index_cleanup)
> +                       params->index_cleanup = VACUUM_OPTION_ENABLED;
> +               else
> +                       params->index_cleanup = VACUUM_OPTION_DISABLED;
> +       }
> +
>
> The option should not be false while VACUUM FULL,

I think that we need to complain only when INDEX_CLEANUP option is
disabled by an explicit option on the VACUUM command and FULL option
is specified. It's no problem when vacuum_index_cleanup is false and
FULL option is true. Since internally we don't use index cleanup when
vacuum full I guess that we don't need to require index_cleanup being
always true even when full option is specified.

> and maybe we
> should complain in WARNING or NOTICE that the relopt is ignored.

I think when users want to control index cleanup behavior manually
they specify INDEX_CLEANUP option on the VACUUM command. So it seems
to me that overwriting a reloption by an explicit option would be a
natural behavior. I'm concerned that these message would rather
confuse users.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Reply | Threaded
Open this post in threaded view
|

Re: New vacuum option to do only freezing

Kyotaro HORIGUCHI-2
At Wed, 3 Apr 2019 12:10:03 +0900, Masahiko Sawada <[hidden email]> wrote in <[hidden email]>

> > And in the following part:
> >
> > +       /* Set index cleanup option based on reloptions */
> > +       if (params->index_cleanup == VACUUM_OPTION_DEFAULT)
> > +       {
> > +               if (onerel->rd_options == NULL ||
> > +                       ((StdRdOptions *) onerel->rd_options)->vacuum_index_cleanup)
> > +                       params->index_cleanup = VACUUM_OPTION_ENABLED;
> > +               else
> > +                       params->index_cleanup = VACUUM_OPTION_DISABLED;
> > +       }
> > +
> >
> > The option should not be false while VACUUM FULL,
>
> I think that we need to complain only when INDEX_CLEANUP option is
> disabled by an explicit option on the VACUUM command and FULL option
> is specified. It's no problem when vacuum_index_cleanup is false and
> FULL option is true. Since internally we don't use index cleanup when
> vacuum full I guess that we don't need to require index_cleanup being
> always true even when full option is specified.

I know it's safe. It's just about integrity of option values. So
I don't insist on that.

> > and maybe we
> > should complain in WARNING or NOTICE that the relopt is ignored.
>
> I think when users want to control index cleanup behavior manually
> they specify INDEX_CLEANUP option on the VACUUM command. So it seems
> to me that overwriting a reloption by an explicit option would be a
> natural behavior. I'm concerned that these message would rather
> confuse users.

If it "cannot be specified with FULL", it seems strange that it's
safe being specified by reloptions.

I'm rather thinking that INDEX_CLEANUP = false is ignorable even
being specified with FULL option, aand DISABLE_PAGE_SKIPPING for
VACUUM FULL shuld be ignored since VACUUM FULL doesn't skip pages
in the first place.

Couldn't we silence the DISABLE_PAGE_SKIPPING & FULL case instead
of complaining about INDEX_CLEANUP & FULL? If so, I feel just
ignoring the relopt cases is reasonable.

Yeah, perhaps I'm warrying too much.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center



Reply | Threaded
Open this post in threaded view
|

Re: New vacuum option to do only freezing

Masahiko Sawada
On Wed, Apr 3, 2019 at 1:17 PM Kyotaro HORIGUCHI
<[hidden email]> wrote:

>
> At Wed, 3 Apr 2019 12:10:03 +0900, Masahiko Sawada <[hidden email]> wrote in <[hidden email]>
> > > And in the following part:
> > >
> > > +       /* Set index cleanup option based on reloptions */
> > > +       if (params->index_cleanup == VACUUM_OPTION_DEFAULT)
> > > +       {
> > > +               if (onerel->rd_options == NULL ||
> > > +                       ((StdRdOptions *) onerel->rd_options)->vacuum_index_cleanup)
> > > +                       params->index_cleanup = VACUUM_OPTION_ENABLED;
> > > +               else
> > > +                       params->index_cleanup = VACUUM_OPTION_DISABLED;
> > > +       }
> > > +
> > >
> > > The option should not be false while VACUUM FULL,
> >
> > I think that we need to complain only when INDEX_CLEANUP option is
> > disabled by an explicit option on the VACUUM command and FULL option
> > is specified. It's no problem when vacuum_index_cleanup is false and
> > FULL option is true. Since internally we don't use index cleanup when
> > vacuum full I guess that we don't need to require index_cleanup being
> > always true even when full option is specified.
>
> I know it's safe. It's just about integrity of option values. So
> I don't insist on that.
>
> > > and maybe we
> > > should complain in WARNING or NOTICE that the relopt is ignored.
> >
> > I think when users want to control index cleanup behavior manually
> > they specify INDEX_CLEANUP option on the VACUUM command. So it seems
> > to me that overwriting a reloption by an explicit option would be a
> > natural behavior. I'm concerned that these message would rather
> > confuse users.
>
> If it "cannot be specified with FULL", it seems strange that it's
> safe being specified by reloptions.
>
> I'm rather thinking that INDEX_CLEANUP = false is ignorable even
> being specified with FULL option, aand DISABLE_PAGE_SKIPPING for
> VACUUM FULL shuld be ignored since VACUUM FULL doesn't skip pages
> in the first place.
>
> Couldn't we silence the DISABLE_PAGE_SKIPPING & FULL case instead
> of complaining about INDEX_CLEANUP & FULL? If so, I feel just
> ignoring the relopt cases is reasonable.
Agreed with being silent even when INDEX_CLEANUP/vacuum_index_cleanup
= false and FULL = true. For  DISABLE_PAGE_SKIPPING, it should be a
separate patch and changes the existing behavior. Maybe need other
discussion.

For VacOptTernaryValue part, I've incorporated the review comments but
left the new enum type since it seems to be more straightforward for
now. I might change that if there are other way.

Attached the updated version patches including the
DISABLE_PAGE_SKIPPING part (0003).

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

v13-0001-Add-INDEX_CLEANUP-option-to-VACUUM-command.patch (28K) Download Attachment
v13-0002-Add-disable-index-cleanup-option-to-vacuumdb.patch (7K) Download Attachment
v13-0003-Do-not-complain-even-when-DISABLE_PAGE_SKIPPING-.patch (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: New vacuum option to do only freezing

Robert Haas
On Wed, Apr 3, 2019 at 1:32 AM Masahiko Sawada <[hidden email]> wrote:
> Attached the updated version patches including the
> DISABLE_PAGE_SKIPPING part (0003).

I am confused about nleft_dead_tuples.  It looks like it gets
incremented whenever we set tupgone = true, regardless of whether we
are doing index cleanup.  But if we ARE doing index cleanup then the
dead tuple will not be left.  And if we are not doing index vacuum
then we still don't need this for anything, because tups_vacuumed is
counting the same thing.  I may be confused.  But if I'm not, then I
think this should just be ripped out, and we should only keep
nleft_dead_itemids.

As far as VacOptTernaryValue, I think it would be safer to change this
so that VACOPT_TERNARY_DEFAULT = 0.  That way palloc0 will fill in the
value that people are likely to want by default, which makes it less
likely that people will accidentally write future code that doesn't
clean up indexes.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


123456