On Sat, Apr 24, 2021 at 11:16 AM Peter Geoghegan <[hidden email]> wrote:
> On Fri, Apr 23, 2021 at 5:29 PM Andres Freund <[hidden email]> wrote:
> > On 2021-04-23 16:12:33 -0700, Peter Geoghegan wrote:
> > > The only reason that I chose 4GB for FAILSAFE_MIN_PAGES is because the
> > > related VACUUM_FSM_EVERY_PAGES constant was 8GB -- the latter limits
> > > how often we'll consider the failsafe in the single-pass/no-indexes
> > > case.
> > I don't really understand why it makes sense to tie FAILSAFE_MIN_PAGES
> > and VACUUM_FSM_EVERY_PAGES together? They seem pretty independent to me?
> VACUUM_FSM_EVERY_PAGES controls how often VACUUM does work that
> usually takes place right after the two pass case finishes a round of
> index and heap vacuuming. This is work that we certainly don't want to
> do every time we process a single heap page in the one-pass/no-indexes
> case. Initially this just meant FSM vacuuming, but it now includes a
> failsafe check.
> Of course all of the precise details here are fairly arbitrary
> (including VACUUM_FSM_EVERY_PAGES, which has been around for a couple
> of releases now). The overall goal that I had in mind was to make the
> one-pass case's use of the failsafe have analogous behavior to the
> two-pass/has-indexes case -- a goal which was itself somewhat
> > The failsafe mode affects the table scan itself by disabling cost
> > limiting. As far as I can see the ways it triggers for the table scan (vs
> > truncation or index processing) are:
> > 1) Before vacuuming starts, for heap phases and indexes, if already
> > necessary at that point
> > 2) For a table with indexes, before/after each index vacuum, if now
> > necessary
> > 3) On a table without indexes, every 8GB, iff there are dead tuples, if now necessary
> > Why would we want to trigger the failsafe mode during a scan of a table
> > with dead tuples and no indexes, but not on a table without dead tuples
> > or with indexes but fewer than m_w_m dead tuples? That makes little
> > sense to me.
> What alternative does make sense to you?
> It seemed important to put the failsafe check at points where we do
> other analogous work in all cases. We made a pragmatic trade-off. In
> theory almost any scheme might not check often enough, and/or might
> check too frequently.
> > It seems that for the no-index case the warning message is quite off?
> I'll fix that up some point soon. FWIW this happened because the
> support for one-pass VACUUM was added quite late, at Robert's request.
+1 to fix this. Are you already working on fixing this? If not, I'll
post a patch.
> Another issue with the failsafe commit is that we haven't considered
> the autovacuum_multixact_freeze_max_age table reloption -- we only
> check the GUC. That might have accidentally been the right thing to
> do, though, since the reloption is interpreted as lower than the GUC
> in all cases anyway -- arguably the
> autovacuum_multixact_freeze_max_age GUC should be all we care about
> anyway. I will need to think about this question some more, though.
FWIW, I intentionally ignored the reloption there since they're
interpreted as lower than the GUC as you mentioned and the situation
where we need to enter the failsafe mode is not the table-specific
problem but a system-wide problem.
On Tue, May 18, 2021 at 2:42 PM Peter Geoghegan <[hidden email]> wrote:
> On Mon, May 17, 2021 at 10:29 PM Masahiko Sawada <[hidden email]> wrote:
> > +1 to fix this. Are you already working on fixing this? If not, I'll
> > post a patch.
> I posted a patch recently (last Thursday my time). Perhaps you can review it?
Oh, I missed that the patch includes that fix. I'll review the patch.
On Tue, May 18, 2021 at 2:46 PM Masahiko Sawada <[hidden email]> wrote:
> On Tue, May 18, 2021 at 2:42 PM Peter Geoghegan <[hidden email]> wrote:
> > On Mon, May 17, 2021 at 10:29 PM Masahiko Sawada <[hidden email]> wrote:
> > > +1 to fix this. Are you already working on fixing this? If not, I'll
> > > post a patch.
> > I posted a patch recently (last Thursday my time). Perhaps you can review it?
> Oh, I missed that the patch includes that fix. I'll review the patch.
Since there is the condition "vacrel->num_index_scans == 0" we could
enter the failsafe mode even if the table is less than 4GB, if we
enter lazy_check_wraparound_failsafe() after executing more than one
index scan. Whereas a vacuum on the table that is less than 4GB and
has no index never enters the failsafe mode. I think we can remove
this condition since I don't see the reason why we don't allow to
enter the failsafe mode only when the first-time index scan in the
case of such tables. What do you think?
On Thu, Jun 10, 2021 at 10:52 AM Andres Freund <[hidden email]> wrote:
I started to write a test for $Subject, which I think we sorely need.
Currently my approach is to:
- start a cluster, create a few tables with test data
- acquire SHARE UPDATE EXCLUSIVE in a prepared transaction, to prevent
autovacuum from doing anything
- cause dead tuples to exist
- run pg_resetwal -x 2000027648
- do things like acquiring pins on pages that block vacuum from progressing
- commit prepared transaction
- wait for template0, template1 datfrozenxid to increase
- wait for relfrozenxid for most relations in postgres to increase
- release buffer pin
- wait for postgres datfrozenxid to increase
Cool. Thank you for working on that!
Could you please share a WIP patch for the $subj? I'd be happy to help with it.
So far so good. But I've encountered a few things that stand in the way of
enabling such a test by default:
1) During startup StartupSUBTRANS() zeroes out all pages between
oldestActiveXID and nextXid. That takes 8s on my workstation, but only
because I have plenty memory - pg_subtrans ends up 14GB as I currently do
the test. Clearly not something we could do on the BF. ....
3) pg_resetwal -x requires to carefully choose an xid: It needs to be the
first xid on a clog page. It's not hard to determine which xids are but it
depends on BLCKSZ and a few constants in clog.c. I've for now hardcoded a
value appropriate for 8KB, but ...
Maybe we can add new pg_resetwal option? Something like pg_resetwal --xid-near-wraparound, which will ask pg_resetwal to calculate exact xid value using values from pg_control and clog macros?
I think it might come in handy for manual testing too.
I have 2 1/2 ideas about addressing 1);
- We could exposing functionality to do advance nextXid to a future value at
runtime, without filling in clog/subtrans pages. Would probably have to live
in varsup.c and be exposed via regress.so or such?
This option looks scary to me. Several functions rely on the fact that StartupSUBTRANS() have zeroed pages.
And if we will do it conditional just for tests, it means that we won't test the real code path.
- The only reason StartupSUBTRANS() does that work is because of the prepared
transaction holding back oldestActiveXID. That transaction in turn exists to
prevent autovacuum from doing anything before we do test setup
Perhaps it'd be sufficient to set autovacuum_naptime really high initially,
perform the test setup, set naptime to something lower, reload config. But
I'm worried that might not be reliable: If something ends up allocating an
xid we'd potentially reach the path in GetNewTransaction() that wakes up the
launcher? But probably there wouldn't be anything doing so?
Another aspect that might not make this a good choice is that it actually
seems relevant to be able to test cases where there are very old still
Maybe this exact scenario can be covered with a separate long-running test, not included in buildfarm test suite?