[PATCH] Disable bgworkers during servers start in pg_upgrade

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

[PATCH] Disable bgworkers during servers start in pg_upgrade

Denis Laxalde
Hello,

We found an issue in pg_upgrade on a cluster with a third-party
background worker. The upgrade goes fine, but the new cluster is then in
an inconsistent state. The background worker comes from the PoWA
extension but the issue does not appear to related to this particular
code.

Here is a shell script to reproduce the issue (error at the end):

  OLDBINDIR=/usr/lib/postgresql/11/bin
  NEWBINDIR=/usr/lib/postgresql/13/bin
 
  OLDDATADIR=$(mktemp -d)
  NEWDATADIR=$(mktemp -d)
 
  $OLDBINDIR/initdb -D $OLDDATADIR
  echo "unix_socket_directories = '/tmp'" >> $OLDDATADIR/postgresql.auto.conf
  echo "shared_preload_libraries = 'pg_stat_statements, powa'" >> $OLDDATADIR/postgresql.auto.conf
  $OLDBINDIR/pg_ctl -D $OLDDATADIR -l $OLDDATADIR/pgsql.log start
  $OLDBINDIR/createdb -h /tmp powa
  $OLDBINDIR/psql -h /tmp -d powa -c "CREATE EXTENSION powa CASCADE"
  $OLDBINDIR/pg_ctl -D $OLDDATADIR -m fast stop
 
  $NEWBINDIR/initdb -D $NEWDATADIR
  cp $OLDDATADIR/postgresql.auto.conf $NEWDATADIR/postgresql.auto.conf
 
  $NEWBINDIR/pg_upgrade --old-datadir $OLDDATADIR --new-datadir $NEWDATADIR --old-bindir $OLDBINDIR
 
  $NEWBINDIR/pg_ctl -D $NEWDATADIR -l $NEWDATADIR/pgsql.log start
  $NEWBINDIR/psql -h /tmp -d powa -c "SELECT 1 FROM powa_snapshot_metas"
  # ERROR:  MultiXactId 1 has not been created yet -- apparent wraparound

(This needs PoWA to be installed; packages are available on pgdg
repositories as postgresql-<pgversion>-powa on Debian or
powa_<pgversion> on RedHat or directly from source code at
https://github.com/powa-team/powa-archivist).

As far as I currently understand, this is due to the data to be migrated
being somewhat inconsistent (from the perspective of pg_upgrade) when
the old cluster and its background workers get started in pg_upgrade
during the "checks" step. (The old cluster remains sane, still.)

As a solution, it seems that, for similar reasons that we restrict
socket access to prevent accidental connections (from commit
f763b77193), we should also prevent background workers to start at this
step.

Please find attached a patch implementing this.

Thanks for considering,
Denis

0001-Disable-background-workers-during-servers-start-in-p.patch (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Disable bgworkers during servers start in pg_upgrade

Andres Freund
Hi,

On 2021-01-21 16:23:58 +0100, Denis Laxalde wrote:
> We found an issue in pg_upgrade on a cluster with a third-party
> background worker. The upgrade goes fine, but the new cluster is then in
> an inconsistent state. The background worker comes from the PoWA
> extension but the issue does not appear to related to this particular
> code.

Well, it does imply that that backgrounder did something, as the pure
existence of a bgworker shouldn't affect

anything. Presumably the issue is that the bgworker actually does
transactional writes, which causes problems because the xids /
multixacts from early during pg_upgrade won't actually be valid after we
do pg_resetxlog etc.


> As a solution, it seems that, for similar reasons that we restrict
> socket access to prevent accidental connections (from commit
> f763b77193), we should also prevent background workers to start at this
> step.

I think that'd have quite the potential for negative impact - imagine
extensions that refuse to be loaded outside of shared_preload_libraries
(e.g. because they need to allocate shared memory) but that are required
during the course of pg_upgrade (e.g. because it's a tableam, a PL or
such). Those libraries will then tried to be loaded during the upgrade
(due to the _PG_init() hook being called when functions from the
extension are needed, e.g. the tableam or PL handler).

Nor is it clear to me that the only way this would be problematic is
with shared_preload_libraries. A library in local_preload_libraries, or
a demand loaded library can trigger bgworkers (or database writes in
some other form) as well.


I wonder if we could

a) set default_transaction_read_only to true, and explicitly change it
   in the sessions that need that.
b) when in binary upgrade mode / -b, error out on all wal writes in
   sessions that don't explicitly set a session-level GUC to allow
   writes.

Greetings,

Andres Freund


Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Disable bgworkers during servers start in pg_upgrade

Denis Laxalde
Hi,

Andres Freund a écrit :

> On 2021-01-21 16:23:58 +0100, Denis Laxalde wrote:
> > We found an issue in pg_upgrade on a cluster with a third-party
> > background worker. The upgrade goes fine, but the new cluster is then in
> > an inconsistent state. The background worker comes from the PoWA
> > extension but the issue does not appear to related to this particular
> > code.
>
> Well, it does imply that that backgrounder did something, as the pure
> existence of a bgworker shouldn't affect
>
> anything. Presumably the issue is that the bgworker actually does
> transactional writes, which causes problems because the xids /
> multixacts from early during pg_upgrade won't actually be valid after we
> do pg_resetxlog etc.
>
>
> > As a solution, it seems that, for similar reasons that we restrict
> > socket access to prevent accidental connections (from commit
> > f763b77193), we should also prevent background workers to start at this
> > step.
>
> I think that'd have quite the potential for negative impact - imagine
> extensions that refuse to be loaded outside of shared_preload_libraries
> (e.g. because they need to allocate shared memory) but that are required
> during the course of pg_upgrade (e.g. because it's a tableam, a PL or
> such). Those libraries will then tried to be loaded during the upgrade
> (due to the _PG_init() hook being called when functions from the
> extension are needed, e.g. the tableam or PL handler).
>
> Nor is it clear to me that the only way this would be problematic is
> with shared_preload_libraries. A library in local_preload_libraries, or
> a demand loaded library can trigger bgworkers (or database writes in
> some other form) as well.
Thank you for those insights!

> I wonder if we could
>
> a) set default_transaction_read_only to true, and explicitly change it
>    in the sessions that need that.
> b) when in binary upgrade mode / -b, error out on all wal writes in
>    sessions that don't explicitly set a session-level GUC to allow
>    writes.

Solution "a" appears to be enough to solve the problem described in my
initial email. See attached patch implementing this.

Cheers,
Denis

0001-Set-default-transactions-to-read-only-at-servers-sta.patch (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Disable bgworkers during servers start in pg_upgrade

Jehan-Guillaume de Rorthais
Hi,

On Wed, 27 Jan 2021 11:25:11 +0100
Denis Laxalde <[hidden email]> wrote:

> Andres Freund a écrit :
> > On 2021-01-21 16:23:58 +0100, Denis Laxalde wrote:  
> > > We found an issue in pg_upgrade on a cluster with a third-party
> > > background worker. The upgrade goes fine, but the new cluster is then in
> > > an inconsistent state. The background worker comes from the PoWA
> > > extension but the issue does not appear to related to this particular
> > > code.  
> >
> > Well, it does imply that that backgrounder did something, as the pure
> > existence of a bgworker shouldn't affect anything. Presumably the issue is
> > that the bgworker actually does transactional writes, which causes problems
> > because the xids / multixacts from early during pg_upgrade won't actually
> > be valid after we do pg_resetxlog etc.

Indeed, it does some writes. As soon as the powa bgworker starts, it takes
"snapshots" of pg_stat_statements state and record them in a local table. To
avoid concurrent run, it takes a lock on some of its local rows using SELECT FOR
UPDATE, hence the mxid consumption.

The inconsistency occurs at least at two place:

* the datminmxid and relminmxid fields pg_dump(all)'ed and restored in the new
  cluster.
* the multixid fields in the controlfile read during the check phase and
  restored later using pg_resetxlog.

> > > As a solution, it seems that, for similar reasons that we restrict
> > > socket access to prevent accidental connections (from commit
> > > f763b77193), we should also prevent background workers to start at this
> > > step.  
> >
> > I think that'd have quite the potential for negative impact - [...]
>
> Thank you for those insights!

+1

> > I wonder if we could
> >
> > a) set default_transaction_read_only to true, and explicitly change it
> >    in the sessions that need that.

According to Denis' tests discussed off-list, it works fine in regard with the
powa bgworker, albeit some complaints in logs. However, I wonder how fragile it
could be as bgworker could easily manipulate either the GUC or "BEGIN READ
WRITE". I realize this is really uncommon practices, but bgworker code from
third parties might be surprising.

> > b) when in binary upgrade mode / -b, error out on all wal writes in
> >    sessions that don't explicitly set a session-level GUC to allow
> >    writes.

It feels safer because more specific to the subject. But I wonder if the
benefice worth adding some (limited?) complexity and a small/quick conditional
block in a very hot code path for a very limited use case.

What about c) where the bgworker are not loaded by default during binary upgrade
mode / -b unless they explicitly set a bgw_flags (BGWORKER_BINARY_UPGRADE ?)
when they are required during pg_upgrade?

Regards,


Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Disable bgworkers during servers start in pg_upgrade

Jehan-Guillaume de Rorthais
Oh, I forgot another point before sending my previous email.

Maybe it might worth adding some final safety checks in pg_upgrade itself?
Eg. checking controldata and mxid files coherency between old and new
cluster would have catch the inconsistency here.