Transactions involving multiple postgres foreign servers, take 2

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
112 messages Options
123456
Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers, take 2

akapila
On Wed, Jun 3, 2020 at 12:02 PM Masahiko Sawada
<[hidden email]> wrote:

>
> On Wed, 3 Jun 2020 at 14:50, Amit Kapila <[hidden email]> wrote:
> >
> >
> > If the intention is to keep the first version simple, then why do we
> > want to support any mode other than 'required'?  I think it will limit
> > its usage for the cases where 2PC can be used only when all FDWs
> > involved support Prepare API but if that helps to keep the design and
> > patch simpler then why not just do that for the first version and then
> > extend it later.  OTOH, if you think it will be really useful to keep
> > other modes, then also we could try to keep those in separate patches
> > to facilitate the review and discussion of the core feature.
>
> ‘disabled’ is the fundamental mode. We also need 'disabled' mode,
> otherwise existing FDW won't work.
>

IIUC, if foreign_twophase_commit is 'disabled', we don't use a
two-phase protocol to commit distributed transactions, right?  So, do
we check this at the time of Prepare or Commit whether we need to use
a two-phase protocol?  I think this should be checked at prepare time.

+        <para>
+         This parameter can be changed at any time; the behavior for any one
+         transaction is determined by the setting in effect when it commits.
+        </para>

This is written w.r.t foreign_twophase_commit.  If one changes this
between prepare and commit, will it have any impact?

>  I was concerned that many FDW
> plugins don't implement FDW transaction APIs yet when users start
> using this feature. But it seems to be a good idea to move 'prefer'
> mode to a separate patch while leaving 'required'. I'll do that in the
> next version patch.
>

Okay, thanks.  Please, see if you can separate out the documentation
for that as well.

Few other comments on v21-0003-Documentation-update:
----------------------------------------------------
1.
+      <entry></entry>
+      <entry>
+       Numeric transaction identifier with that this foreign transaction
+       associates
+      </entry>

/with that this/with which this

2.
+      <entry>
+       The OID of the foreign server on that the foreign transaction
is prepared
+      </entry>

/on that the/on which the

3.
+      <entry><structfield>status</structfield></entry>
+      <entry><type>text</type></entry>
+      <entry></entry>
+      <entry>
+       Status of foreign transaction. Possible values are:
+       <itemizedlist>
+        <listitem>
+         <para>
+          <literal>initial</literal> : Initial status.
+         </para>

What exactly "Initial status" means?

4.
+      <entry><structfield>in_doubt</structfield></entry>
+      <entry><type>boolean</type></entry>
+      <entry></entry>
+      <entry>
+       If <literal>true</literal> this foreign transaction is
in-doubt status and
+       needs to be resolved by calling <function>pg_resolve_fdwxact</function>
+       function.
+      </entry>

It would be better if you can add an additional sentence to say when
and or how can foreign transactions reach in-doubt state.

5.
If <literal>N</literal> local transactions each
+         across <literal>K</literal> foreign server this value need to be set

This part of the sentence can be improved by saying something like:
"If a user expects N local transactions and each of those involves K
foreign servers, this value..".

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers, take 2

Masahiko Sawada-2
On Thu, 4 Jun 2020 at 12:46, Amit Kapila <[hidden email]> wrote:

>
> On Wed, Jun 3, 2020 at 12:02 PM Masahiko Sawada
> <[hidden email]> wrote:
> >
> > On Wed, 3 Jun 2020 at 14:50, Amit Kapila <[hidden email]> wrote:
> > >
> > >
> > > If the intention is to keep the first version simple, then why do we
> > > want to support any mode other than 'required'?  I think it will limit
> > > its usage for the cases where 2PC can be used only when all FDWs
> > > involved support Prepare API but if that helps to keep the design and
> > > patch simpler then why not just do that for the first version and then
> > > extend it later.  OTOH, if you think it will be really useful to keep
> > > other modes, then also we could try to keep those in separate patches
> > > to facilitate the review and discussion of the core feature.
> >
> > ‘disabled’ is the fundamental mode.
Oops, I wanted to say 'required' is the fundamental mode.

> > We also need 'disabled' mode,
> > otherwise existing FDW won't work.
> >
>
> IIUC, if foreign_twophase_commit is 'disabled', we don't use a
> two-phase protocol to commit distributed transactions, right?  So, do
> we check this at the time of Prepare or Commit whether we need to use
> a two-phase protocol?  I think this should be checked at prepare time.

When a client executes COMMIT to a distributed transaction, 2pc is
automatically, transparently used. In ‘required’ case, all involved
(and modified) foreign server needs to support 2pc. So if a
distributed transaction modifies data on a foreign server connected
via an existing FDW which doesn’t support 2pc, the transaction cannot
proceed commit, fails at pre-commit phase. So there should be two
modes: ‘disabled’ and ‘required’, and should be ‘disabled’ by default.

>
> +        <para>
> +         This parameter can be changed at any time; the behavior for any one
> +         transaction is determined by the setting in effect when it commits.
> +        </para>
>
> This is written w.r.t foreign_twophase_commit.  If one changes this
> between prepare and commit, will it have any impact?

Since the distributed transaction commit automatically uses 2pc when
executing COMMIT, it's not possible to change foreign_twophase_commit
between prepare and commit. So I'd like to explain the case where a
user executes PREPARE and then COMMIT PREPARED while changing
foreign_twophase_commit.

PREPARE can run only when foreign_twophase_commit is 'required' (or
'prefer') and all foreign servers involved with the transaction
support 2pc. We prepare all foreign transactions no matter what the
number of servers and modified or not. If either
foreign_twophase_commit is 'disabled' or the transaction modifies data
on a foreign server that doesn't support 2pc, it raises an error. At
COMMIT (or ROLLBACK) PREPARED, similarly foreign_twophase_commit needs
to be set to 'required'. It raises an error if the distributed
transaction has a foreign transaction and foreign_twophase_commit is
'disabled'.

>
> >  I was concerned that many FDW
> > plugins don't implement FDW transaction APIs yet when users start
> > using this feature. But it seems to be a good idea to move 'prefer'
> > mode to a separate patch while leaving 'required'. I'll do that in the
> > next version patch.
> >
>
> Okay, thanks.  Please, see if you can separate out the documentation
> for that as well.
>
> Few other comments on v21-0003-Documentation-update:
> ----------------------------------------------------
> 1.
> +      <entry></entry>
> +      <entry>
> +       Numeric transaction identifier with that this foreign transaction
> +       associates
> +      </entry>
>
> /with that this/with which this
>
> 2.
> +      <entry>
> +       The OID of the foreign server on that the foreign transaction
> is prepared
> +      </entry>
>
> /on that the/on which the
>
> 3.
> +      <entry><structfield>status</structfield></entry>
> +      <entry><type>text</type></entry>
> +      <entry></entry>
> +      <entry>
> +       Status of foreign transaction. Possible values are:
> +       <itemizedlist>
> +        <listitem>
> +         <para>
> +          <literal>initial</literal> : Initial status.
> +         </para>
>
> What exactly "Initial status" means?
This part is out-of-date. Fixed.

>
> 4.
> +      <entry><structfield>in_doubt</structfield></entry>
> +      <entry><type>boolean</type></entry>
> +      <entry></entry>
> +      <entry>
> +       If <literal>true</literal> this foreign transaction is
> in-doubt status and
> +       needs to be resolved by calling <function>pg_resolve_fdwxact</function>
> +       function.
> +      </entry>
>
> It would be better if you can add an additional sentence to say when
> and or how can foreign transactions reach in-doubt state.
>
> 5.
> If <literal>N</literal> local transactions each
> +         across <literal>K</literal> foreign server this value need to be set
>
> This part of the sentence can be improved by saying something like:
> "If a user expects N local transactions and each of those involves K
> foreign servers, this value..".
Thanks. I've incorporated all your comments.

I've attached the new version patch set. 0006 is a separate patch
which introduces 'prefer' mode to foreign_twophase_commit.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

v22-0006-Add-prefer-mode-to-foreign_twophase_commit.patch (16K) Download Attachment
v22-0005-Add-regression-tests-for-foreign-twophase-commit.patch (61K) Download Attachment
v22-0004-postgres_fdw-supports-atomic-commit-APIs.patch (65K) Download Attachment
v22-0002-Support-atomic-commit-among-multiple-foreign-ser.patch (254K) Download Attachment
v22-0003-Documentation-update.patch (53K) Download Attachment
v22-0001-Keep-track-of-writing-on-non-temporary-relation.patch (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers, take 2

akapila
On Fri, Jun 5, 2020 at 3:16 PM Masahiko Sawada
<[hidden email]> wrote:

>
> On Thu, 4 Jun 2020 at 12:46, Amit Kapila <[hidden email]> wrote:
> >
> >
> > +        <para>
> > +         This parameter can be changed at any time; the behavior for any one
> > +         transaction is determined by the setting in effect when it commits.
> > +        </para>
> >
> > This is written w.r.t foreign_twophase_commit.  If one changes this
> > between prepare and commit, will it have any impact?
>
> Since the distributed transaction commit automatically uses 2pc when
> executing COMMIT, it's not possible to change foreign_twophase_commit
> between prepare and commit. So I'd like to explain the case where a
> user executes PREPARE and then COMMIT PREPARED while changing
> foreign_twophase_commit.
>
> PREPARE can run only when foreign_twophase_commit is 'required' (or
> 'prefer') and all foreign servers involved with the transaction
> support 2pc. We prepare all foreign transactions no matter what the
> number of servers and modified or not. If either
> foreign_twophase_commit is 'disabled' or the transaction modifies data
> on a foreign server that doesn't support 2pc, it raises an error. At
> COMMIT (or ROLLBACK) PREPARED, similarly foreign_twophase_commit needs
> to be set to 'required'. It raises an error if the distributed
> transaction has a foreign transaction and foreign_twophase_commit is
> 'disabled'.
>

So, IIUC, it will raise an error if foreign_twophase_commit is
'disabled' (or one of the foreign server involved doesn't support 2PC)
and the error can be raised both when user issues PREPARE or COMMIT
(or ROLLBACK) PREPARED.  If so, isn't it strange that we raise such an
error after PREPARE?  What kind of use-case required this?

>
> >
> > 4.
> > +      <entry><structfield>in_doubt</structfield></entry>
> > +      <entry><type>boolean</type></entry>
> > +      <entry></entry>
> > +      <entry>
> > +       If <literal>true</literal> this foreign transaction is
> > in-doubt status and
> > +       needs to be resolved by calling <function>pg_resolve_fdwxact</function>
> > +       function.
> > +      </entry>
> >
> > It would be better if you can add an additional sentence to say when
> > and or how can foreign transactions reach in-doubt state.
> >

+       If <literal>true</literal> this foreign transaction is in-doubt status.
+       A foreign transaction becomes in-doubt status when user canceled the
+       query during transaction commit or the server crashed during transaction
+       commit.

Can we reword the second sentence as: "A foreign transaction can have
this status when the user has cancelled the statement or the server
crashes during transaction commit."?   I have another question about
this field, why can't it be one of the status ('preparing',
'prepared', 'committing', 'aborting', 'in-doubt') rather than having a
separate field?  Also, isn't it more suitable to name 'status' field
as 'state' because these appear to be more like different states of
transaction?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers, take 2

Masahiko Sawada-2
On Thu, 11 Jun 2020 at 22:21, Amit Kapila <[hidden email]> wrote:

>
> On Fri, Jun 5, 2020 at 3:16 PM Masahiko Sawada
> <[hidden email]> wrote:
> >
> > On Thu, 4 Jun 2020 at 12:46, Amit Kapila <[hidden email]> wrote:
> > >
> > >
> > > +        <para>
> > > +         This parameter can be changed at any time; the behavior for any one
> > > +         transaction is determined by the setting in effect when it commits.
> > > +        </para>
> > >
> > > This is written w.r.t foreign_twophase_commit.  If one changes this
> > > between prepare and commit, will it have any impact?
> >
> > Since the distributed transaction commit automatically uses 2pc when
> > executing COMMIT, it's not possible to change foreign_twophase_commit
> > between prepare and commit. So I'd like to explain the case where a
> > user executes PREPARE and then COMMIT PREPARED while changing
> > foreign_twophase_commit.
> >
> > PREPARE can run only when foreign_twophase_commit is 'required' (or
> > 'prefer') and all foreign servers involved with the transaction
> > support 2pc. We prepare all foreign transactions no matter what the
> > number of servers and modified or not. If either
> > foreign_twophase_commit is 'disabled' or the transaction modifies data
> > on a foreign server that doesn't support 2pc, it raises an error. At
> > COMMIT (or ROLLBACK) PREPARED, similarly foreign_twophase_commit needs
> > to be set to 'required'. It raises an error if the distributed
> > transaction has a foreign transaction and foreign_twophase_commit is
> > 'disabled'.
> >
>
> So, IIUC, it will raise an error if foreign_twophase_commit is
> 'disabled' (or one of the foreign server involved doesn't support 2PC)
> and the error can be raised both when user issues PREPARE or COMMIT
> (or ROLLBACK) PREPARED.  If so, isn't it strange that we raise such an
> error after PREPARE?  What kind of use-case required this?
>

I don’t concrete use-case but the reason why it raises an error when a
user setting foreign_twophase_commit to 'disabled' executes COMMIT (or
ROLLBACK) PREPARED within the transaction involving at least one
foreign server is that I wanted to make it behaves in a similar way of
COMMIT case. I mean, if a user executes just COMMIT, the distributed
transaction is committed in two phases but the value of
foreign_twophase_commit is not changed during these two phases. So I
wanted to require user to set foreign_twophase_commit to ‘required’
both when executing PREPARE and executing COMMIT (or ROLLBACK)
PREPARED. Implementation also can become simple because we can assume
that foreign_twophase_commit is always enabled when a transaction
requires foreign transaction preparation and resolution.

> >
> > >
> > > 4.
> > > +      <entry><structfield>in_doubt</structfield></entry>
> > > +      <entry><type>boolean</type></entry>
> > > +      <entry></entry>
> > > +      <entry>
> > > +       If <literal>true</literal> this foreign transaction is
> > > in-doubt status and
> > > +       needs to be resolved by calling <function>pg_resolve_fdwxact</function>
> > > +       function.
> > > +      </entry>
> > >
> > > It would be better if you can add an additional sentence to say when
> > > and or how can foreign transactions reach in-doubt state.
> > >
>
> +       If <literal>true</literal> this foreign transaction is in-doubt status.
> +       A foreign transaction becomes in-doubt status when user canceled the
> +       query during transaction commit or the server crashed during transaction
> +       commit.
>
> Can we reword the second sentence as: "A foreign transaction can have
> this status when the user has cancelled the statement or the server
> crashes during transaction commit."?

Agreed. Updated in my local branch.

>  I have another question about
> this field, why can't it be one of the status ('preparing',
> 'prepared', 'committing', 'aborting', 'in-doubt') rather than having a
> separate field?

Because I'm using in-doubt field also for checking if the foreign
transaction entry can also be resolved manually, i.g.
pg_resolve_foreign_xact(). For instance, a foreign transaction which
status = 'prepared' and in-doubt = 'true' can be resolved either
foreign transaction resolver or pg_resolve_foreign_xact(). When a user
execute pg_resolve_foreign_xact() against the foreign transaction, it
sets status = 'committing' (or 'rollbacking') by checking transaction
status in clog. The user might cancel pg_resolve_foreign_xact() during
resolution. In this case, the foreign transaction is still status =
'committing' and in-doubt = 'true'. Then if a foreign transaction
resolver process processes the foreign transaction, it can commit it
without clog looking.

> Also, isn't it more suitable to name 'status' field
> as 'state' because these appear to be more like different states of
> transaction?

Agreed.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers, take 2

akapila
On Fri, Jun 12, 2020 at 7:59 AM Masahiko Sawada
<[hidden email]> wrote:

>
> On Thu, 11 Jun 2020 at 22:21, Amit Kapila <[hidden email]> wrote:
> >
>
> >  I have another question about
> > this field, why can't it be one of the status ('preparing',
> > 'prepared', 'committing', 'aborting', 'in-doubt') rather than having a
> > separate field?
>
> Because I'm using in-doubt field also for checking if the foreign
> transaction entry can also be resolved manually, i.g.
> pg_resolve_foreign_xact(). For instance, a foreign transaction which
> status = 'prepared' and in-doubt = 'true' can be resolved either
> foreign transaction resolver or pg_resolve_foreign_xact(). When a user
> execute pg_resolve_foreign_xact() against the foreign transaction, it
> sets status = 'committing' (or 'rollbacking') by checking transaction
> status in clog. The user might cancel pg_resolve_foreign_xact() during
> resolution. In this case, the foreign transaction is still status =
> 'committing' and in-doubt = 'true'. Then if a foreign transaction
> resolver process processes the foreign transaction, it can commit it
> without clog looking.
>

I think this is a corner case and it is better to simplify the state
recording of foreign transactions then to save a CLOG lookup.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers, take 2

Masahiko Sawada-2
On Fri, 12 Jun 2020 at 12:40, Amit Kapila <[hidden email]> wrote:

>
> On Fri, Jun 12, 2020 at 7:59 AM Masahiko Sawada
> <[hidden email]> wrote:
> >
> > On Thu, 11 Jun 2020 at 22:21, Amit Kapila <[hidden email]> wrote:
> > >
> >
> > >  I have another question about
> > > this field, why can't it be one of the status ('preparing',
> > > 'prepared', 'committing', 'aborting', 'in-doubt') rather than having a
> > > separate field?
> >
> > Because I'm using in-doubt field also for checking if the foreign
> > transaction entry can also be resolved manually, i.g.
> > pg_resolve_foreign_xact(). For instance, a foreign transaction which
> > status = 'prepared' and in-doubt = 'true' can be resolved either
> > foreign transaction resolver or pg_resolve_foreign_xact(). When a user
> > execute pg_resolve_foreign_xact() against the foreign transaction, it
> > sets status = 'committing' (or 'rollbacking') by checking transaction
> > status in clog. The user might cancel pg_resolve_foreign_xact() during
> > resolution. In this case, the foreign transaction is still status =
> > 'committing' and in-doubt = 'true'. Then if a foreign transaction
> > resolver process processes the foreign transaction, it can commit it
> > without clog looking.
> >
>
> I think this is a corner case and it is better to simplify the state
> recording of foreign transactions then to save a CLOG lookup.
>

The main usage of in-doubt flag is to distinguish between in-doubt
transactions and other transactions that have their waiter (I call
on-line transactions).  If one foreign server downs for a long time
after the server crash during distributed transaction commit, foreign
transaction resolver tries to resolve the foreign transaction but
fails because the foreign server doesn’t respond. We’d like to avoid
the situation where a resolver process always picks up that foreign
transaction and other on-online transactions waiting to be resolved
cannot move forward. Therefore, a resolver process prioritizes online
transactions. Once the shmem queue having on-line transactions becomes
empty, a resolver process looks at the array of foreign transaction
state to get in-doubt transactions to resolve. I think we should not
process both in-doubt transactions and on-line transactions in the
same way.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers, take 2

akapila
On Fri, Jun 12, 2020 at 9:54 AM Masahiko Sawada
<[hidden email]> wrote:

>
> On Fri, 12 Jun 2020 at 12:40, Amit Kapila <[hidden email]> wrote:
> >
> > On Fri, Jun 12, 2020 at 7:59 AM Masahiko Sawada
> > <[hidden email]> wrote:
> > >
> > > On Thu, 11 Jun 2020 at 22:21, Amit Kapila <[hidden email]> wrote:
> > > >
> > >
> > > >  I have another question about
> > > > this field, why can't it be one of the status ('preparing',
> > > > 'prepared', 'committing', 'aborting', 'in-doubt') rather than having a
> > > > separate field?
> > >
> > > Because I'm using in-doubt field also for checking if the foreign
> > > transaction entry can also be resolved manually, i.g.
> > > pg_resolve_foreign_xact(). For instance, a foreign transaction which
> > > status = 'prepared' and in-doubt = 'true' can be resolved either
> > > foreign transaction resolver or pg_resolve_foreign_xact(). When a user
> > > execute pg_resolve_foreign_xact() against the foreign transaction, it
> > > sets status = 'committing' (or 'rollbacking') by checking transaction
> > > status in clog. The user might cancel pg_resolve_foreign_xact() during
> > > resolution. In this case, the foreign transaction is still status =
> > > 'committing' and in-doubt = 'true'. Then if a foreign transaction
> > > resolver process processes the foreign transaction, it can commit it
> > > without clog looking.
> > >
> >
> > I think this is a corner case and it is better to simplify the state
> > recording of foreign transactions then to save a CLOG lookup.
> >
>
> The main usage of in-doubt flag is to distinguish between in-doubt
> transactions and other transactions that have their waiter (I call
> on-line transactions).
>

Which are these other online transactions?  I had assumed that foreign
transaction resolver process is to resolve in-doubt transactions but
it seems it is also used for some other purpose which anyway was the
next question I had while reviewing other sections of docs but let's
clarify as it came up now.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers, take 2

Masahiko Sawada-2
On Fri, 12 Jun 2020 at 15:37, Amit Kapila <[hidden email]> wrote:

>
> On Fri, Jun 12, 2020 at 9:54 AM Masahiko Sawada
> <[hidden email]> wrote:
> >
> > On Fri, 12 Jun 2020 at 12:40, Amit Kapila <[hidden email]> wrote:
> > >
> > > On Fri, Jun 12, 2020 at 7:59 AM Masahiko Sawada
> > > <[hidden email]> wrote:
> > > >
> > > > On Thu, 11 Jun 2020 at 22:21, Amit Kapila <[hidden email]> wrote:
> > > > >
> > > >
> > > > >  I have another question about
> > > > > this field, why can't it be one of the status ('preparing',
> > > > > 'prepared', 'committing', 'aborting', 'in-doubt') rather than having a
> > > > > separate field?
> > > >
> > > > Because I'm using in-doubt field also for checking if the foreign
> > > > transaction entry can also be resolved manually, i.g.
> > > > pg_resolve_foreign_xact(). For instance, a foreign transaction which
> > > > status = 'prepared' and in-doubt = 'true' can be resolved either
> > > > foreign transaction resolver or pg_resolve_foreign_xact(). When a user
> > > > execute pg_resolve_foreign_xact() against the foreign transaction, it
> > > > sets status = 'committing' (or 'rollbacking') by checking transaction
> > > > status in clog. The user might cancel pg_resolve_foreign_xact() during
> > > > resolution. In this case, the foreign transaction is still status =
> > > > 'committing' and in-doubt = 'true'. Then if a foreign transaction
> > > > resolver process processes the foreign transaction, it can commit it
> > > > without clog looking.
> > > >
> > >
> > > I think this is a corner case and it is better to simplify the state
> > > recording of foreign transactions then to save a CLOG lookup.
> > >
> >
> > The main usage of in-doubt flag is to distinguish between in-doubt
> > transactions and other transactions that have their waiter (I call
> > on-line transactions).
> >
>
> Which are these other online transactions?  I had assumed that foreign
> transaction resolver process is to resolve in-doubt transactions but
> it seems it is also used for some other purpose which anyway was the
> next question I had while reviewing other sections of docs but let's
> clarify as it came up now.

When a distributed transaction is committed by COMMIT command, the
postgres backend process prepare all foreign transaction and commit
the local transaction. Then the backend enqueue itself to the shmem
queue, asks a resolver process for committing the prepared foreign
transaction, and wait. That is, these prepared foreign transactions
are committed by the resolver process, not backend process. Once the
resolver process committed all prepared foreign transactions, it wakes
the waiting backend process. I meant this kind of transaction is
on-line transactions. This procedure is similar to what synchronous
replication does.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers, take 2

akapila
On Fri, Jun 12, 2020 at 2:10 PM Masahiko Sawada
<[hidden email]> wrote:

>
> On Fri, 12 Jun 2020 at 15:37, Amit Kapila <[hidden email]> wrote:
> >
> > > >
> > > > I think this is a corner case and it is better to simplify the state
> > > > recording of foreign transactions then to save a CLOG lookup.
> > > >
> > >
> > > The main usage of in-doubt flag is to distinguish between in-doubt
> > > transactions and other transactions that have their waiter (I call
> > > on-line transactions).
> > >
> >
> > Which are these other online transactions?  I had assumed that foreign
> > transaction resolver process is to resolve in-doubt transactions but
> > it seems it is also used for some other purpose which anyway was the
> > next question I had while reviewing other sections of docs but let's
> > clarify as it came up now.
>
> When a distributed transaction is committed by COMMIT command, the
> postgres backend process prepare all foreign transaction and commit
> the local transaction.
>

Does this mean that we will mark the xid as committed in CLOG of the
local server?  If so, why is this okay till we commit transactions in
all the foreign servers, what if we fail to commit on one of the
servers?

Few more comments on v22-0003-Documentation-update
--------------------------------------------------------------------------------------
1.
+          When <literal>disabled</literal> there can be risk of database
+          consistency among all servers that involved in the distributed
+          transaction when some foreign server crashes during committing the
+          distributed transaction.

Will it read better if rephrase above to something like: "When
<literal>disabled</literal> there can be a risk of database
consistency if one or more foreign servers crashes while committing
the distributed transaction."?

2.
+      <varlistentry
id="guc-foreign-transaction-resolution-rety-interval"
xreflabel="foreign_transaction_resolution_retry_interval">
+       <term><varname>foreign_transaction_resolution_retry_interval</varname>
(<type>integer</type>)
+        <indexterm>
+         <primary><varname>foreign_transaction_resolution_interval</varname>
configuration parameter</primary>
+        </indexterm>
+       </term>
+       <listitem>
+        <para>
+         Specify how long the foreign transaction resolver should
wait when the last resolution
+         fails before retrying to resolve foreign transaction. This
parameter can only be set in the
+         <filename>postgresql.conf</filename> file or on the server
command line.
+        </para>
+        <para>
+         The default value is 10 seconds.
+        </para>
+       </listitem>
+      </varlistentry>

Typo.  <varlistentry
id="guc-foreign-transaction-resolution-rety-interval", spelling of
retry is wrong.  Do we really need such a guc parameter?  I think we
can come up with some simple algorithm to retry after a few seconds
and then increase that interval of retry if we fail again or something
like that.  I don't know how users can come up with some non-default
value for this variable.

3
+      <varlistentry id="guc-foreign-transaction-resolver-timeout"
xreflabel="foreign_transaction_resolver_timeout">
+       <term><varname>foreign_transaction_resolver_timeout</varname>
(<type>integer</type>)
+        <indexterm>
+         <primary><varname>foreign_transaction_resolver_timeout</varname>
configuration parameter</primary>
+        </indexterm>
+       </term>
+       <listitem>
+        <para>
+         Terminate foreign transaction resolver processes that don't
have any foreign
+         transactions to resolve longer than the specified number of
milliseconds.
+         A value of zero disables the timeout mechanism, meaning it
connects to one
+         database until stopping manually.

Can we mention the function name using which one can stop the resolver process?

4.
+   Using the <productname>PostgreSQL</productname>'s atomic commit ensures that
+   all changes on foreign servers end in either commit or rollback using the
+   transaction callback routines

Can we slightly rephase this "Using the PostgreSQL's atomic commit
ensures that all the changes on foreign servers are either committed
or rolled back using the transaction callback routines"?

5.
+       Prepare all transactions on foreign servers.
+       <productname>PostgreSQL</productname> distributed transaction manager
+       prepares all transaction on the foreign servers if two-phase commit is
+       required. Two-phase commit is required when the transaction modifies
+       data on two or more servers including the local server itself and
+       <xref linkend="guc-foreign-twophase-commit"/> is
+       <literal>required</literal>.

/PostgreSQL/PostgreSQL's.

 If all preparations on foreign servers got
+       successful go to the next step.

How about "If the prepare on all foreign servers is successful then go
to the next step"?

 Any failure happens in this step,
+       the server changes to rollback, then rollback all transactions on both
+       local and foreign servers.

Can we rephrase this line to something like: "If there is any failure
in the prepare phase, the server will rollback all the transactions on
both local and foreign servers."?

What if the issued Rollback also failed, say due to network breakdown
between local and one of foreign servers?  Shouldn't such a
transaction be 'in-doubt' state?

6.
+      <para>
+       Commit locally. The server commits transaction locally.  Any
failure happens
+       in this step the server changes to rollback, then rollback all
transactions
+       on both local and foreign servers.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       Resolve all prepared transaction on foreign servers. Pprepared
transactions
+       are committed or rolled back according to the result of the
local transaction.
+       This step is normally performed by a foreign transaction
resolver process.
+      </para>

When (in which step) do we commit on foreign servers?  Do Resolver
processes commit on foreign servers, if so, how can we commit locally
without committing on foreign servers, what if the commit on one of
the servers fails? It is not very clear to me from the steps mentioned
here?  Typo, /Pprepared/Prepared

7.
However, foreign transactions
+    become <firstterm>in-doubt</firstterm> in three cases: where the foreign
+    server crashed or lost the connectibility to it during preparing foreign
+    transaction, where the local node crashed during either preparing or
+    resolving foreign transaction and where user canceled the query.

Here the three cases are not very clear.  You might want to use (a)
..., (b) .. ,(c)..  Also, I think the state will be in-doubt even when
we lost connection to server during commit or rollback.

8.
+    One foreign transaction resolver is responsible for transaction resolutions
+    on which one database connecting.

Can we rephrase it to: "One foreign transaction resolver is
responsible for transaction resolutions on the database to which it is
connected."?

9.
+    Note that other <productname>PostgreSQL</productname> feature
such as parallel
+    queries, logical replication, etc., also take worker slots from
+    <varname>max_worker_processes</varname>.

/feature/features

10.
+   <para>
+    Atomic commit requires several configuration options to be set.
+    On the local node, <xref
linkend="guc-max-prepared-foreign-transactions"/> and
+    <xref linkend="guc-max-foreign-transaction-resolvers"/> must be
non-zero value.
+    Additionally the <varname>max_worker_processes</varname> may need
to be adjusted to
+    accommodate for foreign transaction resolver workers, at least
+    (<varname>max_foreign_transaction_resolvers</varname> +
<literal>1</literal>).
+    Note that other <productname>PostgreSQL</productname> feature
such as parallel
+    queries, logical replication, etc., also take worker slots from
+    <varname>max_worker_processes</varname>.
+   </para>

Don't we need to mention foreign_twophase_commit GUC here?

11.
+   <sect2 id="fdw-callbacks-transaction-managements">
+    <title>FDW Routines For Transaction Managements</title>

Managements/Management?

12.
+     Transaction management callbacks are used for doing commit, rollback and
+     prepare the foreign transaction.

Lets write the above sentence as: "Transaction management callbacks
are used to commit, rollback and prepare the foreign transaction."

13.
+    <para>
+     Transaction management callbacks are used for doing commit, rollback and
+     prepare the foreign transaction. If an FDW wishes that its foreign
+     transaction is managed by <productname>PostgreSQL</productname>'s global
+     transaction manager it must provide both
+     <function>CommitForeignTransaction</function> and
+     <function>RollbackForeignTransaction</function>. In addition, if an FDW
+     wishes to support <firstterm>atomic commit</firstterm> (as described in
+     <xref linkend="fdw-transaction-managements"/>), it must provide
+     <function>PrepareForeignTransaction</function> as well and can provide
+     <function>GetPrepareId</function> callback optionally.
+    </para>

What exact functionality a FDW can accomplish if it just supports
CommitForeignTransaction and RollbackForeignTransaction?  It seems it
doesn't care for 2PC, if so, is there any special functionality we can
achieve with this which we can't do without these APIs?

14.
+PrepareForeignTransaction(FdwXactRslvState *frstate);
+</programlisting>
+    Prepare the transaction on the foreign server. This function is
called at the
+    pre-commit phase of the local transactions if foreign twophase commit is
+    required. This function is used only for distribute transaction management
+    (see <xref linkend="distributed-transaction"/>).
+    </para>

/distribute/distributed

15.
+   <sect2 id="fdw-transaction-commit-rollback">
+    <title>Commit And Rollback Single Foreign Transaction</title>
+    <para>
+     The FDW callback function <literal>CommitForeignTransaction</literal>
+     and <literal>RollbackForeignTransaction</literal> can be used to commit
+     and rollback the foreign transaction. During transaction commit, the core
+     transaction manager calls
<literal>CommitForeignTransaction</literal> function
+     in the pre-commit phase and calls
+     <literal>RollbackForeignTransaction</literal> function in the
post-rollback
+     phase.
+    </para>

There is no reasoning mentioned as to why CommitForeignTransaction has
to be called in pre-commit phase and RollbackForeignTransaction in
post-rollback phase?  Basically why one in pre phase and other in post
phase?

16.
+       <entry>
+        <literal><function>pg_remove_foreign_xact(<parameter>transaction</parameter>
<type>xid</type>, <parameter>serverid</parameter> <type>oid</type>,
<parameter>userid</parameter> <type>oid</type>)</function></literal>
+       </entry>
+       <entry><type>void</type></entry>
+       <entry>
+        This function works the same as
<function>pg_resolve_foreign_xact</function>
+        except that this removes the foreign transcation entry
without resolution.
+       </entry>

Can we write why and when such a function can be used?  Typo,
/trasnaction/transaction

17.
+     <row>
+      <entry><literal>FdwXactResolutionLock</literal></entry>
+      <entry>Waiting to read or update information of foreign trasnaction
+       resolution.</entry>
+     </row>

/trasnaction/transaction


--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers, take 2

Masahiko Sawada-2
On Fri, 12 Jun 2020 at 19:24, Amit Kapila <[hidden email]> wrote:

>
> On Fri, Jun 12, 2020 at 2:10 PM Masahiko Sawada
> <[hidden email]> wrote:
> >
> > On Fri, 12 Jun 2020 at 15:37, Amit Kapila <[hidden email]> wrote:
> > >
> > > > >
> > > > > I think this is a corner case and it is better to simplify the state
> > > > > recording of foreign transactions then to save a CLOG lookup.
> > > > >
> > > >
> > > > The main usage of in-doubt flag is to distinguish between in-doubt
> > > > transactions and other transactions that have their waiter (I call
> > > > on-line transactions).
> > > >
> > >
> > > Which are these other online transactions?  I had assumed that foreign
> > > transaction resolver process is to resolve in-doubt transactions but
> > > it seems it is also used for some other purpose which anyway was the
> > > next question I had while reviewing other sections of docs but let's
> > > clarify as it came up now.
> >
> > When a distributed transaction is committed by COMMIT command, the
> > postgres backend process prepare all foreign transaction and commit
> > the local transaction.
> >

Thank you for your review comments! Let me answer your question first.
I'll see the review comments.

>
> Does this mean that we will mark the xid as committed in CLOG of the
> local server?

Well what I meant is that when the client executes COMMIT command, the
backend executes PREPARE TRANSACTION command on all involved foreign
servers and then marks the xid as committed in clog in the local
server.

>   If so, why is this okay till we commit transactions in
> all the foreign servers, what if we fail to commit on one of the
> servers?

Once the local transaction is committed, all involved foreign
transactions never be rolled back. The backend already prepared all
foreign transaction before local commit, committing prepared foreign
transaction basically doesn't fail. But even if it fails for whatever
reason, we never rollback the all prepared foreign transactions. A
resolver tries to commit foreign transactions at certain intervals.
Does it answer your question?

Regard,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers, take 2

akapila
On Fri, Jun 12, 2020 at 6:24 PM Masahiko Sawada
<[hidden email]> wrote:

>
> On Fri, 12 Jun 2020 at 19:24, Amit Kapila <[hidden email]> wrote:
> >
> > > > Which are these other online transactions?  I had assumed that foreign
> > > > transaction resolver process is to resolve in-doubt transactions but
> > > > it seems it is also used for some other purpose which anyway was the
> > > > next question I had while reviewing other sections of docs but let's
> > > > clarify as it came up now.
> > >
> > > When a distributed transaction is committed by COMMIT command, the
> > > postgres backend process prepare all foreign transaction and commit
> > > the local transaction.
> > >
>
> Thank you for your review comments! Let me answer your question first.
> I'll see the review comments.
>
> >
> > Does this mean that we will mark the xid as committed in CLOG of the
> > local server?
>
> Well what I meant is that when the client executes COMMIT command, the
> backend executes PREPARE TRANSACTION command on all involved foreign
> servers and then marks the xid as committed in clog in the local
> server.
>

Won't it create an inconsistency in viewing the data from the
different servers?  Say, such a transaction inserts one row into a
local server and another into the foreign server.  Now, if we follow
the above protocol, the user will be able to see the row from the
local server but not from the foreign server.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers, take 2

Masahiko Sawada-2
On Sat, 13 Jun 2020 at 14:02, Amit Kapila <[hidden email]> wrote:

>
> On Fri, Jun 12, 2020 at 6:24 PM Masahiko Sawada
> <[hidden email]> wrote:
> >
> > On Fri, 12 Jun 2020 at 19:24, Amit Kapila <[hidden email]> wrote:
> > >
> > > > > Which are these other online transactions?  I had assumed that foreign
> > > > > transaction resolver process is to resolve in-doubt transactions but
> > > > > it seems it is also used for some other purpose which anyway was the
> > > > > next question I had while reviewing other sections of docs but let's
> > > > > clarify as it came up now.
> > > >
> > > > When a distributed transaction is committed by COMMIT command, the
> > > > postgres backend process prepare all foreign transaction and commit
> > > > the local transaction.
> > > >
> >
> > Thank you for your review comments! Let me answer your question first.
> > I'll see the review comments.
> >
> > >
> > > Does this mean that we will mark the xid as committed in CLOG of the
> > > local server?
> >
> > Well what I meant is that when the client executes COMMIT command, the
> > backend executes PREPARE TRANSACTION command on all involved foreign
> > servers and then marks the xid as committed in clog in the local
> > server.
> >
>
> Won't it create an inconsistency in viewing the data from the
> different servers?  Say, such a transaction inserts one row into a
> local server and another into the foreign server.  Now, if we follow
> the above protocol, the user will be able to see the row from the
> local server but not from the foreign server.

Yes, you're right. This atomic commit feature doesn't guarantee such
consistent visibility so-called atomic visibility. Even the local
server is not modified, since a resolver process commits prepared
foreign transactions one by one another user could see an inconsistent
result. Providing globally consistent snapshots to transactions
involving foreign servers is one of the solutions.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers, take 2

Tatsuo Ishii-3
>> Won't it create an inconsistency in viewing the data from the
>> different servers?  Say, such a transaction inserts one row into a
>> local server and another into the foreign server.  Now, if we follow
>> the above protocol, the user will be able to see the row from the
>> local server but not from the foreign server.
>
> Yes, you're right. This atomic commit feature doesn't guarantee such
> consistent visibility so-called atomic visibility. Even the local
> server is not modified, since a resolver process commits prepared
> foreign transactions one by one another user could see an inconsistent
> result. Providing globally consistent snapshots to transactions
> involving foreign servers is one of the solutions.

Another approach to the atomic visibility problem is to control
snapshot acquisition timing and commit timing (plus using REPEATABLE
READ). In the REPEATABLE READ transaction isolation level, PostgreSQL
assigns a snapshot at the time when the first command is executed in a
transaction. If we could prevent any commit while any transaction is
acquiring snapshot, and we could prevent any snapshot acquisition while
committing, visibility inconsistency which Amit explained can be
avoided.

This approach was proposed in a academic paper [1].

Good point with the approach is, we don't need to modify PostgreSQL at
all.

Downside of the approach is, we need someone who controls the timings
(in [1], a middleware called "Pangea" was proposed). Also we need to
limit the transaction isolation level to REPEATABLE READ.

[1] http://www.vldb.org/pvldb/vol2/vldb09-694.pdf

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers, take 2

akapila
On Sun, Jun 14, 2020 at 2:21 PM Tatsuo Ishii <[hidden email]> wrote:
>
> >> Won't it create an inconsistency in viewing the data from the
> >> different servers?  Say, such a transaction inserts one row into a
> >> local server and another into the foreign server.  Now, if we follow
> >> the above protocol, the user will be able to see the row from the
> >> local server but not from the foreign server.
> >
> > Yes, you're right. This atomic commit feature doesn't guarantee such
> > consistent visibility so-called atomic visibility.

Okay, I understand that the purpose of this feature is to provide
atomic commit which means the transaction on all servers involved will
either commit or rollback.  However, I think we should at least see at
a high level how the visibility will work because it might influence
the implementation of this feature.

> > Even the local
> > server is not modified, since a resolver process commits prepared
> > foreign transactions one by one another user could see an inconsistent
> > result. Providing globally consistent snapshots to transactions
> > involving foreign servers is one of the solutions.

How would it be able to do that?  Say, when it decides to take a
snapshot the transaction on the foreign server appears to be committed
but the transaction on the local server won't appear to be committed,
so the consistent data visibility problem as mentioned above could
still arise.

>
> Another approach to the atomic visibility problem is to control
> snapshot acquisition timing and commit timing (plus using REPEATABLE
> READ). In the REPEATABLE READ transaction isolation level, PostgreSQL
> assigns a snapshot at the time when the first command is executed in a
> transaction. If we could prevent any commit while any transaction is
> acquiring snapshot, and we could prevent any snapshot acquisition while
> committing, visibility inconsistency which Amit explained can be
> avoided.
>

I think the problem mentioned above can occur with this as well or if
I am missing something then can you explain in further detail how it
won't create problem in the scenario I have used above?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers, take 2

Tatsuo Ishii-3
>> Another approach to the atomic visibility problem is to control
>> snapshot acquisition timing and commit timing (plus using REPEATABLE
>> READ). In the REPEATABLE READ transaction isolation level, PostgreSQL
>> assigns a snapshot at the time when the first command is executed in a
>> transaction. If we could prevent any commit while any transaction is
>> acquiring snapshot, and we could prevent any snapshot acquisition while
>> committing, visibility inconsistency which Amit explained can be
>> avoided.
>>
>
> I think the problem mentioned above can occur with this as well or if
> I am missing something then can you explain in further detail how it
> won't create problem in the scenario I have used above?

So the problem you mentioned above is like this? (S1/S2 denotes
transactions (sessions), N1/N2 is the postgreSQL servers).  Since S1
already committed on N1, S2 sees the row on N1.  However S2 does not
see the row on N2 since S1 has not committed on N2 yet.

S1/N1: DROP TABLE t1;
DROP TABLE
S1/N1: CREATE TABLE t1(i int);
CREATE TABLE
S1/N2: DROP TABLE t1;
DROP TABLE
S1/N2: CREATE TABLE t1(i int);
CREATE TABLE
S1/N1: BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN
S1/N2: BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN
S2/N1: BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN
S1/N1: INSERT INTO t1 VALUES (1);
INSERT 0 1
S1/N2: INSERT INTO t1 VALUES (1);
INSERT 0 1
S1/N1: PREPARE TRANSACTION 's1n1';
PREPARE TRANSACTION
S1/N2: PREPARE TRANSACTION 's1n2';
PREPARE TRANSACTION
S2/N1: PREPARE TRANSACTION 's2n1';
PREPARE TRANSACTION
S1/N1: COMMIT PREPARED 's1n1';
COMMIT PREPARED
S2/N1: SELECT * FROM t1; -- see the row
 i
---
 1
(1 row)

S2/N2: SELECT * FROM t1; -- doesn't see the row
 i
---
(0 rows)

S1/N2: COMMIT PREPARED 's1n2';
COMMIT PREPARED
S2/N1: COMMIT PREPARED 's2n1';
COMMIT PREPARED

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers, take 2

akapila
On Mon, Jun 15, 2020 at 12:30 PM Tatsuo Ishii <[hidden email]> wrote:

>
> >> Another approach to the atomic visibility problem is to control
> >> snapshot acquisition timing and commit timing (plus using REPEATABLE
> >> READ). In the REPEATABLE READ transaction isolation level, PostgreSQL
> >> assigns a snapshot at the time when the first command is executed in a
> >> transaction. If we could prevent any commit while any transaction is
> >> acquiring snapshot, and we could prevent any snapshot acquisition while
> >> committing, visibility inconsistency which Amit explained can be
> >> avoided.
> >>
> >
> > I think the problem mentioned above can occur with this as well or if
> > I am missing something then can you explain in further detail how it
> > won't create problem in the scenario I have used above?
>
> So the problem you mentioned above is like this? (S1/S2 denotes
> transactions (sessions), N1/N2 is the postgreSQL servers).  Since S1
> already committed on N1, S2 sees the row on N1.  However S2 does not
> see the row on N2 since S1 has not committed on N2 yet.
>

Yeah, something on these lines but S2 can execute the query on N1
directly which should fetch the data from both N1 and N2.  Even if
there is a solution using REPEATABLE READ isolation level we might not
prefer to use that as the only level for distributed transactions, it
might be too costly but let us first see how does it solve the
problem?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers, take 2

Masahiko Sawada-2
In reply to this post by akapila
On Mon, 15 Jun 2020 at 15:20, Amit Kapila <[hidden email]> wrote:

>
> On Sun, Jun 14, 2020 at 2:21 PM Tatsuo Ishii <[hidden email]> wrote:
> >
> > >> Won't it create an inconsistency in viewing the data from the
> > >> different servers?  Say, such a transaction inserts one row into a
> > >> local server and another into the foreign server.  Now, if we follow
> > >> the above protocol, the user will be able to see the row from the
> > >> local server but not from the foreign server.
> > >
> > > Yes, you're right. This atomic commit feature doesn't guarantee such
> > > consistent visibility so-called atomic visibility.
>
> Okay, I understand that the purpose of this feature is to provide
> atomic commit which means the transaction on all servers involved will
> either commit or rollback.  However, I think we should at least see at
> a high level how the visibility will work because it might influence
> the implementation of this feature.
>
> > > Even the local
> > > server is not modified, since a resolver process commits prepared
> > > foreign transactions one by one another user could see an inconsistent
> > > result. Providing globally consistent snapshots to transactions
> > > involving foreign servers is one of the solutions.
>
> How would it be able to do that?  Say, when it decides to take a
> snapshot the transaction on the foreign server appears to be committed
> but the transaction on the local server won't appear to be committed,
> so the consistent data visibility problem as mentioned above could
> still arise.

There are many solutions. For instance, in Postgres-XC/X2 (and maybe
XL), there is a GTM node that is responsible for providing global
transaction IDs (GXID) and globally consistent snapshots. All
transactions need to access GTM when checking the distributed
transaction status as well as starting transactions and ending
transactions. IIUC if a global transaction accesses a tuple whose GXID
is included in its global snapshot it waits for that transaction to be
committed or rolled back.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers, take 2

akapila
On Mon, Jun 15, 2020 at 7:06 PM Masahiko Sawada
<[hidden email]> wrote:

>
> On Mon, 15 Jun 2020 at 15:20, Amit Kapila <[hidden email]> wrote:
> >
> >
> > > > Even the local
> > > > server is not modified, since a resolver process commits prepared
> > > > foreign transactions one by one another user could see an inconsistent
> > > > result. Providing globally consistent snapshots to transactions
> > > > involving foreign servers is one of the solutions.
> >
> > How would it be able to do that?  Say, when it decides to take a
> > snapshot the transaction on the foreign server appears to be committed
> > but the transaction on the local server won't appear to be committed,
> > so the consistent data visibility problem as mentioned above could
> > still arise.
>
> There are many solutions. For instance, in Postgres-XC/X2 (and maybe
> XL), there is a GTM node that is responsible for providing global
> transaction IDs (GXID) and globally consistent snapshots. All
> transactions need to access GTM when checking the distributed
> transaction status as well as starting transactions and ending
> transactions. IIUC if a global transaction accesses a tuple whose GXID
> is included in its global snapshot it waits for that transaction to be
> committed or rolled back.
>

Is there some mapping between GXID and XIDs allocated for each node or
will each node use the GXID as XID to modify the data?   Are we fine
with parking the work for global snapshots and atomic visibility to a
separate patch and just proceed with the design proposed by this
patch?  I am asking because I thought there might be some impact on
the design of this patch based on what we decide for that work.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers, take 2

Ashutosh Bapat-2
On Tue, Jun 16, 2020 at 3:40 PM Amit Kapila <[hidden email]> wrote:

>
> On Mon, Jun 15, 2020 at 7:06 PM Masahiko Sawada
> <[hidden email]> wrote:
> >
> > On Mon, 15 Jun 2020 at 15:20, Amit Kapila <[hidden email]> wrote:
> > >
> > >
> > > > > Even the local
> > > > > server is not modified, since a resolver process commits prepared
> > > > > foreign transactions one by one another user could see an inconsistent
> > > > > result. Providing globally consistent snapshots to transactions
> > > > > involving foreign servers is one of the solutions.
> > >
> > > How would it be able to do that?  Say, when it decides to take a
> > > snapshot the transaction on the foreign server appears to be committed
> > > but the transaction on the local server won't appear to be committed,
> > > so the consistent data visibility problem as mentioned above could
> > > still arise.
> >
> > There are many solutions. For instance, in Postgres-XC/X2 (and maybe
> > XL), there is a GTM node that is responsible for providing global
> > transaction IDs (GXID) and globally consistent snapshots. All
> > transactions need to access GTM when checking the distributed
> > transaction status as well as starting transactions and ending
> > transactions. IIUC if a global transaction accesses a tuple whose GXID
> > is included in its global snapshot it waits for that transaction to be
> > committed or rolled back.
> >
>
> Is there some mapping between GXID and XIDs allocated for each node or
> will each node use the GXID as XID to modify the data?   Are we fine
> with parking the work for global snapshots and atomic visibility to a
> separate patch and just proceed with the design proposed by this
> patch?

Distributed transaction involves, atomic commit,  atomic visibility
and global consistency. 2PC is the only practical solution for atomic
commit. There are some improvements over 2PC but those are add ons to
the basic 2PC, which is what this patch provides. Atomic visibility
and global consistency however have alternative solutions but all of
those solutions require 2PC to be supported. Each of those are large
pieces of work and trying to get everything in may not work. Once we
have basic 2PC in place, there will be a ground to experiment with
solutions for global consistency and atomic visibility. If we manage
to do it right, we could make it pluggable as well. So, I think we
should concentrate on supporting basic 2PC work now.

> I am asking because I thought there might be some impact on
> the design of this patch based on what we decide for that work.
>

Since 2PC is at the heart of any distributed transaction system, the
impact will be low. Figuring all of that, without having basic 2PC,
will be very hard.

--
Best Wishes,
Ashutosh Bapat


Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers, take 2

Masahiko Sawada-2
In reply to this post by akapila
On Fri, 12 Jun 2020 at 19:24, Amit Kapila <[hidden email]> wrote:
>

Thank you for your reviews on 0003 patch. I've incorporated your
comments. I'll submit the latest version patch later as the design or
scope might change as a result of the discussion.

>
> Few more comments on v22-0003-Documentation-update
> --------------------------------------------------------------------------------------
> 1.
> +          When <literal>disabled</literal> there can be risk of database
> +          consistency among all servers that involved in the distributed
> +          transaction when some foreign server crashes during committing the
> +          distributed transaction.
>
> Will it read better if rephrase above to something like: "When
> <literal>disabled</literal> there can be a risk of database
> consistency if one or more foreign servers crashes while committing
> the distributed transaction."?

Fixed.

>
> 2.
> +      <varlistentry
> id="guc-foreign-transaction-resolution-rety-interval"
> xreflabel="foreign_transaction_resolution_retry_interval">
> +       <term><varname>foreign_transaction_resolution_retry_interval</varname>
> (<type>integer</type>)
> +        <indexterm>
> +         <primary><varname>foreign_transaction_resolution_interval</varname>
> configuration parameter</primary>
> +        </indexterm>
> +       </term>
> +       <listitem>
> +        <para>
> +         Specify how long the foreign transaction resolver should
> wait when the last resolution
> +         fails before retrying to resolve foreign transaction. This
> parameter can only be set in the
> +         <filename>postgresql.conf</filename> file or on the server
> command line.
> +        </para>
> +        <para>
> +         The default value is 10 seconds.
> +        </para>
> +       </listitem>
> +      </varlistentry>
>
> Typo.  <varlistentry
> id="guc-foreign-transaction-resolution-rety-interval", spelling of
> retry is wrong.  Do we really need such a guc parameter?  I think we
> can come up with some simple algorithm to retry after a few seconds
> and then increase that interval of retry if we fail again or something
> like that.  I don't know how users can come up with some non-default
> value for this variable.

For example, in a low-reliable network environment, setting lower
value would help to minimize the backend wait time in case of
connection lost. But I also agree with your point. In terms of
implementation, having backends wait for the fixed time is more simple
but we can do such incremental interval by remembering the retry count
for each foreign transaction.

An open question regarding retrying foreign transaction resolution is
how we process the case where an involved foreign server is down for a
very long. If an online transaction is waiting to be resolved, there
is no way to exit from the wait loop other than either the user sends
a cancel request or the crashed server is restored. But if the foreign
server has to be down for a long time, I think it’s not practical to
send a cancel request because the client would need something like a
timeout mechanism. So I think it might be better to provide a way to
cancel the waiting without the user sending a cancel. For example,
having a timeout or having the limit of the retry count. If an
in-doubt transaction is waiting to be resolved, we keep trying to
resolve the foreign transaction at an interval. But I wonder if the
user might want to disable the automatic in-doubt foreign transaction
in some cases, for example, where the user knows the crashed server
will not be restored for a long time. I’m thinking that we can provide
a way to disable automatic foreign transaction resolution or disable
it for the particular foreign transaction.

>
> 3
> +      <varlistentry id="guc-foreign-transaction-resolver-timeout"
> xreflabel="foreign_transaction_resolver_timeout">
> +       <term><varname>foreign_transaction_resolver_timeout</varname>
> (<type>integer</type>)
> +        <indexterm>
> +         <primary><varname>foreign_transaction_resolver_timeout</varname>
> configuration parameter</primary>
> +        </indexterm>
> +       </term>
> +       <listitem>
> +        <para>
> +         Terminate foreign transaction resolver processes that don't
> have any foreign
> +         transactions to resolve longer than the specified number of
> milliseconds.
> +         A value of zero disables the timeout mechanism, meaning it
> connects to one
> +         database until stopping manually.
>
> Can we mention the function name using which one can stop the resolver process?

Fixed.

>
> 4.
> +   Using the <productname>PostgreSQL</productname>'s atomic commit ensures that
> +   all changes on foreign servers end in either commit or rollback using the
> +   transaction callback routines
>
> Can we slightly rephase this "Using the PostgreSQL's atomic commit
> ensures that all the changes on foreign servers are either committed
> or rolled back using the transaction callback routines"?

Fixed.

>
> 5.
> +       Prepare all transactions on foreign servers.
> +       <productname>PostgreSQL</productname> distributed transaction manager
> +       prepares all transaction on the foreign servers if two-phase commit is
> +       required. Two-phase commit is required when the transaction modifies
> +       data on two or more servers including the local server itself and
> +       <xref linkend="guc-foreign-twophase-commit"/> is
> +       <literal>required</literal>.
>
> /PostgreSQL/PostgreSQL's.

Fixed.

>
>  If all preparations on foreign servers got
> +       successful go to the next step.
>
> How about "If the prepare on all foreign servers is successful then go
> to the next step"?

Fixed.

>
>  Any failure happens in this step,
> +       the server changes to rollback, then rollback all transactions on both
> +       local and foreign servers.
>
> Can we rephrase this line to something like: "If there is any failure
> in the prepare phase, the server will rollback all the transactions on
> both local and foreign servers."?

Fixed.

>
> What if the issued Rollback also failed, say due to network breakdown
> between local and one of foreign servers?  Shouldn't such a
> transaction be 'in-doubt' state?

Rollback API to rollback transaction in one-phase can be called
recursively. So FDWs have to tolerate recursive calling.

In the current patch, all transaction operations are performed
synchronously. That is, foreign transaction never becomes in-doubt
state without explicit cancel by the user or the local node crash.
That way, subsequent transactions can assume that precedent
distributed transactions are already resolved unless the user
canceled.

Let me explain the details:

If the transaction turns rollback due to failure before the local
commit, we attempt to do both ROLLBACK and ROLLBACK PREPARED against
foreign transactions whose status is PREPARING. That is, we end the
foreign transactions by doing ROLLBACK. And since we're not sure
preparation has been completed on the foreign server the backend asks
the resolver process for doing ROLLBACK PREPARED on the foreign
servers. Therefore FDWs have to tolerate OBJECT_NOT_FOUND error in
abort case. Since the backend process returns an acknowledgment to the
client only after rolling back all foreign transactions, these foreign
transactional don't remain as in-doubt state.

If rolling back failed after the local commit (i.g., the client does
ROLLBACK and the resolver failed to do ROLLBACK PREPARED), a resolver
process will relaunch and retry to do ROLLBACK PREPARED. The backend
process waits until ROLLBACK PREPARED is successfully done or the user
cancels. So the foreign transactions don't become in-doubt
transactions.

Synchronousness is also an open question. If we want to support atomic
commit in an asynchronous manner it might be better to implement it
first in terms of complexity. The backend returns an acknowledgment to
the client immediately after asking the resolver process. It’s known
as the early acknowledgment technique. The downside is that the user
who wants to see the result of precedent transaction needs to make
sure the precedent transaction is committed on all foreign servers. We
will also need to think about how to control it by GUC parameter when
we have synchronous distributed transaction commit. Perhaps it’s
better to control it independent of synchronous replication.

>
> 6.
> +      <para>
> +       Commit locally. The server commits transaction locally.  Any
> failure happens
> +       in this step the server changes to rollback, then rollback all
> transactions
> +       on both local and foreign servers.
> +      </para>
> +     </listitem>
> +     <listitem>
> +      <para>
> +       Resolve all prepared transaction on foreign servers. Pprepared
> transactions
> +       are committed or rolled back according to the result of the
> local transaction.
> +       This step is normally performed by a foreign transaction
> resolver process.
> +      </para>
>
> When (in which step) do we commit on foreign servers?  Do Resolver
> processes commit on foreign servers, if so, how can we commit locally
> without committing on foreign servers, what if the commit on one of
> the servers fails? It is not very clear to me from the steps mentioned
> here?

In case 2pc is required, we commit transactions on foreign servers at
the final step by the resolver process. If the committing a prepared
transaction on one of the servers fails, a resolver process relaunches
after an interval and retry to commit.

In case 2pc is not required, we commit transactions on foreign servers
at pre-commit phase by the backend.

> Typo, /Pprepared/Prepared

Fixed.

>
> 7.
> However, foreign transactions
> +    become <firstterm>in-doubt</firstterm> in three cases: where the foreign
> +    server crashed or lost the connectibility to it during preparing foreign
> +    transaction, where the local node crashed during either preparing or
> +    resolving foreign transaction and where user canceled the query.
>
> Here the three cases are not very clear.  You might want to use (a)
> ..., (b) .. ,(c)..

Fixed. I change it to itemizedlist.

> Also, I think the state will be in-doubt even when
> we lost connection to server during commit or rollback.

Let me correct the cases of the foreign transactions remain as
in-doubt state. There are two cases:

* The local node crashed
* The user canceled the transaction commit or rollback.

Even when we lost connection to the server during commit or rollback
prepared transaction, a backend doesn’t return an acknowledgment to
the client until either transaction is successfully resolved, the user
cancels the transaction, or the local node crashes.

>
> 8.
> +    One foreign transaction resolver is responsible for transaction resolutions
> +    on which one database connecting.
>
> Can we rephrase it to: "One foreign transaction resolver is
> responsible for transaction resolutions on the database to which it is
> connected."?

Fixed.

>
> 9.
> +    Note that other <productname>PostgreSQL</productname> feature
> such as parallel
> +    queries, logical replication, etc., also take worker slots from
> +    <varname>max_worker_processes</varname>.
>
> /feature/features

Fixed.

>
> 10.
> +   <para>
> +    Atomic commit requires several configuration options to be set.
> +    On the local node, <xref
> linkend="guc-max-prepared-foreign-transactions"/> and
> +    <xref linkend="guc-max-foreign-transaction-resolvers"/> must be
> non-zero value.
> +    Additionally the <varname>max_worker_processes</varname> may need
> to be adjusted to
> +    accommodate for foreign transaction resolver workers, at least
> +    (<varname>max_foreign_transaction_resolvers</varname> +
> <literal>1</literal>).
> +    Note that other <productname>PostgreSQL</productname> feature
> such as parallel
> +    queries, logical replication, etc., also take worker slots from
> +    <varname>max_worker_processes</varname>.
> +   </para>
>
> Don't we need to mention foreign_twophase_commit GUC here?

Fixed.

>
> 11.
> +   <sect2 id="fdw-callbacks-transaction-managements">
> +    <title>FDW Routines For Transaction Managements</title>
>
> Managements/Management?

Fixed.

>
> 12.
> +     Transaction management callbacks are used for doing commit, rollback and
> +     prepare the foreign transaction.
>
> Lets write the above sentence as: "Transaction management callbacks
> are used to commit, rollback and prepare the foreign transaction."

Fixed.

>
> 13.
> +    <para>
> +     Transaction management callbacks are used for doing commit, rollback and
> +     prepare the foreign transaction. If an FDW wishes that its foreign
> +     transaction is managed by <productname>PostgreSQL</productname>'s global
> +     transaction manager it must provide both
> +     <function>CommitForeignTransaction</function> and
> +     <function>RollbackForeignTransaction</function>. In addition, if an FDW
> +     wishes to support <firstterm>atomic commit</firstterm> (as described in
> +     <xref linkend="fdw-transaction-managements"/>), it must provide
> +     <function>PrepareForeignTransaction</function> as well and can provide
> +     <function>GetPrepareId</function> callback optionally.
> +    </para>
>
> What exact functionality a FDW can accomplish if it just supports
> CommitForeignTransaction and RollbackForeignTransaction?  It seems it
> doesn't care for 2PC, if so, is there any special functionality we can
> achieve with this which we can't do without these APIs?

There is no special functionality even if an FDW implements
CommitForeignTrasnaction and RollbackForeignTransaction. Currently,
since there is no transaction API in FDW APIs, FDW developer has to
use XactCallback to control transactions but there is no
documentation. The idea of allowing an FDW to support only
CommitForeignTrasnaction and RollbackForeignTransaction is that FDW
developers can implement transaction management easily. But in the
first patch, we also can disallow it to make the implementation
simple.

>
> 14.
> +PrepareForeignTransaction(FdwXactRslvState *frstate);
> +</programlisting>
> +    Prepare the transaction on the foreign server. This function is
> called at the
> +    pre-commit phase of the local transactions if foreign twophase commit is
> +    required. This function is used only for distribute transaction management
> +    (see <xref linkend="distributed-transaction"/>).
> +    </para>
>
> /distribute/distributed

Fixed.

>
> 15.
> +   <sect2 id="fdw-transaction-commit-rollback">
> +    <title>Commit And Rollback Single Foreign Transaction</title>
> +    <para>
> +     The FDW callback function <literal>CommitForeignTransaction</literal>
> +     and <literal>RollbackForeignTransaction</literal> can be used to commit
> +     and rollback the foreign transaction. During transaction commit, the core
> +     transaction manager calls
> <literal>CommitForeignTransaction</literal> function
> +     in the pre-commit phase and calls
> +     <literal>RollbackForeignTransaction</literal> function in the
> post-rollback
> +     phase.
> +    </para>
>
> There is no reasoning mentioned as to why CommitForeignTransaction has
> to be called in pre-commit phase and RollbackForeignTransaction in
> post-rollback phase?  Basically why one in pre phase and other in post
> phase?

Good point. This behavior just follows what postgres_fdw does. I'm not
sure the exact reason why postgres_fdw commit the transaction in
pre-commit phase but I guess the committing a foreign transaction is
likely to abort comparing to the local commit, it might be better to
do first.

>
> 16.
> +       <entry>
> +        <literal><function>pg_remove_foreign_xact(<parameter>transaction</parameter>
> <type>xid</type>, <parameter>serverid</parameter> <type>oid</type>,
> <parameter>userid</parameter> <type>oid</type>)</function></literal>
> +       </entry>
> +       <entry><type>void</type></entry>
> +       <entry>
> +        This function works the same as
> <function>pg_resolve_foreign_xact</function>
> +        except that this removes the foreign transcation entry
> without resolution.
> +       </entry>
>
> Can we write why and when such a function can be used?  Typo,
> /trasnaction/transaction

Fixed.

>
> 17.
> +     <row>
> +      <entry><literal>FdwXactResolutionLock</literal></entry>
> +      <entry>Waiting to read or update information of foreign trasnaction
> +       resolution.</entry>
> +     </row>
>
> /trasnaction/transaction

Fixed.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


123456