Transactions involving multiple postgres foreign servers, take 2

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
112 messages Options
123456
Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers, take 2

Tatsuo Ishii-3
>> > I think the problem mentioned above can occur with this as well or if
>> > I am missing something then can you explain in further detail how it
>> > won't create problem in the scenario I have used above?
>>
>> So the problem you mentioned above is like this? (S1/S2 denotes
>> transactions (sessions), N1/N2 is the postgreSQL servers).  Since S1
>> already committed on N1, S2 sees the row on N1.  However S2 does not
>> see the row on N2 since S1 has not committed on N2 yet.
>>
>
> Yeah, something on these lines but S2 can execute the query on N1
> directly which should fetch the data from both N1 and N2.

The algorythm assumes that any client should access database through a
middle ware. Such direct access is prohibited.

> Even if
> there is a solution using REPEATABLE READ isolation level we might not
> prefer to use that as the only level for distributed transactions, it
> might be too costly but let us first see how does it solve the
> problem?

The paper extends Snapshot Isolation (SI, which is same as our
REPEATABLE READ isolation level) to "Global Snapshot Isolation", GSI).
I think GSI will solve the problem (atomic visibility) we are
discussing.

Unlike READ COMMITTED, REPEATABLE READ acquires snapshot at the time
when the first command is executed in a transaction (READ COMMITTED
acquires a snapshot at each command in a transaction). Pangea controls
the timing of the snapshot acquisition on pair of transactions
(S1/N1,N2 or S2/N1,N2) so that each pair acquires the same
snapshot. To achieve this, while some transactions are trying to
acquire snapshot, any commit operation should be postponed. Likewise
any snapshot acquisition should wait until any in progress commit
operations are finished (see Algorithm I to III in the paper for more
details). With this rule, the previous example now looks like this:
you can see SELECT on S2/N1 and S2/N2 give the same result.

S1/N1: DROP TABLE t1;
DROP TABLE
S1/N1: CREATE TABLE t1(i int);
CREATE TABLE
S1/N2: DROP TABLE t1;
DROP TABLE
S1/N2: CREATE TABLE t1(i int);
CREATE TABLE
S1/N1: BEGIN;
BEGIN
S1/N2: BEGIN;
BEGIN
S2/N1: BEGIN;
BEGIN
S1/N1: SET transaction_isolation TO 'repeatable read';
SET
S1/N2: SET transaction_isolation TO 'repeatable read';
SET
S2/N1: SET transaction_isolation TO 'repeatable read';
SET
S1/N1: INSERT INTO t1 VALUES (1);
INSERT 0 1
S1/N2: INSERT INTO t1 VALUES (1);
INSERT 0 1
S2/N1: SELECT * FROM t1;
 i
---
(0 rows)

S2/N2: SELECT * FROM t1;
 i
---
(0 rows)

S1/N1: PREPARE TRANSACTION 's1n1';
PREPARE TRANSACTION
S1/N2: PREPARE TRANSACTION 's1n2';
PREPARE TRANSACTION
S2/N1: PREPARE TRANSACTION 's2n1';
PREPARE TRANSACTION
S1/N1: COMMIT PREPARED 's1n1';
COMMIT PREPARED
S1/N2: COMMIT PREPARED 's1n2';
COMMIT PREPARED
S2/N1: COMMIT PREPARED 's2n1';
COMMIT PREPARED

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers, take 2

Bruce Momjian
In reply to this post by Ashutosh Bapat-2
On Tue, Jun 16, 2020 at 06:42:52PM +0530, Ashutosh Bapat wrote:

> > Is there some mapping between GXID and XIDs allocated for each node or
> > will each node use the GXID as XID to modify the data?   Are we fine
> > with parking the work for global snapshots and atomic visibility to a
> > separate patch and just proceed with the design proposed by this
> > patch?
>
> Distributed transaction involves, atomic commit,  atomic visibility
> and global consistency. 2PC is the only practical solution for atomic
> commit. There are some improvements over 2PC but those are add ons to
> the basic 2PC, which is what this patch provides. Atomic visibility
> and global consistency however have alternative solutions but all of
> those solutions require 2PC to be supported. Each of those are large
> pieces of work and trying to get everything in may not work. Once we
> have basic 2PC in place, there will be a ground to experiment with
> solutions for global consistency and atomic visibility. If we manage
> to do it right, we could make it pluggable as well. So, I think we
> should concentrate on supporting basic 2PC work now.

Very good summary, thank you.

--
  Bruce Momjian  <[hidden email]>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee



Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers, take 2

Masahiro Ikeda
> I've attached the new version patch set. 0006 is a separate patch
> which introduces 'prefer' mode to foreign_twophase_commit.

I hope we can use this feature. Thank you for making patches and
discussions.
I'm currently understanding the logic and found some minor points to be
fixed.

I'm sorry if my understanding is wrong.

* The v22 patches need rebase as they can't apply to the current master.

* FdwXactAtomicCommitParticipants said in
src/backend/access/fdwxact/README
   is not implemented. Is FdwXactParticipants right?

* A following comment says that this code is for "One-phase",
   but second argument of FdwXactParticipantEndTransaction() describes
   this code is not "onephase".

AtEOXact_FdwXact() in fdwxact.c
        /* One-phase rollback foreign transaction */
        FdwXactParticipantEndTransaction(fdw_part, false, false);

static void
FdwXactParticipantEndTransaction(FdwXactParticipant *fdw_part, bool
onephase,
        bool for_commit)

* "two_phase_commit" option is mentioned in postgres-fdw.sgml,
    but I can't find related code.

* resolver.c comments have the sentence
   containing two blanks.(Emergency  Termination)

* There are some inconsistency with PostgreSQL wiki.
https://wiki.postgresql.org/wiki/Atomic_Commit_of_Distributed_Transactions

   I understand it's difficult to keep consistency, I think it's ok to
fix later
   when these patches almost be able to be committed.

   - I can't find "two_phase_commit" option in the source code.
     But 2PC is work if the remote server's "max_prepared_transactions"
is set
     to non zero value. It is correct work, isn't it?

   - some parameters are renamed or added in latest patches.
     max_prepared_foreign_transaction, max_prepared_transactions and so
on.

   - typo: froeign_transaction_resolver_timeout

Regards,

--
Masahiro Ikeda
NTT DATA CORPORATION


Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers, take 2

akapila
In reply to this post by Tatsuo Ishii-3
On Tue, Jun 16, 2020 at 8:06 PM Tatsuo Ishii <[hidden email]> wrote:

>
> >> > I think the problem mentioned above can occur with this as well or if
> >> > I am missing something then can you explain in further detail how it
> >> > won't create problem in the scenario I have used above?
> >>
> >> So the problem you mentioned above is like this? (S1/S2 denotes
> >> transactions (sessions), N1/N2 is the postgreSQL servers).  Since S1
> >> already committed on N1, S2 sees the row on N1.  However S2 does not
> >> see the row on N2 since S1 has not committed on N2 yet.
> >>
> >
> > Yeah, something on these lines but S2 can execute the query on N1
> > directly which should fetch the data from both N1 and N2.
>
> The algorythm assumes that any client should access database through a
> middle ware. Such direct access is prohibited.
>

okay, so it seems we need few things which middleware (Pangea) expects
if we have to follow the design of paper.

> > Even if
> > there is a solution using REPEATABLE READ isolation level we might not
> > prefer to use that as the only level for distributed transactions, it
> > might be too costly but let us first see how does it solve the
> > problem?
>
> The paper extends Snapshot Isolation (SI, which is same as our
> REPEATABLE READ isolation level) to "Global Snapshot Isolation", GSI).
> I think GSI will solve the problem (atomic visibility) we are
> discussing.
>
> Unlike READ COMMITTED, REPEATABLE READ acquires snapshot at the time
> when the first command is executed in a transaction (READ COMMITTED
> acquires a snapshot at each command in a transaction). Pangea controls
> the timing of the snapshot acquisition on pair of transactions
> (S1/N1,N2 or S2/N1,N2) so that each pair acquires the same
> snapshot. To achieve this, while some transactions are trying to
> acquire snapshot, any commit operation should be postponed. Likewise
> any snapshot acquisition should wait until any in progress commit
> operations are finished (see Algorithm I to III in the paper for more
> details).
>

I haven't read the paper completely but it sounds quite restrictive
(like both commits and snapshots need to wait).  Another point is that
do we want some middleware involved in the solution?   The main thing
I was looking into at this stage is do we think that the current
implementation proposed by the patch for 2PC is generic enough that we
would be later able to integrate the solution for atomic visibility?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers, take 2

Masahiko Sawada-2
In reply to this post by Masahiro Ikeda
On Wed, 17 Jun 2020 at 09:01, Masahiro Ikeda <[hidden email]> wrote:

>
> > I've attached the new version patch set. 0006 is a separate patch
> > which introduces 'prefer' mode to foreign_twophase_commit.
>
> I hope we can use this feature. Thank you for making patches and
> discussions.
> I'm currently understanding the logic and found some minor points to be
> fixed.
>
> I'm sorry if my understanding is wrong.
>
> * The v22 patches need rebase as they can't apply to the current master.
>
> * FdwXactAtomicCommitParticipants said in
> src/backend/access/fdwxact/README
>    is not implemented. Is FdwXactParticipants right?

Right.

>
> * A following comment says that this code is for "One-phase",
>    but second argument of FdwXactParticipantEndTransaction() describes
>    this code is not "onephase".
>
> AtEOXact_FdwXact() in fdwxact.c
>         /* One-phase rollback foreign transaction */
>         FdwXactParticipantEndTransaction(fdw_part, false, false);
>
> static void
> FdwXactParticipantEndTransaction(FdwXactParticipant *fdw_part, bool
> onephase,
>         bool for_commit)
>
> * "two_phase_commit" option is mentioned in postgres-fdw.sgml,
>     but I can't find related code.
>
> * resolver.c comments have the sentence
>    containing two blanks.(Emergency  Termination)
>
> * There are some inconsistency with PostgreSQL wiki.
> https://wiki.postgresql.org/wiki/Atomic_Commit_of_Distributed_Transactions
>
>    I understand it's difficult to keep consistency, I think it's ok to
> fix later
>    when these patches almost be able to be committed.
>
>    - I can't find "two_phase_commit" option in the source code.
>      But 2PC is work if the remote server's "max_prepared_transactions"
> is set
>      to non zero value. It is correct work, isn't it?

Yes. I had removed two_phase_commit option from postgres_fdw.
Currently, postgres_fdw uses 2pc when 2pc is required. Therefore,
max_prepared_transactions needs to be set to more than one, as you
mentioned.

>
>    - some parameters are renamed or added in latest patches.
>      max_prepared_foreign_transaction, max_prepared_transactions and so
> on.
>
>    - typo: froeign_transaction_resolver_timeout
>

Thank you for your review! I've incorporated your comments on the
local branch. I'll share the latest version patch.

Also, I've updated the wiki page. I'll try to keep the wiki page up-to-date.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers, take 2

akapila
In reply to this post by Ashutosh Bapat-2
On Tue, Jun 16, 2020 at 6:43 PM Ashutosh Bapat
<[hidden email]> wrote:

>
> On Tue, Jun 16, 2020 at 3:40 PM Amit Kapila <[hidden email]> wrote:
> >
> >
> > Is there some mapping between GXID and XIDs allocated for each node or
> > will each node use the GXID as XID to modify the data?   Are we fine
> > with parking the work for global snapshots and atomic visibility to a
> > separate patch and just proceed with the design proposed by this
> > patch?
>
> Distributed transaction involves, atomic commit,  atomic visibility
> and global consistency. 2PC is the only practical solution for atomic
> commit. There are some improvements over 2PC but those are add ons to
> the basic 2PC, which is what this patch provides. Atomic visibility
> and global consistency however have alternative solutions but all of
> those solutions require 2PC to be supported. Each of those are large
> pieces of work and trying to get everything in may not work. Once we
> have basic 2PC in place, there will be a ground to experiment with
> solutions for global consistency and atomic visibility. If we manage
> to do it right, we could make it pluggable as well.
>

I think it is easier said than done. If you want to make it pluggable
or want alternative solutions to adapt the 2PC support provided by us
we should have some idea how those alternative solutions look like.  I
am not telling we have to figure out each and every detail of those
solutions but without paying any attention to the high-level picture
we might end up doing something for 2PC here which either needs a lot
of modifications or might need a design change which would be bad.
Basically, if we later decide to use something like Global Xid to
achieve other features then what we are doing here might not work.

I think it is a good idea to complete the work in pieces where each
piece is useful on its own but without having clarity on the overall
solution that could be a recipe for disaster.  It is possible that you
have some idea in your mind where you can see clearly how this piece
of work can fit in the bigger picture but it is not very apparent to
others or doesn't seem to be documented anywhere.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers, take 2

Tatsuo Ishii-3
In reply to this post by akapila
> okay, so it seems we need few things which middleware (Pangea) expects
> if we have to follow the design of paper.

Yes.

> I haven't read the paper completely but it sounds quite restrictive
> (like both commits and snapshots need to wait).

Maybe. There is a performance evaluation in the paper. You might want
to take a look at it.

> Another point is that
> do we want some middleware involved in the solution?   The main thing
> I was looking into at this stage is do we think that the current
> implementation proposed by the patch for 2PC is generic enough that we
> would be later able to integrate the solution for atomic visibility?

My concern is, FDW+2PC without atomic visibility could lead to data
inconsistency among servers in some cases. If my understanding is
correct, FDW+2PC (without atomic visibility) cannot prevent data
inconsistency in the case below. Initially table t1 has only one row
with i = 0 on both N1 and N2. By executing S1 and S2 concurrently, t1
now has different value of i, 0 and 1.

S1/N1: DROP TABLE t1;
DROP TABLE
S1/N1: CREATE TABLE t1(i int);
CREATE TABLE
S1/N1: INSERT INTO t1 VALUES(0);
INSERT 0 1
S1/N2: DROP TABLE t1;
DROP TABLE
S1/N2: CREATE TABLE t1(i int);
CREATE TABLE
S1/N2: INSERT INTO t1 VALUES(0);
INSERT 0 1
S1/N1: BEGIN;
BEGIN
S1/N2: BEGIN;
BEGIN
S1/N1: UPDATE t1 SET i = i + 1; -- i = 1
UPDATE 1
S1/N2: UPDATE t1 SET i = i + 1; -- i = 1
UPDATE 1
S1/N1: PREPARE TRANSACTION 's1n1';
PREPARE TRANSACTION
S1/N1: COMMIT PREPARED 's1n1';
COMMIT PREPARED
S2/N1: BEGIN;
BEGIN
S2/N2: BEGIN;
BEGIN
S2/N2: DELETE FROM t1 WHERE i = 1;
DELETE 0
S2/N1: DELETE FROM t1 WHERE i = 1;
DELETE 1
S1/N2: PREPARE TRANSACTION 's1n2';
PREPARE TRANSACTION
S2/N1: PREPARE TRANSACTION 's2n1';
PREPARE TRANSACTION
S2/N2: PREPARE TRANSACTION 's2n2';
PREPARE TRANSACTION
S1/N2: COMMIT PREPARED 's1n2';
COMMIT PREPARED
S2/N1: COMMIT PREPARED 's2n1';
COMMIT PREPARED
S2/N2: COMMIT PREPARED 's2n2';
COMMIT PREPARED
S2/N1: SELECT * FROM t1;
 i
---
(0 rows)

S2/N2: SELECT * FROM t1;
 i
---
 1
(1 row)

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers, take 2

Masahiko Sawada-2
On Thu, 18 Jun 2020 at 08:31, Tatsuo Ishii <[hidden email]> wrote:

>
> > okay, so it seems we need few things which middleware (Pangea) expects
> > if we have to follow the design of paper.
>
> Yes.
>
> > I haven't read the paper completely but it sounds quite restrictive
> > (like both commits and snapshots need to wait).
>
> Maybe. There is a performance evaluation in the paper. You might want
> to take a look at it.
>
> > Another point is that
> > do we want some middleware involved in the solution?   The main thing
> > I was looking into at this stage is do we think that the current
> > implementation proposed by the patch for 2PC is generic enough that we
> > would be later able to integrate the solution for atomic visibility?
>
> My concern is, FDW+2PC without atomic visibility could lead to data
> inconsistency among servers in some cases. If my understanding is
> correct, FDW+2PC (without atomic visibility) cannot prevent data
> inconsistency in the case below. Initially table t1 has only one row
> with i = 0 on both N1 and N2. By executing S1 and S2 concurrently, t1
> now has different value of i, 0 and 1.

IIUC the following sequence won't happen because COMMIT PREPARED
's1n1' cannot be executed before PREPARE TRANSACTION 's1n2'. But as
you mentioned, we cannot prevent data inconsistency even with FDW+2PC
e.g., when S2 starts a transaction between COMMIT PREPARED on N1 and
COMMIT PREPARED on N2 by S1. The point is this data inconsistency is
lead by an inconsistent read but not by an inconsistent commit
results. I think there are kinds of possibilities causing data
inconsistency but atomic commit and atomic visibility eliminate
different possibilities. We can eliminate all possibilities of data
inconsistency only after we support 2PC and globally MVCC.

>
> S1/N1: DROP TABLE t1;
> DROP TABLE
> S1/N1: CREATE TABLE t1(i int);
> CREATE TABLE
> S1/N1: INSERT INTO t1 VALUES(0);
> INSERT 0 1
> S1/N2: DROP TABLE t1;
> DROP TABLE
> S1/N2: CREATE TABLE t1(i int);
> CREATE TABLE
> S1/N2: INSERT INTO t1 VALUES(0);
> INSERT 0 1
> S1/N1: BEGIN;
> BEGIN
> S1/N2: BEGIN;
> BEGIN
> S1/N1: UPDATE t1 SET i = i + 1; -- i = 1
> UPDATE 1
> S1/N2: UPDATE t1 SET i = i + 1; -- i = 1
> UPDATE 1
> S1/N1: PREPARE TRANSACTION 's1n1';
> PREPARE TRANSACTION
> S1/N1: COMMIT PREPARED 's1n1';
> COMMIT PREPARED
> S2/N1: BEGIN;
> BEGIN
> S2/N2: BEGIN;
> BEGIN
> S2/N2: DELETE FROM t1 WHERE i = 1;
> DELETE 0
> S2/N1: DELETE FROM t1 WHERE i = 1;
> DELETE 1
> S1/N2: PREPARE TRANSACTION 's1n2';
> PREPARE TRANSACTION
> S2/N1: PREPARE TRANSACTION 's2n1';
> PREPARE TRANSACTION
> S2/N2: PREPARE TRANSACTION 's2n2';
> PREPARE TRANSACTION
> S1/N2: COMMIT PREPARED 's1n2';
> COMMIT PREPARED
> S2/N1: COMMIT PREPARED 's2n1';
> COMMIT PREPARED
> S2/N2: COMMIT PREPARED 's2n2';
> COMMIT PREPARED
> S2/N1: SELECT * FROM t1;
>  i
> ---
> (0 rows)
>
> S2/N2: SELECT * FROM t1;
>  i
> ---
>  1
> (1 row)
>

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers, take 2

Tatsuo Ishii-3
>> My concern is, FDW+2PC without atomic visibility could lead to data
>> inconsistency among servers in some cases. If my understanding is
>> correct, FDW+2PC (without atomic visibility) cannot prevent data
>> inconsistency in the case below. Initially table t1 has only one row
>> with i = 0 on both N1 and N2. By executing S1 and S2 concurrently, t1
>> now has different value of i, 0 and 1.
>
> IIUC the following sequence won't happen because COMMIT PREPARED
> 's1n1' cannot be executed before PREPARE TRANSACTION 's1n2'.

You are right.

> But as
> you mentioned, we cannot prevent data inconsistency even with FDW+2PC
> e.g., when S2 starts a transaction between COMMIT PREPARED on N1 and
> COMMIT PREPARED on N2 by S1.

Ok, example updated.

S1/N1: DROP TABLE t1;
DROP TABLE
S1/N1: CREATE TABLE t1(i int);
CREATE TABLE
S1/N1: INSERT INTO t1 VALUES(0);
INSERT 0 1
S1/N2: DROP TABLE t1;
DROP TABLE
S1/N2: CREATE TABLE t1(i int);
CREATE TABLE
S1/N2: INSERT INTO t1 VALUES(0);
INSERT 0 1
S1/N1: BEGIN;
BEGIN
S1/N2: BEGIN;
BEGIN
S1/N1: UPDATE t1 SET i = i + 1; -- i = 1
UPDATE 1
S1/N2: UPDATE t1 SET i = i + 1; -- i = 1
UPDATE 1
S2/N1: BEGIN;
BEGIN
S2/N2: BEGIN;
BEGIN
S1/N1: PREPARE TRANSACTION 's1n1';
PREPARE TRANSACTION
S1/N2: PREPARE TRANSACTION 's1n2';
PREPARE TRANSACTION
S2/N1: PREPARE TRANSACTION 's2n1';
PREPARE TRANSACTION
S2/N2: PREPARE TRANSACTION 's2n2';
PREPARE TRANSACTION
S1/N1: COMMIT PREPARED 's1n1';
COMMIT PREPARED
S2/N1: DELETE FROM t1 WHERE i = 1;
DELETE 1
S2/N2: DELETE FROM t1 WHERE i = 1;
DELETE 0
S1/N2: COMMIT PREPARED 's1n2';
COMMIT PREPARED
S2/N1: COMMIT PREPARED 's2n1';
COMMIT PREPARED
S2/N2: COMMIT PREPARED 's2n2';
COMMIT PREPARED
S2/N1: SELECT * FROM t1;
 i
---
(0 rows)

S2/N2: SELECT * FROM t1;
 i
---
 1
(1 row)

> The point is this data inconsistency is
> lead by an inconsistent read but not by an inconsistent commit
> results. I think there are kinds of possibilities causing data
> inconsistency but atomic commit and atomic visibility eliminate
> different possibilities. We can eliminate all possibilities of data
> inconsistency only after we support 2PC and globally MVCC.

IMO any permanent data inconsistency is a serious problem for users no
matter what the technical reasons are.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers, take 2

akapila
In reply to this post by Tatsuo Ishii-3
On Thu, Jun 18, 2020 at 5:01 AM Tatsuo Ishii <[hidden email]> wrote:

>
> > Another point is that
> > do we want some middleware involved in the solution?   The main thing
> > I was looking into at this stage is do we think that the current
> > implementation proposed by the patch for 2PC is generic enough that we
> > would be later able to integrate the solution for atomic visibility?
>
> My concern is, FDW+2PC without atomic visibility could lead to data
> inconsistency among servers in some cases. If my understanding is
> correct, FDW+2PC (without atomic visibility) cannot prevent data
> inconsistency in the case below.
>

You are right and we are not going to claim that after this feature is
committed.  This feature has independent use cases like it can allow
parallel copy when foreign tables are involved once we have parallel
copy and surely there will be more.  I think it is clear that we need
atomic visibility (some way to ensure global consistency) to avoid the
data inconsistency problems you and I are worried about and we can do
that as a separate patch but at this stage, it would be good if we can
have some high-level design of that as well so that if we need some
adjustments in the design/implementation of this patch then we can do
it now.  I think there is some discussion on the other threads (like
[1]) about the kind of stuff we are worried about which I need to
follow up on to study the impact.

Having said that, I don't think that is a reason to stop reviewing or
working on this patch.

[1] - https://www.postgresql.org/message-id/flat/21BC916B-80A1-43BF-8650-3363CCDAE09C%40postgrespro.ru

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers, take 2

Bruce Momjian
On Thu, Jun 18, 2020 at 04:09:56PM +0530, Amit Kapila wrote:

> You are right and we are not going to claim that after this feature is
> committed.  This feature has independent use cases like it can allow
> parallel copy when foreign tables are involved once we have parallel
> copy and surely there will be more.  I think it is clear that we need
> atomic visibility (some way to ensure global consistency) to avoid the
> data inconsistency problems you and I are worried about and we can do
> that as a separate patch but at this stage, it would be good if we can
> have some high-level design of that as well so that if we need some
> adjustments in the design/implementation of this patch then we can do
> it now.  I think there is some discussion on the other threads (like
> [1]) about the kind of stuff we are worried about which I need to
> follow up on to study the impact.
>
> Having said that, I don't think that is a reason to stop reviewing or
> working on this patch.

I think our first step is to allow sharding to work on read-only
databases, e.g. data warehousing.  Read/write will require global
snapshots.  It is true that 2PC is limited usefulness without global
snapshots, because, by definition, systems using 2PC are read-write
systems.   However, I can see cases where you are loading data into a
data warehouse but want 2PC so the systems remain consistent even if
there is a crash during loading.

--
  Bruce Momjian  <[hidden email]>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee



Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers, take 2

Ashutosh Bapat-2
On Thu, Jun 18, 2020 at 6:49 PM Bruce Momjian <[hidden email]> wrote:

>
> On Thu, Jun 18, 2020 at 04:09:56PM +0530, Amit Kapila wrote:
> > You are right and we are not going to claim that after this feature is
> > committed.  This feature has independent use cases like it can allow
> > parallel copy when foreign tables are involved once we have parallel
> > copy and surely there will be more.  I think it is clear that we need
> > atomic visibility (some way to ensure global consistency) to avoid the
> > data inconsistency problems you and I are worried about and we can do
> > that as a separate patch but at this stage, it would be good if we can
> > have some high-level design of that as well so that if we need some
> > adjustments in the design/implementation of this patch then we can do
> > it now.  I think there is some discussion on the other threads (like
> > [1]) about the kind of stuff we are worried about which I need to
> > follow up on to study the impact.
> >
> > Having said that, I don't think that is a reason to stop reviewing or
> > working on this patch.
>
> I think our first step is to allow sharding to work on read-only
> databases, e.g. data warehousing.  Read/write will require global
> snapshots.  It is true that 2PC is limited usefulness without global
> snapshots, because, by definition, systems using 2PC are read-write
> systems.   However, I can see cases where you are loading data into a
> data warehouse but want 2PC so the systems remain consistent even if
> there is a crash during loading.
>

For sharding, just implementing 2PC without global consistency
provides limited functionality. But for general purpose federated
databases 2PC serves an important functionality - atomic visibility.
When PostgreSQL is used as one of the coordinators in a heterogeneous
federated database system, it's not expected to have global
consistency or even atomic visibility. But it needs a guarantee that
once a transaction commit, all its legs are committed. 2PC provides
that guarantee as long as the other databases keep their promise that
prepared transactions will always get committed when requested so.
Subtle to this is HA requirement from these databases as well. So the
functionality provided by this patch is important outside the sharding
case as well.

As you said, even for a data warehousing application, there is some
write in the form of loading/merging data. If that write happens
across multiple servers, we need  atomic commit to be guaranteed. Some
of these applications can work even if global consistency and atomic
visibility is guaranteed eventually.

--
Best Wishes,
Ashutosh Bapat


Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers, take 2

Masahiko Sawada-2
In reply to this post by Masahiko Sawada-2
On Wed, 17 Jun 2020 at 14:07, Masahiko Sawada
<[hidden email]> wrote:

>
> On Wed, 17 Jun 2020 at 09:01, Masahiro Ikeda <[hidden email]> wrote:
> >
> > > I've attached the new version patch set. 0006 is a separate patch
> > > which introduces 'prefer' mode to foreign_twophase_commit.
> >
> > I hope we can use this feature. Thank you for making patches and
> > discussions.
> > I'm currently understanding the logic and found some minor points to be
> > fixed.
> >
> > I'm sorry if my understanding is wrong.
> >
> > * The v22 patches need rebase as they can't apply to the current master.
> >
> > * FdwXactAtomicCommitParticipants said in
> > src/backend/access/fdwxact/README
> >    is not implemented. Is FdwXactParticipants right?
>
> Right.
>
> >
> > * A following comment says that this code is for "One-phase",
> >    but second argument of FdwXactParticipantEndTransaction() describes
> >    this code is not "onephase".
> >
> > AtEOXact_FdwXact() in fdwxact.c
> >         /* One-phase rollback foreign transaction */
> >         FdwXactParticipantEndTransaction(fdw_part, false, false);
> >
> > static void
> > FdwXactParticipantEndTransaction(FdwXactParticipant *fdw_part, bool
> > onephase,
> >         bool for_commit)
> >
> > * "two_phase_commit" option is mentioned in postgres-fdw.sgml,
> >     but I can't find related code.
> >
> > * resolver.c comments have the sentence
> >    containing two blanks.(Emergency  Termination)
> >
> > * There are some inconsistency with PostgreSQL wiki.
> > https://wiki.postgresql.org/wiki/Atomic_Commit_of_Distributed_Transactions
> >
> >    I understand it's difficult to keep consistency, I think it's ok to
> > fix later
> >    when these patches almost be able to be committed.
> >
> >    - I can't find "two_phase_commit" option in the source code.
> >      But 2PC is work if the remote server's "max_prepared_transactions"
> > is set
> >      to non zero value. It is correct work, isn't it?
>
> Yes. I had removed two_phase_commit option from postgres_fdw.
> Currently, postgres_fdw uses 2pc when 2pc is required. Therefore,
> max_prepared_transactions needs to be set to more than one, as you
> mentioned.
>
> >
> >    - some parameters are renamed or added in latest patches.
> >      max_prepared_foreign_transaction, max_prepared_transactions and so
> > on.
> >
> >    - typo: froeign_transaction_resolver_timeout
> >
>
> Thank you for your review! I've incorporated your comments on the
> local branch. I'll share the latest version patch.
>
> Also, I've updated the wiki page. I'll try to keep the wiki page up-to-date.
>
I've attached the latest version patches. I've incorporated the review
comments I got so far and improved locking strategy.

Please review it.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

v23-0007-Add-prefer-mode-to-foreign_twophase_commit.patch (16K) Download Attachment
v23-0006-Add-regression-tests-for-foreign-twophase-commit.patch (61K) Download Attachment
v23-0005-postgres_fdw-supports-atomic-commit-APIs.patch (62K) Download Attachment
v23-0004-Documentation-update.patch (54K) Download Attachment
v23-0003-Support-atomic-commit-among-multiple-foreign-ser.patch (254K) Download Attachment
v23-0002-Recreate-RemoveForeignServerById.patch (3K) Download Attachment
v23-0001-Keep-track-of-writing-on-non-temporary-relation.patch (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers, take 2

akapila
On Tue, Jun 23, 2020 at 9:03 AM Masahiko Sawada
<[hidden email]> wrote:
>
>
> I've attached the latest version patches. I've incorporated the review
> comments I got so far and improved locking strategy.
>

Thanks for updating the patch.

> Please review it.
>

I think at this stage it is important that we do some study of various
approaches to achieve this work and come up with a comparison of the
pros and cons of each approach (a) what this patch provides, (b) what
is implemented in Global Snapshots patch [1], (c) if possible, what is
implemented in Postgres-XL.  I fear that if go too far in spending
effort on this and later discovered that it can be better done via
some other available patch/work (maybe due to a reasons like that
approach can easily extended to provide atomic visibility or the
design is more robust, etc.) then it can lead to a lot of rework.

[1] - https://www.postgresql.org/message-id/20200622150636.GB28999%40momjian.us

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers, take 2

Masahiko Sawada-2
On Tue, 23 Jun 2020 at 13:26, Amit Kapila <[hidden email]> wrote:

>
> On Tue, Jun 23, 2020 at 9:03 AM Masahiko Sawada
> <[hidden email]> wrote:
> >
> >
> > I've attached the latest version patches. I've incorporated the review
> > comments I got so far and improved locking strategy.
> >
>
> Thanks for updating the patch.
>
> > Please review it.
> >
>
> I think at this stage it is important that we do some study of various
> approaches to achieve this work and come up with a comparison of the
> pros and cons of each approach (a) what this patch provides, (b) what
> is implemented in Global Snapshots patch [1], (c) if possible, what is
> implemented in Postgres-XL.  I fear that if go too far in spending
> effort on this and later discovered that it can be better done via
> some other available patch/work (maybe due to a reasons like that
> approach can easily extended to provide atomic visibility or the
> design is more robust, etc.) then it can lead to a lot of rework.

Yeah, I have no objection to that plan but I think we also need to
keep in mind that (b), (c), and whatever we are thinking about global
consistency are talking about only PostgreSQL (and postgres_fdw). On
the other hand, this patch needs to implement the feature that can
resolve the atomic commit problem more generically, because the
foreign server might be using oracle_fdw, mysql_fdw, or other FDWs
connecting database systems supporting 2PC.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers, take 2

akapila
On Fri, Jun 26, 2020 at 10:50 AM Masahiko Sawada
<[hidden email]> wrote:

>
> On Tue, 23 Jun 2020 at 13:26, Amit Kapila <[hidden email]> wrote:
> >
> >
> > I think at this stage it is important that we do some study of various
> > approaches to achieve this work and come up with a comparison of the
> > pros and cons of each approach (a) what this patch provides, (b) what
> > is implemented in Global Snapshots patch [1], (c) if possible, what is
> > implemented in Postgres-XL.  I fear that if go too far in spending
> > effort on this and later discovered that it can be better done via
> > some other available patch/work (maybe due to a reasons like that
> > approach can easily extended to provide atomic visibility or the
> > design is more robust, etc.) then it can lead to a lot of rework.
>
> Yeah, I have no objection to that plan but I think we also need to
> keep in mind that (b), (c), and whatever we are thinking about global
> consistency are talking about only PostgreSQL (and postgres_fdw).
>

I think we should explore if those approaches could be extended for
FDWs and if not then that could be considered as a disadvantage of
that approach.


--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers, take 2

Tatsuo Ishii-3
In reply to this post by Tatsuo Ishii-3
>> The point is this data inconsistency is
>> lead by an inconsistent read but not by an inconsistent commit
>> results. I think there are kinds of possibilities causing data
>> inconsistency but atomic commit and atomic visibility eliminate
>> different possibilities. We can eliminate all possibilities of data
>> inconsistency only after we support 2PC and globally MVCC.
>
> IMO any permanent data inconsistency is a serious problem for users no
> matter what the technical reasons are.

I have incorporated "Pangea" algorithm into Pgpool-II to implement the
atomic visibility. In a test below I have two PostgreSQL servers
(stock v12), server0 (port 11002) and server1 (port
11003). default_transaction_isolation was set to 'repeatable read' on
both PostgreSQL, this is required by Pangea. Pgpool-II replicates
write queries and send them to both server0 and server1. There are two
tables "t1" (having only 1 integer column "i") and "log" (having only
1 integer c column "i"). I have run following script
(inconsistency1.sql) via pgbench:

BEGIN;
UPDATE t1 SET i = i + 1;
END;

like: pgbench -n -c 1 -T 30 -f inconsistency1.sql

In the moment I have run another session from pgbench concurrently:

BEGIN;
INSERT INTO log SELECT * FROM t1;
END;

pgbench -n -c 1 -T 30 -f inconsistency2.sql

After finishing those two pgbench runs, I ran following COPY to see if
contents of table "log" are identical in server0 and server1:
psql -p 11002 -c "\copy log to '11002.txt'"
psql -p 11003 -c "\copy log to '11003.txt'"
cmp 11002.txt 11003.txt

The new Pgpool-II incorporating Pangea showed that 11002.txt and
11003.txt are identical as expected. This indicates that the atomic
visibility are kept.

On the other hand Pgpool-II which does not implement Pangea showed
differences in those files.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers, take 2

Masahiro Ikeda
> I've attached the latest version patches. I've incorporated the review
> comments I got so far and improved locking strategy.

Thanks for updating the patch!
I have three questions about the v23 patches.


1. messages related to user canceling

In my understanding, there are two messages
which can be output when a user cancels the COMMIT command.

A. When prepare is failed, the output shows that
    committed locally but some error is occurred.

```
postgres=*# COMMIT;
^CCancel request sent
WARNING:  canceling wait for resolving foreign transaction due to user
request
DETAIL:  The transaction has already committed locally, but might not
have been committed on the foreign server.
ERROR:  server closed the connection unexpectedly
         This probably means the server terminated abnormally
         before or while processing the request.
CONTEXT:  remote SQL command: PREPARE TRANSACTION
'fx_1020791818_519_16399_10'
```

B. When prepare is succeeded,
    the output show that committed locally.

```
postgres=*# COMMIT;
^CCancel request sent
WARNING:  canceling wait for resolving foreign transaction due to user
request
DETAIL:  The transaction has already committed locally, but might not
have been committed on the foreign server.
COMMIT
```

In case of A, I think that "committed locally" message can confuse user.
Because although messages show committed but the transaction is
"ABORTED".

I think "committed" message means that "ABORT" is committed locally.
But is there a possibility of misunderstanding?

In case of A, it's better to change message for user friendly, isn't it?


2. typo

Is trasnactions in fdwxact.c typo?


3. FdwXactGetWaiter in fdwxact.c return unused value

FdwXactGetWaiter is called in FXRslvLoop function.
It returns *waitXid_p, but FXRslvloop doesn't seem to
use *waitXid_p. Do we need to return it?


Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION


Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers, take 2

Fujii Masao-4


On 2020/07/14 9:08, Masahiro Ikeda wrote:
>> I've attached the latest version patches. I've incorporated the review
>> comments I got so far and improved locking strategy.
>
> Thanks for updating the patch!

+1
I'm interested in these patches and now studying them. While checking
the behaviors of the patched PostgreSQL, I got three comments.

1. We can access to the foreign table even during recovery in the HEAD.
But in the patched version, when I did that, I got the following error.
Is this intentional?

ERROR:  cannot assign TransactionIds during recovery

2. With the patch, when INSERT/UPDATE/DELETE are executed both in
local and remote servers, 2PC is executed at the commit phase. But
when write SQL (e.g., TRUNCATE) except INSERT/UPDATE/DELETE are
executed in local and INSERT/UPDATE/DELETE are executed in remote,
2PC is NOT executed. Is this safe?

3. XACT_FLAGS_WROTENONTEMPREL is set when INSERT/UPDATE/DELETE
are executed. But it's not reset even when those queries are canceled by
ROLLBACK TO SAVEPOINT. This may cause unnecessary 2PC at the commit phase.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION


Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers, take 2

Masahiro Ikeda
> I've attached the latest version patches. I've incorporated the review
> comments I got so far and improved locking strategy.

I want to ask a question about streaming replication with 2PC.
Are you going to support 2PC with streaming replication?

I tried streaming replication using v23 patches.
I confirm that 2PC works with streaming replication,
which there are primary/standby coordinator.

But, in my understanding, the WAL of "PREPARE" and
"COMMIT/ABORT PREPARED" can't be replicated to the standby server in
sync.

If this is right, the unresolved transaction can be occurred.

For example,

1. PREPARE is done
2. crash primary before the WAL related to PREPARE is
    replicated to the standby server
3. promote standby server // but can't execute "ABORT PREPARED"

In above case, the remote server has the unresolved transaction.
Can we solve this problem to support in-sync replication?

But, I think some users use async replication for performance.
Do we need to document the limitation or make another solution?

Regards,

--
Masahiro Ikeda
NTT DATA CORPORATION


123456