Transactions involving multiple postgres foreign servers

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
193 messages Options
1234 ... 10
Reply | Threaded
Open this post in threaded view
|

Transactions involving multiple postgres foreign servers

Ashutosh Bapat
Hi All,
While looking at the patch for supporting inheritance on foreign tables, I noticed that if a transaction makes changes to more than two foreign servers the current implementation in postgres_fdw doesn't make sure that either all of them rollback or all of them commit their changes, IOW there is a possibility that some of them commit their changes while others rollback theirs.

PFA patch which uses 2PC to solve this problem. In pgfdw_xact_callback() at XACT_EVENT_PRE_COMMIT event, it sends prepares the transaction at all the foreign postgresql servers and at XACT_EVENT_COMMIT or XACT_EVENT_ABORT event it commits or aborts those transactions resp.

The logic to craft the prepared transaction ids is rudimentary and I am open to suggestions for the same. I have following goals in mind while crafting the transaction ids
1. Minimize the chances of crafting a transaction id which would conflict with a concurrent transaction id on that foreign server.
2. Because of a limitation described later, DBA/user should be able to identify the server which originated a remote transaction.
More can be found in comments above function pgfdw_get_prep_xact_id() in the patch.

Limitations
---------------
1. After a transaction has been prepared on foreign server, if the connection to that server is lost before the transaction is rolled back or committed on that server, the transaction remains in prepared state forever. Manual intervention would be needed to clean up such a transaction (Hence the goal 2 above). Automating this process would require significant changes to the transaction manager, so, left out of this patch, which I thought would be better right now. If required, I can work on that part in this patch itself.

2. 2PC is needed only when there are more than two foreign servers involved in a transaction. Transactions on a single foreign server are handled well right now. So, ideally, the code should detect if there are more than two foreign server are involved in the transaction and only then use 2PC. But I couldn't find a way to detect that without changing the transaction manager.
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pg_fdw_transact.patch (25K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers

Tom Lane-2
Ashutosh Bapat <[hidden email]> writes:
> While looking at the patch for supporting inheritance on foreign tables, I
> noticed that if a transaction makes changes to more than two foreign
> servers the current implementation in postgres_fdw doesn't make sure that
> either all of them rollback or all of them commit their changes, IOW there
> is a possibility that some of them commit their changes while others
> rollback theirs.

> PFA patch which uses 2PC to solve this problem. In pgfdw_xact_callback() at
> XACT_EVENT_PRE_COMMIT event, it sends prepares the transaction at all the
> foreign postgresql servers and at XACT_EVENT_COMMIT or XACT_EVENT_ABORT
> event it commits or aborts those transactions resp.

TBH, I think this is a pretty awful idea.

In the first place, this does little to improve the actual reliability
of a commit occurring across multiple foreign servers; and in the second
place it creates a bunch of brand new failure modes, many of which would
require manual DBA cleanup.

The core of the problem is that this doesn't have anything to do with
2PC as it's commonly understood: for that, you need a genuine external
transaction manager that is aware of all the servers involved in a
transaction, and has its own persistent state (or at least a way to
reconstruct its own state by examining the per-server states).
This patch is not that; in particular it treats the local transaction
asymmetrically from the remote ones, which doesn't seem like a great
idea --- ie, the local transaction could still abort after committing
all the remote ones, leaving you no better off in terms of cross-server
consistency.

As far as failure modes go, one basic reason why this cannot work as
presented is that the remote servers may not even have prepared
transaction support enabled (in fact max_prepared_transactions = 0
is the default in all supported PG versions).  So this would absolutely
have to be a not-on-by-default option.  But the bigger issue is that
leaving it to the DBA to clean up after failures is not a production
grade solution, *especially* not for prepared transactions, which are
performance killers if not closed out promptly.  So I can't imagine
anyone wanting to turn this on without a more robust answer than that.

Basically I think what you'd need for this to be a credible patch would be
for it to work by changing the behavior only in the PREPARE TRANSACTION
path: rather than punting as we do now, prepare the remote transactions,
and report their server identities and gids to an external transaction
manager, which would then be responsible for issuing the actual commits
(along with the actual commit of the local transaction).  I have no idea
whether it's feasible to do that without having to assume a particular
2PC transaction manager API/implementation.

It'd be interesting to hear from people who are using 2PC in production
to find out if this would solve any real-world problems for them, and
what the details of the TM interface would need to look like to make it
work in practice.

In short, you can't force 2PC technology on people who aren't using it
already; while for those who are using it already, this isn't nearly
good enough as-is.

                        regards, tom lane


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers

Robert Haas
On Fri, Jan 2, 2015 at 3:45 PM, Tom Lane <[hidden email]> wrote:
> In short, you can't force 2PC technology on people who aren't using it
> already; while for those who are using it already, this isn't nearly
> good enough as-is.

I was involved in some internal discussions related to this patch, so
I have some opinions on it.  The long-term, high-level goal here is to
facilitate sharding.  If we've got a bunch of PostgreSQL servers
interlinked via postgres_fdw, it should be possible to perform
transactions on the cluster in such a way that transactions are just
as atomic, consistent, isolated, and durable as they would be with
just one server.  As far as I know, there is no way to achieve this
goal through the use of an external transaction manager, because even
if that external transaction manager guarantees, for every
transaction, that the transaction either commits on all nodes or rolls
back on all nodes, there's no way for it to guarantee that other
transactions won't see some intermediate state where the commit has
been completed on some nodes but not others.  To get that, you need
some of integration that reaches down to the way snapshots are taken.

I think, though, that it might be worthwhile to first solve the
simpler problem of figuring out how to ensure that a transaction
commits everywhere or rolls back everywhere, even if intermediate
states might still be transiently visible.   I don't think this patch,
as currently designed, is equal to that challenge, because
XACT_EVENT_PRE_COMMIT fires before the transaction is certain to
commit - PreCommit_CheckForSerializationFailure or PreCommit_Notify
could still error out.  We could have a hook that fires after that,
but that doesn't solve the problem if a user of that hook can itself
throw an error.  Even if part of the API contract is that it's not
allowed to do so, the actual attempt to commit the change on the
remote side can fail due to - e.g. - a network interruption, and
that's go to be dealt with somehow.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers

Tom Lane-2
Robert Haas <[hidden email]> writes:

> I was involved in some internal discussions related to this patch, so
> I have some opinions on it.  The long-term, high-level goal here is to
> facilitate sharding.  If we've got a bunch of PostgreSQL servers
> interlinked via postgres_fdw, it should be possible to perform
> transactions on the cluster in such a way that transactions are just
> as atomic, consistent, isolated, and durable as they would be with
> just one server.  As far as I know, there is no way to achieve this
> goal through the use of an external transaction manager, because even
> if that external transaction manager guarantees, for every
> transaction, that the transaction either commits on all nodes or rolls
> back on all nodes, there's no way for it to guarantee that other
> transactions won't see some intermediate state where the commit has
> been completed on some nodes but not others.  To get that, you need
> some of integration that reaches down to the way snapshots are taken.

That's a laudable goal, but I would bet that nothing built on the FDW
infrastructure will ever get there.  Certainly the proposed patch doesn't
look like it moves us very far towards that set of goalposts.

> I think, though, that it might be worthwhile to first solve the
> simpler problem of figuring out how to ensure that a transaction
> commits everywhere or rolls back everywhere, even if intermediate
> states might still be transiently visible.

Perhaps.  I suspect that it might still be a dead end if the ultimate
goal is cross-system atomic commit ... but likely it would teach us
some useful things anyway.

                        regards, tom lane


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers

Robert Haas
On Mon, Jan 5, 2015 at 2:47 PM, Tom Lane <[hidden email]> wrote:
> That's a laudable goal, but I would bet that nothing built on the FDW
> infrastructure will ever get there.

Why?

It would be surprising to me if, given that we have gone to some pains
to create a system that allows cross-system queries, and hopefully
eventually pushdown of quals, joins, and aggregates, we then made
sharding work in some completely different way that reuses none of
that infrastructure.  But maybe I am looking at this the wrong way.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers

Tom Lane-2
Robert Haas <[hidden email]> writes:
> On Mon, Jan 5, 2015 at 2:47 PM, Tom Lane <[hidden email]> wrote:
>> That's a laudable goal, but I would bet that nothing built on the FDW
>> infrastructure will ever get there.

> Why?

> It would be surprising to me if, given that we have gone to some pains
> to create a system that allows cross-system queries, and hopefully
> eventually pushdown of quals, joins, and aggregates, we then made
> sharding work in some completely different way that reuses none of
> that infrastructure.  But maybe I am looking at this the wrong way.

Well, we intentionally didn't couple the FDW stuff closely into
transaction commit, because of the thought that the "far end" would not
necessarily have Postgres-like transactional behavior, and even if it did
there would be about zero chance of having atomic commit with a
non-Postgres remote server.  postgres_fdw is a seriously bad starting
point as far as that goes, because it encourages one to make assumptions
that can't possibly work for any other wrapper.

I think the idea I sketched upthread of supporting an external transaction
manager might be worth pursuing, in that it would potentially lead to
having at least an approximation of atomic commit across heterogeneous
servers.

Independently of that, I think what you are talking about would be better
addressed outside the constraints of the FDW mechanism.  That's not to say
that we couldn't possibly make postgres_fdw use some additional non-FDW
infrastructure to manage commits; just that solving this in terms of the
FDW infrastructure seems wrongheaded to me.

                        regards, tom lane


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers

Robert Haas
On Mon, Jan 5, 2015 at 3:23 PM, Tom Lane <[hidden email]> wrote:
> Well, we intentionally didn't couple the FDW stuff closely into
> transaction commit, because of the thought that the "far end" would not
> necessarily have Postgres-like transactional behavior, and even if it did
> there would be about zero chance of having atomic commit with a
> non-Postgres remote server.  postgres_fdw is a seriously bad starting
> point as far as that goes, because it encourages one to make assumptions
> that can't possibly work for any other wrapper.

Atomic commit is something that can potentially be supported by many
different FDWs, as long as the thing on the other end supports 2PC.
If you're talking to Oracle or DB2 or SQL Server, and it supports 2PC,
then you can PREPARE the transaction and then go back and COMMIT the
transaction once it's committed locally.  Getting a cluster-wide
*snapshot* is probably a PostgreSQL-only thing requiring much deeper
integration, but I think it would be sensible to leave that as a
future project and solve the simpler problem first.

> I think the idea I sketched upthread of supporting an external transaction
> manager might be worth pursuing, in that it would potentially lead to
> having at least an approximation of atomic commit across heterogeneous
> servers.

An important threshold question here is whether we want to rely on an
external transaction manager, or build one into PostgreSQL.  As far as
this particular project goes, there's nothing that can't be done
inside PostgreSQL.  You need a durable registry of which transactions
you prepared on which servers, and which XIDs they correlate to.  If
you have that, then you can use background workers or similar to go
retry commits or rollbacks of prepared transactions until it works,
even if there's been a local crash meanwhile.

Alternatively, you could rely on an external transaction manager to do
all that stuff.  I don't have a clear sense of what that would entail,
or how it might be better or worse than rolling our own.  I suspect,
though, that it might amount to little more than adding a middle man.
I mean, a third-party transaction manager isn't going to automatically
know how to commit a transaction prepared on some foreign server using
some foreign data wrapper.  It's going to be have to be taught that if
postgres_fdw leaves a transaction in-medias-res on server OID 1234,
you've got to connect to the target machine using that foreign
server's connection parameters, speak libpq, and issue the appropriate
COMMIT TRANSACTION command.  And similarly, you're going to need to
arrange to notify it before preparing that transaction so that it
knows that it needs to request the COMMIT or ABORT later on.  Once
you've got all of that infrastructure for that in place, what are you
really gaining over just doing it in PostgreSQL (or, say, a contrib
module thereto)?

(I'm also concerned that an external transaction manager might need
the PostgreSQL client to be aware of it, whereas what we'd really like
here is for the client to just speak PostgreSQL and be happy that its
commits no longer end up half-done.)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers

Ashutosh Bapat
In reply to this post by Robert Haas


On Mon, Jan 5, 2015 at 11:55 PM, Robert Haas <[hidden email]> wrote:
On Fri, Jan 2, 2015 at 3:45 PM, Tom Lane <[hidden email]> wrote:
> In short, you can't force 2PC technology on people who aren't using it
> already; while for those who are using it already, this isn't nearly
> good enough as-is.

I was involved in some internal discussions related to this patch, so
I have some opinions on it.  The long-term, high-level goal here is to
facilitate sharding.  If we've got a bunch of PostgreSQL servers
interlinked via postgres_fdw, it should be possible to perform
transactions on the cluster in such a way that transactions are just
as atomic, consistent, isolated, and durable as they would be with
just one server.  As far as I know, there is no way to achieve this
goal through the use of an external transaction manager, because even
if that external transaction manager guarantees, for every
transaction, that the transaction either commits on all nodes or rolls
back on all nodes, there's no way for it to guarantee that other
transactions won't see some intermediate state where the commit has
been completed on some nodes but not others.  To get that, you need
some of integration that reaches down to the way snapshots are taken.

I think, though, that it might be worthwhile to first solve the
simpler problem of figuring out how to ensure that a transaction
commits everywhere or rolls back everywhere, even if intermediate
states might still be transiently visible.

Agreed.
 
I don't think this patch,
as currently designed, is equal to that challenge, because
XACT_EVENT_PRE_COMMIT fires before the transaction is certain to
commit - PreCommit_CheckForSerializationFailure or PreCommit_Notify
could still error out.  We could have a hook that fires after that,
but that doesn't solve the problem if a user of that hook can itself
throw an error.  Even if part of the API contract is that it's not
allowed to do so, the actual attempt to commit the change on the
remote side can fail due to - e.g. - a network interruption, and
that's go to be dealt with somehow.


Tom mentioned
--
in particular it treats the local transaction
asymmetrically from the remote ones, which doesn't seem like a great
idea --- ie, the local transaction could still abort after committing
all the remote ones, leaving you no better off in terms of cross-server
consistency.
--
You have given a specific example of this case. So, let me dry run through CommitTransaction() after applying my patch.
1899     CallXactCallbacks(XACT_EVENT_PRE_COMMIT);

While processing this event in postgres_fdw's callback pgfdw_xact_callback() sends a PREPARE TRANSACTION to all the foreign servers involved. These servers return with their success or failures. Even if one of them fails, the local transaction is aborted along-with all the prepared transactions. Only if all the foreign servers succeed we proceed further.

1925     PreCommit_CheckForSerializationFailure();
1926
1932     PreCommit_Notify();
1933

If any of these function (as you mentioned above), throws errors, the local transaction will be aborted as well as the remote prepared transactions. Note, that we haven't yet committed the local transaction (which will be done below) and also not the remote transactions which are in PREPAREd state there. Since all the transactions local as well as remote are aborted in case of error, the data is still consistent. If these steps succeed, we will proceed ahead.

1934     /* Prevent cancel/die interrupt while cleaning up */
1935     HOLD_INTERRUPTS();
1936
1937     /* Commit updates to the relation map --- do this as late as possible */
1938     AtEOXact_RelationMap(true);
1939
1940     /*
1941      * set the current transaction state information appropriately during
1942      * commit processing
1943      */
1944     s->state = TRANS_COMMIT;
1945
1946     /*
1947      * Here is where we really truly commit.
1948      */
1949     latestXid = RecordTransactionCommit();
1950
1951     TRACE_POSTGRESQL_TRANSACTION_COMMIT(MyProc->lxid);
1952
1953     /*
1954      * Let others know about no transaction in progress by me. Note that this
1955      * must be done _before_ releasing locks we hold and _after_
1956      * RecordTransactionCommit.
1957      */
1958     ProcArrayEndTransaction(MyProc, latestXid);
1959

Local transaction committed. Remote transactions still in PREPAREd state. Any server (including local) crash or link failure happens here, we leave the remote transactions dangling in PREPAREd state and manual cleanup will be required.

1975
1976     CallXactCallbacks(XACT_EVENT_COMMIT);

The postgresql callback pgfdw_xact_callback() commits the PREPAREd transactions by sending COMMIT TRANSACTION to remote server (my patch). So, I don't see why would my patch cause inconsistencies. It can cause dangling PREPAREd transactions and I have already acknowledged that fact.

Am I missing something?

 
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers

Ashutosh Bapat
In reply to this post by Tom Lane-2

On Sat, Jan 3, 2015 at 2:15 AM, Tom Lane <[hidden email]> wrote:
Ashutosh Bapat <[hidden email]> writes:
> While looking at the patch for supporting inheritance on foreign tables, I
> noticed that if a transaction makes changes to more than two foreign
> servers the current implementation in postgres_fdw doesn't make sure that
> either all of them rollback or all of them commit their changes, IOW there
> is a possibility that some of them commit their changes while others
> rollback theirs.

> PFA patch which uses 2PC to solve this problem. In pgfdw_xact_callback() at
> XACT_EVENT_PRE_COMMIT event, it sends prepares the transaction at all the
> foreign postgresql servers and at XACT_EVENT_COMMIT or XACT_EVENT_ABORT
> event it commits or aborts those transactions resp.

TBH, I think this is a pretty awful idea.

In the first place, this does little to improve the actual reliability
of a commit occurring across multiple foreign servers; and in the second
place it creates a bunch of brand new failure modes, many of which would
require manual DBA cleanup.

The core of the problem is that this doesn't have anything to do with
2PC as it's commonly understood: for that, you need a genuine external
transaction manager that is aware of all the servers involved in a
transaction, and has its own persistent state (or at least a way to
reconstruct its own state by examining the per-server states).
This patch is not that; in particular it treats the local transaction
asymmetrically from the remote ones, which doesn't seem like a great
idea --- ie, the local transaction could still abort after committing
all the remote ones, leaving you no better off in terms of cross-server
consistency.

As far as failure modes go, one basic reason why this cannot work as
presented is that the remote servers may not even have prepared
transaction support enabled (in fact max_prepared_transactions = 0
is the default in all supported PG versions).  So this would absolutely
have to be a not-on-by-default option.

Agreed. We can have a per foreign server option, which says whether the corresponding server can participate in 2PC. A transaction spanning multiple foreign server with at least one of them not capable of participating in 2PC will need to be aborted.
 
But the bigger issue is that
leaving it to the DBA to clean up after failures is not a production
grade solution, *especially* not for prepared transactions, which are
performance killers if not closed out promptly.  So I can't imagine
anyone wanting to turn this on without a more robust answer than that.

I purposefully left that outside this patch, since it involves significant changes in core. If that's necessary for the first cut, I will work on it.
 

Basically I think what you'd need for this to be a credible patch would be
for it to work by changing the behavior only in the PREPARE TRANSACTION
path: rather than punting as we do now, prepare the remote transactions,
and report their server identities and gids to an external transaction
manager, which would then be responsible for issuing the actual commits
(along with the actual commit of the local transaction).  I have no idea
whether it's feasible to do that without having to assume a particular
2PC transaction manager API/implementation.

I doubt if a TM would expect a bunch of GIDs in response to PREPARE TRANSACTION command. Per X/Open xa_prepare() expects an integer return value, specifying whether the PREPARE succeeded or not and some piggybacked statuses.

In the context of foreign table under inheritance tree, a single DML can span multiple foreign servers. All such DMLs will then need to be handled by an external TM. An external TM or application may not have exact idea as to which all foreign servers are going to be affected by a DML. Users may not want to setup an external TM in such cases. Instead they would expect PostgreSQL to manage such DMLs and transactions all by itself.

As Robert has suggested in his responses, it would be better to enable PostgreSQL to manage distributed transactions itself.


It'd be interesting to hear from people who are using 2PC in production
to find out if this would solve any real-world problems for them, and
what the details of the TM interface would need to look like to make it
work in practice.

In short, you can't force 2PC technology on people who aren't using it
already; while for those who are using it already, this isn't nearly
good enough as-is.

                        regards, tom lane



--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers

Ashutosh Bapat
In reply to this post by Robert Haas


On Tue, Jan 6, 2015 at 11:55 PM, Robert Haas <[hidden email]> wrote:
On Mon, Jan 5, 2015 at 3:23 PM, Tom Lane <[hidden email]> wrote:
> Well, we intentionally didn't couple the FDW stuff closely into
> transaction commit, because of the thought that the "far end" would not
> necessarily have Postgres-like transactional behavior, and even if it did
> there would be about zero chance of having atomic commit with a
> non-Postgres remote server.  postgres_fdw is a seriously bad starting
> point as far as that goes, because it encourages one to make assumptions
> that can't possibly work for any other wrapper.

Atomic commit is something that can potentially be supported by many
different FDWs, as long as the thing on the other end supports 2PC.
If you're talking to Oracle or DB2 or SQL Server, and it supports 2PC,
then you can PREPARE the transaction and then go back and COMMIT the
transaction once it's committed locally.
Getting a cluster-wide
*snapshot* is probably a PostgreSQL-only thing requiring much deeper
integration, but I think it would be sensible to leave that as a
future project and solve the simpler problem first.

> I think the idea I sketched upthread of supporting an external transaction
> manager might be worth pursuing, in that it would potentially lead to
> having at least an approximation of atomic commit across heterogeneous
> servers.

An important threshold question here is whether we want to rely on an
external transaction manager, or build one into PostgreSQL.  As far as
this particular project goes, there's nothing that can't be done
inside PostgreSQL.  You need a durable registry of which transactions
you prepared on which servers, and which XIDs they correlate to.  If
you have that, then you can use background workers or similar to go
retry commits or rollbacks of prepared transactions until it works,
even if there's been a local crash meanwhile.

Alternatively, you could rely on an external transaction manager to do
all that stuff.  I don't have a clear sense of what that would entail,
or how it might be better or worse than rolling our own.  I suspect,
though, that it might amount to little more than adding a middle man.
I mean, a third-party transaction manager isn't going to automatically
know how to commit a transaction prepared on some foreign server using
some foreign data wrapper.  It's going to be have to be taught that if
postgres_fdw leaves a transaction in-medias-res on server OID 1234,
you've got to connect to the target machine using that foreign
server's connection parameters, speak libpq, and issue the appropriate
COMMIT TRANSACTION command.  And similarly, you're going to need to
arrange to notify it before preparing that transaction so that it
knows that it needs to request the COMMIT or ABORT later on.  Once
you've got all of that infrastructure for that in place, what are you
really gaining over just doing it in PostgreSQL (or, say, a contrib
module thereto)?

Thanks Robert for giving high level view of system needed for PostgreSQL to be a transaction manager by itself. Agreed completely.
 

(I'm also concerned that an external transaction manager might need
the PostgreSQL client to be aware of it, whereas what we'd really like
here is for the client to just speak PostgreSQL and be happy that its
commits no longer end up half-done.)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers

Kevin Grittner-5
In reply to this post by Ashutosh Bapat
Ashutosh Bapat <[hidden email]> wrote:

> I don't see why would my patch cause inconsistencies. It can
> cause dangling PREPAREd transactions and I have already
> acknowledged that fact.
>
> Am I missing something?

To me that is the big problem.  Where I have run into ad hoc
distributed transaction managers it has usually been because a
crash left prepared transactions dangling, without cleaning them up
when the transaction manager was restarted.  This tends to wreak
havoc one way or another.

If we are going to include a distributed transaction manager with
PostgreSQL, it *must* persist enough information about the
transaction ID and where it is used in a way that will survive a
subsequent crash before beginning the PREPARE on any of the
systems.  After all nodes are PREPAREd it must flag that persisted
data to indicate that it is now at a point where ROLLBACK is no
longer an option.  Only then can it start committing the prepared
transactions.  After the last node is committed it can clear this
information.  On start-up the distributed transaction manager must
check for any distributed transactions left "in progress" and
commit or rollback based on the preceding; doing retries
indefinitely until it succeeds or is told to stop.

Doing this incompletely (i.e., not identifying and correctly
handling the various failure modes) is IMO far worse than not
attempting it.  If we could build in something that did this
completely and well, that would be a cool selling point; but let's
not gloss over the difficulties.  We must recognize how big a
problem it would be to include a low-quality implementation.

Also, as previously mentioned, it must behave in some reasonable
way if a database is not configured to support 2PC, especially
since 2PC is off by default in PostgreSQL.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers

Ashutosh Bapat


On Wed, Jan 7, 2015 at 9:50 PM, Kevin Grittner <[hidden email]> wrote:
Ashutosh Bapat <[hidden email]> wrote:

> I don't see why would my patch cause inconsistencies. It can
> cause dangling PREPAREd transactions and I have already
> acknowledged that fact.
>
> Am I missing something?

To me that is the big problem.  Where I have run into ad hoc
distributed transaction managers it has usually been because a
crash left prepared transactions dangling, without cleaning them up
when the transaction manager was restarted.  This tends to wreak
havoc one way or another.

If we are going to include a distributed transaction manager with
PostgreSQL, it *must* persist enough information about the
transaction ID and where it is used in a way that will survive a
subsequent crash before beginning the PREPARE on any of the
systems.

Thanks a lot. I hadn't thought of this.
 
After all nodes are PREPAREd it must flag that persisted
data to indicate that it is now at a point where ROLLBACK is no
longer an option.  Only then can it start committing the prepared
transactions.  After the last node is committed it can clear this
information.  On start-up the distributed transaction manager must
check for any distributed transactions left "in progress" and
commit or rollback based on the preceding; doing retries
indefinitely until it succeeds or is told to stop.


Agreed.
 
Doing this incompletely (i.e., not identifying and correctly
handling the various failure modes) is IMO far worse than not
attempting it.  If we could build in something that did this
completely and well, that would be a cool selling point; but let's
not gloss over the difficulties.  We must recognize how big a
problem it would be to include a low-quality implementation.

Also, as previously mentioned, it must behave in some reasonable
way if a database is not configured to support 2PC, especially
since 2PC is off by default in PostgreSQL.


I described one possibility in my reply to Tom's mail. Let me repeat it here.

We can have a per foreign server option, which says whether the corresponding server is able to participate in 2PC. A transaction spanning multiple foreign server with at least one of them not capable of participating in 2PC will be aborted.

Will that work?

In case a user flags a foreign server as capable to 2PC incorrectly, I expect the corresponding FDW would raise error (either because PREPARE fails or FDW doesn't handle that case) and the transaction will be aborted anyway.
 
--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers

Kevin Grittner-5
Ashutosh Bapat <[hidden email]> wrote:
> On Wed, Jan 7, 2015 at 9:50 PM, Kevin Grittner <[hidden email]> wrote:

>> Also, as previously mentioned, it must behave in some reasonable
>> way if a database is not configured to support 2PC, especially
>> since 2PC is off by default in PostgreSQL.

> We can have a per foreign server option, which says whether the
> corresponding server is able to participate in 2PC. A transaction
> spanning multiple foreign server with at least one of them not
> capable of participating in 2PC will be aborted.
>
> Will that work?
>
> In case a user flags a foreign server as capable to 2PC
> incorrectly, I expect the corresponding FDW would raise error
> (either because PREPARE fails or FDW doesn't handle that case)
> and the transaction will be aborted anyway.

That sounds like one way to handle it.  I'm not clear on how you
plan to determine whether 2PC is required for a transaction.
(Apologies if it was previously mentioned and I've forgotten it.)

I don't mean to suggest that these problems are insurmountable; I
just think that people often underestimate the difficulty of
writing a distributed transaction manager and don't always
recognize the problems that it will cause if all of the failure
modes are not considered and handled.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers

Robert Haas
In reply to this post by Kevin Grittner-5
On Wed, Jan 7, 2015 at 11:20 AM, Kevin Grittner <[hidden email]> wrote:

> If we are going to include a distributed transaction manager with
> PostgreSQL, it *must* persist enough information about the
> transaction ID and where it is used in a way that will survive a
> subsequent crash before beginning the PREPARE on any of the
> systems.  After all nodes are PREPAREd it must flag that persisted
> data to indicate that it is now at a point where ROLLBACK is no
> longer an option.  Only then can it start committing the prepared
> transactions.  After the last node is committed it can clear this
> information.  On start-up the distributed transaction manager must
> check for any distributed transactions left "in progress" and
> commit or rollback based on the preceding; doing retries
> indefinitely until it succeeds or is told to stop.

I think one key question here is whether all of this should be handled
in PostgreSQL core or whether some of it should be handled in other
ways.  Is the goal to make postgres_fdw (and FDWs for other databases
that support 2PC) to persist enough information that someone *could*
write a transaction manager for PostgreSQL, or is the goal to actually
write that transaction manager?

Just figuring out how to persist the necessary information is a
non-trivial problem by itself.  You might think that you could just
insert a row into a local table saying, hey, I'm about to prepare a
transaction remotely, but of course that doesn't work: if you then go
on to PREPARE before writing and flushing the local commit record,
then a crash before that's done leaves a dangling prepared transaction
on the remote note.  You might think to write the record, then after
writing and flush the local commit record do the PREPARE.  But you
can't do that either, because now if the PREPARE fails you've already
committed locally.

I guess what you need to do is something like:

1. Write and flush a WAL record indicating an intent to prepare, with
a list of foreign server OIDs and GUIDs.
2. Prepare the remote transaction on each node.  If any of those
operations fail, roll back any prepared nodes and error out.
3. Commit locally (i.e. RecordTransactionCommit, writing and flushing WAL).
4. Try to commit the remote transactions.
5. Write a WAL record indicating that you committed the remote transactions OK.

If you fail after step 1, you can straighten things out by looking at
the status of the transaction: if the transaction committed, any
transactions we intended-to-prepare need to be checked.  If they are
still prepared, we need to commit them or roll them back according to
what happened to our XID.

(Andres is talking in my other ear suggesting that we ought to reuse
the 2PC infrastructure to do all this.  I'm not convinced that's a
good idea, but I'll let him present his own ideas here if he wants to
rather than trying to explain them myself.)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers

Kevin Grittner-5
Robert Haas <[hidden email]> wrote:

> Andres is talking in my other ear suggesting that we ought to
> reuse the 2PC infrastructure to do all this.

If you mean that the primary transaction and all FDWs in the
transaction must use 2PC, that is what I was saying, although
apparently not clearly enough.  All nodes *including the local one*
must be prepared and committed with data about the nodes saved
safely off somewhere that it can be read in the event of a failure
of any of the nodes *including the local one*.  Without that, I see
this whole approach as a train wreck just waiting to happen.

I'm not really clear on the mechanism that is being proposed for
doing this, but one way would be to have the PREPARE of the local
transaction be requested explicitly and to have that cause all FDWs
participating in the transaction to also be prepared.  (That might
be what Andres meant; I don't know.)  That doesn't strike me as the
only possible mechanism to drive this, but it might well be the
simplest and cleanest.  The trickiest bit might be to find a good
way to persist the distributed transaction information in a way
that survives the failure of the main transaction -- or even the
abrupt loss of the machine it's running on.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers

Robert Haas
On Thu, Jan 8, 2015 at 10:19 AM, Kevin Grittner <[hidden email]> wrote:

> Robert Haas <[hidden email]> wrote:
>> Andres is talking in my other ear suggesting that we ought to
>> reuse the 2PC infrastructure to do all this.
>
> If you mean that the primary transaction and all FDWs in the
> transaction must use 2PC, that is what I was saying, although
> apparently not clearly enough.  All nodes *including the local one*
> must be prepared and committed with data about the nodes saved
> safely off somewhere that it can be read in the event of a failure
> of any of the nodes *including the local one*.  Without that, I see
> this whole approach as a train wreck just waiting to happen.

Clearly, all the nodes other than the local one need to use 2PC.  I am
unconvinced that the local node must write a 2PC state file only to
turn around and remove it again almost immediately thereafter.

> I'm not really clear on the mechanism that is being proposed for
> doing this, but one way would be to have the PREPARE of the local
> transaction be requested explicitly and to have that cause all FDWs
> participating in the transaction to also be prepared.  (That might
> be what Andres meant; I don't know.)

We want this to be client-transparent, so that the client just says
COMMIT and everything Just Works.

> That doesn't strike me as the
> only possible mechanism to drive this, but it might well be the
> simplest and cleanest.  The trickiest bit might be to find a good
> way to persist the distributed transaction information in a way
> that survives the failure of the main transaction -- or even the
> abrupt loss of the machine it's running on.

I'd be willing to punt on surviving a loss of the entire machine.  But
I'd like to be able to survive an abrupt reboot.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers

Kevin Grittner-5
Robert Haas <[hidden email]> wrote:

> On Thu, Jan 8, 2015 at 10:19 AM, Kevin Grittner <[hidden email]> wrote:
>> Robert Haas <[hidden email]> wrote:
>>> Andres is talking in my other ear suggesting that we ought to
>>> reuse the 2PC infrastructure to do all this.
>>
>> If you mean that the primary transaction and all FDWs in the
>> transaction must use 2PC, that is what I was saying, although
>> apparently not clearly enough.  All nodes *including the local one*
>> must be prepared and committed with data about the nodes saved
>> safely off somewhere that it can be read in the event of a failure
>> of any of the nodes *including the local one*.  Without that, I see
>> this whole approach as a train wreck just waiting to happen.
>
> Clearly, all the nodes other than the local one need to use 2PC.  I am
> unconvinced that the local node must write a 2PC state file only to
> turn around and remove it again almost immediately thereafter.

The key point is that the distributed transaction data must be
flagged as needing to commit rather than roll back between the
prepare phase and the final commit.  If you try to avoid the
PREPARE, flagging, COMMIT PREPARED sequence by building the
flagging of the distributed transaction metadata into the COMMIT
process, you still have the problem of what to do on crash
recovery.  You really need to use 2PC to keep that clean, I think.

>> I'm not really clear on the mechanism that is being proposed for
>> doing this, but one way would be to have the PREPARE of the local
>> transaction be requested explicitly and to have that cause all FDWs
>> participating in the transaction to also be prepared.  (That might
>> be what Andres meant; I don't know.)
>
> We want this to be client-transparent, so that the client just says
> COMMIT and everything Just Works.

What about the case where one or more nodes doesn't support 2PC.
Do we silently make the choice, without the client really knowing?

>> That doesn't strike me as the
>> only possible mechanism to drive this, but it might well be the
>> simplest and cleanest.  The trickiest bit might be to find a good
>> way to persist the distributed transaction information in a way
>> that survives the failure of the main transaction -- or even the
>> abrupt loss of the machine it's running on.
>
> I'd be willing to punt on surviving a loss of the entire machine.  But
> I'd like to be able to survive an abrupt reboot.

As long as people are aware that there is an urgent need to find
and fix all data stores to which clusters on the failed machine
were connected via FDW when there is a hard machine failure, I
guess it is OK.  In essence we just document it and declare it to
be somebody else's problem.  In general I would expect a
distributed transaction manager to behave well in the face of any
single-machine failure, but if there is one aspect of a
full-featured distributed transaction manager we could give up, I
guess that would be it.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers

Ashutosh Bapat
In reply to this post by Kevin Grittner-5


On Thu, Jan 8, 2015 at 7:02 PM, Kevin Grittner <[hidden email]> wrote:
Ashutosh Bapat <[hidden email]> wrote:
> On Wed, Jan 7, 2015 at 9:50 PM, Kevin Grittner <[hidden email]> wrote:

>> Also, as previously mentioned, it must behave in some reasonable
>> way if a database is not configured to support 2PC, especially
>> since 2PC is off by default in PostgreSQL.

> We can have a per foreign server option, which says whether the
> corresponding server is able to participate in 2PC. A transaction
> spanning multiple foreign server with at least one of them not
> capable of participating in 2PC will be aborted.
>
> Will that work?
>
> In case a user flags a foreign server as capable to 2PC
> incorrectly, I expect the corresponding FDW would raise error
> (either because PREPARE fails or FDW doesn't handle that case)
> and the transaction will be aborted anyway.

That sounds like one way to handle it.  I'm not clear on how you
plan to determine whether 2PC is required for a transaction.
(Apologies if it was previously mentioned and I've forgotten it.)

Any transaction involving more than one server (including local one, I guess), will require two PC. A transaction may modify and access remote database but not local one. In such a case, the state of local transaction doesn't matter once the remote transaction is committed or rolled back.
 

I don't mean to suggest that these problems are insurmountable; I
just think that people often underestimate the difficulty of
writing a distributed transaction manager and don't always
recognize the problems that it will cause if all of the failure
modes are not considered and handled.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers

Ashutosh Bapat
In reply to this post by Robert Haas


On Thu, Jan 8, 2015 at 8:24 PM, Robert Haas <[hidden email]> wrote:
On Wed, Jan 7, 2015 at 11:20 AM, Kevin Grittner <[hidden email]> wrote:
> If we are going to include a distributed transaction manager with
> PostgreSQL, it *must* persist enough information about the
> transaction ID and where it is used in a way that will survive a
> subsequent crash before beginning the PREPARE on any of the
> systems.  After all nodes are PREPAREd it must flag that persisted
> data to indicate that it is now at a point where ROLLBACK is no
> longer an option.  Only then can it start committing the prepared
> transactions.  After the last node is committed it can clear this
> information.  On start-up the distributed transaction manager must
> check for any distributed transactions left "in progress" and
> commit or rollback based on the preceding; doing retries
> indefinitely until it succeeds or is told to stop.

I think one key question here is whether all of this should be handled
in PostgreSQL core or whether some of it should be handled in other
ways.  Is the goal to make postgres_fdw (and FDWs for other databases
that support 2PC) to persist enough information that someone *could*
write a transaction manager for PostgreSQL, or is the goal to actually
write that transaction manager?

Just figuring out how to persist the necessary information is a
non-trivial problem by itself.  You might think that you could just
insert a row into a local table saying, hey, I'm about to prepare a
transaction remotely, but of course that doesn't work: if you then go
on to PREPARE before writing and flushing the local commit record,
then a crash before that's done leaves a dangling prepared transaction
on the remote note.  You might think to write the record, then after
writing and flush the local commit record do the PREPARE.  But you
can't do that either, because now if the PREPARE fails you've already
committed locally.

I guess what you need to do is something like:

1. Write and flush a WAL record indicating an intent to prepare, with
a list of foreign server OIDs and GUIDs.
2. Prepare the remote transaction on each node.  If any of those
operations fail, roll back any prepared nodes and error out.
3. Commit locally (i.e. RecordTransactionCommit, writing and flushing WAL).
4. Try to commit the remote transactions.
5. Write a WAL record indicating that you committed the remote transactions OK.

If you fail after step 1, you can straighten things out by looking at
the status of the transaction: if the transaction committed, any
transactions we intended-to-prepare need to be checked.  If they are
still prepared, we need to commit them or roll them back according to
what happened to our XID.

When you want to strengthen and commit things, the foreign server may not be available to do that. As Kevin pointed out in above, we need to keep on retrying to resolve (commit or rollback based on the status of local transaction) the PREPAREd transactions on foreign server till they are resolved. So, we will have to persist the information somewhere else than the WAL OR keep on persisting the WALs even after the corresponding local transaction has been committed or aborted, which I don't think is a good idea, since that will have impact on replication, VACUUM esp. because it's going to affect the oldest transaction in WAL.
 
That's where Andres's suggestion might help.

(Andres is talking in my other ear suggesting that we ought to reuse
the 2PC infrastructure to do all this.  I'm not convinced that's a
good idea, but I'll let him present his own ideas here if he wants to
rather than trying to explain them myself.)


We can persist the information about distributed transaction (which esp. require 2PC) similar to the way as 2PC infrastructure in pg_twophase directory. I am still investigating whether we can re-use existing 2PC infrastructure or not. My initial reaction is no, since 2PC persists information about local transaction including locked objects, WALs (?) in pg_twophase directory, which is not required for a distributed transaction. But rest of the mechanism like the manner of processing the records during normal processing and recovery looks very useful.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
Reply | Threaded
Open this post in threaded view
|

Re: Transactions involving multiple postgres foreign servers

Jim Nasby-5
In reply to this post by Kevin Grittner-5
On 1/8/15, 12:00 PM, Kevin Grittner wrote:

> Robert Haas <[hidden email]> wrote:
>> On Thu, Jan 8, 2015 at 10:19 AM, Kevin Grittner <[hidden email]> wrote:
>>> Robert Haas <[hidden email]> wrote:
>>>> Andres is talking in my other ear suggesting that we ought to
>>>> reuse the 2PC infrastructure to do all this.
>>>
>>> If you mean that the primary transaction and all FDWs in the
>>> transaction must use 2PC, that is what I was saying, although
>>> apparently not clearly enough.  All nodes *including the local one*
>>> must be prepared and committed with data about the nodes saved
>>> safely off somewhere that it can be read in the event of a failure
>>> of any of the nodes *including the local one*.  Without that, I see
>>> this whole approach as a train wreck just waiting to happen.
>>
>> Clearly, all the nodes other than the local one need to use 2PC.  I am
>> unconvinced that the local node must write a 2PC state file only to
>> turn around and remove it again almost immediately thereafter.
>
> The key point is that the distributed transaction data must be
> flagged as needing to commit rather than roll back between the
> prepare phase and the final commit.  If you try to avoid the
> PREPARE, flagging, COMMIT PREPARED sequence by building the
> flagging of the distributed transaction metadata into the COMMIT
> process, you still have the problem of what to do on crash
> recovery.  You really need to use 2PC to keep that clean, I think.

If we had an independent transaction coordinator then I agree with you Kevin. I think Robert is proposing that if we are controlling one of the nodes that's participating as well as coordinating the overall transaction that we can take some shortcuts. AIUI a PREPARE means you are completely ready to commit. In essence you're just waiting to write and fsync the commit message. That is in fact the state that a coordinating PG node would be in by the time everyone else has done their prepare. So from that standpoint we're OK.

Now, as soon as ANY of the nodes commit, our coordinating node MUST be able to commit as well! That would require it to have a real prepared transaction of it's own created. However, as long as there is zero chance of any other prepared transactions committing before our local transaction, that step isn't actually needed. Our local transaction will either commit or abort, and that will determine what needs to happen on all other nodes.

I'm ignoring the question of how the local node needs to store info about the other nodes in case of a crash, but AFAICT you could reliably recover manually from what I just described.

I think the question is: are we OK with "going under the skirt" in this fashion? Presumably it would provide better performance, whereas forcing ourselves to eat our own 2PC dogfood would presumably make it easier for someone to plugin an external coordinator instead of using our own. I think there's also a lot to be said for getting a partial implementation of this available today (requiring manual recovery), so long as it's not in core.

BTW, I found https://www.cs.rutgers.edu/~pxk/417/notes/content/transactions.html a useful read, specifically the 2PC portion.

>>> I'm not really clear on the mechanism that is being proposed for
>>> doing this, but one way would be to have the PREPARE of the local
>>> transaction be requested explicitly and to have that cause all FDWs
>>> participating in the transaction to also be prepared.  (That might
>>> be what Andres meant; I don't know.)
>>
>> We want this to be client-transparent, so that the client just says
>> COMMIT and everything Just Works.
>
> What about the case where one or more nodes doesn't support 2PC.
> Do we silently make the choice, without the client really knowing?

We abort. (Unless we want to have a running_with_scissors GUC...)

>>> That doesn't strike me as the
>>> only possible mechanism to drive this, but it might well be the
>>> simplest and cleanest.  The trickiest bit might be to find a good
>>> way to persist the distributed transaction information in a way
>>> that survives the failure of the main transaction -- or even the
>>> abrupt loss of the machine it's running on.
>>
>> I'd be willing to punt on surviving a loss of the entire machine.  But
>> I'd like to be able to survive an abrupt reboot.
>
> As long as people are aware that there is an urgent need to find
> and fix all data stores to which clusters on the failed machine
> were connected via FDW when there is a hard machine failure, I
> guess it is OK.  In essence we just document it and declare it to
> be somebody else's problem.  In general I would expect a
> distributed transaction manager to behave well in the face of any
> single-machine failure, but if there is one aspect of a
> full-featured distributed transaction manager we could give up, I
> guess that would be it.

ISTM that one option here would be to "simply" write and sync WAL record(s) of all externally prepared transactions. That would be enough for a hot standby to find all the other servers and tell them to either commit or abort, based on whether our local transaction committed or aborted. If you wanted, you could even have the standby be responsible for telling all the other participants to commit...
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
1234 ... 10
Previous Thread Next Thread