Merging two database dumps

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Merging two database dumps

Alex O'Ree
I have a situation with multiple postgres servers running all with the same databases and table structure. I need to periodically export the data from each of there then merge them all into a single server. On  occasion, it's feasible for the same record (primary key) to be stored in two or more servers

I was using pgdump without the --insert option however I just noticed that pgrestore will stop inserting into a table when the conflict occurs, leaving me with an incomplete set.

Question is what are my other options to skip over the conflicting record when merging? 

From the docs, it appears that making dumps with the --insert option may be the only way to go however performance is an issue. In this case would dropping all indexes help?
Reply | Threaded
Open this post in threaded view
|

Re: Merging two database dumps

pavan95
Hi Alex,
For storing duplicate rows, dropping primary and unique indexes is the only way.

One alternative is create a column with timestamp which updates on every insert/update so that timestamp will be primary. Hope it helps.

Regards,
Pavan

On Wed, Jun 13, 2018, 4:47 PM Alex O'Ree <[hidden email]> wrote:
I have a situation with multiple postgres servers running all with the same databases and table structure. I need to periodically export the data from each of there then merge them all into a single server. On  occasion, it's feasible for the same record (primary key) to be stored in two or more servers

I was using pgdump without the --insert option however I just noticed that pgrestore will stop inserting into a table when the conflict occurs, leaving me with an incomplete set.

Question is what are my other options to skip over the conflicting record when merging? 

From the docs, it appears that making dumps with the --insert option may be the only way to go however performance is an issue. In this case would dropping all indexes help?
Reply | Threaded
Open this post in threaded view
|

Re: [External] Merging two database dumps

Vijaykumar Jain
In reply to this post by Alex O'Ree

You can try one option, although just a thought in the air 😊

Use postgres FDW  ex. https://robots.thoughtbot.com/postgres-foreign-data-wrapper

 

Create foreign tables in the relevant server schema

 

And then union/union all  😊 or your custom constraint on the destination table where you dump the rows.

 

For ex.

You have server1, server2, server3

 

And you have server4 as your new single server.

 

You create FDW of server1, server2, server3 on server 4 and then import table into respective server schema.

 

server1.table1, server2.table1, server3.table1

 

and then

insert into server4.table1  select * from( select * from server1.table1 union select * from server2.table1 union select * from server3.table1) a;

 

something 😊

 

 

Thanks,

Vijay

 

From: Alex O'Ree <[hidden email]>
Date: Wednesday, June 13, 2018 at 4:47 PM
To: "[hidden email]" <[hidden email]>
Subject: [External] Merging two database dumps

 

I have a situation with multiple postgres servers running all with the same databases and table structure. I need to periodically export the data from each of there then merge them all into a single server. On  occasion, it's feasible for the same record (primary key) to be stored in two or more servers

 

I was using pgdump without the --insert option however I just noticed that pgrestore will stop inserting into a table when the conflict occurs, leaving me with an incomplete set.

 

Question is what are my other options to skip over the conflicting record when merging? 

 

From the docs, it appears that making dumps with the --insert option may be the only way to go however performance is an issue. In this case would dropping all indexes help?

Reply | Threaded
Open this post in threaded view
|

Re: Merging two database dumps

Andreas Kretschmer-3
In reply to this post by Alex O'Ree


Am 13.06.2018 um 13:17 schrieb Alex O'Ree:
> I have a situation with multiple postgres servers running all with the
> same databases and table structure. I need to periodically export the
> data from each of there then merge them all into a single server. On 
> occasion, it's feasible for the same record (primary key) to be stored
> in two or more servers

what should happen in this case?

>
> I was using pgdump without the --insert option however I just noticed
> that pgrestore will stop inserting into a table when the conflict
> occurs, leaving me with an incomplete set.
>

Other solution:

* create the tables on the destination server without the PK or with an
other, new PK (maybe SERIAL)
* use logical replication to replicate the table from all your
source-db's to the destination table, see more here:
https://www.2ndquadrant.com/en/resources/pglogical/

your problem seems as a typical task for logical replication to me. You
needs 9.4 at least.


Regards, Andreas

--
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com


Reply | Threaded
Open this post in threaded view
|

Re: [External] Merging two database dumps

pavan95
In reply to this post by Vijaykumar Jain
Yes Vijay,

It might work, but I'm thinking it will be a performance overhead in case of
complex data.

Regards,
Pavan



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html

Reply | Threaded
Open this post in threaded view
|

Re: Merging two database dumps

Alex O'Ree
In reply to this post by Andreas Kretschmer-3
Desired behavior is to just log the error and continue the import using pgdump based copy commands 

The servers are not on the same network.  Sneaker net is the only way

On Wed, Jun 13, 2018, 7:42 AM Andreas Kretschmer <[hidden email]> wrote:


Am 13.06.2018 um 13:17 schrieb Alex O'Ree:
> I have a situation with multiple postgres servers running all with the
> same databases and table structure. I need to periodically export the
> data from each of there then merge them all into a single server. On 
> occasion, it's feasible for the same record (primary key) to be stored
> in two or more servers

what should happen in this case?

>
> I was using pgdump without the --insert option however I just noticed
> that pgrestore will stop inserting into a table when the conflict
> occurs, leaving me with an incomplete set.
>

Other solution:

* create the tables on the destination server without the PK or with an
other, new PK (maybe SERIAL)
* use logical replication to replicate the table from all your
source-db's to the destination table, see more here:
https://www.2ndquadrant.com/en/resources/pglogical/

your problem seems as a typical task for logical replication to me. You
needs 9.4 at least.


Regards, Andreas

--
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com


Reply | Threaded
Open this post in threaded view
|

Re: Merging two database dumps

Adrian Klaver-4
On 06/13/2018 06:21 AM, Alex O'Ree wrote:
> Desired behavior is to just log the error and continue the import using
> pgdump based copy commands

Each COPY is atomic so if any part of it fails the whole thing fails, so
you will not be able to achieve what you want that way.

>
> The servers are not on the same network.  Sneaker net is the only way
>
> On Wed, Jun 13, 2018, 7:42 AM Andreas Kretschmer
> <[hidden email] <mailto:[hidden email]>> wrote:
>
>
>
>     Am 13.06.2018 um 13:17 schrieb Alex O'Ree:
>      > I have a situation with multiple postgres servers running all
>     with the
>      > same databases and table structure. I need to periodically export
>     the
>      > data from each of there then merge them all into a single server. On
>      > occasion, it's feasible for the same record (primary key) to be
>     stored
>      > in two or more servers
>
>     what should happen in this case?
>
>      >
>      > I was using pgdump without the --insert option however I just
>     noticed
>      > that pgrestore will stop inserting into a table when the conflict
>      > occurs, leaving me with an incomplete set.
>      >
>
>     Other solution:
>
>     * create the tables on the destination server without the PK or with an
>     other, new PK (maybe SERIAL)
>     * use logical replication to replicate the table from all your
>     source-db's to the destination table, see more here:
>     https://www.2ndquadrant.com/en/resources/pglogical/
>
>     your problem seems as a typical task for logical replication to me. You
>     needs 9.4 at least.
>
>
>     Regards, Andreas
>
>     --
>     2ndQuadrant - The PostgreSQL Support Company.
>     www.2ndQuadrant.com <http://www.2ndQuadrant.com>
>
>


--
Adrian Klaver
[hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Merging two database dumps

Alex O'Ree
Thanks for the clarification

On Wed, Jun 13, 2018 at 9:32 AM, Adrian Klaver <[hidden email]> wrote:
On 06/13/2018 06:21 AM, Alex O'Ree wrote:
Desired behavior is to just log the error and continue the import using pgdump based copy commands

Each COPY is atomic so if any part of it fails the whole thing fails, so you will not be able to achieve what you want that way.


The servers are not on the same network.  Sneaker net is the only way

On Wed, Jun 13, 2018, 7:42 AM Andreas Kretschmer <[hidden email] <mailto:[hidden email]>> wrote:



    Am 13.06.2018 um 13:17 schrieb Alex O'Ree:
     > I have a situation with multiple postgres servers running all
    with the
     > same databases and table structure. I need to periodically export
    the
     > data from each of there then merge them all into a single server. On
     > occasion, it's feasible for the same record (primary key) to be
    stored
     > in two or more servers

    what should happen in this case?

     >
     > I was using pgdump without the --insert option however I just
    noticed
     > that pgrestore will stop inserting into a table when the conflict
     > occurs, leaving me with an incomplete set.
     >

    Other solution:

    * create the tables on the destination server without the PK or with an
    other, new PK (maybe SERIAL)
    * use logical replication to replicate the table from all your
    source-db's to the destination table, see more here:
    https://www.2ndquadrant.com/en/resources/pglogical/

    your problem seems as a typical task for logical replication to me. You
    needs 9.4 at least.


    Regards, Andreas

    --     2ndQuadrant - The PostgreSQL Support Company.
    www.2ndQuadrant.com <http://www.2ndQuadrant.com>




--
Adrian Klaver
[hidden email]

Previous Thread Next Thread