Programmatically duplicating a schema

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Programmatically duplicating a schema

matt.figg
Hi all,

What is a reliable way to programmatically & generically populate an empty schema with all the objects in the public schema as a template?

We are using the multi tenancy ruby gem Apartment ( https://github.com/influitive/apartment ), which was recently broken by the changes made to pg_dump to address CVE-2018-1058 https://nvd.nist.gov/vuln/detail/CVE-2018-1058

Apartment attempts to duplicate the public schema whenever creating a new schema by running:

pg_dump -s -x -0 -n public

to get the SQL statements needed to recreate the public schema & then executes the pg_dump's sql output after creating & switching to the new schema ( via set search_path to <new schema>; )

After the fix to CVE-2018-1058, all table references in pg_dump's output (including within SQL of stored procedures) are prefixed by the public. schema, which means you cannot just reuse this output in a different schema context without first manually changing the sql.
As a temporary fix so we can handle new customers in production, we are using a regex search/replace for public. in the pg_dump output, but clearly this is not a reliable solution for a generic gem such as Apartment.

Is there a different way this gem could now be achieving this?
Without a flag on pg_dump to force the old format, or a CREATE SCHEMA <schema_name> WITH TEMPLATE <other_schema> kind of statement, we can't see any alternative more reliable than the regex search/replace. 

The open issue on the Apartment gem for context: https://github.com/influitive/apartment/issues/532

Thanks in advance.
Reply | Threaded
Open this post in threaded view
|

Re: Programmatically duplicating a schema

Adrian Klaver-4
On 03/12/2018 08:23 PM, [hidden email] wrote:

> Hi all,
>
> What is a reliable way to programmatically & generically populate an
> empty schema with all the objects in the public schema as a template?
>
> We are using the multi tenancy ruby gem Apartment (
> https://github.com/influitive/apartment ), which was recently broken by
> the changes made to pg_dump to address CVE-2018-1058
> https://nvd.nist.gov/vuln/detail/CVE-2018-1058
>
> Apartment attempts to duplicate the public schema whenever creating a
> new schema by running:
>
> pg_dump -s -x -0 -n public
>
> to get the SQL statements needed to recreate the public schema & then
> executes the pg_dump's sql output after creating & switching to the new
> schema ( via set search_path to <new schema>; )
>
> After the fix to CVE-2018-1058, all table references in pg_dump's output
> (including within SQL of stored procedures) are prefixed by the public.
> schema, which means you cannot just reuse this output in a different
> schema context without first manually changing the sql.
> As a temporary fix so we can handle new customers in production, we are
> using a regex search/replace for public. in the pg_dump output, but
> clearly this is not a reliable solution for a generic gem such as Apartment.
>
> Is there a different way this gem could now be achieving this?
> Without a flag on pg_dump to force the old format, or a CREATE SCHEMA
> <schema_name> WITH TEMPLATE <other_schema> kind of statement, we can't
> see any alternative more reliable than the regex search/replace.

Wild idea:

1) Roll back to:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=tree;h=c987089c48afba99df0c3c2dc45fd69238b52705;hb=2840d201c6a62beb86d9671a66eeec56183d261b

2) Use that to build pre-10.3 version of pg_dump and use that to dump
your schema. Basically long way to to force old format. Is fragile
though as that version will lose contact with changes.


Less wild idea and previously suggested upstream.

1) Check your schema objects into scripts that are checked into version
control w/o schema qualifications.

2) Use scripts to populate new schema.

>
> The open issue on the Apartment gem for
> context: https://github.com/influitive/apartment/issues/532
>
> Thanks in advance.


--
Adrian Klaver
[hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Programmatically duplicating a schema

matt.figg

Thanks Adrian,

 

Really appreciate the suggestions.

 

The objective when trying to solve this for the Apartment library itself is to keep it generic (works for any/all database objects - tables, views, stored procedures, 3rd party extension objects, etc.) & to require minimal configuration (preferably not having to tell the library your schema).

 

I believe pg_dump was the only way to satisfy the second point.  The reason we'd want to infer the create statements via pg_dump is, so we don't need to keep database migration files in sync with a 'create new schema' SQL script.  It adds risk that they get out of sync, causing inconsistencies in new schemas created by the library.

 

Assuming there's no other way to infer the create statements from the public schema, Ruby on Rails' structure.sql could probably be used as a starting point for the 'create new schema' SQL file.  It's similar already, however it's also generated via pg_dump (having the same issues as Apartment library). http://guides.rubyonrails.org/v3.2.9/migrations.html#types-of-schema-dumps  This is outside the realms of this mail group though.

 

Cheers,

Matt.

 



----- Original Message -----
From:
"Adrian Klaver" <[hidden email]>

To:
<[hidden email]>, <[hidden email]>
Cc:

Sent:
Mon, 12 Mar 2018 21:22:01 -0700
Subject:
Re: Programmatically duplicating a schema


On 03/12/2018 08:23 PM, [hidden email] wrote:
> Hi all,
>
> What is a reliable way to programmatically & generically populate an
> empty schema with all the objects in the public schema as a template?
>
> We are using the multi tenancy ruby gem Apartment (
> https://github.com/influitive/apartment ), which was recently broken by
> the changes made to pg_dump to address CVE-2018-1058
> https://nvd.nist.gov/vuln/detail/CVE-2018-1058
>
> Apartment attempts to duplicate the public schema whenever creating a
> new schema by running:
>
> pg_dump -s -x -0 -n public
>
> to get the SQL statements needed to recreate the public schema & then
> executes the pg_dump's sql output after creating & switching to the new
> schema ( via set search_path to <new schema>; )
>
> After the fix to CVE-2018-1058, all table references in pg_dump's output
> (including within SQL of stored procedures) are prefixed by the public.
> schema, which means you cannot just reuse this output in a different
> schema context without first manually changing the sql.
> As a temporary fix so we can handle new customers in production, we are
> using a regex search/replace for public. in the pg_dump output, but
> clearly this is not a reliable solution for a generic gem such as Apartment.
>
> Is there a different way this gem could now be achieving this?
> Without a flag on pg_dump to force the old format, or a CREATE SCHEMA
> <schema_name> WITH TEMPLATE <other_schema> kind of statement, we can't
> see any alternative more reliable than the regex search/replace.

Wild idea:

1) Roll back to:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=tree;h=c987089c48afba99df0c3c2dc45fd69238b52705;hb=2840d201c6a62beb86d9671a66eeec56183d261b

2) Use that to build pre-10.3 version of pg_dump and use that to dump
your schema. Basically long way to to force old format. Is fragile
though as that version will lose contact with changes.


Less wild idea and previously suggested upstream.

1) Check your schema objects into scripts that are checked into version
control w/o schema qualifications.

2) Use scripts to populate new schema.

>
> The open issue on the Apartment gem for
> context: https://github.com/influitive/apartment/issues/532
>
> Thanks in advance.


--
Adrian Klaver
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Programmatically duplicating a schema

Alban Hertroys-4
In reply to this post by matt.figg

> On 13 Mar 2018, at 4:23, [hidden email] wrote:
>
> Hi all,
>
> What is a reliable way to programmatically & generically populate an empty schema with all the objects in the public schema as a template?
>
> We are using the multi tenancy ruby gem Apartment ( https://github.com/influitive/apartment ), which was recently broken by the changes made to pg_dump to address CVE-2018-1058 https://nvd.nist.gov/vuln/detail/CVE-2018-1058
>
> Apartment attempts to duplicate the public schema whenever creating a new schema by running:
>
> pg_dump -s -x -0 -n public
>
> to get the SQL statements needed to recreate the public schema & then executes the pg_dump's sql output after creating & switching to the new schema ( via set search_path to <new schema>; )
>
> After the fix to CVE-2018-1058, all table references in pg_dump's output (including within SQL of stored procedures) are prefixed by the public. schema, which means you cannot just reuse this output in a different schema context without first manually changing the sql.
> As a temporary fix so we can handle new customers in production, we are using a regex search/replace for public. in the pg_dump output, but clearly this is not a reliable solution for a generic gem such as Apartment.

In my opinion, it makes sense that if you have the option of dumping the contents of a specific schema, it should be possible to restore that dump into a different schema. Unfortunately, looking at pg_restore, there does not appear to be such an option (yet).

I'd even go so far to suggest that every single object type that can be dumped with pg_dump (single database, single schema, single table, single function, etc) should be restorable under a different name. I realise that this could make pg_restore options potentially more confusing.

I suppose people currently manually edit the dumps to this effect, but that risks silent corruption of data when for example a data value contains a string such as 'The building is now open to public.'. Regular expressions don't know the difference between data and identifiers in a dump file - pg_restore does.

Whether psql needs the same treatment? I'd qualify this as "advanced" use and limit it to pg_restore.

But then, I'm just a list-lurker, I currently have but the option of voicing my opinion.

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.


Reply | Threaded
Open this post in threaded view
|

Re: Programmatically duplicating a schema

Melvin Davidson-5
 > What is a reliable way to programmatically & generically populate an empty schema with all the objects in the public schema as a template?

The simplest way is just to load the attached clone_schema function. It was originally created by Emanuel '3manuek', which I enhanced. It
now copies all sequences, tables, indexes, rules, triggers, data(optional), views & functions from any existing schema to a new schema.

Then just call the function.

SAMPLE CALL:
SELECT clone_schema('public', 'new_schema', TRUE);

Use FALSE if you do not want to copy data.


--
Melvin Davidson
Maj. Database & Exploration Specialist

Universe Exploration Command – UXC

Employment by invitation only!

clone_schema.sql (18K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Programmatically duplicating a schema

David G Johnston
In reply to this post by matt.figg
On Mon, Mar 12, 2018 at 11:05 PM, <[hidden email]> wrote:

The reason we'd want to infer the create statements via pg_dump is, so we don't need to keep database migration files in sync with a 'create new schema' SQL script.  It adds risk that they get out of sync, causing inconsistencies in new schemas created by the library.

​Then don't have a "create new schema" script (aside from the very first one) run your migrations even when creating new installations.

​David J.
Reply | Threaded
Open this post in threaded view
|

Re: Programmatically duplicating a schema

Adrian Klaver-4
In reply to this post by matt.figg
On 03/12/2018 11:05 PM, [hidden email] wrote:

> Thanks Adrian,
>
> Really appreciate the suggestions.
>
> The objective when trying to solve this for the Apartment library itself
> is to keep it generic (works for any/all database objects - tables,
> views, stored procedures, 3rd party extension objects, etc.) & to
> require minimal configuration (preferably not having to tell the library
> your schema).
>
> I believe pg_dump was the only way to satisfy the second point.  The
> reason we'd want to infer the create statements via pg_dump is, so we
> don't need to keep database migration files in sync with a 'create new
> schema' SQL script.  It adds risk that they get out of sync, causing
> inconsistencies in new schemas created by the library.

Seems to me your best hope is to make a plea on --hackers for a flag
that turns off schema qualification of object names.

>
> Assuming there's no other way to infer the create statements from the
> public schema, Ruby on Rails' structure.sql could probably be used as a
> starting point for the 'create new schema' SQL file.  It's similar
> already, however it's also generated via pg_dump (having the same issues
> as Apartment library).
> http://guides.rubyonrails.org/v3.2.9/migrations.html#types-of-schema-dumps 
> This is outside the realms of this mail group though.
>
> Cheers,
>
> Matt.
>
>


--
Adrian Klaver
[hidden email]

Previous Thread Next Thread