pg_dump, ATTACH, and independently restorable child partitions

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

pg_dump, ATTACH, and independently restorable child partitions

Justin Pryzby
Since this commit, pg_dump CREATEs tables and then ATTACHes them:

|commit 33a53130a89447e171a8268ae0b221bb48af6468
|Author: Alvaro Herrera <[hidden email]>
|Date:   Mon Jun 10 18:56:23 2019 -0400
|
|    Make pg_dump emit ATTACH PARTITION instead of PARTITION OF (reprise)
|...
|    This change also has the advantage that the partition is restorable from
|    the dump (as a standalone table) even if its parent table isn't
|    restored.

I like the idea of child tables being independently restorable, but it doesn't
seem to work.

|psql postgres -c 'DROP TABLE IF EXISTS t' -c 'CREATE TABLE t(i int) PARTITION BY RANGE(i)' -c 'CREATE TABLE t1 PARTITION OF t FOR VALUES FROM (1)TO(2)'
|pg_dump postgres -Fc -t t1 >dump.t1
|psql postgres -c 'DROP TABLE t'
|pg_restore -d postgres ./dump.t1
|pg_restore: while PROCESSING TOC:
|pg_restore: from TOC entry 457; 1259 405311409 TABLE t1 pryzbyj
|pg_restore: error: could not execute query: ERROR:  relation "public.t" does not exist
|Command was: CREATE TABLE public.t1 (
|    i integer
|);
|ALTER TABLE ONLY public.t ATTACH PARTITION public.t1 FOR VALUES FROM (1) TO (2);
|
|pg_restore: error: could not execute query: ERROR:  relation "public.t1" does not exist
|Command was: ALTER TABLE public.t1 OWNER TO pryzbyj;
|
|pg_restore: from TOC entry 4728; 0 405311409 TABLE DATA t1 pryzbyj
|pg_restore: error: could not execute query: ERROR:  relation "public.t1" does not exist
|Command was: COPY public.t1 (i) FROM stdin;
|pg_restore: warning: errors ignored on restore: 3

Now that I look, it seems like this is calling PQexec(), which sends a single,
"simple" libpq message with:
|CREATE TABLE ..; ALTER TABLE .. ATTACH PARTITION;
..which is transactional, so when the 2nd command fails, the CREATE is rolled back.
https://www.postgresql.org/docs/9.5/libpq-exec.html#LIBPQ-EXEC-MAIN

Telsasoft does a lot of dynamic DDL, so this happens sometimes due to columns
added or promoted.  Up to now, when this has come up, I've run:
pg_restore |grep -v 'ATTACH PARTITION' |psql.  Am I missing something ?

The idea of being independently restorable maybe originated with Tom's comment
here: https://www.postgresql.org/message-id/30049.1555537881%40sss.pgh.pa.us

--
Justin


Reply | Threaded
Open this post in threaded view
|

Re: pg_dump, ATTACH, and independently restorable child partitions

Justin Pryzby
On Fri, Oct 23, 2020 at 12:29:40AM -0500, Justin Pryzby wrote:

> Since this commit, pg_dump CREATEs tables and then ATTACHes them:
>
> |commit 33a53130a89447e171a8268ae0b221bb48af6468
> |Author: Alvaro Herrera <[hidden email]>
> |Date:   Mon Jun 10 18:56:23 2019 -0400
> |
> |    Make pg_dump emit ATTACH PARTITION instead of PARTITION OF (reprise)
> |...
> |    This change also has the advantage that the partition is restorable from
> |    the dump (as a standalone table) even if its parent table isn't
> |    restored.
>
> I like the idea of child tables being independently restorable, but it doesn't
> seem to work.
...
> Now that I look, it seems like this is calling PQexec(), which sends a single,
> "simple" libpq message with:
> |CREATE TABLE ..; ALTER TABLE .. ATTACH PARTITION;
> ..which is transactional, so when the 2nd command fails, the CREATE is rolled back.
> https://www.postgresql.org/docs/9.5/libpq-exec.html#LIBPQ-EXEC-MAIN

The easy fix is to add an explicit begin/commit.

--
Justin

0001-pg_dump-Allow-child-partitions-to-be-independently-r.patch (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: pg_dump, ATTACH, and independently restorable child partitions

Justin Pryzby
On Sat, Oct 24, 2020 at 02:59:49PM -0500, Justin Pryzby wrote:

> On Fri, Oct 23, 2020 at 12:29:40AM -0500, Justin Pryzby wrote:
> > Since this commit, pg_dump CREATEs tables and then ATTACHes them:
> >
> > |commit 33a53130a89447e171a8268ae0b221bb48af6468
> > |Author: Alvaro Herrera <[hidden email]>
> > |Date:   Mon Jun 10 18:56:23 2019 -0400
> > |
> > |    Make pg_dump emit ATTACH PARTITION instead of PARTITION OF (reprise)
> > |...
> > |    This change also has the advantage that the partition is restorable from
> > |    the dump (as a standalone table) even if its parent table isn't
> > |    restored.
> >
> > I like the idea of child tables being independently restorable, but it doesn't
> > seem to work.
> ...
> > Now that I look, it seems like this is calling PQexec(), which sends a single,
> > "simple" libpq message with:
> > |CREATE TABLE ..; ALTER TABLE .. ATTACH PARTITION;
> > ..which is transactional, so when the 2nd command fails, the CREATE is rolled back.
> > https://www.postgresql.org/docs/9.5/libpq-exec.html#LIBPQ-EXEC-MAIN
>
> The easy fix is to add an explicit begin/commit.
Now with updated test script.

--
Justin

v2-0001-pg_dump-Allow-child-partitions-to-be-independentl.patch (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: pg_dump, ATTACH, and independently restorable child partitions

Álvaro Herrera
In reply to this post by Justin Pryzby
On 2020-Oct-24, Justin Pryzby wrote:

> On Fri, Oct 23, 2020 at 12:29:40AM -0500, Justin Pryzby wrote:

> > Now that I look, it seems like this is calling PQexec(), which sends a single,
> > "simple" libpq message with:
> > |CREATE TABLE ..; ALTER TABLE .. ATTACH PARTITION;
> > ..which is transactional, so when the 2nd command fails, the CREATE is rolled back.
> > https://www.postgresql.org/docs/9.5/libpq-exec.html#LIBPQ-EXEC-MAIN
>
> The easy fix is to add an explicit begin/commit.

Hmm, I think this throws a warning when used with "pg_restore -1",
right?  I don't think that's sufficient reason to discard the idea, but
it be better to find some other way.

I have no ideas ATM :-(



Reply | Threaded
Open this post in threaded view
|

Re: pg_dump, ATTACH, and independently restorable child partitions

Justin Pryzby
On Fri, Nov 06, 2020 at 11:18:35PM -0300, Alvaro Herrera wrote:

> On 2020-Oct-24, Justin Pryzby wrote:
>
> > On Fri, Oct 23, 2020 at 12:29:40AM -0500, Justin Pryzby wrote:
>
> > > Now that I look, it seems like this is calling PQexec(), which sends a single,
> > > "simple" libpq message with:
> > > |CREATE TABLE ..; ALTER TABLE .. ATTACH PARTITION;
> > > ..which is transactional, so when the 2nd command fails, the CREATE is rolled back.
> > > https://www.postgresql.org/docs/9.5/libpq-exec.html#LIBPQ-EXEC-MAIN
> >
> > The easy fix is to add an explicit begin/commit.
>
> Hmm, I think this throws a warning when used with "pg_restore -1",
> right?  I don't think that's sufficient reason to discard the idea, but
> it be better to find some other way.
Worse, right ?  It'd commit in the middle and then continue outside of a txn.
I guess there's no test case for this :(

> I have no ideas ATM :-(

1. Maybe pg_restore ExecuteSqlCommandBuf() should (always?) call
ExecuteSimpleCommands() instead of ExecuteSqlCommand().  It doesn't seem to
break anything (although that surprised me).

2. Otherwise, the createStmt would need to be split into a createStmt2 or a
char *createStmt[], which I think would then require changing the output
format.  It seems clearly better to keep the sql commands split up initially
than to reverse engineer them during restore.

I tried using \x01 to separate commands, and strtok to split them to run them
individually.  But that breaks the pg_dumpall tests.  As an experiment, I used
\x00, which is somewhat invasive but actually works.

Obviously patching pg_dump will affect only future backups, and the pg_restore
patch allows independently restoring parent tables in existing dumps.

--
Justin

v2-0001-pg_restore-parse-and-run-separately-SQL-commands.patch (1K) Download Attachment
v2-0002-pg_dump-Allow-child-partitions-to-be-independentl.patch (27K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: pg_dump, ATTACH, and independently restorable child partitions

Tom Lane-2
Justin Pryzby <[hidden email]> writes:
> 1. Maybe pg_restore ExecuteSqlCommandBuf() should (always?) call
> ExecuteSimpleCommands() instead of ExecuteSqlCommand().  It doesn't seem to
> break anything (although that surprised me).

That certainly does break everything, which I imagine is the reason
why the cfbot shows that this patch is failing the pg_upgrade tests.
Note the comments for ExecuteSimpleCommands:

 * We have to lex the data to the extent of identifying literals and quoted
 * identifiers, so that we can recognize statement-terminating semicolons.
 * We assume that INSERT data will not contain SQL comments, E'' literals,
 * or dollar-quoted strings, so this is much simpler than a full SQL lexer.

IOW, where that says "Simple", it means *simple* --- in practice,
we only risk using it on commands that we know pg_dump itself built
earlier.  There is no reasonable prospect of getting pg_restore to
split arbitrary SQL at command boundaries.  We'd need something
comparable to psql's lexer, which is huge, and from a future-proofing
standpoint it would be just awful.  (The worst that happens if psql
misparses your string is that it won't send the command when you
expected.  If pg_restore misparses stuff, your best case is that the
restore fails cleanly; the worst case could easily result in
SQL-injection compromises.)  So I think we cannot follow this
approach.

What we'd need to do if we want this to work with direct-to-DB restore
is to split off the ATTACH PARTITION command as a separate TOC entry.
That doesn't seem amazingly difficult, and it would even offer the
possibility that you could extract the partition standalone without
having to ignore errors.  (You'd use -l/-L to select the CREATE TABLE,
the data, etc, but not the ATTACH object.)

That would possibly come out as a larger patch than you have here,
but maybe not by much.  I don't think there's too much more involved
than setting up the proper command strings and calling ArchiveEntry().
You'd need to do some testing to verify that cases like --clean
work sanely.

Also, I read the 0002 patch briefly and couldn't make heads or tails
of it, except that it seemed to be abusing the PQExpBuffer abstraction
well beyond anything I'd consider acceptable.  If you want separate
strings, make a PQExpBuffer for each one.

                        regards, tom lane