pg_dump multi VALUES INSERT

classic Classic list List threaded Threaded
54 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Re: pg_dump multi VALUES INSERT

Alvaro Herrera-9
On 2019-Jan-23, David Rowley wrote:

> On Wed, 23 Jan 2019 at 04:08, Alvaro Herrera <[hidden email]> wrote:
> > Is it possible to avoid the special case for 0 columns by using the
> > UNION ALL syntax I showed?
>
> It would be possible, but my thoughts are that we're moving away from
> the SQL standard by doing so.

Ah, that's a good point that I missed -- I agree with your reasoning.

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: pg_dump multi VALUES INSERT

David Rowley-3
In reply to this post by Fabien COELHO-3
On Thu, 24 Jan 2019 at 04:45, Fabien COELHO <[hidden email]> wrote:
> >> The feature is not tested anywhere. I still think that there should be a
> >> test on empty/small/larger-than-rows-per-insert tables, possibly added to
> >> existing TAP-tests.
> >
> > I was hoping to get away with not having to do that... mainly because
> > I've no idea how.
>
> Hmmm. That is another question! Maybe someone will help.

Here's another version, same as before but with tests this time.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

pg_dump-rows-per-insert-option_v13.patch (20K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: pg_dump multi VALUES INSERT

David Rowley-3
On Thu, 31 Jan 2019 at 11:49, David Rowley <[hidden email]> wrote:
> Here's another version, same as before but with tests this time.

Hi Fabien,

Wondering if you have anything else here? I'm happy for the v13
version to be marked as ready for committer.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: pg_dump multi VALUES INSERT

Fabien COELHO-3

Hello David,

> Wondering if you have anything else here? I'm happy for the v13
> version to be marked as ready for committer.

I still have a few comments.

Patch applies cleanly, compiles, global & local make check are ok.

Typos and style in the doc:

  "However, since, by default this option generates ..."
  "However, since this option, by default, generates ..."

I'd suggest a more straightforward to my mind and ear: "However, since by
default the option generates ..., ....", although beware that I'm not a
native English speaker.

I do not understand why dump_inserts declaration has left the "flags for
options" section.

I'd suggest not to rely on "atoi" because it does not check the argument
syntax, so basically anything is accepted, eg "1O" is 1;

On "if (!dopt->do_nothing) $1 else $2;", I'd rather use a straight
condition "if (dopt->do_nothing) $2 else $1;" (two instances).

There is a test, that is good! Charater "." should be backslashed in the
regexpr. I'd consider also introducing limit cases: empty table, empty
columns by creating corresponding tables and using -t repeatedly.

--
Fabien.

Reply | Threaded
Open this post in threaded view
|

Re: pg_dump multi VALUES INSERT

David Rowley-3
On Sat, 2 Feb 2019 at 21:26, Fabien COELHO <[hidden email]> wrote:
> I do not understand why dump_inserts declaration has left the "flags for
> options" section.

I moved that because it's no longer just a flag. It now stores an int value.

> I'd suggest not to rely on "atoi" because it does not check the argument
> syntax, so basically anything is accepted, eg "1O" is 1;

Seems like it's good enough for --jobs and --compress.   Do you think
those should be changed too? or what's the reason to hold
--rows-per-insert to a different standard?

> There is a test, that is good! Charater "." should be backslashed in the
> regexpr.

Yeah, you're right.   I wonder if we should fix the test of them in
another patch.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: pg_dump multi VALUES INSERT

Fabien COELHO-3

Hello David,

>> I do not understand why dump_inserts declaration has left the "flags for
>> options" section.
>
> I moved that because it's no longer just a flag. It now stores an int value.

Hmmm. Indeed, all th "int"s of this section should be "bool" instead. Now,
some "flags" do not appear although the culd (clear, createdb, blobs), so
the logic is kinda fuzzy anyway. Do as you wish.

>> I'd suggest not to rely on "atoi" because it does not check the argument
>> syntax, so basically anything is accepted, eg "1O" is 1;
>
> Seems like it's good enough for --jobs and --compress.   Do you think
> those should be changed too? or what's the reason to hold
> --rows-per-insert to a different standard?

I think that there is a case for avoiding sloppy "good enough" programming
practices:-) Alas, as you point out, "atoi" is widely used. I'm campaining
to avoid adding more of them. There has been some push to actually remove
"atoi" when not appropriate, eg from "libpq". I'd suggest to consider
starting doing the right thing, and left fixing old patterns to another
patch.

>> There is a test, that is good! Charater "." should be backslashed in the
>> regexpr.
>
> Yeah, you're right.   I wonder if we should fix the test of them in
> another patch.

From a software engineering perspective, I'd say that a feature and its
tests really belong to the same patch.

--
Fabien.

Reply | Threaded
Open this post in threaded view
|

Re: pg_dump multi VALUES INSERT

David Rowley-3
On Sun, 3 Feb 2019 at 21:00, Fabien COELHO <[hidden email]> wrote:
> >> There is a test, that is good! Charater "." should be backslashed in the
> >> regexpr.
> >
> > Yeah, you're right.   I wonder if we should fix the test of them in
> > another patch.
>
> From a software engineering perspective, I'd say that a feature and its
> tests really belong to the same patch.

I meant to say "fix the rest" if them, not "the test of them".


--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: pg_dump multi VALUES INSERT

Surafel Temesgen
In reply to this post by Fabien COELHO-3


On Sun, Feb 3, 2019 at 11:00 AM Fabien COELHO <[hidden email]> wrote:

Hello David,

>> I do not understand why dump_inserts declaration has left the "flags for
>> options" section.
>
> I moved that because it's no longer just a flag. It now stores an int value.

Hmmm. Indeed, all th "int"s of this section should be "bool" instead. Now,
some "flags" do not appear although the culd (clear, createdb, blobs), so
the logic is kinda fuzzy anyway. Do as you wish.

>> I'd suggest not to rely on "atoi" because it does not check the argument
>> syntax, so basically anything is accepted, eg "1O" is 1;
>
> Seems like it's good enough for --jobs and --compress.   Do you think
> those should be changed too? or what's the reason to hold
> --rows-per-insert to a different standard?

I think that there is a case for avoiding sloppy "good enough" programming
practices:-) Alas, as you point out, "atoi" is widely used. I'm campaining
to avoid adding more of them. There has been some push to actually remove
"atoi" when not appropriate, eg from "libpq". I'd suggest to consider
starting doing the right thing, and left fixing old patterns to another
patch.


 
at least for processing user argument i think it is better to use strtol or other
function that have better error handling. i can make a patch that change usage
of atoi for user argument processing after getting feedback from here or i will do
simultaneously 

regards
Surafel
 
Reply | Threaded
Open this post in threaded view
|

Re: pg_dump multi VALUES INSERT

Michael Paquier-2
On Sun, Feb 03, 2019 at 01:21:45PM +0300, Surafel Temesgen wrote:
> at least for processing user argument i think it is better to use strtol or
> other
> function that have better error handling. i can make a patch that change
> usage
> of atoi for user argument processing after getting feedback from here or i
> will do
> simultaneously

Moved the patch to next CF for now, the discussion is going on.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: pg_dump multi VALUES INSERT

Surafel Temesgen
In reply to this post by Fabien COELHO-3


On Sat, Feb 2, 2019 at 11:26 AM Fabien COELHO <[hidden email]> wrote:

Hello David,

> Wondering if you have anything else here? I'm happy for the v13
> version to be marked as ready for committer.

I still have a few comments.

Patch applies cleanly, compiles, global & local make check are ok.

Typos and style in the doc:

        "However, since, by default this option generates ..."
        "However, since this option, by default, generates ..."

I'd suggest a more straightforward to my mind and ear: "However, since by
default the option generates ..., ....", although beware that I'm not a
native English speaker.


fixed

I'd suggest not to rely on "atoi" because it does not check the argument
syntax, so basically anything is accepted, eg "1O" is 1;

i change it to strtol

On "if (!dopt->do_nothing) $1 else $2;", I'd rather use a straight
condition "if (dopt->do_nothing) $2 else $1;" (two instances).

fixed

regards
Surafel

pg_dump-rows-per-insert-option_v14.patch (21K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: pg_dump multi VALUES INSERT

David Rowley-3
Reviewing pg_dump-rows-per-insert-option-v14.

Mostly going back over things that Fabien mentioned:

On Sat, 2 Feb 2019 at 21:26, Fabien COELHO <[hidden email]> wrote:
> There is a test, that is good! Charater "." should be backslashed in the
> regexpr. I'd consider also introducing limit cases: empty table, empty
> columns by creating corresponding tables and using -t repeatedly.

+ (?:INSERT\ INTO\ dump_test.test_table\ VALUES\ \(\d,\ NULL,\ NULL,\
NULL\),\ \(\d,\ NULL,\ NULL,\ NULL\),\ \(\d,\ NULL,\ NULL,\
NULL\);\n){3}

the . here before the table name needs to be escaped. The ones missing
in the existing tests should have been fixed by d07fb6810e.

There's also the additional tests that Fabien mentions.

Also, maybe one for Fabien (because he seems keen on keeping the
--rows-per-insert validation code)

strtol() returns a long. dump_inserts is an int, so on machines where
sizeof(long) == 8 and sizeof(int) == 4 (most machines, these days) the
validation is not bulletproof.  This could lead to:

$ pg_dump --rows-per-insert=2147483648
pg_dump: rows-per-insert must be a positive number

For me, I was fine with the atoi() code that the other options use,
but maybe Fabien has a problem with the long vs int?

It would be simple to workaround by assigning the strtol() to a long
and making the ERANGE test check for ERANGE or  ... > PG_INT_MAX

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: pg_dump multi VALUES INSERT

Surafel Temesgen


On Mon, Feb 11, 2019 at 10:20 AM David Rowley <[hidden email]> wrote:
Reviewing pg_dump-rows-per-insert-option-v14.
Also, maybe one for Fabien (because he seems keen on keeping the
--rows-per-insert validation code)

strtol() returns a long. dump_inserts is an int, so on machines where
sizeof(long) == 8 and sizeof(int) == 4 (most machines, these days) the
validation is not bulletproof.  This could lead to:

$ pg_dump --rows-per-insert=2147483648
pg_dump: rows-per-insert must be a positive number

fixed


For me, I was fine with the atoi() code that the other options use,
but maybe Fabien has a problem with the long vs int?


The main issue with atoi() is it does not detect errors and return 0 for
both invalid input and input value 0 but in our case it doesn't case a
problem because it error out for value 0. but for example in compress Level
if invalid input supplied it silently precede as input value 0 is supplied.
 
regards
Surafel

pg_dump-rows-per-insert-option-v15.patch (21K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: pg_dump multi VALUES INSERT

David Rowley-3
On Wed, 13 Feb 2019 at 19:36, Surafel Temesgen <[hidden email]> wrote:

> On Mon, Feb 11, 2019 at 10:20 AM David Rowley <[hidden email]> wrote:
>>
>> Reviewing pg_dump-rows-per-insert-option-v14.
>>
>> Also, maybe one for Fabien (because he seems keen on keeping the
>> --rows-per-insert validation code)
>>
>> strtol() returns a long. dump_inserts is an int, so on machines where
>> sizeof(long) == 8 and sizeof(int) == 4 (most machines, these days) the
>> validation is not bulletproof.  This could lead to:
>>
>> $ pg_dump --rows-per-insert=2147483648
>> pg_dump: rows-per-insert must be a positive number
>
>
> fixed

Thanks.

I see you didn't touch the tests yet, so I'll set this back to waiting
on author.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: pg_dump multi VALUES INSERT

Surafel Temesgen
In reply to this post by Fabien COELHO-3


On Sat, Feb 2, 2019 at 11:26 AM Fabien COELHO <[hidden email]> wrote:
 
There is a test, that is good! Charater "." should be backslashed in the
regexpr. I'd consider also introducing limit cases: empty table, empty
columns by creating corresponding tables and using -t repeatedly

I see that there are already a test for zero column table in test_fourth_table_zero_col
and if am not wrong table_index_stats is empty table

regards
Surafel
123