BUG #16256: Checking deferred FK constraints when updating PK is ordrs of magnitude slower than when updating FK

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

BUG #16256: Checking deferred FK constraints when updating PK is ordrs of magnitude slower than when updating FK

apt.postgresql.org Repository Update
The following bug has been logged on the website:

Bug reference:      16256
Logged by:          Valentin Kovalenko
Email address:      [hidden email]
PostgreSQL version: 12.1
Operating system:   Ubuntu 18.04.2 LTS
Description:        

Environment:
OS:
  Ubuntu 18.04.2 LTS
PostgreSQL:
  select version(); returns 'PostgreSQL 12.1 (Ubuntu 12.1-1.pgdg18.04+1) on
x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0,
64-bit'.
  PG is setup as specified at
https://www.postgresql.org/download/linux/ubuntu/.

Setup:
1) Install PG, start PG, start psql (the session is in auto-commit mode).
2) Run the following commands to create three functions which simplify
reproducing of the problem:

drop function if exists init_test;
create function init_test(in pk_holder_size bigint, in fk_holder_size
bigint) returns void
language plpgsql
as $$
  begin
    perform setseed(0);
    <<pk_holder>>begin
      drop table if exists pk_holder cascade;
      create unlogged table pk_holder (pk bigint primary key);
      insert into pk_holder select * from generate_series(0, pk_holder_size
- 1);
      analyze pk_holder;
    end;
    <<fk_holder>>begin
      drop table if exists fk_holder;
      create unlogged table fk_holder (
        fk bigint not null,
        serial bigint unique not null,
        constraint fk_holder_fk_fkey foreign key (fk) references pk_holder
(pk) deferrable initially immediate);
      insert into fk_holder (fk, serial) select floor(random() *
pk_holder_size), gs.v from generate_series(0, fk_holder_size - 1) as gs
(v);
      analyze fk_holder;
    end;
  end;
$$;

drop function if exists update_pk;
create function update_pk(in size bigint) returns void
language plpgsql
as $$
  declare
    min_free_pk bigint;
  begin
    select max(pk) + 1 into min_free_pk from pk_holder;
    set constraints fk_holder_fk_fkey deferred;
    --Simply doing update pk_holder set pk = pk where pk < size; does not do
the trick
    update pk_holder set pk = pk + min_free_pk where pk < size;
    update pk_holder set pk = pk - min_free_pk where pk >= min_free_pk;
  end;
$$;

drop function if exists update_fk;
create function update_fk(in size bigint, in deferred boolean) returns
void
language plpgsql
as $$
  declare
    max_pk bigint;
  begin
    if deferred then set constraints fk_holder_fk_fkey deferred; end if;
    update fk_holder set fk = min + floor(random() * (max - min + 1))
    from (select min(pk) as min, max(pk) as max from pk_holder) as
boundaries
    where serial < size;
  end;
$$;

Reproduce:
1) Run "select init_test(1000000, 1000000);" to init the test state (it also
analyzes the created and populated tables). This step creates two unlogged
tables pk_holder and fk_holder with a single FK constraint checking that
fk_holder.fk values are among pk_holder.pk values.
2) Run "\timing" to enable timing.
3.1) Run "begin; select update_pk(2000);" to defer the FK constraint and
update 2000 rows in the pk_holder by changing pk_holder.pk values back and
forth so that at the end their values stay the same, but the rows are
updated and FK constraints will be checked on commit (simply assigning
pk_holder.pk = pk_holder.pk is not enough for this test).
3.2) Run "commit;" to initiate validation of the deferred FK constraint for
the rows updated in the previous step. Take a note on how long it takes to
be executed as reported by psql (about 2 minutes for me).
4.1) Run "begin; select update_fk(2000, true);" to defer the FK constraint
and update 2000 rows in the fk_holder by randomly changing fk_holder.fk
values.
4.2) Run "commit;" to initiate validation of the deferred FK constraint for
the rows updated in the previous step. Take a note on how long it takes to
be executed as reported by psql (about 30 milliseconds for me).

Facts:
In the proposed experiment committing a transaction that updates PK values
on 2000 rows in the pk_holder table takes 2 minutes / 30 milliseconds = 4000
times more time than committing a transaction that updated FK values in 2000
rows in the fk_holder table. Both transactions use the same deferred FK
constraint, both tables are unlogged.

Problem:
Latency of the commit command that validates deferred FK constraints as a
result of primary key values being updated is so large that it makes the
functionality virtually unusable. If the performance in such scenarios
cannot be improved, then the documentation should at least explicitly state
that such scenarios are discouraged as the implementation is not supposed to
cope with them; otherwise it would be great to implement the validation in a
way that allows using such scenarios in practice.

Reply | Threaded
Open this post in threaded view
|

Re: BUG #16256: Checking deferred FK constraints when updating PK is ordrs of magnitude slower than when updating FK

David G Johnston
On Wednesday, February 12, 2020, PG Bug reporting form <[hidden email]> wrote:
The following bug has been logged on the website:

Bug reference:      16256
Logged by:          Valentin Kovalenko
Email address:      [hidden email]
PostgreSQL version: 12.1
Operating system:   Ubuntu 18.04.2 LTS
Description:       

      create unlogged table fk_holder (
        fk bigint not null,
        serial bigint unique not null,
        constraint fk_holder_fk_fkey foreign key (fk) references pk_holder
(pk) deferrable initially immediate);

In the proposed experiment committing a transaction that updates PK values
on 2000 rows in the pk_holder table takes 2 minutes / 30 milliseconds = 4000
times more time than committing a transaction that updated FK values in 2000
rows in the fk_holder table.

As noted on the CREATE TABLE page FK section:

If the referenced column(s) are changed frequently, it might be wise to add an index to the referencing column(s) so that referential actions associated with the foreign key constraint can be performed more efficiently.”

With the omission of such an index in your test fixture you’ve demonstrated why that advice is provided.

David J.

Reply | Threaded
Open this post in threaded view
|

Re: BUG #16256: Checking deferred FK constraints when updating PK is ordrs of magnitude slower than when updating FK

Valentin Kovalenko
Sigh, this is so obvious once you mentioned it, but it did not cross my mind... Embarrassing. Thanks, David!

Valentin