FK violation in partitioned table after truncating a referenced partition

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

FK violation in partitioned table after truncating a referenced partition

Jehan-Guillaume de Rorthais
Hello,

A colleague found the bug described in $subject while testing partitioning.

Consider two partitioned tables with a FK between them. Issuing a TRUNCATE
CASCADE on a *partition* of the referenced side does not propagate to the
referencing side. Eg.:

  CREATE TABLE trunc_a (a INT PRIMARY KEY) PARTITION BY RANGE (a);
  CREATE TABLE trunc_a1 PARTITION OF trunc_a FOR VALUES FROM (0) TO (10);
  CREATE TABLE trunc_a2 PARTITION OF trunc_a FOR VALUES FROM (10) TO (20);
  INSERT INTO trunc_a VALUES (0), (5), (10), (15);

  CREATE TABLE ref_b (
    b INT PRIMARY KEY,
    a INT REFERENCES trunc_a(a) ON DELETE CASCADE
  ) PARTITION BY RANGE (b);
  CREATE TABLE ref_b1 PARTITION OF ref_b FOR VALUES FROM (0) TO (100);
  INSERT INTO ref_b VALUES (10, 0), (50, 5);

  TRUNCATE TABLE trunc_a1 CASCADE;
  -- NOTICE:  truncate cascades to table "ref_b"

  SELECT a FROM trunc_a;
  --  a  
  -- ----
  --  10
  --  15
  -- (2 rows)

  SELECT a FROM ref_b;
  --  a
  -- ---
  --  0
  --  5
  -- (2 rows)

heap_truncate_find_FKs returns only relations that are directly referencing
the given referenced part of the FK. However, when considering two partitioned
relation with a FK between them, there's no child to child relation in
pg_constraint. They only point toward parent tables. Cascading FK is indirectly
done through the parent table.

Unfortunately, in ExecuteTruncateGuts, when relations are actually
truncated, parents are ignored as they are empty relations:

        /*
         * OK, truncate each table.
         */
        mySubid = GetCurrentSubTransactionId();

        foreach(cell, rels)
        {
                Relation rel = (Relation) lfirst(cell);

                /* Skip partitioned tables as there is nothing to do */
                if (rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
                        continue;

Please, find in attachment a bug fix proposal where a truncate on a
partition cascade to the referencing table and truncate all its
partitions if applicable.

The patch make sure heap_truncate_find_FKs find all referencing relations,
directly **and indirectly** through their parent table.

When considering the various way of fixing this, I thought about calling
find_all_inheritors on all relations returned by heap_truncate_find_FKs to add
them to the list or relation to truncate (I have a working patch for this as
well). However, I felt like heap_truncate_find_FKs was the real suspect here
and was responsible to find all referencing relations.

Regards,

0001-v1-Fix-TRUNCATE-on-a-partition-to-cascade-correctly-to-.patch (7K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: FK violation in partitioned table after truncating a referenced partition

Alvaro Herrera-9
On 2020-Feb-04, Jehan-Guillaume de Rorthais wrote:


> Please, find in attachment a bug fix proposal where a truncate on a
> partition cascade to the referencing table and truncate all its
> partitions if applicable.
>
> The patch make sure heap_truncate_find_FKs find all referencing relations,
> directly **and indirectly** through their parent table.
>
> When considering the various way of fixing this, I thought about calling
> find_all_inheritors on all relations returned by heap_truncate_find_FKs to add
> them to the list or relation to truncate (I have a working patch for this as
> well). However, I felt like heap_truncate_find_FKs was the real suspect here
> and was responsible to find all referencing relations.
Hello, thanks for reporting and patching.

I agree that patching heap_truncate_find_FKs is a reasonable way to fix.
I propose a slightly different formulation: instead of the loop that you
have, we can just use the second loop, and add more parent constraints
to the list if any constraint we scan in turn has a parent constraint.
So we don't repeat the whole thing, but only that second loop.

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

v2-0001-Fix-TRUNCATE-on-a-partition-to-apply-CASCADE-to-p.patch (7K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: FK violation in partitioned table after truncating a referenced partition

Alvaro Herrera-9
Also, I think we should patch the documentation like this, because this
effect is not 100% obvious.  (For example, if the partition strategy is
identical in both referenced and referencing tables, you would wish that
only the equivalent partition is truncated.  However, I doubt this can
be really implemented.)

I do wonder how do other RDBMSs behave.

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: FK violation in partitioned table after truncating a referenced partition

Alvaro Herrera-9
On 2020-Feb-06, Alvaro Herrera wrote:

> I do wonder how do other RDBMSs behave.

I think this is relevant,
https://docs.oracle.com/database/121/VLDBG/GUID-FEAAC43A-1066-4697-8474-863181FE4F38.htm
but offers no useful guidance.

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: FK violation in partitioned table after truncating a referenced partition

Alvaro Herrera-9
In reply to this post by Alvaro Herrera-9
On 2020-Feb-06, Alvaro Herrera wrote:

> Also, I think we should patch the documentation like this, because this
> effect is not 100% obvious.  (For example, if the partition strategy is
> identical in both referenced and referencing tables, you would wish that
> only the equivalent partition is truncated.  However, I doubt this can
> be really implemented.)

And of course, I forgot to attach the file.

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

truncate-docs.patch (755 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: FK violation in partitioned table after truncating a referenced partition

Alvaro Herrera-9
In reply to this post by Alvaro Herrera-9
On 2020-Feb-06, Alvaro Herrera wrote:

> I agree that patching heap_truncate_find_FKs is a reasonable way to fix.
> I propose a slightly different formulation: instead of the loop that you
> have, we can just use the second loop, and add more parent constraints
> to the list if any constraint we scan in turn has a parent constraint.
> So we don't repeat the whole thing, but only that second loop.

Hmm, this doesn't actually work; I modified your test case and I see
that my code fails to do the right thing.

-- Truncate a single partition cascading to another partitioned table.
CREATE TABLE trunc_a (a INT PRIMARY KEY) PARTITION BY RANGE (a);
CREATE TABLE trunc_a1 PARTITION OF trunc_a FOR VALUES FROM (0) TO (10);
CREATE TABLE trunc_a2 PARTITION OF trunc_a FOR VALUES FROM (10) TO (20)
  PARTITION BY RANGE (a);
CREATE TABLE trunc_a21 PARTITION OF trunc_a2 FOR VALUES FROM (10) TO (12);
CREATE TABLE trunc_a22 PARTITION OF trunc_a2 FOR VALUES FROM (12) TO (16);
CREATE TABLE trunc_a2d PARTITION OF trunc_a2 DEFAULT;
CREATE TABLE trunc_a3 PARTITION OF trunc_a FOR VALUES FROM (20) TO (30);
INSERT INTO trunc_a VALUES (10), (15), (20), (25);

CREATE TABLE ref_c (
    c INT PRIMARY KEY,
    a INT REFERENCES trunc_a(a) ON DELETE CASCADE
) PARTITION BY RANGE (c);
CREATE TABLE ref_c1 PARTITION OF ref_c FOR VALUES FROM (100) TO (200);
CREATE TABLE ref_c2 PARTITION OF ref_c FOR VALUES FROM (200) TO (300);
INSERT INTO ref_c VALUES (100, 10), (150, 15), (200, 20), (250, 25);

TRUNCATE TABLE trunc_a21 CASCADE;
SELECT a as "from table ref_c" FROM ref_c;
SELECT a as "from table trunc_a" FROM trunc_a ORDER BY a;

DROP TABLE trunc_a, ref_c;

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: FK violation in partitioned table after truncating a referenced partition

Alvaro Herrera-9
On 2020-Feb-06, Alvaro Herrera wrote:

> On 2020-Feb-06, Alvaro Herrera wrote:
>
> > I agree that patching heap_truncate_find_FKs is a reasonable way to fix.
> > I propose a slightly different formulation: instead of the loop that you
> > have, we can just use the second loop, and add more parent constraints
> > to the list if any constraint we scan in turn has a parent constraint.
> > So we don't repeat the whole thing, but only that second loop.
>
> Hmm, this doesn't actually work; I modified your test case and I see
> that my code fails to do the right thing.

Yeah, AFAICS both algorithms posted so far (yours and mine) are wrong.
Maybe there's another way to fix it, but I think we're going to need the
find_all_inheritors call you didn't want; here's a rough sketch of what
I'm thinking:

        while (HeapTupleIsValid(tuple = systable_getnext(fkeyScan)))
        {
                Form_pg_constraint con = (Form_pg_constraint) GETSTRUCT(tuple);

                /* Not referencing one of our list of tables */
                if (!list_member_oid(oids, con->confrelid))
                        continue;

                /*
                 * If the constraint has a parent, climb up the partition hierarchy
                 * all the way to the top.  We need to process all the partitions
                 * covered by the topmost constraint.
                 */
                while (OidIsValid(con->conparentid))
                {
                        scan2 = systable_beginscan(fkeyRel, ConstraintParentIndexId,
                                                                           true, NULL, 1, &key);
                        tup2 = heap_copytuple(systable_getnext(scan2)); /* XXX leaks memory */
                        con = (Form_pg_constraint) GETSTRUCT(tup2);
                        systable_endscan(scan2);
                }

                /* Add referencer to result, unless present in input list */
                if (!list_member_oid(relationIds, con->conrelid))
                        result = lappend_oid(result, con->conrelid);
                if (conrelid is partitioned)
                {
                        add each partition to result list;
                }
        }

ENOTIME to complete it now, though ... also: I'm not sure about having
heap_truncate_find_FKs() acquire the locks on partitions; but what
happens if there's a concurrent detach?

This is a larger can of worms than I imagined.  Maybe a simpler solution
is to say that you cannot truncate a partition; if you want that,
truncate the topmost relation.  No functionality seems lost with that
restriction, or is it?  And the semantics seem better defined anyway.
(AFAICS this is implemented easily: if we see a non-invalid conparentid,
raise an error).

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: FK violation in partitioned table after truncating a referenced partition

Jehan-Guillaume de Rorthais
Thank you for the review and thoughts Alvaro.

On Thu, 6 Feb 2020 20:01:29 -0300
Alvaro Herrera <[hidden email]> wrote:

> On 2020-Feb-06, Alvaro Herrera wrote:
>
> > On 2020-Feb-06, Alvaro Herrera wrote:
> >  
> > > I agree that patching heap_truncate_find_FKs is a reasonable way to fix.
> > > I propose a slightly different formulation: instead of the loop that you
> > > have, we can just use the second loop, and add more parent constraints
> > > to the list if any constraint we scan in turn has a parent constraint.
> > > So we don't repeat the whole thing, but only that second loop.  
> >
> > Hmm, this doesn't actually work; I modified your test case and I see
> > that my code fails to do the right thing.  
>
> Yeah, AFAICS both algorithms posted so far (yours and mine) are wrong.
Well, when reading myself, I found a bug in my algorithm. When looking for
parent constraints harvested during the first loop, I wasn't looking on
pg_contraint.oid, but on conparentid again. So instead of gathering parent
constraints to add the parent relation to the list of oids, I was only adding
siblings constraints. Here the fix:

    ScanKeyInit(&key,
  - Anum_pg_constraint_conparentid,
  + Anum_pg_constraint_oid
    BTEqualStrategyNumber, F_OIDEQ,
    ObjectIdGetDatum(parent));
   
  -  fkeyScan = systable_beginscan(fkeyRel, ConstraintParentIndexId,
  +  fkeyScan = systable_beginscan(fkeyRel, ConstraintOidIndexId,
      true, NULL, 1,
    &key);

[...]
> ENOTIME to complete it now, though ... also: I'm not sure about having
> heap_truncate_find_FKs() acquire the locks on partitions;

ExecuteTruncate and ExecuteTruncateGuts are responsible to open and
lock relations. It might be messy or racy between those and
heap_truncate_find_FKs if the later open/lock or open/nolock while looking for
relations.

> but what happens if there's a concurrent detach?

Not sure. Are you talking about the referenced or referencing side?

> This is a larger can of worms than I imagined.  Maybe a simpler solution
> is to say that you cannot truncate a partition; if you want that,
> truncate the topmost relation.

I thought about this as well, but it might be a feature regression in a minor
version.

> No functionality seems lost with that restriction, or is it?

It does. When truncating a partition, you left untouched other siblings. You
did not truncate the whole partioned table. this is the last query in my
original test.

I added some more words to the doc about this. Please, find in attachment a new
version of bug fix proposal.

Regards,

0001-v2-Fix-TRUNCATE-on-a-partition-to-apply-CASCADE-to-part.patch (8K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: FK violation in partitioned table after truncating a referenced partition

Alvaro Herrera-9
On 2020-Feb-07, Jehan-Guillaume de Rorthais wrote:

> Well, when reading myself, I found a bug in my algorithm. When looking for
> parent constraints harvested during the first loop, I wasn't looking on
> pg_contraint.oid, but on conparentid again. So instead of gathering parent
> constraints to add the parent relation to the list of oids, I was only adding
> siblings constraints. Here the fix:
>
>     ScanKeyInit(&key,
>   - Anum_pg_constraint_conparentid,
>   + Anum_pg_constraint_oid
>     BTEqualStrategyNumber, F_OIDEQ,
>     ObjectIdGetDatum(parent));
>    
>   -  fkeyScan = systable_beginscan(fkeyRel, ConstraintParentIndexId,
>   +  fkeyScan = systable_beginscan(fkeyRel, ConstraintOidIndexId,
>       true, NULL, 1,
>     &key);
Doh, of course.  I should have seen that.

Here's another take at the formulation; IMO the loop is more obvious
this way, with a flag to restart from the top rather than keeping track
of the list length.  But essentially this is your algorithm.

I couldn't find any fault in this.  It would be nice if the cascaded
truncation was more precise, ie. only truncate the referencing
partitions that overlap the ranges covered by the referenced partition
being truncated.  But that seems more difficult to achieve, as well as
less clearly defined; if you really want something like that, I think
you can detach the referenced partition.



--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

v3-0001-Fix-TRUNCATE-on-a-partition-to-apply-CASCADE-to-p.patch (8K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: FK violation in partitioned table after truncating a referenced partition

Jehan-Guillaume de Rorthais
On Fri, 7 Feb 2020 12:04:32 -0300
Alvaro Herrera <[hidden email]> wrote:

> On 2020-Feb-07, Jehan-Guillaume de Rorthais wrote:
>
> > Well, when reading myself, I found a bug in my algorithm. When looking for
> > parent constraints harvested during the first loop, I wasn't looking on
> > pg_contraint.oid, but on conparentid again. So instead of gathering parent
> > constraints to add the parent relation to the list of oids, I was only
> > adding siblings constraints. Here the fix:
> >
> >     ScanKeyInit(&key,
> >   - Anum_pg_constraint_conparentid,
> >   + Anum_pg_constraint_oid
> >     BTEqualStrategyNumber, F_OIDEQ,
> >     ObjectIdGetDatum(parent));
> >    
> >   -  fkeyScan = systable_beginscan(fkeyRel, ConstraintParentIndexId,
> >   +  fkeyScan = systable_beginscan(fkeyRel, ConstraintOidIndexId,
> >       true, NULL,
> > 1, &key);  
>
> Doh, of course.  I should have seen that.
>
> Here's another take at the formulation; IMO the loop is more obvious
> this way, with a flag to restart from the top rather than keeping track
> of the list length.  But essentially this is your algorithm.

Yes, I recognize my algo with some cosmetic improvements, this obvious restart
flag I should have thought about and some welcomed code comments. I agree this
is more clear. Thanks!

Maybe I would just add:

 /*
  * If this constraint has a parent constraint which we have not seen
  * yet, keep track of it for the second loop, below.
+ * Tracking parent constraint allows to climb up to the top-level
+ * level constraint and look for all possible relation referencing
+ * the partioned table.
  */

> I couldn't find any fault in this.

great!

> It would be nice if the cascaded truncation was more precise, ie. only
> truncate the referencing partitions that overlap the ranges covered by the
> referenced partition being truncated.

Yes, I was wondering about that when I was working on the first version of the
patch. It seems like a dedicated partitioning syntax when looking at other
RDBMSs. Eg. "PARTITION BY REFERENCE (col)" and "TRUNCATE PARTITION":

https://oracle-base.com/articles/12c/cascade-functionality-for-truncate-partition-and-exchange-partition-12cr1

> But that seems more difficult to achieve, as well as less clearly defined; if
> you really want something like that, I think you can detach the referenced
> partition.

This is out of the scope of this bug fix in my humble opinion. This would be a
whole new feature, even if it could be done without a new syntax.

Regards,


Reply | Threaded
Open this post in threaded view
|

Re: FK violation in partitioned table after truncating a referenced partition

Alvaro Herrera-9
On 2020-Feb-07, Jehan-Guillaume de Rorthais wrote:

> Maybe I would just add:
>
>  /*
>   * If this constraint has a parent constraint which we have not seen
>   * yet, keep track of it for the second loop, below.
> + * Tracking parent constraint allows to climb up to the top-level
> + * level constraint and look for all possible relation referencing
> + * the partioned table.
>   */

LGTM.

BTW I was thinking that perhaps it would make sense to go up all levels
at once when we see a "parented" constraint; this would avoid having to
restart several times when there's N-levels partitioning.  It might be
an issue if pg_constraint is large, because, you see, there's a seqscan
there!  (Maybe now's the time to add an index to confrelid, but of
course only in master).  This probably doesn't matter much normally
because nobody uses that many partition levels ...

> This is out of the scope of this bug fix in my humble opinion. This would be a
> whole new feature, even if it could be done without a new syntax.

Sure.

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: FK violation in partitioned table after truncating a referenced partition

Jehan-Guillaume de Rorthais
On Fri, 7 Feb 2020 14:27:51 -0300
Alvaro Herrera <[hidden email]> wrote:

> On 2020-Feb-07, Jehan-Guillaume de Rorthais wrote:
>
> > Maybe I would just add:
> >
> >  /*
> >   * If this constraint has a parent constraint which we have not seen
> >   * yet, keep track of it for the second loop, below.
> > + * Tracking parent constraint allows to climb up to the top-level
> > + * level constraint and look for all possible relation referencing
> > + * the partioned table.
> >   */  
>
> LGTM.
Added.

> BTW I was thinking that perhaps it would make sense to go up all levels
> at once when we see a "parented" constraint; this would avoid having to
> restart several times when there's N-levels partitioning.  It might be
> an issue if pg_constraint is large, because, you see, there's a seqscan
> there!

Indeed. See v4 in attachment. It saves 3 seqscans during the whole tests we
added.

> (Maybe now's the time to add an index to confrelid, but of
> course only in master).  This probably doesn't matter much normally
> because nobody uses that many partition levels ...

I have a colleague that enjoys experimenting with limits. But I'm not sure I'll
have feedback from him before next minor release (next week?).

Regards,

0001-v4-Fix-TRUNCATE-on-a-partition-to-apply-CASCADE-to-part.patch (8K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: FK violation in partitioned table after truncating a referenced partition

Alvaro Herrera-9
There's another key point I forgot -- which is that we only need to
search for constraints on the topmost partitioned table, not each of its
partitions.  The reason is that pg_constraint rows exist on the other
side that reference that relation, for each partition on the other side.
So we can do this:

+       if (HeapTupleIsValid(tuple))
+       {
+           Form_pg_constraint con = (Form_pg_constraint) GETSTRUCT(tuple);
+
+           /*
+            * pg_constraint rows always appear for partitioned hierarchies
+            * this way: on the each side of the constraint, one row appears
+            * for each partition that points to the top-most table on the
+            * other side.
+            *
+            * Because of this arrangement, we can correctly catch all
+            * relevant relations by adding to 'parent_cons' all rows with
+            * valid conparentid, and to the 'oids' list all rows with a
+            * zero conparentid.  If any oids are added to 'oids', redo the
+            * first loop above by setting 'restart'.
+            */
+           if (OidIsValid(con->conparentid))
+               parent_cons = list_append_unique_oid(parent_cons,
+                                                    con->conparentid);
+           else if (!list_member_oid(oids, con->confrelid))
+           {
+               oids = lappend_oid(oids, con->confrelid);
+               restart = true;
+           }
+       }

that is, keep appending to the parent_cons list, and not touch the oids
list, until we get to the top of the hierarchy.  Then when we redo the
first loop, we'll get all partitions on the other side because they all
have pg_constraint rows that reference the topmost rel.  (That is to
say, all the intermediate-partition OIDs should be useless in the 'oids'
list anyway.)

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: FK violation in partitioned table after truncating a referenced partition

Alvaro Herrera-9
Pushed.  Thanks for testing and fixing.

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: FK violation in partitioned table after truncating a referenced partition

Jehan-Guillaume de Rorthais
In reply to this post by Alvaro Herrera-9
On Fri, 7 Feb 2020 17:19:48 -0300
Alvaro Herrera <[hidden email]> wrote:

> There's another key point I forgot -- which is that we only need to
> search for constraints on the topmost partitioned table, not each of its
> partitions.  The reason is that pg_constraint rows exist on the other
> side that reference that relation, for each partition on the other side.

Yes, I figured this as well when drawing things during debug time.

By this time, I kept it this way because I wasn't sure about potential
complications with sub-partitioning and FK to sub-partition only.


> So we can do this:
[...]

> that is, keep appending to the parent_cons list, and not touch the oids
> list, until we get to the top of the hierarchy.  Then when we redo the
> first loop, we'll get all partitions on the other side because they all
> have pg_constraint rows that reference the topmost rel.  (That is to
> say, all the intermediate-partition OIDs should be useless in the 'oids'
> list anyway.)

It makes the oids list smaller (depending on the partitioning depth). As it is
scanned for each FK in pg_constraint, it surely squeeze some more time.

I'll stick around irw the other FK violation thread. Please, keep me in the
loop.

Thank you for the discussion and commit.

Regards,