Statement-level Triggers For Uniqueness Checks

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Statement-level Triggers For Uniqueness Checks

Corey Huinker
In digging around the codebase (see thread: Referential Integrity Checks with Statement-level Triggers), I noticed that unique constraints are similarly enforced with a per-row trigger.

The situation with unique indexes is fairly similar to the situation with RI checks: there is some overhead to using a transition table, but that overhead may be less than the cost of firing a trigger once per row inserted/updated.

However, there are some significant differences (apologies to everyone already familiar with this part of the code, it's new to me).

For one, there is no analog to RI_Initial_Check(). Instead the constraint is initially checked via building/finding the unique index that would enforce the uniqueness check.

Then, the actual lookup done in unique_key_recheck has to contend with the intricacies of HOT updates, so I don't know if that can be expressed in an SPI query. Even if not, I think it should be possible to iterate over the EphemeralNamedRelation and that would result itself have a payoff in reduced trigger calls.

I'm going to be working on this as a POC patch separate from the RI work, hence the separate thread, but there's obviously a lot of overlap.

All advice is appreciated.

Reply | Threaded
Open this post in threaded view
|

Re: Statement-level Triggers For Uniqueness Checks

Corey Huinker
So I took a first pass at this, and I got stuck.

The basic design is that instead of creating one row-level trigger per deferrable unique constraint, we instead create one insert-statement level trigger and one update-statement level trigger. Both call the function unique_key_recheck(), which now attempts to walk the inserted transition table, doing basically the same checks that were done in the per-row trigger. I'm hoping for some performance advantage for large row inserts/updates due to N-1 fewer triggers firing and N-1 attempts to lock the unique index.

The regression diff (attached) seems to imply that the triggers simply are not firing, though. I have verified that the triggers are created:

test=# CREATE TEMPORARY TABLE test ( x integer PRIMARY KEY DEFERRABLE INITIALLY DEFERRED );

CREATE TABLE

test=# SELECT * FROM pg_trigger WHERE 

oid             tgconstraint    tgdeferrable    tginitdeferred  tgnargs         tgqual          

tgargs          tgconstrindid   tgenabled       tgisinternal    tgnewtable      tgrelid         

tgattr          tgconstrrelid   tgfoid          tgname          tgoldtable      tgtype          

test=# SELECT * FROM pg_trigger WHERE tgrelid = 'test'::regclass;

  oid  | tgrelid |            tgname            | tgfoid | tgtype | tgenabled | tgisinternal | tgconstrrelid | tgconstrindid | tgconstraint | t

gdeferrable | tginitdeferred | tgnargs | tgattr | tgargs | tgqual | tgoldtable |          tgnewtable          

-------+---------+------------------------------+--------+--------+-----------+--------------+---------------+---------------+--------------+--

------------+----------------+---------+--------+--------+--------+------------+------------------------------

 16392 |   16387 | PK_ConstraintTrigger_i_16392 |   1250 |      4 | O         | t            |             0 |         16390 |        16391 | t

            | t              |       0 |        | \x     |        |            | pg_inserted_transition_table

 16393 |   16387 | PK_ConstraintTrigger_u_16393 |   1250 |     16 | O         | t            |             0 |         16390 |        16391 | t

            | t              |       0 |        | \x     |        |            | pg_inserted_transition_table

(2 rows)


Any idea where I went wrong?

On Mon, Dec 17, 2018 at 9:56 AM Corey Huinker <[hidden email]> wrote:
In digging around the codebase (see thread: Referential Integrity Checks with Statement-level Triggers), I noticed that unique constraints are similarly enforced with a per-row trigger.

The situation with unique indexes is fairly similar to the situation with RI checks: there is some overhead to using a transition table, but that overhead may be less than the cost of firing a trigger once per row inserted/updated.

However, there are some significant differences (apologies to everyone already familiar with this part of the code, it's new to me).

For one, there is no analog to RI_Initial_Check(). Instead the constraint is initially checked via building/finding the unique index that would enforce the uniqueness check.

Then, the actual lookup done in unique_key_recheck has to contend with the intricacies of HOT updates, so I don't know if that can be expressed in an SPI query. Even if not, I think it should be possible to iterate over the EphemeralNamedRelation and that would result itself have a payoff in reduced trigger calls.

I'm going to be working on this as a POC patch separate from the RI work, hence the separate thread, but there's obviously a lot of overlap.

All advice is appreciated.


regression.diffs (8K) Download Attachment
0001-Refactor-per-row-unique-key-deferred-constraint-trig.patch (17K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Statement-level Triggers For Uniqueness Checks

Dean Rasheed-3
On Mon, 24 Dec 2018 at 23:57, Corey Huinker <[hidden email]> wrote:
>
> So I took a first pass at this, and I got stuck.
>
> [snip]
>
> Any idea where I went wrong?

Take a look at this code in AfterTriggerSaveEvent():

        /*
         * If the trigger is a deferred unique constraint check trigger, only
         * queue it if the unique constraint was potentially violated, which
         * we know from index insertion time.
         */
        if (trigger->tgfoid == F_UNIQUE_KEY_RECHECK)
        {
            if (!list_member_oid(recheckIndexes, trigger->tgconstrindid))
                continue;       /* Uniqueness definitely not violated */
        }

If you trace it back, you'll see that for a statement-level trigger,
recheckIndexes will always be empty.

Regards,
Dean

Reply | Threaded
Open this post in threaded view
|

Re: Statement-level Triggers For Uniqueness Checks

Dean Rasheed-3
On Tue, 25 Dec 2018 at 08:04, Dean Rasheed <[hidden email]> wrote:
> Take a look at this code in AfterTriggerSaveEvent():
>

Note that the intention behind that code is that in the (fairly
common) case where an insert or update operation is known to not lead
to any potential PK/UNIQUE index violations, the overhead for the
deferred recheck is minimal. For example, consider a bulk insert where
IDs come from a sequence known to not overlap with existing IDs. In
that case, each new ID is determined at index insertion time to not
possibly conflict with any existing ID, recheckIndexes remains empty
for each row, and no recheck triggers are ever queued.

One of the tricky things about replacing the current rechecks with
statement level triggers will be working out how to deal with such
cases (or cases with just a tiny fraction of keys to be rechecked)
without introducing a large overhead.

Regards,
Dean

Reply | Threaded
Open this post in threaded view
|

Re: Statement-level Triggers For Uniqueness Checks

Peter Eisentraut-6
In reply to this post by Corey Huinker
On 25/12/2018 00:56, Corey Huinker wrote:
> The regression diff (attached) seems to imply that the triggers simply
> are not firing, though.

The reason for this was explained by Dean.  If you take out the check
that he mentioned, then your trigger fires but crashes.  In your changed
unique_key_recheck(), "slot" is not initialized before use (or ever).

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: Statement-level Triggers For Uniqueness Checks

Corey Huinker
On Fri, Jan 4, 2019 at 7:49 AM Peter Eisentraut
<[hidden email]> wrote:
>
> On 25/12/2018 00:56, Corey Huinker wrote:
> > The regression diff (attached) seems to imply that the triggers simply
> > are not firing, though.
>
> The reason for this was explained by Dean.  If you take out the check
> that he mentioned, then your trigger fires but crashes.  In your changed
> unique_key_recheck(), "slot" is not initialized before use (or ever).

Thanks. I'll be revisiting this shortly. Dean's information made me
think the potential for a gain is smaller than initially imagined.

Reply | Threaded
Open this post in threaded view
|

Re: Statement-level Triggers For Uniqueness Checks

Peter Eisentraut-6
On 08/01/2019 23:26, Corey Huinker wrote:

> On Fri, Jan 4, 2019 at 7:49 AM Peter Eisentraut
> <[hidden email]> wrote:
>>
>> On 25/12/2018 00:56, Corey Huinker wrote:
>>> The regression diff (attached) seems to imply that the triggers simply
>>> are not firing, though.
>>
>> The reason for this was explained by Dean.  If you take out the check
>> that he mentioned, then your trigger fires but crashes.  In your changed
>> unique_key_recheck(), "slot" is not initialized before use (or ever).
>
> Thanks. I'll be revisiting this shortly. Dean's information made me
> think the potential for a gain is smaller than initially imagined.

I think those are independent considerations.  The "recheckIndexes"
logic just tracks what indexes have potential conflicts to check later.
Whether that checking later happens in a row or statement trigger should
not matter.  What you need to do is adapt the "recheckIndexes" logic
from row triggers to statement triggers, e.g., expand
ExecASInsertTriggers() to look more like ExecARInsertTriggers().

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services