Duplicate key violation on upsert

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

Duplicate key violation on upsert

Matt Magoffin-3
Hello,

I am experiencing a duplicate key violation in Postgres 9.6 on occasion for one particular query, and I’m wondering where I’m going wrong. My table looks like this:

                      Table "solardatum.da_datum"
  Column   |           Type           | Collation | Nullable | Default 
-----------+--------------------------+-----------+----------+---------
 ts        | timestamp with time zone |           | not null | 
 node_id   | bigint                   |           | not null | 
 source_id | character varying(64)    |           | not null | 
 posted    | timestamp with time zone |           | not null | 
 jdata_i   | jsonb                    |           |          | 
 jdata_a   | jsonb                    |           |          | 
 jdata_s   | jsonb                    |           |          | 
 jdata_t   | text[]                   |           |          | 
Indexes:
    "da_datum_pkey" UNIQUE, btree (node_id, ts, source_id) CLUSTER, tablespace "solarindex"
    "da_datum_reverse_pkey" UNIQUE, btree (node_id, ts DESC, source_id), tablespace "solarindex"
    "da_datum_x_acc_idx" UNIQUE, btree (node_id, source_id, ts DESC, jdata_a) WHERE jdata_a IS NOT NULL, tablespace "solarindex"
Triggers:
    aa_agg_stale_datum BEFORE INSERT OR DELETE OR UPDATE ON solardatum.da_datum FOR EACH ROW EXECUTE PROCEDURE solardatum.trigger_agg_stale_datum()
    ts_insert_blocker BEFORE INSERT ON solardatum.da_datum FOR EACH ROW EXECUTE PROCEDURE _timescaledb_internal.insert_blocker()

The error/query looks like:

ERROR: duplicate key value violates unique constraint “_hyper_1_1931_chunk_da_datum_x_acc_idx"
  Where: SQL statement "INSERT INTO solardatum.da_datum(ts, node_id, source_id, posted, jdata_i, jdata_a, jdata_s, jdata_t)
VALUES (…)
ON CONFLICT (node_id, ts, source_id) DO UPDATE
SET jdata_i = EXCLUDED.jdata_i,
jdata_a = EXCLUDED.jdata_a,
jdata_s = EXCLUDED.jdata_s,
jdata_t = EXCLUDED.jdata_t,
posted = EXCLUDED.posted
RETURNING (xmax = 0)"

I am using the TimescaleDB extension so there are child tables inheriting from this main table and that’s why the reported index name differs from the definition shown above. I’m not sure if the extension is the problem, so I thought I’d start here to see if I’ve configured something wrong or my expectations on how the upsert should work is wrong. My expectation was that basically the insert would never fail from a duplicate key violation.

The error always references the da_datum_x_acc_idx index, which is a partial index with jdata_a added as a covering column… that is, it’s only in the index so I can get some index-only results with that column. Is the partial index possibly an issue in this configuration?

Thanks for any insight,
Matt
Reply | Threaded
Open this post in threaded view
|

Re: Duplicate key violation on upsert

Adrian Klaver-4
On 3/20/20 2:17 AM, Matt Magoffin wrote:

> Hello,
>
> I am experiencing a duplicate key violation in Postgres 9.6 on occasion
> for one particular query, and I’m wondering where I’m going wrong. My
> table looks like this:
>
>                        Table "solardatum.da_datum"
>    Column   |           Type           | Collation | Nullable | Default
> -----------+--------------------------+-----------+----------+---------
>   ts        | timestamp with time zone |           | not null |
>   node_id   | bigint                   |           | not null |
>   source_id | character varying(64)    |           | not null |
>   posted    | timestamp with time zone |           | not null |
>   jdata_i   | jsonb                    |           |          |
>   jdata_a   | jsonb                    |           |          |
>   jdata_s   | jsonb                    |           |          |
>   jdata_t   | text[]                   |           |          |
> Indexes:
>      "da_datum_pkey" UNIQUE, btree (node_id, ts, source_id) CLUSTER,
> tablespace "solarindex"
>      "da_datum_reverse_pkey" UNIQUE, btree (node_id, ts DESC,
> source_id), tablespace "solarindex"
>      "da_datum_x_acc_idx" UNIQUE, btree (node_id, source_id, ts DESC,
> jdata_a) WHERE jdata_a IS NOT NULL, tablespace "solarindex"
> Triggers:
>      aa_agg_stale_datum BEFORE INSERT OR DELETE OR UPDATE ON
> solardatum.da_datum FOR EACH ROW EXECUTE
> PROCEDURE solardatum.trigger_agg_stale_datum()
>      ts_insert_blocker BEFORE INSERT ON solardatum.da_datum FOR EACH ROW
> EXECUTE PROCEDURE _timescaledb_internal.insert_blocker()
>
> The error/query looks like:
>
> ERROR: duplicate key value violates unique constraint
> “_hyper_1_1931_chunk_da_datum_x_acc_idx"

What is the above index UNIQUE over?

What is da_datum_x_acc_idx index below indexed over?


>    Where: SQL statement "INSERT INTO solardatum.da_datum(ts, node_id,
> source_id, posted, jdata_i, jdata_a, jdata_s, jdata_t)
> VALUES (…)
> ON CONFLICT (node_id, ts, source_id) DO UPDATE
> SET jdata_i = EXCLUDED.jdata_i,
> jdata_a = EXCLUDED.jdata_a,
> jdata_s = EXCLUDED.jdata_s,
> jdata_t = EXCLUDED.jdata_t,
> posted = EXCLUDED.posted
> RETURNING (xmax = 0)"
>
> I am using the TimescaleDB extension so there are child tables
> inheriting from this main table and that’s why the reported index name
> differs from the definition shown above. I’m not sure if the extension
> is the problem, so I thought I’d start here to see if I’ve configured
> something wrong or my expectations on how the upsert should work is
> wrong. My expectation was that basically the insert would never fail
> from a duplicate key violation.
>
> The error always references the da_datum_x_acc_idx index, which is a
> partial index with jdata_a added as a covering column… that is, it’s
> only in the index so I can get some index-only results with that column.
> Is the partial index possibly an issue in this configuration?
>
> Thanks for any insight,
> Matt


--
Adrian Klaver
[hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: Duplicate key violation on upsert

Matt Magoffin-3


On 21/03/2020, at 4:00 AM, Adrian Klaver <[hidden email]> wrote:

On 3/20/20 2:17 AM, Matt Magoffin wrote:
Hello,
Indexes:
    "da_datum_pkey" UNIQUE, btree (node_id, ts, source_id) CLUSTER, tablespace "solarindex"
    "da_datum_reverse_pkey" UNIQUE, btree (node_id, ts DESC, source_id), tablespace "solarindex"
    "da_datum_x_acc_idx" UNIQUE, btree (node_id, source_id, ts DESC, jdata_a) WHERE jdata_a IS NOT NULL, tablespace "solarindex"
The error/query looks like:
ERROR: duplicate key value violates unique constraint “_hyper_1_1931_chunk_da_datum_x_acc_idx"

What is the above index UNIQUE over?

What is da_datum_x_acc_idx index below indexed over?

The _hyper_1_1931_chunk_da_datum_x_acc_idx index has the same definition as the da_datum_x_acc_idx above (it is defined on a child table). That is, they are both essentially:

UNIQUE, btree (node_id, source_id, ts DESC, jdata_a) WHERE jdata_a IS NOT NULL

The da_datum_pkey index is what the ON CONFLICT cause refers to, so (node_id, ts, source_id) is UNIQUE as well.

— m@

Reply | Threaded
Open this post in threaded view
|

Re: Duplicate key violation on upsert

Adrian Klaver-4
On 3/20/20 11:34 AM, Matt Magoffin wrote:

>
>
>> On 21/03/2020, at 4:00 AM, Adrian Klaver <[hidden email]
>> <mailto:[hidden email]>> wrote:
>>
>> On 3/20/20 2:17 AM, Matt Magoffin wrote:
>>> Hello,
>>> Indexes:
>>>     "da_datum_pkey" UNIQUE, btree (node_id, ts, source_id) CLUSTER,
>>> tablespace "solarindex"
>>>     "da_datum_reverse_pkey" UNIQUE, btree (node_id, ts DESC,
>>> source_id), tablespace "solarindex"
>>>     "da_datum_x_acc_idx" UNIQUE, btree (node_id, source_id, ts DESC,
>>> jdata_a) WHERE jdata_a IS NOT NULL, tablespace "solarindex"
>>> The error/query looks like:
>>> ERROR: duplicate key value violates unique constraint
>>> “_hyper_1_1931_chunk_da_datum_x_acc_idx"
>>
>> What is the above index UNIQUE over?
>>
>> What is da_datum_x_acc_idx index below indexed over?
>
> The _hyper_1_1931_chunk_da_datum_x_acc_idx index has the same definition
> as the da_datum_x_acc_idx above (it is defined on a child table). That
> is, they are both essentially:
>
> UNIQUE, btree (node_id, source_id, ts DESC, jdata_a) WHERE jdata_a IS
> NOT NULL
>
> The da_datum_pkey index is what the ON CONFLICT cause refers to, so
> (node_id, ts, source_id) is UNIQUE as well.

Hmm, wonder if you are getting bit by this?:

https://www.postgresql.org/docs/12/sql-insert.html#SQL-ON-CONFLICT

"INSERT with an ON CONFLICT DO UPDATE clause is a “deterministic”
statement. This means that the command will not be allowed to affect any
single existing row more than once; a cardinality violation error will
be raised when this situation arises. Rows proposed for insertion should
not duplicate each other in terms of attributes constrained by an
arbiter index or constraint."


>
> — m@
>


--
Adrian Klaver
[hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: Duplicate key violation on upsert

Matt Magoffin-3

On 21/03/2020, at 8:10 AM, Adrian Klaver <[hidden email]> wrote:

The _hyper_1_1931_chunk_da_datum_x_acc_idx index has the same definition as the da_datum_x_acc_idx above (it is defined on a child table). That is, they are both essentially:
UNIQUE, btree (node_id, source_id, ts DESC, jdata_a) WHERE jdata_a IS NOT NULL
The da_datum_pkey index is what the ON CONFLICT cause refers to, so (node_id, ts, source_id) is UNIQUE as well.

Hmm, wonder if you are getting bit by this?:

https://www.postgresql.org/docs/12/sql-insert.html#SQL-ON-CONFLICT

"INSERT with an ON CONFLICT DO UPDATE clause is a “deterministic” statement. This means that the command will not be allowed to affect any single existing row more than once; a cardinality violation error will be raised when this situation arises. Rows proposed for insertion should not duplicate each other in terms of attributes constrained by an arbiter index or constraint.”

I’m not sure I’m wrapping my head around this. The INSERT affects 1 row as the unique values (node_id, ts, source_id) are specified in the statement. Is it possible that da_datum_x_acc_idx is used as the arbiter index in this situation, rather than da_datum_pkey (that I intended), and you’re saying that the jdata_a column is getting updated twice, first in the INSERT and second in the DO UPDATE, triggering the duplicate key violation?

— m@

Reply | Threaded
Open this post in threaded view
|

Re: Duplicate key violation on upsert

Adrian Klaver-4
On 3/20/20 1:32 PM, Matt Magoffin wrote:

>
>> On 21/03/2020, at 8:10 AM, Adrian Klaver <[hidden email]
>> <mailto:[hidden email]>> wrote:
>>
>>> The _hyper_1_1931_chunk_da_datum_x_acc_idx index has the same
>>> definition as the da_datum_x_acc_idx above (it is defined on a child
>>> table). That is, they are both essentially:
>>> UNIQUE, btree (node_id, source_id, ts DESC, jdata_a) WHERE jdata_a IS
>>> NOT NULL
>>> The da_datum_pkey index is what the ON CONFLICT cause refers to, so
>>> (node_id, ts, source_id) is UNIQUE as well.
>>
>> Hmm, wonder if you are getting bit by this?:
>>
>> https://www.postgresql.org/docs/12/sql-insert.html#SQL-ON-CONFLICT
>>
>> "INSERT with an ON CONFLICT DO UPDATE clause is a “deterministic”
>> statement. This means that the command will not be allowed to affect
>> any single existing row more than once; a cardinality violation error
>> will be raised when this situation arises. Rows proposed for insertion
>> should not duplicate each other in terms of attributes constrained by
>> an arbiter index or constraint.”
>
> I’m not sure I’m wrapping my head around this. The INSERT affects 1 row
> as the unique values (node_id, ts, source_id) are specified in the
> statement. Is it possible that da_datum_x_acc_idx is used as the arbiter
> index in this situation, rather than da_datum_pkey (that I intended),
> and you’re saying that the jdata_a column is getting updated twice,
> first in the INSERT and second in the DO UPDATE, triggering the
> duplicate key violation?

I was thinking more about this:

"INSERT INTO solardatum.da_datum(ts, node_id, source_id, posted,
jdata_i, jdata_a, jdata_s, jdata_t)
        VALUES (…) ..."

from your OP. Namely whether it was:

VALUES (), (), (), ...

and if so there were values in the (),(),() that duplicated each other.

As to the second part of your response, ON CONFLICT does one of either
INSERT or UPDATE. If:

1) There is no conflict for ON CONFLICT (node_id, ts, source_id) then
the INSERT proceeds.

2) If there is a conflict then an UPDATE occurs using the SET values.

Now just me working through this:

da_datum_pkey       = (node_id, ts, source_id)
da_datum_x_acc_idx  = (node_id, source_id, ts DESC, jdata_a)

If 1) from above applies then da_datum_x_acc_idx will not be tripped as
the only way that could happen is if the node_id, ts, source_id was the
same as an existing row and that can't be true because the PK over the
same values passed.

If 2) from above happened then you are trying to UPDATE a row with
matching PK values(node_id, ts, source_id). Now it is entirely possible
that since you are not testing for constraint violation on (node_id,
source_id, ts DESC, jdata_a) that you be doing SET jdata_a =
EXCLUDED.jdata_a, using a value that would trip da_datum_x_acc_idx

>
> — m@
>


--
Adrian Klaver
[hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: Duplicate key violation on upsert

Adrian Klaver-4
On 3/21/20 11:05 AM, Adrian Klaver wrote:

> On 3/20/20 1:32 PM, Matt Magoffin wrote:
>>
>>> On 21/03/2020, at 8:10 AM, Adrian Klaver <[hidden email]
>>> <mailto:[hidden email]>> wrote:
>>>
>>>> The _hyper_1_1931_chunk_da_datum_x_acc_idx index has the same
>>>> definition as the da_datum_x_acc_idx above (it is defined on a child
>>>> table). That is, they are both essentially:
>>>> UNIQUE, btree (node_id, source_id, ts DESC, jdata_a) WHERE jdata_a
>>>> IS NOT NULL
>>>> The da_datum_pkey index is what the ON CONFLICT cause refers to, so
>>>> (node_id, ts, source_id) is UNIQUE as well.
>>>
>>> Hmm, wonder if you are getting bit by this?:
>>>
>>> https://www.postgresql.org/docs/12/sql-insert.html#SQL-ON-CONFLICT
>>>
>>> "INSERT with an ON CONFLICT DO UPDATE clause is a “deterministic”
>>> statement. This means that the command will not be allowed to affect
>>> any single existing row more than once; a cardinality violation error
>>> will be raised when this situation arises. Rows proposed for
>>> insertion should not duplicate each other in terms of attributes
>>> constrained by an arbiter index or constraint.”
>>
>> I’m not sure I’m wrapping my head around this. The INSERT affects 1
>> row as the unique values (node_id, ts, source_id) are specified in the
>> statement. Is it possible that da_datum_x_acc_idx is used as the
>> arbiter index in this situation, rather than da_datum_pkey (that I
>> intended), and you’re saying that the jdata_a column is getting
>> updated twice, first in the INSERT and second in the DO UPDATE,
>> triggering the duplicate key violation?
>
> I was thinking more about this:
>
> "INSERT INTO solardatum.da_datum(ts, node_id, source_id, posted,
> jdata_i, jdata_a, jdata_s, jdata_t)
>      VALUES (…) ..."
>
> from your OP. Namely whether it was:
>
> VALUES (), (), (), ...
>
> and if so there were values in the (),(),() that duplicated each other.
>
> As to the second part of your response, ON CONFLICT does one of either
> INSERT or UPDATE. If:
>
> 1) There is no conflict for ON CONFLICT (node_id, ts, source_id) then
> the INSERT proceeds.
>
> 2) If there is a conflict then an UPDATE occurs using the SET values.
>
> Now just me working through this:
>
> da_datum_pkey       = (node_id, ts, source_id)
> da_datum_x_acc_idx  = (node_id, source_id, ts DESC, jdata_a)
>
> If 1) from above applies then da_datum_x_acc_idx will not be tripped as
> the only way that could happen is if the node_id, ts, source_id was the
> same as an existing row and that can't be true because the PK over the
> same values passed.

Well the below is complete rot. If you are UPDATEing then you are not
creating a duplicate row, just overwriting a value with itself.

>
> If 2) from above happened then you are trying to UPDATE a row with
> matching PK values(node_id, ts, source_id). Now it is entirely possible
> that since you are not testing for constraint violation on (node_id,
> source_id, ts DESC, jdata_a) that you be doing SET jdata_a =
> EXCLUDED.jdata_a, using a value that would trip da_datum_x_acc_idx
>
>>
>> — m@
>>
>
>


--
Adrian Klaver
[hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: Duplicate key violation on upsert

Matt Magoffin-3

On 22/03/2020, at 8:11 AM, Adrian Klaver <[hidden email]> wrote:

I was thinking more about this:
"INSERT INTO solardatum.da_datum(ts, node_id, source_id, posted, jdata_i, jdata_a, jdata_s, jdata_t)
    VALUES (…) ..."
from your OP. Namely whether it was:
VALUES (), (), (), ...
and if so there were values in the (),(),() that duplicated each other.
As to the second part of your response, ON CONFLICT does one of either INSERT or UPDATE. If:
1) There is no conflict for ON CONFLICT (node_id, ts, source_id) then the INSERT proceeds.
2) If there is a conflict then an UPDATE occurs using the SET values.
Now just me working through this:
da_datum_pkey       = (node_id, ts, source_id)
da_datum_x_acc_idx  = (node_id, source_id, ts DESC, jdata_a)
If 1) from above applies then da_datum_x_acc_idx will not be tripped as the only way that could happen is if the node_id, ts, source_id was the same as an existing row and that can't be true because the PK over the same values passed.

Well the below is complete rot. If you are UPDATEing then you are not creating a duplicate row, just overwriting a value with itself.

If 2) from above happened then you are trying to UPDATE a row with matching PK values(node_id, ts, source_id). Now it is entirely possible that since you are not testing for constraint violation on (node_id, source_id, ts DESC, jdata_a) that you be doing SET jdata_a = EXCLUDED.jdata_a, using a value that would trip da_datum_x_acc_idx

Sorry for the vagueness in my OP, I was trying to make it easier to read. The VALUES are for individual single column values, so a single possible row to insert/update.

So what you’ve outlined is basically what I thought should be happening. Namely, there can be only one row that will be inserted/updated. I am wondering if I should re-create the da_datum_x_acc_idx index without UNIQUE? I had it as UNIQUE to optimise the type of queries that make use of that index… but I did a little bit of testing using a non-UNIQUE index and those queries appear to execute around the same time as with the UNIQUE index. I just wasn’t sure if that would just be masking some other problem in my setup.

— m@
Reply | Threaded
Open this post in threaded view
|

Re: Duplicate key violation on upsert

Adrian Klaver-4
On 3/21/20 2:45 PM, Matt Magoffin wrote:

>
>> On 22/03/2020, at 8:11 AM, Adrian Klaver <[hidden email]
>> <mailto:[hidden email]>> wrote:
>>
>>> I was thinking more about this:
>>> "INSERT INTO solardatum.da_datum(ts, node_id, source_id, posted,
>>> jdata_i, jdata_a, jdata_s, jdata_t)
>>>     VALUES (…) ..."
>>> from your OP. Namely whether it was:
>>> VALUES (), (), (), ...
>>> and if so there were values in the (),(),() that duplicated each other.
>>> As to the second part of your response, ON CONFLICT does one of
>>> either INSERT or UPDATE. If:
>>> 1) There is no conflict for ON CONFLICT (node_id, ts, source_id) then
>>> the INSERT proceeds.
>>> 2) If there is a conflict then an UPDATE occurs using the SET values.
>>> Now just me working through this:
>>> da_datum_pkey       = (node_id, ts, source_id)
>>> da_datum_x_acc_idx  = (node_id, source_id, ts DESC, jdata_a)
>>> If 1) from above applies then da_datum_x_acc_idx will not be tripped
>>> as the only way that could happen is if the node_id, ts, source_id
>>> was the same as an existing row and that can't be true because the PK
>>> over the same values passed.
>>
>> Well the below is complete rot. If you are UPDATEing then you are not
>> creating a duplicate row, just overwriting a value with itself.
>>
>>> If 2) from above happened then you are trying to UPDATE a row with
>>> matching PK values(node_id, ts, source_id). Now it is entirely
>>> possible that since you are not testing for constraint violation on
>>> (node_id, source_id, ts DESC, jdata_a) that you be doing SET jdata_a
>>> = EXCLUDED.jdata_a, using a value that would trip da_datum_x_acc_idx
>
> Sorry for the vagueness in my OP, I was trying to make it easier to
> read. The VALUES are for individual single column values, so a single
> possible row to insert/update.
>
> So what you’ve outlined is basically what I thought should be happening.
> Namely, there can be only one row that will be inserted/updated. I am
> wondering if I should re-create the da_datum_x_acc_idx index without
> UNIQUE? I had it as UNIQUE to optimise the type of queries that make use
> of that index… but I did a little bit of testing using a non-UNIQUE
> index and those queries appear to execute around the same time as with
> the UNIQUE index. I just wasn’t sure if that would just be masking some
> other problem in my setup.

Well if the combination of values do not need to be UNIQUE then imposing
UNIQUE on them, in my mind, is an unnecessary constraint.  Though it
would be good to know why the constraint is being tripped.

Is there a chance the BEFORE trigger functions are doing something that
could be leading to the error?

In the error log is there a line with the actual values that failed?

>
> — m@


--
Adrian Klaver
[hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: Duplicate key violation on upsert

Matt Magoffin-3

On 23/03/2020, at 9:44 AM, Adrian Klaver <[hidden email]> wrote:
Is there a chance the BEFORE trigger functions are doing something that could be leading to the error?

In the error log is there a line with the actual values that failed?

The error log does not show the literal values, no. Here is a literal example from the logs:

2020-03-20 19:51:11 NZDT [15165]: [6-1] ERROR:  duplicate key value violates unique constraint "_hyper_1_1931_chunk_da_datum_x_acc_idx"
2020-03-20 19:51:11 NZDT [15165]: [7-1] CONTEXT:  SQL statement "INSERT INTO solardatum.da_datum(ts, node_id, source_id, posted, jdata_i, jdata_a, jdata_s, jdata_t)
                VALUES (ts_crea, node, src, ts_post, jdata_json->'i', jdata_json->'a', jdata_json->'s', solarcommon.json_array_to_text_array(jdata_json->'t'))
                ON CONFLICT (node_id, ts, source_id) DO UPDATE
                SET jdata_i = EXCLUDED.jdata_i,
                        jdata_a = EXCLUDED.jdata_a,
                        jdata_s = EXCLUDED.jdata_s,
                        jdata_t = EXCLUDED.jdata_t,
                        posted = EXCLUDED.posted
                RETURNING (xmax = 0)"
        PL/pgSQL function solardatum.store_datum(timestamp with time zone,bigint,text,timestamp with time zone,text,boolean) line 10 at SQL statement
2020-03-20 19:51:11 NZDT [15165]: [8-1] STATEMENT:  select * from solardatum.store_datum($1, $2, $3, $4, $5) as result

As for the BEFORE triggers, the solardatum.trigger_agg_stale_datum one does an INSERT into a different table and a SELECT from this same table. The _timescaledb_internal.insert_blocker one is part of the TimescaleDB extension which looks like it wouldn’t have an impact to this issue, but the source of that is


— m@
Reply | Threaded
Open this post in threaded view
|

Re: Duplicate key violation on upsert

Adrian Klaver-4
On 3/22/20 2:48 PM, Matt Magoffin wrote:

>
>> On 23/03/2020, at 9:44 AM, Adrian Klaver <[hidden email]
>> <mailto:[hidden email]>> wrote:
>> Is there a chance the BEFORE trigger functions are doing something
>> that could be leading to the error?
>>
>> In the error log is there a line with the actual values that failed?
>
> The error log does not show the literal values, no. Here is a literal
> example from the logs:
>
> 2020-03-20 19:51:11 NZDT [15165]: [6-1] ERROR:  duplicate key value
> violates unique constraint "_hyper_1_1931_chunk_da_datum_x_acc_idx"
> 2020-03-20 19:51:11 NZDT [15165]: [7-1] CONTEXT:  SQL statement "INSERT
> INTO solardatum.da_datum(ts, node_id, source_id, posted, jdata_i,
> jdata_a, jdata_s, jdata_t)
>                  VALUES (ts_crea, node, src, ts_post, jdata_json->'i',
> jdata_json->'a', jdata_json->'s',
> solarcommon.json_array_to_text_array(jdata_json->'t'))
>                  ON CONFLICT (node_id, ts, source_id) DO UPDATE
>                  SET jdata_i = EXCLUDED.jdata_i,
>                          jdata_a = EXCLUDED.jdata_a,
>                          jdata_s = EXCLUDED.jdata_s,
>                          jdata_t = EXCLUDED.jdata_t,
>                          posted = EXCLUDED.posted
>                  RETURNING (xmax = 0)"
>          PL/pgSQL function solardatum.store_datum(timestamp with time
> zone,bigint,text,timestamp with time zone,text,boolean) line 10 at SQL
> statement
> 2020-03-20 19:51:11 NZDT [15165]: [8-1] STATEMENT:  select * from
> solardatum.store_datum($1, $2, $3, $4, $5) as result

So the query is in the function solardatum.store_datum()?

If so what is it doing?

And could you capture the values and pass them to a RAISE NOTICE?

>
> As for the BEFORE triggers, the solardatum.trigger_agg_stale_datum one
> does an INSERT into a different table and a SELECT from this same table.
> The _timescaledb_internal.insert_blocker one is part of the TimescaleDB
> extension which looks like it wouldn’t have an impact to this issue, but
> the source of that is
>
> https://github.com/timescale/timescaledb/blob/91fe723d3aaaf88b53ffffebf8adc3e16a68ec45/src/hypertable.c#L1359
>
> — m@


--
Adrian Klaver
[hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: Duplicate key violation on upsert

Matt Magoffin-3

On 23/03/2020, at 1:10 PM, Adrian Klaver <[hidden email]> wrote:

So the query is in the function solardatum.store_datum()?

If so what is it doing?

Yes. This function first performs the INSERT INTO the solardatum.da_datum table that we’re discussing here; then it inserts into two different tables. If it helps, the actual SQL is available here:


And could you capture the values and pass them to a RAISE NOTICE?

It would take me some time to get that change deployed. If I was able to, what information do you think would be helpful here, e.g. that jdata_a is NULL or not, or something else?

The duplicate key violation occurs infrequently, and it does seem appropriate to drop the UNIQUE constraint on the da_datum_x_acc_idx given uniqueness is really only wanted on (node_id, ts, source_id). As long as I can confirm that query performance doesn’t decrease, I’d like to recreate the index without UNIQUE. Then I’m hoping this problem, whatever the cause, goes away.

— m@
Reply | Threaded
Open this post in threaded view
|

Re: Duplicate key violation on upsert

Adrian Klaver-4
On 3/25/20 5:23 PM, Matt Magoffin wrote:

>
>> On 23/03/2020, at 1:10 PM, Adrian Klaver <[hidden email]
>> <mailto:[hidden email]>> wrote:
>>
>> So the query is in the function solardatum.store_datum()?
>>
>> If so what is it doing?
>
> Yes. This function first performs the INSERT INTO the
> solardatum.da_datum table that we’re discussing here; then it inserts
> into two different tables. If it helps, the actual SQL is available here:
>
> https://github.com/SolarNetwork/solarnetwork-central/blob/4fa585929a5526187ade0e842c809837647c6a41/solarnet-db-setup/postgres/postgres-init-generic-datum-x-functions.sql#L203-L242

I told see anything wrong at first glance, but is getting late here. I
will take another look in the morning.

>
>> And could you capture the values and pass them to a RAISE NOTICE?
>
> It would take me some time to get that change deployed. If I was able
> to, what information do you think would be helpful here, e.g. that
> jdata_a is NULL or not, or something else?

The values for (node_id, ts, source_id, jdata_a) as they compromise the
UNIQUE values for da_datum_pkey and da_datum_x_acc_idx.

>
> The duplicate key violation occurs infrequently, and it does seem
> appropriate to drop the UNIQUE constraint on the da_datum_x_acc_idx
> given uniqueness is really only wanted on (node_id, ts, source_id). As
> long as I can confirm that query performance doesn’t decrease, I’d like
> to recreate the index without UNIQUE. Then I’m hoping this problem,
> whatever the cause, goes away.
>
> — m@


--
Adrian Klaver
[hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: Duplicate key violation on upsert

Adrian Klaver-4
In reply to this post by Matt Magoffin-3
On 3/25/20 5:23 PM, Matt Magoffin wrote:

>
>> On 23/03/2020, at 1:10 PM, Adrian Klaver <[hidden email]
>> <mailto:[hidden email]>> wrote:
>>
>> So the query is in the function solardatum.store_datum()?
>>
>> If so what is it doing?
>
> Yes. This function first performs the INSERT INTO the
> solardatum.da_datum table that we’re discussing here; then it inserts
> into two different tables. If it helps, the actual SQL is available here:
>
> https://github.com/SolarNetwork/solarnetwork-central/blob/4fa585929a5526187ade0e842c809837647c6a41/solarnet-db-setup/postgres/postgres-init-generic-datum-x-functions.sql#L203-L242

Well morning and coffee helped some, but not enough to offer blinding
insight. Reviewing the function above, the TimescaleDB insert block
function and the overview of the TimescaleDB hypertable architecture
leads me to believe there is some sort of conflict between the
solarnetwork functions and the TimescaleDB hypertable actions. It is a
wishy-washy answer as I do not understand the TimescaleDB architecture
well enough. You might want to reach to the TimescaleDB community to see
if they can offer any further insight.

>
>> And could you capture the values and pass them to a RAISE NOTICE?
>
> It would take me some time to get that change deployed. If I was able
> to, what information do you think would be helpful here, e.g. that
> jdata_a is NULL or not, or something else?
>
> The duplicate key violation occurs infrequently, and it does seem
> appropriate to drop the UNIQUE constraint on the da_datum_x_acc_idx
> given uniqueness is really only wanted on (node_id, ts, source_id). As
> long as I can confirm that query performance doesn’t decrease, I’d like
> to recreate the index without UNIQUE. Then I’m hoping this problem,
> whatever the cause, goes away.
>
> — m@


--
Adrian Klaver
[hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: Duplicate key violation on upsert

Matt Magoffin-3

On 27/03/2020, at 5:26 AM, Adrian Klaver <[hidden email]> wrote:

Well morning and coffee helped some, but not enough to offer blinding insight. Reviewing the function above, the TimescaleDB insert block function and the overview of the TimescaleDB hypertable architecture leads me to believe there is some sort of conflict between the solarnetwork functions and the TimescaleDB hypertable actions. It is a wishy-washy answer as I do not understand the TimescaleDB architecture well enough. You might want to reach to the TimescaleDB community to see if they can offer any further insight.

Fair enough! Thank you for taking the time to look through this issue with me, I really appreciate it. I’ll reach out the the TimescaleDB folks and see what they think.

— m@