Could postgres12 support millions of sequences? (like 10 million)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
58 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Could postgres12 support millions of sequences? (like 10 million)

pabloa98
Hello,

My schema requires a counter for each combination of 2 values. Something like:

CREATE TABLE counter(
group INT NOT NULL,
element INT NOT NULL,
seq_number INT NOT NULL default 0,
CONSTRAINT PRIMARY KEY (group, element)
);

For each entry in counter, aka for each (group, element) pair, the model requires a seq_number.

If I use a table "counter", I could still have counter collisions between 2 transactions. I need truly sequence behavior. Is that possible by using a table like "counter" table, where the counter could be increased out of the transaction so it performs as a sequence without having race conditions between concurrent transactions?

The other option is to create sequences for each new pair of (group, element) using triggers. There are millions of pairs. So this approach will generate millions of sequences.

How a PostgreSQL database would behave having millions of sequences in a schema? Would it degrade its performance? Is there any negative impact?

Regards

Pablo



lup
Reply | Threaded
Open this post in threaded view
|

Re: Could postgres12 support millions of sequences? (like 10 million)

lup


> On Mar 19, 2020, at 3:36 PM, pabloa98 <[hidden email]> wrote:
>
> Hello,
>
> My schema requires a counter for each combination of 2 values. Something like:
>
> CREATE TABLE counter(
> group INT NOT NULL,
> element INT NOT NULL,
> seq_number INT NOT NULL default 0,
> CONSTRAINT PRIMARY KEY (group, element)
> );
>
> For each entry in counter, aka for each (group, element) pair, the model requires a seq_number.
>
> If I use a table "counter", I could still have counter collisions between 2 transactions. I need truly sequence behavior. Is that possible by using a table like "counter" table, where the counter could be increased out of the transaction so it performs as a sequence without having race conditions between concurrent transactions?
>
> The other option is to create sequences for each new pair of (group, element) using triggers. There are millions of pairs. So this approach will generate millions of sequences.
>
> How a PostgreSQL database would behave having millions of sequences in a schema? Would it degrade its performance? Is there any negative impact?
>
> Regards
>
> Pablo
>

To clarify, are you hoping for consecutive numbers as the each row is added to the table, i.e. “serial”?

What is the intension of “seq_number”?
>
>



Reply | Threaded
Open this post in threaded view
|

Re: Could postgres12 support millions of sequences? (like 10 million)

pabloa98


On Thu, Mar 19, 2020 at 2:50 PM Rob Sargent <[hidden email]> wrote:


> On Mar 19, 2020, at 3:36 PM, pabloa98 <[hidden email]> wrote:
>
> Hello,
>
> My schema requires a counter for each combination of 2 values. Something like:
>
> CREATE TABLE counter(
> group INT NOT NULL,
> element INT NOT NULL,
> seq_number INT NOT NULL default 0,
> CONSTRAINT PRIMARY KEY (group, element)
> );
>
> For each entry in counter, aka for each (group, element) pair, the model requires a seq_number.
>
> If I use a table "counter", I could still have counter collisions between 2 transactions. I need truly sequence behavior. Is that possible by using a table like "counter" table, where the counter could be increased out of the transaction so it performs as a sequence without having race conditions between concurrent transactions?
>
> The other option is to create sequences for each new pair of (group, element) using triggers. There are millions of pairs. So this approach will generate millions of sequences.
>
> How a PostgreSQL database would behave having millions of sequences in a schema? Would it degrade its performance? Is there any negative impact?
>
> Regards
>
> Pablo
>

To clarify, are you hoping for consecutive numbers as the each row is added to the table, i.e. “serial”?

What is the intension of “seq_number”?
>
>


the idea is to have like a serial sequence, but for each pair of (group, element).

so that when we insert rows in another table, we could have something like:


group, element, event_id, ...
1, 1, 1
1, 1, 2
1, 1, 3
2, 1, 1
1, 1, 4
1, 3, 1
1, 1, 5
1, 3, 2
2, 1, 2
2, 1, 3

The 3rd column is the sequence number we get from the appropriate sequence created by the trigger.
 
I want to implement a variation of https://stackoverflow.com/a/30204854 and that will generate millions of sequences.


lup
Reply | Threaded
Open this post in threaded view
|

Re: Could postgres12 support millions of sequences? (like 10 million)

lup


On Mar 19, 2020, at 4:13 PM, pabloa98 <[hidden email]> wrote:



On Thu, Mar 19, 2020 at 2:50 PM Rob Sargent <[hidden email]> wrote:


> On Mar 19, 2020, at 3:36 PM, pabloa98 <[hidden email]> wrote:
>
> Hello,
>
> My schema requires a counter for each combination of 2 values. Something like:
>
> CREATE TABLE counter(
> group INT NOT NULL,
> element INT NOT NULL,
> seq_number INT NOT NULL default 0,
> CONSTRAINT PRIMARY KEY (group, element)
> );
>
> For each entry in counter, aka for each (group, element) pair, the model requires a seq_number.
>
> If I use a table "counter", I could still have counter collisions between 2 transactions. I need truly sequence behavior. Is that possible by using a table like "counter" table, where the counter could be increased out of the transaction so it performs as a sequence without having race conditions between concurrent transactions?
>
> The other option is to create sequences for each new pair of (group, element) using triggers. There are millions of pairs. So this approach will generate millions of sequences.
>
> How a PostgreSQL database would behave having millions of sequences in a schema? Would it degrade its performance? Is there any negative impact?
>
> Regards
>
> Pablo
>

To clarify, are you hoping for consecutive numbers as the each row is added to the table, i.e. “serial”?

What is the intension of “seq_number”?
>
>


the idea is to have like a serial sequence, but for each pair of (group, element).

so that when we insert rows in another table, we could have something like:


group, element, event_id, ...
1, 1, 1
1, 1, 2
1, 1, 3
2, 1, 1
1, 1, 4
1, 3, 1
1, 1, 5
1, 3, 2
2, 1, 2
2, 1, 3

The 3rd column is the sequence number we get from the appropriate sequence created by the trigger.
 
I want to implement a variation of https://stackoverflow.com/a/30204854 and that will generate millions of sequences.


Then I don’t thing group/element can be a PRIMARY KEY 


Reply | Threaded
Open this post in threaded view
|

Re: Could postgres12 support millions of sequences? (like 10 million)

pabloa98


On Thu, Mar 19, 2020 at 3:17 PM Rob Sargent <[hidden email]> wrote:


On Mar 19, 2020, at 4:13 PM, pabloa98 <[hidden email]> wrote:



On Thu, Mar 19, 2020 at 2:50 PM Rob Sargent <[hidden email]> wrote:


> On Mar 19, 2020, at 3:36 PM, pabloa98 <[hidden email]> wrote:
>
> Hello,
>
> My schema requires a counter for each combination of 2 values. Something like:
>
> CREATE TABLE counter(
> group INT NOT NULL,
> element INT NOT NULL,
> seq_number INT NOT NULL default 0,
> CONSTRAINT PRIMARY KEY (group, element)
> );
>
> For each entry in counter, aka for each (group, element) pair, the model requires a seq_number.
>
> If I use a table "counter", I could still have counter collisions between 2 transactions. I need truly sequence behavior. Is that possible by using a table like "counter" table, where the counter could be increased out of the transaction so it performs as a sequence without having race conditions between concurrent transactions?
>
> The other option is to create sequences for each new pair of (group, element) using triggers. There are millions of pairs. So this approach will generate millions of sequences.
>
> How a PostgreSQL database would behave having millions of sequences in a schema? Would it degrade its performance? Is there any negative impact?
>
> Regards
>
> Pablo
>

To clarify, are you hoping for consecutive numbers as the each row is added to the table, i.e. “serial”?

What is the intension of “seq_number”?
>
>


the idea is to have like a serial sequence, but for each pair of (group, element).

so that when we insert rows in another table, we could have something like:


group, element, event_id, ...
1, 1, 1
1, 1, 2
1, 1, 3
2, 1, 1
1, 1, 4
1, 3, 1
1, 1, 5
1, 3, 2
2, 1, 2
2, 1, 3

The 3rd column is the sequence number we get from the appropriate sequence created by the trigger.
 
I want to implement a variation of https://stackoverflow.com/a/30204854 and that will generate millions of sequences.


Then I don’t thing group/element can be a PRIMARY KEY 


Thank you for your answer. Let me explain:

Table "counter" will have triggers that will create a sequence with a new row is inserted called counter_1_1_seq, counter_2_1_seq, counter_1_3_seq, etc that will be used to insert values in **another** table. It will be used for nothing else. When we insert millions of pairs group/element, the trigger in that table will generate millions of sequences.

My question is how PostgreSQL will behave. Could it deal with millions of sequences? What about system operations as vacuum, etc?

Pablo
Reply | Threaded
Open this post in threaded view
|

Re: Could postgres12 support millions of sequences? (like 10 million)

David G Johnston
On Thu, Mar 19, 2020 at 3:33 PM pabloa98 <[hidden email]> wrote:
Table "counter" will have triggers that will create a sequence with a new row is inserted called counter_1_1_seq, counter_2_1_seq, counter_1_3_seq, etc that will be used to insert values in **another** table. It will be used for nothing else. When we insert millions of pairs group/element, the trigger in that table will generate millions of sequences. 

My question is how PostgreSQL will behave. Could it deal with millions of sequences? What about system operations as vacuum, etc?


First, it sounds like you care about there being no gaps in the records you end up saving.  If that is the case then sequences will not work for you.

.
.
.

If you are still reading because you can live with some gaps - then does having one sequence per pair really matter?

Regardless, at this scale you probably should setup a performance test as part of your continuous integration/test infrastructure, and let it answer the "which performs better" question.

However, one other consideration with sequences: do you care that PostgreSQL will cache/pin (i.e., no release) every single sequence you touch for the lifetime of the session? (I do not think DISCARD matters here but I'm just guessing)

A big determinant would seem to be how your sessions would go about using the sequences.  You've described the model requirement but haven't describe the process by which the model will be used; and without that information useful comments pertaining to alternative implementations are difficult to formulate.

David J.

Reply | Threaded
Open this post in threaded view
|

Re: Could postgres12 support millions of sequences? (like 10 million)

Adrian Klaver-4
In reply to this post by pabloa98
On 3/19/20 3:32 PM, pabloa98 wrote:

>
>
> On Thu, Mar 19, 2020 at 3:17 PM Rob Sargent <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>
>
>>     On Mar 19, 2020, at 4:13 PM, pabloa98 <[hidden email]
>>     <mailto:[hidden email]>> wrote:
>>
>>
>>
>>     On Thu, Mar 19, 2020 at 2:50 PM Rob Sargent <[hidden email]
>>     <mailto:[hidden email]>> wrote:
>>
>>
>>
>>         > On Mar 19, 2020, at 3:36 PM, pabloa98 <[hidden email]
>>         <mailto:[hidden email]>> wrote:
>>         >
>>         > Hello,
>>         >
>>         > My schema requires a counter for each combination of 2
>>         values. Something like:
>>         >
>>         > CREATE TABLE counter(
>>         > group INT NOT NULL,
>>         > element INT NOT NULL,
>>         > seq_number INT NOT NULL default 0,
>>         > CONSTRAINT PRIMARY KEY (group, element)
>>         > );
>>         >
>>         > For each entry in counter, aka for each (group, element)
>>         pair, the model requires a seq_number.
>>         >
>>         > If I use a table "counter", I could still have counter
>>         collisions between 2 transactions. I need truly sequence
>>         behavior. Is that possible by using a table like "counter"
>>         table, where the counter could be increased out of the
>>         transaction so it performs as a sequence without having race
>>         conditions between concurrent transactions?
>>         >
>>         > The other option is to create sequences for each new pair of
>>         (group, element) using triggers. There are millions of pairs.
>>         So this approach will generate millions of sequences.
>>         >
>>         > How a PostgreSQL database would behave having millions of
>>         sequences in a schema? Would it degrade its performance? Is
>>         there any negative impact?
>>         >
>>         > Regards
>>         >
>>         > Pablo
>>         >
>>
>>         To clarify, are you hoping for consecutive numbers as the each
>>         row is added to the table, i.e. “serial”?
>>
>>         What is the intension of “seq_number”?
>>         >
>>         >
>>
>>
>>     the idea is to have like a serial sequence, but for each pair of
>>     (group, element).
>>
>>     so that when we insert rows in another table, we could have
>>     something like:
>>
>>
>>     group, element, event_id, ...
>>     1, 1, 1
>>     1, 1, 2
>>     1, 1, 3
>>     2, 1, 1
>>     1, 1, 4
>>     1, 3, 1
>>     1, 1, 5
>>     1, 3, 2
>>     2, 1, 2
>>     2, 1, 3
>>
>>     The 3rd column is the sequence number we get from the appropriate
>>     sequence created by the trigger.
>>     I want to implement a variation of
>>     https://stackoverflow.com/a/30204854 and that will generate
>>     millions of sequences.
>>
>>
>     Then I don’t thing group/element can be a PRIMARY KEY
>
>
> Thank you for your answer. Let me explain:
>
> Table "counter" will have triggers that will create a sequence with a
> new row is inserted called counter_1_1_seq, counter_2_1_seq,
> counter_1_3_seq, etc that will be used to insert values in **another**
> table. It will be used for nothing else. When we insert millions of
> pairs group/element, the trigger in that table will generate millions of
> sequences.

This has train wreck written all over it. If you are going to have a
trigger for each combination of (group, element) I gotta believe the
table will fall over before you run into sequence issues. Not sure why
there can't be serial column that has supplies the sequence numbers and
therefore only one sequence in play. From what I see all you care about
is that:

group, element, event_id

has an increasing event_id for (group, element) pairs.


>
> My question is how PostgreSQL will behave. Could it deal with millions
> of sequences? What about system operations as vacuum, etc?
>
> Pablo


--
Adrian Klaver
[hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: Could postgres12 support millions of sequences? (like 10 million)

pabloa98


On Thu, Mar 19, 2020 at 5:13 PM Adrian Klaver <[hidden email]> wrote:
On 3/19/20 3:32 PM, pabloa98 wrote:
>
>
> On Thu, Mar 19, 2020 at 3:17 PM Rob Sargent <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>
>
>>     On Mar 19, 2020, at 4:13 PM, pabloa98 <[hidden email]
>>     <mailto:[hidden email]>> wrote:
>>
>>
>>
>>     On Thu, Mar 19, 2020 at 2:50 PM Rob Sargent <[hidden email]
>>     <mailto:[hidden email]>> wrote:
>>
>>
>>
>>         > On Mar 19, 2020, at 3:36 PM, pabloa98 <[hidden email]
>>         <mailto:[hidden email]>> wrote:
>>         >
>>         > Hello,
>>         >
>>         > My schema requires a counter for each combination of 2
>>         values. Something like:
>>         >
>>         > CREATE TABLE counter(
>>         > group INT NOT NULL,
>>         > element INT NOT NULL,
>>         > seq_number INT NOT NULL default 0,
>>         > CONSTRAINT PRIMARY KEY (group, element)
>>         > );
>>         >
>>         > For each entry in counter, aka for each (group, element)
>>         pair, the model requires a seq_number.
>>         >
>>         > If I use a table "counter", I could still have counter
>>         collisions between 2 transactions. I need truly sequence
>>         behavior. Is that possible by using a table like "counter"
>>         table, where the counter could be increased out of the
>>         transaction so it performs as a sequence without having race
>>         conditions between concurrent transactions?
>>         >
>>         > The other option is to create sequences for each new pair of
>>         (group, element) using triggers. There are millions of pairs.
>>         So this approach will generate millions of sequences.
>>         >
>>         > How a PostgreSQL database would behave having millions of
>>         sequences in a schema? Would it degrade its performance? Is
>>         there any negative impact?
>>         >
>>         > Regards
>>         >
>>         > Pablo
>>         >
>>
>>         To clarify, are you hoping for consecutive numbers as the each
>>         row is added to the table, i.e. “serial”?
>>
>>         What is the intension of “seq_number”?
>>         >
>>         >
>>
>>
>>     the idea is to have like a serial sequence, but for each pair of
>>     (group, element).
>>
>>     so that when we insert rows in another table, we could have
>>     something like:
>>
>>
>>     group, element, event_id, ...
>>     1, 1, 1
>>     1, 1, 2
>>     1, 1, 3
>>     2, 1, 1
>>     1, 1, 4
>>     1, 3, 1
>>     1, 1, 5
>>     1, 3, 2
>>     2, 1, 2
>>     2, 1, 3
>>
>>     The 3rd column is the sequence number we get from the appropriate
>>     sequence created by the trigger.
>>     I want to implement a variation of
>>     https://stackoverflow.com/a/30204854 and that will generate
>>     millions of sequences.
>>
>>
>     Then I don’t thing group/element can be a PRIMARY KEY
>
>
> Thank you for your answer. Let me explain:
>
> Table "counter" will have triggers that will create a sequence with a
> new row is inserted called counter_1_1_seq, counter_2_1_seq,
> counter_1_3_seq, etc that will be used to insert values in **another**
> table. It will be used for nothing else. When we insert millions of
> pairs group/element, the trigger in that table will generate millions of
> sequences.

This has train wreck written all over it. If you are going to have a
trigger for each combination of (group, element) I gotta believe the
table will fall over before you run into sequence issues. Not sure why
there can't be serial column that has supplies the sequence numbers and
therefore only one sequence in play. From what I see all you care about
is that:

group, element, event_id

has an increasing event_id for (group, element) pairs.



I need a different sequence because it is a business requirement.
Please forget about the first example. I was not clear. It will be something like this.

I have a table called "pair":

CREATE TABLE pair(
group INT NOT NULL,
element INT NOT NULL,
CONSTRAINT PRIMARY KEY (group, element)
);

I must add entries to the table "event". This table event will have a code that will be generated using a sequence of that is a function of s(group,element).
CREATE TABLE event(
group INT NOT NULL,
element INT NOT NULL,
code INT NOT NULL,
CONSTRAINT PRIMARY KEY(code, element, group)
);

I could make column code SERIAL and be done with it. HOWEVER, the "code" column starting from 1 for each pair (group, element) is a business requirement. The business domain dictates that for each pair of (group, element) there is a code that will start from 1 to 99999999. It is OK to have gaps in the sequence.

So this table "event" will contain entries like:

     1, 1, 1
     1, 1, 2
     1, 1, 3
     2, 1, 1
     1, 1, 4
     1, 3, 1
     1, 1, 5
     1, 3, 2
     2, 1, 2
     2, 1, 3

These triplets will be inserted by several processes (aka several concurrent transactions could happen at the same time). Since they are concurrent transactions, it could be possible that a trivial calculation of "code" column (something like SELECT MAX(code) + 1 FROM event WHERE group=1 AND element = 1;) will generate collisions because 2 transactions could come with the same value of "code" at the same time.

So I need something like a sequence for each pair of (group, element)

So I added to the table "pair" a trigger to create/drop these sequences each time a new row is inserted or deleted.

Something like:

    CREATE FUNCTION make_pair_sequence() RETURNS trigger
      LANGUAGE plpgsql
    AS $make_pair_sequence$
      BEGIN
        IF (TG_OP = 'INSERT' OR TG_OP = 'UPDATE') THEN
          EXECUTE format('
            CREATE SEQUENCE IF NOT EXISTS code_%s_%s_seq
              INCREMENT 1
              START 1
              CACHE 1
              MINVALUE 1
              MAXVALUE 99999999  -- 8 digits
          ', NEW.group, NEW.element);
        ELSEIF (TG_OP = 'DELETE') THEN
          EXECUTE format('
            DROP SEQUENCE IF EXISTS code_%s_%s_seq
          ', NEW.group, NEW.element);
        -- ELSEIF (TG_OP = 'TRUNCATE') THEN
        END IF;
        RETURN NULL;  -- result ignored, so no biggie about what it is returned..
      END
    $make_pair_sequence$;

    DROP TRIGGER IF EXISTS make_pair_sequence
      ON pair CASCADE;
    CREATE TRIGGER make_pair_sequence
      AFTER INSERT OR UPDATE OR DELETE ON pair
      FOR EACH ROW EXECUTE FUNCTION make_pair_sequence();

each time I insert a pair with something like:

INSERT INTO PAIR( group, element) VALUES (1, 1);   -- trigger creates sequence code_1_1_seq
INSERT INTO PAIR( group, element) VALUES (1, 3);   -- trigger creates sequence code_1_3_seq
INSERT INTO PAIR( group, element) VALUES (2, 1);   -- trigger creates sequence code_2_1_seq
 
Later I will insert values in the "event" table with values like:

INSERT INTO event(group, element, code) VALUES (1, 1, nextval('code_1_1_seq'));
INSERT INTO event(group, element, code) VALUES (1, 3, nextval('code_1_3_seq'));
INSERT INTO event(group, element, code) VALUES (2, 1, nextval('code_2_1_seq'));
etc

And since there are millions of possible pairs, the database will get millions of sequences. The "code" column value generation on table "event" will have no race conditions and everything will work fine from the business point of view.

Now the questions are:

- Could PostgreSQL deal with millions of sequences? What will happen with maintenance commands (like VACCUM)? Will they suffer performance issues?
- Is there a better method of having independent code counters transaction-safe with no locks?

Sorry about the first email. I did not explain clearly what the problem was.




>
> My question is how PostgreSQL will behave. Could it deal with millions
> of sequences? What about system operations as vacuum, etc?
>
> Pablo


--
Adrian Klaver
[hidden email]
lup
Reply | Threaded
Open this post in threaded view
|

Re: Could postgres12 support millions of sequences? (like 10 million)

lup


On Mar 19, 2020, at 6:45 PM, pabloa98 <[hidden email]> wrote:



On Thu, Mar 19, 2020 at 5:13 PM Adrian Klaver <[hidden email]> wrote:
On 3/19/20 3:32 PM, pabloa98 wrote:
> 
> 
> On Thu, Mar 19, 2020 at 3:17 PM Rob Sargent <[hidden email] 
> <mailto:[hidden email]>> wrote:I have a table called "pair":

CREATE TABLE pair(
group INT NOT NULL,
element INT NOT NULL,
CONSTRAINT PRIMARY KEY (group, element)
);

I must add entries to the table "event". This table event will have a code that will be generated using a sequence of that is a function of s(group,element). 
CREATE TABLE event(
group INT NOT NULL,
element INT NOT NULL,
code INT NOT NULL,
CONSTRAINT PRIMARY KEY(code, element, group)
);

Unless event table is searched by code more than group, you probably want to maintain the key order from the pair table's primary key.

If gaps are ok do you still near multiple sequences?


Reply | Threaded
Open this post in threaded view
|

Re: Could postgres12 support millions of sequences? (like 10 million)

pabloa98


On Thu, Mar 19, 2020 at 6:16 PM Rob Sargent <[hidden email]> wrote:


On Mar 19, 2020, at 6:45 PM, pabloa98 <[hidden email]> wrote:



On Thu, Mar 19, 2020 at 5:13 PM Adrian Klaver <[hidden email]> wrote:
On 3/19/20 3:32 PM, pabloa98 wrote:
> 
> 
> On Thu, Mar 19, 2020 at 3:17 PM Rob Sargent <[hidden email] 
> <mailto:[hidden email]>> wrote:I have a table called "pair":

CREATE TABLE pair(
group INT NOT NULL,
element INT NOT NULL,
CONSTRAINT PRIMARY KEY (group, element)
);

I must add entries to the table "event". This table event will have a code that will be generated using a sequence of that is a function of s(group,element). 
CREATE TABLE event(
group INT NOT NULL,
element INT NOT NULL,
code INT NOT NULL,
CONSTRAINT PRIMARY KEY(code, element, group)
);

Unless event table is searched by code more than group, you probably want to maintain the key order from the pair table's primary key.

OK. I will do that.

If gaps are ok do you still near multiple sequences?

I need to start each "code" value from 1 in each (group, element) pair. This is because we only have 99999999 codes for each pair. We do not want to waste numbers. If there is a gap is OK but no a gap of millions of numbers.

lup
Reply | Threaded
Open this post in threaded view
|

Re: Could postgres12 support millions of sequences? (like 10 million)

lup


On Mar 19, 2020, at 7:35 PM, pabloa98 <[hidden email]> wrote:



On Thu, Mar 19, 2020 at 6:16 PM Rob Sargent <[hidden email]> wrote:


On Mar 19, 2020, at 6:45 PM, pabloa98 <[hidden email]> wrote:



On Thu, Mar 19, 2020 at 5:13 PM Adrian Klaver <[hidden email]> wrote:
On 3/19/20 3:32 PM, pabloa98 wrote:
> 
> 
> On Thu, Mar 19, 2020 at 3:17 PM Rob Sargent <[hidden email] 
> <mailto:[hidden email]>> wrote:I have a table called "pair":

CREATE TABLE pair(
group INT NOT NULL,
element INT NOT NULL,
CONSTRAINT PRIMARY KEY (group, element)
);

I must add entries to the table "event". This table event will have a code that will be generated using a sequence of that is a function of s(group,element). 
CREATE TABLE event(
group INT NOT NULL,
element INT NOT NULL,
code INT NOT NULL,
CONSTRAINT PRIMARY KEY(code, element, group)
);

Unless event table is searched by code more than group, you probably want to maintain the key order from the pair table's primary key.

OK. I will do that. 

If gaps are ok do you still near multiple sequences?

I need to start each "code" value from 1 in each (group, element) pair. This is because we only have 99999999 codes for each pair. We do not want to waste numbers. If there is a gap is OK but no a gap of millions of numbers.


Reply | Threaded
Open this post in threaded view
|

Re: Could postgres12 support millions of sequences? (like 10 million)

Michael Lewis
In reply to this post by David G Johnston


On Thu, Mar 19, 2020, 5:48 PM David G. Johnston <[hidden email]> wrote:
However, one other consideration with sequences: do you care that PostgreSQL will cache/pin (i.e., no release) every single sequence you touch for the lifetime of the session? (I do not think DISCARD matters here but I'm just guessing)


Would you expand on this point or is there someplace specific in the documentation on this?
Reply | Threaded
Open this post in threaded view
|

Re: Could postgres12 support millions of sequences? (like 10 million)

Tom Lane-2
Michael Lewis <[hidden email]> writes:
> On Thu, Mar 19, 2020, 5:48 PM David G. Johnston <[hidden email]>
> wrote:
>> However, one other consideration with sequences: do you care that
>> PostgreSQL will cache/pin (i.e., no release) every single sequence you
>> touch for the lifetime of the session? (I do not think DISCARD matters here
>> but I'm just guessing)

> Would you expand on this point or is there someplace specific in the
> documentation on this?

I think what David is worried about is that a sequence object is a
one-row table in PG's implementation.  Thus

(1) each sequence requires a dozen or two rows in assorted system
catalogs (not sure exactly how many offhand).

(2) each sequence uses up 8KB on disk for its table file.

(3) each sequence you actually access within a particular session
results in creation of relcache and catcache entries in that
session's local memory.  I'm not sure offhand how big those are
either, but a few KB per sequence would be a reasonable guess.

(4) each sequence competes for space in the shared-buffer arena,
since its 8K block has to be swapped into there whenever you try
to access/increment the sequence's value.

This is all mighty inefficient of course, and there's been talk
of trying to reduce the per-sequence overhead; but I don't know
of anyone actively working on that.  As things stand, I think
having millions of sequences would be quite painful performance-
wise, especially if your workload were actively hitting a lot
of them concurrently.  It would work, for some value of "work",
but it wouldn't perform very well.

Also, as multiple people mentioned already, this still wouldn't
guarantee gap-free sequences of ID values.

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: Could postgres12 support millions of sequences? (like 10 million)

Adrian Klaver-4
In reply to this post by Michael Lewis
On 3/19/20 7:38 PM, Michael Lewis wrote:

>
>
> On Thu, Mar 19, 2020, 5:48 PM David G. Johnston
> <[hidden email] <mailto:[hidden email]>> wrote:
>
>     However, one other consideration with sequences: do you care that
>     PostgreSQL will cache/pin (i.e., no release) every single sequence
>     you touch for the lifetime of the session? (I do not think DISCARD
>     matters here but I'm just guessing)
>
>
>
> Would you expand on this point or is there someplace specific in the
> documentation on this?
>

See the section starting here:

https://www.postgresql.org/docs/12/sql-createsequence.html

Notes

"Unexpected results might be obtained if a cache setting greater than
one is used for a sequence object that will be used concurrently by
multiple sessions. Each session will allocate and cache successive
sequence values during one access to the sequence object and increase
the sequence object's last_value accordingly. Then, the next cache-1
uses of nextval within that session simply return the preallocated
values without touching the sequence object. So, any numbers allocated
but not used within a session will be lost when that session ends,
resulting in “holes” in the sequence.

...
"


--
Adrian Klaver
[hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: Could postgres12 support millions of sequences? (like 10 million)

pabloa98


On Thu, Mar 19, 2020 at 9:12 PM Adrian Klaver <[hidden email]> wrote:
On 3/19/20 7:38 PM, Michael Lewis wrote:
>
>
> On Thu, Mar 19, 2020, 5:48 PM David G. Johnston
> <[hidden email] <mailto:[hidden email]>> wrote:
>
>     However, one other consideration with sequences: do you care that
>     PostgreSQL will cache/pin (i.e., no release) every single sequence
>     you touch for the lifetime of the session? (I do not think DISCARD
>     matters here but I'm just guessing)
>
>
>
> Would you expand on this point or is there someplace specific in the
> documentation on this?
>

See the section starting here:

https://www.postgresql.org/docs/12/sql-createsequence.html

Notes

"Unexpected results might be obtained if a cache setting greater than
one is used for a sequence object that will be used concurrently by
multiple sessions. Each session will allocate and cache successive
sequence values during one access to the sequence object and increase
the sequence object's last_value accordingly. Then, the next cache-1
uses of nextval within that session simply return the preallocated
values without touching the sequence object. So, any numbers allocated
but not used within a session will be lost when that session ends,
resulting in “holes” in the sequence.

...
"

We will use a CACHE 1. This is because when nextval('seq') is invoked, we are hitting 3 or 4 more tables so the sequence will not be a performance blocker (compared with all the operations in the transaction).


--
Adrian Klaver
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Could postgres12 support millions of sequences? (like 10 million)

pabloa98
In reply to this post by Tom Lane-2
I see. 

Any suggestion? It should behave like a sequence in the sense that concurrent transitions will get different numbers from this alternative sequence like solution.

In our case, we will need to do a call nextval('some_seq') (or similar) from different processes no more than twice every minute.


It would be nice to have a sequence data type. Correct me if I am wrong, but It seems to me that a sequence data type would cost the same or less than the current sequence implementation.

The schema would be more clear too. We could have a table like:

CREATE TABLE pair(
group INT NOT NULL,
element INT NOT NULL,
seq SEQUENCE INCREMENT 1
              START 1
              CACHE 1
              MINVALUE 1
              MAXVALUE 99999999
        NOT NULL,
CONSTRAINT PRIMARY KEY (group, element)
);

And then:

INSERT INTO event(group, element, code)
VALUES (
         1,
         1,
         ( SELECT seq.nextval('code_1_1_seq') FROM pair p WHERE p.group=1 and p.code=1 )
);

Or perhaps storing all the sequences in the same table as rows will have the same behavior.

Pablo


On Thu, Mar 19, 2020 at 7:56 PM Tom Lane <[hidden email]> wrote:
Michael Lewis <[hidden email]> writes:
> On Thu, Mar 19, 2020, 5:48 PM David G. Johnston <[hidden email]>
> wrote:
>> However, one other consideration with sequences: do you care that
>> PostgreSQL will cache/pin (i.e., no release) every single sequence you
>> touch for the lifetime of the session? (I do not think DISCARD matters here
>> but I'm just guessing)

> Would you expand on this point or is there someplace specific in the
> documentation on this?

I think what David is worried about is that a sequence object is a
one-row table in PG's implementation.  Thus

(1) each sequence requires a dozen or two rows in assorted system
catalogs (not sure exactly how many offhand).

(2) each sequence uses up 8KB on disk for its table file.

(3) each sequence you actually access within a particular session
results in creation of relcache and catcache entries in that
session's local memory.  I'm not sure offhand how big those are
either, but a few KB per sequence would be a reasonable guess.

(4) each sequence competes for space in the shared-buffer arena,
since its 8K block has to be swapped into there whenever you try
to access/increment the sequence's value.

This is all mighty inefficient of course, and there's been talk
of trying to reduce the per-sequence overhead; but I don't know
of anyone actively working on that.  As things stand, I think
having millions of sequences would be quite painful performance-
wise, especially if your workload were actively hitting a lot
of them concurrently.  It would work, for some value of "work",
but it wouldn't perform very well.

Also, as multiple people mentioned already, this still wouldn't
guarantee gap-free sequences of ID values.

                        regards, tom lane
Reply | Threaded
Open this post in threaded view
|

Re: Could postgres12 support millions of sequences? (like 10 million)

David G Johnston
In reply to this post by Tom Lane-2
On Thursday, March 19, 2020, Tom Lane <[hidden email]> wrote:
Michael Lewis <[hidden email]> writes:
> On Thu, Mar 19, 2020, 5:48 PM David G. Johnston <[hidden email]>
> wrote:
>> However, one other consideration with sequences: do you care that
>> PostgreSQL will cache/pin (i.e., no release) every single sequence you
>> touch for the lifetime of the session? (I do not think DISCARD matters here
>> but I'm just guessing)

> Would you expand on this point or is there someplace specific in the
> documentation on this?

I think what David is worried about is that a sequence object is a
one-row table in PG's implementation.  Thus

(1) each sequence requires a dozen or two rows in assorted system
catalogs (not sure exactly how many offhand).

Actually I seemed to have missed that dynamic.  I was actually referring to the SeqTable hash table specified here:


I wouldn’t think there would be much spreading of data throughout the catalog if the sequences are unowned (by tables).

David J.

Reply | Threaded
Open this post in threaded view
|

Re: Could postgres12 support millions of sequences? (like 10 million)

David G Johnston
In reply to this post by pabloa98
On Thursday, March 19, 2020, pabloa98 <[hidden email]> wrote:

We will use a CACHE 1. This is because when nextval('seq') is invoked, we are hitting 3 or 4 more tables so the sequence will not be a performance blocker (compared with all the operations in the transaction).

The other implementation detail to remember is:

“Because nextval and setval calls are never rolled back, sequence objects cannot be used if gapless assignment of sequence numbers is needed.”

David J.

 
Reply | Threaded
Open this post in threaded view
|

Re: Could postgres12 support millions of sequences? (like 10 million)

rob stone-2
In reply to this post by pabloa98
Hello,

On Thu, 2020-03-19 at 14:36 -0700, pabloa98 wrote:

> Hello,
>
> My schema requires a counter for each combination of 2 values.
> Something like:
>
> CREATE TABLE counter(
> group INT NOT NULL,
> element INT NOT NULL,
> seq_number INT NOT NULL default 0,
> CONSTRAINT PRIMARY KEY (group, element)
> );
>
> For each entry in counter, aka for each (group, element) pair, the
> model requires a seq_number.
>
> If I use a table "counter", I could still have counter collisions
> between 2 transactions. I need truly sequence behavior. Is that
> possible by using a table like "counter" table, where the counter
> could be increased out of the transaction so it performs as a
> sequence without having race conditions between concurrent
> transactions?
>
> The other option is to create sequences for each new pair of (group,
> element) using triggers. There are millions of pairs. So this
> approach will generate millions of sequences.
>
> How a PostgreSQL database would behave having millions of sequences
> in a schema? Would it degrade its performance? Is there any negative
> impact?
>
> Regards
>
> Pablo
>
>
>


Have you tried the following:-

BEGIN;
SELECT (seq_number + 1) FROM counter WHERE group = $1
AND element = $2 FOR UPDATE;

If that gives a NOTFOUND exception, then you need to insert into table
counter using a seq_number = 1, or update it with (seq_number + 1).
Then do the insert into whatever table needs the seq_number value.

If no errors detected, then COMMIT else ROLLBACK.
It is all in one transaction block.
We use this approach to assign invoice numbers.

You could pre-seed table counter when a group/element pair is created
for the first time.

HTH,
Robert




Reply | Threaded
Open this post in threaded view
|

Re: Could postgres12 support millions of sequences? (like 10 million)

Adrian Klaver-4
In reply to this post by pabloa98
On 3/19/20 10:31 PM, pabloa98 wrote:

> I see.
>
> Any suggestion? It should behave like a sequence in the sense that
> concurrent transitions will get different numbers from this alternative
> sequence like solution.
>
> In our case, we will need to do a call nextval('some_seq') (or similar)
> from different processes no more than twice every minute.
>
>
> It would be nice to have a sequence data type. Correct me if I am wrong,
> but It seems to me that a sequence data type would cost the same or less
> than the current sequence implementation.
>
> The schema would be more clear too. We could have a table like:
>
> CREATE TABLE pair(
> group INT NOT NULL,
> element INT NOT NULL,
> seq SEQUENCE INCREMENT 1
>                START 1
>                CACHE 1
>                MINVALUE 1
>                MAXVALUE 99999999
>          NOT NULL,
> CONSTRAINT PRIMARY KEY (group, element)
> );
>
> And then:
>
> INSERT INTO event(group, element, code)
> VALUES (
>           1,
>           1,
>           ( SELECT seq.nextval('code_1_1_seq') FROM pair p WHERE
> p.group=1 and p.code=1 )
> );
>
> Or perhaps storing all the sequences in the same table as rows will have
> the same behavior.

If code is just something to show the sequence of insertion for group,
element combinations then maybe something like below:

CREATE TABLE event(
group_id INT NOT NULL, --changed as group is reserved word
element INT NOT NULL,
insert_ts timestamptz NOT NULL DEFAULT clock_timestamp(),
PRIMARY KEY(group_id, element, insert_ts)
);


insert into event(group_id, element) VALUES
(1, 1),
(1, 1),
(1, 1),
(2, 1),
(1, 1),
(1, 3),
(1, 1),
(1, 3),
(2, 1),
(2, 1);


select * from event ;
  group_id | element |           insert_ts
----------+---------+--------------------------------
         1 |       1 | 03/20/2020 09:51:12.675926 PDT
         1 |       1 | 03/20/2020 09:51:12.675985 PDT
         1 |       1 | 03/20/2020 09:51:12.675991 PDT
         2 |       1 | 03/20/2020 09:51:12.675996 PDT
         1 |       1 | 03/20/2020 09:51:12.676 PDT
         1 |       3 | 03/20/2020 09:51:12.676004 PDT
         1 |       1 | 03/20/2020 09:51:12.676008 PDT
         1 |       3 | 03/20/2020 09:51:12.676012 PDT
         2 |       1 | 03/20/2020 09:51:12.676016 PDT
         2 |       1 | 03/20/2020 09:51:12.67602 PDT
(10 rows)


select group_id, element, row_number() OVER (partition by (group_id,
element) order by (group_id, element)) AS code from event;
  group_id | element | code
----------+---------+------
         1 |       1 |    1
         1 |       1 |    2
         1 |       1 |    3
         1 |       1 |    4
         1 |       1 |    5
         1 |       3 |    1
         1 |       3 |    2
         2 |       1 |    1
         2 |       1 |    2
         2 |       1 |    3
(10 rows)

>
> Pablo
>
>
> On Thu, Mar 19, 2020 at 7:56 PM Tom Lane <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     Michael Lewis <[hidden email] <mailto:[hidden email]>> writes:
>      > On Thu, Mar 19, 2020, 5:48 PM David G. Johnston
>     <[hidden email] <mailto:[hidden email]>>
>      > wrote:
>      >> However, one other consideration with sequences: do you care that
>      >> PostgreSQL will cache/pin (i.e., no release) every single
>     sequence you
>      >> touch for the lifetime of the session? (I do not think DISCARD
>     matters here
>      >> but I'm just guessing)
>
>      > Would you expand on this point or is there someplace specific in the
>      > documentation on this?
>
>     I think what David is worried about is that a sequence object is a
>     one-row table in PG's implementation.  Thus
>
>     (1) each sequence requires a dozen or two rows in assorted system
>     catalogs (not sure exactly how many offhand).
>
>     (2) each sequence uses up 8KB on disk for its table file.
>
>     (3) each sequence you actually access within a particular session
>     results in creation of relcache and catcache entries in that
>     session's local memory.  I'm not sure offhand how big those are
>     either, but a few KB per sequence would be a reasonable guess.
>
>     (4) each sequence competes for space in the shared-buffer arena,
>     since its 8K block has to be swapped into there whenever you try
>     to access/increment the sequence's value.
>
>     This is all mighty inefficient of course, and there's been talk
>     of trying to reduce the per-sequence overhead; but I don't know
>     of anyone actively working on that.  As things stand, I think
>     having millions of sequences would be quite painful performance-
>     wise, especially if your workload were actively hitting a lot
>     of them concurrently.  It would work, for some value of "work",
>     but it wouldn't perform very well.
>
>     Also, as multiple people mentioned already, this still wouldn't
>     guarantee gap-free sequences of ID values.
>
>                              regards, tom lane
>


--
Adrian Klaver
[hidden email]


123