xmin and very high number of concurrent transactions

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

xmin and very high number of concurrent transactions

Vijaykumar Jain
I was asked this question in one of my demos, and it was interesting one.

we update xmin for new inserts with the current txid.
now in a very high concurrent scenario where there are more than 2000
concurrent users trying to insert new data,
will updating xmin value be a bottleneck?

i know we should use pooling solutions to reduce concurrent
connections but given we have enough resources to take care of
spawning a new process for a new connection,

Regards,
Vijay

Reply | Threaded
Open this post in threaded view
|

Re: xmin and very high number of concurrent transactions

Adrian Klaver-4
On 3/12/19 12:19 PM, Vijaykumar Jain wrote:
> I was asked this question in one of my demos, and it was interesting one.
>
> we update xmin for new inserts with the current txid.

Why?

> now in a very high concurrent scenario where there are more than 2000
> concurrent users trying to insert new data,
> will updating xmin value be a bottleneck?
>
> i know we should use pooling solutions to reduce concurrent
> connections but given we have enough resources to take care of
> spawning a new process for a new connection,
>
> Regards,
> Vijay
>
>


--
Adrian Klaver
[hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [External] Re: xmin and very high number of concurrent transactions

Vijaykumar Jain
no i mean not we end users, postgres does it (?) via the xmin and xmax
fields  from inherited tables :) if that is what you wanted in a why
or are you asking, does postgres even update those rows and i am wrong
assuming it that way?

since the values need to be atomic,
consider the below analogy
assuming i(postgres) am person giving out token to
people(connections/tx) in a queue.
if there is a single line, (sequential) then it is easy for me to
simply give them 1 token incrementing the value and so on.
but if there are thousands of users in parallel lines, i am only one
person delivering the token, will operate sequentially, and the other
person is "blocked" for sometime before it gets the token with the
required value.
so if there are 1000s or users with the "delay" may impact my
performance  coz i need to maintain the value of the token to be able
to know what token value i need to give to next person?

i do not know if am explaining it correctly, pardon my analogy,


Regards,
Vijay

On Wed, Mar 13, 2019 at 1:10 AM Adrian Klaver <[hidden email]> wrote:

>
> On 3/12/19 12:19 PM, Vijaykumar Jain wrote:
> > I was asked this question in one of my demos, and it was interesting one.
> >
> > we update xmin for new inserts with the current txid.
>
> Why?
>
> > now in a very high concurrent scenario where there are more than 2000
> > concurrent users trying to insert new data,
> > will updating xmin value be a bottleneck?
> >
> > i know we should use pooling solutions to reduce concurrent
> > connections but given we have enough resources to take care of
> > spawning a new process for a new connection,
> >
> > Regards,
> > Vijay
> >
> >
>
>
> --
> Adrian Klaver
> [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [External] Re: xmin and very high number of concurrent transactions

Adrian Klaver-4
On 3/12/19 1:02 PM, Vijaykumar Jain wrote:
> no i mean not we end users, postgres does it (?) via the xmin and xmax
> fields  from inherited tables :) if that is what you wanted in a why
> or are you asking, does postgres even update those rows and i am wrong
> assuming it that way?

Not sure where the inherited tables come in?

See below for more info:
https://www.postgresql.org/docs/11/storage-page-layout.html

AFAIK xmin and xmax are just done as part of the insert or delete
operations so there is no updating involved.

I would say the impact to performance would come from the overhead of
each connection rather then maintaining xmin/xmax.

>
> since the values need to be atomic,
> consider the below analogy
> assuming i(postgres) am person giving out token to
> people(connections/tx) in a queue.
> if there is a single line, (sequential) then it is easy for me to
> simply give them 1 token incrementing the value and so on.
> but if there are thousands of users in parallel lines, i am only one
> person delivering the token, will operate sequentially, and the other
> person is "blocked" for sometime before it gets the token with the
> required value.
> so if there are 1000s or users with the "delay" may impact my
> performance  coz i need to maintain the value of the token to be able
> to know what token value i need to give to next person?
>
> i do not know if am explaining it correctly, pardon my analogy,
>
>
> Regards,
> Vijay
>
> On Wed, Mar 13, 2019 at 1:10 AM Adrian Klaver <[hidden email]> wrote:
>>
>> On 3/12/19 12:19 PM, Vijaykumar Jain wrote:
>>> I was asked this question in one of my demos, and it was interesting one.
>>>
>>> we update xmin for new inserts with the current txid.
>>
>> Why?
>>
>>> now in a very high concurrent scenario where there are more than 2000
>>> concurrent users trying to insert new data,
>>> will updating xmin value be a bottleneck?
>>>
>>> i know we should use pooling solutions to reduce concurrent
>>> connections but given we have enough resources to take care of
>>> spawning a new process for a new connection,
>>>
>>> Regards,
>>> Vijay
>>>
>>>
>>
>>
>> --
>> Adrian Klaver
>> [hidden email]


--
Adrian Klaver
[hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: xmin and very high number of concurrent transactions

reg_pg_stefanz
In reply to this post by Vijaykumar Jain
I may have misunderstood the documentation or your question, but I had
the understanding that xmin is not updated, but is only set on insert
(but yes, also for update, but updates are also inserts for Postgres as
updates are executed as delete/insert)

from https://www.postgresql.org/docs/10/ddl-system-columns.html
 > xmin
 > The identity (transaction ID) of the inserting transaction for this
row version. (A row version is an individual state of > row; each update
of a row creates a new row version for the same logical row.)

therfore I assume, there are no actual updates of xmin values

Stefan

On 12.03.2019 20:19, Vijaykumar Jain wrote:

> I was asked this question in one of my demos, and it was interesting one.
>
> we update xmin for new inserts with the current txid.
> now in a very high concurrent scenario where there are more than 2000
> concurrent users trying to insert new data,
> will updating xmin value be a bottleneck?
>
> i know we should use pooling solutions to reduce concurrent
> connections but given we have enough resources to take care of
> spawning a new process for a new connection,
>
> Regards,
> Vijay
>


Reply | Threaded
Open this post in threaded view
|

Re: xmin and very high number of concurrent transactions

Laurenz Albe
In reply to this post by Vijaykumar Jain
Vijaykumar Jain wrote:

> I was asked this question in one of my demos, and it was interesting one.
>
> we update xmin for new inserts with the current txid.
> now in a very high concurrent scenario where there are more than 2000
> concurrent users trying to insert new data,
> will updating xmin value be a bottleneck?
>
> i know we should use pooling solutions to reduce concurrent
> connections but given we have enough resources to take care of
> spawning a new process for a new connection,

You can read the function GetNewTransactionId in
src/backend/access/transam/varsup.c for details.

Transaction ID creation is serialized with a "light-weight lock",
so it could potentially be a bottleneck.

Often that is dwarfed by the I/O requirements from many concurrent
commits, but if most of your transactions are rolled back or you
use "synchronous_commit = off", I can imagine that it could matter.

It is not a matter of how many clients there are, but of how
often a new writing transaction is started.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com


Reply | Threaded
Open this post in threaded view
|

Re: xmin and very high number of concurrent transactions

Julien Rouhaud
On Wed, Mar 13, 2019 at 9:50 AM Laurenz Albe <[hidden email]> wrote:

>
> Vijaykumar Jain wrote:
> > I was asked this question in one of my demos, and it was interesting one.
> >
> > we update xmin for new inserts with the current txid.
> > now in a very high concurrent scenario where there are more than 2000
> > concurrent users trying to insert new data,
> > will updating xmin value be a bottleneck?
> >
> > i know we should use pooling solutions to reduce concurrent
> > connections but given we have enough resources to take care of
> > spawning a new process for a new connection,
>
> You can read the function GetNewTransactionId in
> src/backend/access/transam/varsup.c for details.
>
> Transaction ID creation is serialized with a "light-weight lock",
> so it could potentially be a bottleneck.

Also I think that GetSnapshotData() would be the major bottleneck way
before GetNewTransactionId() becomes problematic.  Especially with
such a high number of active backends.

Reply | Threaded
Open this post in threaded view
|

Re: [External] Re: xmin and very high number of concurrent transactions

Vijaykumar Jain
Thank you everyone for responding.
Appreciate your help.

Looks like I need to understand the concepts a little more in detail , to be able to ask the right questions, but atleast now I can look at  the relevant docs.


On Wed, 13 Mar 2019 at 2:44 PM Julien Rouhaud <[hidden email]> wrote:
On Wed, Mar 13, 2019 at 9:50 AM Laurenz Albe <[hidden email]> wrote:
>
> Vijaykumar Jain wrote:
> > I was asked this question in one of my demos, and it was interesting one.
> >
> > we update xmin for new inserts with the current txid.
> > now in a very high concurrent scenario where there are more than 2000
> > concurrent users trying to insert new data,
> > will updating xmin value be a bottleneck?
> >
> > i know we should use pooling solutions to reduce concurrent
> > connections but given we have enough resources to take care of
> > spawning a new process for a new connection,
>
> You can read the function GetNewTransactionId in
> src/backend/access/transam/varsup.c for details.
>
> Transaction ID creation is serialized with a "light-weight lock",
> so it could potentially be a bottleneck.

Also I think that GetSnapshotData() would be the major bottleneck way
before GetNewTransactionId() becomes problematic.  Especially with
such a high number of active backends.
--

Regards,
Vijay