Improving connection scalability: GetSnapshotData()

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
68 messages Options
1234
Reply | Threaded
Open this post in threaded view
|

Re: Improving connection scalability: GetSnapshotData()

Ranier Vilela-2
Em sex., 24 de jul. de 2020 às 14:16, Andres Freund <[hidden email]> escreveu:
On 2020-07-24 14:05:04 -0300, Ranier Vilela wrote:
> Latest Postgres
> Windows 64 bits
> msvc 2019 64 bits
>
> Patches applied v12-0001 to v12-0007:
>
>  C:\dll\postgres\contrib\pgstattuple\pgstatapprox.c(74,28): warning C4013:
> 'GetOldestXmin' indefinido; assumindo extern retornando int
> [C:\dll\postgres
> C:\dll\postgres\contrib\pg_visibility\pg_visibility.c(569,29): warning
> C4013: 'GetOldestXmin' indefinido; assumindo extern retornando int
> [C:\dll\postgres\pg_visibility.
> vcxproj]
>  C:\dll\postgres\contrib\pgstattuple\pgstatapprox.c(74,56): error C2065:
> 'PROCARRAY_FLAGS_VACUUM': identificador nao declarado
> [C:\dll\postgres\pgstattuple.vcxproj]
>   C:\dll\postgres\contrib\pg_visibility\pg_visibility.c(569,58): error
> C2065: 'PROCARRAY_FLAGS_VACUUM': identificador nao declarado
> [C:\dll\postgres\pg_visibility.vcxproj]
>   C:\dll\postgres\contrib\pg_visibility\pg_visibility.c(686,70): error
> C2065: 'PROCARRAY_FLAGS_VACUUM': identificador nao declarado
> [C:\dll\postgres\pg_visibility.vcxproj]

I don't know that's about - there's no call to GetOldestXmin() in
pgstatapprox and pg_visibility after patch 0002? And similarly, the
PROCARRAY_* references are also removed in the same patch?
Maybe need to remove them from these places, not?
C:\dll\postgres\contrib>grep -d GetOldestXmin *.c
File pgstattuple\pgstatapprox.c:
        OldestXmin = GetOldestXmin(rel, PROCARRAY_FLAGS_VACUUM);
File pg_visibility\pg_visibility.c:
                OldestXmin = GetOldestXmin(NULL, PROCARRAY_FLAGS_VACUUM);
                                 * deadlocks, because surely GetOldestXmin() should never take
                                RecomputedOldestXmin = GetOldestXmin(NULL, PROCARRAY_FLAGS_VACUUM);

regards,
Ranier Vilela
Reply | Threaded
Open this post in threaded view
|

Re: Improving connection scalability: GetSnapshotData()

Andres Freund
On 2020-07-24 18:15:15 -0300, Ranier Vilela wrote:

> Em sex., 24 de jul. de 2020 às 14:16, Andres Freund <[hidden email]>
> escreveu:
>
> > On 2020-07-24 14:05:04 -0300, Ranier Vilela wrote:
> > > Latest Postgres
> > > Windows 64 bits
> > > msvc 2019 64 bits
> > >
> > > Patches applied v12-0001 to v12-0007:
> > >
> > >  C:\dll\postgres\contrib\pgstattuple\pgstatapprox.c(74,28): warning
> > C4013:
> > > 'GetOldestXmin' indefinido; assumindo extern retornando int
> > > [C:\dll\postgres
> > > C:\dll\postgres\contrib\pg_visibility\pg_visibility.c(569,29): warning
> > > C4013: 'GetOldestXmin' indefinido; assumindo extern retornando int
> > > [C:\dll\postgres\pg_visibility.
> > > vcxproj]
> > >  C:\dll\postgres\contrib\pgstattuple\pgstatapprox.c(74,56): error C2065:
> > > 'PROCARRAY_FLAGS_VACUUM': identificador nao declarado
> > > [C:\dll\postgres\pgstattuple.vcxproj]
> > >   C:\dll\postgres\contrib\pg_visibility\pg_visibility.c(569,58): error
> > > C2065: 'PROCARRAY_FLAGS_VACUUM': identificador nao declarado
> > > [C:\dll\postgres\pg_visibility.vcxproj]
> > >   C:\dll\postgres\contrib\pg_visibility\pg_visibility.c(686,70): error
> > > C2065: 'PROCARRAY_FLAGS_VACUUM': identificador nao declarado
> > > [C:\dll\postgres\pg_visibility.vcxproj]
> >
> > I don't know that's about - there's no call to GetOldestXmin() in
> > pgstatapprox and pg_visibility after patch 0002? And similarly, the
> > PROCARRAY_* references are also removed in the same patch?
> >
> Maybe need to remove them from these places, not?
> C:\dll\postgres\contrib>grep -d GetOldestXmin *.c
> File pgstattuple\pgstatapprox.c:
>         OldestXmin = GetOldestXmin(rel, PROCARRAY_FLAGS_VACUUM);
> File pg_visibility\pg_visibility.c:
>                 OldestXmin = GetOldestXmin(NULL, PROCARRAY_FLAGS_VACUUM);
>                                  * deadlocks, because surely
> GetOldestXmin() should never take
>                                 RecomputedOldestXmin = GetOldestXmin(NULL,
> PROCARRAY_FLAGS_VACUUM);

The 0002 patch changed those files:

diff --git a/contrib/pg_visibility/pg_visibility.c b/contrib/pg_visibility/pg_visibility.c
index 68d580ed1e0..37206c50a21 100644
--- a/contrib/pg_visibility/pg_visibility.c
+++ b/contrib/pg_visibility/pg_visibility.c
@@ -563,17 +563,14 @@ collect_corrupt_items(Oid relid, bool all_visible, bool all_frozen)
  BufferAccessStrategy bstrategy = GetAccessStrategy(BAS_BULKREAD);
  TransactionId OldestXmin = InvalidTransactionId;
 
- if (all_visible)
- {
- /* Don't pass rel; that will fail in recovery. */
- OldestXmin = GetOldestXmin(NULL, PROCARRAY_FLAGS_VACUUM);
- }
-
  rel = relation_open(relid, AccessShareLock);
 
  /* Only some relkinds have a visibility map */
  check_relation_relkind(rel);
 
+ if (all_visible)
+ OldestXmin = GetOldestNonRemovableTransactionId(rel);
+
  nblocks = RelationGetNumberOfBlocks(rel);
 
  /*
@@ -679,11 +676,12 @@ collect_corrupt_items(Oid relid, bool all_visible, bool all_frozen)
  * From a concurrency point of view, it sort of sucks to
  * retake ProcArrayLock here while we're holding the buffer
  * exclusively locked, but it should be safe against
- * deadlocks, because surely GetOldestXmin() should never take
- * a buffer lock. And this shouldn't happen often, so it's
- * worth being careful so as to avoid false positives.
+ * deadlocks, because surely GetOldestNonRemovableTransactionId()
+ * should never take a buffer lock. And this shouldn't happen
+ * often, so it's worth being careful so as to avoid false
+ * positives.
  */
- RecomputedOldestXmin = GetOldestXmin(NULL, PROCARRAY_FLAGS_VACUUM);
+ RecomputedOldestXmin = GetOldestNonRemovableTransactionId(rel);
 
  if (!TransactionIdPrecedes(OldestXmin, RecomputedOldestXmin))
  record_corrupt_item(items, &tuple.t_self);

diff --git a/contrib/pgstattuple/pgstatapprox.c b/contrib/pgstattuple/pgstatapprox.c
index dbc0fa11f61..3a99333d443 100644
--- a/contrib/pgstattuple/pgstatapprox.c
+++ b/contrib/pgstattuple/pgstatapprox.c
@@ -71,7 +71,7 @@ statapprox_heap(Relation rel, output_type *stat)
  BufferAccessStrategy bstrategy;
  TransactionId OldestXmin;
 
- OldestXmin = GetOldestXmin(rel, PROCARRAY_FLAGS_VACUUM);
+ OldestXmin = GetOldestNonRemovableTransactionId(rel);
  bstrategy = GetAccessStrategy(BAS_BULKREAD);
 
  nblocks = RelationGetNumberOfBlocks(rel);


Greetings,

Andres Freund


Reply | Threaded
Open this post in threaded view
|

Re: Improving connection scalability: GetSnapshotData()

Ranier Vilela-2
Em sex., 24 de jul. de 2020 às 21:00, Andres Freund <[hidden email]> escreveu:
On 2020-07-24 18:15:15 -0300, Ranier Vilela wrote:
> Em sex., 24 de jul. de 2020 às 14:16, Andres Freund <[hidden email]>
> escreveu:
>
> > On 2020-07-24 14:05:04 -0300, Ranier Vilela wrote:
> > > Latest Postgres
> > > Windows 64 bits
> > > msvc 2019 64 bits
> > >
> > > Patches applied v12-0001 to v12-0007:
> > >
> > >  C:\dll\postgres\contrib\pgstattuple\pgstatapprox.c(74,28): warning
> > C4013:
> > > 'GetOldestXmin' indefinido; assumindo extern retornando int
> > > [C:\dll\postgres
> > > C:\dll\postgres\contrib\pg_visibility\pg_visibility.c(569,29): warning
> > > C4013: 'GetOldestXmin' indefinido; assumindo extern retornando int
> > > [C:\dll\postgres\pg_visibility.
> > > vcxproj]
> > >  C:\dll\postgres\contrib\pgstattuple\pgstatapprox.c(74,56): error C2065:
> > > 'PROCARRAY_FLAGS_VACUUM': identificador nao declarado
> > > [C:\dll\postgres\pgstattuple.vcxproj]
> > >   C:\dll\postgres\contrib\pg_visibility\pg_visibility.c(569,58): error
> > > C2065: 'PROCARRAY_FLAGS_VACUUM': identificador nao declarado
> > > [C:\dll\postgres\pg_visibility.vcxproj]
> > >   C:\dll\postgres\contrib\pg_visibility\pg_visibility.c(686,70): error
> > > C2065: 'PROCARRAY_FLAGS_VACUUM': identificador nao declarado
> > > [C:\dll\postgres\pg_visibility.vcxproj]
> >
> > I don't know that's about - there's no call to GetOldestXmin() in
> > pgstatapprox and pg_visibility after patch 0002? And similarly, the
> > PROCARRAY_* references are also removed in the same patch?
> >
> Maybe need to remove them from these places, not?
> C:\dll\postgres\contrib>grep -d GetOldestXmin *.c
> File pgstattuple\pgstatapprox.c:
>         OldestXmin = GetOldestXmin(rel, PROCARRAY_FLAGS_VACUUM);
> File pg_visibility\pg_visibility.c:
>                 OldestXmin = GetOldestXmin(NULL, PROCARRAY_FLAGS_VACUUM);
>                                  * deadlocks, because surely
> GetOldestXmin() should never take
>                                 RecomputedOldestXmin = GetOldestXmin(NULL,
> PROCARRAY_FLAGS_VACUUM);

The 0002 patch changed those files:

diff --git a/contrib/pg_visibility/pg_visibility.c b/contrib/pg_visibility/pg_visibility.c
index 68d580ed1e0..37206c50a21 100644
--- a/contrib/pg_visibility/pg_visibility.c
+++ b/contrib/pg_visibility/pg_visibility.c
@@ -563,17 +563,14 @@ collect_corrupt_items(Oid relid, bool all_visible, bool all_frozen)
        BufferAccessStrategy bstrategy = GetAccessStrategy(BAS_BULKREAD);
        TransactionId OldestXmin = InvalidTransactionId;

-       if (all_visible)
-       {
-               /* Don't pass rel; that will fail in recovery. */
-               OldestXmin = GetOldestXmin(NULL, PROCARRAY_FLAGS_VACUUM);
-       }
-
        rel = relation_open(relid, AccessShareLock);

        /* Only some relkinds have a visibility map */
        check_relation_relkind(rel);

+       if (all_visible)
+               OldestXmin = GetOldestNonRemovableTransactionId(rel);
+
        nblocks = RelationGetNumberOfBlocks(rel);

        /*
@@ -679,11 +676,12 @@ collect_corrupt_items(Oid relid, bool all_visible, bool all_frozen)
                                 * From a concurrency point of view, it sort of sucks to
                                 * retake ProcArrayLock here while we're holding the buffer
                                 * exclusively locked, but it should be safe against
-                                * deadlocks, because surely GetOldestXmin() should never take
-                                * a buffer lock. And this shouldn't happen often, so it's
-                                * worth being careful so as to avoid false positives.
+                                * deadlocks, because surely GetOldestNonRemovableTransactionId()
+                                * should never take a buffer lock. And this shouldn't happen
+                                * often, so it's worth being careful so as to avoid false
+                                * positives.
                                 */
-                               RecomputedOldestXmin = GetOldestXmin(NULL, PROCARRAY_FLAGS_VACUUM);
+                               RecomputedOldestXmin = GetOldestNonRemovableTransactionId(rel);

                                if (!TransactionIdPrecedes(OldestXmin, RecomputedOldestXmin))
                                        record_corrupt_item(items, &tuple.t_self);

diff --git a/contrib/pgstattuple/pgstatapprox.c b/contrib/pgstattuple/pgstatapprox.c
index dbc0fa11f61..3a99333d443 100644
--- a/contrib/pgstattuple/pgstatapprox.c
+++ b/contrib/pgstattuple/pgstatapprox.c
@@ -71,7 +71,7 @@ statapprox_heap(Relation rel, output_type *stat)
        BufferAccessStrategy bstrategy;
        TransactionId OldestXmin;

-       OldestXmin = GetOldestXmin(rel, PROCARRAY_FLAGS_VACUUM);
+       OldestXmin = GetOldestNonRemovableTransactionId(rel);
        bstrategy = GetAccessStrategy(BAS_BULKREAD);

        nblocks = RelationGetNumberOfBlocks(rel);

Obviously, the v12-0002-snapshot-scalability-Don-t-compute-global-horizo.patch patch needs to be rebased.
https://github.com/postgres/postgres/blob/master/contrib/pg_visibility/pg_visibility.c

1:
if (all_visible)
{
/ * Don't pass rel; that will fail in recovery. * /
OldestXmin = GetOldestXmin (NULL, PROCARRAY_FLAGS_VACUUM);
}
It is on line 566 in the current version of git, while the patch is on line 563.

2:
* deadlocks, because surely GetOldestXmin () should never take
* a buffer lock. And this shouldn't happen often, so it's
* worth being careful so as to avoid false positives.
* /
It is currently on line 682, while in the patch it is on line 679.

regards,
Ranier Vilela
Reply | Threaded
Open this post in threaded view
|

Re: Improving connection scalability: GetSnapshotData()

Thomas Munro-5
In reply to this post by Andres Freund
On Fri, Jul 24, 2020 at 1:11 PM Andres Freund <[hidden email]> wrote:
> On 2020-07-15 21:33:06 -0400, Alvaro Herrera wrote:
> > On 2020-Jul-15, Andres Freund wrote:
> > > It could make sense to split the conversion of
> > > VariableCacheData->latestCompletedXid to FullTransactionId out from 0001
> > > into is own commit. Not sure...
> >
> > +1, the commit is large enough and that change can be had in advance.
>
> I've done that in the attached.

+     * pair with the memory barrier below.  We do however accept xid to be <=
+     * to next_xid, instead of just <, as xid could be from the procarray,
+     * before we see the updated nextFullXid value.

Tricky.  Right, that makes sense.  I like the range assertion.

+static inline FullTransactionId
+FullXidViaRelative(FullTransactionId rel, TransactionId xid)

I'm struggling to find a better word for this than "relative".

+    return FullTransactionIdFromU64(U64FromFullTransactionId(rel)
+                                    + (int32) (xid - rel_xid));

I like your branch-free code for this.

> I wonder if somebody has an opinion on renaming latestCompletedXid to
> latestCompletedFullXid. That's the pattern we already had (cf
> nextFullXid), but it also leads to pretty long lines and quite a few
> comment etc changes.
>
> I'm somewhat inclined to remove the "Full" out of the variable, and to
> also do that for nextFullXid. I feel like including it in the variable
> name is basically a poor copy of the (also not great) C type system.  If
> we hadn't made FullTransactionId a struct I'd see it differently (and
> thus incompatible with TransactionId), but we have ...

Yeah, I'm OK with dropping the "Full".  I've found it rather clumsy too.


Reply | Threaded
Open this post in threaded view
|

Re: Improving connection scalability: GetSnapshotData()

Thomas Munro-5
On Wed, Jul 29, 2020 at 6:15 PM Thomas Munro <[hidden email]> wrote:
> +static inline FullTransactionId
> +FullXidViaRelative(FullTransactionId rel, TransactionId xid)
>
> I'm struggling to find a better word for this than "relative".

The best I've got is "anchor" xid.  It is an xid that is known to
limit nextFullXid's range while the receiving function runs.


Reply | Threaded
Open this post in threaded view
|

Re: Improving connection scalability: GetSnapshotData()

Daniel Gustafsson
In reply to this post by Andres Freund
> On 24 Jul 2020, at 03:11, Andres Freund <[hidden email]> wrote:

> I've done that in the attached.

As this is actively being reviewed but time is running short, I'm moving this
to the next CF.

cheers ./daniel


Reply | Threaded
Open this post in threaded view
|

Re: Improving connection scalability: GetSnapshotData()

Andres Freund
In reply to this post by Thomas Munro-5
Hi,

On 2020-07-29 19:20:04 +1200, Thomas Munro wrote:
> On Wed, Jul 29, 2020 at 6:15 PM Thomas Munro <[hidden email]> wrote:
> > +static inline FullTransactionId
> > +FullXidViaRelative(FullTransactionId rel, TransactionId xid)
> >
> > I'm struggling to find a better word for this than "relative".
>
> The best I've got is "anchor" xid.  It is an xid that is known to
> limit nextFullXid's range while the receiving function runs.

Thinking about it, I think that relative is a good descriptor. It's just
that 'via' is weird. How about: FullXidRelativeTo?

Greetings,

Andres Freund


Reply | Threaded
Open this post in threaded view
|

Re: Improving connection scalability: GetSnapshotData()

Thomas Munro-5
On Wed, Aug 12, 2020 at 12:19 PM Andres Freund <[hidden email]> wrote:

> On 2020-07-29 19:20:04 +1200, Thomas Munro wrote:
> > On Wed, Jul 29, 2020 at 6:15 PM Thomas Munro <[hidden email]> wrote:
> > > +static inline FullTransactionId
> > > +FullXidViaRelative(FullTransactionId rel, TransactionId xid)
> > >
> > > I'm struggling to find a better word for this than "relative".
> >
> > The best I've got is "anchor" xid.  It is an xid that is known to
> > limit nextFullXid's range while the receiving function runs.
>
> Thinking about it, I think that relative is a good descriptor. It's just
> that 'via' is weird. How about: FullXidRelativeTo?

WFM.


1234