PATCH: Attempt to make dbsize a bit more consistent

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
21 messages Options
12
Reply | Threaded
Open this post in threaded view
|

PATCH: Attempt to make dbsize a bit more consistent

gkokolatos
Hi all,

this minor patch is attempting to force the use of the tableam api in dbsize where ever it is required.

Apparently something similar was introduced for toast relations only. Intuitively it seems that the distinction between a table and a toast table is not needed. This patch treats tables, toast tables and materialized views equally.

Regards,
//Georgios

0001-Attempt-to-make-dbsize-a-bit-more-consistent.patch (9K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: PATCH: Attempt to make dbsize a bit more consistent

John Naylor-2
On Thu, Aug 27, 2020 at 9:39 AM <[hidden email]> wrote:
>
> Hi all,
>
> this minor patch is attempting to force the use of the tableam api in dbsize where ever it is required.
>
> Apparently something similar was introduced for toast relations only. Intuitively it seems that the distinction between a table and a toast table is not needed.

I suspect the reason is found in the comment for table_block_relation_size():

 * If a table AM uses the various relation forks as the sole place where data
 * is stored, and if it uses them in the expected manner (e.g. the actual data
 * is in the main fork rather than some other), it can use this implementation
 * of the relation_size callback rather than implementing its own.

--
John Naylor                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: PATCH: Attempt to make dbsize a bit more consistent

David Zhang
In reply to this post by gkokolatos

I found the function "table_relation_size" is only used by buffer
manager for "RELKIND_RELATION", "RELKIND_TOASTVALUE" and
"RELKIND_MATVIEW", i.e.

         case RELKIND_RELATION:
         case RELKIND_TOASTVALUE:
         case RELKIND_MATVIEW:
             {
                 /*
                  * Not every table AM uses BLCKSZ wide fixed size blocks.
                  * Therefore tableam returns the size in bytes - but
for the
                  * purpose of this routine, we want the number of blocks.
                  * Therefore divide, rounding up.
                  */
                 uint64        szbytes;

                 szbytes = table_relation_size(relation, forkNum);

                 return (szbytes + (BLCKSZ - 1)) / BLCKSZ;
             }

So using "calculate_relation_size" and "calculate_toast_table_size" in
"calculate_table_size" is easy to understand and the original logic is
simple.


On 2020-08-27 6:38 a.m., [hidden email] wrote:
> Hi all,
>
> this minor patch is attempting to force the use of the tableam api in dbsize where ever it is required.
>
> Apparently something similar was introduced for toast relations only. Intuitively it seems that the distinction between a table and a toast table is not needed. This patch treats tables, toast tables and materialized views equally.
>
> Regards,
> //Georgios
Best regards,
--
David

Software Engineer
Highgo Software Inc. (Canada)
www.highgo.ca



Reply | Threaded
Open this post in threaded view
|

Re: PATCH: Attempt to make dbsize a bit more consistent

gkokolatos
In reply to this post by John Naylor-2





‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Tuesday, 8 September 2020 16:49, John Naylor <[hidden email]> wrote:

> On Thu, Aug 27, 2020 at 9:39 AM [hidden email] wrote:
>
> > Hi all,
> > this minor patch is attempting to force the use of the tableam api in dbsize where ever it is required.
> > Apparently something similar was introduced for toast relations only. Intuitively it seems that the distinction between a table and a toast table is not needed.
>
> I suspect the reason is found in the comment for table_block_relation_size():
>
> -   If a table AM uses the various relation forks as the sole place where data
> -   is stored, and if it uses them in the expected manner (e.g. the actual data
> -   is in the main fork rather than some other), it can use this implementation
> -   of the relation_size callback rather than implementing its own.


Thank you for your answer and interest at the patch.

I agree with the comment above. However I do not see why it is relevant here. When issuing:

SELECT pg_table_size('foo'::regclass);

I should not have to care about the on disk layout of the relation 'foo'.
Without this patch, one will get a correct result only when 'foo' is a heap table.
For custom layouts the result can potentially be wrong.



>
>     --
>     John Naylor https://www.2ndQuadrant.com/
>     PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>




Reply | Threaded
Open this post in threaded view
|

Re: PATCH: Attempt to make dbsize a bit more consistent

gkokolatos
In reply to this post by David Zhang





‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Tuesday, 8 September 2020 22:26, David Zhang <[hidden email]> wrote:

>
>
> I found the function "table_relation_size" is only used by buffer
> manager for "RELKIND_RELATION", "RELKIND_TOASTVALUE" and
> "RELKIND_MATVIEW", i.e.
>
>         case RELKIND_RELATION:
>         case RELKIND_TOASTVALUE:
>         case RELKIND_MATVIEW:
>             {
>                 /*
>                  * Not every table AM uses BLCKSZ wide fixed size blocks.
>                  * Therefore tableam returns the size in bytes - but
> for the
>                  * purpose of this routine, we want the number of blocks.
>                  * Therefore divide, rounding up.
>                  */
>                 uint64        szbytes;
>
>                 szbytes = table_relation_size(relation, forkNum);
>
>                 return (szbytes + (BLCKSZ - 1)) / BLCKSZ;
>             }
>
> So using "calculate_relation_size" and "calculate_toast_table_size" in
> "calculate_table_size" is easy to understand and the original logic is
> simple.
>

You are correct. This is the logic that is attempted to be applied
in dbsize.c in this patch.

So what do you think of the patch?


Reply | Threaded
Open this post in threaded view
|

Re: PATCH: Attempt to make dbsize a bit more consistent

David Zhang

On 2020-09-09 12:41 a.m., [hidden email] wrote:

> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> On Tuesday, 8 September 2020 22:26, David Zhang <[hidden email]> wrote:
>
>>
>> I found the function "table_relation_size" is only used by buffer
>> manager for "RELKIND_RELATION", "RELKIND_TOASTVALUE" and
>> "RELKIND_MATVIEW", i.e.
>>
>>          case RELKIND_RELATION:
>>          case RELKIND_TOASTVALUE:
>>          case RELKIND_MATVIEW:
>>              {
>>                  /*
>>                   * Not every table AM uses BLCKSZ wide fixed size blocks.
>>                   * Therefore tableam returns the size in bytes - but
>> for the
>>                   * purpose of this routine, we want the number of blocks.
>>                   * Therefore divide, rounding up.
>>                   */
>>                  uint64        szbytes;
>>
>>                  szbytes = table_relation_size(relation, forkNum);
>>
>>                  return (szbytes + (BLCKSZ - 1)) / BLCKSZ;
>>              }
>>
>> So using "calculate_relation_size" and "calculate_toast_table_size" in
>> "calculate_table_size" is easy to understand and the original logic is
>> simple.
>>
> You are correct. This is the logic that is attempted to be applied
> in dbsize.c in this patch.
>
> So what do you think of the patch?

I would suggest to keep the original logic unless there is a bug.

--
David

Software Engineer
Highgo Software Inc. (Canada)
www.highgo.ca



Reply | Threaded
Open this post in threaded view
|

Re: PATCH: Attempt to make dbsize a bit more consistent

Daniel Gustafsson
>> So what do you think of the patch?
>
> I would suggest to keep the original logic unless there is a bug.

IIUC, the premise of this path submission is that this codepath is in fact
buggy as it may lead to incorrect results for non-heap relations?

Since we have introduced the table AM api I support going throug it for all
table accesses, so +1 on the overall idea.

Some comments on the patch:

- * Note that this also behaves sanely if applied to an index or toast table;
+ * Note that this also behaves sanely if applied to a toast table;
  * those won't have attached toast tables, but they can have multiple forks.
This comment reads a bit odd now and should probably be reworded.


-   return size;
+   Assert(size < PG_INT64_MAX);
+
+   return (int64)size;
I assume that this change, and the other ones like that, aim to handle int64
overflow?  Using the extra legroom of uint64 can still lead to an overflow,
however theoretical it may be.  Wouldn't it be better to check for overflow
before adding to size rather than after the fact?  Something along the lines of
checking for headroom left:

  rel_size = table_relation_size(..);
  if (rel_size > (PG_INT64_MAX - total_size))
    < error codepath >
  total_size += rel_size;


+   if (rel->rd_rel->relkind != RELKIND_INDEX)
+   {
+       relation_close(rel, AccessShareLock);
+       PG_RETURN_NULL();
+   }
pg_indexes_size is defined as returning the size of the indexes attached to the
specified relation, so this hunk is wrong as it instead requires rel to be an
index?

cheers ./daniel

Reply | Threaded
Open this post in threaded view
|

Re: PATCH: Attempt to make dbsize a bit more consistent

Michael Paquier-2
On Thu, Sep 10, 2020 at 11:51:30AM +0200, Daniel Gustafsson wrote:
> Some comments on the patch:

Extra comment for this patch: regression tests are failing.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: PATCH: Attempt to make dbsize a bit more consistent

gkokolatos
In reply to this post by Daniel Gustafsson





‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Thursday, September 10, 2020 12:51 PM, Daniel Gustafsson <[hidden email]> wrote:

[snip]

> Since we have introduced the table AM api I support going throug it for all
> table accesses, so +1 on the overall idea.
>

Thank you for reviewing! Please find v2 of the patch attached.

In addition to addressing the comments, this patch contains a slightly opinionated approach during describe. In short, only relations that have storage, are returning non null size during when \d*+ commands are emitted. Such an example is a view which can be found in the psql tests. If a view was returning a size of 0 Bytes, it would indicate that it can potentially be non zero, which is of course wrong.


> Some comments on the patch:
>
> -   -   Note that this also behaves sanely if applied to an index or toast table;
>
> -   -   Note that this also behaves sanely if applied to a toast table;
>     -   those won't have attached toast tables, but they can have multiple forks.
>         This comment reads a bit odd now and should probably be reworded.
>

Agreed and amended.

>
> -   return size;
>
> -   Assert(size < PG_INT64_MAX);
>
> -
> -   return (int64)size;
>     I assume that this change, and the other ones like that, aim to handle int64
>     overflow? Using the extra legroom of uint64 can still lead to an overflow,
>     however theoretical it may be. Wouldn't it be better to check for overflow
>     before adding to size rather than after the fact? Something along the lines of
>     checking for headroom left:
>
>     rel_size = table_relation_size(..);
>     if (rel_size > (PG_INT64_MAX - total_size))
>     < error codepath >
>
>
> total_size += rel_size;
Actually not, the intention is not to handle overflow. The table_relation_size() returns a uint64 and the calling function returns int64.

The Assert() has been placed in order to denote that it is acknowledged that the two functions return different types. I was of the opinion that a run time check will not be needed as even the smaller type can cover more than 9200 PetaByte tables.

If we were to change anything, then I would prefer to change the signature of the pg_*_size family of functions to return uint64 instead.


>
> -   if (rel->rd_rel->relkind != RELKIND_INDEX)
>
> -   {
>
> -         relation_close(rel, AccessShareLock);
>
>
> -         PG_RETURN_NULL();
>
>
> -   }
>     pg_indexes_size is defined as returning the size of the indexes attached to the
>     specified relation, so this hunk is wrong as it instead requires rel to be an
>     index?
You are absolutely correct, amended.

>
>     cheers ./daniel
>


v2-0001-Attempt-to-make-dbsize-a-bit-more-consistent.patch (13K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: PATCH: Attempt to make dbsize a bit more consistent

Soumyadeep Chakraborty
Hey Georgios,

Thanks for looking for more avenues to invoke tableAM APIS! Please find
my review below:

On Tue, Oct 13, 2020 at 6:28 AM <[hidden email]> wrote:

1.

>  /*
> - * heap size, including FSM and VM
> + * table size, including FSM and VM
>  */

We should not mention FSM and VM in dbsize.c at all as these are
heap AM specific. We can say:
table size, excluding TOAST relation

2.

>  /*
>  * Size of toast relation
>  */
>  if (OidIsValid(rel->rd_rel->reltoastrelid))
> - size += calculate_toast_table_size(rel->rd_rel->reltoastrelid);
> + {
> + Relation toastRel;
> +
> + toastRel = relation_open(rel->rd_rel->reltoastrelid, AccessShareLock);

We can replace the OidIsValid check with relation_needs_toast_table()
and then have the OidIsValid() check in an Assert. Perhaps, that will
make things more readable.

3.

> + if (rel->rd_rel->relkind == RELKIND_RELATION ||
> + rel->rd_rel->relkind == RELKIND_TOASTVALUE ||
> + rel->rd_rel->relkind == RELKIND_MATVIEW)
> + size = calculate_table_size(rel);
> + else
> + {
> + relation_close(rel, AccessShareLock);
> + PG_RETURN_NULL();
> + }

This leads to behavioral changes:

I am talking about the case where one calls pg_table_size() on an index.
W/o your patch, it returns the size of the index. W/ your patch it
returns NULL. Judging by the documentation, this function should not
ideally apply to indexes but it does! I have a sinking feeling that lots
of users use it for this purpose, as there is no function to calculate
the size of a single specific index (except pg_relation_size()).
The same argument I have made above applies to sequences. Users may have
trial-and-errored their way into finding that pg_table_size() can tell
them the size of a specific index/sequence! I don't know how widespread
the use is in the user community, so IMO maybe we should be conservative
and not introduce this change? Alternatively, we could call out that
pg_table_size() is only for tables by throwing an error if anything
other than a table is passed in.

If we decide to preserve the existing behavior of the pg_table_size():
It seems that for things not backed by the tableAM (indexes and
sequences), they should still go through calculate_relation_size().
We can call table_relation_size() based on if relkind is
RELKIND_RELATION, RELKIND_TOASTVALUE or RELKIND_MATVIEW. Perhaps it
might be worthwhile making a new macro RELKIND_HAS_TABLE_STORAGE to
capture these three cases (See RELKIND_HAS_STORAGE). This also ensures
that we return 0 for things that don't qualify as RELKIND_HAS_STORAGE,
such as a partitioned table (Currently w/ the patch applied, we return
NULL for those cases, which is another behavior change)

4.

> @@ -3776,10 +3776,24 @@ listTables(const char *tabtypes, const char *pattern, bool verbose, bool showSys
>                                gettext_noop("Access Method"));
>
>          /*
> +         * As of PostgreSQL 14, do not use pg_table_size() for indexes and
> +         * sequences as it does not behave sanely for those.
> +         *
>           * As of PostgreSQL 9.0, use pg_table_size() to show a more accurate
>           * size of a table, including FSM, VM and TOAST tables.
>           */
> -        if (pset.sversion >= 90000)
> +        if (pset.sversion >= 140000)
> +            appendPQExpBuffer(&buf,
> +                              ",\n CASE"
> +                              " WHEN c.relkind in ("CppAsString2(RELKIND_INDEX)", "
> +                                                    CppAsString2(RELKIND_PARTITIONED_INDEX)", "
> +                                                    CppAsString2(RELKIND_SEQUENCE)") THEN"
> +                              "     pg_catalog.pg_size_pretty(pg_catalog.pg_relation_size(c.oid))"
> +                              " ELSE"
> +                              "     pg_catalog.pg_size_pretty(pg_catalog.pg_table_size(c.oid))"
> +                              " END as \"%s\"",
> +                              gettext_noop("Size"));
> +        else if (pset.sversion >= 90000)
>              appendPQExpBuffer(&buf,
>                                ",\n  pg_catalog.pg_size_pretty(pg_catalog.pg_table_size(c.oid)) as \"%s\"",
>                                gettext_noop("Size"));

Following on from point 3, if we decide to preserve the existing behavior,
we wouldn't need this change, as it would be internalized within
pg_table_size().


4.

> >
> > -   return size;
> >
> > -   Assert(size < PG_INT64_MAX);
> >
> > -
> > -   return (int64)size;
> >     I assume that this change, and the other ones like that, aim to handle int64
> >     overflow? Using the extra legroom of uint64 can still lead to an overflow,
> >     however theoretical it may be. Wouldn't it be better to check for overflow
> >     before adding to size rather than after the fact? Something along the lines of
> >     checking for headroom left:
> >
> >     rel_size = table_relation_size(..);
> >     if (rel_size > (PG_INT64_MAX - total_size))
> >     < error codepath >
> >
> >
> > total_size += rel_size;
>
>
>
> Actually not, the intention is not to handle overflow. The table_relation_size() returns a uint64 and the calling function returns int64.
>
>
>
> The Assert() has been placed in order to denote that it is acknowledged that the two functions return different types. I was of the opinion that a run time check will not be needed as even the > smaller type can cover more than 9200 PetaByte tables.
>
>
>
> If we were to change anything, then I would prefer to change the signature of the pg_*_size family of functions to return uint64 instead.

Changing the signature would be the ideal change for all of this here.
But Postgres does not have support for an unsigned 64bit integer (bigint
is signed). One would need to turn to extensions such as [1]. Thus, +1
to what Daniel said above.


5.

> @@ -415,7 +384,7 @@ calculate_table_size(Relation rel)
>  static int64
>  calculate_indexes_size(Relation rel)
>  {
> -    int64       size = 0;
> +    uint64      size = 0;
>
>      /*
>       * Aggregate all indexes on the given relation
> @@ -444,7 +413,9 @@ calculate_indexes_size(Relation rel)
>          list_free(index_oids);
>      }
>
> -    return size;
> +    Assert(size < PG_INT64_MAX);
> +
> +    return (int64)size;
>  }

I don't think we would need these changes as nothing changed in this
function.

Regards,
Soumyadeep

[1] https://github.com/petere/pguint


Reply | Threaded
Open this post in threaded view
|

Re: PATCH: Attempt to make dbsize a bit more consistent

gkokolatos





‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Monday, November 9, 2020 7:50 PM, Soumyadeep Chakraborty <[hidden email]> wrote:

> Hey Georgios,
>
> Thanks for looking for more avenues to invoke tableAM APIS! Please find
> my review below:

A great review Soumyadeep, it is much appreciated.
Please remember to add yourself as a reviewer in the commitfest
[https://commitfest.postgresql.org/30/2701/]

>
> On Tue, Oct 13, 2020 at 6:28 AM [hidden email] wrote:
>
> 1.
>
> > /*
> >
> > -   -   heap size, including FSM and VM
> >
> > -   -   table size, including FSM and VM
> >         */
> >
>
> We should not mention FSM and VM in dbsize.c at all as these are
> heap AM specific. We can say:
> table size, excluding TOAST relation

Yeah, I was thinking that the notion that FSM and VM are still taken
into account should be stated. We are iterating over ForkNumber
after all.

How about phrasing it as:

+ table size, including all implemented forks from the AM (e.g. FSM, VM)
+ excluding TOAST relations

Thoughts?

>
> 2.
>
> > /*
> >
> > -   Size of toast relation
> >     */
> >     if (OidIsValid(rel->rd_rel->reltoastrelid))
> >
> >
> > -   size += calculate_toast_table_size(rel->rd_rel->reltoastrelid);
> >
> > -   {
> > -   Relation toastRel;
> > -
> > -   toastRel = relation_open(rel->rd_rel->reltoastrelid, AccessShareLock);
>
> We can replace the OidIsValid check with relation_needs_toast_table()
> and then have the OidIsValid() check in an Assert. Perhaps, that will
> make things more readable.

Please, allow me to kindly disagree.

Relation is already open at this stage. Even create_toast_table(), the
internal workhorse for creating toast relations, does check reltoastrelid
to test if the relation is already toasted.

Furthermore, we do call:

+ toastRel = relation_open(rel->rd_rel->reltoastrelid, AccessShareLock);

and in order to avoid elog() errors underneath us, we ought to have
verified the validity of reltoastrelid.

In short, I think that the code in the proposal is not too unreadable
nor that it breaks the coding patterns throughout the codebase.

Am I too wrong?

>
> 3.
>
> > -   if (rel->rd_rel->relkind == RELKIND_RELATION ||
> > -   rel->rd_rel->relkind == RELKIND_TOASTVALUE ||
> > -   rel->rd_rel->relkind == RELKIND_MATVIEW)
> > -   size = calculate_table_size(rel);
> > -   else
> > -   {
> > -   relation_close(rel, AccessShareLock);
> > -   PG_RETURN_NULL();
> > -   }
>
> This leads to behavioral changes:
>
> I am talking about the case where one calls pg_table_size() on an index.
> W/o your patch, it returns the size of the index. W/ your patch it
> returns NULL. Judging by the documentation, this function should not
> ideally apply to indexes but it does! I have a sinking feeling that lots
> of users use it for this purpose, as there is no function to calculate
> the size of a single specific index (except pg_relation_size()).
> The same argument I have made above applies to sequences. Users may have
> trial-and-errored their way into finding that pg_table_size() can tell
> them the size of a specific index/sequence! I don't know how widespread
> the use is in the user community, so IMO maybe we should be conservative
> and not introduce this change? Alternatively, we could call out that
> pg_table_size() is only for tables by throwing an error if anything
> other than a table is passed in.
>
> If we decide to preserve the existing behavior of the pg_table_size():
> It seems that for things not backed by the tableAM (indexes and
> sequences), they should still go through calculate_relation_size().
> We can call table_relation_size() based on if relkind is
> RELKIND_RELATION, RELKIND_TOASTVALUE or RELKIND_MATVIEW. Perhaps it
> might be worthwhile making a new macro RELKIND_HAS_TABLE_STORAGE to
> capture these three cases (See RELKIND_HAS_STORAGE). This also ensures
> that we return 0 for things that don't qualify as RELKIND_HAS_STORAGE,
> such as a partitioned table (Currently w/ the patch applied, we return
> NULL for those cases, which is another behavior change)


Excellent point. This is the discussion I was longing to have.

I stand by the decision coded in the patch, that pg_table_size() should
return NULL for other kinds of relations, such as indexes, sequences
etc.

It is a conscious decision based on the following:

 * Supported by the documentation, pg_table_size() applies to tables only.
For other uses the higher-level functions pg_total_relation_size() or
pg_relation_size() should be used.
 * Commit fa352d662e taught pg_relation_size() and friends to return NULL if the object doesn't exist. This makes perfect sense for the
scenarios described in the commit:

   That avoids errors when the functions are used in queries like
      "SELECT pg_relation_size(oid) FROM pg_class",
   and a table is dropped concurrently.

IMHO: It is more consistent to return NULL when the relation does exist
OR it is not a table kind.
* Returning 0 for things that do not have storage, is nonsensical because
it implies that it can be NON zero at some point. Things that do not
have storage have an unknown size.


As far as for the argument that users might have trialed and errored
their way into undocumented behaviour, I do not think it is strong
enough to stop us from implementing the documented behaviour.

>
> 4.
>
> > @@ -3776,10 +3776,24 @@ listTables(const char *tabtypes, const char *pattern, bool verbose, bool showSys
> > gettext_noop("Access Method"));
> >
> >          /*
> >
> >
> > -           * As of PostgreSQL 14, do not use pg_table_size() for indexes and
> >
> >
> > -           * sequences as it does not behave sanely for those.
> >
> >
> > -           *
> >             * As of PostgreSQL 9.0, use pg_table_size() to show a more accurate
> >             * size of a table, including FSM, VM and TOAST tables.
> >             */
> >
> >
> >
> > -          if (pset.sversion >= 90000)
> >
> >
> >
> > -          if (pset.sversion >= 140000)
> >
> >
> > -              appendPQExpBuffer(&buf,
> >
> >
> > -                                ",\\n CASE"
> >
> >
> > -                                " WHEN c.relkind in ("CppAsString2(RELKIND_INDEX)", "
> >
> >
> > -                                                      CppAsString2(RELKIND_PARTITIONED_INDEX)", "
> >
> >
> > -                                                      CppAsString2(RELKIND_SEQUENCE)") THEN"
> >
> >
> > -                                "     pg_catalog.pg_size_pretty(pg_catalog.pg_relation_size(c.oid))"
> >
> >
> > -                                " ELSE"
> >
> >
> > -                                "     pg_catalog.pg_size_pretty(pg_catalog.pg_table_size(c.oid))"
> >
> >
> > -                                " END as \\"%s\\"",
> >
> >
> > -                                gettext_noop("Size"));
> >
> >
> > -          else if (pset.sversion >= 90000)
> >                appendPQExpBuffer(&buf,
> >                                  ",\\n  pg_catalog.pg_size_pretty(pg_catalog.pg_table_size(c.oid)) as \\"%s\\"",
> >                                  gettext_noop("Size"));
> >
> >
>
> Following on from point 3, if we decide to preserve the existing behavior,
> we wouldn't need this change, as it would be internalized within
> pg_table_size().

We really should not decide to preserve the existing behaviour.

I will reiterate my point: Returning 0 for things that do not have
storage, implies that it can be NON zero at some point. We should not
treat pg_table_size() as an alias for pg_relation_size().

>
> 4.
>
> > > -   return size;
> > >
> > > -   Assert(size < PG_INT64_MAX);
> > >
> > > -
> > > -   return (int64)size;
> > >     I assume that this change, and the other ones like that, aim to handle int64
> > >     overflow? Using the extra legroom of uint64 can still lead to an overflow,
> > >     however theoretical it may be. Wouldn't it be better to check for overflow
> > >     before adding to size rather than after the fact? Something along the lines of
> > >     checking for headroom left:
> > >     rel_size = table_relation_size(..);
> > >     if (rel_size > (PG_INT64_MAX - total_size))
> > >     < error codepath >
> > >
> > >
> > > total_size += rel_size;
> >
> > Actually not, the intention is not to handle overflow. The table_relation_size() returns a uint64 and the calling function returns int64.
> > The Assert() has been placed in order to denote that it is acknowledged that the two functions return different types. I was of the opinion that a run time check will not be needed as even the > smaller type can cover more than 9200 PetaByte tables.
> > If we were to change anything, then I would prefer to change the signature of the pg_*_size family of functions to return uint64 instead.
>
> Changing the signature would be the ideal change for all of this here.
> But Postgres does not have support for an unsigned 64bit integer (bigint
> is signed). One would need to turn to extensions such as [1]. Thus, +1
> to what Daniel said above.

Apologies, I do not follow. Are you suggesting that we should
introduce overflow tests?

>
> 5.
>
> > @@ -415,7 +384,7 @@ calculate_table_size(Relation rel)
> > static int64
> > calculate_indexes_size(Relation rel)
> > {
> >
> > -   int64 size = 0;
> >
> > -   uint64 size = 0;
> >     /*
> >     -   Aggregate all indexes on the given relation
> >         @@ -444,7 +413,9 @@ calculate_indexes_size(Relation rel)
> >         list_free(index_oids);
> >         }
> >
> >
> > -   return size;
> >
> > -   Assert(size < PG_INT64_MAX);
> > -
> > -   return (int64)size;
> >     }
> >
>
> I don't think we would need these changes as nothing changed in this
> function.

This change intended to keep the calculate family of functions
homogenous in style. From the point above I understand it confuses
more than helps, I will remove.

Cheers,
//Georgios

>
> Regards,
> Soumyadeep
>
> [1] https://github.com/petere/pguint




Reply | Threaded
Open this post in threaded view
|

Re: PATCH: Attempt to make dbsize a bit more consistent

Soumyadeep Chakraborty
Hey Georgios,

On Tue, Nov 10, 2020 at 6:20 AM <[hidden email]> wrote:

>
>
>
>
>
>
> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> On Monday, November 9, 2020 7:50 PM, Soumyadeep Chakraborty <[hidden email]> wrote:
>
> > Hey Georgios,
> >
> > Thanks for looking for more avenues to invoke tableAM APIS! Please find
> > my review below:
>
> A great review Soumyadeep, it is much appreciated.
> Please remember to add yourself as a reviewer in the commitfest
> [https://commitfest.postgresql.org/30/2701/]

Ah yes. Sorry, I forgot that!

> >
> > On Tue, Oct 13, 2020 at 6:28 AM [hidden email] wrote:
> >
> > 1.
> >
> > > /*
> > >
> > > -   -   heap size, including FSM and VM
> > >
> > > -   -   table size, including FSM and VM
> > >         */
> > >
> >
> > We should not mention FSM and VM in dbsize.c at all as these are
> > heap AM specific. We can say:
> > table size, excluding TOAST relation
>
> Yeah, I was thinking that the notion that FSM and VM are still taken
> into account should be stated. We are iterating over ForkNumber
> after all.
>
> How about phrasing it as:
>
> + table size, including all implemented forks from the AM (e.g. FSM, VM)
> + excluding TOAST relations
>
> Thoughts?

Yes, I was thinking along the same lines. The concept of a "fork" forced
should not be forced down into the tableAM. But that is a discussion for
another day. We can probably say:

+ table size, including all implemented forks from the AM (e.g. FSM, VM
+ for the heap AM) excluding TOAST relations

> >
> > 2.
> >
> > > /*
> > >
> > > -   Size of toast relation
> > >     */
> > >     if (OidIsValid(rel->rd_rel->reltoastrelid))
> > >
> > >
> > > -   size += calculate_toast_table_size(rel->rd_rel->reltoastrelid);
> > >
> > > -   {
> > > -   Relation toastRel;
> > > -
> > > -   toastRel = relation_open(rel->rd_rel->reltoastrelid, AccessShareLock);
> >
> > We can replace the OidIsValid check with relation_needs_toast_table()
> > and then have the OidIsValid() check in an Assert. Perhaps, that will
> > make things more readable.
>
> Please, allow me to kindly disagree.
>
> Relation is already open at this stage. Even create_toast_table(), the
> internal workhorse for creating toast relations, does check reltoastrelid
> to test if the relation is already toasted.
>
> Furthermore, we do call:
>
> + toastRel = relation_open(rel->rd_rel->reltoastrelid, AccessShareLock);
>
> and in order to avoid elog() errors underneath us, we ought to have
> verified the validity of reltoastrelid.
>
> In short, I think that the code in the proposal is not too unreadable
> nor that it breaks the coding patterns throughout the codebase.
>
> Am I too wrong?

No not at all. The code in the proposal is indeed adhering to the
codebase. What I was going for here was to increase the usage of
relation_needs_toast_table(). What I meant was:

if (table_relation_needs_toast_table(rel))
{
if (!OidIsValid(rel->rd_rel->reltoastrelid))
{
elog(ERROR, <errmsg that toast table wasn't found>);
}
//size calculation here..
}

We want to error out if the toast table can't be found and the relation
is expected to have one, which the existing code doesn't handle.


>
> >
> > 3.
> >
> > > -   if (rel->rd_rel->relkind == RELKIND_RELATION ||
> > > -   rel->rd_rel->relkind == RELKIND_TOASTVALUE ||
> > > -   rel->rd_rel->relkind == RELKIND_MATVIEW)
> > > -   size = calculate_table_size(rel);
> > > -   else
> > > -   {
> > > -   relation_close(rel, AccessShareLock);
> > > -   PG_RETURN_NULL();
> > > -   }
> >
> > This leads to behavioral changes:
> >
> > I am talking about the case where one calls pg_table_size() on an index.
> > W/o your patch, it returns the size of the index. W/ your patch it
> > returns NULL. Judging by the documentation, this function should not
> > ideally apply to indexes but it does! I have a sinking feeling that lots
> > of users use it for this purpose, as there is no function to calculate
> > the size of a single specific index (except pg_relation_size()).
> > The same argument I have made above applies to sequences. Users may have
> > trial-and-errored their way into finding that pg_table_size() can tell
> > them the size of a specific index/sequence! I don't know how widespread
> > the use is in the user community, so IMO maybe we should be conservative
> > and not introduce this change? Alternatively, we could call out that
> > pg_table_size() is only for tables by throwing an error if anything
> > other than a table is passed in.
> >
> > If we decide to preserve the existing behavior of the pg_table_size():
> > It seems that for things not backed by the tableAM (indexes and
> > sequences), they should still go through calculate_relation_size().
> > We can call table_relation_size() based on if relkind is
> > RELKIND_RELATION, RELKIND_TOASTVALUE or RELKIND_MATVIEW. Perhaps it
> > might be worthwhile making a new macro RELKIND_HAS_TABLE_STORAGE to
> > capture these three cases (See RELKIND_HAS_STORAGE). This also ensures
> > that we return 0 for things that don't qualify as RELKIND_HAS_STORAGE,
> > such as a partitioned table (Currently w/ the patch applied, we return
> > NULL for those cases, which is another behavior change)
>
>
> Excellent point. This is the discussion I was longing to have.
>
> I stand by the decision coded in the patch, that pg_table_size() should
> return NULL for other kinds of relations, such as indexes, sequences
> etc.
>
> It is a conscious decision based on the following:
>
>  * Supported by the documentation, pg_table_size() applies to tables only.
> For other uses the higher-level functions pg_total_relation_size() or
> pg_relation_size() should be used.
>  * Commit fa352d662e taught pg_relation_size() and friends to return NULL if the object doesn't exist. This makes perfect sense for the
> scenarios described in the commit:
>
>    That avoids errors when the functions are used in queries like
>       "SELECT pg_relation_size(oid) FROM pg_class",
>    and a table is dropped concurrently.
>
> IMHO: It is more consistent to return NULL when the relation does exist
> OR it is not a table kind.
> * Returning 0 for things that do not have storage, is nonsensical because
> it implies that it can be NON zero at some point. Things that do not
> have storage have an unknown size.

Fair. We will have to document the behavior change.

>
> As far as for the argument that users might have trialed and errored
> their way into undocumented behaviour, I do not think it is strong
> enough to stop us from implementing the documented behaviour.

Fair. I would strongly vote for having two additional functions
(pg_index_size() and pg_sequence_size()) to strongly signal our intent
of banning that kind of use of pg_table_size(). I think it would help
users a lot. It is not easy to find what function to call when you want
the size of a single index/sequence.

> >
> > 4.
> >
> > > @@ -3776,10 +3776,24 @@ listTables(const char *tabtypes, const char *pattern, bool verbose, bool showSys
> > > gettext_noop("Access Method"));
> > >
> > >          /*
> > >
> > >
> > > -           * As of PostgreSQL 14, do not use pg_table_size() for indexes and
> > >
> > >
> > > -           * sequences as it does not behave sanely for those.
> > >
> > >
> > > -           *
> > >             * As of PostgreSQL 9.0, use pg_table_size() to show a more accurate
> > >             * size of a table, including FSM, VM and TOAST tables.
> > >             */
> > >
> > >
> > >
> > > -          if (pset.sversion >= 90000)
> > >
> > >
> > >
> > > -          if (pset.sversion >= 140000)
> > >
> > >
> > > -              appendPQExpBuffer(&buf,
> > >
> > >
> > > -                                ",\\n CASE"
> > >
> > >
> > > -                                " WHEN c.relkind in ("CppAsString2(RELKIND_INDEX)", "
> > >
> > >
> > > -                                                      CppAsString2(RELKIND_PARTITIONED_INDEX)", "
> > >
> > >
> > > -                                                      CppAsString2(RELKIND_SEQUENCE)") THEN"
> > >
> > >
> > > -                                "     pg_catalog.pg_size_pretty(pg_catalog.pg_relation_size(c.oid))"
> > >
> > >
> > > -                                " ELSE"
> > >
> > >
> > > -                                "     pg_catalog.pg_size_pretty(pg_catalog.pg_table_size(c.oid))"
> > >
> > >
> > > -                                " END as \\"%s\\"",
> > >
> > >
> > > -                                gettext_noop("Size"));
> > >
> > >
> > > -          else if (pset.sversion >= 90000)
> > >                appendPQExpBuffer(&buf,
> > >                                  ",\\n  pg_catalog.pg_size_pretty(pg_catalog.pg_table_size(c.oid)) as \\"%s\\"",
> > >                                  gettext_noop("Size"));
> > >
> > >
> >
> > Following on from point 3, if we decide to preserve the existing behavior,
> > we wouldn't need this change, as it would be internalized within
> > pg_table_size().
>
> We really should not decide to preserve the existing behaviour.
>
> I will reiterate my point: Returning 0 for things that do not have
> storage, implies that it can be NON zero at some point. We should not
> treat pg_table_size() as an alias for pg_relation_size().

+1

> >
> > 4.
> >
> > > > -   return size;
> > > >
> > > > -   Assert(size < PG_INT64_MAX);
> > > >
> > > > -
> > > > -   return (int64)size;
> > > >     I assume that this change, and the other ones like that, aim to handle int64
> > > >     overflow? Using the extra legroom of uint64 can still lead to an overflow,
> > > >     however theoretical it may be. Wouldn't it be better to check for overflow
> > > >     before adding to size rather than after the fact? Something along the lines of
> > > >     checking for headroom left:
> > > >     rel_size = table_relation_size(..);
> > > >     if (rel_size > (PG_INT64_MAX - total_size))
> > > >     < error codepath >
> > > >
> > > >
> > > > total_size += rel_size;
> > >
> > > Actually not, the intention is not to handle overflow. The table_relation_size() returns a uint64 and the calling function returns int64.
> > > The Assert() has been placed in order to denote that it is acknowledged that the two functions return different types. I was of the opinion that a run time check will not be needed as even the > smaller type can cover more than 9200 PetaByte tables.
> > > If we were to change anything, then I would prefer to change the signature of the pg_*_size family of functions to return uint64 instead.
> >
> > Changing the signature would be the ideal change for all of this here.
> > But Postgres does not have support for an unsigned 64bit integer (bigint
> > is signed). One would need to turn to extensions such as [1]. Thus, +1
> > to what Daniel said above.
>
> Apologies, I do not follow. Are you suggesting that we should
> introduce overflow tests?

Yes, to introduce the same overflow test that Daniel suggested above as
returning a uint64 in PG does not really return a uint64 AFAIU (since
the pg_**size() functions all return bigint which is signed and there
is no uint64 user-facing type).


Regards,
Soumyadeep (VMware)


Reply | Threaded
Open this post in threaded view
|

Re: PATCH: Attempt to make dbsize a bit more consistent

Masahiko Sawada
On Thu, Nov 12, 2020 at 2:54 AM Soumyadeep Chakraborty
<[hidden email]> wrote:

>
> Hey Georgios,
>
> On Tue, Nov 10, 2020 at 6:20 AM <[hidden email]> wrote:
> >
> >
> >
> >
> >
> >
> > ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> > On Monday, November 9, 2020 7:50 PM, Soumyadeep Chakraborty <[hidden email]> wrote:
> >
> > > Hey Georgios,
> > >
> > > Thanks for looking for more avenues to invoke tableAM APIS! Please find
> > > my review below:
> >
> > A great review Soumyadeep, it is much appreciated.
> > Please remember to add yourself as a reviewer in the commitfest
> > [https://commitfest.postgresql.org/30/2701/]
>
> Ah yes. Sorry, I forgot that!
>
> > >
> > > On Tue, Oct 13, 2020 at 6:28 AM [hidden email] wrote:
> > >
> > > 1.
> > >
> > > > /*
> > > >
> > > > -   -   heap size, including FSM and VM
> > > >
> > > > -   -   table size, including FSM and VM
> > > >         */
> > > >
> > >
> > > We should not mention FSM and VM in dbsize.c at all as these are
> > > heap AM specific. We can say:
> > > table size, excluding TOAST relation
> >
> > Yeah, I was thinking that the notion that FSM and VM are still taken
> > into account should be stated. We are iterating over ForkNumber
> > after all.
> >
> > How about phrasing it as:
> >
> > + table size, including all implemented forks from the AM (e.g. FSM, VM)
> > + excluding TOAST relations
> >
> > Thoughts?
>
> Yes, I was thinking along the same lines. The concept of a "fork" forced
> should not be forced down into the tableAM. But that is a discussion for
> another day. We can probably say:
>
> + table size, including all implemented forks from the AM (e.g. FSM, VM
> + for the heap AM) excluding TOAST relations
>
> > >
> > > 2.
> > >
> > > > /*
> > > >
> > > > -   Size of toast relation
> > > >     */
> > > >     if (OidIsValid(rel->rd_rel->reltoastrelid))
> > > >
> > > >
> > > > -   size += calculate_toast_table_size(rel->rd_rel->reltoastrelid);
> > > >
> > > > -   {
> > > > -   Relation toastRel;
> > > > -
> > > > -   toastRel = relation_open(rel->rd_rel->reltoastrelid, AccessShareLock);
> > >
> > > We can replace the OidIsValid check with relation_needs_toast_table()
> > > and then have the OidIsValid() check in an Assert. Perhaps, that will
> > > make things more readable.
> >
> > Please, allow me to kindly disagree.
> >
> > Relation is already open at this stage. Even create_toast_table(), the
> > internal workhorse for creating toast relations, does check reltoastrelid
> > to test if the relation is already toasted.
> >
> > Furthermore, we do call:
> >
> > + toastRel = relation_open(rel->rd_rel->reltoastrelid, AccessShareLock);
> >
> > and in order to avoid elog() errors underneath us, we ought to have
> > verified the validity of reltoastrelid.
> >
> > In short, I think that the code in the proposal is not too unreadable
> > nor that it breaks the coding patterns throughout the codebase.
> >
> > Am I too wrong?
>
> No not at all. The code in the proposal is indeed adhering to the
> codebase. What I was going for here was to increase the usage of
> relation_needs_toast_table(). What I meant was:
>
> if (table_relation_needs_toast_table(rel))
> {
> if (!OidIsValid(rel->rd_rel->reltoastrelid))
> {
> elog(ERROR, <errmsg that toast table wasn't found>);
> }
> //size calculation here..
> }
>
> We want to error out if the toast table can't be found and the relation
> is expected to have one, which the existing code doesn't handle.
>
>
> >
> > >
> > > 3.
> > >
> > > > -   if (rel->rd_rel->relkind == RELKIND_RELATION ||
> > > > -   rel->rd_rel->relkind == RELKIND_TOASTVALUE ||
> > > > -   rel->rd_rel->relkind == RELKIND_MATVIEW)
> > > > -   size = calculate_table_size(rel);
> > > > -   else
> > > > -   {
> > > > -   relation_close(rel, AccessShareLock);
> > > > -   PG_RETURN_NULL();
> > > > -   }
> > >
> > > This leads to behavioral changes:
> > >
> > > I am talking about the case where one calls pg_table_size() on an index.
> > > W/o your patch, it returns the size of the index. W/ your patch it
> > > returns NULL. Judging by the documentation, this function should not
> > > ideally apply to indexes but it does! I have a sinking feeling that lots
> > > of users use it for this purpose, as there is no function to calculate
> > > the size of a single specific index (except pg_relation_size()).
> > > The same argument I have made above applies to sequences. Users may have
> > > trial-and-errored their way into finding that pg_table_size() can tell
> > > them the size of a specific index/sequence! I don't know how widespread
> > > the use is in the user community, so IMO maybe we should be conservative
> > > and not introduce this change? Alternatively, we could call out that
> > > pg_table_size() is only for tables by throwing an error if anything
> > > other than a table is passed in.
> > >
> > > If we decide to preserve the existing behavior of the pg_table_size():
> > > It seems that for things not backed by the tableAM (indexes and
> > > sequences), they should still go through calculate_relation_size().
> > > We can call table_relation_size() based on if relkind is
> > > RELKIND_RELATION, RELKIND_TOASTVALUE or RELKIND_MATVIEW. Perhaps it
> > > might be worthwhile making a new macro RELKIND_HAS_TABLE_STORAGE to
> > > capture these three cases (See RELKIND_HAS_STORAGE). This also ensures
> > > that we return 0 for things that don't qualify as RELKIND_HAS_STORAGE,
> > > such as a partitioned table (Currently w/ the patch applied, we return
> > > NULL for those cases, which is another behavior change)
> >
> >
> > Excellent point. This is the discussion I was longing to have.
> >
> > I stand by the decision coded in the patch, that pg_table_size() should
> > return NULL for other kinds of relations, such as indexes, sequences
> > etc.
> >
> > It is a conscious decision based on the following:
> >
> >  * Supported by the documentation, pg_table_size() applies to tables only.
> > For other uses the higher-level functions pg_total_relation_size() or
> > pg_relation_size() should be used.
> >  * Commit fa352d662e taught pg_relation_size() and friends to return NULL if the object doesn't exist. This makes perfect sense for the
> > scenarios described in the commit:
> >
> >    That avoids errors when the functions are used in queries like
> >       "SELECT pg_relation_size(oid) FROM pg_class",
> >    and a table is dropped concurrently.
> >
> > IMHO: It is more consistent to return NULL when the relation does exist
> > OR it is not a table kind.
> > * Returning 0 for things that do not have storage, is nonsensical because
> > it implies that it can be NON zero at some point. Things that do not
> > have storage have an unknown size.
>
> Fair. We will have to document the behavior change.
>
> >
> > As far as for the argument that users might have trialed and errored
> > their way into undocumented behaviour, I do not think it is strong
> > enough to stop us from implementing the documented behaviour.
>
> Fair. I would strongly vote for having two additional functions
> (pg_index_size() and pg_sequence_size()) to strongly signal our intent
> of banning that kind of use of pg_table_size(). I think it would help
> users a lot. It is not easy to find what function to call when you want
> the size of a single index/sequence.
>
> > >
> > > 4.
> > >
> > > > @@ -3776,10 +3776,24 @@ listTables(const char *tabtypes, const char *pattern, bool verbose, bool showSys
> > > > gettext_noop("Access Method"));
> > > >
> > > >          /*
> > > >
> > > >
> > > > -           * As of PostgreSQL 14, do not use pg_table_size() for indexes and
> > > >
> > > >
> > > > -           * sequences as it does not behave sanely for those.
> > > >
> > > >
> > > > -           *
> > > >             * As of PostgreSQL 9.0, use pg_table_size() to show a more accurate
> > > >             * size of a table, including FSM, VM and TOAST tables.
> > > >             */
> > > >
> > > >
> > > >
> > > > -          if (pset.sversion >= 90000)
> > > >
> > > >
> > > >
> > > > -          if (pset.sversion >= 140000)
> > > >
> > > >
> > > > -              appendPQExpBuffer(&buf,
> > > >
> > > >
> > > > -                                ",\\n CASE"
> > > >
> > > >
> > > > -                                " WHEN c.relkind in ("CppAsString2(RELKIND_INDEX)", "
> > > >
> > > >
> > > > -                                                      CppAsString2(RELKIND_PARTITIONED_INDEX)", "
> > > >
> > > >
> > > > -                                                      CppAsString2(RELKIND_SEQUENCE)") THEN"
> > > >
> > > >
> > > > -                                "     pg_catalog.pg_size_pretty(pg_catalog.pg_relation_size(c.oid))"
> > > >
> > > >
> > > > -                                " ELSE"
> > > >
> > > >
> > > > -                                "     pg_catalog.pg_size_pretty(pg_catalog.pg_table_size(c.oid))"
> > > >
> > > >
> > > > -                                " END as \\"%s\\"",
> > > >
> > > >
> > > > -                                gettext_noop("Size"));
> > > >
> > > >
> > > > -          else if (pset.sversion >= 90000)
> > > >                appendPQExpBuffer(&buf,
> > > >                                  ",\\n  pg_catalog.pg_size_pretty(pg_catalog.pg_table_size(c.oid)) as \\"%s\\"",
> > > >                                  gettext_noop("Size"));
> > > >
> > > >
> > >
> > > Following on from point 3, if we decide to preserve the existing behavior,
> > > we wouldn't need this change, as it would be internalized within
> > > pg_table_size().
> >
> > We really should not decide to preserve the existing behaviour.
> >
> > I will reiterate my point: Returning 0 for things that do not have
> > storage, implies that it can be NON zero at some point. We should not
> > treat pg_table_size() as an alias for pg_relation_size().
>
> +1
>
> > >
> > > 4.
> > >
> > > > > -   return size;
> > > > >
> > > > > -   Assert(size < PG_INT64_MAX);
> > > > >
> > > > > -
> > > > > -   return (int64)size;
> > > > >     I assume that this change, and the other ones like that, aim to handle int64
> > > > >     overflow? Using the extra legroom of uint64 can still lead to an overflow,
> > > > >     however theoretical it may be. Wouldn't it be better to check for overflow
> > > > >     before adding to size rather than after the fact? Something along the lines of
> > > > >     checking for headroom left:
> > > > >     rel_size = table_relation_size(..);
> > > > >     if (rel_size > (PG_INT64_MAX - total_size))
> > > > >     < error codepath >
> > > > >
> > > > >
> > > > > total_size += rel_size;
> > > >
> > > > Actually not, the intention is not to handle overflow. The table_relation_size() returns a uint64 and the calling function returns int64.
> > > > The Assert() has been placed in order to denote that it is acknowledged that the two functions return different types. I was of the opinion that a run time check will not be needed as even the > smaller type can cover more than 9200 PetaByte tables.
> > > > If we were to change anything, then I would prefer to change the signature of the pg_*_size family of functions to return uint64 instead.
> > >
> > > Changing the signature would be the ideal change for all of this here.
> > > But Postgres does not have support for an unsigned 64bit integer (bigint
> > > is signed). One would need to turn to extensions such as [1]. Thus, +1
> > > to what Daniel said above.
> >
> > Apologies, I do not follow. Are you suggesting that we should
> > introduce overflow tests?
>
> Yes, to introduce the same overflow test that Daniel suggested above as
> returning a uint64 in PG does not really return a uint64 AFAIU (since
> the pg_**size() functions all return bigint which is signed and there
> is no uint64 user-facing type).

Status update for a commitfest entry.

This patch gets review comments and there was some discussion. It
seems we're waiting for the patch update. So I've moved this patch to
the next commitfest and set it to "Waiting on Author".

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/


Reply | Threaded
Open this post in threaded view
|

Re: PATCH: Attempt to make dbsize a bit more consistent

gkokolatos





‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Monday, February 1, 2021 1:18 PM, Masahiko Sawada <[hidden email]> wrote:

> On Thu, Nov 12, 2020 at 2:54 AM Soumyadeep Chakraborty
> [hidden email] wrote:
>
> > Hey Georgios,
> > On Tue, Nov 10, 2020 at 6:20 AM [hidden email] wrote:
> >
> > > ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> > > On Monday, November 9, 2020 7:50 PM, Soumyadeep Chakraborty [hidden email] wrote:
> > >
> > > > Hey Georgios,
> > > > Thanks for looking for more avenues to invoke tableAM APIS! Please find
> > > > my review below:
> > >
> > > A great review Soumyadeep, it is much appreciated.
> > > Please remember to add yourself as a reviewer in the commitfest
> > > [https://commitfest.postgresql.org/30/2701/]
> >
> > Ah yes. Sorry, I forgot that!
> >
> > > > On Tue, Oct 13, 2020 at 6:28 AM [hidden email] wrote:
> > > >
> > > > 1.
> > > >
> > > > > /*
> > > > >
> > > > > -   -   heap size, including FSM and VM
> > > > > -   -   table size, including FSM and VM
> > > > >         */
> > > > >
> > > >
> > > > We should not mention FSM and VM in dbsize.c at all as these are
> > > > heap AM specific. We can say:
> > > > table size, excluding TOAST relation
> > >
> > > Yeah, I was thinking that the notion that FSM and VM are still taken
> > > into account should be stated. We are iterating over ForkNumber
> > > after all.
> > > How about phrasing it as:
> > >
> > > -   table size, including all implemented forks from the AM (e.g. FSM, VM)
> > > -   excluding TOAST relations
> > >
> > > Thoughts?
> >
> > Yes, I was thinking along the same lines. The concept of a "fork" forced
> > should not be forced down into the tableAM. But that is a discussion for
> > another day. We can probably say:
> >
> > -   table size, including all implemented forks from the AM (e.g. FSM, VM
> > -   for the heap AM) excluding TOAST relations
> >
> > > > 2.
> > > >
> > > > > /*
> > > > >
> > > > > -   Size of toast relation
> > > > >     */
> > > > >     if (OidIsValid(rel->rd_rel->reltoastrelid))
> > > > >
> > > > > -   size += calculate_toast_table_size(rel->rd_rel->reltoastrelid);
> > > > >
> > > > > -   {
> > > > >
> > > > > -   Relation toastRel;
> > > > >
> > > > > -
> > > > > -   toastRel = relation_open(rel->rd_rel->reltoastrelid, AccessShareLock);
> > > > >
> > > >
> > > > We can replace the OidIsValid check with relation_needs_toast_table()
> > > > and then have the OidIsValid() check in an Assert. Perhaps, that will
> > > > make things more readable.
> > >
> > > Please, allow me to kindly disagree.
> > > Relation is already open at this stage. Even create_toast_table(), the
> > > internal workhorse for creating toast relations, does check reltoastrelid
> > > to test if the relation is already toasted.
> > > Furthermore, we do call:
> > >
> > > -   toastRel = relation_open(rel->rd_rel->reltoastrelid, AccessShareLock);
> > >
> > > and in order to avoid elog() errors underneath us, we ought to have
> > > verified the validity of reltoastrelid.
> > > In short, I think that the code in the proposal is not too unreadable
> > > nor that it breaks the coding patterns throughout the codebase.
> > > Am I too wrong?
> >
> > No not at all. The code in the proposal is indeed adhering to the
> > codebase. What I was going for here was to increase the usage of
> > relation_needs_toast_table(). What I meant was:
> > if (table_relation_needs_toast_table(rel))
> > {
> > if (!OidIsValid(rel->rd_rel->reltoastrelid))
> > {
> > elog(ERROR, <errmsg that toast table wasn't found>);
> > }
> > //size calculation here..
> > }
> > We want to error out if the toast table can't be found and the relation
> > is expected to have one, which the existing code doesn't handle.
> >
> > > > 3.
> > > >
> > > > > -   if (rel->rd_rel->relkind == RELKIND_RELATION ||
> > > > > -   rel->rd_rel->relkind == RELKIND_TOASTVALUE ||
> > > > > -   rel->rd_rel->relkind == RELKIND_MATVIEW)
> > > > > -   size = calculate_table_size(rel);
> > > > > -   else
> > > > > -   {
> > > > > -   relation_close(rel, AccessShareLock);
> > > > > -   PG_RETURN_NULL();
> > > > > -   }
> > > >
> > > > This leads to behavioral changes:
> > > > I am talking about the case where one calls pg_table_size() on an index.
> > > > W/o your patch, it returns the size of the index. W/ your patch it
> > > > returns NULL. Judging by the documentation, this function should not
> > > > ideally apply to indexes but it does! I have a sinking feeling that lots
> > > > of users use it for this purpose, as there is no function to calculate
> > > > the size of a single specific index (except pg_relation_size()).
> > > > The same argument I have made above applies to sequences. Users may have
> > > > trial-and-errored their way into finding that pg_table_size() can tell
> > > > them the size of a specific index/sequence! I don't know how widespread
> > > > the use is in the user community, so IMO maybe we should be conservative
> > > > and not introduce this change? Alternatively, we could call out that
> > > > pg_table_size() is only for tables by throwing an error if anything
> > > > other than a table is passed in.
> > > > If we decide to preserve the existing behavior of the pg_table_size():
> > > > It seems that for things not backed by the tableAM (indexes and
> > > > sequences), they should still go through calculate_relation_size().
> > > > We can call table_relation_size() based on if relkind is
> > > > RELKIND_RELATION, RELKIND_TOASTVALUE or RELKIND_MATVIEW. Perhaps it
> > > > might be worthwhile making a new macro RELKIND_HAS_TABLE_STORAGE to
> > > > capture these three cases (See RELKIND_HAS_STORAGE). This also ensures
> > > > that we return 0 for things that don't qualify as RELKIND_HAS_STORAGE,
> > > > such as a partitioned table (Currently w/ the patch applied, we return
> > > > NULL for those cases, which is another behavior change)
> > >
> > > Excellent point. This is the discussion I was longing to have.
> > > I stand by the decision coded in the patch, that pg_table_size() should
> > > return NULL for other kinds of relations, such as indexes, sequences
> > > etc.
> > > It is a conscious decision based on the following:
> > >
> > > -   Supported by the documentation, pg_table_size() applies to tables only.
> > >     For other uses the higher-level functions pg_total_relation_size() or
> > >     pg_relation_size() should be used.
> > >
> > > -   Commit fa352d662e taught pg_relation_size() and friends to return NULL if the object doesn't exist. This makes perfect sense for the
> > >     scenarios described in the commit:
> > >     That avoids errors when the functions are used in queries like
> > >     "SELECT pg_relation_size(oid) FROM pg_class",
> > >     and a table is dropped concurrently.
> > >
> > >
> > > IMHO: It is more consistent to return NULL when the relation does exist
> > > OR it is not a table kind.
> > >
> > > -   Returning 0 for things that do not have storage, is nonsensical because
> > >     it implies that it can be NON zero at some point. Things that do not
> > >     have storage have an unknown size.
> > >
> >
> > Fair. We will have to document the behavior change.
> >
> > > As far as for the argument that users might have trialed and errored
> > > their way into undocumented behaviour, I do not think it is strong
> > > enough to stop us from implementing the documented behaviour.
> >
> > Fair. I would strongly vote for having two additional functions
> > (pg_index_size() and pg_sequence_size()) to strongly signal our intent
> > of banning that kind of use of pg_table_size(). I think it would help
> > users a lot. It is not easy to find what function to call when you want
> > the size of a single index/sequence.
> >
> > > > 4.
> > > >
> > > > > @@ -3776,10 +3776,24 @@ listTables(const char *tabtypes, const char *pattern, bool verbose, bool showSys
> > > > > gettext_noop("Access Method"));
> > > > >
> > > > >          /*
> > > > >
> > > > >
> > > > > -             * As of PostgreSQL 14, do not use pg_table_size() for indexes and
> > > > >
> > > > >
> > > > > -             * sequences as it does not behave sanely for those.
> > > > >
> > > > >
> > > > > -             *
> > > > >               * As of PostgreSQL 9.0, use pg_table_size() to show a more accurate
> > > > >               * size of a table, including FSM, VM and TOAST tables.
> > > > >               */
> > > > >
> > > > >
> > > > > -            if (pset.sversion >= 90000)
> > > > >
> > > > >
> > > > > -            if (pset.sversion >= 140000)
> > > > >
> > > > >
> > > > > -                appendPQExpBuffer(&buf,
> > > > >
> > > > >
> > > > > -                                  ",\\\\n CASE"
> > > > >
> > > > >
> > > > > -                                  " WHEN c.relkind in ("CppAsString2(RELKIND_INDEX)", "
> > > > >
> > > > >
> > > > > -                                                        CppAsString2(RELKIND_PARTITIONED_INDEX)", "
> > > > >
> > > > >
> > > > > -                                                        CppAsString2(RELKIND_SEQUENCE)") THEN"
> > > > >
> > > > >
> > > > > -                                  "     pg_catalog.pg_size_pretty(pg_catalog.pg_relation_size(c.oid))"
> > > > >
> > > > >
> > > > > -                                  " ELSE"
> > > > >
> > > > >
> > > > > -                                  "     pg_catalog.pg_size_pretty(pg_catalog.pg_table_size(c.oid))"
> > > > >
> > > > >
> > > > > -                                  " END as \\\\"%s\\\\"",
> > > > >
> > > > >
> > > > > -                                  gettext_noop("Size"));
> > > > >
> > > > >
> > > > > -            else if (pset.sversion >= 90000)
> > > > >                  appendPQExpBuffer(&buf,
> > > > >                                    ",\\\\n  pg_catalog.pg_size_pretty(pg_catalog.pg_table_size(c.oid)) as \\\\"%s\\\\"",
> > > > >                                    gettext_noop("Size"));
> > > > >
> > > > >
> > > >
> > > > Following on from point 3, if we decide to preserve the existing behavior,
> > > > we wouldn't need this change, as it would be internalized within
> > > > pg_table_size().
> > >
> > > We really should not decide to preserve the existing behaviour.
> > > I will reiterate my point: Returning 0 for things that do not have
> > > storage, implies that it can be NON zero at some point. We should not
> > > treat pg_table_size() as an alias for pg_relation_size().
> >
> > +1
> >
> > > > 4.
> > > >
> > > > > > -   return size;
> > > > > >
> > > > > > -   Assert(size < PG_INT64_MAX);
> > > > > >
> > > > > > -
> > > > > > -   return (int64)size;
> > > > > >     I assume that this change, and the other ones like that, aim to handle int64
> > > > > >     overflow? Using the extra legroom of uint64 can still lead to an overflow,
> > > > > >     however theoretical it may be. Wouldn't it be better to check for overflow
> > > > > >     before adding to size rather than after the fact? Something along the lines of
> > > > > >     checking for headroom left:
> > > > > >     rel_size = table_relation_size(..);
> > > > > >     if (rel_size > (PG_INT64_MAX - total_size))
> > > > > >     < error codepath >
> > > > > >
> > > > > >
> > > > > > total_size += rel_size;
> > > > >
> > > > > Actually not, the intention is not to handle overflow. The table_relation_size() returns a uint64 and the calling function returns int64.
> > > > > The Assert() has been placed in order to denote that it is acknowledged that the two functions return different types. I was of the opinion that a run time check will not be needed as even the > smaller type can cover more than 9200 PetaByte tables.
> > > > > If we were to change anything, then I would prefer to change the signature of the pg_*_size family of functions to return uint64 instead.
> > > >
> > > > Changing the signature would be the ideal change for all of this here.
> > > > But Postgres does not have support for an unsigned 64bit integer (bigint
> > > > is signed). One would need to turn to extensions such as [1]. Thus, +1
> > > > to what Daniel said above.
> > >
> > > Apologies, I do not follow. Are you suggesting that we should
> > > introduce overflow tests?
> >
> > Yes, to introduce the same overflow test that Daniel suggested above as
> > returning a uint64 in PG does not really return a uint64 AFAIU (since
> > the pg_**size() functions all return bigint which is signed and there
> > is no uint64 user-facing type).
>
> Status update for a commitfest entry.
>
> This patch gets review comments and there was some discussion. It
> seems we're waiting for the patch update. So I've moved this patch to
> the next commitfest and set it to "Waiting on Author".
Thank you for your patience.

Please find v3 attached. I will reset the status to 'Ready for review'.

>
> Regards,
>
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Masahiko Sawada
> EDB: https://www.enterprisedb.com/


v3-0001-Attempt-to-make-dbsize-a-bit-more-consistent.patch (20K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: PATCH: Attempt to make dbsize a bit more consistent

gkokolatos





‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Friday, February 19, 2021 4:57 PM, <[hidden email]> wrote:

>
>
> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> On Monday, February 1, 2021 1:18 PM, Masahiko Sawada [hidden email] wrote:
>
> > On Thu, Nov 12, 2020 at 2:54 AM Soumyadeep Chakraborty
> > [hidden email] wrote:
> >
> > > Hey Georgios,
> > > On Tue, Nov 10, 2020 at 6:20 AM [hidden email] wrote:
> > >
> > > > ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> > > > On Monday, November 9, 2020 7:50 PM, Soumyadeep Chakraborty [hidden email] wrote:
> > > >
> > > > > Hey Georgios,
> > > > > Thanks for looking for more avenues to invoke tableAM APIS! Please find
> > > > > my review below:
> > > >
> > > > A great review Soumyadeep, it is much appreciated.
> > > > Please remember to add yourself as a reviewer in the commitfest
> > > > [https://commitfest.postgresql.org/30/2701/]
> > >
> > > Ah yes. Sorry, I forgot that!
> > >
> > > > > On Tue, Oct 13, 2020 at 6:28 AM [hidden email] wrote:
> > > > >
> > > > > 1.
> > > > >
> > > > > > /*
> > > > > >
> > > > > > -   -   heap size, including FSM and VM
> > > > > > -   -   table size, including FSM and VM
> > > > > >         */
> > > > > >
> > > > >
> > > > > We should not mention FSM and VM in dbsize.c at all as these are
> > > > > heap AM specific. We can say:
> > > > > table size, excluding TOAST relation
> > > >
> > > > Yeah, I was thinking that the notion that FSM and VM are still taken
> > > > into account should be stated. We are iterating over ForkNumber
> > > > after all.
> > > > How about phrasing it as:
> > > >
> > > > -   table size, including all implemented forks from the AM (e.g. FSM, VM)
> > > > -   excluding TOAST relations
> > > >
> > > > Thoughts?
> > >
> > > Yes, I was thinking along the same lines. The concept of a "fork" forced
> > > should not be forced down into the tableAM. But that is a discussion for
> > > another day. We can probably say:
> > >
> > > -   table size, including all implemented forks from the AM (e.g. FSM, VM
> > > -   for the heap AM) excluding TOAST relations
> > >
> > > > > 2.
> > > > >
> > > > > > /*
> > > > > >
> > > > > > -   Size of toast relation
> > > > > >     */
> > > > > >     if (OidIsValid(rel->rd_rel->reltoastrelid))
> > > > > >
> > > > > > -   size += calculate_toast_table_size(rel->rd_rel->reltoastrelid);
> > > > > >
> > > > > > -   {
> > > > > >
> > > > > > -   Relation toastRel;
> > > > > >
> > > > > > -
> > > > > > -   toastRel = relation_open(rel->rd_rel->reltoastrelid, AccessShareLock);
> > > > > >
> > > > >
> > > > > We can replace the OidIsValid check with relation_needs_toast_table()
> > > > > and then have the OidIsValid() check in an Assert. Perhaps, that will
> > > > > make things more readable.
> > > >
> > > > Please, allow me to kindly disagree.
> > > > Relation is already open at this stage. Even create_toast_table(), the
> > > > internal workhorse for creating toast relations, does check reltoastrelid
> > > > to test if the relation is already toasted.
> > > > Furthermore, we do call:
> > > >
> > > > -   toastRel = relation_open(rel->rd_rel->reltoastrelid, AccessShareLock);
> > > >
> > > > and in order to avoid elog() errors underneath us, we ought to have
> > > > verified the validity of reltoastrelid.
> > > > In short, I think that the code in the proposal is not too unreadable
> > > > nor that it breaks the coding patterns throughout the codebase.
> > > > Am I too wrong?
> > >
> > > No not at all. The code in the proposal is indeed adhering to the
> > > codebase. What I was going for here was to increase the usage of
> > > relation_needs_toast_table(). What I meant was:
> > > if (table_relation_needs_toast_table(rel))
> > > {
> > > if (!OidIsValid(rel->rd_rel->reltoastrelid))
> > > {
> > > elog(ERROR, <errmsg that toast table wasn't found>);
> > > }
> > > //size calculation here..
> > > }
> > > We want to error out if the toast table can't be found and the relation
> > > is expected to have one, which the existing code doesn't handle.
> > >
> > > > > 3.
> > > > >
> > > > > > -   if (rel->rd_rel->relkind == RELKIND_RELATION ||
> > > > > > -   rel->rd_rel->relkind == RELKIND_TOASTVALUE ||
> > > > > > -   rel->rd_rel->relkind == RELKIND_MATVIEW)
> > > > > > -   size = calculate_table_size(rel);
> > > > > > -   else
> > > > > > -   {
> > > > > > -   relation_close(rel, AccessShareLock);
> > > > > > -   PG_RETURN_NULL();
> > > > > > -   }
> > > > >
> > > > > This leads to behavioral changes:
> > > > > I am talking about the case where one calls pg_table_size() on an index.
> > > > > W/o your patch, it returns the size of the index. W/ your patch it
> > > > > returns NULL. Judging by the documentation, this function should not
> > > > > ideally apply to indexes but it does! I have a sinking feeling that lots
> > > > > of users use it for this purpose, as there is no function to calculate
> > > > > the size of a single specific index (except pg_relation_size()).
> > > > > The same argument I have made above applies to sequences. Users may have
> > > > > trial-and-errored their way into finding that pg_table_size() can tell
> > > > > them the size of a specific index/sequence! I don't know how widespread
> > > > > the use is in the user community, so IMO maybe we should be conservative
> > > > > and not introduce this change? Alternatively, we could call out that
> > > > > pg_table_size() is only for tables by throwing an error if anything
> > > > > other than a table is passed in.
> > > > > If we decide to preserve the existing behavior of the pg_table_size():
> > > > > It seems that for things not backed by the tableAM (indexes and
> > > > > sequences), they should still go through calculate_relation_size().
> > > > > We can call table_relation_size() based on if relkind is
> > > > > RELKIND_RELATION, RELKIND_TOASTVALUE or RELKIND_MATVIEW. Perhaps it
> > > > > might be worthwhile making a new macro RELKIND_HAS_TABLE_STORAGE to
> > > > > capture these three cases (See RELKIND_HAS_STORAGE). This also ensures
> > > > > that we return 0 for things that don't qualify as RELKIND_HAS_STORAGE,
> > > > > such as a partitioned table (Currently w/ the patch applied, we return
> > > > > NULL for those cases, which is another behavior change)
> > > >
> > > > Excellent point. This is the discussion I was longing to have.
> > > > I stand by the decision coded in the patch, that pg_table_size() should
> > > > return NULL for other kinds of relations, such as indexes, sequences
> > > > etc.
> > > > It is a conscious decision based on the following:
> > > >
> > > > -   Supported by the documentation, pg_table_size() applies to tables only.
> > > >     For other uses the higher-level functions pg_total_relation_size() or
> > > >     pg_relation_size() should be used.
> > > >
> > > > -   Commit fa352d662e taught pg_relation_size() and friends to return NULL if the object doesn't exist. This makes perfect sense for the
> > > >     scenarios described in the commit:
> > > >     That avoids errors when the functions are used in queries like
> > > >     "SELECT pg_relation_size(oid) FROM pg_class",
> > > >     and a table is dropped concurrently.
> > > >
> > > >
> > > > IMHO: It is more consistent to return NULL when the relation does exist
> > > > OR it is not a table kind.
> > > >
> > > > -   Returning 0 for things that do not have storage, is nonsensical because
> > > >     it implies that it can be NON zero at some point. Things that do not
> > > >     have storage have an unknown size.
> > > >
> > >
> > > Fair. We will have to document the behavior change.
> > >
> > > > As far as for the argument that users might have trialed and errored
> > > > their way into undocumented behaviour, I do not think it is strong
> > > > enough to stop us from implementing the documented behaviour.
> > >
> > > Fair. I would strongly vote for having two additional functions
> > > (pg_index_size() and pg_sequence_size()) to strongly signal our intent
> > > of banning that kind of use of pg_table_size(). I think it would help
> > > users a lot. It is not easy to find what function to call when you want
> > > the size of a single index/sequence.
> > >
> > > > > 4.
> > > > >
> > > > > > @@ -3776,10 +3776,24 @@ listTables(const char *tabtypes, const char *pattern, bool verbose, bool showSys
> > > > > > gettext_noop("Access Method"));
> > > > > >
> > > > > >          /*
> > > > > >
> > > > > >
> > > > > > -               * As of PostgreSQL 14, do not use pg_table_size() for indexes and
> > > > > >
> > > > > >
> > > > > > -               * sequences as it does not behave sanely for those.
> > > > > >
> > > > > >
> > > > > > -               *
> > > > > >                 * As of PostgreSQL 9.0, use pg_table_size() to show a more accurate
> > > > > >                 * size of a table, including FSM, VM and TOAST tables.
> > > > > >                 */
> > > > > >
> > > > > >
> > > > > > -              if (pset.sversion >= 90000)
> > > > > >
> > > > > >
> > > > > > -              if (pset.sversion >= 140000)
> > > > > >
> > > > > >
> > > > > > -                  appendPQExpBuffer(&buf,
> > > > > >
> > > > > >
> > > > > > -                                    ",\\\\\\\\n CASE"
> > > > > >
> > > > > >
> > > > > > -                                    " WHEN c.relkind in ("CppAsString2(RELKIND_INDEX)", "
> > > > > >
> > > > > >
> > > > > > -                                                          CppAsString2(RELKIND_PARTITIONED_INDEX)", "
> > > > > >
> > > > > >
> > > > > > -                                                          CppAsString2(RELKIND_SEQUENCE)") THEN"
> > > > > >
> > > > > >
> > > > > > -                                    "     pg_catalog.pg_size_pretty(pg_catalog.pg_relation_size(c.oid))"
> > > > > >
> > > > > >
> > > > > > -                                    " ELSE"
> > > > > >
> > > > > >
> > > > > > -                                    "     pg_catalog.pg_size_pretty(pg_catalog.pg_table_size(c.oid))"
> > > > > >
> > > > > >
> > > > > > -                                    " END as \\\\\\\\"%s\\\\\\\\"",
> > > > > >
> > > > > >
> > > > > > -                                    gettext_noop("Size"));
> > > > > >
> > > > > >
> > > > > > -              else if (pset.sversion >= 90000)
> > > > > >                    appendPQExpBuffer(&buf,
> > > > > >                                      ",\\\\\\\\n  pg_catalog.pg_size_pretty(pg_catalog.pg_table_size(c.oid)) as \\\\\\\\"%s\\\\\\\\"",
> > > > > >                                      gettext_noop("Size"));
> > > > > >
> > > > > >
> > > > >
> > > > > Following on from point 3, if we decide to preserve the existing behavior,
> > > > > we wouldn't need this change, as it would be internalized within
> > > > > pg_table_size().
> > > >
> > > > We really should not decide to preserve the existing behaviour.
> > > > I will reiterate my point: Returning 0 for things that do not have
> > > > storage, implies that it can be NON zero at some point. We should not
> > > > treat pg_table_size() as an alias for pg_relation_size().
> > >
> > > +1
> > >
> > > > > 4.
> > > > >
> > > > > > > -   return size;
> > > > > > >
> > > > > > > -   Assert(size < PG_INT64_MAX);
> > > > > > >
> > > > > > > -
> > > > > > > -   return (int64)size;
> > > > > > >     I assume that this change, and the other ones like that, aim to handle int64
> > > > > > >     overflow? Using the extra legroom of uint64 can still lead to an overflow,
> > > > > > >     however theoretical it may be. Wouldn't it be better to check for overflow
> > > > > > >     before adding to size rather than after the fact? Something along the lines of
> > > > > > >     checking for headroom left:
> > > > > > >     rel_size = table_relation_size(..);
> > > > > > >     if (rel_size > (PG_INT64_MAX - total_size))
> > > > > > >     < error codepath >
> > > > > > >
> > > > > > >
> > > > > > > total_size += rel_size;
> > > > > >
> > > > > > Actually not, the intention is not to handle overflow. The table_relation_size() returns a uint64 and the calling function returns int64.
> > > > > > The Assert() has been placed in order to denote that it is acknowledged that the two functions return different types. I was of the opinion that a run time check will not be needed as even the > smaller type can cover more than 9200 PetaByte tables.
> > > > > > If we were to change anything, then I would prefer to change the signature of the pg_*_size family of functions to return uint64 instead.
> > > > >
> > > > > Changing the signature would be the ideal change for all of this here.
> > > > > But Postgres does not have support for an unsigned 64bit integer (bigint
> > > > > is signed). One would need to turn to extensions such as [1]. Thus, +1
> > > > > to what Daniel said above.
> > > >
> > > > Apologies, I do not follow. Are you suggesting that we should
> > > > introduce overflow tests?
> > >
> > > Yes, to introduce the same overflow test that Daniel suggested above as
> > > returning a uint64 in PG does not really return a uint64 AFAIU (since
> > > the pg_**size() functions all return bigint which is signed and there
> > > is no uint64 user-facing type).
> >
> > Status update for a commitfest entry.
> > This patch gets review comments and there was some discussion. It
> > seems we're waiting for the patch update. So I've moved this patch to
> > the next commitfest and set it to "Waiting on Author".
>
> Thank you for your patience.
>
> Please find v3 attached. I will reset the status to 'Ready for review'.


Version 3 of the patch broke the cfbot. Please find v4 attached.

This patch requires a catalog version bump.

Cheers,
//Georgios -- https://www.vmware.com


>
> > Regards,
> >
> > Masahiko Sawada
> > EDB: https://www.enterprisedb.com/




Reply | Threaded
Open this post in threaded view
|

Re: PATCH: Attempt to make dbsize a bit more consistent

gkokolatos





‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Wednesday, February 24, 2021 3:34 PM, <[hidden email]> wrote:

>
>
> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> On Friday, February 19, 2021 4:57 PM, [hidden email] wrote:
>
> > ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> > On Monday, February 1, 2021 1:18 PM, Masahiko Sawada [hidden email] wrote:
> >
> > > On Thu, Nov 12, 2020 at 2:54 AM Soumyadeep Chakraborty
> > > [hidden email] wrote:
> > >
> > > > Hey Georgios,
> > > > On Tue, Nov 10, 2020 at 6:20 AM [hidden email] wrote:
> > > >
> > > > > ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> > > > > On Monday, November 9, 2020 7:50 PM, Soumyadeep Chakraborty [hidden email] wrote:
> > > > >
> > > > > > Hey Georgios,
> > > > > > Thanks for looking for more avenues to invoke tableAM APIS! Please find
> > > > > > my review below:
> > > > >
> > > > > A great review Soumyadeep, it is much appreciated.
> > > > > Please remember to add yourself as a reviewer in the commitfest
> > > > > [https://commitfest.postgresql.org/30/2701/]
> > > >
> > > > Ah yes. Sorry, I forgot that!
> > > >
> > > > > > On Tue, Oct 13, 2020 at 6:28 AM [hidden email] wrote:
> > > > > >
> > > > > > 1.
> > > > > >
> > > > > > > /*
> > > > > > >
> > > > > > > -   -   heap size, including FSM and VM
> > > > > > > -   -   table size, including FSM and VM
> > > > > > >         */
> > > > > > >
> > > > > >
> > > > > > We should not mention FSM and VM in dbsize.c at all as these are
> > > > > > heap AM specific. We can say:
> > > > > > table size, excluding TOAST relation
> > > > >
> > > > > Yeah, I was thinking that the notion that FSM and VM are still taken
> > > > > into account should be stated. We are iterating over ForkNumber
> > > > > after all.
> > > > > How about phrasing it as:
> > > > >
> > > > > -   table size, including all implemented forks from the AM (e.g. FSM, VM)
> > > > > -   excluding TOAST relations
> > > > >
> > > > > Thoughts?
> > > >
> > > > Yes, I was thinking along the same lines. The concept of a "fork" forced
> > > > should not be forced down into the tableAM. But that is a discussion for
> > > > another day. We can probably say:
> > > >
> > > > -   table size, including all implemented forks from the AM (e.g. FSM, VM
> > > > -   for the heap AM) excluding TOAST relations
> > > >
> > > > > > 2.
> > > > > >
> > > > > > > /*
> > > > > > >
> > > > > > > -   Size of toast relation
> > > > > > >     */
> > > > > > >     if (OidIsValid(rel->rd_rel->reltoastrelid))
> > > > > > >
> > > > > > > -   size += calculate_toast_table_size(rel->rd_rel->reltoastrelid);
> > > > > > >
> > > > > > > -   {
> > > > > > >
> > > > > > > -   Relation toastRel;
> > > > > > >
> > > > > > > -
> > > > > > > -   toastRel = relation_open(rel->rd_rel->reltoastrelid, AccessShareLock);
> > > > > > >
> > > > > >
> > > > > > We can replace the OidIsValid check with relation_needs_toast_table()
> > > > > > and then have the OidIsValid() check in an Assert. Perhaps, that will
> > > > > > make things more readable.
> > > > >
> > > > > Please, allow me to kindly disagree.
> > > > > Relation is already open at this stage. Even create_toast_table(), the
> > > > > internal workhorse for creating toast relations, does check reltoastrelid
> > > > > to test if the relation is already toasted.
> > > > > Furthermore, we do call:
> > > > >
> > > > > -   toastRel = relation_open(rel->rd_rel->reltoastrelid, AccessShareLock);
> > > > >
> > > > > and in order to avoid elog() errors underneath us, we ought to have
> > > > > verified the validity of reltoastrelid.
> > > > > In short, I think that the code in the proposal is not too unreadable
> > > > > nor that it breaks the coding patterns throughout the codebase.
> > > > > Am I too wrong?
> > > >
> > > > No not at all. The code in the proposal is indeed adhering to the
> > > > codebase. What I was going for here was to increase the usage of
> > > > relation_needs_toast_table(). What I meant was:
> > > > if (table_relation_needs_toast_table(rel))
> > > > {
> > > > if (!OidIsValid(rel->rd_rel->reltoastrelid))
> > > > {
> > > > elog(ERROR, <errmsg that toast table wasn't found>);
> > > > }
> > > > //size calculation here..
> > > > }
> > > > We want to error out if the toast table can't be found and the relation
> > > > is expected to have one, which the existing code doesn't handle.
> > > >
> > > > > > 3.
> > > > > >
> > > > > > > -   if (rel->rd_rel->relkind == RELKIND_RELATION ||
> > > > > > > -   rel->rd_rel->relkind == RELKIND_TOASTVALUE ||
> > > > > > > -   rel->rd_rel->relkind == RELKIND_MATVIEW)
> > > > > > > -   size = calculate_table_size(rel);
> > > > > > > -   else
> > > > > > > -   {
> > > > > > > -   relation_close(rel, AccessShareLock);
> > > > > > > -   PG_RETURN_NULL();
> > > > > > > -   }
> > > > > >
> > > > > > This leads to behavioral changes:
> > > > > > I am talking about the case where one calls pg_table_size() on an index.
> > > > > > W/o your patch, it returns the size of the index. W/ your patch it
> > > > > > returns NULL. Judging by the documentation, this function should not
> > > > > > ideally apply to indexes but it does! I have a sinking feeling that lots
> > > > > > of users use it for this purpose, as there is no function to calculate
> > > > > > the size of a single specific index (except pg_relation_size()).
> > > > > > The same argument I have made above applies to sequences. Users may have
> > > > > > trial-and-errored their way into finding that pg_table_size() can tell
> > > > > > them the size of a specific index/sequence! I don't know how widespread
> > > > > > the use is in the user community, so IMO maybe we should be conservative
> > > > > > and not introduce this change? Alternatively, we could call out that
> > > > > > pg_table_size() is only for tables by throwing an error if anything
> > > > > > other than a table is passed in.
> > > > > > If we decide to preserve the existing behavior of the pg_table_size():
> > > > > > It seems that for things not backed by the tableAM (indexes and
> > > > > > sequences), they should still go through calculate_relation_size().
> > > > > > We can call table_relation_size() based on if relkind is
> > > > > > RELKIND_RELATION, RELKIND_TOASTVALUE or RELKIND_MATVIEW. Perhaps it
> > > > > > might be worthwhile making a new macro RELKIND_HAS_TABLE_STORAGE to
> > > > > > capture these three cases (See RELKIND_HAS_STORAGE). This also ensures
> > > > > > that we return 0 for things that don't qualify as RELKIND_HAS_STORAGE,
> > > > > > such as a partitioned table (Currently w/ the patch applied, we return
> > > > > > NULL for those cases, which is another behavior change)
> > > > >
> > > > > Excellent point. This is the discussion I was longing to have.
> > > > > I stand by the decision coded in the patch, that pg_table_size() should
> > > > > return NULL for other kinds of relations, such as indexes, sequences
> > > > > etc.
> > > > > It is a conscious decision based on the following:
> > > > >
> > > > > -   Supported by the documentation, pg_table_size() applies to tables only.
> > > > >     For other uses the higher-level functions pg_total_relation_size() or
> > > > >     pg_relation_size() should be used.
> > > > >
> > > > > -   Commit fa352d662e taught pg_relation_size() and friends to return NULL if the object doesn't exist. This makes perfect sense for the
> > > > >     scenarios described in the commit:
> > > > >     That avoids errors when the functions are used in queries like
> > > > >     "SELECT pg_relation_size(oid) FROM pg_class",
> > > > >     and a table is dropped concurrently.
> > > > >
> > > > >
> > > > > IMHO: It is more consistent to return NULL when the relation does exist
> > > > > OR it is not a table kind.
> > > > >
> > > > > -   Returning 0 for things that do not have storage, is nonsensical because
> > > > >     it implies that it can be NON zero at some point. Things that do not
> > > > >     have storage have an unknown size.
> > > > >
> > > >
> > > > Fair. We will have to document the behavior change.
> > > >
> > > > > As far as for the argument that users might have trialed and errored
> > > > > their way into undocumented behaviour, I do not think it is strong
> > > > > enough to stop us from implementing the documented behaviour.
> > > >
> > > > Fair. I would strongly vote for having two additional functions
> > > > (pg_index_size() and pg_sequence_size()) to strongly signal our intent
> > > > of banning that kind of use of pg_table_size(). I think it would help
> > > > users a lot. It is not easy to find what function to call when you want
> > > > the size of a single index/sequence.
> > > >
> > > > > > 4.
> > > > > >
> > > > > > > @@ -3776,10 +3776,24 @@ listTables(const char *tabtypes, const char *pattern, bool verbose, bool showSys
> > > > > > > gettext_noop("Access Method"));
> > > > > > >
> > > > > > >          /*
> > > > > > >
> > > > > > >
> > > > > > > -                 * As of PostgreSQL 14, do not use pg_table_size() for indexes and
> > > > > > >
> > > > > > >
> > > > > > > -                 * sequences as it does not behave sanely for those.
> > > > > > >
> > > > > > >
> > > > > > > -                 *
> > > > > > >                   * As of PostgreSQL 9.0, use pg_table_size() to show a more accurate
> > > > > > >                   * size of a table, including FSM, VM and TOAST tables.
> > > > > > >                   */
> > > > > > >
> > > > > > >
> > > > > > > -                if (pset.sversion >= 90000)
> > > > > > >
> > > > > > >
> > > > > > > -                if (pset.sversion >= 140000)
> > > > > > >
> > > > > > >
> > > > > > > -                    appendPQExpBuffer(&buf,
> > > > > > >
> > > > > > >
> > > > > > > -                                      ",\\\\\\\\\\\\\\\\n CASE"
> > > > > > >
> > > > > > >
> > > > > > > -                                      " WHEN c.relkind in ("CppAsString2(RELKIND_INDEX)", "
> > > > > > >
> > > > > > >
> > > > > > > -                                                            CppAsString2(RELKIND_PARTITIONED_INDEX)", "
> > > > > > >
> > > > > > >
> > > > > > > -                                                            CppAsString2(RELKIND_SEQUENCE)") THEN"
> > > > > > >
> > > > > > >
> > > > > > > -                                      "     pg_catalog.pg_size_pretty(pg_catalog.pg_relation_size(c.oid))"
> > > > > > >
> > > > > > >
> > > > > > > -                                      " ELSE"
> > > > > > >
> > > > > > >
> > > > > > > -                                      "     pg_catalog.pg_size_pretty(pg_catalog.pg_table_size(c.oid))"
> > > > > > >
> > > > > > >
> > > > > > > -                                      " END as \\\\\\\\\\\\\\\\"%s\\\\\\\\\\\\\\\\"",
> > > > > > >
> > > > > > >
> > > > > > > -                                      gettext_noop("Size"));
> > > > > > >
> > > > > > >
> > > > > > > -                else if (pset.sversion >= 90000)
> > > > > > >                      appendPQExpBuffer(&buf,
> > > > > > >                                        ",\\\\\\\\\\\\\\\\n  pg_catalog.pg_size_pretty(pg_catalog.pg_table_size(c.oid)) as \\\\\\\\\\\\\\\\"%s\\\\\\\\\\\\\\\\"",
> > > > > > >                                        gettext_noop("Size"));
> > > > > > >
> > > > > > >
> > > > > >
> > > > > > Following on from point 3, if we decide to preserve the existing behavior,
> > > > > > we wouldn't need this change, as it would be internalized within
> > > > > > pg_table_size().
> > > > >
> > > > > We really should not decide to preserve the existing behaviour.
> > > > > I will reiterate my point: Returning 0 for things that do not have
> > > > > storage, implies that it can be NON zero at some point. We should not
> > > > > treat pg_table_size() as an alias for pg_relation_size().
> > > >
> > > > +1
> > > >
> > > > > > 4.
> > > > > >
> > > > > > > > -   return size;
> > > > > > > >
> > > > > > > > -   Assert(size < PG_INT64_MAX);
> > > > > > > >
> > > > > > > > -
> > > > > > > > -   return (int64)size;
> > > > > > > >     I assume that this change, and the other ones like that, aim to handle int64
> > > > > > > >     overflow? Using the extra legroom of uint64 can still lead to an overflow,
> > > > > > > >     however theoretical it may be. Wouldn't it be better to check for overflow
> > > > > > > >     before adding to size rather than after the fact? Something along the lines of
> > > > > > > >     checking for headroom left:
> > > > > > > >     rel_size = table_relation_size(..);
> > > > > > > >     if (rel_size > (PG_INT64_MAX - total_size))
> > > > > > > >     < error codepath >
> > > > > > > >
> > > > > > > >
> > > > > > > > total_size += rel_size;
> > > > > > >
> > > > > > > Actually not, the intention is not to handle overflow. The table_relation_size() returns a uint64 and the calling function returns int64.
> > > > > > > The Assert() has been placed in order to denote that it is acknowledged that the two functions return different types. I was of the opinion that a run time check will not be needed as even the > smaller type can cover more than 9200 PetaByte tables.
> > > > > > > If we were to change anything, then I would prefer to change the signature of the pg_*_size family of functions to return uint64 instead.
> > > > > >
> > > > > > Changing the signature would be the ideal change for all of this here.
> > > > > > But Postgres does not have support for an unsigned 64bit integer (bigint
> > > > > > is signed). One would need to turn to extensions such as [1]. Thus, +1
> > > > > > to what Daniel said above.
> > > > >
> > > > > Apologies, I do not follow. Are you suggesting that we should
> > > > > introduce overflow tests?
> > > >
> > > > Yes, to introduce the same overflow test that Daniel suggested above as
> > > > returning a uint64 in PG does not really return a uint64 AFAIU (since
> > > > the pg_**size() functions all return bigint which is signed and there
> > > > is no uint64 user-facing type).
> > >
> > > Status update for a commitfest entry.
> > > This patch gets review comments and there was some discussion. It
> > > seems we're waiting for the patch update. So I've moved this patch to
> > > the next commitfest and set it to "Waiting on Author".
> >
> > Thank you for your patience.
> > Please find v3 attached. I will reset the status to 'Ready for review'.
>
> Version 3 of the patch broke the cfbot. Please find v4 attached.
>
> This patch requires a catalog version bump.
Now with attachment. Apologies for the chatter.

>
> Cheers,
> //Georgios -- https://www.vmware.com
>
> > > Regards,
> > > Masahiko Sawada
> > > EDB: https://www.enterprisedb.com/


v4-0001-Attempt-to-make-dbsize-a-bit-more-consistent.patch (20K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: PATCH: Attempt to make dbsize a bit more consistent

Michael Paquier-2
On Wed, Feb 24, 2021 at 02:35:51PM +0000, [hidden email] wrote:
> Now with attachment. Apologies for the chatter.

The patch has no documentation for the two new functions, so it is a
bit hard to understand what is the value brought here, and what is the
goal wanted just by reading the patch, except that this switches the
size reported for views to NULL instead of zero bytes by reading the
regression tests.

Please note that reporting zero is not wrong for views IMO as these
have no physical storage, so, while it is possible to argue that a
size of zero could imply that this relation size could not be zero, it
will never change, so I'd rather let this behavior as it is.  I
would bet, additionally, that this could break existing use cases.
Reporting NULL, on the contrary, as your patch does, makes things
worse in some ways because that's what pg_relation_size would report
when a relation does not exist anymore.  Imagine for example the case
of a DROP TABLE run in parallel of a scan of pg_class using
pg_relation_size().  So it becomes impossible to know if a relation
has been removed or not.  This joins some points raised by Soumyadeep
upthread.

Anyway, as mentioned by other people upthread, I am not really
convinced either that we should have more flavors of size functions,
particularly depending on the relkind as this would be confusing for
the end-user.  pg_relation_size() can already do this job for all
relkinds that use segment files, so it should still be able to hold
its ground and maintain a consistent behavior with what it does
currently.

+static int64
+calculate_table_fork_size(Relation rel, ForkNumber forkNum)
+{
+   return (int64)table_relation_size(rel, forkNum);
+}
Why isn't that just unsigned, like table_relation_size()?  This is an
internal function so it does not matter to changes its signature, but
I think that you had better do a cast at a higher level instead.

        for (int i = 0; i < MAX_FORKNUM; i++)
-           nblocks += smgrnblocks(rel->rd_smgr, i);
+           nblocks += smgrexists(rel->rd_smgr, i)
+                       ? smgrnblocks(rel->rd_smgr, i)
+                       : 0;
Is that actually the part for views?  Why is that done this way?

+   if (rel->rd_rel->relkind == RELKIND_RELATION ||
+       rel->rd_rel->relkind == RELKIND_TOASTVALUE ||
+       rel->rd_rel->relkind == RELKIND_MATVIEW)
+       size = calculate_table_fork_size(rel,
+                forkname_to_number(text_to_cstring(forkName)));
+   else if (rel->rd_rel->relkind == RELKIND_INDEX)
+       size = calculate_relation_size(&(rel->rd_node), rel->rd_backend,
+                forkname_to_number(text_to_cstring(forkName)));
+   else
+   {
+       relation_close(rel, AccessShareLock);
+       PG_RETURN_NULL();
+   }
The approach you are taking to bring some consistency in all that by
making the size estimations go through table AMs via
table_relation_size() is promising.  However, this code breaks the
size estimation for sequences, which is not good.  If attempting to
use an evaluation that's based on a table AM, shouldn't this code use
a check based on rd_tableam rather than the relkind then?  We assume
that rd_tableam is set depending on the relkind in relcache.c, for
one.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: PATCH: Attempt to make dbsize a bit more consistent

gkokolatos





‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Monday, March 15, 2021 7:10 AM, Michael Paquier <[hidden email]> wrote:

&gt; On Wed, Feb 24, 2021 at 02:35:51PM +0000, [hidden email] wrote:
&gt;
&gt; &gt; Now with attachment. Apologies for the chatter.
&gt;
&gt; The patch has no documentation for the two new functions, so it is a
&gt; bit hard to understand what is the value brought here, and what is the
&gt; goal wanted just by reading the patch, except that this switches the
&gt; size reported for views to NULL instead of zero bytes by reading the
&gt; regression tests.

Understood and agreed. Thank you very much for looking.

&gt;
&gt; Please note that reporting zero is not wrong for views IMO as these
&gt; have no physical storage, so, while it is possible to argue that a
&gt; size of zero could imply that this relation size could not be zero, it
&gt; will never change, so I'd rather let this behavior as it is. I
&gt; would bet, additionally, that this could break existing use cases.
&gt; Reporting NULL, on the contrary, as your patch does, makes things
&gt; worse in some ways because that's what pg_relation_size would report
&gt; when a relation does not exist anymore. Imagine for example the case
&gt; of a DROP TABLE run in parallel of a scan of pg_class using
&gt; pg_relation_size(). So it becomes impossible to know if a relation
&gt; has been removed or not. This joins some points raised by Soumyadeep
&gt; upthread.

Yeah, I would have agreed with you before actually looking a bit closer.
I think that the gist of it is that the intention of the functions and
the actual usage/documentation of those have diverged. I honestly thought
that pg_relation_size() was a general function intended for any kind of
relation, whereas pg_table_size() was a function intended only for tables
(toast and mat views included).

Then I read the comment describing pg_relation_size() which reads
'disk space usage for the main fork of the specified table or index' and
'disk space usage for the specified fork of a table or index', found
initially in commit 358a897fa1d and maintained since.

But I digress.


&gt;
&gt; Anyway, as mentioned by other people upthread, I am not really
&gt; convinced either that we should have more flavors of size functions,
&gt; particularly depending on the relkind as this would be confusing for
&gt; the end-user. pg_relation_size() can already do this job for all
&gt; relkinds that use segment files, so it should still be able to hold
&gt; its ground and maintain a consistent behavior with what it does
&gt; currently.

I must have missunderstood the other people upthread and I thought
that new flavours were requested. Thank you for clarifying and
correcting me.

&gt;
&gt; +static int64
&gt; +calculate_table_fork_size(Relation rel, ForkNumber forkNum)
&gt; +{
&gt;
&gt; -   return (int64)table_relation_size(rel, forkNum);
&gt;     +}
&gt;     Why isn't that just unsigned, like table_relation_size()? This is an
&gt;     internal function so it does not matter to changes its signature, but
&gt;     I think that you had better do a cast at a higher level instead.
&gt;
&gt;     for (int i = 0; i &lt; MAX_FORKNUM; i++)
&gt;
&gt;
&gt; -             nblocks += smgrnblocks(rel-&gt;rd_smgr, i);
&gt;
&gt;
&gt;
&gt; -             nblocks += smgrexists(rel-&gt;rd_smgr, i)
&gt;
&gt;
&gt; -                         ? smgrnblocks(rel-&gt;rd_smgr, i)
&gt;
&gt;
&gt; -                         : 0;
&gt;
&gt;
&gt;
&gt; Is that actually the part for views? Why is that done this way?

No, it is not for views. The codebase should never reach this function
for a view or it would be a serious bug elsewhere.

This is addressing a bug in table_relation_size(). This function is correctly
not requiring for its caller to know any specifics about the forks. A heap
table is not required to have created any fsm, or vm, forks neither. Finally
smgrnblocks() assumes that the fork actually exists or errors out.

This change, makes certain that calling table_relation_size() with a
non existing fork will not generate an error.

&gt;
&gt; -   if (rel-&gt;rd_rel-&gt;relkind == RELKIND_RELATION ||
&gt;
&gt; -         rel-&gt;rd_rel-&gt;relkind == RELKIND_TOASTVALUE ||
&gt;
&gt;
&gt; -         rel-&gt;rd_rel-&gt;relkind == RELKIND_MATVIEW)
&gt;
&gt;
&gt; -         size = calculate_table_fork_size(rel,
&gt;
&gt;
&gt; -                  forkname_to_number(text_to_cstring(forkName)));
&gt;
&gt;
&gt; -   else if (rel-&gt;rd_rel-&gt;relkind == RELKIND_INDEX)
&gt;
&gt; -         size = calculate_relation_size(&amp;(rel-&gt;rd_node), rel-&gt;rd_backend,
&gt;
&gt;
&gt; -                  forkname_to_number(text_to_cstring(forkName)));
&gt;
&gt;
&gt; -   else
&gt;
&gt; -   {
&gt;
&gt; -         relation_close(rel, AccessShareLock);
&gt;
&gt;
&gt; -         PG_RETURN_NULL();
&gt;
&gt;
&gt; -   }
&gt;     The approach you are taking to bring some consistency in all that by
&gt;     making the size estimations go through table AMs via
&gt;     table_relation_size() is promising. However, this code breaks the
&gt;     size estimation for sequences, which is not good. If attempting to
&gt;     use an evaluation that's based on a table AM, shouldn't this code use
&gt;     a check based on rd_tableam rather than the relkind then? We assume
&gt;     that rd_tableam is set depending on the relkind in relcache.c, for
&gt;     one.


Thank you for the remarks. Please find v5 attached which is a minimal
patch to try to use the table am api if the relation is using the table
am api. Otherwise all else remains the same.

Cheers,
Georgios


&gt;     --
&gt;     Michael
&gt;

</[hidden email]>

v5-0001-Attempt-to-make-dbsize-a-bit-more-consistent.patch (8K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: PATCH: Attempt to make dbsize a bit more consistent

Michael Paquier-2
In reply to this post by Michael Paquier-2
On Mon, Mar 15, 2021 at 03:10:59PM +0900, Michael Paquier wrote:
> Anyway, as mentioned by other people upthread, I am not really
> convinced either that we should have more flavors of size functions,
> particularly depending on the relkind as this would be confusing for
> the end-user.  pg_relation_size() can already do this job for all
> relkinds that use segment files, so it should still be able to hold
> its ground and maintain a consistent behavior with what it does
> currently.

On top of the rest of my notes, there are two more things that would
face a behavior change if making the existing functions go through
table AMs, which would scan the data in the smgr:
- After a VACUUM, the relation would be reported with a size of 0,
while that may not be the case of on-disk files yet.
- Temporary tables of other sessions would be accessible.

So we would likely want a separate function.  Another possibility,
which I find tempting, would be to push down the calculation logic
relying on physical files down to the table AM themselves with a new
dedicated callback (relation_size_physical?), as it seems to me that
the most important thing users want to know with this set of functions
is how much physical space is being consumed at one given point in
time.  Attached is a small prototype I was just playing with.
--
Michael

dbsize_play.patch (6K) Download Attachment
signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: PATCH: Attempt to make dbsize a bit more consistent

gkokolatos





‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Wednesday, March 17, 2021 6:35 AM, Michael Paquier <[hidden email]> wrote:

> On Mon, Mar 15, 2021 at 03:10:59PM +0900, Michael Paquier wrote:
>
> > Anyway, as mentioned by other people upthread, I am not really
> > convinced either that we should have more flavors of size functions,
> > particularly depending on the relkind as this would be confusing for
> > the end-user. pg_relation_size() can already do this job for all
> > relkinds that use segment files, so it should still be able to hold
> > its ground and maintain a consistent behavior with what it does
> > currently.
>
> On top of the rest of my notes, there are two more things that would
> face a behavior change if making the existing functions go through
> table AMs, which would scan the data in the smgr:

I am not certain if you are referring to v5 (sent earlier than your mail)
or v4. Can you please verify?

>
> -   After a VACUUM, the relation would be reported with a size of 0,
>     while that may not be the case of on-disk files yet.

I am not really following. Apologies for it. The table AM may or may not
choose to go through smgr, depending on the implementation. The only
currently known implementation, heap, does invalidate smgr, based on
what I can see, after a VACUUM. I have not been able to create a test
case both with or without v5 of the patch where not the same result
would be returned.

What have I missed?

>
> -   Temporary tables of other sessions would be accessible.

I am not really certain I am following. Commit 6919b7e3294 from 2012
notes that calculate_relation_size can be safely applied to temp tables
of other sessions. v5 of the patch does not change that behaviour. Nor
did previous versions, but those are already obsolete.

>
>     So we would likely want a separate function. Another possibility,
>     which I find tempting, would be to push down the calculation logic
>     relying on physical files down to the table AM themselves with a new
>     dedicated callback (relation_size_physical?), as it seems to me that
>     the most important thing users want to know with this set of functions
>     is how much physical space is being consumed at one given point in
>     time. Attached is a small prototype I was just playing with.
>     --
>     Michael
>




12