Monitoring disk space from within the server

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
22 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Monitoring disk space from within the server

Christoph Berg-2
Monitoring the available disk space is the topmost thing on the
priority for PostgreSQL operation, yet this metric is not available
from the SQL level.

The attached patch implements a function pg_tablespace_statfs(tblspc)
to report disk space numbers per tablespace:

# select * from pg_tablespace_statfs('pg_default');
  blocks   │  bfree   │  bavail  │  files   │  ffree
───────────┼──────────┼──────────┼──────────┼──────────
 103179564 │ 20829222 │ 20815126 │ 26214400 │ 24426295

Open points:
* should these numbers be converted to bytes?
* the column names currently mirror the statfs() names and should
  certainly be improved
* which of these columns add to \db+ output?
* possibly extend this (and \db) to pg_wal

Christoph

0001-Add-pg_tablespace_statfs-functions.patch (5K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Monitoring disk space from within the server

Julien Rouhaud
On Fri, Nov 8, 2019 at 2:24 PM Christoph Berg <[hidden email]> wrote:

>
> Monitoring the available disk space is the topmost thing on the
> priority for PostgreSQL operation, yet this metric is not available
> from the SQL level.
>
> The attached patch implements a function pg_tablespace_statfs(tblspc)
> to report disk space numbers per tablespace:
>
> # select * from pg_tablespace_statfs('pg_default');
>   blocks   │  bfree   │  bavail  │  files   │  ffree
> ───────────┼──────────┼──────────┼──────────┼──────────
>  103179564 │ 20829222 │ 20815126 │ 26214400 │ 24426295
>
> Open points:
> * should these numbers be converted to bytes?
> * the column names currently mirror the statfs() names and should
>   certainly be improved
> * which of these columns add to \db+ output?
> * possibly extend this (and \db) to pg_wal

Shouldn't we have something more generic, in hope that this eventually
get implemented on Windows?  I'm also wondering if getting the fs
information is enough, as there might be quota.


Reply | Threaded
Open this post in threaded view
|

Re: Monitoring disk space from within the server

Christoph Berg-2
Re: Julien Rouhaud 2019-11-08 <[hidden email]>
> Shouldn't we have something more generic, in hope that this eventually
> get implemented on Windows?  I'm also wondering if getting the fs
> information is enough, as there might be quota.

The name is certainly not a good pick, it's not meant to be a raw
statfs() wrapper but something more high-level. I just went with that
to have something working to start with.

How about these?
pg_tablespace_stats()
pg_tablespace_space()
pg_tablespace_disk_space()

Christoph


Reply | Threaded
Open this post in threaded view
|

Re: Monitoring disk space from within the server

Christoph Berg-2
In reply to this post by Julien Rouhaud
Re: Julien Rouhaud 2019-11-08 <[hidden email]>
> I'm also wondering if getting the fs
> information is enough, as there might be quota.

We could append the quotactl(Q_GETQUOTA) information as well, but I'm
not sure this has a sensible actual-users-to-noise ratio.

Christoph


Reply | Threaded
Open this post in threaded view
|

Re: Monitoring disk space from within the server

Julien Rouhaud
In reply to this post by Christoph Berg-2
On Fri, Nov 8, 2019 at 2:35 PM Christoph Berg <[hidden email]> wrote:

>
> Re: Julien Rouhaud 2019-11-08 <[hidden email]>
> > Shouldn't we have something more generic, in hope that this eventually
> > get implemented on Windows?  I'm also wondering if getting the fs
> > information is enough, as there might be quota.
>
> The name is certainly not a good pick, it's not meant to be a raw
> statfs() wrapper but something more high-level. I just went with that
> to have something working to start with.
>
> How about these?
> pg_tablespace_stats()
> pg_tablespace_space()
> pg_tablespace_disk_space()

The related function on Windows is apparently GetDiskFreeSpaceA [1].
It'll probably be quite hard to get something consistent for most of
counters, so probably pg_tablespace_(disk_)space is the best name,
providing only total size and free size?

[1] https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-getdiskfreespacea


Reply | Threaded
Open this post in threaded view
|

Re: Monitoring disk space from within the server

Julien Rouhaud
In reply to this post by Christoph Berg-2
On Fri, Nov 8, 2019 at 2:40 PM Christoph Berg <[hidden email]> wrote:
>
> Re: Julien Rouhaud 2019-11-08 <[hidden email]>
> > I'm also wondering if getting the fs
> > information is enough, as there might be quota.
>
> We could append the quotactl(Q_GETQUOTA) information as well, but I'm
> not sure this has a sensible actual-users-to-noise ratio.

Well, having a quota is one of the few real reason to create a
tablespace so it's probably worth it, although  I have to agree that I
seldom saw quota in production.


Reply | Threaded
Open this post in threaded view
|

Re: Monitoring disk space from within the server

Christoph Berg-2
In reply to this post by Julien Rouhaud
Re: Julien Rouhaud 2019-11-08 <[hidden email]>
> The related function on Windows is apparently GetDiskFreeSpaceA [1].

There's a link to GetDiskFreeSpaceExA() which seems much easier to use
because it accepts any directory on the drive in question:

https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-getdiskfreespaceexa

> It'll probably be quite hard to get something consistent for most of
> counters, so probably pg_tablespace_(disk_)space is the best name,
> providing only total size and free size?

So, how about:

pg_tablespace_disk_space -> total_bytes | free_bytes

The inode numbers are probably not very interesting in a PG tablespace
as we don't create that many files. Or do we think including these
counters (on UNIX) makes sense?

There's precedents for leaving fields NULL where not supported by the
OS, for example pg_stat_file() returns "change" on UNIX only, and
"creation" on Windows only.

Christoph


Reply | Threaded
Open this post in threaded view
|

Re: Monitoring disk space from within the server

Christoph Berg-2
In reply to this post by Julien Rouhaud
Re: Julien Rouhaud 2019-11-08 <[hidden email]>
> > We could append the quotactl(Q_GETQUOTA) information as well, but I'm
> > not sure this has a sensible actual-users-to-noise ratio.
>
> Well, having a quota is one of the few real reason to create a
> tablespace so it's probably worth it, although  I have to agree that I
> seldom saw quota in production.

Given that PG deals badly with one tablespace being full (might work
in production, but if it's full during recovery, the whole server will
stop), I've never seen quotas being used.

Christoph


Reply | Threaded
Open this post in threaded view
|

Re: Monitoring disk space from within the server

Julien Rouhaud
In reply to this post by Christoph Berg-2
On Fri, Nov 8, 2019 at 2:58 PM Christoph Berg <[hidden email]> wrote:

>
> Re: Julien Rouhaud 2019-11-08 <[hidden email]>
> > The related function on Windows is apparently GetDiskFreeSpaceA [1].
>
> There's a link to GetDiskFreeSpaceExA() which seems much easier to use
> because it accepts any directory on the drive in question:
>
> https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-getdiskfreespaceexa
>
> > It'll probably be quite hard to get something consistent for most of
> > counters, so probably pg_tablespace_(disk_)space is the best name,
> > providing only total size and free size?
>
> So, how about:
>
> pg_tablespace_disk_space -> total_bytes | free_bytes
>
> The inode numbers are probably not very interesting in a PG tablespace
> as we don't create that many files. Or do we think including these
> counters (on UNIX) makes sense?

Agreed, inodes are probably not very useful there.

> There's precedents for leaving fields NULL where not supported by the
> OS, for example pg_stat_file() returns "change" on UNIX only, and
> "creation" on Windows only.
> [...]
> Given that PG deals badly with one tablespace being full (might work
> in production, but if it's full during recovery, the whole server will
> stop), I've never seen quotas being used.

I'm +1 on "pg_tablespace_disk_space -> total_bytes | free_bytes"


Reply | Threaded
Open this post in threaded view
|

Re: Monitoring disk space from within the server

Tomas Vondra-4
In reply to this post by Christoph Berg-2
On Fri, Nov 08, 2019 at 02:24:19PM +0100, Christoph Berg wrote:
>Monitoring the available disk space is the topmost thing on the
>priority for PostgreSQL operation, yet this metric is not available
>from the SQL level.
>

While I agree monitoring disk space is important, I think pretty much
every deployment already does that using some other monitoring tool
(which also monitors million other things).

Also, I wonder how universal / reliable this actually is, considering
the range of filesystems and related stuff (thin provisioning, quotas,
...) people use in production. I do recall a number of cases when "df"
was showing a plenty of free space, but one of the internal resources
for that particular filesystem was exhausted. I doubt it's desirable to
add all this knowledge into PostgreSQL.

It's not clear to me what issue this is actually meant to solve - it
provides data, which is nice, but it still needs to be fed to some
motinoring and alerting system. And every monitoring system has a good
plugin to collect this type of data, so why not to use that?

Surely, we can't rely on this for any internal logic - so why not to
provide this as an extension?


regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Monitoring disk space from within the server

Christoph Berg-2
Re: Tomas Vondra 2019-11-08 <20191108145025.d7pfcip6plufxiah@development>
> While I agree monitoring disk space is important, I think pretty much
> every deployment already does that using some other monitoring tool
> (which also monitors million other things).

There are plenty of deployments where that isn't true, either because
they aren't doing any monitoring, or (probably more commonly) because
the OS monitoring is done by the OS department and the DB department
doesn't have good access to the figures, and possibly not even any
shell access.

Offering the numbers on the database level would make monitoring
easier for these users, and also provide the numbers on the level
where they might be useful. ("Do I have enough disk space to load this
5GB dump now?")

> Also, I wonder how universal / reliable this actually is, considering
> the range of filesystems and related stuff (thin provisioning, quotas,
> ...) people use in production. I do recall a number of cases when "df"
> was showing a plenty of free space, but one of the internal resources
> for that particular filesystem was exhausted. I doubt it's desirable to
> add all this knowledge into PostgreSQL.

That might be partly true, e.g. btrfs traditionally didn't support
"df" but only "btrfs df". But this got fixed in the meantime, and just
because there are weird filesystems doesn't mean we shouldn't try to
support the normal case where statfs() just works.

> It's not clear to me what issue this is actually meant to solve - it
> provides data, which is nice, but it still needs to be fed to some
> motinoring and alerting system. And every monitoring system has a good
> plugin to collect this type of data, so why not to use that?

What's wrong with providing nice data? It doesn't hurt to have it.
And the cost of the implementation is low.

> Surely, we can't rely on this for any internal logic - so why not to
> provide this as an extension?

By the same argument you could also argue that \l+ should be an
extension because database size is optional to know.

I think this should be directly in core because it's useful to a wide
range of users.

Christoph


Reply | Threaded
Open this post in threaded view
|

Re: Monitoring disk space from within the server

Christoph Berg-2
Re: To Tomas Vondra 2019-11-08 <[hidden email]>
> I think this should be directly in core because it's useful to a wide
> range of users.

Also, I want to have it in \db+ in psql where users would actually be
looking for it.

Christoph


Reply | Threaded
Open this post in threaded view
|

Re: Monitoring disk space from within the server

Tomas Vondra-4
In reply to this post by Christoph Berg-2
On Fri, Nov 08, 2019 at 04:06:21PM +0100, Christoph Berg wrote:

>Re: Tomas Vondra 2019-11-08 <20191108145025.d7pfcip6plufxiah@development>
>> While I agree monitoring disk space is important, I think pretty much
>> every deployment already does that using some other monitoring tool
>> (which also monitors million other things).
>
>There are plenty of deployments where that isn't true, either because
>they aren't doing any monitoring, or (probably more commonly) because
>the OS monitoring is done by the OS department and the DB department
>doesn't have good access to the figures, and possibly not even any
>shell access.
>

It might sound a bit annoying, but I suspect deployments where the DBAs
does not have access to such basic system metrics have bigger issues and
giving them one particular piece of information is just a bandaid.

>Offering the numbers on the database level would make monitoring
>easier for these users, and also provide the numbers on the level
>where they might be useful. ("Do I have enough disk space to load this
>5GB dump now?")
>
>> Also, I wonder how universal / reliable this actually is, considering
>> the range of filesystems and related stuff (thin provisioning, quotas,
>> ...) people use in production. I do recall a number of cases when "df"
>> was showing a plenty of free space, but one of the internal resources
>> for that particular filesystem was exhausted. I doubt it's desirable to
>> add all this knowledge into PostgreSQL.
>
>That might be partly true, e.g. btrfs traditionally didn't support
>"df" but only "btrfs df". But this got fixed in the meantime, and just
>because there are weird filesystems doesn't mean we shouldn't try to
>support the normal case where statfs() just works.
>

Maybe, but if you suggest to show the information in \dn+ then we should
at least know how common / likely those issues are. Because if they are
anything but extremely uncommon, we'll get plenty of bogus bug reports
complaining about inaccurate information.

>> It's not clear to me what issue this is actually meant to solve - it
>> provides data, which is nice, but it still needs to be fed to some
>> motinoring and alerting system. And every monitoring system has a good
>> plugin to collect this type of data, so why not to use that?
>
>What's wrong with providing nice data? It doesn't hurt to have it.
>And the cost of the implementation is low.
>

My point was that there are other (better, already existing) ways to get
the data, and getting them through database means extra complexity (I do
agree it's not a lot of code at this point, though).

Of course, if your assumption is that using those other ways to get the
data is impossible, then sure - adding this makes sense.

>> Surely, we can't rely on this for any internal logic - so why not to
>> provide this as an extension?
>
>By the same argument you could also argue that \l+ should be an
>extension because database size is optional to know.
>

That's not really my argument, though. I'm not suggesting everything
"optional" should be in an extension, but that there's nothing forcing
us to make this directly part of the core.

And packaging stuff into extension has advantages too (independent dev
cycle, and so on).

>I think this should be directly in core because it's useful to a wide
>range of users.
>

I'm not convinced that's actually true. It might be, but I don't have
any data to support it (or vice versa).


cheers

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Monitoring disk space from within the server

Michael Paquier-2
In reply to this post by Julien Rouhaud
On Fri, Nov 08, 2019 at 03:10:57PM +0100, Julien Rouhaud wrote:
> Agreed, inodes are probably not very useful there.

Total bytes and free bytes looks like a good first cut.  Have you
looked at the portability of statfs() on other BSD flavors and
Solaris?  I recall from a lookup at statvfs() that these may not be
present everywhere.  I'd think that we can live with a configure
switch and complain with an error or a warning if we are out of
options on a given platform.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Monitoring disk space from within the server

Christoph Moench-Tegeder
## Michael Paquier ([hidden email]):

> Total bytes and free bytes looks like a good first cut.  Have you
> looked at the portability of statfs() on other BSD flavors and
> Solaris?

"The statfs() system call first appeared in 4.4BSD." (from the statfs(2)
manpage on FreeBSD). struct statfs differs between Linux and BSD, but
is "close enough" for this, the fields from the original patch are
present in both implementations.
Solaris does not have statfs() anymore. Instead, it has a statvfs()
which is "more or less equivalent" to the Linux statvfs(). On FreeBSD,
using statvfs() (it's available) is rather not recommended, from the
man page:
  The statvfs() and fstatvfs() functions fill the structure pointed
  to by buf with garbage. This garbage will occasionally bear resemblance
  to file system statistics, but portable applications must not depend on
  this.
That's funny, as statvfs() is in our beloved POSIX.1 since at least
2001 - current specs:
https://pubs.opengroup.org/onlinepubs/9699919799/functions/fstatvfs.html

Regards,
Christoph

--
Spare Space


Reply | Threaded
Open this post in threaded view
|

Re: Monitoring disk space from within the server

Laurenz Albe
In reply to this post by Christoph Berg-2
On Fri, 2019-11-08 at 14:24 +0100, Christoph Berg wrote:

> Monitoring the available disk space is the topmost thing on the
> priority for PostgreSQL operation, yet this metric is not available
> from the SQL level.
>
> The attached patch implements a function pg_tablespace_statfs(tblspc)
> to report disk space numbers per tablespace:
>
> # select * from pg_tablespace_statfs('pg_default');
>   blocks   │  bfree   │  bavail  │  files   │  ffree
> ───────────┼──────────┼──────────┼──────────┼──────────
>  103179564 │ 20829222 │ 20815126 │ 26214400 │ 24426295
>
> Open points:
> * should these numbers be converted to bytes?
> * the column names currently mirror the statfs() names and should
>   certainly be improved
> * which of these columns add to \db+ output?
> * possibly extend this (and \db) to pg_wal

Will this work on Windows?
A quick web search seems to indicate that Windows has no statfs(2).

What's more is that the Linux man page says that statfs(2) is
Linux-specific.

I think that if we have such a feature (which I think would be useful)
should be available for all operating systems supported by PostgreSQL.

Yours,
Laurenz Albe



Reply | Threaded
Open this post in threaded view
|

Re: Monitoring disk space from within the server

Michael Paquier-2
In reply to this post by Christoph Moench-Tegeder
On Sat, Nov 09, 2019 at 02:33:49PM +0100, Christoph Moench-Tegeder wrote:

> "The statfs() system call first appeared in 4.4BSD." (from the statfs(2)
> manpage on FreeBSD). struct statfs differs between Linux and BSD, but
> is "close enough" for this, the fields from the original patch are
> present in both implementations.
> Solaris does not have statfs() anymore. Instead, it has a statvfs()
> which is "more or less equivalent" to the Linux statvfs(). On FreeBSD,
> using statvfs() (it's available) is rather not recommended, from the
> man page:
>   The statvfs() and fstatvfs() functions fill the structure pointed
>   to by buf with garbage. This garbage will occasionally bear resemblance
>   to file system statistics, but portable applications must not depend on
>   this.
> That's funny, as statvfs() is in our beloved POSIX.1 since at least
> 2001 - current specs:
> https://pubs.opengroup.org/onlinepubs/9699919799/functions/fstatvfs.html
Thanks for looking at that.  The point of FreeBSD is interesting to
know.  So this basically would leave us with the following hierarchy
to grab the data:
1) statfs()
2) statvfs()
3) Windows-specific implementation
4) Complain if nothing is present

For the free space, then we just need (f_bsize * f_bfree), and the
total is (f_blocks * f_bsize).

Any opinions?
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Monitoring disk space from within the server

Michael Paquier-2
In reply to this post by Laurenz Albe
On Mon, Nov 11, 2019 at 09:11:35PM +0100, Laurenz Albe wrote:
> Will this work on Windows?
> A quick web search seems to indicate that Windows has no statfs(2).

It won't.  We are actually discussing the compatibility aspects and
the minimal data set we could grab in a different part of the thread.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Monitoring disk space from within the server

Daniel Gustafsson
In reply to this post by Michael Paquier-2
> On 12 Nov 2019, at 02:46, Michael Paquier <[hidden email]> wrote:

> Any opinions?

I agree with Tomas upthread that it's unclear whether this needs to be in core.
There are many system parameters a database admin is likely to be interested
in, diskspace being just one of them (albeit a very important one for many
reasons), and there is nothing that makes the SQL interface (or postgres core
for that matter) particularly more suited for this job than other existing
tools.

Why is SQL level crucial for this?

cheers ./daniel

Reply | Threaded
Open this post in threaded view
|

Re: Monitoring disk space from within the server

Michael Paquier-2
On Tue, Nov 12, 2019 at 09:47:47AM +0100, Daniel Gustafsson wrote:
> I agree with Tomas upthread that it's unclear whether this needs to be in core.
> There are many system parameters a database admin is likely to be interested
> in, diskspace being just one of them (albeit a very important one for many
> reasons), and there is nothing that makes the SQL interface (or postgres core
> for that matter) particularly more suited for this job than other existing
> tools.
>
> Why is SQL level crucial for this?

Because this makes the monitoring experience easier from a remote
perspective.  FWIW, I have cases it would have been useful to monitor
out the physical amount of space available with the amount of space
covered by bloated tables for a given set of tablespaces through a
single source.
--
Michael

signature.asc (849 bytes) Download Attachment
12