file cloning in pg_upgrade and CREATE DATABASE

classic Classic list List threaded Threaded
36 messages Options
12
Reply | Threaded
Open this post in threaded view
|

file cloning in pg_upgrade and CREATE DATABASE

Peter Eisentraut-6
Here is another attempt at implementing file cloning for pg_upgrade and
CREATE DATABASE.  The idea is to take advantage of file systems that can
make copy-on-write clones, which would make the copy run much faster.
For pg_upgrade, this will give the performance of --link mode without
the associated drawbacks.

There have been patches proposed previously [0][1].  The concerns there
were mainly that they required a Linux-specific ioctl() call and only
worked for Btrfs.

Some new things have happened since then:

- XFS has (optional) reflink support.  This file system is probably more
widely used than Btrfs.

- Linux and glibc have a proper function to do this now.

- APFS on macOS supports file cloning.

So altogether this feature will be more widely usable and less ugly to
implement.  Note, however, that you will currently need literally the
latest glibc release, so it probably won't be accessible right now
unless you are using Fedora 28 for example.  (This is the
copy_file_range() function that had us recently rename the same function
in pg_rewind.)

Some example measurements:

6 GB database, pg_upgrade unpatched 30 seconds, patched 3 seconds (XFS
and APFS)

similar for a CREATE DATABASE from a large template

Even if you don't have a file system with cloning support, the special
library calls make copying faster.  For example, on APFS, in this
example, an unpatched CREATE DATABASE takes 30 seconds, with the library
call (but without cloning) it takes 10 seconds.

For amusement/bewilderment, without the recent flush optimization on
APFS, this takes 2 minutes 30 seconds.  I suppose this optimization will
now actually obsolete, since macOS will no longer hit that code.


[0]:
https://www.postgresql.org/message-id/flat/513C0E7C.5080606%40socialserve.com

[1]:
https://www.postgresql.org/message-id/flat/20140213030731.GE4831%40momjian.us
--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

0001-Use-file-cloning-in-pg_upgrade-and-CREATE-DATABASE.patch (12K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: file cloning in pg_upgrade and CREATE DATABASE

Robert Haas
On Tue, Feb 20, 2018 at 10:00 PM, Peter Eisentraut
<[hidden email]> wrote:

> Some example measurements:
>
> 6 GB database, pg_upgrade unpatched 30 seconds, patched 3 seconds (XFS
> and APFS)
>
> similar for a CREATE DATABASE from a large template
>
> Even if you don't have a file system with cloning support, the special
> library calls make copying faster.  For example, on APFS, in this
> example, an unpatched CREATE DATABASE takes 30 seconds, with the library
> call (but without cloning) it takes 10 seconds.

Nice results.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Reply | Threaded
Open this post in threaded view
|

Re: file cloning in pg_upgrade and CREATE DATABASE

Tomas Vondra-4
In reply to this post by Peter Eisentraut-6
On 02/21/2018 04:00 AM, Peter Eisentraut wrote:
> ...
>
> Some example measurements:
>
> 6 GB database, pg_upgrade unpatched 30 seconds, patched 3 seconds (XFS
> and APFS)
>
> similar for a CREATE DATABASE from a large template
>

Nice improvement, of course. How does that affect performance on the
cloned database? If I understand this correctly, it essentially enables
CoW on the files, so what's the overhead on that? It'd be unfortunate to
speed up CREATE DATABASE only to get degraded performance later.

In any case, I find this interesting mainly for pg_upgrade use case. On
running systems I think the main issue with CREATE DATABASE is that it
forces a checkpoint.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: file cloning in pg_upgrade and CREATE DATABASE

Peter Eisentraut-6
On 2/21/18 18:57, Tomas Vondra wrote:
> Nice improvement, of course. How does that affect performance on the
> cloned database? If I understand this correctly, it essentially enables
> CoW on the files, so what's the overhead on that? It'd be unfortunate to
> speed up CREATE DATABASE only to get degraded performance later.

I ran a little test (on APFS and XFS): Create a large (unlogged) table,
copy the database, then delete everything from the table in the copy.
That should need to CoW all the blocks.  It has about the same
performance with cloning, possibly slightly faster.

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: file cloning in pg_upgrade and CREATE DATABASE

Michael Paquier-2
In reply to this post by Peter Eisentraut-6
On Tue, Feb 20, 2018 at 10:00:04PM -0500, Peter Eisentraut wrote:
> Some new things have happened since then:
>
> - XFS has (optional) reflink support.  This file system is probably more
> widely used than Btrfs.

Btrfs is still in development, there are I think no many people who
would use it in production.

> - Linux and glibc have a proper function to do this now.
>
> - APFS on macOS supports file cloning.

So copyfile() is only part of macos?  I am not able to find references
in FreeBSD, NetBSD or OpenBSD, but I may be missing something.

> So altogether this feature will be more widely usable and less ugly to
> implement.  Note, however, that you will currently need literally the
> latest glibc release, so it probably won't be accessible right now
> unless you are using Fedora 28 for example.  (This is the
> copy_file_range() function that had us recently rename the same function
> in pg_rewind.)

For reference, Debian SID is using glibc 2.27.  ArchLinux is still on
2.26.

> Some example measurements:
>
> 6 GB database, pg_upgrade unpatched 30 seconds, patched 3 seconds (XFS
> and APFS)

Interesting.  I'll try to test that on an XFS partition and see if I can
see a difference.  For now I have just read through the patch.

+#ifdef HAVE_COPYFILE
+ if (copyfile(fromfile, tofile, NULL,
+#ifdef COPYFILE_CLONE
+ COPYFILE_CLONE
+#else
+               COPYFILE_DATA
+#endif
+                       ) < 0)
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not copy file \"%s\" to \"%s\": %m", fromfile, tofile)));
+#else
        copy_file(fromfile, tofile);
+#endif

Any backend-side callers of copy_file() would not benefit from
copyfile() on OSX.  Shouldn't all that handling be inside copy_file(),
similarly to what your patch actually does for pg_upgrade?  I think that
you should also consider fcopyfile() instead of copyfile() as it works
directly on the file descriptors and share the same error handling as
the others.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: file cloning in pg_upgrade and CREATE DATABASE

Michael Paquier-2
On Mon, Mar 19, 2018 at 04:06:36PM +0900, Michael Paquier wrote:
> Any backend-side callers of copy_file() would not benefit from
> copyfile() on OSX.  Shouldn't all that handling be inside copy_file(),
> similarly to what your patch actually does for pg_upgrade?  I think that
> you should also consider fcopyfile() instead of copyfile() as it works
> directly on the file descriptors and share the same error handling as
> the others.

Two other things I have noticed as well:
1) src/bin/pg_rewind/copy_fetch.c could benefit from similar speed-ups I
think when copying data from source to target using the local mode of
pg_rewind.  This could really improve cases where new relations are
added after a promotion.
2) XLogFileCopy() uses a copy logic as well.  For large segments things
could be improved, however we need to be careful about filling in the
end of segments with zeros.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: file cloning in pg_upgrade and CREATE DATABASE

Michael Paquier-2
On Mon, Mar 19, 2018 at 04:14:15PM +0900, Michael Paquier wrote:
> Two other things I have noticed as well:
> 1) src/bin/pg_rewind/copy_fetch.c could benefit from similar speed-ups I
> think when copying data from source to target using the local mode of
> pg_rewind.  This could really improve cases where new relations are
> added after a promotion.
> 2) XLogFileCopy() uses a copy logic as well.  For large segments things
> could be improved, however we need to be careful about filling in the
> end of segments with zeros.

I have been thinking about this patch over the night, and here is a list
of bullet points which would be nice to tackle:
- Remove the current diff in copydir.
- Extend copy_file so as it is able to use fcopyfile.
- Move the work done in pg_upgrade into a common API which can as well
be used by pg_rewind as well.  One place would be to have a
frontend-only API in src/common which does the leg work.  I would
recommend working only on file descriptors as well for consistency with
copy_file_range.
- Add proper wait events for the backend calls.  Those are missing for
copy_file_range and copyfile.
- For XLogFileCopy, the problem may be trickier as the tail of a segment
is filled with zeroes, so dropping it from the first version of the
patch sounds wiser.

Patch is switched as waiting on author, I have set myself as a
reviewer.

Thanks,
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: file cloning in pg_upgrade and CREATE DATABASE

Peter Eisentraut-6
On 3/19/18 22:58, Michael Paquier wrote:
> I have been thinking about this patch over the night, and here is a list
> of bullet points which would be nice to tackle:
> - Remove the current diff in copydir.

done

> - Extend copy_file so as it is able to use fcopyfile.

fcopyfile() does not support cloning.  (This is not documented.)

> - Move the work done in pg_upgrade into a common API which can as well
> be used by pg_rewind as well.  One place would be to have a
> frontend-only API in src/common which does the leg work.  I would
> recommend working only on file descriptors as well for consistency with
> copy_file_range.

pg_upgrade copies files, whereas pg_rewind needs to copy file ranges.
So I don't think this is going to be a good match.

We could add support for using Linux copy_file_range() in pg_rewind, but
that would work a bit differently.  I also don't have a good sense of
how to test the performance of that.

Another thing to think about is that we go through some trouble to
initialize new WAL files so that the disk space is fully allocated.  If
we used file cloning calls in pg_rewind, that would potentially
invalidate some of that.  At least, we'd have to think through this more
carefully.

> - Add proper wait events for the backend calls.  Those are missing for
> copy_file_range and copyfile.

done

> - For XLogFileCopy, the problem may be trickier as the tail of a segment
> is filled with zeroes, so dropping it from the first version of the
> patch sounds wiser.

Seems like a possible follow-on project.  But see also under pg_rewind
above.

Another oddity is that pg_upgrade uses CopyFile() on Windows, but the
backend does not.  The Git log shows that the backend used to use
CopyFile(), but that was then removed when the generic code was added,
but when pg_upgrade was imported, it came with the CopyFile() call.

I suspect the CopyFile() call can be quite a bit faster, so we should
consider adding it back in.  Or if not, remove it from pg_upgrade.

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

v2-0001-Use-file-cloning-in-pg_upgrade-and-CREATE-DATABAS.patch (13K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: file cloning in pg_upgrade and CREATE DATABASE

Michael Paquier-2
On Tue, Mar 20, 2018 at 10:55:04AM -0400, Peter Eisentraut wrote:
> On 3/19/18 22:58, Michael Paquier wrote:
>> - Extend copy_file so as it is able to use fcopyfile.
>
> fcopyfile() does not support cloning.  (This is not documented.)

You are right.  I have been reading the documentation here to get an
idea as I don't have a macos system at hand:
https://www.unix.com/man-page/osx/3/fcopyfile/

However I have bumped into that:
http://www.openradar.me/30706426

Future versions will be visibly fixed.

>> - Move the work done in pg_upgrade into a common API which can as well
>> be used by pg_rewind as well.  One place would be to have a
>> frontend-only API in src/common which does the leg work.  I would
>> recommend working only on file descriptors as well for consistency with
>> copy_file_range.
>
> pg_upgrade copies files, whereas pg_rewind needs to copy file ranges.
> So I don't think this is going to be a good match.
>
> We could add support for using Linux copy_file_range() in pg_rewind, but
> that would work a bit differently.  I also don't have a good sense of
> how to test the performance of that.
One simple way to test that would be to limit the time it takes to scan
the WAL segments on the target so as the filemap is computed quickly,
and create many, say gigabyte-size relations on the promoted source
which will need to be copied from the source to the target.

> Another thing to think about is that we go through some trouble to
> initialize new WAL files so that the disk space is fully allocated.  If
> we used file cloning calls in pg_rewind, that would potentially
> invalidate some of that.  At least, we'd have to think through this more
> carefully.

Agreed.  Let's keep in mind such things but come with a sane, first cut
of this patch based on the time remaining in this commit fest.

>> - Add proper wait events for the backend calls.  Those are missing for
>> copy_file_range and copyfile.
>
> done

+         <entry><literal>CopyFileCopy</literal></entry>
+         <entry>Waiting for a file copy operation (if the copying is done by
+         an operating system call rather than as separate read and write
+         operations).</entry>
CopyFileCopy is... Redundant.  Perhaps CopyFileSystem or CopyFileRange?

>> - For XLogFileCopy, the problem may be trickier as the tail of a segment
>> is filled with zeroes, so dropping it from the first version of the
>> patch sounds wiser.
>
> Seems like a possible follow-on project.  But see also under pg_rewind
> above.

No objections to do that in the future for both.

> Another oddity is that pg_upgrade uses CopyFile() on Windows, but the
> backend does not.  The Git log shows that the backend used to use
> CopyFile(), but that was then removed when the generic code was added,
> but when pg_upgrade was imported, it came with the CopyFile() call.

You mean 558730ac, right?

> I suspect the CopyFile() call can be quite a bit faster, so we should
> consider adding it back in.  Or if not, remove it from pg_upgrade.

Hm.  The proposed patch also removes an important property of what
happens now in copy_file: the copied files are periodically synced to
avoid spamming the cache, so for some loads wouldn't this cause a
performance regression?

At least on Linux it is possible to rely on sync_file_range which is
called via pg_flush_data, so it seems to me that we ought to roughly
keep the loop working on FLUSH_DISTANCE, and replace the calls of
read/write by copy_file_range.  copyfile is only able to do a complete
file copy, so we would also lose this property as well on Linux.  Even
for Windows using CopyFile would be a step backwards for the backend.

pg_upgrade is different though as it copies files fully, so using both
copyfile and copy_file_range makes sense.

At the end, it seems to me that using copy_file_range has some values as
you save a set of read/write calls, but copyfile comes with its
limitations, which I think will cause side issues, so I would recommend
dropping it from a first cut of the patch for the backend.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: file cloning in pg_upgrade and CREATE DATABASE

Bruce Momjian
In reply to this post by Peter Eisentraut-6
I think this documentation change:

        +   leaving the old cluster untouched.  At present, this is supported on Linux
                                    ---------

would be better by changing "untouched" to "unmodified".

Also, it would be nice if users could easily know if pg_upgrade is going
to use COW or not because it might affect whether they choose --link or
not.  Right now it seems unclear how a user would know.  Can we have
pg_upgrade --check perhaps output something.  Can we also have the
pg_upgrade status display indicate that too, e.g. change

        Copying user relation files

to

        Copying (copy-on-write) user relation files

--
  Bruce Momjian  <[hidden email]>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +

Reply | Threaded
Open this post in threaded view
|

Re: file cloning in pg_upgrade and CREATE DATABASE

Peter Eisentraut-6
In reply to this post by Michael Paquier-2
On 3/21/18 22:38, Michael Paquier wrote:
> At least on Linux it is possible to rely on sync_file_range which is
> called via pg_flush_data, so it seems to me that we ought to roughly
> keep the loop working on FLUSH_DISTANCE, and replace the calls of
> read/write by copy_file_range.  copyfile is only able to do a complete
> file copy, so we would also lose this property as well on Linux.

I have shown earlier in the thread that copy_file_range in one go is
still better than doing it in pieces.

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: file cloning in pg_upgrade and CREATE DATABASE

Peter Eisentraut-6
In reply to this post by Bruce Momjian
On 3/23/18 13:16, Bruce Momjian wrote:

> Also, it would be nice if users could easily know if pg_upgrade is going
> to use COW or not because it might affect whether they choose --link or
> not.  Right now it seems unclear how a user would know.  Can we have
> pg_upgrade --check perhaps output something.  Can we also have the
> pg_upgrade status display indicate that too, e.g. change
>
> Copying user relation files
>
> to
>
> Copying (copy-on-write) user relation files

That would be nice, but we don't have a way to tell that, AFAICT.

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: file cloning in pg_upgrade and CREATE DATABASE

Michael Paquier-2
In reply to this post by Peter Eisentraut-6
On Sun, Mar 25, 2018 at 09:33:38PM -0400, Peter Eisentraut wrote:
> On 3/21/18 22:38, Michael Paquier wrote:
>> At least on Linux it is possible to rely on sync_file_range which is
>> called via pg_flush_data, so it seems to me that we ought to roughly
>> keep the loop working on FLUSH_DISTANCE, and replace the calls of
>> read/write by copy_file_range.  copyfile is only able to do a complete
>> file copy, so we would also lose this property as well on Linux.
>
> I have shown earlier in the thread that copy_file_range in one go is
> still better than doing it in pieces.

f8c183a has introduced the optimization that your patch is removing,
which was discussed on this thread:
https://www.postgresql.org/message-id/flat/4B78906A.7020309%40mark.mielke.cc
I am not much into the internals of copy_file_range, but isn't there a
risk to have a large range of blocks copied to discard potentially
useful blocks from the OS cache?  That's what this patch makes me worry
about.  Performance is good, but on a system where the OS cache is
heavily used for a set of hot blocks this could cause performance side
effects that I think we canot neglect.

Another thing is that 71d6d07 allowed a couple of database commands to
be more sensitive to interruptions.  With large databases used as a base
template it seems to me that this would cause the interruptions to be
less responsive.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: file cloning in pg_upgrade and CREATE DATABASE

Peter Eisentraut-6
On 3/26/18 02:15, Michael Paquier wrote:
> f8c183a has introduced the optimization that your patch is removing,
> which was discussed on this thread:
> https://www.postgresql.org/message-id/flat/4B78906A.7020309%40mark.mielke.cc

Note that that thread is from 2010 and talks about creation of a
database from the standard template being too slow on spinning rust,
because we fsync too often.  I think we have moved well past that
problem size.

I have run some more tests on both macOS and Linux with ext4, and my
results are that the bigger the flush distance, the better.  Before we
made the adjustments for APFS, we had a flush size of 64kB, now it's 1MB
and 32MB on macOS.  In my tests, I see 256MB as the best across both
platforms, and not flushing early at all is only minimally worse.

You can measure this to death, and this obviously doesn't apply equally
on all systems and configurations, but clearly some of the old
assumptions from 8 years ago are no longer applicable.

> I am not much into the internals of copy_file_range, but isn't there a
> risk to have a large range of blocks copied to discard potentially
> useful blocks from the OS cache?  That's what this patch makes me worry
> about.  Performance is good, but on a system where the OS cache is
> heavily used for a set of hot blocks this could cause performance side
> effects that I think we canot neglect.

How would we go about assessing that?  It's possible, but if
copy_file_range() really blows away all your in-use cache, that would be
surprising.

> Another thing is that 71d6d07 allowed a couple of database commands to
> be more sensitive to interruptions.  With large databases used as a base
> template it seems to me that this would cause the interruptions to be
> less responsive.

The maximum file size that we copy is 1GB and that nowadays takes maybe
10 seconds.  I think that would be an acceptable response time.

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: file cloning in pg_upgrade and CREATE DATABASE

Peter Eisentraut-6
I think we have raised a number of interesting issues here which require
more deeper consideration.  So I suggest to set this patch to Returned
with feedback.

Btw., I just learned that copy_file_range() only works on files on the
same device.  So more arrangements will need to be made for that.

> I have run some more tests on both macOS and Linux with ext4, and my> results are that the bigger the flush distance, the better.  Before
we> made the adjustments for APFS, we had a flush size of 64kB, now it's
1MB> and 32MB on macOS.  In my tests, I see 256MB as the best across
both> platforms, and not flushing early at all is only minimally worse.
Based on this, I suggest that we set the flush distance to 32MB on all
platforms.  Not only is it faster, it avoids having different settings
on some platforms.

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: file cloning in pg_upgrade and CREATE DATABASE

Peter Eisentraut-6
In reply to this post by Peter Eisentraut-6
I have made a major revision of this patch.

I have removed all the changes to CREATE DATABASE.  That was too
contentious and we got lost in unrelated details there.  The real
benefit is for pg_upgrade.

Another point was that for pg_upgrade use a user would like to know
beforehand whether reflinking would be used, which was not possible with
the copy_file_range() API.  So here I have switched to using the ioctl()
call directly.

So the new interface is that pg_upgrade has a new option
--reflink={always,auto,never}.  (This option name is adapted from GNU
cp.)  From the documentation:

<para>
 The setting <literal>always</literal> requires the use of relinks.  If
 they are not supported, the <application>pg_upgrade</application> run
 will abort.  Use this in production to limit the upgrade run time.
 The setting <literal>auto</literal> uses reflinks when available,
 otherwise it falls back to a normal copy.  This is the default.  The
 setting <literal>never</literal> prevents use of reflinks and always
 uses a normal copy.  This can be useful to ensure that the upgraded
 cluster has its disk space fully allocated and not shared with the old
 cluster.
</para>

Also, pg_upgrade --check will check whether the selected option would work.

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

v3-0001-pg_upgrade-Allow-use-of-file-cloning.patch (18K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: file cloning in pg_upgrade and CREATE DATABASE

Robert Haas
On Wed, Jun 6, 2018 at 11:58 AM, Peter Eisentraut
<[hidden email]> wrote:
> --reflink={always,auto,never}.  (This option name is adapted from GNU
...
>  The setting <literal>always</literal> requires the use of relinks.  If

Is it supposed to be relinks or reflinks?  The two lines above don't agree.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Reply | Threaded
Open this post in threaded view
|

Re: file cloning in pg_upgrade and CREATE DATABASE

Peter Eisentraut-6
On 6/8/18 14:06, Robert Haas wrote:
> Is it supposed to be relinks or reflinks?  The two lines above don't agree.

It's supposed to be "reflinks".  I'll fix that.

I have also used the more general term "cloning" in the documentation.
We can discuss which term we should use more.

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: file cloning in pg_upgrade and CREATE DATABASE

Thomas Munro-3
In reply to this post by Peter Eisentraut-6
On Wed, Feb 21, 2018 at 4:00 PM, Peter Eisentraut
<[hidden email]> wrote:
> - XFS has (optional) reflink support.  This file system is probably more
> widely used than Btrfs.
>
> - Linux and glibc have a proper function to do this now.
>
> - APFS on macOS supports file cloning.

TIL that Solaris 11.4 (closed) ZFS supports reflink() too.  Sadly,
it's not in OpenZFS though I see numerous requests and discussions...
(Of course you can just clone the whole filesystem and then pg_upgrade
the clone in-place).

--
Thomas Munro
http://www.enterprisedb.com

Reply | Threaded
Open this post in threaded view
|

Re: file cloning in pg_upgrade and CREATE DATABASE

Peter Eisentraut-6
On 13.07.18 07:09, Thomas Munro wrote:

> On Wed, Feb 21, 2018 at 4:00 PM, Peter Eisentraut
> <[hidden email]> wrote:
>> - XFS has (optional) reflink support.  This file system is probably more
>> widely used than Btrfs.
>>
>> - Linux and glibc have a proper function to do this now.
>>
>> - APFS on macOS supports file cloning.
>
> TIL that Solaris 11.4 (closed) ZFS supports reflink() too.  Sadly,
> it's not in OpenZFS though I see numerous requests and discussions...

I look forward to your FreeBSD patch then. ;-)

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

12