[PATCH]make pg_rewind to not copy useless WAL files

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
31 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[PATCH]make pg_rewind to not copy useless WAL files

chenhj
Hi all,

Currently, pg_rewind copies all WAL files from the source server, whether or not they are needed.
In some circumstances, will bring a lot of unnecessary network and disk IO consumption, and also increase the execution time of pg_rewind.
Such as when wal_keep_segments or max_wal_size is large.

According to pg_rewind's processing logic, only need to copy the WAL after the divergence from the source server. 
The WAL before the divergence must already exists on the target server.
Also, there is no need to copy WALs that have been recovered.

This patch optimizes the above mentioned issues, as follows:
1. In the target data directory, do not delete the WAL files before the divergence.
2. When copying files from the source server, do not copy the WAL files before the divergence and the WAL files after the current WAL insert localtion.

Note:
The "current WAL insert localtion" above is obtained before copying data files. If a runing PostgreSQL server is used as the source server, the newly generated WAL files during pg_rewind running will not be copied to 
the target data directory.
However, in this case the target server is typically used as a standby of the source server after pg_rewind is executed, so these WAL files will be copied via streaming replication later.

--
Best regards
Chen Huajun


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pg_rewind_wal_copy_reduce.patch (6K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH]make pg_rewind to not copy useless WAL files

Alexander Korotkov
Hi!

On Sat, Sep 16, 2017 at 5:56 PM, chenhj <[hidden email]> wrote:
This patch optimizes the above mentioned issues, as follows:
1. In the target data directory, do not delete the WAL files before the divergence.
2. When copying files from the source server, do not copy the WAL files before the divergence and the WAL files after the current WAL insert localtion.

Looks like cool optimization for me.  Please, add this patch to the next commitfest.
Do you think this patch should modify pg_rewind tap tests too?  It would be nice to make WAL files fetching more covered by tap tests.  In particular, new tests may generate more WAL files and make sure that pg_rewind fetches only required files among them.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH]make pg_rewind to not copy useless WAL files

Michael Paquier
On Sun, Sep 17, 2017 at 3:19 AM, Alexander Korotkov
<[hidden email]> wrote:

> Hi!
>
> On Sat, Sep 16, 2017 at 5:56 PM, chenhj <[hidden email]> wrote:
>>
>> This patch optimizes the above mentioned issues, as follows:
>> 1. In the target data directory, do not delete the WAL files before the
>> divergence.
>> 2. When copying files from the source server, do not copy the WAL files
>> before the divergence and the WAL files after the current WAL insert
>> localtion.
>
>
> Looks like cool optimization for me.  Please, add this patch to the next
> commitfest.

Agreed.

> Do you think this patch should modify pg_rewind tap tests too?  It would be
> nice to make WAL files fetching more covered by tap tests.  In particular,
> new tests may generate more WAL files and make sure that pg_rewind fetches
> only required files among them.

This looks mandatory to me. Using pg_switch_wal() and a minimum amount
of WAL generated you could just make the set of WAL segments skipped
minimal data.

I have not checked in details, but I think that the positions where
you are applying the filters are using the right approach.

!         !(strncmp(path, "pg_wal", 6) == 0 && IsXLogFileName(path + 7) &&
Please use XLOGDIR here.
--
Michael


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH]make pg_rewind to not copy useless WAL files

chenhj

At 2017-09-17 08:33:33, "Michael Paquier" <[hidden email]> wrote: >On Sun, Sep 17, 2017 at 3:19 AM, Alexander Korotkov ><[hidden email]> wrote: >> Hi! >> >> On Sat, Sep 16, 2017 at 5:56 PM, chenhj <[hidden email]> wrote: >>> >>> This patch optimizes the above mentioned issues, as follows: >>> 1. In the target data directory, do not delete the WAL files before the >>> divergence. >>> 2. When copying files from the source server, do not copy the WAL files >>> before the divergence and the WAL files after the current WAL insert >>> localtion. >> >> >> Looks like cool optimization for me. Please, add this patch to the next >> commitfest. > >Agreed. > >> Do you think this patch should modify pg_rewind tap tests too? It would be >> nice to make WAL files fetching more covered by tap tests. In particular, >> new tests may generate more WAL files and make sure that pg_rewind fetches >> only required files among them. > >This looks mandatory to me. Using pg_switch_wal() and a minimum amount >of WAL generated you could just make the set of WAL segments skipped >minimal data. > >I have not checked in details, but I think that the positions where >you are applying the filters are using the right approach. > >! !(strncmp(path, "pg_wal", 6) == 0 && IsXLogFileName(path + 7) && >Please use XLOGDIR here. >-- >Michael >
Thanks, I will use XLOGDIR and add TAP tests later.

--
Chen Huajun
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH]make pg_rewind to not copy useless WAL files

chenhj
Hi

This is the new pacth with TAP test and use Macro XLOGDIR.
And i had add this patch to the commitfest, 
    https://commitfest.postgresql.org/15/1302/

--
Best Regards,
Chen Huajun


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pg_rewind_wal_copy_reduce_v2.patch (14K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH]make pg_rewind to not copy useless WAL files

Alexander Korotkov
On Fri, Sep 22, 2017 at 7:16 PM, chenhj <[hidden email]> wrote:
This is the new pacth with TAP test and use Macro XLOGDIR.

Good.  I took a quick look over the patch.
Why do you need master_query(), standby_query() and run_query() in RewindTest.pm?
You can do just $node_master->safe_psql() and $node_slave->safe_psql() instead.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH]make pg_rewind to not copy useless WAL files

chenhj
On 2017-09-23 01:59:0, "Alexander Korotkov" <[hidden email]> wrote:
On Fri, Sep 22, 2017 at 7:16 PM, chenhj <[hidden email]> wrote:
This is the new pacth with TAP test and use Macro XLOGDIR.

Good.  I took a quick look over the patch.
Why do you need master_query(), standby_query() and run_query() in RewindTest.pm?
You can do just $node_master->safe_psql() and $node_slave->safe_psql() instead.

Ooh, i did not notice that function.Thank you for your advice!
 
---
Regards,
Chen Huajun


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pg_rewind_wal_copy_reduce_v3.patch (12K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH]make pg_rewind to not copy useless WAL files

Alexander Korotkov
On Mon, Sep 25, 2017 at 6:26 PM, chenhj <[hidden email]> wrote:
On 2017-09-23 01:59:0, "Alexander Korotkov" <[hidden email]> wrote:
On Fri, Sep 22, 2017 at 7:16 PM, chenhj <[hidden email]> wrote:
This is the new pacth with TAP test and use Macro XLOGDIR.

Good.  I took a quick look over the patch.
Why do you need master_query(), standby_query() and run_query() in RewindTest.pm?
You can do just $node_master->safe_psql() and $node_slave->safe_psql() instead.

Ooh, i did not notice that function.Thank you for your advice!

Great.  I tried this patch.  It applies cleanly, but doesn't compile.

pg_rewind.c:310:36: error: too few arguments provided to function-like macro invocation
        XLByteToSeg(divergerec, startsegno);
                                          ^
../../../src/include/access/xlog_internal.h:118:9: note: macro 'XLByteToSeg' defined here
#define XLByteToSeg(xlrp, logSegNo, wal_segsz_bytes) \
        ^
pg_rewind.c:310:2: error: use of undeclared identifier 'XLByteToSeg'
        XLByteToSeg(divergerec, startsegno);
        ^
pg_rewind.c:311:89: error: too few arguments provided to function-like macro invocation
        XLogFileName(divergence_wal_filename, targetHistory[lastcommontliIndex].tli, startsegno);
                                                                                               ^
../../../src/include/access/xlog_internal.h:155:9: note: macro 'XLogFileName' defined here
#define XLogFileName(fname, tli, logSegNo, wal_segsz_bytes)     \
        ^
pg_rewind.c:311:2: error: use of undeclared identifier 'XLogFileName'
        XLogFileName(divergence_wal_filename, targetHistory[lastcommontliIndex].tli, startsegno);
        ^
pg_rewind.c:312:34: error: too few arguments provided to function-like macro invocation
        XLByteToPrevSeg(endrec, endsegno);
                                        ^
../../../src/include/access/xlog_internal.h:121:9: note: macro 'XLByteToPrevSeg' defined here
#define XLByteToPrevSeg(xlrp, logSegNo, wal_segsz_bytes) \
        ^
pg_rewind.c:312:2: error: use of undeclared identifier 'XLByteToPrevSeg'
        XLByteToPrevSeg(endrec, endsegno);
        ^
pg_rewind.c:313:57: error: too few arguments provided to function-like macro invocation
        XLogFileName(last_source_wal_filename, endtli, endsegno);
                                                               ^
../../../src/include/access/xlog_internal.h:155:9: note: macro 'XLogFileName' defined here
#define XLogFileName(fname, tli, logSegNo, wal_segsz_bytes)     \
        ^
pg_rewind.c:313:2: error: use of undeclared identifier 'XLogFileName'
        XLogFileName(last_source_wal_filename, endtli, endsegno);
        ^
8 errors generated.

It appears that your patch conflicts with fc49e24f.  Please, rebase it.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH]make pg_rewind to not copy useless WAL files

chenhj
On 2017-09-28 01:29:29,"Alexander Korotkov" <[hidden email]> wrote:


It appears that your patch conflicts with fc49e24f.  Please, rebase it.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Yes, i had rebased it, Please check the new patch. 

------
Best Regards,
Chen Huajun






--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pg_rewind_wal_copy_reduce_v4.patch (12K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH]make pg_rewind to not copy useless WAL files

Alexander Korotkov
On Thu, Sep 28, 2017 at 6:44 PM, chenhj <[hidden email]> wrote:
On 2017-09-28 01:29:29,"Alexander Korotkov" <[hidden email]> wrote:
It appears that your patch conflicts with fc49e24f.  Please, rebase it.

Yes, i had rebased it, Please check the new patch. 

Good, now it applies cleanly.

else if (strncmp(path, XLOGDIR"/", strlen(XLOGDIR"/")) == 0 &&
IsXLogFileName(path + strlen(XLOGDIR"/")) &&
(strcmp(path + strlen(XLOGDIR"/") + 8, divergence_wal_filename + 8) < 0 ||
 strcmp(path + strlen(XLOGDIR"/") + 8, last_source_wal_filename + 8) > 0))

According to our conding style, you should leave a space betwen XLOGDIF and "/".
Also, you do a trick by comparison xlog segment numbers using strcmp().  It's nice, but I would prefer seeing XLogFromFileName() here.  It would improve code readability and be less error prone during further modifications.


------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH]make pg_rewind to not copy useless WAL files

chenhj
On 2017-09-29 00:43:18,"Alexander Korotkov" <[hidden email]> wrote:
On Thu, Sep 28, 2017 at 6:44 PM, chenhj <[hidden email]> wrote:
On 2017-09-28 01:29:29,"Alexander Korotkov" <[hidden email]> wrote:
It appears that your patch conflicts with fc49e24f.  Please, rebase it.

Yes, i had rebased it, Please check the new patch. 

Good, now it applies cleanly.

else if (strncmp(path, XLOGDIR"/", strlen(XLOGDIR"/")) == 0 &&
 IsXLogFileName(path + strlen(XLOGDIR"/")) &&
 (strcmp(path + strlen(XLOGDIR"/") + 8, divergence_wal_filename + 8) < 0 ||
  strcmp(path + strlen(XLOGDIR"/") + 8, last_source_wal_filename + 8) > 0))

According to our conding style, you should leave a space betwen XLOGDIF and "/".
Also, you do a trick by comparison xlog segment numbers using strcmp().  It's nice, but I would prefer seeing XLogFromFileName() here.  It would improve code readability and be less error prone during further modifications.


------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Thanks for advice!
I had modified it.

-----
Best Regards,
Chen Huajun





--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pg_rewind_wal_copy_reduce_v5.patch (13K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH]make pg_rewind to not copy useless WAL files

Alexander Korotkov
On Thu, Sep 28, 2017 at 10:52 PM, chenhj <[hidden email]> wrote:
On 2017-09-29 00:43:18,"Alexander Korotkov" <[hidden email]> wrote:
On Thu, Sep 28, 2017 at 6:44 PM, chenhj <[hidden email]> wrote:
On 2017-09-28 01:29:29,"Alexander Korotkov" <[hidden email]> wrote:
It appears that your patch conflicts with fc49e24f.  Please, rebase it.

Yes, i had rebased it, Please check the new patch. 

Good, now it applies cleanly.

else if (strncmp(path, XLOGDIR"/", strlen(XLOGDIR"/")) == 0 &&
 IsXLogFileName(path + strlen(XLOGDIR"/")) &&
 (strcmp(path + strlen(XLOGDIR"/") + 8, divergence_wal_filename + 8) < 0 ||
  strcmp(path + strlen(XLOGDIR"/") + 8, last_source_wal_filename + 8) > 0))

According to our conding style, you should leave a space betwen XLOGDIF and "/".
Also, you do a trick by comparison xlog segment numbers using strcmp().  It's nice, but I would prefer seeing XLogFromFileName() here.  It would improve code readability and be less error prone during further modifications.

Thanks for advice!
I had modified it.

OK. Patch becomes better.
I also have more general question.  Why do we need upper bound for segment number (last_source_segno)?  I understand the purpose of lower bound (divergence_segno) which save us from copying extra WAL files, but what is upper bound for?  As I understood, we anyway need to replay most recent WAL records to reach consistent state after pg_rewind.  I propose to remove last_source_segno unless I'm missing something.


------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH]make pg_rewind to not copy useless WAL files

chenhj
On 2017-09-29 05:31:51, "Alexander Korotkov" <[hidden email]> wrote:
On Thu, Sep 28, 2017 at 10:52 PM, chenhj <[hidden email]> wrote:
On 2017-09-29 00:43:18,"Alexander Korotkov" <[hidden email]> wrote:
On Thu, Sep 28, 2017 at 6:44 PM, chenhj <[hidden email]> wrote:
On 2017-09-28 01:29:29,"Alexander Korotkov" <[hidden email]> wrote:
It appears that your patch conflicts with fc49e24f.  Please, rebase it.

Yes, i had rebased it, Please check the new patch. 

Good, now it applies cleanly.

else if (strncmp(path, XLOGDIR"/", strlen(XLOGDIR"/")) == 0 &&
 IsXLogFileName(path + strlen(XLOGDIR"/")) &&
 (strcmp(path + strlen(XLOGDIR"/") + 8, divergence_wal_filename + 8) < 0 ||
  strcmp(path + strlen(XLOGDIR"/") + 8, last_source_wal_filename + 8) > 0))

According to our conding style, you should leave a space betwen XLOGDIF and "/".
Also, you do a trick by comparison xlog segment numbers using strcmp().  It's nice, but I would prefer seeing XLogFromFileName() here.  It would improve code readability and be less error prone during further modifications.

Thanks for advice!
I had modified it.

OK. Patch becomes better.
I also have more general question.  Why do we need upper bound for segment number (last_source_segno)?  I understand the purpose of lower bound (divergence_segno) which save us from copying extra WAL files, but what is upper bound for?  As I understood, we anyway need to replay most recent WAL records to reach consistent state after pg_rewind.  I propose to remove last_source_segno unless I'm missing something.

Thanks for relay!
When checkpoint occurs, some old WAL files will be renamed as future WAL files for later use.
The upper bound for segment number (last_source_segno) is used to avoid copying these extra WAL files.

When the parameter max_wal_size or max_min_size is large,these may be many renamed old WAL files for reused.

For example, I have just looked at one of our production systems (max_wal_size = 64GB, min_wal_size = 2GB), 
the total size of WALs is about 30GB, and contains about 4GB renamed old WAL files.

[postgres@hostxxx pg_xlog]$ ll
...
-rw------- 1 postgres postgres 16777216 Sep 29 14:05 0000000100000BCF00000078
-rw------- 1 postgres postgres 16777216 Sep 29 14:05 0000000100000BCF00000079
-rw------- 1 postgres postgres 16777216 Sep 29 14:05 0000000100000BCF0000007A
-rw------- 1 postgres postgres 16777216 Sep 29 14:05 0000000100000BCF0000007B
-rw------- 1 postgres postgres 16777216 Sep 29 14:05 0000000100000BCF0000007C
-rw------- 1 postgres postgres 16777216 Sep 29 14:05 0000000100000BCF0000007D
-rw------- 1 postgres postgres 16777216 Sep 29 11:22 0000000100000BCF0000007E //after this, there are about 4GB WALs for reuse
-rw------- 1 postgres postgres 16777216 Sep 29 11:08 0000000100000BCF0000007F
-rw------- 1 postgres postgres 16777216 Sep 29 11:06 0000000100000BCF00000080
-rw------- 1 postgres postgres 16777216 Sep 29 12:05 0000000100000BCF00000081
-rw------- 1 postgres postgres 16777216 Sep 29 11:28 0000000100000BCF00000082
-rw------- 1 postgres postgres 16777216 Sep 29 11:06 0000000100000BCF00000083
...

-----
Best Regards,
Chen Huajun
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH]make pg_rewind to not copy useless WAL files

Alexander Korotkov
On Fri, Sep 29, 2017 at 10:07 AM, chenhj <[hidden email]> wrote:
On 2017-09-29 05:31:51, "Alexander Korotkov" <[hidden email]> wrote:
On Thu, Sep 28, 2017 at 10:52 PM, chenhj <[hidden email]> wrote:
On 2017-09-29 00:43:18,"Alexander Korotkov" <[hidden email]> wrote:
On Thu, Sep 28, 2017 at 6:44 PM, chenhj <[hidden email]> wrote:
On 2017-09-28 01:29:29,"Alexander Korotkov" <[hidden email]> wrote:
It appears that your patch conflicts with fc49e24f.  Please, rebase it.

Yes, i had rebased it, Please check the new patch. 

Good, now it applies cleanly.

else if (strncmp(path, XLOGDIR"/", strlen(XLOGDIR"/")) == 0 &&
 IsXLogFileName(path + strlen(XLOGDIR"/")) &&
 (strcmp(path + strlen(XLOGDIR"/") + 8, divergence_wal_filename + 8) < 0 ||
  strcmp(path + strlen(XLOGDIR"/") + 8, last_source_wal_filename + 8) > 0))

According to our conding style, you should leave a space betwen XLOGDIF and "/".
Also, you do a trick by comparison xlog segment numbers using strcmp().  It's nice, but I would prefer seeing XLogFromFileName() here.  It would improve code readability and be less error prone during further modifications.

Thanks for advice!
I had modified it.

OK. Patch becomes better.
I also have more general question.  Why do we need upper bound for segment number (last_source_segno)?  I understand the purpose of lower bound (divergence_segno) which save us from copying extra WAL files, but what is upper bound for?  As I understood, we anyway need to replay most recent WAL records to reach consistent state after pg_rewind.  I propose to remove last_source_segno unless I'm missing something.

Thanks for relay!
When checkpoint occurs, some old WAL files will be renamed as future WAL files for later use.
The upper bound for segment number (last_source_segno) is used to avoid copying these extra WAL files.

When the parameter max_wal_size or max_min_size is large,these may be many renamed old WAL files for reused.

For example, I have just looked at one of our production systems (max_wal_size = 64GB, min_wal_size = 2GB), 
the total size of WALs is about 30GB, and contains about 4GB renamed old WAL files.

[postgres@hostxxx pg_xlog]$ ll
...
-rw------- 1 postgres postgres 16777216 Sep 29 14:05 0000000100000BCF00000078
-rw------- 1 postgres postgres 16777216 Sep 29 14:05 0000000100000BCF00000079
-rw------- 1 postgres postgres 16777216 Sep 29 14:05 0000000100000BCF0000007A
-rw------- 1 postgres postgres 16777216 Sep 29 14:05 0000000100000BCF0000007B
-rw------- 1 postgres postgres 16777216 Sep 29 14:05 0000000100000BCF0000007C
-rw------- 1 postgres postgres 16777216 Sep 29 14:05 0000000100000BCF0000007D
-rw------- 1 postgres postgres 16777216 Sep 29 11:22 0000000100000BCF0000007E //after this, there are about 4GB WALs for reuse
-rw------- 1 postgres postgres 16777216 Sep 29 11:08 0000000100000BCF0000007F
-rw------- 1 postgres postgres 16777216 Sep 29 11:06 0000000100000BCF00000080
-rw------- 1 postgres postgres 16777216 Sep 29 12:05 0000000100000BCF00000081
-rw------- 1 postgres postgres 16777216 Sep 29 11:28 0000000100000BCF00000082
-rw------- 1 postgres postgres 16777216 Sep 29 11:06 0000000100000BCF00000083
...

OK.  That makes sense.  Thank you for the explanation.

I still have some minor comments.
 
    /*
+    * Save the WAL filenames of the divergence and the current WAL insert
+    * location of the source server. Later only the WAL files between those
+    * would be copied to the target data directory.

Comment is outdated.  We don't save filenames anymore, now we save segment numbers.
 
+    * Note:The later generated WAL files in the source server before the end
+    * of the copy of the data files must be made available when the target
+    * server is started. This can be done by configuring the target server as
+    * a standby of the source server.
+    */

You miss space after "Note:".  Also, it seems reasonable for me to leave empty line before "Note:".

# Setup parameter for WAL reclaim 

Parameter*s*, because you're setting up multiple of them.

# The accuracy of imodification from pg_ls_waldir() is seconds, so sleep one seconds

One second without "s".

Also, please check empty lines in 006_wal_copy.pl to be just empty lines without tabs.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH]make pg_rewind to not copy useless WAL files

chenhj
On 2017-09-29 19:29:40,"Alexander Korotkov" <[hidden email]> wrote:
On Fri, Sep 29, 2017 at 10:07 AM, chenhj <[hidden email]> wrote:


OK.  That makes sense.  Thank you for the explanation.

I still have some minor comments.
 
    /*
+    * Save the WAL filenames of the divergence and the current WAL insert
+    * location of the source server. Later only the WAL files between those
+    * would be copied to the target data directory.

Comment is outdated.  We don't save filenames anymore, now we save segment numbers.
 
+    * Note:The later generated WAL files in the source server before the end
+    * of the copy of the data files must be made available when the target
+    * server is started. This can be done by configuring the target server as
+    * a standby of the source server.
+    */

You miss space after "Note:".  Also, it seems reasonable for me to leave empty line before "Note:".

# Setup parameter for WAL reclaim 

Parameter*s*, because you're setting up multiple of them.

# The accuracy of imodification from pg_ls_waldir() is seconds, so sleep one seconds

One second without "s".

Also, please check empty lines in 006_wal_copy.pl to be just empty lines without tabs.

Thanks for your comments, i had fix above problems.
And also add several line breaks at long line in 006_wal_copy.pl
Please check this patch again.

------
Best Regards
Chen Huajun


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pg_rewind_wal_copy_reduce_v6.patch (13K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH]make pg_rewind to not copy useless WAL files

chenhj
On 2017-09-30 00:53:31,"chenhj" <[hidden email]> wrote:
On 2017-09-29 19:29:40,"Alexander Korotkov" <[hidden email]> wrote:
On Fri, Sep 29, 2017 at 10:07 AM, chenhj <[hidden email]> wrote:


OK.  That makes sense.  Thank you for the explanation.

I still have some minor comments.
 
    /*
+    * Save the WAL filenames of the divergence and the current WAL insert
+    * location of the source server. Later only the WAL files between those
+    * would be copied to the target data directory.

Comment is outdated.  We don't save filenames anymore, now we save segment numbers.
 
+    * Note:The later generated WAL files in the source server before the end
+    * of the copy of the data files must be made available when the target
+    * server is started. This can be done by configuring the target server as
+    * a standby of the source server.
+    */

You miss space after "Note:".  Also, it seems reasonable for me to leave empty line before "Note:".

# Setup parameter for WAL reclaim 

Parameter*s*, because you're setting up multiple of them.

# The accuracy of imodification from pg_ls_waldir() is seconds, so sleep one seconds

One second without "s".

Also, please check empty lines in 006_wal_copy.pl to be just empty lines without tabs.

Thanks for your comments, i had fix above problems.
And also add several line breaks at long line in 006_wal_copy.pl
Please check this patch again.

Sorry, patch v6 did not remove tabs in two empty lines, please use the new one.

Best Regards,
Chen Huajun


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pg_rewind_wal_copy_reduce_v7.patch (13K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH]make pg_rewind to not copy useless WAL files

Alexander Korotkov
On Fri, Sep 29, 2017 at 8:10 PM, chenhj <[hidden email]> wrote:
On 2017-09-30 00:53:31,"chenhj" <[hidden email]> wrote:
On 2017-09-29 19:29:40,"Alexander Korotkov" <[hidden email]> wrote:
On Fri, Sep 29, 2017 at 10:07 AM, chenhj <[hidden email]> wrote:


OK.  That makes sense.  Thank you for the explanation.

I still have some minor comments.
 
    /*
+    * Save the WAL filenames of the divergence and the current WAL insert
+    * location of the source server. Later only the WAL files between those
+    * would be copied to the target data directory.

Comment is outdated.  We don't save filenames anymore, now we save segment numbers.
 
+    * Note:The later generated WAL files in the source server before the end
+    * of the copy of the data files must be made available when the target
+    * server is started. This can be done by configuring the target server as
+    * a standby of the source server.
+    */

You miss space after "Note:".  Also, it seems reasonable for me to leave empty line before "Note:".

# Setup parameter for WAL reclaim 

Parameter*s*, because you're setting up multiple of them.

# The accuracy of imodification from pg_ls_waldir() is seconds, so sleep one seconds

One second without "s".

Also, please check empty lines in 006_wal_copy.pl to be just empty lines without tabs.

Thanks for your comments, i had fix above problems.
And also add several line breaks at long line in 006_wal_copy.pl
Please check this patch again.

Sorry, patch v6 did not remove tabs in two empty lines, please use the new one.

Great.  Now code of this patch looks good for me.
However, we forgot about documentation.

  <para>
   The result is equivalent to replacing the target data directory with the
   source one. Only changed blocks from relation files are copied;
   all other files are copied in full, including configuration files. The
   advantage of <application>pg_rewind</> over taking a new base backup, or
   tools like <application>rsync</>, is that <application>pg_rewind</> does
   not require reading through unchanged blocks in the cluster. This makes
   it a lot faster when the database is large and only a small
   fraction of blocks differ between the clusters.
  </para>

At least, this paragraph need to be adjusted, because it states whose files are copied.  And probably latter paragraphs whose state about WAL files.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH]make pg_rewind to not copy useless WAL files

chenhj
In reply to this post by chenhj
On 2017-09-30 02:17:54,"Alexander Korotkov" <[hidden email]> wrote:

Great.  Now code of this patch looks good for me.
However, we forgot about documentation.

  <para>
   The result is equivalent to replacing the target data directory with the
   source one. Only changed blocks from relation files are copied;
   all other files are copied in full, including configuration files. The
   advantage of <application>pg_rewind</> over taking a new base backup, or
   tools like <application>rsync</>, is that <application>pg_rewind</> does
   not require reading through unchanged blocks in the cluster. This makes
   it a lot faster when the database is large and only a small
   fraction of blocks differ between the clusters.
  </para>

At least, this paragraph need to be adjusted, because it states whose files are copied.  And probably latter paragraphs whose state about WAL files.



Your are rigth.
I wrote a draft as following, but i'm afraid whether the english statement is accurate.

----------------------
diff --git a/doc/src/sgml/ref/pg_rewind.sgml b/doc/src/sgml/ref/pg_rewind.sgml
index d5430d4..bcd094b 100644
--- a/doc/src/sgml/ref/pg_rewind.sgml
+++ b/doc/src/sgml/ref/pg_rewind.sgml
@@ -50,9 +50,12 @@ PostgreSQL documentation
   <para>
    The result is equivalent to replacing the target data directory with the
    source one. Only changed blocks from relation files are copied;
-   all other files are copied in full, including configuration files. The
-   advantage of <application>pg_rewind</> over taking a new base backup, or
-   tools like <application>rsync</>, is that <application>pg_rewind</> does
+   all other files except WAL are copied in full, including configuration
+   files. Only the WAL files between the point of divergence and the current
+   WAL insert location of the source server are copied, for other WAL files
+   are useless for the target server. The advantage of
+   <application>pg_rewind</> over taking a new base backup, or tools
+   like <application>rsync</>, is that <application>pg_rewind</> does
    not require reading through unchanged blocks in the cluster. This makes
    it a lot faster when the database is large and only a small
    fraction of blocks differ between the clusters.
@@ -231,7 +234,7 @@ PostgreSQL documentation
      <para>
       Copy all other files such as <filename>pg_xact</filename> and
       configuration files from the source cluster to the target cluster
-      (everything except the relation files).
+      (everything except the relation files and some WAL files).
      </para>
     </step>
     <step>

------
Best Regars,
Chen Huajun


Reply | Threaded
Open this post in threaded view
|

Re: [PATCH]make pg_rewind to not copy useless WAL files

Alexander Korotkov
On Sat, Sep 30, 2017 at 8:18 PM, chenhj <[hidden email]> wrote:
On 2017-09-30 02:17:54,"Alexander Korotkov" <[hidden email]wrote:

Great.  Now code of this patch looks good for me.
However, we forgot about documentation.

  <para>
   The result is equivalent to replacing the target data directory with the
   source one. Only changed blocks from relation files are copied;
   all other files are copied in full, including configuration files. The
   advantage of <application>pg_rewind</> over taking a new base backup, or
   tools like <application>rsync</>, is that <application>pg_rewind</> does
   not require reading through unchanged blocks in the cluster. This makes
   it a lot faster when the database is large and only a small
   fraction of blocks differ between the clusters.
  </para>

At least, this paragraph need to be adjusted, because it states whose files are copied.  And probably latter paragraphs whose state about WAL files.



Your are rigth.
I wrote a draft as following, but i'm afraid whether the english statement is accurate.

I'm not native english speaker too :(

Only the WAL files between the point of divergence and the current WAL insert location of the source server are copied, *for* other WAL files are useless for the target server. 

I'm not sure about this usage of word *for*.  For me, it probably should be just removed.  Rest of changes looks good for me.  Please, integrate them into the patch.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH]make pg_rewind to not copy useless WAL files

chenhj
On  2017-10-01 04:09:19,"Alexander Korotkov" <[hidden email]> wrote:
On Sat, Sep 30, 2017 at 8:18 PM, chenhj <[hidden email]> wrote:
On 2017-09-30 02:17:54,"Alexander Korotkov" <[hidden email]wrote:

Great.  Now code of this patch looks good for me.
However, we forgot about documentation.

  <para>
   The result is equivalent to replacing the target data directory with the
   source one. Only changed blocks from relation files are copied;
   all other files are copied in full, including configuration files. The
   advantage of <application>pg_rewind</> over taking a new base backup, or
   tools like <application>rsync</>, is that <application>pg_rewind</> does
   not require reading through unchanged blocks in the cluster. This makes
   it a lot faster when the database is large and only a small
   fraction of blocks differ between the clusters.
  </para>

At least, this paragraph need to be adjusted, because it states whose files are copied.  And probably latter paragraphs whose state about WAL files.



Your are rigth.
I wrote a draft as following, but i'm afraid whether the english statement is accurate.

I'm not native english speaker too :(

Only the WAL files between the point of divergence and the current WAL insert location of the source server are copied, *for* other WAL files are useless for the target server. 

I'm not sure about this usage of word *for*.  For me, it probably should be just removed.  Rest of changes looks good for me.  Please, integrate them into the patch.


I had removed the *for* , Pleae check the new patch again.

---
Best Regards,
Chen Huajun


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pg_rewind_wal_copy_reduce_v8.patch (15K) Download Attachment
12