Change pgarch_readyXlog() to return .history files first

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

Change pgarch_readyXlog() to return .history files first

David Steele
Hackers,

The alphabetical ordering of pgarch_readyXlog() means that on promotion
000000010000000100000001.partial will be archived before 00000002.history.

This appears harmless, but the .history files are what other potential
primaries use to decide what timeline they should pick.  The additional
latency of compressing/transferring the much larger partial file means
that archiving of the .history file is delayed and greatly increases the
chance that another primary will promote to the same timeline.

Teach pgarch_readyXlog() to return .history files first (and in order)
to reduce the window where this can happen.  This won't prevent all
conflicts, but it is a simple change and should greatly reduce
real-world occurrences.

I also think we should consider back-patching this change.  It's hard to
imagine that archive commands would have trouble with this reordering
and the current ordering causes real pain in HA clusters.

Regards,
--
-David
[hidden email]

history-files-first-v1.patch (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Change pgarch_readyXlog() to return .history files first

David Steele
On 12/13/18 11:53 AM, David Steele wrote:

> Hackers,
>
> The alphabetical ordering of pgarch_readyXlog() means that on promotion
> 000000010000000100000001.partial will be archived before 00000002.history.
>
> This appears harmless, but the .history files are what other potential
> primaries use to decide what timeline they should pick.  The additional
> latency of compressing/transferring the much larger partial file means
> that archiving of the .history file is delayed and greatly increases the
> chance that another primary will promote to the same timeline.
>
> Teach pgarch_readyXlog() to return .history files first (and in order)
> to reduce the window where this can happen.  This won't prevent all
> conflicts, but it is a simple change and should greatly reduce
> real-world occurrences.
>
> I also think we should consider back-patching this change.  It's hard to
> imagine that archive commands would have trouble with this reordering
> and the current ordering causes real pain in HA clusters.
Some gcc versions wanted more parens, so updated in attached.

--
-David
[hidden email]

history-files-first-v2.patch (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Change pgarch_readyXlog() to return .history files first

Michael Paquier-2
In reply to this post by David Steele
On Thu, Dec 13, 2018 at 11:53:53AM -0500, David Steele wrote:
> I also think we should consider back-patching this change.  It's hard to
> imagine that archive commands would have trouble with this reordering
> and the current ordering causes real pain in HA clusters.

I would like to hear opinion from other though if we should consider
that as an improvement or an actual bug fix.  Changing the order of the
files to map with what the startup process does when promoting does not
sound like a bug fix to me, still this is not really invasive, so we
could really consider it worth back-patching to reduce common pain from
users when it comes to timeline handling.

> -    if (!found)
> +    /* Is this a history file? */
> +        bool history = basenamelen >= sizeof(".history") &&
> +            strcmp(rlde->d_name + (basenamelen - sizeof(".history") + 1),
> +                   ".history.ready") == 0;

Or you could just use IsTLHistoryFileName here?

If one wants to simply check this code, you can just create dummy orphan
files in archive_status and see in which order they get removed.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Change pgarch_readyXlog() to return .history files first

David Steele
On 12/13/18 7:15 PM, Michael Paquier wrote:

> On Thu, Dec 13, 2018 at 11:53:53AM -0500, David Steele wrote:
>> I also think we should consider back-patching this change.  It's hard to
>> imagine that archive commands would have trouble with this reordering
>> and the current ordering causes real pain in HA clusters.
>
> I would like to hear opinion from other though if we should consider
> that as an improvement or an actual bug fix.  Changing the order of the
> files to map with what the startup process does when promoting does not
> sound like a bug fix to me, still this is not really invasive, so we
> could really consider it worth back-patching to reduce common pain from
> users when it comes to timeline handling.
I think an argument can be made that it is a bug (ish).  Postgres
generates the files in one order, and they get archived in a different
order.

>> -    if (!found)
>> +    /* Is this a history file? */
>> +        bool history = basenamelen >= sizeof(".history") &&
>> +            strcmp(rlde->d_name + (basenamelen - sizeof(".history") + 1),
>> +                   ".history.ready") == 0;
>
> Or you could just use IsTLHistoryFileName here?

We'd have to truncate .ready off the string to make that work, which
seems easy enough.  Is that what you were thinking?

One thing to consider is the check above is more efficient than
IsTLHistoryFileName() and it potentially gets run a lot.

> If one wants to simply check this code, you can just create dummy orphan
> files in archive_status and see in which order they get removed.

Seems awfully racy.  Are there currently any tests like this for the
archiver that I can look at extending?

Regards,
--
-David
[hidden email]


signature.asc (891 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Change pgarch_readyXlog() to return .history files first

Michael Paquier-2
On Fri, Dec 14, 2018 at 08:43:20AM -0500, David Steele wrote:

> On 12/13/18 7:15 PM, Michael Paquier wrote:
>> I would like to hear opinion from other though if we should consider
>> that as an improvement or an actual bug fix.  Changing the order of the
>> files to map with what the startup process does when promoting does not
>> sound like a bug fix to me, still this is not really invasive, so we
>> could really consider it worth back-patching to reduce common pain from
>> users when it comes to timeline handling.
>
> I think an argument can be made that it is a bug (ish).  Postgres
> generates the files in one order, and they get archived in a different
> order.
I am not completely sure either.  In my experience, if there is any
doubt on such definitions the best answer is to not backpatch.

>> Or you could just use IsTLHistoryFileName here?
>
> We'd have to truncate .ready off the string to make that work, which
> seems easy enough.  Is that what you were thinking?

Yes, that's the idea.  pgarch_readyXlog returns the segment name which
should be archived, so you could just compute it after detecting a
.ready file.

> One thing to consider is the check above is more efficient than
> IsTLHistoryFileName() and it potentially gets run a lot.

This check misses strspn(), so any file finishing with .history would
get eaten even if that's unlikely to happen.

>> If one wants to simply check this code, you can just create dummy orphan
>> files in archive_status and see in which order they get removed.
>
> Seems awfully racy.  Are there currently any tests like this for the
> archiver that I can look at extending?

Sorry for the confusion, I was referring to manual testing here.
Thinking about it, we could have an automatic test to check for the file
order pattern by creating dummy files, starting the server with archiver
enabled, and then parse the logs as orphan .ready files would get
removed in the order their archiving is attempted with one WARNING entry
generated for each.  I am not sure if that is worth a test though.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Change pgarch_readyXlog() to return .history files first

David Steele
On 12/15/18 2:10 AM, Michael Paquier wrote:
> On Fri, Dec 14, 2018 at 08:43:20AM -0500, David Steele wrote:
>> On 12/13/18 7:15 PM, Michael Paquier wrote:

>>> Or you could just use IsTLHistoryFileName here?
>>
>> We'd have to truncate .ready off the string to make that work, which
>> seems easy enough.  Is that what you were thinking?
>
> Yes, that's the idea.  pgarch_readyXlog returns the segment name which
> should be archived, so you could just compute it after detecting a
> .ready file.
>
>> One thing to consider is the check above is more efficient than
>> IsTLHistoryFileName() and it potentially gets run a lot.
>
> This check misses strspn(), so any file finishing with .history would
> get eaten even if that's unlikely to happen.
Good point.  The new patch uses IsTLHistoryFileName().

>>> If one wants to simply check this code, you can just create dummy orphan
>>> files in archive_status and see in which order they get removed.
>>
>> Seems awfully racy.  Are there currently any tests like this for the
>> archiver that I can look at extending?
>
> Sorry for the confusion, I was referring to manual testing here.

Ah, I see.  Yes, that's exactly how I tested it, in addition to doing
real promotions.

> Thinking about it, we could have an automatic test to check for the file
> order pattern by creating dummy files, starting the server with archiver
> enabled, and then parse the logs as orphan .ready files would get
> removed in the order their archiving is attempted with one WARNING entry
> generated for each.  I am not sure if that is worth a test though.

Yes, parsing the logs was the best thing I could think of, too.

--
-David
[hidden email]

history-files-first-v3.patch (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Change pgarch_readyXlog() to return .history files first

Michael Paquier-2
On Thu, Dec 20, 2018 at 01:57:30PM +0200, David Steele wrote:
> Good point.  The new patch uses IsTLHistoryFileName().

OK, I have been reviewing the patch and the logic is correct, still I
could not resist reducing the number of inner if's for readability.  I
also did not like the high-jacking of rlde->d_name so instead let's use
an intermediate variable to store the basename of a scanned entry.  The
format of the if/elif with comments in-between was not really consistent
with the common practice as well.  pg_indent has also been applied.

> Ah, I see.  Yes, that's exactly how I tested it, in addition to doing real
> promotions.

OK, so am I doing.

Attached is an updated patch.  Does that look fine to you?  The base
logic is unchanged, and after a promotion history files get archived
before the last partial segment.
--
Michael

history-files-first-v4.patch (3K) Download Attachment
signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Change pgarch_readyXlog() to return .history files first

Kyotaro HORIGUCHI-2
Hello.

FWIW it seems to me a bug that making an inconsistent set of
files in archive directory.

At Fri, 21 Dec 2018 12:19:21 +0900, Michael Paquier <[hidden email]> wrote in <[hidden email]>
> On Thu, Dec 20, 2018 at 01:57:30PM +0200, David Steele wrote:
> > Good point.  The new patch uses IsTLHistoryFileName().
>
> OK, I have been reviewing the patch and the logic is correct, still I
> could not resist reducing the number of inner if's for readability.  I

+1 basically. But I think that tail(name, 6) != ".ready" can
happen with a certain frequency but strspn(name) < basenamelen
rarely in the normal case.

> also did not like the high-jacking of rlde->d_name so instead let's use
> an intermediate variable to store the basename of a scanned entry.  The
> format of the if/elif with comments in-between was not really consistent
> with the common practice as well.  pg_indent has also been applied.
>
> > Ah, I see.  Yes, that's exactly how I tested it, in addition to doing real
> > promotions.
>
> OK, so am I doing.
>
> Attached is an updated patch.  Does that look fine to you?  The base
> logic is unchanged, and after a promotion history files get archived
> before the last partial segment.

Renaming history to ishistory looks good.

if (!found || (ishistory && !historyFound))
{
  /* init */
  found = true;
  historyFound = ishistory;
}
else if (ishistory || (!ishstory && !historyFound))
 /* compare/replace */

In the else if condition, ishisotry must be false in the right
hand of ||. What we do here is ignoring non-history files once
history file found. (Just a logic condensing and it would be done
by compiler, though)

"else if (!historyFound || ishistory)"



> strcpy(xlog, newxlog);

The caller prepares sufficient memory for basename, and we no
longer copy ".ready" into newxlog. Douldn't we work directly on
xlog instead of allocating newxlog?

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center


Reply | Threaded
Open this post in threaded view
|

Re: Change pgarch_readyXlog() to return .history files first

Michael Paquier-2
On Fri, Dec 21, 2018 at 01:49:18PM +0900, Kyotaro HORIGUCHI wrote:
> FWIW it seems to me a bug that making an inconsistent set of
> files in archive directory.

Okay, point taken!  FWIW, I have no actual objections in not
back-patching that.

> At Fri, 21 Dec 2018 12:19:21 +0900, Michael Paquier <[hidden email]> wrote in <[hidden email]>
>> On Thu, Dec 20, 2018 at 01:57:30PM +0200, David Steele wrote:
>> > Good point.  The new patch uses IsTLHistoryFileName().
>>
>> OK, I have been reviewing the patch and the logic is correct, still I
>> could not resist reducing the number of inner if's for readability.  I
>
> +1 basically. But I think that tail(name, 6) != ".ready" can
> happen with a certain frequency but strspn(name) < basenamelen
> rarely in the normal case.
So that +0.5 if "basically" means a partial agreement? :p

> In the else if condition, ishisotry must be false in the right
> hand of ||. What we do here is ignoring non-history files once
> history file found. (Just a logic condensing and it would be done
> by compiler, though)

Yes, this can be simplified.  So let's do so.

> The caller prepares sufficient memory for basename, and we no
> longer copy ".ready" into newxlog. Wouldn't we work directly on
> xlog instead of allocating newxlog?

Okay, let's simplify that as you suggest.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Change pgarch_readyXlog() to return .history files first

Kyotaro HORIGUCHI-2
At Fri, 21 Dec 2018 14:17:25 +0900, Michael Paquier <[hidden email]> wrote in <[hidden email]>
> On Fri, Dec 21, 2018 at 01:49:18PM +0900, Kyotaro HORIGUCHI wrote:
> > FWIW it seems to me a bug that making an inconsistent set of
> > files in archive directory.
>
> Okay, point taken!  FWIW, I have no actual objections in not
> back-patching that.

I maybe(?) know.

> > At Fri, 21 Dec 2018 12:19:21 +0900, Michael Paquier <[hidden email]> wrote in <[hidden email]>
> >> On Thu, Dec 20, 2018 at 01:57:30PM +0200, David Steele wrote:
> >> > Good point.  The new patch uses IsTLHistoryFileName().
> >>
> >> OK, I have been reviewing the patch and the logic is correct, still I
> >> could not resist reducing the number of inner if's for readability.  I
> >
> > +1 basically. But I think that tail(name, 6) != ".ready" can
> > happen with a certain frequency but strspn(name) < basenamelen
> > rarely in the normal case.
>
> So that +0.5 if "basically" means a partial agreement? :p

Mmm. No, +0.9.

> > In the else if condition, ishisotry must be false in the right
> > hand of ||. What we do here is ignoring non-history files once
> > history file found. (Just a logic condensing and it would be done
> > by compiler, though)
>
> Yes, this can be simplified.  So let's do so.
>
> > The caller prepares sufficient memory for basename, and we no
> > longer copy ".ready" into newxlog. Wouldn't we work directly on

Sorry for silly typo, but the 'W' was 'C', I meant:p

> > xlog instead of allocating newxlog?
>
> Okay, let's simplify that as you suggest.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center


Reply | Threaded
Open this post in threaded view
|

Re: Change pgarch_readyXlog() to return .history files first

David Steele
In reply to this post by Kyotaro HORIGUCHI-2
On 12/21/18 6:49 AM, Kyotaro HORIGUCHI wrote:
> "else if (!historyFound || ishistory)"
>
>
>
>> strcpy(xlog, newxlog);
>
> The caller prepares sufficient memory for basename, and we no
> longer copy ".ready" into newxlog. Douldn't we work directly on
> xlog instead of allocating newxlog?

I thought about doing that, but wanted to focus on the task at hand.  It
does save a strcpy and a bit of stack space, so seems like a win.

Overall, the patch looks good to me.  I think breaking up the if does
make the code more readable.

Regards,
--
-David
[hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Change pgarch_readyXlog() to return .history files first

Michael Paquier-2
On Fri, Dec 21, 2018 at 08:17:12AM +0200, David Steele wrote:
> I thought about doing that, but wanted to focus on the task at hand.  It
> does save a strcpy and a bit of stack space, so seems like a win.
>
> Overall, the patch looks good to me.  I think breaking up the if does make
> the code more readable.

Thanks for the lookups.  I can see that the patch applies without
conflicts down to 9.4, and based on the opinions gathered on this
thread back-patching this stuff is the consensus, based on input from
Kyotaro Horiguchi and David Steele (I don't mind much myself).  So,
any objections from others in doing so?
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Change pgarch_readyXlog() to return .history files first

Michael Paquier-2
On Sat, Dec 22, 2018 at 08:55:14AM +0900, Michael Paquier wrote:
> Thanks for the lookups.  I can see that the patch applies without
> conflicts down to 9.4, and based on the opinions gathered on this
> thread back-patching this stuff is the consensus, based on input from
> Kyotaro Horiguchi and David Steele (I don't mind much myself).  So,
> any objections from others in doing so?

On REL9_4_STABLE, IsTLHistoryFileName() goes missing, so committed
"only" down to 9.5.  Thanks David for the patch and Horiguchi-san for
the review.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Change pgarch_readyXlog() to return .history files first

David Steele
On 12/24/18 1:31 PM, Michael Paquier wrote:

> On Sat, Dec 22, 2018 at 08:55:14AM +0900, Michael Paquier wrote:
>> Thanks for the lookups.  I can see that the patch applies without
>> conflicts down to 9.4, and based on the opinions gathered on this
>> thread back-patching this stuff is the consensus, based on input from
>> Kyotaro Horiguchi and David Steele (I don't mind much myself).  So,
>> any objections from others in doing so?
>
> On REL9_4_STABLE, IsTLHistoryFileName() goes missing, so committed
> "only" down to 9.5.  Thanks David for the patch and Horiguchi-san for
> the review.

Thanks, Michael!

I'm not too worried about 9.4 since it is currently the oldest supported
version.  HA users tend to be on the leading edge of the upgrade curve
and others have the opportunity to upgrade if the reordering will help them.

--
-David
[hidden email]