[PATCH] xlogreader: do not read a file block twice

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

[PATCH] xlogreader: do not read a file block twice

a.zakirov
Hello hackers,

Grigory noticed that one of our utilities has very slow performance when
xlogreader reads zlib archives. We found out that xlogreader sometimes
reads a WAL file block twice.

zlib has slow performance when you read an archive not in sequential
order. I think reading a block twice in same position isn't sequential,
because gzread() moves current position forward and next call gzseek()
to the same position moves it back.

It seems that the attached patch solves the issue. I think when reqLen
== state->readLen the requested block already is in the xlogreader's buffer.

What do you think?

--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

xlogreader-reqlen.patch (558 bytes) Download Attachment
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] xlogreader: do not read a file block twice

Grigory Smolkin
Hm, looks like it could speed up PostgreSQL recovery, but is it safe?


On 02/11/2019 07:25 PM, Arthur Zakirov wrote:

> Hello hackers,
>
> Grigory noticed that one of our utilities has very slow performance
> when xlogreader reads zlib archives. We found out that xlogreader
> sometimes reads a WAL file block twice.
>
> zlib has slow performance when you read an archive not in sequential
> order. I think reading a block twice in same position isn't
> sequential, because gzread() moves current position forward and next
> call gzseek() to the same position moves it back.
>
> It seems that the attached patch solves the issue. I think when reqLen
> == state->readLen the requested block already is in the xlogreader's
> buffer.
>
> What do you think?
>

--
Grigory Smolkin
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] xlogreader: do not read a file block twice

Michael Paquier-2
On Mon, Feb 11, 2019 at 07:32:59PM +0300, Grigory Smolkin wrote:
> Hm, looks like it could speed up PostgreSQL recovery, but is it
> safe?

(Please avoid top-posting.)

> On 02/11/2019 07:25 PM, Arthur Zakirov wrote:
>> Grigory noticed that one of our utilities has very slow performance when
>> xlogreader reads zlib archives. We found out that xlogreader sometimes
>> reads a WAL file block twice.
>>
>> What do you think?

I think that such things, even if they look simple, need a careful
lookup, and I have not looked at the proposal yet.  Could you add it
to the next commit fest so as we don't lose track of it?
https://commitfest.postgresql.org/22/
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] xlogreader: do not read a file block twice

a.zakirov
On 12.02.2019 07:23, Michael Paquier wrote:

>> On 02/11/2019 07:25 PM, Arthur Zakirov wrote:
>>> Grigory noticed that one of our utilities has very slow performance when
>>> xlogreader reads zlib archives. We found out that xlogreader sometimes
>>> reads a WAL file block twice.
>>>
>>> What do you think?
>
> I think that such things, even if they look simple, need a careful
> lookup, and I have not looked at the proposal yet.  Could you add it
> to the next commit fest so as we don't lose track of it?
> https://commitfest.postgresql.org/22/

Of course. Agree, it may be a non trivial case. Added as a bug fix:
https://commitfest.postgresql.org/22/1994/

--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] xlogreader: do not read a file block twice

Andrey Lepikhov
In reply to this post by a.zakirov


On 11.02.2019 21:25, Arthur Zakirov wrote:

> Hello hackers,
>
> Grigory noticed that one of our utilities has very slow performance when
> xlogreader reads zlib archives. We found out that xlogreader sometimes
> reads a WAL file block twice.
>
> zlib has slow performance when you read an archive not in sequential
> order. I think reading a block twice in same position isn't sequential,
> because gzread() moves current position forward and next call gzseek()
> to the same position moves it back.
>
> It seems that the attached patch solves the issue. I think when reqLen
> == state->readLen the requested block already is in the xlogreader's
> buffer.
>
> What do you think?
I looked at the history of the code changes:

---------------------------------------------------------------
7fcbf6a405f (Alvaro Herrera 2013-01-16 16:12:53 -0300 539)
reqLen < state->readLen)

1bb2558046c (Heikki Linnakangas 2010-01-27 15:27:51 +0000 9349)
        targetPageOff == readOff && targetRecOff < readLen)

eaef111396e (Tom Lane 2006-04-03 23:35:05 +0000 3842)
len = XLOG_BLCKSZ - RecPtr->xrecoff % XLOG_BLCKSZ;
4d14fe0048c (Tom Lane 2001-03-13 01:17:06 +0000 3843)
if (total_len > len)
---------------------------------------------------------------

In the original code of Tom Lane, condition (total_len > len) caused a
page reread from disk. As I understand it, this is equivalent to your
proposal.
Th code line in commit 1bb2558046c seems tantamount to the corresponding
line in commit 7fcbf6a405f but have another semantics: the targetPageOff
value can't be more or equal XLOG_BLCKSZ, but the reqLen value can be.
It may be a reason of appearance of possible mistake, introduced by
commit 7fcbf6a405f.

--
Andrey Lepikhov
Postgres Professional
https://postgrespro.com
The Russian Postgres Company

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] xlogreader: do not read a file block twice

a.zakirov
On 12.02.2019 20:47, Andrey Lepikhov wrote:

> I looked at the history of the code changes:
>
> ---------------------------------------------------------------
> 7fcbf6a405f (Alvaro Herrera 2013-01-16 16:12:53 -0300 539) reqLen <
> state->readLen)
>
> 1bb2558046c (Heikki Linnakangas 2010-01-27 15:27:51 +0000 9349)        
> targetPageOff == readOff && targetRecOff < readLen)
>
> eaef111396e (Tom Lane 2006-04-03 23:35:05 +0000 3842)
> len = XLOG_BLCKSZ - RecPtr->xrecoff % XLOG_BLCKSZ;
> 4d14fe0048c (Tom Lane 2001-03-13 01:17:06 +0000 3843)
> if (total_len > len)
> ---------------------------------------------------------------
>
> In the original code of Tom Lane, condition (total_len > len) caused a
> page reread from disk. As I understand it, this is equivalent to your
> proposal.
> Th code line in commit 1bb2558046c seems tantamount to the corresponding
> line in commit 7fcbf6a405f but have another semantics: the targetPageOff
> value can't be more or equal XLOG_BLCKSZ, but the reqLen value can be.
> It may be a reason of appearance of possible mistake, introduced by
> commit 7fcbf6a405f.

Thank you for your research. Indeed, it makes sense now.

In my case after reading a page both `reqLen` and `state->readLen` equal
to XLOG_BLCKSZ. This leads to a page reread, since `pageptr` is the same
as the previous read. But `targetRecOff` is different in the second case
because we want to read next record, which probably doesn't fit into the
page wholly (that's why `reqLen` is equal to XLOG_BLCKSZ).

--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] xlogreader: do not read a file block twice

Michael Paquier-2
In reply to this post by a.zakirov
On Tue, Feb 12, 2019 at 11:44:14AM +0300, Arthur Zakirov wrote:
> Of course. Agree, it may be a non trivial case. Added as a bug fix:
> https://commitfest.postgresql.org/22/1994/

I have been looking at the patch, and I agree that the current coding
is a bit crazy.  If the wanted data has already been read, it makes
little sense to require reading it again if the size requested by the
caller of ReadPageInternal() exactly equals what has been read
already, and that's what the code is doing.

Now I don't actually agree that this qualifies as a bug fix.  As
things stand, a page may finish by being more than once if what has
been read previously equals what is requested, however this does not
prevent the code to work correctly.  The performance gain is also
heavily dependent on the callback reading a page and the way the WAL
reader is used.  How do you actually read WAL pages in your own
plugin with compressed data?  It begins by reading a full page once,
then it moves on to a per-record read after making sure that the page
has been read?
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] xlogreader: do not read a file block twice

a.zakirov
On 14.02.2019 09:51, Michael Paquier wrote:
> Now I don't actually agree that this qualifies as a bug fix.  As
> things stand, a page may finish by being more than once if what has
> been read previously equals what is requested, however this does not
> prevent the code to work correctly.  The performance gain is also
> heavily dependent on the callback reading a page and the way the WAL
> reader is used.  How do you actually read WAL pages in your own
> plugin with compressed data?  It begins by reading a full page once,
> then it moves on to a per-record read after making sure that the page
> has been read?

Yes, an application reads WAL pages wholly at a time. It is done within
SimpleXLogPageRead() (it is a read_page callback passed to
XLogReaderAllocate()). It returns XLOG_BLCKSZ.

Here is the part of the code, not sure that it will be useful though:

SimpleXLogPageRead(...)
{
     ...
     targetPageOff = targetPagePtr % private_data->xlog_seg_size;
     ...
     if (gzseek(private_data->gz_xlogfile, (z_off_t) targetPageOff,
                SEEK_SET) == -1)
         ...
     if (gzread(private_data->gz_xlogfile, readBuf, XLOG_BLCKSZ) !=
                XLOG_BLCKSZ)
     ...
     return XLOG_BLCKSZ;
}

So we read whole page with size XLOG_BLCKSZ. The full code:
https://github.com/postgrespro/pg_probackup/blob/c052651b8c8864733bcabbc2660c387b792229d8/src/parsexlog.c#L1074

Here is the little optimization I made. Mainly I just add a buffer to
store previous read page:
https://github.com/postgrespro/pg_probackup/blob/c052651b8c8864733bcabbc2660c387b792229d8/src/parsexlog.c#L1046

--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] xlogreader: do not read a file block twice

Michael Paquier-2
On Thu, Feb 14, 2019 at 11:20:56AM +0300, Arthur Zakirov wrote:
> So we read whole page with size XLOG_BLCKSZ. The full code:
> https://github.com/postgrespro/pg_probackup/blob/c052651b8c8864733bcabbc2660c387b792229d8/src/parsexlog.c#L1074
>
> Here is the little optimization I made. Mainly I just add a buffer to store
> previous read page:
> https://github.com/postgrespro/pg_probackup/blob/c052651b8c8864733bcabbc2660c387b792229d8/src/parsexlog.c#L1046

Thanks, I see what you have done.  I cannot comment if your shortcut
is actually fully correct based on my knowledge of this code, but
things cannot be in the best conditions without having the WAL reader
handle properly the limits.  So I am planning to commit what you
propose after an extra pass on it in the next couple of days or so.
--
Michael

signature.asc (849 bytes) Download Attachment