libpq compression

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

libpq compression

konstantin knizhnik
Hi hackers,

One of our customers was managed to improve speed about 10 times by using SSL compression for the system where client and servers are located in different geographical regions
and query results are very large because of JSON columns. Them actually do not need encryption, just compression.
I expect that it is not the only case where compression of libpq protocol can be useful. Please notice that Postgres replication is also using libpq protocol.

Taken in account that vulnerability was found in SSL compression and so SSLComppression is considered to be deprecated and insecure (http://www.postgresql-archive.org/disable-SSL-compression-td6010072.html), it will be nice to have some alternative mechanism of reducing  libpq traffic.

I have implemented some prototype implementation of it (patch is attached).
To use zstd compression, Postgres should be configured with --with-zstd. Otherwise compression will use zlib unless it is disabled by --without-zlib option.
I have added compression=on/off parameter to connection string and -Z option to psql and pgbench utilities.
Below are some results:

Compression ratio (raw->compressed):


libz (level=1)
libzstd (level=1)
pgbench -i -s 10
16997209->2536330
16997209->268077
pgbench -t 100000 -S
6289036->1523862
6600338<-900293
6288933->1777400
6600338<-1000318

There is no mistyping: libzstd compress COPY data about 10 times better than libz, with wonderful compression ratio 63.

Influence on execution time is minimal (I have tested local configuration when client and server are at the same host):


no compression
libz (level=1)
libzstd (level=1)
pgbench -i -s 10
1.552
1.572
1.611
pgbench -t 100000 -S
4.482
4.926
4.877

    
-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

libpq-compression-2.patch (35K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: libpq compression

Dmitry Dolgov
> On 30 March 2018 at 14:53, Konstantin Knizhnik <[hidden email]> wrote:
> Hi hackers,
> One of our customers was managed to improve speed about 10 times by using SSL compression for the system where client and servers are located in different geographical regions
> and query results are very large because of JSON columns. Them actually do not need encryption, just compression.
> I expect that it is not the only case where compression of libpq protocol can be useful. Please notice that Postgres replication is also using libpq protocol.
>
> Taken in account that vulnerability was found in SSL compression and so SSLComppression is considered to be deprecated and insecure (http://www.postgresql-archive.org/disable-SSL-compression-td6010072.html), it will be nice to have some alternative mechanism of reducing  libpq traffic.
>
> I have implemented some prototype implementation of it (patch is attached).
> To use zstd compression, Postgres should be configured with --with-zstd. Otherwise compression will use zlib unless it is disabled by --without-zlib option.
> I have added compression=on/off parameter to connection string and -Z option to psql and pgbench utilities.

I'm a bit confused why there was no reply to this. I mean, it wasn't sent on
1st April, the patch still can be applied on top of the master branch and looks
like it even works.

I assume the main concern her is that it's implemented in a rather not
extensible way. Also, if I understand correctly, it compresses the data stream
in both direction server <-> client, not sure if there is any value in
compressing what a client sends to a server. But still I'm wondering why it
didn't start at least a discussion about how it can be implemented. Do I miss
something?
Reply | Threaded
Open this post in threaded view
|

Re: libpq compression

konstantin knizhnik


On 15.05.2018 13:23, Dmitry Dolgov wrote:
> On 30 March 2018 at 14:53, Konstantin Knizhnik <[hidden email]> wrote:
> Hi hackers,
> One of our customers was managed to improve speed about 10 times by using SSL compression for the system where client and servers are located in different geographical regions
> and query results are very large because of JSON columns. Them actually do not need encryption, just compression.
> I expect that it is not the only case where compression of libpq protocol can be useful. Please notice that Postgres replication is also using libpq protocol.
>
> Taken in account that vulnerability was found in SSL compression and so SSLComppression is considered to be deprecated and insecure (http://www.postgresql-archive.org/disable-SSL-compression-td6010072.html), it will be nice to have some alternative mechanism of reducing  libpq traffic.
>
> I have implemented some prototype implementation of it (patch is attached).
> To use zstd compression, Postgres should be configured with --with-zstd. Otherwise compression will use zlib unless it is disabled by --without-zlib option.
> I have added compression=on/off parameter to connection string and -Z option to psql and pgbench utilities.

I'm a bit confused why there was no reply to this. I mean, it wasn't sent on
1st April, the patch still can be applied on top of the master branch and looks
like it even works.

I assume the main concern her is that it's implemented in a rather not
extensible way. Also, if I understand correctly, it compresses the data stream
in both direction server <-> client, not sure if there is any value in
compressing what a client sends to a server. But still I'm wondering why it
didn't start at least a discussion about how it can be implemented. Do I miss
something?

Implementation of libpq compression will be included in next release of PgProEE.
Looks like community is not so interested in this patch. Frankly speaking I do not understand why.
Compression of libpq traffic can significantly increase speed of:
1. COPY
2. Replication (both streaming and logical)
3. Queries returning large results sets (for example JSON) through slow connections.

It is possible to compress libpq traffic using SSL compression.  But SSL compression is unsafe and deteriorated feature.

Yes, this patch is not extensible: it can use either zlib either zstd. Unfortunately internal Postgres compression pglz doesn't provide streaming API.
May be it is good idea to combine it with Ildus patch (custom compression methods): https://commitfest.postgresql.org/18/1294/
In this case it will be possible to use any custom compression algorithm. But we need to design and implement streaming API for pglz and other compressors.


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 
Reply | Threaded
Open this post in threaded view
|

Re: libpq compression

Andrew Dunstan-8


On 05/15/2018 08:53 AM, Konstantin Knizhnik wrote:

>
>
> On 15.05.2018 13:23, Dmitry Dolgov wrote:
>> > On 30 March 2018 at 14:53, Konstantin Knizhnik
>> <[hidden email] <mailto:[hidden email]>> wrote:
>> > Hi hackers,
>> > One of our customers was managed to improve speed about 10 times by
>> using SSL compression for the system where client and servers are
>> located in different geographical regions
>> > and query results are very large because of JSON columns. Them
>> actually do not need encryption, just compression.
>> > I expect that it is not the only case where compression of libpq
>> protocol can be useful. Please notice that Postgres replication is
>> also using libpq protocol.
>> >
>> > Taken in account that vulnerability was found in SSL compression
>> and so SSLComppression is considered to be deprecated and insecure
>> (http://www.postgresql-archive.org/disable-SSL-compression-td6010072.html),
>> it will be nice to have some alternative mechanism of reducing  libpq
>> traffic.
>> >
>> > I have implemented some prototype implementation of it (patch is
>> attached).
>> > To use zstd compression, Postgres should be configured with
>> --with-zstd. Otherwise compression will use zlib unless it is
>> disabled by --without-zlib option.
>> > I have added compression=on/off parameter to connection string and
>> -Z option to psql and pgbench utilities.
>>
>> I'm a bit confused why there was no reply to this. I mean, it wasn't
>> sent on
>> 1st April, the patch still can be applied on top of the master branch
>> and looks
>> like it even works.
>>
>> I assume the main concern her is that it's implemented in a rather not
>> extensible way. Also, if I understand correctly, it compresses the
>> data stream
>> in both direction server <-> client, not sure if there is any value in
>> compressing what a client sends to a server. But still I'm wondering
>> why it
>> didn't start at least a discussion about how it can be implemented.
>> Do I miss
>> something?
>
> Implementation of libpq compression will be included in next release
> of PgProEE.
> Looks like community is not so interested in this patch. Frankly
> speaking I do not understand why.
> Compression of libpq traffic can significantly increase speed of:
> 1. COPY
> 2. Replication (both streaming and logical)
> 3. Queries returning large results sets (for example JSON) through
> slow connections.
>
> It is possible to compress libpq traffic using SSL compression. But
> SSL compression is unsafe and deteriorated feature.
>
> Yes, this patch is not extensible: it can use either zlib either zstd.
> Unfortunately internal Postgres compression pglz doesn't provide
> streaming API.
> May be it is good idea to combine it with Ildus patch (custom
> compression methods): https://commitfest.postgresql.org/18/1294/
> In this case it will be possible to use any custom compression
> algorithm. But we need to design and implement streaming API for pglz
> and other compressors.
>
>


I'm sure there is plenty of interest in this. However, you guys need to
understand where we are in the development cycle. We're trying to wrap
up Postgres 11, which was feature frozen before this patch ever landed.
So it's material for Postgres 12. That means it will probably need to
wait a little while before it gets attention. It doesn't mean nobody is
interested.

I disagree with Dmitry about compressing in both directions - I can
think of plenty of good cases where we would want to compress traffic
from the client.

cheers

andrew


--
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: libpq compression

Craig Ringer-3
On 15 May 2018 at 21:36, Andrew Dunstan <[hidden email]> wrote:


> To use zstd compression, Postgres should be configured with --with-zstd. Otherwise compression will use zlib unless it is disabled by --without-zlib option.
> I have added compression=on/off parameter to connection string and -Z option to psql and pgbench utilities.

I'm a bit confused why there was no reply to this. I mean, it wasn't sent on
1st April, the patch still can be applied on top of the master branch and looks
like it even works.

I assume the main concern her is that it's implemented in a rather not
extensible way. Also, if I understand correctly, it compresses the data stream
in both direction server <-> client, not sure if there is any value in
compressing what a client sends to a server. But still I'm wondering why it
didn't start at least a discussion about how it can be implemented. Do I miss
something?

Implementation of libpq compression will be included in next release of PgProEE.
Looks like community is not so interested in this patch. Frankly speaking I do not understand why.

I'm definitely very interested, and simply missed the post.

I'll talk with some team mates as we're doing some PG12 planning now.
 
Yes, this patch is not extensible: it can use either zlib either zstd. Unfortunately internal Postgres compression pglz doesn't provide streaming API.
May be it is good idea to combine it with Ildus patch (custom compression methods): https://commitfest.postgresql.org/18/1294/

Given the history of issues with attempting custom/pluggable compression for toast etc, I really wouldn't want to couple those up.

pglz wouldn't make much sense for protocol compression anyway, except maybe for fast local links where it was worth a slight compression overhead but not the cpu needed for gzip. I don't think it's too exciting. zlib/gzip is likely the sweet spot for the reasonable future for protocol compression, or a heck of a lot better than what we have, anyway.

We should make sure the protocol part is extensible, but the implementation doesn't need to be pluggable.

In this case it will be possible to use any custom compression algorithm. But we need to design and implement streaming API for pglz and other compressors.

I'm sure there is plenty of interest in this. However, you guys need to understand where we are in the development cycle. We're trying to wrap up Postgres 11, which was feature frozen before this patch ever landed. So it's material for Postgres 12. That means it will probably need to wait a little while before it gets attention. It doesn't mean nobody is interested.

I disagree with Dmitry about compressing in both directions - I can think of plenty of good cases where we would want to compress traffic from the client.

Agreed. The most obvious case being COPY, but there's also big bytea values, etc. 


--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
Reply | Threaded
Open this post in threaded view
|

Re: libpq compression

Euler Taveira
In reply to this post by konstantin knizhnik
2018-05-15 9:53 GMT-03:00 Konstantin Knizhnik <[hidden email]>:
> Looks like community is not so interested in this patch. Frankly speaking I
> do not understand why.
>
AFAICS the lack of replies is due to feature freeze. I'm pretty sure
people are interested in this topic (at least I am). Did you review a
previous discussion [1] about this?

I did a prototype a few years ago. I didn't look at your patch yet.
I'll do in a few weeks. Please add your patch to the next CF [2].


[1] https://www.postgresql.org/message-id/4FD9698F.2090407%40timbira.com
[2] https://commitfest.postgresql.org/18/


--
   Euler Taveira                                   Timbira -
http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

Reply | Threaded
Open this post in threaded view
|

Re: libpq compression

Grigory Smolkin
In reply to this post by konstantin knizhnik

Hello!
I have noticed that psql --help lack -Z|--compression option.
Also it would be nice to have option like --compression-level in psql and pgbench.


On 03/30/2018 03:53 PM, Konstantin Knizhnik wrote:
Hi hackers,

One of our customers was managed to improve speed about 10 times by using SSL compression for the system where client and servers are located in different geographical regions
and query results are very large because of JSON columns. Them actually do not need encryption, just compression.
I expect that it is not the only case where compression of libpq protocol can be useful. Please notice that Postgres replication is also using libpq protocol.

Taken in account that vulnerability was found in SSL compression and so SSLComppression is considered to be deprecated and insecure (http://www.postgresql-archive.org/disable-SSL-compression-td6010072.html), it will be nice to have some alternative mechanism of reducing  libpq traffic.

I have implemented some prototype implementation of it (patch is attached).
To use zstd compression, Postgres should be configured with --with-zstd. Otherwise compression will use zlib unless it is disabled by --without-zlib option.
I have added compression=on/off parameter to connection string and -Z option to psql and pgbench utilities.
Below are some results:

Compression ratio (raw->compressed):


libz (level=1)
libzstd (level=1)
pgbench -i -s 10
16997209->2536330
16997209->268077
pgbench -t 100000 -S
6289036->1523862
6600338<-900293
6288933->1777400
6600338<-1000318

There is no mistyping: libzstd compress COPY data about 10 times better than libz, with wonderful compression ratio 63.

Influence on execution time is minimal (I have tested local configuration when client and server are at the same host):


no compression
libz (level=1)
libzstd (level=1)
pgbench -i -s 10
1.552
1.572
1.611
pgbench -t 100000 -S
4.482
4.926
4.877
-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

-- 
Grigory Smolkin
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Reply | Threaded
Open this post in threaded view
|

Re: libpq compression

konstantin knizhnik


On 16.05.2018 18:09, Grigory Smolkin wrote:
>
> Hello!
> I have noticed that psql --help lack -Z|--compression option.
> Also it would be nice to have option like --compression-level in psql
> and pgbench.
>
Thank you for this notice.
Updated and rebased patch is attached.
Concerning specification of compression level: I have made many
experiments with different data sets and both zlib/zstd and in both
cases using compression level higher than default doesn't cause some
noticeable increase of compression ratio, but quite significantly reduce
speed. Moreover, for "pgbench -i" zstd provides better compression ratio
(63 times!) with compression level 1 than with with largest recommended
compression level 22! This is why I decided not to allow user to choose
compression level.

libpq-compression-3.patch (36K) Download Attachment
Previous Thread Next Thread