Custom compression methods

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
146 messages Options
1 ... 5678
Reply | Threaded
Open this post in threaded view
|

Re: Custom compression methods

Andres Freund
Hi,

On 2020-06-19 13:03:02 -0400, Robert Haas wrote:

> - I can see three possible ways of breaking our dependence on 'pglz'
> for TOAST compression. Option #1 is to pick one new algorithm which we
> think is better than 'pglz' in all relevant ways and use it as the
> default for all new compressed datums. This would be dramatically
> simpler than what this patch does, because there would be no user
> interface. It would just be out with the old and in with the new.
> Option #2 is to create a short list of new algorithms that have
> different trade-offs; e.g. one that is very fast (like lz4) and one
> that has an extremely high compression ratio, and provide an interface
> for users to choose between them. This would be moderately simpler
> than what this patch does, because we would expose to the user
> anything about how a new compression method could be added, but it
> would still require a UI for the user to choose between the available
> (and hard-coded) options. It has the further advantage that every
> PostgreSQL cluster will offer the same options (or a subset of them,
> perhaps, depending on configure flags) and so you don't have to worry
> that, say, a pg_am row gets lost and suddenly all of your toasted data
> is inaccessible and uninterpretable. Option #3 is to do what this
> patch actually does, which is to allow for the addition of any number
> of compressors, including by extensions. It has the advantage that new
> compressors can be added with core's permission, so, for example, if
> it is unclear whether some excellent compressor is free of patent
> problems, we can elect not to ship support for it in core, while at
> the same time people who are willing to accept the associated legal
> risk can add that functionality to their own copy as an extension
> without having to patch core. The legal climate may even vary by
> jurisdiction, so what might be questionable in country A might be
> clearly just fine in country B. Aside from those issues, this approach
> allows people to experiment and innovate outside of core relatively
> quickly, instead of being bound by the somewhat cumbrous development
> process which has left this patch in limbo for the last few years. My
> view is that option #1 is likely to be impractical, because getting
> people to agree is hard, and better things are likely to come along
> later, and people like options. So I prefer either #2 or #3.

I personally favor going for #2, at least initially. Then we can discuss
the runtime-extensibility of #3 separately.


> - The next question is how a datum compressed with some non-default
> method should be represented on disk. The patch handles this first of
> all by making the observation that the compressed size can't be >=1GB,
> because the uncompressed size can't be >=1GB, and we wouldn't have
> stored it compressed if it expanded. Therefore, the upper two bits of
> the compressed size should always be zero on disk, and the patch
> steals one of them to indicate whether "custom" compression is in use.
> If it is, the 4-byte varlena header is followed not only by a 4-byte
> size (now with the new flag bit also included) but also by a 4-byte
> OID, indicating the compression AM in use. I don't think this is a
> terrible approach, but I don't think it's amazing, either. 4 bytes is
> quite a bit to use for this; if I guess correctly what will be a
> typical cluster configuration, you probably would really only need
> about 2 bits. For a datum that is both stored externally and
> compressed, the overhead is likely negligible, because the length is
> probably measured in kB or MB. But for a datum that is compressed but
> not stored externally, it seems pretty expensive; the datum is
> probably short, and having an extra 4 bytes of uncompressible data
> kinda sucks. One possibility would be to allow only one byte here:
> require each compression AM that is installed to advertise a one-byte
> value that will denote its compressed datums. If more than one AM
> tries to claim the same byte value, complain. Another possibility is
> to abandon this approach and go with #2 from the previous paragraph.
> Or maybe we add 1 or 2 "privileged" built-in compressors that get
> dedicated bit-patterns in the upper 2 bits of the size field, with the
> last bit pattern being reserved for future algorithms. (e.g. 0x00 =
> pglz, 0x01 = lz4, 0x10 = zstd, 0x11 = something else - see within for
> details).

Agreed. I favor an approach roughly like I'd implemented below
https://postgr.es/m/20130605150144.GD28067%40alap2.anarazel.de

I.e. leave the vartag etc as-is, but utilize the fact that pglz
compressed datums starts with a 4 byte length header, and that due to
the 1GB limit, the first two bits currently have to be 0. That allows to
indicate 2 compression methods without any space overhead, and
additional compression methods are supported by using an additional byte
(or some variable length encoded larger amount) if both bits are 1.


> - Yet another possible approach to the on-disk format is to leave
> varatt_external.va_extsize and varattrib_4b.rawsize untouched and
> instead add new compression methods by adding new vartag_external
> values. There's quite a lot of bit-space available there: we have a
> whole byte, and we're currently only using 4 values. We could easily
> add a half-dozen new possibilities there for new compression methods
> without sweating the bit-space consumption. The main thing I don't
> like about this is that it only seems like a useful way to provide for
> out-of-line compression. Perhaps it could be generalized to allow for
> inline compression as well, but it seems like it would take some
> hacking.

One additional note: Adding additional vartag_external values does incur
some noticable cost, distributed across lots of places.


> - One thing I really don't like about the patch is that it consumes a
> bit from infomask2 for a new flag HEAP_HASCUSTOMCOMPRESSED. infomask
> bits are at a premium, and there's been no real progress in the decade
> plus that I've been hanging around here in clawing back any bit-space.
> I think we really need to avoid burning our remaining bits for
> anything other than a really critical need, and I don't think I
> understand what the need is in this case. I might be missing
> something, but I'd really strongly suggest looking for a way to get
> rid of this. It also invents the concept of a TupleDesc flag, and the
> flag invented is TD_ATTR_CUSTOM_COMPRESSED; I'm not sure I see why we
> need that, either.

+many


Small note: The current patch adds #include "postgres.h" to a few
headers - it shouldn't do so.

Greetings,

Andres Freund


Reply | Threaded
Open this post in threaded view
|

Re: Custom compression methods

Robert Haas
On Mon, Jun 22, 2020 at 4:53 PM Andres Freund <[hidden email]> wrote:

> > Or maybe we add 1 or 2 "privileged" built-in compressors that get
> > dedicated bit-patterns in the upper 2 bits of the size field, with the
> > last bit pattern being reserved for future algorithms. (e.g. 0x00 =
> > pglz, 0x01 = lz4, 0x10 = zstd, 0x11 = something else - see within for
> > details).
>
> Agreed. I favor an approach roughly like I'd implemented below
> https://postgr.es/m/20130605150144.GD28067%40alap2.anarazel.de
> I.e. leave the vartag etc as-is, but utilize the fact that pglz
> compressed datums starts with a 4 byte length header, and that due to
> the 1GB limit, the first two bits currently have to be 0. That allows to
> indicate 2 compression methods without any space overhead, and
> additional compression methods are supported by using an additional byte
> (or some variable length encoded larger amount) if both bits are 1.

I think there's essentially no difference between these two ideas,
unless the two bits we're talking about stealing are not the same in
the two cases. Am I missing something?

> One additional note: Adding additional vartag_external values does incur
> some noticable cost, distributed across lots of places.

OK.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Reply | Threaded
Open this post in threaded view
|

Re: Custom compression methods

Andres Freund
Hi,

On 2020-06-23 14:27:47 -0400, Robert Haas wrote:

> On Mon, Jun 22, 2020 at 4:53 PM Andres Freund <[hidden email]> wrote:
> > > Or maybe we add 1 or 2 "privileged" built-in compressors that get
> > > dedicated bit-patterns in the upper 2 bits of the size field, with the
> > > last bit pattern being reserved for future algorithms. (e.g. 0x00 =
> > > pglz, 0x01 = lz4, 0x10 = zstd, 0x11 = something else - see within for
> > > details).
> >
> > Agreed. I favor an approach roughly like I'd implemented below
> > https://postgr.es/m/20130605150144.GD28067%40alap2.anarazel.de
> > I.e. leave the vartag etc as-is, but utilize the fact that pglz
> > compressed datums starts with a 4 byte length header, and that due to
> > the 1GB limit, the first two bits currently have to be 0. That allows to
> > indicate 2 compression methods without any space overhead, and
> > additional compression methods are supported by using an additional byte
> > (or some variable length encoded larger amount) if both bits are 1.

https://postgr.es/m/20130621000900.GA12425%40alap2.anarazel.de is a
thread with more information / patches further along.


> I think there's essentially no difference between these two ideas,
> unless the two bits we're talking about stealing are not the same in
> the two cases. Am I missing something?

I confused this patch with the approach in
https://www.postgresql.org/message-id/d8576096-76ba-487d-515b-44fdedba8bb5%402ndquadrant.com
sorry for that.  It obviously still differs by not having lower space
overhead (by virtue of not having a 4 byte 'va_cmid', but no additional
space for two methods, and then 1 byte overhead for 256 more), but
that's not that fundamental a difference.

I do think it's nicer to hide the details of the compression inside
toast specific code as the version in the "further along" thread above
did.


The varlena stuff feels so archaic, it's hard to keep it all in my head...


I think I've pondered that elsewhere before (but perhaps just on IM with
you?), but I do think we'll need a better toast pointer format at some
point. It's pretty fundamentally based on having the 1GB limit, which I
don't think we can justify for that much longer.

Using something like https://postgr.es/m/20191210015054.5otdfuftxrqb5gum%40alap3.anarazel.de
I'd probably make it something roughly like:

1) signed varint indicating "in-place" length
1a) if positive, it's "plain" "in-place" data
1b) if negative, data type indicator follows. abs(length) includes size of metadata.
2) optional: unsigned varint metadata type indicator
3) data

Because 1) is the size of the data, toast datums can be skipped with a
relatively low amount of instructions during tuple deforming. Instead of
needing a fair number of branches, as the case right now.

So a small in-place uncompressed varlena2 would have an overhead of 1
byte up to 63 bytes, and 2 bytes otherwise (with 8 kb pages at least).

An in-place compressed datum could have an overhead as low as 3 bytes (1
byte length, 1 byte indicator for type of compression, 1 byte raw size),
although I suspect it's rarely going to be useful at that small sizes.


Anyway. I think it's probably reasonable to utilize those two bits
before going to a new toast format. But if somebody were more interested
in working on toastv2 I'd not push back either.

Regards,

Andres


Reply | Threaded
Open this post in threaded view
|

Re: Custom compression methods

Robert Haas
On Tue, Jun 23, 2020 at 4:00 PM Andres Freund <[hidden email]> wrote:
> https://postgr.es/m/20130621000900.GA12425%40alap2.anarazel.de is a
> thread with more information / patches further along.
>
> I confused this patch with the approach in
> https://www.postgresql.org/message-id/d8576096-76ba-487d-515b-44fdedba8bb5%402ndquadrant.com
> sorry for that.  It obviously still differs by not having lower space
> overhead (by virtue of not having a 4 byte 'va_cmid', but no additional
> space for two methods, and then 1 byte overhead for 256 more), but
> that's not that fundamental a difference.

Wait a minute. Are we saying there are three (3) dueling patches for
adding an alternate TOAST algorithm? It seems like there is:

This "custom compression methods" thread - vintage 2017 - Original
code by Nikita Glukhov, later work by Ildus Kurbangaliev
The "pluggable compression support" thread - vintage 2013 - Andres Freund
The "plgz performance" thread - vintage 2019 - Petr Jelinek

Anyone want to point to a FOURTH implementation of this feature?

I guess the next thing to do is figure out which one is the best basis
for further work.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Reply | Threaded
Open this post in threaded view
|

Re: Custom compression methods

Dilip Kumar-2
On Wed, Jun 24, 2020 at 5:30 PM Robert Haas <[hidden email]> wrote:

>
> On Tue, Jun 23, 2020 at 4:00 PM Andres Freund <[hidden email]> wrote:
> > https://postgr.es/m/20130621000900.GA12425%40alap2.anarazel.de is a
> > thread with more information / patches further along.
> >
> > I confused this patch with the approach in
> > https://www.postgresql.org/message-id/d8576096-76ba-487d-515b-44fdedba8bb5%402ndquadrant.com
> > sorry for that.  It obviously still differs by not having lower space
> > overhead (by virtue of not having a 4 byte 'va_cmid', but no additional
> > space for two methods, and then 1 byte overhead for 256 more), but
> > that's not that fundamental a difference.
>
> Wait a minute. Are we saying there are three (3) dueling patches for
> adding an alternate TOAST algorithm? It seems like there is:
>
> This "custom compression methods" thread - vintage 2017 - Original
> code by Nikita Glukhov, later work by Ildus Kurbangaliev
> The "pluggable compression support" thread - vintage 2013 - Andres Freund
> The "plgz performance" thread - vintage 2019 - Petr Jelinek
>
> Anyone want to point to a FOURTH implementation of this feature?
>
> I guess the next thing to do is figure out which one is the best basis
> for further work.

I have gone through these 3 threads and here is a summary of what I
understand from them.  Feel free to correct me if I have missed
something.

#1. Custom compression methods:  Provide a mechanism to create/drop
compression methods by using external libraries, and it also provides
a way to set the compression method for the columns/types.  There are
a few complexities with this approach those are listed below:

a. We need to maintain the dependencies between the column and the
compression method.  And the bigger issue is, even if the compression
method is changed, we need to maintain the dependencies with the older
compression methods as we might have some older tuples that were
compressed with older methods.
b. Inside the compressed attribute, we need to maintain the
compression method so that we know how to decompress it.  For this, we
use 2 bits from the raw_size of the compressed varlena header.

#2. pglz performance:  Along with pglz this patch provides an
additional compression method using lz4.  The new compression method
can be enabled/disabled during configure time or using SIGHUP.  We use
1 bit from the raw_size of the compressed varlena header to identify
the compression method (pglz or lz4).

#3. pluggable compression:  This proposal is to replace the existing
pglz algorithm,  with the snappy or lz4 whichever is better.  As per
the performance data[1], it appeared that the lz4 is the winner in
most of the cases.
- This also provides an additional patch to plugin any compression method.
- This will also use 2 bits from the raw_size of the compressed
attribute for identifying the compression method.
- Provide an option to select the compression method using GUC,  but
the comments in the patch suggest to remove the GUC.  So it seems that
GUC was used only for the POC.
- Honestly, I did not clearly understand from this patch set that
whether it proposes to replace the existing compression method with
the best method (and the plugin is just provided for performance
testing) or it actually proposes an option to have pluggable
compression methods.

IMHO,  We can provide a solution based on #1 and #2, i.e. we can
provide a few best compression methods in the core, and on top of
that, we can also provide a mechanism to create/drop the external
compression methods.

[1] https://www.postgresql.org/message-id/20130621000900.GA12425%40alap2.anarazel.de

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Custom compression methods

Andres Freund
In reply to this post by Robert Haas
Hi,

On 2020-06-24 07:59:47 -0400, Robert Haas wrote:

> On Tue, Jun 23, 2020 at 4:00 PM Andres Freund <[hidden email]> wrote:
> > https://postgr.es/m/20130621000900.GA12425%40alap2.anarazel.de is a
> > thread with more information / patches further along.
> >
> > I confused this patch with the approach in
> > https://www.postgresql.org/message-id/d8576096-76ba-487d-515b-44fdedba8bb5%402ndquadrant.com
> > sorry for that.  It obviously still differs by not having lower space
> > overhead (by virtue of not having a 4 byte 'va_cmid', but no additional
> > space for two methods, and then 1 byte overhead for 256 more), but
> > that's not that fundamental a difference.
>
> Wait a minute. Are we saying there are three (3) dueling patches for
> adding an alternate TOAST algorithm? It seems like there is:
>
> This "custom compression methods" thread - vintage 2017 - Original
> code by Nikita Glukhov, later work by Ildus Kurbangaliev
> The "pluggable compression support" thread - vintage 2013 - Andres Freund
> The "plgz performance" thread - vintage 2019 - Petr Jelinek
>
> Anyone want to point to a FOURTH implementation of this feature?

To be clear, I don't think the 2003 patch should be considered as being
"in the running".

Greetings,

Andres Freund


1 ... 5678