Caveats from reloption toast_tuple_target

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Caveats from reloption toast_tuple_target

Michael Paquier-2
Hi all,
(Adding Simon as the author of toast_tuple_target, as well Andrew and
Pavan in CC.)

toast_tuple_target has been introduced in 2017 by c251336 as of v11.
And while reviewing Pavan's patch to have more complex control over
the compression threshold of a tuple, I have bumped into some
surprising code:
https://www.postgresql.org/message-id/20190403044916.GD3298@...

As far as I understand it, even with this option we don't try to toast
tuples in heap_prepare_insert() and heap_update() where
TOAST_TUPLE_THRESHOLD gets used to define if a tuple can be toasted or
not.  The same applies to raw_heap_insert() in rewriteheap.c, and
needs_toast_table() in toasting.c.

Shouldn't we use the reloption instead of the compiled threshold to
determine if a tuple should be toasted or not?  Perhaps I am missing
something?  It seems to me that this is a bug that should be
back-patched, but it could also be qualified as a behavior change for
existing relations.

Thoughts?
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Caveats from reloption toast_tuple_target

Robert Haas
On Wed, Apr 3, 2019 at 2:38 AM Michael Paquier <[hidden email]> wrote:

> Hi all,
> (Adding Simon as the author of toast_tuple_target, as well Andrew and
> Pavan in CC.)
>
> toast_tuple_target has been introduced in 2017 by c251336 as of v11.
> And while reviewing Pavan's patch to have more complex control over
> the compression threshold of a tuple, I have bumped into some
> surprising code:
> https://www.postgresql.org/message-id/20190403044916.GD3298@...
>
> As far as I understand it, even with this option we don't try to toast
> tuples in heap_prepare_insert() and heap_update() where
> TOAST_TUPLE_THRESHOLD gets used to define if a tuple can be toasted or
> not.  The same applies to raw_heap_insert() in rewriteheap.c, and
> needs_toast_table() in toasting.c.
>
> Shouldn't we use the reloption instead of the compiled threshold to
> determine if a tuple should be toasted or not?  Perhaps I am missing
> something?  It seems to me that this is a bug that should be
> back-patched, but it could also be qualified as a behavior change for
> existing relations.

Could you explain a bit more clearly what you think the bug is?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Reply | Threaded
Open this post in threaded view
|

Re: Caveats from reloption toast_tuple_target

Michael Paquier-2
On Wed, Apr 03, 2019 at 12:13:51PM -0400, Robert Haas wrote:
> On Wed, Apr 3, 2019 at 2:38 AM Michael Paquier <[hidden email]> wrote:
>> Shouldn't we use the reloption instead of the compiled threshold to
>> determine if a tuple should be toasted or not?  Perhaps I am missing
>> something?  It seems to me that this is a bug that should be
>> back-patched, but it could also be qualified as a behavior change for
>> existing relations.
>
> Could you explain a bit more clearly what you think the bug is?

I mean that toast_tuple_target is broken as-is, because it should be
used on the new tuples of a relation as a threshold to decide if this
tuple should be toasted or not, but we don't actually use the
reloption value for that decision-making: the default threshold
TOAST_TUPLE_THRESHOLD gets used instead all the time!  The code does
not even create a toast table in some cases.

You may want to look at the attached patch if those words make little
sense as code may be easier to explain than words here.  Here is also
a simple example:
CREATE TABLE toto (b text) WITH (toast_tuple_target = 1024);
INSERT INTO toto SELECT string_agg('', md5(random()::text))
  FROM generate_series(1,10); -- 288 bytes
SELECT pg_relation_size(reltoastrelid) = 0 AS is_empty FROM
  pg_class where relname = 'toto';
INSERT INTO toto SELECT string_agg('', md5(random()::text))
  FROM generate_series(1,40); -- 1248 bytes
SELECT pg_relation_size(reltoastrelid) = 0 AS is_empty FROM
  pg_class where relname = 'toto';

On HEAD, the second INSERT does *not* toast the tuple (is_empty is
true).  With the patch attached, the second INSERT toasts the tuple as
I would expect (is_empty is *false*) per the custom threshold
defined.

While on it, I think that the extra argument for
RelationGetToastTupleTarget() is useless as the default value should
be TOAST_TUPLE_THRESHOLD all the time.

Does this make sense?
--
Michael

toast-tuple-target-fix.patch (4K) Download Attachment
signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Caveats from reloption toast_tuple_target

Pavan Deolasee


On Thu, Apr 4, 2019 at 11:36 AM Michael Paquier <[hidden email]> wrote:


I mean that toast_tuple_target is broken as-is, because it should be
used on the new tuples of a relation as a threshold to decide if this
tuple should be toasted or not, but we don't actually use the
reloption value for that decision-making: the default threshold
TOAST_TUPLE_THRESHOLD gets used instead all the time!  The code does
not even create a toast table in some cases.

I think the problem with the existing code is that while it allows users to set toast_tuple_target to be less than TOAST_TUPLE_THRESHOLD, the same is not honoured while deciding whether to toast a row or not. AFAICS it works ok when toast_tuple_target is greater than or equal to TOAST_TUPLE_THRESHOLD i.e. it won't toast the rows unless they are larger than toast_tuple_target.

IMV it makes sense to simply cap the lower limit of toast_tuple_target to the compile time default and update docs to reflect that. Otherwise, we need to deal with the possibility of dynamically creating the toast table if the relation option is lowered after creating the table. Your proposed patch handles the case when the toast_tuple_target is specified at CREATE TABLE, but we would still have problem with ALTER TABLE, no? But there might be side effects of changing the lower limit for pg_dump/pg_restore. So we would need to think about that too.

Thanks,
Pavan

--
 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
Reply | Threaded
Open this post in threaded view
|

Re: Caveats from reloption toast_tuple_target

David Rowley-3
On Fri, 5 Apr 2019 at 17:31, Pavan Deolasee <[hidden email]> wrote:
> IMV it makes sense to simply cap the lower limit of toast_tuple_target to the compile time default and update docs to reflect that. Otherwise, we need to deal with the possibility of dynamically creating the toast table if the relation option is lowered after creating the table. Your proposed patch handles the case when the toast_tuple_target is specified at CREATE TABLE, but we would still have problem with ALTER TABLE, no? But there might be side effects of changing the lower limit for pg_dump/pg_restore. So we would need to think about that too.

FWIW I independently stumbled upon this problem today and I concluded
the same thing, we can only make the lower limit for the
toast_tuple_threshold reloption the same as TOAST_TUPLE_THRESHOLD. (I
was unaware of this thread, so I reported in [1])

I only quickly looked at Michael's patch and it does not seem to do
anything for the case that if no toast table exists and the user
lowers the reloption, then nothing seems to be there to build a new
toast table.

I mentioned over in [1] that:
> It does not seem possible to add/remote the toast table when the
> reloption is changed either as we're only obtaining a
> ShareUpdateExclusiveLock to set it. We'd likely need to upgrade that
> to an AccessExclusiveLock to do that.

Reading over the original discussion in [2], Simon seemed more
interested in delaying the toasting for tuples larger than 2040 bytes,
not making it happen sooner. This makes sense since smaller datums are
increasingly less likely to compress the smaller they are.

The question is, can we increase the lower limit.  We don't want
pg_upgrade or pg_dump reloads failing from older versions. Perhaps we
can just silently set the reloption to TOAST_TUPLE_THRESHOLD when the
user gives us some lower value. At least then lower values would
disappear over time.

[1] https://www.postgresql.org/message-id/CAKJS1f9vrJ55oYe7un+rakTzwaGh3my5MA0RBfyNngAXu7eVeQ@...
[2] https://postgr.es/m/CANP8+jKsVmw6CX6YP9z7zqkTzcKV1+Uzr3XjKcZW=2Ya00OyQQ@...

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Caveats from reloption toast_tuple_target

David Rowley-3
In reply to this post by Pavan Deolasee
On Fri, 5 Apr 2019 at 17:31, Pavan Deolasee <[hidden email]> wrote:
> IMV it makes sense to simply cap the lower limit of toast_tuple_target to the compile time default and update docs to reflect that. Otherwise, we need to deal with the possibility of dynamically creating the toast table if the relation option is lowered after creating the table. Your proposed patch handles the case when the toast_tuple_target is specified at CREATE TABLE, but we would still have problem with ALTER TABLE, no? But there might be side effects of changing the lower limit for pg_dump/pg_restore. So we would need to think about that too.

I've attached a patch which increases the lower limit up to
TOAST_TUPLE_TARGET.  Unfortunately, reloptions don't have an
assign_hook like GUCs do. Unless we add those we've no way to still
accept lower values without an error.  Does anyone think that's worth
adding for this?  Without it, it's possible that pg_restore /
pg_upgrade could fail if someone happened to have toast_tuple_target
set lower than 2032 bytes.

I didn't bother capping RelationGetToastTupleTarget() to ensure it
never returns less than TOAST_TUPLE_TARGET since the code that
currently uses it can't trigger if it's lower than that value.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

toast_tuple_target_fix_david.patch (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Caveats from reloption toast_tuple_target

David Rowley-3
On Tue, 16 Apr 2019 at 23:30, David Rowley <[hidden email]> wrote:
> I've attached a patch which increases the lower limit up to
> TOAST_TUPLE_TARGET.  Unfortunately, reloptions don't have an
> assign_hook like GUCs do. Unless we add those we've no way to still
> accept lower values without an error.  Does anyone think that's worth
> adding for this?  Without it, it's possible that pg_restore /
> pg_upgrade could fail if someone happened to have toast_tuple_target
> set lower than 2032 bytes.

Hi Michael,

I'm just wondering if you've had any thoughts on this?

To recap, Pavan and I think it's a problem to allow the
toast_tuple_target to be reduced as the relation in question may not
have a toast table defined. It does not seem very possible to call
create_toast_table() when the toast_tuple_target is changed since we
only obtain a ShareUpdateExclusiveLock when doing that.

The options seem to be:
1. Make the lower limit of toast_tuple_target  the same as
TOAST_TUPLE_THRESHOLD; or
2. Require an AccessExclusiveLock when setting toast_tuple_target and
call create_toast_table() to ensure we get a toast table when the
setting is reduced to a level that means the target table may require
a toast relation.

I sent a patch for #1, but maybe someone thinks we should do #2?   The
reason I've not explored #2 more is that after reading the original
discussion when this patch was being developed, the main interest
seemed to be keeping the values inline longer.  Moving them out of
line sooner seems to make less sense since smaller values are less
likely to compress as well as larger values.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Caveats from reloption toast_tuple_target

Michael Paquier-2
On Tue, Apr 30, 2019 at 02:20:27PM +1200, David Rowley wrote:
> The options seem to be:
> 1. Make the lower limit of toast_tuple_target  the same as
> TOAST_TUPLE_THRESHOLD; or
> 2. Require an AccessExclusiveLock when setting toast_tuple_target and
> call create_toast_table() to ensure we get a toast table when the
> setting is reduced to a level that means the target table may require
> a toast relation.

Actually, the patch I sent upthread does call create_toast_table() to
attempt to create a toast table.  However it fails lamentably because
it lacks an exclusive lock when setting toast_tuple_target as you
outlined:
create table ab (a char(300));
alter table ab set (toast_tuple_target = 128);
insert into ab select string_agg('', md5(random()::text))
  from generate_series(1,10); -- 288 bytes

This fails for the ALTER TABLE step like that:
ERROR:  XX000: AccessExclusiveLock required to add toast table.

Now if I upgrade to AccessExclusiveLock then the thing is able to
toast tuples related to the lower threshold set.  Here is the stack if
you are interested:
#0  create_toast_table (rel=0x7f8af648d178, toastOid=0,
toastIndexOid=0, reloptions=0, lockmode=8, check=true) at
toasting.c:131
#1  0x00005627da7a4eca in CheckAndCreateToastTable (relOid=16385,
reloptions=0, lockmode=8, check=true) at toasting.c:86
#2  0x00005627da7a4e1e in AlterTableCreateToastTable (relOid=16385,
reloptions=0, lockmode=8) at toasting.c:63
#3  0x00005627da87f479 in ATRewriteCatalogs (wqueue=0x7fffb77cfae8,
lockmode=8) at tablecmds.c:4185

> I sent a patch for #1, but maybe someone thinks we should do #2?   The
> reason I've not explored #2 more is that after reading the original
> discussion when this patch was being developed, the main interest
> seemed to be keeping the values inline longer.  Moving them out of
> line sooner seems to make less sense since smaller values are less
> likely to compress as well as larger values.

Yes, I have been chewing on the original thread from Simon, and it
seems really that he got interested in larger values when working on
this patch.  And anyway, on HEAD we currently allow a toast table to
be created only if the threshold is at least TOAST_TUPLE_THRESHOLD,
so we have an inconsistency between reloptions.c and
needs_toast_table().

There could be an argument for allowing lower thresholds, but let's
see if somebody has a better use-case for it.  In this case they would
need to upgrade the lock needed to set toast_tuple_target.  I actually
don't have an argument in favor of that, thinking about it more.

Now, can we really increase the minimum value as you and Pavan
propose?  For now anything between 128 and TOAST_TUPLE_TARGET gets
silently ignored, but if we increase the threshold as you propose we
could prevent some dumps to be restored, and as storage parameters are
defined as part of a WITH clause in CREATE TABLE, this could break
restores for a lot of users. We could tell pg_dump to enforce any
values between 128 and TOAST_TUPLE_THRESHOLD to be enforced to
TOAST_TUPLE_THRESHOLD, still that's a lot of complication just to take
care of one inconsistency.

Hence, based on that those arguments, there is option #3 to do
nothing.  Perhaps the surrounding comments could make the current
behavior less confusing though.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Caveats from reloption toast_tuple_target

David Rowley-3
On Tue, 14 May 2019 at 18:49, Michael Paquier <[hidden email]> wrote:
> Now, can we really increase the minimum value as you and Pavan
> propose?  For now anything between 128 and TOAST_TUPLE_TARGET gets
> silently ignored, but if we increase the threshold as you propose we
> could prevent some dumps to be restored, and as storage parameters are
> defined as part of a WITH clause in CREATE TABLE, this could break
> restores for a lot of users. We could tell pg_dump to enforce any
> values between 128 and TOAST_TUPLE_THRESHOLD to be enforced to
> TOAST_TUPLE_THRESHOLD, still that's a lot of complication just to take
> care of one inconsistency.

If we had reloption validation functions then we could, but we don't,
so it seems we'd have no choice but reporting a hard ERROR.

I guess it's not impossible for pg_dump to fail on this even without
this change. If someone had increased the limit on an instance with
say 16k page to something over what TOAST_TUPLE_TARGET_MAIN would be
on a standard instance, then restoring onto the 8k page instance will
fail.   Of course, that's less likely since it's a whole other factor
in the equation, and it's still not impossible, so maybe we need to
think about it harder.

> Hence, based on that those arguments, there is option #3 to do
> nothing.  Perhaps the surrounding comments could make the current
> behavior less confusing though.

I see this item has been moved to the "Nothing to do" section of the
open items list.  I'd really like to see a few other people comment
before we go and ignore this.  We only get 1 opportunity to release a
fix like this per year and it would be good to get further
confirmation if we want to leave this.

In my view, someone who has to go to the trouble of changing this
setting is probably going to have quite a bit of data in their
database and is quite unlikely to be using pg_dump due to that.  Does
that mean we can make this cause an ERROR?... I don't know, but would
be good to hear what others think.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Caveats from reloption toast_tuple_target

Michael Paquier-2
On Tue, May 21, 2019 at 12:33:54PM +1200, David Rowley wrote:
> I guess it's not impossible for pg_dump to fail on this even without
> this change. If someone had increased the limit on an instance with
> say 16k page to something over what TOAST_TUPLE_TARGET_MAIN would be
> on a standard instance, then restoring onto the 8k page instance will
> fail.   Of course, that's less likely since it's a whole other factor
> in the equation, and it's still not impossible, so maybe we need to
> think about it harder.

Sure, this one would be possible as well.  Much less likely I guess as
I don't imagine a lot of our user base which perform upgrades to new
instances by changing the page size.  One way to trick that would be
to use a ratio of the page size instead.  I would imagine that
changing compile-time constraints when moving to a new version
increases since we have logical replication so as you can move things
with close to zero downtime without relying on the physical page
size.

> I see this item has been moved to the "Nothing to do" section of the
> open items list.  I'd really like to see a few other people comment
> before we go and ignore this.  We only get 1 opportunity to release a
> fix like this per year and it would be good to get further
> confirmation if we want to leave this.

Yes, I moved this item without seeing any replies.  Anyway, it's
really the kind of thing I'd rather not touch post beta, and I
see disadvantages in doing what you and Pavan propose as well.  There
is as well the argument that tuple_toast_target is so specialized that
close to zero people are using it, hence changing its lower bound
would impact nobody.

> In my view, someone who has to go to the trouble of changing this
> setting is probably going to have quite a bit of data in their
> database and is quite unlikely to be using pg_dump due to that.  Does
> that mean we can make this cause an ERROR?... I don't know, but would
> be good to hear what others think.

Sure.  Other opinions are welcome.  Perhaps I lack insight and user
stories on the matter, but I unfortunately see downsides in all things
discussed.  I am a rather pessimistic guy by nature.
--
Michael

signature.asc (849 bytes) Download Attachment