Hot Standby Feedback should default to on in 9.3+

classic Classic list List threaded Threaded
24 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Hot Standby Feedback should default to on in 9.3+

Andres Freund-3
Hi,

The subject says it all.

There are workloads where its detrimental, but in general having it
default to on improver experience tremendously because getting conflicts
because of vacuum is rather confusing.

In the workloads where it might not be a good idea (very long queries on
the standby, many dead tuples on the primary) you need to think very
carefuly about the strategy of avoiding conflicts anyway, and explicit
configuration is required as well.

Does anybody have an argument against changing the default value?

Greetings,

Andres Freund

--
 Andres Freund                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Hot Standby Feedback should default to on in 9.3+

Simon Riggs
On 30 November 2012 19:02, Andres Freund <[hidden email]> wrote:

> The subject says it all.
>
> There are workloads where its detrimental, but in general having it
> default to on improver experience tremendously because getting conflicts
> because of vacuum is rather confusing.
>
> In the workloads where it might not be a good idea (very long queries on
> the standby, many dead tuples on the primary) you need to think very
> carefuly about the strategy of avoiding conflicts anyway, and explicit
> configuration is required as well.
>
> Does anybody have an argument against changing the default value?

I don't see a technical objection, perhaps others do.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Hot Standby Feedback should default to on in 9.3+

Robert Haas
In reply to this post by Andres Freund-3
On Fri, Nov 30, 2012 at 2:02 PM, Andres Freund <[hidden email]> wrote:
> Does anybody have an argument against changing the default value?

Well, the disadvantage of it is that the standby can bloat the master,
which might be surprising to some people, too.  But I don't really
have a lot of skin in this game.

While we're talking about changing defaults, how about changing the
default value of the recovery.conf parameter 'standby_mode' to on?
Not sure about anybody else, but I never want it any other way.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Hot Standby Feedback should default to on in 9.3+

Josh berkus
In reply to this post by Andres Freund-3

> In the workloads where it might not be a good idea (very long queries on
> the standby, many dead tuples on the primary) you need to think very
> carefuly about the strategy of avoiding conflicts anyway, and explicit
> configuration is required as well.
>
> Does anybody have an argument against changing the default value?

On balance, I think it's a good idea.  It's easier for new users,
conceptually, to deal with table bloat than query cancel.

Have we done testing on how much query cancel it actually eliminates?

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Hot Standby Feedback should default to on in 9.3+

Heikki Linnakangas-6
In reply to this post by Andres Freund-3
On 30.11.2012 21:02, Andres Freund wrote:

> Hi,
>
> The subject says it all.
>
> There are workloads where its detrimental, but in general having it
> default to on improver experience tremendously because getting conflicts
> because of vacuum is rather confusing.
>
> In the workloads where it might not be a good idea (very long queries on
> the standby, many dead tuples on the primary) you need to think very
> carefuly about the strategy of avoiding conflicts anyway, and explicit
> configuration is required as well.
>
> Does anybody have an argument against changing the default value?

-1. By default, I would expect a standby server to not have any
meaningful impact on the performance of the master. With hot standby
feedback, you can bloat the master very badly if you're not careful.

Think of someone setting up a test server, by setting it up as a standby
from the master. Now, when someone holds a transaction open in the test
server, you get bloat in the master. Or if you set up a standby for
reporting purposes - a very common use case - you would not expect a
long running ad-hoc query in the standby to bloat the master. That's
precisely why you set up such a standby in the first place.

You could of course still turn it off, but you would have to know about
it in the first place. I think it's a reasonable assumption that a
standby does *not* affect the master (aside from the bandwidth and disk
space required to retain/ship the WAL). If you have to remember to
explicitly set a GUC to get that behavior, that's a pretty big gotcha.

- Heikki


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Hot Standby Feedback should default to on in 9.3+

Claudio Freire
On Fri, Nov 30, 2012 at 5:46 PM, Heikki Linnakangas
<[hidden email]> wrote:
>
> Think of someone setting up a test server, by setting it up as a standby
> from the master. Now, when someone holds a transaction open in the test
> server, you get bloat in the master. Or if you set up a standby for
> reporting purposes - a very common use case - you would not expect a long
> running ad-hoc query in the standby to bloat the master. That's precisely
> why you set up such a standby in the first place.

Without hot standby feedback, reporting queries are impossible. I've
experienced it. Cancellations make it impossible to finish any
decently complex reporting query.


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Hot Standby Feedback should default to on in 9.3+

Tom Lane-2
In reply to this post by Robert Haas
Robert Haas <[hidden email]> writes:
> While we're talking about changing defaults, how about changing the
> default value of the recovery.conf parameter 'standby_mode' to on?
> Not sure about anybody else, but I never want it any other way.

Dunno, it's been only a couple of days since there was a thread about
somebody who had turned it on and not gotten the results he wanted
(because he was only trying to do a point-in-time recovery not create
a standby).  There's enough other configuration needed to set up a
standby node that I'm not sure flipping this default helps the case
much.

But having said that, would it be practical to get rid of the explicit
standby_mode parameter altogether?  I'm thinking we could assume standby
mode is wanted if primary_conninfo has a nonempty value.

There remains the case of a standby being fed solely from WAL archive
without primary_conninfo, but that's a pretty darn corner-y corner case,
and I doubt it has to be easy to set up.  One possibility for it is to
allow primary_conninfo to be set to "none", which would still trigger
standby mode but could be coded to not enable connection attempts.

Mind you, I'm not sure that such a design is easier to understand or
document.  But it would be one less parameter.

                        regards, tom lane


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Hot Standby Feedback should default to on in 9.3+

Heikki Linnakangas-6
In reply to this post by Claudio Freire
On 30.11.2012 22:49, Claudio Freire wrote:

> On Fri, Nov 30, 2012 at 5:46 PM, Heikki Linnakangas
> <[hidden email]>  wrote:
>>
>> Think of someone setting up a test server, by setting it up as a standby
>> from the master. Now, when someone holds a transaction open in the test
>> server, you get bloat in the master. Or if you set up a standby for
>> reporting purposes - a very common use case - you would not expect a long
>> running ad-hoc query in the standby to bloat the master. That's precisely
>> why you set up such a standby in the first place.
>
> Without hot standby feedback, reporting queries are impossible. I've
> experienced it. Cancellations make it impossible to finish any
> decently complex reporting query.

Maybe so, but I'd rather get cancellations in the standby, and then read
up on feedback and the other options and figure out how to make it work,
than get severe bloat in the master and scratch my head wondering what's
causing it.

- Heikki


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Hot Standby Feedback should default to on in 9.3+

Kevin Grittner-4
In reply to this post by Andres Freund-3
Claudio Freire wrote:

> Without hot standby feedback, reporting queries are impossible.
> I've experienced it. Cancellations make it impossible to finish
> any decently complex reporting query.

With what setting of max_standby_streaming_delay? I would rather
default that to -1 than default hot_standby_feedback on. That way
what you do on the standby only affects the standby.

A default that allows anyone who has a read-only login to a standby
to bloat the server by default, which may require hours of down
time to correct, seems dangerous to me.

-Kevin


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Hot Standby Feedback should default to on in 9.3+

Claudio Freire
On Fri, Nov 30, 2012 at 6:06 PM, Kevin Grittner <[hidden email]> wrote:
>
>> Without hot standby feedback, reporting queries are impossible.
>> I've experienced it. Cancellations make it impossible to finish
>> any decently complex reporting query.
>
> With what setting of max_standby_streaming_delay? I would rather
> default that to -1 than default hot_standby_feedback on. That way
> what you do on the standby only affects the standby.

1d


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Hot Standby Feedback should default to on in 9.3+

Tom Lane-2
In reply to this post by Claudio Freire
Claudio Freire <[hidden email]> writes:
> Without hot standby feedback, reporting queries are impossible. I've
> experienced it. Cancellations make it impossible to finish any
> decently complex reporting query.

The original expectation was that slave-side cancels would be
infrequent.  Maybe there's some fixing/tuning to be done there.

                        regards, tom lane


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Hot Standby Feedback should default to on in 9.3+

Kevin Grittner-4
In reply to this post by Andres Freund-3
Claudio Freire wrote:

>> With what setting of max_standby_streaming_delay? I would rather
>> default that to -1 than default hot_standby_feedback on. That
>> way what you do on the standby only affects the standby.
>
> 1d

Was there actually a transaction hanging open for an entire day on
the standby? Was it a query which actually ran that long, or an
ill-behaved user or piece of software?

I have most certainly managed databases where holding up vacuuming
on the source would cripple performance to the point that users
would have demanded that any other process causing it must be
immediately canceled. And canceling it wouldn't be enough at that
point -- the bloat would still need to be fixed before they could
work efficiently.

-Kevin


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Hot Standby Feedback should default to on in 9.3+

Claudio Freire
On Fri, Nov 30, 2012 at 6:20 PM, Kevin Grittner <[hidden email]> wrote:

> Claudio Freire wrote:
>
>>> With what setting of max_standby_streaming_delay? I would rather
>>> default that to -1 than default hot_standby_feedback on. That
>>> way what you do on the standby only affects the standby.
>>
>> 1d
>
> Was there actually a transaction hanging open for an entire day on
> the standby? Was it a query which actually ran that long, or an
> ill-behaved user or piece of software?

No, and if there was, I wouldn't care for it to be cancelled.

Queries were being cancelled way before that timeout was reached,
probably something to do with max_keep_segments on the master side
being unable to keep up for that long.

> I have most certainly managed databases where holding up vacuuming
> on the source would cripple performance to the point that users
> would have demanded that any other process causing it must be
> immediately canceled. And canceling it wouldn't be enough at that
> point -- the bloat would still need to be fixed before they could
> work efficiently.

I wouldn't mind occasional cancels, but these were recurring. When a
query ran long enough, there was no way for it to finish, no matter
how many times you tried. The master never stops being busy, that's
probably a factor.


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Hot Standby Feedback should default to on in 9.3+

Heikki Linnakangas-6
On 30.11.2012 23:40, Claudio Freire wrote:

> On Fri, Nov 30, 2012 at 6:20 PM, Kevin Grittner<[hidden email]>  wrote:
>> Claudio Freire wrote:
>>
>>>> With what setting of max_standby_streaming_delay? I would rather
>>>> default that to -1 than default hot_standby_feedback on. That
>>>> way what you do on the standby only affects the standby.
>>>
>>> 1d
>>
>> Was there actually a transaction hanging open for an entire day on
>> the standby? Was it a query which actually ran that long, or an
>> ill-behaved user or piece of software?
>
> No, and if there was, I wouldn't care for it to be cancelled.
>
> Queries were being cancelled way before that timeout was reached,
> probably something to do with max_keep_segments on the master side
> being unable to keep up for that long.

Running out of max_keep_segments would produce a different error,
requiring a new base backup.

>> I have most certainly managed databases where holding up vacuuming
>> on the source would cripple performance to the point that users
>> would have demanded that any other process causing it must be
>> immediately canceled. And canceling it wouldn't be enough at that
>> point -- the bloat would still need to be fixed before they could
>> work efficiently.
>
> I wouldn't mind occasional cancels, but these were recurring. When a
> query ran long enough, there was no way for it to finish, no matter
> how many times you tried. The master never stops being busy, that's
> probably a factor.

Hmm, it sounds like max_standby_streaming_delay=1d didn't work as
intended for some reason. It should've given the query one day to run
before canceling it. Unless the standby was running one day behind the
master already, but that seems unlikely. Any chance you could reproduce
that?

- Heikki


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Hot Standby Feedback should default to on in 9.3+

Claudio Freire
On Fri, Nov 30, 2012 at 6:49 PM, Heikki Linnakangas
<[hidden email]> wrote:

>>> I have most certainly managed databases where holding up vacuuming
>>> on the source would cripple performance to the point that users
>>> would have demanded that any other process causing it must be
>>> immediately canceled. And canceling it wouldn't be enough at that
>>> point -- the bloat would still need to be fixed before they could
>>> work efficiently.
>>
>>
>> I wouldn't mind occasional cancels, but these were recurring. When a
>> query ran long enough, there was no way for it to finish, no matter
>> how many times you tried. The master never stops being busy, that's
>> probably a factor.
>
>
> Hmm, it sounds like max_standby_streaming_delay=1d didn't work as intended
> for some reason. It should've given the query one day to run before
> canceling it. Unless the standby was running one day behind the master
> already, but that seems unlikely. Any chance you could reproduce that?

I have a pre-production server with replication for these tests. I
could create a fake stream of writes on it, disable feedback, and see
what happens.


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Hot Standby Feedback should default to on in 9.3+

Andres Freund-3
In reply to this post by Robert Haas
On 2012-11-30 14:35:37 -0500, Robert Haas wrote:
> On Fri, Nov 30, 2012 at 2:02 PM, Andres Freund <[hidden email]> wrote:
> > Does anybody have an argument against changing the default value?
>
> Well, the disadvantage of it is that the standby can bloat the master,
> which might be surprising to some people, too.  But I don't really
> have a lot of skin in this game.

Sure, thats a problem. But ISTM that its a problem everyone running
postgres has to know about anyway from running the master itself.

> While we're talking about changing defaults, how about changing the
> default value of the recovery.conf parameter 'standby_mode' to on?
> Not sure about anybody else, but I never want it any other way.

Hm. But only if there is a recovery.conf I guess?

Andres

--
 Andres Freund                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Hot Standby Feedback should default to on in 9.3+

Daniel Farina-4
In reply to this post by Robert Haas
On Fri, Nov 30, 2012 at 11:35 AM, Robert Haas <[hidden email]> wrote:
> On Fri, Nov 30, 2012 at 2:02 PM, Andres Freund <[hidden email]> wrote:
>> Does anybody have an argument against changing the default value?
>
> Well, the disadvantage of it is that the standby can bloat the master,
> which might be surprising to some people, too.  But I don't really
> have a lot of skin in this game.

Under this precept, we used to not enable hot standby feedback and
instead allowed more or less unbounded staleness of the standby
through very long cancellation times. Although not immediate,
eventually we decided that enough people were getting confused by
sufficiently long standby delay caused by bad queries and idle in xact
backends, so now we have enabled feedback for new database replicants,
along with some fairly un-aggressive cancellation timeouts. It's all
rather messy and not very satisfying.  We have yet to know if feedback
causes or solves problems, on average.

In very early versions we tried the default cancellation settings, and
query cancellation confused everyone a *lot*.  That went away in a
hurry as a result, so I suppose it's not entirely unreasonable to say
in retrospect that the defaults can be considered kind of bad.

Longer term, I think I'd be keen to switch all our user-controlled
replication to logical except for use cases where the workload of the
standby is under our (and not the user's) control, such as for
failover.

Unfortunately, our experience with the feature and its use suggests
that the contract granted by the mechanisms seen in hot standby are
too complex for full-stack developers to keep in careful consideration
along with all the other things they want to do with their application
and/or have to remember about Postgres to get by.

--
fdr


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Hot Standby Feedback should default to on in 9.3+

Magnus Hagander-2
In reply to this post by Heikki Linnakangas-6
On Fri, Nov 30, 2012 at 9:46 PM, Heikki Linnakangas
<[hidden email]> wrote:

> On 30.11.2012 21:02, Andres Freund wrote:
>>
>> Hi,
>>
>> The subject says it all.
>>
>> There are workloads where its detrimental, but in general having it
>> default to on improver experience tremendously because getting conflicts
>> because of vacuum is rather confusing.
>>
>> In the workloads where it might not be a good idea (very long queries on
>> the standby, many dead tuples on the primary) you need to think very
>> carefuly about the strategy of avoiding conflicts anyway, and explicit
>> configuration is required as well.
>>
>> Does anybody have an argument against changing the default value?
>
>
> -1. By default, I would expect a standby server to not have any meaningful
> impact on the performance of the master. With hot standby feedback, you can
> bloat the master very badly if you're not careful.

I'm with Heikki on the -1 on this. It's certainly unexpected to have
the slave affect the master by default - people will expect the master
to be independent.

Also, it doesn't IMHO actually *help*. The big thing that makes it
harder for people to set up replication that way is wal_level=minimal
by default, and in a smaller sense max_wal_senders (but
wal_level=minimal also has the interesting property that it's not
enough to change it to wal_level=hot_standby if you figure it out too
late - you have to turn off hot standby on the slave, start it, have
it catch up, shut it down, and reenable hot standby). And they
requires a *restart* of the master, which is a lot worse than a small
change to the config of the *slave*. So unless you're suggesting to
change the default of those two values as well, I'm not sure it really
helps that much...


> Think of someone setting up a test server, by setting it up as a standby
> from the master. Now, when someone holds a transaction open in the test
> server, you get bloat in the master. Or if you set up a standby for
> reporting purposes - a very common use case - you would not expect a long
> running ad-hoc query in the standby to bloat the master. That's precisely
> why you set up such a standby in the first place.
>
> You could of course still turn it off, but you would have to know about it
> in the first place. I think it's a reasonable assumption that a standby does
> *not* affect the master (aside from the bandwidth and disk space required to
> retain/ship the WAL). If you have to remember to explicitly set a GUC to get
> that behavior, that's a pretty big gotcha.

+1. Having your reporting query time out *shows you* the problem.
Having the master bloat for you won't show the problem until later -
when it's much bigger, and it's much more pain to recover from.


--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Hot Standby Feedback should default to on in 9.3+

Magnus Hagander-2
In reply to this post by Tom Lane-2
On Fri, Nov 30, 2012 at 10:09 PM, Tom Lane <[hidden email]> wrote:
> Claudio Freire <[hidden email]> writes:
>> Without hot standby feedback, reporting queries are impossible. I've
>> experienced it. Cancellations make it impossible to finish any
>> decently complex reporting query.
>
> The original expectation was that slave-side cancels would be
> infrequent.  Maybe there's some fixing/tuning to be done there.

It depends completely on the query pattern on the master. Saying that
cancellations makes it "impossible to finish any decently complex
reporting query" is completely incorrect - it depends on the queries
on the *master*, not on the complexity of the query on the slave. I
know a lot of scenarios where query cancels pretty much never happen
at all.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Hot Standby Feedback should default to on in 9.3+

Andres Freund-3
In reply to this post by Heikki Linnakangas-6
On 2012-11-30 22:46:06 +0200, Heikki Linnakangas wrote:

> On 30.11.2012 21:02, Andres Freund wrote:
> >Hi,
> >
> >The subject says it all.
> >
> >There are workloads where its detrimental, but in general having it
> >default to on improver experience tremendously because getting conflicts
> >because of vacuum is rather confusing.
> >
> >In the workloads where it might not be a good idea (very long queries on
> >the standby, many dead tuples on the primary) you need to think very
> >carefuly about the strategy of avoiding conflicts anyway, and explicit
> >configuration is required as well.
> >
> >Does anybody have an argument against changing the default value?
>
> -1. By default, I would expect a standby server to not have any meaningful
> impact on the performance of the master. With hot standby feedback, you can
> bloat the master very badly if you're not careful.

True. But everyone running postgres hopefully knows the problem
already. So that effect is relatively easy to explain.

The other control possibilities we have are rather hard to understand
and to setup in my experience.

> Think of someone setting up a test server, by setting it up as a standby
> from the master. Now, when someone holds a transaction open in the test
> server, you get bloat in the master. Or if you set up a standby for
> reporting purposes - a very common use case - you would not expect a long
> running ad-hoc query in the standby to bloat the master. That's precisely
> why you set up such a standby in the first place.

But you can't do any meaningful reporting without changing the current
variables around this anyway. If you have any writes on the master
barely any significant query ever completes.
The two basic choices we give people suck more imo:
* you setup a large delay: It possibly takes a very long time to catch
up if the primary dies, you don't see any up2date data in later queries
* you abort queries: You can't do any reporting queries

Both are unusable for most scenarios and getting the former just right
is hard.

Imo a default of on works in far more scenarios than the contrary.

Greetings,

Andres Freund

--
 Andres Freund                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
12