Why we lost Uber as a user

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
68 messages Options
1234
Reply | Threaded
Open this post in threaded view
|

Why we lost Uber as a user

Joshua Drake-2
Hello,

The following article is a very good look at some of our limitations and
highlights some of the pains many of us have been working "around" since
we started using the software.

https://eng.uber.com/mysql-migration/

Specifically:

* Inefficient architecture for writes
* Inefficient data replication
* Issues with table corruption
* Poor replica MVCC support
* Difficulty upgrading to newer releases

It is a very good read and I encourage our hackers to do so with an open
mind.

Sincerely,

JD

--
Command Prompt, Inc.                  http://the.postgres.company/
                         +1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.
Unless otherwise stated, opinions are my own.


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Why we lost Uber as a user

Josh berkus
On 07/26/2016 09:54 AM, Joshua D. Drake wrote:
> Hello,
>
> The following article is a very good look at some of our limitations and
> highlights some of the pains many of us have been working "around" since
> we started using the software.

They also had other reasons to switch to MySQL, particularly around
changes of staffing (the switch happened after they got a new CTO).  And
they encountered that 9.2 bug literally the week we released a fix, per
one of the mailing lists. Even if they switched off, it's still a nice
testimonial that they once ran their entire worldwide fleet off a single
Postgres cluster.

However, the issues they cite as limitations of our current replication
system are real, or we wouldn't have so many people working on
alternatives.  We could really use pglogical in 10.0, as well as
OLTP-friendly MM replication.

The write amplification issue, and its correllary in VACUUM, certainly
continues to plague some users, and doesn't have any easy solutions.

I do find it interesting that they mention schema changes in passing,
without actually saying anything about them -- given that schema changes
have been one of MySQL's major limitations.  I'll also note that they
don't mention any of MySQL's corresponding weak spots, such as
limitations on table size due to primary key sorting.

One wonders what would have happened if they'd adopted a sharding model
on top of Postgres?

I would like to see someone blog about our testing for replication
corruption issues now, in response to this.

--
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Why we lost Uber as a user

Josh berkus
On 07/26/2016 01:53 PM, Josh Berkus wrote:
> The write amplification issue, and its correllary in VACUUM, certainly
> continues to plague some users, and doesn't have any easy solutions.

To explain this in concrete terms, which the blog post does not:

1. Create a small table, but one with enough rows that indexes make
sense (say 50,000 rows).

2. Make this table used in JOINs all over your database.

3. To support these JOINs, index most of the columns in the small table.

4. Now, update that small table 500 times per second.

That's a recipe for runaway table bloat; VACUUM can't do much because
there's always some minutes-old transaction hanging around (and SNAPSHOT
TOO OLD doesn't really help, we're talking about minutes here), and
because of all of the indexes HOT isn't effective.  Removing the indexes
is equally painful because it means less efficient JOINs.

The Uber guy is right that InnoDB handles this better as long as you
don't touch the primary key (primary key updates in InnoDB are really bad).

This is a common problem case we don't have an answer for yet.

--
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Why we lost Uber as a user

Bruce Momjian
On Tue, Jul 26, 2016 at 02:26:57PM -0700, Josh Berkus wrote:

> On 07/26/2016 01:53 PM, Josh Berkus wrote:
> > The write amplification issue, and its correllary in VACUUM, certainly
> > continues to plague some users, and doesn't have any easy solutions.
>
> To explain this in concrete terms, which the blog post does not:
>
> 1. Create a small table, but one with enough rows that indexes make
> sense (say 50,000 rows).
>
> 2. Make this table used in JOINs all over your database.
>
> 3. To support these JOINs, index most of the columns in the small table.
>
> 4. Now, update that small table 500 times per second.
>
> That's a recipe for runaway table bloat; VACUUM can't do much because
> there's always some minutes-old transaction hanging around (and SNAPSHOT
> TOO OLD doesn't really help, we're talking about minutes here), and
> because of all of the indexes HOT isn't effective.  Removing the indexes
> is equally painful because it means less efficient JOINs.
>
> The Uber guy is right that InnoDB handles this better as long as you
> don't touch the primary key (primary key updates in InnoDB are really bad).
>
> This is a common problem case we don't have an answer for yet.

Or, basically, we don't have an answer to without making something else
worse.

--
  Bruce Momjian  <[hidden email]>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+                     Ancient Roman grave inscription +


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Why we lost Uber as a user

Robert Haas
In reply to this post by Josh berkus
On Tue, Jul 26, 2016 at 5:26 PM, Josh Berkus <[hidden email]> wrote:

> On 07/26/2016 01:53 PM, Josh Berkus wrote:
>> The write amplification issue, and its correllary in VACUUM, certainly
>> continues to plague some users, and doesn't have any easy solutions.
>
> To explain this in concrete terms, which the blog post does not:
>
> 1. Create a small table, but one with enough rows that indexes make
> sense (say 50,000 rows).
>
> 2. Make this table used in JOINs all over your database.
>
> 3. To support these JOINs, index most of the columns in the small table.
>
> 4. Now, update that small table 500 times per second.
>
> That's a recipe for runaway table bloat; VACUUM can't do much because
> there's always some minutes-old transaction hanging around (and SNAPSHOT
> TOO OLD doesn't really help, we're talking about minutes here), and
> because of all of the indexes HOT isn't effective.  Removing the indexes
> is equally painful because it means less efficient JOINs.
>
> The Uber guy is right that InnoDB handles this better as long as you
> don't touch the primary key (primary key updates in InnoDB are really bad).
>
> This is a common problem case we don't have an answer for yet.

This is why I think we need a pluggable heap storage layer, which
could be done either by rebranding foreign data wrappers as data
wrappers (as I have previously proposed) or using the access method
interface (as proposed by Alexander Korotkov) at PGCon.  We're
reaching the limits of what can be done using our current heap format,
and we need to enable developers to experiment with new things.  Aside
from the possibility of eventually coming up with something that's
good enough to completely (or mostly) replace our current heap storage
format, we need to support specialized data storage formats that are
optimized for particular use cases (columnar, memory-optimized, WORM).
I know that people are worried about ending up with too many heap
storage formats, but I think we should be a lot more worried about not
having enough heap storage formats.  Anybody who thinks that the
current design is working for all of our users is not paying very
close attention.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Why we lost Uber as a user

Tom Lane-2
In reply to this post by Josh berkus
Josh Berkus <[hidden email]> writes:
> To explain this in concrete terms, which the blog post does not:

> 1. Create a small table, but one with enough rows that indexes make
> sense (say 50,000 rows).

> 2. Make this table used in JOINs all over your database.

> 3. To support these JOINs, index most of the columns in the small table.

> 4. Now, update that small table 500 times per second.

> That's a recipe for runaway table bloat; VACUUM can't do much because
> there's always some minutes-old transaction hanging around (and SNAPSHOT
> TOO OLD doesn't really help, we're talking about minutes here), and
> because of all of the indexes HOT isn't effective.

Hm, I'm not following why this is a disaster.  OK, you have circa 100%
turnover of the table in the lifespan of the slower transactions, but I'd
still expect vacuuming to be able to hold the bloat to some small integer
multiple of the minimum possible table size.  (And if the table is small,
that's still small.)  I suppose really long transactions (pg_dump?) could
be pretty disastrous, but there are ways around that, like doing pg_dump
on a slave.

Or in short, this seems like an annoyance, not a time-for-a-new-database
kind of problem.

                        regards, tom lane


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Why we lost Uber as a user

Josh berkus
On 07/26/2016 03:07 PM, Tom Lane wrote:
> Josh Berkus <[hidden email]> writes:

>> That's a recipe for runaway table bloat; VACUUM can't do much because
>> there's always some minutes-old transaction hanging around (and SNAPSHOT
>> TOO OLD doesn't really help, we're talking about minutes here), and
>> because of all of the indexes HOT isn't effective.
>
> Hm, I'm not following why this is a disaster.  OK, you have circa 100%
> turnover of the table in the lifespan of the slower transactions, but I'd
> still expect vacuuming to be able to hold the bloat to some small integer
> multiple of the minimum possible table size.

Not in practice.  Don't forget that you also have bloat of the indexes
as well.  I encountered multiple cases of this particular failure case,
and often bloat ended up at something like 100X of the clean table/index
size, with no stable size (that is, it always kept growing).  This was
the original impetus for wanting REINDEX CONCURRENTLY, but really that's
kind of a workaround.

  (And if the table is small,
> that's still small.)  I suppose really long transactions (pg_dump?) could
> be pretty disastrous, but there are ways around that, like doing pg_dump
> on a slave.

You'd need a dedicated slave for the pg_dump, otherwise you'd hit query
cancel.

> Or in short, this seems like an annoyance, not a time-for-a-new-database
> kind of problem.

It's considerably more than an annoyance for the people who suffer from
it; for some databases I dealt with, this one issue was responsible for
80% of administrative overhead (cron jobs, reindexing, timeouts ...).

But no, it's not a database-switcher *by itself*.  But is is a chronic,
and serious, problem.  I don't have even a suggestion of a real solution
for it without breaking something else, though.

--
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Why we lost Uber as a user

Robert Haas
In reply to this post by Tom Lane-2
On Tue, Jul 26, 2016 at 6:07 PM, Tom Lane <[hidden email]> wrote:

> Josh Berkus <[hidden email]> writes:
>> To explain this in concrete terms, which the blog post does not:
>
>> 1. Create a small table, but one with enough rows that indexes make
>> sense (say 50,000 rows).
>
>> 2. Make this table used in JOINs all over your database.
>
>> 3. To support these JOINs, index most of the columns in the small table.
>
>> 4. Now, update that small table 500 times per second.
>
>> That's a recipe for runaway table bloat; VACUUM can't do much because
>> there's always some minutes-old transaction hanging around (and SNAPSHOT
>> TOO OLD doesn't really help, we're talking about minutes here), and
>> because of all of the indexes HOT isn't effective.
>
> Hm, I'm not following why this is a disaster.  OK, you have circa 100%
> turnover of the table in the lifespan of the slower transactions, but I'd
> still expect vacuuming to be able to hold the bloat to some small integer
> multiple of the minimum possible table size.  (And if the table is small,
> that's still small.)  I suppose really long transactions (pg_dump?) could
> be pretty disastrous, but there are ways around that, like doing pg_dump
> on a slave.
>
> Or in short, this seems like an annoyance, not a time-for-a-new-database
> kind of problem.

I've seen multiple cases where this kind of thing causes a
sufficiently large performance regression that the system just can't
keep up.  Things are OK when the table is freshly-loaded, but as soon
as somebody runs a query on any table in the cluster that lasts for a
minute or two, so much bloat accumulates that the performance drops to
an unacceptable level.  This kind of thing certainly doesn't happen to
everybody, but equally certainly, this isn't the first time I've heard
of it being a problem.  Sometimes, with careful tending and a very
aggressive autovacuum configuration, you can live with it, but it's
never a lot of fun.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Why we lost Uber as a user

Michael Paquier
On Wed, Jul 27, 2016 at 7:19 AM, Robert Haas <[hidden email]> wrote:

> I've seen multiple cases where this kind of thing causes a
> sufficiently large performance regression that the system just can't
> keep up.  Things are OK when the table is freshly-loaded, but as soon
> as somebody runs a query on any table in the cluster that lasts for a
> minute or two, so much bloat accumulates that the performance drops to
> an unacceptable level.  This kind of thing certainly doesn't happen to
> everybody, but equally certainly, this isn't the first time I've heard
> of it being a problem.  Sometimes, with careful tending and a very
> aggressive autovacuum configuration, you can live with it, but it's
> never a lot of fun.

Yes.. That's not fun at all. And it takes days to do this tuning
properly if you do such kind of tests on a given product that should
work the way its spec certifies it to ease the customer experience.

As much as this post is interesting, the comments on HN are a good read as well:
https://news.ycombinator.com/item?id=12166585
Some points raised are that the "flaws" mentioned in this post are
actually advantages. But I guess this depends on how you want to run
your business via your application layer.
--
Michael


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Why we lost Uber as a user

Stephen Frost
In reply to this post by Joshua Drake-2
* Joshua D. Drake ([hidden email]) wrote:

> Hello,
>
> The following article is a very good look at some of our limitations
> and highlights some of the pains many of us have been working
> "around" since we started using the software.
>
> https://eng.uber.com/mysql-migration/
>
> Specifically:
>
> * Inefficient architecture for writes
> * Inefficient data replication
The above are related and there are serious downsides to having an extra
mapping in the middle between the indexes and the heap.

What makes me doubt just how well they understood the issues or what is
happening is the lack of any mention of hint bits of tuple freezing
(requiring additional writes).

> * Issues with table corruption

That was a bug that was fixed quite quickly once it was detected.  The
implication that MySQL doesn't have similar bugs is entirely incorrect,
as is the idea that logical replication would avoid data corruption
issues (in practice, it actually tends to be quite a bit worse).

> * Poor replica MVCC support

Solved through the hot standby feedback system.

> * Difficulty upgrading to newer releases

Their specific issue with these upgrades was solved, years ago, by me
(and it wasn't particularly difficult to do...) through the use of
pg_upgrade's --link option and rsync's ability to construct hard link
trees.  Making major release upgrades easier with less downtime is
certainly a good goal, but there's been a solution to the specific issue
they had here for quite a while.

Thanks!

Stephen

signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Why we lost Uber as a user

Robert Haas
On Tue, Jul 26, 2016 at 8:27 PM, Stephen Frost <[hidden email]> wrote:

> * Joshua D. Drake ([hidden email]) wrote:
>> Hello,
>>
>> The following article is a very good look at some of our limitations
>> and highlights some of the pains many of us have been working
>> "around" since we started using the software.
>>
>> https://eng.uber.com/mysql-migration/
>>
>> Specifically:
>>
>> * Inefficient architecture for writes
>> * Inefficient data replication
>
> The above are related and there are serious downsides to having an extra
> mapping in the middle between the indexes and the heap.
>
> What makes me doubt just how well they understood the issues or what is
> happening is the lack of any mention of hint bits of tuple freezing
> (requiring additional writes).

Yeah.  A surprising amount of that post seemed to be devoted to
describing how our MVCC architecture works rather than what problem
they had with it.  I'm not saying we shouldn't take their bad
experience seriously - we clearly should - but I don't feel like it's
as clear as it could be about exactly where the breakdowns happened.
That's why I found Josh's restatement useful - I am assuming without
proof that his restatement is accurate....

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Why we lost Uber as a user

Vik Fearing-3
On 27/07/16 05:45, Robert Haas wrote:

> On Tue, Jul 26, 2016 at 8:27 PM, Stephen Frost <[hidden email]> wrote:
>> * Joshua D. Drake ([hidden email]) wrote:
>>> Hello,
>>>
>>> The following article is a very good look at some of our limitations
>>> and highlights some of the pains many of us have been working
>>> "around" since we started using the software.
>>>
>>> https://eng.uber.com/mysql-migration/
>>>
>>> Specifically:
>>>
>>> * Inefficient architecture for writes
>>> * Inefficient data replication
>>
>> The above are related and there are serious downsides to having an extra
>> mapping in the middle between the indexes and the heap.
>>
>> What makes me doubt just how well they understood the issues or what is
>> happening is the lack of any mention of hint bits of tuple freezing
>> (requiring additional writes).
>
> Yeah.  A surprising amount of that post seemed to be devoted to
> describing how our MVCC architecture works rather than what problem
> they had with it.  I'm not saying we shouldn't take their bad
> experience seriously - we clearly should - but I don't feel like it's
> as clear as it could be about exactly where the breakdowns happened.

There is some more detailed information in this 30-minute talk:
https://vimeo.com/145842299
--
Vik Fearing                                          +33 6 46 75 15 36
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Why we lost Uber as a user

Merlin Moncure-2
In reply to this post by Tom Lane-2
On Tue, Jul 26, 2016 at 5:07 PM, Tom Lane <[hidden email]> wrote:

> Josh Berkus <[hidden email]> writes:
>> To explain this in concrete terms, which the blog post does not:
>
>> 1. Create a small table, but one with enough rows that indexes make
>> sense (say 50,000 rows).
>
>> 2. Make this table used in JOINs all over your database.
>
>> 3. To support these JOINs, index most of the columns in the small table.
>
>> 4. Now, update that small table 500 times per second.
>
>> That's a recipe for runaway table bloat; VACUUM can't do much because
>> there's always some minutes-old transaction hanging around (and SNAPSHOT
>> TOO OLD doesn't really help, we're talking about minutes here), and
>> because of all of the indexes HOT isn't effective.
>
> Hm, I'm not following why this is a disaster.  OK, you have circa 100%
> turnover of the table in the lifespan of the slower transactions, but I'd
> still expect vacuuming to be able to hold the bloat to some small integer
> multiple of the minimum possible table size.  (And if the table is small,
> that's still small.)  I suppose really long transactions (pg_dump?) could
> be pretty disastrous, but there are ways around that, like doing pg_dump
> on a slave.
>
> Or in short, this seems like an annoyance, not a time-for-a-new-database
> kind of problem.

Well, the real annoyance as I understand it is the raw volume of bytes
of WAL traffic a single update of a field can cause.  They switched to
statement level replication(!).

merlin


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Why we lost Uber as a user

Bruce Momjian
On Wed, Jul 27, 2016 at 08:33:52AM -0500, Merlin Moncure wrote:
> > Or in short, this seems like an annoyance, not a time-for-a-new-database
> > kind of problem.
>
> Well, the real annoyance as I understand it is the raw volume of bytes
> of WAL traffic a single update of a field can cause.  They switched to
> statement level replication(!).

Well, their big complaint about binary replication is that a bug can
spread from a master to all slaves, which doesn't happen with statement
level replication.  If that type of corruption is your primary worry,
and you can ignore the worries about statement level replication, then
it makes sense.  Of course, the big tragedy is that statement level
replication has known unfixable(?) failures, while binary replication
failures are caused by developer-introduced bugs.

In some ways, people worry about the bugs they have seen, not the bugs
they haven't seen.

--
  Bruce Momjian  <[hidden email]>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+                     Ancient Roman grave inscription +


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Why we lost Uber as a user

Josh berkus
In reply to this post by Robert Haas
On 07/26/2016 08:45 PM, Robert Haas wrote:
> That's why I found Josh's restatement useful - I am assuming without
> proof that his restatement is accurate....

FWIW, my restatement was based on some other sites rather than Uber.
Including folks who didn't abandon Postgres.

--
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Why we lost Uber as a user

Geoff Winkless
In reply to this post by Bruce Momjian
On 27 July 2016 at 17:04, Bruce Momjian <[hidden email]> wrote:
Well, their big complaint about binary replication is that a bug can
spread from a master to all slaves, which doesn't happen with statement
level replication.  

​I'm not sure that that makes sense to me. If there's a database bug that occurs when you run a statement on the master, it seems there's a decent chance that that same bug is going to occur when you run the same statement on the slave.

Obviously it depends on the type of bug and how identical the slave is, but statement-level replication certainly doesn't preclude such a bug from propagating.​
 

​Geoff​
Reply | Threaded
Open this post in threaded view
|

Re: Why we lost Uber as a user

Vitaly Burovoy
On 7/28/16, Geoff Winkless <[hidden email]> wrote:

> On 27 July 2016 at 17:04, Bruce Momjian <[hidden email]> wrote:
>
>> Well, their big complaint about binary replication is that a bug can
>> spread from a master to all slaves, which doesn't happen with statement
>> level replication.
>
> ​
> I'm not sure that that makes sense to me. If there's a database bug that
> occurs when you run a statement on the master, it seems there's a decent
> chance that that same bug is going to occur when you run the same statement
> on the slave.
>
> Obviously it depends on the type of bug and how identical the slave is, but
> statement-level replication certainly doesn't preclude such a bug from
> propagating.
>
> ​Geoff

Please, read the article first! The bug is about wrong visibility of
tuples after applying WAL at slaves.
For example, you can see two different records selecting from a table
by a primary key (moreover, their PKs are the same, but other columns
differ).

From the article (emphasizing is mine):
The following query illustrates how this bug would affect our users
table example:
SELECT * FROM users WHERE id = 4;
This query would return *TWO* records: ...


And it affected slaves, not master.
Slaves are for decreasing loading to master, if you run all queries
(even) RO at master, why would you (or someone) have so many slaves?

--
Best regards,
Vitaly Burovoy


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Why we lost Uber as a user

Geoff Winkless

On 28 Jul 2016 12:19, "Vitaly Burovoy" <[hidden email]> wrote:
>
> On 7/28/16, Geoff Winkless <[hidden email]> wrote:
> > On 27 July 2016 at 17:04, Bruce Momjian <[hidden email]> wrote:
> >
> >> Well, their big complaint about binary replication is that a bug can
> >> spread from a master to all slaves, which doesn't happen with statement
> >> level replication.
> >
> > ​
> > I'm not sure that that makes sense to me. If there's a database bug that
> > occurs when you run a statement on the master, it seems there's a decent
> > chance that that same bug is going to occur when you run the same statement
> > on the slave.
> >
> > Obviously it depends on the type of bug and how identical the slave is, but
> > statement-level replication certainly doesn't preclude such a bug from
> > propagating.
> >
> > ​Geoff
>
> Please, read the article first! The bug is about wrong visibility of
> tuples after applying WAL at slaves.
> For example, you can see two different records selecting from a table
> by a primary key (moreover, their PKs are the same, but other columns
> differ).

I read the article. It affected slaves as well as the master.

I quote:
"because of the way replication works, this issue has the potential to spread into all of the databases in a replication hierarchy"

I maintain that this is a nonsense argument. Especially since (as you pointed out and as I missed first time around) the bug actually occurred at different records on different slaves, so he invalidates his own point.

Geoff

Reply | Threaded
Open this post in threaded view
|

Re: Why we lost Uber as a user

pgwhatever
In reply to this post by Geoff Winkless
Statement-Based replication has a lot of problems with it like indeterminate UDFs.  Here is a link to see them all:
https://dev.mysql.com/doc/refman/5.7/en/replication-sbr-rbr.html#replication-sbr-rbr-sbr-disadvantages
Reply | Threaded
Open this post in threaded view
|

Re: Why we lost Uber as a user

Merlin Moncure-2
On Thu, Jul 28, 2016 at 8:16 AM, pgwhatever <[hidden email]> wrote:
> Statement-Based replication has a lot of problems with it like indeterminate
> UDFs.  Here is a link to see them all:
> https://dev.mysql.com/doc/refman/5.7/en/replication-sbr-rbr.html#replication-sbr-rbr-sbr-disadvantages

Sure.  It's also incredibly efficient with respect to bandwidth -- so,
if you're application was engineered to work around those problems
it's a huge win.  They could have used pgpool, but I guess the fix was
already in.

Taking a step back, from the outside, it looks like uber:
*) has a very thick middleware, very thin database with respect to
logic and complexity
*) has a very high priority on quick and cheap (in terms of bandwidth)
replication
*) has decided the database needs to be interchangeable
*) is not afraid to make weak or erroneous technical justifications as
a basis of stack selection (the futex vs ipc argument I felt was
particularly awful -- it ignored the fact we use spinlocks)

The very fact that they swapped it out so easily suggests that they
were not utilizing the database as they could have, and a different
technical team might have come to a different result.   Postgres is a
very general system and rewards deep knowledge such that it can
outperform even specialty systems in the hands of a capable developer
(for example, myself).  I'm just now hammering in the final coffin
nails that will get solr swapped out for jsonb backed postgres.

I guess it's fair to say that they felt mysql is closer to what they
felt a database should do out of the box.  That's disappointing, but
life moves on.  The takeaways are:

*) people like different choices of replication mechanics -- statement
level sucks a lot of the time, but not all the time
*) hs/sr simplicity of configuration and operation is a big issue.
it's continually gotten better and still needs to
*) bad QC can cost you customers.   how much regression coverage do we
have of hs/sr?
*) postgres may not be the ideal choice for those who want a thin and
simple database

merlin


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
1234