Query is over 2x slower with jit=on

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
38 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Query is over 2x slower with jit=on

Andreas Joseph Krogh-2
Hi all.
 
I don't know whether this is expected or not but I'm experiencing over 2x slowdown on a large query in PG-11 with JIT=on.
 
(query is a prepared statement executed with "explain analyze execute myprepared(arg1, arg2, ..., argn)")
 
After 10 executions these are the results (the first 5 executed in > 16s, then the plan changed)
 
With jit=on:
https://explain.depesz.com/s/vYB
Planning Time: 0.336 ms
JIT:
 Functions: 716
 Generation Time: 78.404 ms
 Inlining: false
 Inlining Time: 0.000 ms
 Optimization: false
 Optimization Time: 43.916 ms
 Emission Time: 600.031 ms
Execution Time: 2035.150 ms
(385 rows)

 
With jit=off:
https://explain.depesz.com/s/X6mA
Planning Time: 0.371 ms
Execution Time: 833.941 ms
(377 rows)

 
Both are master as of 55d26ff638f063fbccf57843f2c27f9795895a5c
 
The query largely consists of CTEs with aggregation which are FULL OUTER JOIN'ed.
 
On v10 the query executes in:
Execution time: 1159.628 ms
 
So v11 (with jit=off) is about 25% faster (due to parallel hash-join I think), which is nice!
 
What's the deal with jit making it slower?
 
--
Andreas Joseph Krogh
Reply | Threaded
Open this post in threaded view
|

Re: Query is over 2x slower with jit=on

Andres Freund
On 2018-04-18 17:35:31 +0200, Andreas Joseph Krogh wrote:

> With jit=on:
> https://explain.depesz.com/s/vYB
> Planning Time: 0.336 ms
>  JIT:
>   Functions: 716
>   Generation Time: 78.404 ms
>   Inlining: false
>   Inlining Time: 0.000 ms
>   Optimization: false
>   Optimization Time: 43.916 ms
>   Emission Time: 600.031 ms

Any chance this is a debug LLVM build?


> What's the deal with jit making it slower?

JIT has cost, and sometimes it's not beneficial. Here our heuristics
when to JIT appear to be a bit off. In the parallel world it's worse
because the JITing is duplicated for parallel workers atm.

Greetings,

Andres Freund

Reply | Threaded
Open this post in threaded view
|

Sv: Re: Query is over 2x slower with jit=on

Andreas Joseph Krogh-2
På onsdag 18. april 2018 kl. 17:50:55, skrev Andres Freund <[hidden email]>:
On 2018-04-18 17:35:31 +0200, Andreas Joseph Krogh wrote:
> With jit=on:
> https://explain.depesz.com/s/vYB
> Planning Time: 0.336 ms
>  JIT:
>   Functions: 716
>   Generation Time: 78.404 ms
>   Inlining: false
>   Inlining Time: 0.000 ms
>   Optimization: false
>   Optimization Time: 43.916 ms
>   Emission Time: 600.031 ms

Any chance this is a debug LLVM build?


> What's the deal with jit making it slower?

JIT has cost, and sometimes it's not beneficial. Here our heuristics
when to JIT appear to be a bit off. In the parallel world it's worse
because the JITing is duplicated for parallel workers atm.
 
PostgreSQL is built with "--enable-debug --with-llvm". LLVM is the one which comes with Ubuntu-17.10.
 
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
 
Reply | Threaded
Open this post in threaded view
|

Re: Query is over 2x slower with jit=on

Simon Riggs
In reply to this post by Andres Freund
On 18 April 2018 at 16:50, Andres Freund <[hidden email]> wrote:

> On 2018-04-18 17:35:31 +0200, Andreas Joseph Krogh wrote:
>> With jit=on:
>> https://explain.depesz.com/s/vYB
>> Planning Time: 0.336 ms
>>  JIT:
>>   Functions: 716
>>   Generation Time: 78.404 ms
>>   Inlining: false
>>   Inlining Time: 0.000 ms
>>   Optimization: false
>>   Optimization Time: 43.916 ms
>>   Emission Time: 600.031 ms
>
> Any chance this is a debug LLVM build?
>
>
>> What's the deal with jit making it slower?
>
> JIT has cost, and sometimes it's not beneficial. Here our heuristics
> when to JIT appear to be a bit off. In the parallel world it's worse
> because the JITing is duplicated for parallel workers atm.

Please change the name of the "JIT" parameter to something meaningful
to humans before this gets too far into the wild.

SSL is somewhat understandable because its not a Postgres-private term.

geqo is regrettable and we really don't want any more too
short/abbreviated parameter names.

Think of our EOU if every GUC was a TLA.

Thanks

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: Query is over 2x slower with jit=on

Chapman Flack
On 04/18/2018 12:27 PM, Simon Riggs wrote:

> Please change the name of the "JIT" parameter to something meaningful
> to humans before this gets too far into the wild.
>
> SSL is somewhat understandable because its not a Postgres-private term.

JIT is hardly a Postgres-private term. It's a familiar term in a
widespread community, though I might cede the point that the
community in question is more compiler wonks than DBAs.

-Chap

Reply | Threaded
Open this post in threaded view
|

Re: Query is over 2x slower with jit=on

Andres Freund


On April 18, 2018 9:50:48 AM PDT, Chapman Flack <[hidden email]> wrote:

>On 04/18/2018 12:27 PM, Simon Riggs wrote:
>
>> Please change the name of the "JIT" parameter to something meaningful
>> to humans before this gets too far into the wild.
>>
>> SSL is somewhat understandable because its not a Postgres-private
>term.
>
>JIT is hardly a Postgres-private term. It's a familiar term in a
>widespread community, though I might cede the point that the
>community in question is more compiler wonks than DBAs.

You're right. There's another thread where we discussed this. Also due to concern of Simon's. Can we not do so separately here?

Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Reply | Threaded
Open this post in threaded view
|

Re: Query is over 2x slower with jit=on

Robert Haas
In reply to this post by Andres Freund
On Wed, Apr 18, 2018 at 11:50 AM, Andres Freund <[hidden email]> wrote:
> JIT has cost, and sometimes it's not beneficial. Here our heuristics
> when to JIT appear to be a bit off. In the parallel world it's worse
> because the JITing is duplicated for parallel workers atm.

It seems like you're describing it as if the JIT just didn't produce
gains sufficient to make up for the cost of doing it, but that's not
really the issue here AFAICS.  Here the JIT actually made code that
run slower than the un-JIT-ted code.  That seems like a different sort
of problem.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Reply | Threaded
Open this post in threaded view
|

Re: Query is over 2x slower with jit=on

Andres Freund


On April 18, 2018 12:16:35 PM PDT, Robert Haas <[hidden email]> wrote:

>On Wed, Apr 18, 2018 at 11:50 AM, Andres Freund <[hidden email]>
>wrote:
>> JIT has cost, and sometimes it's not beneficial. Here our heuristics
>> when to JIT appear to be a bit off. In the parallel world it's worse
>> because the JITing is duplicated for parallel workers atm.
>
>It seems like you're describing it as if the JIT just didn't produce
>gains sufficient to make up for the cost of doing it, but that's not
>really the issue here AFAICS.  Here the JIT actually made code that
>run slower than the un-JIT-ted code.  That seems like a different sort
>of problem.

Not convinced that that is true - the issue is more likely that JIT work in workers is counted as execute time... Gotta add that somehow, not sure what the best way would be.
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Reply | Threaded
Open this post in threaded view
|

Re: Query is over 2x slower with jit=on

Robert Haas
On Wed, Apr 18, 2018 at 3:29 PM, Andres Freund <[hidden email]> wrote:
> Not convinced that that is true - the issue is more likely that JIT work in workers is counted as execute time... Gotta add that somehow, not sure what the best way would be.

Oh, that does seem like something that should be fixed.  If that's
what is happening here, it's bound to confuse a lot of people.
Probably you need to add some code to
ExecParallelRetrieveInstrumentation.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Reply | Threaded
Open this post in threaded view
|

Sv: Sv: Re: Query is over 2x slower with jit=on

Andreas Joseph Krogh-2
In reply to this post by Andreas Joseph Krogh-2
På onsdag 18. april 2018 kl. 18:26:03, skrev Andreas Joseph Krogh <[hidden email]>:
På onsdag 18. april 2018 kl. 17:50:55, skrev Andres Freund <[hidden email]>:
On 2018-04-18 17:35:31 +0200, Andreas Joseph Krogh wrote:
> With jit=on:
> https://explain.depesz.com/s/vYB
> Planning Time: 0.336 ms
>  JIT:
>   Functions: 716
>   Generation Time: 78.404 ms
>   Inlining: false
>   Inlining Time: 0.000 ms
>   Optimization: false
>   Optimization Time: 43.916 ms
>   Emission Time: 600.031 ms

Any chance this is a debug LLVM build?


> What's the deal with jit making it slower?

JIT has cost, and sometimes it's not beneficial. Here our heuristics
when to JIT appear to be a bit off. In the parallel world it's worse
because the JITing is duplicated for parallel workers atm.
 
PostgreSQL is built with "--enable-debug --with-llvm". LLVM is the one which comes with Ubuntu-17.10.
 
Some more info;
 
Without --enable-debug (seems this didn't impact performance mutch):
 
First 5 executions (prepared-statement issued from psql)
 Planning Time: 47.634 ms
 JIT:
   Functions: 725
   Generation Time: 74.748 ms
   Inlining: true
   Inlining Time: 90.763 ms
   Optimization: true
   Optimization Time: 5822.516 ms
   Emission Time: 3089.127 ms
 Execution Time: 16375.996 ms
 
 
After 5 executions (prepared-statement issued from psql)
Planning Time: 0.385 ms
JIT:
  Functions: 716
  Generation Time: 76.382 ms
  Inlining: false
  Inlining Time: 0.000 ms
  Optimization: false
  Optimization Time: 41.709 ms
  Emission Time: 613.074 ms
Execution Time: 2031.830 ms
 
jit=off:
 
Planning Time: 0.171 ms
Execution Time: 832.489 ms
 
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
 
Reply | Threaded
Open this post in threaded view
|

Re: Query is over 2x slower with jit=on

Andres Freund
In reply to this post by Robert Haas
Hi,

On 2018-04-18 18:37:30 -0400, Robert Haas wrote:
> On Wed, Apr 18, 2018 at 3:29 PM, Andres Freund <[hidden email]> wrote:
> > Not convinced that that is true - the issue is more likely that JIT work in workers is counted as execute time... Gotta add that somehow, not sure what the best way would be.
>
> Oh, that does seem like something that should be fixed.  If that's
> what is happening here, it's bound to confuse a lot of people.
> Probably you need to add some code to
> ExecParallelRetrieveInstrumentation.

I had lost track of this, and we unfortunately hadn't added an open item
back then.  I think we should add it now?

RMT (with me recused), do you think we should accept the new code fixing
this would entail? And thus that this should be an open item? It's
arguably a new feature, although I don't find that a terribly convincing
position.

Greetings,

Andres Freund

Reply | Threaded
Open this post in threaded view
|

Re: Query is over 2x slower with jit=on

Jonathan S. Katz-3

> On Aug 22, 2018, at 12:12 PM, Andres Freund <[hidden email]> wrote:
>
> Hi,
>
> On 2018-04-18 18:37:30 -0400, Robert Haas wrote:
>> On Wed, Apr 18, 2018 at 3:29 PM, Andres Freund <[hidden email]> wrote:
>>> Not convinced that that is true - the issue is more likely that JIT work in workers is counted as execute time... Gotta add that somehow, not sure what the best way would be.
>>
>> Oh, that does seem like something that should be fixed.  If that's
>> what is happening here, it's bound to confuse a lot of people.
>> Probably you need to add some code to
>> ExecParallelRetrieveInstrumentation.
>
> I had lost track of this, and we unfortunately hadn't added an open item
> back then.  I think we should add it now?
>
> RMT (with me recused), do you think we should accept the new code fixing
> this would entail? And thus that this should be an open item? It's
> arguably a new feature, although I don't find that a terribly convincing
> position.
Reviewed the earlier discussion. I can argue this both ways.

As it stands right now it seems like we will be defaulting “jit = off” which
means there will only be so many people who will be using jit in PG11,
and as such we could provide the patch for 12.

However, for the people who do enable JIT in PG11, this is a scenario
that would cause the query to perform in an unexpected way that is no
fault of their own, which one would argue is a bug. I have not tried to
reproduce it myself, but from looking at the explain plans from Andreas,
I’m confident I could craft a moderately complex query that could
demonstrate the performance degradation with jit = on.

While adding the code may constitute being a new feature, it is a feature
that would prevent user frustration on something that we are highlighting as a
“major feature” of PG11, even if it’s not enabled by default.

What I would you Andres is how complex will the patch be and how much time
will it take? As such I would +1 it for open items right now unless it seems like
this could take a significant of time to come up with a proper patch, in which
case I would reconsider, but at least with it on open items we would be able
to continuously review.

Thanks,

Jonathan



signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Sv: Re: Query is over 2x slower with jit=on

Andreas Joseph Krogh-2
In reply to this post by Andres Freund
På onsdag 22. august 2018 kl. 18:12:41, skrev Andres Freund <[hidden email]>:
Hi,

On 2018-04-18 18:37:30 -0400, Robert Haas wrote:
> On Wed, Apr 18, 2018 at 3:29 PM, Andres Freund <[hidden email]> wrote:
> > Not convinced that that is true - the issue is more likely that JIT work in workers is counted as execute time... Gotta add that somehow, not sure what the best way would be.
>
> Oh, that does seem like something that should be fixed.  If that's
> what is happening here, it's bound to confuse a lot of people.
> Probably you need to add some code to
> ExecParallelRetrieveInstrumentation.

I had lost track of this, and we unfortunately hadn't added an open item
back then.  I think we should add it now?

RMT (with me recused), do you think we should accept the new code fixing
this would entail? And thus that this should be an open item? It's
arguably a new feature, although I don't find that a terribly convincing
position.

Greetings,

Andres Freund
 
Just to be clear; The query really runs slower (wall-clock time), it's not just the timing.
 
--
Andreas Joseph Krogh
Reply | Threaded
Open this post in threaded view
|

Re: Query is over 2x slower with jit=on

Andres Freund
On 2018-08-22 18:39:18 +0200, Andreas Joseph Krogh wrote:
> Just to be clear; The query really runs slower (wall-clock time), it's not
> just the timing.

I bet it's not actually running slower, it "just" takes longer to start
up due to the JITing in each worker. I suspect what we should do is to
multiple the cost limits by the number of workers, to model that.  But
without the fixed instrumentation that's harder to see...

Greetings,

Andres Freund

Reply | Threaded
Open this post in threaded view
|

Sv: Re: Query is over 2x slower with jit=on

Andreas Joseph Krogh-2
På onsdag 22. august 2018 kl. 18:51:55, skrev Andres Freund <[hidden email]>:
On 2018-08-22 18:39:18 +0200, Andreas Joseph Krogh wrote:
> Just to be clear; The query really runs slower (wall-clock time), it's not
> just the timing.

I bet it's not actually running slower, it "just" takes longer to start
up due to the JITing in each worker. I suspect what we should do is to
multiple the cost limits by the number of workers, to model that.  But
without the fixed instrumentation that's harder to see...
 
Well, yes, that might be. By "runs" I meant from me hitting ENTER in psql to the time the query finishes...
 
I thought JITing of prepared queries happended once (in "prepare") so it didn't have to do the JITing every time the query is executed. Isn't the previously generated bytecode usable for subsequent queries?
 
--
Andreas Joseph Krogh
Reply | Threaded
Open this post in threaded view
|

Re: Query is over 2x slower with jit=on

Pierre Ducroquet
In reply to this post by Andres Freund
On Wednesday, August 22, 2018 6:51:55 PM CEST Andres Freund wrote:
> On 2018-08-22 18:39:18 +0200, Andreas Joseph Krogh wrote:
> > Just to be clear; The query really runs slower (wall-clock time), it's not
> > just the timing.
>
> I bet it's not actually running slower, it "just" takes longer to start
> up due to the JITing in each worker. I suspect what we should do is to
> multiple the cost limits by the number of workers, to model that.  But
> without the fixed instrumentation that's harder to see...

It depends on the query. It has been shown in other threads that query can
indeed take longer to run because of JITing : if the cost is too low to fire
LLVM optimizer, the generated code can be so bad it will be slower than the
non-JIT executor.
Cf for instance a previous discussion here : http://www.postgresql-archive.org/PATCH-LLVM-tuple-deforming-improvements-td6029385.html

I think it would be interesting to try the query from this thread with a patch
forcing the LLVM codegen to O1 (I found no PassManager there to play with, it
seems to be an off/on/extreme switch ; patch 0001-LLVM-Use-the-O1-CodeGen-
level.patch from thread mentioned above).




Reply | Threaded
Open this post in threaded view
|

Re: Query is over 2x slower with jit=on

Andres Freund
In reply to this post by Andreas Joseph Krogh-2
On 2018-08-22 19:51:12 +0200, Andreas Joseph Krogh wrote:
> I thought JITing of prepared queries happended once (in "prepare")

No, it happens when the first JITed function is executed.


>  so it didn't have to do the JITing every time the query is
> executed. Isn't the previously generated bytecode usable for
> subsequent queries?

No, not currently. There's some reasons preventing that (primarily that
we currently rely on addresses of certain things not to change during
execution). There's ongoing work to change that, but that's certainly
not going to be ready for v11.

Greetings,

Andres Freund

Reply | Threaded
Open this post in threaded view
|

Re: Query is over 2x slower with jit=on

Andreas Joseph Krogh-2
På onsdag 22. august 2018 kl. 20:52:05, skrev Andres Freund <[hidden email]>:
On 2018-08-22 19:51:12 +0200, Andreas Joseph Krogh wrote:
> I thought JITing of prepared queries happended once (in "prepare")

No, it happens when the first JITed function is executed.


>  so it didn't have to do the JITing every time the query is
> executed. Isn't the previously generated bytecode usable for
> subsequent queries?

No, not currently. There's some reasons preventing that (primarily that
we currently rely on addresses of certain things not to change during
execution). There's ongoing work to change that, but that's certainly
not going to be ready for v11.

Greetings,

Andres Freund
 
 
Ok, thanks for clarifying.
 
--
Andreas Joseph Krogh
Reply | Threaded
Open this post in threaded view
|

Re: Query is over 2x slower with jit=on

Andres Freund
In reply to this post by Pierre Ducroquet
On 2018-08-22 20:48:48 +0200, Pierre Ducroquet wrote:

> On Wednesday, August 22, 2018 6:51:55 PM CEST Andres Freund wrote:
> > On 2018-08-22 18:39:18 +0200, Andreas Joseph Krogh wrote:
> > > Just to be clear; The query really runs slower (wall-clock time), it's not
> > > just the timing.
> >
> > I bet it's not actually running slower, it "just" takes longer to start
> > up due to the JITing in each worker. I suspect what we should do is to
> > multiple the cost limits by the number of workers, to model that.  But
> > without the fixed instrumentation that's harder to see...
>
> It depends on the query. It has been shown in other threads that query can
> indeed take longer to run because of JITing : if the cost is too low to fire
> LLVM optimizer, the generated code can be so bad it will be slower than the
> non-JIT executor.

This largely seems to be orthogonal to what I'm talking about.


> Cf for instance a previous discussion here : http://www.postgresql-archive.org/PATCH-LLVM-tuple-deforming-improvements-td6029385.html

I'd wish people stopped using www.postgresql-archive.org. It's *NOT*
postgresql.org maintained, in fact I do not know who does. It does shows
ads when downloading links, which I'm personally not ok with.


Greetings,

Andres Freund

Reply | Threaded
Open this post in threaded view
|

Re: Query is over 2x slower with jit=on

Jonathan S. Katz-3
In reply to this post by Andreas Joseph Krogh-2

On Aug 22, 2018, at 2:58 PM, Andreas Joseph Krogh <[hidden email]> wrote:

På onsdag 22. august 2018 kl. 20:52:05, skrev Andres Freund <[hidden email]>:
On 2018-08-22 19:51:12 +0200, Andreas Joseph Krogh wrote:
> I thought JITing of prepared queries happended once (in "prepare")

No, it happens when the first JITed function is executed.


>  so it didn't have to do the JITing every time the query is
> executed. Isn't the previously generated bytecode usable for
> subsequent queries?

No, not currently. There's some reasons preventing that (primarily that
we currently rely on addresses of certain things not to change during
execution). There's ongoing work to change that, but that's certainly
not going to be ready for v11.

Greetings,

Andres Freund
 
 
Ok, thanks for clarifying.

Per earlier note[1] I was able to reproduce this issue with similar results to
Andreas while running 11 Beta 3.

jit = on

Planning Time: 0.921 ms
JIT:
  Functions: 193
  Generation Time: 121.595 ms
  Inlining: false
  Inlining Time: 0.000 ms
  Optimization: false
  Optimization Time: 58.045 ms
  Emission Time: 1201.100 ms
Execution Time: 1628.017 ms

jit = off

Planning Time: 1.398 ms
Execution Time: 256.473 ms

I increased the the search range I used in the query by 3x, and got these numbers:

jit=on
Planning Time: 0.931 ms
JIT:
  Functions: 184
  Generation Time: 126.587 ms
  Inlining: true
  Inlining Time: 98.865 ms
  Optimization: true
  Optimization Time: 20518.982 ms
  Emission Time: 7270.963 ms
Execution Time: 28772.576 ms

jit=off
Planning Time: 1.527 ms
Execution Time: 959.160 ms

So, I would +1 this for open items.

Jonathan



signature.asc (849 bytes) Download Attachment
12