Excessive memory usage in multi-statement queries w/ partitioning

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
23 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Excessive memory usage in multi-statement queries w/ partitioning

Andreas Seltenreich-4
Hi,

a customer reported excessive memory usage and out-of-memory ERRORs
after introducing native partitioning in one of their databases.  We
could narrow it down to the overhead introduced by the partitioning when
issuing multiple statements in a single query.  I could reduce the
problem to the following recipe:

--8<---------------cut here---------------start------------->8---
#!/bin/bash

# create 100 partitions
psql -c 'create table t(c int primary key) partition by range(c)'
for i in {1..100}; do
    psql -e -c "create table t$i partition of t for values
         from ($(((i-1)*100))) to ($((i*100-1))) "
done

# artificially limit per-process memory by setting a resource limit for
# the postmaster to 256MB

prlimit -d$((256*1024*1024)) -p $POSTMASTER_PID
--8<---------------cut here---------------end--------------->8---

Now, updates to a partition are fine with 4000 update statements:

,----
| $ psql -c "$(yes update t2 set c=c where c=6 \; | head -n 4000)"
| UPDATE 0
`----

…but when doing it on the parent relation, even 100 statements are
enough to exceed the limit:

,----
| $ psql -c "$(yes update t set c=c where c=6 \; | head -n 100)"
| FEHLER:  Speicher aufgebraucht
| DETAIL:  Failed on request of size 200 in memory context "MessageContext".
`----

The memory context dump shows plausible values except for the MessageContext:

TopMemoryContext: 124336 total in 8 blocks; 18456 free (11 chunks); 105880 used
  [...]
  MessageContext: 264241152 total in 42 blocks; 264 free (0 chunks); 264240888 used
  [...]

Maybe some tactically placed pfrees or avoiding putting redundant stuff
into MessageContext can relax the situation?

regards,
Andreas


Reply | Threaded
Open this post in threaded view
|

Re: Excessive memory usage in multi-statement queries w/ partitioning

David Rowley-3
On Thu, 23 May 2019 at 17:55, Andreas Seltenreich
<[hidden email]> wrote:
> a customer reported excessive memory usage and out-of-memory ERRORs
> after introducing native partitioning in one of their databases.  We
> could narrow it down to the overhead introduced by the partitioning when
> issuing multiple statements in a single query.

"multiple statements in a single query", did you mean to write session
or maybe transaction there?

Which version?

I tried your test case with REL_11_STABLE and I see nowhere near as
much memory used in MessageContext.

After repeating the query twice, I see:

MessageContext: 8388608 total in 11 blocks; 3776960 free (1 chunks);
4611648 used
Grand total: 8388608 bytes in 11 blocks; 3776960 free (1 chunks); 4611648 used
MessageContext: 8388608 total in 11 blocks; 3776960 free (1 chunks);
4611648 used
Grand total: 8388608 bytes in 11 blocks; 3776960 free (1 chunks); 4611648 used

which is quite a long way off the 252MB you're getting.

perhaps I'm not testing with the same version as you are.


--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

dump_MessageContext_stats.diff (628 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Excessive memory usage in multi-statement queries w/ partitioning

Julian Schauder
Hey David,


> "multiple statements in a single query", did you mean to write
> session
> or maybe transaction there?

Maybe the wording isn't perfect. It is required that the querys are
sent as a single batch. Try the exact bash-script Andreas used for
updating the parent.

> Which version?

Tested including 11.2. Initially found on 11.1. Memory-consumption
Scales somewhat linearly with existing partitions and ';' delimited
Querys per single Batch.


regards
--
Julian Schauder



Reply | Threaded
Open this post in threaded view
|

Re: Excessive memory usage in multi-statement queries w/ partitioning

David Rowley-3
On Thu, 23 May 2019 at 21:19, Julian Schauder
<[hidden email]> wrote:
> > "multiple statements in a single query", did you mean to write
> > session
> > or maybe transaction there?
>
> Maybe the wording isn't perfect. It is required that the querys are
> sent as a single batch. Try the exact bash-script Andreas used for
> updating the parent.

Thanks for explaining.

> > Which version?
>
> Tested including 11.2. Initially found on 11.1. Memory-consumption
> Scales somewhat linearly with existing partitions and ';' delimited
> Querys per single Batch.

Yeah, unfortunately, if the batch contains 100 of those statements
then the planner is going to eat 100 times the memory since it stores
all 100 plans at once.

Since your pruning all but 1 partition then the situation should be
much better for you when you can upgrade to v12. Unfortunately, that's
still about 5 months away.

The best thing you can do for now is going to be either reduce the
number of partitions or reduce the number of statements in the
batch... or install more memory.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Excessive memory usage in multi-statement queries w/ partitioning

Amit Langote-2
In reply to this post by Andreas Seltenreich-4
Hi,

On 2019/05/23 4:15, Andreas Seltenreich wrote:

> …but when doing it on the parent relation, even 100 statements are
> enough to exceed the limit:
>
> ,----
> | $ psql -c "$(yes update t set c=c where c=6 \; | head -n 100)"
> | FEHLER:  Speicher aufgebraucht
> | DETAIL:  Failed on request of size 200 in memory context "MessageContext".
> `----
>
> The memory context dump shows plausible values except for the MessageContext:
>
> TopMemoryContext: 124336 total in 8 blocks; 18456 free (11 chunks); 105880 used
>   [...]
>   MessageContext: 264241152 total in 42 blocks; 264 free (0 chunks); 264240888 used
>   [...]

As David Rowley said, planning that query hundreds of times under a single
MessageContext is not something that will end well on 11.3, because even a
single instance takes up tons of memory that's only released when
MessageContext is reset.

> Maybe some tactically placed pfrees or avoiding putting redundant stuff
> into MessageContext can relax the situation?

I too have had similar thoughts on the matter.  If the planner had built
all its subsidiary data structures in its own private context (or tree of
contexts) which is reset once a plan for a given query is built and passed
on, then there wouldn't be an issue of all of that subsidiary memory
leaking into MessageContext.  However, the problem may really be that
we're subjecting the planner to use cases that it wasn't perhaps designed
to perform equally well under -- running it many times while handling the
same message.  It is worsened by the fact that the query in question is
something that ought to have been documented as not well supported by the
planner; David has posted a documentation patch for that [1].  PG 12 has
alleviated the situation to a large degree, so you won't see the OOM
occurring for this query, but not for all queries unfortunately.

With that said, we may want to look into the planner sometimes hoarding
memory, especially when planning complex queries involving partitions.
AFAIK, one of the reasons for partition-wise join, aggregate to be turned
off by default is that its planning consumes a lot of CPU and memory,
partly because of the fact that planner doesn't actively release the
memory of its subsidiary structures, or maybe because of inferior ways in
which partitions and partitioning properties are represented in the
planner.  Though if there's evidence that it's the latter, maybe we should
fix that before pondering any sophisticated planner memory management.

Thanks,
Amit

[1]
https://www.postgresql.org/message-id/CAKJS1f-2rx%2BE9mG3xrCVHupefMjAp1%2BtpczQa9SEOZWyU7fjEA%40mail.gmail.com



Reply | Threaded
Open this post in threaded view
|

Re: Excessive memory usage in multi-statement queries w/ partitioning

Joe Conway
On 5/24/19 1:47 AM, Amit Langote wrote:

> On 2019/05/23 4:15, Andreas Seltenreich wrote:
>> …but when doing it on the parent relation, even 100 statements are
>> enough to exceed the limit:
>>
>> ,----
>> | $ psql -c "$(yes update t set c=c where c=6 \; | head -n 100)"
>> | FEHLER:  Speicher aufgebraucht
>> | DETAIL:  Failed on request of size 200 in memory context "MessageContext".
>> `----
>>
>> The memory context dump shows plausible values except for the MessageContext:
>>
>> TopMemoryContext: 124336 total in 8 blocks; 18456 free (11 chunks); 105880 used
>>   [...]
>>   MessageContext: 264241152 total in 42 blocks; 264 free (0 chunks); 264240888 used
>>   [...]
>
> As David Rowley said, planning that query hundreds of times under a single
> MessageContext is not something that will end well on 11.3, because even a
> single instance takes up tons of memory that's only released when
> MessageContext is reset.
>
>> Maybe some tactically placed pfrees or avoiding putting redundant stuff
>> into MessageContext can relax the situation?
>
> I too have had similar thoughts on the matter.  If the planner had built
> all its subsidiary data structures in its own private context (or tree of
> contexts) which is reset once a plan for a given query is built and passed
> on, then there wouldn't be an issue of all of that subsidiary memory
> leaking into MessageContext.  However, the problem may really be that
> we're subjecting the planner to use cases that it wasn't perhaps designed
> to perform equally well under -- running it many times while handling the
> same message.  It is worsened by the fact that the query in question is
> something that ought to have been documented as not well supported by the
> planner; David has posted a documentation patch for that [1].  PG 12 has
> alleviated the situation to a large degree, so you won't see the OOM
> occurring for this query, but not for all queries unfortunately.

I admittedly haven't followed this thread too closely, but if having 100
partitions causes out of memory on pg11, that sounds like a massive
regression to me.

Joe

--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development


signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Excessive memory usage in multi-statement queries w/ partitioning

David Rowley-3
On Sat, 25 May 2019 at 00:18, Joe Conway <[hidden email]> wrote:
> I admittedly haven't followed this thread too closely, but if having 100
> partitions causes out of memory on pg11, that sounds like a massive
> regression to me.

For it to have regressed it would have had to once have been better,
but where was that mentioned?  The only thing I saw was
non-partitioned tables compared to partitioned tables, but you can't
really say it's a regression if you're comparing apples to oranges.

I think the only regression here is in the documents from bebc46931a1
having removed the warning about too many partitions in a partitioned
table at the end of ddl.sgml. As Amit mentions, we'd like to put
something back about that.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Excessive memory usage in multi-statement queries w/ partitioning

Joe Conway
On 5/24/19 9:33 AM, David Rowley wrote:
> On Sat, 25 May 2019 at 00:18, Joe Conway <[hidden email]> wrote:
>> I admittedly haven't followed this thread too closely, but if having 100
>> partitions causes out of memory on pg11, that sounds like a massive
>> regression to me.
>
> For it to have regressed it would have had to once have been better,
> but where was that mentioned?  The only thing I saw was
> non-partitioned tables compared to partitioned tables, but you can't
> really say it's a regression if you're comparing apples to oranges.


I have very successfully used multiple hundreds and even low thousands
of partitions without running out of memory under the older inheritance
based "partitioning", and declarative partitioning is supposed to be
(and we have advertised it to be) better, not worse, isn't it?

At least from my point of view if 100 partitions is unusable due to
memory leaks it is a regression. Perhaps not *technically* a regression
assuming it behaves this way in pg10 also, but I bet lots of users would
see it that way.

Joe

--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development


signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Excessive memory usage in multi-statement queries w/ partitioning

Tom Lane-2
Joe Conway <[hidden email]> writes:
> On 5/24/19 9:33 AM, David Rowley wrote:
>> For it to have regressed it would have had to once have been better,
>> but where was that mentioned?  The only thing I saw was
>> non-partitioned tables compared to partitioned tables, but you can't
>> really say it's a regression if you're comparing apples to oranges.

> I have very successfully used multiple hundreds and even low thousands
> of partitions without running out of memory under the older inheritance
> based "partitioning", and declarative partitioning is supposed to be
> (and we have advertised it to be) better, not worse, isn't it?

Have you done the exact thing described in the test case?  I think
that's going to be quite unpleasantly memory-intensive in any version.

The real issue here is that we have designed around the assumption
that MessageContext will be used to parse and plan one single statement
before being reset.  The described usage breaks that assumption.
No matter how memory-efficient any one statement is or isn't,
if you throw enough of them at the backend without giving it a chance
to reset MessageContext, it won't end well.

So my thought, if we want to do something about this, is not "find
some things we can pfree at the end of planning" but "find a way
to use a separate context for each statement in the query string".
Maybe multi-query strings could be handled by setting up a child
context of MessageContext (after we've done the raw parsing there
and determined that indeed there are multiple queries), running
parse analysis and planning in that context, and resetting that
context after each query.

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: Excessive memory usage in multi-statement queries w/ partitioning

Joe Conway
On 5/24/19 10:28 AM, Tom Lane wrote:

> Joe Conway <[hidden email]> writes:
>> On 5/24/19 9:33 AM, David Rowley wrote:
>>> For it to have regressed it would have had to once have been better,
>>> but where was that mentioned?  The only thing I saw was
>>> non-partitioned tables compared to partitioned tables, but you can't
>>> really say it's a regression if you're comparing apples to oranges.
>
>> I have very successfully used multiple hundreds and even low thousands
>> of partitions without running out of memory under the older inheritance
>> based "partitioning", and declarative partitioning is supposed to be
>> (and we have advertised it to be) better, not worse, isn't it?
>
> Have you done the exact thing described in the test case?  I think
> that's going to be quite unpleasantly memory-intensive in any version.

Ok, fair point. Will test and report back.

Joe

--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development


signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Excessive memory usage in multi-statement queries w/ partitioning

Amit Langote-2
In reply to this post by Tom Lane-2
On 2019/05/24 23:28, Tom Lane wrote:
> So my thought, if we want to do something about this, is not "find
> some things we can pfree at the end of planning" but "find a way
> to use a separate context for each statement in the query string".
> Maybe multi-query strings could be handled by setting up a child
> context of MessageContext (after we've done the raw parsing there
> and determined that indeed there are multiple queries), running
> parse analysis and planning in that context, and resetting that
> context after each query.

Maybe like the attached?  I'm not sure if we need to likewise be concerned
about exec_sql_string() being handed multi-query strings.

Thanks,
Amit

parse-plan-memcxt.patch (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Excessive memory usage in multi-statement queries w/ partitioning

Amit Langote-2
In reply to this post by Joe Conway
Hi,

On 2019/05/24 21:18, Joe Conway wrote:

> On 5/24/19 1:47 AM, Amit Langote wrote:
>> I too have had similar thoughts on the matter.  If the planner had built
>> all its subsidiary data structures in its own private context (or tree of
>> contexts) which is reset once a plan for a given query is built and passed
>> on, then there wouldn't be an issue of all of that subsidiary memory
>> leaking into MessageContext.  However, the problem may really be that
>> we're subjecting the planner to use cases that it wasn't perhaps designed
>> to perform equally well under -- running it many times while handling the
>> same message.  It is worsened by the fact that the query in question is
>> something that ought to have been documented as not well supported by the
>> planner; David has posted a documentation patch for that [1].  PG 12 has
>> alleviated the situation to a large degree, so you won't see the OOM
>> occurring for this query, but not for all queries unfortunately.
>
> I admittedly haven't followed this thread too closely, but if having 100
> partitions causes out of memory on pg11, that sounds like a massive
> regression to me.

You won't run out of memory if you are running just one query per message,
but that's not the case in this discussion.  With multi-query submissions
like in this case, memory taken up by parsing and planning of *all*
queries adds up to a single MessageContext, so can lead to OOM if there
are enough queries to load up MessageContext beyond limit.  The only point
I was trying to make in what I wrote is that reaching OOM of this sort is
easier with partitioning, because of the age-old behavior that planning
UPDATE/DELETE queries on inherited tables (and so partitioned tables)
needs tons of memory that grows as the number of child tables / partitions
increases.

We fixed things in PG 12, at least for partitioning, so that as long as a
query needs to affect only a small number of partitions of the total
present, its planning will use only a fixed amount of CPU and memory, so
increasing the number of partitions won't lead to explosive growth in
memory used.  You might be able to tell however that that effort had
nothing to do improving the situation with multi-query submissions.

Thanks,
Amit



Reply | Threaded
Open this post in threaded view
|

Re: Excessive memory usage in multi-statement queries w/ partitioning

Tom Lane-2
In reply to this post by Amit Langote-2
Amit Langote <[hidden email]> writes:
> On 2019/05/24 23:28, Tom Lane wrote:
>> So my thought, if we want to do something about this, is not "find
>> some things we can pfree at the end of planning" but "find a way
>> to use a separate context for each statement in the query string".

> Maybe like the attached?  I'm not sure if we need to likewise be concerned
> about exec_sql_string() being handed multi-query strings.

Please add this to the upcoming CF so we don't forget about it.
(I don't think there's anything very new about this behavior, so
I don't feel that we should consider it an open item for v12 ---
anyone think differently?)

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: Excessive memory usage in multi-statement queries w/ partitioning

Amit Langote-2
On 2019/05/27 21:56, Tom Lane wrote:

> Amit Langote <[hidden email]> writes:
>> On 2019/05/24 23:28, Tom Lane wrote:
>>> So my thought, if we want to do something about this, is not "find
>>> some things we can pfree at the end of planning" but "find a way
>>> to use a separate context for each statement in the query string".
>
>> Maybe like the attached?  I'm not sure if we need to likewise be concerned
>> about exec_sql_string() being handed multi-query strings.
>
> Please add this to the upcoming CF so we don't forget about it.

Done; added to Performance for lack of a better topic for this.

https://commitfest.postgresql.org/23/2131/

> (I don't think there's anything very new about this behavior, so
> I don't feel that we should consider it an open item for v12 ---
> anyone think differently?)

Agree that there's nothing new about the behavior itself.  What may be new
though is people getting increasingly bitten by it if they query tables
containing large numbers of partitions most of which need to be scanned
[1].  That is, provided they have use cases where a single client request
contains hundreds of such queries to begin with.

Thanks,
Amit


[1] AFAICT, that's the only class of queries where planner needs to keep a
lot of stuff around, the memory cost of which increases with the number of
partitions.  I was thinking that the planning complex queries involving
going through tons of indexes, joins, etc. also hoards tons of memory, but
apparently not, because the planner seems fairly good at cleaning after
itself as it's doing the work.



Reply | Threaded
Open this post in threaded view
|

Re: Excessive memory usage in multi-statement queries w/ partitioning

Julien Rouhaud
On Tue, May 28, 2019 at 6:57 AM Amit Langote
<[hidden email]> wrote:

>
> On 2019/05/27 21:56, Tom Lane wrote:
> > Amit Langote <[hidden email]> writes:
> >> On 2019/05/24 23:28, Tom Lane wrote:
> >>> So my thought, if we want to do something about this, is not "find
> >>> some things we can pfree at the end of planning" but "find a way
> >>> to use a separate context for each statement in the query string".
> >
> >> Maybe like the attached?  I'm not sure if we need to likewise be concerned
> >> about exec_sql_string() being handed multi-query strings.

the whole extension sql script is passed to execute_sql_string(), so I
think that it's a good thing to have similar workaround there.

About the patch:

 -        * Switch to appropriate context for constructing querytrees (again,
-        * these must outlive the execution context).
+        * Switch to appropriate context for constructing querytrees.
+        * Memory allocated during this construction is released before
+        * the generated plan is executed.

The comment should mention query and plan trees, everything else seems ok to me.


Reply | Threaded
Open this post in threaded view
|

Re: Excessive memory usage in multi-statement queries w/ partitioning

Amit Langote
Hi Julien,

Thanks for taking a look at this.

On Thu, Jul 4, 2019 at 6:52 PM Julien Rouhaud <[hidden email]> wrote:
> On Tue, May 28, 2019 at 6:57 AM Amit Langote wrote:
> > >> Maybe like the attached?  I'm not sure if we need to likewise be concerned
> > >> about exec_sql_string() being handed multi-query strings.
>
> the whole extension sql script is passed to execute_sql_string(), so I
> think that it's a good thing to have similar workaround there.

That makes sense, although it is perhaps much less likely for memory
usage explosion to occur in execute_sql_strings(), because the scripts
passed to execute_sql_strings() mostly contain utility statements and
rarely anything whose planning will explode in memory usage.

Anyway, I've added similar handling in execute_sql_strings() for consistency.

Now I wonder if we'll need to consider another path which calls
pg_plan_queries() on a possibly multi-statement query --
BuildCachedPlan()...

> About the patch:
>
>  -        * Switch to appropriate context for constructing querytrees (again,
> -        * these must outlive the execution context).
> +        * Switch to appropriate context for constructing querytrees.
> +        * Memory allocated during this construction is released before
> +        * the generated plan is executed.
>
> The comment should mention query and plan trees, everything else seems ok to me.

Okay, fixed.

Attached updated patch.  Thanks again.

Regards,
Amit

parse-plan-memcxt_v2.patch (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Excessive memory usage in multi-statement queries w/ partitioning

Julien Rouhaud
Hi Amit,

On Mon, Jul 8, 2019 at 10:52 AM Amit Langote <[hidden email]> wrote:

>
> On Thu, Jul 4, 2019 at 6:52 PM Julien Rouhaud <[hidden email]> wrote:
> > On Tue, May 28, 2019 at 6:57 AM Amit Langote wrote:
> > > >> Maybe like the attached?  I'm not sure if we need to likewise be concerned
> > > >> about exec_sql_string() being handed multi-query strings.
> >
> > the whole extension sql script is passed to execute_sql_string(), so I
> > think that it's a good thing to have similar workaround there.
>
> That makes sense, although it is perhaps much less likely for memory
> usage explosion to occur in execute_sql_strings(), because the scripts
> passed to execute_sql_strings() mostly contain utility statements and
> rarely anything whose planning will explode in memory usage.
>
> Anyway, I've added similar handling in execute_sql_strings() for consistency.

Thanks!

> Now I wonder if we'll need to consider another path which calls
> pg_plan_queries() on a possibly multi-statement query --
> BuildCachedPlan()...

I also thought about this when reviewing the patch, but AFAICS you
can't provide a multi-statement query to BuildCachedPlan() using
prepared statements and I'm not sure that SPI is worth the trouble.
I'll mark this patch as ready for committer.

>
> > About the patch:
> >
> >  -        * Switch to appropriate context for constructing querytrees (again,
> > -        * these must outlive the execution context).
> > +        * Switch to appropriate context for constructing querytrees.
> > +        * Memory allocated during this construction is released before
> > +        * the generated plan is executed.
> >
> > The comment should mention query and plan trees, everything else seems ok to me.
>
> Okay, fixed.
>
> Attached updated patch.  Thanks again.
>
> Regards,
> Amit


Reply | Threaded
Open this post in threaded view
|

Re: Excessive memory usage in multi-statement queries w/ partitioning

Tom Lane-2
In reply to this post by Amit Langote
Amit Langote <[hidden email]> writes:
> [ parse-plan-memcxt_v2.patch ]

I got around to looking at this finally.  I'm not at all happy with
the fact that it's added a plantree copy step to the only execution
path through exec_simple_query.  That's a very significant overhead,
in many use-cases, to solve something that nobody had complained
about for a couple of decades before now.  I don't see the need for
any added copy step anyway.  The only reason you're doing it AFAICS
is so you can release the per-statement context a bit earlier, which
is a completely unnecessary optimization.  Just wait to release it
till the bottom of the loop.

Also, creating/deleting the sub-context is in itself an added overhead
that accomplishes exactly nothing in the typical case where there's
not multiple statements.  I thought the idea was to do that only if
there was more than one raw parsetree (or, maybe better, do it for
all but the last parsetree).

To show that this isn't an empty concern, I did a quick pgbench
test.  Using a single-client select-only test ("pgbench -S -T 60"
in an -s 10 database), I got these numbers in three trials with HEAD:

tps = 9593.818478 (excluding connections establishing)
tps = 9570.189163 (excluding connections establishing)
tps = 9596.579038 (excluding connections establishing)

and these numbers after applying the patch:

tps = 9411.918165 (excluding connections establishing)
tps = 9389.279079 (excluding connections establishing)
tps = 9409.350175 (excluding connections establishing)

That's about a 2% dropoff.  Now it's possible that that can be
explained away as random variations from a slightly different layout
of critical loops vs cacheline boundaries ... but I don't believe it.

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: Excessive memory usage in multi-statement queries w/ partitioning

Amit Langote
On Tue, Jul 9, 2019 at 6:21 AM Tom Lane <[hidden email]> wrote:
>
> Amit Langote <[hidden email]> writes:
> > [ parse-plan-memcxt_v2.patch ]
>
> I got around to looking at this finally.

Thanks for the review.

> I'm not at all happy with
> the fact that it's added a plantree copy step to the only execution
> path through exec_simple_query.  That's a very significant overhead,
> in many use-cases, to solve something that nobody had complained
> about for a couple of decades before now.  I don't see the need for
> any added copy step anyway.  The only reason you're doing it AFAICS
> is so you can release the per-statement context a bit earlier, which
> is a completely unnecessary optimization.  Just wait to release it
> till the bottom of the loop.

Ah, that makes sense.  I've removed the copying of plan tree and also
moved the temporary context deletion to the bottom of the loop.

> Also, creating/deleting the sub-context is in itself an added overhead
> that accomplishes exactly nothing in the typical case where there's
> not multiple statements.  I thought the idea was to do that only if
> there was more than one raw parsetree (or, maybe better, do it for
> all but the last parsetree).

That makes sense too.  I've made it (creation/deletion of the child
context) conditional on whether there are more than one queries to
plan.

> To show that this isn't an empty concern, I did a quick pgbench
> test.  Using a single-client select-only test ("pgbench -S -T 60"
> in an -s 10 database), I got these numbers in three trials with HEAD:
>
> tps = 9593.818478 (excluding connections establishing)
> tps = 9570.189163 (excluding connections establishing)
> tps = 9596.579038 (excluding connections establishing)
>
> and these numbers after applying the patch:
>
> tps = 9411.918165 (excluding connections establishing)
> tps = 9389.279079 (excluding connections establishing)
> tps = 9409.350175 (excluding connections establishing)
>
> That's about a 2% dropoff.
With the updated patch, here are the numbers on my machine (HEAD vs patch)

HEAD:

tps = 3586.233815 (excluding connections establishing)
tps = 3569.252542 (excluding connections establishing)
tps = 3559.027733 (excluding connections establishing)

Patched:

tps = 3586.988057 (excluding connections establishing)
tps = 3585.169589 (excluding connections establishing)
tps = 3526.437968 (excluding connections establishing)

A bit noisy but not much degradation.

Attached updated patch.  Thanks again.

Regards,
Amit

parse-plan-memcxt_v3.patch (5K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Excessive memory usage in multi-statement queries w/ partitioning

Kyotaro Horiguchi-4
Hi,

At Wed, 10 Jul 2019 16:35:18 +0900, Amit Langote <[hidden email]> wrote in <[hidden email]>

> On Tue, Jul 9, 2019 at 6:21 AM Tom Lane <[hidden email]> wrote:
> >
> > Amit Langote <[hidden email]> writes:
> > > [ parse-plan-memcxt_v2.patch ]
> >
> > I got around to looking at this finally.
>
> Thanks for the review.
>
> > I'm not at all happy with
> > the fact that it's added a plantree copy step to the only execution
> > path through exec_simple_query.  That's a very significant overhead,
> > in many use-cases, to solve something that nobody had complained
> > about for a couple of decades before now.  I don't see the need for
> > any added copy step anyway.  The only reason you're doing it AFAICS
> > is so you can release the per-statement context a bit earlier, which
> > is a completely unnecessary optimization.  Just wait to release it
> > till the bottom of the loop.
>
> Ah, that makes sense.  I've removed the copying of plan tree and also
> moved the temporary context deletion to the bottom of the loop.

-     * Switch to appropriate context for constructing querytrees (again,
-     * these must outlive the execution context).
+     * Switch to appropriate context for constructing query and plan trees
+     * (again, these must outlive the execution context).  Normally, it's
+     * MessageContext, but if there are more queries to plan, we use a
+     * temporary child context that will be reset after executing this
+     * query.  We avoid that overhead of setting up a separate context
+     * for the common case of having just a single query.

Might be stupid, but I feel uneasy that "usually it must live in
MessageContxt, but not necessarily if there is succeeding
query".. *I* need more explanation why it is safe to use that
short-lived context.

> > Also, creating/deleting the sub-context is in itself an added overhead
> > that accomplishes exactly nothing in the typical case where there's
> > not multiple statements.  I thought the idea was to do that only if
> > there was more than one raw parsetree (or, maybe better, do it for
> > all but the last parsetree).
>
> That makes sense too.  I've made it (creation/deletion of the child
> context) conditional on whether there are more than one queries to
> plan.
>
> > To show that this isn't an empty concern, I did a quick pgbench
> > test.  Using a single-client select-only test ("pgbench -S -T 60"
> > in an -s 10 database), I got these numbers in three trials with HEAD:
> >
> > tps = 9593.818478 (excluding connections establishing)
> > tps = 9570.189163 (excluding connections establishing)
> > tps = 9596.579038 (excluding connections establishing)
> >
> > and these numbers after applying the patch:
> >
> > tps = 9411.918165 (excluding connections establishing)
> > tps = 9389.279079 (excluding connections establishing)
> > tps = 9409.350175 (excluding connections establishing)
> >
> > That's about a 2% dropoff.
>
> With the updated patch, here are the numbers on my machine (HEAD vs patch)
>
> HEAD:
>
> tps = 3586.233815 (excluding connections establishing)
> tps = 3569.252542 (excluding connections establishing)
> tps = 3559.027733 (excluding connections establishing)
>
> Patched:
>
> tps = 3586.988057 (excluding connections establishing)
> tps = 3585.169589 (excluding connections establishing)
> tps = 3526.437968 (excluding connections establishing)
>
> A bit noisy but not much degradation.
>
> Attached updated patch.  Thanks again.
>
> Regards,
> Amit

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center


12