Re: Block level parallel vacuum

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
284 messages Options
1234 ... 15
Reply | Threaded
Open this post in threaded view
|

Re: Block level parallel vacuum

Masahiko Sawada
On Thu, Nov 30, 2017 at 11:09 AM, Michael Paquier
<[hidden email]> wrote:

> On Tue, Oct 24, 2017 at 5:54 AM, Masahiko Sawada <[hidden email]> wrote:
>> Yeah, I was thinking the commit is relevant with this issue but as
>> Amit mentioned this error is emitted by DROP SCHEMA CASCASE.
>> I don't find out the cause of this issue yet. With the previous
>> version patch, autovacuum workers were woking with one parallel worker
>> but it never drops relations. So it's possible that the error might
>> not have been relevant with the patch but anywayI'll continue to work
>> on that.
>
> This depends on the extension lock patch from
> https://www.postgresql.org/message-id/flat/CAD21AoCmT3cFQUN4aVvzy5chw7DuzXrJCbrjTU05B+Ss=Gn1LA@.../
> if I am following correctly. So I propose to mark this patch as
> returned with feedback for now, and come back to it once the root
> problems are addressed. Feel free to correct me if you think that's
> not adapted.
I've re-designed the parallel vacuum patch. Attached the latest
version patch. As the discussion so far, this patch depends on the
extension lock patch[1]. However I think we can discuss the design
part of parallel vacuum independently from that patch. That's way I'm
proposing the new patch. In this patch, I structured and refined the
lazy_scan_heap() because it's a single big function and not suitable
for making it parallel.

The parallel vacuum worker processes keep waiting for commands from
the parallel vacuum leader process. Before entering each phase of lazy
vacuum such as scanning heap, vacuum index and vacuum heap, the leader
process changes the all workers state to the next state. Vacuum worker
processes do the job according to the their state and wait for the
next command after finished. Also in before entering the next phase,
the leader process does some preparation works while vacuum workers is
sleeping; for example, clearing shared dead tuple space before
entering the 'scanning heap' phase. The status of vacuum workers are
stored into a DSM area pointed by WorkerState variables, and
controlled by the leader process. FOr the basic design and performance
improvements please refer to my presentation at PGCon 2018[2].

The number of parallel vacuum workers is determined according to
either the table size or PARALLEL option in VACUUM command. The
maximum of parallel workers is max_parallel_maintenance_workers.

I've separated the code for vacuum worker process to
backends/commands/vacuumworker.c, and created
includes/commands/vacuum_internal.h file to declare the definitions
for the lazy vacuum.

For autovacuum, this patch allows autovacuum worker process to use the
parallel option according to the relation size or the reloption. But
autovacuum delay, since there is no slots for parallel worker of
autovacuum in AutoVacuumShmem this patch doesn't support the change of
the autovacuum delay configuration during running.

Please apply this patch with the extension lock patch[1] when testing
as this patch can try to extend visibility map pages concurrently.

[1] https://www.postgresql.org/message-id/CAD21AoBn8WbOt21MFfj1mQmL2ZD8KVgMHYrOe1F5ozsQC4Z_hw%40mail.gmail.com
[2] https://www.pgcon.org/2018/schedule/events/1202.en.html

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

v7-0001-Publish-some-parallel-heap-scan-functions.patch (3K) Download Attachment
v7-0002-Add-parallel-option-to-lazy-vacuum.patch (196K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Block level parallel vacuum

Masahiko Sawada
On Tue, Aug 14, 2018 at 9:31 AM Masahiko Sawada <[hidden email]> wrote:

>
> On Thu, Nov 30, 2017 at 11:09 AM, Michael Paquier
> <[hidden email]> wrote:
> > On Tue, Oct 24, 2017 at 5:54 AM, Masahiko Sawada <[hidden email]> wrote:
> >> Yeah, I was thinking the commit is relevant with this issue but as
> >> Amit mentioned this error is emitted by DROP SCHEMA CASCASE.
> >> I don't find out the cause of this issue yet. With the previous
> >> version patch, autovacuum workers were woking with one parallel worker
> >> but it never drops relations. So it's possible that the error might
> >> not have been relevant with the patch but anywayI'll continue to work
> >> on that.
> >
> > This depends on the extension lock patch from
> > https://www.postgresql.org/message-id/flat/CAD21AoCmT3cFQUN4aVvzy5chw7DuzXrJCbrjTU05B+Ss=Gn1LA@.../
> > if I am following correctly. So I propose to mark this patch as
> > returned with feedback for now, and come back to it once the root
> > problems are addressed. Feel free to correct me if you think that's
> > not adapted.
>
> I've re-designed the parallel vacuum patch. Attached the latest
> version patch. As the discussion so far, this patch depends on the
> extension lock patch[1]. However I think we can discuss the design
> part of parallel vacuum independently from that patch. That's way I'm
> proposing the new patch. In this patch, I structured and refined the
> lazy_scan_heap() because it's a single big function and not suitable
> for making it parallel.
>
> The parallel vacuum worker processes keep waiting for commands from
> the parallel vacuum leader process. Before entering each phase of lazy
> vacuum such as scanning heap, vacuum index and vacuum heap, the leader
> process changes the all workers state to the next state. Vacuum worker
> processes do the job according to the their state and wait for the
> next command after finished. Also in before entering the next phase,
> the leader process does some preparation works while vacuum workers is
> sleeping; for example, clearing shared dead tuple space before
> entering the 'scanning heap' phase. The status of vacuum workers are
> stored into a DSM area pointed by WorkerState variables, and
> controlled by the leader process. FOr the basic design and performance
> improvements please refer to my presentation at PGCon 2018[2].
>
> The number of parallel vacuum workers is determined according to
> either the table size or PARALLEL option in VACUUM command. The
> maximum of parallel workers is max_parallel_maintenance_workers.
>
> I've separated the code for vacuum worker process to
> backends/commands/vacuumworker.c, and created
> includes/commands/vacuum_internal.h file to declare the definitions
> for the lazy vacuum.
>
> For autovacuum, this patch allows autovacuum worker process to use the
> parallel option according to the relation size or the reloption. But
> autovacuum delay, since there is no slots for parallel worker of
> autovacuum in AutoVacuumShmem this patch doesn't support the change of
> the autovacuum delay configuration during running.
>

Attached rebased version patch to the current HEAD.

> Please apply this patch with the extension lock patch[1] when testing
> as this patch can try to extend visibility map pages concurrently.
>

Because the patch leads performance degradation in the case where
bulk-loading to a partitioned table I think that the original
proposal, which makes group locking conflict when relation extension
locks, is more realistic approach. So I worked on this with the simple
patch instead of [1]. Attached three patches:

* 0001 patch publishes some static functions such as
heap_paralellscan_startblock_init so that the parallel vacuum code can
use them.
* 0002 patch makes the group locking conflict when relation extension locks.
* 0003 patch add paralel option to lazy vacuum.

Please review them.

[1] https://www.postgresql.org/message-id/CAD21AoBn8WbOt21MFfj1mQmL2ZD8KVgMHYrOe1F5ozsQC4Z_hw%40mail.gmail.com

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Reply | Threaded
Open this post in threaded view
|

Re: Block level parallel vacuum

Masahiko Sawada
On Tue, Oct 30, 2018 at 5:30 PM Masahiko Sawada <[hidden email]> wrote:

>
> On Tue, Aug 14, 2018 at 9:31 AM Masahiko Sawada <[hidden email]> wrote:
> >
> > On Thu, Nov 30, 2017 at 11:09 AM, Michael Paquier
> > <[hidden email]> wrote:
> > > On Tue, Oct 24, 2017 at 5:54 AM, Masahiko Sawada <[hidden email]> wrote:
> > >> Yeah, I was thinking the commit is relevant with this issue but as
> > >> Amit mentioned this error is emitted by DROP SCHEMA CASCASE.
> > >> I don't find out the cause of this issue yet. With the previous
> > >> version patch, autovacuum workers were woking with one parallel worker
> > >> but it never drops relations. So it's possible that the error might
> > >> not have been relevant with the patch but anywayI'll continue to work
> > >> on that.
> > >
> > > This depends on the extension lock patch from
> > > https://www.postgresql.org/message-id/flat/CAD21AoCmT3cFQUN4aVvzy5chw7DuzXrJCbrjTU05B+Ss=Gn1LA@.../
> > > if I am following correctly. So I propose to mark this patch as
> > > returned with feedback for now, and come back to it once the root
> > > problems are addressed. Feel free to correct me if you think that's
> > > not adapted.
> >
> > I've re-designed the parallel vacuum patch. Attached the latest
> > version patch. As the discussion so far, this patch depends on the
> > extension lock patch[1]. However I think we can discuss the design
> > part of parallel vacuum independently from that patch. That's way I'm
> > proposing the new patch. In this patch, I structured and refined the
> > lazy_scan_heap() because it's a single big function and not suitable
> > for making it parallel.
> >
> > The parallel vacuum worker processes keep waiting for commands from
> > the parallel vacuum leader process. Before entering each phase of lazy
> > vacuum such as scanning heap, vacuum index and vacuum heap, the leader
> > process changes the all workers state to the next state. Vacuum worker
> > processes do the job according to the their state and wait for the
> > next command after finished. Also in before entering the next phase,
> > the leader process does some preparation works while vacuum workers is
> > sleeping; for example, clearing shared dead tuple space before
> > entering the 'scanning heap' phase. The status of vacuum workers are
> > stored into a DSM area pointed by WorkerState variables, and
> > controlled by the leader process. FOr the basic design and performance
> > improvements please refer to my presentation at PGCon 2018[2].
> >
> > The number of parallel vacuum workers is determined according to
> > either the table size or PARALLEL option in VACUUM command. The
> > maximum of parallel workers is max_parallel_maintenance_workers.
> >
> > I've separated the code for vacuum worker process to
> > backends/commands/vacuumworker.c, and created
> > includes/commands/vacuum_internal.h file to declare the definitions
> > for the lazy vacuum.
> >
> > For autovacuum, this patch allows autovacuum worker process to use the
> > parallel option according to the relation size or the reloption. But
> > autovacuum delay, since there is no slots for parallel worker of
> > autovacuum in AutoVacuumShmem this patch doesn't support the change of
> > the autovacuum delay configuration during running.
> >
>
> Attached rebased version patch to the current HEAD.
>
> > Please apply this patch with the extension lock patch[1] when testing
> > as this patch can try to extend visibility map pages concurrently.
> >
>
> Because the patch leads performance degradation in the case where
> bulk-loading to a partitioned table I think that the original
> proposal, which makes group locking conflict when relation extension
> locks, is more realistic approach. So I worked on this with the simple
> patch instead of [1]. Attached three patches:
>
> * 0001 patch publishes some static functions such as
> heap_paralellscan_startblock_init so that the parallel vacuum code can
> use them.
> * 0002 patch makes the group locking conflict when relation extension locks.
> * 0003 patch add paralel option to lazy vacuum.
>
> Please review them.
>
Oops, forgot to attach patches.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

v8-0001-Publish-some-parallel-heap-scan-functions.patch (3K) Download Attachment
v8-0002-Make-group-locking-conflict-when-relation-exntesi.patch (1K) Download Attachment
v8-0003-Add-parallel-option-to-lazy-vacuum.patch (199K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Block level parallel vacuum

Юрий Соколов
Excuse me for being noisy.

Increasing vacuum's ring buffer improves vacuum upto 6 times.
https://www.postgresql.org/message-id/flat/20170720190405.GM1769%40tamriel.snowman.net
This is one-line change.

How much improvement parallel vacuum gives?

31.10.2018 3:23, Masahiko Sawada пишет:

> On Tue, Oct 30, 2018 at 5:30 PM Masahiko Sawada <[hidden email]> wrote:
>>
>> On Tue, Aug 14, 2018 at 9:31 AM Masahiko Sawada <[hidden email]> wrote:
>>>
>>> On Thu, Nov 30, 2017 at 11:09 AM, Michael Paquier
>>> <[hidden email]> wrote:
>>>> On Tue, Oct 24, 2017 at 5:54 AM, Masahiko Sawada <[hidden email]> wrote:
>>>>> Yeah, I was thinking the commit is relevant with this issue but as
>>>>> Amit mentioned this error is emitted by DROP SCHEMA CASCASE.
>>>>> I don't find out the cause of this issue yet. With the previous
>>>>> version patch, autovacuum workers were woking with one parallel worker
>>>>> but it never drops relations. So it's possible that the error might
>>>>> not have been relevant with the patch but anywayI'll continue to work
>>>>> on that.
>>>>
>>>> This depends on the extension lock patch from
>>>> https://www.postgresql.org/message-id/flat/CAD21AoCmT3cFQUN4aVvzy5chw7DuzXrJCbrjTU05B+Ss=Gn1LA@.../
>>>> if I am following correctly. So I propose to mark this patch as
>>>> returned with feedback for now, and come back to it once the root
>>>> problems are addressed. Feel free to correct me if you think that's
>>>> not adapted.
>>>
>>> I've re-designed the parallel vacuum patch. Attached the latest
>>> version patch. As the discussion so far, this patch depends on the
>>> extension lock patch[1]. However I think we can discuss the design
>>> part of parallel vacuum independently from that patch. That's way I'm
>>> proposing the new patch. In this patch, I structured and refined the
>>> lazy_scan_heap() because it's a single big function and not suitable
>>> for making it parallel.
>>>
>>> The parallel vacuum worker processes keep waiting for commands from
>>> the parallel vacuum leader process. Before entering each phase of lazy
>>> vacuum such as scanning heap, vacuum index and vacuum heap, the leader
>>> process changes the all workers state to the next state. Vacuum worker
>>> processes do the job according to the their state and wait for the
>>> next command after finished. Also in before entering the next phase,
>>> the leader process does some preparation works while vacuum workers is
>>> sleeping; for example, clearing shared dead tuple space before
>>> entering the 'scanning heap' phase. The status of vacuum workers are
>>> stored into a DSM area pointed by WorkerState variables, and
>>> controlled by the leader process. FOr the basic design and performance
>>> improvements please refer to my presentation at PGCon 2018[2].
>>>
>>> The number of parallel vacuum workers is determined according to
>>> either the table size or PARALLEL option in VACUUM command. The
>>> maximum of parallel workers is max_parallel_maintenance_workers.
>>>
>>> I've separated the code for vacuum worker process to
>>> backends/commands/vacuumworker.c, and created
>>> includes/commands/vacuum_internal.h file to declare the definitions
>>> for the lazy vacuum.
>>>
>>> For autovacuum, this patch allows autovacuum worker process to use the
>>> parallel option according to the relation size or the reloption. But
>>> autovacuum delay, since there is no slots for parallel worker of
>>> autovacuum in AutoVacuumShmem this patch doesn't support the change of
>>> the autovacuum delay configuration during running.
>>>
>>
>> Attached rebased version patch to the current HEAD.
>>
>>> Please apply this patch with the extension lock patch[1] when testing
>>> as this patch can try to extend visibility map pages concurrently.
>>>
>>
>> Because the patch leads performance degradation in the case where
>> bulk-loading to a partitioned table I think that the original
>> proposal, which makes group locking conflict when relation extension
>> locks, is more realistic approach. So I worked on this with the simple
>> patch instead of [1]. Attached three patches:
>>
>> * 0001 patch publishes some static functions such as
>> heap_paralellscan_startblock_init so that the parallel vacuum code can
>> use them.
>> * 0002 patch makes the group locking conflict when relation extension locks.
>> * 0003 patch add paralel option to lazy vacuum.
>>
>> Please review them.
>>
>
> Oops, forgot to attach patches.
>
> Regards,
>
> --
> Masahiko Sawada
> NIPPON TELEGRAPH AND TELEPHONE CORPORATION
> NTT Open Source Software Center
>


Reply | Threaded
Open this post in threaded view
|

Re: Block level parallel vacuum

Masahiko Sawada
Hi,

On Thu, Nov 1, 2018 at 2:28 PM Yura Sokolov <[hidden email]> wrote:
>
> Excuse me for being noisy.
>
> Increasing vacuum's ring buffer improves vacuum upto 6 times.
> https://www.postgresql.org/message-id/flat/20170720190405.GM1769%40tamriel.snowman.net
> This is one-line change.
>
> How much improvement parallel vacuum gives?

It depends on hardware resources you can use.

In current design the scanning heap and vacuuming heap are procesed
with parallel workers at block level (using parallel sequential scan)
and the vacuuming indexes are also processed with parallel worker at
index-level. So even if a table is not large enough the more a table
has indexes you can get better performance. The performance test
result (I attached) I did before shows that parallel vacuum is up to
almost 10 times faster than single-process vacuum in a case. The test
used not-large table (4GB table) with many indexes but it would be
insteresting to test with large table.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

parallel_vacuum.png (116K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Block level parallel vacuum

akapila
In reply to this post by Masahiko Sawada
On Tue, Oct 30, 2018 at 2:04 PM Masahiko Sawada <[hidden email]> wrote:

>
> Attached rebased version patch to the current HEAD.
>
> > Please apply this patch with the extension lock patch[1] when testing
> > as this patch can try to extend visibility map pages concurrently.
> >
>
> Because the patch leads performance degradation in the case where
> bulk-loading to a partitioned table I think that the original
> proposal, which makes group locking conflict when relation extension
> locks, is more realistic approach. So I worked on this with the simple
> patch instead of [1]. Attached three patches:
>
> * 0001 patch publishes some static functions such as
> heap_paralellscan_startblock_init so that the parallel vacuum code can
> use them.
> * 0002 patch makes the group locking conflict when relation extension locks.
> * 0003 patch add paralel option to lazy vacuum.
>
> Please review them.
>

I could see that you have put a lot of effort on this patch and still
we are not able to make much progress mainly I guess because of
relation extension lock problem.  I think we can park that problem for
some time (as already we have invested quite some time on it), discuss
a bit about actual parallel vacuum patch and then come back to it.  I
don't know if that is right or not.  I am not sure we can make this
ready for PG12 timeframe, but I feel this patch deserves some
attention.  I have started reading the main parallel vacuum patch and
below are some assorted comments.

+     <para>
+      Execute <command>VACUUM</command> in parallel by <replaceable
class="parameter">N
+      </replaceable>a background workers. Collecting garbage on table
is processed
+      in block-level parallel. For tables with indexes, parallel
vacuum assigns each
+      index to each parallel vacuum worker and all garbages on a
index are processed
+      by particular parallel vacuum worker. The maximum nunber of
parallel workers
+      is <xref linkend="guc-max-parallel-workers-maintenance"/>. This
option can not
+      use with <literal>FULL</literal> option.
+     </para>

There are a couple of mistakes in above para:
(a) "..a background workers." a seems redundant.
(b) "Collecting garbage on table is processed in block-level
parallel."/"Collecting garbage on table is processed at block-level in
parallel."
(c) "For tables with indexes, parallel vacuum assigns each index to
each parallel vacuum worker and all garbages on a index are processed
by particular parallel vacuum worker."
We can rephrase it as:
"For tables with indexes, parallel vacuum assigns a worker to each
index and all garbages on a index are processed by particular that
parallel vacuum worker."
(d) Typo: nunber/number
(e) Typo: can not/cannot

I have glanced part of the patch, but didn't find any README or doc
containing the design of this patch. I think without having design in
place, it is difficult to review a patch of this size and complexity.
To start with at least explain how the work is distributed among
workers, say there are two workers which needs to vacuum a table with
four indexes, how it works?  How does the leader participate and
coordinate the work.  The other parts that you can explain how the
state is maintained during parallel vacuum, something like you are
trying to do in below function:

+ * lazy_prepare_next_state
+ *
+ * Before enter the next state prepare the next state. In parallel lazy vacuum,
+ * we must wait for the all vacuum workers to finish the previous state before
+ * preparation. Also, after prepared we change the state ot all vacuum workers
+ * and wake up them.
+ */
+static void
+lazy_prepare_next_state(LVState *lvstate, LVLeader *lvleader, int next_state)

Still other things are how the stats are shared among leader and
worker.  I can understand few things in bits and pieces while glancing
through the patch, but it would be easier to understand if you
document it at one place.  It can help reviewers to understand it.

Can you consider to split the patch so that the refactoring you have
done in current code to make it usable by parallel vacuum is a
separate patch?

+/*
+ * Vacuum all indexes. In parallel vacuum, each workers take indexes
+ * one by one. Also after vacuumed index they mark it as done. This marking
+ * is necessary to guarantee that all indexes are vacuumed based on
+ * the current collected dead tuples. The leader process continues to
+ * vacuum even if any indexes is not vacuumed completely due to failure of
+ * parallel worker for whatever reason. The mark will be checked
before entering
+ * the next state.
+ */
+void
+lazy_vacuum_all_indexes(LVState *lvstate)

I didn't understand what you want to say here.  Do you mean that
leader can continue collecting more dead tuple TIDs when workers are
vacuuming the index?  How does it deal with the errors if any during
index vacuum?

+ * plan_lazy_vacuum_workers_index_workers
+ * Use the planner to decide how many parallel worker processes
+ * VACUUM and autovacuum should request for use
+ *
+ * tableOid is the table begin vacuumed which must not be non-tables or
+ * special system tables.
..
+ plan_lazy_vacuum_workers(Oid tableOid, int nworkers_requested)

The comment starting from tableOid is not clear.  The actual function
name(plan_lazy_vacuum_workers) and name in comments
(plan_lazy_vacuum_workers_index_workers) doesn't match.  Can you take
relation as input parameter instead of taking tableOid as that can
save a lot of code in this function.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Reply | Threaded
Open this post in threaded view
|

Re: Block level parallel vacuum

akapila
On Sat, Nov 24, 2018 at 5:47 PM Amit Kapila <[hidden email]> wrote:
> On Tue, Oct 30, 2018 at 2:04 PM Masahiko Sawada <[hidden email]> wrote:
> >
>
> I could see that you have put a lot of effort on this patch and still
> we are not able to make much progress mainly I guess because of
> relation extension lock problem.  I think we can park that problem for
> some time (as already we have invested quite some time on it), discuss
> a bit about actual parallel vacuum patch and then come back to it.
>

Today, I was reading this and previous related thread [1] and it seems
to me multiple people Andres [2], Simon [3] have pointed out that
parallelization for index portion is more valuable.  Also, some of the
results [4] indicate the same.  Now, when there are no indexes,
parallelizing heap scans also have benefit, but I think in practice we
will see more cases where the user wants to vacuum tables with
indexes.  So how about if we break this problem in the following way
where each piece give the benefit of its own:
(a) Parallelize index scans wherein the workers will be launched only
to vacuum indexes.  Only one worker per index will be spawned.
(b) Parallelize per-index vacuum.  Each index can be vacuumed by
multiple workers.
(c) Parallelize heap scans where multiple workers will scan the heap,
collect dead TIDs and then launch multiple workers for indexes.

I think if we break this problem into multiple patches, it will reduce
the scope of each patch and help us in making progress.   Now, it's
been more than 2 years that we are trying to solve this problem, but
still didn't make much progress.  I understand there are various
genuine reasons and all of that work will help us in solving all the
problems in this area.  How about if we first target problem (a) and
once we are done with that we can see which of (b) or (c) we want to
do first?


[1] - https://www.postgresql.org/message-id/CAD21AoD1xAqp4zK-Vi1cuY3feq2oO8HcpJiz32UDUfe0BE31Xw%40mail.gmail.com
[2] - https://www.postgresql.org/message-id/20160823164836.naody2ht6cutioiz%40alap3.anarazel.de
[3] - https://www.postgresql.org/message-id/CANP8%2BjKWOw6AAorFOjdynxUKqs6XRReOcNy-VXRFFU_4bBT8ww%40mail.gmail.com
[4] - https://www.postgresql.org/message-id/CAGTBQpbU3R_VgyWk6jaD%3D6v-Wwrm8%2B6CbrzQxQocH0fmedWRkw%40mail.gmail.com

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Reply | Threaded
Open this post in threaded view
|

Re: Block level parallel vacuum

Masahiko Sawada
On Sun, Nov 25, 2018 at 2:35 PM Amit Kapila <[hidden email]> wrote:
>
> On Sat, Nov 24, 2018 at 5:47 PM Amit Kapila <[hidden email]> wrote:
> > On Tue, Oct 30, 2018 at 2:04 PM Masahiko Sawada <[hidden email]> wrote:
> > >
> >

Thank you for the comment.

> > I could see that you have put a lot of effort on this patch and still
> > we are not able to make much progress mainly I guess because of
> > relation extension lock problem.  I think we can park that problem for
> > some time (as already we have invested quite some time on it), discuss
> > a bit about actual parallel vacuum patch and then come back to it.
> >
>
> Today, I was reading this and previous related thread [1] and it seems
> to me multiple people Andres [2], Simon [3] have pointed out that
> parallelization for index portion is more valuable.  Also, some of the
> results [4] indicate the same.  Now, when there are no indexes,
> parallelizing heap scans also have benefit, but I think in practice we
> will see more cases where the user wants to vacuum tables with
> indexes.  So how about if we break this problem in the following way
> where each piece give the benefit of its own:
> (a) Parallelize index scans wherein the workers will be launched only
> to vacuum indexes.  Only one worker per index will be spawned.
> (b) Parallelize per-index vacuum.  Each index can be vacuumed by
> multiple workers.
> (c) Parallelize heap scans where multiple workers will scan the heap,
> collect dead TIDs and then launch multiple workers for indexes.
>
> I think if we break this problem into multiple patches, it will reduce
> the scope of each patch and help us in making progress.   Now, it's
> been more than 2 years that we are trying to solve this problem, but
> still didn't make much progress.  I understand there are various
> genuine reasons and all of that work will help us in solving all the
> problems in this area.  How about if we first target problem (a) and
> once we are done with that we can see which of (b) or (c) we want to
> do first?

Thank you for suggestion. It seems good to me. We would get a nice
performance scalability even by only (a), and vacuum will get more
powerful by (b) or (c). Also, (a) would not require to resovle the
relation extension lock issue IIUC. I'll change the patch and submit
to the next CF.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Reply | Threaded
Open this post in threaded view
|

Re: Block level parallel vacuum

akapila
On Mon, Nov 26, 2018 at 2:08 PM Masahiko Sawada <[hidden email]> wrote:

>
> On Sun, Nov 25, 2018 at 2:35 PM Amit Kapila <[hidden email]> wrote:
> >
> > On Sat, Nov 24, 2018 at 5:47 PM Amit Kapila <[hidden email]> wrote:
> > > On Tue, Oct 30, 2018 at 2:04 PM Masahiko Sawada <[hidden email]> wrote:
> > > >
> > >
>
> Thank you for the comment.
>
> > > I could see that you have put a lot of effort on this patch and still
> > > we are not able to make much progress mainly I guess because of
> > > relation extension lock problem.  I think we can park that problem for
> > > some time (as already we have invested quite some time on it), discuss
> > > a bit about actual parallel vacuum patch and then come back to it.
> > >
> >
> > Today, I was reading this and previous related thread [1] and it seems
> > to me multiple people Andres [2], Simon [3] have pointed out that
> > parallelization for index portion is more valuable.  Also, some of the
> > results [4] indicate the same.  Now, when there are no indexes,
> > parallelizing heap scans also have benefit, but I think in practice we
> > will see more cases where the user wants to vacuum tables with
> > indexes.  So how about if we break this problem in the following way
> > where each piece give the benefit of its own:
> > (a) Parallelize index scans wherein the workers will be launched only
> > to vacuum indexes.  Only one worker per index will be spawned.
> > (b) Parallelize per-index vacuum.  Each index can be vacuumed by
> > multiple workers.
> > (c) Parallelize heap scans where multiple workers will scan the heap,
> > collect dead TIDs and then launch multiple workers for indexes.
> >
> > I think if we break this problem into multiple patches, it will reduce
> > the scope of each patch and help us in making progress.   Now, it's
> > been more than 2 years that we are trying to solve this problem, but
> > still didn't make much progress.  I understand there are various
> > genuine reasons and all of that work will help us in solving all the
> > problems in this area.  How about if we first target problem (a) and
> > once we are done with that we can see which of (b) or (c) we want to
> > do first?
>
> Thank you for suggestion. It seems good to me. We would get a nice
> performance scalability even by only (a), and vacuum will get more
> powerful by (b) or (c). Also, (a) would not require to resovle the
> relation extension lock issue IIUC.
>

Yes, I also think so.  We do acquire 'relation extension lock' during
index vacuum, but as part of (a), we are talking one worker per-index,
so there shouldn't be a problem with respect to deadlocks.

> I'll change the patch and submit
> to the next CF.
>

Okay.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Reply | Threaded
Open this post in threaded view
|

Re: Block level parallel vacuum

Masahiko Sawada
On Tue, Nov 27, 2018 at 11:26 AM Amit Kapila <[hidden email]> wrote:

>
> On Mon, Nov 26, 2018 at 2:08 PM Masahiko Sawada <[hidden email]> wrote:
> >
> > On Sun, Nov 25, 2018 at 2:35 PM Amit Kapila <[hidden email]> wrote:
> > >
> > > On Sat, Nov 24, 2018 at 5:47 PM Amit Kapila <[hidden email]> wrote:
> > > > On Tue, Oct 30, 2018 at 2:04 PM Masahiko Sawada <[hidden email]> wrote:
> > > > >
> > > >
> >
> > Thank you for the comment.
> >
> > > > I could see that you have put a lot of effort on this patch and still
> > > > we are not able to make much progress mainly I guess because of
> > > > relation extension lock problem.  I think we can park that problem for
> > > > some time (as already we have invested quite some time on it), discuss
> > > > a bit about actual parallel vacuum patch and then come back to it.
> > > >
> > >
> > > Today, I was reading this and previous related thread [1] and it seems
> > > to me multiple people Andres [2], Simon [3] have pointed out that
> > > parallelization for index portion is more valuable.  Also, some of the
> > > results [4] indicate the same.  Now, when there are no indexes,
> > > parallelizing heap scans also have benefit, but I think in practice we
> > > will see more cases where the user wants to vacuum tables with
> > > indexes.  So how about if we break this problem in the following way
> > > where each piece give the benefit of its own:
> > > (a) Parallelize index scans wherein the workers will be launched only
> > > to vacuum indexes.  Only one worker per index will be spawned.
> > > (b) Parallelize per-index vacuum.  Each index can be vacuumed by
> > > multiple workers.
> > > (c) Parallelize heap scans where multiple workers will scan the heap,
> > > collect dead TIDs and then launch multiple workers for indexes.
> > >
> > > I think if we break this problem into multiple patches, it will reduce
> > > the scope of each patch and help us in making progress.   Now, it's
> > > been more than 2 years that we are trying to solve this problem, but
> > > still didn't make much progress.  I understand there are various
> > > genuine reasons and all of that work will help us in solving all the
> > > problems in this area.  How about if we first target problem (a) and
> > > once we are done with that we can see which of (b) or (c) we want to
> > > do first?
> >
> > Thank you for suggestion. It seems good to me. We would get a nice
> > performance scalability even by only (a), and vacuum will get more
> > powerful by (b) or (c). Also, (a) would not require to resovle the
> > relation extension lock issue IIUC.
> >
>
> Yes, I also think so.  We do acquire 'relation extension lock' during
> index vacuum, but as part of (a), we are talking one worker per-index,
> so there shouldn't be a problem with respect to deadlocks.
>
> > I'll change the patch and submit
> > to the next CF.
> >
>
> Okay.
>
Attached the updated patches. I scaled back the scope of this patch.
The patch now includes only feature (a), that is it execute both index
vacuum and cleanup index in parallel. It also doesn't include
autovacuum support for now.

The PARALLEL option works alomst same as before patch. In VACUUM
command, we can specify 'PARALLEL n' option where n is the number of
parallel workers to request. If the n is omitted the number of
parallel worekrs would be # of indexes -1. Also we can specify
parallel degree by parallel_worker reloption. The number or parallel
workers is capped by Min(# of indexes - 1,
max_maintenance_parallel_workers). That is, parallel vacuum can be
executed for a table if it has more than one indexes.

For internal design, the details are written at the top of comment in
vacuumlazy.c file. In parallel vacuum mode, we allocate DSM at the
beginning of lazy vacuum which stores shared information as well as
dead tuples. When starting either index vacuum or cleanup vacuum we
launch parallel workers. The parallel workers perform either index
vacuum or clenaup vacuum for each indexes, and then exit after done
all indexes. Then the leader process re-initialize DSM and re-launch
at the next time, not destroy parallel context here. After done lazy
vacuum, the leader process exits the parallel mode and updates index
statistics since we are not allowed any writes during parallel mode.

Also I've attached 0002 patch to support parallel lazy vacuum for
vacuumdb command.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

v9-0002-Add-P-option-to-vacuumdb-command.patch (7K) Download Attachment
v9-0001-Add-parallel-option-to-VACUUM-command.patch (96K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Block level parallel vacuum

akapila
On Tue, Dec 18, 2018 at 1:29 PM Masahiko Sawada <[hidden email]> wrote:

>
> Attached the updated patches. I scaled back the scope of this patch.
> The patch now includes only feature (a), that is it execute both index
> vacuum and cleanup index in parallel. It also doesn't include
> autovacuum support for now.
>
> The PARALLEL option works alomst same as before patch. In VACUUM
> command, we can specify 'PARALLEL n' option where n is the number of
> parallel workers to request. If the n is omitted the number of
> parallel worekrs would be # of indexes -1.
>

I think for now this is okay, but I guess in furture when we make
heapscans also parallel or maybe allow more than one worker per-index
vacuum, then this won't hold good. So, I am not sure if below text in
docs is most appropriate.

+    <term><literal>PARALLEL <replaceable
class="parameter">N</replaceable></literal></term>
+    <listitem>
+     <para>
+      Execute index vacuum and cleanup index in parallel with
+      <replaceable class="parameter">N</replaceable> background
workers. If the parallel
+      degree <replaceable class="parameter">N</replaceable> is omitted,
+      <command>VACUUM</command> requests the number of indexes - 1
processes, which is the
+      maximum number of parallel vacuum workers since individual
indexes is processed by
+      one process. The actual number of parallel vacuum workers may
be less due to the
+      setting of <xref linkend="guc-max-parallel-workers-maintenance"/>.
+      This option can not use with  <literal>FULL</literal> option.

It might be better to use some generic language in docs, something
like "If the parallel degree N is omitted, then vacuum decides the
number of workers based on number of indexes on the relation which is
further limited by max-parallel-workers-maintenance".   I think you
also need to mention in some way that you consider storage option
parallel_workers.

Few assorted comments:
1.
+lazy_begin_parallel_vacuum_index(LVState *lvstate, bool for_cleanup)
{
..
+
+ LaunchParallelWorkers(lvstate->pcxt);
+
+ /*
+ * if no workers launched, we vacuum all indexes by the leader process
+ * alone. Since there is hope that we can launch workers in the next
+ * execution time we don't want to end the parallel mode yet.
+ */
+ if (lvstate->pcxt->nworkers_launched == 0)
+ return;

It is quite possible that the workers are not launched because we fail
to allocate memory, basically when pcxt->nworkers is zero.  I think in
such cases there is no use for being in parallel mode.  You can even
detect that before calling lazy_begin_parallel_vacuum_index.

2.
static void
+lazy_vacuum_all_indexes_for_leader(LVState *lvstate,
IndexBulkDeleteResult **stats,
+    LVTidMap *dead_tuples, bool do_parallel,
+    bool for_cleanup)
{
..
+ if (do_parallel)
+ lazy_begin_parallel_vacuum_index(lvstate, for_cleanup);
+
+ for (;;)
+ {
+ IndexBulkDeleteResult *r = NULL;
+
+ /*
+ * Get the next index number to vacuum and set index statistics. In parallel
+ * lazy vacuum, index bulk-deletion results are stored in the shared memory
+ * segment. If it's already updated we use it rather than setting it to NULL.
+ * In single vacuum, we can always use an element of the 'stats'.
+ */
+ if (do_parallel)
+ {
+ idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
+
+ if (lvshared->indstats[idx].updated)
+ r = &(lvshared->indstats[idx].stats);
+ }

It is quite possible that we are not able to launch any workers in
lazy_begin_parallel_vacuum_index, in such cases, we should not use
parallel mode path, basically as written we can't rely on
'do_parallel' flag.


--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Reply | Threaded
Open this post in threaded view
|

Re: Block level parallel vacuum

Masahiko Sawada
On Thu, Dec 20, 2018 at 3:38 PM Amit Kapila <[hidden email]> wrote:

>
> On Tue, Dec 18, 2018 at 1:29 PM Masahiko Sawada <[hidden email]> wrote:
> >
> > Attached the updated patches. I scaled back the scope of this patch.
> > The patch now includes only feature (a), that is it execute both index
> > vacuum and cleanup index in parallel. It also doesn't include
> > autovacuum support for now.
> >
> > The PARALLEL option works alomst same as before patch. In VACUUM
> > command, we can specify 'PARALLEL n' option where n is the number of
> > parallel workers to request. If the n is omitted the number of
> > parallel worekrs would be # of indexes -1.
> >
>
> I think for now this is okay, but I guess in furture when we make
> heapscans also parallel or maybe allow more than one worker per-index
> vacuum, then this won't hold good. So, I am not sure if below text in
> docs is most appropriate.
>
> +    <term><literal>PARALLEL <replaceable
> class="parameter">N</replaceable></literal></term>
> +    <listitem>
> +     <para>
> +      Execute index vacuum and cleanup index in parallel with
> +      <replaceable class="parameter">N</replaceable> background
> workers. If the parallel
> +      degree <replaceable class="parameter">N</replaceable> is omitted,
> +      <command>VACUUM</command> requests the number of indexes - 1
> processes, which is the
> +      maximum number of parallel vacuum workers since individual
> indexes is processed by
> +      one process. The actual number of parallel vacuum workers may
> be less due to the
> +      setting of <xref linkend="guc-max-parallel-workers-maintenance"/>.
> +      This option can not use with  <literal>FULL</literal> option.
>
> It might be better to use some generic language in docs, something
> like "If the parallel degree N is omitted, then vacuum decides the
> number of workers based on number of indexes on the relation which is
> further limited by max-parallel-workers-maintenance".
Thank you for the review.

I agreed your concern and the text you suggested.

>  I think you
> also need to mention in some way that you consider storage option
> parallel_workers.

Added.

>
> Few assorted comments:
> 1.
> +lazy_begin_parallel_vacuum_index(LVState *lvstate, bool for_cleanup)
> {
> ..
> +
> + LaunchParallelWorkers(lvstate->pcxt);
> +
> + /*
> + * if no workers launched, we vacuum all indexes by the leader process
> + * alone. Since there is hope that we can launch workers in the next
> + * execution time we don't want to end the parallel mode yet.
> + */
> + if (lvstate->pcxt->nworkers_launched == 0)
> + return;
>
> It is quite possible that the workers are not launched because we fail
> to allocate memory, basically when pcxt->nworkers is zero.  I think in
> such cases there is no use for being in parallel mode.  You can even
> detect that before calling lazy_begin_parallel_vacuum_index.
Agreed. we can stop preparation and exit parallel mode if
pcxt->nworkers got 0 after InitializeParallelDSM() .

>
> 2.
> static void
> +lazy_vacuum_all_indexes_for_leader(LVState *lvstate,
> IndexBulkDeleteResult **stats,
> +    LVTidMap *dead_tuples, bool do_parallel,
> +    bool for_cleanup)
> {
> ..
> + if (do_parallel)
> + lazy_begin_parallel_vacuum_index(lvstate, for_cleanup);
> +
> + for (;;)
> + {
> + IndexBulkDeleteResult *r = NULL;
> +
> + /*
> + * Get the next index number to vacuum and set index statistics. In parallel
> + * lazy vacuum, index bulk-deletion results are stored in the shared memory
> + * segment. If it's already updated we use it rather than setting it to NULL.
> + * In single vacuum, we can always use an element of the 'stats'.
> + */
> + if (do_parallel)
> + {
> + idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
> +
> + if (lvshared->indstats[idx].updated)
> + r = &(lvshared->indstats[idx].stats);
> + }
>
> It is quite possible that we are not able to launch any workers in
> lazy_begin_parallel_vacuum_index, in such cases, we should not use
> parallel mode path, basically as written we can't rely on
> 'do_parallel' flag.
>
Fixed.

Attached new version patch.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

v10-0002-Add-P-option-to-vacuumdb-command.patch (7K) Download Attachment
v10-0001-Add-parallel-option-to-VACUUM-command.patch (100K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Block level parallel vacuum

Masahiko Sawada
On Fri, Dec 28, 2018 at 11:43 AM Masahiko Sawada <[hidden email]> wrote:

>
> On Thu, Dec 20, 2018 at 3:38 PM Amit Kapila <[hidden email]> wrote:
> >
> > On Tue, Dec 18, 2018 at 1:29 PM Masahiko Sawada <[hidden email]> wrote:
> > >
> > > Attached the updated patches. I scaled back the scope of this patch.
> > > The patch now includes only feature (a), that is it execute both index
> > > vacuum and cleanup index in parallel. It also doesn't include
> > > autovacuum support for now.
> > >
> > > The PARALLEL option works alomst same as before patch. In VACUUM
> > > command, we can specify 'PARALLEL n' option where n is the number of
> > > parallel workers to request. If the n is omitted the number of
> > > parallel worekrs would be # of indexes -1.
> > >
> >
> > I think for now this is okay, but I guess in furture when we make
> > heapscans also parallel or maybe allow more than one worker per-index
> > vacuum, then this won't hold good. So, I am not sure if below text in
> > docs is most appropriate.
> >
> > +    <term><literal>PARALLEL <replaceable
> > class="parameter">N</replaceable></literal></term>
> > +    <listitem>
> > +     <para>
> > +      Execute index vacuum and cleanup index in parallel with
> > +      <replaceable class="parameter">N</replaceable> background
> > workers. If the parallel
> > +      degree <replaceable class="parameter">N</replaceable> is omitted,
> > +      <command>VACUUM</command> requests the number of indexes - 1
> > processes, which is the
> > +      maximum number of parallel vacuum workers since individual
> > indexes is processed by
> > +      one process. The actual number of parallel vacuum workers may
> > be less due to the
> > +      setting of <xref linkend="guc-max-parallel-workers-maintenance"/>.
> > +      This option can not use with  <literal>FULL</literal> option.
> >
> > It might be better to use some generic language in docs, something
> > like "If the parallel degree N is omitted, then vacuum decides the
> > number of workers based on number of indexes on the relation which is
> > further limited by max-parallel-workers-maintenance".
>
> Thank you for the review.
>
> I agreed your concern and the text you suggested.
>
> >  I think you
> > also need to mention in some way that you consider storage option
> > parallel_workers.
>
> Added.
>
> >
> > Few assorted comments:
> > 1.
> > +lazy_begin_parallel_vacuum_index(LVState *lvstate, bool for_cleanup)
> > {
> > ..
> > +
> > + LaunchParallelWorkers(lvstate->pcxt);
> > +
> > + /*
> > + * if no workers launched, we vacuum all indexes by the leader process
> > + * alone. Since there is hope that we can launch workers in the next
> > + * execution time we don't want to end the parallel mode yet.
> > + */
> > + if (lvstate->pcxt->nworkers_launched == 0)
> > + return;
> >
> > It is quite possible that the workers are not launched because we fail
> > to allocate memory, basically when pcxt->nworkers is zero.  I think in
> > such cases there is no use for being in parallel mode.  You can even
> > detect that before calling lazy_begin_parallel_vacuum_index.
>
> Agreed. we can stop preparation and exit parallel mode if
> pcxt->nworkers got 0 after InitializeParallelDSM() .
>
> >
> > 2.
> > static void
> > +lazy_vacuum_all_indexes_for_leader(LVState *lvstate,
> > IndexBulkDeleteResult **stats,
> > +    LVTidMap *dead_tuples, bool do_parallel,
> > +    bool for_cleanup)
> > {
> > ..
> > + if (do_parallel)
> > + lazy_begin_parallel_vacuum_index(lvstate, for_cleanup);
> > +
> > + for (;;)
> > + {
> > + IndexBulkDeleteResult *r = NULL;
> > +
> > + /*
> > + * Get the next index number to vacuum and set index statistics. In parallel
> > + * lazy vacuum, index bulk-deletion results are stored in the shared memory
> > + * segment. If it's already updated we use it rather than setting it to NULL.
> > + * In single vacuum, we can always use an element of the 'stats'.
> > + */
> > + if (do_parallel)
> > + {
> > + idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
> > +
> > + if (lvshared->indstats[idx].updated)
> > + r = &(lvshared->indstats[idx].stats);
> > + }
> >
> > It is quite possible that we are not able to launch any workers in
> > lazy_begin_parallel_vacuum_index, in such cases, we should not use
> > parallel mode path, basically as written we can't rely on
> > 'do_parallel' flag.
> >
>
> Fixed.
>
> Attached new version patch.
>
Rebased.


Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

v11-0001-Add-parallel-option-to-VACUUM-command.patch (99K) Download Attachment
v11-0002-Add-P-option-to-vacuumdb-command.patch (7K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Block level parallel vacuum

Haribabu Kommi-2

On Tue, Jan 15, 2019 at 6:00 PM Masahiko Sawada <[hidden email]> wrote:

Rebased.

I started reviewing the patch, I didn't finish my review yet.
Following are some of the comments.

+    <term><literal>PARALLEL <replaceable class="parameter">N</replaceable></literal></term>
+    <listitem>
+     <para>
+      Execute index vacuum and cleanup index in parallel with

I doubt that user can understand the terms index vacuum and cleanup index.
May be it needs some more detailed information.


- VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7 /* don't skip any pages */
+ VACOPT_PARALLEL = 1 << 7, /* do lazy VACUUM in parallel */
+ VACOPT_DISABLE_PAGE_SKIPPING = 1 << 8 /* don't skip any pages */
+} VacuumOptionFlag;

Any specific reason behind not adding it as last member of the enum?


-typedef enum VacuumOption
+typedef enum VacuumOptionFlag
 {

I don't find the new name quite good, how about VacuumFlags?


+typedef struct VacuumOption
+{

How about VacuumOptions? Because this structure can contains all the
options provided to vacuum operation. 



+ vacopt1->flags |= vacopt2->flags;
+ if (vacopt2->flags == VACOPT_PARALLEL)
+ vacopt1->nworkers = vacopt2->nworkers;
+ pfree(vacopt2);
+ $$ = vacopt1;
+ }

As the above statement indicates the the last parallel number of workers 
is considered into the account, can we explain it in docs?


postgres=# vacuum (parallel 2, verbose) tbl;

With verbose, no parallel workers related information is available.
I feel giving that information is required even when it is not parallel
vacuum also.


Regards,
Haribabu Kommi
Fujitsu Australia
Reply | Threaded
Open this post in threaded view
|

Re: Block level parallel vacuum

Masahiko Sawada
On Fri, Jan 18, 2019 at 10:38 AM Haribabu Kommi
<[hidden email]> wrote:

>
>
> On Tue, Jan 15, 2019 at 6:00 PM Masahiko Sawada <[hidden email]> wrote:
>>
>>
>> Rebased.
>
>
> I started reviewing the patch, I didn't finish my review yet.
> Following are some of the comments.
Thank you for reviewing the patch.

>
> +    <term><literal>PARALLEL <replaceable class="parameter">N</replaceable></literal></term>
> +    <listitem>
> +     <para>
> +      Execute index vacuum and cleanup index in parallel with
>
> I doubt that user can understand the terms index vacuum and cleanup index.
> May be it needs some more detailed information.
>

Agreed. Table 27.22 "Vacuum phases" has a good description of vacuum
phases. So maybe adding the referencint to it would work.

>
> - VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7 /* don't skip any pages */
> + VACOPT_PARALLEL = 1 << 7, /* do lazy VACUUM in parallel */
> + VACOPT_DISABLE_PAGE_SKIPPING = 1 << 8 /* don't skip any pages */
> +} VacuumOptionFlag;
>
> Any specific reason behind not adding it as last member of the enum?
>

My mistake, fixed it.

>
> -typedef enum VacuumOption
> +typedef enum VacuumOptionFlag
>  {
>
> I don't find the new name quite good, how about VacuumFlags?
>

Agreed with removing "Option" from the name but I think VacuumFlag
would be better because this enum represents only one flag. Thoughts?

>
> +typedef struct VacuumOption
> +{
>
> How about VacuumOptions? Because this structure can contains all the
> options provided to vacuum operation.
>

Agreed.

>
>
> + vacopt1->flags |= vacopt2->flags;
> + if (vacopt2->flags == VACOPT_PARALLEL)
> + vacopt1->nworkers = vacopt2->nworkers;
> + pfree(vacopt2);
> + $$ = vacopt1;
> + }
>
> As the above statement indicates the the last parallel number of workers
> is considered into the account, can we explain it in docs?
>
Agreed.

>
> postgres=# vacuum (parallel 2, verbose) tbl;
>
> With verbose, no parallel workers related information is available.
> I feel giving that information is required even when it is not parallel
> vacuum also.
>

Agreed. How about the folloiwng verbose output? I've added the number
of launched, planned and requested vacuum workers and purpose (vacuum
or cleanup).

postgres(1:91536)=# vacuum (verbose, parallel 30) test; -- table
'test' has 3 indexes
INFO:  vacuuming "public.test"
INFO:  launched 2 parallel vacuum workers for index vacuum (planned:
2, requested: 30)
INFO:  scanned index "test_idx1" to remove 2000 row versions
DETAIL:  CPU: user: 0.12 s, system: 0.00 s, elapsed: 0.12 s
INFO:  scanned index "test_idx2" to remove 2000 row versions by
parallel vacuum worker
DETAIL:  CPU: user: 0.07 s, system: 0.05 s, elapsed: 0.12 s
INFO:  scanned index "test_idx3" to remove 2000 row versions by
parallel vacuum worker
DETAIL:  CPU: user: 0.09 s, system: 0.05 s, elapsed: 0.14 s
INFO:  "test": removed 2000 row versions in 10 pages
DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
INFO:  launched 2 parallel vacuum workers for index cleanup (planned:
2, requested: 30)
INFO:  index "test_idx1" now contains 991151 row versions in 2745 pages
DETAIL:  2000 index row versions were removed.
24 index pages have been deleted, 18 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  index "test_idx2" now contains 991151 row versions in 2745 pages
DETAIL:  2000 index row versions were removed.
24 index pages have been deleted, 18 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  index "test_idx3" now contains 991151 row versions in 2745 pages
DETAIL:  2000 index row versions were removed.
24 index pages have been deleted, 18 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  "test": found 2000 removable, 367 nonremovable row versions in
41 out of 4425 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 500
There were 6849 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.12 s, system: 0.01 s, elapsed: 0.17 s.
VACUUM

Since the previous patch conflicts with 285d8e12 I've attached the
latest version patch that incorporated the review comment I got.




Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

v12-0002-Add-P-option-to-vacuumdb-command.patch (9K) Download Attachment
v12-0001-Add-parallel-option-to-VACUUM-command.patch (104K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Block level parallel vacuum

Haribabu Kommi-2

On Fri, Jan 18, 2019 at 11:42 PM Masahiko Sawada <[hidden email]> wrote:
On Fri, Jan 18, 2019 at 10:38 AM Haribabu Kommi
<[hidden email]> wrote:
>
>
> On Tue, Jan 15, 2019 at 6:00 PM Masahiko Sawada <[hidden email]> wrote:
>>
>>
>> Rebased.
>
>
> I started reviewing the patch, I didn't finish my review yet.
> Following are some of the comments.

Thank you for reviewing the patch.

>
> +    <term><literal>PARALLEL <replaceable class="parameter">N</replaceable></literal></term>
> +    <listitem>
> +     <para>
> +      Execute index vacuum and cleanup index in parallel with
>
> I doubt that user can understand the terms index vacuum and cleanup index.
> May be it needs some more detailed information.
>

Agreed. Table 27.22 "Vacuum phases" has a good description of vacuum
phases. So maybe adding the referencint to it would work.

OK.
 
>
> -typedef enum VacuumOption
> +typedef enum VacuumOptionFlag
>  {
>
> I don't find the new name quite good, how about VacuumFlags?
>

Agreed with removing "Option" from the name but I think VacuumFlag
would be better because this enum represents only one flag. Thoughts?

OK.
 

> postgres=# vacuum (parallel 2, verbose) tbl;
>
> With verbose, no parallel workers related information is available.
> I feel giving that information is required even when it is not parallel
> vacuum also.
>

Agreed. How about the folloiwng verbose output? I've added the number
of launched, planned and requested vacuum workers and purpose (vacuum
or cleanup).

postgres(1:91536)=# vacuum (verbose, parallel 30) test; -- table
'test' has 3 indexes
INFO:  vacuuming "public.test"
INFO:  launched 2 parallel vacuum workers for index vacuum (planned:
2, requested: 30)
INFO:  scanned index "test_idx1" to remove 2000 row versions
DETAIL:  CPU: user: 0.12 s, system: 0.00 s, elapsed: 0.12 s
INFO:  scanned index "test_idx2" to remove 2000 row versions by
parallel vacuum worker
DETAIL:  CPU: user: 0.07 s, system: 0.05 s, elapsed: 0.12 s
INFO:  scanned index "test_idx3" to remove 2000 row versions by
parallel vacuum worker
DETAIL:  CPU: user: 0.09 s, system: 0.05 s, elapsed: 0.14 s
INFO:  "test": removed 2000 row versions in 10 pages
DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
INFO:  launched 2 parallel vacuum workers for index cleanup (planned:
2, requested: 30)
INFO:  index "test_idx1" now contains 991151 row versions in 2745 pages
DETAIL:  2000 index row versions were removed.
24 index pages have been deleted, 18 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  index "test_idx2" now contains 991151 row versions in 2745 pages
DETAIL:  2000 index row versions were removed.
24 index pages have been deleted, 18 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  index "test_idx3" now contains 991151 row versions in 2745 pages
DETAIL:  2000 index row versions were removed.
24 index pages have been deleted, 18 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  "test": found 2000 removable, 367 nonremovable row versions in
41 out of 4425 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 500
There were 6849 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.12 s, system: 0.01 s, elapsed: 0.17 s.
VACUUM
 
The verbose output is good.

Since the previous patch conflicts with 285d8e12 I've attached the
latest version patch that incorporated the review comment I got.

Thanks for the latest patch. I have some more minor comments.

+      Execute index vacuum and cleanup index in parallel with

Better to use vacuum index and cleanup index? This is in same with
the description of vacuum phases. It is better to follow same notation
in the patch.


+ dead_tuples = lazy_space_alloc(lvstate, nblocks, parallel_workers);

With the change, the lazy_space_alloc takes care of initializing the
parallel vacuum, can we write something related to that in the comments.


+ initprog_val[2] = dead_tuples->max_dead_tuples;

dead_tuples variable may need rename for better reading?



+ if (lvshared->indstats[idx].updated)
+ result = &(lvshared->indstats[idx].stats);
+ else
+ copy_result = true;


I don't see a need for copy_result variable, how about directly using
the updated flag to decide whether to copy or not? Once the result is
copied update the flag.


+use Test::More tests => 34;

I don't find any new tetst are added in this patch.

I am thinking of performance penalty if we use the parallel option of
vacuum on a small sized table? 

Regards,
Haribabu Kommi
Fujitsu Australia
Reply | Threaded
Open this post in threaded view
|

Re: Block level parallel vacuum

Masahiko Sawada
On Tue, Jan 22, 2019 at 9:59 PM Haribabu Kommi <[hidden email]> wrote:
>
>
> Thanks for the latest patch. I have some more minor comments.

Thank you for reviewing the patch.

>
> +      Execute index vacuum and cleanup index in parallel with
>
> Better to use vacuum index and cleanup index? This is in same with
> the description of vacuum phases. It is better to follow same notation
> in the patch.

Agreed. I've changed it to "Vacuum index and cleanup index in parallel
with ...".

>
>
> + dead_tuples = lazy_space_alloc(lvstate, nblocks, parallel_workers);
>
> With the change, the lazy_space_alloc takes care of initializing the
> parallel vacuum, can we write something related to that in the comments.
>

Agreed.

>
> + initprog_val[2] = dead_tuples->max_dead_tuples;
>
> dead_tuples variable may need rename for better reading?
>

I might not get your comment correctly but I've tried to fix it.
Please review it.

>
>
> + if (lvshared->indstats[idx].updated)
> + result = &(lvshared->indstats[idx].stats);
> + else
> + copy_result = true;
>
>
> I don't see a need for copy_result variable, how about directly using
> the updated flag to decide whether to copy or not? Once the result is
> copied update the flag.
>
You're right. Fixed.

>
> +use Test::More tests => 34;
>
> I don't find any new tetst are added in this patch.

Fixed.

>
> I am thinking of performance penalty if we use the parallel option of
> vacuum on a small sized table?

Hm, unlike other parallel operations other than ParallelAppend the
parallel vacuum executes multiple index vacuum simultaneously.
Therefore this can avoid contension. I think there is a performance
penalty but it would not be big.

Attached the latest patches.




Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

v13-0001-Add-parallel-option-to-VACUUM-command.patch (105K) Download Attachment
v13-0002-Add-P-option-to-vacuumdb-command.patch (8K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Block level parallel vacuum

Haribabu Kommi-2

On Thu, Jan 24, 2019 at 1:16 PM Masahiko Sawada <[hidden email]> wrote:

Attached the latest patches.

Thanks for the updated patches.
Some more code review comments.

+         started by a single utility command.  Currently, the parallel
+         utility commands that support the use of parallel workers are
+         <command>CREATE INDEX</command> and <command>VACUUM</command>
+         without <literal>FULL</literal> option, and only when building
+         a B-tree index.  Parallel workers are taken from the pool of


I feel the above sentence may not give the proper picture, how about the 
adding following modification?

<command>CREATE INDEX</command> only when building a B-tree index 
and <command>VACUUM</command> without <literal>FULL</literal> option.



+ * parallel vacuum, we perform both index vacuum and index cleanup in parallel.
+ * Individual indexes is processed by one vacuum process. At beginning of

How about vacuum index and cleanup index similar like other places?


+ * memory space for dead tuples. When starting either index vacuum or cleanup
+ * vacuum, we launch parallel worker processes. Once all indexes are processed

same here as well?


+ * Before starting parallel index vacuum and parallel cleanup index we launch
+ * parallel workers. All parallel workers will exit after processed all indexes

parallel vacuum index and parallel cleanup index?


+ /*
+ * If there is already-updated result in the shared memory we
+ * use it. Otherwise we pass NULL to index AMs and copy the
+ * result to the shared memory segment.
+ */
+ if (lvshared->indstats[idx].updated)
+ result = &(lvshared->indstats[idx].stats);

I didn't really find a need of the flag to differentiate the stats pointer from
first run to second run? I don't see any problem in passing directing the stats
and the same stats are updated in the worker side and leader side. Anyway no two
processes will do the index vacuum at same time. Am I missing something?

Even if this flag is to identify whether the stats are updated or not before
writing them, I don't see a need of it compared to normal vacuum.


+ * Enter the parallel mode, allocate and initialize a DSM segment. Return
+ * the memory space for storing dead tuples or NULL if no workers are prepared.
+ */

+ pcxt = CreateParallelContext("postgres", "heap_parallel_vacuum_main",
+ request, true);

But we are passing as serializable_okay flag as true, means it doesn't return
NULL. Is it expected?


+ initStringInfo(&buf);
+ appendStringInfo(&buf,
+ ngettext("launched %d parallel vacuum worker %s (planned: %d",
+   "launched %d parallel vacuum workers %s (planned: %d",
+   lvstate->pcxt->nworkers_launched),
+ lvstate->pcxt->nworkers_launched,
+ for_cleanup ? "for index cleanup" : "for index vacuum",
+ lvstate->pcxt->nworkers);
+ if (lvstate->options.nworkers > 0)
+ appendStringInfo(&buf, ", requested %d", lvstate->options.nworkers);

what is the difference between planned workers and requested workers, aren't both
are same?


- COMPARE_SCALAR_FIELD(options);
- COMPARE_NODE_FIELD(rels);
+ if (a->options.flags != b->options.flags)
+ return false;
+ if (a->options.nworkers != b->options.nworkers)
+ return false;

Options is changed from SCALAR to check, but why the rels check is removed?
The options is changed from int to a structure so using SCALAR may not work
in other function like _copyVacuumStmt and etc?


+typedef struct VacuumOptions
+{
+ VacuumFlag flags; /* OR of VacuumFlag */
+ int nworkers; /* # of parallel vacuum workers */
+} VacuumOptions;


Do we need to add NodeTag for the above structure? Because this structure is
part of VacuumStmt structure.


+        <application>vacuumdb</application> will require background workers,
+        so make sure your <xref linkend="guc-max-parallel-workers-maintenance"/>
+        setting is more than one.

removing vacuumdb and changing it as "This option will ..."? 

I will continue the testing of this patch and share the details. 

Regards,
Haribabu Kommi
Fujitsu Australia
Reply | Threaded
Open this post in threaded view
|

Re: Block level parallel vacuum

Masahiko Sawada
On Wed, Jan 30, 2019 at 2:06 AM Haribabu Kommi <[hidden email]> wrote:

>
>
> On Thu, Jan 24, 2019 at 1:16 PM Masahiko Sawada <[hidden email]> wrote:
>>
>>
>> Attached the latest patches.
>
>
> Thanks for the updated patches.
> Some more code review comments.
>

Thank you!

> +         started by a single utility command.  Currently, the parallel
> +         utility commands that support the use of parallel workers are
> +         <command>CREATE INDEX</command> and <command>VACUUM</command>
> +         without <literal>FULL</literal> option, and only when building
> +         a B-tree index.  Parallel workers are taken from the pool of
>
>
> I feel the above sentence may not give the proper picture, how about the
> adding following modification?
>
> <command>CREATE INDEX</command> only when building a B-tree index
> and <command>VACUUM</command> without <literal>FULL</literal> option.
>
>

Agreed.

>
> + * parallel vacuum, we perform both index vacuum and index cleanup in parallel.
> + * Individual indexes is processed by one vacuum process. At beginning of
>
> How about vacuum index and cleanup index similar like other places?
>
>
> + * memory space for dead tuples. When starting either index vacuum or cleanup
> + * vacuum, we launch parallel worker processes. Once all indexes are processed
>
> same here as well?
>
>
> + * Before starting parallel index vacuum and parallel cleanup index we launch
> + * parallel workers. All parallel workers will exit after processed all indexes
>
> parallel vacuum index and parallel cleanup index?
>
>

ISTM we're using like "index vacuuming", "index cleanup" and "FSM
vacuming" in vacuumlazy.c so maybe "parallel index vacuuming" and
"parallel index cleanup" would be better?

> + /*
> + * If there is already-updated result in the shared memory we
> + * use it. Otherwise we pass NULL to index AMs and copy the
> + * result to the shared memory segment.
> + */
> + if (lvshared->indstats[idx].updated)
> + result = &(lvshared->indstats[idx].stats);
>
> I didn't really find a need of the flag to differentiate the stats pointer from
> first run to second run? I don't see any problem in passing directing the stats
> and the same stats are updated in the worker side and leader side. Anyway no two
> processes will do the index vacuum at same time. Am I missing something?
>
> Even if this flag is to identify whether the stats are updated or not before
> writing them, I don't see a need of it compared to normal vacuum.
>

The passing stats = NULL to amvacuumcleanup and ambulkdelete means the
first time execution. For example, btvacuumcleanup skips cleanup if
it's not NULL.In the normal vacuum we pass NULL to ambulkdelete or
amvacuumcleanup when the first time calling. And they store the result
stats to the memory allocated int the local memory. Therefore in the
parallel vacuum I think that both worker and leader need to move it to
the shared memory and mark it as updated as different worker could
vacuum different indexes at the next time.

>
> + * Enter the parallel mode, allocate and initialize a DSM segment. Return
> + * the memory space for storing dead tuples or NULL if no workers are prepared.
> + */
>
> + pcxt = CreateParallelContext("postgres", "heap_parallel_vacuum_main",
> + request, true);
>
> But we are passing as serializable_okay flag as true, means it doesn't return
> NULL. Is it expected?
>
>

I think you're right. Since the request never be 0 and
serializable_okey is true it should not return NULL. Will fix.

> + initStringInfo(&buf);
> + appendStringInfo(&buf,
> + ngettext("launched %d parallel vacuum worker %s (planned: %d",
> +   "launched %d parallel vacuum workers %s (planned: %d",
> +   lvstate->pcxt->nworkers_launched),
> + lvstate->pcxt->nworkers_launched,
> + for_cleanup ? "for index cleanup" : "for index vacuum",
> + lvstate->pcxt->nworkers);
> + if (lvstate->options.nworkers > 0)
> + appendStringInfo(&buf, ", requested %d", lvstate->options.nworkers);
>
> what is the difference between planned workers and requested workers, aren't both
> are same?

The request is the parallel degree that is specified explicitly by
user whereas the planned is the actual number we planned based on the
number of indexes the table has. For example, if we do like 'VACUUM
(PARALLEL 3000) tbl' where the tbl has 4 indexes, the request is 3000
and the planned is 4. Also if max_parallel_maintenance_workers is 2
the planned is 2.

>
>
> - COMPARE_SCALAR_FIELD(options);
> - COMPARE_NODE_FIELD(rels);
> + if (a->options.flags != b->options.flags)
> + return false;
> + if (a->options.nworkers != b->options.nworkers)
> + return false;
>
> Options is changed from SCALAR to check, but why the rels check is removed?
> The options is changed from int to a structure so using SCALAR may not work
> in other function like _copyVacuumStmt and etc?

Agreed and will fix.

>
> +typedef struct VacuumOptions
> +{
> + VacuumFlag flags; /* OR of VacuumFlag */
> + int nworkers; /* # of parallel vacuum workers */
> +} VacuumOptions;
>
>
> Do we need to add NodeTag for the above structure? Because this structure is
> part of VacuumStmt structure.

Yes, I will add it.

>
>
> +        <application>vacuumdb</application> will require background workers,
> +        so make sure your <xref linkend="guc-max-parallel-workers-maintenance"/>
> +        setting is more than one.
>
> removing vacuumdb and changing it as "This option will ..."?
>
Agreed.

> I will continue the testing of this patch and share the details.
>

Thank you. I'll submit the updated patch set.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Reply | Threaded
Open this post in threaded view
|

Re: Block level parallel vacuum

akapila
On Fri, Feb 1, 2019 at 2:49 AM Masahiko Sawada <[hidden email]> wrote:
>
> On Wed, Jan 30, 2019 at 2:06 AM Haribabu Kommi <[hidden email]> wrote:
>
> Thank you. I'll submit the updated patch set.
>

I don't see any chance of getting this committed in the next few days,
so, moved to next CF.   Thanks for working on this and I hope you will
continue work on this project.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

1234 ... 15