Track statistics for streaming of in-progress transactions

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

Track statistics for streaming of in-progress transactions

akapila
Commit 464824323e has added the support of the streaming of
in-progress transactions into the built-in logical replication. The
attached patch adds the statistics about transactions streamed to the
decoding output plugin from ReorderBuffer. Users can query the
pg_stat_replication_slots view to check these stats and call
pg_stat_reset_replication_slot to reset the stats of a particular
slot. Users can pass NULL in pg_stat_reset_replication_slot to reset
stats of all the slots.

Commit 9868167500 has added the basic infrastructure to capture the
stats of slot and this commit extends the statistics collector to
track additional information about slots.

This patch was originally written by Ajin Cherian [1]. I have fixed
bugs and modified some comments in the code.

Thoughts?

[1] - https://www.postgresql.org/message-id/CAFPTHDZ8RnOovefzB%2BOMoRxLSD404WRLqWBUHe6bWqM5ew1bNA%40mail.gmail.com

--
With Regards,
Amit Kapila

v2-0001-Track-statistics-for-streaming-of-changes-from-Re.patch (18K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Track statistics for streaming of in-progress transactions

Ajin Cherian
On Wed, Oct 14, 2020 at 2:39 PM Amit Kapila <[hidden email]> wrote:

>
> Commit 464824323e has added the support of the streaming of
> in-progress transactions into the built-in logical replication. The
> attached patch adds the statistics about transactions streamed to the
> decoding output plugin from ReorderBuffer. Users can query the
> pg_stat_replication_slots view to check these stats and call
> pg_stat_reset_replication_slot to reset the stats of a particular
> slot. Users can pass NULL in pg_stat_reset_replication_slot to reset
> stats of all the slots.
>
> Commit 9868167500 has added the basic infrastructure to capture the
> stats of slot and this commit extends the statistics collector to
> track additional information about slots.
>
> This patch was originally written by Ajin Cherian [1]. I have fixed
> bugs and modified some comments in the code.
>
> Thoughts?
>
> [1] - https://www.postgresql.org/message-id/CAFPTHDZ8RnOovefzB%2BOMoRxLSD404WRLqWBUHe6bWqM5ew1bNA%40mail.gmail.com

I've applied the patch. It applies cleanly. I've reviewed the patch
and have no comments to report.
I have also run some tests to get streaming stats as well as reset the
stats counter, everything seems to be working as expected.
I am fine with the changes.

regards,
Ajin Cherian
Fujitsu Australia


Reply | Threaded
Open this post in threaded view
|

Re: Track statistics for streaming of in-progress transactions

akapila
On Mon, Oct 19, 2020 at 1:52 PM Ajin Cherian <[hidden email]> wrote:

>
> On Wed, Oct 14, 2020 at 2:39 PM Amit Kapila <[hidden email]> wrote:
> >
> > Commit 464824323e has added the support of the streaming of
> > in-progress transactions into the built-in logical replication. The
> > attached patch adds the statistics about transactions streamed to the
> > decoding output plugin from ReorderBuffer. Users can query the
> > pg_stat_replication_slots view to check these stats and call
> > pg_stat_reset_replication_slot to reset the stats of a particular
> > slot. Users can pass NULL in pg_stat_reset_replication_slot to reset
> > stats of all the slots.
> >
> > Commit 9868167500 has added the basic infrastructure to capture the
> > stats of slot and this commit extends the statistics collector to
> > track additional information about slots.
> >
> > This patch was originally written by Ajin Cherian [1]. I have fixed
> > bugs and modified some comments in the code.
> >
> > Thoughts?
> >
> > [1] - https://www.postgresql.org/message-id/CAFPTHDZ8RnOovefzB%2BOMoRxLSD404WRLqWBUHe6bWqM5ew1bNA%40mail.gmail.com
>
> I've applied the patch. It applies cleanly. I've reviewed the patch
> and have no comments to report.
> I have also run some tests to get streaming stats as well as reset the
> stats counter, everything seems to be working as expected.
> I am fine with the changes.
>

Thanks. One thing I have considered while updating this patch was to
write a test case similar to what we have for spilled stats in
test_decoding/sql/stats.sql but I decided not to do it as that doesn't
seem to add much value for the streaming case because we already have
some tests in test_decoding/sql/stream.sql which indicates that the
streaming is happening. If we could have a way to get the exact
streaming stats then it would have been better but while writing tests
for spilled stats we found that it is not possible because some
background transactions (like autovacuum) might send the stats earlier
making the actual number inconsistent. What do you think?

Sawada-San, do you have any thoughts on this matter?

--
With Regards,
Amit Kapila.


Reply | Threaded
Open this post in threaded view
|

Re: Track statistics for streaming of in-progress transactions

Masahiko Sawada-2
On Tue, 20 Oct 2020 at 14:29, Amit Kapila <[hidden email]> wrote:

>
> On Mon, Oct 19, 2020 at 1:52 PM Ajin Cherian <[hidden email]> wrote:
> >
> > On Wed, Oct 14, 2020 at 2:39 PM Amit Kapila <[hidden email]> wrote:
> > >
> > > Commit 464824323e has added the support of the streaming of
> > > in-progress transactions into the built-in logical replication. The
> > > attached patch adds the statistics about transactions streamed to the
> > > decoding output plugin from ReorderBuffer. Users can query the
> > > pg_stat_replication_slots view to check these stats and call
> > > pg_stat_reset_replication_slot to reset the stats of a particular
> > > slot. Users can pass NULL in pg_stat_reset_replication_slot to reset
> > > stats of all the slots.
> > >
> > > Commit 9868167500 has added the basic infrastructure to capture the
> > > stats of slot and this commit extends the statistics collector to
> > > track additional information about slots.
> > >
> > > This patch was originally written by Ajin Cherian [1]. I have fixed
> > > bugs and modified some comments in the code.
> > >
> > > Thoughts?
> > >
> > > [1] - https://www.postgresql.org/message-id/CAFPTHDZ8RnOovefzB%2BOMoRxLSD404WRLqWBUHe6bWqM5ew1bNA%40mail.gmail.com
> >
> > I've applied the patch. It applies cleanly. I've reviewed the patch
> > and have no comments to report.
> > I have also run some tests to get streaming stats as well as reset the
> > stats counter, everything seems to be working as expected.
> > I am fine with the changes.
> >
>
> Thanks. One thing I have considered while updating this patch was to
> write a test case similar to what we have for spilled stats in
> test_decoding/sql/stats.sql but I decided not to do it as that doesn't
> seem to add much value for the streaming case because we already have
> some tests in test_decoding/sql/stream.sql which indicates that the
> streaming is happening. If we could have a way to get the exact
> streaming stats then it would have been better but while writing tests
> for spilled stats we found that it is not possible because some
> background transactions (like autovacuum) might send the stats earlier
> making the actual number inconsistent. What do you think?
>
> Sawada-San, do you have any thoughts on this matter?

I basically agree with that. Reading the patch, I have a question that
might be relevant to this matter:

The patch has the following code:

+   /*
+    * Remember this information to be used later to update stats. We can't
+    * update the stats here as an error while processing the changes would
+    * lead to the accumulation of stats even though we haven't streamed all
+    * the changes.
+    */
+   txn_is_streamed = rbtxn_is_streamed(txn);
+   stream_bytes = txn->total_size;

The commend seems to mention only about when an error happened while
processing the changes but I wonder if the same is true for the
aborted transaction. That is, if we catch an error due to concurrent
transaction abort while processing the changes, we stop to stream the
changes. But the patch accumulates the stats even in this case. If we
don’t want to accumulate the stats of the abort transaction and it’s
easily reproducible, it might be better to add a test checking if we
don’t accumulate in that case.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Track statistics for streaming of in-progress transactions

akapila
On Wed, Oct 21, 2020 at 8:15 AM Masahiko Sawada
<[hidden email]> wrote:

>
> On Tue, 20 Oct 2020 at 14:29, Amit Kapila <[hidden email]> wrote:
> >
> >
> > Thanks. One thing I have considered while updating this patch was to
> > write a test case similar to what we have for spilled stats in
> > test_decoding/sql/stats.sql but I decided not to do it as that doesn't
> > seem to add much value for the streaming case because we already have
> > some tests in test_decoding/sql/stream.sql which indicates that the
> > streaming is happening. If we could have a way to get the exact
> > streaming stats then it would have been better but while writing tests
> > for spilled stats we found that it is not possible because some
> > background transactions (like autovacuum) might send the stats earlier
> > making the actual number inconsistent. What do you think?
> >
> > Sawada-San, do you have any thoughts on this matter?
>
> I basically agree with that. Reading the patch, I have a question that
> might be relevant to this matter:
>
> The patch has the following code:
>
> +   /*
> +    * Remember this information to be used later to update stats. We can't
> +    * update the stats here as an error while processing the changes would
> +    * lead to the accumulation of stats even though we haven't streamed all
> +    * the changes.
> +    */
> +   txn_is_streamed = rbtxn_is_streamed(txn);
> +   stream_bytes = txn->total_size;
>
> The commend seems to mention only about when an error happened while
> processing the changes but I wonder if the same is true for the
> aborted transaction. That is, if we catch an error due to concurrent
> transaction abort while processing the changes, we stop to stream the
> changes. But the patch accumulates the stats even in this case.
>

It would only add for the current stream and I don't think that is
wrong because we would have sent some data (at least the start
message) for which we send the stream_stop message later while
decoding Abort message. I had thought to avoid this we can update the
stats in ReorderBufferProcessTXN at the end when we know streaming is
complete but again that would miss the counter update for the data we
have sent before an error has occurred and also updating the streaming
counters in ReorderBufferStreamTXN seems more logical to me.

> If we
> don’t want to accumulate the stats of the abort transaction and it’s
> easily reproducible, it might be better to add a test checking if we
> don’t accumulate in that case.
>

But as explained above, I think we count it as we would have sent at
least one message (could be more) before we encounter this error.


--
With Regards,
Amit Kapila.


Reply | Threaded
Open this post in threaded view
|

Re: Track statistics for streaming of in-progress transactions

Ajin Cherian
In reply to this post by akapila
On Tue, Oct 20, 2020 at 4:29 PM Amit Kapila <[hidden email]> wrote:

> Thanks. One thing I have considered while updating this patch was to
> write a test case similar to what we have for spilled stats in
> test_decoding/sql/stats.sql but I decided not to do it as that doesn't
> seem to add much value for the streaming case because we already have
> some tests in test_decoding/sql/stream.sql which indicates that the
> streaming is happening. If we could have a way to get the exact
> streaming stats then it would have been better but while writing tests
> for spilled stats we found that it is not possible because some
> background transactions (like autovacuum) might send the stats earlier
> making the actual number inconsistent. What do you think?
>

I agree. If the stat numbers can't be guaranteed to be consistent it's
not worth writing specific tests for this.

regards,
Ajin Cherian
Fujitsu Australia


Reply | Threaded
Open this post in threaded view
|

Re: Track statistics for streaming of in-progress transactions

Dilip Kumar-2
In reply to this post by akapila
On Wed, Oct 14, 2020 at 9:09 AM Amit Kapila <[hidden email]> wrote:
>
> Commit 464824323e has added the support of the streaming of
> in-progress transactions into the built-in logical replication. The
> attached patch adds the statistics about transactions streamed to the
> decoding output plugin from ReorderBuffer.

I have reviewed the attached patch, I have one comment

+ int64 streamTxns; /* number of transactions streamed to the decoding
output plugin */
+ int64 streamCount; /* streaming invocation counter */
+ int64 streamBytes; /* amount of data streamed to subscriber */

I think instead of saying "amount of data streamed to subscriber"  it
should be " amount of data streamed to the decoding output plugin"

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Track statistics for streaming of in-progress transactions

akapila
On Thu, Oct 22, 2020 at 11:52 AM Dilip Kumar <[hidden email]> wrote:

>
> On Wed, Oct 14, 2020 at 9:09 AM Amit Kapila <[hidden email]> wrote:
> >
> > Commit 464824323e has added the support of the streaming of
> > in-progress transactions into the built-in logical replication. The
> > attached patch adds the statistics about transactions streamed to the
> > decoding output plugin from ReorderBuffer.
>
> I have reviewed the attached patch, I have one comment
>
> + int64 streamTxns; /* number of transactions streamed to the decoding
> output plugin */
> + int64 streamCount; /* streaming invocation counter */
> + int64 streamBytes; /* amount of data streamed to subscriber */
>
> I think instead of saying "amount of data streamed to subscriber"  it
> should be " amount of data streamed to the decoding output plugin"
>

Thanks, I think a similar change is required in docs as well. One more
thing I was considering whether to change docs to explain stream_count
and stream_txns somewhat more clearly based on what I have posted for
spilled_count and spilled_txns in the other thread [1]? Do you think
that patch is an improvement over what we have now? If yes, we can
adapt the similar changes here as well, otherwise, we can leave it as
it is.

[1] - https://www.postgresql.org/message-id/CAA4eK1LdPQucvp9St2D6NhO9aQ2KKr3U0yAbKDox2UC86Q%2B_zg%40mail.gmail.com

--
With Regards,
Amit Kapila.


Reply | Threaded
Open this post in threaded view
|

Re: Track statistics for streaming of in-progress transactions

akapila
On Thu, Oct 22, 2020 at 2:09 PM Amit Kapila <[hidden email]> wrote:

>
> On Thu, Oct 22, 2020 at 11:52 AM Dilip Kumar <[hidden email]> wrote:
> >
> > On Wed, Oct 14, 2020 at 9:09 AM Amit Kapila <[hidden email]> wrote:
> > >
> > > Commit 464824323e has added the support of the streaming of
> > > in-progress transactions into the built-in logical replication. The
> > > attached patch adds the statistics about transactions streamed to the
> > > decoding output plugin from ReorderBuffer.
> >
> > I have reviewed the attached patch, I have one comment
> >
> > + int64 streamTxns; /* number of transactions streamed to the decoding
> > output plugin */
> > + int64 streamCount; /* streaming invocation counter */
> > + int64 streamBytes; /* amount of data streamed to subscriber */
> >
> > I think instead of saying "amount of data streamed to subscriber"  it
> > should be " amount of data streamed to the decoding output plugin"
> >
>
> Thanks, I think a similar change is required in docs as well.
>
I have fixed the above comment and rebased the patch. I have changed
the docs a bit to add more explanation about the counters. Let me know
if you have any more comments. Thanks Dilip and Sawada-San for
reviewing this patch.

--
With Regards,
Amit Kapila.

v3-0001-Track-statistics-for-streaming-of-changes-from-Re.patch (19K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Track statistics for streaming of in-progress transactions

akapila
On Fri, Oct 23, 2020 at 10:24 AM Amit Kapila <[hidden email]> wrote:
>
> On Thu, Oct 22, 2020 at 2:09 PM Amit Kapila <[hidden email]> wrote:
> >
>
> I have fixed the above comment and rebased the patch. I have changed
> the docs a bit to add more explanation about the counters. Let me know
> if you have any more comments. Thanks Dilip and Sawada-San for
> reviewing this patch.
>

Attached is an updated patch with minor changes in docs and cosmetic
changes. I am planning to push this patch tomorrow unless there are
any more comments/suggestions.

--
With Regards,
Amit Kapila.

v4-0001-Track-statistics-for-streaming-of-changes-from-Re.patch (19K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Track statistics for streaming of in-progress transactions

Tomas Vondra-4
On Wed, Oct 28, 2020 at 08:54:53AM +0530, Amit Kapila wrote:

>On Fri, Oct 23, 2020 at 10:24 AM Amit Kapila <[hidden email]> wrote:
>>
>> On Thu, Oct 22, 2020 at 2:09 PM Amit Kapila <[hidden email]> wrote:
>> >
>>
>> I have fixed the above comment and rebased the patch. I have changed
>> the docs a bit to add more explanation about the counters. Let me know
>> if you have any more comments. Thanks Dilip and Sawada-San for
>> reviewing this patch.
>>
>
>Attached is an updated patch with minor changes in docs and cosmetic
>changes. I am planning to push this patch tomorrow unless there are
>any more comments/suggestions.
>

+1 and thanks for working on this


regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Track statistics for streaming of in-progress transactions

akapila
On Thu, Oct 29, 2020 at 5:16 AM Tomas Vondra
<[hidden email]> wrote:

>
> On Wed, Oct 28, 2020 at 08:54:53AM +0530, Amit Kapila wrote:
> >On Fri, Oct 23, 2020 at 10:24 AM Amit Kapila <[hidden email]> wrote:
> >>
> >> On Thu, Oct 22, 2020 at 2:09 PM Amit Kapila <[hidden email]> wrote:
> >> >
> >>
> >> I have fixed the above comment and rebased the patch. I have changed
> >> the docs a bit to add more explanation about the counters. Let me know
> >> if you have any more comments. Thanks Dilip and Sawada-San for
> >> reviewing this patch.
> >>
> >
> >Attached is an updated patch with minor changes in docs and cosmetic
> >changes. I am planning to push this patch tomorrow unless there are
> >any more comments/suggestions.
> >
>
> +1 and thanks for working on this
>

Pushed!

--
With Regards,
Amit Kapila.