Introduce timeout capability for ConditionVariableSleep

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Introduce timeout capability for ConditionVariableSleep

Shawn Debnath
Hello,

Postgres today doesn't support waiting for a condition variable with a
timeout, although the framework it relies upon, does. This change wraps
the existing ConditionVariableSleep functionality and introduces a new
API, ConditionVariableTimedSleep, to allow callers to specify a timeout
value.

A scenario that highlights this use case is a backend is waiting on
status update from multiple workers but needs to time out if that signal
doesn't arrive within a certain period. There was a workaround prior
to aced5a92, but with that change, the semantics are now different.

I chose to go with -1 instead of 0 for the return from
ConditionVariableTimedSleep to indicate timeout error  as it seems
cleaner for this API. WaitEventSetWaitBlock returns -1 for timeout but
WaitEventSetWait treats timeout as 0 (to represent 0 events indicating
timeout).

If there's an alternative, cleaner way to achieve this outcome, I am all
ears.

Thanks.

--
Shawn Debnath
Amazon Web Services (AWS)


0001-Introduce-timeout-capability-for-ConditionVariableSleep-v1.patch (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Introduce timeout capability for ConditionVariableSleep

Thomas Munro-5
Hi Shawn,

On Wed, Mar 13, 2019 at 12:25 PM Shawn Debnath <[hidden email]> wrote:
> Postgres today doesn't support waiting for a condition variable with a
> timeout, although the framework it relies upon, does. This change wraps
> the existing ConditionVariableSleep functionality and introduces a new
> API, ConditionVariableTimedSleep, to allow callers to specify a timeout
> value.

Seems reasonable, I think, and should be familiar to anyone used to
well known multithreading libraries.

+/*
+ * Wait for the given condition variable to be signaled or till timeout.
+ * This should be called in a predicate loop that tests for a specific exit
+ * condition and otherwise sleeps, like so:
+ *
+ *     ConditionVariablePrepareToSleep(cv);  // optional
+ *     while (condition for which we are waiting is not true)
+ *         ConditionVariableSleep(cv, wait_event_info);
+ *     ConditionVariableCancelSleep();
+ *
+ * wait_event_info should be a value from one of the WaitEventXXX enums
+ * defined in pgstat.h.  This controls the contents of pg_stat_activity's
+ * wait_event_type and wait_event columns while waiting.
+ *
+ * Returns 0 or -1 if timed out.
+ */
+int
+ConditionVariableTimedSleep(ConditionVariable *cv, long timeout,
+                            uint32 wait_event_info)


Can we just refer to the other function's documentation for this?  I
don't want two copies of this blurb (and this copy-paste already
failed to include "Timed" in the example function name).

One difference compared to pthread_cond_timedwait() is that pthread
uses an absolute time and here you use a relative time (as we do in
WaitEventSet API).  The first question is which makes a better API,
and the second is what the semantics of a relative timeout should be:
start again every time we get a spurious wake-up, or track time
already waited?  Let's see...

-        (void) WaitEventSetWait(cv_wait_event_set, -1, &event, 1,
+        ret = WaitEventSetWait(cv_wait_event_set, timeout, &event, 1,
                                 wait_event_info);

Here you're restarting the timeout clock every time through the loop
without adjustment, and I think that's not a good choice: spurious
wake-ups cause bonus waiting.

--
Thomas Munro
https://enterprisedb.com

Reply | Threaded
Open this post in threaded view
|

Re: Introduce timeout capability for ConditionVariableSleep

Shawn Debnath
Hi Thomas,

Thanks for reviewing!

On Wed, Mar 13, 2019 at 12:40:57PM +1300, Thomas Munro wrote:
> Can we just refer to the other function's documentation for this?  I
> don't want two copies of this blurb (and this copy-paste already
> failed to include "Timed" in the example function name).

Hah - point well taken. Fixed.

> One difference compared to pthread_cond_timedwait() is that pthread
> uses an absolute time and here you use a relative time (as we do in
> WaitEventSet API).  The first question is which makes a better API,
> and the second is what the semantics of a relative timeout should be:
> start again every time we get a spurious wake-up, or track time
> already waited?  Let's see...
>
> -        (void) WaitEventSetWait(cv_wait_event_set, -1, &event, 1,
> +        ret = WaitEventSetWait(cv_wait_event_set, timeout, &event, 1,
>                                  wait_event_info);
>
> Here you're restarting the timeout clock every time through the loop
> without adjustment, and I think that's not a good choice: spurious
> wake-ups cause bonus waiting.
Agree. In my testing WaitEventSetWait did the work for calculating the
right timeout remaining. It's a bummer that we have to repeat the same
pattern at the ConditionVariableTimedSleep() but I guess anyone who
loops in such cases will have to adjust their values accordingly.

BTW, I am curious why Andres in 98a64d0bd71 didn't just create an
artificial event with WL_TIMEOUT and return that from
WaitEventSetWait(). Would have made it cleaner than checking checking
return values for -1.

Updated v2 patch attached.

--
Shawn Debnath
Amazon Web Services (AWS)

0001-Introduce-timeout-capability-for-ConditionVariableSleep-v2.patch (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Introduce timeout capability for ConditionVariableSleep

Kyotaro HORIGUCHI-2
Hello.

At Tue, 12 Mar 2019 17:53:43 -0700, Shawn Debnath <[hidden email]> wrote in <[hidden email]>

> Hi Thomas,
>
> Thanks for reviewing!
>
> On Wed, Mar 13, 2019 at 12:40:57PM +1300, Thomas Munro wrote:
> > Can we just refer to the other function's documentation for this?  I
> > don't want two copies of this blurb (and this copy-paste already
> > failed to include "Timed" in the example function name).
>
> Hah - point well taken. Fixed.
>
> > One difference compared to pthread_cond_timedwait() is that pthread
> > uses an absolute time and here you use a relative time (as we do in
> > WaitEventSet API).  The first question is which makes a better API,
> > and the second is what the semantics of a relative timeout should be:
> > start again every time we get a spurious wake-up, or track time
> > already waited?  Let's see...
> >
> > -        (void) WaitEventSetWait(cv_wait_event_set, -1, &event, 1,
> > +        ret = WaitEventSetWait(cv_wait_event_set, timeout, &event, 1,
> >                                  wait_event_info);
> >
> > Here you're restarting the timeout clock every time through the loop
> > without adjustment, and I think that's not a good choice: spurious
> > wake-ups cause bonus waiting.
>
> Agree. In my testing WaitEventSetWait did the work for calculating the
> right timeout remaining. It's a bummer that we have to repeat the same
> pattern at the ConditionVariableTimedSleep() but I guess anyone who
> loops in such cases will have to adjust their values accordingly.
I think so, too. And actually the duplicate timeout calculation
doesn't seem good. We could eliminate the duplicate by allowing
WaitEventSetWait to exit by unwanted events, something like the
attached.

> BTW, I am curious why Andres in 98a64d0bd71 didn't just create an
> artificial event with WL_TIMEOUT and return that from
> WaitEventSetWait(). Would have made it cleaner than checking checking
> return values for -1.

Maybe because it is not a kind of WaitEvent, so it naturally
cannot be a part of occurred_events.

# By the way, you can obtain a short hash of a commit by git
#  rev-parse --short <full hash>.

> Updated v2 patch attached.

I'd like to comment on the patch.

+ * In the event of a timeout, we simply return and the caller
+ * calls ConditionVariableCancelSleep to remove themselves from the
+ * wait queue. See ConditionVariableSleep() for notes on how to correctly check
+ * for the exit condition.
+ *
+ * Returns 0, or -1 if timed out.

Maybe this could be more simpler, that like:

* ConditionVariableTimedSleep - allows us to specify timeout
*
* If timeout = =1, block until the condition is satisfied.
*
* Returns -1 when timeout expires, otherwise returns 0.
*
* See ConditionVariableSleep() for general behavior and usage.


+int
+ConditionVariableTimedSleep(ConditionVariable *cv, long timeout,

Counldn't the two-state return value be a boolean?


+ int ret = 0;

As a general coding convention, we are not to give such a generic
name for a variable with such a long life if avoidable. In the
case the 'ret' could be 'timeout_fired' or something and it would
be far verbose.


+ if (rc == 0 && timeout >= 0)

WaitEventSetWait returns 0 only in the case of timeout
expiration, so the second term is useless.  Just setting ret to
-1 and break seems to me almost the same with "goto".  The reason
why the existing ConditionVariableSleep uses do {} while(done) is
that it is straightforwad. Timeout added everal exit point in the
loop so it's make the loop rather complex by going around with
the variable. Whole the loop could be in the following more flat
shape.

   while (true)
   {
      CHECK_FOR_INTERRUPTS();
      rc = WaitEventSetWait();
      ResetLatch();

      /* timeout expired, return */
      if (rc == 0) return -1;
      SpinLockAcquire();
      if (!proclist...)
      {
         done = true;
      }
      SpinLockRelease();

      /* condition satisfied, return */
      if (done) return 0;

      /* if we're here, we should wait for the remaining time */
      INSTR_TIME_SET_CURRENT()
      ...
  }


+ Assert(ret == 0);

I don't see a point in the assertion so much.


regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

diff --git a/src/backend/storage/ipc/latch.c b/src/backend/storage/ipc/latch.c
index 59fa917ae0..abead3294e 100644
--- a/src/backend/storage/ipc/latch.c
+++ b/src/backend/storage/ipc/latch.c
@@ -950,7 +950,7 @@ WaitEventAdjustWin32(WaitEventSet *set, WaitEvent *event)
 int
 WaitEventSetWait(WaitEventSet *set, long timeout,
  WaitEvent *occurred_events, int nevents,
- uint32 wait_event_info)
+ uint32 wait_event_info, bool wait_for_timeout)
 {
  int returned_events = 0;
  instr_time start_time;
@@ -965,7 +965,8 @@ WaitEventSetWait(WaitEventSet *set, long timeout,
  */
  if (timeout >= 0)
  {
- INSTR_TIME_SET_CURRENT(start_time);
+ if (wait_for_timeout)
+ INSTR_TIME_SET_CURRENT(start_time);
  Assert(timeout >= 0 && timeout <= INT_MAX);
  cur_timeout = timeout;
  }
@@ -1038,6 +1039,9 @@ WaitEventSetWait(WaitEventSet *set, long timeout,
  /* If we're not done, update cur_timeout for next iteration */
  if (returned_events == 0 && timeout >= 0)
  {
+ if (!wait_for_timeout)
+ return 0;
+
  INSTR_TIME_SET_CURRENT(cur_time);
  INSTR_TIME_SUBTRACT(cur_time, start_time);
  cur_timeout = timeout - (long) INSTR_TIME_GET_MILLISEC(cur_time);
Reply | Threaded
Open this post in threaded view
|

Re: Introduce timeout capability for ConditionVariableSleep

Shawn Debnath
Thank you reviewing. Comments inline.

On Wed, Mar 13, 2019 at 05:24:15PM +0900, Kyotaro HORIGUCHI wrote:

> > Agree. In my testing WaitEventSetWait did the work for calculating
> > the right timeout remaining. It's a bummer that we have to repeat
> > the same pattern at the ConditionVariableTimedSleep() but I guess
> > anyone who loops in such cases will have to adjust their values
> > accordingly.
>
> I think so, too. And actually the duplicate timeout calculation
> doesn't seem good. We could eliminate the duplicate by allowing
> WaitEventSetWait to exit by unwanted events, something like the
> attached.
After thinking about this more, I see WaitEventSetWait()'s contract as
wait for an event or timeout if no events are received in that time
frame. Although ConditionVariableTimedSleep() is also using the same
word, I believe the semantics are different. The timeout period in
ConditionVariableTimedSleep() is intended to limit the time we wait
until removal from the wait queue. Whereas, in the case of
WaitEventSetWait, the timeout period is intended to limit the time the
caller waits till the first event.

Hence, I believe the code is correct as is and we shouldn't change the
contract for WaitEventSetWait.

> > BTW, I am curious why Andres in 98a64d0bd71 didn't just create an
> > artificial event with WL_TIMEOUT and return that from
> > WaitEventSetWait(). Would have made it cleaner than checking checking
> > return values for -1.
>
> Maybe because it is not a kind of WaitEvent, so it naturally
> cannot be a part of occurred_events.

Hmm, I don't agree with that completely. One could argue that the
backend is waiting for any event in order to be woken up, including a
timeout event.

> # By the way, you can obtain a short hash of a commit by git
> #  rev-parse --short <full hash>.

Good to know! :-) Luckily git is smart enough to still match it to the
correct commit.

> > Updated v2 patch attached.
>
> I'd like to comment on the patch.
>
> + * In the event of a timeout, we simply return and the caller
> + * calls ConditionVariableCancelSleep to remove themselves from the
> + * wait queue. See ConditionVariableSleep() for notes on how to correctly check
> + * for the exit condition.
> + *
> + * Returns 0, or -1 if timed out.
>
> Maybe this could be more simpler, that like:
>
> * ConditionVariableTimedSleep - allows us to specify timeout
> *
> * If timeout = =1, block until the condition is satisfied.
> *
> * Returns -1 when timeout expires, otherwise returns 0.
> *
> * See ConditionVariableSleep() for general behavior and usage.
Agree. Changed to:

  * Wait for the given condition variable to be signaled or till timeout.
  *
  * Returns -1 when timeout expires, otherwise returns 0.
  *
  * See ConditionVariableSleep() for general usage.

> +ConditionVariableTimedSleep(ConditionVariable *cv, long timeout,
>
> Counldn't the two-state return value be a boolean?

I wanted to leave the option open to use the positive integers for other
purposes but you are correct, bool suffices for now. If needed, we can
change it in the future.

> + int ret = 0;
>
> As a general coding convention, we are not to give such a generic
> name for a variable with such a long life if avoidable. In the
> case the 'ret' could be 'timeout_fired' or something and it would
> be far verbose.
>
>
> + if (rc == 0 && timeout >= 0)
>
> WaitEventSetWait returns 0 only in the case of timeout
> expiration, so the second term is useless.  Just setting ret to
> -1 and break seems to me almost the same with "goto".  The reason
> why the existing ConditionVariableSleep uses do {} while(done) is
> that it is straightforwad. Timeout added everal exit point in the
> loop so it's make the loop rather complex by going around with
> the variable. Whole the loop could be in the following more flat
> shape.
>
>    while (true)
>    {
>       CHECK_FOR_INTERRUPTS();
>       rc = WaitEventSetWait();
>       ResetLatch();
>
>       /* timeout expired, return */
>       if (rc == 0) return -1;
>       SpinLockAcquire();
>       if (!proclist...)
>       {
>          done = true;
>       }
>       SpinLockRelease();
>
>       /* condition satisfied, return */
>       if (done) return 0;
>
>       /* if we're here, we should wait for the remaining time */
>       INSTR_TIME_SET_CURRENT()
>       ...
>   }
Agree. The timeout did complicate the logic for a single variable to
track the exit condition. Adopted the approach above.

> + Assert(ret == 0);
>
> I don't see a point in the assertion so much.

Being overly verbose. Removed.

--
Shawn Debnath
Amazon Web Services (AWS)

0001-Introduce-timeout-capability-for-ConditionVariableSleep-v3.patch (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Introduce timeout capability for ConditionVariableSleep

Kyotaro HORIGUCHI-2
Hello.

At Thu, 14 Mar 2019 17:26:11 -0700, Shawn Debnath <[hidden email]> wrote in <[hidden email]>

> Thank you reviewing. Comments inline.
>
> On Wed, Mar 13, 2019 at 05:24:15PM +0900, Kyotaro HORIGUCHI wrote:
> > > Agree. In my testing WaitEventSetWait did the work for calculating
> > > the right timeout remaining. It's a bummer that we have to repeat
> > > the same pattern at the ConditionVariableTimedSleep() but I guess
> > > anyone who loops in such cases will have to adjust their values
> > > accordingly.
> >
> > I think so, too. And actually the duplicate timeout calculation
> > doesn't seem good. We could eliminate the duplicate by allowing
> > WaitEventSetWait to exit by unwanted events, something like the
> > attached.
>
> After thinking about this more, I see WaitEventSetWait()'s contract as
> wait for an event or timeout if no events are received in that time

Sure.

> frame. Although ConditionVariableTimedSleep() is also using the same
> word, I believe the semantics are different. The timeout period in
> ConditionVariableTimedSleep() is intended to limit the time we wait
> until removal from the wait queue. Whereas, in the case of
> WaitEventSetWait, the timeout period is intended to limit the time the
> caller waits till the first event.

Mmm. The two look the same to me.. Timeout means for both that
"Wait for one of these events or timeout expiration to
occur". Removal from waiting queue is just a subtask of exiting
from waiting state.

The "don't exit until timeout expires unless any expected events
occur" part is to be done in the uppermost layer so it is enough
that the lower layer does just "exit when something
happened". This is the behavior of WaitEventSetWaitBlock for
WaitEventSetWait. My proposal is giving callers capabliy to tell
WaitEventSetWait not to perform useless timeout contination.

> Hence, I believe the code is correct as is and we shouldn't change the
> contract for WaitEventSetWait.
>
> > > BTW, I am curious why Andres in 98a64d0bd71 didn't just create an
> > > artificial event with WL_TIMEOUT and return that from
> > > WaitEventSetWait(). Would have made it cleaner than checking checking
> > > return values for -1.
> >
> > Maybe because it is not a kind of WaitEvent, so it naturally
> > cannot be a part of occurred_events.
>
> Hmm, I don't agree with that completely. One could argue that the
> backend is waiting for any event in order to be woken up, including a
> timeout event.

Right, I understand that. I didn't mean that it is the right
design for everyone. Just meant that it is in that shape. (And I
rather like it.)

latch.h:127
#define WL_TIMEOUT             (1 << 3)    /* not for WaitEventSetWait() */

We can make it one of the events for WaitEventSetWait, but I
don't see such a big point on that, and also that doesn't this
patch better in any means.


> > # By the way, you can obtain a short hash of a commit by git
> > #  rev-parse --short <full hash>.
>
> Good to know! :-) Luckily git is smart enough to still match it to the
> correct commit.

And too complex so that infrequent usage easily get out from my
brain:(


> > > Updated v2 patch attached.

Thank you . It looks fine execpt the above point.  But still I
have some questions on it. (the reason for they not being
comments is that they are about wordings..).

+     * Track the current time so that we can calculate the remaining timeout
+     * if we are woken up spuriously.

I think tha "track" means chasing a moving objects. So it might
be bettter that it is record or something?

>   * Wait for the given condition variable to be signaled or till timeout.
>   *
>   * Returns -1 when timeout expires, otherwise returns 0.
>   *
>   * See ConditionVariableSleep() for general usage.
>
> > +ConditionVariableTimedSleep(ConditionVariable *cv, long timeout,
> >
> > Counldn't the two-state return value be a boolean?
>
> I wanted to leave the option open to use the positive integers for other
> purposes but you are correct, bool suffices for now. If needed, we can
> change it in the future.

Yes, we can do that after we found it needed.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center


Reply | Threaded
Open this post in threaded view
|

Re: Introduce timeout capability for ConditionVariableSleep

Shawn Debnath
On Fri, Mar 15, 2019 at 02:15:17PM +0900, Kyotaro HORIGUCHI wrote:

> > After thinking about this more, I see WaitEventSetWait()'s contract
> > as wait for an event or timeout if no events are received in that
> > time
>
> Sure.
>
> > frame. Although ConditionVariableTimedSleep() is also using the same
> > word, I believe the semantics are different. The timeout period in
> > ConditionVariableTimedSleep() is intended to limit the time we wait
> > until removal from the wait queue. Whereas, in the case of
> > WaitEventSetWait, the timeout period is intended to limit the time the
> > caller waits till the first event.
>
> Mmm. The two look the same to me.. Timeout means for both that
> "Wait for one of these events or timeout expiration to
> occur". Removal from waiting queue is just a subtask of exiting
> from waiting state.
>
> The "don't exit until timeout expires unless any expected events
> occur" part is to be done in the uppermost layer so it is enough
> that the lower layer does just "exit when something
> happened".

Agree with the fact that lower layers should return and let the upper
layer determine and filter events as needed.

> This is the behavior of WaitEventSetWaitBlock for
> WaitEventSetWait. My proposal is giving callers capabliy to tell
> WaitEventSetWait not to perform useless timeout contination.

This is where I disagree. WaitEventSetWait needs its own loop and
timeout calculation because WaitEventSetWaitBlock can return when EINTR
is received. This gets filtered in WaitEventSetWait and doesn't bubble
up by design. Since it's involved in filtering events, it now also has
to manage the timeout value. ConditionVariableTimedSleep being at a
higher level, and waitng for certain events that the lower layers are
unaware of, also shares the timeout management reponsibility.

Do note that there is no performance impact of having multiple timeout
loops. The current design allows for each layer to filter events and
hence per layer timeout management seems fine. If one would want to
avoid this, perhaps we need to introduce a non-static version of
WaitEventSetWaitBlock and call that directly. But that of course is
beyond this patch.

> Thank you . It looks fine execpt the above point.  But still I
> have some questions on it. (the reason for they not being
> comments is that they are about wordings..).
>
> +     * Track the current time so that we can calculate the remaining timeout
> +     * if we are woken up spuriously.
>
> I think tha "track" means chasing a moving objects. So it might
> be bettter that it is record or something?
>
> >   * Wait for the given condition variable to be signaled or till timeout.
> >   *
> >   * Returns -1 when timeout expires, otherwise returns 0.
> >   *
> >   * See ConditionVariableSleep() for general usage.
> >
> > > +ConditionVariableTimedSleep(ConditionVariable *cv, long timeout,
> > >
> > > Counldn't the two-state return value be a boolean?

I will change it to Record in the next iteration of the patch.
 

--
Shawn Debnath
Amazon Web Services (AWS)

Reply | Threaded
Open this post in threaded view
|

Re: Introduce timeout capability for ConditionVariableSleep

Shawn Debnath
On Sat, Mar 16, 2019 at 03:27:17PM -0700, Shawn Debnath wrote:

> > +     * Track the current time so that we can calculate the
> > remaining timeout
> > +     * if we are woken up spuriously.
> >
> > I think tha "track" means chasing a moving objects. So it might
> > be bettter that it is record or something?
> >
> > >   * Wait for the given condition variable to be signaled or till timeout.
> > >   *
> > >   * Returns -1 when timeout expires, otherwise returns 0.
> > >   *
> > >   * See ConditionVariableSleep() for general usage.
> > >
> > > > +ConditionVariableTimedSleep(ConditionVariable *cv, long timeout,
> > > >
> > > > Counldn't the two-state return value be a boolean?
>
> I will change it to Record in the next iteration of the patch.
Posting rebased and updated patch. Changed the word 'Track' to 'Record'
and also changed variable name rem_timeout to cur_timeout to match
naming in other use cases.


--
Shawn Debnath
Amazon Web Services (AWS)

0001-Introduce-timeout-capability-for-ConditionVariableSleep-v4.patch (4K) Download Attachment