Windows buildfarm members vs. new async-notify isolation test

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
25 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Windows buildfarm members vs. new async-notify isolation test

Tom Lane-2
I see from the buildfarm status page that since commits 6b802cfc7
et al went in a week ago, frogmouth and currawong have failed that
new test case every time, with the symptom

================== pgsql.build/src/test/isolation/regression.diffs ===================
*** c:/prog/bf/root/REL_10_STABLE/pgsql.build/src/test/isolation/expected/async-notify.out Mon Nov 25 00:30:49 2019
--- c:/prog/bf/root/REL_10_STABLE/pgsql.build/src/test/isolation/results/async-notify.out Mon Dec  2 00:54:26 2019
***************
*** 93,99 ****
  step llisten: LISTEN c1; LISTEN c2;
  step lcommit: COMMIT;
  step l2commit: COMMIT;
- listener2: NOTIFY "c1" with payload "" from notifier
  step l2stop: UNLISTEN *;
 
  starting permutation: llisten lbegin usage bignotify usage
--- 93,98 ----

(Note that these two critters don't run branches v11 and up, which
is why they're only showing this failure in 10 and 9.6.)

drongo showed the same failure once in v10, and fairywren showed
it once in v12.  Every other buildfarm animal seems happy.

I'm a little baffled as to what this might be --- some sort of
timing problem in our Windows signal emulation, perhaps?  But
if so, why haven't we found it years ago?

I don't have any ability to test this myself, so would appreciate
help or ideas.

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: Windows buildfarm members vs. new async-notify isolation test

Andrew Dunstan-8

On 12/2/19 11:23 AM, Tom Lane wrote:

> I see from the buildfarm status page that since commits 6b802cfc7
> et al went in a week ago, frogmouth and currawong have failed that
> new test case every time, with the symptom
>
> ================== pgsql.build/src/test/isolation/regression.diffs ===================
> *** c:/prog/bf/root/REL_10_STABLE/pgsql.build/src/test/isolation/expected/async-notify.out Mon Nov 25 00:30:49 2019
> --- c:/prog/bf/root/REL_10_STABLE/pgsql.build/src/test/isolation/results/async-notify.out Mon Dec  2 00:54:26 2019
> ***************
> *** 93,99 ****
>   step llisten: LISTEN c1; LISTEN c2;
>   step lcommit: COMMIT;
>   step l2commit: COMMIT;
> - listener2: NOTIFY "c1" with payload "" from notifier
>   step l2stop: UNLISTEN *;
>  
>   starting permutation: llisten lbegin usage bignotify usage
> --- 93,98 ----
>
> (Note that these two critters don't run branches v11 and up, which
> is why they're only showing this failure in 10 and 9.6.)
>
> drongo showed the same failure once in v10, and fairywren showed
> it once in v12.  Every other buildfarm animal seems happy.
>
> I'm a little baffled as to what this might be --- some sort of
> timing problem in our Windows signal emulation, perhaps?  But
> if so, why haven't we found it years ago?
>
> I don't have any ability to test this myself, so would appreciate
> help or ideas.



I can test things, but I don't really know what to test. FYI frogmouth
and currawong run on virtualized XP. drongo anf fairywrne run on
virtualized WS2019. Neither VM is heavily resourced.


Note that my other windows critters jacana and bowerbird which run on
bare metal W10Pro haven't seen this problem.


(BTW, from next week I'm going to be down under for 3 months, and my
ability to test Windows things will be somewhat reduced.)


cheers


andrew


--
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Reply | Threaded
Open this post in threaded view
|

Re: Windows buildfarm members vs. new async-notify isolation test

Mark Dilger-4


On 12/2/19 11:42 AM, Andrew Dunstan wrote:

>
> On 12/2/19 11:23 AM, Tom Lane wrote:
>> I see from the buildfarm status page that since commits 6b802cfc7
>> et al went in a week ago, frogmouth and currawong have failed that
>> new test case every time, with the symptom
>>
>> ================== pgsql.build/src/test/isolation/regression.diffs ===================
>> *** c:/prog/bf/root/REL_10_STABLE/pgsql.build/src/test/isolation/expected/async-notify.out Mon Nov 25 00:30:49 2019
>> --- c:/prog/bf/root/REL_10_STABLE/pgsql.build/src/test/isolation/results/async-notify.out Mon Dec  2 00:54:26 2019
>> ***************
>> *** 93,99 ****
>>    step llisten: LISTEN c1; LISTEN c2;
>>    step lcommit: COMMIT;
>>    step l2commit: COMMIT;
>> - listener2: NOTIFY "c1" with payload "" from notifier
>>    step l2stop: UNLISTEN *;
>>    
>>    starting permutation: llisten lbegin usage bignotify usage
>> --- 93,98 ----
>>
>> (Note that these two critters don't run branches v11 and up, which
>> is why they're only showing this failure in 10 and 9.6.)
>>
>> drongo showed the same failure once in v10, and fairywren showed
>> it once in v12.  Every other buildfarm animal seems happy.
>>
>> I'm a little baffled as to what this might be --- some sort of
>> timing problem in our Windows signal emulation, perhaps?  But
>> if so, why haven't we found it years ago?
>>
>> I don't have any ability to test this myself, so would appreciate
>> help or ideas.
>
>
>
> I can test things, but I don't really know what to test. FYI frogmouth
> and currawong run on virtualized XP. drongo anf fairywrne run on
> virtualized WS2019. Neither VM is heavily resourced.

Hi Andrew, if you have time you could perhaps check the
isolation test structure itself.  Like Tom, I don't have a
Windows box to test this.

I would be curious to see if there is a race condition in
src/test/isolation/isolationtester.c between the loop starting
on line 820:

   while ((res = PQgetResult(conn)))
   {
      ...
   }

and the attempt to consume input that might include NOTIFY
messages on line 861:

   PQconsumeInput(conn);

If the first loop consumes the commit message, gets no
further PGresult from PQgetResult, and finishes, and execution
proceeds to PQconsumeInput before the NOTIFY has arrived
over the socket, there won't be anything for PQnotifies to
return, and hence for try_complete_step to print before
returning.

I'm not sure if it is possible for the commit message to
arrive before the notify message in the fashion I am describing,
but that's something you might easily check by having
isolationtester sleep before PQconsumeInput on line 861.


--
Mark Dilger


Reply | Threaded
Open this post in threaded view
|

Re: Windows buildfarm members vs. new async-notify isolation test

Tom Lane-2
Mark Dilger <[hidden email]> writes:
> On 12/2/19 11:42 AM, Andrew Dunstan wrote:
>> On 12/2/19 11:23 AM, Tom Lane wrote:
>>> I'm a little baffled as to what this might be --- some sort of
>>> timing problem in our Windows signal emulation, perhaps?  But
>>> if so, why haven't we found it years ago?

> I would be curious to see if there is a race condition in
> src/test/isolation/isolationtester.c between the loop starting
> on line 820:
>    while ((res = PQgetResult(conn)))
>    {
>       ...
>    }
> and the attempt to consume input that might include NOTIFY
> messages on line 861:
>    PQconsumeInput(conn);

In principle, the issue should not be there, because commits
790026972 et al should have ensured that the NOTIFY protocol
message comes out before ReadyForQuery (and thus, libpq will
absorb it before PQgetResult will return NULL).  I think the
timing problem --- if that's what it is --- must be on the
backend side; somehow the backend is not processing the
inbound notify queue before it goes idle.

Hmm ... just looking at the code again, could it be that there's
no well-placed CHECK_FOR_INTERRUPTS?  Andrew, could you see if
injecting one in what 790026972 added to postgres.c helps?
That is,

                /*
                 * Also process incoming notifies, if any.  This is mostly to
                 * ensure stable behavior in tests: if any notifies were
                 * received during the just-finished transaction, they'll be
                 * seen by the client before ReadyForQuery is.
                 */
+               CHECK_FOR_INTERRUPTS();
                if (notifyInterruptPending)
                    ProcessNotifyInterrupt();


                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: Windows buildfarm members vs. new async-notify isolation test

akapila
On Tue, Dec 3, 2019 at 10:10 PM Tom Lane <[hidden email]> wrote:

>
> In principle, the issue should not be there, because commits
> 790026972 et al should have ensured that the NOTIFY protocol
> message comes out before ReadyForQuery (and thus, libpq will
> absorb it before PQgetResult will return NULL).  I think the
> timing problem --- if that's what it is --- must be on the
> backend side; somehow the backend is not processing the
> inbound notify queue before it goes idle.
>
> Hmm ... just looking at the code again, could it be that there's
> no well-placed CHECK_FOR_INTERRUPTS?  Andrew, could you see if
> injecting one in what 790026972 added to postgres.c helps?
> That is,
>
>                 /*
>                  * Also process incoming notifies, if any.  This is mostly to
>                  * ensure stable behavior in tests: if any notifies were
>                  * received during the just-finished transaction, they'll be
>                  * seen by the client before ReadyForQuery is.
>                  */
> +               CHECK_FOR_INTERRUPTS();
>                 if (notifyInterruptPending)
>                     ProcessNotifyInterrupt();
>

I also tried to analyze this failure and it seems this is a good bet,
but I am also wondering why we have never seen such a timing issue in
other somewhat similar tests.  For ex.,  one with comment (#
Cross-backend notification delivery.).  Do they have a better way of
ensuring that the notification will be received or is it purely
coincidental that they haven't seen such a symptom?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Windows buildfarm members vs. new async-notify isolation test

Tom Lane-2
Amit Kapila <[hidden email]> writes:
> On Tue, Dec 3, 2019 at 10:10 PM Tom Lane <[hidden email]> wrote:
>> Hmm ... just looking at the code again, could it be that there's
>> no well-placed CHECK_FOR_INTERRUPTS?  Andrew, could you see if
>> injecting one in what 790026972 added to postgres.c helps?

> I also tried to analyze this failure and it seems this is a good bet,
> but I am also wondering why we have never seen such a timing issue in
> other somewhat similar tests.  For ex.,  one with comment (#
> Cross-backend notification delivery.).  Do they have a better way of
> ensuring that the notification will be received or is it purely
> coincidental that they haven't seen such a symptom?

TBH, my bet is that this *won't* fix it, but it seemed like an easy
thing to test.  For this to fix it, you'd have to suppose that we
never do a CHECK_FOR_INTERRUPTS during a COMMIT command, which is
improbable at best.

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: Windows buildfarm members vs. new async-notify isolation test

Andrew Dunstan-8
On Wed, Dec 4, 2019 at 12:12 AM Tom Lane <[hidden email]> wrote:

>
> Amit Kapila <[hidden email]> writes:
> > On Tue, Dec 3, 2019 at 10:10 PM Tom Lane <[hidden email]> wrote:
> >> Hmm ... just looking at the code again, could it be that there's
> >> no well-placed CHECK_FOR_INTERRUPTS?  Andrew, could you see if
> >> injecting one in what 790026972 added to postgres.c helps?
>
> > I also tried to analyze this failure and it seems this is a good bet,
> > but I am also wondering why we have never seen such a timing issue in
> > other somewhat similar tests.  For ex.,  one with comment (#
> > Cross-backend notification delivery.).  Do they have a better way of
> > ensuring that the notification will be received or is it purely
> > coincidental that they haven't seen such a symptom?
>
> TBH, my bet is that this *won't* fix it, but it seemed like an easy
> thing to test.  For this to fix it, you'd have to suppose that we
> never do a CHECK_FOR_INTERRUPTS during a COMMIT command, which is
> improbable at best.
>


You win your bet. Tried this on frogmouth and it still failed.

cheers

andrew


--
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Windows buildfarm members vs. new async-notify isolation test

akapila
On Wed, Dec 4, 2019 at 9:51 PM Andrew Dunstan
<[hidden email]> wrote:

>
> On Wed, Dec 4, 2019 at 12:12 AM Tom Lane <[hidden email]> wrote:
> >
> > Amit Kapila <[hidden email]> writes:
> > > On Tue, Dec 3, 2019 at 10:10 PM Tom Lane <[hidden email]> wrote:
> > >> Hmm ... just looking at the code again, could it be that there's
> > >> no well-placed CHECK_FOR_INTERRUPTS?  Andrew, could you see if
> > >> injecting one in what 790026972 added to postgres.c helps?
> >
> > > I also tried to analyze this failure and it seems this is a good bet,
> > > but I am also wondering why we have never seen such a timing issue in
> > > other somewhat similar tests.  For ex.,  one with comment (#
> > > Cross-backend notification delivery.).  Do they have a better way of
> > > ensuring that the notification will be received or is it purely
> > > coincidental that they haven't seen such a symptom?
> >
> > TBH, my bet is that this *won't* fix it, but it seemed like an easy
> > thing to test.  For this to fix it, you'd have to suppose that we
> > never do a CHECK_FOR_INTERRUPTS during a COMMIT command, which is
> > improbable at best.
> >
>
>
> You win your bet. Tried this on frogmouth and it still failed.
>

IIUC, this means that commit (step l2commit) is finishing before the
notify signal is reached that session.  If so, can we at least confirm
that by adding something like select pg_sleep(1) in that step?  So,
l2commit will be: step "l2commit" { SELECT pg_sleep(1); COMMIT; }.  I
think we can try by increasing sleep time as well to confirm the
behavior if required.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Windows buildfarm members vs. new async-notify isolation test

Andrew Dunstan-8

On 12/5/19 4:37 AM, Amit Kapila wrote:

> On Wed, Dec 4, 2019 at 9:51 PM Andrew Dunstan
> <[hidden email]> wrote:
>> On Wed, Dec 4, 2019 at 12:12 AM Tom Lane <[hidden email]> wrote:
>>> Amit Kapila <[hidden email]> writes:
>>>> On Tue, Dec 3, 2019 at 10:10 PM Tom Lane <[hidden email]> wrote:
>>>>> Hmm ... just looking at the code again, could it be that there's
>>>>> no well-placed CHECK_FOR_INTERRUPTS?  Andrew, could you see if
>>>>> injecting one in what 790026972 added to postgres.c helps?
>>>> I also tried to analyze this failure and it seems this is a good bet,
>>>> but I am also wondering why we have never seen such a timing issue in
>>>> other somewhat similar tests.  For ex.,  one with comment (#
>>>> Cross-backend notification delivery.).  Do they have a better way of
>>>> ensuring that the notification will be received or is it purely
>>>> coincidental that they haven't seen such a symptom?
>>> TBH, my bet is that this *won't* fix it, but it seemed like an easy
>>> thing to test.  For this to fix it, you'd have to suppose that we
>>> never do a CHECK_FOR_INTERRUPTS during a COMMIT command, which is
>>> improbable at best.
>>>
>>
>> You win your bet. Tried this on frogmouth and it still failed.
>>
> IIUC, this means that commit (step l2commit) is finishing before the
> notify signal is reached that session.  If so, can we at least confirm
> that by adding something like select pg_sleep(1) in that step?  So,
> l2commit will be: step "l2commit" { SELECT pg_sleep(1); COMMIT; }.  I
> think we can try by increasing sleep time as well to confirm the
> behavior if required.
>

Yeah, with the sleep in there the NOTIFY is seen.


cheers


andrew




--
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Reply | Threaded
Open this post in threaded view
|

Re: Windows buildfarm members vs. new async-notify isolation test

Tom Lane-2
Andrew Dunstan <[hidden email]> writes:
> On 12/5/19 4:37 AM, Amit Kapila wrote:
>> IIUC, this means that commit (step l2commit) is finishing before the
>> notify signal is reached that session.  If so, can we at least confirm
>> that by adding something like select pg_sleep(1) in that step?  So,
>> l2commit will be: step "l2commit" { SELECT pg_sleep(1); COMMIT; }.  I
>> think we can try by increasing sleep time as well to confirm the
>> behavior if required.

> Yeah, with the sleep in there the NOTIFY is seen.

Well, that is *really* interesting, because I was fairly sure that
everything was adequately interlocked.  The signal must have been
sent before step notify1 finishes, and then we do several other
things, so how could the listener2 process not have gotten it by
the time we run the l2commit step?  I still think this is showing
us some sort of deficiency in our Windows signal mechanism.

A possible theory as to what's happening is that the kernel scheduler
is discriminating against listener2's signal management thread(s)
and not running them until everything else goes idle for a moment.
(If true, even a very short sleep ought to be enough to fix the test.)
If that's what's happening, though, I think we ought to look into
whether we can raise the priority of the signal threads compared to
the main thread.  I don't think we want this much variation between
the way signals work on Windows and the way they work elsewhere.

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: Windows buildfarm members vs. new async-notify isolation test

akapila
On Sat, Dec 7, 2019 at 5:01 AM Tom Lane <[hidden email]> wrote:

>
> Andrew Dunstan <[hidden email]> writes:
> > On 12/5/19 4:37 AM, Amit Kapila wrote:
> >> IIUC, this means that commit (step l2commit) is finishing before the
> >> notify signal is reached that session.  If so, can we at least confirm
> >> that by adding something like select pg_sleep(1) in that step?  So,
> >> l2commit will be: step "l2commit" { SELECT pg_sleep(1); COMMIT; }.  I
> >> think we can try by increasing sleep time as well to confirm the
> >> behavior if required.
>
> > Yeah, with the sleep in there the NOTIFY is seen.
>
> Well, that is *really* interesting, because I was fairly sure that
> everything was adequately interlocked.  The signal must have been
> sent before step notify1 finishes, and then we do several other
> things, so how could the listener2 process not have gotten it by
> the time we run the l2commit step?  I still think this is showing
> us some sort of deficiency in our Windows signal mechanism.
>
> A possible theory as to what's happening is that the kernel scheduler
> is discriminating against listener2's signal management thread(s)
> and not running them until everything else goes idle for a moment.
>

If we have to believe that theory then why the other similar test is
not showing the problem.  The other test, I am talking about is below:
# Cross-backend notification delivery.
permutation "llisten" "notify1" "notify2" "notify3" "notifyf" "lcheck"

In this test also, one session is listening and the other session sent
a few notifications and then when the first session executes another
command (lcheck), it receives the notifications.  I have also debugged
it in the Windows box that as soon as the notify sends the signal, the
signal thread receives it and comes out of ConnectNamedPipe and does
the processing to dispatch the signal.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Windows buildfarm members vs. new async-notify isolation test

Tom Lane-2
Amit Kapila <[hidden email]> writes:
> On Sat, Dec 7, 2019 at 5:01 AM Tom Lane <[hidden email]> wrote:
>> A possible theory as to what's happening is that the kernel scheduler
>> is discriminating against listener2's signal management thread(s)
>> and not running them until everything else goes idle for a moment.

> If we have to believe that theory then why the other similar test is
> not showing the problem.

There are fewer processes involved in that case, so I don't think
it disproves the theory that this is a scheduler glitch.

> I have also debugged
> it in the Windows box that as soon as the notify sends the signal, the
> signal thread receives it and comes out of ConnectNamedPipe and does
> the processing to dispatch the signal.

Have you done that debugging on a machine that's showing the failure?
Since only some of our Windows critters are showing it, it's evidently
dependent on environment or Windows version somehow.

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: Windows buildfarm members vs. new async-notify isolation test

Tom Lane-2
I wrote:
> Amit Kapila <[hidden email]> writes:
>> On Sat, Dec 7, 2019 at 5:01 AM Tom Lane <[hidden email]> wrote:
>>> A possible theory as to what's happening is that the kernel scheduler
>>> is discriminating against listener2's signal management thread(s)
>>> and not running them until everything else goes idle for a moment.

>> If we have to believe that theory then why the other similar test is
>> not showing the problem.

> There are fewer processes involved in that case, so I don't think
> it disproves the theory that this is a scheduler glitch.

So, just idly looking at the code in src/backend/port/win32/signal.c
and src/port/kill.c, I have to wonder why we have this baroque-looking
design of using *two* signal management threads.  And, if I'm
reading it right, we create an entire new pipe object and an entire
new instance of the second thread for each incoming signal.  Plus, the
signal senders use CallNamedPipe (hence, underneath, TransactNamedPipe)
which means they in effect wait for the recipient's signal-handling
thread to ack receipt of the signal.  Maybe there's a good reason for
all this but it sure seems like a lot of wasted cycles from here.

I have to wonder why we don't have a single named pipe that lasts as
long as the recipient process does, and a signal sender just writes
one byte to it, and considers the signal delivered if it is able to
do that.  The "message" semantics seem like overkill for that.

I dug around in the contemporaneous archives and could only find
https://www.postgresql.org/message-id/303E00EBDD07B943924382E153890E5434AA47%40cuthbert.rcsinc.local
which describes the existing approach but fails to explain why we
should do it like that.

This might or might not have much to do with the immediate problem,
but I can't help wondering if there's some race-condition-ish behavior
in there that's contributing to what we're seeing.  We already had to
fix a couple of race conditions from doing it like this, cf commits
2e371183e, 04a4413c2, f27a4696f.  Perhaps 0ea1f2a3a is relevant
as well.

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: Windows buildfarm members vs. new async-notify isolation test

akapila
On Sun, Dec 8, 2019 at 1:26 AM Tom Lane <[hidden email]> wrote:

>
> So, just idly looking at the code in src/backend/port/win32/signal.c
> and src/port/kill.c, I have to wonder why we have this baroque-looking
> design of using *two* signal management threads.  And, if I'm
> reading it right, we create an entire new pipe object and an entire
> new instance of the second thread for each incoming signal.  Plus, the
> signal senders use CallNamedPipe (hence, underneath, TransactNamedPipe)
> which means they in effect wait for the recipient's signal-handling
> thread to ack receipt of the signal.  Maybe there's a good reason for
> all this but it sure seems like a lot of wasted cycles from here.
>
> I have to wonder why we don't have a single named pipe that lasts as
> long as the recipient process does, and a signal sender just writes
> one byte to it, and considers the signal delivered if it is able to
> do that.  The "message" semantics seem like overkill for that.
>
> I dug around in the contemporaneous archives and could only find
> https://www.postgresql.org/message-id/303E00EBDD07B943924382E153890E5434AA47%40cuthbert.rcsinc.local
> which describes the existing approach but fails to explain why we
> should do it like that.
>
> This might or might not have much to do with the immediate problem,
> but I can't help wondering if there's some race-condition-ish behavior
> in there that's contributing to what we're seeing.
>

On the receiving side, the work we do after the 'notify' is finished
(or before CallNamedPipe gets control back) is as follows:

pg_signal_dispatch_thread()
{
..
FlushFileBuffers(pipe);
DisconnectNamedPipe(pipe);
CloseHandle(pipe);

pg_queue_signal(sigNum);
}

It seems most of these are the system calls which makes me think that
they might be slow enough on some Windows version that it could lead
to such race condition.

Now, coming back to the other theory the scheduler is not able to
schedule these signal management threads.  I think if that would be
the case, then notify could not have finished, because CallNamedPipe
returns only when dispatch thread writes back to the pipe.   Now, if
somehow after writing back on the pipe if the scheduler kicks this
thread out, it is possible that we see such behavior, however, I am
not sure if we can do anything about that.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Windows buildfarm members vs. new async-notify isolation test

akapila
In reply to this post by Tom Lane-2
On Sat, Dec 7, 2019 at 10:50 PM Tom Lane <[hidden email]> wrote:

>
> Amit Kapila <[hidden email]> writes:
> > On Sat, Dec 7, 2019 at 5:01 AM Tom Lane <[hidden email]> wrote:
> >> A possible theory as to what's happening is that the kernel scheduler
> >> is discriminating against listener2's signal management thread(s)
> >> and not running them until everything else goes idle for a moment.
>
> > If we have to believe that theory then why the other similar test is
> > not showing the problem.
>
> There are fewer processes involved in that case, so I don't think
> it disproves the theory that this is a scheduler glitch.
>
> > I have also debugged
> > it in the Windows box that as soon as the notify sends the signal, the
> > signal thread receives it and comes out of ConnectNamedPipe and does
> > the processing to dispatch the signal.
>
> Have you done that debugging on a machine that's showing the failure?
>

No, it is on my local Win-7 setup.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Windows buildfarm members vs. new async-notify isolation test

akapila
In reply to this post by akapila
On Sun, Dec 8, 2019 at 10:44 AM Amit Kapila <[hidden email]> wrote:

>
> On Sun, Dec 8, 2019 at 1:26 AM Tom Lane <[hidden email]> wrote:
> >
> > So, just idly looking at the code in src/backend/port/win32/signal.c
> > and src/port/kill.c, I have to wonder why we have this baroque-looking
> > design of using *two* signal management threads.  And, if I'm
> > reading it right, we create an entire new pipe object and an entire
> > new instance of the second thread for each incoming signal.  Plus, the
> > signal senders use CallNamedPipe (hence, underneath, TransactNamedPipe)
> > which means they in effect wait for the recipient's signal-handling
> > thread to ack receipt of the signal.  Maybe there's a good reason for
> > all this but it sure seems like a lot of wasted cycles from here.
> >
> > I have to wonder why we don't have a single named pipe that lasts as
> > long as the recipient process does, and a signal sender just writes
> > one byte to it, and considers the signal delivered if it is able to
> > do that.  The "message" semantics seem like overkill for that.
> >
> > I dug around in the contemporaneous archives and could only find
> > https://www.postgresql.org/message-id/303E00EBDD07B943924382E153890E5434AA47%40cuthbert.rcsinc.local
> > which describes the existing approach but fails to explain why we
> > should do it like that.
> >
> > This might or might not have much to do with the immediate problem,
> > but I can't help wondering if there's some race-condition-ish behavior
> > in there that's contributing to what we're seeing.
> >
>
> On the receiving side, the work we do after the 'notify' is finished
> (or before CallNamedPipe gets control back) is as follows:
>
> pg_signal_dispatch_thread()
> {
> ..
> FlushFileBuffers(pipe);
> DisconnectNamedPipe(pipe);
> CloseHandle(pipe);
>
> pg_queue_signal(sigNum);
> }
>
> It seems most of these are the system calls which makes me think that
> they might be slow enough on some Windows version that it could lead
> to such race condition.
>

IIUC, once the dispatch thread has queued the signal
(pg_queue_signal), the next CHECK_FOR_INTERRUPTS by the main thread
will execute the signal.  So, if we move pg_queue_signal() before we
do WriteFile in pg_signal_dispatch_thread(), this race condition will
be closed.  Now, we might not want to do this as that will add some
more time (even though very less) before notify on the other side can
finish or maybe there is some technical problem with this idea which I
am not able to see immediately.


--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Windows buildfarm members vs. new async-notify isolation test

Tom Lane-2
Amit Kapila <[hidden email]> writes:
> IIUC, once the dispatch thread has queued the signal
> (pg_queue_signal), the next CHECK_FOR_INTERRUPTS by the main thread
> will execute the signal.  So, if we move pg_queue_signal() before we
> do WriteFile in pg_signal_dispatch_thread(), this race condition will
> be closed.  Now, we might not want to do this as that will add some
> more time (even though very less) before notify on the other side can
> finish or maybe there is some technical problem with this idea which I
> am not able to see immediately.

Hmm.  Certainly worth trying to see if it resolves the failure on
Andrew's machines.

It's not real hard to believe that TransactNamedPipe could be
"optimized" so that it preferentially schedules the client thread
once the handshake is done, not the server thread (based on some
heuristic claim that the former is probably an interactive process
and the latter less so).  In that situation, we'd proceed on with
the signal not really delivered, and there is nothing guaranteeing
that it will be delivered anytime soon --- the rest of the test
can make progress regardless of whether that thread ever gets
scheduled again.  So, as long as we've got this handshake mechanism,
it seems like it'd be a good thing for the ack to indicate that
the signal was *actually* delivered (by setting the recipient's
flag bit) and not just that it'll probably get delivered eventually.

I remain a bit unsure that we actually need the handshaking business
at all --- I doubt that Unix signals provide any guarantee of synchronous
delivery on most platforms.  (If I'm reading the POSIX spec correctly,
it only requires synchronous delivery when a thread signals itself.)
But the existence of this unsynchronized thread in the Windows
implementation sure seems like a dubious thing, now that you
point it out.

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: Windows buildfarm members vs. new async-notify isolation test

Tom Lane-2
I wrote:
> Amit Kapila <[hidden email]> writes:
>> IIUC, once the dispatch thread has queued the signal
>> (pg_queue_signal), the next CHECK_FOR_INTERRUPTS by the main thread
>> will execute the signal.  So, if we move pg_queue_signal() before we
>> do WriteFile in pg_signal_dispatch_thread(), this race condition will
>> be closed.  Now, we might not want to do this as that will add some
>> more time (even though very less) before notify on the other side can
>> finish or maybe there is some technical problem with this idea which I
>> am not able to see immediately.

> Hmm.  Certainly worth trying to see if it resolves the failure on
> Andrew's machines.

For Andrew's convenience, here's a draft patch for that.  I took the
liberty of improving the rather thin comments in this area, too.

                        regards, tom lane


diff --git a/src/backend/port/win32/signal.c b/src/backend/port/win32/signal.c
index 9b5e544..f1e35fd 100644
--- a/src/backend/port/win32/signal.c
+++ b/src/backend/port/win32/signal.c
@@ -208,12 +208,15 @@ pgwin32_create_signal_listener(pid_t pid)
  */
 
 
+/*
+ * Queue a signal for the main thread, by setting the flag bit and event.
+ */
 void
 pg_queue_signal(int signum)
 {
  Assert(pgwin32_signal_event != NULL);
  if (signum >= PG_SIGNAL_COUNT || signum <= 0)
- return;
+ return; /* ignore any bad signal number */
 
  EnterCriticalSection(&pg_signal_crit_sec);
  pg_signal_queue |= sigmask(signum);
@@ -222,7 +225,11 @@ pg_queue_signal(int signum)
  SetEvent(pgwin32_signal_event);
 }
 
-/* Signal dispatching thread */
+/*
+ * Signal dispatching thread.  This runs after we have received a named
+ * pipe connection from a client (signal sender).   Process the request,
+ * close the pipe, and exit.
+ */
 static DWORD WINAPI
 pg_signal_dispatch_thread(LPVOID param)
 {
@@ -242,13 +249,37 @@ pg_signal_dispatch_thread(LPVOID param)
  CloseHandle(pipe);
  return 0;
  }
+
+ /*
+ * Queue the signal before responding to the client.  In this way, it's
+ * guaranteed that once kill() has returned in the signal sender, the next
+ * CHECK_FOR_INTERRUPTS() in the signal recipient will see the signal.
+ * (This is a stronger guarantee than POSIX makes; maybe we don't need it?
+ * But without it, we've seen timing bugs on Windows that do not manifest
+ * on any known Unix.)
+ */
+ pg_queue_signal(sigNum);
+
+ /*
+ * Write something back to the client, allowing its CallNamedPipe() call
+ * to terminate.
+ */
  WriteFile(pipe, &sigNum, 1, &bytes, NULL); /* Don't care if it works or
  * not.. */
+
+ /*
+ * We must wait for the client to read the data before we can close the
+ * pipe, else the data will be lost.  (If the WriteFile call failed,
+ * there'll be nothing in the buffer, so this shouldn't block.)
+ */
  FlushFileBuffers(pipe);
+
+ /* This is a formality, since we're about to close the pipe anyway. */
  DisconnectNamedPipe(pipe);
+
+ /* And we're done. */
  CloseHandle(pipe);
 
- pg_queue_signal(sigNum);
  return 0;
 }
 
@@ -266,6 +297,7 @@ pg_signal_thread(LPVOID param)
  BOOL fConnected;
  HANDLE hThread;
 
+ /* Create a new pipe instance if we don't have one. */
  if (pipe == INVALID_HANDLE_VALUE)
  {
  pipe = CreateNamedPipe(pipename, PIPE_ACCESS_DUPLEX,
@@ -280,6 +312,11 @@ pg_signal_thread(LPVOID param)
  }
  }
 
+ /*
+ * Wait for a client to connect.  If something connects before we
+ * reach here, we'll get back a "failure" with ERROR_PIPE_CONNECTED,
+ * which is actually a success (way to go, Microsoft).
+ */
  fConnected = ConnectNamedPipe(pipe, NULL) ? TRUE : (GetLastError() == ERROR_PIPE_CONNECTED);
  if (fConnected)
  {
Reply | Threaded
Open this post in threaded view
|

Re: Windows buildfarm members vs. new async-notify isolation test

Tom Lane-2
In reply to this post by Tom Lane-2
I wrote:
> So, just idly looking at the code in src/backend/port/win32/signal.c
> and src/port/kill.c, I have to wonder why we have this baroque-looking
> design of using *two* signal management threads.  And, if I'm
> reading it right, we create an entire new pipe object and an entire
> new instance of the second thread for each incoming signal.  Plus, the
> signal senders use CallNamedPipe (hence, underneath, TransactNamedPipe)
> which means they in effect wait for the recipient's signal-handling
> thread to ack receipt of the signal.  Maybe there's a good reason for
> all this but it sure seems like a lot of wasted cycles from here.

Here's a possible patch (untested by me) to get rid of the second thread
and the new-pipe-for-every-request behavior.  I believe that the existing
logic may be based on Microsoft's "Multithreaded Pipe Server" example [1]
or something similar, but that's based on an assumption that servicing
a client request may take a substantial amount of time and it's worth
handling requests concurrently.  Neither point applies in this context.

Doing it like this seems attractive to me because it gets rid of two
different failure modes: inability to create a new thread and inability
to create a new pipe handle.  Now on the other hand, it means that
inability to complete the read/write transaction with a client right
away will delay processing of other signals.  But we know that the
client is engaged in a CallNamedPipe operation, so how realistic is
that concern?

This is to be applied on top of the other patch I just sent.

                        regards, tom lane

[1] https://docs.microsoft.com/en-us/windows/win32/ipc/multithreaded-pipe-server


diff --git a/src/backend/port/win32/signal.c b/src/backend/port/win32/signal.c
index f1e35fd..aa762b9 100644
--- a/src/backend/port/win32/signal.c
+++ b/src/backend/port/win32/signal.c
@@ -38,7 +38,7 @@ static pqsigfunc pg_signal_array[PG_SIGNAL_COUNT];
 static pqsigfunc pg_signal_defaults[PG_SIGNAL_COUNT];
 
 
-/* Signal handling thread function */
+/* Signal handling thread functions */
 static DWORD WINAPI pg_signal_thread(LPVOID param);
 static BOOL WINAPI pg_console_handler(DWORD dwCtrlType);
 
@@ -225,64 +225,6 @@ pg_queue_signal(int signum)
  SetEvent(pgwin32_signal_event);
 }
 
-/*
- * Signal dispatching thread.  This runs after we have received a named
- * pipe connection from a client (signal sender).   Process the request,
- * close the pipe, and exit.
- */
-static DWORD WINAPI
-pg_signal_dispatch_thread(LPVOID param)
-{
- HANDLE pipe = (HANDLE) param;
- BYTE sigNum;
- DWORD bytes;
-
- if (!ReadFile(pipe, &sigNum, 1, &bytes, NULL))
- {
- /* Client died before sending */
- CloseHandle(pipe);
- return 0;
- }
- if (bytes != 1)
- {
- /* Received <bytes> bytes over signal pipe (should be 1) */
- CloseHandle(pipe);
- return 0;
- }
-
- /*
- * Queue the signal before responding to the client.  In this way, it's
- * guaranteed that once kill() has returned in the signal sender, the next
- * CHECK_FOR_INTERRUPTS() in the signal recipient will see the signal.
- * (This is a stronger guarantee than POSIX makes; maybe we don't need it?
- * But without it, we've seen timing bugs on Windows that do not manifest
- * on any known Unix.)
- */
- pg_queue_signal(sigNum);
-
- /*
- * Write something back to the client, allowing its CallNamedPipe() call
- * to terminate.
- */
- WriteFile(pipe, &sigNum, 1, &bytes, NULL); /* Don't care if it works or
- * not.. */
-
- /*
- * We must wait for the client to read the data before we can close the
- * pipe, else the data will be lost.  (If the WriteFile call failed,
- * there'll be nothing in the buffer, so this shouldn't block.)
- */
- FlushFileBuffers(pipe);
-
- /* This is a formality, since we're about to close the pipe anyway. */
- DisconnectNamedPipe(pipe);
-
- /* And we're done. */
- CloseHandle(pipe);
-
- return 0;
-}
-
 /* Signal handling thread */
 static DWORD WINAPI
 pg_signal_thread(LPVOID param)
@@ -290,12 +232,12 @@ pg_signal_thread(LPVOID param)
  char pipename[128];
  HANDLE pipe = pgwin32_initial_signal_pipe;
 
+ /* Set up pipe name, in case we have to re-create the pipe. */
  snprintf(pipename, sizeof(pipename), "\\\\.\\pipe\\pgsignal_%lu", GetCurrentProcessId());
 
  for (;;)
  {
  BOOL fConnected;
- HANDLE hThread;
 
  /* Create a new pipe instance if we don't have one. */
  if (pipe == INVALID_HANDLE_VALUE)
@@ -320,51 +262,52 @@ pg_signal_thread(LPVOID param)
  fConnected = ConnectNamedPipe(pipe, NULL) ? TRUE : (GetLastError() == ERROR_PIPE_CONNECTED);
  if (fConnected)
  {
- HANDLE newpipe;
-
  /*
- * We have a connected pipe. Pass this off to a separate thread
- * that will do the actual processing of the pipe.
- *
- * We must also create a new instance of the pipe *before* we
- * start running the new thread. If we don't, there is a race
- * condition whereby the dispatch thread might run CloseHandle()
- * before we have created a new instance, thereby causing a small
- * window of time where we will miss incoming requests.
+ * We have a connection from a would-be signal sender. Process it.
  */
- newpipe = CreateNamedPipe(pipename, PIPE_ACCESS_DUPLEX,
-  PIPE_TYPE_MESSAGE | PIPE_READMODE_MESSAGE | PIPE_WAIT,
-  PIPE_UNLIMITED_INSTANCES, 16, 16, 1000, NULL);
- if (newpipe == INVALID_HANDLE_VALUE)
+ BYTE sigNum;
+ DWORD bytes;
+
+ if (ReadFile(pipe, &sigNum, 1, &bytes, NULL) &&
+ bytes == 1)
  {
  /*
- * This really should never fail. Just retry in case it does,
- * even though we have a small race window in that case. There
- * is nothing else we can do other than abort the whole
- * process which will be even worse.
+ * Queue the signal before responding to the client.  In this
+ * way, it's guaranteed that once kill() has returned in the
+ * signal sender, the next CHECK_FOR_INTERRUPTS() in the
+ * signal recipient will see the signal.  (This is a stronger
+ * guarantee than POSIX makes; maybe we don't need it?  But
+ * without it, we've seen timing bugs on Windows that do not
+ * manifest on any known Unix.)
  */
- write_stderr("could not create signal listener pipe: error code %lu; retrying\n", GetLastError());
+ pg_queue_signal(sigNum);
 
  /*
- * Keep going so we at least dispatch this signal. Hopefully,
- * the call will succeed when retried in the loop soon after.
+ * Write something back to the client, allowing its
+ * CallNamedPipe() call to terminate.
  */
+ WriteFile(pipe, &sigNum, 1, &bytes, NULL); /* Don't care if it
+ * works or not */
+
+ /*
+ * We must wait for the client to read the data before we can
+ * disconnect, else the data will be lost.  (If the WriteFile
+ * call failed, there'll be nothing in the buffer, so this
+ * shouldn't block.)
+ */
+ FlushFileBuffers(pipe);
  }
- hThread = CreateThread(NULL, 0,
-   (LPTHREAD_START_ROUTINE) pg_signal_dispatch_thread,
-   (LPVOID) pipe, 0, NULL);
- if (hThread == INVALID_HANDLE_VALUE)
- write_stderr("could not create signal dispatch thread: error code %lu\n",
- GetLastError());
  else
- CloseHandle(hThread);
+ {
+ /*
+ * If we fail to read a byte from the client, assume it's the
+ * client's problem and do nothing.  Perhaps it'd be better to
+ * force a pipe close and reopen?
+ */
+ }
 
- /*
- * Background thread is running with our instance of the pipe. So
- * replace our reference with the newly created one and loop back
- * up for another run.
- */
- pipe = newpipe;
+ /* Disconnect from client so that we can re-use the pipe. */
+ DisconnectNamedPipe(pipe);
  }
  else
  {
Reply | Threaded
Open this post in threaded view
|

Re: Windows buildfarm members vs. new async-notify isolation test

Andrew Dunstan-8

On 12/8/19 11:57 AM, Tom Lane wrote:

> I wrote:
>> So, just idly looking at the code in src/backend/port/win32/signal.c
>> and src/port/kill.c, I have to wonder why we have this baroque-looking
>> design of using *two* signal management threads.  And, if I'm
>> reading it right, we create an entire new pipe object and an entire
>> new instance of the second thread for each incoming signal.  Plus, the
>> signal senders use CallNamedPipe (hence, underneath, TransactNamedPipe)
>> which means they in effect wait for the recipient's signal-handling
>> thread to ack receipt of the signal.  Maybe there's a good reason for
>> all this but it sure seems like a lot of wasted cycles from here.
> Here's a possible patch (untested by me) to get rid of the second thread
> and the new-pipe-for-every-request behavior.  I believe that the existing
> logic may be based on Microsoft's "Multithreaded Pipe Server" example [1]
> or something similar, but that's based on an assumption that servicing
> a client request may take a substantial amount of time and it's worth
> handling requests concurrently.  Neither point applies in this context.
>
> Doing it like this seems attractive to me because it gets rid of two
> different failure modes: inability to create a new thread and inability
> to create a new pipe handle.  Now on the other hand, it means that
> inability to complete the read/write transaction with a client right
> away will delay processing of other signals.  But we know that the
> client is engaged in a CallNamedPipe operation, so how realistic is
> that concern?
>
> This is to be applied on top of the other patch I just sent.
>
> regards, tom lane
>
> [1] https://docs.microsoft.com/en-us/windows/win32/ipc/multithreaded-pipe-server



Patch 1 fixed the problems on frogmouth.


Patch 2 also ran without incident.


tests run on REL_10_STABLE


cheers


andrew


--
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



12