503 Backend fetch failed errors

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

503 Backend fetch failed errors

Rémi Zara-2
Hi,

I’m getting a lot of these errors with coypu (several per day), but not systematically.
Is this a problem on my end, or is this on the sever end ?

Query for: stage=OK&animal=coypu&ts=1541573277
Target: https://buildfarm.postgresql.org/cgi-bin/pgstatus.pl/53b137e7c765b781699bbe73e3aec7751a8c4ab7
Status Line: 503 Backend fetch failed
Web txn failed with status: 1
Query for: stage=OK&animal=coypu&ts=1541575423
Target: https://buildfarm.postgresql.org/cgi-bin/pgstatus.pl/5e8e0913dc9f9a580a4125264d74fff95f26c926
Status Line: 503 Backend fetch failed
Web txn failed with status: 1

Regards,

Rémi Zara
Reply | Threaded
Open this post in threaded view
|

Re: 503 Backend fetch failed errors

Stefan Kaltenbrunner
On 11/7/18 6:40 PM, Rémi Zara wrote:
> Hi,

Hi Rémi!

>
> I’m getting a lot of these errors with coypu (several per day), but not systematically.
> Is this a problem on my end, or is this on the sever end ?
>
> Query for: stage=OK&animal=coypu&ts=1541573277
> Target: https://buildfarm.postgresql.org/cgi-bin/pgstatus.pl/53b137e7c765b781699bbe73e3aec7751a8c4ab7
> Status Line: 503 Backend fetch failed
> Web txn failed with status: 1
> Query for: stage=OK&animal=coypu&ts=1541575423
> Target: https://buildfarm.postgresql.org/cgi-bin/pgstatus.pl/5e8e0913dc9f9a580a4125264d74fff95f26c926
> Status Line: 503 Backend fetch failed
> Web txn failed with status: 1

given the error this is something that is created by the varnish
instance that is in front of the buildfarm. On a quick look I could
immediately figure out what the problem is - but it looks like you (or
somebody else) tried at least to click one of the links above using hist
desktop browser and got an error about a missing branch specification ;)


We will further investige...


Stefan

Reply | Threaded
Open this post in threaded view
|

Re: 503 Backend fetch failed errors

Magnus Hagander-2


On Thu, Nov 8, 2018 at 9:01 AM Stefan Kaltenbrunner <[hidden email]> wrote:
On 11/7/18 6:40 PM, Rémi Zara wrote:
> Hi,

Hi Rémi!

>
> I’m getting a lot of these errors with coypu (several per day), but not systematically.
> Is this a problem on my end, or is this on the sever end ?
>
> Query for: stage=OK&animal=coypu&ts=1541573277
> Target: https://buildfarm.postgresql.org/cgi-bin/pgstatus.pl/53b137e7c765b781699bbe73e3aec7751a8c4ab7
> Status Line: 503 Backend fetch failed
> Web txn failed with status: 1
> Query for: stage=OK&animal=coypu&ts=1541575423
> Target: https://buildfarm.postgresql.org/cgi-bin/pgstatus.pl/5e8e0913dc9f9a580a4125264d74fff95f26c926
> Status Line: 503 Backend fetch failed
> Web txn failed with status: 1

given the error this is something that is created by the varnish
instance that is in front of the buildfarm. On a quick look I could
immediately figure out what the problem is - but it looks like you (or
somebody else) tried at least to click one of the links above using hist
desktop browser and got an error about a missing branch specification ;)



AFAICT:

A quick look in the logs indicates that the buildfarm is responding:
-   RespHeader     Status: 492 bad branch parameter

However, 492 is not a valid http status code, so Varnish can't handle it and thus returns 503 failure to the client.


--
Reply | Threaded
Open this post in threaded view
|

Re: 503 Backend fetch failed errors

Stefan Kaltenbrunner
On 11/8/18 9:19 AM, Magnus Hagander wrote:

>
>
> On Thu, Nov 8, 2018 at 9:01 AM Stefan Kaltenbrunner
> <[hidden email]> wrote:
>
>     On 11/7/18 6:40 PM, Rémi Zara wrote:
>      > Hi,
>
>     Hi Rémi!
>
>      >
>      > I’m getting a lot of these errors with coypu (several per day),
>     but not systematically.
>      > Is this a problem on my end, or is this on the sever end ?
>      >
>      > Query for: stage=OK&animal=coypu&ts=1541573277
>      > Target:
>     https://buildfarm.postgresql.org/cgi-bin/pgstatus.pl/53b137e7c765b781699bbe73e3aec7751a8c4ab7
>      > Status Line: 503 Backend fetch failed
>      > Web txn failed with status: 1
>      > Query for: stage=OK&animal=coypu&ts=1541575423
>      > Target:
>     https://buildfarm.postgresql.org/cgi-bin/pgstatus.pl/5e8e0913dc9f9a580a4125264d74fff95f26c926
>      > Status Line: 503 Backend fetch failed
>      > Web txn failed with status: 1
>
>     given the error this is something that is created by the varnish
>     instance that is in front of the buildfarm. On a quick look I could
>     immediately figure out what the problem is - but it looks like you (or
>     somebody else) tried at least to click one of the links above using
>     hist
>     desktop browser and got an error about a missing branch specification ;)
>
>
>
> AFAICT:
>
> A quick look in the logs indicates that the buildfarm is responding:
> -   RespHeader     Status: 492 bad branch parameter
>
> However, 492 is not a valid http status code, so Varnish can't handle it
> and thus returns 503 failure to the client.

I think that is not the actual error that Rémi is experiencing- the 492
case (which is indeed an invalid http error code) only happens when one
actually klicks the link in the mail above(which I guess some did and
you found in the logs) because the actual BF client will add a parameter
to the "Target" URL.

The actual "errors" dont seem to show up in the lighttpd logs afaiks.

Reply | Threaded
Open this post in threaded view
|

Re: 503 Backend fetch failed errors

Magnus Hagander-2


On Thu, Nov 8, 2018 at 9:31 AM Stefan Kaltenbrunner <[hidden email]> wrote:
On 11/8/18 9:19 AM, Magnus Hagander wrote:
>
>
> On Thu, Nov 8, 2018 at 9:01 AM Stefan Kaltenbrunner
> <[hidden email]> wrote:
>
>     On 11/7/18 6:40 PM, Rémi Zara wrote:
>      > Hi,
>
>     Hi Rémi!
>
>      >
>      > I’m getting a lot of these errors with coypu (several per day),
>     but not systematically.
>      > Is this a problem on my end, or is this on the sever end ?
>      >
>      > Query for: stage=OK&animal=coypu&ts=1541573277
>      > Target:
>     https://buildfarm.postgresql.org/cgi-bin/pgstatus.pl/53b137e7c765b781699bbe73e3aec7751a8c4ab7
>      > Status Line: 503 Backend fetch failed
>      > Web txn failed with status: 1
>      > Query for: stage=OK&animal=coypu&ts=1541575423
>      > Target:
>     https://buildfarm.postgresql.org/cgi-bin/pgstatus.pl/5e8e0913dc9f9a580a4125264d74fff95f26c926
>      > Status Line: 503 Backend fetch failed
>      > Web txn failed with status: 1
>
>     given the error this is something that is created by the varnish
>     instance that is in front of the buildfarm. On a quick look I could
>     immediately figure out what the problem is - but it looks like you (or
>     somebody else) tried at least to click one of the links above using
>     hist
>     desktop browser and got an error about a missing branch specification ;)
>
>
>
> AFAICT:
>
> A quick look in the logs indicates that the buildfarm is responding:
> -   RespHeader     Status: 492 bad branch parameter
>
> However, 492 is not a valid http status code, so Varnish can't handle it
> and thus returns 503 failure to the client.

I think that is not the actual error that Rémi is experiencing- the 492
case (which is indeed an invalid http error code) only happens when one
actually klicks the link in the mail above(which I guess some did and
you found in the logs) because the actual BF client will add a parameter
to the "Target" URL.

The actual "errors" dont seem to show up in the lighttpd logs afaiks.

Oh, sorry. I was checking the one called "target", I assumed that was the URL that failed.

Assuming for the original ones the ts is part of the URL, none of that is still in the logs. Or are they post parameters? Do we know exactly which URL is actually failing, and when (exactly) this happened?

--
Reply | Threaded
Open this post in threaded view
|

Re: 503 Backend fetch failed errors

Stefan Kaltenbrunner
On 11/8/18 9:35 AM, Magnus Hagander wrote:

>
>
> On Thu, Nov 8, 2018 at 9:31 AM Stefan Kaltenbrunner
> <[hidden email]> wrote:
>
>     On 11/8/18 9:19 AM, Magnus Hagander wrote:
>      >
>      >
>      > On Thu, Nov 8, 2018 at 9:01 AM Stefan Kaltenbrunner
>      > <[hidden email]> wrote:
>      >
>      >     On 11/7/18 6:40 PM, Rémi Zara wrote:
>      >      > Hi,
>      >
>      >     Hi Rémi!
>      >
>      >      >
>      >      > I’m getting a lot of these errors with coypu (several per
>     day),
>      >     but not systematically.
>      >      > Is this a problem on my end, or is this on the sever end ?
>      >      >
>      >      > Query for: stage=OK&animal=coypu&ts=1541573277
>      >      > Target:
>      >
>     https://buildfarm.postgresql.org/cgi-bin/pgstatus.pl/53b137e7c765b781699bbe73e3aec7751a8c4ab7
>      >      > Status Line: 503 Backend fetch failed
>      >      > Web txn failed with status: 1
>      >      > Query for: stage=OK&animal=coypu&ts=1541575423
>      >      > Target:
>      >
>     https://buildfarm.postgresql.org/cgi-bin/pgstatus.pl/5e8e0913dc9f9a580a4125264d74fff95f26c926
>      >      > Status Line: 503 Backend fetch failed
>      >      > Web txn failed with status: 1
>      >
>      >     given the error this is something that is created by the varnish
>      >     instance that is in front of the buildfarm. On a quick look I
>     could
>      >     immediately figure out what the problem is - but it looks
>     like you (or
>      >     somebody else) tried at least to click one of the links above
>     using
>      >     hist
>      >     desktop browser and got an error about a missing branch
>     specification ;)
>      >
>      >
>      >
>      > AFAICT:
>      >
>      > A quick look in the logs indicates that the buildfarm is responding:
>      > -   RespHeader     Status: 492 bad branch parameter
>      >
>      > However, 492 is not a valid http status code, so Varnish can't
>     handle it
>      > and thus returns 503 failure to the client.
>
>     I think that is not the actual error that Rémi is experiencing- the 492
>     case (which is indeed an invalid http error code) only happens when one
>     actually klicks the link in the mail above(which I guess some did and
>     you found in the logs) because the actual BF client will add a
>     parameter
>     to the "Target" URL.
>
>     The actual "errors" dont seem to show up in the lighttpd logs afaiks.
>
>
> Oh, sorry. I was checking the one called "target", I assumed that was
> the URL that failed.
>
> Assuming for the original ones the ts is part of the URL, none of that
> is still in the logs. Or are they post parameters? Do we know exactly
> which URL is actually failing, and when (exactly) this happened?

well - most of the parameters to each url are in the error report (f.e.
"Query for: stage=OK&animal=coypu&ts=1541575423") I dunno whether Rémi
knows which branch that was for? - that one also has a unix timestamp,
though I "think" that is the timestamp from when the build started on
the bf-client and not the ts when the request was made)

Afaiks the two requests are not at all in the lighly log so only varnish
might have seen them (though its unclear what error it got while
connecting to lighty)



Stefan

Reply | Threaded
Open this post in threaded view
|

Re: 503 Backend fetch failed errors

Magnus Hagander-2


On Thu, Nov 8, 2018 at 9:42 AM Stefan Kaltenbrunner <[hidden email]> wrote:
On 11/8/18 9:35 AM, Magnus Hagander wrote:
>
>
> On Thu, Nov 8, 2018 at 9:31 AM Stefan Kaltenbrunner
> <[hidden email]> wrote:
>
>     On 11/8/18 9:19 AM, Magnus Hagander wrote:
>      >
>      >
>      > On Thu, Nov 8, 2018 at 9:01 AM Stefan Kaltenbrunner
>      > <[hidden email]> wrote:
>      >
>      >     On 11/7/18 6:40 PM, Rémi Zara wrote:
>      >      > Hi,
>      >
>      >     Hi Rémi!
>      >
>      >      >
>      >      > I’m getting a lot of these errors with coypu (several per
>     day),
>      >     but not systematically.
>      >      > Is this a problem on my end, or is this on the sever end ?
>      >      >
>      >      > Query for: stage=OK&animal=coypu&ts=1541573277
>      >      > Target:
>      >
>     https://buildfarm.postgresql.org/cgi-bin/pgstatus.pl/53b137e7c765b781699bbe73e3aec7751a8c4ab7
>      >      > Status Line: 503 Backend fetch failed
>      >      > Web txn failed with status: 1
>      >      > Query for: stage=OK&animal=coypu&ts=1541575423
>      >      > Target:
>      >
>     https://buildfarm.postgresql.org/cgi-bin/pgstatus.pl/1
>      >      > Status Line: 503 Backend fetch failed
>      >      > Web txn failed with status: 1
>      >
>      >     given the error this is something that is created by the varnish
>      >     instance that is in front of the buildfarm. On a quick look I
>     could
>      >     immediately figure out what the problem is - but it looks
>     like you (or
>      >     somebody else) tried at least to click one of the links above
>     using
>      >     hist
>      >     desktop browser and got an error about a missing branch
>     specification ;)
>      >
>      >
>      >
>      > AFAICT:
>      >
>      > A quick look in the logs indicates that the buildfarm is responding:
>      > -   RespHeader     Status: 492 bad branch parameter
>      >
>      > However, 492 is not a valid http status code, so Varnish can't
>     handle it
>      > and thus returns 503 failure to the client.
>
>     I think that is not the actual error that Rémi is experiencing- the 492
>     case (which is indeed an invalid http error code) only happens when one
>     actually klicks the link in the mail above(which I guess some did and
>     you found in the logs) because the actual BF client will add a
>     parameter
>     to the "Target" URL.
>
>     The actual "errors" dont seem to show up in the lighttpd logs afaiks.
>
>
> Oh, sorry. I was checking the one called "target", I assumed that was
> the URL that failed.
>
> Assuming for the original ones the ts is part of the URL, none of that
> is still in the logs. Or are they post parameters? Do we know exactly
> which URL is actually failing, and when (exactly) this happened?

well - most of the parameters to each url are in the error report (f.e.
"Query for: stage=OK&animal=coypu&ts=1541575423") I dunno whether Rémi
knows which branch that was for? - that one also has a unix timestamp,
though I "think" that is the timestamp from when the build started on
the bf-client and not the ts when the request was made)

Afaiks the two requests are not at all in the lighly log so only varnish
might have seen them (though its unclear what error it got while
connecting to lighty)

They'll be hard to find in the Varnish log without actually having the URL. There is nothin gin the varnish log with 1541575423 in it at all. And there is nothing with "coypu" and a http 503 in it either. And the log goes back to Nov 6...

So my guess is it might be a POST which doesn't actually have the animal name or the timestamp on the URL.

I do see some general POSTs returning 503. They all seem to be of the type going to pgstatus.pl like the ones above, so maybe that is the URL after all? If I look at just POSTs there, I see a single one, and it has:

-   FetchError     Resource temporarily unavailable
-   FetchError     straight insufficient bytes

"Straight insufficient bytes" means there is a mismatch between Content-Length and the actual amount of data sent/read.

And on the backend side:

--  FetchError     req.body read error: 11 (Resource temporarily unavailable)

I believe this means that varnish is actually failing to read the request body from the *client*, in order to pass it on to the server. In that case, it could be that the client sends the wrong length. It does send a content-length header of 4160573 bytes -- perhaps it stops sending data before it gets there. Is that a "reasonable size" package being sent? It's quite a big POST.

The error occured 7.25 seconds after Varnish started talking to lighttpd. So it at least did something first. Perhaps if it actually is bigger than 4MB it hit some sort of limit and lighttpd killed the request?

--