[HACKERS] postmaster.pid disappeared

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

[HACKERS] postmaster.pid disappeared

Junaili Lie
Hi,
I was redirected to this maillist when i asked questions on irc. I
hope this is the right mailing list.
I am running postgresql 7.4.8 on solaris 10 (and I compile and
installed slony). Everytime I am trying to reload the configuration
using pg_ctl reload -D $PGDATA, it deleted the postmaster.pid and
didn't create a new one. So, after reload, the only way I can restart
the server is by kill -9 and then start the server again. I check the
log, nothing is meaningful except the last line:
LOG:  received SIGHUP, reloading configuration files
I am wondering if anybody has any idea?

I also noticed that the pg_ctl stop $PGDATA -m fast and smart takes
forever. When I do ps -ef, i saw several instances of <defunct>. I
have to kill -9 almost all the time to shutdown the server.

Thank you in advance,

J

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
      joining column's datatypes do not match
Reply | Threaded
Open this post in threaded view
|

Re: [HACKERS] postmaster.pid disappeared

Josh berkus
Junaili,

> I am running postgresql 7.4.8 on solaris 10 (and I compile and
> installed slony). Everytime I am trying to reload the configuration
> using pg_ctl reload -D $PGDATA, it deleted the postmaster.pid and
> didn't create a new one. So, after reload, the only way I can restart
> the server is by kill -9 and then start the server again. I check the
> log, nothing is meaningful except the last line:
> LOG:  received SIGHUP, reloading configuration files
> I am wondering if anybody has any idea?

Hmmm ... you didn't answer my question on IRC: are you using an alternate
database location defined in postgresql.conf?

--
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to [hidden email] so that your
      message can get through to the mailing list cleanly
Reply | Threaded
Open this post in threaded view
|

Re: [HACKERS] postmaster.pid disappeared

Tom Lane-2
In reply to this post by Junaili Lie
Junaili Lie <[hidden email]> writes:
> I am running postgresql 7.4.8 on solaris 10 (and I compile and
> installed slony). Everytime I am trying to reload the configuration
> using pg_ctl reload -D $PGDATA, it deleted the postmaster.pid and
> didn't create a new one.

That's very strange.  The pg_ctl script itself doesn't delete
the postmaster.pid file under any circumstances (unless maybe
you are using a locally modified version?), and the postmaster
shouldn't delete it either unless exiting.  Can you determine
exactly where the unlink call is coming from?  strace or local
equivalent may help.

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend
Reply | Threaded
Open this post in threaded view
|

Re: [HACKERS] postmaster.pid disappeared

Josh berkus
In reply to this post by Josh berkus
Folks,

> > > I am running postgresql 7.4.8 on solaris 10 (and I compile and
> > > installed slony). Everytime I am trying to reload the configuration
> > > using pg_ctl reload -D $PGDATA, it deleted the postmaster.pid and
> > > didn't create a new one. So, after reload, the only way I can restart
> > > the server is by kill -9 and then start the server again. I check the
> > > log, nothing is meaningful except the last line:
> > > LOG: received SIGHUP, reloading configuration files
> > > I am wondering if anybody has any idea?

Looking at his report, what's happening is that the postmaster is shutting
down, but the other backends are not ... they're hanging around as zombies.  
Not sure why, but I'm chatting with him on IRC.

--
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
    (send "unregister YourEmailAddressHere" to [hidden email])
Reply | Threaded
Open this post in threaded view
|

Re: [HACKERS] postmaster.pid disappeared

Tom Lane-2
Josh Berkus <[hidden email]> writes:
> Looking at his report, what's happening is that the postmaster is shutting
> down, but the other backends are not ... they're hanging around as
> zombies.

The zombies couldn't be dead backends if the postmaster has gone away:
in every Unix I know, a zombie process disappears instantly if its
parent dies (since the only reason for a zombie in the first place
is to hold the process' exit status until the parent reads it with
wait()).

> Not sure why, but I'm chatting with him on IRC.

Find out what the parent process of the zombies actually is.

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

               http://archives.postgresql.org
Reply | Threaded
Open this post in threaded view
|

Re: [HACKERS] postmaster.pid disappeared

Josh berkus
Tom,

> The zombies couldn't be dead backends if the postmaster has gone away:
> in every Unix I know, a zombie process disappears instantly if its
> parent dies (since the only reason for a zombie in the first place
> is to hold the process' exit status until the parent reads it with
> wait()).

yeah, I think I spoke too soon.  What it looks like is that pg_ctl is
reporting success while actually failing to shut down the postmaster.  
Solaris makes it a little hard to read; parent-process relationships aren't
as clear as they are in Linux.

--
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
      joining column's datatypes do not match
Reply | Threaded
Open this post in threaded view
|

Re: [HACKERS] postmaster.pid disappeared

Junaili Lie
Hi,
Thank you all for the respond.
I should probably mentioned that postgres is maintained by smf, which
is a service management tool in solaris 10.
I asked our sys admin to remove postgres from being managed by smf.
he did that. But right now he is having problem because the system
could not start because of some mounting problems.
I will report back any progress I have.
In the meantime, any ideas or suggestions or things that I can do to
provide more infor will be greatly appreciated.
Thanks,


J


On 5/24/05, Josh Berkus <[hidden email]> wrote:

> Tom,
>
> > The zombies couldn't be dead backends if the postmaster has gone away:
> > in every Unix I know, a zombie process disappears instantly if its
> > parent dies (since the only reason for a zombie in the first place
> > is to hold the process' exit status until the parent reads it with
> > wait()).
>
> yeah, I think I spoke too soon.  What it looks like is that pg_ctl is
> reporting success while actually failing to shut down the postmaster.
> Solaris makes it a little hard to read; parent-process relationships aren't
> as clear as they are in Linux.
>
> --
> --Josh
>
> Josh Berkus
> Aglio Database Solutions
> San Francisco
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
>      joining column's datatypes do not match
>

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster
Reply | Threaded
Open this post in threaded view
|

Re: [HACKERS] postmaster.pid disappeared

Junaili Lie
In reply to this post by Tom Lane-2
Tom,
I am not too sure how to determine the unlink call.
Can you provide more information/instructions?

In my case the pg_ctl reload -D /usr/local/pgsql deleted the
postmaster.pid without creating a new one. I am not too sure if this
is normal.

J


On 5/24/05, Tom Lane <[hidden email]> wrote:

> Junaili Lie <[hidden email]> writes:
> > I am running postgresql 7.4.8 on solaris 10 (and I compile and
> > installed slony). Everytime I am trying to reload the configuration
> > using pg_ctl reload -D $PGDATA, it deleted the postmaster.pid and
> > didn't create a new one.
>
> That's very strange.  The pg_ctl script itself doesn't delete
> the postmaster.pid file under any circumstances (unless maybe
> you are using a locally modified version?), and the postmaster
> shouldn't delete it either unless exiting.  Can you determine
> exactly where the unlink call is coming from?  strace or local
> equivalent may help.
>
>                        regards, tom lane
>

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to [hidden email] so that your
      message can get through to the mailing list cleanly
Reply | Threaded
Open this post in threaded view
|

Re: [HACKERS] postmaster.pid disappeared

Junaili Lie
Hi,
I reinstall postgresql 7.4.6 instead of 7.4.8 (still on Solaris 10)
and didn't include postgresql as services that is managed by SMF, and
it works fine so far. Also, I should mentioned that I configured
postgresql 7.4.6 with --enable-thread-safety option, don't know if
this will have anything to do with this issue.
Thanks for all the help,

J

On 5/24/05, Junaili Lie <[hidden email]> wrote:

> Tom,
> I am not too sure how to determine the unlink call.
> Can you provide more information/instructions?
>
> In my case the pg_ctl reload -D /usr/local/pgsql deleted the
> postmaster.pid without creating a new one. I am not too sure if this
> is normal.
>
> J
>
>
> On 5/24/05, Tom Lane <[hidden email]> wrote:
> > Junaili Lie <[hidden email]> writes:
> > > I am running postgresql 7.4.8 on solaris 10 (and I compile and
> > > installed slony). Everytime I am trying to reload the configuration
> > > using pg_ctl reload -D $PGDATA, it deleted the postmaster.pid and
> > > didn't create a new one.
> >
> > That's very strange.  The pg_ctl script itself doesn't delete
> > the postmaster.pid file under any circumstances (unless maybe
> > you are using a locally modified version?), and the postmaster
> > shouldn't delete it either unless exiting.  Can you determine
> > exactly where the unlink call is coming from?  strace or local
> > equivalent may help.
> >
> >                        regards, tom lane
> >
>

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
    (send "unregister YourEmailAddressHere" to [hidden email])