PostgreSQL lock file

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

PostgreSQL lock file

Олег Самойлов
Hi all.

I have a task to create HA PostgreSQL cross datacenter cluster. I created a test bed on VirtualBox and I simulate different failures and disastrous in loops. And got one of funny bug of the Postgresql (11.4). One of the clusters has two instances of PostgreSQL on port 5433 and 5434. After simulating unexpected power off of the node the lock files of PostgreSQL still exist in /tmp directory. And on recovery the first instance on port 5433 starts without problem, but the second instance sometimes not. The reason is that the PID in /tmp/.s.PGSQL.5434.lock sometimes point to the one of process of PostgereSQL on port 5433, for instance the logger.

 1408 ?        S      0:00 /usr/pgsql-11/bin/postgres -D /var/lib/pgsql/krogan0a
 1415 ?        Ss     0:00  \_ postgres: logger
 1416 ?        Ss     0:02  \_ postgres: startup   recovering 000000030000000000000007
 1417 ?        Ss     0:00  \_ postgres: checkpointer
 1418 ?        Ss     0:00  \_ postgres: background writer
 1419 ?        Ss     0:00  \_ postgres: stats collector
 1420 ?        Ss     0:10  \_ postgres: walreceiver   streaming 0/7269290

# cat /tmp/.s.PGSQL.5434.lock
1415
/var/lib/pgsql/krogan0b
1565355860
5434
/tmp

# sudo -u postgres /usr/pgsql-11/bin/pg_ctl -D /var/lib/pgsql/krogan0b start
waiting for server to start....2019-08-09 16:52:25.685 MSK [6673] LOG:  listening on IPv4 address "0.0.0.0", port 5434
2019-08-09 16:52:25.685 MSK [6673] LOG:  listening on IPv6 address "::", port 5434
2019-08-09 16:52:25.686 MSK [6673] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5434"
2019-08-09 16:52:25.686 MSK [6673] FATAL:  lock file "/tmp/.s.PGSQL.5434.lock" already exists
2019-08-09 16:52:25.686 MSK [6673] HINT:  Is another postmaster (PID 1415) using socket file "/tmp/.s.PGSQL.5434"?
2019-08-09 16:52:25.686 MSK [6673] LOG:  database system is shut down
 stopped waiting
pg_ctl: could not start server
Examine the log output.

So it will be good if postgres will check not only PID, but also that the different postgres is running on different or same port. And is there some workaround?

Reply | Threaded
Open this post in threaded view
|

Re: PostgreSQL lock file

Tom Lane-2
=?utf-8?B?0J7Qu9C10LMg0KHQsNC80L7QudC70L7Qsg==?= <[hidden email]> writes:
> I have a task to create HA PostgreSQL cross datacenter cluster. I created a test bed on VirtualBox and I simulate different failures and disastrous in loops. And got one of funny bug of the Postgresql (11.4). One of the clusters has two instances of PostgreSQL on port 5433 and 5434. After simulating unexpected power off of the node the lock files of PostgreSQL still exist in /tmp directory. And on recovery the first instance on port 5433 starts without problem, but the second instance sometimes not. The reason is that the PID in /tmp/.s.PGSQL.5434.lock sometimes point to the one of process of PostgereSQL on port 5433, for instance the logger.

Best practice for this sort of thing is considered to be to run the two
Postgres instances under different user IDs.  That avoids them thinking
that the other one's lock files are relevant, and it provides an extra
layer of security against possible conflicts between the two instances.

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: PostgreSQL lock file

Олег Самойлов

> Best practice for this sort of thing is considered to be to run the two
> Postgres instances under different user IDs.  That avoids them thinking
> that the other one's lock files are relevant, and it provides an extra
> layer of security against possible conflicts between the two instances.
>
> regards, tom lane

Thanks, an interesting  notice. I already did changes in start script to check instance by pg_ctl status and delete lock file if status<>0. But your recommendation is looked better.