initdb - creating clusters

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|

initdb - creating clusters

PG Doc comments form
The following documentation comment has been logged on the website:

Page: https://www.postgresql.org/docs/10/creating-cluster.html
Description:

I'm searching for what a cluster is and how to create one.  The
documentation tells me to use initdb -D path/to/cluster.  I am told that
this is installed when I installed postgresql.  I try to run it with no
success.  Searching for an answer I find that I'm supposed to use
pg_createcluster because initdb is version dependent and not made
executable.  It seems like there is an omission here as the documentation on
this page also mentions pg_ctl which my system (Ubuntu 18.04) knows nothing
about using the 'which' command.  I can imagine that someone might argue
that this is system dependent - I don't know whether that is true or not.  I
have generally found the documentation excellent and certainly not inward
looking.  For instance the documentation on replication strategies includes
proprietary solutions.

I would suggest that you include a paragraph stating that various operating
systems use other commands to avoid version conflict and suggest the reader
search for '<user-system> pg_ctl'.  I can understand why you might not want
to link to external sites in your documentation.  (While writing this I have
searched to make sure I'm not writing rubbish and already understand that
pg_createcluster is a Debian solution/variant.)

Thanks for all you do

Gary
Reply | Threaded
Open this post in threaded view
|

Re: initdb - creating clusters

Laurenz Albe
On Thu, 2020-07-09 at 15:25 +0000, PG Doc comments form wrote:

> I'm searching for what a cluster is and how to create one.  The
> documentation tells me to use initdb -D path/to/cluster.  I am told that
> this is installed when I installed postgresql.  I try to run it with no
> success.  Searching for an answer I find that I'm supposed to use
> pg_createcluster because initdb is version dependent and not made
> executable.  It seems like there is an omission here as the documentation on
> this page also mentions pg_ctl which my system (Ubuntu 18.04) knows nothing
> about using the 'which' command.  I can imagine that someone might argue
> that this is system dependent - I don't know whether that is true or not.  I
> have generally found the documentation excellent and certainly not inward
> looking.  For instance the documentation on replication strategies includes
> proprietary solutions.
>
> I would suggest that you include a paragraph stating that various operating
> systems use other commands to avoid version conflict and suggest the reader
> search for '<user-system> pg_ctl'.  I can understand why you might not want
> to link to external sites in your documentation.  (While writing this I have
> searched to make sure I'm not writing rubbish and already understand that
> pg_createcluster is a Debian solution/variant.)
Something like the attached?

Yours,
Laurenz Albe

0001-Document-that-installation-packages-may-provide-othe.patch (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: initdb - creating clusters

Tom Lane-2
Laurenz Albe <[hidden email]> writes:
> On Thu, 2020-07-09 at 15:25 +0000, PG Doc comments form wrote:
>> I would suggest that you include a paragraph stating that various operating
>> systems use other commands to avoid version conflict and suggest the reader
>> search for '<user-system> pg_ctl'.  I can understand why you might not want
>> to link to external sites in your documentation.  (While writing this I have
>> searched to make sure I'm not writing rubbish and already understand that
>> pg_createcluster is a Debian solution/variant.)

> Something like the attached?

I think the problem is more general than that.  The packager might
well provide a substitute or wrapper for initdb, but it's even more
likely that there's some other way to start and stop the server than
what we describe.

I experimented with putting a disclaimer at the very top of the chapter,
as attached.  I like that from a wording standpoint, but from a usability
standpoint it's still got the question of whether users will see it at
all.  (This is not helped any by the fact that our current docs toolchain
insists on putting a chapter TOC in front of the chapter head material,
so that what ought to be the most important information becomes something
you don't see at all unless you think to scroll down.)

Another approach would be to put something along this line at the heads
of each of the relevant sections, which'd be 18.1, 18.2, 18.3, 18.5,
and 18.6 by my count.  That seems very repetitive; but it would have
the advantage that people could hardly miss it.

I do agree that we ought to do something here.  I think only a small
minority of users build their own Postgres installations anymore.

                        regards, tom lane


diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml
index 937bb2e8ac..8cfc266799 100644
--- a/doc/src/sgml/runtime.sgml
+++ b/doc/src/sgml/runtime.sgml
@@ -8,6 +8,21 @@
   and its interactions with the operating system.
  </para>
 
+ <para>
+  The discussions in this chapter assume that you are working with
+  an unmodified version of <productname>PostgreSQL</productname>,
+  for example one that you built from source according to the directions
+  in the preceding chapters.  If you are working with a pre-packaged
+  version of <productname>PostgreSQL</productname>, it is likely that
+  the packager has made special provisions for installing and starting
+  the database server according to your system's conventions.
+  For example, there may be special scripts for creating a database
+  cluster.  There almost certainly will be a mechanism for starting
+  the server, which you should prefer over constructing your own start
+  script as described in <xref linkend="server-start"/>.
+  Consult the package-level documentation for details.
+ </para>
+
  <sect1 id="postgres-user">
   <title>The <productname>PostgreSQL</productname> User Account</title>
 
Reply | Threaded
Open this post in threaded view
|

Re: initdb - creating clusters

Daniel Gustafsson
> On 11 Jul 2020, at 23:36, Tom Lane <[hidden email]> wrote:

> +  For example, there may be special scripts for creating a database
> +  cluster.  There almost certainly will be a mechanism for starting
> +  the server,

Aren't we really talking about "running the server as a service" and not just
starting it?  Perhaps thats hair-splitting territory?


cheers ./daniel


Reply | Threaded
Open this post in threaded view
|

Re: initdb - creating clusters

Tom Lane-2
Daniel Gustafsson <[hidden email]> writes:
>> On 11 Jul 2020, at 23:36, Tom Lane <[hidden email]> wrote:
>> +  For example, there may be special scripts for creating a database
>> +  cluster.  There almost certainly will be a mechanism for starting
>> +  the server,

> Aren't we really talking about "running the server as a service" and not just
> starting it?  Perhaps thats hair-splitting territory?

Yeah, but that terminology might itself be a bit platform-specific.
I considered giving specific examples, like systemd unit files,
but was afraid that that'd just confuse people on other platforms.
Not sure what the best way to approach this is.

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: initdb - creating clusters

David G Johnston
In reply to this post by Tom Lane-2
On Sat, Jul 11, 2020 at 2:37 PM Tom Lane <[hidden email]> wrote:
Another approach would be to put something along this line at the heads
of each of the relevant sections, which'd be 18.1, 18.2, 18.3, 18.5,
and 18.6 by my count.  That seems very repetitive; but it would have
the advantage that people could hardly miss it.

I do agree that we ought to do something here.  I think only a small
minority of users build their own Postgres installations anymore.

Taken to an extreme...

Presently 16 and 17 explicitly describe source installation.  18 extends upon how to go about using a source installation but then also dives into topics that are relevant regardless of the architecture of the server binaries, and to some extent more accurately represents "server configuration" or maybe "server-os integration" if you want to keep that stuff in its own chapter.  So 18 gets split with 18 retaining the material that pertains to source installed binaries.  Then add a new chapter, 18.3 (overall numbering will shift), making mention of package installers and maybe even allow for some detail to be covered in that chapter before handing the user off to the distro's documentation.  Then 18.6 gets "server-os integration/configuration" while 19 remains "server configuration".

David J.

p.s. This thread started on the 9th and Laurenz responded on the 10th but this email (11th) from Tom is the first I've seen of this thread.  As I write this I haven't seen Daniel's response but I do see Tom's reply to Daniel's reponse.  This "I see responses but not originals" is quite common for me.  Using GMail.


Reply | Threaded
Open this post in threaded view
|

Re: initdb - creating clusters

Daniel Gustafsson
In reply to this post by Tom Lane-2
> On 12 Jul 2020, at 00:24, Tom Lane <[hidden email]> wrote:
>
> Daniel Gustafsson <[hidden email]> writes:
>>> On 11 Jul 2020, at 23:36, Tom Lane <[hidden email]> wrote:
>>> +  For example, there may be special scripts for creating a database
>>> +  cluster.  There almost certainly will be a mechanism for starting
>>> +  the server,
>
>> Aren't we really talking about "running the server as a service" and not just
>> starting it?  Perhaps thats hair-splitting territory?
>
> Yeah, but that terminology might itself be a bit platform-specific.

I guess thats a good point.

> I considered giving specific examples, like systemd unit files,
> but was afraid that that'd just confuse people on other platforms.
> Not sure what the best way to approach this is.

Hmm, since the section is aimed at reducing confusion for inexperienced users I
agree that adding more detail might be detrimental to the point.

Re-reading it with bug-reports etc in mind, I think the only thing that I
propose would be to expand the terminology for what a package is to be
"pre-packaged or vendor-supplied".

cheers ./daniel

Reply | Threaded
Open this post in threaded view
|

Re: initdb - creating clusters

Bruce Momjian
In reply to this post by PG Doc comments form
On Thu, Jul  9, 2020 at 03:25:14PM +0000, PG Doc comments form wrote:
> looking.  For instance the documentation on replication strategies includes
> proprietary solutions.

Uh, what proprietary solutions are listed in our documentation?

--
  Bruce Momjian  <[hidden email]>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee



Reply | Threaded
Open this post in threaded view
|

Re: initdb - creating clusters

Daniel Gustafsson
> On 21 Jul 2020, at 02:25, Bruce Momjian <[hidden email]> wrote:
>
> On Thu, Jul  9, 2020 at 03:25:14PM +0000, PG Doc comments form wrote:
>> looking.  For instance the documentation on replication strategies includes
>> proprietary solutions.
>
> Uh, what proprietary solutions are listed in our documentation?

I think "proprietary" here implies outside-of-core, and we have a few of those
listed in the "Comparison of Different Solutions" section.

cheers ./daniel

Reply | Threaded
Open this post in threaded view
|

Re: initdb - creating clusters

Bruce Momjian
On Tue, Jul 21, 2020 at 10:40:59AM +0200, Daniel Gustafsson wrote:

> > On 21 Jul 2020, at 02:25, Bruce Momjian <[hidden email]> wrote:
> >
> > On Thu, Jul  9, 2020 at 03:25:14PM +0000, PG Doc comments form wrote:
> >> looking.  For instance the documentation on replication strategies includes
> >> proprietary solutions.
> >
> > Uh, what proprietary solutions are listed in our documentation?
>
> I think "proprietary" here implies outside-of-core, and we have a few of those
> listed in the "Comparison of Different Solutions" section.

Oh, OK, those seem fine to me.

--
  Bruce Momjian  <[hidden email]>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee



Reply | Threaded
Open this post in threaded view
|

Re: initdb - creating clusters

Daniel Gustafsson
> On 22 Jul 2020, at 18:34, Bruce Momjian <[hidden email]> wrote:
>
> On Tue, Jul 21, 2020 at 10:40:59AM +0200, Daniel Gustafsson wrote:
>>> On 21 Jul 2020, at 02:25, Bruce Momjian <[hidden email]> wrote:
>>>
>>> On Thu, Jul  9, 2020 at 03:25:14PM +0000, PG Doc comments form wrote:
>>>> looking.  For instance the documentation on replication strategies includes
>>>> proprietary solutions.
>>>
>>> Uh, what proprietary solutions are listed in our documentation?
>>
>> I think "proprietary" here implies outside-of-core, and we have a few of those
>> listed in the "Comparison of Different Solutions" section.
>
> Oh, OK, those seem fine to me.

I took the liberty of adding the proposed patch upthread to the next commitfest
to make sure it's not forgotten about, as I do think it will improve the docs.

cheers ./daniel

Reply | Threaded
Open this post in threaded view
|

Re: initdb - creating clusters

Thomas Munro-5
On Wed, Aug 26, 2020 at 12:05 AM Daniel Gustafsson <[hidden email]> wrote:
> I took the liberty of adding the proposed patch upthread to the next commitfest
> to make sure it's not forgotten about, as I do think it will improve the docs.

+  The discussions in this chapter assume that you are working with
+  an unmodified version of <productname>PostgreSQL</productname>,
+  for example one that you built from source according to the directions
+  in the preceding chapters.  If you are working with a pre-packaged

Rather than "unmodified", would it be better to say something more
like  "without any extra supporting infrastructure"?

My point is that packagers don't typically *modify* PG, rather they
supply a bunch of wrappers (eg Debian postgresql-common), service
management scripting (eg systemd gloopity-gloop), post-install
scripting (eg Debian's policy of automatically starting any service
when you install it, implying that it must also run initdb for you).


Reply | Threaded
Open this post in threaded view
|

Re: initdb - creating clusters

Tom Lane-2
Thomas Munro <[hidden email]> writes:
> +  The discussions in this chapter assume that you are working with
> +  an unmodified version of <productname>PostgreSQL</productname>,
> +  for example one that you built from source according to the directions
> +  in the preceding chapters.  If you are working with a pre-packaged

> Rather than "unmodified", would it be better to say something more
> like  "without any extra supporting infrastructure"?

So maybe "... you are working with plain
<productname>PostgreSQL</productname> without any additional
infrastructure, for example a copy that you built from source
according to the directions in the preceding chapters." ?

Do you have a feeling one way or the other about whether to repeat
some of this text in each of the relevant sub-sections?  I initially
didn't want to do that, but thinking about how people consume the
HTML docs, I'm afraid that anything not appearing on the same page
won't get seen.

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: initdb - creating clusters

Jürgen Purtz
On 30.08.20 17:21, Tom Lane wrote:
> Do you have a feeling one way or the other about whether to repeat
> some of this text in each of the relevant sub-sections?  I initially
> didn't want to do that, but thinking about how people consume the
> HTML docs, I'm afraid that anything not appearing on the same page
> won't get seen.

If we do so but avoid redundant text parts, we can use the entity
mechanism or the more modern XInclude mechanism. The attached patch uses
both techniques in an example file: brin.sgml includes lorem.sgml two times.

(In both cases we should avoid files with multiple root elements, eg.
multiple <para> or <sect1> without a parent element, because this would
violate the well-formed-ness of the included XML document.)

--

J. Purtz


0001-lorem.patch (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: initdb - creating clusters

Daniel Gustafsson
In reply to this post by Tom Lane-2
> On 30 Aug 2020, at 17:21, Tom Lane <[hidden email]> wrote:
>
> Thomas Munro <[hidden email]> writes:
>> +  The discussions in this chapter assume that you are working with
>> +  an unmodified version of <productname>PostgreSQL</productname>,
>> +  for example one that you built from source according to the directions
>> +  in the preceding chapters.  If you are working with a pre-packaged
>
>> Rather than "unmodified", would it be better to say something more
>> like  "without any extra supporting infrastructure"?
>
> So maybe "... you are working with plain
> <productname>PostgreSQL</productname> without any additional
> infrastructure, for example a copy that you built from source
> according to the directions in the preceding chapters." ?

That seems pretty clearly worded to me.

> Do you have a feeling one way or the other about whether to repeat
> some of this text in each of the relevant sub-sections?  I initially
> didn't want to do that, but thinking about how people consume the
> HTML docs, I'm afraid that anything not appearing on the same page
> won't get seen.

I think you're right here, duplicating the content is probably required for it
to be useful.

cheers ./daniel

Reply | Threaded
Open this post in threaded view
|

Re: initdb - creating clusters

Tom Lane-2
Daniel Gustafsson <[hidden email]> writes:
> On 30 Aug 2020, at 17:21, Tom Lane <[hidden email]> wrote:
>> Do you have a feeling one way or the other about whether to repeat
>> some of this text in each of the relevant sub-sections?  I initially
>> didn't want to do that, but thinking about how people consume the
>> HTML docs, I'm afraid that anything not appearing on the same page
>> won't get seen.

> I think you're right here, duplicating the content is probably required for it
> to be useful.

I took a stab at doing it that way, as attached.  (I couldn't resist
the temptation to do some minor editing on adjacent material, too.)

                        regards, tom lane


diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml
index 6cda39f3ab..f584231935 100644
--- a/doc/src/sgml/runtime.sgml
+++ b/doc/src/sgml/runtime.sgml
@@ -4,10 +4,22 @@
  <title>Server Setup and Operation</title>
 
  <para>
-  This chapter discusses how to set up and run the database server
+  This chapter discusses how to set up and run the database server,
   and its interactions with the operating system.
  </para>
 
+ <para>
+  The directions in this chapter assume that you are working with
+  plain <productname>PostgreSQL</productname> without any additional
+  infrastructure, for example a copy that you built from source
+  according to the directions in the preceding chapters.
+  If you are working with a pre-packaged or vendor-supplied
+  version of <productname>PostgreSQL</productname>, it is likely that
+  the packager has made special provisions for installing and starting
+  the database server according to your system's conventions.
+  Consult the package-level documentation for details.
+ </para>
+
  <sect1 id="postgres-user">
   <title>The <productname>PostgreSQL</productname> User Account</title>
 
@@ -21,9 +33,15 @@
    separate user account. This user account should only own the data
    that is managed by the server, and should not be shared with other
    daemons. (For example, using the user <literal>nobody</literal> is a bad
-   idea.) It is not advisable to install executables owned by this
-   user because compromised systems could then modify their own
-   binaries.
+   idea.) In particular, it is advisable that this user account not own
+   the <productname>PostgreSQL</productname> executable files, to ensure
+   that a compromised server process could not modify those executables.
+  </para>
+
+  <para>
+   Pre-packaged versions of <productname>PostgreSQL</productname> will
+   typically create a suitable user account automatically during
+   package installation.
   </para>
 
   <para>
@@ -71,11 +89,26 @@
    completely up to you where you choose to store your data.  There is no
    default, although locations such as
    <filename>/usr/local/pgsql/data</filename> or
-   <filename>/var/lib/pgsql/data</filename> are popular. To initialize a
-   database cluster, use the command <xref
-   linkend="app-initdb"/>,<indexterm><primary>initdb</primary></indexterm> which is
-   installed with <productname>PostgreSQL</productname>. The desired
-   file system location of your database cluster is indicated by the
+   <filename>/var/lib/pgsql/data</filename> are popular.
+   The data directory must be initialized before being used, using the program
+   <xref linkend="app-initdb"/><indexterm><primary>initdb</primary></indexterm>
+   which is installed with <productname>PostgreSQL</productname>.
+  </para>
+
+  <para>
+   If you are using a pre-packaged version
+   of <productname>PostgreSQL</productname>, it may well have a specific
+   convention for where to place the data directory, and it may also
+   provide a script for creating the data directory.  In that case you
+   should use that script in preference to
+   running <command>initdb</command> directly.
+   Consult the package-level documentation for details.
+  </para>
+
+  <para>
+   To initialize a database cluster manually,
+   run <command>initdb</command> and specify the desired
+   file system location of the database cluster with the
    <option>-D</option> option, for example:
 <screen>
 <prompt>$</prompt> <userinput>initdb -D /usr/local/pgsql/data</userinput>
@@ -309,10 +342,22 @@ postgres$ <userinput>initdb -D /usr/local/pgsql/data</userinput>
    Before anyone can access the database, you must start the database
    server. The database server program is called
    <command>postgres</command>.<indexterm><primary>postgres</primary></indexterm>
-   The <command>postgres</command> program must know where to
-   find the data it is supposed to use. This is done with the
-   <option>-D</option> option. Thus, the simplest way to start the
-   server is:
+  </para>
+
+  <para>
+   If you are using a pre-packaged version
+   of <productname>PostgreSQL</productname>, it almost certainly includes
+   provisions for running the server as a background task according to the
+   conventions of your operating system.  Using the package's
+   infrastructure to start the server will be much less work than figuring
+   out how to do this yourself.  Consult the package-level documentation
+   for details.
+  </para>
+
+  <para>
+   The bare-bones way to start the server manually is just to invoke
+   <command>postgres</command> directly, specifying the location of the
+   data directory with the <option>-D</option> option, for example:
 <screen>
 $ <userinput>postgres -D /usr/local/pgsql/data</userinput>
 </screen>
@@ -364,7 +409,7 @@ pg_ctl start -l logfile
      <secondary>starting the server during</secondary>
    </indexterm>
    Autostart scripts are operating-system-specific.
-   There are a few distributed with
+   There are a few example scripts distributed with
    <productname>PostgreSQL</productname> in the
    <filename>contrib/start-scripts</filename> directory. Installing one will require
    root privileges.
@@ -1481,9 +1526,23 @@ $ <userinput>cat /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages</userinp
   </indexterm>
 
   <para>
-   There are several ways to shut down the database server. You control
-   the type of shutdown by sending different signals to the supervisor
+   There are several ways to shut down the database server.
+   Under the hood, they all reduce to sending a signal to the supervisor
    <command>postgres</command> process.
+  </para>
+
+  <para>
+   If you are using a pre-packaged version
+   of <productname>PostgreSQL</productname>, and you used its provisions
+   for starting the server, then you should also use its provisions for
+   stopping the server.  Consult the package-level documentation for
+   details.
+  </para>
+
+  <para>
+   When managing the server directly, you can control the type of shutdown
+   by sending different signals to the <command>postgres</command>
+   process:
 
    <variablelist>
     <varlistentry>
@@ -1620,6 +1679,10 @@ $ <userinput>kill -INT `head -1 /usr/local/pgsql/data/postmaster.pid`</userinput
    is to dump and reload the database, though this can be slow.  A
    faster method is <xref linkend="pgupgrade"/>.  Replication methods are
    also available, as discussed below.
+   (If you are using a pre-packaged version
+   of <productname>PostgreSQL</productname>, it may provide scripts to
+   assist with major version upgrades.  Consult the package-level
+   documentation for details.)
   </para>
 
   <para>
Reply | Threaded
Open this post in threaded view
|

Re: initdb - creating clusters

Daniel Gustafsson
> On 2 Sep 2020, at 18:43, Tom Lane <[hidden email]> wrote:

> I took a stab at doing it that way, as attached.  (I couldn't resist
> the temptation to do some minor editing on adjacent material, too.)

LGTM.  I didn't try to build the docs with this applied, but reading it I can't
see anything odd about the markup.

cheers ./daniel

Reply | Threaded
Open this post in threaded view
|

Re: initdb - creating clusters

Tom Lane-2
Daniel Gustafsson <[hidden email]> writes:
>> On 2 Sep 2020, at 18:43, Tom Lane <[hidden email]> wrote:
>> I took a stab at doing it that way, as attached.  (I couldn't resist
>> the temptation to do some minor editing on adjacent material, too.)

> LGTM.  I didn't try to build the docs with this applied, but reading it I can't
> see anything odd about the markup.

Hearing no other comments, pushed.

                        regards, tom lane