Probable documentation errors or improvements

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Probable documentation errors or improvements

Yaroslav
Disclaimer: I'm not a native speaker, so not sure those are actually
incorrect, and can't offer non-trivial suggestions.

General ones:
. "boolean" vs "Boolean" --- usage seems to be inconsistent, even in the
same context.

. Transaction isolation levels are sometimes called "transaction isolation
modes", is that correct?

. In https://www.postgresql.org/docs/current/tableam.html, links to source
code
  are also hyperlinks into git, like (from SGML source):

  For details, please refer to the <ulink
url="https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/include/access/tableam.h;hb=HEAD">
  <filename>src/include/access/tableam.h</filename></ulink> file.

  Perhaps, other similar links in documentation should also be made into
  hyperlinks?

--------------------------------------------------------------------------------
Specific ones:

--
https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-CONSTANTS

4.1.2.4 Dollar-quoted String Constants

While the standard syntax for specifying string constants is usually
convenient, it can be difficult to understand when the desired string
contains
many single quotes or backslashes, since each of those must be doubled.
  -- Not so for backslashes (perhaps, this sentence is from
     pre-standard_conforming_strings ages).

----- <in the same section> ------

Notice that inside the dollar-quoted string, single quotes can be used
without
needing to be escaped. Indeed, no characters inside a dollar-quoted string
are
ever escaped: the string content is always written literally. Backslashes
are
not special, and neither are dollar signs, unless they are part of a
sequence
matching the opening tag.
  -- Backslashes, again. Though here in may be justified, not sure.

----- <in the same section> ------

$function$
BEGIN
    RETURN ($1 ~ $q$[\t\r\n\v\\]$q$);
END;
$function$
  -- While it's understandable what the example is trying to demonstrate,
     single-quoted string would work here, too (so, no actual advantage, in
     this case).

----- <in the same section> ------

With single-quote syntax, each backslash in the above example would have to
be
written as four backslashes, which would be reduced to two backslashes in
parsing the original string constant, and then to one when the inner string
constant is re-parsed during function execution.
  -- Nothing needs to be changed about backslashes, yet again.

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/ddl-basics.html
5.1. Table Basics

A table in a relational database is much like a table on paper: It consists
of
rows and columns.
  -- Why "I" in It after ":" is capitalized?

----- <in the same section> ------
Some of the frequently used data types are integer for whole numbers,
numeric
for possibly fractional numbers, text for character strings, date for dates,
time for time-of-day values, and timestamp for values containing both date
and
time.
  -- Perhaps, add (or replace with) timestamptz for storing moments in time
(or
     something like that)?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/ddl-constraints.html#DDL-CONSTRAINTS-EXCLUSION

5.4.6. Exclusion Constraints

Exclusion constraints ensure that if any two rows are compared on the
specified
columns or expressions using the specified operators, at least one of these
operator comparisons will return false or null. The syntax is:

CREATE TABLE circles (
    c circle,
    EXCLUDE USING gist (c WITH &&)
);
  -- Not only the definition is hard to grasp, but the example doesn't
clarify
     a lot, as it's not explained what it actually achieves (especially
given
     geometric data types and operators are described several chapters
later).

--------------------------------------------------------------------------------
https://www.postgresql.org/docs/current/ddl-alter.html#DDL-ALTER-REMOVING-A-CONSTRAINT

5.6.4. Removing a Constraint

(If you are dealing with a generated constraint name like $2, don't forget
that
you'll need to double-quote it to make it a valid identifier.)
  -- I don't think current releases generate names like that anymore?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/ddl-rowsecurity.html

5.8. Row Security Policies
  -- A general note: interaction of row security with search_path is not
     documented at all, but it may be important is some cases, like using
     functions in row security policies.

----- <in the same section> ------

We can then see that an administrator connecting over a network will not see
any records, due to the restrictive policy:

=> SELECT current_user;
 current_user
--------------
 admin
(1 row)

=> select inet_client_addr();
 inet_client_addr
------------------
 127.0.0.1
(1 row)

=> SELECT current_user;
 current_user
--------------
 admin
  -- "SELECT current_user;" twice (I guess first one should have been psql
-h).

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/dml-update.html

6.2. Updating Data

Only if you have a primary key in the table (independent of whether you
declared it or not) can you reliably address individual rows by choosing a
condition that matches the primary key.
  -- If it's about relational theory here, it's candidate key, actually (and
     there could be more that one).

Graphical database access tools rely on this fact to allow you to update
rows
individually.
  -- Is it certain there's no tool out there that uses "WHERE CURRENT OF
cursor"
     for that, for instance? So, perhaps, change it to "usually rely" or
"Many
     graphical..."?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/queries-table-expressions.html

7.2. Table Expressions

Trivial table expressions simply refer to a table on disk, a so-called base
table, but more complex expressions can be used to modify or combine base
tables in various ways.
  -- Formally, referenced "tables" may be [materialized] views, etc.
     And the same, below:

A table reference can be a table name (possibly schema-qualified), or a
derived
table such as a subquery, a JOIN construct, or complex combinations of
these.

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/queries-table-expressions.html#QUERIES-FROM

7.2.1.5. LATERAL Subqueries
  -- This section is yet another place where SQL style could use
unification.

For example, if get_product_names() returns the names of products made by a
manufacturer, but some manufacturers in our table currently produce no
products, we could find out which ones those are like this:

SELECT m.name
  FROM manufacturers m LEFT JOIN LATERAL get_product_names(m.id) pname ON
true
WHERE pname IS NULL;
  -- Is it really a good example? A "groupwise maximum" would be much better
     one, IMHO. Besides, the query could have been (more naturally) written
as:

     SELECT m.name
       FROM manufacturers
      WHERE EXISTS (
            SELECT 1
              FROM get_product_names(m.id)
            )

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/datatype-character.html

8.3. Character Types

SQL defines two primary character types: character varying(n) and
character(n),
where n is a positive integer.
  -- SQL standard, perhaps?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/functions-math.html#FUNCTIONS-MATH-FUNC-TABLE

Table 9.5. Mathematical Functions

In div(y numeric, x numeric) example:
div(9,4)
  -- No space after the comma.

----- <in the same section> ------

mod(9,4)
  -- The same as above.

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/functions-aggregate.html#FUNCTIONS-AGGREGATE-TABLE

Table 9.55. General-Purpose Aggregate Functions

bit_and -- the bitwise AND of all non-null input values, or null if none
bit_or -- the bitwise OR of all non-null input values, or null if none
bool_and -- true if all input values are true, otherwise false
  -- Descriptions of NULL handling are inconsistent: either the one in bit_*
     should be removed, or it should be added everywhere except in
non-strict
     functions?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/functions-aggregate.html
9.20. Aggregate Functions

Users accustomed to working with other SQL database management systems might
be
disappointed by the performance of the count aggregate when it is applied to
the
entire table.
  -- Which modern DBMS(s) is that about? Perhaps, the note could be removed?

----- <in the same section> ------

This ordering is unspecified by default, but can be controlled by writing an
ORDER BY clause within the aggregate call, as shown in Section 4.2.7.
Alternatively, supplying the input values from a sorted subquery will
usually
work. For example:

SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;

Beware that this approach can fail if the outer query level contains
additional
processing, such as a join, because that might cause the subquery's output
to be
reordered before the aggregate is computed.
  -- Perhaps, as the reliable way exists, the "dangerous" one could be
removed?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/functions-srf.html

9.24. Set Returning Functions

In the example:
-- basic usage
SELECT generate_subscripts('{NULL,1,NULL,2}'::int[], 1) AS s;
  -- Why not:
     SELECT n
       FROM generate_subscripts('{NULL,1,NULL,2}'::int[], 1) AS s(n);

----- <in the same section> ------

SELECT a AS array, s AS subscript, a[s] AS value
FROM (SELECT generate_subscripts(a, 1) AS s, a FROM arrays) foo;
  -- IMO, bad style: SRF call in the select list + implicit cross join + no
     AS before "foo", but used in every field alias; besides, "array" and
     "value" are keywords.

----- <in the same section> ------

CREATE OR REPLACE FUNCTION unnest2(anyarray)
  -- The same bad style as above.

----- <in the same section> ------

SELECT * FROM pg_ls_dir('.') WITH ORDINALITY AS t(ls,n);
  -- Nobody but a superuser can try this example (besides, the output is
quite
     long). Why not come up with a better one?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/functions-info.html#FUNCTIONS-INFO-ACCESS-TABLE

After Table 9.64. Access Privilege Inquiry Functions, in has_table_privilege
description:

(Thus, there are actually six variants of has_table_privilege, which can be
distinguished by the number and types of their arguments.)
  -- This note (describing an implementation detail) is present here, but
not
     for the other has_* functions. Perhaps, it could be removed or
generalized?

----- <in the same section> ------

The desired access privilege type is specified by a text string, which must
evaluate to one of the values SELECT, INSERT, UPDATE, DELETE, TRUNCATE,
REFERENCES, or TRIGGER. Optionally, WITH GRANT OPTION can be added to a
privilege type to test whether the privilege is held with grant option.
Also,
multiple privilege types can be listed separated by commas, in which case
the
result will be true if any of the listed privileges is held.
  -- The third sentence contradicts the first one ("must evaluate to one" vs
"
     "multiple privilege types can be listed")?

----- <in the same section> ------

has_sequence_privilege checks whether a user can access a sequence in a
particular way. The possibilities for its arguments are analogous to
has_table_privilege. The desired access privilege type must evaluate to one
of
USAGE, SELECT, or UPDATE.
  -- It also supports combinations of privileges. Besides, the support of
     optional WITH GRANT OPTION (as well as in the other similar functions)
     is not entirely clear from the description, IMHO.

----- <in the same section> ------
  -- A note on style here --- in that section, there are several paragraphs
of
     the common form:

"XXX checks whether a user can access a YYY in a particular way. Its
argument
possibilities are analogous to has_table_privilege. The desired access
privilege
type must evaluate to ZZZ."
  -- Could all of these be rewritten in a better way?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/functions-info.html#FUNCTIONS-TXID-SNAPSHOT-PARTS

Table 9.75. Snapshot Components

In the xip_list description:

The list includes only those active txids between xmin and xmax;
  -- Is this correct English?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/functions-admin.html#FUNCTIONS-ADMIN-BACKUP

9.26.3. Backup Control Functions

pg_current_wal_lsn displays the current write-ahead log write location in
the
same format used by the above functions.
  -- "in the same format _as_ used", perhaps?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/functions-event-triggers.html#PG-EVENT-TRIGGER-SQL-DROP-FUNCTIONS

9.28.2. Processing Objects Dropped by a DDL Command

In example of "CREATE FUNCTION test_event_trigger_for_drops()":
    END LOOP;
END
$$;
  -- Style nitpick: no ";" after the last END, unlike in other examples.

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/typeconv-func.html

Example 10.7. Variadic Function Resolution

CREATE FUNCTION public.variadic_example(VARIADIC numeric[]) RETURNS int
  LANGUAGE sql AS 'SELECT 1';
CREATE FUNCTION
  -- "CREATE FUNCTION" is both input SQL and psql output, hardly
distinguishable.
     Besides, the function body is not dollar-quoted (bad style). The same
goes
     for the following functions in the example.

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/indexes-bitmap-scans.html

11.5. Combining Multiple Indexes

For example, given an index on (a, b) a query condition like WHERE a = 5 AND
b =
6 could use the index, but a query like WHERE a = 5 OR b = 6 could not
directly
use the index.
  -- "could not directly use" is vague, IMO (i.e. PostgreSQL actually can
and
      sometimes will use an index like that for the query).

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/indexes-partial.html

11.8. Partial Indexes

Matching takes place at query planning time, not at run time. As a result,
parameterized query clauses do not work with a partial index. For example a
prepared query with a parameter might specify “x < ?” which will never imply
“x < 2” for all possible values of the parameter.
  -- First of all, parameters syntax in postgres is "$n", not "?"; then,
partial
     indexes work just fine with parameterization or prepared statements?
E.g.:
         CREATE TABLE a(b int, c int);
         CREATE INDEX ON a(b) WHERE c > 2;
         PREPARE foo(int) AS SELECT b FROM a WHERE c > $1 AND b = 15;
         EXPLAIN EXECUTE foo(3); -- Can use the index

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/textsearch-intro.html

12.1. Introduction

Textual search operators have existed in databases for years. PostgreSQL has
~,
~*, LIKE, and ILIKE operators for textual data types, but they lack many
essential properties required by modern information systems:
  -- Honestly, the sentence reads like an ad to me. ;)

It is possible to use OR to search for multiple derived forms, but this is
tedious and error-prone (some words can have several thousand derivatives).
  -- It's also possible to use "|" in regexps for the purpose (and much
easier
     to list thousands of derivatives this way).

They tend to be slow because there is no index support, so they must process
all
documents for every search.
  -- There's pg_trgm, now.

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/textsearch-intro.html#TEXTSEARCH-MATCHING

12.1.2. Basic Text Matching

There are functions to_tsquery, plainto_tsquery, and phraseto_tsquery that
are
helpful in converting user-written text into a proper tsquery, primarily by
normalizing words appearing in the text.
  -- There's also websearch_to_tsquery now, is it relevant here?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/textsearch-tables.html

12.2.1. Searching a Table

A more complex example is to select the ten most recent documents that
contain
create and table in the title or body:

SELECT title
FROM pgweb
WHERE to_tsvector(title || ' ' || body) @@ to_tsquery('create & table')
ORDER BY last_mod_date DESC
LIMIT 10;
  -- It's rather create and table in the title and/or body?
     E.g. it matches with:
       WITH pgweb(title, body, last_mod_date) AS (VALUES ('create', 'table',
'2020-01-01'))

----- <in the same section> ------

For clarity we omitted the coalesce function calls which would be needed to
find
rows that contain NULL in one of the two fields.
  -- "title || ' ' || body" above could be replaced with, say:
     "concat_ws(' ', title, body)" to avoid the problem, and this paragraph
     removed.

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/textsearch-controls.html#TEXTSEARCH-PARSING-QUERIES

12.3.2. Parsing Queries

SELECT websearch_to_tsquery('english', 'The fat rats');
 websearch_to_tsquery
----------------------
 'fat' & 'rat'
(1 row)
-- All the examples of websearch_to_tsquery have "(1 row)" after each one,
while
   the ones for previous functions lack it.

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/textsearch-features.html#TEXTSEARCH-MANIPULATE-TSQUERY

12.4.2. Manipulating Queries

tsquery && tsquery
    Returns the AND-combination of the two given queries.
      -- Is this "given" a good English (there are several similar sentences
         below)?

----- <in the same section> ------

SELECT to_tsquery('fat') <-> to_tsquery('cat | rat');
             ?column?
-----------------------------------
 'fat' <-> 'cat' | 'fat' <-> 'rat'
  -- This returns "'fat' <-> ( 'cat' | 'rat' )" now.

Returns a query that searches for a match to the first given query followed
by a
match to the second given query at a distance of at distance lexemes, using
the
<N> tsquery operator.
  -- Seems like "at a distance of at distance lexemes" has an extra "at"?

----- <in the same section> ------

SELECT numnode(plainto_tsquery('the any'));
NOTICE:  query contains only stopword(s) or doesn't contain lexeme(s),
ignored
 numnode
---------
       0
  -- The actual NOTICE spelling now is:
     "text-search query contains only stop words or doesn't contain lexemes,
     ignored"

----- <in the same section> ------

SELECT querytree(to_tsquery('!defined'));
 querytree
-----------

  -- This returns "T" now.

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/textsearch-dictionaries.html#TEXTSEARCH-SYNONYM-DICTIONARY

12.6.3. Synonym Dictionary

mydb=# CREATE TEXT SEARCH DICTIONARY syn (template=synonym,
synonyms='synonym_sample');
  -- Why is this "mydb=#" prefix in all the examples here (previous ones
don't
     have it)?

SELECT ts_lexize('syn','indices');
  -- No space after the comma in the function argument list (there are
several
     like this below).

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/textsearch-dictionaries.html#TEXTSEARCH-THESAURUS

12.6.4.1. Thesaurus Configuration

CREATE TEXT SEARCH DICTIONARY thesaurus_simple (
    TEMPLATE = thesaurus,
    DictFile = mythesaurus,
    Dictionary = pg_catalog.english_stem
);
  -- Why TEMPLATE is spelled all-CAPS here, while DictFile is camel-case
(seems
     inconsistent style)?

----- <in the same section> ------

Thesauruses are used during indexing so any change in the thesaurus
dictionary's
parameters requires reindexing. For most other dictionary types, small
changes
such as adding or removing stopwords does not force reindexing.
  -- Is it "stop words" or "stopwords", or "stop-words"? Spelling seems to
be
     inconsistent in the documentation (and error messages).

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/textsearch-indexes.html

12.9. GIN and GiST Index Types

A GiST index can be covering, i.e. use the INCLUDE clause. Included columns
can
have data types without any GiST operator class. Included attributes will be
stored uncompressed.
  -- Is this really relevant here (it's the same for any GiST index)?

----- <in the same section> ------

Since random access to table records is slow, this limits the usefulness of
GiST
indexes.
  -- Or it might not be --- if it's SSD, or the index pages are in RAM.

----- <in the same section> ------

Note that GIN index build time can often be improved by increasing
maintenance_work_mem, while GiST index build time is not sensitive to that
parameter.
  -- Buffered GiST build does depend on maintenance_work_mem value.

----- <in the same section> ------

Partitioning can be done at the database level using table inheritance, or
by
distributing documents over servers and collecting external search results,
e.g.
via Foreign Data access.
  -- Why is it "table inheritance" here, not declarative partitioning?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/textsearch-psql.html

12.10. psql Support

-- All of this is documented in psql reference, and examples here are
lengthy.

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/transaction-iso.html#XACT-REPEATABLE-READ

13.2.2. Repeatable Read Isolation Level

The Repeatable Read mode provides a rigorous guarantee that each transaction
sees a completely stable view of the database. However, this view will not
necessarily always be consistent with some serial (one at a time) execution
of
concurrent transactions of the same level. For example, even a read only
transaction at this level may see a control record updated to show that a
batch
has been completed but not see one of the detail records which is logically
part
of the batch because it read an earlier revision of the control record.
  -- The example is very vague by itself, IMO. I suppose it's a reference to
     https://wiki.postgresql.org/wiki/SSI#Deposit_Report , but is it really
     expected a reader will understand that from the above paragraph?!

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/transaction-iso.html#XACT-SERIALIZABLE

13.2.3. Serializable Isolation Level

Suppose that serializable transaction A computes:

<skip>

Concurrently, serializable transaction B computes:

<skip>

but since there is no serial order of execution consistent with the result,
using Serializable transactions will allow one transaction to commit and
will
roll the other back with this message:
  -- Inconsistent capitalization of Serializable?

----- <in the same section> ------

If you explicitly request a SERIALIZABLE READ ONLY DEFERRABLE transaction,
it
will block until it can establish this fact.
  -- Actually, the first query in such a transaction will block. Perhaps, it
     should be clarified?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/explicit-locking.html#LOCKING-ROWS

13.3.2. Row-Level Locks

The FOR UPDATE lock mode is also acquired by any DELETE on a row, and also
by an
UPDATE that modifies the values on certain columns.
  -- on certain -> of certain?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/explicit-locking.html#ADVISORY-LOCKS

13.3.5. Advisory Locks

Like all locks in PostgreSQL, a complete list of advisory locks currently
held
by any session can be found in the pg_locks system view.
   -- Row-level locks are not there --- perhaps, "Like all locks ..." part
is
      just unnecessary?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/using-explain.html

14.1. Using EXPLAIN

Examples in this section are drawn from the regression test database after
doing
a VACUUM ANALYZE, using 9.3 development sources.
  -- Which is EOL by now. Isn't it a time to update?

----- <in the same section> ------

You should be able to get similar results if you try the examples yourself,
but
your estimated costs and row counts might vary slightly because ANALYZE's
statistics are random samples rather than exact, and because costs are
inherently somewhat platform-dependent.
  -- What does it mean (that some planner cost constants are
platform-dependent)?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/using-explain.html#USING-EXPLAIN-BASICS

14.1.1. EXPLAIN Basics

(Every correct plan will output the same row set, we trust.)
  -- Which is outright wrong:
      SELECT *
        FROM any_table
       WHERE random() < 0.1;

----- <in the same section> ------
Returning to our example:
  -- Is it really worth repeating it in full (it's not that far above)?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/using-explain.html#USING-EXPLAIN-ANALYZE

14.1.2. EXPLAIN ANALYZE

The numbers provided by BUFFERS help to identify which parts of the query
are
the most I/O-intensive.
  -- No mention of what these numbers actually *are* (pages / blocks).

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/planner-stats.html#PLANNER-STATS-EXTENDED

14.2.2.1. Functional Dependencies

In a fully normalized database, functional dependencies should exist only on
primary keys and superkeys. However, in practice many data sets are not
fully
normalized for various reasons; intentional denormalization for performance
reasons is a common example.
   -- Confusing terminology: IIRC, modern normalization theory doesn't use
PKs
      (but "candidate keys"). Besides, "denormalization" is confusing by
itself.

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/parallel-safety.html

15.4.1. Parallel Labeling for Functions and Aggregates

The following operations are always parallel restricted.
  -- Should end with ":" instead of "."?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/install-procedure.html

16.4. Installation Procedure

The standard installation provides all the header files needed for client
application development as well as for server-side program development, such
as
custom functions or data types written in C. (Prior to PostgreSQL 8.0, a
separate make install-all-headers command was needed for the latter, but
this
step has been folded into the standard install.)
  -- I guess the last sentence is not relevant anymore.

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/installation-platform-notes.html#INSTALLATION-NOTES-CYGWIN

16.7.2. Cygwin

The adduser command is not supported; use the appropriate user management
application on Windows NT, 2000, or XP. Otherwise, skip this step.

The su command is not supported; use ssh to simulate su on Windows NT, 2000,
or
XP. Otherwise, skip this step.
  -- AFAIK, all these operating systems are dead.

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/install-windows-full.html#id-1.6.4.8.8

17.1.1. Requirements

ActiveState TCL

  Required for building PL/Tcl (Note: version 8.4 is required, the free
Standard
  Distribution is sufficient).
    -- version 8.4 or later, perhaps?

----- <in the same section> ------
Note
   The obsolete winflex binaries distributed on the PostgreSQL FTP site and
   referenced in older documentation will fail with “flex: fatal internal
error,
   exec failed” on 64-bit Windows hosts. Use Flex from MSYS instead.
     -- Is this note still relevant?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/creating-cluster.html

18.2. Creating a Database Cluster

After initialization, a database cluster will contain a database named
postgres,
which is meant as a default database for use by utilities, users and third
party
applications. The database server itself does not require the postgres
database
to exist, but many external utility programs assume it exists. Another
database
created within each cluster during initialization is called template1.
  -- Why template0 is not mentioned at all in this subchapter?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/kernel-resources.html#SYSVIPC

18.4.1. Shared Memory and Semaphores

The maximum number of semaphores in the system is set by SEMMNS, which
consequently must be at least as high as max_connections plus
autovacuum_max_workers plus max_wal_senders, plus max_worker_processes, plus
one
extra for each 16 allowed connections plus workers (see the formula in Table
18.1).
  -- Why duplicate formulas (in the table and here)?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/upgrading.html#UPGRADING-VIA-PGDUMPALL

18.6.1. Upgrading Data via pg_dumpall

One upgrade method is to dump data from one major version of PostgreSQL and
reload it in another — to do this, you must use a logical backup tool like
pg_dumpall; file system level backup methods will not work.
  -- IMO, it would be nice to call dumps dumps, not "logical backups".

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/runtime-config-developer.html

19.17. Developer Options

trace_lock_oidmin (integer)

 If set, do not trace locks for tables below this OID. (use to avoid output
 on system tables)
   -- Incorrect "." position, should be moved to the end of the sentence.

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/role-attributes.html

21.2. Role Attributes

(except for superusers, since those bypass all permission checks)
  -- Repeated several times in this subchapter (could it be rephrased?).

----- <in the same section> ------

The password and md5 authentication methods make use of passwords. Database
passwords are separate from operating system passwords.
  -- What about SCRAM (scram-sha-256)?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/manage-ag-dropdb.html

22.5. Destroying a Database

template1 would be the only option for dropping the last user database of a
given cluster.
  -- What about using "postgres" (or even template0, if one allows
connections
     to it) database?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/manage-ag-tablespaces.html

22.6. Tablespaces

Note that in PostgreSQL 9.1 and earlier you will also need to update the
pg_tablespace catalog with the new locations. (If you do not, pg_dump will
continue to output the old tablespace locations.)
  -- Outdated comment (9.1 is EOL).

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/maintenance.html

Chapter 24. Routine Database Maintenance Tasks

PostgreSQL, like any database software, requires that certain tasks be
performed
regularly to achieve optimum performance.
  -- Why not remove "like any database software"? I really doubt anyone
checked
     every "database software" in the world (and, say, sqlite comes to
mind).
     Besides, how is that relevant to the point made here?

----- <in the same section> ------

The other main category of maintenance task is periodic “vacuuming” of the
database.
  -- "of maintenance tasks" (plural), perhaps?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/routine-vacuuming.html#VACUUM-FOR-WRAPAROUND

24.1.5. Preventing Transaction ID Wraparound Failures

vacuum_freeze_min_age controls how old an XID value has to be before rows
bearing that XID will be frozen. Increasing this setting may avoid
unnecessary
work if the rows that would otherwise be frozen will soon be modified again,
but
decreasing this setting increases the number of transactions that can elapse
before the table must be vacuumed again.
  -- Isn't it "decreasing this setting _decreases_ the number of
     transactions..."?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/backup.html

Chapter 25. Backup and Restore

As with everything that contains valuable data, PostgreSQL databases should
be
backed up regularly.
  -- IMNSHO, the whole point of the chapter is backwards. If the data is
     valuable, one wants to prevent (or recover) from its loss. Backups are
just
     a *mean* for that, while data loss avoidance is the *goal*. Enters
disaster
     recovery (and more broadly business continuity). Two metrics are
     fundamental to that: RPO and RTO. Using pg_dump, one cannot even get
     predictable values for those in any non-trivial case. Therefore, "25.1.
SQL
     Dump" should not be first and foremost in the chapter (if mentioned at
     all); and the whole chapter better be written with the above in mind.
At
     least, it would be nice if dumps were called dumps, not backups, as
well as
     the process of making those was called dumping in "25.1. SQL Dump".

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/continuous-archiving.html#BACKUP-BASE-BACKUP

25.3.2. Making a Base Backup

To aid you in doing this, the base backup process creates a backup history
file
that is immediately stored into the WAL archive area.
  -- The sentence isn't clear --- what's meant by "WAL archive area" here?

----- <in the same section> ------

If you used the label to identify the associated dump file, then the
archived
history file is enough to tell you which dump file to restore.
  -- What is meant by "dump file" here?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP

25.3.3.1. Making a Non-Exclusive Low-Level Backup

On a primary, if archive_mode is enabled and the wait_for_archive parameter
is
true, pg_stop_backup does not return until the last segment has been
archived.
  -- It's not entirely clear that wait_for_archive is pg_stop_backup
parameter,
     and even if yes, which one is it (first or second).

----- <in the same section> ------

25.3.3.3. Backing Up the Data Directory

This is easy to arrange if pg_wal/ is a symbolic link pointing to someplace
outside the cluster directory, which is a common setup anyway for
performance
reasons.
  -- Perhaps, this "common" setup should be also mentioned / recommended
     elsewhere, then?

----- <in the same section> ------

In case of confusion it is therefore possible to look inside a backup file
and
determine exactly which backup session the dump file came from.
  -- "inside a backup _label_ file", perhaps? And what's meant by "dump"
here?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/continuous-archiving.html#BACKUP-PITR-RECOVERY

25.3.4. Recovering Using a Continuous Archive Backup

It is important that the command return nonzero exit status on failure.
  -- "returns", perhaps?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/continuous-archiving.html#BACKUP-TIPS

25.3.6.1. Standalone Hot Backups

With this preparation, a backup can be taken using a script like the
following:
  -- The following is the recipe for an exclusive backup, which is
deprecated.

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/high-availability.html

Chapter 26. High Availability, Load Balancing, and Replication

Some solutions are synchronous, meaning that a data-modifying transaction is
not
considered committed until all servers have committed the transaction. This
guarantees that a failover will not lose any data and that all load-balanced
servers will return consistent results no matter which server is queried.
  -- The last sentence confuses distributed durability ("guarantees that a
     failover will not lose any data") with distributed atomic visibility
("all
     load-balanced servers will return consistent results") --- the latter
is
     impossible to guarantee by any method of replication provided by
     PostgreSQL.

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/log-shipping-alternative.html

26.4. Alternative Method for Log Shipping

Note that in this mode, the server will apply WAL one file at a time, so if
you
use the standby server for queries (see Hot Standby), there is a delay
between
an action in the master and when the action becomes visible in the standby,
corresponding the time it takes to fill up the WAL file.
  -- Is it "corresponding the time" or "corresponding to the time"?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/log-shipping-alternative.html#WARM-STANDBY-RECORD

26.4.2. Record-Based Log Shipping

Starting with PostgreSQL version 9.0, you can use streaming replication (see
Section 26.2.5) to achieve the same benefits with less effort.
  -- The described method seems to be quite non-trivial, and, it seems,
provides
     no benefits over streaming replication. Perhaps, this section should be
     removed altogether?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/hot-standby.html#HOT-STANDBY-ADMIN

26.5.3. Administrator's Overview

WAL file control commands will not work during recovery, e.g.
pg_start_backup,
pg_switch_wal etc.
  -- pg_start_backup does work during recovery (for non-exclusive backups).

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/wal-reliability.html

29.1. Reliability

While forcing data to the disk platters periodically might seem like a
simple
operation, it is not. Because disk drives are dramatically slower than main
memory and CPUs, several layers of caching exist between the computer's main
memory and the disk platters.
  -- Perhaps, "disk platters" (used twice here) could be improved (SSDs
don't
     have any)?

----- <in the same section> ------

These commands are not directly accessible to PostgreSQL, but some file
systems
(e.g., ZFS, ext4) can use them to flush data to the platters on
write-back-enabled drives.
  -- Perhaps, "platters" could be improved (SSDs don't have any)?

----- <in the same section> ------

Another risk of data loss is posed by the disk platter write operations
themselves. Disk platters are divided into sectors, commonly 512 bytes each.
  -- Perhaps, "platters" / "Disk platters" could be improved (SSDs don't
have
     any)?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/logical-replication.html

Chapter 30. Logical Replication

The subscriber applies the data in the same order as the publisher so that
transactional consistency is guaranteed for publications within a single
subscription.
  -- "transactional consistency" is vague. Perhaps, a warning analogous
     to the one in
https://www.postgresql.org/docs/current/applevel-consistency.html#SERIALIZABLE-CONSISTENCY
     could be added here for clarification?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/logical-replication-architecture.html

30.5.1. Initial Snapshot

Logical replication starts by copying a snapshot of the data on the
publisher
database.
  -- "from the publisher", perhaps?

----- <in the same section> ------

The subscriber applies data in the order in which commits were made on the
publisher so that transactional consistency is guaranteed for the
publications
within any single subscription.
  -- "transactional consistency" is vague (not clear if it's atomic, here)?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/libpq-status.html

33.2. Connection Status Functions

(server_encoding, TimeZone, and integer_datetimes were not reported by
releases
before 8.0; standard_conforming_strings was not reported by releases before
8.1;
IntervalStyle was not reported by releases before 8.4; application_name was
not
reported by releases before 9.0.)
  -- All of these releases are very old --- perhaps, this could be removed?

Pre-3.0-protocol servers do not report parameter settings, but libpq
includes
logic to obtain values for server_version and client_encoding anyway.
  -- Pre-3.0-protocol servers are very old --- perhaps, this could be
removed?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/libpq-async.html

33.4. Asynchronous Command Processing

Like PQprepare, it will not work on 2.0-protocol connections.
  -- The chapter has several warnings like this. Perhaps (as protocol 2.0 is
     very old) these could be removed?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/lo-interfaces.html#LO-CREATE

34.3.1. Creating a Large Object

lo_create is new as of PostgreSQL 8.1; if this function is run against an
older
server version, it will fail and return InvalidOid.
  -- PostgreSQL 8.1 is very old --- perhaps, the warning could be removed?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/xfunc-sql.html

37.5. Query Language (SQL) Functions

Alternatively, an SQL function can be declared to return a set (that is,
multiple rows) by specifying the function's return type as SETOF sometype,
or
equivalently by declaring it as RETURNS TABLE(columns).
  -- "an SQL function" here...

Alternatively, if you want to define a SQL function that performs actions
but
has no useful value to return, you can define it as returning void.
  -- But "a SQL function" here. Is it correct?

----- <in the same section> ------

CREATE FUNCTION clean_emp() RETURNS void AS '
    DELETE FROM emp
        WHERE salary < 0;
' LANGUAGE SQL;
   -- Why not use dollar-quoting here (as recommended in the text below)?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/xfunc-sql.html#XFUNC-SQL-FUNCTION-ARGUMENTS

37.5.1. Arguments for SQL Functions

The ability to use names to reference SQL function arguments was added in
PostgreSQL 9.2. Functions to be used in older servers must use the $n
notation.
  -- Perhaps (as 9.2 is EOL), this note could be removed?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/xfunc-sql.html#XFUNC-SQL-BASE-FUNCTIONS

37.5.2. SQL Functions on Base Types

-- Alternative syntax for string literal:
CREATE FUNCTION one() RETURNS integer AS '
    SELECT 1 AS result;
' LANGUAGE SQL;
  -- What's the point of the "alternative syntax" example (besides
thickening
     the manual)?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/xfunc-c.html#id-1.8.3.13.13

37.10.9. Polymorphic Arguments and Return Types

There are two routines provided in fmgr.h to allow a version-1 C function to
discover the actual data types of its arguments and the type it is expected
to
return.
  -- Remove "version-1", perhaps?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/extend-extensions.html#id-1.8.3.20.11

37.17.1. Extension Files

The CREATE EXTENSION command relies on a control file for each extension,
which
must be named the same as the extension with a suffix of .control, and must
be
placed in the installation's SHAREDIR/extension directory.
  -- In the other places in documentation, it's $SHAREDIR (with dollar).

By default, the script file(s) are also placed in the SHAREDIR/extension
directory; but the control file can specify a different directory for the
script
file(s).
  -- In the other places in documentation, it's $SHAREDIR (with dollar).

Unless an absolute path is given, the name is relative to the installation's
SHAREDIR directory.
  -- In the other places in documentation, it's $SHAREDIR (with dollar).

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/rules-views.html#id-1.8.6.7.7

40.2.3. The Power of Views in PostgreSQL

The benefit of implementing views with the rule system is, that the planner
has
all the information about which tables have to be scanned plus the
relationships
between these tables plus the restrictive qualifications from the views plus
the
qualifications from the original query in one single query tree.
  -- The comma is not needed here?

----- <in the same section> ------

And the rule system as implemented in PostgreSQL ensures, that this is all
information available about the query up to that point.
  -- The comma is not needed here?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/rules-triggers.html

40.7. Rules Versus Triggers

So if many rows are affected in one statement, a rule issuing one extra
command
is likely to be faster than a trigger that is called for every single row
and
must re-determine what to do many times.
  -- There are triggers with transition relations now --- perhaps, this
should
     be corrected?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/plpgsql-trigger.html#PLPGSQL-TRIGGER-EXAMPLE

Example 42.3. A PL/pgSQL Trigger Function

last_date timestamp,
  -- Why not timestamptz?

NEW.last_date := current_timestamp;
  -- current_timestamp returns timestamptz (which should be used here).

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/plpgsql-trigger.html#PLPGSQL-TRIGGER-AUDIT-EXAMPLE

Example 42.4. A PL/pgSQL Trigger Function for Auditing

CREATE TABLE emp_audit(
    operation         char(1)   NOT NULL,
    stamp             timestamp NOT NULL,
  -- Should be timestamptz (BTW, now() is used later to fill it).

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/plpgsql-trigger.html#PLPGSQL-VIEW-TRIGGER-AUDIT-EXAMPLE

Example 42.5. A PL/pgSQL View Trigger Function for Auditing

CREATE TABLE emp_audit(
    ...
    stamp             timestamp NOT NULL
  -- Should be timestamptz

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/plpgsql-trigger.html#PLPGSQL-TRIGGER-SUMMARY-EXAMPLE

Example 42.6. A PL/pgSQL Trigger Function for Maintaining a Summary Table

        <<insert_update>>
        LOOP
            UPDATE sales_summary_bytime
                SET amount_sold = amount_sold + delta_amount_sold,
                    units_sold = units_sold + delta_units_sold,
                    amount_cost = amount_cost + delta_amount_cost
                WHERE time_key = delta_time_key;

            EXIT insert_update WHEN found;

            BEGIN
                INSERT INTO sales_summary_bytime (
                            time_key,
                            amount_sold,
                            units_sold,
                            amount_cost)
                    VALUES (
                            delta_time_key,
                            delta_amount_sold,
                            delta_units_sold,
                            delta_amount_cost
                           );

                EXIT insert_update;

            EXCEPTION
                WHEN UNIQUE_VIOLATION THEN
                    -- do nothing
            END;
        END LOOP insert_update;
  -- The above works only in READ COMMITTED --- perhaps, it should be
mentioned?
     Or just replaced with INSERT ... ON CONFLICT ... DO UPDATE?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/plpgsql-trigger.html#PLPGSQL-TRIGGER-AUDIT-TRANSITION-EXAMPLE

Example 42.7. Auditing with Transition Tables

CREATE TABLE emp_audit(
    operation         char(1)   NOT NULL,
    stamp             timestamp NOT NULL,
  -- Should be timestamptz

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/plpgsql-implementation.html#PLPGSQL-PLAN-CACHING

42.11.2. Plan Caching

CREATE FUNCTION logfunc2(logtxt text) RETURNS void AS $$
    DECLARE
        curtime timestamp;
    BEGIN
        curtime := 'now';
        INSERT INTO logtable VALUES (logtxt, curtime);
    END;
$$ LANGUAGE plpgsql;
  -- Why not timestamptz in the above?

----- <in the same section> ------

In the case of logfunc1, the PostgreSQL main parser knows when analyzing the
INSERT that the string 'now' should be interpreted as timestamp, because the
target column of logtable is of that type. Thus, 'now' will be converted to
a
timestamp constant when the INSERT is analyzed, and then used in all
invocations
of logfunc1 during the lifetime of the session.
  -- timestamp -> timestamptz?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/pltcl-functions.html

43.2. PL/Tcl Functions and Arguments

In a nonstrict function, if the actual value of an argument is null, the
corresponding $n variable will be set to an empty string.
  -- It's "n variable" (without dollar).

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/pltcl-dbaccess.html

43.5. Database Access from PL/Tcl

For example:
   spi_exec "SELECT count(*) AS cnt FROM pg_proc"
will set the Tcl variable $cnt to the number of rows in the pg_proc system
catalog.
  -- It's "variable cnt" (without dollar).

----- <in the same section> ------

We need backslashes inside the query string given to spi_prepare to ensure
that
the $n markers will be passed through to spi_prepare as-is, and not replaced
by
Tcl variable substitution.
  -- Perhaps, add that:
     {SELECT count(*) AS cnt FROM t1 WHERE num >= $1 AND num <= $2}
     could have been used instead (or use it instead and remove the
paragraph)?

----- <in the same section> ------

quote string
   Doubles all occurrences of single quote and backslash characters in the
given
   string.
  -- And indeed it does, so:
      set x {don't \add \ }
      elog NOTICE "$x = [quote $x]"
     Results in:
        NOTICE:  don't \add \  = don''t \\add \\
   Which is not going to work correctly with default
standard_conforming_strings
   setting, and the following example is actually wrong:

"SELECT '[ quote $val ]' AS ret"
  -- Should be "SELECT E'[ quote $val ]' AS ret" (or, perhaps, it's a bug in
     "quote" function?)

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/pltcl-trigger.html

43.6. Trigger Functions in PL/Tcl

The information from the trigger manager is passed to the function body in
the
following variables:

$TG_name
  -- It's "TG_name" (without dollar). The same goes for all the following
     variables.

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/pltcl-event-trigger.html

43.7. Event Trigger Functions in PL/Tcl

$TG_event
  -- It's "TG_event" (without dollar). The same goes for the following
variable.

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/plperl-funcs.html

44.1. PL/Perl Functions and Arguments

It is usually most convenient to use dollar quoting (see Section 4.1.2.4)
for
the string constant. If you choose to use escape string syntax E'', you must
double any single quote marks (') and backslashes (\) used in the body of
the
function (see Section 4.1.2.1).
  -- Perhaps, the last sentence (or both) could be removed (it's
sufficiently
     described elsewhere; PL/Tcl and PL/Python chapters have no similar text
     about quoting)?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/sql-copy.html

filename

    The path name of the input or output file. An input file name can be an
    absolute or relative path, but an output file name must be an absolute
path.
    Windows users might need to use an E'' string and double any backslashes
    used in the path name.
      -- But why they might need to use an E'' string (this is quite vague)?

FREEZE
    This is intended as a performance option for initial data loading.
     -- But actually it's almost useless (as visibility map is not written
by
        COPY FREEZE, therefore index-only scans don't work as intended, and
the
        next VACUUM will re-scan the whole table). Perhaps, add a note about
        that?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/sql-createconversion.html

The privileges required to create a conversion might be changed in a future
release.
  -- But are still the same (looked back as far as 7.3). What's the use of
this
     warning (perhaps, it could be removed)?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/sql-createextension.html

PostgreSQL will create the extension using details from the file
SHAREDIR/extension/extension_name.control.
  -- In the other places in documentation, it's $SHAREDIR (with dollar).

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/sql-createforeigntable.html

CREATE FOREIGN TABLE films (
    code        char(5) NOT NULL,
    title       varchar(40) NOT NULL,
    did         integer NOT NULL,
    date_prod   date,
    kind        varchar(10),
  -- Perhaps, replace all the char types here with text + CHECKs (would be
nice
     if the documentation adhered to
     https://wiki.postgresql.org/wiki/Don%27t_Do_This)?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/sql-createtable.html
  -- For all the "Examples" section: perhaps, replace all the char types
here
     with text + CHECKs (would be nice if the documentation adhered to
     https://wiki.postgresql.org/wiki/Don%27t_Do_This)?

Also:
    modtime   timestamp DEFAULT current_timestamp
  -- Replace with timestamptz.

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/sql-createtransform.html

Synopsis
    FROM SQL WITH FUNCTION from_sql_function_name [ (argument_type [, ...])
],
    TO SQL WITH FUNCTION to_sql_function_name [ (argument_type [, ...]) ]
  -- It seems that a transform function might accept many arguments, but
it's
     not documented like that below?

from_sql_function_name[(argument_type [, ...])]
    It must take one argument of type internal and return type internal.
    -- [, ...] contradicts the description?

to_sql_function_name[(argument_type [, ...])]
    It must take one argument of type internal and return the type that is
the
    type for the transform.
    -- [, ...] contradicts the description?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/sql-createview.html

CREATE VIEW comedies AS
    SELECT f.*,
           country_code_to_name(f.country_code) AS country,
           (SELECT avg(r.rating)
            FROM user_ratings r
            WHERE r.film_id = f.id) AS avg_rating
    FROM films f
    WHERE f.kind = 'Comedy';
    -- Using AS for aliases (user_ratings AS r, films AS f) would be better
       style, IMO.
--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/sql-dropfunction.html

* The standard only allows one function to be dropped per command.
  -- The following items don't have points at the end.

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/sql-rollback-to.html

The SQL standard specifies that the key word SAVEPOINT is mandatory, but
PostgreSQL and Oracle allow it to be omitted.
  -- Why do we need to document Oracle behavior here?

----- <in the same section> ------

Also, SQL has an optional clause AND [ NO ] CHAIN which is not currently
supported by PostgreSQL.
  -- ISO SQL (from a draft of 2011) states: "If AND CHAIN is specified, then
     <savepoint clause> shall not be specified." So, the sentence is
incorrect?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/sql-select.html

    [ HAVING condition [, ...] ]
  -- There could be only one condition in HAVING, "[, ...]" is incorrect.

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/sql-select.html#SQL-FOR-UPDATE-SHARE

Previous releases failed to preserve a lock which is upgraded by a later
savepoint. For example, this code:

<skipped>

would fail to preserve the FOR UPDATE lock after the ROLLBACK TO. This has
been
fixed in release 9.3.
  -- 9.3 is already unsupported --- perhaps, the warning could be removed
now?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/sql-select.html#id-1.9.3.171.9

Examples

To join the table films with the table distributors:

SELECT f.title, f.did, d.name, f.date_prod, f.kind
    FROM distributors d, films f
    WHERE f.did = d.did
  -- Why not use explicit JOIN ... ON here? Also, there's no ";" at the end,
     unlike in the following examples.

----- <in the same section> ------

SELECT kind, sum(len) AS total FROM films GROUP BY kind;
  -- Would be nice if examples were formatted in the same style (the
following
     ones, too).

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/sql-select.html#id-1.9.3.171.10

Note that if a FROM clause is not specified, the query cannot reference any
database tables. For example, the following query is invalid:

SELECT distributors.* WHERE distributors.name = 'Westward';

PostgreSQL releases prior to 8.1 would accept queries of this form, and add
an
implicit entry to the query's FROM clause for each table referenced by the
query. This is no longer allowed.
  -- 8.1 is unsupported for years. Perhaps, this warning could be removed?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/sql-selectinto.html

    [ HAVING condition [, ...] ]
  -- There could be only one condition in HAVING, "[, ...]" is incorrect.

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/sql-set.html

SET TIME ZONE
 <skip>
LOCAL
DEFAULT
    Set the time zone to your local time zone (that is, the server's default
    value of timezone).
      -- Seems like it resets to the database / role default (if any of
those is
         set), instead.

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/sql-show.html

These variables can be set using the SET statement, by editing the
postgresql.conf configuration file, through the PGOPTIONS environmental
variable
(when using libpq or a libpq-based application), or through command-line
flags
when starting the postgres server. See Chapter 19 for details.
  -- The list of methods is not exhaustive --- perhaps, add "etc.", or just
     remove / rephrase these sentences?

----- <in the same section> ------

Show all settings:
<skip>
(196 rows)
  -- It's 299 rows as of PostgreSQL 12.4. Perhaps, remove the "rows" line?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/sql-truncate.html

When RESTART IDENTITY is specified, the implied ALTER SEQUENCE RESTART
operations are also done transactionally; that is, they will be rolled back
if
the surrounding transaction does not commit. This is unlike the normal
behavior
of ALTER SEQUENCE RESTART.
  -- But ALTER SEQUENCE RESTART is also transactional... what is this about?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/app-pgbasebackup.html

Especially if the option --checkpoint=fast is not used, this can take some
time
during which pg_basebackup will be appear to be idle.
  -- "will be appear to be" -> "will appear to be"?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/pgarchivecleanup.html

pg_archivecleanup is designed to work with PostgreSQL 8.0 and later when
used as
a standalone utility, or with PostgreSQL 9.0 and later when used as an
archive
cleanup command.
  -- Isn't this information useless / deprecated by now?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/app-postgres.html

The -- options will not work on FreeBSD or OpenBSD. Use -c instead. This is
a
bug in the affected operating systems; a future release of PostgreSQL will
provide a workaround if this is not fixed.
  -- This warning appeared here in PostgreSQL 8.2. Is it still relevant?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/view-pg-locks.html

51.74. pg_locks

The waiting process will sleep until the other lock is released (or a
deadlock
situation is detected).
  -- There's also lock timeout which seems relevant here.

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/catalog-pg-authid.html

Table 51.8. pg_authid Columns
  -- Some rows in the description have point at the end, some don't
(seemingly
     at random).

--------------------------------------------------------------------------------
https://www.postgresql.org/docs/current/protocol-flow.html#PROTOCOL-ASYNC

52.2.6. Asynchronous Operations

(server_encoding, TimeZone, and integer_datetimes were not reported by
releases
before 8.0; standard_conforming_strings was not reported by releases before
8.1;
IntervalStyle was not reported by releases before 8.4; application_name was
not
reported by releases before 9.0.)
  -- All these versions are unsupported --- perhaps, the sentence could be
     removed?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/protocol-logical-replication.html#PROTOCOL-LOGICAL-MESSAGES-FLOW

52.5.3. Logical Replication Protocol Message Flow

The origin message indicated that the transaction originated on different
replication node.
  -- origin -> Origin (as spelled in the other cases in the paragraph),
     indicated -> indicates?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/error-message-reporting.html

As of PostgreSQL 9.3, complete coverage exists only for errors in SQLSTATE
class 23 (integrity constraint violation), but this is likely to be expanded
in
future.
  -- Perhaps, this information could be updated?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/nls-translator.html#id-1.10.7.2.6

54.1.4. Editing the PO Files

There is (unsurprisingly) a PO mode for Emacs, which I find quite useful.
  -- Who is "I"? Perhaps, this should be rephrased? Also, why only Emacs is
     mentioned, not any other editors? Perhaps, something like
   
https://www.gnu.org/software/trans-coord/manual/web-trans/html_node/PO-Editors.html
     should be referenced, instead?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/index-functions.html

61.2. Index Access Method Functions

False means it is certain that the index entry matches the scan keys. true
means
this is not certain, and the conditions represented by the scan keys must be
rechecked against the heap tuple after fetching it.
  -- "False" is capitalized, "true" isn't --- this can't be correct?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/gin-implementation.html

66.4. Implementation

As of PostgreSQL 9.1, null key values can be included in the index.
  -- As 9.1 is already unsupported, perhaps "As of PostgreSQL 9.1" could be
     removed?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/gin-implementation.html#GIN-FAST-UPDATE

66.4.1. GIN Fast Update Technique

As of PostgreSQL 8.4, GIN is capable of postponing much of this work by
inserting new tuples into a temporary, unsorted list of pending entries.
  -- As 8.4 is already unsupported, perhaps "As of PostgreSQL 8.4" could be
     removed?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/gin-tips.html

66.5. GIN Tips and Tricks

    Insertion into a GIN index can be slow due to the likelihood of many
keys
    being inserted for each item. So, for bulk insertions into a table it is
    advisable to drop the GIN index and recreate it after finishing bulk
    insertion.

    As of PostgreSQL 8.4, this advice is less necessary since delayed
indexing
    is used (see Section 66.4.1 for details). But for very large updates it
may
    still be best to drop and recreate the index.
      -- As 8.4 is already unsupported, perhaps it's better to rephrase both
         paragraphs?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/gin-tips.html

66.5. GIN Tips and Tricks

gin_pending_list_limit can be overridden for individual GIN indexes by
changing
storage parameters, and which allows each GIN index to have its own cleanup
threshold.
  -- "and which allows" -> "and it allows", perhaps?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/row-estimation-examples.html

70.1. Row Estimation Examples

The outputs shown are taken from version 8.3. The behavior of earlier (or
later)
versions might vary.
  -- 8.3 is very old --- perhaps, it's time to update the chapter?

----- <in the same section> ------

Note also that since ANALYZE uses random sampling while producing
statistics,
the results will change slightly after any new ANALYZE.
  -- Strictly speaking, it's "might change", not "will change", as small
tables
     are scanned in full anyway (which could be relevant here), no?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/errcodes-appendix.html

Appendix A. PostgreSQL Error Codes

As of PostgreSQL 9.3, complete coverage for this feature exists only for
errors
in SQLSTATE class 23 (integrity constraint violation), but this is likely to
be
expanded in future.
  -- Perhaps, this information could be updated?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/release-12-3.html

E.2. Release 12.3

The previous coding could allow the file to be created with permissions that
wouldn't allow the postmaster to write on it.
  -- Is "write on" correct?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/cube.html#id-1.11.7.18.7

F.9.4. Defaults

I believe this union:
  -- "I" disagrees with the general style of the documentation?

----- <in the same section> ------

In all binary operations on differently-dimensioned cubes, I assume the
lower-dimensional one to be a Cartesian projection, i. e., having zeroes in
place of coordinates omitted in the string representation.
  -- "I" disagrees with the general style of the documentation?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/isn.html

F.19. isn

It is hoped that a future version of this module will obtained the prefix
list
from one or more tables that can be easily updated by users as needed;
however,
at present, the list can only be updated by modifying the source code and
recompiling.
  -- "will obtained" -> "will obtain"?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/pgcrypto.html#PGCRYPTO-HASH-SPEED-TABLE

[In the description of the Table F.18. Hash Algorithm Speeds]

That way I can show the speed with different numbers of iterations.
  -- "I" disagrees with the general style of the documentation?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/seg.html#id-1.11.7.43.5

F.34.2. Syntax

Because ... is widely used in data sources, it is allowed as an alternative
spelling of ...
  -- While the first of those is <...>, and the second is <..>., it's a
little
     confusing / not very apparent, IMO --- perhaps, the sentence could be
     rephrased so the operator is not at the end of it?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/sepgsql.html#SEPGSQL-INSTALLATION

F.35.2. Installation

Be sure that the libselinux-devel RPM is installed at build time.
  -- Why RPM, specifically (debian(-based) distributions support SELinux,
too)?



-----
WBR, Yaroslav Schekin.
--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html


WBR, Yaroslav Schekin.
Reply | Threaded
Open this post in threaded view
|

Re: Probable documentation errors or improvements

David G Johnston
On Thu, Sep 10, 2020 at 12:20 PM Yaroslav <[hidden email]> wrote:
Disclaimer: I'm not a native speaker, so not sure those are actually
incorrect, and can't offer non-trivial suggestions.

I skimmed about 2/3rds of these and while I agree on the surface that probably 3/4ths of them are improvements they aren't clear enough wins to methodically go through and write up one or more patches.

There are a few areas that seem outdated that could use a good once-over in the sgml sources to clean up.  For those I'd expect that I or someone else may go and write up actual patches.

This run-on presentation is off-putting.  Even if actual patches are not forthcoming I would at least suggest one email per topic with the suggestions broken down into at least bugs (at the top) and non-bugs/improvements.

David J.

Reply | Threaded
Open this post in threaded view
|

Re: Probable documentation errors or improvements

Justin Pryzby
In reply to this post by Yaroslav
On Thu, Sep 10, 2020 at 12:19:55PM -0700, Yaroslav wrote:
> Disclaimer: I'm not a native speaker, so not sure those are actually
> incorrect, and can't offer non-trivial suggestions.

https://www.postgresql.org/message-id/flat/CAKFQuwZh3k_CX2-%2BNcZ%3DFZss4bX6ASxDFEXJTY6u4wTH%2BG8%2BKA%40mail.gmail.com#9f9eba0cbbf9b57455503537575f5339

Most of these appear to be reasonable corrections or improvements.

Would you want to submit a patch against the master branch ?
It'll be easier for people to read when it's in a consistent format.
And less work to read it than to write it.

I suggest to first handle the 10+ changes which are most important and in need
of fixing.  After a couple rounds, then see if what's left is worth patching.

--
Justin


Reply | Threaded
Open this post in threaded view
|

Re: Probable documentation errors or improvements

Justin Pryzby
On Sat, Sep 19, 2020 at 12:55:58PM -0500, Justin Pryzby wrote:

> On Thu, Sep 10, 2020 at 12:19:55PM -0700, Yaroslav wrote:
> > Disclaimer: I'm not a native speaker, so not sure those are actually
> > incorrect, and can't offer non-trivial suggestions.
>
> https://www.postgresql.org/message-id/flat/CAKFQuwZh3k_CX2-%2BNcZ%3DFZss4bX6ASxDFEXJTY6u4wTH%2BG8%2BKA%40mail.gmail.com#9f9eba0cbbf9b57455503537575f5339
>
> Most of these appear to be reasonable corrections or improvements.
>
> Would you want to submit a patch against the master branch ?
> It'll be easier for people to read when it's in a consistent format.
> And less work to read it than to write it.
>
> I suggest to first handle the 10+ changes which are most important and in need
> of fixing.  After a couple rounds, then see if what's left is worth patching.
A couple of these were already fixed (querytree and "be appear").

I provide a patch for a few others.

Copying Teodor et al regarding tsquery.

wdiff to follow.

commit 6c5a4e6fde2841f6a2ad00f3cc4530e7fcf9a9f3
Author: Justin Pryzby <[hidden email]>
Date:   Tue Sep 22 23:34:54 2020 -0500

    fix tsquery example
   
    broken since bb140506df605fab58f48926ee1db1f80bdafb59

diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index 8af6ac01d3..7daf292468 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -1623,9 +1623,9 @@ occurrences to display in the result.',

<screen>
SELECT to_tsquery('fat') &lt;-&gt; to_tsquery('cat | rat');
          ?column?
[-------------------------------------]{+----------------------------+}
 'fat' &lt;-&gt; {+(+} 'cat' |[-'fat' &lt;-&gt;-] 'rat' {+)+}
</screen>
      </para>
     </listitem>

commit 096a9097d58de8e2d21e42fab528b27a5213a699
Author: Justin Pryzby <[hidden email]>
Date:   Tue Sep 22 23:13:00 2020 -0500

    HAVING conditions cannot be repeated

diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml
index b93e4ca208..94889b66b4 100644
--- a/doc/src/sgml/ref/select.sgml
+++ b/doc/src/sgml/ref/select.sgml
@@ -38,7 +38,7 @@ SELECT [ ALL | DISTINCT [ ON ( <replaceable class="parameter">expression</replac
    [ FROM <replaceable class="parameter">from_item</replaceable> [, ...] ]
    [ WHERE <replaceable class="parameter">condition</replaceable> ]
    [ GROUP BY <replaceable class="parameter">grouping_element</replaceable> [, ...] ]
    [ HAVING <replaceable class="parameter">condition</replaceable>[-[, ...]-] ]
    [ WINDOW <replaceable class="parameter">window_name</replaceable> AS ( <replaceable class="parameter">window_definition</replaceable> ) [, ...] ]
    [ { UNION | INTERSECT | EXCEPT } [ ALL | DISTINCT ] <replaceable class="parameter">select</replaceable> ]
    [ ORDER BY <replaceable class="parameter">expression</replaceable> [ ASC | DESC | USING <replaceable class="parameter">operator</replaceable> ] [ NULLS { FIRST | LAST } ] [, ...] ]
diff --git a/doc/src/sgml/ref/select_into.sgml b/doc/src/sgml/ref/select_into.sgml
index b1af52a4da..e82e416d60 100644
--- a/doc/src/sgml/ref/select_into.sgml
+++ b/doc/src/sgml/ref/select_into.sgml
@@ -28,7 +28,7 @@ SELECT [ ALL | DISTINCT [ ON ( <replaceable class="parameter">expression</replac
    [ FROM <replaceable class="parameter">from_item</replaceable> [, ...] ]
    [ WHERE <replaceable class="parameter">condition</replaceable> ]
    [ GROUP BY <replaceable class="parameter">expression</replaceable> [, ...] ]
    [ HAVING <replaceable class="parameter">condition</replaceable>[-[, ...]-] ]
    [ WINDOW <replaceable class="parameter">window_name</replaceable> AS ( <replaceable class="parameter">window_definition</replaceable> ) [, ...] ]
    [ { UNION | INTERSECT | EXCEPT } [ ALL | DISTINCT ] <replaceable class="parameter">select</replaceable> ]
    [ ORDER BY <replaceable class="parameter">expression</replaceable> [ ASC | DESC | USING <replaceable class="parameter">operator</replaceable> ] [ NULLS { FIRST | LAST } ] [, ...] ]

commit 1e16102f906d9ead33ebdfc0b6f5d862e851131b
Author: Justin Pryzby <[hidden email]>
Date:   Tue Sep 22 22:24:14 2020 -0500

    style and consistency

diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index 390c49eb6a..649020b7da 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -612,7 +612,7 @@ amgettuple (IndexScanDesc scan,
   will pass the caller's snapshot test.  On success, <function>amgettuple</function>
   must also set <literal>scan-&gt;xs_recheck</literal> to true or false.
   False means it is certain that the index entry matches the scan keys.
   [-true-]{+True+} means this is not certain, and the conditions represented by the
   scan keys must be rechecked against the heap tuple after fetching it.
   This provision supports <quote>lossy</quote> index operators.
   Note that rechecking will extend only to the scan conditions; a partial

commit 44c7efab0499644060c19868d4c431007e8cccaa
Author: Justin Pryzby <[hidden email]>
Date:   Tue Sep 22 22:55:59 2020 -0500

    distro agnostic

diff --git a/doc/src/sgml/sepgsql.sgml b/doc/src/sgml/sepgsql.sgml
index 9961569afc..15417bf19d 100644
--- a/doc/src/sgml/sepgsql.sgml
+++ b/doc/src/sgml/sepgsql.sgml
@@ -88,7 +88,7 @@ Policy from config file:        targeted
  <para>
   To build this module, include the option <literal>--with-selinux</literal> in
   your PostgreSQL <literal>configure</literal> command.  Be sure that the
   <filename>libselinux-devel</filename> [-RPM-]{+package+} is installed at build time.
  </para>

  <para>
@@ -177,7 +177,7 @@ $ for DBNAME in template0 template1 postgres; do
   Makefile on your system; the path shown below is only an example.
   (This Makefile is usually supplied by the
   <filename>selinux-policy-devel</filename> or
   <filename>selinux-policy</filename> [-RPM.)-]{+package.)+}
   Once built, install this policy package using the
   <command>semodule</command> command, which loads supplied policy packages
   into the kernel.  If the package is correctly installed,

commit 1584c0dd6a5183b915811c89af08f135479509ab
Author: Justin Pryzby <[hidden email]>
Date:   Tue Sep 22 22:55:48 2020 -0500

    grammar

diff --git a/doc/src/sgml/gin.sgml b/doc/src/sgml/gin.sgml
index 5c8d4d5275..67754f52f6 100644
--- a/doc/src/sgml/gin.sgml
+++ b/doc/src/sgml/gin.sgml
@@ -612,7 +612,7 @@
    </para>
    <para>
     <varname>gin_pending_list_limit</varname> can be overridden for individual
     GIN indexes by changing storage parameters,[-and-] which allows each
     GIN index to have its own cleanup threshold.
     For example, it's possible to increase the threshold only for the GIN
     index which can be updated heavily, and decrease it otherwise.
diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
index beb309e668..a9b7e8fc99 100644
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -1503,7 +1503,7 @@ synchronous_standby_names = 'ANY 2 (s1, s2, s3)'
    Note that in this mode, the server will apply WAL one file at a
    time, so if you use the standby server for queries (see Hot Standby),
    there is a delay between an action in the primary and when the
    action becomes visible in the standby, corresponding {+to+} the time it takes
    to fill up the WAL file. <varname>archive_timeout</varname> can be used to make that delay
    shorter. Also note that you can't combine streaming replication with
    this method.
diff --git a/doc/src/sgml/isn.sgml b/doc/src/sgml/isn.sgml
index e55ed07312..709bc8345c 100644
--- a/doc/src/sgml/isn.sgml
+++ b/doc/src/sgml/isn.sgml
@@ -14,7 +14,7 @@
  hard-coded list of prefixes; this list of prefixes is also used to hyphenate
  numbers on output.  Since new prefixes are assigned from time to time, the
  list of prefixes may be out of date.  It is hoped that a future version of
  this module will [-obtained-]{+obtain+} the prefix list from one or more tables that
  can be easily updated by users as needed; however, at present, the
  list can only be updated by modifying the source code and recompiling.
  Alternatively, prefix validation and hyphenation support may be
diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index 6920913a26..92861e5cfd 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -1246,7 +1246,7 @@ ERROR:  could not serialize access due to read/write dependencies among transact
        <para>
         The <literal>FOR UPDATE</literal> lock mode
         is also acquired by any <command>DELETE</command> on a row, and also by an
         <command>UPDATE</command> that modifies the values [-on-]{+of+} certain columns.  Currently,
         the set of columns considered for the <command>UPDATE</command> case are those that
         have a unique index on them that can be used in a foreign key (so partial
         indexes and expressional indexes are not considered), but this may change
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index f5e3318106..06068f87a4 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -2837,7 +2837,7 @@ The commands accepted in replication mode are:
  <para>
   Every sent transaction contains zero or more DML messages (Insert,
   Update, Delete). In case of a cascaded setup it can also contain Origin
   messages. The origin message [-indicated-]{+indicates+} that the transaction originated on
   different replication node. Since a replication node in the scope of logical
   replication protocol can be pretty much anything, the only identifier
   is the origin name. It's downstream's responsibility to handle this as
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index 2ebdf02bfa..8af6ac01d3 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -1645,7 +1645,7 @@ SELECT to_tsquery('fat') &lt;-&gt; to_tsquery('cat | rat');
     <listitem>
      <para>
       Returns a query that searches for a match to the first given query
       followed by a match to the second given query at a distance of [-at-]{+exactly+}
       <replaceable>distance</replaceable> lexemes, using
       the <literal>&lt;<replaceable>N</replaceable>&gt;</literal>
       <type>tsquery</type> operator.  For example:

commit fe18f111c589378eb6bc0ed0ac6f8efb317c25b4
Author: Justin Pryzby <[hidden email]>
Date:   Tue Sep 22 22:48:52 2020 -0500

    Avoid ending a sentence about "..." with ".." ..
   
    .. which then looks just the same as "...".

diff --git a/doc/src/sgml/seg.sgml b/doc/src/sgml/seg.sgml
index e86142d885..e0dfbc76cf 100644
--- a/doc/src/sgml/seg.sgml
+++ b/doc/src/sgml/seg.sgml
@@ -205,8 +205,8 @@ test=&gt; select '6.25 .. 6.50'::seg as "pH";
  </table>

  <para>
   Because {+the+} <literal>...</literal> {+operator+} is widely used in data sources, it is allowed
   as an alternative spelling of [-<literal>..</literal>.-]{+the <literal>..</literal> operator.+}  Unfortunately, this
   creates a parsing ambiguity: it is not clear whether the upper bound
   in <literal>0...23</literal> is meant to be <literal>23</literal> or <literal>0.23</literal>.
   This is resolved by requiring at least one digit before the decimal

commit fabaf97865d46377ed4dbbbb1109299eb037394f
Author: Justin Pryzby <[hidden email]>
Date:   Tue Sep 22 22:51:47 2020 -0500

    punctuation

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index de9bacd34f..d1b8fc8010 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -1525,7 +1525,7 @@
      </para>
      <para>
       Role can log in. That is, this role can be given as the initial
       session authorization [-identifier-]{+identifier.+}
      </para></entry>
     </row>

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 8eabf93834..1c753ccb7e 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -10084,8 +10084,8 @@ LOG:  CleanUpLock: deleting: lock(0xb7acd844) id(24688,24696,0,0,0,1)
      </term>
      <listitem>
       <para>
        If set, do not trace locks for tables below this [-OID. (use-]{+OID (used+} to avoid
        output on system [-tables)-]{+tables).+}
       </para>
       <para>
        This parameter is only available if the <symbol>LOCK_DEBUG</symbol>
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 461b748d89..8ec58c8577 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -26656,7 +26656,7 @@ BEGIN
  RAISE NOTICE 'rewriting table % for reason %',
                pg_event_trigger_table_rewrite_oid()::regclass,
                pg_event_trigger_table_rewrite_reason();
[-END;-]{+END+}
$$;

CREATE EVENT TRIGGER test_table_rewrite_oid
diff --git a/doc/src/sgml/parallel.sgml b/doc/src/sgml/parallel.sgml
index e31bd9d3ce..c81abff48d 100644
--- a/doc/src/sgml/parallel.sgml
+++ b/doc/src/sgml/parallel.sgml
@@ -471,7 +471,7 @@ EXPLAIN SELECT * FROM pgbench_accounts WHERE filler LIKE '%x%';
  </para>

  <para>
    The following operations are always parallel [-restricted.-]{+restricted:+}
  </para>

  <itemizedlist>

commit adf050ac6cc7d0905fc1613dce1a04f76a892609
Author: Justin Pryzby <[hidden email]>
Date:   Tue Sep 22 21:40:17 2020 -0500

    extraneous comma

diff --git a/doc/src/sgml/rules.sgml b/doc/src/sgml/rules.sgml
index bcf860b68b..d6e3463ac2 100644
--- a/doc/src/sgml/rules.sgml
+++ b/doc/src/sgml/rules.sgml
@@ -769,7 +769,7 @@ SELECT t1.a, t2.b, t1.ctid FROM t1, t2 WHERE t1.a = t2.a;
</para>

<para>
    The benefit of implementing views with the rule system [-is,-]{+is+}
    that the planner has all
    the information about which tables have to be scanned plus the
    relationships between these tables plus the restrictive
@@ -781,7 +781,7 @@ SELECT t1.a, t2.b, t1.ctid FROM t1, t2 WHERE t1.a = t2.a;
    the best path to execute the query, and the more information
    the planner has, the better this decision can be. And
    the rule system as implemented in <productname>PostgreSQL</productname>
    [-ensures,-]{+ensures+} that this is all information available about the query
    up to that point.
</para>
</sect2>

v1-0001-extraneous-comma.patch (1K) Download Attachment
v1-0002-punctuation.patch (2K) Download Attachment
v1-0003-Avoid-ending-a-sentence-about-.-with.patch (1K) Download Attachment
v1-0004-grammar.patch (5K) Download Attachment
v1-0005-distro-agnostic.patch (1K) Download Attachment
v1-0006-style-and-consistency.patch (1K) Download Attachment
v1-0007-HAVING-conditions-cannot-be-repeated.patch (2K) Download Attachment
v1-0008-fix-tsquery-example.patch (941 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Probable documentation errors or improvements

Justin Pryzby
Split one patch about text search, added another one (sequences), added some
info to commit messages, and added here.
https://commitfest.postgresql.org/30/2744/

--
Justin

v2-0001-extraneous-comma.patch (1K) Download Attachment
v2-0002-semicolons-after-END-in-programlisting.patch (2K) Download Attachment
v2-0003-punctuation.patch (1K) Download Attachment
v2-0004-Avoid-ending-a-sentence-about-.-with.patch (1K) Download Attachment
v2-0005-grammar.patch (4K) Download Attachment
v2-0006-distro-agnostic.patch (1K) Download Attachment
v2-0007-style-and-consistency.patch (1K) Download Attachment
v2-0008-HAVING-conditions-cannot-be-repeated.patch (2K) Download Attachment
v2-0009-doc-textsearch-grammar-exactly.patch (1K) Download Attachment
v2-0010-fix-tsquery-example.patch (942 bytes) Download Attachment
v2-0011-Fix-omission-regarding-ALTER-SEQUENCE.patch (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Probable documentation errors or improvements

Justin Pryzby
On Fri, Sep 25, 2020 at 09:30:00AM -0500, Justin Pryzby wrote:
> Split one patch about text search, added another one (sequences), added some
> info to commit messages, and added here.
> https://commitfest.postgresql.org/30/2744/

Added an additional patch regarding spaces between function arguments.

--
Justin

v3-0001-extraneous-comma.patch (1K) Download Attachment
v3-0002-semicolons-after-END-in-programlisting.patch (2K) Download Attachment
v3-0003-punctuation.patch (1K) Download Attachment
v3-0004-Add-spaces-after-commas-in-function-calls.patch (17K) Download Attachment
v3-0005-Avoid-ending-a-sentence-about-.-with.patch (1K) Download Attachment
v3-0006-grammar.patch (4K) Download Attachment
v3-0007-distro-agnostic.patch (1K) Download Attachment
v3-0008-style-and-consistency.patch (1K) Download Attachment
v3-0009-HAVING-conditions-cannot-be-repeated.patch (2K) Download Attachment
v3-0010-doc-textsearch-grammar-exactly.patch (1K) Download Attachment
v3-0011-fix-tsquery-example.patch (942 bytes) Download Attachment
v3-0012-Fix-omission-regarding-ALTER-SEQUENCE.patch (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Probable documentation errors or improvements

Heikki Linnakangas
On 05/10/2020 22:19, Justin Pryzby wrote:
> On Fri, Sep 25, 2020 at 09:30:00AM -0500, Justin Pryzby wrote:
>> Split one patch about text search, added another one (sequences), added some
>> info to commit messages, and added here.
>> https://commitfest.postgresql.org/30/2744/
>
> Added an additional patch regarding spaces between function arguments.

Pushed most of these.

I left out these changes in sepgsql docs:

> --- a/doc/src/sgml/sepgsql.sgml
> +++ b/doc/src/sgml/sepgsql.sgml
> @@ -88,7 +88,7 @@ Policy from config file:        targeted
>    <para>
>     To build this module, include the option <literal>--with-selinux</literal> in
>     your PostgreSQL <literal>configure</literal> command.  Be sure that the
> -   <filename>libselinux-devel</filename> RPM is installed at build time.
> +   <filename>libselinux-devel</filename> package is installed at build time.
>    </para>

It's true that the sentence is unusually distro-specific, but I think
second instance of this becomes ambiguous if we just change RPM to package:

>   <para>
>    Second, build and install the policy package for the regression test.
>    The <filename>sepgsql-regtest</filename> policy is a special purpose policy package
>    which provides a set of rules to be allowed during the regression tests.
>    It should be built from the policy source file
>    <filename>sepgsql-regtest.te</filename>, which is done using
>    <command>make</command> with a Makefile supplied by SELinux.
>    You will need to locate the appropriate
>    Makefile on your system; the path shown below is only an example.
>    (This Makefile is usually supplied by the
>    <filename>selinux-policy-devel</filename> or
>    <filename>selinux-policy</filename> RPM.)
>    Once built, install this policy package using the
>    <command>semodule</command> command, which loads supplied policy packages
>    into the kernel.  If the package is correctly installed,
>    <literal><command>semodule</command> -l</literal> should list <literal>sepgsql-regtest</literal> as an
>    available policy package:
>   </para>

The paragraph talks about "policy package", so using just "package" to
refer to a .rpm/.deb file would be confusing. Suggestions are welcome.

- Heikki


Reply | Threaded
Open this post in threaded view
|

Re: Probable documentation errors or improvements

Justin Pryzby
On Mon, Oct 19, 2020 at 07:36:29PM +0300, Heikki Linnakangas wrote:
> On 05/10/2020 22:19, Justin Pryzby wrote:
> > On Fri, Sep 25, 2020 at 09:30:00AM -0500, Justin Pryzby wrote:
> > > Split one patch about text search, added another one (sequences), added some
> > > info to commit messages, and added here.
> > > https://commitfest.postgresql.org/30/2744/
> >
> > Added an additional patch regarding spaces between function arguments.
>
> Pushed most of these.

Thanks.  

Yaroslav, could you prepare a patch for your remaining suggestions ?
I suggest to include just the handful of items which are most important (most
wrong/ambiguous/confusing).

--
Justin


Reply | Threaded
Open this post in threaded view
|

Re: Probable documentation errors or improvements

Tom Lane-2
In reply to this post by Heikki Linnakangas
Heikki Linnakangas <[hidden email]> writes:
> I left out these changes in sepgsql docs:

>> --- a/doc/src/sgml/sepgsql.sgml
>> +++ b/doc/src/sgml/sepgsql.sgml
>> @@ -88,7 +88,7 @@ Policy from config file:        targeted
>> <para>
>> To build this module, include the option <literal>--with-selinux</literal> in
>> your PostgreSQL <literal>configure</literal> command.  Be sure that the
>> -   <filename>libselinux-devel</filename> RPM is installed at build time.
>> +   <filename>libselinux-devel</filename> package is installed at build time.
>> </para>

> It's true that the sentence is unusually distro-specific, but I think
> second instance of this becomes ambiguous if we just change RPM to package:

There's also the fact that the package name "libselinux-devel" is itself
Red Hat specific --- Debian-based distros would call it something else,
even assuming that they have an exact equivalent which they might not.
The other advice you mention is also pretty RH-specific.

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: Probable documentation errors or improvements

Yaroslav
In reply to this post by Justin Pryzby
Thank you for preparing the patches, and Heikki Linnakangas for pushing those!

Yaroslav, could you prepare a patch for your remaining suggestions ?
I suggest to include just the handful of items which are most important (most
wrong/ambiguous/confusing).

Let me write this again: I'm not a native speaker, so not sure those are actually incorrect, and can't offer non-trivial suggestions.
 
Anyway, I've tried to remove the already fixed things (also by other patches mentioned in the thread) from my initial list, and attach it.
 
Also, I've looked over the pushed patches, and spotted these:
 
-- https://www.postgresql.org/docs/current/catalog-pg-authid.html
 
Table 51.8. pg_authid Columns
-- Some rows in the description have point at the end, some don't (seemingly at random).
 
The applied patch fixed only one of those cases... also seemingly at random. Would be nice if someone read the whole table and made a decision about how it should be. If (just guessing here) the rule is "there should be no point if it's just one sentence", then "Role bypasses every row level security policy, see Section 5.8 for more information." is still wrong.
 
Then, while ";" after "END" were added in doc/src/sgml/plpgsql.sgml, but in doc/src/sgml/func.sgml they still were not.
 
Finally, while the patches fix a lot of whitespace in function arguments, the original complaint about div() and mod() is still not fixed.
 
-- 
WBR, Yaroslav Schekin.

remaining_problems.txt (87K) Download Attachment
WBR, Yaroslav Schekin.