PostgreSQL Weekly News is brought to you this week by David Fetter
Submit news and announcements by Sunday at 3:00pm PST8PDT to [hidden email].
Amit Kapila pushed:
Allow decoding at prepare time in ReorderBuffer. This patch allows
PREPARE-time decoding of two-phase transactions (if the output plugin supports
this capability), in which case the transactions are replayed at PREPARE and
then committed later when COMMIT PREPARED arrives. Now that we decode the
changes before the commit, the concurrent aborts may cause failures when the
output plugin consults catalogs (both system and user-defined). We detect
such failures with a special sqlerrcode ERRCODE_TRANSACTION_ROLLBACK
introduced by commit 7259736a6e and stop decoding the remaining changes. Then
we rollback the changes when rollback prepared is encountered. Author: Ajin
Cherian and Amit Kapila based on previous work by Nikhil Sontakke and Stas
Kelvich Reviewed-by: Amit Kapila, Peter Smith, Sawada Masahiko, Arseny Sher,
and Dilip Kumar Tested-by: Takamichi Osumi Discussion:
Fix allocation logic of cryptohash context data with OpenSSL. The allocation
of the cryptohash context data when building with OpenSSL was happening in the
memory context of the caller of pg_cryptohash_create(), which could lead to
issues with resowner cleanup if cascading resources are cleaned up on an
error. Like other facilities using resowners, move the base allocation to
TopMemoryContext to ensure a correct cleanup on failure. The resulting code
gets simpler with this commit as the context data is now hold by a unique
opaque pointer, so as there is only one single allocation done in
TopMemoryContext. After discussion, also change the cryptohash subroutines to
return an error if the caller provides NULL for the context data to ease error
detection on OOM. Author: Heikki Linnakangas Discussion:
Add the ability for the core grammar to have more than one parse target. This
patch essentially allows gram.y to implement a family of related syntax trees,
rather than necessarily always parsing a list of SQL statements. raw_parser()
gains a new argument, enum RawParseMode, to say what to do. As proof of
concept, add a mode that just parses a TypeName without any other decoration,
and use that to greatly simplify typeStringToTypeName(). In addition, invent
a new SPI entry point SPI_prepare_extended() to allow SPI users (particularly
plpgsql) to get at this new functionality. In hopes of making this the last
variant of SPI_prepare(), set up its additional arguments as a struct rather
than direct arguments, and promise that future additions to the struct can
default to zero. SPI_prepare_cursor() and SPI_prepare_params() can perhaps go
away at some point. Discussion:
Re-implement pl/pgsql's expression and assignment parsing. Invent new
RawParseModes that allow the core grammar to handle pl/pgsql expressions and
assignments directly, and thereby get rid of a lot of hackery in pl/pgsql's
parser. This moves a good deal of knowledge about pl/pgsql into the core
code: notably, we have to invent a CoercionContext that matches pl/pgsql's
(rather dubious) historical behavior for assignment coercions. That's getting
away from the original idea of pl/pgsql as an arm's-length extension of the
core, but really we crossed that bridge a long time ago. The main advantage
of doing this is that we can now use the core parser to generate FieldStore
and/or SubscriptingRef nodes to handle assignments to pl/pgsql variables that
are records or arrays. That fixes a number of cases that had never been
implemented in pl/pgsql assignment, such as nested records and array slicing,
and it allows pl/pgsql assignment to support the datatype-specific
subscripting behaviors introduced in commit c7aba7c14. There are cosmetic
benefits too: when a syntax error occurs in a pl/pgsql expression, the error
report no longer includes the confusing "SELECT" keyword that used to get
prefixed to the expression text. Also, there seem to be some small speed
Remove PLPGSQL_DTYPE_ARRAYELEM datum type within pl/pgsql. In the wake of the
previous commit, we don't really need this anymore, since array assignment is
primarily handled by the core code. The only way that that code could still
be reached is that a GET DIAGNOSTICS target variable could be an array
element. But that doesn't seem like a particularly essential feature. I'd
added it in commit 55caaaeba, but just because it was easy not because anyone
had actually asked for it. Hence, revert that patch and then remove the
now-unreachable stuff. (If we really had to, we could probably reimplement
GET DIAGNOSTICS using the new assignment machinery; but the cost/benefit ratio
looks very poor, and it'd likely be a bit slower.) Note that
PLPGSQL_DTYPE_RECFIELD remains. It's possible that we could get rid of that
too, but maintaining the existing behaviors for RECORD-type variables seems
like it might be difficult. Since there's not any functional limitation in
those code paths as there was in the ARRAYELEM code, I've not pursued the
Rethink the "read/write parameter" mechanism in pl/pgsql. Performance issues
with the preceding patch to re-implement array element assignment within
pl/pgsql led me to realize that the read/write parameter mechanism is
misdesigned. Instead of requiring the assignment source expression to be such
that all its references to the target variable could be passed as R/W, we
really want to identify one reference to the target variable to be passed as
R/W, allowing any other ones to be passed read/only as they would be by
default. As long as the R/W reference is a direct argument to the top-level
(hence last to be executed) function in the expression, there is no harm in
R/O references being passed to other lower parts of the expression. Nor is
there any use-case for more than one argument of the top-level function being
R/W. Hence, rewrite that logic to identify one single Param that references
the target variable, and make only that Param pass a read/write reference, not
any other Params referencing the target variable. Discussion:
Fix integer-overflow corner cases in substring() functions. If the substring
start index and length overflow when added together, substring() misbehaved,
either throwing a bogus "negative substring length" error on a case that
should succeed, or failing to complain that a negative length is negative (and
instead returning the whole string, in most cases). Unsurprisingly, the text,
bytea, and bit variants of the function all had this issue. Rearrange the
logic to ensure that negative lengths are always rejected, and add an overflow
check to handle the other case. Also install similar guards into
detoast_attr_slice() (nee heap_tuple_untoast_attr_slice()), since it's far
from clear that no other code paths leading to that function could pass it
values that would overflow. Patch by myself and Pavel Stehule, per bug #16804
from Rafi Shamim. Back-patch to v11. While these bugs are old, the
common/int.h infrastructure for overflow-detecting arithmetic didn't exist
before commit 4d6ad3125, and it doesn't seem like these misbehaviors are bad
enough to justify developing a standalone fix for the older branches.
Allow psql's \dt and \di to show TOAST tables and their indexes. Formerly,
TOAST objects were unconditionally suppressed, but since \d is able to print
them it's not very clear why these variants should not. Instead, use the same
rules as for system catalogs: they can be seen if you write the 'S' modifier
or a table name pattern. (In practice, since hardly anybody would keep
pg_toast in their search_path, it's really down to whether you use a pattern
that can match pg_toast.*.) No docs change seems necessary because the docs
already say that this happens for "system objects"; we're just classifying
TOAST tables as being that. Justin Pryzby, reviewed by Laurenz Albe
Add a test module for the regular expression package. This module provides a
function test_regex() that is functionally rather like regexp_matches(), but
with additional debugging-oriented options and additional output. The debug
options are somewhat obscure; they are chosen to match the API of the test
harness that Henry Spencer wrote way-back-when for use in Tcl. With this, we
can import all the test cases that Spencer wrote originally, even for regex
functionality that we don't currently expose in Postgres. This seems
necessary because we can no longer rely on Tcl to act as upstream and verify
any fixes or improvements that we make. In addition to Spencer's tests, I
added a few for lookbehind constraints (which we added in 2015, and Tcl still
hasn't absorbed) that are modeled on his tests for lookahead constraints.
After looking at code coverage reports, I also threw in a couple of tests to
more fully exercise our "high colormap" logic. According to my testing, this
brings the check-world coverage for src/backend/regex/ from 71.1% to 86.7% of
lines. (coverage.postgresql.org shows a slightly different number, which I
think is because it measures a non-assert build.) Discussion:
Improve timeout.c's handling of repeated timeout set/cancel. A very common
usage pattern is that we set a timeout that we don't expect to reach, cancel
it after a little bit, and later repeat. With the original implementation of
timeout.c, this results in one setitimer() call per timeout set or cancel. We
can do a lot better by being lazy about changing the timeout interrupt
request, namely: (1) never cancel the outstanding interrupt, even when we have
no active timeout events; (2) if we need to set an interrupt, but there
already is one pending at or before the required time, leave it alone. When
the interrupt happens, the signal handler will reschedule it at whatever time
is then needed. For example, with a one-second setting for statement_timeout,
this method results in having to interact with the kernel only a little more
than once a second, no matter how many statements we execute in between. The
mainline code might never call setitimer() at all after the first time, while
each time the signal handler fires, it sees that the then-pending request is
most of a second away, and that's when it sets the next interrupt request for.
Each mainline timeout-set request after that will observe that the time it
wants is past the pending interrupt request time, and do nothing. This also
works pretty well for cases where a few different timeout lengths are in use,
as long as none of them are very short. But that describes our usage well.
Idea and original patch by Thomas Munro; I fixed a race condition and improved
the comments. Discussion:
Further second thoughts about idle_session_timeout patch. On reflection, the
order of operations in PostgresMain() is wrong. These timeouts ought to be
shut down before, not after, we do the post-command-read CHECK_FOR_INTERRUPTS,
to guarantee that any timeout error will be detected there rather than at some
ill-defined later point (possibly after having wasted a lot of work). This is
really an error in the original idle_in_transaction_timeout patch, so
back-patch to 9.6 where that was introduced.
Adjust createdb TAP tests to work on recent OpenBSD. We found last February
that the error-case tests added by commit 008cf0409 failed on OpenBSD, because
that platform doesn't really check locale names. At the time it seemed that
that was only an issue for LC_CTYPE, but testing on a more recent version of
OpenBSD shows that it's now equally lax about LC_COLLATE. Rather than
dropping the LC_COLLATE test too, put back LC_CTYPE (reverting c4b0edb07), and
adjust these tests to accept the different error message that we get if
setlocale() doesn't reject a bogus locale name. The point of these tests is
not really what the backend does with the locale name, but to show that
createdb quotes funny locale names safely; so we're not losing test
reliability this way. Back-patch as appropriate. Discussion:
Fix ancient bug in parsing of BRE-mode regular expressions. brenext(), when
parsing a '*' quantifier, forgot to return any "value" for the token; per the
equivalent case in next(), it should return value 1 to indicate that greedy
rather than non-greedy behavior is wanted. The result is that the compiled
regexp could behave like 'x*?' rather than the intended 'x*', if we were
unlucky enough to have a zero in v->nextvalue at this point. That seems to
happen with some reliability if we have '.*' at the beginning of a BRE-mode
regexp, although that depends on the initial contents of a stack-allocated
struct, so it's not guaranteed to fail. Found by Alexander Lakhin using
valgrind testing. This bug seems to be aboriginal in Spencer's code, so
back-patch all the way. Discussion:
Fix plpgsql tests for debug_invalidate_system_caches_always. Commit c9d529848
resulted in having a couple more places where the error context stack for a
failure varies depending on debug_invalidate_system_caches_always (nee
CLOBBER_CACHE_ALWAYS). This is not very surprising, since we have to re-parse
cached plans if the plan cache is clobbered. Stabilize the expected test
output by hiding the context stack in these places, as we've done elsewhere in
this test script. (Another idea worth considering, now that we have
debug_invalidate_system_caches_always, is to force it to zero for these test
cases. That seems like it'd risk reducing the coverage of cache-clobber
testing, which might or might not be worth being able to verify that we get
the expected error output in normal cases. For the moment I just stuck with
the existing technique.) In passing, update comments that referred to
CLOBBER_CACHE_ALWAYS. Per buildfarm member hyrax.
Replace CLOBBER_CACHE_ALWAYS with run-time GUC. Forced cache invalidation
(CLOBBER_CACHE_ALWAYS) has been impractical to use for testing in PostgreSQL
because it's so slow and because it's toggled on/off only at build time. It
is helpful when hunting bugs in any code that uses the sycache/relcache
because causes cache invalidations to be injected whenever it would be
possible for an invalidation to occur, whether or not one was really pending.
Address this by providing run-time control over cache clobber behaviour using
the new debug_invalidate_system_caches_always GUC. Support is not compiled in
at all unless assertions are enabled or CLOBBER_CACHE_ENABLED is explicitly
defined at compile time. It defaults to 0 if compiled in, so it has
negligible effect on assert build performance by default. When support is
compiled in, test code can now set debug_invalidate_system_caches_always=1
locally to a backend to test specific queries, functions, extensions, etc. Or
tests can toggle it globally for a specific test case while retaining normal
performance during test setup and teardown. For backwards compatibility with
existing test harnesses and scripts, debug_invalidate_system_caches_always
defaults to 1 if CLOBBER_CACHE_ALWAYS is defined, and to 3 if
CLOBBER_CACHE_RECURSIVE is defined. CLOBBER_CACHE_ENABLED is now visible in
pg_config_manual.h, as is the related RECOVER_RELATION_BUILD_MEMORY setting
for the relcache. Author: Craig Ringer email@example.com
Detect the deadlocks between backends and the startup process. The deadlocks
that the recovery conflict on lock is involved in can happen between
hot-standby backends and the startup process. If a backend takes an access
exclusive lock on the table and which finally triggers the deadlock, that
deadlock can be detected as expected. On the other hand, previously, if the
startup process took an access exclusive lock and which finally triggered the
deadlock, that deadlock could not be detected and could remain even after
deadlock_timeout passed. This is a bug. The cause of this bug was that the
code for handling the recovery conflict on lock didn't take care of deadlock
case at all. It assumed that deadlocks involving the startup process and
backends were able to be detected by the deadlock detector invoked within
backends. But this assumption was incorrect. The startup process also should
have invoked the deadlock detector if necessary. To fix this bug, this commit
makes the startup process invoke the deadlock detector if deadlock_timeout is
reached while handling the recovery conflict on lock. Specifically, in that
case, the startup process requests all the backends holding the conflicting
locks to check themselves for deadlocks. Back-patch to v9.6. v9.5 has also
this bug, but per discussion we decided not to back-patch the fix to v9.5.
Because v9.5 doesn't have some infrastructure codes (e.g., 37c54863cf) that
this bug fix patch depends on. We can apply those codes for the back-patch,
but since the next minor version release is the final one for v9.5, it's risky
to do that. If we unexpectedly introduce new bug to v9.5 by the back-patch,
there is no chance to fix that. We determined that the back-patch to v9.5
would give more risk than gain. Author: Fujii Masao Reviewed-by: Bertrand
Drouvot, Masahiko Sawada, Kyotaro Horiguchi Discussion:
Add GUC to log long wait times on recovery conflicts. This commit adds GUC
log_recovery_conflict_waits that controls whether a log message is produced
when the startup process is waiting longer than deadlock_timeout for recovery
conflicts. This is useful in determining if recovery conflicts prevent the
recovery from applying WAL. Note that currently a log message is produced
only when recovery conflict has not been resolved yet even after
deadlock_timeout passes, i.e., only when the startup process is still waiting
for recovery conflict even after deadlock_timeout. Author: Bertrand Drouvot,
Masahiko Sawada Reviewed-by: Alvaro Herrera, Kyotaro Horiguchi, Fujii Masao
Atsushi Torikoshi sent in another revision of a patch to implement
pg_get_target_backend_memory_contexts() and make it possible to collect memory
contexts of the specified process.
Atsushi Torikoshi sent in another revision of a patch to add a wait_start column
to the pg_locks view.
Mark Zhao sent in a patch intended to fix a bug that manifested as logical
replication on partitioned tables being very slow and consuming a lot of CPU by
adding a missing RelationClose after RelationIdGetRelation in pgoutput.c.
Önder Kalacı sent in another revision of a patch to implement row filtering for
Justin Pryzby sent in a patch to Allow errors in parameter values to be reported
during the BIND phase.
Pavel Stěhule sent in another revision of a patch to make it possible to make it
possible to write window functions in PLs, along with an implementation of same
Bharath Rupireddy sent in three more revisions of a patch to make it possible to
use parallel inserts in CTAS.
Kyotaro HORIGUCHI sent in four more revisions of a patch intended to fix a bug
that manifested as failure of a standby to follow a timeline switch by ensuring
that the Walsender tracks timeline switches while sending a historic timeline.
Peter Smith sent in four more revisions of a patch to make it possible to use
multiple tablesync workers.
Dilip Kumar sent in another revision of a patch to add options for custom table
Dmitry Dolgov sent in three more revisions of a patch to use the generic
subscripting infrastructure for JSONB operations.
Justin Pryzby sent in another revision of a patch to support multiple
compression methods and options for same in pg_dump.
Masahiko Sawada sent in a patch to introduce an IndexAM API for choosing index
vacuum strategy, use same to choose index vacuum strategy, and skip btree
bulkdelete if the index doesn't grow.
Thomas Munro sent in another revision of a patch to reduce the WaitEventSet
Pavel Stěhule sent in a patch to add an option to use a shorthand for argument
and local variable references in PL/pgsql.
Dmitry Dolgov sent in another revision of a patch to Prevent jumbling of every
element in ArrayExpr in order to keep pg_stat_statements from producing
different entries for what are essentially similar queries.
Tom Lane sent in a PoC patch to deal with MacOS's SIP infrastructure works for
Amit Kapila sent in a patch to track replication origin progress for rollbacks
for some cases the patch for tracking 2PC in logical replication missed.
Paul Martinez sent in a patch to add partial foreign key updates in referential
Bruce Momjian sent in two more revisions of a patch to consolidate more of the
hex functions in /common.
Shinya Kato, Masahiko Sawada, and Fujii Masao traded patches to fill out the
implementation of CLOSE, FETCH, and MOVE tab completion in psql.
Daniel Gustafsson sent in two more revisions of a patch to support enabling and
disabling checksums on running clusters.
Tsutomu Yamada and Tomáš Vondra traded patches to add a family of functions
starting with \dX to psql which deals with extended statistics.
Bharath Rupireddy sent in two more revisions of a patch to add a postgres_fdw
function to discard cached connections, add a postgres_fdw.keep_connections GUC
to control whether connections are cached, and add a similar server-level
Ryo Matsumura sent in a patch atop the libpq tracing patch to fix some
oversights in same.
Kyotaro HORIGUCHI sent in another revision of a patch to intended to fix a bug
that manifested as corruption during WAL replay by delaying checkpoint
completion until_after truncation succeeds.
Greg Sabino Mullane sent in another revision of a patch to enable psql's \df to
choose functions by input type.
Movead Li sent in another revision of a patch to fix the waldump size for wal
Kirk Jamison sent in another revision of a patch to make dropping relation
buffers more efficient with dlist.
Michaël Paquier sent in another revision of a patch to add SHA1 to the
Julien Rouhaud sent in another revision of a patch to move pg_stat_statements
query jumbling to core, expose queryid in pg_stat_activity and log_line_prefix,
and expose query identifier in verbose explain.
Laurenz Albe sent in two more revisions of a patch to add session statistics to
Zeng Wenjing sent in a PoC patch to implement global indexes.
Bharath Rupireddy sent in three more revisions of a patch to implement EXPLAIN
[ANALYZE] for REFRESH MATERIALIZED VIEW.
Masahiko Sawada sent in a patch intended to fix a bug that manifested as logical
replication worker accesses catalogs in error context callback by storing both
the local and the remote type names in SlotErrCallbackArg so that it's possible
just to set the names in the error callback without a system cache lookup.
Vigneshwaran C sent in a patch to add schema level support for PUBLICATIONs.
Mark Dilger sent in two more revisions of a patch to add a new pg_amcheck
contrib module, which is a command line interface for running amcheck's
verifications against tables and indexes.
Thomas Munro sent in a patch to add FreeBSD to the list of platforms that have
Kyotaro HORIGUCHI sent in another revision of a patch to make the stats
collector more efficient by replacing the files it used for temporary storage
with shared memory.
Michaël Paquier sent in another revision of a patch to refactor HMAC
implementations to reduce duplication.
Pavel Stěhule sent in another revision of a patch to reduce the overhead of
execution of the CALL statement in no atomic mode from PL/pgSQL.
Kyotaro HORIGUCHI sent in another revision of a patch to make ALTER TABLE SET
[UN]LOGGED avoid a heap rewrite, change SET LOGGED when wal_level > minimal so
it emits WAL using XLOG_FPI instead of a massive number of HEAP_INSERTs, and
allows for the cleanup of files left behind in the crash of the transaction that
Pavel Stěhule sent in a patch to add a way to return the text value of variable
content to the PL/pgsql debugging API.
Pavel Stěhule sent in a patch to make it possible to use a special pager for
psql's \watch command.
Tomáš Vondra sent in another revision of a patch to make it possible to create
extended statistics on expressions.
Simon Riggs sent in four more revisions of a patch to implement system-versioned
Peter Eisentraut sent in another revision of a patch to pageinspect which
changes the type of block number arguments to bigint in order to avert overflow.
Bruce Momjian sent in four more revisions of a patch to add tests for key
Álvaro Herrera and Tomáš Vondra traded patches to implement MERGE.
Pavel Stěhule and Erik Rijkers traded patches to implement schema variables.
Álvaro Herrera and Justin Pryzby traded patches to implement ALTER TABLE ...
DETACH PARTITION CONCURRENTLY.
Noah Misch sent in a patch to fix pg_dump for GRANT OPTION among initial
Krasiyan Andreev sent in another revision of a patch to implement NULL treatment
for window functions.
Michael Banck sent in a patch to fix an issue where psql's \watch is not working
correctly in the case where the query in question doesn't return rows.
Thomas Munro sent in a patch to use pg_pwrite() in pg_test_fsync to maintain
consistency with what PostgreSQL now does.
Justin Pryzby sent in another revision of a patch to fix some documentation and
comments in the patch that implements pluggable compression in libpq.
Noah Misch sent in another revision of a patch intended to fix a bug that
manifested as spurious "apparent wraparound" via SimpleLruTruncate() rounding.
Shenhao Wang sent in a patch intended to fix a bug that manifested as invalid
data in file backup_label problem on Windows by setting text mode when reading
backup_label and tablesapce_map.
Tatsuo Ishii sent in a patch to fix a missing acronym label in the
Tomáš Vondra sent in another revision of a patch to set PD_ALL_VISIBLE and
visibility map bits in COPY FREEZE, making good the lack of page-level flag
Tom Lane sent in a patch intended to fix a bug that manifested as multiple hosts
in connection string failed to failover in non-hot standby mode.
This email was sent to you from PWN. It was delivered on their behalf by
the PostgreSQL project. Any questions about the content of the message should be
sent to PWN.
You were sent this email as a subscriber of the pgsql-announce mailinglist, for
the content tag PWN.
To unsubscribe from
further emails, or change which emails you want to receive, please click the personal unsubscribe
link that you can find in the headers of this email, or visit