libpq environment variables in the server

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

libpq environment variables in the server

Peter Eisentraut-6
When libpq is loaded in the server (libpqwalreceiver, dblink,
postgres_fdw), it may use libpq environment variables set in the
postmaster environment for connection parameter defaults.  I have
noticed that this has some confusing effects in our test suites.  I
wonder if this is a good idea in general.

For example, the TAP test infrastructure sets PGAPPNAME to allow
identifying clients in the server log.  But this environment variable is
also inherited by temporary servers started with pg_ctl and is then in
turn used by libpqwalreceiver as the application_name for connecting to
remote servers where it then shows up in pg_stat_replication and is
relevant for things like synchronous_standby_names.

Also, the pg_rewind tests appear to rely implicitly on PGHOST and PGPORT
being set in the server environment to find the other node via
primary_conninfo.  That is easy to fix, but it shows that this kind of
thing can creep in unintentionally.

I was thinking that maybe we should clear all libpq environment
variables in the server, or perhaps have a mode in libpq to ignore all
environment variables.  Then again, maybe setting something like
PGSSLMODE globally in the server could be useful, so just removing
everything might not be the right answer.

Maybe pg_ctl should have some functionality to clear the environment,
and maybe there could be a facility in postgresql.conf to set
environment variables.

Thoughts?

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: libpq environment variables in the server

Dmitry Igrishin
пн, 21 янв. 2019 г. в 13:42, Peter Eisentraut
<[hidden email]>:

>
> When libpq is loaded in the server (libpqwalreceiver, dblink,
> postgres_fdw), it may use libpq environment variables set in the
> postmaster environment for connection parameter defaults.  I have
> noticed that this has some confusing effects in our test suites.  I
> wonder if this is a good idea in general.
>
> For example, the TAP test infrastructure sets PGAPPNAME to allow
> identifying clients in the server log.  But this environment variable is
> also inherited by temporary servers started with pg_ctl and is then in
> turn used by libpqwalreceiver as the application_name for connecting to
> remote servers where it then shows up in pg_stat_replication and is
> relevant for things like synchronous_standby_names.
>
> Also, the pg_rewind tests appear to rely implicitly on PGHOST and PGPORT
> being set in the server environment to find the other node via
> primary_conninfo.  That is easy to fix, but it shows that this kind of
> thing can creep in unintentionally.
>
> I was thinking that maybe we should clear all libpq environment
> variables in the server, or perhaps have a mode in libpq to ignore all
> environment variables.  Then again, maybe setting something like
> PGSSLMODE globally in the server could be useful, so just removing
> everything might not be the right answer.
As an author of the C++ client API (Pgfe) that currently wraps libpq I
would like
to ignore all these environment variables that affects libpq's behavior, because
libpq is an implementation detail and the Pgfe API hides it
completely. So +1 from
me for such a mode in libpq.

Reply | Threaded
Open this post in threaded view
|

Re: libpq environment variables in the server

Peter Eisentraut-6
In reply to this post by Peter Eisentraut-6
On 2019-01-21 11:42, Peter Eisentraut wrote:
> Maybe pg_ctl should have some functionality to clear the environment,
> and maybe there could be a facility in postgresql.conf to set
> environment variables.

After pondering this, this seemed to be the best solution.

Many init systems already clear the environment by default, so that the
started services don't have random environment settings inherited from
the user.  However, you can't easily do this when using pg_ctl directly,
so having an option to clear the environment in pg_ctl seems generally
useful.  Then we can use it in the test suites and thus avoid
environment variables leaking in in unintended ways.

That revealed that we do need the second mechanism to set environment
variables after everything has been cleared.  One example in the test
suite is LDAPCONF.  But there are other cases that could be useful, such
as any environment settings that would support archive_command or
restore_command.  I think this would also be a nice feature in general,
so that you can keep all the configuration together in the configuration
files and don't have to rely on external mechanisms to inject these
environment variables.

Attached are two patches that implement these features.

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

v1-0001-Add-GUC-parameter-environment.patch (6K) Download Attachment
v1-0002-pg_ctl-Add-clear-environment-option.patch (9K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: libpq environment variables in the server

Noah Misch-2
On Mon, Jan 21, 2019 at 11:42:16AM +0100, Peter Eisentraut wrote:
> For example, the TAP test infrastructure sets PGAPPNAME to allow
> identifying clients in the server log.  But this environment variable is
> also inherited by temporary servers started with pg_ctl and is then in
> turn used by libpqwalreceiver as the application_name for connecting to
> remote servers where it then shows up in pg_stat_replication and is
> relevant for things like synchronous_standby_names.

+1 for clearing PGAPPNAME before starting a test postmaster.

> Also, the pg_rewind tests appear to rely implicitly on PGHOST and PGPORT
> being set in the server environment to find the other node via
> primary_conninfo.  That is easy to fix, but it shows that this kind of
> thing can creep in unintentionally.

That's not a defect, and I wouldn't assume it's unintentional.

On Mon, Feb 18, 2019 at 02:58:09PM +0100, Peter Eisentraut wrote:

> On 2019-01-21 11:42, Peter Eisentraut wrote:
> > Maybe pg_ctl should have some functionality to clear the environment,
> > and maybe there could be a facility in postgresql.conf to set
> > environment variables.
>
> After pondering this, this seemed to be the best solution.
>
> Many init systems already clear the environment by default, so that the
> started services don't have random environment settings inherited from
> the user.  However, you can't easily do this when using pg_ctl directly,

I think this qualifies:
env -i LANG=C "$(type -p pg_ctl)" -w restart -D "$PGDATA"

> so having an option to clear the environment in pg_ctl seems generally
> useful.

An option would offer value on Windows, where "env" is not installed by
default.  (Your current patch makes the flag a no-op on Windows, but I suppose
that would have been part of finishing the patch.)  However, I'd also expect
more unintended consequences from wiping the environment on Windows.  Module
loads will need %PATH% to find dependent libraries, and I'd hesitate to remove
a ubiquitous variable like %SYSTEMROOT%.  Overall, I'm -1 on having this
--clear-environment option.

> Then we can use it in the test suites and thus avoid
> environment variables leaking in in unintended ways.

Some of my buildfarm animals[1] do need LD_LIBRARY_PATH preserved.  Once the
TAP suites support[2] TEMP_CONFIG, I suppose testers could use that to mutate
the "environment" GUC.  This test suite change has only cosmetic benefits,
which one can instead achieve by clearing just PGAPPNAME.  Since clearing all
environment variables in test suite postmasters would break existing setups
until they adjust, -1 on that even if we end up gaining a --clear-environment
option.  Testing a bare environment even reduces our test coverage somewhat.

> That revealed that we do need the second mechanism to set environment
> variables after everything has been cleared.  One example in the test
> suite is LDAPCONF.  But there are other cases that could be useful, such
> as any environment settings that would support archive_command or
> restore_command.  I think this would also be a nice feature in general,
> so that you can keep all the configuration together in the configuration
> files and don't have to rely on external mechanisms to inject these
> environment variables.

That's slightly nice.

> +        line.  Note that changing this parameter at run time will not unset
> +        previously set environment variables.

I prefer to see that fixed or the setting made PGC_POSTMASTER.

PGC_SIGHUP makes this an interesting feature.  Suppose you install a new
loadable module that relies on an LD_LIBRARY_PATH setting.  I've never needed
it myself, but I can see value in being able to mutate LD_LIBRARY_PATH without
a postmaster restart.  At PGC_POSTMASTER, the value is limited to mild init
script simplification.

When I was committing pgwin32_putenv() fixes 54aa6cc, 202dbdb and 95b9b8a, I
found that Perl, Python (https://bugs.python.org/issue1159) and Tcl all cache
the environment and ignore environment changes subsequent to cache
initialization.  Hence, a post-startup GUC change would not affect Perl %ENV
of backends that initialized Perl before the config reload.  That's a minor
factor in favor of PGC_POSTMASTER.  In general, expect little when changing
the environment after startup.

> + {"environment", PGC_SIGHUP, CLIENT_CONN_OTHER,
> + gettext_noop("Environment variables to be set."),
> + NULL,
> + GUC_LIST_INPUT | GUC_LIST_QUOTE

This requires a variable_is_guc_list_quote() update.

> +static void
> +assign_environment_settings(const char *newval, void *extra)
> +{
> + char *rawstring = pstrdup(newval);
> + List *elemlist;
> + ListCell *lc;
> +
> + if (!SplitGUCList(rawstring, ',', &elemlist))

All other list GUCs use SplitIdentifierString(); is this right?

> + {
> + /* syntax error in list */
> + list_free_deep(elemlist);
> + pfree(rawstring);
> + ereport(ERROR,
> + (errcode(ERRCODE_SYNTAX_ERROR),
> + errmsg("invalid list syntax in parameter \"%s\"",
> + "parameter")));

The error-throwing part belongs in a check_ function, not an assign_ function.

> + }
> +
> + foreach(lc, elemlist)
> + {
> + char   *setting = lfirst(lc);
> +
> + putenv(guc_strdup(ERROR, setting));
> + }

This leaks a fresh copy on every config reload.  That's too much, but I think
it's okay to leak memory when the before-reload value differs from the latest
value.

nm

[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hornet&dt=2019-03-02%2009%3A22%3A14
[2] https://www.postgresql.org/message-id/flat/20181229021950.GA3302966%40rfd.leadboat.com

Reply | Threaded
Open this post in threaded view
|

Re: libpq environment variables in the server

Peter Eisentraut-6
On 2019-03-03 09:06, Noah Misch wrote:
> +1 for clearing PGAPPNAME before starting a test postmaster.

> I think this qualifies:
> env -i LANG=C "$(type -p pg_ctl)" -w restart -D "$PGDATA"

OK, let's make this simpler then.  Here is an updated patch that just
unsets PGAPPNAME around pg_ctl invocations in the testing library, and
then adjusts the tests to remove no longer needed application_name
overrides.

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

0001-Don-t-propagate-PGAPPNAME-through-pg_ctl-in-tests.patch (31K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: libpq environment variables in the server

Noah Misch-2
Looks good.

On Thu, Mar 14, 2019 at 12:06:45PM +0100, Peter Eisentraut wrote:

> + # Temporarily unset PGAPPNAME so that the server doesn't inherit
> + # it.  Otherwise this could affect libpqwalreceiver connections in
> + # confusing ways.
> + my $save_pgappname = $ENV{PGAPPNAME};
> + delete $ENV{PGAPPNAME};
> +
>   # Note: We set the cluster_name here, not in postgresql.conf (in
>   # sub init) so that it does not get copied to standbys.
>   my $ret = TestLib::system_log('pg_ctl', '-D', $self->data_dir, '-l',
>   $self->logfile, '-o', "--cluster-name=$name", 'start');
>  
> + $ENV{PGAPPNAME} = $save_pgappname;
> +

I consider the following style more idiomatic:

 {
     local %ENV;
     delete $ENV{PGAPPNAME};
     ...
 }

I'm okay with the way you've written it, though.

Reply | Threaded
Open this post in threaded view
|

Re: libpq environment variables in the server

Peter Eisentraut-6
On 2019-03-15 05:00, Noah Misch wrote:
> I consider the following style more idiomatic:
>
>  {
>      local %ENV;
>      delete $ENV{PGAPPNAME};
>      ...
>  }

That doesn't work because the first line clears the entire environment.

What does work is

{
    delete local $ENV{PGAPPNAME};
    ...
}

But that is documented as new in Perl 5.12.0, so we might not be able to
use it.  It appears to work in the 5.8.9 I have lying around, so I'm
confused.

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: libpq environment variables in the server

Dagfinn Ilmari Mannsåker
Peter Eisentraut <[hidden email]> writes:

> On 2019-03-15 05:00, Noah Misch wrote:
>> I consider the following style more idiomatic:
>>
>>  {
>>      local %ENV;
>>      delete $ENV{PGAPPNAME};
>>      ...
>>  }
>
> That doesn't work because the first line clears the entire environment.

The solution to that is to do 'local %ENV = %ENV;', to assign a copy of
the original to the localised variable.  This doesn't work on VMS,
because its concept of environment variables is quite different from
UNIX, but PostgreSQL doesn't support that anyway.

> What does work is
>
> {
>     delete local $ENV{PGAPPNAME};
>     ...
> }
>
> But that is documented as new in Perl 5.12.0, so we might not be able to
> use it.  It appears to work in the 5.8.9 I have lying around, so I'm
> confused.

It "works" as in it's not a syntax error, but it doesn't actually
localise the deletion. The following program:

    use strict;
    use warnings;
    use feature 'say';

    our %env = qw(foo bar baz bat);
    say "original:  ", join(", ", sort keys %env);
    {
        delete local $env{foo};
        say "localised: ", join(", ", sort keys %env);
    }
    say "restored?  ", join(", ", sort keys %env);

on 5.12 prints:

    original:  baz, foo
    localised: baz
    restored?  baz, foo

while on 5.10 it prints:

    original:  baz, foo
    localised: baz
    restored?  baz

BTW, https://perl.bot/ is handy for testing things like this on various
Perl versions you don't have lying around.

- ilmari
--
- Twitter seems more influential [than blogs] in the 'gets reported in
  the mainstream press' sense at least.               - Matt McLeod
- That'd be because the content of a tweet is easier to condense down
  to a mainstream media article.                      - Calle Dybedahl

Reply | Threaded
Open this post in threaded view
|

Re: libpq environment variables in the server

Noah Misch-2
On Fri, Mar 15, 2019 at 10:06:29AM +0000, Dagfinn Ilmari Mannsåker wrote:

> Peter Eisentraut <[hidden email]> writes:
>
> > On 2019-03-15 05:00, Noah Misch wrote:
> >> I consider the following style more idiomatic:
> >>
> >>  {
> >>      local %ENV;
> >>      delete $ENV{PGAPPNAME};
> >>      ...
> >>  }
> >
> > That doesn't work because the first line clears the entire environment.
>
> The solution to that is to do 'local %ENV = %ENV;', to assign a copy of
> the original to the localised variable.

That's the right thing, not what I wrote.  We use that in
src/bin/initdb/t/001_initdb.pl.

Reply | Threaded
Open this post in threaded view
|

Re: libpq environment variables in the server

Peter Eisentraut-6
On 2019-03-15 16:01, Noah Misch wrote:

>>>>  {
>>>>      local %ENV;
>>>>      delete $ENV{PGAPPNAME};
>>>>      ...
>>>>  }
>>>
>>> That doesn't work because the first line clears the entire environment.
>>
>> The solution to that is to do 'local %ENV = %ENV;', to assign a copy of
>> the original to the localised variable.
>
> That's the right thing, not what I wrote.  We use that in
> src/bin/initdb/t/001_initdb.pl.

Great.  Committed that way.

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services