LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes

classic Classic list List threaded Threaded
25 messages Options
12
Reply | Threaded
Open this post in threaded view
|

LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes

Mike Yeap
Hi all, I have encountered a problem related to LDAP authenticated session with Postgres foreign data wrapper (postgres_fdw).

The server crashed with following errors and other active server processes are terminated as well:
2019-02-20 14:53:30.496 SGT [PID=1353 application="" user_name= database= host(port)=] LOG:  server process (PID 26306) was terminated by signal 11: Segmentation fault

2019-02-20 14:53:30.496 SGT [PID=1353 application="" user_name= database= host(port)=] LOG:  terminating any other active server processes

I can reproduce it in a test server with many other sessions connected:

1. login using non-LDAP-authenticated user, query local & foreign tables - OK
2. login using LDAP-authenticated user, query local table - OK
3. login using LDAP-authenticated user, query foreign table - ERROR, server crashes with signal 11: Segmentation fault error when I quit the psql session

It seems like the problem only when the LDAP-authenticated session (which queried foreign table) is terminated. In dmesg log, I can see following:

[16385512.182231] traps: postmaster[26306] general protection ip:7f1e758b638c sp:7ffef7ed8858 error:0 in libc-2.17.so[7f1e75836000+1b6000]

Has anyone encountered similar issue?

######################
PostgreSQL version: 10.6
Platform: CentOS Linux
######################

Thank you.

Regards,
Mike Yeap
Reply | Threaded
Open this post in threaded view
|

Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes

Laurenz Albe
Mike Yeap wrote:

> I have encountered a problem related to LDAP authenticated session with Postgres foreign data wrapper (postgres_fdw).
>
> The server crashed with following errors and other active server processes are terminated as well:
> 2019-02-20 14:53:30.496 SGT [PID=1353 application="" user_name= database= host(port)=] LOG:  server process (PID 26306) was terminated by signal 11: Segmentation fault
>
> 2019-02-20 14:53:30.496 SGT [PID=1353 application="" user_name= database= host(port)=] LOG:  terminating any other active server processes
>
> I can reproduce it in a test server with many other sessions connected:
>
> 1. login using non-LDAP-authenticated user, query local & foreign tables - OK
> 2. login using LDAP-authenticated user, query local table - OK
> 3. login using LDAP-authenticated user, query foreign table - ERROR, server crashes with signal 11: Segmentation fault error when I quit the psql session

Are the "postgres" executable and libpq linked with the same version of OpenLDAP?

Any other extensions installed?

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com


Reply | Threaded
Open this post in threaded view
|

Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes

Tom Lane-2
Laurenz Albe <[hidden email]> writes:
> Mike Yeap wrote:
>> I have encountered a problem related to LDAP authenticated session with Postgres foreign data wrapper (postgres_fdw).

> Are the "postgres" executable and libpq linked with the same version of OpenLDAP?

And which version is that?  (And which version of Postgres?)

Digging around in our git history, I came across this:

Author: Noah Misch <[hidden email]>
Branch: master Release: REL9_5_BR [d7cdf6ee3] 2014-07-22 11:01:03 -0400

    Diagnose incompatible OpenLDAP versions during build and test.
   
    With OpenLDAP versions 2.4.24 through 2.4.31, inclusive, PostgreSQL
    backends can crash at exit.  Raise a warning during "configure" based on
    the compile-time OpenLDAP version number, and test the crash scenario in
    the dblink test suite.  Back-patch to 9.0 (all supported versions).

which sounds a fair bit like what you are describing.

                        regards, tom lane

Reply | Threaded
Open this post in threaded view
|

Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes

Mike Yeap
> Are the "postgres" executable and libpq linked with the same version of OpenLDAP?
How should I check whether they are linked?

My Postgres version is 10.6 and I have this output for "yum list | grep ldap | sort":
$ yum list | grep ldap | sort

apr-util-ldap.x86_64                        1.5.2-6.el7                base
bind-dyndb-ldap.x86_64                      11.1-4.el7                 base
compat-openldap.i686                        1:2.3.43-5.el7             base
compat-openldap.x86_64                      1:2.3.43-5.el7             base
cyrus-sasl-ldap.i686                        2.1.26-23.el7              base
cyrus-sasl-ldap.x86_64                      2.1.26-23.el7              base
freeradius-ldap.x86_64                      3.0.13-9.el7_5             base
ipsilon-authldap.noarch                     1.0.0-13.el7_3             base
krb5-server-ldap.x86_64                     1.15.1-37.el7_6            updates
ldapjdk-javadoc.noarch                      4.19-5.el7                 base
ldapjdk.noarch                              4.19-5.el7                 base
mod_ldap.x86_64                             2.4.6-88.el7.centos        base
nss-pam-ldapd.i686                          0.8.13-16.el7              base
nss-pam-ldapd.x86_64                        0.8.13-16.el7              base
openldap-clients.x86_64                     2.4.44-21.el7_6            @updates
openldap-devel.i686                         2.4.44-21.el7_6            updates
openldap-devel.x86_64                       2.4.44-21.el7_6            updates
openldap.i686                               2.4.44-21.el7_6            updates
openldap-servers-sql.x86_64                 2.4.44-21.el7_6            updates
openldap-servers.x86_64                     2.4.44-21.el7_6            updates
openldap.x86_64                             2.4.44-21.el7_6            @updates
openssh-ldap.x86_64                         7.4p1-16.el7               base
php-ldap.x86_64                             5.4.16-46.el7              base
python-ldap2pg-doc.x86_64                   4.11-1.rhel7               pgdg10
python-ldap2pg.x86_64                       4.11-1.rhel7               pgdg10
python-ldap.x86_64                          2.4.15-2.el7               base
sssd-ldap.x86_64                            1.16.2-13.el7_6.5          updates

And in the database where I encountered this issue I have these extensions installed:

repdb=# \dx
                                      List of installed extensions
        Name        | Version |   Schema   |                        Description
--------------------+---------+------------+------------------------------------------------------------
 hstore             | 1.4     | public     | data type for storing sets of (key, value) pairs
 pg_stat_statements | 1.6     | repdb      | track execution statistics of all SQL statements executed
 plpgsql            | 1.0     | pg_catalog | PL/pgSQL procedural language
 postgres_fdw       | 1.0     | repdb      | foreign-data wrapper for remote PostgreSQL servers
 tablefunc          | 1.0     | repdb      | functions that manipulate whole tables, including crosstab
(5 rows)

Thank you.

Regards,
Mike Yeap

On Wed, Feb 20, 2019 at 10:17 PM Tom Lane <[hidden email]> wrote:
Laurenz Albe <[hidden email]> writes:
> Mike Yeap wrote:
>> I have encountered a problem related to LDAP authenticated session with Postgres foreign data wrapper (postgres_fdw).

> Are the "postgres" executable and libpq linked with the same version of OpenLDAP?

And which version is that?  (And which version of Postgres?)

Digging around in our git history, I came across this:

Author: Noah Misch <[hidden email]>
Branch: master Release: REL9_5_BR [d7cdf6ee3] 2014-07-22 11:01:03 -0400

    Diagnose incompatible OpenLDAP versions during build and test.

    With OpenLDAP versions 2.4.24 through 2.4.31, inclusive, PostgreSQL
    backends can crash at exit.  Raise a warning during "configure" based on
    the compile-time OpenLDAP version number, and test the crash scenario in
    the dblink test suite.  Back-patch to 9.0 (all supported versions).

which sounds a fair bit like what you are describing.

                        regards, tom lane
Reply | Threaded
Open this post in threaded view
|

Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes

Tom Lane-2
Mike Yeap <[hidden email]> writes:
>> Are the "postgres" executable and libpq linked with the same version of
>> OpenLDAP?

> How should I check whether they are linked?

"ldd" should show the dependencies of whatever executable or library
you point it at.

                        regards, tom lane

Reply | Threaded
Open this post in threaded view
|

Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes

Mike Yeap
Hi Tom, when I run "ldd /usr/pgsql-10/bin/postmaster" I got this output:

# ldd /usr/pgsql-10/bin/postmaster
linux-vdso.so.1 =>  (0x00007ffd4ec65000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007eff8b5d3000)
libxml2.so.2 => /lib64/libxml2.so.2 (0x00007eff8b268000)
libpam.so.0 => /lib64/libpam.so.0 (0x00007eff8b059000)
libssl.so.10 => /lib64/libssl.so.10 (0x00007eff8ade7000)
libcrypto.so.10 => /lib64/libcrypto.so.10 (0x00007eff8a985000)
libgssapi_krb5.so.2 => /lib64/libgssapi_krb5.so.2 (0x00007eff8a738000)
librt.so.1 => /lib64/librt.so.1 (0x00007eff8a530000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007eff8a32b000)
libm.so.6 => /lib64/libm.so.6 (0x00007eff8a029000)
libldap-2.4.so.2 => /lib64/libldap-2.4.so.2 (0x00007eff89dd4000)
libicui18n.so.50 => /lib64/libicui18n.so.50 (0x00007eff899d4000)
libicuuc.so.50 => /lib64/libicuuc.so.50 (0x00007eff8965b000)
libsystemd.so.0 => /lib64/libsystemd.so.0 (0x00007eff89633000)
libc.so.6 => /lib64/libc.so.6 (0x00007eff89271000)
/lib64/ld-linux-x86-64.so.2 (0x00007eff8b7f9000)
libz.so.1 => /lib64/libz.so.1 (0x00007eff8905b000)
liblzma.so.5 => /lib64/liblzma.so.5 (0x00007eff88e35000)
libaudit.so.1 => /lib64/libaudit.so.1 (0x00007eff88c0c000)
libkrb5.so.3 => /lib64/libkrb5.so.3 (0x00007eff88924000)
libcom_err.so.2 => /lib64/libcom_err.so.2 (0x00007eff88720000)
libk5crypto.so.3 => /lib64/libk5crypto.so.3 (0x00007eff884ec000)
libkrb5support.so.0 => /lib64/libkrb5support.so.0 (0x00007eff882de000)
libkeyutils.so.1 => /lib64/libkeyutils.so.1 (0x00007eff880da000)
libresolv.so.2 => /lib64/libresolv.so.2 (0x00007eff87ebf000)
liblber-2.4.so.2 => /lib64/liblber-2.4.so.2 (0x00007eff87cb0000)
libsasl2.so.3 => /lib64/libsasl2.so.3 (0x00007eff87a93000)
libssl3.so => /lib64/libssl3.so (0x00007eff8784f000)
libsmime3.so => /lib64/libsmime3.so (0x00007eff87628000)
libnss3.so => /lib64/libnss3.so (0x00007eff87302000)
libnssutil3.so => /lib64/libnssutil3.so (0x00007eff870d5000)
libplds4.so => /lib64/libplds4.so (0x00007eff86ed1000)
libplc4.so => /lib64/libplc4.so (0x00007eff86ccc000)
libnspr4.so => /lib64/libnspr4.so (0x00007eff86a8d000)
libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007eff86785000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007eff8656f000)
libicudata.so.50 => /lib64/libicudata.so.50 (0x00007eff84f9a000)
libcap.so.2 => /lib64/libcap.so.2 (0x00007eff84d95000)
libselinux.so.1 => /lib64/libselinux.so.1 (0x00007eff84b6e000)
libgcrypt.so.11 => /lib64/libgcrypt.so.11 (0x00007eff848ec000)
libgpg-error.so.0 => /lib64/libgpg-error.so.0 (0x00007eff846e7000)
libdw.so.1 => /lib64/libdw.so.1 (0x00007eff844a0000)
libcap-ng.so.0 => /lib64/libcap-ng.so.0 (0x00007eff84299000)
libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007eff84062000)
libattr.so.1 => /lib64/libattr.so.1 (0x00007eff83e5c000)
libpcre.so.1 => /lib64/libpcre.so.1 (0x00007eff83bfa000)
libelf.so.1 => /lib64/libelf.so.1 (0x00007eff839e2000)
libbz2.so.1 => /lib64/libbz2.so.1 (0x00007eff837d1000)
libfreebl3.so => /lib64/libfreebl3.so (0x00007eff835ce000)

On the line that has ldap in it:

libldap-2.4.so.2 => /lib64/libldap-2.4.so.2 (0x00007eff89dd4000)

Sorry but in this case what is my libpq?

Regards,
Mike Yeap

On Thu, Feb 21, 2019 at 10:03 AM Tom Lane <[hidden email]> wrote:
Mike Yeap <[hidden email]> writes:
>> Are the "postgres" executable and libpq linked with the same version of
>> OpenLDAP?

> How should I check whether they are linked?

"ldd" should show the dependencies of whatever executable or library
you point it at.

                        regards, tom lane
Reply | Threaded
Open this post in threaded view
|

Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes

Thomas Munro-5
In reply to this post by Mike Yeap
On Thu, Feb 21, 2019 at 2:42 PM Mike Yeap <[hidden email]> wrote:
> openldap-clients.x86_64                     2.4.44-21.el7_6            @updates
> openldap-devel.i686                         2.4.44-21.el7_6            updates
> openldap-devel.x86_64                       2.4.44-21.el7_6            updates
> openldap.i686                               2.4.44-21.el7_6            updates
> openldap-servers-sql.x86_64                 2.4.44-21.el7_6            updates
> openldap-servers.x86_64                     2.4.44-21.el7_6            updates
> openldap.x86_64                             2.4.44-21.el7_6            @updates

> On Wed, Feb 20, 2019 at 10:17 PM Tom Lane <[hidden email]> wrote:
>>     With OpenLDAP versions 2.4.24 through 2.4.31, inclusive, PostgreSQL
>>     backends can crash at exit.  Raise a warning during "configure" based on
>>     the compile-time OpenLDAP version number, and test the crash scenario in
>>     the dblink test suite.  Back-patch to 9.0 (all supported versions).

Clearly 2.4.44 is not in the range 2.4.24 through 2.4.31.  Perhaps the
dangerous range is out of date?  Hmm, so Noah's analysis[1] says this
is a clash between libldap_r.so (used by libpq) and libldap.so (used
by the server), specifically in destructor/exit code.  Curiously, in a
thread about Curl's struggles with this problem, I found a claim[2]
that Debian decided to abandon the non-"_r" variant and just use _r
always.  Sure enough, on my Debian buster VM I see a symlink
libldap-2.4.so.2 -> libldap_r-2.4.so.2.  So essentially Debian and
friends have already forced Noah's first option on users:

> 1. Link the backend with libldap_r, so we never face the mismatch. On some
> platforms, this means also linking in threading libraries.

FreeBSD and CentOS systems near me have separate libraries still.

[1] https://www.postgresql.org/message-id/flat/20140612210219.GA705509%40tornado.leadboat.com
[2] https://www.openldap.org/lists/openldap-technical/201608/msg00094.html

--
Thomas Munro
https://enterprisedb.com

Reply | Threaded
Open this post in threaded view
|

Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes

Mike Yeap
Hi Thomas, does that mean the bug is still there?

Regards,
Mike Yeap

On Mon, Feb 25, 2019 at 4:06 PM Thomas Munro <[hidden email]> wrote:
On Thu, Feb 21, 2019 at 2:42 PM Mike Yeap <[hidden email]> wrote:
> openldap-clients.x86_64                     2.4.44-21.el7_6            @updates
> openldap-devel.i686                         2.4.44-21.el7_6            updates
> openldap-devel.x86_64                       2.4.44-21.el7_6            updates
> openldap.i686                               2.4.44-21.el7_6            updates
> openldap-servers-sql.x86_64                 2.4.44-21.el7_6            updates
> openldap-servers.x86_64                     2.4.44-21.el7_6            updates
> openldap.x86_64                             2.4.44-21.el7_6            @updates

> On Wed, Feb 20, 2019 at 10:17 PM Tom Lane <[hidden email]> wrote:
>>     With OpenLDAP versions 2.4.24 through 2.4.31, inclusive, PostgreSQL
>>     backends can crash at exit.  Raise a warning during "configure" based on
>>     the compile-time OpenLDAP version number, and test the crash scenario in
>>     the dblink test suite.  Back-patch to 9.0 (all supported versions).

Clearly 2.4.44 is not in the range 2.4.24 through 2.4.31.  Perhaps the
dangerous range is out of date?  Hmm, so Noah's analysis[1] says this
is a clash between libldap_r.so (used by libpq) and libldap.so (used
by the server), specifically in destructor/exit code.  Curiously, in a
thread about Curl's struggles with this problem, I found a claim[2]
that Debian decided to abandon the non-"_r" variant and just use _r
always.  Sure enough, on my Debian buster VM I see a symlink
libldap-2.4.so.2 -> libldap_r-2.4.so.2.  So essentially Debian and
friends have already forced Noah's first option on users:

> 1. Link the backend with libldap_r, so we never face the mismatch. On some
> platforms, this means also linking in threading libraries.

FreeBSD and CentOS systems near me have separate libraries still.

[1] https://www.postgresql.org/message-id/flat/20140612210219.GA705509%40tornado.leadboat.com
[2] https://www.openldap.org/lists/openldap-technical/201608/msg00094.html

--
Thomas Munro
https://enterprisedb.com
Reply | Threaded
Open this post in threaded view
|

Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes

Thomas Munro-5
On Tue, Feb 26, 2019 at 8:17 PM Mike Yeap <[hidden email]> wrote:
> Hi Thomas, does that mean the bug is still there?

Hi Mike,

I haven't tried to repro this myself, but it certainly sounds like it.
It also sounds like it would probably go away if you switched to a
Debian-derived distro, instead of a Red Hat-derived distro, but I
doubt that's the kind of advice you were looking for.  We need to
figure out a proper solution here, though I'm not sure what.  Question
for the list: other stuff in the server needs libpthread (SSL, LLVM,
...), so why are we insisting on using non-MT LDAP?

--
Thomas Munro
https://enterprisedb.com

Reply | Threaded
Open this post in threaded view
|

Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes

Mike Yeap
Hi Thomas, I see..... guess I can't use LDAP authentication for now, :-(

Hopefully this problem is solved in future version, thank you!

Regards,
Mike Yeap

On Tue, Feb 26, 2019 at 4:12 PM Thomas Munro <[hidden email]> wrote:
On Tue, Feb 26, 2019 at 8:17 PM Mike Yeap <[hidden email]> wrote:
> Hi Thomas, does that mean the bug is still there?

Hi Mike,

I haven't tried to repro this myself, but it certainly sounds like it.
It also sounds like it would probably go away if you switched to a
Debian-derived distro, instead of a Red Hat-derived distro, but I
doubt that's the kind of advice you were looking for.  We need to
figure out a proper solution here, though I'm not sure what.  Question
for the list: other stuff in the server needs libpthread (SSL, LLVM,
...), so why are we insisting on using non-MT LDAP?

--
Thomas Munro
https://enterprisedb.com
Reply | Threaded
Open this post in threaded view
|

Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes

Thomas Munro-5
In reply to this post by Thomas Munro-5
On Tue, Feb 26, 2019 at 9:11 PM Thomas Munro <[hidden email]> wrote:
> On Tue, Feb 26, 2019 at 8:17 PM Mike Yeap <[hidden email]> wrote:
> > Hi Thomas, does that mean the bug is still there?

> I haven't tried to repro this myself, but it certainly sounds like it.
> It also sounds like it would probably go away if you switched to a
> Debian-derived distro, instead of a Red Hat-derived distro, but I
> doubt that's the kind of advice you were looking for.  We need to
> figure out a proper solution here, though I'm not sure what.  Question
> for the list: other stuff in the server needs libpthread (SSL, LLVM,
> ...), so why are we insisting on using non-MT LDAP?

Concretely, why don't we just kill the LDAP_LIBS_FE/LDAP_LIBS_BE
distinction and use a single LDAP_LIBS?  Then it'll always match.  It
can still be the non-MT variant if you build with
--disable-thread-safety (who does that?), but then it'll be the same
in the server too so that postgres_fdw + ldap works that way too.
Sketch patch attached.


--
Thomas Munro
https://enterprisedb.com

same-lib-ldap-everywhere.patch (7K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes

Stephen Frost
In reply to this post by Mike Yeap
Greetings Mike,

* Mike Yeap ([hidden email]) wrote:
> Hi Thomas, I see..... guess I can't use LDAP authentication for now, :-(

If you're in an active directory environment, you should really be using
Kerberos for authentication and NOT LDAP anyway.  LDAP-based
authentication involves sending the user's password (cleartext) to the
PG server, which is really bad security.  Hopefully you're at least
connecting to PG with SSL, and from PG to LDAP with SSL, but you still
run the issue that a compromised server would expose the password of
everyone connecting to that server, and when you're using a centralized
authentication system like LDAP, that one password gets you access to
everything that account has access to.

Thanks!

Stephen

signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes

Tom Lane-2
In reply to this post by Thomas Munro-5
Thomas Munro <[hidden email]> writes:
> Question
> for the list: other stuff in the server needs libpthread (SSL, LLVM,
> ...), so why are we insisting on using non-MT LDAP?

The traditional reason for avoiding that is the risk of a server
process becoming multi-threaded.  There are live bugs of that ilk
on Darwin, and we actually have cross-checks for the case in our
code (see HAVE_PTHREAD_IS_THREADED_NP stanzas).

If pthread_is_threaded_np(), or something equivalent, is widely available
then it might be all right to try solving this going forward by switching
to libldap_r and seeing if anyone hits those cross-checks.  I'd be afraid
to risk it in the back branches though ...

                        regards, tom lane

Reply | Threaded
Open this post in threaded view
|

Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes

Thomas Munro-5
On Wed, Feb 27, 2019 at 3:57 AM Tom Lane <[hidden email]> wrote:

> Thomas Munro <[hidden email]> writes:
> > Question
> > for the list: other stuff in the server needs libpthread (SSL, LLVM,
> > ...), so why are we insisting on using non-MT LDAP?
>
> The traditional reason for avoiding that is the risk of a server
> process becoming multi-threaded.  There are live bugs of that ilk
> on Darwin, and we actually have cross-checks for the case in our
> code (see HAVE_PTHREAD_IS_THREADED_NP stanzas).
>
> If pthread_is_threaded_np(), or something equivalent, is widely available
> then it might be all right to try solving this going forward by switching
> to libldap_r and seeing if anyone hits those cross-checks.  I'd be afraid
> to risk it in the back branches though ...

Hmm.  Well here is a new data point: it looks like the Red Hat family
of distributions is in the process of making the same decision as
Debian (namely: to expunge the non-MT variant, because it bites
various projects in the same way that it bites us), but they haven't
quite hasn't pulled the trigger yet:

https://fedoraproject.org/wiki/Changes/OpenLDAPwithoutNonthreadedLibraries

So if we do nothing at all, it seems likely that this problem will
eventually go away by itself on practically all Linux systems, leaving
this unfixed LDAP vs postgres_fdw bug to trip up the other Unix
systems.  Bleugh.

I don't see pthread_is_threaded_np() on any non-Apple systems in my
lab.  Clearly libdap_r is *capable* of creating threads: it contains a
function ldap_pvt_thread_create(), and we can see that slapd and other
OpenLDAP things use that, but AFAICT that's a private facility not
intended for end users to call, so there's no danger if you just use
the documented LDAP client API.  Since pthread_is_threaded_np() is a
Mac thing, note also that Macs aren't directly exposed to this
particular choice anyway because (at least if you use system-provided
libraries rather than MacPorts et al) libldap.dylib and
libldap_r.dylib are already symlinks to the same Apple voodoo
"/System/Library/Frameworks/LDAP.framework/Versions/A/LDAP".

--
Thomas Munro
https://enterprisedb.com

Reply | Threaded
Open this post in threaded view
|

Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes

Tom Lane-2
Thomas Munro <[hidden email]> writes:
> On Wed, Feb 27, 2019 at 3:57 AM Tom Lane <[hidden email]> wrote:
>> If pthread_is_threaded_np(), or something equivalent, is widely available
>> then it might be all right to try solving this going forward by switching
>> to libldap_r and seeing if anyone hits those cross-checks.  I'd be afraid
>> to risk it in the back branches though ...

> Hmm.  Well here is a new data point: it looks like the Red Hat family
> of distributions is in the process of making the same decision as
> Debian (namely: to expunge the non-MT variant, because it bites
> various projects in the same way that it bites us), but they haven't
> quite hasn't pulled the trigger yet:
> https://fedoraproject.org/wiki/Changes/OpenLDAPwithoutNonthreadedLibraries

Interesting, but that's going to be a very slow change.  That says they'll
pull the trigger in Fedora 30, which I think is due to be released this
spring --- but it won't show up in RHEL till the next major release (8
or maybe even 9 at this point), and the existing major releases have got
10-year support lifespans.

> I don't see pthread_is_threaded_np() on any non-Apple systems in my
> lab.

Yeah, I thought that might be a Mac thing.  I wonder if POSIX has any
usable equivalent.

> Clearly libdap_r is *capable* of creating threads: it contains a
> function ldap_pvt_thread_create(), and we can see that slapd and other
> OpenLDAP things use that, but AFAICT that's a private facility not
> intended for end users to call, so there's no danger if you just use
> the documented LDAP client API.

That seems promising, but I'd sure be happier if we could cross-check
that there's still just one thread at the completion of authentication.

                        regards, tom lane

Reply | Threaded
Open this post in threaded view
|

Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes

Thomas Munro-5
Adding Noah to thread.

On Wed, Feb 27, 2019 at 11:28 AM Tom Lane <[hidden email]> wrote:
> Thomas Munro <[hidden email]> writes:
> > I don't see pthread_is_threaded_np() on any non-Apple systems in my
> > lab.
>
> Yeah, I thought that might be a Mac thing.  I wonder if POSIX has any
> usable equivalent.

I don't see anything like that (the concept doesn't seem very
portable).  I couldn't find a way on Glibc (but I'm not saying there
isn't one hiding somewhere).  FreeBSD has a thing much like macOS's
(and I think some more BSDs do too); it's set to true by libthr when
the first thread is created, to make libc start locking various stuff.

The macOS one probably isn't a good canary to protect us from OpenLDAP
creating threads since on typical macOS builds we're using Apple's
LDAP thing (which cybersquats libldap.dylib and libldap_r.dylib via
symlinks).  So adding a FreeBSD check seems like a good idea, because
at least one FreeBSD system in our buildfarm runs the ldap checks on
real OpenLDAP (elver).

> > Clearly libdap_r is *capable* of creating threads: it contains a
> > function ldap_pvt_thread_create(), and we can see that slapd and other
> > OpenLDAP things use that, but AFAICT that's a private facility not
> > intended for end users to call, so there's no danger if you just use
> > the documented LDAP client API.
>
> That seems promising, but I'd sure be happier if we could cross-check
> that there's still just one thread at the completion of authentication.

Ok, here's that patch again with a commit message and with the
configure version warning removed, and a make-sure-we're-not-threaded
patch for FreeBSD.

I'm not sure what to do about the LDAP test in
contrib/dblink/sql/dblink.sql.  Do we still want this?

I propose this for master only, for now.  I also think it'd be nice to
consider back-patching it after a while, especially since this
reported broke on CentOS/RHEL7, a pretty popular OS that'll be around
for a good while.  Hmm, I wonder if it's OK to subtly change library
dependencies in a minor release; I don't see any problem with it since
I expect both variants to be provided by the same package in every
distro but we'd certainly want to highlight this to the package
maintainers if we did it.

--
Thomas Munro
https://enterprisedb.com

0001-Test-__isthreaded-on-FreeBSD-and-friends.patch (6K) Download Attachment
0002-Use-the-same-libldap-variant-in-the-frontend-and-bac.patch (13K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes

Noah Misch-2
On Thu, Mar 07, 2019 at 10:45:56AM +1300, Thomas Munro wrote:

> On Wed, Feb 27, 2019 at 11:28 AM Tom Lane <[hidden email]> wrote:
> > Thomas Munro <[hidden email]> writes:
> > > I don't see pthread_is_threaded_np() on any non-Apple systems in my
> > > lab.
> >
> > Yeah, I thought that might be a Mac thing.  I wonder if POSIX has any
> > usable equivalent.
>
> I don't see anything like that (the concept doesn't seem very
> portable).

I'm not aware of one.

> > > Clearly libdap_r is *capable* of creating threads: it contains a
> > > function ldap_pvt_thread_create(), and we can see that slapd and other
> > > OpenLDAP things use that, but AFAICT that's a private facility not
> > > intended for end users to call, so there's no danger if you just use
> > > the documented LDAP client API.
> >
> > That seems promising, but I'd sure be happier if we could cross-check
> > that there's still just one thread at the completion of authentication.
>
> Ok, here's that patch again with a commit message and with the
> configure version warning removed, and a make-sure-we're-not-threaded
> patch for FreeBSD.
>
> I'm not sure what to do about the LDAP test in
> contrib/dblink/sql/dblink.sql.  Do we still want this?

Mike, does the dblink test suite not fail on your system?  It's designed to
catch this exact problem.

Has anyone else reproduced this?

> I propose this for master only, for now.  I also think it'd be nice to
> consider back-patching it after a while, especially since this
> reported broke on CentOS/RHEL7, a pretty popular OS that'll be around
> for a good while.  Hmm, I wonder if it's OK to subtly change library
> dependencies in a minor release; I don't see any problem with it since
> I expect both variants to be provided by the same package in every
> distro but we'd certainly want to highlight this to the package
> maintainers if we did it.

It's not great to change library dependencies in a minor release.  If every
RHEL 7 installation can crash this way, changing the dependencies is probably
the least bad thing.

Reply | Threaded
Open this post in threaded view
|

Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes

Thomas Munro-5
On Thu, Mar 7, 2019 at 4:19 PM Noah Misch <[hidden email]> wrote:
> Has anyone else reproduced this?

I tried, but could not reproduce this problem on "CentOS Linux release
7.6.1810 (Core)" using OpenLDAP "2.4.44-21.el7_6" (same as Mike
reported, what yum install is currently serving up).  I tried "make
check" in contrib/dblink, and the only strange thing I noticed was
this FATAL error at the top of contrib/dblink/log/postmaster.log:

2019-03-14 03:51:33.058 UTC [20131] LOG:  database system is ready to
accept connections
2019-03-14 03:51:33.059 UTC [20135] [unknown] FATAL:  the database
system is starting up

I don't see that on other systems and don't understand it.

I also tried a test of my own which I thought corresponded directly to
what Mike described, on both master and REL_10_STABLE.  I'll record my
steps here so perhaps someone can see what's missing.

1.  Run the regression test under src/test/ldap so that you get some
canned slapd configuration files.
2.  cd into src/test/ldap/tmp_check and run "slapd -f slapd.conf -h
ldap://localhost:5555".  It should daemonify itself, and run until you
kill it with SIGINT.
3.  Put this into pg_hba.conf:
host postgres test1 127.0.0.1/32 ldap ldapserver=localhost
ldapport=5555 ldapbasedn="dc=example,dc=net"
4.  Create database objects as superuser:
create user test1;
create table t (i int);
grant all on t to test1;
create extension postgres_fdw;
create server foreign_server foreign data wrapper postgres_fdw options
(dbname 'postgres', host '127.0.0.1');
create foreign table ft (i int) server foreign_server options (table_name 't');
create user mapping for test1 server foreign_server options (user
'test1', password 'secret1');
grant all on ft to test1;
5.  Now you should be able to log in with "psql -h 127.0.0.1 postgres
test1" and password "secret1", and run queries like: select * from ft;

When exiting the session, I was expecting the backend to crash,
because it had executed libldap.so code during authentication, and
then it had linked in libldap_r.so via libpq.so while connecting via
postgres_fdw.  But it doesn't crash.  I wonder what is different for
Mike; am I missing something, or is there non-determinism here?

> > I propose this for master only, for now.  I also think it'd be nice to
> > consider back-patching it after a while, especially since this
> > reported broke on CentOS/RHEL7, a pretty popular OS that'll be around
> > for a good while.  Hmm, I wonder if it's OK to subtly change library
> > dependencies in a minor release; I don't see any problem with it since
> > I expect both variants to be provided by the same package in every
> > distro but we'd certainly want to highlight this to the package
> > maintainers if we did it.
>
> It's not great to change library dependencies in a minor release.  If every
> RHEL 7 installation can crash this way, changing the dependencies is probably
> the least bad thing.

+1, once we get a repro and/or better understanding.

--
Thomas Munro
https://enterprisedb.com

Reply | Threaded
Open this post in threaded view
|

Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes

Noah Misch-2
On Thu, Mar 14, 2019 at 05:18:49PM +1300, Thomas Munro wrote:
> On Thu, Mar 7, 2019 at 4:19 PM Noah Misch <[hidden email]> wrote:
> > Has anyone else reproduced this?
>
> I tried, but could not reproduce this problem on "CentOS Linux release
> 7.6.1810 (Core)" using OpenLDAP "2.4.44-21.el7_6" (same as Mike
> reported, what yum install is currently serving up).

> When exiting the session, I was expecting the backend to crash,
> because it had executed libldap.so code during authentication, and
> then it had linked in libldap_r.so via libpq.so while connecting via
> postgres_fdw.  But it doesn't crash.  I wonder what is different for
> Mike; am I missing something, or is there non-determinism here?

The test is deterministic.  I'm guessing Mike's system is finding ldap
libraries other than the usual system ones.  Mike, would you check as follows?

$ echo "select pg_backend_pid(); load 'dblink'; select pg_sleep(100)" | psql -X &
[1] 2530123
  pg_backend_pid
----------------
        2530124
(1 row)

LOAD

$ gdb --batch --pid 2530124 -ex 'info sharedlibrary ldap'
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
0x00007ffff6303463 in __epoll_wait_nocancel () from /lib64/libc.so.6
From                To                  Syms Read   Shared Object Library
0x00007ffff65e1ee0  0x00007ffff6613304  Yes (*)     /lib64/libldap-2.4.so.2
0x00007fffe998f6d0  0x00007fffe99c3ae4  Yes (*)     /lib64/libldap_r-2.4.so.2
(*): Shared library is missing debugging information.

Reply | Threaded
Open this post in threaded view
|

Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes

Mike Yeap
Hi Noah, below is the output from one of the servers having this issue:

$ echo "select pg_backend_pid(); load 'dblink'; select pg_sleep(100)" | psql -X &
[1] 9731

$ select pg_backend_pid(); load 'dblink'; select pg_sleep(100)
 pg_backend_pid
----------------
           9732
(1 row)

LOAD

$ gdb --batch --pid 9732 -ex 'info sharedlibrary ldap'

warning: .dynamic section for "/lib64/libldap-2.4.so.2" is not at the expected address (wrong library or version mismatch?)

warning: .dynamic section for "/lib64/liblber-2.4.so.2" is not at the expected address (wrong library or version mismatch?)
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
0x00007f1e7592dcf3 in __epoll_wait_nocancel () from /lib64/libc.so.6
From                To                  Syms Read   Shared Object Library
0x00007f1e7637d0f8  0x00007f1e763ae51c  Yes (*)     /lib64/libldap-2.4.so.2
0x00007f1d9f2c16d0  0x00007f1d9f2f5ae4  Yes (*)     /lib64/libldap_r-2.4.so.2
(*): Shared library is missing debugging information.


Regards,
Mike Yeap

On Thu, Mar 14, 2019 at 1:42 PM Noah Misch <[hidden email]> wrote:
On Thu, Mar 14, 2019 at 05:18:49PM +1300, Thomas Munro wrote:
> On Thu, Mar 7, 2019 at 4:19 PM Noah Misch <[hidden email]> wrote:
> > Has anyone else reproduced this?
>
> I tried, but could not reproduce this problem on "CentOS Linux release
> 7.6.1810 (Core)" using OpenLDAP "2.4.44-21.el7_6" (same as Mike
> reported, what yum install is currently serving up).

> When exiting the session, I was expecting the backend to crash,
> because it had executed libldap.so code during authentication, and
> then it had linked in libldap_r.so via libpq.so while connecting via
> postgres_fdw.  But it doesn't crash.  I wonder what is different for
> Mike; am I missing something, or is there non-determinism here?

The test is deterministic.  I'm guessing Mike's system is finding ldap
libraries other than the usual system ones.  Mike, would you check as follows?

$ echo "select pg_backend_pid(); load 'dblink'; select pg_sleep(100)" | psql -X &
[1] 2530123
  pg_backend_pid
----------------
        2530124
(1 row)

LOAD

$ gdb --batch --pid 2530124 -ex 'info sharedlibrary ldap'
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
0x00007ffff6303463 in __epoll_wait_nocancel () from /lib64/libc.so.6
From                To                  Syms Read   Shared Object Library
0x00007ffff65e1ee0  0x00007ffff6613304  Yes (*)     /lib64/libldap-2.4.so.2
0x00007fffe998f6d0  0x00007fffe99c3ae4  Yes (*)     /lib64/libldap_r-2.4.so.2
(*): Shared library is missing debugging information.
12