Strange OSX make check-world failure

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Strange OSX make check-world failure

Chris Travers-7
Logs are below.  This happens on master, and on 10.  I suspect it is an issue with something regarding ecpg.  Wondering what I am doing wrong.

============== creating temporary instance            ==============

============== initializing database system           ==============

============== starting postmaster                    ==============

running on port 50853 with PID 20314

============== creating database "ecpg1_regression"   ==============

CREATE DATABASE

ALTER DATABASE

============== creating database "ecpg2_regression"   ==============

CREATE DATABASE

ALTER DATABASE

============== creating role "regress_ecpg_user1"     ==============

CREATE ROLE

GRANT

GRANT

============== creating role "regress_ecpg_user2"     ==============

CREATE ROLE

GRANT

GRANT

============== running regression test queries        ==============

test compat_informix/dec_test ... ok

test compat_informix/charfuncs ... ok

test compat_informix/rfmtdate ... ok

test compat_informix/rfmtlong ... ok

test compat_informix/rnull    ... ok

test compat_informix/sqlda    ... ok

test compat_informix/describe ... ok

test compat_informix/test_informix ... ok

test compat_informix/test_informix2 ... ok

test connect/test2            ... ok

test connect/test3            ... ok

test connect/test4            ... ok

test connect/test5            ... ok

test pgtypeslib/dt_test       ... stdout stderr FAILED (test process was terminated by signal 6: Abort trap)

test pgtypeslib/dt_test2      ... stdout stderr FAILED (test process was terminated by signal 6: Abort trap)

test pgtypeslib/num_test      ... stdout stderr FAILED (test process was terminated by signal 6: Abort trap)

test pgtypeslib/num_test2     ... stdout stderr FAILED (test process was terminated by signal 6: Abort trap)

test pgtypeslib/nan_test      ... ok

test preproc/array_of_struct  ... ok

test preproc/pointer_to_struct ... ok

test preproc/autoprep         ... ok

test preproc/comment          ... ok

test preproc/cursor           ... ok

test preproc/define           ... ok

test preproc/init             ... ok

test preproc/strings          ... ok

test preproc/type             ... ok

test preproc/variable         ... ok

test preproc/outofscope       ... ok

test preproc/whenever         ... ok

test sql/array                ... ok

test sql/binary               ... ok

test sql/code100              ... ok

test sql/copystdout           ... ok

test sql/define               ... ok

test sql/desc                 ... ok

test sql/sqlda                ... stdout stderr FAILED (test process was terminated by signal 6: Abort trap)

test sql/describe             ... ok

test sql/dynalloc             ... ok

test sql/dynalloc2            ... ok

test sql/dyntest              ... ok

test sql/execute              ... ok

test sql/fetch                ... ok

test sql/func                 ... ok

test sql/indicators           ... ok

test sql/oldexec              ... ok

test sql/quote                ... ok

test sql/show                 ... ok

test sql/insupd               ... ok

test sql/parser               ... ok

test thread/thread            ... ok

test thread/thread_implicit   ... ok

test thread/prep              ... ok

test thread/alloc             ... ok

test thread/descriptor        ... ok

test sql/twophase             ... stderr FAILED

============== shutting down postmaster               ==============


=======================

 6 of 56 tests failed. 

=======================


--
Best Regards,
Chris Travers
Head of Database

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com 
Saarbrücker Straße 37a, 10405 Berlin

Reply | Threaded
Open this post in threaded view
|

Re: Strange OSX make check-world failure

Tom Lane-2
Chris Travers <[hidden email]> writes:
> Logs are below.  This happens on master, and on 10.  I suspect it is an
> issue with something regarding ecpg.  Wondering what I am doing wrong.

"make check" generally won't work on OSX unless you've disabled SIP:

https://www.howtogeek.com/230424/how-to-disable-system-integrity-protection-on-a-mac-and-why-you-shouldnt/

That might not be the issue --- I'd have rather expected a failure
sooner --- but it's worth checking.

The reason why it doesn't work is basically that Apple sabotages
the DYLD_LIBRARY_PATH mechanism, causing the tests to load whatever
version of libpq.dylib (and the ecpg libraries) might exist in your
system library directories or the install target directory, rather
than the files in the build tree.  Possibly the reason you got this
far is that your install target is already reasonably up to date
for libpq, but not so much for ecpg.

                        regards, tom lane

Reply | Threaded
Open this post in threaded view
|

Re: Strange OSX make check-world failure

Thomas Munro-3
On Tue, Sep 18, 2018 at 2:14 AM Tom Lane <[hidden email]> wrote:

> Chris Travers <[hidden email]> writes:
> > Logs are below.  This happens on master, and on 10.  I suspect it is an
> > issue with something regarding ecpg.  Wondering what I am doing wrong.
>
> "make check" generally won't work on OSX unless you've disabled SIP:
>
> https://www.howtogeek.com/230424/how-to-disable-system-integrity-protection-on-a-mac-and-why-you-shouldnt/
>
> That might not be the issue --- I'd have rather expected a failure
> sooner --- but it's worth checking.

$ sw_vers
ProductName: Mac OS X
ProductVersion: 10.13.4
BuildVersion: 17E199
$ csrutil status
System Integrity Protection status: enabled.
$ make -s -C src/interfaces/ecpg check
... snip ...
======================
 All 58 tests passed.
======================

Hmm... why does this work for me... let's see where it gets libraries from:

$ otool -L src/interfaces/ecpg/test/pgtypeslib/dt_test
src/interfaces/ecpg/test/pgtypeslib/dt_test:
/Users/munro/install/postgres2/lib/libecpg.6.dylib (compatibility
version 6.0.0, current version 6.12.0)
/Users/munro/install/postgres2/lib/libpgtypes.3.dylib (compatibility
version 3.0.0, current version 3.12.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current
version 1252.50.4)

Aha!  It looks like it was important to run "make install" before
running those tests.  Let's see what happens if I remove those
libraries and try again:

...
test pgtypeslib/dt_test           ... stdout stderr FAILED (test
process was terminated by signal 6: Abort trap)
...

Same result as Chris.

--
Thomas Munro
http://www.enterprisedb.com

Reply | Threaded
Open this post in threaded view
|

Re: Strange OSX make check-world failure

Tom Lane-2
Thomas Munro <[hidden email]> writes:
> On Tue, Sep 18, 2018 at 2:14 AM Tom Lane <[hidden email]> wrote:
>> "make check" generally won't work on OSX unless you've disabled SIP:
>> https://www.howtogeek.com/230424/how-to-disable-system-integrity-protection-on-a-mac-and-why-you-shouldnt/

> Aha!  It looks like it was important to run "make install" before
> running those tests.

Right.  If you don't want to disable SIP, you can work around it by always
doing "make install" before "make check".  Kind of a PITA though.

                        regards, tom lane

Reply | Threaded
Open this post in threaded view
|

Re: Strange OSX make check-world failure

Samuel Cochran
Hi folks 👋

Forgive me if I'm getting the mailing list etiquette wrong — first time poster.

I ended up sitting next to Thomas Munro at PGDU 2018 and talking about testing. While trying to get `make check` running on my macbook, I think I may have fixed this issue.

System Integrity Protection strips dynamic linker (dyld) environment variables, such as DYLD_LIBRARY_PATH, during exec(2) [1] so we need to rewrite the load paths inside binaries when relocating then during make temp-install before make check on darwin. Homebrew does something similar [2]. I've attached a patch which adjust the Makefile and gets make check working on my machine with SIP in tact.

Cheers,
Sam

  [1]: https://developer.apple.com/library/archive/documentation/Security/Conceptual/System_Integrity_Protection_Guide/RuntimeProtections/RuntimeProtections.html
  [2]: https://github.com/Homebrew/brew/blob/77e6a927504c51a1393a0a6ccaf6f2611ac4a9d5/Library/Homebrew/os/mac/keg.rb#L17-L30


On Tue, Sep 18, 2018, at 8:39 AM, Tom Lane wrote:

> Thomas Munro <[hidden email]> writes:
> > On Tue, Sep 18, 2018 at 2:14 AM Tom Lane <[hidden email]> wrote:
> >> "make check" generally won't work on OSX unless you've disabled SIP:
> >> https://www.howtogeek.com/230424/how-to-disable-system-integrity-protection-on-a-mac-and-why-you-shouldnt/
>
> > Aha!  It looks like it was important to run "make install" before
> > running those tests.
>
> Right.  If you don't want to disable SIP, you can work around it by always
> doing "make install" before "make check".  Kind of a PITA though.
>
> regards, tom lane
>
>

installname.patch (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Strange OSX make check-world failure

Tom Lane-2
Samuel Cochran <[hidden email]> writes:
> System Integrity Protection strips dynamic linker (dyld) environment variables, such as DYLD_LIBRARY_PATH, during exec(2) [1]

Yeah.  I wish Apple would just fix that silliness ... I'll spare you the
rant about why it's stupid, but it is.  (BTW, last I looked, it's not
exec(2) per se that's doing the damage; the problem is that we're
invoking a sub-shell that's considered a protected program for some
reason, and it's only the use of that that causes DYLD_LIBRARY_PATH
to get removed from the process environment.)

> so we need to rewrite the load paths inside binaries when relocating then during make temp-install before make check on darwin.

Interesting proposal, but I think it needs work.

* As coded, this only fixes the problem for references to libpq, not
any of our other shared libraries.

* It's also unpleasant that it hard-wires knowledge of libpq's version
numbering in a place pretty far removed from anywhere that should know
that.

* Just to be annoying, this won't work at all on 32-bit OSX versions
unless we link everything with -headerpad_max_install_names.  (I know
Apple forgot about 32-bit machines long ago, but our buildfarm hasn't.)

* Speaking of not working, I don't think this "find" invocation will
report any failure exits from install_name_tool.

* This doesn't fix anything for executables that never get installed,
for instance isolationtester.

We could probably fix the first four problems with some more sweat,
but I'm not seeing a plausible answer to the last one.  Overwriting
isolationtester's rpath to make "make check" work would just break
it for "make installcheck".

                        regards, tom lane

Reply | Threaded
Open this post in threaded view
|

Re: Strange OSX make check-world failure

Samuel Cochran
On Fri, Dec 7, 2018, at 5:26 PM, Tom Lane wrote:
> Interesting proposal, but I think it needs work.

Absolutely! I only hacked it together to the point that it worked on my laptop and illustrated the approach. :-)

> * As coded, this only fixes the problem for references to libpq, not
> any of our other shared libraries.

None of the the other shared libraries are referenced by the modified binaries:

$ for bin in tmp_install/usr/local/pgsql/bin/*; do otool -L $bin; done | grep dylib | sort -u
        .../tmp_install/usr/local/pgsql/lib/libpq.5.dylib (compatibility version 5.0.0, current version 5.12.0)
        /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1252.200.5)
        /usr/lib/libedit.3.dylib (compatibility version 2.0.0, current version 3.0.0)
        /usr/lib/libz.1.dylib (compatibility version 1.0.0, current version 1.2.11)

But I agree it would be nice to make it work in potential future cases, too.

> * It's also unpleasant that it hard-wires knowledge of libpq's version
> numbering in a place pretty far removed from anywhere that should know
> that.

Ideally it would iterate the binaries, iterate the load commands, and rewrite each.

> * Just to be annoying, this won't work at all on 32-bit OSX versions
> unless we link everything with -headerpad_max_install_names.  (I know
> Apple forgot about 32-bit machines long ago, but our buildfarm hasn't.)

We can make the references relative which would dramatically decrease the sizes.

> * Speaking of not working, I don't think this "find" invocation will
> report any failure exits from install_name_tool.

If we iterate more carefully, as above, then failures should be reported and cause an abort.

> * This doesn't fix anything for executables that never get installed,
> for instance isolationtester.
>
> We could probably fix the first four problems with some more sweat,
> but I'm not seeing a plausible answer to the last one.  Overwriting
> isolationtester's rpath to make "make check" work would just break
> it for "make installcheck".

Ah, sorry, I'm not super familiar yet with the build process so missed this bit. But I think executable-relative paths will fix.

I tried using this line instead and `make check` and `make installcheck` both work for me. It's awful, I'm not super fluent in Makefile so I'm sure it could be 100X better, and probably isn't quoted correctly, but the approach itself works. I couldn't quickly figure out a portable way to generate a relative path from bindir to libdir which would be a great improvement.

$(if $(filter $(PORTNAME),darwin),for binary in $(abs_top_builddir)/tmp_install$(bindir)/*; do for dylib in $$(otool -L $$binary | tail +2 | awk '{ print $$1 }' | grep '$(libdir)'); do install_name_tool -change $$dylib @executable_path/../lib/$${dylib##*/} $$binary || exit $$?; done; done)

Cheers,
Sam