libpq host/hostaddr/conninfo inconsistencies

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
36 messages Options
12
Reply | Threaded
Open this post in threaded view
|

libpq host/hostaddr/conninfo inconsistencies

Fabien COELHO-3

Hello devs,

While reviewing various patches by Tom which are focussing on libpq
multi-host behavior,

  https://commitfest.postgresql.org/19/1749/
  https://commitfest.postgresql.org/19/1752/

it occured to me that there are a few more problems with the
documentation, the host/hostaddr feature, and the consistency of both.
Namely:

* According to the documentation, either "host" or "hostaddr" can be
specified. The former for names and socket directories, the later for ip
addresses. If both are specified, "hostaddr" supersedes "host", and it may
be used for various authentication purposes.

However, the actual capability is slightly different: specifying an ip
address to "host" does work, without ensuing any name or reverse name
look-ups, even if this is undocumented.  This means that the host/hostaddr
dichotomy is somehow moot as host can already be used for the same
purpose.

* \conninfo does not follow the implemented logic, and, as there is no
sanity check performed on the specification, it can display wrong
informations, which are not going to be helpful to anyone with a problem
to solve and trying to figure out the current state:

   sh> psql "host=/tmp hostaddr=127.0.0.1"
   psql> \conninfo
   You are connected to database "fabien" as user "fabien" via socket in "/tmp" at port "5432"
   # wrong, it is really connected to 127.0.0.1 by TCP/IP

   sh> psql "host=127.0.0.2 hostaddr=127.0.0.1"
   psql> \conninfo
   You are connected to database "fabien" as user "fabien" on host "127.0.0.2" at port "5432".
   # wrong again, it is really connected to 127.0.0.1

   sh> psql "hostaddr=127.0.0.1"
   psql> \conninfo
   You are connected to database "fabien" as user "fabien" via socket in "/var/run/postgresql" at port "5432".
   # wrong again

* Another issue with \conninfo is that if a host resolves to multiple ips,
there is no way to know which was chosen and/or worked, although on errors
some messages show the failing ip.

* The host/hostaddr dichotomy worsens when several targets are specified,
because according to the documentation you should specify either names &
dirs as host and ips as hostaddr, which leads to pretty strange spec each
being a possible source of confusion and unhelpful messages as described
above:

   sh> psql "host=localhost,127.0.0.2,, hostaddr=127.0.0.1,,127.0.0.3,"
   # attempt 1 is 127.0.0.1 identified as localhost
   # attempt 2 is 127.0.0.2
   # attempt 3 is 127.0.0.3 identified as the default, whatever it is
   # attempt 4 is really the default

* The documentation about host/hostaddr/port accepting lists is really
added as an afterthought: the features are presented for one, and then the
list is mentionned. Moreover there are quite a few repeats between the
paragraph about defaults and so.


Given this state of affair ISTM that the situation would be clarified by:

(1) describing "host" full capability to accept names, ips and dirs.

(2) describing "hostaddr" as a look-up shortcut. Maybe the "hostaddr"
could be renamed in passing, eg "resolve" to outline that it is just a
lookup shortcut, and not a partial alternative to "host".

(3) checking that hostaddr non empty addresses are only accepted if the
corresponding host is a name. The user must use the "host=ip" syntax
to connect to an ip.

(4) teaching \conninfo to show the real connection, which probably require
extending libpq to access the underlying ip, eg PQaddr or PQhostaddr or
whatever.

The attached patch does 1-3 (2 without renaming, though).

Thoughts?

--
Fabien.

libpq-host-ip-1.patch (22K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: libpq host/hostaddr/conninfo inconsistencies

Fabien COELHO-3

> The attached patch does 1-3 (2 without renaming, though).

Attached is a rebase after 5ca00774.

--
Fabien.

libpq-host-ip-2.patch (21K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: libpq host/hostaddr/conninfo inconsistencies

Tom Lane-2
Fabien COELHO <[hidden email]> writes:
> Attached is a rebase after 5ca00774.

I notice that the cfbot thinks that *none* of your pending patches apply
successfully.  I tried this one locally and what I get is

$ patch -p1 <~/libpq-host-ip-2.patch
(Stripping trailing CRs from patch.)
patching file doc/src/sgml/libpq.sgml
(Stripping trailing CRs from patch.)
patching file src/interfaces/libpq/fe-connect.c

as compared to the cfbot report, in which every hunk is rejected:

=== applying patch ./libpq-host-ip-2.patch
Hmm...  Looks like a unified diff to me...
The text leading up to this was:
--------------------------
|diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
|index 5e7931ba90..086172d4f0 100644
|--- a/doc/src/sgml/libpq.sgml
|+++ b/doc/src/sgml/libpq.sgml
--------------------------
Patching file doc/src/sgml/libpq.sgml using Plan A...
Hunk #1 failed at 964.
Hunk #2 failed at 994.
2 out of 2 hunks failed--saving rejects to doc/src/sgml/libpq.sgml.rej
Hmm...  The next patch looks like a unified diff to me...
The text leading up to this was:
--------------------------
|diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
|index a8048ffad2..34025ba041 100644
|--- a/src/interfaces/libpq/fe-connect.c
|+++ b/src/interfaces/libpq/fe-connect.c
--------------------------
Patching file src/interfaces/libpq/fe-connect.c using Plan A...
Hunk #1 failed at 908.
Hunk #2 failed at 930.
Hunk #3 failed at 943.
Hunk #4 failed at 974.
Hunk #5 failed at 1004.
Hunk #6 failed at 1095.
Hunk #7 failed at 2098.
Hunk #8 failed at 2158.
Hunk #9 failed at 6138.
9 out of 9 hunks failed--saving rejects to src/interfaces/libpq/fe-connect.c.rej
done

So I'm speculating that the cfbot is using a version of patch(1) that
doesn't have strip-trailing-CRs logic.  Which bemuses me, because
I thought they all did.

                        regards, tom lane

Reply | Threaded
Open this post in threaded view
|

Re: libpq host/hostaddr/conninfo inconsistencies

Fabien COELHO-3

> I notice that the cfbot thinks that *none* of your pending patches apply
> successfully.  I tried this one locally and what I get is

Hmmm. :-(

I've reverted to sending MIME conformant "text/x-diff" CRLF attachements,
as "text/plain" did the same and you complained rightfully that
"application/octet-stream" was a bad choice.

I do not know how to force my MUA to send MIME-broken text attachments
with LF only, which are indeed sent by other MUAs (eg thunderbird on macos
does it, and Tom your mailer seems to do it as well, dunno what it is,
though).

So I'm out of choices:-(

--
Fabien.

Reply | Threaded
Open this post in threaded view
|

Re: libpq host/hostaddr/conninfo inconsistencies

a.zakirov
In reply to this post by Fabien COELHO-3
Hello,

On Fri, Aug 24, 2018 at 11:22:47AM +0200, Fabien COELHO wrote:
> Attached is a rebase after 5ca00774.

I looked a little bit the patch. And I have a few notes.

> However, the actual capability is slightly different: specifying an ip
> address to "host" does work, without ensuing any name or reverse name
> look-ups, even if this is undocumented.

Agree it may have more details within the documentation.

> sh> psql "host=/tmp hostaddr=127.0.0.1"

Yeah this example shows that user may be confused by output of
\conninfo. I think it is psql issue and libpq issue. psql in
exec_command_conninfo() rely only on the PQhost() result. Can we add a
function PQhostType() to solve this issue?

> sh> psql "host=127.0.0.2 hostaddr=127.0.0.1"

I'm not sure that is is the issue. User defined the host name and psql
show it.

> sh> psql "hostaddr=127.0.0.1"

I cannot reproduce it. It gives me the message:

You are connected to database "artur" as user "artur" on host "127.0.0.1" at port "5432".

I think it is because of the environment (I didn't define PGHOST
variable, for example). If so, depending on PGHOST variable value
("/tmp" or "127.0.0.1") it is related with first or second issue.

> * Another issue with \conninfo is that if a host resolves to multiple ips,
> there is no way to know which was chosen and/or worked, although on errors
> some messages show the failing ip.

Can you explain it please? You can use PQhost() to know choosed host.

> * The documentation about host/hostaddr/port accepting lists is really
> added as an afterthought: the features are presented for one, and then the
> list is mentionned.

I cannot agree with you. When I've learned libpq before I found
host/hostaddr rules description useful. And I disagree that it is good
to remove it (as the patch does).
Of course it is only my point of view and others may have another opinion.

> (3) checking that hostaddr non empty addresses are only accepted if the
> corresponding host is a name. The user must use the "host=ip" syntax
> to connect to an ip.

Patch gives me an error if I specified only hostaddr:

psql -d "hostaddr=127.0.0.1"
psql: host "/tmp" cannot have an hostaddr "127.0.0.1"

It is wrong, because I didn't specified host=/tmp.

--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Reply | Threaded
Open this post in threaded view
|

Re: libpq host/hostaddr/conninfo inconsistencies

Thomas Munro-3
In reply to this post by Tom Lane-2
On Sat, Aug 25, 2018 at 7:25 AM Tom Lane <[hidden email]> wrote:

> Fabien COELHO <[hidden email]> writes:
> > Attached is a rebase after 5ca00774.
>
> I notice that the cfbot thinks that *none* of your pending patches apply
> successfully.  I tried this one locally and what I get is
>
> $ patch -p1 <~/libpq-host-ip-2.patch
> (Stripping trailing CRs from patch.)
> patching file doc/src/sgml/libpq.sgml
> (Stripping trailing CRs from patch.)
> patching file src/interfaces/libpq/fe-connect.c
>
> as compared to the cfbot report, in which every hunk is rejected:
>
> === applying patch ./libpq-host-ip-2.patch
> Hmm...  Looks like a unified diff to me...
> The text leading up to this was:
> --------------------------
> |diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
> |index 5e7931ba90..086172d4f0 100644
> |--- a/doc/src/sgml/libpq.sgml
> |+++ b/doc/src/sgml/libpq.sgml
> --------------------------
> Patching file doc/src/sgml/libpq.sgml using Plan A...
> Hunk #1 failed at 964.
> Hunk #2 failed at 994.
> 2 out of 2 hunks failed--saving rejects to doc/src/sgml/libpq.sgml.rej
> Hmm...  The next patch looks like a unified diff to me...
> The text leading up to this was:
> --------------------------
> |diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
> |index a8048ffad2..34025ba041 100644
> |--- a/src/interfaces/libpq/fe-connect.c
> |+++ b/src/interfaces/libpq/fe-connect.c
> --------------------------
> Patching file src/interfaces/libpq/fe-connect.c using Plan A...
> Hunk #1 failed at 908.
> Hunk #2 failed at 930.
> Hunk #3 failed at 943.
> Hunk #4 failed at 974.
> Hunk #5 failed at 1004.
> Hunk #6 failed at 1095.
> Hunk #7 failed at 2098.
> Hunk #8 failed at 2158.
> Hunk #9 failed at 6138.
> 9 out of 9 hunks failed--saving rejects to src/interfaces/libpq/fe-connect.c.rej
> done
>
> So I'm speculating that the cfbot is using a version of patch(1) that
> doesn't have strip-trailing-CRs logic.  Which bemuses me, because
> I thought they all did.

Huh.  Yeah.  I have now switched it over to GNU patch.  It seems to be
happier with Fabien's patches so far, but will take a few minutes to
catch up with all of them.

--
Thomas Munro
http://www.enterprisedb.com

Reply | Threaded
Open this post in threaded view
|

Re: libpq host/hostaddr/conninfo inconsistencies

Fabien COELHO-3
In reply to this post by a.zakirov

Hello Arthur,

Thanks for the comments.

>> However, the actual capability is slightly different: specifying an ip
>> address to "host" does work, without ensuing any name or reverse name
>> look-ups, even if this is undocumented.
>
> Agree it may have more details within the documentation.
>
>> sh> psql "host=/tmp hostaddr=127.0.0.1"
>
> Yeah this example shows that user may be confused by output of
> \conninfo. I think it is psql issue and libpq issue.

Yep. I'd add that there is a documentation issue as well.

> psql in exec_command_conninfo() rely only on the PQhost() result. Can we
> add a function PQhostType() to solve this issue?

I did not attempt to fix "\conninfo" yet, I focussed on the host/hostaddr
documentation and consistency checks in libpq.

I agree that at least one additional PQ function is needed.

What to do with a "host type" function is unclear, because it would not
change the output of PQhost() which returns the "host" value even if it
was ignored by the connection, there is no access to "hostaddr"... it is
not enough.

I was thinking that maybe a function could return the full description as
a string, so that the connection logic choices and display are implemented
in libpq only, but this is debatable.

Otherwise a collection of functions, including a host type function, would
be necessary for the client to have full information about the actual
current connection.

>> sh> psql "host=127.0.0.2 hostaddr=127.0.0.1"
>
> I'm not sure that is is the issue. User defined the host name and psql
> show it.

The issue is that "host" is an ip, "\conninfo" will inform wrongly that
you are connected to "127.0.0.2", but the actual connection is really to
"127.0.0.1", this is plain misleading, and I consider this level of
unhelpfullness more a bug than a feature.

>> sh> psql "hostaddr=127.0.0.1"
>
> I cannot reproduce it. It gives me the message:
> You are connected to database "artur" as user "artur" on host "127.0.0.1" at port "5432".
>
> I think it is because of the environment (I didn't define PGHOST
> variable, for example). If so, depending on PGHOST variable value
> ("/tmp" or "127.0.0.1") it is related with first or second issue.

Indeed, hostaddr superseedes the default, whatever it is, so it depends on
the default, which can be overriden with PGHOST.

>> * Another issue with \conninfo is that if a host resolves to multiple ips,
>> there is no way to know which was chosen and/or worked, although on errors
>> some messages show the failing ip.
>
> Can you explain it please? You can use PQhost() to know choosed host.

Indeed PQhost will tell the name. My point is that there will be no clue
about the actual ip used among those possible.

>> * The documentation about host/hostaddr/port accepting lists is really
>> added as an afterthought: the features are presented for one, and then the
>> list is mentionned.
>
> I cannot agree with you. When I've learned libpq before I found
> host/hostaddr rules description useful. And I disagree that it is good
> to remove it (as the patch does).
> Of course it is only my point of view and others may have another opinion.

I'm not sure I understand your concern.

Do you mean that you would prefer the document to keep describing that
host/hostaddr/port accepts one value, and then have in some other place or
at the end of the option documentation a line that say, "by the way, we
really accept lists, and they must be somehow consistent between
host/hostaddr/port"?

>> (3) checking that hostaddr non empty addresses are only accepted if the
>> corresponding host is a name. The user must use the "host=ip" syntax
>> to connect to an ip.


> Patch gives me an error if I specified only hostaddr:
>
> psql -d "hostaddr=127.0.0.1"
> psql: host "/tmp" cannot have an hostaddr "127.0.0.1"

This is the expected modified behavior: hostaddr can only be specified on
a host when it is a name, which is not the case here.

Changing the name to "resolve", has it would maybe help the user realize
it is not expected to be used to provide the target host, it is just a dns
shortcut.

If the user wants to connect to 127.0.0.1, they have to use
"host=127.0.0.1".

> It is wrong, because I didn't specified host=/tmp.

You did not, but this is the default value when you do not specify "host"
explicitely, so it was specified behind your back.

--
Fabien.

Reply | Threaded
Open this post in threaded view
|

Re: libpq host/hostaddr/conninfo inconsistencies

Fabien COELHO-3
In reply to this post by Thomas Munro-3

>> So I'm speculating that the cfbot is using a version of patch(1) that
>> doesn't have strip-trailing-CRs logic.  Which bemuses me, because
>> I thought they all did.
>
> Huh.  Yeah.  I have now switched it over to GNU patch.  It seems to be
> happier with Fabien's patches so far, but will take a few minutes to
> catch up with all of them.

Thanks for the fix. I gather that I'm the only one on the list who uses a
MIME-conformant MUA.

--
Fabien.

Reply | Threaded
Open this post in threaded view
|

Re: libpq host/hostaddr/conninfo inconsistencies

a.zakirov
In reply to this post by Fabien COELHO-3
Sorry for late answer.

On 9/30/18 10:21 AM, Fabien COELHO wrote:
>>> sh> psql "host=127.0.0.2 hostaddr=127.0.0.1"
>>
>> I'm not sure that is is the issue. User defined the host name and psql
>> show it.
>
> The issue is that "host" is an ip, "\conninfo" will inform wrongly that
> you are connected to "127.0.0.2", but the actual connection is really to
> "127.0.0.1", this is plain misleading, and I consider this level of
> unhelpfullness more a bug than a feature.

I didn't think that this is an issue, because I determined "host" as
just a host's display name when "hostaddr" is defined. So user may
determine 127.0.0.1 (hostaddr) as "happy_host", for example. It
shouldn't be a real host.

I searched for another use cases of PQhost(). In PostgreSQL source code
I found that it is used in pg_dump and psql to connect to some instance.

There is the next issue with PQhost() and psql (pg_dump could have it
too, see CloneArchive() in pg_backup_archiver.c and _connectDB() in
pg_backup_db.c):

$ psql "host=host_1,host_2 hostaddr=127.0.0.1,127.0.0.3 dbname=postgres"
=# \conninfo
You are connected to database "postgres" as user "artur" on host
"host_1" at port "5432".
=# \connect test
could not translate host name "host_1" to address: Неизвестное имя или
служба
Previous connection kept

So in the example above you cannot reuse connection string with
\connect. What do you think?

>> I cannot agree with you. When I've learned libpq before I found
>> host/hostaddr rules description useful. And I disagree that it is good
>> to remove it (as the patch does).
>> Of course it is only my point of view and others may have another
>> opinion.
>
> I'm not sure I understand your concern.
>
> Do you mean that you would prefer the document to keep describing that
> host/hostaddr/port accepts one value, and then have in some other place
> or at the end of the option documentation a line that say, "by the way,
> we really accept lists, and they must be somehow consistent between
> host/hostaddr/port"?

I wrote about the following part of the documentation:

> -        Using <literal>hostaddr</literal> instead of <literal>host</literal> allows the
> -        application to avoid a host name look-up, which might be important
> -        in applications with time constraints. However, a host name is
> -        required for GSSAPI or SSPI authentication
> -        methods, as well as for <literal>verify-full</literal> SSL
> -        certificate verification.  The following rules are used:
> -        <itemizedlist>
 > ...

So I think description of these rules is useful here and shouldn't be
removed. Your patch removes it and maybe it shouldn't do that. But now I
realised that the patch breaks this behavior and backward compatibility
is broken.

>> Patch gives me an error if I specified only hostaddr:
>>
>> psql -d "hostaddr=127.0.0.1"
>> psql: host "/tmp" cannot have an hostaddr "127.0.0.1"
>
> This is the expected modified behavior: hostaddr can only be specified
> on a host when it is a name, which is not the case here.

See the comment above about backward compatibility. psql without the
patch can connect to an instance if I specify only hostaddr.

--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Reply | Threaded
Open this post in threaded view
|

Re: libpq host/hostaddr/conninfo inconsistencies

Fabien COELHO-3

Hello Arthur,

>>>> sh> psql "host=127.0.0.2 hostaddr=127.0.0.1"
>>>
>>> I'm not sure that is is the issue. User defined the host name and psql
>>> show it.

>> The issue is that "host" is an ip, "\conninfo" will inform wrongly that you
>> are connected to "127.0.0.2", but the actual connection is really to
>> "127.0.0.1", this is plain misleading, and I consider this level of
>> unhelpfullness more a bug than a feature.
>
> I didn't think that this is an issue, because I determined "host" as just a
> host's display name when "hostaddr" is defined.

When I type "\conninfo", I do not expect to have false clues that must be
interpreted depending on a fine knowledge of the documentation and the
connection parameters possibly typed hours earlier, I would just expect to
have a direct answer describing in a self contained way what the
connection actually is.

> So user may determine 127.0.0.1 (hostaddr) as "happy_host", for example.
> It shouldn't be a real host.

They may determine it if they can access the initial connection
information, which means an careful inquest because \conninfo does not say
what it is... If they just read what is said, they just get wrong
informations.

> I searched for another use cases of PQhost(). In PostgreSQL source code I
> found that it is used in pg_dump and psql to connect to some instance.

> There is the next issue with PQhost() and psql (pg_dump could have it too,
> see CloneArchive() in pg_backup_archiver.c and _connectDB() in
> pg_backup_db.c):
>
> $ psql "host=host_1,host_2 hostaddr=127.0.0.1,127.0.0.3 dbname=postgres"
> =# \conninfo
> You are connected to database "postgres" as user "artur" on host "host_1" at
> port "5432".
> =# \connect test
> could not translate host name "host_1" to address: Неизвестное имя или служба
> Previous connection kept
>
> So in the example above you cannot reuse connection string with \connect.
> What do you think?
I think that this is another connection related "feature", aka bug, that
should be fixed as well:-(

>>> I cannot agree with you. When I've learned libpq before I found
>>> host/hostaddr rules description useful. And I disagree that it is good
>>> to remove it (as the patch does).


>>> Of course it is only my point of view and others may have another opinion.
>>
>> I'm not sure I understand your concern.
>>
>> Do you mean that you would prefer the document to keep describing that
>> host/hostaddr/port accepts one value, and then have in some other place or
>> at the end of the option documentation a line that say, "by the way, we
>> really accept lists, and they must be somehow consistent between
>> host/hostaddr/port"?
>
> I wrote about the following part of the documentation:
>
>> -      Using <literal>hostaddr</literal> instead of <literal>host</literal> allows the
>> -      application to avoid a host name look-up, which might be important
>> -      in applications with time constraints. However, a host name is
>> -      required for GSSAPI or SSPI authentication
>> -      methods, as well as for <literal>verify-full</literal> SSL
>> -      certificate verification.  The following rules are used:
>> -      <itemizedlist>
>> ...

> So I think description of these rules is useful here and shouldn't be
> removed.

Ok, I have put back a summary description of which rules apply, which are
somehow simpler & saner, at least this is the aim of this patch.

> Your patch removes it and maybe it shouldn't do that. But now I
> realised that the patch breaks this behavior and backward compatibility
> is broken.

Indeed. The incompatible changes are that "host" must always be provided,
instead of letting the user providing an IP either in host or hostaddr
(currently both work although undocumented), and that "hostaddr" can only
be provided for a host name, not for an IP or socket.

>>> Patch gives me an error if I specified only hostaddr:
>>>
>>> psql -d "hostaddr=127.0.0.1"
>>> psql: host "/tmp" cannot have an hostaddr "127.0.0.1"
>>
>> This is the expected modified behavior: hostaddr can only be specified on a
>> host when it is a name, which is not the case here.
>
> See the comment above about backward compatibility. psql without the patch
> can connect to an instance if I specify only hostaddr.
Yes, that is intentional and is the purpose of this patch: to provide a
simple connection model for the user: use "host" to connect to a target
server, and "hostaddr" as a lookup shortcut only.

For a reminder, my main issues with the current status are:

(1) the documentation is inconsistent with the implementation:
     "host" can be given an IP, but this is not documented.
     "hostaddr" can be provided for anything, and overshadows the initial
     specification, but:

(2) "\conninfo" does not give a clue about what the connection
     really is in such cases.

Moreover, you found another issue with psql's "\connect" which does not
work properly when both "host" & "hostaddr" are given.

In the attached patch, I tried to clarify the documentation further and
fix some rebase issues I had. ISTM that all relevant informations provided
in the previous version are still there.

The backward incompatibility is clearly documented.

The patch does not address the \conninfo issue, which requires extending
libpq. I think that the \connect issue you raised is linked to the same
set of problems within libpq, which does not provide any reliable way to
know about the current connection in some cases, either for describing it
or reusing it.

--
Fabien.

libpq-host-ip-3.patch (27K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: libpq host/hostaddr/conninfo inconsistencies

Robert Haas
In reply to this post by Fabien COELHO-3
On Mon, Aug 20, 2018 at 7:32 AM Fabien COELHO <[hidden email]> wrote:
>    sh> psql "host=localhost,127.0.0.2,, hostaddr=127.0.0.1,,127.0.0.3,"
>    # attempt 1 is 127.0.0.1 identified as localhost
>    # attempt 2 is 127.0.0.2
>    # attempt 3 is 127.0.0.3 identified as the default, whatever it is
>    # attempt 4 is really the default

I think this patch is a solution in search of a problem.  It's true
that the above example is very confusing, but there's no reason for
everybody to ever do that.  It's like saying that C is a bad
programming language because people can do this:

https://www.ioccc.org/2018/anderson/prog.c

Well, no.  The fact that a programming language -- or a connection
string -- can be used to create incomprehensible constructs is an
artifact of it being powerful and flexible, not a defect.

What users should do is just use host.  If that causes name lookups
they want to avoid, they should instead use both host and hostaddr.
If they do that, they'll be fine.  If they do strange things like
specify host and hostaddr strings that don't match, then yes, it won't
work very well.  But the documentation already says that, so I don't
really see why we need to change anything here.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Reply | Threaded
Open this post in threaded view
|

Re: libpq host/hostaddr/conninfo inconsistencies

Fabien COELHO-3

Hello Robert,

> I think this patch is a solution in search of a problem.

I take note of this negative opinion.

> [...] It's true that the above example is very confusing, but there's no
> reason for everybody to ever do that.

If you do it, even by accident, there is no way to guess what is wrong
because the reported informations are inconsistent and does not reflect
the actual status.

> Well, no.  The fact that a programming language -- or a connection
> string -- can be used to create incomprehensible constructs is an
> artifact of it being powerful and flexible, not a defect.

I see at least three actual defects:

  - \conninfo output does NOT reflect the actual status of a connection
    some cases. I do not see how this can be defended as a powerful
    feature.

  - \connect does NOT work in some trivial cases.

These two above issues are linked to the fact that libpq does not allow to
know what the actual connection is, so it cannot be described correctly
nor reused to create another connection.

  - the documentation does not say that "host" accepts IPs,
    and implicitely says that hostaddr should be used for IPs.

Once it is clear that "host" accepts IPs, then the host/hostaddr duality
becomes much less clear, which is the conceptual issue I'm trying to
solve by improving the documentation.

> What users should do is just use host.  If that causes name lookups
> they want to avoid, they should instead use both host and hostaddr.

THANKS!

This is exactly the simple approach what I'm trying to promote:-) However,
this is NOT what is actually said in the documentation.

The documentation says that host should be used for host names or sockets,
hostaddr for IP addresses, and then there is a special case when both are
provided. The implementation does not really do that, as noted above.

> If they do that, they'll be fine.

Sure.

> If they do strange things like specify host and hostaddr strings that
> don't match, then yes, it won't work very well.

Indeed. I think that we should be a bit more user friendly by catching
obvious misconfigurations.

> But the documentation already says that, so I don't really see why we
> need to change anything here.

It seems that the documentation does not say what you think it says.

--
Fabien.

Reply | Threaded
Open this post in threaded view
|

Re: libpq host/hostaddr/conninfo inconsistencies

Robert Haas
On Thu, Oct 25, 2018 at 1:06 PM Fabien COELHO <[hidden email]> wrote:
> If you do it, even by accident, there is no way to guess what is wrong
> because the reported informations are inconsistent and does not reflect
> the actual status.

Meh.  The reported information is fine.  If you tell the system that
foo.com has an IP of 127.0.0.1 when it really doesn't, and then you
get confused because it reports a failure to connect to foo.com when
you really failed to connect to 127.0.0.1, that's a self-inflicted
injury.  It's not that I am opposed to helping people avoid
self-inflicted injuries, but this one doesn't seem either likely or
serious.

> I see at least three actual defects:
>
>   - \conninfo output does NOT reflect the actual status of a connection
>     some cases. I do not see how this can be defended as a powerful
>     feature.

Well, again, I think you're talking about the case where host and
hostaddr don't match.  But that's not an intended use case, so I'm not
sure it matters.  Perhaps extending the \conninfo output with the
actual IP to which somebody connected wouldn't be a bad idea, but in
at least 99% cases, it's just going to be clutter.

>   - \connect does NOT work in some trivial cases.
>
> These two above issues are linked to the fact that libpq does not allow to
> know what the actual connection is, so it cannot be described correctly
> nor reused to create another connection.

Yeah, that's not great.

>   - the documentation does not say that "host" accepts IPs,
>     and implicitely says that hostaddr should be used for IPs.
>
> Once it is clear that "host" accepts IPs, then the host/hostaddr duality
> becomes much less clear, which is the conceptual issue I'm trying to
> solve by improving the documentation.

All I can really say here is that I don't find the current
documentation very confusing, but I agree with you that some people
have been confused by it. I'm not direly opposed to making it more
clear, but I'm not sure that necessitates all of the behavior changes
you are proposing.

I mean, the ssh syntax synopsis says:

     ssh [-1246AaCfGgKkMNnqsTtVvXxYy] [-b bind_address] [-c cipher_spec]
         [-D [bind_address:]port] [-E log_file] [-e escape_char]
         [-F configfile] [-I pkcs11] [-i identity_file]
         [-J [user@]host[:port]] [-L address] [-l login_name] [-m mac_spec]
         [-O ctl_cmd] [-o option] [-p port] [-Q query_option] [-R address]
         [-S ctl_path] [-W host:port] [-w local_tun[:remote_tun]]
         [user@]hostname [command]

Well, are you confused?  That host name could really be an IP address.
But I don't think that's really confusing, because I think it's pretty
widely understood that a hostname is just a proxy for an IP address,
and therefore it's expected that any place where a hostname is
requested, you could instead supply the IP address directly.

What is, arguably, a little confusing in the case of ssh is that
'hostname' could ALSO, instead of being a name that we can find in DNS
or an IP address, correspond to a Host entry in our ~/.ssh/config
file, which could remap the hostname we gave to some other hostname
for DNS lookup purposes, or to an IP address.  But we don't have that
problem, because we picked a different keyword for that kind of
functionality -- service=whatever vs. host=whatever.

> The documentation says that host should be used for host names or sockets,
> hostaddr for IP addresses, and then there is a special case when both are
> provided. The implementation does not really do that, as noted above.

You're not the first person to think that -- I believe the pgAdmin 3
developers were confused about the same point -- so it's probably not
as clear as it could be.  But I actually do not see that in the
documentation anywhere.  It says that the value of hostaddr must be an
IP address, but I do not see that it says that if what you have is an
IP address, you should stuff that in hostaddr rather than host.  Maybe
we should explicitly say the opposite e.g.

host Name or IP address of host to connect to.

hostaddr Numeric IP address of host to connect to.  Normally not
needed, because PostgreSQL will perform a lookup on the value
specified for host if necessary.  If specified, this should be...

> > But the documentation already says that, so I don't really see why we
> > need to change anything here.
>
> It seems that the documentation does not say what you think it says.

Or maybe it doesn't say what YOU think it says.  :-)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Reply | Threaded
Open this post in threaded view
|

Re: libpq host/hostaddr/conninfo inconsistencies

Fabien COELHO-3

Hello Robert,

> [...] that's a self-inflicted injury.

Sure. I'm trying to be more user friendly.

> It's not that I am opposed to helping people avoid self-inflicted
> injuries, but this one doesn't seem either likely or serious.

If I'm trying to improve something, I tend to be thorough about it.

>> I see at least three actual defects:
>>
>>   - \conninfo output does NOT reflect the actual status of a connection
>>     some cases. I do not see how this can be defended as a powerful
>>     feature.
>
> Well, again, I think you're talking about the case where host and
> hostaddr don't match.  But that's not an intended use case,

I disagree: it is an intended use case because it is documented that you
can use both host & hostaddr. This feature has been added without telling
conninfo about it, hence the confusion when it is used.

> so I'm not sure it matters.  Perhaps extending the \conninfo output with
> the actual IP to which somebody connected wouldn't be a bad idea, but in
> at least 99% cases, it's just going to be clutter.

It helps when both host & hostaddr are used, or if a host name resolves to
several IPs.

About clutter: if someone asks for \conninfo it is because they need it,
so probably they can deal with a precise information, instead of an output
that may or may not be what the connection really is.

Moreover, ISTM more likely that I would want to look at \conninfo if the
connection parameters were complex, to know how it resolved, probably
while debugging something, and then I would really want it to reflect the
actual status.

>>   - \connect does NOT work in some trivial cases.
>>
>> These two above issues are linked to the fact that libpq does not allow to
>> know what the actual connection is, so it cannot be described correctly
>> nor reused to create another connection.
>
> Yeah, that's not great.

Indeed, I think it is a bug. Note that the patch does not address this
issue, I'm keeping it for later. It should require extending libpq, which
requires some more thinking.

> [...] ssh ... [user@]hostname [command]
>
> Well, are you confused?  That host name could really be an IP address.

Sure, but ssh does not give an alternate syntax to provide a target IP
address, whereas libpq (apparently) provides one syntax for hostnames and
one for IPs.

> What is, arguably, a little confusing in the case of ssh is that
> 'hostname' could ALSO, instead of being a name that we can find in DNS
> or an IP address, correspond to a Host entry in our ~/.ssh/config
> file, which could remap the hostname we gave to some other hostname
> for DNS lookup purposes, or to an IP address.

Sure. Now when you run "ssh -v", the output tells you that it used the
config to redefine the connection, it does not say that it is directly
connected to the target, contrary to \conninfo which provides plain false
informations.

>> The documentation says that host should be used for host names or sockets,
>> hostaddr for IP addresses, and then there is a special case when both are
>> provided. The implementation does not really do that, as noted above.
>
> You're not the first person to think that -- I believe the pgAdmin 3
> developers were confused about the same point -- so it's probably not
> as clear as it could be.

Yep. That is my point:-)


> [...] Maybe we should explicitly say the opposite e.g. host Name or IP
> address of host to connect to. hostaddr Numeric IP address of host to
> connect to.  Normally not needed, because PostgreSQL will perform a
> lookup on the value specified for host if necessary.  If specified, this
> should be...

Well, that is one of my point, trying to improve the documentation to make
it less confusing...

>> It seems that the documentation does not say what you think it says.
>
> Or maybe it doesn't say what YOU think it says.  :-)

Hmmm. I have re-read the current host/hostaddr doc before replying to your
email. I find it confusing because of what it says and not says and
somehow suggests. Moreover, people get regularly confused, as you pointed
out.

Probably I'm below par at understanding English technical documentations,
but I'm afraid I'm not the only average Joe around.

To sum up:

(1) you are somehow against changing the current implementation, eg
erroring out on possibly misleading configurations, because you do not
think it is really useful to help users in those cases.

(2) you are not against improving the documentation, although you find it
clear enough already, but you agree that some people could get confused.

The attached patch v4 only improves the documentation so that it reflects
what the implementation really does. I think it too bad to leave out the
user-friendly aspects of the patch, though.

--
Fabien.

libpq-host-ip-4.patch (12K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: libpq host/hostaddr/conninfo inconsistencies

Dmitry Dolgov
> On Fri, Oct 26, 2018 at 9:22 AM Fabien COELHO <[hidden email]> wrote:
>
> To sum up:
>
> (1) you are somehow against changing the current implementation, eg
> erroring out on possibly misleading configurations, because you do not
> think it is really useful to help users in those cases.
>
> (2) you are not against improving the documentation, although you find it
> clear enough already, but you agree that some people could get confused.
>
> The attached patch v4 only improves the documentation so that it reflects
> what the implementation really does.

Thanks, it's definitely makes sense to propose documentation patch if there are
any concerns about how clear it is. For now I'm moving patch to the next CF.

> I think it too bad to leave out the user-friendly aspects of the patch,
> though.

Why then not split the original proposal into two patches, one to improve the
documentation, and another to make it more user friendly?

Reply | Threaded
Open this post in threaded view
|

Re: libpq host/hostaddr/conninfo inconsistencies

Michael Paquier-2
On Fri, Nov 30, 2018 at 01:08:51PM +0100, Dmitry Dolgov wrote:
> Why then not split the original proposal into two patches, one to improve the
> documentation, and another to make it more user friendly?

Moved to next CF for now.  From what I can see the latest patch
manipulates the same areas of the documentation, so keeping things
grouped would reduce the global amount of diffs.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: libpq host/hostaddr/conninfo inconsistencies

Andres Freund
In reply to this post by Fabien COELHO-3
Hi,

On 2018-10-26 09:21:51 +0200, Fabien COELHO wrote:
> (1) you are somehow against changing the current implementation, eg erroring
> out on possibly misleading configurations, because you do not think it is
> really useful to help users in those cases.

I find this formulation somewhat passive aggressive.


> (2) you are not against improving the documentation, although you find it
> clear enough already, but you agree that some people could get confused.
>
> The attached patch v4 only improves the documentation so that it reflects
> what the implementation really does. I think it too bad to leave out the
> user-friendly aspects of the patch, though.

Robert, any chance you could opine on the doc patch, given that's your
suggested direction?

- Andres

Reply | Threaded
Open this post in threaded view
|

Re: libpq host/hostaddr/conninfo inconsistencies

Fabien COELHO-3

> On 2018-10-26 09:21:51 +0200, Fabien COELHO wrote:
>> (1) you are somehow against changing the current implementation, eg erroring
>> out on possibly misleading configurations, because you do not think it is
>> really useful to help users in those cases.
>
> I find this formulation somewhat passive aggressive.

I do not understand what you mean by that expression.

I was just trying to sum-up Robert's opposition to erroring on misleading
configurations (eg "host=1.2.3.4 hostaddr=4.3.2.1") instead of complying
to it whatever, as is currently done. Probably my phrasing could be
improved, but I do not think that I misrepresented Robert's position.

Note that the issue is somehow mitigated by 6e5f8d489a: \conninfo now
displays a more precise information, so that at least you are not told
that you are connected to a socket when you a really connected to an ip,
or to one ip when you a really connected to another.

--
Fabien.

Reply | Threaded
Open this post in threaded view
|

Re: libpq host/hostaddr/conninfo inconsistencies

Kyotaro HORIGUCHI-2
Hello.

At Thu, 14 Feb 2019 22:51:40 +0100 (CET), Fabien COELHO <[hidden email]> wrote in <alpine.DEB.2.21.1902142224380.20189@lancre>

>
> > On 2018-10-26 09:21:51 +0200, Fabien COELHO wrote:
> >> (1) you are somehow against changing the current implementation, eg
> >> erroring
> >> out on possibly misleading configurations, because you do not think it
> >> is
> >> really useful to help users in those cases.
> >
> > I find this formulation somewhat passive aggressive.
>
> I do not understand what you mean by that expression.
>
> I was just trying to sum-up Robert's opposition to erroring on
> misleading configurations (eg "host=1.2.3.4 hostaddr=4.3.2.1") instead
> of complying to it whatever, as is currently done. Probably my
> phrasing could be improved, but I do not think that I misrepresented
> Robert's position.
>
> Note that the issue is somehow mitigated by 6e5f8d489a: \conninfo now
> displays a more precise information, so that at least you are not told
> that you are connected to a socket when you a really connected to an
> ip, or to one ip when you a really connected to another.

I'm rather on (maybe) Robert's side in that not opposing to edit
it but documentation should be plain as far as it is not so
mis-leading for average readers. From the same viewpoint,
documentation is written general-and-important-first, then
special cases and trivials.

On such standpoint, the first hunk in the patch attracted my
eyes.

       <term><literal>host</literal></term>
       <listitem>
        <para>
-        Name of host to connect to.<indexterm><primary>host name</primary></indexterm>
-        If a host name begins with a slash, it specifies Unix-domain
-        communication rather than TCP/IP communication; the value is the
-        name of the directory in which the socket file is stored.
+        Comma-separated list of hosts to connect to.<indexterm><primary>host name</primary></indexterm>
+        Each specified host will be tried in turn in the order given.
+        See <xref linkend="libpq-multiple-hosts"/> for details.
+        Each item may be a host name that will be resolved with a look-up,
+        a numeric IP address (IPv4 in the standard format, e.g.,
+        <literal>172.28.40.9</literal>, or IPv6 if supported by your machine)
+        that will be used directly, or
+        the name of a directory which contains the socket file for Unix-domain
+        communication rather than TCP/IP communication
+        (the specification must then begin with a slash);
+       </para>

I don't think this is user-friendly since almost all of them
don't write multiple hosts there. So I prefer the previous
organization. The description about IP-address looks too verbose,
especially we don't need explain what is IP-address here.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center


Reply | Threaded
Open this post in threaded view
|

Re: libpq host/hostaddr/conninfo inconsistencies

Fabien COELHO-3

Hello Kyotaro-san,

> On such standpoint, the first hunk in the patch attracted my
> eyes.
>
>       <term><literal>host</literal></term>
>       <listitem>
>        <para>
> -        Name of host to connect to.<indexterm><primary>host name</primary></indexterm>
> -        If a host name begins with a slash, it specifies Unix-domain
> -        communication rather than TCP/IP communication; the value is the
> -        name of the directory in which the socket file is stored.
> +       </para>
>
> I don't think this is user-friendly since almost all of them don't write
> multiple hosts there. So I prefer the previous organization.

ISTM that specifying the expected syntax is the first information needed?

The previous organization says "this is a host name (bla bla bla) btw I
lied at the beginning this is a list".

> The description about IP-address looks too verbose, especially we don't
> need explain what is IP-address here.

Ok.

I agree that the order is not the best possible one. Here is a simplified
and reordered version:

""" Comma-separated list of hosts to connect to. Each item may be a host
name that will be resolved with a look-up, a numeric IP address that will
be used directly, or the name of a directory which contains the socket
file for Unix-domain communication, if the specification begins with a
slash. Each specified target will be tried in turn in the order given. See
<xref linkend="libpq-multiple-hosts"/> for details. """

What do you think about that version.

--
Fabien.

12