BUG #16560: Strange behavior with polygon and NaN

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

BUG #16560: Strange behavior with polygon and NaN

apt.postgresql.org Repository Update
The following bug has been logged on the website:

Bug reference:      16560
Logged by:          Jesse Zhang
Email address:      [hidden email]
PostgreSQL version: 13beta2
Operating system:   Linux
Description:        

Hi bugs,

While working with Chris Hajas on merging Postgres 12 with Greenplum
Database we stumbled upon the following strange behavior in the geometry
type polygon:

------ >8 --------

CREATE TEMP TABLE foo (p point);
CREATE INDEX ON foo USING gist(p);

INSERT INTO foo VALUES ('0,0'), ('1,1'), ('NaN,NaN');

SELECT $q$
SELECT * FROM foo WHERE p <@ polygon '(0,0), (0, 100), (100, 100), (100,
0)'
$q$ AS qry \gset

BEGIN;
SAVEPOINT yolo;
SET LOCAL enable_seqscan TO off;
:qry;

ROLLBACK TO SAVEPOINT yolo;
SET LOCAL enable_indexscan TO off;
SET LOCAL enable_bitmapscan TO off;
:qry;

------ 8< --------

If you run the above repro SQL in HEAD (and 12, and likely all older
versions), you get the following output:

CREATE TABLE
CREATE INDEX
INSERT 0 3
BEGIN
SAVEPOINT
SET
   p
-------
 (0,0)
 (1,1)
(2 rows)

ROLLBACK
SET
SET
     p
-----------
 (0,0)
 (1,1)
 (NaN,NaN)
(3 rows)


At first glance, you'd think this is the gist AM's bad, but on a second
thought, something else is strange here. The following query returns
true:

SELECT point '(NaN, NaN)' <@ polygon '(0,0), (0, 100), (100, 100), (100,
0)'

The above behavior of the "contained in" operator is surprising, and
it's probably not what the GiST AM is expecting. I took a look at
point_inside() in geo_ops.c, and it doesn't seem well equipped to handle
NaN. Similary ill-equipped is dist_ppoly_internal() which underlies the
distnace operator for polygon. It gives the following interesting
output:

SELECT *, c <-> polygon '(0,0),(0,100),(100,100),(100,0)' as distance
FROM (
  SELECT circle(point(100 * i, 'NaN'), 50) AS c
  FROM generate_series(-2, 4) i
) t(c)
ORDER BY 2;

        c        | distance
-----------------+----------
 <(-200,NaN),50> |        0
 <(-100,NaN),50> |        0
 <(0,NaN),50>    |        0
 <(100,NaN),50>  |        0
 <(200,NaN),50>  |      NaN
 <(300,NaN),50>  |      NaN
 <(400,NaN),50>  |      NaN
(7 rows)

Should they all be NaN? Am I alone in thinking the index is right but
the operators are wrong? Or should we call the indexes wrong here?

Cheers,
Jesse and Chris

Reply | Threaded
Open this post in threaded view
|

Re: BUG #16560: Strange behavior with polygon and NaN

Tom Lane-2
PG Bug reporting form <[hidden email]> writes:
> While working with Chris Hajas on merging Postgres 12 with Greenplum
> Database we stumbled upon the following strange behavior in the geometry
> type polygon:

Yeah, I'd imagine that none of the geometric operators/functions take
any thought for NaNs; if they behave sanely it's purely accidental.
Not really sure what we'd do about that --- operators returning
boolean, for instance, don't have a great way to handle it.  (I doubt
returning NULL would be nice.)  Going forward it might be best
to disallow NaNs in geometric values, perhaps?

                        regards, tom lane