websearch_to_tsquery() and handling of ampersand characters inside double quotes

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

websearch_to_tsquery() and handling of ampersand characters inside double quotes

Alastair McKinley
Hi all,

I have recently discovered an unexpected difference in behaviour using websearch_to_tsquery() with quoted strings containing ampersands.

These two queries are equivalent without surrounding double quotes

select websearch_to_tsquery('something and another') = websearch_to_tsquery('something & another');
 ?column?
----------
 t
(1 row)

select websearch_to_tsquery('something and another');
 websearch_to_tsquery
----------------------
 'someth' & 'anoth'
(1 row)


With surrounding double quotes they produce subtly different queries, with different positional information.


 select websearch_to_tsquery('"something and another"') = websearch_to_tsquery('"something & another"');
 ?column?
----------
 f
(1 row)


select websearch_to_tsquery('"something and another"');
 websearch_to_tsquery
----------------------
 'someth' <2> 'anoth'
(1 row)

select websearch_to_tsquery('"something & another"');
 websearch_to_tsquery
----------------------
 'someth' <-> 'anoth'
(1 row)

I imagine the difference is due to the ts_vector type recording different information for the underlying strings.

select to_tsvector('something & another');
     to_tsvector      
----------------------
 'anoth':2 'someth':1
(1 row)

chimera=# select to_tsvector('something and another');
     to_tsvector      
----------------------
 'anoth':3 'someth':1
(1 row)


This leads to quite different search results and my current workaround is to suggest to users to do both searches with an OR.  Is this the right solution?

Best regards,

Alastair