CHAR LIKE VARCHAR

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

CHAR LIKE VARCHAR

Ilja Golshtein
Hi!

I came across following difference
between "LIKE" and "=" regarding
CHARs and VARCHARs

create table aa(f5 char(5), fv varchar(5));
insert into aa values('str1', 'str1');
select count(*) from aa where f5 = fv;
> 1
select count(*) from aa where f5 like fv;
> 0

I understand trailing spaces in CHAR are not
significant though I expect DBMS shows
consistent (and, ideally, clearly documented)
behavior.

From my point of view in example above
it would be nice to have the same result
for both queries regardless it is 0 or 1.

Of course, I may be wrong. Is there a clear
concept behind the difference between "LIKE"
and "="?

Thanks.

--
Best regards
Ilja Golshtein

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to [hidden email] so that your
      message can get through to the mailing list cleanly
Reply | Threaded
Open this post in threaded view
|

Re: CHAR LIKE VARCHAR

Richard Huxton
Ilja Golshtein wrote:

> Hi!
>
> I came across following difference
> between "LIKE" and "=" regarding
> CHARs and VARCHARs
>
> create table aa(f5 char(5), fv varchar(5));
> insert into aa values('str1', 'str1');
> select count(*) from aa where f5 = fv;
>
>>1
>
> select count(*) from aa where f5 like fv;
>
>>0
>
> I understand trailing spaces in CHAR are not
> significant though I expect DBMS shows
> consistent (and, ideally, clearly documented)
> behavior.

If you ask me (and it's too late to back out now :-) the whole behaviour
of CHAR(n) is wrong, broken and just a bad idea.

>>From my point of view in example above
> it would be nice to have the same result
> for both queries regardless it is 0 or 1.
>
> Of course, I may be wrong. Is there a clear
> concept behind the difference between "LIKE"
> and "="?

Well, you could argue that LIKE should ignore the trailing spaces (and
the only way to decide is to look at the SQL specs). The problem is,
that if f5=fv then presumably length(f5)=length(fv) and length(f5 ||
'+') = length(f5)+1

Of course, only the first of these is true because the whole idea of
char(n) is badly thought out. Either the value contains spaces or it
doesn't - unfortunately it's neither and both.
--
   Richard Huxton
   Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

               http://archives.postgresql.org
Reply | Threaded
Open this post in threaded view
|

Re: CHAR LIKE VARCHAR

Michael Glaesemann

On Jun 7, 2005, at 6:03 PM, Richard Huxton wrote:

> If you ask me (and it's too late to back out now :-) the whole  
> behaviour of CHAR(n) is wrong, broken and just a bad idea.

Just a quick addition: AFAIK, this bad, broken behavior (I definitely  
agree!) is per SQL spec.  :( Don't know about the LIKE operator,  
however.

Michael Glaesemann
grzm myrealbox com


---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend
Reply | Threaded
Open this post in threaded view
|

Re: CHAR LIKE VARCHAR

Tom Lane-2
Michael Glaesemann <[hidden email]> writes:
> On Jun 7, 2005, at 6:03 PM, Richard Huxton wrote:
>> If you ask me (and it's too late to back out now :-) the whole  
>> behaviour of CHAR(n) is wrong, broken and just a bad idea.

> Just a quick addition: AFAIK, this bad, broken behavior (I definitely  
> agree!) is per SQL spec.  :( Don't know about the LIKE operator,  
> however.

AFAICS the behavior of LIKE is per spec also.  I quote from SQL92:

            b) The <predicate>

                 M LIKE P

              is true if there exists a partitioning of M into substrings
              such that:

              i) A substring of M is a sequence of 0 or more contiguous
                 <character representation>s of M and each <character repre-
                 sentation> of M is part of exactly one substring.

             ii) If the i-th substring specifier of P is an arbitrary char-
                 acter specifier, the i-th substring of M is any single
                 <character representation>.

            iii) If the i-th substring specifier of P is an arbitrary string
                 specifier, then the i-th substring of M is any sequence of
                 0 or more <character representation>s.

             iv) If the i-th substring specifier of P is neither an arbi-
                 trary character specifier nor an arbitrary string speci-
                 fier, then the i-th substring of M is equal to that sub-
                 string specifier according to the collating sequence of
                 the <like predicate>, without the appending of <space>
                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                 characters to M, and has the same length as that substring
                 ^^^^^^^^^^^^^^^
                 specifier.

              v) The number of substrings of M is equal to the number of
                 substring specifiers of P.

Under ordinary CHAR rules we could consider '1 ' as the substring
matching the final '1' of the pattern, so that 's' 't' 'r' '1 '
would match the pattern 's' 't' 'r' '1', but the underlined part of
rule 8.5.5.b.iv appears to specifically forbid that for LIKE.

Which is too bad, because it would be easy to "fix" this to behave
more sanely --- just remove the bpcharlike function and associated ~~
operator, so that the case would be handled by converting CHAR(n)
to text (and thereby stripping trailing spaces).

The bottom line is indeed that CHAR() is a nasty, useless, misdesigned
datatype.  Use text or varchar.

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq