casting Bangla characters to NUMERIC

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

casting Bangla characters to NUMERIC

Bob Jolliffe
Hi

I have an interesting problem.  I have a string field in a table which
(sometimes) is expected to contain numeric values  ie. a cast of the
field to INTEGER is supposed to succeed.

My issue is that the application is running in Bangladesh, and
sometimes the users have entered values using Bangla characters.  eg.
"৮" rather than "8".  (Note that this shouldnt have been possible to
enter in the first place, but we can ignore that for now.  They are
there, I have to somehow deal with it.)

Is there a way with LOCALE settings to get this cast to work
correctly?  Currently I get:
postgres=# select cast('৮' as INTEGER);
ERROR:  invalid input syntax for integer: "৮"
LINE 1: select cast('৮' as INTEGER);

Regards
Bob


Reply | Threaded
Open this post in threaded view
|

Re: casting Bangla characters to NUMERIC

Tom Lane-2
Bob Jolliffe <[hidden email]> writes:
> I have an interesting problem.  I have a string field in a table which
> (sometimes) is expected to contain numeric values  ie. a cast of the
> field to INTEGER is supposed to succeed.

> My issue is that the application is running in Bangladesh, and
> sometimes the users have entered values using Bangla characters.  eg.
> "৮" rather than "8".  (Note that this shouldnt have been possible to
> enter in the first place, but we can ignore that for now.  They are
> there, I have to somehow deal with it.)

> Is there a way with LOCALE settings to get this cast to work
> correctly?

Doubt it :-(.  ISTM it's certainly outside the charter of int4in to
do that.  Ideally to_number() would handle it, but I don't think it
does at the moment.  Probably your best bet is to translate those
characters to regular ASCII digits using replace(), then cast.
It'd be a bit tedious, but fortunately there are only 10 cases
to consider, and you could wrap that up in a function.

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: casting Bangla characters to NUMERIC

Bob Jolliffe
Thanks Tom.  That is what I expected to hear.  Was being hopeful ...

On Mon, 25 Nov 2019 at 16:27, Tom Lane <[hidden email]> wrote:

>
> Bob Jolliffe <[hidden email]> writes:
> > I have an interesting problem.  I have a string field in a table which
> > (sometimes) is expected to contain numeric values  ie. a cast of the
> > field to INTEGER is supposed to succeed.
>
> > My issue is that the application is running in Bangladesh, and
> > sometimes the users have entered values using Bangla characters.  eg.
> > "৮" rather than "8".  (Note that this shouldnt have been possible to
> > enter in the first place, but we can ignore that for now.  They are
> > there, I have to somehow deal with it.)
>
> > Is there a way with LOCALE settings to get this cast to work
> > correctly?
>
> Doubt it :-(.  ISTM it's certainly outside the charter of int4in to
> do that.  Ideally to_number() would handle it, but I don't think it
> does at the moment.  Probably your best bet is to translate those
> characters to regular ASCII digits using replace(), then cast.
> It'd be a bit tedious, but fortunately there are only 10 cases
> to consider, and you could wrap that up in a function.
>
>                         regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: casting Bangla characters to NUMERIC

Bob Jolliffe
 select translate(string,'০১২৩৪৫৬৭৮৯','0123456789');

seems to do the trick.

On Mon, 25 Nov 2019 at 16:38, Bob Jolliffe <[hidden email]> wrote:

>
> Thanks Tom.  That is what I expected to hear.  Was being hopeful ...
>
> On Mon, 25 Nov 2019 at 16:27, Tom Lane <[hidden email]> wrote:
> >
> > Bob Jolliffe <[hidden email]> writes:
> > > I have an interesting problem.  I have a string field in a table which
> > > (sometimes) is expected to contain numeric values  ie. a cast of the
> > > field to INTEGER is supposed to succeed.
> >
> > > My issue is that the application is running in Bangladesh, and
> > > sometimes the users have entered values using Bangla characters.  eg.
> > > "৮" rather than "8".  (Note that this shouldnt have been possible to
> > > enter in the first place, but we can ignore that for now.  They are
> > > there, I have to somehow deal with it.)
> >
> > > Is there a way with LOCALE settings to get this cast to work
> > > correctly?
> >
> > Doubt it :-(.  ISTM it's certainly outside the charter of int4in to
> > do that.  Ideally to_number() would handle it, but I don't think it
> > does at the moment.  Probably your best bet is to translate those
> > characters to regular ASCII digits using replace(), then cast.
> > It'd be a bit tedious, but fortunately there are only 10 cases
> > to consider, and you could wrap that up in a function.
> >
> >                         regards, tom lane