How to migrate column type from uuid to serial

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

How to migrate column type from uuid to serial

Hemil Ruparel
I was integrating a payment gateway for my app when I noticed its maximum length of customer id string is 32. SIze of UUID is 36 (32 characters and 4 dashes). So I want to change the type of customer id to serial. The problem is by now, the column is being used at many places. How to migrate the column to serial without dropping the data?
Reply | Threaded
Open this post in threaded view
|

Re: How to migrate column type from uuid to serial

Adrian Klaver-4
On 10/7/20 5:48 AM, Hemil Ruparel wrote:
> I was integrating a payment gateway for my app when I noticed its
> maximum length of customer id string is 32. SIze of UUID is 36 (32
> characters and 4 dashes). So I want to change the type of customer id to
> serial. The problem is by now, the column is being used at many places.
> How to migrate the column to serial without dropping the data?

Changing the size of the column is not an option?

Your description of the customer id column above is somewhat confusing
to me. Is the id actually stored as a UUID?

Why is the max length an issue?

If you where to migrate I would say create an independent
serial/identity column. Then point the dependent objects at that. After
all have been converted drop the old column.





--
Adrian Klaver
[hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: How to migrate column type from uuid to serial

Adrian Klaver-4
On 10/7/20 6:58 AM, Hemil Ruparel wrote:

Please reply to list also.
Ccing list

> Yes. The id is stored as uuid. Thanks for the suggestion. Should work
>
>
> On Wed, Oct 7, 2020 at 7:27 PM Adrian Klaver <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     On 10/7/20 5:48 AM, Hemil Ruparel wrote:
>      > I was integrating a payment gateway for my app when I noticed its
>      > maximum length of customer id string is 32. SIze of UUID is 36 (32
>      > characters and 4 dashes). So I want to change the type of
>     customer id to
>      > serial. The problem is by now, the column is being used at many
>     places.
>      > How to migrate the column to serial without dropping the data?
>
>     Changing the size of the column is not an option?
>
>     Your description of the customer id column above is somewhat confusing
>     to me. Is the id actually stored as a UUID?
>
>     Why is the max length an issue?
>
>     If you where to migrate I would say create an independent
>     serial/identity column. Then point the dependent objects at that. After
>     all have been converted drop the old column.
>
>
>
>
>
>     --
>     Adrian Klaver
>     [hidden email] <mailto:[hidden email]>
>


--
Adrian Klaver
[hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: How to migrate column type from uuid to serial

Hemil Ruparel
Yes the id is stored as a uuid. Thanks for the suggestion. Should work


On Wed, Oct 7, 2020 at 7:29 PM Adrian Klaver <[hidden email]> wrote:
On 10/7/20 6:58 AM, Hemil Ruparel wrote:

Please reply to list also.
Ccing list

> Yes. The id is stored as uuid. Thanks for the suggestion. Should work
>
>
> On Wed, Oct 7, 2020 at 7:27 PM Adrian Klaver <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     On 10/7/20 5:48 AM, Hemil Ruparel wrote:
>      > I was integrating a payment gateway for my app when I noticed its
>      > maximum length of customer id string is 32. SIze of UUID is 36 (32
>      > characters and 4 dashes). So I want to change the type of
>     customer id to
>      > serial. The problem is by now, the column is being used at many
>     places.
>      > How to migrate the column to serial without dropping the data?
>
>     Changing the size of the column is not an option?
>
>     Your description of the customer id column above is somewhat confusing
>     to me. Is the id actually stored as a UUID?
>
>     Why is the max length an issue?
>
>     If you where to migrate I would say create an independent
>     serial/identity column. Then point the dependent objects at that. After
>     all have been converted drop the old column.
>
>
>
>
>
>     --
>     Adrian Klaver
>     [hidden email] <mailto:[hidden email]>
>


--
Adrian Klaver
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: How to migrate column type from uuid to serial

Thomas Kellerer-4
Hemil Ruparel schrieb am 07.10.2020 um 16:02:
> Yes the id is stored as a uuid.

Then it should be declared with the data type uuid, which only needs 16 bytes.





Reply | Threaded
Open this post in threaded view
|

Re: How to migrate column type from uuid to serial

Hemil Ruparel
umm it is declared as uuid. But how does it occupy only 16 bytes? Even if we remove those 4 dashes thats 32 bytes of text right? I am not concerned about the size at all. How do i send it as a string below 32 bytes?

On Wed, Oct 7, 2020 at 7:37 PM Thomas Kellerer <[hidden email]> wrote:
Hemil Ruparel schrieb am 07.10.2020 um 16:02:
> Yes the id is stored as a uuid.

Then it should be declared with the data type uuid, which only needs 16 bytes.





Reply | Threaded
Open this post in threaded view
|

Re: How to migrate column type from uuid to serial

Thomas Kellerer-4
Hemil Ruparel schrieb am 07.10.2020 um 16:21:
> it is declared as uuid. But how does it occupy only 16 bytes?

Because a UUID is internally simply a 128bit number - the dashes you see are just formatting.

But if you can only send the text represnation, then yes 32 characters aren't enough.


Reply | Threaded
Open this post in threaded view
|

Re: How to migrate column type from uuid to serial

Hemil Ruparel
Sorry if this is silly but if it is a 128 bit number, why do we need 32 characters to represent it? Isn't 8 bits one byte?

On Wed, Oct 7, 2020 at 8:08 PM Thomas Kellerer <[hidden email]> wrote:
Hemil Ruparel schrieb am 07.10.2020 um 16:21:
> it is declared as uuid. But how does it occupy only 16 bytes?

Because a UUID is internally simply a 128bit number - the dashes you see are just formatting.

But if you can only send the text represnation, then yes 32 characters aren't enough.


Reply | Threaded
Open this post in threaded view
|

Re: How to migrate column type from uuid to serial

Thomas Kellerer-4
>>> it is declared as uuid. But how does it occupy only 16 bytes?
>> Because a UUID is internally simply a 128bit number - the dashes you see are just formatting.

> Sorry if this is silly but if it is a 128 bit number, why do we need 32 characters to represent it?

The 36 (or 32 without the dashes) characters are just the default hex representation.

If you wanted to, you could convert it to a bigint (or a numeric).

See e.g. here for an example: https://stackoverflow.com/a/27286610



Reply | Threaded
Open this post in threaded view
|

Re: How to migrate column type from uuid to serial

Hemil Ruparel
In reply to this post by Hemil Ruparel
Is it because they are hex characters and hence only need 4 bit to store per character but we display each of those 4 bits as a character as a hex value (0 to 9 and a-f) all of which in ASCII and UTF-8 require a byte to represent? Hence the length of 32 (or 36 with dashes)? 

On Wed, Oct 7, 2020 at 8:10 PM Hemil Ruparel <[hidden email]> wrote:
Sorry if this is silly but if it is a 128 bit number, why do we need 32 characters to represent it? Isn't 8 bits one byte?

On Wed, Oct 7, 2020 at 8:08 PM Thomas Kellerer <[hidden email]> wrote:
Hemil Ruparel schrieb am 07.10.2020 um 16:21:
> it is declared as uuid. But how does it occupy only 16 bytes?

Because a UUID is internally simply a 128bit number - the dashes you see are just formatting.

But if you can only send the text represnation, then yes 32 characters aren't enough.


Reply | Threaded
Open this post in threaded view
|

Re: How to migrate column type from uuid to serial

Francisco Olarte
In reply to this post by Hemil Ruparel
Hemil:

On Wed, Oct 7, 2020 at 2:49 PM Hemil Ruparel <[hidden email]> wrote:
> I was integrating a payment gateway for my app when I noticed its maximum length of customer id string is 32. SIze of UUID is 36 (32 characters and 4 dashes). So I want to change the type of customer id to serial. The problem is by now, the column is being used at many places. How to migrate the column to serial without dropping the data?

An uuid is just a 128 bit number, as noted by many. Your problem is
the default encoding is 32 hex digits plus 4 hyphens. If your payment
gateway uses a 32 CHARs string  ( "maximum length of customer id
string is 32" ) you can just use other encodings. Passing to integer
and using decimal wont cut it ( you'll need 39 digits ), but using
just hex (without hyphens) will drop the string representation to 32
characters ( you can encode with a subst and postgres accepts it
without hyphens ).

If you want to buy a little more space for your own purposes you can
even fit 128 bits in 22 base64 chars with a couple pairs of bits to
spare, and IIRC you can do it with a creative decode/encode step after
killing the dashes.

And if your payment gateway uses unicode codepoints instead of ASCII
chars as units you could probably use more creative encodings ;-) ,
but probably using  a "drop the dashes" subst in the interface will be
your simpler option.

Francisco Olarte.


Reply | Threaded
Open this post in threaded view
|

Re: How to migrate column type from uuid to serial

Hemil Ruparel
I was thinking UUID was not a very good choice for id. Serial would be a better one because I don't have a billion customers. It is more like a thousand. So when I saw the customer ID of the payment gateway cannot accept more than 32 characters, I thought UUID is overkill. So I want to migrate to using a serial int instead as the primary key.,

On Wed, Oct 7, 2020 at 10:48 PM Francisco Olarte <[hidden email]> wrote:
Hemil:

On Wed, Oct 7, 2020 at 2:49 PM Hemil Ruparel <[hidden email]> wrote:
> I was integrating a payment gateway for my app when I noticed its maximum length of customer id string is 32. SIze of UUID is 36 (32 characters and 4 dashes). So I want to change the type of customer id to serial. The problem is by now, the column is being used at many places. How to migrate the column to serial without dropping the data?

An uuid is just a 128 bit number, as noted by many. Your problem is
the default encoding is 32 hex digits plus 4 hyphens. If your payment
gateway uses a 32 CHARs string  ( "maximum length of customer id
string is 32" ) you can just use other encodings. Passing to integer
and using decimal wont cut it ( you'll need 39 digits ), but using
just hex (without hyphens) will drop the string representation to 32
characters ( you can encode with a subst and postgres accepts it
without hyphens ).

If you want to buy a little more space for your own purposes you can
even fit 128 bits in 22 base64 chars with a couple pairs of bits to
spare, and IIRC you can do it with a creative decode/encode step after
killing the dashes.

And if your payment gateway uses unicode codepoints instead of ASCII
chars as units you could probably use more creative encodings ;-) ,
but probably using  a "drop the dashes" subst in the interface will be
your simpler option.

Francisco Olarte.
Reply | Threaded
Open this post in threaded view
|

Re: How to migrate column type from uuid to serial

Francisco Olarte
On Thu, Oct 8, 2020 at 6:14 AM Hemil Ruparel <[hidden email]> wrote:
> I was thinking UUID was not a very good choice for id. Serial would be a better one because I don't have a billion customers. It is more like a thousand. So when I saw the customer ID of the payment gateway cannot accept more than 32 characters, I thought UUID is overkill. So I want to migrate to using a serial int instead as the primary key.,

I do not like them. They are typically random 128 bit integers. If you
think of it, random 16 bits ints are not a good idea for keys, even if
the table has only a couple dozen keys. UUIDs are the same at a bigger
scale, they look nicer and safer, but have got problems. The version
1-2, the one with mac address with theoretically guaranteed uniqueness
were nice for their intended use, avoiding collisions without a
central authority. But the namespace hash and random version are
practically collision free, but not theoretically, which displeases
me. When I need to present an obscured name to an outside service (
i.e., in your case, presenting the user id to the gateway directly
leaks information, as they can estimate your consumer growth ) I just
encrypt ( not hash, encrypt ) them. For these you can do something
like treating a 64 bit number like an 8 byte array, encrypt it with a
"secret" key and you send something which is random enough to the
gateway, but is easy for you to manage in the database, and preserves
the counter niceties, like knowing there are approximately 1000
customers between IDs 3000 and 4000 ( some may have been deleted
/skiped ). Also, depending on your patterns, IDs typically correlate
nicely with table order, which makes index access a bit faster, you do
not jump around. And 4 byte ints make for a much denser index than 16
byte uuid.

Francisco Olarte.


Reply | Threaded
Open this post in threaded view
|

Re: How to migrate column type from uuid to serial

Peter J. Holzer
In reply to this post by Hemil Ruparel
On 2020-10-07 20:10:34 +0530, Hemil Ruparel wrote:
> Sorry if this is silly but if it is a 128 bit number, why do we need 32
> characters to represent it? Isn't 8 bits one byte?

Yes, 8 bits are 1 byte. But that's 256 different values, so to display
them in 1 character you would need 256 different characters. That's not
possible in ASCII (ASCII has only 94 graphic characters), and even if
you included accented characters and other alphabets (like Greek or
Cyrillic) it would be hard to read. So the decision was to display each
byte as a pair of two hexadecimal digits (because 16 * 16 = 256). They
could also have used 3 decimal digits (000 - 255) for each byte, but
that would have wasted even more space, or they could have used base 32
or 64 for the whole number, but that would make conversion harder.

        hp

--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | [hidden email]         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: How to migrate column type from uuid to serial

Peter J. Holzer
On 2020-10-10 11:31:23 +0200, Peter J. Holzer wrote:
> On 2020-10-07 20:10:34 +0530, Hemil Ruparel wrote:
> > Sorry if this is silly but if it is a 128 bit number, why do we need 32
> > characters to represent it? Isn't 8 bits one byte?
>
> Yes, 8 bits are 1 byte. But that's 256 different values, so to display
> them in 1 character you would need 256 different characters. That's not
> possible in ASCII (ASCII has only 94 graphic characters), and even if
> you included accented characters and other alphabets (like Greek or
> Cyrillic) it would be hard to read.

I'm showing my European bias here.

I should have thought of Korean. The Hangul script is syllabic with a
very straightforward and easy to learn structure. Wikipedia tells me
that they have 19 consonants and 21 vowels, so you could just pick 16
consonants and 16 vowels to construct 256 syllables. That would even
make UUIDs pronounceable.

        hp

--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | [hidden email]         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: How to migrate column type from uuid to serial

Hemil Ruparel
oh. I get it now. Thanks


On Sat, Oct 10, 2020 at 3:41 PM Peter J. Holzer <[hidden email]> wrote:
On 2020-10-10 11:31:23 +0200, Peter J. Holzer wrote:
> On 2020-10-07 20:10:34 +0530, Hemil Ruparel wrote:
> > Sorry if this is silly but if it is a 128 bit number, why do we need 32
> > characters to represent it? Isn't 8 bits one byte?
>
> Yes, 8 bits are 1 byte. But that's 256 different values, so to display
> them in 1 character you would need 256 different characters. That's not
> possible in ASCII (ASCII has only 94 graphic characters), and even if
> you included accented characters and other alphabets (like Greek or
> Cyrillic) it would be hard to read.

I'm showing my European bias here.

I should have thought of Korean. The Hangul script is syllabic with a
very straightforward and easy to learn structure. Wikipedia tells me
that they have 19 consonants and 21 vowels, so you could just pick 16
consonants and 16 vowels to construct 256 syllables. That would even
make UUIDs pronounceable.

        hp

--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | [hidden email]         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"