The windows random number of generator (a port of lrand48 in random.c)
seems a little weak. It seems to only offer about 16 bits of precision. Maybe there is a bug in the implementation? Merlin observe: esp=# select count(*) from (select distinct random() from generate_series(1,1000000)) q; count ------- 65559 (1 row) esp=# select count(*) from (select distinct random() from generate_series(1,1000000)) q; count ------- 65558 (1 row) esp=# select count(*) from (select distinct random() from generate_series(1,1000000)) q; count ------- 65572 (1 row) esp=# select min(r), max(r), avg(r) from (select random() as r from generate_series(1,1000000)) q; min | max | avg ----------------------+-------------------+------------------- 4.6566128752458e-010 | 0.999984742142253 | 0.499985154491819 (1 row) esp=# select min(r), max(r), avg(r) from (select random() as r from generate_series(1,1000000)) q; min | max | avg ----------------------+-------------------+------------------ 4.6566128752458e-010 | 0.999984742142253 | 0.50079921773987 (1 row) esp=# esp=# select min(r), max(r), avg(r) from (select random() as r from generate_series(1,1000000)) q; min | max | avg ----------------------+-------------------+------------------- 4.6566128752458e-010 | 0.999984742142253 | 0.499613384426336 (1 row)
"Merlin Moncure" <[hidden email]> writes:
> The windows random number of generator (a port of lrand48 in random.c) > seems a little weak. It seems to only offer about 16 bits of precision. > Maybe there is a bug in the implementation? > esp=# select count(*) from (select distinct random() from > generate_series(1,1000000)) q; > count > ------- > 65559 > (1 row) That's pretty awful, all right. I get numbers like this on two different Unix machines: regression=# select count(*) from (select distinct random() from generate_series(1,1000000)) q; count -------- 999769 (1 row) postgres=# select count(*) from (select distinct random() from postgres(# generate_series(1,1000000)) q; count -------- 999787 (1 row) Anyone care to burrow into the code and see what its problem is? regards, tom lane
> "Merlin Moncure" <[hidden email]> writes:
> > The windows random number of generator (a port of lrand48 in random.c) > > seems a little weak. It seems to only offer about 16 bits of precision. > > Maybe there is a bug in the implementation? > > > esp=# select count(*) from (select distinct random() from > > generate_series(1,1000000)) q; > > count > > ------- > > 65559 > > (1 row) > > That's pretty awful, all right. I get numbers like this on two > different Unix machines: I'll research a fix. Here's a clearer picture of the problem: select distinct (random()::numeric(10,8)) * 65536 from generate_series(1,1000000); esp=# select distinct (random() * 65536)::numeric(15,8) from generate_series(1,1000000); numeric ---------------- 0.00003052 1.00003052 2.00003052 3.00003052 4.00003052 [...] oops! :) Merlin
>
> Anyone care to burrow into the code and see what its problem is? got it. Looks like this in lrand48(void): //return ((long) _rand48_seed[2] << 15) + ((long) _rand48_seed[1] > 1); is supposed to be this: return (long)((unsigned long) _rand48_seed[2] << 15) + ((unsigned long) _rand48_seed[1] >> 1); in port\rand.c Merlin
"Merlin Moncure" <[hidden email]> writes:
> Looks like this in lrand48(void): > //return ((long) _rand48_seed[2] << 15) + ((long) _rand48_seed[1] > 1); > is supposed to be this: > return (long)((unsigned long) _rand48_seed[2] << 15) + ((unsigned long) > _rand48_seed[1] >> 1); Hmm, _rand48_seed is unsigned short, so casting to either long or unsigned long should zero-extend, and then it doesn't matter whether the shifts think it's signed or not. In short, that shouldn't change the behavior unless your compiler is broken. regards, tom lane
Tom Lane writes: > "Merlin Moncure" <[hidden email]> writes: > > Looks like this in lrand48(void): > > //return ((long) _rand48_seed[2] << 15) + ((long) > _rand48_seed[1] > 1); > > > is supposed to be this: > > return (long)((unsigned long) _rand48_seed[2] << 15) + > ((unsigned long) > > _rand48_seed[1] >> 1); > > Hmm, _rand48_seed is unsigned short, so casting to either long or > unsigned long should zero-extend, and then it doesn't matter whether > the shifts think it's signed or not. In short, that shouldn't change > the behavior unless your compiler is broken. Yes, ISTM the unsigned long's are superfluous. The bug is that the first version has a typo, in that greater-than is applied to rand48_seed[1], when it means to do right-shift... which explains why we're seeing "just over" 16-bits of precision. Cheers, Claudio
> "Merlin Moncure" <[hidden email]> writes:
> > Looks like this in lrand48(void): _rand48_seed[1] > 1); > > _rand48_seed[1] >> 1); ^^ The problem is the shift operator :). Anyways I double checked the results and it works as expected now so here's a patch. I also removed the spurious casts. Merlin
"Merlin Moncure" <[hidden email]> writes:
> "Merlin Moncure" <[hidden email]> writes: >> Looks like this in lrand48(void): >> _rand48_seed[1] > 1); >> _rand48_seed[1] >> 1); >> ^^ > The problem is the shift operator :). Ah, missed that completely in looking at the casts. Will fix. regards, tom lane
