Quantcast

Hash Functions

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
61 messages Options
1234
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Hash Functions

Jeff Davis-8
https://www.postgresql.org/message-id/CAMp0ubeo3fzzEfiE1vmc1AJkkRPxLnZQoOASeu6cCcco-c+9zw@...

In that thread, I pointed out some important considerations for the
hash functions themselves. This is a follow-up, after I looked more
carefully.

1. The hash functions as they exist today aren't portable -- they can
return different results on different machines. That means using these
functions for hash partitioning would yield different contents for the
same partition on different architectures (and that's bad, considering
they are logical partitions and not some internal detail).

The core hashing algorithm is based on 32-bit chunks, so it's only
portable if the input is an int32 (or array of int32s). That's not
true for varchar (etc.) or numeric (which is based on an array of
int16s). There is a hack for int8, but see issue #2 below.

We could try to mark portable vs. unportable hash functions, but it
seems quite valuable to be able to logically partition on varchar, so
I think we should have some portable answer there. Another annoyance
is that we would have to assume container types are unportable, or
make them inherit the portability of the underlying type's hash
function.

We could revert 26043592 (copied Tom because that was his patch) to
make hash_any() go back to being portable -- do we know what that
speedup actually was? Maybe the benefit is smaller on newer
processors? Another option is to try to do some combination of
byteswapping and word-at-a-time, which might be better than
byte-at-a-time if the byteswapping is done with a native instruction.

2. hashint8() is a little dubious. If the input is positive, it XORs
the high bits; if negative, it XORs the complement of the high bits.
That works for compatibility with hashint2/4, but it does not cause
good hash properties[1]. I prefer that we either (a) upcast int2/4 to
int8 before hashing and then hash the whole 64 bits; or (b) if the
value is within range, downcast to int4, otherwise hash the whole 64
bits.

3. We should downcast numeric to int4/8 before hashing if it's in
range, so that it can be a part of the same opfamily as int2/4/8.

4. text/varchar should use hashbpchar() so that they can be part of
the same opfamily. Trailing blanks seem unlikely to be significant for
most real-world data anyway.

5. For salts[2], I don't think it's too hard to support them in an
optional way. We just allow the function to be a two-argument function
with a default. Places that care about specifying the salt may do so
if the function has pronargs==2, otherwise it just gets the default
value. If we have salts, I don't think having 64-bit hashes is very
important. If we run out of bits, we can just salt the hash function
differently and get more hash bits. This is not urgent and I believe
we should just implement salts when and if some algorithm needs them.

Regards,
    Jeff Davis

[1] You can a kind of mirroring in the hash outputs indicating bad mixing:

postgres=# select hashint8((2^32-100)::int8);
 hashint8
----------
 41320869
(1 row)

postgres=# select hashint8((-2^32-100)::int8);
 hashint8
----------
 41320869
(1 row)

postgres=# select hashint8((2^32-101)::int8);
  hashint8
-------------
 -1214482326
(1 row)

postgres=# select hashint8((-2^32-101)::int8);
  hashint8
-------------
 -1214482326
(1 row)


[2] Not sure I'm using the term "salt" properly. I really just mean a
way to affect the initial state, which I think is good enough for our
purposes.


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Hash Functions

Robert Haas
On Fri, May 12, 2017 at 12:08 AM, Jeff Davis <[hidden email]> wrote:
> 1. The hash functions as they exist today aren't portable -- they can
> return different results on different machines. That means using these
> functions for hash partitioning would yield different contents for the
> same partition on different architectures (and that's bad, considering
> they are logical partitions and not some internal detail).

Hmm, yeah, that is bad.

> We could revert 26043592 (copied Tom because that was his patch) to
> make hash_any() go back to being portable -- do we know what that
> speedup actually was? Maybe the benefit is smaller on newer
> processors? Another option is to try to do some combination of
> byteswapping and word-at-a-time, which might be better than
> byte-at-a-time if the byteswapping is done with a native instruction.

With regard to portability, I find that in 2009, according to Tom, we
had "already agreed" that it was dispensible:

http://postgr.es/m/23832.1234214526@...

I was not able to find where that was agreed.  On performance, I found this:

https://www.postgresql.org/message-id/20081104202655.GP18362@...

It says at the end: "The average time to reindex the table using our
current hash_any() without the separate mix()/final() was 1696ms and
1482ms with the separate mix()/final() stages giving almost 13% better
performance for this stupid metric."

> 5. For salts[2], I don't think it's too hard to support them in an
> optional way. We just allow the function to be a two-argument function
> with a default. Places that care about specifying the salt may do so
> if the function has pronargs==2, otherwise it just gets the default
> value. If we have salts, I don't think having 64-bit hashes is very
> important. If we run out of bits, we can just salt the hash function
> differently and get more hash bits. This is not urgent and I believe
> we should just implement salts when and if some algorithm needs them.

The potential problem with that is that the extra argument might slow
down the hash functions enough to matter.  Argument unpacking is not
free, and the hashing logic itself will get more complex.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Hash Functions

Andres Freund


On May 12, 2017 10:05:56 AM PDT, Robert Haas <[hidden email]> wrote:

>On Fri, May 12, 2017 at 12:08 AM, Jeff Davis <[hidden email]> wrote:
>> 1. The hash functions as they exist today aren't portable -- they can
>> return different results on different machines. That means using
>these
>> functions for hash partitioning would yield different contents for
>the
>> same partition on different architectures (and that's bad,
>considering
>> they are logical partitions and not some internal detail).
>
>Hmm, yeah, that is bad.

Given that a lot of data types have a architecture dependent representation, it seems somewhat unrealistic and expensive to have a hard rule to keep them architecture agnostic.   And if that's not guaranteed, then I'm doubtful it makes sense as a soft rule either.

Andres

Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Hash Functions

Robert Haas
On Fri, May 12, 2017 at 1:12 PM, Andres Freund <[hidden email]> wrote:
> Given that a lot of data types have a architecture dependent representation, it seems somewhat unrealistic and expensive to have a hard rule to keep them architecture agnostic.   And if that's not guaranteed, then I'm doubtful it makes sense as a soft rule either.

That's a good point, but the flip side is that, if we don't have such
a rule, a pg_dump of a hash-partitioned table on one architecture
might fail to restore on another architecture.  Today, I believe that,
while the actual database cluster is architecture-dependent, a pg_dump
is architecture-independent.  Is it OK to lose that property?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Hash Functions

Tom Lane-2
Robert Haas <[hidden email]> writes:
> On Fri, May 12, 2017 at 1:12 PM, Andres Freund <[hidden email]> wrote:
>> Given that a lot of data types have a architecture dependent representation, it seems somewhat unrealistic and expensive to have a hard rule to keep them architecture agnostic.   And if that's not guaranteed, then I'm doubtful it makes sense as a soft rule either.

> That's a good point, but the flip side is that, if we don't have such
> a rule, a pg_dump of a hash-partitioned table on one architecture
> might fail to restore on another architecture.  Today, I believe that,
> while the actual database cluster is architecture-dependent, a pg_dump
> is architecture-independent.  Is it OK to lose that property?

I'd vote that it's not, which means that this whole approach to hash
partitioning is unworkable.  I agree with Andres that demanding hash
functions produce architecture-independent values will not fly.

Maintaining such a property for float8 (and the types that depend on it)
might be possible if you believe that nobody ever uses anything but IEEE
floats, but we've never allowed that as a hard assumption before.

Even architecture dependence isn't the whole scope of the problem.
Consider for example dumping a LATIN1-encoded database and trying
to reload it into a UTF8-encoded database.  People will certainly
expect that to be possible, and do you want to guarantee that the
hash of a text value is encoding-independent?

                        regards, tom lane


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Hash Functions

Joe Conway
In reply to this post by Robert Haas
On 05/12/2017 10:17 AM, Robert Haas wrote:

> On Fri, May 12, 2017 at 1:12 PM, Andres Freund wrote:
>> Given that a lot of data types have a architecture dependent
>> representation, it seems somewhat unrealistic and expensive to have
>> a hard rule to keep them architecture agnostic.   And if that's not
>> guaranteed, then I'm doubtful it makes sense as a soft rule
>> either.
>
> That's a good point, but the flip side is that, if we don't have
> such a rule, a pg_dump of a hash-partitioned table on one
> architecture might fail to restore on another architecture.  Today, I
> believe that, while the actual database cluster is
> architecture-dependent, a pg_dump is architecture-independent.  Is it
> OK to lose that property?

Not from where I sit.

Joe

--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development


signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Hash Functions

Robert Haas
In reply to this post by Tom Lane-2
On Fri, May 12, 2017 at 1:34 PM, Tom Lane <[hidden email]> wrote:
> I'd vote that it's not, which means that this whole approach to hash
> partitioning is unworkable.  I agree with Andres that demanding hash
> functions produce architecture-independent values will not fly.

If we can't produce architecture-independent hash values, then what's
the other option?

One alternative would be to change the way that we dump and restore
the data.  Instead of dumping the data with the individual partitions,
dump it all out for the parent and let tuple routing sort it out at
restore time.  Of course, this isn't very satisfying because now
dump-and-restore hasn't really preserved the state of the database;
indeed, you could make it into a hard failure by creating a foreign
key referencing a partition hash partition.  After dump-and-restore,
the row ends up in some other partition and the foreign key can't be
recreated because the relationship no longer holds.  This isn't
limited to foreign keys, either; similar problems could be created
with CHECK constraints or other per-table properties that can vary
between one child and another.

I basically think it's pretty futile to suppose that we can get away
with having a dump and restore move rows around between partitions
without having that blow up in some cases.

> Maintaining such a property for float8 (and the types that depend on it)
> might be possible if you believe that nobody ever uses anything but IEEE
> floats, but we've never allowed that as a hard assumption before.

I don't know how standard that is.  Is there any hardware that
anyone's likely to be using that doesn't?  TBH, I don't really care if
support for obscure, nearly-dead platforms like VAX or whatever don't
quite work with hash-partitioned tables.  In practice, PostgreSQL only
sorta works on that kind of platform anyway; there are far bigger
problems than this.  On the other hand, if there are servers being
shipped in 2017 that don't use IEEE floats, that's another problem.

What about integers?  I think we're already assuming two's-complement
arithmetic, which I think means that the only problem with making the
hash values portable for integers is big-endian vs. little-endian.
That's sounds solveable-ish.

> Even architecture dependence isn't the whole scope of the problem.
> Consider for example dumping a LATIN1-encoded database and trying
> to reload it into a UTF8-encoded database.  People will certainly
> expect that to be possible, and do you want to guarantee that the
> hash of a text value is encoding-independent?

No, I think that's expecting too much.  I'd be just fine telling
people that if you hash-partition on a text column, it may not load
into a database with another encoding.  If you care about that, don't
use hash-partitioning, or don't change the encoding, or dump out the
partitions one by one and reload all the roads into the parent table
for re-routing, solving whatever problems come up along the way.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Hash Functions

Tom Lane-2
Robert Haas <[hidden email]> writes:
> On Fri, May 12, 2017 at 1:34 PM, Tom Lane <[hidden email]> wrote:
>> I'd vote that it's not, which means that this whole approach to hash
>> partitioning is unworkable.  I agree with Andres that demanding hash
>> functions produce architecture-independent values will not fly.

> If we can't produce architecture-independent hash values, then what's
> the other option?

Forget hash partitioning.  There's no law saying that that's a good
idea and we have to have it.  With a different set of constraints,
maybe we could do it, but I think the existing design decisions have
basically locked it out --- and I doubt that hash partitioning is so
valuable that we should jettison other desirable properties to get it.

> One alternative would be to change the way that we dump and restore
> the data.  Instead of dumping the data with the individual partitions,
> dump it all out for the parent and let tuple routing sort it out at
> restore time.  Of course, this isn't very satisfying because now
> dump-and-restore hasn't really preserved the state of the database;
> indeed, you could make it into a hard failure by creating a foreign
> key referencing a partition hash partition.

Yeah, that isn't really appetizing at all.  If we were doing physical
partitioning below the user-visible level, we could make it fly.
But the existing design makes the partition boundaries user-visible
which means we have to insist that the partitioning rule is immutable
(in the strongest sense of being invariant across all installations
you could possibly wish to transport data between).

Now, we already have some issues of that sort with range partitioning;
if say you try to transport data to another system with a different
sort ordering, you have probably got problems.  But that issue already
existed with the old manual partitioning approach, ie if you have a
CHECK constraint like "(col >= 'foo' && col < 'gob')" then a transfer
to such a system could fail already.  So I'm okay with that.  But it
seems like hash partitioning is going to introduce new, exciting, and
hard-to-document-or-avoid portability gotchas.  Do we really need
to go there?

                        regards, tom lane


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Hash Functions

ktm@rice.edu
In reply to this post by Robert Haas
On Fri, May 12, 2017 at 02:23:14PM -0400, Robert Haas wrote:
>
> What about integers?  I think we're already assuming two's-complement
> arithmetic, which I think means that the only problem with making the
> hash values portable for integers is big-endian vs. little-endian.
> That's sounds solveable-ish.
>

xxhash produces identical hashes independent for big-endian and little-
endian.

https://github.com/Cyan4973/xxHash

Regards,
Ken


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Hash Functions

Robert Haas
In reply to this post by Tom Lane-2
On Fri, May 12, 2017 at 2:45 PM, Tom Lane <[hidden email]> wrote:
> Yeah, that isn't really appetizing at all.  If we were doing physical
> partitioning below the user-visible level, we could make it fly.
> But the existing design makes the partition boundaries user-visible
> which means we have to insist that the partitioning rule is immutable
> (in the strongest sense of being invariant across all installations
> you could possibly wish to transport data between).

I think you're right.

> Now, we already have some issues of that sort with range partitioning;
> if say you try to transport data to another system with a different
> sort ordering, you have probably got problems.  But that issue already
> existed with the old manual partitioning approach, ie if you have a
> CHECK constraint like "(col >= 'foo' && col < 'gob')" then a transfer
> to such a system could fail already.  So I'm okay with that.  But it
> seems like hash partitioning is going to introduce new, exciting, and
> hard-to-document-or-avoid portability gotchas.  Do we really need
> to go there?

That is a good question.  I think it basically amounts to this
question: is hash partitioning useful, and if so, for what?

Range and list partitioning inherently provide management benefits.
For example, if you range-partition your data by year, then when you
want to get rid of the oldest year worth of data, you can drop the
entire partition at once, which is likely to be much faster than a
DELETE statement.  But hash partitioning provide no such benefits
because, by design, the distribution of the data among the partitions
is essentially random.  Dropping one partition will not usually be a
useful thing to do because the rows in that partition are logically
unconnected.  So, if you have a natural way of partitioning a table by
range or list, that's probably going to be superior to hash
partitioning, but sometimes there isn't an obviously useful way of
breaking up the data, but you still want to partition it in order to
reduce the size of the individual tables.

That, in turn, allows maintenance operations to be performed each
partition separately.  For example, suppose your table is big enough
that running CLUSTER or VACUUM FULL on it takes eight hours, but you
can't really afford an eight-hour maintenance window when the table
gets bloated.  However, you can afford (on Sunday nights when activity
is lowest) a window of, say, one hour.  Well, if you hash-partition
the table, you can CLUSTER or VACUUM FULL one partition a week for N
weeks until you get to them all.  Similarly, if you need to create an
index, you can build it on one partition at a time.  You can even add
the index to one partition to see how well it works -- for example,
does it turn too many HOT updates into non-HOT updates, causing bloat?
-- and try it out before you go do it across the board.  And an
unpartitioned table can only accommodate one VACUUM process at a time,
but a partitioned table can have one per partition.  Partitioning a
table also means that you don't need a single disk volume large enough
to hold the whole thing; instead, you can put each partition in a
separate tablespace or (in some exciting future world where PostgreSQL
looks more like a distributed system) perhaps even on another server.

For a table that is a few tens of gigabytes, none of this amounts to a
hill of beans, but for a table that is a few tens of terabytes, the
time it takes to perform administrative operations can become a really
big problem.  A good example is, say, a table full of orders.  Imagine
a high-velocity business like Amazon or UPS that has a constant stream
of new orders, or a mobile app that has a constant stream of new user
profiles.  If that data grows fast enough that the table needs to be
partitioned, how should it be done?  It's often desirable to create
partitions of about equal size and about equal hotness, and
range-partitioning on the creation date or order ID number means
continually creating new partitions, and having all of the hot data in
the same partition.

In my experience, it is *definitely* the case that users with very
large tables are experiencing significant pain right now.  The freeze
map changes were a good step towards ameliorating some of that pain,
but I think hash partitioning is another step in that direction, and I
think there will be other applications as well.  Even for users who
don't have data that large, there are also interesting
query-performance implications of hash partitioning.  Joins between
very large tables tend to get implemented as merge joins, which often
means sorting all the data, which is O(n lg n) and not easy to
parallelize.  But if those very large tables happened to be compatibly
partitioned on the join key, you could do a partitionwise join per the
patch Ashutosh proposed, which would be cheaper and easier to do in
parallel.

Maybe a shorter argument for hash partitioning is that not one but two
different people proposed patches for it within months of the initial
partitioning patch going in.  When multiple people are thinking about
implementing the same feature almost immediately after the
prerequisite patches land, that's a good clue that it's a desirable
feature.  So I think we should try to solve the problems, rather than
giving up.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Hash Functions

Peter Eisentraut-6
In reply to this post by Robert Haas
On 5/12/17 14:23, Robert Haas wrote:
> One alternative would be to change the way that we dump and restore
> the data.  Instead of dumping the data with the individual partitions,
> dump it all out for the parent and let tuple routing sort it out at
> restore time.

I think this could be a pg_dump option.  One way it dumps out the
partitions, and another way it recomputes the partitions.  I think that
could be well within pg_dump's mandate.

(cough ... logical replication ... cough)

> Of course, this isn't very satisfying because now
> dump-and-restore hasn't really preserved the state of the database;

That depends no whether you consider the partitions to be a user-visible
or an internal detail.  The current approach is sort of in the middle,
so it makes sense to allow the user to interpret it either way depending
on need.

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Hash Functions

Alvaro Herrera-9
Peter Eisentraut wrote:
> On 5/12/17 14:23, Robert Haas wrote:
> > One alternative would be to change the way that we dump and restore
> > the data.  Instead of dumping the data with the individual partitions,
> > dump it all out for the parent and let tuple routing sort it out at
> > restore time.
>
> I think this could be a pg_dump option.  One way it dumps out the
> partitions, and another way it recomputes the partitions.  I think that
> could be well within pg_dump's mandate.

I was thinking the same, but enable that option automatically for hash
partitioning.  Each partition is still dumped separately, but instead of
referencing the specific partition in the TABLE DATA object, reference
the parent table.  This way, the restore can still be parallelized, but
tuple routing is executed for each tuple.

> (cough ... logical replication ... cough)

I think for logical replication the tuple should appear as being in the
parent table, not the partition.  No?

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Hash Functions

Peter Eisentraut-6
On 5/12/17 18:13, Alvaro Herrera wrote:
> I think for logical replication the tuple should appear as being in the
> parent table, not the partition.  No?

Logical replication replicates base table to base table.  How those
tables are tied together into a partitioned table or an inheritance tree
is up to the system catalogs on each side.

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Hash Functions

David Fetter
On Fri, May 12, 2017 at 06:38:55PM -0400, Peter Eisentraut wrote:
> On 5/12/17 18:13, Alvaro Herrera wrote:
> > I think for logical replication the tuple should appear as being in the
> > parent table, not the partition.  No?
>
> Logical replication replicates base table to base table.  How those
> tables are tied together into a partitioned table or an inheritance tree
> is up to the system catalogs on each side.

This seems like a totally reasonable approach to pg_dump, especially
in light of the fact that logical replication already (and quite
reasonably) does it this way.  Hard work has been done to make
tuple-routing cheap, and this is one of the payoffs.

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Hash Functions

Robert Haas
On Fri, May 12, 2017 at 7:36 PM, David Fetter <[hidden email]> wrote:

> On Fri, May 12, 2017 at 06:38:55PM -0400, Peter Eisentraut wrote:
>> On 5/12/17 18:13, Alvaro Herrera wrote:
>> > I think for logical replication the tuple should appear as being in the
>> > parent table, not the partition.  No?
>>
>> Logical replication replicates base table to base table.  How those
>> tables are tied together into a partitioned table or an inheritance tree
>> is up to the system catalogs on each side.
>
> This seems like a totally reasonable approach to pg_dump, especially
> in light of the fact that logical replication already (and quite
> reasonably) does it this way.  Hard work has been done to make
> tuple-routing cheap, and this is one of the payoffs.

Cheap isn't free, though.  It's got a double-digit percentage overhead
rather than a large-multiple-of-the-runtime overhead as triggers do,
but people still won't want to pay it unnecessarily, I think.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Hash Functions

Andres Freund
On 2017-05-12 21:56:30 -0400, Robert Haas wrote:
> Cheap isn't free, though.  It's got a double-digit percentage overhead
> rather than a large-multiple-of-the-runtime overhead as triggers do,
> but people still won't want to pay it unnecessarily, I think.

That should be partiall addressable with reasonable amounts of
engineering though.  Efficiently computing the target partition in a
hash-partitioned table can be implemented very efficiently, and adding
infrastructure for multiple bulk insert targets in copy should be quite
doable as well. It's also work that's generally useful, not just for
backups.

The bigger issue to me here wrt pg_dump is that partitions can restored
in parallel, but that'd probably not work as well if dumped
separately. Unless we'd do the re-routing on load, rather than when
dumping - which'd also increase cache locality, by most of the time
(same architecture/encoding/etc) having one backend insert into the same
partition.

Greetings,

Andres Freund


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Hash Functions

akapila
In reply to this post by Robert Haas
On Sat, May 13, 2017 at 1:08 AM, Robert Haas <[hidden email]> wrote:

> On Fri, May 12, 2017 at 2:45 PM, Tom Lane <[hidden email]> wrote:
>
> Maybe a shorter argument for hash partitioning is that not one but two
> different people proposed patches for it within months of the initial
> partitioning patch going in.  When multiple people are thinking about
> implementing the same feature almost immediately after the
> prerequisite patches land, that's a good clue that it's a desirable
> feature.  So I think we should try to solve the problems, rather than
> giving up.
>

Can we think of defining separate portable hash functions which can be
used for the purpose of hash partitioning?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Hash Functions

Robert Haas
On Sat, May 13, 2017 at 12:52 AM, Amit Kapila <[hidden email]> wrote:
> Can we think of defining separate portable hash functions which can be
> used for the purpose of hash partitioning?

I think that would be a good idea.  I think it shouldn't even be that
hard.  By data type:

- Integers.  We'd need to make sure that we get the same results for
the same value on big-endian and little-endian hardware, and that
performance is good on both systems.  That seems doable.

- Floats.  There may be different representations in use on different
hardware, which could be a problem.  Tom didn't answer my question
about whether any even-vaguely-modern hardware is still using non-IEEE
floats, which I suspect means that the answer is "no".  If every bit
of hardware we are likely to find uses basically the same
representation of the same float value, then this shouldn't be hard.
(Also, even if this turns out to be hard for floats, using a float as
a partitioning key would be a surprising choice because the default
output representation isn't even unambiguous; you need
extra_float_digits for that.)

- Strings.  There's basically only one representation for a string.
If we assume that the hash code only needs to be portable across
hardware and not across encodings, a position for which I already
argued upthread, then I think this should be manageable.

- Everything Else.  Basically, everything else is just a composite of
that stuff, I think.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Hash Functions

Jeff Davis-8
In reply to this post by Tom Lane-2
On Fri, May 12, 2017 at 10:34 AM, Tom Lane <[hidden email]> wrote:
> Maintaining such a property for float8 (and the types that depend on it)
> might be possible if you believe that nobody ever uses anything but IEEE
> floats, but we've never allowed that as a hard assumption before.

This is not such a big practical problem (for me at least) because
hashing of floats is of dubious value.

> Even architecture dependence isn't the whole scope of the problem.
> Consider for example dumping a LATIN1-encoded database and trying
> to reload it into a UTF8-encoded database.  People will certainly
> expect that to be possible, and do you want to guarantee that the
> hash of a text value is encoding-independent?

That is a major problem. In an ideal world, we could make that work
with something like ucol_getSortKey(), but we don't require ICU, and
we can't mix getSortKey() with strxfrm(), or even strxfrm() results
from different platforms.

I don't think it's correct to hash the code points, either, because
strings may be considered equal in a locale even if the code points
aren't identical. But I don't think postgres lives up to that standard
currently.

But hash partitioning is too valuable to give up on entirely. I think
we should consider supporting a limited subset of types for now with
something not based on the hash am.

Regards,
    Jeff Davis


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Hash Functions

Tom Lane-2
In reply to this post by Robert Haas
Robert Haas <[hidden email]> writes:
> On Sat, May 13, 2017 at 12:52 AM, Amit Kapila <[hidden email]> wrote:
>> Can we think of defining separate portable hash functions which can be
>> used for the purpose of hash partitioning?

> I think that would be a good idea.  I think it shouldn't even be that
> hard.  By data type:

> - Integers.  We'd need to make sure that we get the same results for
> the same value on big-endian and little-endian hardware, and that
> performance is good on both systems.  That seems doable.

> - Floats.  There may be different representations in use on different
> hardware, which could be a problem.  Tom didn't answer my question
> about whether any even-vaguely-modern hardware is still using non-IEEE
> floats, which I suspect means that the answer is "no".  If every bit
> of hardware we are likely to find uses basically the same
> representation of the same float value, then this shouldn't be hard.
> (Also, even if this turns out to be hard for floats, using a float as
> a partitioning key would be a surprising choice because the default
> output representation isn't even unambiguous; you need
> extra_float_digits for that.)

> - Strings.  There's basically only one representation for a string.
> If we assume that the hash code only needs to be portable across
> hardware and not across encodings, a position for which I already
> argued upthread, then I think this should be manageable.

Basically, this is simply saying that you're willing to ignore the
hard cases, which reduces the problem to one of documenting the
portability limitations.  You might as well not even bother with
worrying about the integer case, because porting between little-
and big-endian systems is surely far less common than cases you've
already said you're okay with blowing off.

That's not an unreasonable position to take, perhaps; doing better
than that is going to be a lot more work and it's not very clear
how much real-world benefit results.  But I can't follow the point
of worrying about endianness but not encoding.

                        regards, tom lane


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
1234
Previous Thread Next Thread
Loading...