Re: [POC] hash partitioning

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
51 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Re: [POC] hash partitioning

yangjie@highgo.com
Hello

Looking at your hash partitioning syntax, I implemented a hash partition in a more concise way, with no need to determine the number of sub-tables, and dynamically add partitions.

Description

The hash partition's implement is on the basis of the original range / list partition,and using similar syntax.

To create a partitioned table ,use:

CREATE TABLE h (id int) PARTITION BY HASH(id);

The partitioning key supports only one value, and I think the partition key can support multiple values, 
which may be difficult to implement when querying, but it is not impossible.

A partition table can be create as bellow:

 CREATE TABLE h1 PARTITION OF h;
 CREATE TABLE h2 PARTITION OF h;
 CREATE TABLE h3 PARTITION OF h;
 
FOR VALUES clause cannot be used, and the partition bound is calclulated automatically as partition index of single integer value.

An inserted record is stored in a partition whose index equals 
DatumGetUInt32(OidFunctionCall1(lookup_type_cache(key->parttypid[0], TYPECACHE_HASH_PROC)->hash_proc, values[0])) % nparts
/* Number of partitions */
;
In the above example, this is DatumGetUInt32(OidFunctionCall1(lookup_type_cache(key->parttypid[0], TYPECACHE_HASH_PROC)->hash_proc, id)) % 3;

postgres=# insert into h select generate_series(1,20);
INSERT 0 20
postgres=# select tableoid::regclass,* from h;
 tableoid | id 
----------+----
 h1       |  3
 h1       |  5
 h1       | 17
 h1       | 19
 h2       |  2
 h2       |  6
 h2       |  7
 h2       | 11
 h2       | 12
 h2       | 14
 h2       | 15
 h2       | 18
 h2       | 20
 h3       |  1
 h3       |  4
 h3       |  8
 h3       |  9
 h3       | 10
 h3       | 13
 h3       | 16
(20 rows)

The number of partitions here can be dynamically added, and if a new partition is created, the number of partitions changes, the calculated target partitions will change, and the same data is not reasonable in different partitions,So you need to re-calculate the existing data and insert the target partition when you create a new partition.

postgres=# create table h4 partition of h;
CREATE TABLE
postgres=# select tableoid::regclass,* from h;
 tableoid | id 
----------+----
 h1       |  5
 h1       | 17
 h1       | 19
 h1       |  6
 h1       | 12
 h1       |  8
 h1       | 13
 h2       | 11
 h2       | 14
 h3       |  1
 h3       |  9
 h3       |  2
 h3       | 15
 h4       |  3
 h4       |  7
 h4       | 18
 h4       | 20
 h4       |  4
 h4       | 10
 h4       | 16
(20 rows)

When querying the data, the hash partition uses the same algorithm as the insertion, and filters out the table that does not need to be scanned.

postgres=# explain analyze select * from h where id = 1;
                                             QUERY PLAN                                             
----------------------------------------------------------------------------------------------------
 Append  (cost=0.00..41.88 rows=13 width=4) (actual time=0.020..0.023 rows=1 loops=1)
   ->  Seq Scan on h3  (cost=0.00..41.88 rows=13 width=4) (actual time=0.013..0.016 rows=1 loops=1)
         Filter: (id = 1)
         Rows Removed by Filter: 3
 Planning time: 0.346 ms
 Execution time: 0.061 ms
(6 rows)

postgres=# explain analyze select * from h where id in (1,5);;
                                             QUERY PLAN                                             
----------------------------------------------------------------------------------------------------
 Append  (cost=0.00..83.75 rows=52 width=4) (actual time=0.016..0.028 rows=2 loops=1)
   ->  Seq Scan on h1  (cost=0.00..41.88 rows=26 width=4) (actual time=0.015..0.018 rows=1 loops=1)
         Filter: (id = ANY ('{1,5}'::integer[]))
         Rows Removed by Filter: 6
   ->  Seq Scan on h3  (cost=0.00..41.88 rows=26 width=4) (actual time=0.005..0.007 rows=1 loops=1)
         Filter: (id = ANY ('{1,5}'::integer[]))
         Rows Removed by Filter: 3
 Planning time: 0.720 ms
 Execution time: 0.074 ms
(9 rows)

postgres=# explain analyze select * from h where id = 1 or id = 5;;
                                             QUERY PLAN                                             
----------------------------------------------------------------------------------------------------
 Append  (cost=0.00..96.50 rows=50 width=4) (actual time=0.017..0.078 rows=2 loops=1)
   ->  Seq Scan on h1  (cost=0.00..48.25 rows=25 width=4) (actual time=0.015..0.019 rows=1 loops=1)
         Filter: ((id = 1) OR (id = 5))
         Rows Removed by Filter: 6
   ->  Seq Scan on h3  (cost=0.00..48.25 rows=25 width=4) (actual time=0.005..0.010 rows=1 loops=1)
         Filter: ((id = 1) OR (id = 5))
         Rows Removed by Filter: 3
 Planning time: 0.396 ms
 Execution time: 0.139 ms
(9 rows)

Can not detach / attach / drop partition table.

Best regards,
young

yonj1e.github.io

[hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [POC] hash partitioning

Yugo Nagata
Hi young,

On Mon, 28 Aug 2017 15:33:46 +0800
"[hidden email]" <[hidden email]> wrote:

> Hello
>
> Looking at your hash partitioning syntax, I implemented a hash partition in a more concise way, with no need to determine the number of sub-tables, and dynamically add partitions.

I think it is great work, but the current consensus about hash-partitioning supports
Amul's patch[1], in which the syntax is different from the my original proposal.
So, you will have to read Amul's patch and make a discussion if you still want to
propose your implementation.

Regards,

[1] https://www.postgresql.org/message-id/CAAJ_b965A2oog=6eFUhELexL3RmgFssB3G7LwkVA1bw0WUJJoA@...


>
> Description
>
> The hash partition's implement is on the basis of the original range / list partition,and using similar syntax.
>
> To create a partitioned table ,use:
>
> CREATE TABLE h (id int) PARTITION BY HASH(id);
>
> The partitioning key supports only one value, and I think the partition key can support multiple values,
> which may be difficult to implement when querying, but it is not impossible.
>
> A partition table can be create as bellow:
>
>  CREATE TABLE h1 PARTITION OF h;
>  CREATE TABLE h2 PARTITION OF h;
>  CREATE TABLE h3 PARTITION OF h;
>  
> FOR VALUES clause cannot be used, and the partition bound is calclulated automatically as partition index of single integer value.
>
> An inserted record is stored in a partition whose index equals
> DatumGetUInt32(OidFunctionCall1(lookup_type_cache(key->parttypid[0], TYPECACHE_HASH_PROC)->hash_proc, values[0])) % nparts/* Number of partitions */
> ;
> In the above example, this is DatumGetUInt32(OidFunctionCall1(lookup_type_cache(key->parttypid[0], TYPECACHE_HASH_PROC)->hash_proc, id)) % 3;
>
> postgres=# insert into h select generate_series(1,20);
> INSERT 0 20
> postgres=# select tableoid::regclass,* from h;
>  tableoid | id
> ----------+----
>  h1       |  3
>  h1       |  5
>  h1       | 17
>  h1       | 19
>  h2       |  2
>  h2       |  6
>  h2       |  7
>  h2       | 11
>  h2       | 12
>  h2       | 14
>  h2       | 15
>  h2       | 18
>  h2       | 20
>  h3       |  1
>  h3       |  4
>  h3       |  8
>  h3       |  9
>  h3       | 10
>  h3       | 13
>  h3       | 16
> (20 rows)
>
> The number of partitions here can be dynamically added, and if a new partition is created, the number of partitions changes, the calculated target partitions will change, and the same data is not reasonable in different partitions,So you need to re-calculate the existing data and insert the target partition when you create a new partition.
>
> postgres=# create table h4 partition of h;
> CREATE TABLE
> postgres=# select tableoid::regclass,* from h;
>  tableoid | id
> ----------+----
>  h1       |  5
>  h1       | 17
>  h1       | 19
>  h1       |  6
>  h1       | 12
>  h1       |  8
>  h1       | 13
>  h2       | 11
>  h2       | 14
>  h3       |  1
>  h3       |  9
>  h3       |  2
>  h3       | 15
>  h4       |  3
>  h4       |  7
>  h4       | 18
>  h4       | 20
>  h4       |  4
>  h4       | 10
>  h4       | 16
> (20 rows)
>
> When querying the data, the hash partition uses the same algorithm as the insertion, and filters out the table that does not need to be scanned.
>
> postgres=# explain analyze select * from h where id = 1;
>                                              QUERY PLAN                                            
> ----------------------------------------------------------------------------------------------------
>  Append  (cost=0.00..41.88 rows=13 width=4) (actual time=0.020..0.023 rows=1 loops=1)
>    ->  Seq Scan on h3  (cost=0.00..41.88 rows=13 width=4) (actual time=0.013..0.016 rows=1 loops=1)
>          Filter: (id = 1)
>          Rows Removed by Filter: 3
>  Planning time: 0.346 ms
>  Execution time: 0.061 ms
> (6 rows)
>
> postgres=# explain analyze select * from h where id in (1,5);;
>                                              QUERY PLAN                                            
> ----------------------------------------------------------------------------------------------------
>  Append  (cost=0.00..83.75 rows=52 width=4) (actual time=0.016..0.028 rows=2 loops=1)
>    ->  Seq Scan on h1  (cost=0.00..41.88 rows=26 width=4) (actual time=0.015..0.018 rows=1 loops=1)
>          Filter: (id = ANY ('{1,5}'::integer[]))
>          Rows Removed by Filter: 6
>    ->  Seq Scan on h3  (cost=0.00..41.88 rows=26 width=4) (actual time=0.005..0.007 rows=1 loops=1)
>          Filter: (id = ANY ('{1,5}'::integer[]))
>          Rows Removed by Filter: 3
>  Planning time: 0.720 ms
>  Execution time: 0.074 ms
> (9 rows)
>
> postgres=# explain analyze select * from h where id = 1 or id = 5;;
>                                              QUERY PLAN                                            
> ----------------------------------------------------------------------------------------------------
>  Append  (cost=0.00..96.50 rows=50 width=4) (actual time=0.017..0.078 rows=2 loops=1)
>    ->  Seq Scan on h1  (cost=0.00..48.25 rows=25 width=4) (actual time=0.015..0.019 rows=1 loops=1)
>          Filter: ((id = 1) OR (id = 5))
>          Rows Removed by Filter: 6
>    ->  Seq Scan on h3  (cost=0.00..48.25 rows=25 width=4) (actual time=0.005..0.010 rows=1 loops=1)
>          Filter: ((id = 1) OR (id = 5))
>          Rows Removed by Filter: 3
>  Planning time: 0.396 ms
>  Execution time: 0.139 ms
> (9 rows)
>
> Can not detach / attach / drop partition table.
>
> Best regards,
> young
>
>
> yonj1e.github.io
> [hidden email]


--
Yugo Nagata <[hidden email]>


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: [POC] hash partitioning

yangjie@highgo.com
Hi,

This is my patch, before I forgot to add attachments, and the following address is also discussed.
https://www.postgresql.org/message-id/2017082612390093777512%40highgo.com

-------

young

HighGo Database: http://www.highgo.com


On 8/28/2017 16:28[hidden email] wrote:
Hi young,

On Mon, 28 Aug 2017 15:33:46 +0800
"[hidden email]" <[hidden email]> wrote:

> Hello

> Looking at your hash partitioning syntax, I implemented a hash partition in a more concise way, with no need to determine the number of sub-tables, and dynamically add partitions.

I think it is great work, but the current consensus about hash-partitioning supports 
Amul's patch[1], in which the syntax is different from the my original proposal. 
So, you will have to read Amul's patch and make a discussion if you still want to
propose your implementation.

Regards,

[1] https://www.postgresql.org/message-id/CAAJ_b965A2oog=[hidden email]



> Description

> The hash partition's implement is on the basis of the original range / list partition,and using similar syntax.

> To create a partitioned table ,use:

> CREATE TABLE h (id int) PARTITION BY HASH(id);

> The partitioning key supports only one value, and I think the partition key can support multiple values, 
> which may be difficult to implement when querying, but it is not impossible.

> A partition table can be create as bellow:

>  CREATE TABLE h1 PARTITION OF h;
>  CREATE TABLE h2 PARTITION OF h;
>  CREATE TABLE h3 PARTITION OF h;
>  
> FOR VALUES clause cannot be used, and the partition bound is calclulated automatically as partition index of single integer value.

> An inserted record is stored in a partition whose index equals 
> DatumGetUInt32(OidFunctionCall1(lookup_type_cache(key->parttypid[0], TYPECACHE_HASH_PROC)->hash_proc, values[0])) % nparts/* Number of partitions */
> ;
> In the above example, this is DatumGetUInt32(OidFunctionCall1(lookup_type_cache(key->parttypid[0], TYPECACHE_HASH_PROC)->hash_proc, id)) % 3;

> postgres=# insert into h select generate_series(1,20);
> INSERT 0 20
> postgres=# select tableoid::regclass,* from h;
>  tableoid | id 
> ----------+----
>  h1       |  3
>  h1       |  5
>  h1       | 17
>  h1       | 19
>  h2       |  2
>  h2       |  6
>  h2       |  7
>  h2       | 11
>  h2       | 12
>  h2       | 14
>  h2       | 15
>  h2       | 18
>  h2       | 20
>  h3       |  1
>  h3       |  4
>  h3       |  8
>  h3       |  9
>  h3       | 10
>  h3       | 13
>  h3       | 16
> (20 rows)

> The number of partitions here can be dynamically added, and if a new partition is created, the number of partitions changes, the calculated target partitions will change, and the same data is not reasonable in different partitions,So you need to re-calculate the existing data and insert the target partition when you create a new partition.

> postgres=# create table h4 partition of h;
> CREATE TABLE
> postgres=# select tableoid::regclass,* from h;
>  tableoid | id 
> ----------+----
>  h1       |  5
>  h1       | 17
>  h1       | 19
>  h1       |  6
>  h1       | 12
>  h1       |  8
>  h1       | 13
>  h2       | 11
>  h2       | 14
>  h3       |  1
>  h3       |  9
>  h3       |  2
>  h3       | 15
>  h4       |  3
>  h4       |  7
>  h4       | 18
>  h4       | 20
>  h4       |  4
>  h4       | 10
>  h4       | 16
> (20 rows)

> When querying the data, the hash partition uses the same algorithm as the insertion, and filters out the table that does not need to be scanned.

> postgres=# explain analyze select * from h where id = 1;
>                                              QUERY PLAN                                             
> ----------------------------------------------------------------------------------------------------
>  Append  (cost=0.00..41.88 rows=13 width=4) (actual time=0.020..0.023 rows=1 loops=1)
>    ->  Seq Scan on h3  (cost=0.00..41.88 rows=13 width=4) (actual time=0.013..0.016 rows=1 loops=1)
>          Filter: (id = 1)
>          Rows Removed by Filter: 3
>  Planning time: 0.346 ms
>  Execution time: 0.061 ms
> (6 rows)

> postgres=# explain analyze select * from h where id in (1,5);;
>                                              QUERY PLAN                                             
> ----------------------------------------------------------------------------------------------------
>  Append  (cost=0.00..83.75 rows=52 width=4) (actual time=0.016..0.028 rows=2 loops=1)
>    ->  Seq Scan on h1  (cost=0.00..41.88 rows=26 width=4) (actual time=0.015..0.018 rows=1 loops=1)
>          Filter: (id = ANY ('{1,5}'::integer[]))
>          Rows Removed by Filter: 6
>    ->  Seq Scan on h3  (cost=0.00..41.88 rows=26 width=4) (actual time=0.005..0.007 rows=1 loops=1)
>          Filter: (id = ANY ('{1,5}'::integer[]))
>          Rows Removed by Filter: 3
>  Planning time: 0.720 ms
>  Execution time: 0.074 ms
> (9 rows)

> postgres=# explain analyze select * from h where id = 1 or id = 5;;
>                                              QUERY PLAN                                             
> ----------------------------------------------------------------------------------------------------
>  Append  (cost=0.00..96.50 rows=50 width=4) (actual time=0.017..0.078 rows=2 loops=1)
>    ->  Seq Scan on h1  (cost=0.00..48.25 rows=25 width=4) (actual time=0.015..0.019 rows=1 loops=1)
>          Filter: ((id = 1) OR (id = 5))
>          Rows Removed by Filter: 6
>    ->  Seq Scan on h3  (cost=0.00..48.25 rows=25 width=4) (actual time=0.005..0.010 rows=1 loops=1)
>          Filter: ((id = 1) OR (id = 5))
>          Rows Removed by Filter: 3
>  Planning time: 0.396 ms
>  Execution time: 0.139 ms
> (9 rows)

> Can not detach / attach / drop partition table.

> Best regards,
> young


> yonj1e.github.io
[hidden email]


-- 
Yugo Nagata <[hidden email]>


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

=?UTF-8?Q?hash=5Fpart=5Fon=5Fbeta2=5Fv1.patch?= (49K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [POC] hash partitioning

amul sul
I've updated patch to use an extended hash function (​Commit # 81c5e46c490e2426db243eada186995da5bb0ba7) for the partitioning.

Regards,
Amul


On Thu, Jul 27, 2017 at 5:11 PM, amul sul <[hidden email]> wrote:
Attaching newer patches rebased against the latest master head. Thanks !

Regards,
Amul



--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

0001-Cleanup_v6.patch (5K) Download Attachment
0002-hash-partitioning_another_design-v17.patch (115K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [POC] hash partitioning

rajkumar.raghuwanshi

On Mon, Sep 4, 2017 at 4:08 PM, amul sul <[hidden email]> wrote:
I've updated patch to use an extended hash function (​Commit # 81c5e46c490e2426db243eada186995da5bb0ba7) for the partitioning.

I have done some testing with these patches, everything looks fine, attaching sql and out file for reference.

Thanks & Regards,
Rajkumar Raghuwanshi
QMG, EnterpriseDB Corporation
 


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

hash_partition_test.out (69K) Download Attachment
hash_partition_test.sql (27K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [POC] hash partitioning

Robert Haas
In reply to this post by amul sul
On Mon, Sep 4, 2017 at 6:38 AM, amul sul <[hidden email]> wrote:
> I've updated patch to use an extended hash function (Commit #
> 81c5e46c490e2426db243eada186995da5bb0ba7) for the partitioning.

Committed 0001 after noticing that Jeevan Ladhe also found that change
convenient for default partitioning.  I made a few minor cleanups;
hopefully I didn't break anything.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: [POC] hash partitioning

amul sul
On Fri, Sep 8, 2017 at 6:45 AM, Robert Haas <[hidden email]> wrote:
On Mon, Sep 4, 2017 at 6:38 AM, amul sul <[hidden email]> wrote:
> I've updated patch to use an extended hash function (Commit #
> 81c5e46c490e2426db243eada186995da5bb0ba7) for the partitioning.

Committed 0001 after noticing that Jeevan Ladhe also found that change
convenient for default partitioning.  I made a few minor cleanups;
hopefully I didn't break anything.

​Thanks you.

Rebased 0002 against this commit & renamed to 0001, PFA. 

Regards,
Amul​


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

0001-hash-partitioning_another_design-v18.patch (115K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [POC] hash partitioning

Ashutosh Bapat
On Fri, Sep 8, 2017 at 6:10 PM, amul sul <[hidden email]> wrote:

> On Fri, Sep 8, 2017 at 6:45 AM, Robert Haas <[hidden email]> wrote:
>>
>> On Mon, Sep 4, 2017 at 6:38 AM, amul sul <[hidden email]> wrote:
>> > I've updated patch to use an extended hash function (Commit #
>> > 81c5e46c490e2426db243eada186995da5bb0ba7) for the partitioning.
>>
>> Committed 0001 after noticing that Jeevan Ladhe also found that change
>> convenient for default partitioning.  I made a few minor cleanups;
>> hopefully I didn't break anything.
>
>
> Thanks you.
>
> Rebased 0002 against this commit & renamed to 0001, PFA.

Given that we have default partition support now, I am wondering
whether hash partitioned tables also should have default partitions.
The way we have structured hash partitioning syntax, there can be
"holes" in partitions. Default partition would help plug those holes.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: [POC] hash partitioning

Robert Haas
On Mon, Sep 11, 2017 at 4:17 AM, Ashutosh Bapat
<[hidden email]> wrote:
>> Rebased 0002 against this commit & renamed to 0001, PFA.
>
> Given that we have default partition support now, I am wondering
> whether hash partitioned tables also should have default partitions.
> The way we have structured hash partitioning syntax, there can be
> "holes" in partitions. Default partition would help plug those holes.

Yeah, I was thinking about that, too.  On the one hand, it seems like
it's solving the problem the wrong way: if you've set up hash
partitioning properly, you shouldn't have any holes.  On the other
hand, supporting it probably wouldn't cost anything noticeable and
might make things seem more consistent.  I'm not sure which way to
jump on this one.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: [POC] hash partitioning

Álvaro Herrera
Robert Haas wrote:

> On Mon, Sep 11, 2017 at 4:17 AM, Ashutosh Bapat
> <[hidden email]> wrote:
> >> Rebased 0002 against this commit & renamed to 0001, PFA.
> >
> > Given that we have default partition support now, I am wondering
> > whether hash partitioned tables also should have default partitions.
> > The way we have structured hash partitioning syntax, there can be
> > "holes" in partitions. Default partition would help plug those holes.
>
> Yeah, I was thinking about that, too.  On the one hand, it seems like
> it's solving the problem the wrong way: if you've set up hash
> partitioning properly, you shouldn't have any holes.  On the other
> hand, supporting it probably wouldn't cost anything noticeable and
> might make things seem more consistent.  I'm not sure which way to
> jump on this one.

How difficult/tedious/troublesome would be to install the missing
partitions if you set hash partitioning with a default partition and
only later on notice that some partitions are missing?  I think if the
answer is that you need to exclusive-lock something for a long time and
this causes a disruption in production systems, then it's better not to
allow a default partition at all and just force all the hash partitions
to be there from the start.

On the other hand, if you can get tuples out of the default partition
into their intended regular partitions without causing any disruption,
then it seems okay to allow default partitions in hash partitioning
setups.

(I, like many others, was unable to follow the default partition stuff
as closely as I would have liked.)

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: [POC] hash partitioning

Robert Haas
On Mon, Sep 11, 2017 at 8:00 AM, Alvaro Herrera <[hidden email]> wrote:

> How difficult/tedious/troublesome would be to install the missing
> partitions if you set hash partitioning with a default partition and
> only later on notice that some partitions are missing?  I think if the
> answer is that you need to exclusive-lock something for a long time and
> this causes a disruption in production systems, then it's better not to
> allow a default partition at all and just force all the hash partitions
> to be there from the start.
>
> On the other hand, if you can get tuples out of the default partition
> into their intended regular partitions without causing any disruption,
> then it seems okay to allow default partitions in hash partitioning
> setups.

I think there's no real use case for default partitioning, and yeah,
you do need exclusive locks to repartition things (whether hash
partitioning or otherwise).  It would be nice to fix that eventually,
but it's hard, because the executor has to cope with the floor moving
under it, and as of today, it really can't cope with that at all - not
because of partitioning specifically, but because of existing design
decisions that will require a lot of work (and probably arguing) to
revisit.

I think the way to get around the usability issues for hash
partitioning is to eventually add some syntax that does things like
(1) automatically create the table with N properly-configured
partitions, (2) automatically split an existing partition into N
pieces, and (3) automatically rewrite the whole table using a
different partition count.

People seem to find the hash partitioning stuff a little arcane.  I
don't want to discount that confusion with some sort of high-handed "I
know better" attitude, I think the interface that users will actually
see can end up being pretty straightforward.  The complexity that is
there in the syntax is to allow pg_upgrade and pg_dump/restore to work
properly.  But users don't necessarily have to use the same syntax
that pg_dump does, just as you can say CREATE INDEX ON a (b) and let
the system specify the index name, but at dump time the index name is
specified explicitly.

> (I, like many others, was unable to follow the default partition stuff
> as closely as I would have liked.)

Uh, sorry about that.  Would it help if I wrote a blog post on it or
something?  The general idea is simple: any tuples that don't route to
any other partition get routed to the default partition.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: [POC] hash partitioning

amul sul
In reply to this post by Álvaro Herrera
On Mon, Sep 11, 2017 at 5:30 PM, Alvaro Herrera <[hidden email]> wrote:
Robert Haas wrote:
> On Mon, Sep 11, 2017 at 4:17 AM, Ashutosh Bapat
> <[hidden email]> wrote:
> >> Rebased 0002 against this commit & renamed to 0001, PFA.
> >
> > Given that we have default partition support now, I am wondering
> > whether hash partitioned tables also should have default partitions.
> > The way we have structured hash partitioning syntax, there can be
> > "holes" in partitions. Default partition would help plug those holes.
>
> Yeah, I was thinking about that, too.  On the one hand, it seems like
> it's solving the problem the wrong way: if you've set up hash
> partitioning properly, you shouldn't have any holes.  On the other
> hand, supporting it probably wouldn't cost anything noticeable and
> might make things seem more consistent.  I'm not sure which way to
> jump on this one.

How difficult/tedious/troublesome would be to install the missing
partitions if you set hash partitioning with a default partition and
only later on notice that some partitions are missing?  I think if the
answer is that you need to exclusive-lock something for a long time and
this causes a disruption in production systems, then it's better not to
allow a default partition at all and just force all the hash partitions
to be there from the start.


I am also leaning toward ​not to support a default partition for a hash partitioned table.

The major drawback I can see is the constraint get created on the default partition
table.  IIUC, constraint on the default partition table are just negation of partition
constraint on all its sibling partitions.

Consider a hash partitioned table having partitions with (modulus 64, remainder 0) ,
...., (modulus 64, remainder 62) hash bound and partition column are col1, col2,...,so on,
then constraint for the default partition will be :

NOT( (satisfies_hash_partition(64, 0, hash_fn1(col1), hash_fn2(col2), ...) && ... &&
      satisfies_hash_partition(64, 62, hash_fn1(col1),hash_fn2(col2), ...))

​Which will be much harmful to the performance than any other partitioning
strategy because it calculate a hash for the same partitioning key multiple time.
We could overcome this by having an another SQL function (e.g satisfies_default_hash_partition)
which calculates hash value once and checks the remainder, and that would be
a different path from the current default partition framework.

​Regards,
Amul​

Reply | Threaded
Open this post in threaded view
|

Re: [POC] hash partitioning

Jesper Pedersen
In reply to this post by amul sul
Hi Amul,

On 09/08/2017 08:40 AM, amul sul wrote:
> Rebased 0002 against this commit & renamed to 0001, PFA.
>

This patch needs a rebase.

Best regards,
  Jesper



--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: [POC] hash partitioning

amul sul
On Wed, Sep 13, 2017 at 7:43 PM, Jesper Pedersen <[hidden email]> wrote:
Hi Amul,

On 09/08/2017 08:40 AM, amul sul wrote:
Rebased 0002 against this commit & renamed to 0001, PFA.


This patch needs a rebase.


Thanks for your note.
​ ​
Attached is the patch rebased on the latest master head.
Also added error on
​creating ​
​d
efault partition
​for the hash partitioned table​
,
and updated document &
​ ​
test script for the same.

​Regards,
Amul​
 


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

0001-hash-partitioning_another_design-v19.patch (118K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [POC] hash partitioning

Jesper Pedersen
Hi Amul,

On 09/14/2017 04:58 AM, amul sul wrote:

> On Wed, Sep 13, 2017 at 7:43 PM, Jesper Pedersen <[hidden email]
>> This patch needs a rebase.
>>
>>
> Thanks for your note.
> ​ ​
> Attached is the patch rebased on the latest master head.
> Also added error on ​creating ​​default partition ​for the hash partitioned table​,
> and updated document & test script for the same.
>

Thanks !

When I do

CREATE TABLE mytab (
   a integer NOT NULL,
   b integer NOT NULL,
   c integer,
   d integer
) PARTITION BY HASH (b);

and create 64 partitions;

CREATE TABLE mytab_p00 PARTITION OF mytab FOR VALUES WITH (MODULUS 64,
REMAINDER 0);
...
CREATE TABLE mytab_p63 PARTITION OF mytab FOR VALUES WITH (MODULUS 64,
REMAINDER 63);

and associated indexes

CREATE INDEX idx_p00 ON mytab_p00 USING btree (b, a);
...
CREATE INDEX idx_p63 ON mytab_p63 USING btree (b, a);

Populate the database, and do ANALYZE.

Given

EXPLAIN (ANALYZE, VERBOSE, BUFFERS ON) SELECT a, b, c, d FROM mytab
WHERE b = 42

gives

Append
   -> Index Scan using idx_p00 (cost rows=7) (actual rows=0)
   ...
   -> Index Scan using idx_p63 (cost rows=7) (actual rows=0)

E.g. all partitions are being scanned. Of course one partition will
contain the rows I'm looking for.

Best regards,
  Jesper


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: [POC] hash partitioning

Robert Haas
On Thu, Sep 14, 2017 at 11:39 AM, Jesper Pedersen
<[hidden email]> wrote:

> When I do
>
> CREATE TABLE mytab (
>   a integer NOT NULL,
>   b integer NOT NULL,
>   c integer,
>   d integer
> ) PARTITION BY HASH (b);
>
> and create 64 partitions;
>
> CREATE TABLE mytab_p00 PARTITION OF mytab FOR VALUES WITH (MODULUS 64,
> REMAINDER 0);
> ...
> CREATE TABLE mytab_p63 PARTITION OF mytab FOR VALUES WITH (MODULUS 64,
> REMAINDER 63);
>
> and associated indexes
>
> CREATE INDEX idx_p00 ON mytab_p00 USING btree (b, a);
> ...
> CREATE INDEX idx_p63 ON mytab_p63 USING btree (b, a);
>
> Populate the database, and do ANALYZE.
>
> Given
>
> EXPLAIN (ANALYZE, VERBOSE, BUFFERS ON) SELECT a, b, c, d FROM mytab WHERE b
> = 42
>
> gives
>
> Append
>   -> Index Scan using idx_p00 (cost rows=7) (actual rows=0)
>   ...
>   -> Index Scan using idx_p63 (cost rows=7) (actual rows=0)
>
> E.g. all partitions are being scanned. Of course one partition will contain
> the rows I'm looking for.

Yeah, we need Amit Langote's work in
http://postgr.es/m/098b9c71-1915-1a2a-8d52-1a7a50ce79e8@...
to land and this patch to be adapted to make use of it.  I think
that's the major thing still standing in the way of this. Concerns
were also raised about not having a way to see the hash function, but
we fixed that in 81c5e46c490e2426db243eada186995da5bb0ba7 and
hopefully this patch has been updated to use a seed (I haven't looked
yet).  And there was a concern about hash functions not being
portable, but the conclusion of that was basically that most people
think --load-via-partition-root will be a satisfactory workaround for
cases where that becomes a problem (cf. commit
23d7680d04b958de327be96ffdde8f024140d50e).  So this is the major
remaining issue that I know about.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: [POC] hash partitioning

Jesper Pedersen
Hi,

On 09/14/2017 12:05 PM, Robert Haas wrote:

> On Thu, Sep 14, 2017 at 11:39 AM, Jesper Pedersen
> <[hidden email]> wrote:
>> When I do
>>
>> CREATE TABLE mytab (
>>    a integer NOT NULL,
>>    b integer NOT NULL,
>>    c integer,
>>    d integer
>> ) PARTITION BY HASH (b);
>>
>> and create 64 partitions;
>>
>> CREATE TABLE mytab_p00 PARTITION OF mytab FOR VALUES WITH (MODULUS 64,
>> REMAINDER 0);
>> ...
>> CREATE TABLE mytab_p63 PARTITION OF mytab FOR VALUES WITH (MODULUS 64,
>> REMAINDER 63);
>>
>> and associated indexes
>>
>> CREATE INDEX idx_p00 ON mytab_p00 USING btree (b, a);
>> ...
>> CREATE INDEX idx_p63 ON mytab_p63 USING btree (b, a);
>>
>> Populate the database, and do ANALYZE.
>>
>> Given
>>
>> EXPLAIN (ANALYZE, VERBOSE, BUFFERS ON) SELECT a, b, c, d FROM mytab WHERE b
>> = 42
>>
>> gives
>>
>> Append
>>    -> Index Scan using idx_p00 (cost rows=7) (actual rows=0)
>>    ...
>>    -> Index Scan using idx_p63 (cost rows=7) (actual rows=0)
>>
>> E.g. all partitions are being scanned. Of course one partition will contain
>> the rows I'm looking for.
>
> Yeah, we need Amit Langote's work in
> http://postgr.es/m/098b9c71-1915-1a2a-8d52-1a7a50ce79e8@...
> to land and this patch to be adapted to make use of it.  I think
> that's the major thing still standing in the way of this. Concerns
> were also raised about not having a way to see the hash function, but
> we fixed that in 81c5e46c490e2426db243eada186995da5bb0ba7 and
> hopefully this patch has been updated to use a seed (I haven't looked
> yet).  And there was a concern about hash functions not being
> portable, but the conclusion of that was basically that most people
> think --load-via-partition-root will be a satisfactory workaround for
> cases where that becomes a problem (cf. commit
> 23d7680d04b958de327be96ffdde8f024140d50e).  So this is the major
> remaining issue that I know about.
>

Thanks for the information, Robert !

Best regards,
  Jesper


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: [POC] hash partitioning

David Fetter
In reply to this post by Robert Haas
On Mon, Sep 11, 2017 at 07:43:29AM -0400, Robert Haas wrote:

> On Mon, Sep 11, 2017 at 4:17 AM, Ashutosh Bapat
> <[hidden email]> wrote:
> >> Rebased 0002 against this commit & renamed to 0001, PFA.
> >
> > Given that we have default partition support now, I am wondering
> > whether hash partitioned tables also should have default
> > partitions.  The way we have structured hash partitioning syntax,
> > there can be "holes" in partitions. Default partition would help
> > plug those holes.
>
> Yeah, I was thinking about that, too.  On the one hand, it seems
> like it's solving the problem the wrong way: if you've set up hash
> partitioning properly, you shouldn't have any holes.

Should we be pointing the gun away from people's feet by making hash
partitions that cover the space automagically when the partitioning
scheme[1] is specified?  In other words, do we have a good reason to have
only some of the hash partitions so defined by default?

Best,
David.

[1] For now, that's just the modulus, but the PoC included specifying
hashing functions, so I assume other ways to specify the partitioning
scheme could eventually be proposed.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: [POC] hash partitioning

Robert Haas
On Thu, Sep 14, 2017 at 12:54 PM, David Fetter <[hidden email]> wrote:
> Should we be pointing the gun away from people's feet by making hash
> partitions that cover the space automagically when the partitioning
> scheme[1] is specified?  In other words, do we have a good reason to have
> only some of the hash partitions so defined by default?

Sure, we can add some convenience syntax for that, but I'd like to get
the basic stuff working before doing that kind of polishing.

If nothing else, I assume Keith Fiske's pg_partman will provide a way
to magically DTRT about an hour after this goes in.  But probably we
can do better in core easily enough.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: [POC] hash partitioning

Jesper Pedersen
On 09/14/2017 12:56 PM, Robert Haas wrote:

> On Thu, Sep 14, 2017 at 12:54 PM, David Fetter <[hidden email]> wrote:
>> Should we be pointing the gun away from people's feet by making hash
>> partitions that cover the space automagically when the partitioning
>> scheme[1] is specified?  In other words, do we have a good reason to have
>> only some of the hash partitions so defined by default?
>
> Sure, we can add some convenience syntax for that, but I'd like to get
> the basic stuff working before doing that kind of polishing.
>
> If nothing else, I assume Keith Fiske's pg_partman will provide a way
> to magically DTRT about an hour after this goes in.  But probably we
> can do better in core easily enough.
>

Yeah, it would be nice to have a syntax like

) PARTITION BY HASH (col) WITH (AUTO_CREATE = 64);

But then there also needs to be a way to create the 64 associated
indexes too for everything to be easy.

Best regards,
  Jesper


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
123
Previous Thread Next Thread