Proposal: Local indexes for partitioned table

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
50 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Proposal: Local indexes for partitioned table

Maksim Milyutin
Hi hackers!

As I've understood from thread [1] the main issue of creating local
indexes for partitions is supporting REINDEX and DROP INDEX operations
on parent partitioned tables. Furthermore Robert Haas mentioned the
problem of creating index on key that is represented in partitions with
single value (or primitive interval) [1] i.e. under the
list-partitioning or range-partitioning with unit interval.

I would like to propose the following solution:

1. Create index for hierarchy of partitioned tables and partitions
recursively. Don't create relfilenode for indexes on parents, only
entries in catalog (much like the partitioned table's storage
elimination in [2]). Abstract index for partitioned tables is only for
the reference on indexes of child tables to perform REINDEX and DROP
INDEX operations.

2. Specify created indexes in pg_depend table so that indexes of child
tables depend on corresponding indexes of parent tables with type of
dependency DEPENDENCY_NORMAL so that index could be removed separately
for partitions and recursively/separately for partitioned tables.

3. REINDEX on index of partitioned table would perform this operation on
existing indexes of corresponding partitions. In this case it is
necessary to consider such operations as REINDEX SCHEMA | DATABASE |
SYSTEM so that partitions' indexes wouldn't be re-indexed multiple times
in a row.

Any thoughts?

1.
https://www.postgresql.org/message-id/CA+TgmoZUwj=QYnaK+F7xEf4w_e2g3XxdMnSNZMZjuinHRcOB8A@...
2.
https://www.postgresql.org/message-id/2b0d42f2-3a53-763b-c9c2-47139e4b1c2e%40lab.ntt.co.jp

--
Maksim Milyutin
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Local indexes for partitioned table

Robert Haas
On Wed, Mar 1, 2017 at 4:23 PM, Maksim Milyutin
<[hidden email]> wrote:

> As I've understood from thread [1] the main issue of creating local indexes
> for partitions is supporting REINDEX and DROP INDEX operations on parent
> partitioned tables. Furthermore Robert Haas mentioned the problem of
> creating index on key that is represented in partitions with single value
> (or primitive interval) [1] i.e. under the list-partitioning or
> range-partitioning with unit interval.
>
> I would like to propose the following solution:
>
> 1. Create index for hierarchy of partitioned tables and partitions
> recursively. Don't create relfilenode for indexes on parents, only entries
> in catalog (much like the partitioned table's storage elimination in [2]).
> Abstract index for partitioned tables is only for the reference on indexes
> of child tables to perform REINDEX and DROP INDEX operations.
>
> 2. Specify created indexes in pg_depend table so that indexes of child
> tables depend on corresponding indexes of parent tables with type of
> dependency DEPENDENCY_NORMAL so that index could be removed separately for
> partitions and recursively/separately for partitioned tables.
>
> 3. REINDEX on index of partitioned table would perform this operation on
> existing indexes of corresponding partitions. In this case it is necessary
> to consider such operations as REINDEX SCHEMA | DATABASE | SYSTEM so that
> partitions' indexes wouldn't be re-indexed multiple times in a row.
>
> Any thoughts?

Sounds generally good.  One thing to keep in mind is that - in this
system - a UNIQUE index on the parent doesn't actually guarantee
uniqueness across the whole partitioning hierarchy unless it so
happens that the index columns or expressions are the same as the
partitioning columns or expressions.  That's a little a
counterintuitive, and people have already been complaining that a
partitioned table + partitions doesn't look enough like a plain table.
However, I'm not sure there's a better alternative, because somebody
might want partition-wise unique indexes even though that doesn't
guarantee global uniqueness.  So I think if someday we have global
indexes, then we can plan to use some other syntax for that, like
CREATE GLOBAL [ UNIQUE ] INDEX.

But, of course, that's just my opinion.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Local indexes for partitioned table

Maksim Milyutin
On 02.03.2017 11:41, Robert Haas wrote:

> Sounds generally good.  One thing to keep in mind is that - in this
> system - a UNIQUE index on the parent doesn't actually guarantee
> uniqueness across the whole partitioning hierarchy unless it so
> happens that the index columns or expressions are the same as the
> partitioning columns or expressions.  That's a little a
> counterintuitive, and people have already been complaining that a
> partitioned table + partitions doesn't look enough like a plain table.
> However, I'm not sure there's a better alternative, because somebody
> might want partition-wise unique indexes even though that doesn't
> guarantee global uniqueness.  So I think if someday we have global
> indexes, then we can plan to use some other syntax for that, like
> CREATE GLOBAL [ UNIQUE ] INDEX.

Yes, I absolutely agree with your message that cross-partition
uniqueness is guaranteed through global index on partitioned table apart
from the case when the index key are the same as partitioning key (or
index comprises partitioning key in general).

Thanks for your comment. I'll try to propose the first patches as soon
as possible.

--
Maksim Milyutin
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Local indexes for partitioned table

Maksim Milyutin
In reply to this post by Maksim Milyutin
On 01.03.2017 13:53, Maksim Milyutin wrote:

> Hi hackers!
>
> As I've understood from thread [1] the main issue of creating local
> indexes for partitions is supporting REINDEX and DROP INDEX operations
> on parent partitioned tables. Furthermore Robert Haas mentioned the
> problem of creating index on key that is represented in partitions with
> single value (or primitive interval) [1] i.e. under the
> list-partitioning or range-partitioning with unit interval.
>
> I would like to propose the following solution:
>
> 1. Create index for hierarchy of partitioned tables and partitions
> recursively. Don't create relfilenode for indexes on parents, only
> entries in catalog (much like the partitioned table's storage
> elimination in [2]). Abstract index for partitioned tables is only for
> the reference on indexes of child tables to perform REINDEX and DROP
> INDEX operations.
>
> 2. Specify created indexes in pg_depend table so that indexes of child
> tables depend on corresponding indexes of parent tables with type of
> dependency DEPENDENCY_NORMAL so that index could be removed separately
> for partitions and recursively/separately for partitioned tables.
>
> 3. REINDEX on index of partitioned table would perform this operation on
> existing indexes of corresponding partitions. In this case it is
> necessary to consider such operations as REINDEX SCHEMA | DATABASE |
> SYSTEM so that partitions' indexes wouldn't be re-indexed multiple times
> in a row.
>
> Any thoughts?
>
> 1.
> https://www.postgresql.org/message-id/CA+TgmoZUwj=QYnaK+F7xEf4w_e2g3XxdMnSNZMZjuinHRcOB8A@...
>
> 2.
> https://www.postgresql.org/message-id/2b0d42f2-3a53-763b-c9c2-47139e4b1c2e%40lab.ntt.co.jp
>
>
I want to present the first version of patches that implement local
indexes for partitioned tables and discuss some technical details of
that implementation.


1. I have added a new relkind for local indexes named
RELKIND_LOCAL_INDEX (literal 'l').

This was done because physical storage is created in the 'heap_create'
function and we need to revoke the creating storage as with partitioned
tables. Since information that this index belongs to partitioned tables
is not available in 'heap_create' function (pg_index entry on the index
is not created yet) I chose the least painful way - added a specific
relkind for index on partitioned table.
I suppose that this act will require the integrating new relkind to
different places of source code so I'm ready to consider another
proposals on this point.

2. My implementation doesn't support the concurrent creating of local
index (CREATE INDEX CONCURRENTLY). As I understand, this operation
involves nontrivial manipulation with snapshots and I don't know how to
implement concurrent creating of multiple indexes. In this point I ask
help from community.

3. As I noticed early pg_depend table is used for cascade deleting
indexes on partitioned table and its children. I also use pg_depend to
determine relationship between parent and child indexes when reindex
executes recursively on child indexes.

Perhaps, it's not good way to use pg_depend to determine the
relationship between parent and child indexes because the kind of this
relationship is not defined. I could propose to add into pg_index table
specific field of 'oidvector' type that specify oids of dependent
indexes for the current local index.


On this stage I want to discuss only technical details of local indexes'
implementation. The problems related to merging existing indexes of
partitions within local index tree, determination uniqueness of field in
global sense through local index and syntax notes I want to arise later.


CC welcome!

--
Maksim Milyutin
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

local_index_v1.patch (14K) Download Attachment
local_index.sql (2K) Download Attachment
local_index.out (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Local indexes for partitioned table

Greg Stark
On 4 April 2017 at 17:10, Maksim Milyutin <[hidden email]> wrote:

>
> 3. As I noticed early pg_depend table is used for cascade deleting indexes
> on partitioned table and its children. I also use pg_depend to determine
> relationship between parent and child indexes when reindex executes
> recursively on child indexes.
>
> Perhaps, it's not good way to use pg_depend to determine the relationship
> between parent and child indexes because the kind of this relationship is
> not defined. I could propose to add into pg_index table specific field of
> 'oidvector' type that specify oids of dependent indexes for the current
> local index.


Alternately you could have an single oid in pg_index on each of the
children that specifies which local index is its parent. That would
probably require a new index on that column so you could look up all
the children efficiently.

I think it would behave more sensibly when you're adding or removing a
partition, especially if you want to add many partitions in parallel
using multiple transactions. An oidvector of children would
effectively mean you could only be doing one partition creation or
deletion at a time.

--
greg


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Local indexes for partitioned table

Robert Haas
In reply to this post by Maksim Milyutin
On Tue, Apr 4, 2017 at 12:10 PM, Maksim Milyutin
<[hidden email]> wrote:
> 1. I have added a new relkind for local indexes named RELKIND_LOCAL_INDEX
> (literal 'l').

Seems like it should maybe be RELKIND_PARTITIONED_INDEX.  There's
nothing particularly "local" about it.  I suppose what you're going
for is that it's not global, but in a way it *is* global to the
partitioning hierarchy.  That's the point.  It's just that it's
partitioned.

> Perhaps, it's not good way to use pg_depend to determine the relationship
> between parent and child indexes because the kind of this relationship is
> not defined. I could propose to add into pg_index table specific field of
> 'oidvector' type that specify oids of dependent indexes for the current
> local index.

I agree with Greg's comment on this point.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Local indexes for partitioned table

Maksim Milyutin
In reply to this post by Greg Stark
On 10.04.2017 13:46, Greg Stark wrote:

> On 4 April 2017 at 17:10, Maksim Milyutin <[hidden email]> wrote:
>>
>> 3. As I noticed early pg_depend table is used for cascade deleting indexes
>> on partitioned table and its children. I also use pg_depend to determine
>> relationship between parent and child indexes when reindex executes
>> recursively on child indexes.
>>
>> Perhaps, it's not good way to use pg_depend to determine the relationship
>> between parent and child indexes because the kind of this relationship is
>> not defined. I could propose to add into pg_index table specific field of
>> 'oidvector' type that specify oids of dependent indexes for the current
>> local index.
>
>
> Alternately you could have an single oid in pg_index on each of the
> children that specifies which local index is its parent. That would
> probably require a new index on that column so you could look up all
> the children efficiently.
>
> I think it would behave more sensibly when you're adding or removing a
> partition, especially if you want to add many partitions in parallel
> using multiple transactions. An oidvector of children would
> effectively mean you could only be doing one partition creation or
> deletion at a time.
>

Thanks for your comment. Your approach sounds better than mine. I'll try it.

--
Maksim Milyutin
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Local indexes for partitioned table

Maksim Milyutin
In reply to this post by Robert Haas
On 10.04.2017 14:20, Robert Haas wrote:

> On Tue, Apr 4, 2017 at 12:10 PM, Maksim Milyutin
> <[hidden email]> wrote:
>> 1. I have added a new relkind for local indexes named RELKIND_LOCAL_INDEX
>> (literal 'l').
>
> Seems like it should maybe be RELKIND_PARTITIONED_INDEX.  There's
> nothing particularly "local" about it.  I suppose what you're going
> for is that it's not global, but in a way it *is* global to the
> partitioning hierarchy.  That's the point.  It's just that it's
> partitioned.
>
Ok, thanks for the note.

But I want to discuss the relevancy of introduction of a new relkind for
partitioned index. I could to change the control flow in partitioned
index creation (specify conditional statement in the 'index_create'
routine in attached patch) and not enter to the 'heap_create' routine.
This case releases us from integrating new relkind into different places
of Postgres code. But we have to copy-paste some specific code from
'heap_create' function, e.g., definition of relfilenode and tablespaceid
for the new index and perhaps something more when 'heap_create' routine
will be extended.

What do you think about this way?


--
Maksim Milyutin
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

local_index_without_new_relkind.patch (12K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Local indexes for partitioned table

Amit Langote-2
Hi,

On 2017/04/17 23:00, Maksim Milyutin wrote:

> On 10.04.2017 14:20, Robert Haas wrote:
>> On Tue, Apr 4, 2017 at 12:10 PM, Maksim Milyutin
>> <[hidden email]> wrote:
>>> 1. I have added a new relkind for local indexes named RELKIND_LOCAL_INDEX
>>> (literal 'l').
>>
>> Seems like it should maybe be RELKIND_PARTITIONED_INDEX.  There's
>> nothing particularly "local" about it.  I suppose what you're going
>> for is that it's not global, but in a way it *is* global to the
>> partitioning hierarchy.  That's the point.  It's just that it's
>> partitioned.
>>
>
> Ok, thanks for the note.
>
> But I want to discuss the relevancy of introduction of a new relkind for
> partitioned index. I could to change the control flow in partitioned index
> creation (specify conditional statement in the 'index_create' routine in
> attached patch) and not enter to the 'heap_create' routine. This case
> releases us from integrating new relkind into different places of Postgres
> code. But we have to copy-paste some specific code from 'heap_create'
> function, e.g., definition of relfilenode and tablespaceid for the new
> index and perhaps something more when 'heap_create' routine will be extended.

I may be missing something, but isn't it that a new relkind will be needed
anyway?  How does the rest of the code distinguish such index objects once
they are created?  Is it possible that some other code may try to access
the storage for an index whose indrelid is a partitioned table?

Thanks,
Amit



--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Local indexes for partitioned table

Maksim Milyutin
On 18.04.2017 13:08, Amit Langote wrote:
> Hi,
>

Hi, Amit!

> On 2017/04/17 23:00, Maksim Milyutin wrote:
>>
>> Ok, thanks for the note.
>>
>> But I want to discuss the relevancy of introduction of a new relkind for
>> partitioned index. I could to change the control flow in partitioned index
>> creation (specify conditional statement in the 'index_create' routine in
>> attached patch) and not enter to the 'heap_create' routine. This case
>> releases us from integrating new relkind into different places of Postgres
>> code. But we have to copy-paste some specific code from 'heap_create'
>> function, e.g., definition of relfilenode and tablespaceid for the new
>> index and perhaps something more when 'heap_create' routine will be extended.
>
> I may be missing something, but isn't it that a new relkind will be needed
> anyway?  How does the rest of the code distinguish such index objects once
> they are created?

Local partitioned indexes can be recognized through the check on the
relkind of table to which the index refers. Something like this:

heap = relation_open(IndexGetRelation(indexid, false), heapLockmode);
if (heap->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
     /* indexid is local index on partitioned table */

> Is it possible that some other code may try to access
> the storage for an index whose indrelid is a partitioned table?
>

Thеsе cases must be caught. But as much as partitioned tables doesn't
participate in query plans their indexes are unaccessible by executor.
Reindex operation is overloaded with my patch.


--
Maksim Milyutin
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Local indexes for partitioned table

Ashutosh Bapat
On Tue, Apr 18, 2017 at 4:43 PM, Maksim Milyutin
<[hidden email]> wrote:

> On 18.04.2017 13:08, Amit Langote wrote:
>>
>> Hi,
>>
>
> Hi, Amit!
>
>> On 2017/04/17 23:00, Maksim Milyutin wrote:
>>>
>>>
>>> Ok, thanks for the note.
>>>
>>> But I want to discuss the relevancy of introduction of a new relkind for
>>> partitioned index. I could to change the control flow in partitioned
>>> index
>>> creation (specify conditional statement in the 'index_create' routine in
>>> attached patch) and not enter to the 'heap_create' routine. This case
>>> releases us from integrating new relkind into different places of
>>> Postgres
>>> code. But we have to copy-paste some specific code from 'heap_create'
>>> function, e.g., definition of relfilenode and tablespaceid for the new
>>> index and perhaps something more when 'heap_create' routine will be
>>> extended.
>>
>>
>> I may be missing something, but isn't it that a new relkind will be needed
>> anyway?  How does the rest of the code distinguish such index objects once
>> they are created?
>
>
> Local partitioned indexes can be recognized through the check on the relkind
> of table to which the index refers. Something like this:
>
> heap = relation_open(IndexGetRelation(indexid, false), heapLockmode);
> if (heap->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
>     /* indexid is local index on partitioned table */

An index on partitioned table can be global index (yet to be
implemented) or a local index. We can not differentiate between those
just by looking at the relation on which they are built.

>
>> Is it possible that some other code may try to access
>> the storage for an index whose indrelid is a partitioned table?
>>
>
> Thеsе cases must be caught. But as much as partitioned tables doesn't
> participate in query plans their indexes are unaccessible by executor.
> Reindex operation is overloaded with my patch.
>

A global index would have storage for a partitioned table whereas a
local index wouldn't have any storage for a partitioned table.

I agree with Amit that we need new relkinds for local as well as global indexes.
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Local indexes for partitioned table

Maksim Milyutin
On 19.04.2017 11:42, Ashutosh Bapat wrote:

> On Tue, Apr 18, 2017 at 4:43 PM, Maksim Milyutin
> <[hidden email]> wrote:
>>
>> Local partitioned indexes can be recognized through the check on the relkind
>> of table to which the index refers. Something like this:
>>
>> heap = relation_open(IndexGetRelation(indexid, false), heapLockmode);
>> if (heap->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
>>     /* indexid is local index on partitioned table */
>
> An index on partitioned table can be global index (yet to be
> implemented) or a local index. We can not differentiate between those
> just by looking at the relation on which they are built.
>

We could to refine the criteria for the local partitioned index later
encapsulating it in a macro, e.g., adding a new flag from pg_index that
differentiate the type of index on partitioned table.


>> Thеsе cases must be caught. But as much as partitioned tables doesn't
>> participate in query plans their indexes are unaccessible by executor.
>> Reindex operation is overloaded with my patch.
>>
>
> A global index would have storage for a partitioned table whereas a
> local index wouldn't have any storage for a partitioned table.
>
> I agree with Amit that we need new relkinds for local as well as global indexes.
>

Ok, thanks for the feedback. Then I'll use a new relkind for local
partitioned index in further development.



--
Maksim Milyutin
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Local indexes for partitioned table

Robert Haas
On Wed, Apr 19, 2017 at 5:25 AM, Maksim Milyutin
<[hidden email]> wrote:
> Ok, thanks for the feedback. Then I'll use a new relkind for local
> partitioned index in further development.

Any update on this?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Local indexes for partitioned table

Maksim Milyutin-2
10.08.17 23:01, Robert Haas wrote:

> On Wed, Apr 19, 2017 at 5:25 AM, Maksim Milyutin
> <[hidden email]> wrote:
>> Ok, thanks for the feedback. Then I'll use a new relkind for local
>> partitioned index in further development.
> Any update on this?
>

I'll continue to work soon. Sorry for so long delay.

--
Regards,
Maksim Milyutin



--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Local indexes for partitioned table

Álvaro Herrera
In reply to this post by Maksim Milyutin
Hello

I've been thinking about this issue too.  I think your patch is not
ambitious enough.  Here's my brain dump on the issue, to revive
discussion.

As you propose, IMO this new feature would use the standard index
creation syntax:
   CREATE [UNIQUE] INDEX someindex ON parted_table (a, b);

This command currently throws an error.  We'd have it do two things:

1. create catalog rows in pg_class and pg_index for the main table,
   indicating the existance of this partitioned index.  These would not
   point to an actual index (since the main table itself is empty), but
   instead to an "abstract" index.  This abstract index can be used by
   various operations; see below.

2. create one index for each existing partition.  These would be
   identical to what would happen if you created the index directly on
   each partition, except that there is an additional dependency to the
   parent's abstract index.

If partitions are themselves partitioned, we would recursively apply the
indexes to the sub-partitions by doing (1) above for the partitioned
partition.

Once the index has been created for all existing partitions, the
hierarchy-wide index becomes valid and can be used normally by the
planner/executor.  I think we could use the pg_index.indisvalid property
for the abstract index for this.

I propose that an index declared UNIQUE throws an error for now.  We can
implement uniqueness (for the case where the indexed columns match the
partitioning key) later, once we sort out all the issues here first.  I
think unique indexes would be very useful even with that limitation, but
let's have it as a separate project.

I think using pg_depend as the mechanism to link partition indexes to
parent is a bad idea.  How about pg_inherits instead?  Seems more
appropriate.


Creating hierachy-wide indexes for existing partitioned tables is likely
to take a long time, so we must include CONCURRENTLY as an option.  This
will need some transaction machinery support in order to ensure that
each partition gets its index created at some point in a long chain of
transactions, and that the whole thing is marked valid only at the end.
Also, if the creation is interrupted (because of a crash or a regular
shutdown), it'll be useful to be able to continue rather than being
forced to start from scratch.

When a new partition is added, indexes satisfying the partitioned
table's abstract indexes are created automatically.

During ALTER TABLE ... ATTACH PARTITION, we check that an indexing
satisfying the abstract index exist (and we create a pg_inherit link).
If not, the command is aborted.

REINDEX is easily supported (just reindex each partition's index
individually), but that command is blocking, which is not good.  For
concurrent operation for tables not partitioned, a typical pattern to
avoid blocking the table is to suggest creation of an identical index
using CREATE INDEX CONCURRENTLY, then drop the original one.  That's not
going to work with these partitioned indexes, which is going to be a
problem.  I don't have any great ideas about this part yet.

We need to come up with some way to generate names for each partition
index.


I am going to work on this now.

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Local indexes for partitioned table

Maksim Milyutin-2

Hi!


On 06.10.2017 19:37, Alvaro Herrera wrote:
As you propose, IMO this new feature would use the standard index
creation syntax:
   CREATE [UNIQUE] INDEX someindex ON parted_table (a, b);

This command currently throws an error.  We'd have it do two things:

1. create catalog rows in pg_class and pg_index for the main table,
   indicating the existance of this partitioned index.  These would not
   point to an actual index (since the main table itself is empty), but
   instead to an "abstract" index.  This abstract index can be used by
   various operations; see below.

Robert Haas proposed[1] to use the name "partitioned index" (instead of abstract) that have to be reflected in 'relkind' field of pg_class as the RELKIND_PARTITIONED_INDEX value.

I propose that an index declared UNIQUE throws an error for now.  We can
implement uniqueness (for the case where the indexed columns match the
partitioning key) later, once we sort out all the issues here first.  I
think unique indexes would be very useful even with that limitation, but
let's have it as a separate project.

Yes, global uniqueness through local unique indexes causes further work related with foreign keys on partitioned table, full support of INSERT OF CONFLICT, etc. It make sense to implement after the current stage of work.

I think using pg_depend as the mechanism to link partition indexes to
parent is a bad idea.  How about pg_inherits instead?  Seems more
appropriate.

Greg Stark proposed[2] to use new pg_index field of oid type that refers to the parent pg_index item. AFAIC pg_inherits also makes sense but semantically it deals with inheriting tables. IMHO the using of this catalog table to define relation between partitioned table and partitions looks like a hack to make use of constraint exclusion logic for partition pruning.

Creating hierachy-wide indexes for existing partitioned tables is likely
to take a long time, so we must include CONCURRENTLY as an option.  This
will need some transaction machinery support in order to ensure that
each partition gets its index created at some point in a long chain of
transactions, and that the whole thing is marked valid only at the end.
Also, if the creation is interrupted (because of a crash or a regular
shutdown), it'll be useful to be able to continue rather than being
forced to start from scratch.

This option was very difficult for me. I would be interested to see the implementation.

During ALTER TABLE ... ATTACH PARTITION, we check that an indexing
satisfying the abstract index exist (and we create a pg_inherit link).
If not, the command is aborted.

We could create necessary index for partition, not abort ALTER TABLE ... ATTACH PARTITION statement.

We need to come up with some way to generate names for each partition
index.

I think the calling 'ChooseIndexName(RelationGetRelationName(childrel), namespaceId, indexColNames, ...)' resolves this problem.

I am going to work on this now.

It will be great! I think this project is difficult for me on the part of integration described above functionality with the legacy postgres code. Also IMO this project is very important because it opens the way for such feature as global uniqueness of fields of partitioned tables. And any protraction in implementation is bad.

I would like to review and test your intermediate results.


1. https://www.postgresql.org/message-id/CA%2BTgmoY5UOUnW%3DMcwT7xUB_2W5dAkvOg5kD20Spx5gF-Ad47cA%40mail.gmail.com
2. https://www.postgresql.org/message-id/CAM-w4HOVftuv5RVi3a%2BsRV6nBpg204w7%3DL8MwPXVvYBFo1uM1Q%40mail.gmail.com
-- 
Regards,
Maksim Milyutin
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Local indexes for partitioned table

Robert Haas
In reply to this post by Álvaro Herrera
On Fri, Oct 6, 2017 at 12:37 PM, Alvaro Herrera <[hidden email]> wrote:
> 2. create one index for each existing partition.  These would be
>    identical to what would happen if you created the index directly on
>    each partition, except that there is an additional dependency to the
>    parent's abstract index.

One thing I'm a bit worried about is how to name these subordinate
indexes.  They have to have names because that's how pg_class works,
and those names can't all be the same, again because that's how
pg_class works.  There's no problem right away when you first create
the partitioned index, because you can just pick names out of a hat
using whatever name-generating algorithm seems best.  However, when
you dump-and-restore (including but not limited to the pg_upgrade
case) you've got to preserve those names.  If you just generate a new
name that may or may not be the same as the old one, then it may
collide with a user-specified name that only occurs later in the dump.
Also, you'll have trouble if the user has applied a COMMENT or a
SECURITY LABEL to the index because that command works by name, or if
the user has a reference to the index name inside a function or
whatever.

These are pretty annoying corner-case bugs because they're not likely
to come up very often.  Most people won't notice or care if the index
name changes.  But I don't think it's acceptable to just ignore the
problem.  An idea I had was to treat the abstract index - to use your
term - sort of the way we treat an extension.  Normally, when you
create an index on a partitioned table, it cascades, but for dump and
restore purpose, we tag on some syntax that says "well, don't actually
create the subordinate indexes, i'll tell you about those later".
Then for each subordinate index we issue a separate CREATE INDEX
command followed by ALTER INDEX abstract_index ATTACH PARTITION
concrete_index or something of that sort.  That means you can't
absolutely count on the parent index to have all of the children it's
supposed to have but maybe that's OK.

Another thing that would let you do is CREATE INDEX CONCURRENTLY
replacement_concrete_index; ALTER INDEX abstract_index DETACH
PARTITION old_concrete_index, ATTACH PARTITION
replacement_concrete_index; DROP INDEX CONCURRENTLY
old_concrete_index, which seems like a thing someone might want to do.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Local indexes for partitioned table

Maksim Milyutin-2
07.10.17 16:34, Robert Haas wrote:

> On Fri, Oct 6, 2017 at 12:37 PM, Alvaro Herrera <[hidden email]> wrote:
> One thing I'm a bit worried about is how to name these subordinate
> indexes.  They have to have names because that's how pg_class works,
> and those names can't all be the same, again because that's how
> pg_class works.  There's no problem right away when you first create
> the partitioned index, because you can just pick names out of a hat
> using whatever name-generating algorithm seems best.  However, when
> you dump-and-restore (including but not limited to the pg_upgrade
> case) you've got to preserve those names.  If you just generate a new
> name that may or may not be the same as the old one, then it may
> collide with a user-specified name that only occurs later in the dump.
> Also, you'll have trouble if the user has applied a COMMENT or a
> SECURITY LABEL to the index because that command works by name, or if
> the user has a reference to the index name inside a function or
> whatever.
>
> These are pretty annoying corner-case bugs because they're not likely
> to come up very often.  Most people won't notice or care if the index
> name changes.  But I don't think it's acceptable to just ignore the
> problem.  An idea I had was to treat the abstract index - to use your
> term - sort of the way we treat an extension.  Normally, when you
> create an index on a partitioned table, it cascades, but for dump and
> restore purpose, we tag on some syntax that says "well, don't actually
> create the subordinate indexes, i'll tell you about those later".
> Then for each subordinate index we issue a separate CREATE INDEX
> command followed by ALTER INDEX abstract_index ATTACH PARTITION
> concrete_index or something of that sort.  That means you can't
> absolutely count on the parent index to have all of the children it's
> supposed to have but maybe that's OK.

AFAICS, the main problem with naming is generating new unique names for
subordinate indexes on the stage of migrating data scheme (pg_dump,
pg_upgrade, etc). And we cannot specify these names in the 'CREATE INDEX
partitioned_index' statement therefore we have to regenerate their.

In this case I propose to restore index names' hierarchy *bottom-up*,
i.e. first of all create indexes for the leaf partitions and then create
ones for parents up to root explicitly specifying names. When creating
index on parent table we have to check is there exist any index on child
table that could be child index (identical criteria). If so, not
generate new index but implicitly attach that index into parent one.
If we have incomplete index hierarchy, e.g. we dropped some indexes of
partitions previously, then recreating of parent's index would
regenerate (not attach) indexes for those partitions. We could drop
those odd generated indexes after building of parent's index. This
decision is not straightforward but provides to consider 'CREATE INDEX
paritioned_table' statement as a cascade operation.
As a result, we can specify name for each concrete index while
recreating a whole hierarchy.

--
Regards,
Maksim Milyutin



--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Local indexes for partitioned table

Ashutosh Bapat
In reply to this post by Robert Haas
On Sat, Oct 7, 2017 at 7:04 PM, Robert Haas <[hidden email]> wrote:

> On Fri, Oct 6, 2017 at 12:37 PM, Alvaro Herrera <[hidden email]> wrote:
>> 2. create one index for each existing partition.  These would be
>>    identical to what would happen if you created the index directly on
>>    each partition, except that there is an additional dependency to the
>>    parent's abstract index.
>
> One thing I'm a bit worried about is how to name these subordinate
> indexes.  They have to have names because that's how pg_class works,
> and those names can't all be the same, again because that's how
> pg_class works.  There's no problem right away when you first create
> the partitioned index, because you can just pick names out of a hat
> using whatever name-generating algorithm seems best.  However, when
> you dump-and-restore (including but not limited to the pg_upgrade
> case) you've got to preserve those names.  If you just generate a new
> name that may or may not be the same as the old one, then it may
> collide with a user-specified name that only occurs later in the dump.
> Also, you'll have trouble if the user has applied a COMMENT or a
> SECURITY LABEL to the index because that command works by name, or if
> the user has a reference to the index name inside a function or
> whatever.
>
> These are pretty annoying corner-case bugs because they're not likely
> to come up very often.  Most people won't notice or care if the index
> name changes.  But I don't think it's acceptable to just ignore the
> problem.  An idea I had was to treat the abstract index - to use your
> term - sort of the way we treat an extension.  Normally, when you
> create an index on a partitioned table, it cascades, but for dump and
> restore purpose, we tag on some syntax that says "well, don't actually
> create the subordinate indexes, i'll tell you about those later".
> Then for each subordinate index we issue a separate CREATE INDEX
> command followed by ALTER INDEX abstract_index ATTACH PARTITION
> concrete_index or something of that sort.  That means you can't
> absolutely count on the parent index to have all of the children it's
> supposed to have but maybe that's OK.

+1.

How about CREATE INDEX ... PARTITION OF ... FOR TABLE ...? to create
the index and attach it?

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: Local indexes for partitioned table

Álvaro Herrera
In reply to this post by Robert Haas
Robert Haas wrote:

> On Fri, Oct 6, 2017 at 12:37 PM, Alvaro Herrera <[hidden email]> wrote:
> > 2. create one index for each existing partition.  These would be
> >    identical to what would happen if you created the index directly on
> >    each partition, except that there is an additional dependency to the
> >    parent's abstract index.
>
> One thing I'm a bit worried about is how to name these subordinate
> indexes.  They have to have names because that's how pg_class works,
> and those names can't all be the same, again because that's how
> pg_class works.  There's no problem right away when you first create
> the partitioned index, because you can just pick names out of a hat
> using whatever name-generating algorithm seems best.  However, when
> you dump-and-restore (including but not limited to the pg_upgrade
> case) you've got to preserve those names.  If you just generate a new
> name that may or may not be the same as the old one, then it may
> collide with a user-specified name that only occurs later in the dump.
> Also, you'll have trouble if the user has applied a COMMENT or a
> SECURITY LABEL to the index because that command works by name, or if
> the user has a reference to the index name inside a function or
> whatever.
>
> These are pretty annoying corner-case bugs because they're not likely
> to come up very often.  Most people won't notice or care if the index
> name changes.  But I don't think it's acceptable to just ignore the
> problem.

I agree it's a problem that needs to be addressed directly.

> An idea I had was to treat the abstract index - to use your
> term - sort of the way we treat an extension.  Normally, when you
> create an index on a partitioned table, it cascades, but for dump and
> restore purpose, we tag on some syntax that says "well, don't actually
> create the subordinate indexes, i'll tell you about those later".
> Then for each subordinate index we issue a separate CREATE INDEX
> command followed by ALTER INDEX abstract_index ATTACH PARTITION
> concrete_index or something of that sort.  That means you can't
> absolutely count on the parent index to have all of the children it's
> supposed to have but maybe that's OK.

Hmm ... yeah, ATTACH and DETACH sound acceptable to me.  On DETACH, the
abstract index should be marked indisvalid=false unless a substitute
index already exists; and on ATTACH when indisvalid=false we verify that
all local indexes exist, and if so we can flip indisvalid.  That way, we
can continue to rely on the parent index always having all its children
when the flag is set.

I'm not clear on a syntax that creates the main index and hopes to later
have the sub-indexes created.  Another approach is to do it the other
way around, i.e. create the children first, then once they're all in
place create the main one normally, which merely verifies that all the
requisite children exist.  This is related to what I proposed to occur
when a regular table is joined as a partition of the partitioned table:
we run a verification that an index matching the parent's abstract
indexes exists, and if not we raise an error.  (Alternatively we could
allow the case, and mark the abstract index as indisvalid=false, but
that seems to violate POLA).

> Another thing that would let you do is CREATE INDEX CONCURRENTLY
> replacement_concrete_index; ALTER INDEX abstract_index DETACH
> PARTITION old_concrete_index, ATTACH PARTITION
> replacement_concrete_index; DROP INDEX CONCURRENTLY
> old_concrete_index, which seems like a thing someone might want to do.

Yeah, this is a point I explicitly mentioned, and this proposal seems
like a good way.

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
123
Previous Thread Next Thread