I saw someone ask once for a schema diagram of the system catalogs.
Things like that have occasionally been produced manually, but they are not regularly updated. That made me wonder, why can't we add primary and foreign keys to system catalogs and then have existing tools produce such a schema diagram automatically? Since we have ADD PRIMARY KEY USING INDEX, we can declare a primary key for an existing index. So this doesn't have to affect the low-level early bootstrapping. The attached patch adds those commands manually. Another option might be to have the catalog generation Perl tooling create those declarations automatically from some marker in the catalog files. That would also allow declaring unique constraints for the unique indexes that don't end up being the primary key. I'm not dealing here with how we might do foreign keys between system catalogs, but I think the idea would be to have some way to declare a foreign key "on paper" without any actual triggers. Any thoughts on this direction? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services |
On Sat, 3 Oct 2020, 14:40 Peter Eisentraut, <[hidden email]> wrote: I saw someone ask once for a schema diagram of the system catalogs. I like the general idea a lot. I've used Pg for a long time and I still struggle to march up relationships sometimes. Especially given that some things use relation oid oid keys and some use named cols as keys. So a big +1 from me for the idea. Especially if we ensure psql recognises when the relation 'oid' attribute has a declared PK and includes it in the column list. I don't object to simply declaring them without any implementing triggers. You aren't supposed to mess with the catalogs anyway. I'd actually like it to be defended against more actively by default. So the FKs are implicitly always valid, because the implementation says so. Much the same way trigger based FKs are unless you break the rules and mess with the triggers. Frankly I think we really need a way to mark FKs to be DISABLED or NOT CHECKED or something and a way to mark them as NOT VALID. Rsther than expecting uses to fiddle with the implementation triggers. But I don't think FKs on system catalogs require that, it's just be cosmetic there. |
On Sat, Oct 3, 2020 at 09:27:02PM +0800, Craig Ringer wrote:
> > > On Sat, 3 Oct 2020, 14:40 Peter Eisentraut, <[hidden email]> > wrote: > > I saw someone ask once for a schema diagram of the system catalogs. > Things like that have occasionally been produced manually, but they are > not regularly updated. That made me wonder, why can't we add primary > and foreign keys to system catalogs and then have existing tools produce > such a schema diagram automatically? > > Since we have ADD PRIMARY KEY USING INDEX, we can declare a primary key > for an existing index. So this doesn't have to affect the low-level > early bootstrapping. The attached patch adds those commands manually. > Another option might be to have the catalog generation Perl tooling > create those declarations automatically from some marker in the catalog > files. That would also allow declaring unique constraints for the > unique indexes that don't end up being the primary key. > > > Any thoughts on this direction? > > > I like the general idea a lot. I've used Pg for a long time and I still > struggle to march up relationships sometimes. Especially given that some things > use relation oid oid keys and some use named cols as keys. > > So a big +1 from me for the idea. Especially if we ensure psql recognises when > the relation 'oid' attribute has a declared PK and includes it in the column > list. +1 > I don't object to simply declaring them without any implementing triggers. You > aren't supposed to mess with the catalogs anyway. I'd actually like it to be > defended against more actively by default. So the FKs are implicitly always > valid, because the implementation says so. Much the same way trigger based FKs > are unless you break the rules and mess with the triggers. > > Frankly I think we really need a way to mark FKs to be DISABLED or NOT CHECKED > or something and a way to mark them as NOT VALID. Rsther than expecting uses to > fiddle with the implementation triggers. But I don't think FKs on system > catalogs require that, it's just be cosmetic there. +1 -- Bruce Momjian <[hidden email]> https://momjian.us EnterpriseDB https://enterprisedb.com The usefulness of a cup is in its emptiness, Bruce Lee |
In reply to this post by Craig Ringer-5
On Sat, Oct 3, 2020 at 9:27 AM Craig Ringer <[hidden email]> wrote: > So a big +1 from me for the idea. Especially if we ensure psql recognises when the relation 'oid' attribute has a declared PK and includes it in the column list. Oid has been in the normal column list since v12 when it stopped being a system column. Or did you mean something else? |
On Sun, 4 Oct 2020, 01:32 John Naylor, <[hidden email]> wrote:
It means I've spent way too long working on extensions running mainly on pg11 lately and way too little reading -hackers. Ignore they bit. Thanks.. |
In reply to this post by Peter Eisentraut-6
Hi,
On 2020-10-03 08:40:31 +0200, Peter Eisentraut wrote: > Since we have ADD PRIMARY KEY USING INDEX, we can declare a primary key for > an existing index. So this doesn't have to affect the low-level early > bootstrapping. The attached patch adds those commands manually. Another > option might be to have the catalog generation Perl tooling create those > declarations automatically from some marker in the catalog files. That > would also allow declaring unique constraints for the unique indexes that > don't end up being the primary key. Hm. What prevents us from declaring the pkey during bootstrap? I don't at all like adding yet another place that needs manual editing when doing DDL changes. Greetings, Andres Freund |
Andres Freund <[hidden email]> writes:
> On 2020-10-03 08:40:31 +0200, Peter Eisentraut wrote: >> Since we have ADD PRIMARY KEY USING INDEX, we can declare a primary key for >> an existing index. So this doesn't have to affect the low-level early >> bootstrapping. The attached patch adds those commands manually. Another >> option might be to have the catalog generation Perl tooling create those >> declarations automatically from some marker in the catalog files. That >> would also allow declaring unique constraints for the unique indexes that >> don't end up being the primary key. > Hm. What prevents us from declaring the pkey during bootstrap? I don't > at all like adding yet another place that needs manual editing when > doing DDL changes. We don't add new catalogs often, so the cost-benefit ratio of automating this looks pretty poor. It's not like you'll be able to forget to do it, given the proposed regression test check for catalogs without pkeys. One thing I'm wondering about though is whether Peter's implementation ends up with desirable pg_depend contents for the pkey constraints. Probably we want them to just be pinned and not have any explicit dependencies on the associated indexes, but I haven't thought hard about it. regards, tom lane |
Hi,
On 2020-10-06 15:31:16 -0400, Tom Lane wrote: > Andres Freund <[hidden email]> writes: > > On 2020-10-03 08:40:31 +0200, Peter Eisentraut wrote: > >> Since we have ADD PRIMARY KEY USING INDEX, we can declare a primary key for > >> an existing index. So this doesn't have to affect the low-level early > >> bootstrapping. The attached patch adds those commands manually. Another > >> option might be to have the catalog generation Perl tooling create those > >> declarations automatically from some marker in the catalog files. That > >> would also allow declaring unique constraints for the unique indexes that > >> don't end up being the primary key. > > > Hm. What prevents us from declaring the pkey during bootstrap? I don't > > at all like adding yet another place that needs manual editing when > > doing DDL changes. > > We don't add new catalogs often, so the cost-benefit ratio of automating > this looks pretty poor. True, we don't create new ones that often. Still think that distributing such setup over fewer places is good. And it's not like there's only a handful of pkeys to start with. To me it makes more sense to add a DECLARE_PRIMARY_KEY in indexing.h, replacing the relevant DECLARE_UNIQUE_INDEX stanzas. Or, possibly preferrable, do it as part of the CATALOG() directly - which'd avoid needing the index statement in the first place. The Catalog.pm part is trivial. It's abit harder to implement actually creating the constraint - but even the hard route of adding a new field to Boot_DeclareIndexStmt or such wouldn't be particularly hard. > One thing I'm wondering about though is whether Peter's implementation > ends up with desirable pg_depend contents for the pkey constraints. > Probably we want them to just be pinned and not have any explicit > dependencies on the associated indexes, but I haven't thought hard > about it. That sounds right to me. Wonder whether it's not time to move the DECLARE bits from indexing.h to the individual catalogs, independent of what we're discussing here. With todays Catalog.pm there's really not much point in keeping them separate, imo. Greetings, Andres Freund |
On Tue, Oct 6, 2020 at 4:16 PM Andres Freund <[hidden email]> wrote: True, we don't create new ones that often. Still think that distributing That seems logical. Or, possibly preferrable, do it as part of That line is already messy in places. AFAICT, you'd still need info from the index statement (at least my shortened version below), leading to two syntaxes and two places for the same thing. Maybe I'm missing something? Wonder whether it's not time to move the DECLARE bits from indexing.h to That would look nicer, at least. A declaration could go from DECLARE_UNIQUE_INDEX(pg_subscription_rel_srrelid_srsubid_index, 6117, on pg_subscription_rel using btree(srrelid oid_ops, srsubid oid_ops)); #define SubscriptionRelSrrelidSrsubidIndexId 6117 to something like DECLARE_UNIQUE_INDEX(btree(srrelid oid_ops, srsubid oid_ops), 6117, SubscriptionRelSrrelidSrsubidIndexId)); |
In reply to this post by Peter Eisentraut-6
On 2020-10-03 08:40, Peter Eisentraut wrote:
> Since we have ADD PRIMARY KEY USING INDEX, we can declare a primary key > for an existing index. So this doesn't have to affect the low-level > early bootstrapping. The attached patch adds those commands manually. > Another option might be to have the catalog generation Perl tooling > create those declarations automatically from some marker in the catalog > files. That would also allow declaring unique constraints for the > unique indexes that don't end up being the primary key. I have reworked my patch like this. Now, the primary key is identified by using DECLARE_UNIQUE_INDEX_PKEY() instead of DECLARE_UNIQUE_INDEX() in the catalog pg_*.h files. This automatically generates the required ALTER TABLE commands in a new file system_constraints.sql. initdb is ordered so that that file is run before dependency processing, so the system constraints also end up pinned (as was requested during previous discussion). Note, this needs the get_constraint_index() patch I just posted separately. The old implementation of get_constraint_index() does not handle pinned constraints. There is something strange going on in the misc_sanity test, as seen by the test output change -NOTICE: pg_constraint contains unpinned initdb-created object(s) This previously detected some constraints from the information schema, so it was correct to point those out. But since it looks for the min(oid) of a catalog, this will now find one of the new pinned constraints, so it won't detect this case anymore. I think that test is not written correctly. -- Peter Eisentraut 2ndQuadrant, an EDB company https://www.2ndquadrant.com/ |
Peter Eisentraut <[hidden email]> writes:
> [ v2-0001-Add-primary-keys-and-unique-constraints-to-system.patch ] I've reviewed this patch. It looks pretty solid to me, with a couple trivial nits as mentioned below, and one bigger thing that's perhaps in the category of bikeshedding. Namely, do we really want to prefer using the OID indexes as the primary keys? In most cases there's some other index that seems to me to be what a user would think of as the pkey, for example pg_class_relname_nsp_index for pg_class or pg_proc_proname_args_nsp_index for pg_proc. Preferring OID where it exists is a nice simple rule, which has some attractiveness, but the OID indexes seem to me like a lookup aid rather than the "real" object identity. > There is something strange going on in the misc_sanity test, as seen by > the test output change > -NOTICE: pg_constraint contains unpinned initdb-created object(s) Formerly the lowest OID in pg_constraint was that of the information_schema.cardinal_number_domain_check CHECK constraint, which is intentionally not pinned. Now the lowest OID is that of pg_proc_oid_index, which is pinned, hence the output change. I think that's fine. The idea here is just to notice whether any catalogs have been missed by setup_depend, and we have now proven that pg_constraint wasn't missed, so it's cool. The other catalog that is listed in the expected output, but is not one of the ones intentionally excluded by setup_depend, is pg_rewrite. That's because its lowest OID is a rewrite rule for a system view, which we've intentionally made non-pinned. So that's also fine, but maybe it'd be worth explaining it in the comment above the DO block. Anyway, on to the minor nits: system_constraints.sql should be removed by the maintainer-clean target in src/backend/catalog/Makefile; perhaps also mention it in the comment for the clean target. Also I suppose src/tools/msvc/clean.bat needs to remove it, like it does postgres.bki. The contents of system_constraints.sql seem pretty randomly ordered, and I bet the order isn't stable across machines. It would be wise to sort the entries by name to ensure we don't get inconsistencies between different builds. (If nothing else, that could complicate validating tarballs.) I don't follow why the pg_seclabel, pg_shseclabel indexes aren't made into pkeys? There's a comment claiming they "use a nondefault operator class", but that's untrue AFAICS. regards, tom lane |
In reply to this post by Craig Ringer-5
On Sat, Oct 3, 2020 at 9:27 AM Craig Ringer
<[hidden email]> wrote: > Frankly I think we really need a way to mark FKs to be DISABLED or NOT CHECKED or something and a way to mark them as NOT VALID. Rsther than expecting uses to fiddle with the implementation triggers. But I don't think FKs on system catalogs require that, it's just be cosmetic there. Not really. I think the idea that users don't or shouldn't ever do manual DDL on system catalogs is not very plausible, considering that we suggest such steps in our own release notes. I don't have any complaint about labelling some of the unique indexes as primary keys, but I think installing foreign keys that don't really enforce anything may lead to confusion. -- Robert Haas EDB: http://www.enterprisedb.com |
Robert Haas <[hidden email]> writes:
> I don't have any complaint about labelling some of the unique indexes > as primary keys, but I think installing foreign keys that don't really > enforce anything may lead to confusion. I'm not sure if I buy the "confusion" argument, but after thinking about this more I realized that there's a stronger roadblock for treating catalog interrelationships as SQL foreign keys. Namely, that we always represent no-reference situations with a zero OID, whereas it'd have to be NULL to look like a valid foreign-key case. Changing to real NULLs in those columns would of course break a ton of C code, so I don't think that's gonna happen. We could overlay an FK onto OID columns that should never have zero entries, but that would miss enough of the interesting cases that I don't find it a very attractive proposal. So I think we should finish up the project of making real SQL constraints for the catalog indexes, but the idea of making FK constraints too looks like it'll end in failure. The reason I got interested in this right now is the nearby discussion [1] about why findoidjoins misses some catalog relationships and whether we should fix that and/or make its results more readily accessible. I'd thought perhaps FK constraint entries could supersede the need for that tool altogether, but now it seems like not. I still like the idea of marking OID relationships in the catalog headers though. Perhaps we should take Joel's suggestion of a new system catalog more seriously, and have genbki.pl populate such a catalog from info in the catalog header files. regards, tom lane [1] https://www.postgresql.org/message-id/flat/d1f413ff-0e50-413d-910c-bdd9ee9752c1%40www.fastmail.com |
I wrote:
> ... I still like the idea of marking OID relationships in the > catalog headers though. Perhaps we should take Joel's suggestion > of a new system catalog more seriously, and have genbki.pl populate > such a catalog from info in the catalog header files. On second thought, a catalog is overkill; it'd only be useful if the data could change after initdb, which this data surely cannot. The right way to expose such info to SQL is with a set-returning function reading a constant table in the C code, a la pg_get_keywords(). That way takes a whole lot less infrastructure and is just as useful to SQL code. [ wanders away wondering about a debug mode in which we actually validate OIDs inserted into such columns... ] regards, tom lane |
On Mon, Jan 18, 2021, at 19:33, Tom Lane wrote: > On second thought, a catalog is overkill; it'd only be useful if the data > could change after initdb, which this data surely cannot. The right way > to expose such info to SQL is with a set-returning function reading a > constant table in the C code, a la pg_get_keywords(). That way takes a > whole lot less infrastructure and is just as useful to SQL code. +1
|
In reply to this post by Tom Lane-2
On Sun, 2021-01-17 at 17:07 -0500, Tom Lane wrote:
> Peter Eisentraut <[hidden email]> writes: > > [ v2-0001-Add-primary-keys-and-unique-constraints-to-system.patch ] > > [...] do we really want to prefer > using the OID indexes as the primary keys? In most cases there's some > other index that seems to me to be what a user would think of as the > pkey, for example pg_class_relname_nsp_index for pg_class or > pg_proc_proname_args_nsp_index for pg_proc. Preferring OID where it > exists is a nice simple rule, which has some attractiveness, but the > OID indexes seem to me like a lookup aid rather than the "real" object > identity. I disagree. The OID is the real, immutable identity of an object. The "relname" of a "pg_class" encatalogtry can change any time. Since there are no foreign keys that reference catalogs, that won't cause problems, but I still think that primary keys should change as little as possible. Yours, Laurenz Albe |
In reply to this post by Tom Lane-2
On Mon, Jan 18, 2021, at 18:23, Tom Lane wrote: > I realized that there's a stronger roadblock for > treating catalog interrelationships as SQL foreign keys. Namely, > that we always represent no-reference situations with a zero OID, > whereas it'd have to be NULL to look like a valid foreign-key case. Another roadblock is perhaps the lack of foreign keys on arrays, which would be needed by the following references: SELECT * FROM oidjoins WHERE column_type ~ '(vector|\[\])$' ORDER BY 1,2,3; table_name | column_name | column_type | ref_table_name | ref_column_name ----------------------+----------------+-------------+----------------+----------------- pg_constraint | conexclop | oid[] | pg_operator | oid pg_constraint | conffeqop | oid[] | pg_operator | oid pg_constraint | confkey | int2[] | pg_attribute | attnum pg_constraint | conkey | int2[] | pg_attribute | attnum pg_constraint | conpfeqop | oid[] | pg_operator | oid pg_constraint | conppeqop | oid[] | pg_operator | oid pg_extension | extconfig | oid[] | pg_class | oid pg_index | indclass | oidvector | pg_opclass | oid pg_index | indcollation | oidvector | pg_collation | oid pg_index | indkey | int2vector | pg_attribute | attnum pg_partitioned_table | partattrs | int2vector | pg_attribute | attnum pg_partitioned_table | partclass | oidvector | pg_opclass | oid pg_partitioned_table | partcollation | oidvector | pg_collation | oid pg_policy | polroles | oid[] | pg_authid | oid pg_proc | proallargtypes | oid[] | pg_type | oid pg_proc | proargtypes | oidvector | pg_type | oid pg_statistic_ext | stxkeys | int2vector | pg_attribute | attnum pg_trigger | tgattr | int2vector | pg_attribute | attnum (18 rows) I see there is an old thread with work in this area, "Re: GSoC 2017: Foreign Key Arrays": The last message in the thread is from 2018-10-02: >On Tue, 2 Oct, 2018 at 05:13:26AM +0200, Michael Paquier wrote: >>On Sat, Aug 11, 2018 at 05:20:57AM +0200, Mark Rofail wrote: >> I am still having problems rebasing this patch. I can not figure it out on >> my own. >Okay, it's been a couple of months since this last email, and nothing >has happened, so I am marking it as returned with feedback. >-- >Michael Personally, I would absolutely *love* FKs on array columns. I always feel shameful when adding array columns to tables, when the elements are PKs in some other table. It's convenient and often the best design, but it feels dirty knowing there are no FKs to help detect application bugs. Let's hope the current FKs-on-catalog-discussion can ignite the old Foreign Key Arrays thread. /Joel
|
In reply to this post by Laurenz Albe
On 1/19/21 11:46 AM, Laurenz Albe wrote:
> On Sun, 2021-01-17 at 17:07 -0500, Tom Lane wrote: >> Peter Eisentraut <[hidden email]> writes: >>> [ v2-0001-Add-primary-keys-and-unique-constraints-to-system.patch ] >> >> [...] do we really want to prefer >> using the OID indexes as the primary keys? In most cases there's some >> other index that seems to me to be what a user would think of as the >> pkey, for example pg_class_relname_nsp_index for pg_class or >> pg_proc_proname_args_nsp_index for pg_proc. Preferring OID where it >> exists is a nice simple rule, which has some attractiveness, but the >> OID indexes seem to me like a lookup aid rather than the "real" object >> identity. > > I disagree. The OID is the real, immutable identity of an object. > The "relname" of a "pg_class" encatalogtry can change any time. > Since there are no foreign keys that reference catalogs, that won't cause > problems, but I still think that primary keys should change as little as > possible. This might be good advice for systems where the primary key defines the physical layout of the table, but for postgres's heap there is no reason not to prefer the natural key of the table. My vote is with Tom on this one. -- Vik Fearing |
In reply to this post by Joel Jacobson-3
Dear Joel, I would love to start working on this again, I tried to revive the patch again in 2019, however, I faced the same problems as last time (detailed in the thread) and I didn't get the support I needed to continue with this patch. Are you willing to help rebase and finally publish this patch? Best Regards, Mark Rofail On Tue, 19 Jan 2021 at 2:22 PM Joel Jacobson <[hidden email]> wrote:
|
In reply to this post by Laurenz Albe
Laurenz Albe <[hidden email]> writes:
> On Sun, 2021-01-17 at 17:07 -0500, Tom Lane wrote: >> [...] do we really want to prefer >> using the OID indexes as the primary keys? In most cases there's some >> other index that seems to me to be what a user would think of as the >> pkey, for example pg_class_relname_nsp_index for pg_class or >> pg_proc_proname_args_nsp_index for pg_proc. Preferring OID where it >> exists is a nice simple rule, which has some attractiveness, but the >> OID indexes seem to me like a lookup aid rather than the "real" object >> identity. > I disagree. The OID is the real, immutable identity of an object. > The "relname" of a "pg_class" encatalogtry can change any time. > Since there are no foreign keys that reference catalogs, that won't cause > problems, but I still think that primary keys should change as little as > possible. Yeah, there's also the point that the OID is what we use for "foreign key" references from other catalogs. I don't deny that there are reasons to think of OID as the pkey. But as long as these constraints are basically cosmetic, it seemed to me that we should consider the other approach. I'm not dead set on that, just wanted to discuss it. regards, tom lane |
Free forum by Nabble | Edit this page |