[Proposal] Global temporary tables

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
41 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Re: [Proposal] Global temporary tables

Pavel Stehule


pá 1. 11. 2019 v 17:09 odesílatel Konstantin Knizhnik <[hidden email]> napsal:


On 01.11.2019 18:26, Robert Haas wrote:
> On Fri, Nov 1, 2019 at 11:15 AM Konstantin Knizhnik
> <[hidden email]> wrote:
>> It seems to me that I have found quite elegant solution for per-backend statistic for GTT: I just inserting it in backend's catalog cache, but not in pg_statistic table itself.
>> To do it I have to add InsertSysCache/InsertCatCache functions which insert pinned entry in the correspondent cache.
>> I wonder if there are some pitfalls of such approach?
> That sounds pretty hackish. You'd have to be very careful, for
> example, that if the tables were dropped or re-analyzed, all of the
> old entries got removed --

I have checked it:
- when table is reanalyzed, then cache entries are replaced.
- when table is dropped, then cache entries are removed.

> and then it would still fail if any code
> tried to access the statistics directly from the table, rather than
> via the caches. My assumption is that the statistics ought to be
> stored in some backend-private data structure designed for that
> purpose, and that the code that needs the data should be taught to
> look for it there when the table is a GTT.

Yes, if you do "select * from pg_statistic" then you will not see
statistic for GTT in this case.
But I do not think that it is so critical. I do not believe that anybody
is trying to manually interpret values in this table.
And optimizer is retrieving statistic through sys-cache mechanism and so
is able to build correct plan in this case.

Years ago, when I though about it, I wrote patch with similar design. It's working, but surely it's ugly.

I have another idea. Can be pg_statistics view instead a table?

Some like

SELECT * FROM pg_catalog.pg_statistics_rel
UNION ALL
SELECT * FROM pg_catalog.pg_statistics_gtt();

Internally - when stat cache is filled, then there can be used pg_statistics_rel and pg_statistics_gtt() directly. What I remember, there was not possibility to work with queries, only with just relations.

Or crazy idea - today we can implement own types of heaps. Is possible to create engine where result can be combination of some shared data and local data. So union will be implemented on heap level.
This implementation can be simple, just scanning pages from shared buffers and from local buffers. For these tables we don't need complex metadata. It's crazy idea, and I think so union with table function should be best.

Regards

Pavel





--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Reply | Threaded
Open this post in threaded view
|

Re: [Proposal] Global temporary tables

Julien Rouhaud
On Sat, Nov 2, 2019 at 6:31 AM Pavel Stehule <[hidden email]> wrote:

>
> pá 1. 11. 2019 v 17:09 odesílatel Konstantin Knizhnik <[hidden email]> napsal:
>>
>> On 01.11.2019 18:26, Robert Haas wrote:
>> > On Fri, Nov 1, 2019 at 11:15 AM Konstantin Knizhnik
>> > <[hidden email]> wrote:
>> >> It seems to me that I have found quite elegant solution for per-backend statistic for GTT: I just inserting it in backend's catalog cache, but not in pg_statistic table itself.
>> >> To do it I have to add InsertSysCache/InsertCatCache functions which insert pinned entry in the correspondent cache.
>> >> I wonder if there are some pitfalls of such approach?
>> > That sounds pretty hackish. You'd have to be very careful, for
>> > example, that if the tables were dropped or re-analyzed, all of the
>> > old entries got removed --
>>
>> I have checked it:
>> - when table is reanalyzed, then cache entries are replaced.
>> - when table is dropped, then cache entries are removed.
>>
>> > and then it would still fail if any code
>> > tried to access the statistics directly from the table, rather than
>> > via the caches. My assumption is that the statistics ought to be
>> > stored in some backend-private data structure designed for that
>> > purpose, and that the code that needs the data should be taught to
>> > look for it there when the table is a GTT.
>>
>> Yes, if you do "select * from pg_statistic" then you will not see
>> statistic for GTT in this case.
>> But I do not think that it is so critical. I do not believe that anybody
>> is trying to manually interpret values in this table.
>> And optimizer is retrieving statistic through sys-cache mechanism and so
>> is able to build correct plan in this case.
>
>
> Years ago, when I though about it, I wrote patch with similar design. It's working, but surely it's ugly.
>
> I have another idea. Can be pg_statistics view instead a table?
>
> Some like
>
> SELECT * FROM pg_catalog.pg_statistics_rel
> UNION ALL
> SELECT * FROM pg_catalog.pg_statistics_gtt();
>
> Internally - when stat cache is filled, then there can be used pg_statistics_rel and pg_statistics_gtt() directly. What I remember, there was not possibility to work with queries, only with just relations.

It'd be a loss if you lose the ability to see the statistics, as there
are valid use cases where you need to see the stats, eg. understanding
why you don't get the plan you wanted.  There's also at least one
extension [1] that allows you to backup and use restored statistics,
so there are definitely people interested in it.

[1]: https://github.com/ossc-db/pg_dbms_stats


Reply | Threaded
Open this post in threaded view
|

Re: [Proposal] Global temporary tables

Pavel Stehule


so 2. 11. 2019 v 8:18 odesílatel Julien Rouhaud <[hidden email]> napsal:
On Sat, Nov 2, 2019 at 6:31 AM Pavel Stehule <[hidden email]> wrote:
>
> pá 1. 11. 2019 v 17:09 odesílatel Konstantin Knizhnik <[hidden email]> napsal:
>>
>> On 01.11.2019 18:26, Robert Haas wrote:
>> > On Fri, Nov 1, 2019 at 11:15 AM Konstantin Knizhnik
>> > <[hidden email]> wrote:
>> >> It seems to me that I have found quite elegant solution for per-backend statistic for GTT: I just inserting it in backend's catalog cache, but not in pg_statistic table itself.
>> >> To do it I have to add InsertSysCache/InsertCatCache functions which insert pinned entry in the correspondent cache.
>> >> I wonder if there are some pitfalls of such approach?
>> > That sounds pretty hackish. You'd have to be very careful, for
>> > example, that if the tables were dropped or re-analyzed, all of the
>> > old entries got removed --
>>
>> I have checked it:
>> - when table is reanalyzed, then cache entries are replaced.
>> - when table is dropped, then cache entries are removed.
>>
>> > and then it would still fail if any code
>> > tried to access the statistics directly from the table, rather than
>> > via the caches. My assumption is that the statistics ought to be
>> > stored in some backend-private data structure designed for that
>> > purpose, and that the code that needs the data should be taught to
>> > look for it there when the table is a GTT.
>>
>> Yes, if you do "select * from pg_statistic" then you will not see
>> statistic for GTT in this case.
>> But I do not think that it is so critical. I do not believe that anybody
>> is trying to manually interpret values in this table.
>> And optimizer is retrieving statistic through sys-cache mechanism and so
>> is able to build correct plan in this case.
>
>
> Years ago, when I though about it, I wrote patch with similar design. It's working, but surely it's ugly.
>
> I have another idea. Can be pg_statistics view instead a table?
>
> Some like
>
> SELECT * FROM pg_catalog.pg_statistics_rel
> UNION ALL
> SELECT * FROM pg_catalog.pg_statistics_gtt();
>
> Internally - when stat cache is filled, then there can be used pg_statistics_rel and pg_statistics_gtt() directly. What I remember, there was not possibility to work with queries, only with just relations.

It'd be a loss if you lose the ability to see the statistics, as there
are valid use cases where you need to see the stats, eg. understanding
why you don't get the plan you wanted.  There's also at least one
extension [1] that allows you to backup and use restored statistics,
so there are definitely people interested in it.

[1]: https://github.com/ossc-db/pg_dbms_stats

I don't think - the extensions can use UNION and the content will be same as caches used by planner.


Reply | Threaded
Open this post in threaded view
|

Re: [Proposal] Global temporary tables

Pavel Stehule


so 2. 11. 2019 v 8:23 odesílatel Pavel Stehule <[hidden email]> napsal:


so 2. 11. 2019 v 8:18 odesílatel Julien Rouhaud <[hidden email]> napsal:
On Sat, Nov 2, 2019 at 6:31 AM Pavel Stehule <[hidden email]> wrote:
>
> pá 1. 11. 2019 v 17:09 odesílatel Konstantin Knizhnik <[hidden email]> napsal:
>>
>> On 01.11.2019 18:26, Robert Haas wrote:
>> > On Fri, Nov 1, 2019 at 11:15 AM Konstantin Knizhnik
>> > <[hidden email]> wrote:
>> >> It seems to me that I have found quite elegant solution for per-backend statistic for GTT: I just inserting it in backend's catalog cache, but not in pg_statistic table itself.
>> >> To do it I have to add InsertSysCache/InsertCatCache functions which insert pinned entry in the correspondent cache.
>> >> I wonder if there are some pitfalls of such approach?
>> > That sounds pretty hackish. You'd have to be very careful, for
>> > example, that if the tables were dropped or re-analyzed, all of the
>> > old entries got removed --
>>
>> I have checked it:
>> - when table is reanalyzed, then cache entries are replaced.
>> - when table is dropped, then cache entries are removed.
>>
>> > and then it would still fail if any code
>> > tried to access the statistics directly from the table, rather than
>> > via the caches. My assumption is that the statistics ought to be
>> > stored in some backend-private data structure designed for that
>> > purpose, and that the code that needs the data should be taught to
>> > look for it there when the table is a GTT.
>>
>> Yes, if you do "select * from pg_statistic" then you will not see
>> statistic for GTT in this case.
>> But I do not think that it is so critical. I do not believe that anybody
>> is trying to manually interpret values in this table.
>> And optimizer is retrieving statistic through sys-cache mechanism and so
>> is able to build correct plan in this case.
>
>
> Years ago, when I though about it, I wrote patch with similar design. It's working, but surely it's ugly.
>
> I have another idea. Can be pg_statistics view instead a table?
>
> Some like
>
> SELECT * FROM pg_catalog.pg_statistics_rel
> UNION ALL
> SELECT * FROM pg_catalog.pg_statistics_gtt();
>
> Internally - when stat cache is filled, then there can be used pg_statistics_rel and pg_statistics_gtt() directly. What I remember, there was not possibility to work with queries, only with just relations.

It'd be a loss if you lose the ability to see the statistics, as there
are valid use cases where you need to see the stats, eg. understanding
why you don't get the plan you wanted.  There's also at least one
extension [1] that allows you to backup and use restored statistics,
so there are definitely people interested in it.

[1]: https://github.com/ossc-db/pg_dbms_stats

I don't think - the extensions can use UNION and the content will be same as caches used by planner.

sure, if some one try to modify directly system tables, then it should be fixed. 
Reply | Threaded
Open this post in threaded view
|

Re: [Proposal] Global temporary tables

Julien Rouhaud
In reply to this post by Pavel Stehule
On Sat, Nov 2, 2019 at 8:23 AM Pavel Stehule <[hidden email]> wrote:

>
> so 2. 11. 2019 v 8:18 odesílatel Julien Rouhaud <[hidden email]> napsal:
>>
>> On Sat, Nov 2, 2019 at 6:31 AM Pavel Stehule <[hidden email]> wrote:
>> >
>> > pá 1. 11. 2019 v 17:09 odesílatel Konstantin Knizhnik <[hidden email]> napsal:
>> >>
>> >> On 01.11.2019 18:26, Robert Haas wrote:
>> >> > On Fri, Nov 1, 2019 at 11:15 AM Konstantin Knizhnik
>> >> > <[hidden email]> wrote:
>> >> >> It seems to me that I have found quite elegant solution for per-backend statistic for GTT: I just inserting it in backend's catalog cache, but not in pg_statistic table itself.
>> >> >> To do it I have to add InsertSysCache/InsertCatCache functions which insert pinned entry in the correspondent cache.
>> >> >> I wonder if there are some pitfalls of such approach?
>> >> > That sounds pretty hackish. You'd have to be very careful, for
>> >> > example, that if the tables were dropped or re-analyzed, all of the
>> >> > old entries got removed --
>> >>
>> >> I have checked it:
>> >> - when table is reanalyzed, then cache entries are replaced.
>> >> - when table is dropped, then cache entries are removed.
>> >>
>> >> > and then it would still fail if any code
>> >> > tried to access the statistics directly from the table, rather than
>> >> > via the caches. My assumption is that the statistics ought to be
>> >> > stored in some backend-private data structure designed for that
>> >> > purpose, and that the code that needs the data should be taught to
>> >> > look for it there when the table is a GTT.
>> >>
>> >> Yes, if you do "select * from pg_statistic" then you will not see
>> >> statistic for GTT in this case.
>> >> But I do not think that it is so critical. I do not believe that anybody
>> >> is trying to manually interpret values in this table.
>> >> And optimizer is retrieving statistic through sys-cache mechanism and so
>> >> is able to build correct plan in this case.
>> >
>> >
>> > Years ago, when I though about it, I wrote patch with similar design. It's working, but surely it's ugly.
>> >
>> > I have another idea. Can be pg_statistics view instead a table?
>> >
>> > Some like
>> >
>> > SELECT * FROM pg_catalog.pg_statistics_rel
>> > UNION ALL
>> > SELECT * FROM pg_catalog.pg_statistics_gtt();
>> >
>> > Internally - when stat cache is filled, then there can be used pg_statistics_rel and pg_statistics_gtt() directly. What I remember, there was not possibility to work with queries, only with just relations.
>>
>> It'd be a loss if you lose the ability to see the statistics, as there
>> are valid use cases where you need to see the stats, eg. understanding
>> why you don't get the plan you wanted.  There's also at least one
>> extension [1] that allows you to backup and use restored statistics,
>> so there are definitely people interested in it.
>>
>> [1]: https://github.com/ossc-db/pg_dbms_stats
>
>
> I don't think - the extensions can use UNION and the content will be same as caches used by planner.

Yes, I agree that changing pg_statistics to be a view as you showed
would fix the problem.  I was answering Konstantin's point:

>> >> But I do not think that it is so critical. I do not believe that anybody
>> >> is trying to manually interpret values in this table.
>> >> And optimizer is retrieving statistic through sys-cache mechanism and so
>> >> is able to build correct plan in this case.

which is IMHO a wrong assumption.


Reply | Threaded
Open this post in threaded view
|

Re: [Proposal] Global temporary tables

konstantin knizhnik
In reply to this post by Julien Rouhaud


On 02.11.2019 10:19, Julien Rouhaud wrote:

> On Sat, Nov 2, 2019 at 6:31 AM Pavel Stehule <[hidden email]> wrote:
>> pá 1. 11. 2019 v 17:09 odesílatel Konstantin Knizhnik <[hidden email]> napsal:
>>> On 01.11.2019 18:26, Robert Haas wrote:
>>>> On Fri, Nov 1, 2019 at 11:15 AM Konstantin Knizhnik
>>>> <[hidden email]> wrote:
>>>>> It seems to me that I have found quite elegant solution for per-backend statistic for GTT: I just inserting it in backend's catalog cache, but not in pg_statistic table itself.
>>>>> To do it I have to add InsertSysCache/InsertCatCache functions which insert pinned entry in the correspondent cache.
>>>>> I wonder if there are some pitfalls of such approach?
>>>> That sounds pretty hackish. You'd have to be very careful, for
>>>> example, that if the tables were dropped or re-analyzed, all of the
>>>> old entries got removed --
>>> I have checked it:
>>> - when table is reanalyzed, then cache entries are replaced.
>>> - when table is dropped, then cache entries are removed.
>>>
>>>> and then it would still fail if any code
>>>> tried to access the statistics directly from the table, rather than
>>>> via the caches. My assumption is that the statistics ought to be
>>>> stored in some backend-private data structure designed for that
>>>> purpose, and that the code that needs the data should be taught to
>>>> look for it there when the table is a GTT.
>>> Yes, if you do "select * from pg_statistic" then you will not see
>>> statistic for GTT in this case.
>>> But I do not think that it is so critical. I do not believe that anybody
>>> is trying to manually interpret values in this table.
>>> And optimizer is retrieving statistic through sys-cache mechanism and so
>>> is able to build correct plan in this case.
>>
>> Years ago, when I though about it, I wrote patch with similar design. It's working, but surely it's ugly.
>>
>> I have another idea. Can be pg_statistics view instead a table?
>>
>> Some like
>>
>> SELECT * FROM pg_catalog.pg_statistics_rel
>> UNION ALL
>> SELECT * FROM pg_catalog.pg_statistics_gtt();
>>
>> Internally - when stat cache is filled, then there can be used pg_statistics_rel and pg_statistics_gtt() directly. What I remember, there was not possibility to work with queries, only with just relations.
> It'd be a loss if you lose the ability to see the statistics, as there
> are valid use cases where you need to see the stats, eg. understanding
> why you don't get the plan you wanted.  There's also at least one
> extension [1] that allows you to backup and use restored statistics,
> so there are definitely people interested in it.
>
> [1]: https://github.com/ossc-db/pg_dbms_stats
It seems to have completely no sense to backup and restore statistic for
temporary tables which life time is limited to life time of backend,
doesn't it?




Reply | Threaded
Open this post in threaded view
|

Re: [Proposal] Global temporary tables

konstantin knizhnik
In reply to this post by Pavel Stehule


On 02.11.2019 8:30, Pavel Stehule wrote:


pá 1. 11. 2019 v 17:09 odesílatel Konstantin Knizhnik <[hidden email]> napsal:


On 01.11.2019 18:26, Robert Haas wrote:
> On Fri, Nov 1, 2019 at 11:15 AM Konstantin Knizhnik
> <[hidden email]> wrote:
>> It seems to me that I have found quite elegant solution for per-backend statistic for GTT: I just inserting it in backend's catalog cache, but not in pg_statistic table itself.
>> To do it I have to add InsertSysCache/InsertCatCache functions which insert pinned entry in the correspondent cache.
>> I wonder if there are some pitfalls of such approach?
> That sounds pretty hackish. You'd have to be very careful, for
> example, that if the tables were dropped or re-analyzed, all of the
> old entries got removed --

I have checked it:
- when table is reanalyzed, then cache entries are replaced.
- when table is dropped, then cache entries are removed.

> and then it would still fail if any code
> tried to access the statistics directly from the table, rather than
> via the caches. My assumption is that the statistics ought to be
> stored in some backend-private data structure designed for that
> purpose, and that the code that needs the data should be taught to
> look for it there when the table is a GTT.

Yes, if you do "select * from pg_statistic" then you will not see
statistic for GTT in this case.
But I do not think that it is so critical. I do not believe that anybody
is trying to manually interpret values in this table.
And optimizer is retrieving statistic through sys-cache mechanism and so
is able to build correct plan in this case.

Years ago, when I though about it, I wrote patch with similar design. It's working, but surely it's ugly.

I have another idea. Can be pg_statistics view instead a table?

Some like

SELECT * FROM pg_catalog.pg_statistics_rel
UNION ALL
SELECT * FROM pg_catalog.pg_statistics_gtt();

And pg_catalog.pg_statistics_gtt() is set returning functions?
I afraid that it is not acceptable solution from performance point of view: pg_statictic table is accessed by keys (<relid>,<attpos>,<inh>)
If it can not be done using index scan, then it can cause significant performance slow down.


Internally - when stat cache is filled, then there can be used pg_statistics_rel and pg_statistics_gtt() directly. What I remember, there was not possibility to work with queries, only with just relations.

Or crazy idea - today we can implement own types of heaps. Is possible to create engine where result can be combination of some shared data and local data. So union will be implemented on heap level.
This implementation can be simple, just scanning pages from shared buffers and from local buffers. For these tables we don't need complex metadata. It's crazy idea, and I think so union with table function should be best.

Frankly speaking, implementing special heap access method for pg_statistic just to handle case of global temp tables seems to be overkill
from my point of view. It requires a lot coding (or at least copying a lot of code from heapam). Also, as I wrote above, we need also index for efficient lookup of statistic.


Reply | Threaded
Open this post in threaded view
|

Re: [Proposal] Global temporary tables

Julien Rouhaud
In reply to this post by konstantin knizhnik
On Sat, Nov 2, 2019 at 4:09 PM Konstantin Knizhnik
<[hidden email]> wrote:

>
> On 02.11.2019 10:19, Julien Rouhaud wrote:
> > On Sat, Nov 2, 2019 at 6:31 AM Pavel Stehule <[hidden email]> wrote:
> >> pá 1. 11. 2019 v 17:09 odesílatel Konstantin Knizhnik <[hidden email]> napsal:
> >>> On 01.11.2019 18:26, Robert Haas wrote:
> >>>> On Fri, Nov 1, 2019 at 11:15 AM Konstantin Knizhnik
> >>>> <[hidden email]> wrote:
> >>>>> It seems to me that I have found quite elegant solution for per-backend statistic for GTT: I just inserting it in backend's catalog cache, but not in pg_statistic table itself.
> >>>>> To do it I have to add InsertSysCache/InsertCatCache functions which insert pinned entry in the correspondent cache.
> >>>>> I wonder if there are some pitfalls of such approach?
> >>>> That sounds pretty hackish. You'd have to be very careful, for
> >>>> example, that if the tables were dropped or re-analyzed, all of the
> >>>> old entries got removed --
> >>> I have checked it:
> >>> - when table is reanalyzed, then cache entries are replaced.
> >>> - when table is dropped, then cache entries are removed.
> >>>
> >>>> and then it would still fail if any code
> >>>> tried to access the statistics directly from the table, rather than
> >>>> via the caches. My assumption is that the statistics ought to be
> >>>> stored in some backend-private data structure designed for that
> >>>> purpose, and that the code that needs the data should be taught to
> >>>> look for it there when the table is a GTT.
> >>> Yes, if you do "select * from pg_statistic" then you will not see
> >>> statistic for GTT in this case.
> >>> But I do not think that it is so critical. I do not believe that anybody
> >>> is trying to manually interpret values in this table.
> >>> And optimizer is retrieving statistic through sys-cache mechanism and so
> >>> is able to build correct plan in this case.
> >>
> >> Years ago, when I though about it, I wrote patch with similar design. It's working, but surely it's ugly.
> >>
> >> I have another idea. Can be pg_statistics view instead a table?
> >>
> >> Some like
> >>
> >> SELECT * FROM pg_catalog.pg_statistics_rel
> >> UNION ALL
> >> SELECT * FROM pg_catalog.pg_statistics_gtt();
> >>
> >> Internally - when stat cache is filled, then there can be used pg_statistics_rel and pg_statistics_gtt() directly. What I remember, there was not possibility to work with queries, only with just relations.
> > It'd be a loss if you lose the ability to see the statistics, as there
> > are valid use cases where you need to see the stats, eg. understanding
> > why you don't get the plan you wanted.  There's also at least one
> > extension [1] that allows you to backup and use restored statistics,
> > so there are definitely people interested in it.
> >
> > [1]: https://github.com/ossc-db/pg_dbms_stats
> It seems to have completely no sense to backup and restore statistic for
> temporary tables which life time is limited to life time of backend,
> doesn't it?

In general yes I agree, but it doesn't if the goal is to understand
why even after an analyze on the temporary table your query is still
behaving poorly.  It can be useful to allow reproduction or just give
someone else the statistics to see what's going on.


Reply | Threaded
Open this post in threaded view
|

Re: [Proposal] Global temporary tables

Pavel Stehule
In reply to this post by konstantin knizhnik



And pg_catalog.pg_statistics_gtt() is set returning functions?

yes

I afraid that it is not acceptable solution from performance point of view: pg_statictic table is accessed by keys (<relid>,<attpos>,<inh>)

I don't think so it is problem. The any component, that needs to use fast access can use some special function that check index or check some memory buffers.


If it can not be done using index scan, then it can cause significant performance slow down.

where you need fast access when you use SQL access? Inside postgres optimizer is caches everywhere. And statistics cache should to know so have to check index and some memory buffers.

The proposed view will not be used by optimizer, but it can be used by some higher layers. I think so there is a agreement so GTT metadata should not be stored in system catalogue. If are stored in some syscache or somewhere else is not important in this moment. But can be nice if for user the GTT metadata should not be black hole. I think so is better to change some current tables to views, than use some special function just specialized for GTT (these functions should to exists in both variants). When I think about it - this is important not just for functionality that we expect from GTT. It is important for consistency of Postgres catalog - how much different should be GTT than other types of tables in system catalogue from user's perspective.


Reply | Threaded
Open this post in threaded view
|

Re: [Proposal] Global temporary tables

曾文旌(义从)
In reply to this post by Robert Haas
Dear Hackers


I attached the patch of GTT implementationI base on PG12.
The GTT design came from my first email.
Some limitations in patch will be eliminated in later versions.

Later, I will comment on Konstantin's patch and make some proposals for cooperation.
Looking forward to your feedback.

Thanks.

Zeng Wenjing






> 2019年10月29日 上午12:40,Robert Haas <[hidden email]> 写道:
>
> On Mon, Oct 28, 2019 at 9:37 AM Konstantin Knizhnik
> <[hidden email]> wrote:
>> Sorry, but both statements are not true.
>
> Well, I think they are true.
>
>> I am not sure how vital is lack of aborts for transactions working with
>> GTT at replica.
>> Some people said that there is no sense in aborts of read-only
>> transactions at replica (despite to the fact that them are saving
>> intermediate results in GTT).
>> Some people said something similar with your's "dead on arrival".
>> But inconsistency is not possible: if such transaction is really
>> aborted, then backend is terminated and nobody can see this inconsistency.
>
> Aborting the current transaction is a very different thing from
> terminating the backend.
>
> Also, the idea that there is no sense in aborts of read-only
> transactions on a replica seems totally wrong. Suppose that you insert
> a row into the table and then you go to insert a row in each index,
> but one of the index inserts fails - duplicate key, out of memory
> error, I/O error, whatever. Now the table and the index are
> inconsistent. Normally, we're protected against this by MVCC, but if
> you use a solution that breaks MVCC by using the same XID for all
> transactions, then it can happen.
>
>> Concerning second alternative: you can check yourself that it is not
>> *extremely* complicated and invasive.
>> I extracted changes which are related with handling transactions at
>> replica and attached them to this mail.
>> It is just 500 lines (including diff contexts). Certainly there are some
>> limitation of this implementation: number of  transactions working with
>> GTT at replica is limited by 2^32
>> and since GTT tuples are not frozen, analog of GTT CLOG kept in memory
>> is never truncated.
>
> I admit that this patch is not lengthy, but there remains the question
> of whether it is correct. It's possible that the problem isn't as
> complicated as I think it is, but I do think there are quite a number
> of reasons why this patch wouldn't be considered acceptable...
>
>> I agree with it and think that implementation of GTT with private
>> buffers and no replica access is good starting point.
>
> ...but given that we seem to agree on this point, perhaps it isn't
> necessary to argue about those things right now.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company


global_temporary_table_v1.patch (189K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [Proposal] Global temporary tables

konstantin knizhnik


On 06.11.2019 16:24, 曾文旌(义从) wrote:

> Dear Hackers
>
>
> I attached the patch of GTT implementationI base on PG12.
> The GTT design came from my first email.
> Some limitations in patch will be eliminated in later versions.
>
> Later, I will comment on Konstantin's patch and make some proposals for cooperation.
> Looking forward to your feedback.
>
> Thanks.
>
> Zeng Wenjing
>
Thank you for this patch.
My first comments:

1.  I have ported you patch to the latest Postgres version (my patch is
attached).
2. You patch is supporting only B-Tree index for GTT. All other indexes
(hash, gin, gist, brin,...) are not currently supported.
3. I do not understand the reason for the following limitation:
"We allow to create index on global temp table only this session use it"

First of all it seems to significantly reduce usage of global temp tables.
Why do we need GTT at all? Mostly because we need to access temporary
data in more than one backend. Otherwise we can just use normal table.
If temp table is expected to be larger enough, so that we need to create
index for it, then it is hard to believe that it will be needed only in
one backend.

May be the assumption is that all indexes has to be created before GTT
start to be used.
But right now this check is not working correctly in any case - if you
insert some data into the table, then
you can not create index any more:

postgres=# create global temp table gtt(x integer primary key, y integer);
CREATE TABLE
postgres=# insert into gtt values (generate_series(1,100000),
generate_series(1,100000));
INSERT 0 100000
postgres=# create index on gtt(y);
ERROR:  can not create index when have one or more backend attached this
global temp table

I wonder why do you need such restriction?


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


global_temporary_table_v1-pg13.patch (132K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [Proposal] Global temporary tables

曾文旌(义从)


> 2019年11月7日 上午12:08,Konstantin Knizhnik <[hidden email]> 写道:
>
>
>
> On 06.11.2019 16:24, 曾文旌(义从) wrote:
>> Dear Hackers
>>
>>
>> I attached the patch of GTT implementationI base on PG12.
>> The GTT design came from my first email.
>> Some limitations in patch will be eliminated in later versions.
>>
>> Later, I will comment on Konstantin's patch and make some proposals for cooperation.
>> Looking forward to your feedback.
>>
>> Thanks.
>>
>> Zeng Wenjing
>>
>
> Thank you for this patch.
> My first comments:
>
> 1.  I have ported you patch to the latest Postgres version (my patch is attached).
> 2. You patch is supporting only B-Tree index for GTT. All other indexes (hash, gin, gist, brin,...) are not currently supported.
Currently I only support btree index.
I noticed that your patch supports more index types, which is where I'd like to work with you.

> 3. I do not understand the reason for the following limitation:
> "We allow to create index on global temp table only this session use it"
>
> First of all it seems to significantly reduce usage of global temp tables.
> Why do we need GTT at all? Mostly because we need to access temporary data in more than one backend. Otherwise we can just use normal table.
> If temp table is expected to be larger enough, so that we need to create index for it, then it is hard to believe that it will be needed only in one backend.
>
> May be the assumption is that all indexes has to be created before GTT start to be used.
Yes, Currently, GTT's index is only supported and created in an empty table state, and other sessions are not using it.
There has two improvements pointer:
1 Index can create on GTT(A) when the GTT(A)  in the current session is not empty, requiring the GTT table to be empty in the other session.
Index_build needs to be done in the current session just like a normal table. This improvement is relatively easy.

2 Index can create on GTT(A)  when more than one session are using this GTT(A).
Because when I'm done creating an index of the GTT in this session and setting it to be an valid index, it's not true for the GTT in other sessions.
Indexes on gtt in other sessions require "rebuild_index" before using it.
I don't have a better solution right now, maybe you have some suggestions.


> But right now this check is not working correctly in any case - if you insert some data into the table, then
> you can not create index any more:
>
> postgres=# create global temp table gtt(x integer primary key, y integer);
> CREATE TABLE
> postgres=# insert into gtt values (generate_series(1,100000), generate_series(1,100000));
> INSERT 0 100000
> postgres=# create index on gtt(y);
> ERROR:  can not create index when have one or more backend attached this global temp table
>
> I wonder why do you need such restriction?
>
>
> --
> Konstantin Knizhnik
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
> <global_temporary_table_v1-pg13.patch>



Reply | Threaded
Open this post in threaded view
|

Re: [Proposal] Global temporary tables

Pavel Stehule


čt 7. 11. 2019 v 10:30 odesílatel 曾文旌(义从) <[hidden email]> napsal:


> 2019年11月7日 上午12:08,Konstantin Knizhnik <[hidden email]> 写道:
>
>
>
> On 06.11.2019 16:24, 曾文旌(义从) wrote:
>> Dear Hackers
>>
>>
>> I attached the patch of GTT implementationI base on PG12.
>> The GTT design came from my first email.
>> Some limitations in patch will be eliminated in later versions.
>>
>> Later, I will comment on Konstantin's patch and make some proposals for cooperation.
>> Looking forward to your feedback.
>>
>> Thanks.
>>
>> Zeng Wenjing
>>
>
> Thank you for this patch.
> My first comments:
>
> 1.  I have ported you patch to the latest Postgres version (my patch is attached).
> 2. You patch is supporting only B-Tree index for GTT. All other indexes (hash, gin, gist, brin,...) are not currently supported.
Currently I only support btree index.
I noticed that your patch supports more index types, which is where I'd like to work with you.

> 3. I do not understand the reason for the following limitation:
> "We allow to create index on global temp table only this session use it"
>
> First of all it seems to significantly reduce usage of global temp tables.
> Why do we need GTT at all? Mostly because we need to access temporary data in more than one backend. Otherwise we can just use normal table.
> If temp table is expected to be larger enough, so that we need to create index for it, then it is hard to believe that it will be needed only in one backend.
>
> May be the assumption is that all indexes has to be created before GTT start to be used.
Yes, Currently, GTT's index is only supported and created in an empty table state, and other sessions are not using it.
There has two improvements pointer:
1 Index can create on GTT(A) when the GTT(A)  in the current session is not empty, requiring the GTT table to be empty in the other session.
Index_build needs to be done in the current session just like a normal table. This improvement is relatively easy.

2 Index can create on GTT(A)  when more than one session are using this GTT(A).
Because when I'm done creating an index of the GTT in this session and setting it to be an valid index, it's not true for the GTT in other sessions.
Indexes on gtt in other sessions require "rebuild_index" before using it.
I don't have a better solution right now, maybe you have some suggestions.

I think so DDL operations can be implemented in some reduced form - so DDL are active only for one session, and for other sessions are invisible. Important is state of GTT object on session start.

For example ALTER TABLE DROP COLUMN can has very fatal impact on other sessions. So I think the best of GTT can be pattern - the structure of GTT table is immutable for any session that doesn't do DDL operations.



> But right now this check is not working correctly in any case - if you insert some data into the table, then
> you can not create index any more:
>
> postgres=# create global temp table gtt(x integer primary key, y integer);
> CREATE TABLE
> postgres=# insert into gtt values (generate_series(1,100000), generate_series(1,100000));
> INSERT 0 100000
> postgres=# create index on gtt(y);
> ERROR:  can not create index when have one or more backend attached this global temp table
>
> I wonder why do you need such restriction?
>
>
> --
> Konstantin Knizhnik
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
> <global_temporary_table_v1-pg13.patch>

Reply | Threaded
Open this post in threaded view
|

Re: [Proposal] Global temporary tables

曾文旌(义从)
In reply to this post by 曾文旌(义从)


2019年11月7日 下午5:30,曾文旌(义从) <[hidden email]> 写道:



2019年11月7日 上午12:08,Konstantin Knizhnik <[hidden email]> 写道:



On 06.11.2019 16:24, 曾文旌(义从) wrote:
Dear Hackers


I attached the patch of GTT implementationI base on PG12.
The GTT design came from my first email.
Some limitations in patch will be eliminated in later versions.

Later, I will comment on Konstantin's patch and make some proposals for cooperation.
Looking forward to your feedback.

Thanks.

Zeng Wenjing


Thank you for this patch.
My first comments:

1.  I have ported you patch to the latest Postgres version (my patch is attached).
2. You patch is supporting only B-Tree index for GTT. All other indexes (hash, gin, gist, brin,...) are not currently supported.
Currently I only support btree index.
I noticed that your patch supports more index types, which is where I'd like to work with you.

3. I do not understand the reason for the following limitation:
"We allow to create index on global temp table only this session use it"

First of all it seems to significantly reduce usage of global temp tables.
Why do we need GTT at all? Mostly because we need to access temporary data in more than one backend. Otherwise we can just use normal table.
If temp table is expected to be larger enough, so that we need to create index for it, then it is hard to believe that it will be needed only in one backend.

May be the assumption is that all indexes has to be created before GTT start to be used.
Yes, Currently, GTT's index is only supported and created in an empty table state, and other sessions are not using it.
There has two improvements pointer:
1 Index can create on GTT(A) when the GTT(A)  in the current session is not empty, requiring the GTT table to be empty in the other session.
Index_build needs to be done in the current session just like a normal table. This improvement is relatively easy.
This part of the improvement has been completed.
New patch is attached.


2 Index can create on GTT(A)  when more than one session are using this GTT(A).
Because when I'm done creating an index of the GTT in this session and setting it to be an valid index, it's not true for the GTT in other sessions.
Indexes on gtt in other sessions require "rebuild_index" before using it. 
I don't have a better solution right now, maybe you have some suggestions.


But right now this check is not working correctly in any case - if you insert some data into the table, then
you can not create index any more:

postgres=# create global temp table gtt(x integer primary key, y integer);
CREATE TABLE
postgres=# insert into gtt values (generate_series(1,100000), generate_series(1,100000));
INSERT 0 100000
postgres=# create index on gtt(y);
ERROR:  can not create index when have one or more backend attached this global temp table

Index can create on GTT(A) when the GTT(A)  in the current session is not empty now.
But still requiring the GTT table to be empty in the other session.

I wonder why do you need such restriction?


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

<global_temporary_table_v1-pg13.patch>


Zeng Wenjing




global_temporary_table_v2.patch (189K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [Proposal] Global temporary tables

曾文旌(义从)
In reply to this post by Pavel Stehule


2019年11月7日 下午5:40,Pavel Stehule <[hidden email]> 写道:



čt 7. 11. 2019 v 10:30 odesílatel 曾文旌(义从) <[hidden email]> napsal:


> 2019年11月7日 上午12:08,Konstantin Knizhnik <[hidden email]> 写道:
>
>
>
> On 06.11.2019 16:24, 曾文旌(义从) wrote:
>> Dear Hackers
>>
>>
>> I attached the patch of GTT implementationI base on PG12.
>> The GTT design came from my first email.
>> Some limitations in patch will be eliminated in later versions.
>>
>> Later, I will comment on Konstantin's patch and make some proposals for cooperation.
>> Looking forward to your feedback.
>>
>> Thanks.
>>
>> Zeng Wenjing
>>
>
> Thank you for this patch.
> My first comments:
>
> 1.  I have ported you patch to the latest Postgres version (my patch is attached).
> 2. You patch is supporting only B-Tree index for GTT. All other indexes (hash, gin, gist, brin,...) are not currently supported.
Currently I only support btree index.
I noticed that your patch supports more index types, which is where I'd like to work with you.

> 3. I do not understand the reason for the following limitation:
> "We allow to create index on global temp table only this session use it"
>
> First of all it seems to significantly reduce usage of global temp tables.
> Why do we need GTT at all? Mostly because we need to access temporary data in more than one backend. Otherwise we can just use normal table.
> If temp table is expected to be larger enough, so that we need to create index for it, then it is hard to believe that it will be needed only in one backend.
>
> May be the assumption is that all indexes has to be created before GTT start to be used.
Yes, Currently, GTT's index is only supported and created in an empty table state, and other sessions are not using it.
There has two improvements pointer:
1 Index can create on GTT(A) when the GTT(A)  in the current session is not empty, requiring the GTT table to be empty in the other session.
Index_build needs to be done in the current session just like a normal table. This improvement is relatively easy.

2 Index can create on GTT(A)  when more than one session are using this GTT(A).
Because when I'm done creating an index of the GTT in this session and setting it to be an valid index, it's not true for the GTT in other sessions.
Indexes on gtt in other sessions require "rebuild_index" before using it.
I don't have a better solution right now, maybe you have some suggestions.

I think so DDL operations can be implemented in some reduced form - so DDL are active only for one session, and for other sessions are invisible. Important is state of GTT object on session start.

For example ALTER TABLE DROP COLUMN can has very fatal impact on other sessions. So I think the best of GTT can be pattern - the structure of GTT table is immutable for any session that doesn't do DDL operations.
Yes, Those ddl that need to rewrite data files will have this problem.
This is why I disabled alter GTT in the current version.
It can be improved, such as Alter GTT can also be allowed when only the current session is in use.
Users can also choose to kick off other sessions that are using gtt, then do alter GTT.
I provide a function(pg_gtt_attached_pid(relation, schema)) to query which session a GTT is being used by.




> But right now this check is not working correctly in any case - if you insert some data into the table, then
> you can not create index any more:
>
> postgres=# create global temp table gtt(x integer primary key, y integer);
> CREATE TABLE
> postgres=# insert into gtt values (generate_series(1,100000), generate_series(1,100000));
> INSERT 0 100000
> postgres=# create index on gtt(y);
> ERROR:  can not create index when have one or more backend attached this global temp table
>
> I wonder why do you need such restriction?
>
>
> --
> Konstantin Knizhnik
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
> <global_temporary_table_v1-pg13.patch>


Reply | Threaded
Open this post in threaded view
|

Re: [Proposal] Global temporary tables

Pavel Stehule


čt 7. 11. 2019 v 13:17 odesílatel 曾文旌(义从) <[hidden email]> napsal:


2019年11月7日 下午5:40,Pavel Stehule <[hidden email]> 写道:



čt 7. 11. 2019 v 10:30 odesílatel 曾文旌(义从) <[hidden email]> napsal:


> 2019年11月7日 上午12:08,Konstantin Knizhnik <[hidden email]> 写道:
>
>
>
> On 06.11.2019 16:24, 曾文旌(义从) wrote:
>> Dear Hackers
>>
>>
>> I attached the patch of GTT implementationI base on PG12.
>> The GTT design came from my first email.
>> Some limitations in patch will be eliminated in later versions.
>>
>> Later, I will comment on Konstantin's patch and make some proposals for cooperation.
>> Looking forward to your feedback.
>>
>> Thanks.
>>
>> Zeng Wenjing
>>
>
> Thank you for this patch.
> My first comments:
>
> 1.  I have ported you patch to the latest Postgres version (my patch is attached).
> 2. You patch is supporting only B-Tree index for GTT. All other indexes (hash, gin, gist, brin,...) are not currently supported.
Currently I only support btree index.
I noticed that your patch supports more index types, which is where I'd like to work with you.

> 3. I do not understand the reason for the following limitation:
> "We allow to create index on global temp table only this session use it"
>
> First of all it seems to significantly reduce usage of global temp tables.
> Why do we need GTT at all? Mostly because we need to access temporary data in more than one backend. Otherwise we can just use normal table.
> If temp table is expected to be larger enough, so that we need to create index for it, then it is hard to believe that it will be needed only in one backend.
>
> May be the assumption is that all indexes has to be created before GTT start to be used.
Yes, Currently, GTT's index is only supported and created in an empty table state, and other sessions are not using it.
There has two improvements pointer:
1 Index can create on GTT(A) when the GTT(A)  in the current session is not empty, requiring the GTT table to be empty in the other session.
Index_build needs to be done in the current session just like a normal table. This improvement is relatively easy.

2 Index can create on GTT(A)  when more than one session are using this GTT(A).
Because when I'm done creating an index of the GTT in this session and setting it to be an valid index, it's not true for the GTT in other sessions.
Indexes on gtt in other sessions require "rebuild_index" before using it.
I don't have a better solution right now, maybe you have some suggestions.

I think so DDL operations can be implemented in some reduced form - so DDL are active only for one session, and for other sessions are invisible. Important is state of GTT object on session start.

For example ALTER TABLE DROP COLUMN can has very fatal impact on other sessions. So I think the best of GTT can be pattern - the structure of GTT table is immutable for any session that doesn't do DDL operations.
Yes, Those ddl that need to rewrite data files will have this problem.
This is why I disabled alter GTT in the current version.
It can be improved, such as Alter GTT can also be allowed when only the current session is in use.

I think so it is acceptable solution for some first steps, but I cannot to imagine so this behave can be good for production usage. But can be good enough for some time.

Regards

Pavel

Users can also choose to kick off other sessions that are using gtt, then do alter GTT.
I provide a function(pg_gtt_attached_pid(relation, schema)) to query which session a GTT is being used by.




> But right now this check is not working correctly in any case - if you insert some data into the table, then
> you can not create index any more:
>
> postgres=# create global temp table gtt(x integer primary key, y integer);
> CREATE TABLE
> postgres=# insert into gtt values (generate_series(1,100000), generate_series(1,100000));
> INSERT 0 100000
> postgres=# create index on gtt(y);
> ERROR:  can not create index when have one or more backend attached this global temp table
>
> I wonder why do you need such restriction?
>
>
> --
> Konstantin Knizhnik
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
> <global_temporary_table_v1-pg13.patch>


Reply | Threaded
Open this post in threaded view
|

Re: [Proposal] Global temporary tables

konstantin knizhnik
In reply to this post by 曾文旌(义从)


On 07.11.2019 12:30, 曾文旌(义从) wrote:

>
>> May be the assumption is that all indexes has to be created before GTT start to be used.
> Yes, Currently, GTT's index is only supported and created in an empty table state, and other sessions are not using it.
> There has two improvements pointer:
> 1 Index can create on GTT(A) when the GTT(A)  in the current session is not empty, requiring the GTT table to be empty in the other session.
> Index_build needs to be done in the current session just like a normal table. This improvement is relatively easy.
>
> 2 Index can create on GTT(A)  when more than one session are using this GTT(A).
> Because when I'm done creating an index of the GTT in this session and setting it to be an valid index, it's not true for the GTT in other sessions.
> Indexes on gtt in other sessions require "rebuild_index" before using it.
> I don't have a better solution right now, maybe you have some suggestions.
It is possible to create index on demand:

Buffer
_bt_getbuf(Relation rel, BlockNumber blkno, int access)
{
     Buffer        buf;

     if (blkno != P_NEW)
     {
         /* Read an existing block of the relation */
         buf = ReadBuffer(rel, blkno);
         /* Session temporary relation may be not yet initialized for
this backend. */
         if (blkno == BTREE_METAPAGE &&
GlobalTempRelationPageIsNotInitialized(rel, BufferGetPage(buf)))
         {
             Relation heap = RelationIdGetRelation(rel->rd_index->indrelid);
             ReleaseBuffer(buf);
             DropRelFileNodeLocalBuffers(rel->rd_node, MAIN_FORKNUM, blkno);
             btbuild(heap, rel, BuildIndexInfo(rel));
             RelationClose(heap);
             buf = ReadBuffer(rel, blkno);
             LockBuffer(buf, access);
         }
         else
         {
             LockBuffer(buf, access);
             _bt_checkpage(rel, buf);
         }
     }
     ...


This code initializes B-Tree and load data in it when GTT index is
access and is not initialized yet.
It looks a little bit hacker but it works.

I also wonder why you are keeping information about GTT in shared
memory. Looks like the only information we really need to share is
table's metadata.
But it is already shared though catalog. All other GTT related
information is private to backend so I do not see reasons to place it in
shared memory.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Reply | Threaded
Open this post in threaded view
|

Re: [Proposal] Global temporary tables

曾文旌(义从)


> 2019年11月8日 上午12:32,Konstantin Knizhnik <[hidden email]> 写道:
>
>
>
> On 07.11.2019 12:30, 曾文旌(义从) wrote:
>>
>>> May be the assumption is that all indexes has to be created before GTT start to be used.
>> Yes, Currently, GTT's index is only supported and created in an empty table state, and other sessions are not using it.
>> There has two improvements pointer:
>> 1 Index can create on GTT(A) when the GTT(A)  in the current session is not empty, requiring the GTT table to be empty in the other session.
>> Index_build needs to be done in the current session just like a normal table. This improvement is relatively easy.
>>
>> 2 Index can create on GTT(A)  when more than one session are using this GTT(A).
>> Because when I'm done creating an index of the GTT in this session and setting it to be an valid index, it's not true for the GTT in other sessions.
>> Indexes on gtt in other sessions require "rebuild_index" before using it.
>> I don't have a better solution right now, maybe you have some suggestions.
> It is possible to create index on demand:
>
> Buffer
> _bt_getbuf(Relation rel, BlockNumber blkno, int access)
> {
>     Buffer        buf;
>
>     if (blkno != P_NEW)
>     {
>         /* Read an existing block of the relation */
>         buf = ReadBuffer(rel, blkno);
>         /* Session temporary relation may be not yet initialized for this backend. */
>         if (blkno == BTREE_METAPAGE && GlobalTempRelationPageIsNotInitialized(rel, BufferGetPage(buf)))
>         {
>             Relation heap = RelationIdGetRelation(rel->rd_index->indrelid);
>             ReleaseBuffer(buf);
>             DropRelFileNodeLocalBuffers(rel->rd_node, MAIN_FORKNUM, blkno);
>             btbuild(heap, rel, BuildIndexInfo(rel));
>             RelationClose(heap);
>             buf = ReadBuffer(rel, blkno);
>             LockBuffer(buf, access);
>         }
>         else
>         {
>             LockBuffer(buf, access);
>             _bt_checkpage(rel, buf);
>         }
>     }
>     ...
In my opinion, it is not a good idea to trigger a btbuild with a select or DML, the cost of which depends on the amount of data in the GTT.

>
>
> This code initializes B-Tree and load data in it when GTT index is access and is not initialized yet.
> It looks a little bit hacker but it works.
>
> I also wonder why you are keeping information about GTT in shared memory. Looks like the only information we really need to share is table's metadata.
> But it is already shared though catalog. All other GTT related information is private to backend so I do not see reasons to place it in shared memory.
The shared hash structure tracks which backend has initialized the GTT storage in order to implement the DDL of the GTT.
As for GTT, there is only one definition(include index on GTT), but each backend may have one data.
For the implementation of drop GTT, I assume that all data and definitions need to be deleted.

>
> --
> Konstantin Knizhnik
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
>
>




global_temporary_table_v2-pg13.patch (189K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [Proposal] Global temporary tables

konstantin knizhnik


On 08.11.2019 10:50, 曾文旌(义从) wrote:
> In my opinion, it is not a good idea to trigger a btbuild with a select or DML, the cost of which depends on the amount of data in the GTT.
IMHO it is better than returning error.
Also index will be used only if cost of plan with index will be
considered better than cost of plan without index. If you do not have
index, then you have to scan the whole table.
Time of such scan is comparable with time of building index.

Yes, I agree that indexes for GTT are used to be created together with
table itself before it is used by any application.
But if later DBA recognized that efficient execution of queries requires
some more indexes,
it will be strange and dangerous to prevent him from adding such index
until all clients which have accessed this table will drop their
connections.
Also maintaining in shared memory information about attached backends
seems to be overkill.

>>
>> This code initializes B-Tree and load data in it when GTT index is access and is not initialized yet.
>> It looks a little bit hacker but it works.
>>
>> I also wonder why you are keeping information about GTT in shared memory. Looks like the only information we really need to share is table's metadata.
>> But it is already shared though catalog. All other GTT related information is private to backend so I do not see reasons to place it in shared memory.
> The shared hash structure tracks which backend has initialized the GTT storage in order to implement the DDL of the GTT.
Sorry, I do not understand this argument.
DDL is performed on shared metadata present in global catalog.
Standard postgres invalidation mechanism is used to notify all backends
about schema changes.
Why do we need to maintain some extra information in shared memory.
Can you give me example of DLL which does't work without such shared hash?

> As for GTT, there is only one definition(include index on GTT), but each backend may have one data.
> For the implementation of drop GTT, I assume that all data and definitions need to be deleted.

Data of dropped GTT is removed on normal backend termination or cleaned
up at server restart in case of abnormal shutdown (as it is done for
local temp tables).
I have not used any shared control structures for GTT in my
implementation and that is why I wonder why do you need it and what are
the expected problems with my
implementation?

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Reply | Threaded
Open this post in threaded view
|

Re: [Proposal] Global temporary tables

曾文旌(义从)
In reply to this post by konstantin knizhnik
My comments for global_private_temp-4.patch

good side:
1 Lots of  index type on GTT. I think we need support for all kinds of indexes.
2 serial column on GTT.
3 INHERITS GTT.
4 PARTITION GTT.

I didn't choose to support them in the first release, but you did.

Other side:
1 case: create global temp table gtt2(a int primary key, b text) on commit delete rows;
I think you've lost the meaning of the on commit delete rows clause.
After the GTT is created, the other sessions feel that this is an on commit PRESERVE rows GTT.

truncate gtt, mybe this is a bug in DropRelFileNodeBuffers.
GTT's local buffer is not released.
Case:
postgres=# insert into gtt2 values(1,'xx');
INSERT 0 1
postgres=# truncate gtt2;
TRUNCATE TABLE
postgres=# insert into gtt2 values(1,'xx');
ERROR:  unexpected data beyond EOF in block 0 of relation base/13579/t3_16384
HINT:  This has been seen to occur with buggy kernels; consider updating your system.

3  lock type of truncate GTT.
I don't think it's a good idea to hold a big lock with truncate GTT, because it only needs to process private data.

4 GTT's ddl Those ddl that need to rewrite data files may need attention.
We have discussed in the previous email. This is why I used shared hash to track the GTT file.


5 There will be problems with DDL that will change relfilenode. Such as cluster GTT ,vacuum full GTT.
A session completes vacuum full gtt(a), and other sessions will immediately start reading and writing new storage files and existing data is also lost.
I disable them in my current version.

6 drop GTT
I think drop GTT should clean up all storage files and definitions. How do you think?

7 MVCC visibility clog clean
GTT data visibility rules, like regular tables, so GTT also need clog.
We need to avoid the clog that GTT needs to be cleaned up. 
At the same time, GTT does not do autovacuum, and retaining "too old data" will cause wraparound data loss.
I have given a solution in my design.


Zeng Wenjing

2019年11月1日 下午11:15,Konstantin Knizhnik <[hidden email]> 写道:



On 25.10.2019 20:00, Pavel Stehule wrote:

>
>> So except the limitation mentioned above (which I do not consider as critical) there is only one problem which was not addressed: maintaining statistics for GTT.
>> If all of the following conditions are true:
>>
>> 1) GTT are used in joins
>> 2) There are indexes defined for GTT
>> 3) Size and histogram of GTT in different backends can significantly vary.
>> 4) ANALYZE was explicitly called for GTT
>>
>> then query execution plan built in one backend will be also used for other backends where it can be inefficient.
>> I also do not consider this problem as "show stopper" for adding GTT to Postgres.
> I think that's *definitely* a show stopper.
Well, if both you and Pavel think that it is really "show stopper", then
this problem really has to be addressed.
I slightly confused about this opinion, because Pavel has told me
himself that 99% of users never create indexes for temp tables
or run "analyze" for them. And without it, this problem is not a problem
at all.


Users doesn't do ANALYZE on temp tables in 99%. It's true. But second fact is so users has lot of problems. It's very similar to wrong statistics on persistent tables. When data are small, then it is not problem for users, although from my perspective it's not optimal. When data are not small, then the problem can be brutal. Temporary tables are not a exception. And users and developers are people - we know only about fatal problems. There are lot of unoptimized queries, but because the problem is not fatal, then it is not reason for report it. And lot of people has not any idea how fast the databases can be. The knowledges of  users and app developers are sad book.

Pavel

It seems to me that I have found quite elegant solution for per-backend statistic for GTT: I just inserting it in backend's catalog cache, but not in pg_statistic table itself.
To do it I have to add InsertSysCache/InsertCatCache functions which insert pinned entry in the correspondent cache.
I wonder if there are some pitfalls of such approach?

New patch for GTT is attached.
-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 
<global_private_temp-4.patch>

123