Matview size - space increased on concurrently refresh

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Matview size - space increased on concurrently refresh

ncontu1
Hello,
we noticed with a simple matview we have that refreshing it using the concurrently item the space always increases of about 120MB .
This only happens if I am reading from that matview and at the same time I am am refreshing it.

cmdv3=# SELECT pg_size_pretty(pg_relation_size('public.matview_nm_connections'::regclass));
pg_size_pretty
----------------
133 MB
(1 row)
 
cmdv3=# refresh materialized view matview_nm_connections;
REFRESH MATERIALIZED VIEW
cmdv3=# SELECT pg_size_pretty(pg_relation_size('public.matview_nm_connections'::regclass));
pg_size_pretty
----------------
133 MB
(1 row)
 
cmdv3=# \! date
Fri Jul 12 13:52:51 GMT 2019
 
cmdv3=# refresh materialized view matview_nm_connections;
REFRESH MATERIALIZED VIEW
cmdv3=# SELECT pg_size_pretty(pg_relation_size('public.matview_nm_connections'::regclass));
pg_size_pretty
----------------
133 MB
(1 row)
 
 
Let's try concurrently.....
 
cmdv3=# refresh materialized view CONCURRENTLY matview_nm_connections;
REFRESH MATERIALIZED VIEW
cmdv3=# SELECT pg_size_pretty(pg_relation_size('public.matview_nm_connections'::regclass));
pg_size_pretty
----------------
261 MB
(1 row)


So the matview is not really used and it does not have anything strange but that matview growth to 12GB as we refresh it once an hour.
It had the free percent at 97%.
I understand with concurrenlty it needs to take copy of the data while reading, but this seems to be too much on the space side.

Is this a bug? Or is there anyone can help us understanding this?

Thanks a lot,
Nicola
Reply | Threaded
Open this post in threaded view
|

Re: Matview size - space increased on concurrently refresh

ncontu1
P.S.: I am on postgres 11.3

Il giorno ven 12 lug 2019 alle ore 16:32 Nicola Contu <[hidden email]> ha scritto:
Hello,
we noticed with a simple matview we have that refreshing it using the concurrently item the space always increases of about 120MB .
This only happens if I am reading from that matview and at the same time I am am refreshing it.

cmdv3=# SELECT pg_size_pretty(pg_relation_size('public.matview_nm_connections'::regclass));
pg_size_pretty
----------------
133 MB
(1 row)
 
cmdv3=# refresh materialized view matview_nm_connections;
REFRESH MATERIALIZED VIEW
cmdv3=# SELECT pg_size_pretty(pg_relation_size('public.matview_nm_connections'::regclass));
pg_size_pretty
----------------
133 MB
(1 row)
 
cmdv3=# \! date
Fri Jul 12 13:52:51 GMT 2019
 
cmdv3=# refresh materialized view matview_nm_connections;
REFRESH MATERIALIZED VIEW
cmdv3=# SELECT pg_size_pretty(pg_relation_size('public.matview_nm_connections'::regclass));
pg_size_pretty
----------------
133 MB
(1 row)
 
 
Let's try concurrently.....
 
cmdv3=# refresh materialized view CONCURRENTLY matview_nm_connections;
REFRESH MATERIALIZED VIEW
cmdv3=# SELECT pg_size_pretty(pg_relation_size('public.matview_nm_connections'::regclass));
pg_size_pretty
----------------
261 MB
(1 row)


So the matview is not really used and it does not have anything strange but that matview growth to 12GB as we refresh it once an hour.
It had the free percent at 97%.
I understand with concurrenlty it needs to take copy of the data while reading, but this seems to be too much on the space side.

Is this a bug? Or is there anyone can help us understanding this?

Thanks a lot,
Nicola
Reply | Threaded
Open this post in threaded view
|

Re: Matview size - space increased on concurrently refresh

Kaixi Luo

On Fri, Jul 12, 2019 at 4:34 PM Nicola Contu <[hidden email]> wrote:
P.S.: I am on postgres 11.3

Il giorno ven 12 lug 2019 alle ore 16:32 Nicola Contu <[hidden email]> ha scritto:
Hello,
we noticed with a simple matview we have that refreshing it using the concurrently item the space always increases of about 120MB .
This only happens if I am reading from that matview and at the same time I am am refreshing it.

cmdv3=# SELECT pg_size_pretty(pg_relation_size('public.matview_nm_connections'::regclass));
pg_size_pretty
----------------
133 MB
(1 row)
 
cmdv3=# refresh materialized view matview_nm_connections;
REFRESH MATERIALIZED VIEW
cmdv3=# SELECT pg_size_pretty(pg_relation_size('public.matview_nm_connections'::regclass));
pg_size_pretty
----------------
133 MB
(1 row)
 
cmdv3=# \! date
Fri Jul 12 13:52:51 GMT 2019
 
cmdv3=# refresh materialized view matview_nm_connections;
REFRESH MATERIALIZED VIEW
cmdv3=# SELECT pg_size_pretty(pg_relation_size('public.matview_nm_connections'::regclass));
pg_size_pretty
----------------
133 MB
(1 row)
 
 
Let's try concurrently.....
 
cmdv3=# refresh materialized view CONCURRENTLY matview_nm_connections;
REFRESH MATERIALIZED VIEW
cmdv3=# SELECT pg_size_pretty(pg_relation_size('public.matview_nm_connections'::regclass));
pg_size_pretty
----------------
261 MB
(1 row)


So the matview is not really used and it does not have anything strange but that matview growth to 12GB as we refresh it once an hour.
It had the free percent at 97%.
I understand with concurrenlty it needs to take copy of the data while reading, but this seems to be too much on the space side.

Is this a bug? Or is there anyone can help us understanding this?

Thanks a lot,
Nicola

This is normal and something to be expected. When refreshing the materialized view, the new data is written to a disk and then the two tables are diffed. After the refresh finishes, your view size should go back to normal.
 
Reply | Threaded
Open this post in threaded view
|

Re: Matview size - space increased on concurrently refresh

ncontu1
It does not. That's the issue.
It always increases of 120mb and it reached 12gb instead of just 180mb.

Il dom 14 lug 2019, 21:34 Kaixi Luo <[hidden email]> ha scritto:

On Fri, Jul 12, 2019 at 4:34 PM Nicola Contu <[hidden email]> wrote:
P.S.: I am on postgres 11.3

Il giorno ven 12 lug 2019 alle ore 16:32 Nicola Contu <[hidden email]> ha scritto:
Hello,
we noticed with a simple matview we have that refreshing it using the concurrently item the space always increases of about 120MB .
This only happens if I am reading from that matview and at the same time I am am refreshing it.

cmdv3=# SELECT pg_size_pretty(pg_relation_size('public.matview_nm_connections'::regclass));
pg_size_pretty
----------------
133 MB
(1 row)
 
cmdv3=# refresh materialized view matview_nm_connections;
REFRESH MATERIALIZED VIEW
cmdv3=# SELECT pg_size_pretty(pg_relation_size('public.matview_nm_connections'::regclass));
pg_size_pretty
----------------
133 MB
(1 row)
 
cmdv3=# \! date
Fri Jul 12 13:52:51 GMT 2019
 
cmdv3=# refresh materialized view matview_nm_connections;
REFRESH MATERIALIZED VIEW
cmdv3=# SELECT pg_size_pretty(pg_relation_size('public.matview_nm_connections'::regclass));
pg_size_pretty
----------------
133 MB
(1 row)
 
 
Let's try concurrently.....
 
cmdv3=# refresh materialized view CONCURRENTLY matview_nm_connections;
REFRESH MATERIALIZED VIEW
cmdv3=# SELECT pg_size_pretty(pg_relation_size('public.matview_nm_connections'::regclass));
pg_size_pretty
----------------
261 MB
(1 row)


So the matview is not really used and it does not have anything strange but that matview growth to 12GB as we refresh it once an hour.
It had the free percent at 97%.
I understand with concurrenlty it needs to take copy of the data while reading, but this seems to be too much on the space side.

Is this a bug? Or is there anyone can help us understanding this?

Thanks a lot,
Nicola

This is normal and something to be expected. When refreshing the materialized view, the new data is written to a disk and then the two tables are diffed. After the refresh finishes, your view size should go back to normal.
 
Reply | Threaded
Open this post in threaded view
|

Re: Matview size - space increased on concurrently refresh

Tom Lane-2
[ please do not top-post in your replies, it makes the conversation hard
  to follow ]

Nicola Contu <[hidden email]> writes:
> Il dom 14 lug 2019, 21:34 Kaixi Luo <[hidden email]> ha scritto:
>> This is normal and something to be expected. When refreshing the
>> materialized view, the new data is written to a disk and then the two
>> tables are diffed. After the refresh finishes, your view size should go
>> back to normal.

> It does not. That's the issue.
> It always increases of 120mb and it reached 12gb instead of just 180mb.

A concurrent matview refresh will necessarily leave behind two copies
of any rows it changes, just like any other row-update operation in
Postgres.  Once there are no concurrent transactions that can "see"
the old row copies, they should be reclaimable by vacuum.

Since you're not seeing autovacuum reclaim the space automatically,
I hypothesize that you've got autovacuum turned off or dialed down
to unrealistically non-aggressive settings.  Or possibly you have
old open transactions that are preventing reclaiming dead rows
(because they can still possibly "see" those rows).  Either of those
explanations should imply that you're getting similar bloat in every
other table and matview, though.

You might want to look into pg_stat_all_tables to see what it says
about the last_autovacuum time etc. for that matview.  Another source
of insight is to do a manual "vacuum verbose" on the matview and see
what that says about removable and nonremovable rows.

                        regards, tom lane