Watching for view changes

classic Classic list List threaded Threaded
31 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Watching for view changes

Mitar
Hi!

I am trying to see how I could use NOTIFY/LISTEN to watch changes of a
complicated SELECT query which spans multiple tables. Ideally, I would
like to leave to PostgreSQL to determine when some data (and which
data) in the result of the SELECT query has changed. So I am thinking
that creating a temporary view using that query could be a way, only
if I would find a way to watch such view for changes somehow.

But it seems this is not really possible. I looked into two mechanisms:

- Logical replication. Instead of NOTIFY/LISTEN I could simply create
a publication over a view and then subscribe to it. But it seems
logical replication can be done only over base tables and not views.
[1]
- Using "after" trigger on the view to get notification when the view
gets changed. I could even use  transition relations to have
information what changed. But sadly it seems that this is possible
only if there is also INSTEAD OF trigger on the view. But I would like
to get notification when the view has changed because underlying
tables have changed, and not because of an UPDATE query on the view
itself. Moreover, I do not really need writable views. [2]

So I wonder if I am missing anything. Is there some other best
practice how to get notifications when result of a query changes in
real-time? And information what changed?

How hard it would be to implement such triggers on a view for whenever
a view changes? Is there a process to make a feature request?

(Also, I have not really managed to get statement level "after"
triggers to be run on a view for at all. Because if I rewrite a query
with INSTEAD OF then triggers on those tables are triggered, not
really view's. So not sure what is even expected use there.)

[1] https://www.postgresql.org/docs/devel/logical-replication-restrictions.html
[2] https://www.postgresql.org/docs/devel/trigger-definition.html


Mitar

--
http://mitar.tnode.com/
https://twitter.com/mitar_m

Reply | Threaded
Open this post in threaded view
|

Re: Watching for view changes

Ron-2
On 12/20/18 3:17 AM, Mitar wrote:
> Hi!
>
> I am trying to see how I could use NOTIFY/LISTEN to watch changes of a
> complicated SELECT query which spans multiple tables. Ideally, I would
> like to leave to PostgreSQL to determine when some data (and which
> data) in the result of the SELECT query has changed. So I am thinking
[snip]

There was a LONG thread on this list a few months ago about monitoring for
DDL changes.  It should prove helpful (or cause you to despair).


--
Angular momentum makes the world go 'round.

Reply | Threaded
Open this post in threaded view
|

Re: Watching for view changes

Mitar
Hi!

Thanks. Care for a link or at least thread subject?

Also, Googling around this seems a pretty popular request. I am quite
surprised that there is not something out-of-box available for this,
efficient and scalable.


Mitar

On Thu, Dec 20, 2018 at 7:33 AM Ron <[hidden email]> wrote:

>
> On 12/20/18 3:17 AM, Mitar wrote:
> > Hi!
> >
> > I am trying to see how I could use NOTIFY/LISTEN to watch changes of a
> > complicated SELECT query which spans multiple tables. Ideally, I would
> > like to leave to PostgreSQL to determine when some data (and which
> > data) in the result of the SELECT query has changed. So I am thinking
> [snip]
>
> There was a LONG thread on this list a few months ago about monitoring for
> DDL changes.  It should prove helpful (or cause you to despair).
>
>
> --
> Angular momentum makes the world go 'round.
>


--
http://mitar.tnode.com/
https://twitter.com/mitar_m

Reply | Threaded
Open this post in threaded view
|

Re: Watching for view changes

Adrian Klaver-4
On 12/20/18 7:54 AM, Mitar wrote:
> Hi!
>
> Thanks. Care for a link or at least thread subject?

https://www.postgresql.org/message-id/CA%2BFnnTxqScf6mDw_7HLMfu7YTagPrsYUw-bc%3DOKsBQ0TqprvEA%40mail.gmail.com

>
> Also, Googling around this seems a pretty popular request. I am quite
> surprised that there is not something out-of-box available for this,
> efficient and scalable.
>
>
> Mitar
>
> On Thu, Dec 20, 2018 at 7:33 AM Ron <[hidden email]> wrote:
>>
>> On 12/20/18 3:17 AM, Mitar wrote:
>>> Hi!
>>>
>>> I am trying to see how I could use NOTIFY/LISTEN to watch changes of a
>>> complicated SELECT query which spans multiple tables. Ideally, I would
>>> like to leave to PostgreSQL to determine when some data (and which
>>> data) in the result of the SELECT query has changed. So I am thinking
>> [snip]
>>
>> There was a LONG thread on this list a few months ago about monitoring for
>> DDL changes.  It should prove helpful (or cause you to despair).
>>
>>
>> --
>> Angular momentum makes the world go 'round.
>>
>
>


--
Adrian Klaver
[hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Watching for view changes

Mike Rylander
In reply to this post by Mitar
On Thu, Dec 20, 2018 at 4:17 AM Mitar <[hidden email]> wrote:

>
> Hi!
>
> I am trying to see how I could use NOTIFY/LISTEN to watch changes of a
> complicated SELECT query which spans multiple tables. Ideally, I would
> like to leave to PostgreSQL to determine when some data (and which
> data) in the result of the SELECT query has changed. So I am thinking
> that creating a temporary view using that query could be a way, only
> if I would find a way to watch such view for changes somehow.
>

If, as I assume, the FROM clause of the SELECT is essentially static,
just with varying WHERE conditions, you could just use an AFTER
INSERT/UPDATE/DELETE trigger on each of the tables involved to send a
NOTIFY whenever data that is exposed by the SELECT is changed.  You
can deliver a payload, such as the table name, primary key value and,
with a little work, even a list of fields that were modified as a JSON
blob, and let the application do whatever needs to be done to react
the the changes -- issue other queries, etc.

Of course that depends on your application knowing when it's
appropriate to NOTIFY, or being able to handle spurious NOTIFYs.

HTH,

--
Mike Rylander
 | Executive Director
 | Equinox Open Library Initiative
 | phone:  1-877-OPEN-ILS (673-6457)
 | email:  [hidden email]
 | web:  http://equinoxinitiative.org

> But it seems this is not really possible. I looked into two mechanisms:
>
> - Logical replication. Instead of NOTIFY/LISTEN I could simply create
> a publication over a view and then subscribe to it. But it seems
> logical replication can be done only over base tables and not views.
> [1]
> - Using "after" trigger on the view to get notification when the view
> gets changed. I could even use  transition relations to have
> information what changed. But sadly it seems that this is possible
> only if there is also INSTEAD OF trigger on the view. But I would like
> to get notification when the view has changed because underlying
> tables have changed, and not because of an UPDATE query on the view
> itself. Moreover, I do not really need writable views. [2]
>
> So I wonder if I am missing anything. Is there some other best
> practice how to get notifications when result of a query changes in
> real-time? And information what changed?
>
> How hard it would be to implement such triggers on a view for whenever
> a view changes? Is there a process to make a feature request?
>
> (Also, I have not really managed to get statement level "after"
> triggers to be run on a view for at all. Because if I rewrite a query
> with INSTEAD OF then triggers on those tables are triggered, not
> really view's. So not sure what is even expected use there.)
>
> [1] https://www.postgresql.org/docs/devel/logical-replication-restrictions.html
> [2] https://www.postgresql.org/docs/devel/trigger-definition.html
>
>
> Mitar
>
> --
> http://mitar.tnode.com/
> https://twitter.com/mitar_m
>

Reply | Threaded
Open this post in threaded view
|

Re: Watching for view changes

Mitar
In reply to this post by Adrian Klaver-4
Hi!

Ah, that was a thread about table changes. I am interested in data
changes (results from a query).

Thanks.


Mitar

On Thu, Dec 20, 2018 at 8:01 AM Adrian Klaver <[hidden email]> wrote:

>
> On 12/20/18 7:54 AM, Mitar wrote:
> > Hi!
> >
> > Thanks. Care for a link or at least thread subject?
>
> https://www.postgresql.org/message-id/CA%2BFnnTxqScf6mDw_7HLMfu7YTagPrsYUw-bc%3DOKsBQ0TqprvEA%40mail.gmail.com
>
> >
> > Also, Googling around this seems a pretty popular request. I am quite
> > surprised that there is not something out-of-box available for this,
> > efficient and scalable.
> >
> >
> > Mitar
> >
> > On Thu, Dec 20, 2018 at 7:33 AM Ron <[hidden email]> wrote:
> >>
> >> On 12/20/18 3:17 AM, Mitar wrote:
> >>> Hi!
> >>>
> >>> I am trying to see how I could use NOTIFY/LISTEN to watch changes of a
> >>> complicated SELECT query which spans multiple tables. Ideally, I would
> >>> like to leave to PostgreSQL to determine when some data (and which
> >>> data) in the result of the SELECT query has changed. So I am thinking
> >> [snip]
> >>
> >> There was a LONG thread on this list a few months ago about monitoring for
> >> DDL changes.  It should prove helpful (or cause you to despair).
> >>
> >>
> >> --
> >> Angular momentum makes the world go 'round.
> >>
> >
> >
>
>
> --
> Adrian Klaver
> [hidden email]



--
http://mitar.tnode.com/
https://twitter.com/mitar_m

lup
Reply | Threaded
Open this post in threaded view
|

Re: Watching for view changes

lup


> On Dec 20, 2018, at 1:04 PM, Mitar <[hidden email]> wrote:
>
> Hi!
>
> Ah, that was a thread about table changes. I am interested in data
> changes (results from a query).
>
> Thanks.
>
Are you hoping to see the difference in the returned values for successive calls to the same query?


Reply | Threaded
Open this post in threaded view
|

Re: Watching for view changes

Mitar
In reply to this post by Mike Rylander
Hi!

On Thu, Dec 20, 2018 at 8:11 AM Mike Rylander <[hidden email]> wrote:
> If, as I assume, the FROM clause of the SELECT is essentially static,
> just with varying WHERE conditions, you could just use an AFTER
> INSERT/UPDATE/DELETE trigger on each of the tables involved to send a
> NOTIFY whenever data that is exposed by the SELECT is changed.

You mean that for every SQL query I am making to a database, I would
manually determine which tables are involved and then setup triggers
with NOTIFY that the SELECT might have changed?

I am trying to see if this could be something I could abstract out
that it would be done automatically for any query. I have looked into
using EXPLAIN to get a list of tables involved in a query, but the
issue is that it look non-trivial to determine what has really changed
as a consequence of those tables changing. I would have to cache
myself what was the result of a query and then myself do a diff? Are
there any ways to do this inside PostgreSQL without having to cache
results on the client and do it there?

> You can deliver a payload, such as the table name, primary key value and,
> with a little work, even a list of fields that were modified as a JSON
> blob, and let the application do whatever needs to be done to react
> the the changes -- issue other queries, etc.

But this would mean that I would have to know how changes on involved
tables influence query results. I would like to not have to do SQL
query parsing and understanding on the client. So ideally, I would get
information directly from PostgreSQL. For me, an API where I could do
AFTER UPDATE trigger on FOR EACH ROW on a view would be perfect. In
that trigger I could get information which rows of the view changed
and then use NOTIFY to inform the client. Or even use transition
relations to get old and new state in the case FOR EACH STATEMENT (but
then I would still have to diff it probably myself). And view could
represent any query, without me having to try to understand and parse
it.


Mitar

--
http://mitar.tnode.com/
https://twitter.com/mitar_m

Reply | Threaded
Open this post in threaded view
|

Re: Watching for view changes

Mitar
In reply to this post by lup
Hi!

On Thu, Dec 20, 2018 at 12:07 PM Rob Sargent <[hidden email]> wrote:
> Are you hoping to see the difference in the returned values for successive calls to the same query?

i would like to in as close to real-time as possible get notification
when results of a query would have changed if I would do the same
query again, without me having to do polling or without me having to
do diffing.

So if my query was SELECT * FROM table then having after update
trigger on that table would give me such information on a row to row
basis, or statement basis. But I am not doing just such queries. So I
was thinking that I could create a view and then do SELECT * FROM view
to get update trigger on changes on the view. So changes would happen
because of changes do underlying tables. And I would like to be able
to know when and what in that query has changed when underlying tables
have changed.

I have found this package [1] which seems to be close in the idea, but
it does not seem to work in all cases. There is also an older
different package. [2] I would like to see if something like this
could be done inside database itself.

[1] https://github.com/nothingisdead/pg-live-query
[2] https://github.com/numtel/pg-live-select


Mitar

--
http://mitar.tnode.com/
https://twitter.com/mitar_m

Reply | Threaded
Open this post in threaded view
|

RE: Watching for view changes

Kevin Brannen
From: Mitar <[hidden email]>

> On Thu, Dec 20, 2018 at 12:07 PM Rob Sargent <[hidden email]> wrote:
> > Are you hoping to see the difference in the returned values for successive calls to the same query?
>
> i would like to in as close to real-time as possible get notification when results of a query would have changed if I would do the same query again, without me having to do polling or without me having to do diffing.


Hmm, I guess I could see that as long as the DB wasn't too [write] busy, else you'd be flooded with notifications.

Some years ago when I was working on a web app in Perl (which uses the DBI module for all communication to the DB), I subclassed DBI in order to see all calls to the DB. For most statements, I just let all the calls go thru. But for inserts, I wrote to a file the data that was being inserted; for deletes I wrote what was being deleted; and for updates I wrote the before and after values. (For the last 2 I changed the delete/update into a select to get the data.) It made it much easier to see how data changed -- especially when I was new to the app.

You could do something like that, where you have an interceptor that reports on data changes, filtering/searching for just the parts you want as you see fit. Of course, that would be just for your app, it wouldn't catch changes made from psql and other tools.

Maybe it's a useful idea for you ... or maybe not. 😊

If you had to have all statements no matter what tool was used to change data, I'd probably change log_statements to "all", "tail -f" the Pg log, and "do the right thing" (which could be non-trivial).

Kevin
This e-mail transmission, and any documents, files or previous e-mail messages attached to it, may contain confidential information. If you are not the intended recipient, or a person responsible for delivering it to the intended recipient, you are hereby notified that any disclosure, distribution, review, copy or use of any of the information contained in or attached to this message is STRICTLY PROHIBITED. If you have received this transmission in error, please immediately notify us by reply e-mail, and destroy the original transmission and its attachments without reading them or saving them to disk. Thank you.
Reply | Threaded
Open this post in threaded view
|

Re: Watching for view changes

Ron-2
In reply to this post by Mitar
On 12/20/18 2:20 PM, Mitar wrote:
> Hi!
>
> On Thu, Dec 20, 2018 at 12:07 PM Rob Sargent <[hidden email]> wrote:
>> Are you hoping to see the difference in the returned values for successive calls to the same query?
> i would like to in as close to real-time as possible get notification
> when results of a query would have changed if I would do the same
> query again, without me having to do polling or without me having to
> do diffing.

That's nothing like what you wrote in the Subject line.

--
Angular momentum makes the world go 'round.

Reply | Threaded
Open this post in threaded view
|

Re: Watching for view changes

George Neuner
In reply to this post by Mitar
On Thu, 20 Dec 2018 12:12:14 -0800, Mitar <[hidden email]> wrote:


>On Thu, Dec 20, 2018 at 8:11 AM Mike Rylander <[hidden email]> wrote:
>> If, as I assume, the FROM clause of the SELECT is essentially static,
>> just with varying WHERE conditions, you could just use an AFTER
>> INSERT/UPDATE/DELETE trigger on each of the tables involved to send a
>> NOTIFY whenever data that is exposed by the SELECT is changed.
>
>You mean that for every SQL query I am making to a database, I would
>manually determine which tables are involved and then setup triggers
>with NOTIFY that the SELECT might have changed?

You can just put an update trigger on every table.  You aren't forced
to listen for notifications.


>I am trying to see if this could be something I could abstract out
>that it would be done automatically for any query. I have looked into
>using EXPLAIN to get a list of tables involved in a query, but the
>issue is that it look non-trivial to determine what has really changed
>as a consequence of those tables changing. I would have to cache
>myself what was the result of a query and then myself do a diff? Are
>there any ways to do this inside PostgreSQL without having to cache
>results on the client and do it there?

You don't need to cache anything on the client.  An update trigger on
a table can notify a listening client when data is changed.  

The difficulty is that views are not guaranteed to be updateable.  As
a technical matter, you can put a trigger on a view, but it may never
fire.

AFAIK, update and delete triggers do work on materialized views,
because they really are physical tables.  Depending on your use case,
materialized views may or may not be of help to you.



>> You can deliver a payload, such as the table name, primary key value and,
>> with a little work, even a list of fields that were modified as a JSON
>> blob, and let the application do whatever needs to be done to react
>> the the changes -- issue other queries, etc.
>
>But this would mean that I would have to know how changes on involved
>tables influence query results.

The trigger function itself is agnostic WRT the format of the table -
the old and new row data are provided generically as records, and you
can to convert the record data, e.g., to JSON or XML, without knowing
its format.

AFAIHS, you really only need to know the table format to inspect or
modify the row data.


>I would like to not have to do SQL query parsing and understanding
>on the client. So ideally, I would get information directly from
>PostgreSQL. For me, an API where I could do AFTER UPDATE trigger on
>FOR EACH ROW on a view would be perfect. In that trigger I could get
>information which rows of the view changed and then use NOTIFY to
>inform the client.

You might want to use BEFORE UPDATE so you get both the old and new
row data.


YMMV,
George


Reply | Threaded
Open this post in threaded view
|

Re: Watching for view changes

Mitar
In reply to this post by Kevin Brannen
Hi!

On Thu, Dec 20, 2018 at 12:54 PM Kevin Brannen <[hidden email]> wrote:
Hmm, I guess I could see that as long as the DB wasn't too [write] busy, else you'd be flooded with notifications.

Sure. But this could be resolved by allowing notifications to be batched together. Debounce them. So could maybe configure how often you want such notifications and if they are more often they would be combined together into one.
 
Maybe it's a useful idea for you ... or maybe not. 😊

Thanks. Yes, this is one approach to do it. Hooking into every modify call at the app level and in this way have some information what is changing. I would prefer doing it in the database though, so that it could be independent from the source of the change. Moreover, not all UPDATE queries really do end up updating the data.


Mitar

--
Reply | Threaded
Open this post in threaded view
|

Re: Watching for view changes

Mitar
In reply to this post by George Neuner
Hi!

Thanks for more responses.

On Thu, Dec 20, 2018 at 6:28 PM George Neuner <[hidden email]> wrote:
> >You mean that for every SQL query I am making to a database, I would
> >manually determine which tables are involved and then setup triggers
> >with NOTIFY that the SELECT might have changed?
>
> You can just put an update trigger on every table.  You aren't forced
> to listen for notifications.

You are right. It is easier to just have triggers in advance.

> The difficulty is that views are not guaranteed to be updateable.

What you mean by that? I mean, just to be clear. I care only about
read-only views. The changes to views I care about change view because
underlying tables are updated. I do not care about UPDATE queries
against views themselves.

So you mean here that it is not guaranteed that you can make an UPDATE
query against a view? Yes, that I know. But if underlying tables
change, view is always updated, no? I mean, this is the main purpose
of a view. :-)

> As a technical matter, you can put a trigger on a view, but it may never fire.

If fires only for UPDATE queries against views themselves (and only if
INSTEAD OF trigger does not invalidate the update). But it never fires
for updates which happen because of changes to the underlying tables.
I would like to know:

a) Do people agree/think that would be a good API for my use case?
b) How hard would it be to implement/develop something like that? Is
this something PostgreSQL already knows internally and it is just a
question of exposing it?
c) Is there some better way to achieve this?

> AFAIK, update and delete triggers do work on materialized views,
> because they really are physical tables.  Depending on your use case,
> materialized views may or may not be of help to you.

Yes, materialized views are too heavy for me. But having update and
delete triggers only memory-only temporary views would be perfect.

Also, materialized views have to be manually refreshed, no? So it is
not really true that they get updated automatically (and that update
triggers would run) as soon as underling tables are modified?

> >I would like to not have to do SQL query parsing and understanding
> >on the client. So ideally, I would get information directly from
> >PostgreSQL. For me, an API where I could do AFTER UPDATE trigger on
> >FOR EACH ROW on a view would be perfect. In that trigger I could get
> >information which rows of the view changed and then use NOTIFY to
> >inform the client.
>
> You might want to use BEFORE UPDATE so you get both the old and new
> row data.

Not sure how this helps. If I have a query like (or a view defined
with such query):

SELECT "_id", "body", (SELECT row_to_json(posts) FROM posts WHERE
posts."_id"=comments."postId") AS "post" FROM comments

If I get a trigger notification that some row in "comments" table has
changed. How do I get an updated row in the query results (or the view
defined with such query). I would have to parse the SQL and figure out
how to do transformation myself, no?

So, I am not sure how triggers on underlying tables can really inform
how to know what in the view has been updated?


Mitar

--
http://mitar.tnode.com/
https://twitter.com/mitar_m

Reply | Threaded
Open this post in threaded view
|

Re: Watching for view changes

George Neuner
On Fri, 21 Dec 2018 20:49:23 -0800, Mitar <[hidden email]> wrote:

>On Thu, Dec 20, 2018 at 6:28 PM George Neuner <[hidden email]> wrote:
>
>> The difficulty is that views are not guaranteed to be updateable.
>
>What you mean by that? I mean, just to be clear. I care only about
>read-only views. The changes to views I care about change view because
>underlying tables are updated. I do not care about UPDATE queries
>against views themselves.
>
>So you mean here that it is not guaranteed that you can make an UPDATE
>query against a view? Yes, that I know. But if underlying tables
>change, view is always updated, no? I mean, this is the main purpose
>of a view. :-)

Terminology. <grin>  UPDATE vs update.

I'm certain you know this already, but to be clear:


A normal view really is just a SELECT whose results reflect the
current data in the underlying tables.  That is, a normal view is
*code*, NOT data, and its result is an anonymous *virtual* table that
exists only in the transaction that applied the view.  Thus the
results can't be monitored for changes without employing some kind of
deliberate caching.

A materialized view IS exactly such a deliberate cache of results from
applying a view.  It is a real table that can be monitored for changes
using INSERT, UPDATE and/or DELETE triggers.  


>> As a technical matter, you can put a trigger on a view, but it may never fire.
>
>If fires only for UPDATE queries against views themselves (and only if
>INSTEAD OF trigger does not invalidate the update). But it never fires
>for updates which happen because of changes to the underlying tables.

Right.  A trigger on a (normal) view actually is on the virtual result
table - but that table doesn't exist outside of the transaction that
applied the view.

I'm not certain offhand, but I *believe* that - like a SELECT - view
results will change on the fly during a transaction if the underlying
tables change and the isolation settings allow to see it.

But even if a view will update while open, to use this you'd have to
hold the result set open (with a cursor) while also keeping the
transaction open.  Once the result set is closed, the view results are
gone.


With a materialized view, you must apply the view code again (call
REFRESH) to see changes to the underlying tables - it doesn't happen
automagically.  But when refreshed, triggers on the cache table would
react to changes.


>I would like to know:
>
>a) Do people agree/think that would be a good API for my use case?
>b) How hard would it be to implement/develop something like that? Is
>this something PostgreSQL already knows internally and it is just a
>question of exposing it?

It probably would be possible to take a trigger set on a view and
transitively trigger on the base tables underlying it.  

But to what end?  The results (of a normal view) don't persist beyond
the current transaction, and allowing them to do so would, AFAICS, be
a violation of the SQL standard.  Postgresql is not going to do that
(certainly not for a new feature, only for backward compatibility).

>c) Is there some better way to achieve this?
>  :
>Yes, materialized views are too heavy for me. But having update and
>delete triggers only memory-only temporary views would be perfect.

Unfortunately, materialized views are the only reasonable server side
solution that I can think of.


>Also, materialized views have to be manually refreshed, no? So it is
>not really true that they get updated automatically (and that update
>triggers would run) as soon as underling tables are modified?

Yes.  But you can do that with triggers on the base tables.  Or timers
if "most current" information is not critical.  

Perhaps you can use synchronized normal and materialized views.  Use
the normal view for current information in transactions, and the
materialized view to asynchronously update passive LISTEN clients with
delta information.


>> >I would like to not have to do SQL query parsing and understanding
>> >on the client.

That I understand. <grin>  I'm a compiler and language hobbiest ...
even just parsing modern SQL can be painful.


>> So ideally, I would get information directly from
>> >PostgreSQL. For me, an API where I could do AFTER UPDATE trigger on
>> >FOR EACH ROW on a view would be perfect. In that trigger I could get
>> >information which rows of the view changed and then use NOTIFY to
>> >inform the client.
>>
>> You might want to use BEFORE UPDATE so you get both the old and new
>> row data.
>
>Not sure how this helps.

If you know specifically what columns have changed, only they would
need be communicated to a LISTEN client.  Not necessary to send the
whole row (which may be much larger).

AFTER shows you only the new row.  BEFORE shows you both the old and
new rows so you can determine what changed (and how, if relevant).


>So, I am not sure how triggers on underlying tables can really inform
>how to know what in the view has been updated?

They can't - if any results are computed you'd have to apply the view
again to see the changes.  And then you'd have to a whole new view and
not just updates to the old.

I don't see any way out of this that doesn't involve a materialized
view.  It's the only way I can see to get just updates and not deal
with the whole result set again.


YMMV,
George


Reply | Threaded
Open this post in threaded view
|

Re: Watching for view changes

Mitar
Hi!

On Fri, Dec 21, 2018 at 11:10 PM George Neuner <[hidden email]> wrote:
> A materialized view IS exactly such a deliberate cache of results from
> applying a view.  It is a real table that can be monitored for changes
> using INSERT, UPDATE and/or DELETE triggers.

Caching is needed if you want to compute a difference between previous
version and new. But if you want to just know new value, then I could
imagine that (a simple implementation would) on every change to any
underlying table check if this change matches selectors of the query
and if such apply its operations/projections and produce the new
value.

So yes, you need caching if you want to decrease CPU use, but you
could also see it as new values being computed again and again through
query. Would such caching you are mentioning really improve
performance, I do not know, so it might be premature optimization?

If we do not go down the cache path, then it seems there is no other
way to have this "apply this query again on those updated rows from
table". In a way I see query as a transformation (in relational
algebra) of original tables to results and I would need to be able to
apply that transformation again on new rows. And if some additional
data is necessary (some rows for other non-changed tables) it would
just recompute that again, instead of using cache.

If we do go down the cache path, then I agree, materialized views
seems nice, but I would prefer temporary materialized views: they
should be cleaned up at the end of the session. Moreover, they should
be ideally just in memory, not really on disk. Materialized views are
currently stored to disk, no?

> With a materialized view, you must apply the view code again (call
> REFRESH) to see changes to the underlying tables - it doesn't happen
> automagically.  But when refreshed, triggers on the cache table would
> react to changes.

So you are saying I could use triggers on the cache table to know what
really changed instead of having to compute diff myself? Interesting.


Mitar

--
http://mitar.tnode.com/
https://twitter.com/mitar_m

Reply | Threaded
Open this post in threaded view
|

Re: Watching for view changes

Ricardo Martin Gomez
In reply to this post by Mitar
Hi, perhaps you can use triggers for some tables.
Regards.


From: Mitar <[hidden email]>
Sent: Saturday, December 22, 2018 1:21:49 AM
To: Kevin Brannen
Cc: [hidden email]
Subject: Re: Watching for view changes
 
Hi!

On Thu, Dec 20, 2018 at 12:54 PM Kevin Brannen <[hidden email]> wrote:
Hmm, I guess I could see that as long as the DB wasn't too [write] busy, else you'd be flooded with notifications.

Sure. But this could be resolved by allowing notifications to be batched together. Debounce them. So could maybe configure how often you want such notifications and if they are more often they would be combined together into one.
 
Maybe it's a useful idea for you ... or maybe not. 😊

Thanks. Yes, this is one approach to do it. Hooking into every modify call at the app level and in this way have some information what is changing. I would prefer doing it in the database though, so that it could be independent from the source of the change. Moreover, not all UPDATE queries really do end up updating the data.


Mitar

--
lup
Reply | Threaded
Open this post in threaded view
|

Re: Watching for view changes

lup
Is this of theoretical interest (up to and including a specification/requirement) or this a practical concern (i.e. need to know when to update somebody’s dashboard widget (but the query is too slow to simply refresh on-demand)?


On Dec 22, 2018, at 9:42 AM, Ricardo Martin Gomez <[hidden email]> wrote:

Hi, perhaps you can use triggers for some tables.
Regards.


From: Mitar <[hidden email]>
Sent: Saturday, December 22, 2018 1:21:49 AM
To: Kevin Brannen
Cc: [hidden email]
Subject: Re: Watching for view changes
 
Hi!

On Thu, Dec 20, 2018 at 12:54 PM Kevin Brannen <[hidden email]> wrote:
Hmm, I guess I could see that as long as the DB wasn't too [write] busy, else you'd be flooded with notifications.

Sure. But this could be resolved by allowing notifications to be batched together. Debounce them. So could maybe configure how often you want such notifications and if they are more often they would be combined together into one.
 
Maybe it's a useful idea for you ... or maybe not. 😊

Thanks. Yes, this is one approach to do it. Hooking into every modify call at the app level and in this way have some information what is changing. I would prefer doing it in the database though, so that it could be independent from the source of the change. Moreover, not all UPDATE queries really do end up updating the data.


Mitar

--

Reply | Threaded
Open this post in threaded view
|

Re: Watching for view changes

Mitar
Hi!

This is of very practical concern. :-) I have many apps I developed in Meteor [1] using MongoDB. The whole Meteor stack is about reactive programming where web UI automatically rerenders as data in the database is changing. Meteor achieves this using complicated server-side code which tails MongoDB oplog and then maps this to reactive queries and maps how they update based on changes it observes in the oplog. This is then pushed to the client which rerenders.

This approach has scalability issues and also it costs a lot of resources on the server side to first copy data from the DB into server-side component and then keep that state in the server-side component in sync with the DB. It generally has to reimplement oplog parsing, query parsing and evaluation, to be able to do all that.

I like this declerative style of programming. Where you define reactive queries where you select data from DB, define a transformation, and then render it in UI. As data in DB is changing, everything else gets updated automatically. It is a pretty nice way of programming. Without having to think about which all places might be updating DB and how to update UI based on all those places. Maybe not for everyone and all tasks, but in my case I generally work with collaborative online tools where such real-time aspect of working together is pretty neat.

So I like PostgreSQL and I have used in other apps. And now I am trying to see if I could find an efficient way for PostgreSQL to have such reactive query and send me data as the query is changing. I think DB already has to deal with most of such logic and wiring it together in the DB instead of server-side of the app might allow better performance and scaling here.

For example, CREATE PUBLICATION seems a reasonable API as well (instead of a trigger + notification + working around a limit on how much data can be send in a notification), but it does not work on materialized views. Why is that? I thought materialized views are the same as tables, just that there is nicer API to copy a query into those tables when wanted. Currently it seems what is there is very similar to what MongoDB provides: publication/observe on a table level. So if I would not be using joins I could SUBSCRIBE to the PUBLICATION. I would still have to implement logic how to map those changes to changes to a result of a query though, to know how to update results. If I could publish a materialized view, PostgreSQL could do that mapping for me. And I could also do joins.

So it seems materialized views are close to this, but not perfect. So I have some questions:

- Is there a technical reason why PUBLICATION cannot be done on a materialized views? I mean, I could manually create/simulate materialized views through regular tables probably. What are differences between regular tables and materialized views?
- Are there limits on how many subscribers to a PUBLICATION can there effectively be?
- For my case it would be great if materialized views could be TEMPORARY, in-memory (and if PostgreSQL runs out of buffer space for it, I would prefer an error), and UNLOGGED. Any particular reasons which would prevent them to be implemented as such?



Mitar

On Sat, Dec 22, 2018 at 1:16 PM Rob Sargent <[hidden email]> wrote:
Is this of theoretical interest (up to and including a specification/requirement) or this a practical concern (i.e. need to know when to update somebody’s dashboard widget (but the query is too slow to simply refresh on-demand)?


On Dec 22, 2018, at 9:42 AM, Ricardo Martin Gomez <[hidden email]> wrote:

Hi, perhaps you can use triggers for some tables.
Regards.


From: Mitar <[hidden email]>
Sent: Saturday, December 22, 2018 1:21:49 AM
To: Kevin Brannen
Cc: [hidden email]
Subject: Re: Watching for view changes
 
Hi!

On Thu, Dec 20, 2018 at 12:54 PM Kevin Brannen <[hidden email]> wrote:
Hmm, I guess I could see that as long as the DB wasn't too [write] busy, else you'd be flooded with notifications.

Sure. But this could be resolved by allowing notifications to be batched together. Debounce them. So could maybe configure how often you want such notifications and if they are more often they would be combined together into one.
 
Maybe it's a useful idea for you ... or maybe not. 😊

Thanks. Yes, this is one approach to do it. Hooking into every modify call at the app level and in this way have some information what is changing. I would prefer doing it in the database though, so that it could be independent from the source of the change. Moreover, not all UPDATE queries really do end up updating the data.


Mitar

--



--
Reply | Threaded
Open this post in threaded view
|

Re: Watching for view changes

George Neuner
In reply to this post by Mitar
On Fri, 21 Dec 2018 23:41:16 -0800, Mitar <[hidden email]> wrote:

>Hi!
>
>On Fri, Dec 21, 2018 at 11:10 PM George Neuner <[hidden email]> wrote:
>> A materialized view IS exactly such a deliberate cache of results from
>> applying a view.  It is a real table that can be monitored for changes
>> using INSERT, UPDATE and/or DELETE triggers.
>
>Caching is needed if you want to compute a difference between previous
>version and new. But if you want to just know new value, then I could
>imagine that (a simple implementation would) on every change to any
>underlying table check if this change matches selectors of the query
>and if such apply its operations/projections and produce the new
>value.

Yes, that could be done.  But it isn't.  In effect you are asking the
DBMS also to be a spreadsheet: i.e. change this cell and everything
that depends on it gets recomputed.

A spreadsheet is an order of magnitude simpler to implement than a
DBMS, but the combination would be an order of magnitude (or more)
harder.  Even object graph databases don't offer spreadsheet
functionality, and it would be a lot easier to do there than in a
table relational system.


>So yes, you need caching if you want to decrease CPU use, but you
>could also see it as new values being computed again and again through
>query. Would such caching you are mentioning really improve
>performance, I do not know, so it might be premature optimization?

It may take only 3 cycles to multiply two numbers, but it can take
thousands of cycles [or millions if the data is on disk] to get those
two numbers into the multiplier.

There always are exceptions, but the general rule is that whenever the
result requires:
 - significant computation,
 - significant resources, or
 - significant time
then you should cache the result instead of recomputing it.

Joins and sorts can take a whole lot of memory (and spill onto disk if
they overflow the work buffer).  A fetch of a table or index not in
memory is simple but takes a lot of time - as well as maybe pushing
something else out (increasing the complexity of a competing query).


>If we do go down the cache path, then I agree, materialized views
>seems nice, but I would prefer temporary materialized views: they
>should be cleaned up at the end of the session. Moreover, they should
>be ideally just in memory, not really on disk. Materialized views are
>currently stored to disk, no?

In PG, all *named* tables are backed on disk - even temporary tables.
Only anonymous tables of query results can exist entirely in memory
[and even they can spill onto disk when necessary].

With enough memory you can cache all your tables in shared buffers and
have enough extra that you never run out of work buffers and never
overflow a work buffer.  But that is the best you can achieve with PG.

George


12