Performance killed with FDW when using CAST.

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Performance killed with FDW when using CAST.

Jorge Torralba
We setup a FDW server for a large table that has an hstore data type as on eof the coums.

When we run the query directly on the FDW server

select * from recent_events where account_id = 1 AND (attributes -> 'account_incident_id')::integer = 2617116 limit 10;

The response is 2ms . which is awesome.

When we run it from the remote server where we have the server defined, the query never returns.

Our server definition is as follows.

(host 'fdwserver', dbname 'mydb', port '5432', fetch_size '1000', use_remote_estimate 'true', extensions 'hstore')

We have been playing with this for hours to no avail. 

Any clues ?

Thanks

--
Thanks,

Jorge Torralba
----------------------------

Note: This communication may contain privileged or other confidential information. If you are not the intended recipient, please do not print, copy, retransmit, disseminate or otherwise use the information. Please indicate to the sender that you have received this email in error and delete the copy you received. Thank You.
Reply | Threaded
Open this post in threaded view
|

Re: Performance killed with FDW when using CAST.

Ian Barwick-3
On 4/17/19 9:21 AM, Jorge Torralba wrote:

> We setup a FDW server for a large table that has an hstore data type as on eof the coums.
>
> When we run the query directly on the FDW server
>
> select * from recent_events where account_id = 1 AND (attributes -> 'account_incident_id')::integer = 2617116 limit 10;
>
> The response is 2ms . which is awesome.
>
> When we run it from the remote server where we have the server defined, the query never returns.
>
> Our server definition is as follows.
>
> (host 'fdwserver', dbname 'mydb', port '5432', fetch_size '1000', use_remote_estimate 'true', extensions 'hstore')
>
> We have been playing with this for hours to no avail.
>
> Any clues ?

At a guess the FDW is not able to push down the entire query to the FDW server
and might be doing something like fetching all the remote rows and casting
them locally, or something along those lines.

It would help to provide the following info:

- PostgreSQL versions of both servers
- "EXPLAIN" output of the query on both servers
- "EXPLAIN ANALYZE" output of the query executed directly on the FDW server
- approximate number of rows in the "recent_events" table


Regards

Ian Barwick

--
  Ian Barwick                   https://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Performance killed with FDW when using CAST.

Jorge Torralba
Thanks for taking the time to look.

Both servers are on ....

                                                             version
----------------------------------------------------------------------------------------------------------------------------------
 PostgreSQL 11.2 (Debian 11.2-1.pgdg90+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit

I have reduced the query to select only two columns for simplicity with the same results.

The table contain about 50 million rows

On the server hosting the table ..... Hiding private data.

alertsdb_recent_events=#  select id, attributes -> 'account_incident_id' from recent_events where account_id = 1 AND (attributes -> 'account_incident_id')::integer = 2617116 limit 5;
                  id                  | ?column?
--------------------------------------+----------
 ***** | 2617116
 ***** | 2617116
 ***** | 2617116
 ***** | 2617116
 ***** | 2617116
(5 rows)

Time: 82.868 ms


# explain  select id, attributes -> 'account_incident_id' from recent_events where account_id = 1 AND (attributes -> 'account_incident_id')::integer = 2617116 limit 5;
                                                               QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.56..3.77 rows=5 width=48)
   ->  Index Scan using recent_event_account_id_attributes_account_incident_id_idx on recent_events  (cost=0.56..56.31 rows=87 width=48)
         Index Cond: ((account_id = 1) AND (((attributes -> 'account_incident_id'::text))::integer = 2617116))
(3 rows)

Time: 71.907 ms

# explain  analyze select id, attributes -> 'account_incident_id' from recent_events where account_id = 1 AND (attributes -> 'account_incident_id')::integer = 2617116 limit 5;
                                                                                    QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.56..3.77 rows=5 width=48) (actual time=0.019..0.025 rows=5 loops=1)
   ->  Index Scan using recent_event_account_id_attributes_account_incident_id_idx on recent_events  (cost=0.56..56.31 rows=87 width=48) (actual time=0.018..0.023 rows=5 loops=1)
         Index Cond: ((account_id = 1) AND (((attributes -> 'account_incident_id'::text))::integer = 2617116))
 Planning Time: 0.124 ms
 Execution Time: 0.038 ms
(5 rows)

Time: 80.782 ms


On the server that communicates with the FDW server .....

# explain  select id, attributes -> 'account_incident_id' from recent_events where account_id = 1 AND (attributes -> 'account_incident_id')::integer = 2617116 limit 5;
                                     QUERY PLAN
-------------------------------------------------------------------------------------
 Limit  (cost=100.00..426.27 rows=5 width=48)
   ->  Foreign Scan on recent_events  (cost=100.00..6663659.61 rows=102117 width=48)
         Filter: (((attributes -> 'account_incident_id'::text))::integer = 2617116)
(3 rows)

Time: 85.276 ms


# explain  analyze select id, attributes -> 'account_incident_id' from recent_events where account_id = 1 AND (attributes -> 'account_incident_id')::integer = 2617116 limit 5;
                                                              QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=100.00..426.27 rows=5 width=48) (actual time=21242.087..26387.257 rows=5 loops=1)
   ->  Foreign Scan on recent_events  (cost=100.00..6663703.90 rows=102117 width=48) (actual time=21242.086..26387.252 rows=5 loops=1)
         Filter: (((attributes -> 'account_incident_id'::text))::integer = 2617116)
         Rows Removed by Filter: 724249
 Planning Time: 1.164 ms
 Execution Time: 26387.851 ms
(6 rows)

Time: 26528.113 ms (00:26.528)

The query killer is the ...

AND (attributes -> 'account_incident_id')::integer = 2617116

Run the query this way ...

select id, attributes -> 'account_incident_id' from recent_events where account_id = 1  limit 5;

and the results is only 10ms slower than on the hosting server directly which is what we are expecting.  It's like the casting of the hstore column is just not playing nice.

Thanks


On Tue, Apr 16, 2019 at 6:56 PM Ian Barwick <[hidden email]> wrote:
On 4/17/19 9:21 AM, Jorge Torralba wrote:
> We setup a FDW server for a large table that has an hstore data type as on eof the coums.
>
> When we run the query directly on the FDW server
>
> select * from recent_events where account_id = 1 AND (attributes -> 'account_incident_id')::integer = 2617116 limit 10;
>
> The response is 2ms . which is awesome.
>
> When we run it from the remote server where we have the server defined, the query never returns.
>
> Our server definition is as follows.
>
> (host 'fdwserver', dbname 'mydb', port '5432', fetch_size '1000', use_remote_estimate 'true', extensions 'hstore')
>
> We have been playing with this for hours to no avail.
>
> Any clues ?

At a guess the FDW is not able to push down the entire query to the FDW server
and might be doing something like fetching all the remote rows and casting
them locally, or something along those lines.

It would help to provide the following info:

- PostgreSQL versions of both servers
- "EXPLAIN" output of the query on both servers
- "EXPLAIN ANALYZE" output of the query executed directly on the FDW server
- approximate number of rows in the "recent_events" table


Regards

Ian Barwick

--
  Ian Barwick                   https://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Thanks,

Jorge Torralba
----------------------------

Note: This communication may contain privileged or other confidential information. If you are not the intended recipient, please do not print, copy, retransmit, disseminate or otherwise use the information. Please indicate to the sender that you have received this email in error and delete the copy you received. Thank You.
Reply | Threaded
Open this post in threaded view
|

Re: Performance killed with FDW when using CAST.

Ron-2


On 4/16/19 9:40 PM, Jorge Torralba wrote:

> Thanks for taking the time to look.
>
> Both servers are on ....
>
>    version
> ----------------------------------------------------------------------------------------------------------------------------------
>  PostgreSQL 11.2 (Debian 11.2-1.pgdg90+1) on x86_64-pc-linux-gnu, compiled
> by gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit
>
> I have reduced the query to select only two columns for simplicity with
> the same results.
>
> The table contain about 50 million rows
>
> On the server hosting the table ..... Hiding private data.
>
> alertsdb_recent_events=#  select id, attributes -> 'account_incident_id'
> from recent_events where account_id = 1 AND (attributes ->
> 'account_incident_id')::integer = 2617116 limit 5;
>                   id                  | ?column?
> --------------------------------------+----------
>  ***** | 2617116
>  ***** | 2617116
>  ***** | 2617116
>  ***** | 2617116
>  ***** | 2617116
> (5 rows)
>
> Time: 82.868 ms
>
>
> # explain  select id, attributes -> 'account_incident_id' from
> recent_events where account_id = 1 AND (attributes ->
> 'account_incident_id')::integer = 2617116 limit 5;
>      QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------------------------------
>  Limit  (cost=0.56..3.77 rows=5 width=48)
>    ->  Index Scan using
> recent_event_account_id_attributes_account_incident_id_idx on
> recent_events  (cost=0.56..56.31 rows=87 width=48)
>          Index Cond: ((account_id = 1) AND (((attributes ->
> 'account_incident_id'::text))::integer = 2617116))
> (3 rows)
>
> Time: 71.907 ms
>
> # explain  analyze select id, attributes -> 'account_incident_id' from
> recent_events where account_id = 1 AND (attributes ->
> 'account_incident_id')::integer = 2617116 limit 5;
>                           QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Limit  (cost=0.56..3.77 rows=5 width=48) (actual time=0.019..0.025 rows=5
> loops=1)
>    ->  Index Scan using
> recent_event_account_id_attributes_account_incident_id_idx on
> recent_events  (cost=0.56..56.31 rows=87 width=48) (actual
> time=0.018..0.023 rows=5 loops=1)
>          Index Cond: ((account_id = 1) AND (((attributes ->
> 'account_incident_id'::text))::integer = 2617116))
>  Planning Time: 0.124 ms
>  Execution Time: 0.038 ms
> (5 rows)
>
> Time: 80.782 ms
>
>
> On the server that communicates with the FDW server .....
>
> # explain  select id, attributes -> 'account_incident_id' from
> recent_events where account_id = 1 AND (attributes ->
> 'account_incident_id')::integer = 2617116 limit 5;
>                                      QUERY PLAN
> -------------------------------------------------------------------------------------
>  Limit  (cost=100.00..426.27 rows=5 width=48)
>    ->  Foreign Scan on recent_events (cost=100.00..6663659.61 rows=102117
> width=48)
>          Filter: (((attributes -> 'account_incident_id'::text))::integer =
> 2617116)
> (3 rows)
>
> Time: 85.276 ms
>
>
> # explain  analyze select id, attributes -> 'account_incident_id' from
> recent_events where account_id = 1 AND (attributes ->
> 'account_incident_id')::integer = 2617116 limit 5;
>     QUERY PLAN
> ---------------------------------------------------------------------------------------------------------------------------------------
>  Limit  (cost=100.00..426.27 rows=5 width=48) (actual
> time=21242.087..26387.257 rows=5 loops=1)
>    ->  Foreign Scan on recent_events (cost=100.00..6663703.90 rows=102117
> width=48) (actual time=21242.086..26387.252 rows=5 loops=1)
>          Filter: (((attributes -> 'account_incident_id'::text))::integer =
> 2617116)
>          Rows Removed by Filter: 724249
>  Planning Time: 1.164 ms
>  Execution Time: 26387.851 ms
> (6 rows)
>
> Time: 26528.113 ms (00:26.528)
>
> The query killer is the ...
>
> AND (attributes -> 'account_incident_id')::integer = 2617116
>
> Run the query this way ...
>
> select id, attributes -> 'account_incident_id' from recent_events where
> account_id = 1  limit 5;
>
> and the results is only 10ms slower than on the hosting server directly
> which is what we are expecting.  It's like the casting of the hstore
> column is just not playing nice.

Casting the left side of a predicate is not recommended.  What if you cast
2617166 to be the same type as attributes -> 'account_incident_id'?


--
Angular momentum makes the world go 'round.


Reply | Threaded
Open this post in threaded view
|

Re: Performance killed with FDW when using CAST.

Jorge Torralba
I tried without casting and still the same problem on the server accessing remote data.

On the server hosting the data, even with casting the query executes in 78ms.

fty, I have run analyze on both the servers as well.

Thanks

JT

On Tue, Apr 16, 2019 at 8:00 PM Ron <[hidden email]> wrote:


On 4/16/19 9:40 PM, Jorge Torralba wrote:
> Thanks for taking the time to look.
>
> Both servers are on ....
>
>    version
> ----------------------------------------------------------------------------------------------------------------------------------
>  PostgreSQL 11.2 (Debian 11.2-1.pgdg90+1) on x86_64-pc-linux-gnu, compiled
> by gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit
>
> I have reduced the query to select only two columns for simplicity with
> the same results.
>
> The table contain about 50 million rows
>
> On the server hosting the table ..... Hiding private data.
>
> alertsdb_recent_events=#  select id, attributes -> 'account_incident_id'
> from recent_events where account_id = 1 AND (attributes ->
> 'account_incident_id')::integer = 2617116 limit 5;
>                   id                  | ?column?
> --------------------------------------+----------
>  ***** | 2617116
>  ***** | 2617116
>  ***** | 2617116
>  ***** | 2617116
>  ***** | 2617116
> (5 rows)
>
> Time: 82.868 ms
>
>
> # explain  select id, attributes -> 'account_incident_id' from
> recent_events where account_id = 1 AND (attributes ->
> 'account_incident_id')::integer = 2617116 limit 5;
>      QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------------------------------
>  Limit  (cost=0.56..3.77 rows=5 width=48)
>    ->  Index Scan using
> recent_event_account_id_attributes_account_incident_id_idx on
> recent_events  (cost=0.56..56.31 rows=87 width=48)
>          Index Cond: ((account_id = 1) AND (((attributes ->
> 'account_incident_id'::text))::integer = 2617116))
> (3 rows)
>
> Time: 71.907 ms
>
> # explain  analyze select id, attributes -> 'account_incident_id' from
> recent_events where account_id = 1 AND (attributes ->
> 'account_incident_id')::integer = 2617116 limit 5;
>                           QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Limit  (cost=0.56..3.77 rows=5 width=48) (actual time=0.019..0.025 rows=5
> loops=1)
>    ->  Index Scan using
> recent_event_account_id_attributes_account_incident_id_idx on
> recent_events  (cost=0.56..56.31 rows=87 width=48) (actual
> time=0.018..0.023 rows=5 loops=1)
>          Index Cond: ((account_id = 1) AND (((attributes ->
> 'account_incident_id'::text))::integer = 2617116))
>  Planning Time: 0.124 ms
>  Execution Time: 0.038 ms
> (5 rows)
>
> Time: 80.782 ms
>
>
> On the server that communicates with the FDW server .....
>
> # explain  select id, attributes -> 'account_incident_id' from
> recent_events where account_id = 1 AND (attributes ->
> 'account_incident_id')::integer = 2617116 limit 5;
>                                      QUERY PLAN
> -------------------------------------------------------------------------------------
>  Limit  (cost=100.00..426.27 rows=5 width=48)
>    ->  Foreign Scan on recent_events (cost=100.00..6663659.61 rows=102117
> width=48)
>          Filter: (((attributes -> 'account_incident_id'::text))::integer =
> 2617116)
> (3 rows)
>
> Time: 85.276 ms
>
>
> # explain  analyze select id, attributes -> 'account_incident_id' from
> recent_events where account_id = 1 AND (attributes ->
> 'account_incident_id')::integer = 2617116 limit 5;
>     QUERY PLAN
> ---------------------------------------------------------------------------------------------------------------------------------------
>  Limit  (cost=100.00..426.27 rows=5 width=48) (actual
> time=21242.087..26387.257 rows=5 loops=1)
>    ->  Foreign Scan on recent_events (cost=100.00..6663703.90 rows=102117
> width=48) (actual time=21242.086..26387.252 rows=5 loops=1)
>          Filter: (((attributes -> 'account_incident_id'::text))::integer =
> 2617116)
>          Rows Removed by Filter: 724249
>  Planning Time: 1.164 ms
>  Execution Time: 26387.851 ms
> (6 rows)
>
> Time: 26528.113 ms (00:26.528)
>
> The query killer is the ...
>
> AND (attributes -> 'account_incident_id')::integer = 2617116
>
> Run the query this way ...
>
> select id, attributes -> 'account_incident_id' from recent_events where
> account_id = 1  limit 5;
>
> and the results is only 10ms slower than on the hosting server directly
> which is what we are expecting.  It's like the casting of the hstore
> column is just not playing nice.

Casting the left side of a predicate is not recommended.  What if you cast
2617166 to be the same type as attributes -> 'account_incident_id'?


--
Angular momentum makes the world go 'round.




--
Thanks,

Jorge Torralba
----------------------------

Note: This communication may contain privileged or other confidential information. If you are not the intended recipient, please do not print, copy, retransmit, disseminate or otherwise use the information. Please indicate to the sender that you have received this email in error and delete the copy you received. Thank You.
Reply | Threaded
Open this post in threaded view
|

Re: Performance killed with FDW when using CAST.

Laurenz Albe
In reply to this post by Jorge Torralba
Jorge Torralba wrote:

> We setup a FDW server for a large table that has an hstore data type as on eof the coums.
>
> When we run the query directly on the FDW server
>
> select * from recent_events where account_id = 1 AND (attributes -> 'account_incident_id')::integer = 2617116 limit 10;
>
> The response is 2ms . which is awesome.
>
> When we run it from the remote server where we have the server defined, the query never returns.
>
> Our server definition is as follows.
>
> (host 'fdwserver', dbname 'mydb', port '5432', fetch_size '1000', use_remote_estimate 'true', extensions 'hstore')

After debugging into this, it seems that the hstore operator -> cannot
be pushed down because of collation problems.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com



Reply | Threaded
Open this post in threaded view
|

Re: Performance killed with FDW when using CAST.

Thomas Kellerer
Laurenz Albe schrieb am 17.04.2019 um 07:03:

>> We setup a FDW server for a large table that has an hstore data type as on eof the coums.
>>
>> When we run the query directly on the FDW server
>>
>> select * from recent_events where account_id = 1 AND (attributes -> 'account_incident_id')::integer = 2617116 limit 10;
>>
>> The response is 2ms . which is awesome.
>>
>> When we run it from the remote server where we have the server defined, the query never returns.
>>
>> Our server definition is as follows.
>>
>> (host 'fdwserver', dbname 'mydb', port '5432', fetch_size '1000', use_remote_estimate 'true', extensions 'hstore')
>
> After debugging into this, it seems that the hstore operator -> cannot
> be pushed down because of collation problems.

Do you happen to know if the JSONB operator -> (or ->>) can be pushed down?

Although I realize it would be a major change in the database design, maybe that would be a solution for Jorge.

Thomas





Reply | Threaded
Open this post in threaded view
|

Re: Performance killed with FDW when using CAST.

Tom Lane-2
Thomas Kellerer <[hidden email]> writes:
> Laurenz Albe schrieb am 17.04.2019 um 07:03:
>> After debugging into this, it seems that the hstore operator -> cannot
>> be pushed down because of collation problems.

> Do you happen to know if the JSONB operator -> (or ->>) can be pushed down?

A bit of experimentation says that jsonb -> integer can be pushed down,
but not any of the variants involving a text fieldname or result.
Presumably this is because of the heuristic that says not to push down
a collation that didn't arise from the remote column.  jsonb -> text
isn't really collation-sensitive, of course, but postgres_fdw has no
good way to know that, since the core code (outside of that operator
itself) doesn't know it either.  The assumption is that any function
with at least one input of a collatable type is collation-sensitive.
Here you're getting a default collation from the text literal, and
postgres_fdw doesn't want to assume that the remote end would choose
the same collation.

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: Performance killed with FDW when using CAST.

Jorge Torralba
Thanks for the info. Is there a way to match the collations and make it work?

On Wed, Apr 17, 2019, 07:02 Tom Lane <[hidden email]> wrote:
Thomas Kellerer <[hidden email]> writes:
> Laurenz Albe schrieb am 17.04.2019 um 07:03:
>> After debugging into this, it seems that the hstore operator -> cannot
>> be pushed down because of collation problems.

> Do you happen to know if the JSONB operator -> (or ->>) can be pushed down?

A bit of experimentation says that jsonb -> integer can be pushed down,
but not any of the variants involving a text fieldname or result.
Presumably this is because of the heuristic that says not to push down
a collation that didn't arise from the remote column.  jsonb -> text
isn't really collation-sensitive, of course, but postgres_fdw has no
good way to know that, since the core code (outside of that operator
itself) doesn't know it either.  The assumption is that any function
with at least one input of a collatable type is collation-sensitive.
Here you're getting a default collation from the text literal, and
postgres_fdw doesn't want to assume that the remote end would choose
the same collation.

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: Performance killed with FDW when using CAST.

Jorge Torralba
In reply to this post by Tom Lane-2
I made a copy of the table and altered the column from hstore to jsonb.

Ran the following query with the same performance issues. 

SELECT id, attributes FROM xxx WHERE account_id = 1 AND timestamp >= '2019-01-16 22:34:28.584' AND  CAST(attributes ->> 'account_incident_id' as integer) = 2617116  order by timestamp desc limit 10;

Things to Note.

Remove the CAST on the attributes column and the order by results in quick performance

Add order by performance dies

Add CAST without the order by you can go out for dinner and still be waiting for a result set.





On Wed, Apr 17, 2019 at 7:02 AM Tom Lane <[hidden email]> wrote:
Thomas Kellerer <[hidden email]> writes:
> Laurenz Albe schrieb am 17.04.2019 um 07:03:
>> After debugging into this, it seems that the hstore operator -> cannot
>> be pushed down because of collation problems.

> Do you happen to know if the JSONB operator -> (or ->>) can be pushed down?

A bit of experimentation says that jsonb -> integer can be pushed down,
but not any of the variants involving a text fieldname or result.
Presumably this is because of the heuristic that says not to push down
a collation that didn't arise from the remote column.  jsonb -> text
isn't really collation-sensitive, of course, but postgres_fdw has no
good way to know that, since the core code (outside of that operator
itself) doesn't know it either.  The assumption is that any function
with at least one input of a collatable type is collation-sensitive.
Here you're getting a default collation from the text literal, and
postgres_fdw doesn't want to assume that the remote end would choose
the same collation.

                        regards, tom lane




--
Thanks,

Jorge Torralba
----------------------------

Note: This communication may contain privileged or other confidential information. If you are not the intended recipient, please do not print, copy, retransmit, disseminate or otherwise use the information. Please indicate to the sender that you have received this email in error and delete the copy you received. Thank You.