The wrong (?) query plan for queries with remote (postgres_fdw) tables

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

The wrong (?) query plan for queries with remote (postgres_fdw) tables

Vitaly Baranovsky

Hello,


I have an inefficient query execution for queries with postgres_fdw.


I have an ineffective query with remote tables (postgres_fdw) that works for about 1 second. The same query with local tables (with the same data) instead of foreign ones has execution time less than 5 ms. So, the difference is almost 200 times.


So, the query works with 3 tables. 2 of them are remote, 1 is a local one.


Remote db (works on the same Postgres instance):

foreign_table

foreign_filter_table


Local db:

local_table


The idea that local_table and foreign_table have the same structure. They have 2 columns: primary key (primary_uuid) and foreign key (fkey_uuid).


Also, we have foreign_filter_table, that is a master table for 2 tables mentioned above. In addition to the primary key (primary_uuid), it also has a column filter_uuid.


What is the aim of a query: to filter the master table by filter_uuid, and then select corresponding data from unioned (union all) local_table and foreign_table, and make pagination sorted by the foreign key.


The master table (foreign_filter_table) contains 100K records. Slave tables contain about 100K records each, where they refer to about 5% of master table (each slave table contains 20 rows for 5% of master table rows).


Note, use_remote_estimate is true for the foreign server.


Logically, query execution should be: select rows from the master query and make merge join between them. It works this way when I use local tables instead of remote ones.


But with foreign tables it union child tables first, and then make a nested loop for each row with the mater table. As a result, the query becomes very slow...


So, the query is:

select *

from

(select * from local_table lt

union all

select * from foreign_server.foreign_table ft) a

join foreign_server.foreign_filter_table on a.fkey_uuid = foreign_server.foreign_filter_table.primary_uuid

where foreign_server.foreign_filter_table.filter_uuid between '56c77b02-8309-42f1-ae02-8d6922ea7dba' and '67c77b02-8309-42f1-ae02-8d6922ea7dba'

order by a.fkey_uuid

limit 10 offset 90


A query plan is:

"Limit  (cost=527.23..563.45 rows=10 width=80) (actual time=915.919..920.302 rows=10 loops=1)"

"  Output: lt.fkey_uuid, lt.primary_uuid, foreign_filter_table.primary_uuid, foreign_filter_table.filter_uuid, lt.fkey_uuid"

"  Buffers: shared hit=949"

"  -> Nested Loop  (cost=201.28..20859148.42 rows=5759398 width=80) (actual time=118.138..920.282 rows=100 loops=1)"

"        Output: lt.fkey_uuid, lt.primary_uuid, foreign_filter_table.primary_uuid, foreign_filter_table.filter_uuid, lt.fkey_uuid"

"        Buffers: shared hit=949"

"        -> Merge Append  (cost=100.85..19272.58 rows=192108 width=32) (actual time=1.133..16.119 rows=1864 loops=1)"

"              Sort Key: lt.fkey_uuid"

"              Buffers: shared hit=949"

"              -> Index Scan using fkey_uuid_idx on public.local_table lt  (cost=0.42..8937.22 rows=96054 width=32) (actual time=0.021..5.159 rows=940 loops=1)"

"                    Output: lt.fkey_uuid, lt.primary_uuid"

"                    Buffers: shared hit=949"

"              -> Foreign Scan on foreign_server.foreign_table ft  (cost=100.42..8414.27 rows=96054 width=32) (actual time=1.109..9.756 rows=925 loops=1)"

"                    Output: ft.fkey_uuid, ft.primary_uuid"

"                    Remote SQL: SELECT fkey_uuid, primary_uuid FROM public.foreign_table ORDER BY fkey_uuid ASC NULLS LAST"

"        -> Foreign Scan on foreign_server.foreign_filter_table  (cost=100.43..108.47 rows=1 width=32) (actual time=0.380..0.380 rows=0 loops=1864)"

"              Output: foreign_filter_table.primary_uuid, foreign_filter_table.filter_uuid"

"              Remote SQL: SELECT primary_uuid, filter_uuid FROM public.foreign_filter_table WHERE ((filter_uuid >= '56c77b02-8309-42f1-ae02-8d6922ea7dba'::uuid)) AND ((filter_uuid <= '67c77b02-8309-42f1-ae02-8d6922ea7dba'::uuid)) AND (($1::uuid = primary_u (...)"

"Planning Time: 1.825 ms"

"Execution Time: 920.617 ms"


But, when I do the same locally on a remote database with a query:

select *

from

(select * from foreign_table ft) a

join foreign_filter_table on a.fkey_uuid = foreign_filter_table.primary_uuid

where foreign_filter_table.filter_uuid between '57c77b02-8309-42f1-ae02-8d6922ea7dba' and '67c77b02-8309-42f1-ae02-8d6922ea7dba'

order by a.fkey_uuid

limit 10 offset 90


I get a query plan:

"Limit  (cost=248.72..272.37 rows=10 width=80) (actual time=4.366..4.384 rows=10 loops=1)"

"  Output: ft.fkey_uuid, ft.primary_uuid, foreign_filter_table.primary_uuid, foreign_filter_table.filter_uuid, ft.fkey_uuid"

"  -> Merge Join  (cost=35.91..13665.71 rows=5764 width=80) (actual time=0.558..4.378 rows=100 loops=1)"

"        Output: ft.fkey_uuid, ft.primary_uuid, foreign_filter_table.primary_uuid, foreign_filter_table.filter_uuid, ft.fkey_uuid"

"        Inner Unique: true"

"        Merge Cond: (ft.fkey_uuid = foreign_filter_table.primary_uuid)"

"        -> Index Scan using fkey_uuid_idx on public.foreign_table ft  (cost=0.42..6393.19 rows=96054 width=32) (actual time=0.005..2.556 rows=1297 loops=1)"

"              Output: ft.fkey_uuid, ft.primary_uuid"

"        -> Index Scan using filter_table_pk on public.foreign_filter_table  (cost=0.42..6994.83 rows=5996 width=32) (actual time=0.043..1.684 rows=85 loops=1)"

"              Output: foreign_filter_table.primary_uuid, foreign_filter_table.filter_uuid"

"              Filter: ((foreign_filter_table.filter_uuid >= '57c77b02-8309-42f1-ae02-8d6922ea7dba'::uuid) AND (foreign_filter_table.filter_uuid <= '67c77b02-8309-42f1-ae02-8d6922ea7dba'::uuid))"

"              Rows Removed by Filter: 1095"

"Planning Time: 1.816 ms"

"Execution Time: 4.605 ms"



So, why the behavior of the Postgres is like this? How can I optimize such a query? It looks like query optimizer builds an ineffective plan, but maybe I’m wrong


Thank you.


P.S.: scripts are attached

Pagination query problem.sql (2K) Download Attachment