Performance on JSONB select

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Performance on JSONB select

PegoraroF10
select * from MyTable where
  ((JsonBField->>'status'='descartada' and
To_Date(JsonBField->'descartada'->>'data','yyyy-mm-dd') > Current_Date) or
   (JsonBField->>'status'='contrato' and
To_Date(JsonBField->'contrato'->>'data','yyyy-mm-dd') > Current_Date-7) or
   (JsonBField->>'status'='naoatribuido'));

Considering this table has a million records, I would like to use a proper
index, but how can I use an index when using operator >.




--
Sent from: https://www.postgresql-archive.org/PostgreSQL-general-f1843780.html


Reply | Threaded
Open this post in threaded view
|

Re: Performance on JSONB select

Michael Lewis
Much of indexing strategy depends on knowing the data like how many distinct values and what the distribution is like. Is JsonBField->>'status' always set? Are those three values mentioned in this query common or rare? Can you re-write this query to avoid using an OR in the where clause? Are you just wanting to add a GIN index for the jsonb paths? Or do you want indexed like below that are a bit stylized to this query?

CREATE INDEX idx_MyTable_status USING btree( JsonBField->>'status' );
CREATE INDEX idx_MyTable_descartada_date USING btree( To_Date(JsonBField->'descartada'->>'data','yyyy-mm-dd') );
CREATE INDEX idx_MyTable_contrato_date USING btree( To_Date(JsonBField->'contrato'->>'data','yyyy-mm-dd') );
Reply | Threaded
Open this post in threaded view
|

Re: Performance on JSONB select

PegoraroF10
ok, my select performed better but I had to create 8 indices to speed up my
query.

I would love to create just one index using GIN(JsonBField jsonb_ops) but
using version 11 I cannot use operators like > and  <.

I see on docs that version 12 has jsonpath Filter Expression Elements and
they include > and <. So, the question is, will be possible to have just one
index and use it with "jsonpath Filter Expression Elements ?



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-general-f1843780.html


Reply | Threaded
Open this post in threaded view
|

Re: Performance on JSONB select

Stephen Frost
In reply to this post by Michael Lewis
Greetings,

* Michael Lewis ([hidden email]) wrote:
> Much of indexing strategy depends on knowing the data like how many
> distinct values and what the distribution is like. Is JsonBField->>'status'
> always set? Are those three values mentioned in this query common or rare?
> Can you re-write this query to avoid using an OR in the where clause? Are
> you just wanting to add a GIN index for the jsonb paths? Or do you want
> indexed like below that are a bit stylized to this query?

If you know a field is going to always be there, you're better off, by
far, by just having a regular column for that value and a straight up
btree for it.  This saves a significant amount of space and makes it
much easier to index and work with.

Thanks,

Stephen

signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Performance on JSONB select

PegoraroF10
In reply to this post by PegoraroF10
I think I solved my problem. I didn´t know it was possible but I´ve created
an almost dynamic index, because it´ll index for status and corresponding
datevalue of an object with that status value.

Postgres, I love you.
 
Just one B-Tree index with 2 fields, Status and DateTime of that respective
Status object. My Json has always a status and a respective object of that
status with other values. So I know if it has that status it has a Date
Value or not on its correspondent object.

{
    "status": "visitadescartada",
    "contrato": {},
    "atribuido": {"datevalue": "2019-09-05 14:47:11.149095-03"},
    "trabalhando": {},
    "visitaagendada": {"datevalue": "2019-09-05
15:06:24.255548-03","caption": "Agendado"},
    "visitadescartada": {"datevalue": "2019-09-12
11:47:17.45782-03","caption": "Desagendado"},
    "digitacaodescartada": {}
}

create index IndexByStatusAndDate on MyTable (
(JsonBField->>$$status$$),
castimmutabletimestamp(JsonBField->(JsonBField->>$$status$$)->>'datevalue'))
where (JsonBField ? $$status$$);

select * from MyTable where (JsonBField ? $$status$$) and
case when JsonBField->>$$status$$=$$cadastrodescartado$$ then
castimmutabletimestamp(JsonBField->(JsonBField->>$$status$$)->>'datevalue')
> castimmutabletimestamp($$2019-10-01$$)
     when JsonBField->>$$status$$=$$visitadescartada$$ then
castimmutabletimestamp(JsonBField->(JsonBField->>$$status$$)->>'datevalue')
> castimmutabletimestamp($$2019-10-01$$)
     when JsonBField->>$$status$$=$$contrato$$ then
castimmutabletimestamp(JsonBField->(JsonBField->>$$status$$)->>'data') >
castimmutabletimestamp($$2019-10-01$$)
     when JsonBField->>$$status$$=$$naoatribuido$$ then True end;

And performance now is great because I´m using both fields on index.




--
Sent from: https://www.postgresql-archive.org/PostgreSQL-general-f1843780.html


Reply | Threaded
Open this post in threaded view
|

Re: Performance on JSONB select

Pavel Stehule
In reply to this post by PegoraroF10
Hi

so 5. 10. 2019 v 13:34 odesílatel PegoraroF10 <[hidden email]> napsal:
I think I solved my problem. I didn´t know it was possible but I´ve created
an almost dynamic index, because it´ll index for status and corresponding
datevalue of an object with that status value.

you created multicolumn functional index (there are no any dynamic index :))

Pavel
 

Postgres, I love you.

Just one B-Tree index with 2 fields, Status and DateTime of that respective
Status object. My Json has always a status and a respective object of that
status with other values. So I know if it has that status it has a Date
Value or not on its correspondent object.

{
    "status": "visitadescartada",
    "contrato": {},
    "atribuido": {"datevalue": "2019-09-05 14:47:11.149095-03"},
    "trabalhando": {},
    "visitaagendada": {"datevalue": "2019-09-05
15:06:24.255548-03","caption": "Agendado"},
    "visitadescartada": {"datevalue": "2019-09-12
11:47:17.45782-03","caption": "Desagendado"},
    "digitacaodescartada": {}
}

create index IndexByStatusAndDate on MyTable (
(JsonBField->>$$status$$),
castimmutabletimestamp(JsonBField->(JsonBField->>$$status$$)->>'datevalue'))
where (JsonBField ? $$status$$);

select * from MyTable where (JsonBField ? $$status$$) and
case when JsonBField->>$$status$$=$$cadastrodescartado$$ then
castimmutabletimestamp(JsonBField->(JsonBField->>$$status$$)->>'datevalue')
> castimmutabletimestamp($$2019-10-01$$)
     when JsonBField->>$$status$$=$$visitadescartada$$ then
castimmutabletimestamp(JsonBField->(JsonBField->>$$status$$)->>'datevalue')
> castimmutabletimestamp($$2019-10-01$$)
     when JsonBField->>$$status$$=$$contrato$$ then
castimmutabletimestamp(funilvendas->(JsonBField->>$$status$$)->>'data') >
castimmutabletimestamp($$2019-10-01$$)
     when JsonBField->>$$status$$=$$naoatribuido$$ then True end;

And performance now is great because I´m using both fields on index.



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-general-f1843780.html


Reply | Threaded
Open this post in threaded view
|

Re: Performance on JSONB select

PegoraroF10
I told it was almost dynamic because it uses DateValue from an object or
another, depending on value of staus key.



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-general-f1843780.html


Reply | Threaded
Open this post in threaded view
|

Re: Performance on JSONB select

Fabrízio de Royes Mello-2

Em sáb, 5 de out de 2019 às 11:49, PegoraroF10 <[hidden email]> escreveu:
I told it was almost dynamic because it uses DateValue from an object or
another, depending on value of staus key.


Actually it’s named partial index.

Regards,
--
   Fabrízio de Royes Mello         Timbira - http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
Reply | Threaded
Open this post in threaded view
|

Re: Performance on JSONB select

PegoraroF10
No, no, no, partial index is the where clause of it and I´m not talking about that.
I´m talking about datevalue, which will be used one or another, depending on status value

This record iindex will be status and datevalue from "visitadescartada" object
{
  "status": "visitadescartada",
  "atribuido": {"datevalue": "2019-09-05 14:47:11"},
  "visitadescartada": {"datevalue": "2019-09-12 11:47:17"},
  "digitacaodescartada": {}
}

And here the index record will be status and datevalue from "atribuido" object
{
  "status": "atribuido",
  "atribuido": {"datevalue": "2019-09-05 14:47:11"},
  "visitadescartada": {"datevalue": "2019-09-12 11:47:17"},
  "digitacaodescartada": {}
}

Sent from the PostgreSQL - general mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|

Re: Performance on JSONB select

Michael Lewis
My Json has always a status

Why declare the "where" clause when creating the index? It would not seem needed if status is always set and so your index will reference all rows in the table.

Thanks for sharing the trick of having the second column in the index determine the key based on the first column. I don't know if/when I might need this, but an interesting solution. Generically, it seems like the below-

create index idx_mytable_jsonb_dependent_fields on public.mytable ( (JsonBField->>'primary_field_to_filter_on'), JsonBField->(JsonBField->>'primary_field_to_filter_on')->>'secondary_field' );
Reply | Threaded
Open this post in threaded view
|

Re: Performance on JSONB select

PegoraroF10
Sorry, I told you that the status was always populated but not, I need that
filter for the index because not all records are using that json structure.
When json is not null then yes, Status is always there.
I have maybe 20 or 25% of records having json populated, so, I really need
that filter.

Yes, that solution is fine. Selectivity is fine, speed is fine and the index
is "almost dynamic"



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-general-f1843780.html