get counts of multiple field values in a jsonb column

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

get counts of multiple field values in a jsonb column

Martin Norbäck Olivers
HI!
I'm using postgres to store unstructured fields in a jsonb column. I also have a quite complicated query on the table, joining it with other tables etc, and given that query I want to get a count of all the values for a number of keys in the data.

Currently I'm doing one query for each key, like this
select data->>'field1', count(*) from COMPLICATED QUERY group by 1
select data->>'field2', count(*) from COMPLICATED QUERY group by 1
...
select data->>'fieldN', count(*) from COMPLICATED QUERY group by 1

field1, field2, ..., fieldN are known at query time.

But as the number of keys I want to count for increases, so does the time it takes to run all these queries. I think the main problem is that COMPLICATED QUERY is complicated and takes time to run each time. I would very much like to run only one query that counts all the values of all the fields, but I'm not quite sure how to do that. I'm looking at all the aggregation functions but can't quite find one that suits this purpose.

I would love to get some input on ways to make this faster.

Regards,

Martin
Reply | Threaded
Open this post in threaded view
|

Re: get counts of multiple field values in a jsonb column

David G Johnston
On Sat, Oct 17, 2020 at 8:00 AM Martin Norbäck Olivers <[hidden email]> wrote:
I'm looking at all the aggregation functions but can't quite find one that suits this purpose.

I would love to get some input on ways to make this faster.

If you want counts the count function is your goto aggregate function.  What you are missing is performing conditional counting.

SELECT fld, count(*) FILTER (WHERE expression) FROM query GROUP BY fld;


David J.

lup
Reply | Threaded
Open this post in threaded view
|

Re: get counts of multiple field values in a jsonb column

lup
In reply to this post by Martin Norbäck Olivers


> On Oct 17, 2020, at 9:00 AM, Martin Norbäck Olivers <[hidden email]> wrote:
>
> HI!
> I'm using postgres to store unstructured fields in a jsonb column. I also have a quite complicated query on the table, joining it with other tables etc, and given that query I want to get a count of all the values for a number of keys in the data.
>
> Currently I'm doing one query for each key, like this
> select data->>'field1', count(*) from COMPLICATED QUERY group by 1
> select data->>'field2', count(*) from COMPLICATED QUERY group by 1
> ...
> select data->>'fieldN', count(*) from COMPLICATED QUERY group by 1
>
> field1, field2, ..., fieldN are known at query time.
>
> But as the number of keys I want to count for increases, so does the time it takes to run all these queries. I think the main problem is that COMPLICATED QUERY is complicated and takes time to run each time. I would very much like to run only one query that counts all the values of all the fields, but I'm not quite sure how to do that. I'm looking at all the aggregation functions but can't quite find one that suits this purpose.
>
> I would love to get some input on ways to make this faster.
>
> Regards,
>
> Martin

If the “COMPLICATED QUERY” is repeated verbatim each time, run it once into a temp table and do the counts against the temp table.

create temp table complicate as COMPLICATED_QUERY;
/*maybe index*/
select data->>'field1', count(*) from COMPLICATED QUERY group by 1

You can put the whole thing in a function and drop the temp table each run.


A CTE based on COMPLICATED_QUERY might work too.

Reply | Threaded
Open this post in threaded view
|

Re: get counts of multiple field values in a jsonb column

Simon Riggs
In reply to this post by Martin Norbäck Olivers
On Sat, 17 Oct 2020 at 16:00, Martin Norbäck Olivers <[hidden email]> wrote:

>
> HI!
> I'm using postgres to store unstructured fields in a jsonb column. I also have a quite complicated query on the table, joining it with other tables etc, and given that query I want to get a count of all the values for a number of keys in the data.
>
> Currently I'm doing one query for each key, like this
> select data->>'field1', count(*) from COMPLICATED QUERY group by 1
> select data->>'field2', count(*) from COMPLICATED QUERY group by 1
> ...
> select data->>'fieldN', count(*) from COMPLICATED QUERY group by 1
>
> field1, field2, ..., fieldN are known at query time.
>
> But as the number of keys I want to count for increases, so does the time it takes to run all these queries. I think the main problem is that COMPLICATED QUERY is complicated and takes time to run each time. I would very much like to run only one query that counts all the values of all the fields, but I'm not quite sure how to do that. I'm looking at all the aggregation functions but can't quite find one that suits this purpose.
>
> I would love to get some input on ways to make this faster.

You want something like this...

select key, count(*) from (select (jsonb_each_text(data)).key from
COMPLICATED QUERY) as kv group by 1;

--
Simon Riggs                http://www.EnterpriseDB.com/