libpq pipelineing

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

libpq pipelineing

Samuel Williams
Hello,

Using the asynchronous interface of libpq, is it possible to pipeline
multiple queries?

i.e.

PQsendQuery(query1)
PQsendQuery(query2)

followed by

query1_results = PQgetResult(...)
query2_results = PQgetResult(...)

I tried it but got "another command is already in progress" error.

So, maybe it's not supported, or maybe I'm doing something wrong.

Thanks
Samuel


Reply | Threaded
Open this post in threaded view
|

Re: libpq pipelineing

David G Johnston
On Friday, June 26, 2020, Samuel Williams <[hidden email]> wrote:
Hello,

Using the asynchronous interface of libpq, is it possible to pipeline
multiple queries?

i.e.

PQsendQuery(query1)
PQsendQuery(query2)

followed by

query1_results = PQgetResult(...)
query2_results = PQgetResult(...)

I tried it but got "another command is already in progress" error.

The documentation seems to leave zero ambiguity:

 After successfully calling PQsendQuery, call PQgetResult one or more times to obtain the results. PQsendQuery cannot be called again (on the same connection) until PQgetResult has returned a null pointer, indicating that the command is done.

David J.

Reply | Threaded
Open this post in threaded view
|

Re: libpq pipelineing

Samuel Williams
Thanks David,

You are correct.

I was giving an example of what I was hoping to achieve, not what I
expected to work with the current interface.

I found some discussion in the past relating to batch processing which
appears to support some kind of pipelining:

https://2ndquadrant.github.io/postgres/libpq-batch-mode.html

However it seems to be abandoned.

Kind regards,
Samuel

On Sat, 27 Jun 2020 at 16:15, David G. Johnston
<[hidden email]> wrote:

>
> On Friday, June 26, 2020, Samuel Williams <[hidden email]> wrote:
>>
>> Hello,
>>
>> Using the asynchronous interface of libpq, is it possible to pipeline
>> multiple queries?
>>
>> i.e.
>>
>> PQsendQuery(query1)
>> PQsendQuery(query2)
>>
>> followed by
>>
>> query1_results = PQgetResult(...)
>> query2_results = PQgetResult(...)
>>
>> I tried it but got "another command is already in progress" error.
>
>
> The documentation seems to leave zero ambiguity:
>
>  After successfully calling PQsendQuery, call PQgetResult one or more times to obtain the results. PQsendQuery cannot be called again (on the same connection) until PQgetResult has returned a null pointer, indicating that the command is done.
>
> David J.
>


Reply | Threaded
Open this post in threaded view
|

Re: libpq pipelineing

David G Johnston
On Friday, June 26, 2020, Samuel Williams <[hidden email]> wrote:
Thanks David,

You are correct.

I was giving an example of what I was hoping to achieve, not what I
expected to work with the current interface.

What about, as it says, sending multiple statements in a single sendQuery and then polling for multiple results?

David J.

Reply | Threaded
Open this post in threaded view
|

Re: libpq pipelineing

Samuel Williams
> What about, as it says, sending multiple statements in a single sendQuery and then polling for multiple results?

I tried this, and even in single row streaming mode, I found that
there are cases where the results would not be streamed until all the
queries were sent.

From the users point of view, they may generate a loop sending
multiple queries and don't care about the result, so a pipeline/batch
processing is ideal to avoid RTT per loop iteration, if database
access is slow, this can be a significant source of latency.

Kind regards,
Samuel


Reply | Threaded
Open this post in threaded view
|

Re: libpq pipelineing

David G Johnston
On Friday, June 26, 2020, Samuel Williams <[hidden email]> wrote:
> What about, as it says, sending multiple statements in a single sendQuery and then polling for multiple results?

I tried this, and even in single row streaming mode, I found that
there are cases where the results would not be streamed until all the
queries were sent.

From the users point of view, they may generate a loop sending
multiple queries and don't care about the result, so a pipeline/batch
processing is ideal to avoid RTT per loop iteration, if database
access is slow, this can be a significant source of latency

 I don’t have any insight into the bigger picture but I’d concur that no other option is documented so what you desire is not possible.

David J.

Reply | Threaded
Open this post in threaded view
|

Re: libpq pipelineing

Samuel Williams
Here is a short example:

https://gist.github.com/ioquatix/2f08f78699418f65971035785c80cf18

It makes 10 queries in one "PQsendQuery" and sets single row mode. But
all the results come back at once as shown by the timestamps.

Next I'm planning to investigate streaming large recordsets to see if
it works better/incrementally.


Reply | Threaded
Open this post in threaded view
|

Re: libpq pipelineing

Tom Lane-2
Samuel Williams <[hidden email]> writes:
> Here is a short example:
> https://gist.github.com/ioquatix/2f08f78699418f65971035785c80cf18
> It makes 10 queries in one "PQsendQuery" and sets single row mode. But
> all the results come back at once as shown by the timestamps.

That looks to be less about what libpq will do than what the Ruby
interface code will do.

The volume of return data may also be an issue.  I don't think the
backend will flush data out to the client except when it (a) reaches
an idle state or (b) fills the output buffer.  Ten occurrences of
a short query result aren't gonna be enough for (b) --- from memory,
that buffer is probably 8KB.

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: libpq pipelineing

Samuel Williams
I think libmariadb has a nicer interface for this.

Essentially what you do is send your query, and then read a result set
(one result set per query), and then you stream individual rows using:

mysql_fetch_row_start
mysql_fetch_row_cont

Those methods don't seem to have an equivalent in libpq - you can use
PQgetResult but it buffers all the rows. Using single row mode results
in many results for each query (seems like a big overhead). The
difference between this and MariaDB is that mysql_fetch_row still
operates within one logical set of results, but single row mode breaks
the single logical set of results into lots of individual results.

Maybe the statement about efficiency is incorrect, but it would be
nice if you could incrementally stream a single result set more
easily.


On Sun, 28 Jun 2020 at 02:40, Tom Lane <[hidden email]> wrote:

>
> Samuel Williams <[hidden email]> writes:
> > Here is a short example:
> > https://gist.github.com/ioquatix/2f08f78699418f65971035785c80cf18
> > It makes 10 queries in one "PQsendQuery" and sets single row mode. But
> > all the results come back at once as shown by the timestamps.
>
> That looks to be less about what libpq will do than what the Ruby
> interface code will do.
>
> The volume of return data may also be an issue.  I don't think the
> backend will flush data out to the client except when it (a) reaches
> an idle state or (b) fills the output buffer.  Ten occurrences of
> a short query result aren't gonna be enough for (b) --- from memory,
> that buffer is probably 8KB.
>
>                         regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: libpq pipelineing

Stephen Frost
In reply to this post by Samuel Williams
Greetings,

* Samuel Williams ([hidden email]) wrote:
> Here is a short example:
>
> https://gist.github.com/ioquatix/2f08f78699418f65971035785c80cf18
>
> It makes 10 queries in one "PQsendQuery" and sets single row mode. But
> all the results come back at once as shown by the timestamps.

If you have 10 queries that you want to make in a given transaction and
you care about the latency then really the best option is to wrap that
all in a single pl/pgsql function on the server side and make one call.

> Next I'm planning to investigate streaming large recordsets to see if
> it works better/incrementally.

If you want to stream large data sets to/from PG, you should consider
using COPY.

Thanks,

Stephen

signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: libpq pipelineing

Tom Lane-2
In reply to this post by Samuel Williams
Samuel Williams <[hidden email]> writes:
> Those methods don't seem to have an equivalent in libpq - you can use
> PQgetResult but it buffers all the rows. Using single row mode results
> in many results for each query (seems like a big overhead).

Have you got any actual evidence for that?  Sure, the overhead is
more than zero, but does it mean anything in comparison to the other
costs of data transmission?

> Maybe the statement about efficiency is incorrect, but it would be
> nice if you could incrementally stream a single result set more
> easily.

More easily than what?  If we did not construct a PGresult then we would
need some other abstraction for access to the returned row, dealing with
error cases, etc etc.  That would mean a lot of very duplicative API code
in libpq, and a painful bunch of adjustments in client code.

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: libpq pipelineing

Samuel Williams
Tom, I'm implementing a small abstraction layer for event-driven
result streaming on PostgreSQL and MariaDB for Ruby, and I'll endeavor
to report back with some numbers once I have enough of it working to
benchmark something meaningful.

Thanks for your patience and help.

Kind regards,
Samuel

On Tue, 30 Jun 2020 at 02:06, Tom Lane <[hidden email]> wrote:

>
> Samuel Williams <[hidden email]> writes:
> > Those methods don't seem to have an equivalent in libpq - you can use
> > PQgetResult but it buffers all the rows. Using single row mode results
> > in many results for each query (seems like a big overhead).
>
> Have you got any actual evidence for that?  Sure, the overhead is
> more than zero, but does it mean anything in comparison to the other
> costs of data transmission?
>
> > Maybe the statement about efficiency is incorrect, but it would be
> > nice if you could incrementally stream a single result set more
> > easily.
>
> More easily than what?  If we did not construct a PGresult then we would
> need some other abstraction for access to the returned row, dealing with
> error cases, etc etc.  That would mean a lot of very duplicative API code
> in libpq, and a painful bunch of adjustments in client code.
>
>                         regards, tom lane