Quantcast

walsender & parallelism

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
21 messages Options
12
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

walsender & parallelism

Andres Freund
Hi,

Since [1] walsender (not receiver as commit message says) can execute
SQL queries.  While doing some testing of [2] I noticed that SQL queries
in walsender get stuck if parallelism is used - I have not investigated
why that is yet, but it surely is an issue.  On first blush I'd suspect
that some signalling is not wired up correctly (cf. am_walsender branches
in PostgresMain() and such).

This'd be easier to look at if the fairly separate facility of allowing
walsender to execute SQL commands had been committed separately, rather
than as part of a fairly large commit of largely unrelated things.

Greetings,

Andres Freund

[1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=7c4f52409a8c7d85ed169bbbc1f6092274d03920
[2] http://archives.postgresql.org/message-id/30242bc6-eca4-b7bb-670e-8d0458753a8c%402ndquadrant.com


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: walsender & parallelism

Petr Jelinek-4
On 21/04/17 03:40, Andres Freund wrote:
>
> Since [1] walsender (not receiver as commit message says) can execute
> SQL queries.  While doing some testing of [2] I noticed that SQL queries
> in walsender get stuck if parallelism is used - I have not investigated
> why that is yet, but it surely is an issue.  On first blush I'd suspect
> that some signalling is not wired up correctly (cf. am_walsender branches
> in PostgresMain() and such).

Looks like SIGUSR1 being different is problem here - it's normally used
to . I also noticed that we don't handle SIGINT (query cancel).

I'll write proper patch but can you try to just use
procsignal_sigusr1_handler for SIGUSR1 in walsender.c to see if it fixes
the issue?

BTW while looking at the code, I don't understand why we call
latch_sigusr1_handler after calling SetLatch(MyLatch), shouldn't just
the SetLatch be enough (they both end up calling sendSelfPipeByte()
eventually)?

--
  Petr Jelinek                  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: walsender & parallelism

Craig Ringer-3
On 21 April 2017 at 10:20, Petr Jelinek <[hidden email]> wrote:

> On 21/04/17 03:40, Andres Freund wrote:
>>
>> Since [1] walsender (not receiver as commit message says) can execute
>> SQL queries.  While doing some testing of [2] I noticed that SQL queries
>> in walsender get stuck if parallelism is used - I have not investigated
>> why that is yet, but it surely is an issue.  On first blush I'd suspect
>> that some signalling is not wired up correctly (cf. am_walsender branches
>> in PostgresMain() and such).
>
> Looks like SIGUSR1 being different is problem here - it's normally used
> to . I also noticed that we don't handle SIGINT (query cancel).
>
> I'll write proper patch but can you try to just use
> procsignal_sigusr1_handler for SIGUSR1 in walsender.c to see if it fixes
> the issue?

That's what my recovery conflicts for logical decoding on standby
patch does, FWIW.

I haven't found any issues yet..

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: walsender & parallelism

Petr Jelinek-4
On 21/04/17 04:32, Craig Ringer wrote:

> On 21 April 2017 at 10:20, Petr Jelinek <[hidden email]> wrote:
>> On 21/04/17 03:40, Andres Freund wrote:
>>>
>>> Since [1] walsender (not receiver as commit message says) can execute
>>> SQL queries.  While doing some testing of [2] I noticed that SQL queries
>>> in walsender get stuck if parallelism is used - I have not investigated
>>> why that is yet, but it surely is an issue.  On first blush I'd suspect
>>> that some signalling is not wired up correctly (cf. am_walsender branches
>>> in PostgresMain() and such).
>>
>> Looks like SIGUSR1 being different is problem here - it's normally used
>> to . I also noticed that we don't handle SIGINT (query cancel).
>>
>> I'll write proper patch but can you try to just use
>> procsignal_sigusr1_handler for SIGUSR1 in walsender.c to see if it fixes
>> the issue?
>
> That's what my recovery conflicts for logical decoding on standby
> patch does, FWIW.
>
> I haven't found any issues yet..
>

Ah I knew I've seen that change somewhere. I thought it was either in my
patch or master which is why I thought it's working fine already.

--
  Petr Jelinek                  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: walsender & parallelism

Petr Jelinek-4
On 21/04/17 04:37, Petr Jelinek wrote:

> On 21/04/17 04:32, Craig Ringer wrote:
>> On 21 April 2017 at 10:20, Petr Jelinek <[hidden email]> wrote:
>>> On 21/04/17 03:40, Andres Freund wrote:
>>>>
>>>> Since [1] walsender (not receiver as commit message says) can execute
>>>> SQL queries.  While doing some testing of [2] I noticed that SQL queries
>>>> in walsender get stuck if parallelism is used - I have not investigated
>>>> why that is yet, but it surely is an issue.  On first blush I'd suspect
>>>> that some signalling is not wired up correctly (cf. am_walsender branches
>>>> in PostgresMain() and such).
>>>
>>> Looks like SIGUSR1 being different is problem here - it's normally used
>>> to . I also noticed that we don't handle SIGINT (query cancel).
>>>
>>> I'll write proper patch but can you try to just use
>>> procsignal_sigusr1_handler for SIGUSR1 in walsender.c to see if it fixes
>>> the issue?
>>
>> That's what my recovery conflicts for logical decoding on standby
>> patch does, FWIW.
>>
>> I haven't found any issues yet..
>>
>
> Ah I knew I've seen that change somewhere. I thought it was either in my
> patch or master which is why I thought it's working fine already.
>
Here is patch. I changed both SIGINT and SIGUSR1 handlers, afaics it
does not break anything for existing walsender usage.

--
  Petr Jelinek                  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Handle-signals-correctly-in-walsender.patch (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: walsender & parallelism

Andres Freund
In reply to this post by Petr Jelinek-4
On 2017-04-21 04:20:26 +0200, Petr Jelinek wrote:
> Looks like SIGUSR1 being different is problem here - it's normally used
> to . I also noticed that we don't handle SIGINT (query cancel).

I think we really need to unify the paths between walsender and normal
backends to a much larger degree.


> BTW while looking at the code, I don't understand why we call
> latch_sigusr1_handler after calling SetLatch(MyLatch), shouldn't just
> the SetLatch be enough (they both end up calling sendSelfPipeByte()
> eventually)?

Historic raisins - there didn't use to be a SetLatch in
procsignal_sigusr1_handler. That changed when I whacked around catchup &
notify to be based on latches ([1] and following).

[1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=59f71a0d0b56b2df48db4bf1738aece5551f7a47

- Andres


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: walsender & parallelism

Andres Freund
In reply to this post by Petr Jelinek-4
Hi,

On 2017-04-22 17:53:19 +0200, Petr Jelinek wrote:
> Here is patch. I changed both SIGINT and SIGUSR1 handlers, afaics it
> does not break anything for existing walsender usage.

> diff --git a/src/backend/replication/logical/launcher.c b/src/backend/replication/logical/launcher.c
> index 761fbfa..e9dd886 100644
> --- a/src/backend/replication/logical/launcher.c
> +++ b/src/backend/replication/logical/launcher.c
> @@ -254,7 +254,7 @@ logicalrep_worker_launch(Oid dbid, Oid subid, const char *subname, Oid userid,
>   BackgroundWorker bgw;
>   BackgroundWorkerHandle *bgw_handle;
>   int i;
> - int slot;
> + int slot = 0;
>   LogicalRepWorker   *worker = NULL;
>   int nsyncworkers = 0;
>   TimestampTz now = GetCurrentTimestamp();

That seems independent and unnecessary?


> -/* SIGUSR1: set flag to send WAL records */
> -static void
> -WalSndXLogSendHandler(SIGNAL_ARGS)
> -{
> - int save_errno = errno;
> -
> - latch_sigusr1_handler();
> -
> - errno = save_errno;
> -}

>   pqsignal(SIGPIPE, SIG_IGN);
> - pqsignal(SIGUSR1, WalSndXLogSendHandler); /* request WAL sending */
> + pqsignal(SIGUSR1, procsignal_sigusr1_handler);

Those aren't actually entirely equivalent. I'm not sure it matters much,
but WalSndXLogSendHandler didn't include a SetLatch(), but
procsignal_sigusr1_handler() does.  Normally redundant SetLatch() calls
will just be elided, but what if walsender already woke up and did it's
work?

I think it'd be better to have PostgresMain() set up signals by default,
and then have WalSndSignals() overwrites the ones it needs separate.
WalSndLastCycleHandler is genuinely different. WalSndSigHupHandler
currently sets a different variable from postgres.c, but that seems like
a bad idea, because afaics we'll plainly ignore SIGHUPS unless in
WalSndLoop, WalSndWriteData,WalSndWaitForWal.  That actually seems like
an active bug to me?

Greetings,

Andres Freund


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: walsender & parallelism

Andres Freund
Hi,

On 2017-04-23 16:59:41 -0700, Andres Freund wrote:

> Hi,
>
> On 2017-04-22 17:53:19 +0200, Petr Jelinek wrote:
> > Here is patch. I changed both SIGINT and SIGUSR1 handlers, afaics it
> > does not break anything for existing walsender usage.
>
> > diff --git a/src/backend/replication/logical/launcher.c b/src/backend/replication/logical/launcher.c
> > index 761fbfa..e9dd886 100644
> > --- a/src/backend/replication/logical/launcher.c
> > +++ b/src/backend/replication/logical/launcher.c
> > @@ -254,7 +254,7 @@ logicalrep_worker_launch(Oid dbid, Oid subid, const char *subname, Oid userid,
> >   BackgroundWorker bgw;
> >   BackgroundWorkerHandle *bgw_handle;
> >   int i;
> > - int slot;
> > + int slot = 0;
> >   LogicalRepWorker   *worker = NULL;
> >   int nsyncworkers = 0;
> >   TimestampTz now = GetCurrentTimestamp();
>
> That seems independent and unnecessary?
>
>
> > -/* SIGUSR1: set flag to send WAL records */
> > -static void
> > -WalSndXLogSendHandler(SIGNAL_ARGS)
> > -{
> > - int save_errno = errno;
> > -
> > - latch_sigusr1_handler();
> > -
> > - errno = save_errno;
> > -}
>
> >   pqsignal(SIGPIPE, SIG_IGN);
> > - pqsignal(SIGUSR1, WalSndXLogSendHandler); /* request WAL sending */
> > + pqsignal(SIGUSR1, procsignal_sigusr1_handler);
>
> Those aren't actually entirely equivalent. I'm not sure it matters much,
> but WalSndXLogSendHandler didn't include a SetLatch(), but
> procsignal_sigusr1_handler() does.  Normally redundant SetLatch() calls
> will just be elided, but what if walsender already woke up and did it's
> work?
>
> I think it'd be better to have PostgresMain() set up signals by default,
> and then have WalSndSignals() overwrites the ones it needs separate.
> WalSndLastCycleHandler is genuinely different. WalSndSigHupHandler
> currently sets a different variable from postgres.c, but that seems like
> a bad idea, because afaics we'll plainly ignore SIGHUPS unless in
> WalSndLoop, WalSndWriteData,WalSndWaitForWal.  That actually seems like
> an active bug to me?

Oh, and one more point: There desperately need to be tests exercising
"SQL via walsender". Including the issue of parallelism here, the missed
cancel handler, timeouts, and such.  One way to do so is to use
pg_regress with an adjusted connection string (specifying
replication=database), doing the same for isolationtester, or using
dblink.

- Andres


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: walsender & parallelism

Petr Jelinek-4
In reply to this post by Andres Freund
On 24/04/17 01:59, Andres Freund wrote:

> Hi,
>
> On 2017-04-22 17:53:19 +0200, Petr Jelinek wrote:
>> Here is patch. I changed both SIGINT and SIGUSR1 handlers, afaics it
>> does not break anything for existing walsender usage.
>
>> diff --git a/src/backend/replication/logical/launcher.c b/src/backend/replication/logical/launcher.c
>> index 761fbfa..e9dd886 100644
>> --- a/src/backend/replication/logical/launcher.c
>> +++ b/src/backend/replication/logical/launcher.c
>> @@ -254,7 +254,7 @@ logicalrep_worker_launch(Oid dbid, Oid subid, const char *subname, Oid userid,
>>   BackgroundWorker bgw;
>>   BackgroundWorkerHandle *bgw_handle;
>>   int i;
>> - int slot;
>> + int slot = 0;
>>   LogicalRepWorker   *worker = NULL;
>>   int nsyncworkers = 0;
>>   TimestampTz now = GetCurrentTimestamp();
>
> That seems independent and unnecessary?
>

Yeah that leaked from different patch I was working on in parallel.

>
>> -/* SIGUSR1: set flag to send WAL records */
>> -static void
>> -WalSndXLogSendHandler(SIGNAL_ARGS)
>> -{
>> - int save_errno = errno;
>> -
>> - latch_sigusr1_handler();
>> -
>> - errno = save_errno;
>> -}
>
>>   pqsignal(SIGPIPE, SIG_IGN);
>> - pqsignal(SIGUSR1, WalSndXLogSendHandler); /* request WAL sending */
>> + pqsignal(SIGUSR1, procsignal_sigusr1_handler);
>
> Those aren't actually entirely equivalent. I'm not sure it matters much,
> but WalSndXLogSendHandler didn't include a SetLatch(), but
> procsignal_sigusr1_handler() does.  Normally redundant SetLatch() calls
> will just be elided, but what if walsender already woke up and did it's
> work?
>

They aren't exactly same no, but I don't see harm in what
procsignal_sigusr1_handler. I mean worst case scenario is latch is set
so walsender checks for work.

> I think it'd be better to have PostgresMain() set up signals by default,
> and then have WalSndSignals() overwrites the ones it needs separate.

Hmm that sounds like a good idea.

> WalSndLastCycleHandler is genuinely different. WalSndSigHupHandler
> currently sets a different variable from postgres.c, but that seems like
> a bad idea, because afaics we'll plainly ignore SIGHUPS unless in
> WalSndLoop, WalSndWriteData,WalSndWaitForWal.  That actually seems like
> an active bug to me?
>

Hmm I don't think good solution is to use same variable though,
walsender needs to do slightly different thing on SIGHUP, plus the
got_SIGHUP is not the type of variable we want to export. I wonder if it
would be enough to just add check for got_SIGHUP somewhere at the
beginning of exec_replication_command().

--
  Petr Jelinek                  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: walsender & parallelism

Petr Jelinek-4
In reply to this post by Andres Freund
On 24/04/17 01:43, Andres Freund wrote:

>
>> BTW while looking at the code, I don't understand why we call
>> latch_sigusr1_handler after calling SetLatch(MyLatch), shouldn't just
>> the SetLatch be enough (they both end up calling sendSelfPipeByte()
>> eventually)?
>
> Historic raisins - there didn't use to be a SetLatch in
> procsignal_sigusr1_handler. That changed when I whacked around catchup &
> notify to be based on latches ([1] and following).
>
> [1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=59f71a0d0b56b2df48db4bf1738aece5551f7a47
>

Okay, but why call both SetLatch and latch_sigusr1_handler? What does
that buy us?

--
  Petr Jelinek                  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: walsender & parallelism

Petr Jelinek-4
In reply to this post by Andres Freund
On 24/04/17 02:04, Andres Freund wrote:
> Hi,
>
> Oh, and one more point: There desperately need to be tests exercising
> "SQL via walsender". Including the issue of parallelism here, the missed
> cancel handler, timeouts, and such.  One way to do so is to use
> pg_regress with an adjusted connection string (specifying
> replication=database), doing the same for isolationtester, or using
> dblink.
>

Well, there are some tests (the logical replication uses that
functionality) and it's not quite clear to me what all to run there.
I assume you don't want to rerun whole regression suite against
walsender given the time it would take in normal tests (although I could
see doing that optionally somehow) but then what to pick from there.

--
  Petr Jelinek                  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: walsender & parallelism

Andres Freund
In reply to this post by Petr Jelinek-4
On 2017-04-24 04:26:16 +0200, Petr Jelinek wrote:
> > WalSndLastCycleHandler is genuinely different. WalSndSigHupHandler
> > currently sets a different variable from postgres.c, but that seems like
> > a bad idea, because afaics we'll plainly ignore SIGHUPS unless in
> > WalSndLoop, WalSndWriteData,WalSndWaitForWal.  That actually seems like
> > an active bug to me?
> >
>
> Hmm I don't think good solution is to use same variable though,
> walsender needs to do slightly different thing on SIGHUP,

Hm? You mean SyncRepInitConfig()?  I don't think that matters, because
it's only needed if config is changed while streaming.

> plus the
> got_SIGHUP is not the type of variable we want to export. I wonder if it
> would be enough to just add check for got_SIGHUP somewhere at the
> beginning of exec_replication_command().

I don't think that'd be sufficient, because we really want config
changes to get picked up while idle, and that won't work from within
exec_replication_command().  And that pretty much means it'll have to
happen outside of walsender.c.

FWIW, while it's not pretty, I actually see very little reason not to
share got_SIGHUP (with a bit less generic name name) between different
types of processes (i.e. putting it in miscadmin.h or such). It's not
exactly pretty, but there's also no benefit in duplicating it
everywhere, and without it you run into the issue presented here.

Greetings,

Andres Freund


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: walsender & parallelism

Andres Freund
In reply to this post by Petr Jelinek-4
On 2017-04-24 04:27:58 +0200, Petr Jelinek wrote:

> On 24/04/17 01:43, Andres Freund wrote:
> >
> >> BTW while looking at the code, I don't understand why we call
> >> latch_sigusr1_handler after calling SetLatch(MyLatch), shouldn't just
> >> the SetLatch be enough (they both end up calling sendSelfPipeByte()
> >> eventually)?
> >
> > Historic raisins - there didn't use to be a SetLatch in
> > procsignal_sigusr1_handler. That changed when I whacked around catchup &
> > notify to be based on latches ([1] and following).
> >
> > [1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=59f71a0d0b56b2df48db4bf1738aece5551f7a47
> >
>
> Okay, but why call both SetLatch and latch_sigusr1_handler? What does
> that buy us?

Nothing.  It's how the code evolved, we can change that.


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: walsender & parallelism

Petr Jelinek-4
In reply to this post by Petr Jelinek-4
On 24/04/17 04:31, Petr Jelinek wrote:

> On 24/04/17 02:04, Andres Freund wrote:
>> Hi,
>>
>> Oh, and one more point: There desperately need to be tests exercising
>> "SQL via walsender". Including the issue of parallelism here, the missed
>> cancel handler, timeouts, and such.  One way to do so is to use
>> pg_regress with an adjusted connection string (specifying
>> replication=database), doing the same for isolationtester, or using
>> dblink.
>>
>
> Well, there are some tests (the logical replication uses that
> functionality) and it's not quite clear to me what all to run there.
> I assume you don't want to rerun whole regression suite against
> walsender given the time it would take in normal tests (although I could
> see doing that optionally somehow) but then what to pick from there.
>
So actually maybe running regression tests through it might be
reasonable approach if we add new make target for it.

Attached patch shows how it could be done.

Patch adds new make target check-walsender-sql which runs serial
schedule of regression tests using replication connection (there is new
command line argument --replication-connection for pg_regress as well).
Reason for running serial schedule is that the max_walsenders setting
becomes problem when running parallel one.

Note that the first patch is huge. That's because I needed to add
alternative output for largeobject test because it uses fastpath
function calls which are not allowed over replication protocol.

As part of this I realized that the parser fallback that I wrote
originally for handling SQL in walsender is not robust enough as the
lexer would fail on some valid SQL statements. So there is also second
patch that does this using different approach which allows any SQL
statement.

Thoughts?

--
  Petr Jelinek                  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

0002-Improve-walsender-parser-for-non-replication-command.patch.gz (2K) Download Attachment
0001-Add-check-walsender-sql-make-target.patch.gz (314K) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: walsender & parallelism

Andres Freund


On April 23, 2017 10:31:18 PM PDT, Petr Jelinek <[hidden email]> wrote:
>On 24/04/17 04:31, Petr Jelinek wrote:
>So actually maybe running regression tests through it might be
>reasonable approach if we add new make target for it.

That sounds like a good plan.


>Note that the first patch is huge. That's because I needed to add
>alternative output for largeobject test because it uses fastpath
>function calls which are not allowed over replication protocol.

There's no need for that restriction, is there?  At least for db walsenders...

>As part of this I realized that the parser fallback that I wrote
>originally for handling SQL in walsender is not robust enough as the
>lexer would fail on some valid SQL statements. So there is also second
>patch that does this using different approach which allows any SQL
>statement.

Haven't looked at the new thing yet, but I'm not particularly surprised...  Wonder of there's a good way to fully integrate both grammars, without exploding keywords.

Andres

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: walsender & parallelism

Petr Jelinek-4
On 24/04/17 07:42, Andres Freund wrote:

>
>
> On April 23, 2017 10:31:18 PM PDT, Petr Jelinek <[hidden email]> wrote:
>> On 24/04/17 04:31, Petr Jelinek wrote:
>> So actually maybe running regression tests through it might be
>> reasonable approach if we add new make target for it.
>
> That sounds like a good plan.
>
>
>> Note that the first patch is huge. That's because I needed to add
>> alternative output for largeobject test because it uses fastpath
>> function calls which are not allowed over replication protocol.
>
> There's no need for that restriction, is there?  At least for db walsenders...
>

No, there is no real need to restring the extended protocol either but
we do so currently. The point of allowing SQL was to allow logical
replication to work, not to merge walsender completely into normal
backend code. And I don't know about differences well enough to go for
the full merge, especially not at this stage of PG10 dev.

>> As part of this I realized that the parser fallback that I wrote
>> originally for handling SQL in walsender is not robust enough as the
>> lexer would fail on some valid SQL statements. So there is also second
>> patch that does this using different approach which allows any SQL
>> statement.
>
> Haven't looked at the new thing yet, but I'm not particularly surprised...  Wonder of there's a good way to fully integrate both grammars, without exploding keywords.
>

I think we'd need to keyword the first word of every replication command
if we wanted to merge both grammar files together. I am not sure if
there is all that much need for that, the pass-through for everything
that is not replication command seems to work just fine. Obviously it
means walsender is still special but as I said, my plan was to make it
work for logical replication not to merge it completely with existing
backends.

--
  Petr Jelinek                  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: walsender & parallelism

Andres Freund
On 2017-04-24 18:29:51 +0200, Petr Jelinek wrote:

> On 24/04/17 07:42, Andres Freund wrote:
> >
> >
> > On April 23, 2017 10:31:18 PM PDT, Petr Jelinek <[hidden email]> wrote:
> >> On 24/04/17 04:31, Petr Jelinek wrote:
> >> So actually maybe running regression tests through it might be
> >> reasonable approach if we add new make target for it.
> >
> > That sounds like a good plan.
> >
> >
> >> Note that the first patch is huge. That's because I needed to add
> >> alternative output for largeobject test because it uses fastpath
> >> function calls which are not allowed over replication protocol.
> >
> > There's no need for that restriction, is there?  At least for db walsenders...
> >
>
> No, there is no real need to restring the extended protocol either but
> we do so currently. The point of allowing SQL was to allow logical
> replication to work, not to merge walsender completely into normal
> backend code.

Well, that's understandable, but there's also the competing issue that
we need something that is well defined and behaved.


> Obviously it
> means walsender is still special but as I said, my plan was to make it
> work for logical replication not to merge it completely with existing
> backends.

Yea, and I don't think that's an argument for anything on its own,
sorry.

- Andres


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: walsender & parallelism

Petr Jelinek-4
On 24/04/17 20:00, Andres Freund wrote:

> On 2017-04-24 18:29:51 +0200, Petr Jelinek wrote:
>> On 24/04/17 07:42, Andres Freund wrote:
>>>
>>>
>>> On April 23, 2017 10:31:18 PM PDT, Petr Jelinek <[hidden email]> wrote:
>>>> On 24/04/17 04:31, Petr Jelinek wrote:
>>>> So actually maybe running regression tests through it might be
>>>> reasonable approach if we add new make target for it.
>>>
>>> That sounds like a good plan.
>>>
>>>
>>>> Note that the first patch is huge. That's because I needed to add
>>>> alternative output for largeobject test because it uses fastpath
>>>> function calls which are not allowed over replication protocol.
>>>
>>> There's no need for that restriction, is there?  At least for db walsenders...
>>>
>>
>> No, there is no real need to restring the extended protocol either but
>> we do so currently. The point of allowing SQL was to allow logical
>> replication to work, not to merge walsender completely into normal
>> backend code.
>
> Well, that's understandable, but there's also the competing issue that
> we need something that is well defined and behaved.
>

It's well defined, it supports simple protocol queries, that's it.

>
>> Obviously it
>> means walsender is still special but as I said, my plan was to make it
>> work for logical replication not to merge it completely with existing
>> backends.
>
> Yea, and I don't think that's an argument for anything on its own,
> sorry.
>

It's not meant argument, it's plain statement of my intentions. I am not
stopping you from doing more if you want, however I don't see that it's
needed or any arguments about why it is needed.

--
  Petr Jelinek                  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: walsender & parallelism

Andres Freund
In reply to this post by Petr Jelinek-4
Hi,

On 2017-04-24 07:31:18 +0200, Petr Jelinek wrote:
> The previous coding tried to run the unknown string throur lexer which
> could fail for some valid SQL statements as the replication command
> parser is too simple to handle all the complexities of SQL language.
>
> Instead just fall back to SQL parser on any unknown command.
>
> Also remove SHOW command handling from walsender as it's handled by the
> simple query code path.

This break SHOW var; over the plain replication connections though
(which was quite intentionally supported), so I don't think that's ok?


I don't like much how SHOW and walsender understanding SQL statements
turned out, I think code like

                                        if (am_walsender)
                                        {
                                                if (!exec_replication_command(query_string))
                                                        exec_simple_query(query_string);
                                        }
                                        else
                                                exec_simple_query(query_string);

shows some of the issues here.


> Checks the SQL over walsender interface by running the standard set of
> regression tests against it.

I like that approach a lot.


> New alternative output for largeobject test has been added as the
> replication connection does not support fastpath function calls.

I think just supporting fastpath over the replication protocol is less
work than maintaining the alternative file.


> --- a/src/Makefile.global.in
> +++ b/src/Makefile.global.in
> @@ -321,6 +321,7 @@ BZIP2 = bzip2
>  # Testing
>  
>  check: temp-install
> +check-walsender-sql: temp-install

This doesn't strike me as something that should go into
a/src/Makefile.global.in - I'd rather put it in
b/src/test/regress/GNUmakefile.  check is in .global because it's used
in a lot of different makefiles, but that's not true for this target, right?



Greetings,

Andres Freund


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: walsender & parallelism

Petr Jelinek-4
On 25/04/17 01:25, Andres Freund wrote:

> Hi,
>
> On 2017-04-24 07:31:18 +0200, Petr Jelinek wrote:
>> The previous coding tried to run the unknown string throur lexer which
>> could fail for some valid SQL statements as the replication command
>> parser is too simple to handle all the complexities of SQL language.
>>
>> Instead just fall back to SQL parser on any unknown command.
>>
>> Also remove SHOW command handling from walsender as it's handled by the
>> simple query code path.
>
> This break SHOW var; over the plain replication connections though
> (which was quite intentionally supported), so I don't think that's ok?
>
Hmm I guess we don't use this anywhere in tests (which is not surprising
as it would have to be used in plain walsender connection). Fixed.

>
> I don't like much how SHOW and walsender understanding SQL statements
> turned out, I think code like
>
> if (am_walsender)
> {
> if (!exec_replication_command(query_string))
> exec_simple_query(query_string);
> }
> else
> exec_simple_query(query_string);
>
> shows some of the issues here.
>
Not sure how it's related to SHOW, but in any case, it's just result of
the parsing fallback.

>
>> New alternative output for largeobject test has been added as the
>> replication connection does not support fastpath function calls.
>
> I think just supporting fastpath over the replication protocol is less
> work than maintaining the alternative file.
>
>
>> --- a/src/Makefile.global.in
>> +++ b/src/Makefile.global.in
>> @@ -321,6 +321,7 @@ BZIP2 = bzip2
>>  # Testing
>>  
>>  check: temp-install
>> +check-walsender-sql: temp-install
>
> This doesn't strike me as something that should go into
> a/src/Makefile.global.in - I'd rather put it in
> b/src/test/regress/GNUmakefile.  check is in .global because it's used
> in a lot of different makefiles, but that's not true for this target, right?
>
Shows how well I know our build system :)

Anyway, I gave it a try to support as much as possible and the result is
attached.

First patch adds the new make target that runs regression tests through
walsender. I had to change one test to run SHOW TIMEZONE instead of SHOW
TIME ZONE, otherwise no tests need to be changed.

The second patch unifies the sighup handling across all kinds of
processes. This is mostly replacing got_SIGHuP with new variable and
removing local SIGHUP handlers in favor of generic one where possible.

And the third patch changes walsender. There are 3 things it does a) it
overhauls signal handling there, b) it allows both fast path function
calls and extended protocol messages (once I did function fast path I
didn't really see any difference in terms of extended protocol, please
correct me if I am wrong in that) and c) improves the fallback in parser.

--
  Petr Jelinek                  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

0001-Add-check-walsender-sql-make-target.patch (6K) Download Attachment
0002-Unify-SIGHUP-hadnling-across-all-types-of-processes.patch (33K) Download Attachment
0003-Make-walsender-behave-more-like-normal-backend.patch (14K) Download Attachment
12
Previous Thread Next Thread
Loading...