Hi,
I would like to propose getting the callstack of the postgres process by connecting to the server. This helps us in diagnosing the problems from a customer environment in case of hung process or in case of long running process. The idea here is to implement & expose pg_print_callstack function, internally what this function does is, the connected backend will send SIGUSR1 signal by setting PMSIGNAL_BACKTRACE_EMIT to the postmaster process. Postmaster process will send a SIGUSR1 signal to the process by setting PROCSIG_BACKTRACE_PRINT if the process has access to ProcSignal. As syslogger process & Stats process don't have access to ProcSignal, multiplexing with SIGUSR1 is not possible for these processes, hence SIGUSR2 signal will be sent for these processes. Once the process receives this signal it will log the backtrace of the process. Attached is a WIP patch for the same. Thoughts? Regards, Vignesh EnterpriseDB: http://www.enterprisedb.com |
vignesh C <[hidden email]> writes:
> The idea here is to implement & expose pg_print_callstack function, > internally what this function does is, the connected backend will send > SIGUSR1 signal by setting PMSIGNAL_BACKTRACE_EMIT to the postmaster > process. Postmaster process will send a SIGUSR1 signal to the process > by setting PROCSIG_BACKTRACE_PRINT if the process has access to > ProcSignal. As syslogger process & Stats process don't have access to > ProcSignal, multiplexing with SIGUSR1 is not possible for these > processes, hence SIGUSR2 signal will be sent for these processes. Once > the process receives this signal it will log the backtrace of the > process. Surely this is *utterly* unsafe. You can't do that sort of stuff in a signal handler. It might be all right to set a flag that would cause the next CHECK_FOR_INTERRUPTS to print a backtrace, but I'm not sure how useful that really is. The proposed postmaster.c addition seems quite useless, as there is exactly one stack trace it could ever log. I would like to see some discussion of the security implications of such a feature, as well. ("There aren't any" is the wrong answer.) regards, tom lane |
On Sun, Nov 22, 2020 at 11:55 AM Tom Lane <[hidden email]> wrote:
> > vignesh C <[hidden email]> writes: > > The idea here is to implement & expose pg_print_callstack function, > > internally what this function does is, the connected backend will send > > SIGUSR1 signal by setting PMSIGNAL_BACKTRACE_EMIT to the postmaster > > process. Postmaster process will send a SIGUSR1 signal to the process > > by setting PROCSIG_BACKTRACE_PRINT if the process has access to > > ProcSignal. As syslogger process & Stats process don't have access to > > ProcSignal, multiplexing with SIGUSR1 is not possible for these > > processes, hence SIGUSR2 signal will be sent for these processes. Once > > the process receives this signal it will log the backtrace of the > > process. > > Surely this is *utterly* unsafe. You can't do that sort of stuff in > a signal handler. > > It might be all right to set a flag that would cause the next > CHECK_FOR_INTERRUPTS to print a backtrace, but I'm not sure > how useful that really is. > > The proposed postmaster.c addition seems quite useless, as there > is exactly one stack trace it could ever log. > > I would like to see some discussion of the security implications > of such a feature, as well. ("There aren't any" is the wrong > answer.) Hi Hackers, Any thoughts on the security implication for this feature. Regards, Vignesh EnterpriseDB: http://www.enterprisedb.com |
In reply to this post by Tom Lane-2
> Surely this is *utterly* unsafe. You can't do that sort of stuff in
> a signal handler. Not safely, anyway. The signal handler could be called in the middle of a malloc(), a pfree(), or all sorts of other exciting circumstances. It'd have to be extremely careful to use only local resources on the stack and I don't see how that's feasible here. It'll work - most of the time. But it could explode messily and excitingly just when you actually need it to work properly, which is rarely what you want from features clearly intended for production debugging. (It'd be interesting to add some test infrastructure that helps us fire signal handlers at awkward times in a controlled manner, so we could predictably test signal handler re-entrancy and concurrency behaviour.) > It might be all right to set a flag that would cause the next > CHECK_FOR_INTERRUPTS to print a backtrace, but I'm not sure > how useful that really is. I find that when I most often want a backtrace of a running, live backend, it's because the backend is doing something that isn't passing a CHECK_FOR_INTERRUPTS() so it's not responding to signals. So it wouldn't help if a backend is waiting on an LWLock, busy in a blocking call to some loaded library, a blocking syscall, etc. But there are enough other times I want live backtraces, and I'm not the only one whose needs matter. It'd be kinda handy when collecting samples of some backend's activity when it's taking an excessively long time doing something indeterminate. I generally use a gdb script for that because unfortunately the Linux trace tool situation is so hopeless that I can't rely on perf or systemtap being present, working, and understanding the same command line arguments across various distros and versions, so something built-in would be convenient. That convenience would probably be counterbalanced by the hassle of extracting the results from the log files unless it's possible to use a function in one backend to query the stack in another instead of just writing it to the log. So a weak +1 from me, assuming printing stacks from CHECK_FOR_INTERRUPTS() to the log. We already have most of the infrastructure for that so the change is trivial, and we already trust anyone who can read the log, so it seems like a pretty low-cost, low-risk change. Somewhat more interested favour if the results can be obtained from a function or view from another backend, but then the security implications get a bit more exciting if we let non-superusers do it. You may recall that I wanted to do something similar a while ago in order to request MemoryContextStats() without needing to attach gdb and invoke a function manually using ptrace(). Also changes to support reading TCP socket state for a process. So I find this sort of thing useful in general. If we're querying one backend from another we could read its stack with ptrace() and unwind it with libunwind within the requesting backend, which would be a whole lot safer to execute and would work fine even when blocked in syscalls or synchronous library calls. See the eu-stack command from elfutils. If the target backend had shared libraries loaded that the requested backend didn't, libunwind could load the debuginfo for us if available. The downsides would be that many system lockdowns disable ptrace() for non-root even for same-user-owned processes, and that we'd need to depend on libunwind (or other platform equivalents) so it'd probably be a contrib. In which case you have to wonder if it's that much better than running `system("eu-stack $pid")` in plperl or a trivial custom C extension function. > I would like to see some discussion of the security implications > of such a feature, as well. ("There aren't any" is the wrong > answer.) If the stack only goes to the log, I actually don't think there are significant security implications beyond those we already have with our existing backtrace printing features. We already trust anyone who can read the log almost completely, and we can already emit stacks to the log. But I'd still want it to be gated superuser-only, or a role that's GRANTable by superuser only by default, since it exposes arbitrary internals of the server. "same user id" matching is not sufficient. A less-privileged session user might be calling into SECURITY DEFINER code, code from a privileged view, or sensitive C library code. Checking for the effective user is no better because the effective user might be less privileged at the moment the bt is requested, but the state up-stack might reveal sensitive information from a more privileged user. |
Craig Ringer <[hidden email]> writes:
>> I would like to see some discussion of the security implications >> of such a feature, as well. ("There aren't any" is the wrong >> answer.) > If the stack only goes to the log, I actually don't think there are > significant security implications beyond those we already have with > our existing backtrace printing features. We already trust anyone who > can read the log almost completely, and we can already emit stacks to > the log. But I'd still want it to be gated superuser-only, or a role > that's GRANTable by superuser only by default, since it exposes > arbitrary internals of the server. The concerns that I had were that the patch as submitted provides a mechanism that causes ALL processes in the system to dump backtraces, not a targeted request; and that it allows any user to issue such requests at an unbounded rate. That seems like a really easy route to denial of service. There's also a question of whether you'd even get intelligible results from dozens of processes simultaneously dumping many-line messages to the same place. (This might work out all right if you're using our syslogger, but it probably would not with any other logging technology.) I'd feel better about it if the mechanism had you specify exactly one target process, and were restricted to a superuser requestor. I'm not excited about adding on frammishes like letting one process extract another's stack trace. I think that just adds more points of failure, which is a bad thing in a feature that you're only going to care about when things are a bit pear-shaped already. regards, tom lane |
On Tue, 1 Dec 2020 at 07:04, Tom Lane <[hidden email]> wrote:
> I'd feel better about it if the mechanism had you specify exactly > one target process, and were restricted to a superuser requestor. Er, rather. I actually assumed the former was the case already, not having looked closely yet. |
In reply to this post by Tom Lane-2
Hi,
On 2020-11-22 01:25:08 -0500, Tom Lane wrote: > Surely this is *utterly* unsafe. You can't do that sort of stuff in > a signal handler. That's of course true for the current implementation - but I don't think it's a fundamental constraint. With a bit of care backtrace() and backtrace_symbols() itself can be signal safe: > backtrace() and backtrace_symbols_fd() don't call malloc() explicitly, but they are part of libgcc, which gets loaded dynamically when first > used. Dynamic loading usually triggers a call to malloc(3). If you need certain calls to these two functions to not allocate memory (in signal > handlers, for example), you need to make sure libgcc is loaded beforehand. It should be quite doable to emit such backtraces directly to stderr, instead of using appendStringInfoString()/elog(). Or even use a static buffer. It does have quite some appeal to be able to debug production workloads where queries can't be cancelled etc. And knowing that backtraces reliably work in case of SIGQUIT etc is also nice... > I would like to see some discussion of the security implications > of such a feature, as well. ("There aren't any" is the wrong > answer.) +1 Greetings, Andres Freund |
In reply to this post by Craig Ringer-5
Hi,
On 2020-11-30 13:35:46 +0800, Craig Ringer wrote: > I find that when I most often want a backtrace of a running, live > backend, it's because the backend is doing something that isn't > passing a CHECK_FOR_INTERRUPTS() so it's not responding to signals. So > it wouldn't help if a backend is waiting on an LWLock, busy in a > blocking call to some loaded library, a blocking syscall, etc. But > there are enough other times I want live backtraces, and I'm not the > only one whose needs matter. Random thought: Wonder if it could be worth adding a conditionally compiled mode where we track what the longest time between two CHECK_FOR_INTERRUPTS() calls is (with some extra logic for client IO). Obviously the regression tests don't tend to hit the worst cases of CFR() less code, but even if they did, we currently wouldn't know from running the regression tests. Greetings, Andres Freund |
On Tue, 1 Dec 2020 at 11:31, Andres Freund <[hidden email]> wrote:
> > Hi, > > On 2020-11-30 13:35:46 +0800, Craig Ringer wrote: > > I find that when I most often want a backtrace of a running, live > > backend, it's because the backend is doing something that isn't > > passing a CHECK_FOR_INTERRUPTS() so it's not responding to signals. So > > it wouldn't help if a backend is waiting on an LWLock, busy in a > > blocking call to some loaded library, a blocking syscall, etc. But > > there are enough other times I want live backtraces, and I'm not the > > only one whose needs matter. > > Random thought: Wonder if it could be worth adding a conditionally > compiled mode where we track what the longest time between two > CHECK_FOR_INTERRUPTS() calls is (with some extra logic for client > IO). > > Obviously the regression tests don't tend to hit the worst cases of > CFR() less code, but even if they did, we currently wouldn't know from > running the regression tests. We can probably determine that just as well with a perf or systemtap run on an --enable-dtrace build. Just tag CHECK_FOR_INTERRUPTS() with a SDT marker then record the timings. It might be convenient to have it built-in I guess, but if we tag the site and do the timing/tracing externally we don't have to bother about conditional compilation and special builds. |
In reply to this post by Andres Freund
Andres Freund <[hidden email]> writes:
> It should be quite doable to emit such backtraces directly to stderr, > instead of using appendStringInfoString()/elog(). No, please no. (1) On lots of logging setups (think syslog), anything that goes to stderr is just going to wind up in the bit bucket. I realize that we have that issue already for memory context dumps on OOM errors, but that doesn't make it a good thing. (2) You couldn't really write "to stderr", only to fileno(stderr), creating issues about interleaving of the output with regular stderr output. For instance it's quite likely that the backtrace would appear before stderr output that had actually been emitted earlier, which'd be tremendously confusing. (3) This isn't going to do anything good for my concerns about interleaved output from different processes, either. regards, tom lane |
On Tue, Dec 1, 2020 at 9:31 AM Tom Lane <[hidden email]> wrote:
> > Andres Freund <[hidden email]> writes: > > It should be quite doable to emit such backtraces directly to stderr, > > instead of using appendStringInfoString()/elog(). > > No, please no. > > (1) On lots of logging setups (think syslog), anything that goes to > stderr is just going to wind up in the bit bucket. I realize that > we have that issue already for memory context dumps on OOM errors, > but that doesn't make it a good thing. > > (2) You couldn't really write "to stderr", only to fileno(stderr), > creating issues about interleaving of the output with regular stderr > output. For instance it's quite likely that the backtrace would > appear before stderr output that had actually been emitted earlier, > which'd be tremendously confusing. > > (3) This isn't going to do anything good for my concerns about interleaved > output from different processes, either. > I felt if we are not agreeing on logging on the stderr, even using static buffer we might not be able to log as send_message_to_server_log calls appendStringInfo. I felt that doing it from CHECK_FOR_INTERRUPTS may be better. Regards, Vignesh EnterpriseDB: http://www.enterprisedb.com |
On Tue, Dec 1, 2020 at 2:15 PM vignesh C <[hidden email]> wrote:
> > On Tue, Dec 1, 2020 at 9:31 AM Tom Lane <[hidden email]> wrote: > > > > Andres Freund <[hidden email]> writes: > > > It should be quite doable to emit such backtraces directly to stderr, > > > instead of using appendStringInfoString()/elog(). > > > > No, please no. > > > > (1) On lots of logging setups (think syslog), anything that goes to > > stderr is just going to wind up in the bit bucket. I realize that > > we have that issue already for memory context dumps on OOM errors, > > but that doesn't make it a good thing. > > > > (2) You couldn't really write "to stderr", only to fileno(stderr), > > creating issues about interleaving of the output with regular stderr > > output. For instance it's quite likely that the backtrace would > > appear before stderr output that had actually been emitted earlier, > > which'd be tremendously confusing. > > > > (3) This isn't going to do anything good for my concerns about interleaved > > output from different processes, either. > > > > I felt if we are not agreeing on logging on the stderr, even using > static buffer we might not be able to log as > send_message_to_server_log calls appendStringInfo. I felt that doing > it from CHECK_FOR_INTERRUPTS may be better. > CHECK_FOR_INTERRUPTS. This patch also includes the change to allow getting backtrace of any particular process based on the suggestions. Attached patch has the implementation for the same. Thoughts? Regards, Vignesh EnterpriseDB: http://www.enterprisedb.com |
On 2020-12-08 10:38, vignesh C wrote:
> I have implemented printing of backtrace based on handling it in > CHECK_FOR_INTERRUPTS. This patch also includes the change to allow > getting backtrace of any particular process based on the suggestions. > Attached patch has the implementation for the same. > Thoughts? Are we willing to use up a signal for this? |
On 2021-01-15 09:53:05 +0100, Peter Eisentraut wrote:
> On 2020-12-08 10:38, vignesh C wrote: > > I have implemented printing of backtrace based on handling it in > > CHECK_FOR_INTERRUPTS. This patch also includes the change to allow > > getting backtrace of any particular process based on the suggestions. > > Attached patch has the implementation for the same. > > Thoughts? > > Are we willing to use up a signal for this? Why is a full signal needed? Seems the procsignal infrastructure should suffice? |
On Sat, Jan 16, 2021 at 1:40 AM Andres Freund <[hidden email]> wrote:
> > On 2021-01-15 09:53:05 +0100, Peter Eisentraut wrote: > > On 2020-12-08 10:38, vignesh C wrote: > > > I have implemented printing of backtrace based on handling it in > > > CHECK_FOR_INTERRUPTS. This patch also includes the change to allow > > > getting backtrace of any particular process based on the suggestions. > > > Attached patch has the implementation for the same. > > > Thoughts? > > > > Are we willing to use up a signal for this? > > Why is a full signal needed? Seems the procsignal infrastructure should > suffice? Most of the processes have access to ProcSignal, for these processes printing of callstack signal was handled by using ProcSignal. Pgstat process & syslogger process do not have access to ProcSignal, multiplexing with SIGUSR1 is not possible for these processes. So I handled the printing of callstack for pgstat process & syslogger using the SIGUSR2 signal. This is because shared memory is detached before pgstat & syslogger process is started by using the below: /* Drop our connection to postmaster's shared memory, as well */ dsm_detach_all(); PGSharedMemoryDetach(); Regards, Vignesh EnterpriseDB: http://www.enterprisedb.com |
Hi,
On Sat, Jan 16, 2021, at 09:34, vignesh C wrote: > On Sat, Jan 16, 2021 at 1:40 AM Andres Freund <[hidden email]> wrote: > > > > On 2021-01-15 09:53:05 +0100, Peter Eisentraut wrote: > > > On 2020-12-08 10:38, vignesh C wrote: > > > > I have implemented printing of backtrace based on handling it in > > > > CHECK_FOR_INTERRUPTS. This patch also includes the change to allow > > > > getting backtrace of any particular process based on the suggestions. > > > > Attached patch has the implementation for the same. > > > > Thoughts? > > > > > > Are we willing to use up a signal for this? > > > > Why is a full signal needed? Seems the procsignal infrastructure should > > suffice? > > Most of the processes have access to ProcSignal, for these processes > printing of callstack signal was handled by using ProcSignal. Pgstat > process & syslogger process do not have access to ProcSignal, > multiplexing with SIGUSR1 is not possible for these processes. So I > handled the printing of callstack for pgstat process & syslogger using > the SIGUSR2 signal. > This is because shared memory is detached before pgstat & syslogger > process is started by using the below: > /* Drop our connection to postmaster's shared memory, as well */ > dsm_detach_all(); > PGSharedMemoryDetach(); Sure. But why is it important enough to support those that we are willing to dedicate a signal to the task? Their backtraces aren't often interesting, so I think we should just ignore them here. Andres |
In reply to this post by vignesh C
vignesh C <[hidden email]> writes:
> On Sat, Jan 16, 2021 at 1:40 AM Andres Freund <[hidden email]> wrote: >> Why is a full signal needed? Seems the procsignal infrastructure should >> suffice? > Most of the processes have access to ProcSignal, for these processes > printing of callstack signal was handled by using ProcSignal. Pgstat > process & syslogger process do not have access to ProcSignal, > multiplexing with SIGUSR1 is not possible for these processes. So I > handled the printing of callstack for pgstat process & syslogger using > the SIGUSR2 signal. I'd argue that backtraces for those processes aren't really essential, and indeed that trying to make the syslogger report its own backtrace is damn dangerous. (Personally, I think this whole patch fails the safety-vs-usefulness tradeoff, but I expect I'll get shouted down.) regards, tom lane |
In reply to this post by Andres Freund
On Sat, Jan 16, 2021 at 11:10 PM Andres Freund <[hidden email]> wrote:
> > Hi, > > On Sat, Jan 16, 2021, at 09:34, vignesh C wrote: > > On Sat, Jan 16, 2021 at 1:40 AM Andres Freund <[hidden email]> wrote: > > > > > > On 2021-01-15 09:53:05 +0100, Peter Eisentraut wrote: > > > > On 2020-12-08 10:38, vignesh C wrote: > > > > > I have implemented printing of backtrace based on handling it in > > > > > CHECK_FOR_INTERRUPTS. This patch also includes the change to allow > > > > > getting backtrace of any particular process based on the suggestions. > > > > > Attached patch has the implementation for the same. > > > > > Thoughts? > > > > > > > > Are we willing to use up a signal for this? > > > > > > Why is a full signal needed? Seems the procsignal infrastructure should > > > suffice? > > > > Most of the processes have access to ProcSignal, for these processes > > printing of callstack signal was handled by using ProcSignal. Pgstat > > process & syslogger process do not have access to ProcSignal, > > multiplexing with SIGUSR1 is not possible for these processes. So I > > handled the printing of callstack for pgstat process & syslogger using > > the SIGUSR2 signal. > > This is because shared memory is detached before pgstat & syslogger > > process is started by using the below: > > /* Drop our connection to postmaster's shared memory, as well */ > > dsm_detach_all(); > > PGSharedMemoryDetach(); > > Sure. But why is it important enough to support those that we are willing to dedicate a signal to the task? Their backtraces aren't often interesting, so I think we should just ignore them here. Thanks for your comments Andres, I will ignore it for the processes which do not have access to ProcSignal. I will make the changes and post a patch for this soon. Regards, Vignesh EnterpriseDB: http://www.enterprisedb.com |
On Mon, 18 Jan 2021 at 00:56, vignesh C <[hidden email]> wrote:
I think that's sensible. I've had variable results with glibc's backtrace(), especially on older platforms and/or with external debuginfo, but it's much better than nothing. It's often not feasible to get someone to install gdb and run commands on their production systems - they can be isolated and firewalled or hobbled by painful change policies. Something basic built-in to postgres, even if basic, is likely to come in very handy. |
In reply to this post by vignesh C
On Sun, Jan 17, 2021 at 10:26 PM vignesh C <[hidden email]> wrote:
> > On Sat, Jan 16, 2021 at 11:10 PM Andres Freund <[hidden email]> wrote: > > > > Hi, > > > > On Sat, Jan 16, 2021, at 09:34, vignesh C wrote: > > > On Sat, Jan 16, 2021 at 1:40 AM Andres Freund <[hidden email]> wrote: > > > > > > > > On 2021-01-15 09:53:05 +0100, Peter Eisentraut wrote: > > > > > On 2020-12-08 10:38, vignesh C wrote: > > > > > > I have implemented printing of backtrace based on handling it in > > > > > > CHECK_FOR_INTERRUPTS. This patch also includes the change to allow > > > > > > getting backtrace of any particular process based on the suggestions. > > > > > > Attached patch has the implementation for the same. > > > > > > Thoughts? > > > > > > > > > > Are we willing to use up a signal for this? > > > > > > > > Why is a full signal needed? Seems the procsignal infrastructure should > > > > suffice? > > > > > > Most of the processes have access to ProcSignal, for these processes > > > printing of callstack signal was handled by using ProcSignal. Pgstat > > > process & syslogger process do not have access to ProcSignal, > > > multiplexing with SIGUSR1 is not possible for these processes. So I > > > handled the printing of callstack for pgstat process & syslogger using > > > the SIGUSR2 signal. > > > This is because shared memory is detached before pgstat & syslogger > > > process is started by using the below: > > > /* Drop our connection to postmaster's shared memory, as well */ > > > dsm_detach_all(); > > > PGSharedMemoryDetach(); > > > > Sure. But why is it important enough to support those that we are willing to dedicate a signal to the task? Their backtraces aren't often interesting, so I think we should just ignore them here. > > Thanks for your comments Andres, I will ignore it for the processes > which do not have access to ProcSignal. I will make the changes and > post a patch for this soon. > Regards, Vignesh EnterpriseDB: http://www.enterprisedb.com |
Free forum by Nabble | Edit this page |