Unified logging system for command-line programs

classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

Unified logging system for command-line programs

Peter Eisentraut-6
I have developed a patch that unifies the various ad hoc logging
(message printing, error printing) systems used throughout the
command-line programs.

Examples:


- fprintf(stderr, _("%s: could not open file \"%s\" for writing: %s\n"),
-         progname, path, strerror(errno));
+ pg_log_error("could not open file \"%s\" for writing: %m", path);


-           if (debug)
-               fprintf(stderr,
-                       _("%s: file \"%s\" would be removed\n"),
-                       progname, WALFilePath);
+           pg_log_debug("file \"%s\" would be removed", WALFilePath);


Features:

- Program name is automatically prefixed.

- Message string does not end with newline.  This removes a common
source of inconsistencies and omissions.

- Additionally, a final newline is automatically stripped, simplifying
use of PQerrorMessage() etc., another common source of mistakes.

- I converted error message strings to use %m where possible.  (I had
originally intended to implement %m here like elog used to do, but that
was thankfully already done elsewhere.)

- As a result of the above several points, more translatable message
strings can be shared between different components and between frontends
and backend, without gratuitous punctuation or whitespace differences.

- There is support for setting a "log level".  This is not meant to be
user-facing, but can be used internally to implement debug or verbose
modes, as in the above example.

- Lazy argument evaluation, so no significant overhead if logging at
some level is disabled.

- Bonus: Some color in the messages, similar to gcc and clang.  Export
PG_COLOR=auto to try it out.  The colors are currently hardcoded, so
some configuration there might be added.

- Common files (common/, fe_utils/, etc.) can handle logging much more
simply by just using one API without worrying too much about the context
of the calling program, requiring callbacks, or having to pass
"progname" around everywhere.


Soft goals:

- Reduces vertical space use and visual complexity of error reporting in
the source code.

- Encourages more deliberate classification of messages.  For example,
in some cases it wasn't clear without analyzing the surrounding code
whether a message was meant as an error or just an info.

- Concepts and terms are vaguely aligned with popular logging frameworks
such as log4j and Python logging.

- Future possibilities.  Maybe something like log_line_prefix or
different log formats could be added.  Just a theory right now, but this
would make it easier.


Non-goals/out of scope:

- Flow control.  This is all just about printing stuff out.  Nothing
affects program flow (e.g., fatal exits).  The uses are just too varied
to do that.  Some existing code had wrappers that do some kind of
print-and-exit, and I adapted those.  It didn't seem worth going any
further.


It's not fully complete but most of it works well.  I didn't do
pg_upgrade and pg_ctl yet.  pg_dump has some remaining special cases to
work through.  I tried to keep the output mostly the same, but there is
a lot of historical baggage to unwind and special cases to consider, and
I might not always have succeeded.  One significant change is that
pg_rewind used to write all error messages to stdout.  That is now
changed to stderr.

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

v1-0001-Unified-logging-system-for-command-line-programs.patch (661K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Unified logging system for command-line programs

Tom Lane-2
Peter Eisentraut <[hidden email]> writes:
> I have developed a patch that unifies the various ad hoc logging
> (message printing, error printing) systems used throughout the
> command-line programs.

I've not read the patch in any detail, but +1 for making this more
uniform.

> - Common files (common/, fe_utils/, etc.) can handle logging much more
> simply by just using one API without worrying too much about the context
> of the calling program, requiring callbacks, or having to pass
> "progname" around everywhere.

It seems like a shame that src/common files still need to have
#ifdef FRONTEND variant code to deal with frontend vs. backend
conventions.  I wonder how hard it would be to layer some subset of
ereport() functionality on top of what you have here, so as to get
rid of those #ifdef stanzas.

                        regards, tom lane

Reply | Threaded
Open this post in threaded view
|

Re: Unified logging system for command-line programs

Alvaro Herrera-9
On 2018-Dec-30, Tom Lane wrote:

> Peter Eisentraut <[hidden email]> writes:
> > I have developed a patch that unifies the various ad hoc logging
> > (message printing, error printing) systems used throughout the
> > command-line programs.
>
> I've not read the patch in any detail, but +1 for making this more
> uniform.

Agreed, and the compactness is a good bonus too.

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: Unified logging system for command-line programs

Andres Freund
In reply to this post by Tom Lane-2
Hi,

On 2018-12-30 14:45:23 -0500, Tom Lane wrote:
> I wonder how hard it would be to layer some subset of
> ereport() functionality on top of what you have here, so as to get
> rid of those #ifdef stanzas.

+many.  I think we should aim to unify the use (in contrast to the
implementation) of logging as much as possible, rather than having a
separate API for it for client programs. Not just because that facilitates
code reuse in frontend programs, but also because that's one less thing to
learn when getting started with PG.

Further down the line I think we should also port the PG_CATCH logic to
client programs.

Greetings,

Andres Freund

Reply | Threaded
Open this post in threaded view
|

Re: Unified logging system for command-line programs

Alvaro Herrera-9
In reply to this post by Peter Eisentraut-6
Ah, one more thing -- there's a patch by Marina Polyakova (in CC) to
make pgbench logging more regular.  Maybe that stuff should be
considered now too.  I'm not saying to patch pgbench in this commit, but
rather to have pgbench in mind while discussing the API.  I think the
last version of that was here:

https://postgr.es/m/a1bd32671a6777b78dd67b95eb68ff82@...

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: Unified logging system for command-line programs

Peter Eisentraut-6
In reply to this post by Tom Lane-2
On 30/12/2018 20:45, Tom Lane wrote:
> It seems like a shame that src/common files still need to have
> #ifdef FRONTEND variant code to deal with frontend vs. backend
> conventions.  I wonder how hard it would be to layer some subset of
> ereport() functionality on top of what you have here, so as to get
> rid of those #ifdef stanzas.

The patch does address that in some places:

@@ -39,12 +45,7 @@ pgfnames(const char *path)
    dir = opendir(path);
    if (dir == NULL)
    {
-#ifndef FRONTEND
-       elog(WARNING, "could not open directory \"%s\": %m", path);
-#else
-       fprintf(stderr, _("could not open directory \"%s\": %s\n"),
-               path, strerror(errno));
-#endif
+       pg_log_warning("could not open directory \"%s\": %m", path);
        return NULL;
    }

It's worth noting that less than 5 files are of concern for this, so
creating a more elaborate system would probably be more code than you'd
save at the other end.

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: Unified logging system for command-line programs

Peter Eisentraut-6
In reply to this post by Andres Freund
On 31/12/2018 16:55, Andres Freund wrote:
> I think we should aim to unify the use (in contrast to the
> implementation) of logging as much as possible, rather than having a
> separate API for it for client programs.

I opted against doing that, for mainly two reasons: One, I think the
ereport() API is too verbose for this purpose, an invocation is usually
two to three lines.  My goal was to make logging smaller and more
compact.  Two, I think tying error reporting to flow control does not
always work well and leads to bad code and a bad user experience.
Relatedly, rewriting all the frontend programs to exception style would
end up being a 10x project to rewrite everything for no particular
benefit.  Going from 8 or so APIs to 2 is already an improvement, I
think.  If someone wants to try going further, it can be considered, but
it would be an entirely different project.

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: Unified logging system for command-line programs

Andres Freund
Hi,

On 2019-01-03 14:28:51 +0100, Peter Eisentraut wrote:
> On 31/12/2018 16:55, Andres Freund wrote:
> > I think we should aim to unify the use (in contrast to the
> > implementation) of logging as much as possible, rather than having a
> > separate API for it for client programs.
>
> I opted against doing that, for mainly two reasons: One, I think the
> ereport() API is too verbose for this purpose, an invocation is usually
> two to three lines.

Well, then elog() could be used.


> My goal was to make logging smaller and more
> compact.  Two, I think tying error reporting to flow control does not
> always work well and leads to bad code and a bad user experience.

Not sure I can buy that, given that we seem to be doing quite OK in the backend.


> Relatedly, rewriting all the frontend programs to exception style would
> end up being a 10x project to rewrite everything for no particular
> benefit.  Going from 8 or so APIs to 2 is already an improvement, I
> think.  If someone wants to try going further, it can be considered, but
> it would be an entirely different project.

Why would it be 10x the effort, if you already touch all the relevant
log invocations? This'll just mean that the same lines will
mechanically need to be changed again.

Greetings,

Andres Freund

Reply | Threaded
Open this post in threaded view
|

Re: Unified logging system for command-line programs

Peter Eisentraut-6
On 03/01/2019 19:03, Andres Freund wrote:
>> My goal was to make logging smaller and more
>> compact.  Two, I think tying error reporting to flow control does not
>> always work well and leads to bad code and a bad user experience.
>
> Not sure I can buy that, given that we seem to be doing quite OK in the backend.

Consider the numerous places where we do elog(LOG) for an *error*
because we don't want to jump away.

>> Relatedly, rewriting all the frontend programs to exception style would
>> end up being a 10x project to rewrite everything for no particular
>> benefit.  Going from 8 or so APIs to 2 is already an improvement, I
>> think.  If someone wants to try going further, it can be considered, but
>> it would be an entirely different project.
>
> Why would it be 10x the effort,

Because you would have to rewrite all the programs to handle elog(ERROR)
jumping somewhere else.

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: Unified logging system for command-line programs

Tom Lane-2
Peter Eisentraut <[hidden email]> writes:
> On 03/01/2019 19:03, Andres Freund wrote:
>>> Relatedly, rewriting all the frontend programs to exception style would
>>> end up being a 10x project to rewrite everything for no particular
>>> benefit.  Going from 8 or so APIs to 2 is already an improvement, I
>>> think.  If someone wants to try going further, it can be considered, but
>>> it would be an entirely different project.

>> Why would it be 10x the effort,

> Because you would have to rewrite all the programs to handle elog(ERROR)
> jumping somewhere else.

FWIW, this argument has nothing to do with what I was actually
proposing.  I envisioned that we'd have a wrapper in which
non-error ereports() map directly onto what you're calling
pg_log_debug, pg_log_warning, etc, while ereport(ERROR) has the
effect of writing a message and then calling exit(1).  We would
use ereport(ERROR) in exactly the places where we're now writing
a message and calling exit(1).  No change at all in program
flow control, but an opportunity to consolidate code in places
that are currently doing this sort of thing:

#ifndef FRONTEND
        ereport(ERROR,
                (errcode_for_file_access(),
                 errmsg("could not open file \"%s\" for reading: %m",
                        ControlFilePath)));
#else
    {
        fprintf(stderr, _("%s: could not open file \"%s\" for reading: %s\n"),
                progname, ControlFilePath, strerror(errno));
        exit(EXIT_FAILURE);
    }
#endif

                        regards, tom lane

Reply | Threaded
Open this post in threaded view
|

Re: Unified logging system for command-line programs

Peter Eisentraut-6
On 03/01/2019 22:01, Tom Lane wrote:
> I envisioned that we'd have a wrapper in which
> non-error ereports() map directly onto what you're calling
> pg_log_debug, pg_log_warning, etc,

My code does that, but the other way around.  (It's easier that way than
to unpack ereport() invocations.)

> while ereport(ERROR) has the
> effect of writing a message and then calling exit(1).

The problem is that in majority of cases the FRONTEND code, as it is
written today, doesn't want to exit() after an error.

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: Unified logging system for command-line programs

Tom Lane-2
Peter Eisentraut <[hidden email]> writes:
> On 03/01/2019 22:01, Tom Lane wrote:
>> while ereport(ERROR) has the
>> effect of writing a message and then calling exit(1).

> The problem is that in majority of cases the FRONTEND code, as it is
> written today, doesn't want to exit() after an error.

Right, so for that you'd use ereport(WARNING) or LOG or whatever.

We'd probably need a bit of care about which ereport levels produce
exactly what output, but I don't think that's insurmountable.  We
do not need all the backend-side message levels to exist for frontend.

                        regards, tom lane

Reply | Threaded
Open this post in threaded view
|

Re: Unified logging system for command-line programs

Andres Freund
Hi,

On 2019-01-03 17:03:43 -0500, Tom Lane wrote:
> Peter Eisentraut <[hidden email]> writes:
> > On 03/01/2019 22:01, Tom Lane wrote:
> >> while ereport(ERROR) has the
> >> effect of writing a message and then calling exit(1).
>
> > The problem is that in majority of cases the FRONTEND code, as it is
> > written today, doesn't want to exit() after an error.
>
> Right, so for that you'd use ereport(WARNING) or LOG or whatever.

Or we could just add an ERROR variant that doesn't exit. Years back
I'd proposed that we make the log level a bitmask, but it could also
just be something like CALLSITE_ERROR or something roughly along those
lines.  There's a few cases in backend code where that'd be beneficial
too.

Greetings,

Andres Freund

Reply | Threaded
Open this post in threaded view
|

Re: Unified logging system for command-line programs

Donald Dong
I think this patch is a nice improvement!

On Jan 3, 2019, at 2:08 PM, Andres Freund <[hidden email]> wrote:
> Or we could just add an ERROR variant that doesn't exit. Years back
> I'd proposed that we make the log level a bitmask, but it could also
> just be something like CALLSITE_ERROR or something roughly along those
> lines.  There's a few cases in backend code where that'd be beneficial
> too.

I think the logging system can also be applied on pg_regress. Perhaps even
for the external frontend applications?

The patch cannot be applied directly on HEAD. So I patched it on top of
60d99797bf. When I call pg_log_error() in initdb, I see

Program received signal SIGSEGV, Segmentation fault.
__strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:62
62      ../sysdeps/x86_64/multiarch/strlen-avx2.S: No such file or directory.
(gdb) bt
#0  __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:62
#1  0x0000555555568f96 in dopr.constprop ()
#2  0x0000555555569ddb in pg_vsnprintf ()
#3  0x0000555555564236 in pg_log_generic ()
#4  0x000055555555c240 in main ()

I'm not sure what would be causing this behavior. I would appreciate
references or docs for testing and debugging patches more efficiently.
Now I'm having difficulties loading symbols of initdb in gdb.

Thank you,
Donald Dong

Reply | Threaded
Open this post in threaded view
|

Re: Unified logging system for command-line programs

Peter Eisentraut-6
On 10/01/2019 05:57, Donald Dong wrote:
> I think the logging system can also be applied on pg_regress. Perhaps even
> for the external frontend applications?

Could be done, yes.  A bit at a time. ;-)

> The patch cannot be applied directly on HEAD. So I patched it on top of
> 60d99797bf.

Here is an updated patch with the merge conflicts of my own design
resolved.  No functionality changes.

> When I call pg_log_error() in initdb, I see
>
> Program received signal SIGSEGV, Segmentation fault.
> __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:62
> 62      ../sysdeps/x86_64/multiarch/strlen-avx2.S: No such file or directory.
> (gdb) bt
> #0  __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:62
> #1  0x0000555555568f96 in dopr.constprop ()
> #2  0x0000555555569ddb in pg_vsnprintf ()
> #3  0x0000555555564236 in pg_log_generic ()
> #4  0x000055555555c240 in main ()
What do you mean exactly by "I call pg_log_error()"?  The existing calls
in initdb clearly work, at least some of them, that is covered by the
test suite.  Are you adding new calls?

> I'm not sure what would be causing this behavior. I would appreciate
> references or docs for testing and debugging patches more efficiently.
> Now I'm having difficulties loading symbols of initdb in gdb.

The above looks like you'd probably get a better insight by compiling
with -O0 or some other lower optimization setting.

There is also this:
https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

v2-0001-Unified-logging-system-for-command-line-programs.patch (660K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Unified logging system for command-line programs

Donald Dong

> On Jan 11, 2019, at 9:14 AM, Peter Eisentraut <[hidden email]> wrote:
>
>> The patch cannot be applied directly on HEAD. So I patched it on top of
>> 60d99797bf.
>
> Here is an updated patch with the merge conflicts of my own design
> resolved.  No functionality changes.
>
>> When I call pg_log_error() in initdb, I see
>>
>> Program received signal SIGSEGV, Segmentation fault.
>> __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:62
>> 62      ../sysdeps/x86_64/multiarch/strlen-avx2.S: No such file or directory.
>> (gdb) bt
>> #0  __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:62
>> #1  0x0000555555568f96 in dopr.constprop ()
>> #2  0x0000555555569ddb in pg_vsnprintf ()
>> #3  0x0000555555564236 in pg_log_generic ()
>> #4  0x000055555555c240 in main ()
>
> What do you mean exactly by "I call pg_log_error()"?  The existing calls
> in initdb clearly work, at least some of them, that is covered by the
> test suite.  Are you adding new calls?

Thank you. I did add a new call for my local testing. There are no more errors
after re-applying the patch on master.

>> I'm not sure what would be causing this behavior. I would appreciate
>> references or docs for testing and debugging patches more efficiently.
>> Now I'm having difficulties loading symbols of initdb in gdb.
>
> The above looks like you'd probably get a better insight by compiling
> with -O0 or some other lower optimization setting.
>
> There is also this:
> https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD

Thank you for the reference. That's very helpful!


I noticed in some places such as

        pg_log_error("no data directory specified");
        fprintf(stderr,
                                        _("You must identify the directory where the data for this database system\n"
                                        ...

and

        pg_log_warning("enabling \"trust\" authentication for local connections");
        fprintf(stderr, _("You can change this by editing pg_hba.conf or using the option -A, or\n"
                                        "--auth-local and --auth-host, the next time you run initdb.\n"));

, pg_log  does not completely replace fprintf. Would it be better to use pg_log
so the logging level can also filter these messages?