Hi,
I'm going to propose libpq debug log for analysis of queries on the application side. I think that it is useful to determine whether the cause is on the application side or the server side when a slow query occurs. The provided information is "date and time" at which execution of processing is started, "query", "application side processing", which is picked up information from PQtrace(), and "connection id", which is for uniquely identifying the connection. To collect the log, set the connection string or environment variable. - logdir or PGLOGDIR : directory where log file written - logsize or PGLOGSIZE : maximum log size What do you think about this? Do you think that such feature is necessary? Regards, Aya Iwata |
"Iwata, Aya" <[hidden email]> writes:
> I'm going to propose libpq debug log for analysis of queries on the application side. > I think that it is useful to determine whether the cause is on the application side or the server side when a slow query occurs. Hm, how will you tell that really? And what's the advantage over the existing server-side query logging capability? > The provided information is "date and time" at which execution of processing is started, "query", "application side processing", which is picked up information from PQtrace(), and "connection id", which is for uniquely identifying the connection. PQtrace() is utterly useless for anything except debugging libpq internals, and it's not tremendously useful even for that. Don't bother with that part. Where will you get a "unique connection id" from? How will you deal with asynchronously-executed queries --- either the PQgetResult style, or the single-row-at-a-time style? regards, tom lane |
In reply to this post by Iwata, Aya
On Fri, 24 Aug 2018 04:38:22 +0000
"Iwata, Aya" <[hidden email]> wrote: > Hi, > > I'm going to propose libpq debug log for analysis of queries on the application side. > I think that it is useful to determine whether the cause is on the application side or the server side when a slow query occurs. Do you mean you want to monitor the protocol message exchange at the client side to analyze performance issues, right? Actually, this might be useful to determin where is the problem, for example, the client application, the backend PostgreSQL, or somewhere in the network. Such logging can be implemented in the application, but if libpq provides the standard way, it would be helpful to resolve a problem without modifying the application code. > The provided information is "date and time" at which execution of processing is started, "query", "application side processing", which is picked up information from PQtrace(), and "connection id", which is for uniquely identifying the connection. I couldn't image how this is like. Could you show us a sample of log lines in your head? > To collect the log, set the connection string or environment variable. > - logdir or PGLOGDIR : directory where log file written > - logsize or PGLOGSIZE : maximum log size How we can specify the log file name? What should happen if a file size exceeds to PGLOGSIZE? Regards, -- Yugo Nagata <[hidden email]> |
In reply to this post by Tom Lane-2
On Fri, Aug 24, 2018 at 9:48 AM, Tom Lane <[hidden email]> wrote:
> PQtrace() is utterly useless for anything except debugging libpq > internals, and it's not tremendously useful even for that. Don't > bother with that part. I think that improving the output format could help with that a lot. What it current produces is almost unreadable; adjusting it to emit one line per protocol message would, I think, help a lot. There are probably other improvements that could be made at the same time. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company |
Robert Haas <[hidden email]> writes:
> On Fri, Aug 24, 2018 at 9:48 AM, Tom Lane <[hidden email]> wrote: >> PQtrace() is utterly useless for anything except debugging libpq >> internals, and it's not tremendously useful even for that. Don't >> bother with that part. > I think that improving the output format could help with that a lot. > What it current produces is almost unreadable; adjusting it to emit > one line per protocol message would, I think, help a lot. There are > probably other improvements that could be made at the same time. I wouldn't mind throwing it out and reimplementing it ;-) ... tracing at the logical message level rather than the byte level would help. But still, I'm not really convinced that it has very much to do with what you'd want in a user-level debug log. Part of what's really bad about the PQtrace output is that in v2 protocol the same output can be repeated several times as we try to parse a message and conclude we don't have it all yet. I believe that problem is gone in v3, but it may be hard to do a consistent redesign until we nuke libpq's v2 support. Still, it might be past time for the latter, seeing that we no longer promise pre-7.4 compatibility in either psql or pg_dump. regards, tom lane |
In reply to this post by Tom Lane-2
> "Iwata, Aya" <[hidden email]> writes:
> > I'm going to propose libpq debug log for analysis of queries on the application > side. > > I think that it is useful to determine whether the cause is on the application > side or the server side when a slow query occurs. > > Hm, how will you tell that really? And what's the advantage over the existing > server-side query logging capability? The log I would like to propose is used when the performance issue happen, system administrator knows the process of application internally and check if there is any problem. "debug" is not the correct description of the feature. The correct one should be "trace". Should I create another thread? > > The provided information is "date and time" at which execution of processing > is started, "query", "application side processing", which is picked up > information from PQtrace(), and "connection id", which is for uniquely > identifying the connection. > > PQtrace() is utterly useless for anything except debugging libpq internals, > and it's not tremendously useful even for that. Don't bother with that part. My initial intention was to get only useful information from PQTrace () since it acquires a lot of information. Is there another way to obtain application side information besides PQTrace() ? > Where will you get a "unique connection id" from? When collecting trace log file in the application side, I think it is necessary to identify different connection. In order to do this, when new empty PQconn structure is created, new connection id is also created. Then we output it in the trace log file for one application. > How will you deal with asynchronously-executed queries --- either the > PQgetResult style, or the single-row-at-a-time style? In my understanding, PQgetResult style outputs logs of multiple result queries at once, While the single-row-at-a-time style outputs log for each query. Is this correct? I think PQgetResult style is better, because this style traces the internal process of the application. Regards, Aya Iwata |
In reply to this post by Yugo Nagata
> > I'm going to propose libpq debug log for analysis of queries on the application
> side. > > I think that it is useful to determine whether the cause is on the application > side or the server side when a slow query occurs. > > Do you mean you want to monitor the protocol message exchange at the client > side to analyze performance issues, right? Actually, this might be useful to > determin where is the problem, for example, the client application, the backend > PostgreSQL, or somewhere in the network. > > Such logging can be implemented in the application, but if libpq provides the > standard way, it would be helpful to resolve a problem without modifying the > application code. Since I'd like to monitor the information the server and the client exchange, I think monitoring protocol messages is good. When a slow query is occurs, we check this client side trace log. The purpose of this log acquisition I thought is to identify where is the problem: server side, application side or traffic. And if the problem is in application side, checking the trace log to identify what is the problem. > > The provided information is "date and time" at which execution of processing > is started, "query", "application side processing", which is picked up > information from PQtrace(), and "connection id", which is for uniquely > identifying the connection. > > I couldn't image how this is like. Could you show us a sample of log lines in > your head? I am roughly thinking as follows; ... START : 2018/09/03 18:16:35.357 CONNECTION(1) STATUS : Connection SEND MESSAGE : 2018/09/03 18:16:35.358 <information send to backend...> RECEIVE MESSAGE : 2018/09/03 18:16:35.359 <information receive from backend...> END : 2018/09/03 18:16:35.360 ... START : 2018/09/03 18:16:35.357 CONNECTION(1) QUERY : DECLARE myportal CURSOR FOR select * from pg_database SEND MESSAGE : 2018/09/03 18:16:35.358 <information send to backend...> RECEIVE MESSAGE : 2018/09/03 18:16:35.359 <information receive from backend...> END : 2018/09/03 18:16:35.360 ... > > To collect the log, set the connection string or environment variable. > > - logdir or PGLOGDIR : directory where log file written > > - logsize or PGLOGSIZE : maximum log size > > How we can specify the log file name? What should happen if a file size exceeds > to PGLOGSIZE? The log file name is determined as follow. libpq-%ApplicationName-%Y-%m-%d_%H%M%S.log When the log file size exceeds to PGLOGSIZE, the log is output to another file. Regards, Aya Iwata |
"Iwata, Aya" <[hidden email]> writes:
> The purpose of this log acquisition I thought is to identify where is the problem: > server side, application side or traffic. TBH, I think the sort of logging you're proposing would be expensive enough that *it* would be the bottleneck in a lot of cases. A lot of people find that the existing server-side "log_statement" support is too expensive to keep turned on in production --- and that logs only received SQL queries, not the returned data, and certainly not every message passed over the wire. regards, tom lane |
In reply to this post by Iwata, Aya
On 04/09/2018 02:29, Iwata, Aya wrote:
> Since I'd like to monitor the information the server and the client exchange, > I think monitoring protocol messages is good. > > When a slow query is occurs, we check this client side trace log. > The purpose of this log acquisition I thought is to identify where is the problem: > server side, application side or traffic. > And if the problem is in application side, checking the trace log to identify what is the problem. Between perf/systemtap/dtrace and wireshark, you can already do pretty much all of that. Have you looked at those and found anything missing? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services |
In reply to this post by Tom Lane-2
Let me explain this trace log in a bit more detail.
This log is not turned on in the system by default. Turn it on when an issue occurs and you want to check the information in the application in order to clarify the cause. I will present three use cases for this log. 1. Confirmation on cause of out-of-memory We assume that Out-of-memory occurred in the process of storing the data received from the database. However, the contents or length of the data is not known. A trace log is obtained to find these out and what kind of data you have in which part (i.e. query result when receiving from database). 2. Protocol error confirmation When there is an error in the protocol transmitted from the client and an error occurs in the database server, the protocol sent by the client is checked. When the network is unstable, log is checked whether the database server is receiving protocol messages. 3. Processing delay survey If the processing in the application is slow and the processing in the database is fast, investigate the cause of the processing delay. 4 kind of time can be obtained by the log; Timestamp when SQL started Timestamp when information began to be sent to the backend Timestamp when information is successfully received in the application Timestamp when SQL ended Then the difference between the time is checked to find out which part of process takes time. I reconfirm the function I proposed. If get the trace log, set PGLOGDIR/logdir and PGLOGSIZE/logsize. These parameters are set in the environment variable or the connection service file. - logdir or PGLOGDIR : directory where log file written - logsize or PGLOGSIZE : maximum log size. When the log file size exceeds to PGLOGSIZE, the log is output to another file. The log file name is determined as follow. libpq-%ConnectionID-%Y-%m-%d_%H%M%S.log This is a trace log example; ... Start: 2018-09-03 18:16:35.357 Connection(1) Status: Connection Send message: 2018-09-03 18:16:35.358 <information send to backend...> Receive message: 2018-09-03 18:16:35.359 <information receive from backend...> End: 2018-09-03 18:16:35.360 ... Start: 2018-09-03 18:16:35.357 Connection(1) ...(1), (2) Query: DECLARE myportal CURSOR FOR select * from pg_database ...(3) Send message: 2018-09-03 18:16:35.358 ...(4) <information send to backend...> ...(5) Receive message: 2018/09/03 18:16:35.359 ...(6) <information receive from backend...> ...(7) End: 2018-09-03 18:16:35.360 ...(8) ... (1) Timestamp when SQL started (2) Connection ID (Identify the connection) (3) SQL statement (4) Timestamp when information began to be sent to the backend (5) send message to backend (Result of query, Protocol messages) (6) Timestamp when information is successfully received in the application (7) receive message from backend (Result of query, Protocol messages) (8) Timestamp when SQL ended Regards, Iwata Aya |
In reply to this post by Peter Eisentraut-6
Hi,
Sorry for my late response. > Between perf/systemtap/dtrace and wireshark, you can already do pretty much > all of that. Have you looked at those and found anything missing? These commands provide detailed information to us. However, I think the trace log is necessary from the following point. 1. ease of use for users It is necessary to take information that is easy to understand for database users. This trace log is retrieved on the application server side. Not only the database developer but also application developer will get and check this log. Also, some of these commands return detailed PostgreSQL function names. The trace log would be useful for users who do not know the inside of PostgreSQL (e.g. application developers) 2. obtain the same information on all OS Some commands depend on the OS. I think that it is important that the trace log information is compatible to each OS. Regards, Aya Iwata |
Hi,
I create a first libpq trace log patch. In this patch, - All message that PQtrace() gets are output to the libpq trace log file (I maybe select more effective message in the future patch) - Trace log output style is changed slightly from previously proposed This patch not include documentation, but you can see parameter detail and how to use it by looking at my previous e-mail. If get the trace log, set PGLOGDIR/logdir and PGLOGSIZE/logsize. These parameters are set in the environment variable or the connection service file. - logdir or PGLOGDIR : directory where log file written - logsize or PGLOGSIZE : maximum log size(M). When the log file size exceeds to PGLOGSIZE, the log is output to another file. The log file name is determined as follow. libpq-%ProcessID-%Y-%m-%d_%H%M%S.log Trace log example; Start : 2018/10/30 08:02:24.433 ... time(a) Query: SELECT pg_catalog.set_config('search_path', '', false) To backend> Msg Q To backend> "SELECT pg_catalog.set_config('search_path', '', false)" To backend> Msg complete, length 60 Start sending message to backend: 2018/10/30 08:02:24.433 ... time(b) End sending message to backend: 2018/10/30 08:02:24.433 ... time(c) Start receiving message from backend: 2018/10/30 08:02:24.434 ... time(d) End receiving message from backend: 2018/10/30 08:02:24.434 ... time(e) From backend> T From backend (#4)> 35 From backend (#2)> 1 From backend> "set_config" From backend (#4)> 0 From backend (#2)> 0 From backend (#4)> 25 From backend (#2)> 65535 From backend (#4)> -1 From backend (#2)> 0 From backend> D From backend (#4)> 10 From backend (#2)> 1 From backend (#4)> 0 From backend> C From backend (#4)> 13 From backend> "SELECT 1" From backend> Z From backend (#4)> 5 From backend> Z From backend (#4)> 5 From backend> I End : 2018/10/30 08:02:24.435 ... time(f) From time(a) to time(b): time for libpq processing From time(b) to time(c): time for traffic From time(c) to time(d): time for backend processing From time(d) to time(e): time for traffic From time(e) to time(f): time for libpq processing Regards, Aya Iwata |
Greetings,
This is my first attempt at a patch review, so I will take a pass at the low hanging fruit. Initial Pass ============ + Patch applies + Patch builds + Patch behaves as described in the thread I tried a few small things: When I set a relative path for `PGLOGDIR`, the files were correctly written to the directory. When I set a path for `PGLOGDIR` that didn't exist or was not write-able, the patch writes no files, and does not alert the user that no files are being written. Performance =========== I ran two permutations of make check, one with the patch applied but not activated, and the other with with the files being written to disk. Each permutation was run ten times, and the stats are below (times are in seconds): min max median mean not logging 50.4 57.6 53.3 53.4 logging 58.3 77.7 65.0 65.8 Cheers, Jim Doty |
In reply to this post by Iwata, Aya
On Tue, Oct 30, 2018 at 2:39 AM Iwata, Aya <[hidden email]> wrote:
> I create a first libpq trace log patch. Couple additional thoughts from a read-through of the patch: - PQtrace() and the new trace-logging machinery overlap in some places but not others -- and if both are set, PQtrace() will take precedence. It seems like the two should not be separate. - It isn't immediately clear to me how the new information in the logs is intended to be used in concert with the old information. Maybe this goes back to the comments by Tom and Robert higher in the thread -- that an overhaul of the PQtrace system is due. This patch as presented would make things a little worse before they got better, I think. That said, I think the overall idea -- application performance information that can be enabled via the environment, without requiring debugging privileges on a machine or the need to manually correlate traces made by other applications -- is a good one, and something I would use. --Jacob |
In reply to this post by Iwata, Aya
On Tue, Oct 30, 2018 at 8:38 PM Iwata, Aya <[hidden email]> wrote: Hi, Thanks for the patch. I have some comments related to the trace output that is getting printed. The amount of log it is generating may not be understood to many of the application developers. IMO, this should print only the necessary information that can understood by any one by default and full log with more configuration? Regards, Haribabu Kommi Fujitsu Australia |
In reply to this post by Jim Doty
Hi Jim Doty san,
Thank you for review! I'm sorry my late reply... > Initial Pass > ============ > > + Patch applies > + Patch builds > + Patch behaves as described in the thread Thank you for your check. > When I set a path for `PGLOGDIR` that didn't exist or was not write-able, > the patch writes no files, and does not alert the user that no files are being > written. I understand. I think it means that it is necessary to confirm how the setting is going well. There is no warning method when connection string or the environment variable is wrong. So I added following document: + If the setting of the file path by the connection string or the environment variable is + incorrect, the log file is not created in the intended location. + The maximum log file size you set is output to the beginning of the file, so you can check it. And I added the process. Please see my v2 patch. > Performance > =========== > > I ran two permutations of make check, one with the patch applied but not > activated, and the other with with the files being written to disk. Each > permutation was run ten times, and the stats are below (times are in > seconds): > > min max median mean > not logging 50.4 57.6 53.3 53.4 > logging 58.3 77.7 65.0 65.8 This minimizes the impact of logging on performance. Regards, Aya Iwata |
In reply to this post by Haribabu Kommi-2
Hi Hari san, Thank you for your comment! And sorry my late reply… >I have some comments related to the trace output that is getting >printed. The amount of log it is generating may not be understood >to many of the application developers. IMO, this should print >only the necessary information that can understood by any one >by default and full log with more configuration? I understand. And I agree your opinion. I will add feature called “log level” that changes the amount of log output information with my future version patch. Regards,
Aya Iwata |
In reply to this post by Jacob Champion
Hi Jacob san,
Thank you for your comment! And sorry for late reply... > Couple additional thoughts from a read-through of the patch: > > - PQtrace() and the new trace-logging machinery overlap in some places but > not others -- and if both are set, PQtrace() will take precedence. > It seems like the two should not be separate. I understand. This log is acquired for the purpose of investigating the cause part (server side or application side) when performance is bad. So I will search whether getting other place of PQtrace() is necessary or not. And I will reply after the research, please wait for a while a little. > - It isn't immediately clear to me how the new information in the logs is > intended to be used in concert with the old information. Maybe this goes back > to the comments by Tom and Robert higher in the thread -- that an overhaul > of the PQtrace system is due. This patch as presented would make things a > little worse before they got better, I think. > > That said, I think the overall idea -- application performance information > that can be enabled via the environment, without requiring debugging > privileges on a machine or the need to manually correlate traces made by other > applications -- is a good one, and something I would use. Regards, Aya Iwata |
Hi,
I created a new version patch. Please find attached my patch. Changes: Since v2 patch I forgot to write the detail of v2 patch changes. So I write these. - Fixed the " Start receiving message from backend:" message location. Because I found that message located at outside of "retry4". - Changed location where output "start :" / "end :" messages and timestamp. The reason of the change is that v1 patch did not correspond to Asynchronous Command Processing. - Added documentation - Added documentation how to check mistake of logdir and/or logsize. (Based on review comment of Jim san's) Since v3 patch - Fixed my mistake on fe-connect.c. Time and message were output at the place where does not output in originally PQtrace(). These messages are needed only new trace log. So I fixed it. - Fixed two points so that new trace log overlaps all places in PQtrace(). (Based on review comment of Jacob san's) TODO: I will add the feature called "logging level" on next version patch. I will attach it as soon as possible. I'm marking it as "Needs review". Regards, Aya Iwata |
On 27/11/2018 08:42, Iwata, Aya wrote:
> I created a new version patch. Please find attached my patch. This does not excite me. It seems mostly redundant with using tcpdump. If I were to debug networking problems, I wouldn't even trust this very much because it relies on the willpower of all future PostgreSQL developers to keep this accurately up to date, whereas tcpdump gives me the truth from the kernel. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services |
Free forum by Nabble | Edit this page |