Stats collector on rampage (8.2.3)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Stats collector on rampage (8.2.3)

Gaetano Mendola
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi all,
it seems that the stats collector on my box is using more CPU than
it did in the past.

This is what I'm observing:

CPU usage for the stat process: 25% flat

$ psql -c "select version()"
                                                version
- -------------------------------------------------------------------------------------------------------
 PostgreSQL 8.2.3 on i686-redhat-linux-gnu, compiled by GCC gcc (GCC) 3.4.6 20060404 (Red Hat 3.4.6-3)
(1 row)


$ strace -tt -p 10773
[...]
09:47:37.867655 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
09:47:37.867738 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 823) = 823
09:47:37.867820 close(3)                = 0
09:47:37.867862 munmap(0xb7ced000, 4096) = 0
09:47:37.867906 rename("global/pgstat.tmp", "global/pgstat.stat") = 0
09:47:37.868188 poll([{fd=7, events=POLLIN|POLLERR, revents=POLLIN}], 1, 2000) = 1
09:47:37.868245 recv(7, "\1\0\0\0\320\3\0\0\rg\0\0\16\0\0\0\0\0\0\0\1\0\0\0\347"..., 1000, 0) = 976
09:47:37.868317 setitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={0, 500}}, NULL) = 0
09:47:37.868372 poll([{fd=7, events=POLLIN|POLLERR, revents=POLLIN}], 1, 2000) = 1
09:47:37.868428 recv(7, "\1\0\0\0\320\3\0\0\rg\0\0\16\0\0\0\0\0\0\0\0\0\0\0g\n\0"..., 1000, 0) = 976
09:47:37.868501 poll([{fd=7, events=POLLIN|POLLERR, revents=POLLIN}], 1, 2000) = 1
09:47:37.868559 recv(7, "\1\0\0\0\320\3\0\0\rg\0\0\16\0\0\0\0\0\0\0\0\0\0\0i\n\0"..., 1000, 0) = 976
09:47:37.868629 poll([{fd=7, events=POLLIN|POLLERR, revents=POLLIN}], 1, 2000) = 1
09:47:37.868687 recv(7, "\1\0\0\0\4\3\0\0\rg\0\0\v\0\0\0\0\0\0\0\0\0\0\0^\n\0\0"..., 1000, 0) = 772
09:47:37.868757 poll([{fd=7, events=POLLIN|POLLERR, revents=POLLIN}], 1, 2000) = 1
09:47:37.868815 recv(7, "\1\0\0\0\240\0\0\0\0\0\0\0\2\0\0\0\0\0\0\0\0\0\0\0\354"..., 1000, 0) = 160
09:47:37.868886 poll([{fd=7, events=POLLIN|POLLERR, revents=POLLIN}], 1, 2000) = 1
09:47:37.868944 recv(7, "\1\0\0\0\240\0\0\0\rg\0\0\2\0\0\0\1\0\0\0\0\0\0\0<\n\0"..., 1000, 0) = 160
09:47:37.869012 poll([{fd=7, events=POLLIN|POLLERR, revents=POLLIN}], 1, 2000) = 1
09:47:37.869070 recv(7, "\1\0\0\0l\1\0\0\0\0\0\0\5\0\0\0\0\0\0\0\0\0\0\0\354\4\0"..., 1000, 0) = 364
09:47:37.869141 poll([{fd=7, events=POLLIN|POLLERR, revents=POLLIN}], 1, 2000) = 1
09:47:37.869198 recv(7, "\1\0\0\0\300\2\0\0\rg\0\0\n\0\0\0\1\0\0\0\0\0\0\0007\n"..., 1000, 0) = 704
09:47:37.869267 poll([{fd=7, events=POLLIN|POLLERR, revents=POLLIN}], 1, 2000) = 1
09:47:37.869329 recv(7, "\1\0\0\0(\1\0\0\rg\0\0\4\0\0\0\1\0\0\0\0\0\0\0/\n\0\0\0"..., 1000, 0) = 296
09:47:37.869398 poll([{fd=7, events=POLLIN|POLLERR, revents=POLLIN}], 1, 2000) = 1
09:47:37.869456 recv(7, "\1\0\0\0\320\3\0\0\rg\0\0\16\0\0\0\0\0\0\0\1\0\0\0\347"..., 1000, 0) = 976
09:47:37.869524 --- SIGALRM (Alarm clock) @ 0 (0) ---
09:47:37.869575 sigreturn()             = ? (mask now [])
09:47:37.869659 getppid()               = 10768
09:47:37.869702 open("global/pgstat.tmp", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 3
09:47:37.869775 fstat64(3, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
09:47:37.869871 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7ced000
09:47:37.869928 write(3, "\226\274\245\1D\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
09:47:37.870252 write(3, ";\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
[...]

and doing the statistics on the system calls:


$ time strace -c -p 10773
Process 10773 attached - interrupt to quit
Process 10773 detached
% time     seconds  usecs/call     calls    errors syscall
- ------ ----------- ----------- --------- --------- ----------------
 62.70    4.980721          16    307851           write
 25.72    2.043299         156     13058      3039 poll
  9.61    0.763046         248      3078           rename
  0.64    0.050992          17      3079           open
  0.49    0.038819           4     10019           recv
  0.26    0.020469           7      3078           munmap
  0.13    0.010344           3      3078           close
  0.12    0.009425           3      3079           mmap2
  0.11    0.008353           3      3079           setitimer
  0.09    0.007114           2      3079      3039 sigreturn
  0.07    0.005923           2      3079           fstat64
  0.06    0.004734           2      3079           getppid
- ------ ----------- ----------- --------- --------- ----------------
100.00    7.943239                358636      6078 total

real    0m16.313s
user    0m1.428s
sys     0m3.802s


so instead of 32 or such rename it did the rename 3K times.

To solve the problem is it possible to kill that process? (will it be respawned?)


Regards
Gaetano Mendola











-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHJZ9I7UpzwH2SGd4RAqwrAJ9vyt5fd1rdEu+uTnef6QpFYVBbhwCePB69
b0jA7Ko85TyEfMqAmVVRy/w=
=ax+l
-----END PGP SIGNATURE-----

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

               http://archives.postgresql.org
Reply | Threaded
Open this post in threaded view
|

Re: Stats collector on rampage (8.2.3)

Magnus Hagander-2
Gaetano Mendola wrote:
> Hi all,
> it seems that the stats collector on my box is using more CPU than
> it did in the past.

This is a known bug that was fixed in 8.2.4, so you need to upgrade.

//Magnus


---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

                http://www.postgresql.org/about/donate
Reply | Threaded
Open this post in threaded view
|

Re: Stats collector on rampage (8.2.3)

hubert depesz lubaczewski-2
In reply to this post by Gaetano Mendola
On Mon, Oct 29, 2007 at 09:52:24AM +0100, Gaetano Mendola wrote:
> it seems that the stats collector on my box is using more CPU than
> it did in the past.

it's well known bug, and it was fixed in 8.2.4:
http://www.postgresql.org/docs/current/interactive/release-8-2-4.html
...
Prevent the statistics collector from writing to disk too frequently
(Tom)
...

best regards,

depesz

--
quicksil1er: "postgres is excellent, but like any DB it requires a
highly paid DBA.  here's my CV!" :)
http://www.depesz.com/ - blog dla ciebie (i moje CV)

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
       subscribe-nomail command to [hidden email] so that your
       message can get through to the mailing list cleanly
Reply | Threaded
Open this post in threaded view
|

Re: Stats collector on rampage (8.2.3)

Gaetano Mendola
hubert depesz lubaczewski wrote:

> On Mon, Oct 29, 2007 at 09:52:24AM +0100, Gaetano Mendola wrote:
>> it seems that the stats collector on my box is using more CPU than
>> it did in the past.
>
> it's well known bug, and it was fixed in 8.2.4:
> http://www.postgresql.org/docs/current/interactive/release-8-2-4.html
> ...
> Prevent the statistics collector from writing to disk too frequently
> (Tom)
> ...

I saw that, upgrading the DB at this very moment is not doable, killing
that process will the postmaster respawn another one? BTW I discover that
it was triggered by the time change due the daylight saving.

Regards
Gaetano Mendola

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
       subscribe-nomail command to [hidden email] so that your
       message can get through to the mailing list cleanly
Reply | Threaded
Open this post in threaded view
|

Re: Stats collector on rampage (8.2.3)

Gaetano Mendola
In reply to this post by hubert depesz lubaczewski-2
hubert depesz lubaczewski wrote:

> > On Mon, Oct 29, 2007 at 09:52:24AM +0100, Gaetano Mendola wrote:
>> >> it seems that the stats collector on my box is using more CPU than
>> >> it did in the past.
> >
> > it's well known bug, and it was fixed in 8.2.4:
> > http://www.postgresql.org/docs/current/interactive/release-8-2-4.html
> > ...
> > Prevent the statistics collector from writing to disk too frequently
> > (Tom)
> > ...

I saw that, upgrading the DB at this very moment is not doable, killing
that process will the postmaster respawn another one? BTW I discover that
it was triggered by the time change due the daylight saving.

Regards
Gaetano Mendola

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
       subscribe-nomail command to [hidden email] so that your
       message can get through to the mailing list cleanly
Reply | Threaded
Open this post in threaded view
|

Re: Stats collector on rampage (8.2.3)

Magnus Hagander-2
In reply to this post by Gaetano Mendola
Gaetano Mendola wrote:

> hubert depesz lubaczewski wrote:
>> On Mon, Oct 29, 2007 at 09:52:24AM +0100, Gaetano Mendola wrote:
>>> it seems that the stats collector on my box is using more CPU than
>>> it did in the past.
>> it's well known bug, and it was fixed in 8.2.4:
>> http://www.postgresql.org/docs/current/interactive/release-8-2-4.html
>> ...
>> Prevent the statistics collector from writing to disk too frequently
>> (Tom)
>> ...
>
> I saw that, upgrading the DB at this very moment is not doable, killing
> that process will the postmaster respawn another one? BTW I discover that
> it was triggered by the time change due the daylight saving.

IIRC, it will. You need to change postgresql.conf and disable the stats
collector. If you do that, it won't be started.

Shouldn't be trigged by DST.

//Magnus

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings
Reply | Threaded
Open this post in threaded view
|

Re: Stats collector on rampage (8.2.3)

Andrew Dunstan
In reply to this post by Gaetano Mendola


Gaetano Mendola wrote:

> hubert depesz lubaczewski wrote:
>  
>>> On Mon, Oct 29, 2007 at 09:52:24AM +0100, Gaetano Mendola wrote:
>>>      
>>>>> it seems that the stats collector on my box is using more CPU than
>>>>> it did in the past.
>>>>>          
>>> it's well known bug, and it was fixed in 8.2.4:
>>> http://www.postgresql.org/docs/current/interactive/release-8-2-4.html
>>> ...
>>> Prevent the statistics collector from writing to disk too frequently
>>> (Tom)
>>> ...
>>>      
>
> I saw that, upgrading the DB at this very moment is not doable, killing
> that process will the postmaster respawn another one? BTW I discover that
> it was triggered by the time change due the daylight saving.
>
>
>  

you do realize that this upgrade wouldn't require a dump/restore, don't
you? It would be close to instantaneous.

cheers

andrew

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings
Reply | Threaded
Open this post in threaded view
|

Re: Stats collector on rampage (8.2.3)

Gaetano Mendola
In reply to this post by Magnus Hagander-2
Magnus Hagander wrote:

> Gaetano Mendola wrote:
>> hubert depesz lubaczewski wrote:
>>> On Mon, Oct 29, 2007 at 09:52:24AM +0100, Gaetano Mendola wrote:
>>>> it seems that the stats collector on my box is using more CPU than
>>>> it did in the past.
>>> it's well known bug, and it was fixed in 8.2.4:
>>> http://www.postgresql.org/docs/current/interactive/release-8-2-4.html
>>> ...
>>> Prevent the statistics collector from writing to disk too frequently
>>> (Tom)
>>> ...
>> I saw that, upgrading the DB at this very moment is not doable, killing
>> that process will the postmaster respawn another one? BTW I discover that
>> it was triggered by the time change due the daylight saving.
>
> IIRC, it will. You need to change postgresql.conf and disable the stats
> collector. If you do that, it won't be started.
>
> Shouldn't be trigged by DST.

The high cpu usage started at that time, may be that is another problem of some
applications stuck on it. I will investigate further.

Gaetano

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend
Reply | Threaded
Open this post in threaded view
|

Re: Stats collector on rampage (8.2.3)

Gaetano Mendola
In reply to this post by Andrew Dunstan
Andrew Dunstan wrote:

>
>
> Gaetano Mendola wrote:
>> hubert depesz lubaczewski wrote:
>>  
>>>> On Mon, Oct 29, 2007 at 09:52:24AM +0100, Gaetano Mendola wrote:
>>>>      
>>>>>> it seems that the stats collector on my box is using more CPU than
>>>>>> it did in the past.
>>>>>>          
>>>> it's well known bug, and it was fixed in 8.2.4:
>>>> http://www.postgresql.org/docs/current/interactive/release-8-2-4.html
>>>> ...
>>>> Prevent the statistics collector from writing to disk too frequently
>>>> (Tom)
>>>> ...
>>>>      
>>
>> I saw that, upgrading the DB at this very moment is not doable, killing
>> that process will the postmaster respawn another one? BTW I discover that
>> it was triggered by the time change due the daylight saving.
>>
>>
>>  
>
> you do realize that this upgrade wouldn't require a dump/restore, don't
> you? It would be close to instantaneous.

Sure I do, for me it's even easier than that, I use DRBD so I can just shut
down one node upgrade it and then upgrade the other one; however upgrade the
DB means for me "touch" an entire satellite trasmission platform, I have to
schedule a platform maintenance for that...

Regards
Gaetano Mendola

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq