--MimeMultipartBoundary
Content-Type: text/plain; charset=us-ascii
Hi
Michael said:
> Something is very wrong there.
hmm.
Attaching to the thread:
aio_thread_loop (ptr=0x809c060) at aiops.c:217
217 }
(gdb) n
211 abstime.tv_sec = squid_curtime + 3;
(gdb) n
212 abstime.tv_nsec = 0;
(gdb) n
213 threadp->status = _THREAD_WAITING;
(gdb) n
214 ret = pthread_cond_timedwait(&(threadp->cond),
(gdb)
217 }
(gdb)
211 abstime.tv_sec = squid_curtime + 3;
(gdb)
212 abstime.tv_nsec = 0;
(gdb)
213 threadp->status = _THREAD_WAITING;
(gdb)
214 ret = pthread_cond_timedwait(&(threadp->cond),
Doing an 'strace' of the thread (with kill -CONT...)
gettimeofday({887695778, 975004}, NULL) = 0
gettimeofday({887695779, 65626}, NULL) = 0
gettimeofday({887695779, 66551}, NULL) = 0
gettimeofday({887695779, 67453}, NULL) = 0
gettimeofday({887695779, 68356}, NULL) = 0
gettimeofday({887695779, 69259}, NULL) = 0
gettimeofday({887695779, 70162}, NULL) = 0
gettimeofday({887695779, 71065}, NULL) = 0
gettimeofday({887695779, 71968}, NULL) = 0
gettimeofday({887695779, 72871}, NULL) = 0
gettimeofday({887695779, 73772}, NULL) = 0
gettimeofday({887695779, 74675}, NULL) = 0
gettimeofday({887695779, 75618}, NULL) = 0
gettimeofday({887695779, 76523}, NULL) = 0
gettimeofday({887695779, 77424}, NULL) = 0
gettimeofday({887695779, 78385}, NULL) = 0
gettimeofday({887695779, 79287}, NULL) = 0
gettimeofday({887695779, 80189}, NULL) = 0
gettimeofday({887695779, 81090}, NULL) = 0
gettimeofday({887695779, 81991}, NULL) = 0
gettimeofday({887695779, 82892}, NULL) = 0
gettimeofday({887695779, 83793}, NULL) = 0
gettimeofday({887695779, 84695}, NULL) = 0
newt:/usr/local/squid/logs # ./wall-time < /home/oskar/squid/squid-1.2.beta15/src/1088.out
System call Count Total time Time per call % total
gettimeofday 4254 20.0968 0.004724 100.00000
Total time elapsed: 20.09676
The problem seems to be that aio_thread_loop is going a little moggy :) The
only function it's calling is gettimeofday...
I am going to recompile libpthread with debugging and see if it actually enters
the functions...
pthread_cond_timedwait (cond=0x809c080, mutex=0x809c068, abstime=0xbf5ffef0)
at condvar.c:137
(hmm. I hope you don't mind extracting the source somewhere...)
(gdb) print now.tv_sec
$15 = 887697370
....
(gdb) print abstime->tv_sec
$12 = 887696514
...
(gdb) print reltime.tv_sec
$10 = -857
Ok, fine... it's because I can't hit 'enter' fast enough, right?
Or is it that the 'hard-tuned' number just isn't big enough on a slow
machine like mine... meaning that it goes beserk and slows down even more.
and therefore never actually returns within the wait time.
To test I put 2 variables in to aiops.c... one that's set when
pthread_cond_timedwait returns ETIMEDOUT, one whenever it isn't.
Check this after a minute or so..
(gdb) attach 2362
Attaching to program `/usr/local/squid/bin/squid', process 2362
Reading symbols from /lib/libpthread.so.0...done.
Reading symbols from /lib/libm.so.5...done.
Reading symbols from /lib/libc.so.5...done.
Reading symbols from /lib/ld-linux.so.1...done.
0x804ee50 in aio_thread_loop (ptr=0x809c978) at aiops.c:222
222 if (ret == ETIMEDOUT) {
(gdb) s
223 errc++;
(gdb)
224 } else {
(gdb)
228 }
(gdb) print errc
$1 = 985256
(gdb) print loopc
$2 = 0
it's always returning ETIMEDOUT... means it just uses MORE cpu power
and other threads also never return before ETIMEDOUT...
I could be misunderstanding something with the code here.. it's been
a looong night :(
What about some kind of 'exponential backoff'? It's better to save CPU than
'keep the threads hot in ram'... and when the machine gets really loaded
it means that things will still work... (as long as to have incrementation,
at least linear).
Oskar
--- "Haven't slept at all. I don't see why people insist on sleeping. You feel so much better if you don't. And how can anyone want to lose a minute - a single minute of being alive?" -- Think Twice --MimeMultipartBoundary--Received on Tue Jul 29 2003 - 13:15:46 MDT
This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:11:43 MST