Gideon Glass wrote:
>
>
> Hmm. Thanks for your words. The target filesystem in this case is EXT2,
> and there are six cache-disks. The entire disk-subsystem (including
> cache) never exceeds 400-600 blocks of throughput per second (about half
> in, and half out, at peak)...Physically, the disks spend most of their
> operational time each second idle.
>
> Whatever is causing the problem is throttling us to the tune of maybe 40
> requests/second (or more). Despite not being (apparently) disk bound,
> I'm tempted to try async-io and see what happens.
>
> If you're not using async I/O, try this for kicks. Put gettimeofday()
> calls around open/close/read/write/unlink calls on disk objects and
> record the latencies of these calls. I suspect you'll find that open
> dominates the others.
>
> gid
>
>
> D
>
>
> Andres Kroonmaa wrote:
> >
> > On 26 Apr 99, at 8:34, Dancer <dancer@zeor.simegen.com> wrote:
> > > (When replying, make sure I get a copy. The list removed me from
> > > squid-dev back a while ago, when I moved states and email addresses)
> >
> > > Checking the access logs reveals connection durations after the critical
> > > mark climbed rapidly to 50,000 ms and more, and are still being
> > > processed, completed and logged for a minute or two after the client has
> > > ceased (check the server...still serving, yes).
> >
> > > Run test at 100/sec. Runs for some time. Symptoms repeat. I'm quietly
> > > watching 'vmstat 5' this whole time, and watching the disk buffers
> > > filling (490MB now..squid's cache_mem is set at 8MB just as a reference
> >
> >
> > > Now:
> > >
> > > ASSUMPTION: Squid's lookup time for a given object does not grow
> > > significantly as the number of objects increases. (It'll grow, but not
> > > hugely)
> >
> > Agree.
> >
> > > GUESS: The slowdown is the operating system looking through a
> > > half-gigabyte of cache-blocks to find objects, and actually managing to
> > > exceed the time it would take to fetch the object from the physical
> > > disk.
> >
> > Agree that is is more OS issue than squid issue.
> >
> > But I don't think it's to do with cache-block search thats bogging
> > you. cache-blocks are supposedly handled with VM system in round-robin,
> > flushing/freeing dirty blocks and allocating new for buffering.
> > It all should be fairly fast, as long as there is no need to wait
> > for disk io and there is enought of kernel internal resources.
> >
> > There might be some notion of max pageout rate that is limiting
> > disk io rate. I'm not a Linux guy, but on Solaris, they intendedly
> > limit max pageout rate with kernel variable, not to saturate disks.
> >
> > So, if your request rate exceeds max pageout rate, things add up
> > until there is no cache free and disk trashing starts.
> >
> > > Requesting clue-transfer,
> >
> > On most UNIX'es there is a notion of file (page)cache, and _separate_
> > MetaData buffer cache. File cache is allowd to fill full unused ram,
> > but buffer cache is used for doing UFS metadata io, is limited in max
> > size, and sometimes fixed in kernel at compiletime.
> > This cache is where file and free space allocation, file open, close,
> > unlink is happening. Usually, metadata io is not write-cached but
> > syncronous. This is what limits number of file creations per second
> > at disk performance levels.
> > If you have async meta updates, then the size of this cache determines
> > for how long OS can sustain delaying this disk io, after which things
> > get back slow.
> >
> > I don't think UFS itself is direct culprit of what is seen, but
> > indirectly it might be. It depends on how OS is handling multiple
> > UFS cyl groups and how it is allocating its MD cache for these.
> > I'd imagine, that as long as your test is run withing a single CG,
> > all needed metadata might fit into buffer cache and things might
> > look much faster, but after critical number of CG's comes into play,
> > sharp change is speed might occur. There are lots of OS tunables to
> > tweak for high-end squid-cache operation.
> >
> > Without these, general OS is expected to service some 20-100 users
> > at a time, which is not what you're after. Look at inode cache,
> > directory name lookup cache, metadata buffer cache.
> > Some parts of UFS caching are kept locked for some time after
> > use, not to allow fast reuse of them and to increase probability
> > of hit on them. This might lock all available slots and cause
> > severe delays in UFS metadata operations, without actual disk io,
> > with just waiting for timeouts.
> > Then, you can have simply hit a bug in your OS...
> >
> > Of course, all this is rant and speculations. You should look
> > deeper into what is going on to know better.
> >
> > Arjan de Vet some time ago wrote a patch for Squid to measure
> > time taken by OS to service certain system calls and report these
> > into debug logs. Very useful for this kind of stuff.
> >
> > You should look closer on what happens with system service times
> > for calls like open/read/write/close/unlink, for disk and network
> > separately. Then you can have much better hint on what to look at
> > further.
> >
> > You might see diskfile open/close(/unlink) call times to skyrocket
> > when you see the trouble scenario.
> >
> > ----------------------------------------------------------------------
> > Andres Kroonmaa mail: andre@online.ee
> > Senior Network Engineer
> > Organization: MicroLink Online Tel: 6308 909
> > Tallinn, Sakala 19 Pho: +372 6308 909
> > Estonia, EE0001 http://www.online.ee Fax: +372 6308 901
> > ----------------------------------------------------------------------
>
>
If I can rustle up the time, I'll certainly try that.
D
Received on Tue Jul 29 2003 - 13:15:57 MDT
This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:12:06 MST