Re: scsi drive error - squid blocking - lost swap.state

From: Dancer <dancer@dont-contact.us>
Date: Wed, 07 Apr 1999 10:37:47 +1000

I don't yet _quite_ have a squid2.x that meets my stability requirements
for the new deployment. Devel3 is _close_. If devel4/penultimate doesn't
make it soon, I might try a test round with devel3+henrik_magic(tm).

Accordingly, our currently deployed boxen are squid1.1+dancer_magic(tm).

D

tom minchin wrote:
>
> I've had similar probs with the aic7xxx driver and Adaptec hardware. In some
> cases I've had to reduce the speed (in the Adaptec BIOS) from 40megs/sec to
> 20megs/sec - otherwise I got too many SCSI timeouts.
>
> >From reading this disaster, do you store a single swap.state for all your
> disks? I've found it's better to leave it at the default and store swap.state
> in each cache disk (only items in that cache disk are in that swap.state).
> Then if you lose a cache disk, you only lose the swap.state file that went
> with it.
>
> tom@interact.net.au
>
> On Wed, Apr 07, 1999 at 08:44:39AM +1000, Dancer wrote:
> >
> > Ahh, yes. We used to get a lot of those on our remote proxies.
> > Guaranteed to screw the machine up (or hang it solid in an SMP
> > configuration). It's a problem with the type of hardware or the driver
> > (we only ever get this with Adaptec stuff, the aic7xxx driver, to be
> > precise).
> >
> > We don't have a cure, and the glitch is not necessarily connected to
> > load (although on our worst machines you could virtually guarantee it by
> > doing an mke2fs). Kernel 2.0.36 has (so far) given us the lowest
> > incidence of the fault. Our twin PPro machines suffer most, while our
> > twin Xeon boxen hardly (if ever) do this.
> >
> > My heartfelt sympathy on this one. It sucks pond-scum.
> >
> > D
> >
> > Alastair Waddell wrote:
> > >
> > > I suffered a scsi failure today of variety:
> > >
> > > Apr 6 20:37:07 kernel: scsi : aborting command due to timeout : pid
> > > 9737610, scsi0
> > > , channel 0, id 1, lun 0 Write (6) 00 00 41 02 00
> > >
> > > I was unable to release squid which was blocking in a permanent state
> > > necessitating a physical reboot (shutdown and kill/kill -9 ineffective).
> > > Bummer.
> > >
> > > I unmounted and commented out (in squid.conf) the particular drive and
> > > restarted squid with my two remaining spindles.
> > >
> > > Later I twice fsck'd the drive with a check for bad blocks prior to
> > > mounting and restarting squid.
> > >
> > > The problem is twofold:
> > >
> > > 1: How can something like this happen/is there a way I could have done
> > > without the reboot (servers are remote located).
> > >
> > > 2: How come my swap.state file has truncated (fsck with bad block check?
> > > running squid with 2/3 of it's spindles???)
> > >
> > > I'm particularly upset, read pissed, because I only just rebuild the
> > > storage in this 8.7GB drive after losing it a month ago (when my IDE/OS
> > > drive failed).
> > >
> > > Finally, do I have to delete the 6.5 GB on this drive now that the
> > > swap.state file is 'altered'?
> > >
> > > (any comments on the IBM drives shown below ??)
> > >
> > > Filesystem 1024-blocks Used Available Capacity Mounted on
> > > /dev/sda1 8589226 6387716 1756110 78% /mnt/cache1
> > > -rw-r--r-- 1 squid squid 114672 Apr 7 02:33 swap.state
> > >
> > > Linux 2.2.2 (RH5.2 with all updates)
> > > vendor_id : GenuineIntel
> > > model name : Celeron (Covington)
> > > cpu MHz : 412.371201
> > > MemTotal: 258196 kB
> > > SwapTotal: 133048 kB
> > >
> > > Squid Cache: Version 2.1.PATCH2
> > >
> > > (scsi0) <Adaptec AIC-7895 Ultra SCSI host adapter> found at PCI 20/0
> > > (scsi0) Wide Channel A, SCSI ID=7, 255/255 SCBs
> > > (scsi0) Warning - detected auto-termination
> > > (scsi0) Please verify driver detected settings are correct.
> > > (scsi0) If not, then please properly set the device termination
> > > (scsi0) in the Adaptec SCSI BIOS by hitting CTRL-A when prompted
> > > (scsi0) during machine bootup.
> > > (scsi0) Cables present (Int-50 NO, Int-68 NO, Ext-68 NO)
> > > (scsi0) Downloading sequencer code... 394 instructions downloaded
> > > (scsi1) <Adaptec AIC-7895 Ultra SCSI host adapter> found at PCI 20/1
> > > (scsi1) Wide Channel B, SCSI ID=7, 255/255 SCBs
> > > (scsi1) Warning - detected auto-termination
> > > (scsi1) Please verify driver detected settings are correct.
> > > (scsi1) If not, then please properly set the device termination
> > > (scsi1) in the Adaptec SCSI BIOS by hitting CTRL-A when prompted
> > > (scsi1) during machine bootup.
> > > (scsi1) Cables present (Int-50 NO, Int-68 NO, Ext-68 NO)
> > > (scsi1) Downloading sequencer code... 394 instructions downloaded
> > > scsi0 : Adaptec AHA274x/284x/294x (EISA/VLB/PCI-Fast SCSI) 5.1.10/3.2.4
> > > <Adaptec AIC-7895 Ultra SCSI host adapter>
> > > scsi1 : Adaptec AHA274x/284x/294x (EISA/VLB/PCI-Fast SCSI) 5.1.10/3.2.4
> > > <Adaptec AIC-7895 Ultra SCSI host adapter>
> > > scsi : 2 hosts.
> > > Vendor: IBM Model: DGVS09U Rev: 0350
> > > Type: Direct-Access ANSI SCSI revision: 03
> > > Detected scsi disk sda at scsi0, channel 0, id 1, lun 0
> > > Vendor: IBM Model: DDRS-34560W Rev: S97B
> > > Type: Direct-Access ANSI SCSI revision: 02
> > > Detected scsi disk sdb at scsi1, channel 0, id 5, lun 0
> > > Vendor: IBM Model: DDRS-34560W Rev: S97B
> > > Type: Direct-Access ANSI SCSI revision: 02
> > > Detected scsi disk sdc at scsi1, channel 0, id 6, lun 0
> > > (scsi0:0:1:0) Synchronous at 40.0 Mbyte/sec, offset 8.
> > > SCSI device sda: hdwr sector= 512 bytes. Sectors= 17829870 [8705 MB]
> > > [8.7 GB]
> > > sda: sda1
> > > (scsi1:0:5:0) Synchronous at 40.0 Mbyte/sec, offset 8.
> > > SCSI device sdb: hdwr sector= 512 bytes. Sectors= 8925000 [4357 MB] [4.4
> > > GB]
> > > sdb: sdb1
> > > (scsi1:0:6:0) Synchronous at 40.0 Mbyte/sec, offset 8.
> > > SCSI device sdc: hdwr sector= 512 bytes. Sectors= 8925000 [4357 MB] [4.4
> > > GB]
> > > sdc: sdc1
> > >
> > > --
> > > ----------------------------------------------------------------------
> > > Alastair Waddell o Tel +61 3 96400400
> > > Legion Internet
> > > Queen Street, Melbourne o Full featured VISP Facility
> > >
> > > Virtual Services + DNS Maintenance + ISP Co-location + Internetworking
Received on Tue Apr 06 1999 - 18:21:11 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:45:44 MST