I'll pass along the questions below, but answers may not be returned.
Please note that I am 400 miles from the system, and have never physically
touched it, and that various aspects are handled by various volunteers.
That's one of the "interesting" aspects of Eisner. Another is that people
do have "day jobs".
To (briefly) answer another posting - yes, there are various automatic
system utilities that try to warn us of impending problems. And there's a
UPS (unfortunately, it turns out, with a bad battery). We don't have a
utility that warns of an upcoming power failure in the neighborhood.
<***@yahoo.com> wrote in message news:firstname.lastname@example.org...
Just for clarification of the configuration - you've got a DS20 with a
Mylex (swxcr) scsi RAID controller using 2 channels on a split SCSI
bus on the stack of 7 drive slots going up the right side. These 7
drives are 9GB bricks in a RAID-5 and from that you've sliced it into
6 logical drives DRA0-5. The drive in slot #1 on the second bus has
failed so the swxcr has marked DRA0-5 as "DEGRADED". Then the 3rd
channel of the controller is used for DRA6 on an external shelf.
I've seen the "%DRA, drives=0, optimal = 4294967290, degraded = 6,
failed = 0 " before. I think it's a combination of a firmware bug and
a slightly confused controller. I wish I could remember exactly what
I did to fix it but it was a few years ago when I had the problem.
I would assume that you've tried replacing the bad disk mentioned with
another? The swxcr may not automatically do a rebuild. You may have
to go into the swxcrmgr utility and tell the controller that the drive
has been replaced, mark it as good and then it to rebuild the array.
Did the external shelf become disconnected or lose power? It doesn't
look like the swxcr sees it at the moment. That might be enough to
hang it, though if that's the case is should show as failed. I'm not
sure why there is no mention of it in the startup messages. It may be
irrelevant if the DRA0-5 array is rebuilt.
I know you're just passing on info second hand, but a little more info
on what's been tried and failed and what the failure messages/results
were would help.