Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!linus!security!genrad!grkermit!masscomp!clyde!...@BRL-VGR.ARPA From: roode%u...@BRL-VGR.ARPA Newsgroups: net.unix-wizards Subject: 4.2 abrupt halts Message-ID: <15161@sri-arpa.UUCP> Date: Thu, 5-Jan-84 19:24:19 EST Article-I.D.: sri-arpa.15161 Posted: Thu Jan 5 19:24:19 1984 Date-Received: Sun, 8-Jan-84 00:53:55 EST Lines: 44 From: Dana Roode <roode%uci-750a@BRL-VGR.ARPA> We are experiencing mysterious halts on our 750 system, which, if we had not just installed 4.2BSD and not had the problem before, we would swear were hardware caused. The system will be running fine, and out of nowhere, we halt: 800202CA 04 The documentation says the "04" halt code indicates "interrupt stack not valid or unable to read SCB". The address corresponds to "_dumpsys+.9e" in our kernel, which appears to be a harmless "pushaf" of an argument for printf. Of course the fact that we are in "dumpsys" probably indicates we were trying to crash anyway, but why, I don't know. Nothing appears on the console before the halt, and the system does not try to continue despite the fact that the console switch is in its normal "restart" position. After some of these crashes, we were unable to reboot at all without powering the CPU on and off. We would type the boot command to the front end and receive a micro verify check failure (single "%" or "%O"). This lead us to believe we had a hardware problem. DEC replaced our L0002 CPU module, which they said included the microcode hardware involved. We have had another abrupt halt since then, but this time the system responded properly to a boot command. Has anyone seen a problem like this one with 4.2? (or 4.1?) Hardware or software? If there was an original problem that triggered entry into the dumpsys routine, how do we find what the problem was, given that nothing is printing on the console? Since we are in great need to get our system back on its feet as soon as possible, please send a copy of all replies directly to me. Thanks, Dana Roode University of California, Irvine roode.uci@rand-relay -or- ucbvax!ucivax!roode
Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.1 6/24/83 (MC830713); site erix.UUCP Path: utzoo!watmath!clyde!burl!ulysses!harpo!decvax!mcvax!enea!erix!mike From: mike@erix.UUCP (Mike Williams) Newsgroups: net.unix-wizards Subject: 4.2 died on our VAX. Message-ID: <301@erix.UUCP> Date: Wed, 14-Mar-84 09:00:12 EST Article-I.D.: erix.301 Posted: Wed Mar 14 09:00:12 1984 Date-Received: Thu, 15-Mar-84 07:15:47 EST Organization: L M Ericsson, Stockholm, Sweden Lines: 16 Normally when UNIX panics we get a dump which we sometimes look at to see what happened. The other day our VAX just died. It continued to run, echoed text from terminals etc. but just hung if you tried to give it any commands. It turned out that one of our disk controllers was playing up. We swap and page on two disks (hp0 and hp1) and hp1 just gave up. This was quickly repaired and we were up again after an hour. Is there any way to force a dump in these conditions? Why did the VAX just play dumb? Mike Williams {decvax,philabs}!mcvax!enea!erix!mike or