2.2.13 wait_on_bh lockups on SMP

To: linux-kernel@
Subject: 2.2.13 wait_on_bh lockups on SMP 
From: Mark van Walraven  
Date: Mon, 3 Jan 2000 21:59:14 +1300 
Sender: owner-linux-kernel@

Hi,

I am getting lock ups on a production server with a high network load
once or twice a day.  After a hang earlier this evening, the same message
repeating off the console screen:

	wait_on_bh, CPU 0:
	irq: 0 [0 0]
	bh:  1 [0 1]
	<[c010af05]> <[c015c994]> <[c016a18e]> <[c014c4e6]>

(This is slightly different to what ursus@xxxxxxx reported last month
http://www.deja.com/=dnc/[ST_rn=ps]/getdoc.xp?AN=563124486&fmt=text .)

Another hang happened about an hour later, this time with nothing written
to the console.

From my system map:

	c010aecc T synchronize_bh
	c014c4ac T sock_recvmsg
	c015c800 T tcp_recvmsg
	c016a100 T inet_recvmsg

Another system, with identical hardware and kernel, but not quite so heavily
loaded, has been running flawlessly for a couple of weeks.

Probably-irrelevant details: Dell PowerEdge 2300 with AMI MegaRAID;
kernel built from the Debian kernel-source-2.2.13_2.2.13-2 package,
to which I added freeswan-1.1 - otherwise a vanilla Debian 2.1 system;
eth0 and eth1 are eepro100 (module), though only eth0 is up;
CONFIG_M686, CONFIG_X86_GOOD_APIC, CONFIG_1GB, CONFIG_MTRR, CONFIG_SMP.

Of course, further config details are available on request.

I'd really appreciate any assistance - this is interrupting our services,
as well as my so-called holiday.

Thanks,

Mark.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxxx
Please read the FAQ at http://www.tux.org/lkml/

From: Manfred Spraul <manfr...@colorfullife.com>
Subject: Re: 2.2.13 wait_on_bh lockups on SMP
Date: 2000/01/03
Message-ID: <fa.gvuju1v.1u6kdj6@ifi.uio.no>#1/1
X-Deja-AN: 567737709
Original-Date: Mon, 03 Jan 2000 14:05:24 +0100
Sender: owner-linux-ker...@vger.rutgers.edu
Content-Transfer-Encoding: 7bit
Original-Message-ID: <38709E94.B6AAAB2@colorfullife.com>
References: <fa.g1p84pv.1074t81@ifi.uio.no>
To: Mark van Walraven <ma...@wave.co.nz>
Original-References: <20000103215914.A3...@mail.wave.co.nz>
X-Accept-Language: en
Content-Type: text/plain; charset=us-ascii
X-Orcpt: rfc822;linux-kernel-outgoing-dig
Organization: Internet mailing list
MIME-Version: 1.0
Newsgroups: fa.linux.kernel
X-Loop: majord...@vger.rutgers.edu

Mark van Walraven wrote:
> 
> Hi,
> 
> I am getting lock ups on a production server with a high network load
> once or twice a day.  After a hang earlier this evening, the same message
> repeating off the console screen:
> 
>         wait_on_bh, CPU 0:
>         irq: 0 [0 0]
>         bh:  1 [0 1]
>         <[c010af05]> <[c015c994]> <[c016a18e]> <[c014c4e6]>
> 
Unfortunately wait_on_bh() doesn't print a complete back trace if you
use modules. Could you apply the patch below? It will print a complete
back trace.
Parse the result through ksymoops.

--
	Manfred


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/

From: Mark van Walraven <ma...@wave.co.nz>
Subject: Re: 2.2.13 wait_on_bh lockups on SMP
Date: 2000/01/06
Message-ID: <fa.g59o39v.16ngv86@ifi.uio.no>#1/1
X-Deja-AN: 568878982
Original-Date: Thu, 6 Jan 2000 12:24:48 +1300
Sender: owner-linux-ker...@vger.rutgers.edu
Original-Message-ID: <20000106122448.C8464@mail.wave.co.nz>
References: <fa.gvuju1v.1u6kdj6@ifi.uio.no>
To: linux-ker...@vger.rutgers.edu
Original-References: <20000103215914.A3...@mail.wave.co.nz> <38709E94.B6AA...@colorfullife.com>
Content-Type: text/plain; charset=us-ascii
X-Orcpt: rfc822;linux-kernel-outgoing-dig
Organization: Internet mailing list
Mime-Version: 1.0
Newsgroups: fa.linux.kernel
X-Loop: majord...@vger.rutgers.edu

On Mon, Jan 03, 2000 at 02:05:24PM +0100, Manfred Spraul wrote:
> Unfortunately wait_on_bh() doesn't print a complete back trace if you
> use modules. Could you apply the patch below? It will print a complete
> back trace.

I did this several days ago.  After going into hiding for a bit, the
hangs have re-appeared, but I get nothing on the console at all.

It's possible that there are two (or more) separate problems.  I suspect
the eepro100 driver and will investigate that.  I'll post here if I get
any more oopses.

Thanks,

Mark.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/

From: Alan Cox <a...@lxorguk.ukuu.org.uk>
Subject: Re: 2.2.13 wait_on_bh lockups on SMP
Date: 2000/01/06
Message-ID: <fa.fll6vmv.tlmnqs@ifi.uio.no>#1/1
X-Deja-AN: 568924160
Original-Date: Thu, 6 Jan 2000 00:39:20 +0000 (GMT)
Sender: owner-linux-ker...@vger.rutgers.edu
Original-Message-Id: <E1260xV-0005Cb-00@the-village.bc.nu>
Content-Transfer-Encoding: 7bit
References: <fa.g59o39v.16ngv86@ifi.uio.no>
To: ma...@wave.co.nz (Mark van Walraven)
Content-Type: text/plain; charset=us-ascii
X-Orcpt: rfc822;linux-kernel-outgoing-dig
Organization: Internet mailing list
MIME-Version: 1.0
Newsgroups: fa.linux.kernel
X-Loop: majord...@vger.rutgers.edu

> It's possible that there are two (or more) separate problems.  I suspect
> the eepro100 driver and will investigate that.  I'll post here if I get
> any more oopses.

Someone else reported a tcp hang in wait_on_bh and is using an eepro100. Right now I cant
see a cause in the eepro100 driver, just a happens to match..


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/

From: "Terry Katz" <k...@advanced.org>
Subject: RE: 2.2.13 wait_on_bh lockups on SMP
Date: 2000/01/06
Message-ID: <fa.mjjdh7v.1p5kjgt@ifi.uio.no>#1/1
X-Deja-AN: 568944613
Original-Date: Wed, 5 Jan 2000 21:42:07 -0500
Sender: owner-linux-ker...@vger.rutgers.edu
Content-Transfer-Encoding: 7bit
Original-Message-ID: <NDBBIHAIKLCAPCNCPEJNIELHCBAA.katz@advanced.org>
References: <fa.fll6vmv.tlmnqs@ifi.uio.no>
To: "Alan Cox" <a...@lxorguk.ukuu.org.uk>, "Mark van Walraven" <ma...@wave.co.nz>
X-Priority: 3 (Normal)
Content-Type: text/plain; charset="iso-8859-1"
X-Orcpt: rfc822;linux-kernel-outgoing-dig
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
Importance: Normal
Organization: Internet mailing list
X-MSMail-Priority: Normal
MIME-Version: 1.0
Newsgroups: fa.linux.kernel
X-Loop: majord...@vger.rutgers.edu

Were there updates to the eepro driver from 2.2.12 to 2.2.13?  We have been
running a whole bunch of SMP systems since 2.2.12 was released and haven't
had a single crash ... infact, we've had a system up for 50 days straight,
which receives about a million web-hits a day...

-Terry

> -----Original Message-----
> From: owner-linux-ker...@vger.rutgers.edu
> [mailto:owner-linux-ker...@vger.rutgers.edu]On Behalf Of Alan Cox
> Sent: Wednesday, January 05, 2000 7:39 PM
> To: Mark van Walraven
> Cc: linux-ker...@vger.rutgers.edu; Manfred Spraul
> Subject: Re: 2.2.13 wait_on_bh lockups on SMP
>
>
> > It's possible that there are two (or more) separate problems.  I suspect
> > the eepro100 driver and will investigate that.  I'll post here if I get
> > any more oopses.
>
> Someone else reported a tcp hang in wait_on_bh and is using an
> eepro100. Right now I cant
> see a cause in the eepro100 driver, just a happens to match..
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.rutgers.edu
> Please read the FAQ at http://www.tux.org/lkml/
>
>


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/

From: ur...@usa.net
Subject: Re: 2.2.13 wait_on_bh lockups on SMP
Date: 2000/01/09
Message-ID: <s7gn1tiqoj8159@corp.supernews.com>
X-Deja-AN: 570289565
References: <fa.g59o39v.16ngv86@ifi.uio.no> <fa.fll6vmv.tlmnqs@ifi.uio.no>
Organization: Posted via Supernews, http://www.supernews.com
Newsgroups: fa.linux.kernel
X-Complaints-To: newsabuse@supernews.com

In article <fa.fll6vmv.tlm...@ifi.uio.no>, Alan Cox wrote ...
>
> Someone else reported a tcp hang in wait_on_bh and is using an eepro100.
> Right now I can't see a cause in the eepro100 driver, just a happens to match.

Alan:

Been hammering away on this &@#%$ timer_bh problem for weeks :(

I reported the same "wait_on_bh" hangs under heavy network load
with linux 2.2.13[stock/ac3/aa6]-SMP as Manfred Spraul reported.
I'm using Compaq 6400R (2xPIII-500/2G memory) for webserving
(need to handle 10+ million hits/day/server) with Linux+Apache;
used Apache 1.3.9 with updated version of Dean Gaudet's TOP_FUEL
patch. Also need raid-0.90 support (meaning the patch of course)
as my web content is on a raid-0.90 style 2-disk RAID-0 stripe.
Plus using Intel EtherExpress Pro/100 NIC with eepro100.c-1.09l
at first, and later v1.09t (more explanation on eepro100 below).

OT: I needed to increase the size of rt_cache via /proc/sys/net
otherwise I was getting TONS of "dst_cache_overflow" in syslog
during heavy webserver load. Thanks to Alexey Kuznetsov for
an older USENET post on the SysCtl stuff to tune routing code.
But the ipv4 rt_cache tuning needs much better documentation!

Anyway, back to the "wait_on_bh" issue ...

Was getting too many crashes in SMP mode, even with Andrea's
"aa6" patchset for kernel 2.2.13, plus raid-0.90, plus the
incremental "set-blocksize" patch on top of the raid-0.90 patch.
So I tried to run the SMP kernel with "nosmp noapic" options
on the bootprompt (via LILO "append="nosmp noapic"). Started
seeing kernel crashes without any console or syslog errors and
keyboard mostly dead, except I could still get some EIP's
(via <ALT>+<SysCtl>+P) ... all of which seem to refer to
"timer_bh" according to the System.map matching this kernel.

Also contacted Andrea Arcangeli directly about this problem,
he has a lot of crash traces and info I fed him via E-mail;
if anyone thinks they can make use of the traces I'll repost
or he can simply summarize them [better than I could anyway].

Decided to compile for true UniProcessor as last-ditch option,
worst case is I "waste" the second CPU until kernel bug squashed.
No SMP "wait_on_bh"-style hangs, but immediately saw retransmit
errors and hangs from eepro100 ... same thing people have been
complaining about on eepro100 mailing-list for months. Didn't
see it myself until now (2.2.13aa6+raid-up). Strange.

Upgraded to "test" v1.09t of eepro100.c driver from CESDIS,
due to posting mentioning some possible workarounds Don Becker
put into the 1.09t version. Recompiled 2.2.13aa6+raid-0.90-up.
Voila! the eepro100 retransmit errors went away completely.
This incident makes me uneasy about stability/performance of
eepro100.c in general. Going to try testing 3Com cards ...

Started getting crashes under 2.2.13aa6+raid-0.90-up under load,
again without any usable output to console/syslog and only
some SysRq still working (e.g. <ALT>+<SysRq>+P); got some EIP's
which I sent to Andrea. He suggested I grab his IKD patchset
and enable in-kernel debugger (KDB) to get better trace info.

I applied Andrea's 2.2.13-ikd1 patchset pretty successfully
(albeit a few small rejects I fixed by hand due to BIGMEM
breaking the patch) and I ran the new kernel on 10 servers
which although still crashing under load every few days,
I was able to press <PAUSE> to get into the kernel debugger
(KDB) to see the current process backtrace. Sent these to
Andrea Arcangeli as well. From KDB backtraces, the hangs
under kernel 2.2.13aa6+raid+ikd-up looked to him like an
IRQ flood and possibly related to TCP_DELAY_ACK ...

Andrea sent me a one-line patch to disable TCP_DELAY_ACK
which I applied, to see see if this is the offending code.
I compiled 2.2.13aa6+raid+ikd+NO_DELAY_ACK for UniProcessor,
and have been running this kernel for 4 days straight
under load without any crashes. I'm not assuming the patch
solved the problem, maybe I've just been lucky this week
(and these servers could crash any moment, so says Murphy).

Recently I found some disturbing errors in /var/log/messages
on a few of these machines, regarding the RAID device:

  "kernel: fs warning (device md(9,1)):
   ext2_free_blocks: bit already cleared for block 4408088"

These repeat a few times every hour or so, during heavy
filesystem activity on the md device (RAID-0 stripe).
I'm going to watch this webserver cluster like a hawk,
and if/when I can capture a crash (and KDB backtrace)
I will copy this information to Andrea and the list.

Also I want to find out of any changes 2.2.13->2.2.14
will help me (which haven't been addressed by the "aa6"
patchset already). Are there any of Andrea's fixes in aa6
which have NOT been folded into 2.2.14? Finally to try
a different netcard (3Com?) in case eepro100.c is buggy.

As aforementioned these servers need to run rock-stable,
handling in excess of 10 million hits per day (per server)
I will be deploying 100 of these Linux/Apache webservers
to handle > 1 billion transactions per day. 

Any/all ideas/comments/patches/advice greatly appreciated.
Thanks to all of those who've actually read this far :)

PS:	As a high-performance linux/webserver issue ...
	Has anyone here had success with the "10xpatches"
	from SGI for Apache (http://oss.sgi.com/apache/)?
	They seem similar to the TOP_FUEL Apache patches
	(from Dean Gaudet or Dan Kegel? can't remember now)

-- 
RW <ur...@usa.net>