Path: archiver1.google.com!news1.google.com!sn-xit-02!supernews.com! news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no! ifi.uio.no!internet-mailinglist Newsgroups: fa.linux.kernel Return-Path: <linux-kernel-ow...@vger.kernel.org> Original-Message-ID: <3C2CD326.100@athlon.maya.org> Original-Date: Fri, 28 Dec 2001 21:16:38 +0100 From: Andreas Hartmann <andihartm...@freenet.de> User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.7+) Gecko/20011225 X-Accept-Language: en-us MIME-Version: 1.0 To: Kernel-Mailingliste <linux-ker...@vger.kernel.org> Subject: [2.4.17/18pre] VM and swap - it's really unusable Content-Type: text/plain; charset=ISO-8859-15; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-ow...@vger.kernel.org Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org Organization: Internet mailing list Date: Fri, 28 Dec 2001 20:19:40 GMT Message-ID: <fa.djbc0rv.1ogs82n@ifi.uio.no> Lines: 32 Hello all, Again, I did a rsync-operation as described in "[2.4.17rc1] Swapping" MID <3C1F4014.2010...@athlon.maya.org>. This time, the kernel had a swappartition which was about 200MB. As the swap-partition was fully used, the kernel killed all processes of knode. Nearly 50% of RAM had been used for buffers at this moment. Why is there so much memory used for buffers? I know I repeat it, but please: Fix the VM-management in kernel 2.4.x. It's unusable. Believe me! As comparison: kernel 2.2.19 didn't need nearly any swap for the same operation! Please consider that I'm using 512 MB of RAM. This should, or better: must be enough to do the rsync-operation nearly without any swapping - kernel 2.2.19 does it! The performance of kernel 2.4.18pre1 is very poor, which is no surprise, because the machine swaps nearly nonstop. Regards, Andreas Hartmann - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu! news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!news.tele.dk! small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!ifi.uio.no! internet-mailinglist Newsgroups: fa.linux.kernel Return-Path: <linux-kernel-ow...@vger.kernel.org> Original-Date: Fri, 28 Dec 2001 18:32:12 -0200 (BRST) From: Rik van Riel <r...@conectiva.com.br> X-X-Sender: <r...@duckman.distro.conectiva> To: Andreas Hartmann <andihartm...@freenet.de> Cc: <linux-ker...@vger.kernel.org> Subject: Re: [2.4.17/18pre] VM and swap - it's really unusable In-Reply-To: <3C2CD326.100@athlon.maya.org> Original-Message-ID: <Pine.LNX.4.33L.0112281827000.12225-100000@duckman.distro.conectiva> X-spambait: aardv...@kernelnewbies.org X-spammeplease: aardv...@nl.linux.org MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-ow...@vger.kernel.org Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org Organization: Internet mailing list Date: Fri, 28 Dec 2001 20:33:52 GMT Message-ID: <fa.ookmi1v.1hmtj7@ifi.uio.no> References: <fa.djbc0rv.1ogs82n@ifi.uio.no> Lines: 29 On Fri, 28 Dec 2001, Andreas Hartmann wrote: > Fix the VM-management in kernel 2.4.x. It's unusable. Believe > me! As comparison: kernel 2.2.19 didn't need nearly any swap for > the same operation! If you feel adventurous you can try my rmap based VM, the latest version is on: http://surriel.com/patches/2.4/2.4.17-rmap-8 This VM should behave a bit better (it does on my machines), but isn't yet bug-free enough to be used on production machines. Also, the changes it introduces are, IMHO, too big for a stable kernel series ;) regards, Rik -- DMCA, SSSCA, W3C? Who cares? http://thefreeworld.net/ http://www.surriel.com/ http://distro.conectiva.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu! news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no! ifi.uio.no!internet-mailinglist Newsgroups: fa.linux.kernel Return-Path: <linux-kernel-ow...@vger.kernel.org> Original-Message-ID: <3C2CE373.3000806@athlon.maya.org> Original-Date: Fri, 28 Dec 2001 22:26:11 +0100 From: Andreas Hartmann <andihartm...@freenet.de> User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.7+) Gecko/20011225 X-Accept-Language: en-us MIME-Version: 1.0 To: Andrew Morton <a...@zip.com.au> CC: Kernel-Mailingliste <linux-ker...@vger.kernel.org> Subject: Re: [2.4.17/18pre] VM and swap - it's really unusable Original-References: <3C2CD326....@athlon.maya.org> <3C2CD9EC.1D6C7...@zip.com.au> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-ow...@vger.kernel.org Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org Organization: Internet mailing list Date: Fri, 28 Dec 2001 21:34:08 GMT Message-ID: <fa.ebv0e5v.10385b4@ifi.uio.no> References: <fa.e1utl5v.2n6spf@ifi.uio.no> Lines: 48 Andrew Morton wrote: > Andreas Hartmann wrote: > >>Hello all, >> >>Again, I did a rsync-operation as described in >>"[2.4.17rc1] Swapping" MID <3C1F4014.2010...@athlon.maya.org>. >> >>This time, the kernel had a swappartition which was about 200MB. As the >>swap-partition was fully used, the kernel killed all processes of knode. >>Nearly 50% of RAM had been used for buffers at this moment. Why is there >>so much memory used for buffers? >> > > It's very strange. The large amount of buffercache usage is to > be expected from statting 20 gigs worth of files, but the kernel > should (and normally does) free up that memory on demand. > > Which filesystem(s) are you using? > > Are you using NFS/NBD/SMBFS or anything like that? > Basically, I'm using NFS and reiserfs. But I didn't use any file on NFS since the last reboot - and the NFS-shares haven't been mounted. There are 2 IDE-Harddisks in this machine: hda: WDC WD205AA, ATA DISK drive (40079088 sectors (20520 MB) w/2048KiB cache, CHS=2494/255/63, UDMA(66)) hdb: WDC WD450AA-00BAA0, ATA DISK drive (87930864 sectors (45021 MB) w/2048KiB Cache, CHS=5473/255/63, UDMA(66)) On hda, I have got 7 partitions (plus one little "boot"-partition, which isn't mounted and a 200MB swap partition). On hdb, I have got 12 partitions and one more, meanwhile 1GB swap partition. All partitions are formated with reiserfs. Regards, Andreas Hartmann - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu! news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!news.tele.dk! small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!ifi.uio.no! internet-mailinglist Newsgroups: fa.linux.kernel Return-Path: <linux-kernel-ow...@vger.kernel.org> Subject: Re: [2.4.17/18pre] VM and swap - it's really unusable To: andihartm...@freenet.de (Andreas Hartmann) Original-Date: Sat, 29 Dec 2001 00:30:51 +0000 (GMT) Cc: linux-ker...@vger.kernel.org (Kernel-Mailingliste) In-Reply-To: <3C2CD326.100@athlon.maya.org> from "Andreas Hartmann" at Dec 28, 2001 09:16:38 PM X-Mailer: ELM [version 2.5 PL6] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Original-Message-Id: <E16K7Om-0002QI-00@the-village.bc.nu> From: Alan Cox <a...@lxorguk.ukuu.org.uk> Sender: linux-kernel-ow...@vger.kernel.org Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org Organization: Internet mailing list Date: Sat, 29 Dec 2001 00:21:49 GMT Message-ID: <fa.g0136fv.n4eo16@ifi.uio.no> References: <fa.djbc0rv.1ogs82n@ifi.uio.no> Lines: 13 > Fix the VM-management in kernel 2.4.x. It's unusable. Believe > me! As comparison: kernel 2.2.19 didn't need nearly any swap for > the same operation! > The performance of kernel 2.4.18pre1 is very poor, which is no surprise, > because the machine swaps nearly nonstop. Does the 2.4.9 Red Hat kernel (if yoiu are using RH) or 2.4.12-ac8 show the same problem ? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!sn-xit-02!supernews.com! news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no! ifi.uio.no!internet-mailinglist Newsgroups: fa.linux.kernel Return-Path: <linux-kernel-ow...@vger.kernel.org> Original-Message-ID: <3C2DC1AA.2070106@athlon.maya.org> Original-Date: Sat, 29 Dec 2001 14:14:18 +0100 From: Andreas Hartmann <andihartm...@freenet.de> User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.7+) Gecko/20011225 X-Accept-Language: en-us MIME-Version: 1.0 To: Kernel-Mailingliste <linux-ker...@vger.kernel.org> Subject: Re: [2.4.17/18pre] VM and swap - it's really unusable Original-References: <3C2CD326....@athlon.maya.org> Content-Type: text/plain; charset=ISO-8859-15; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-ow...@vger.kernel.org Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org Organization: Internet mailing list Date: Sat, 29 Dec 2001 13:18:13 GMT Message-ID: <fa.eaucgmv.133soom@ifi.uio.no> References: <fa.djbc0rv.1ogs82n@ifi.uio.no> Lines: 67 Andreas Hartmann wrote: > Hello all, > > Again, I did a rsync-operation as described in > "[2.4.17rc1] Swapping" MID <3C1F4014.2010...@athlon.maya.org>. > Some other examples: I just did a cp -Rd linux-2.4.16 linux-2.4.17 (with object-files). Before starting this action, I had about 120 MB of free RAM. During copying - I did nothing else meanwhile, there was 2MB swap used - and 12 MB of RAM were free. The biggest part of memory was used for caching - what is ok. After copying, only 10 MB of memory have been given free again. There have been 490MB of RAM used now (nearly most for caching). Outgoing from this situation, I started another little cp-action: cp -Rd linux-2.4.18pre1 linux-2.4.test (again including object files). Result: the swap usage stayed nearly constant, neverthless there have been 6 accesses to swap. Now, I deleted the linux-2.4.test-directory with rm -R linux-2.4.test This action was very fast (approximately 1s). Afterwards, a big part of the cache memory has been given free (about 100MB). Now, 122MB of RAM have been free again. Next example (running after the last): SuSE run-crons have been running. This means: -> updatedb -> sort -> frcode -> find -> mandb 47MB swap used, 2/3 of memory is used for buffers (Don't forget: I've got 512MB of RAM) and about 30MB of RAM are free. My observation: Why does the kernel swap to get free memory for caching / buffering? I can't see any sense in this action. Wouldn't it be better to shrink the cashing / buffering-RAM to the amount of memory, which is obviously free? Swapping should be principally used, if the RAM ends for real memory (memory, which is used for running applications). First of all, the memory-usage of cache and buffers should be reduced before starting to swap IMHO. Or would it be possible, to implement more than one swapping strategy, which could be configured during make menuconfig? This would give the user the chance to find the best swapping strategy for his purpose. Regards, Andreas Hartmann - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!sn-xit-02!supernews.com! news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no! ifi.uio.no!internet-mailinglist Newsgroups: fa.linux.kernel Return-Path: <linux-kernel-ow...@vger.kernel.org> Original-Date: Thu, 3 Jan 2002 14:23:01 -0600 From: Ken Brownfield <brown...@irridia.com> To: Andreas Hartmann <andihartm...@freenet.de> Cc: Kernel-Mailingliste <linux-ker...@vger.kernel.org> Subject: Re: [2.4.17/18pre] VM and swap - it's really unusable Original-Message-ID: <20020103142301.C4759@asooo.flowerfire.com> Original-References: <3C2CD326....@athlon.maya.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 1.0.1i In-Reply-To: <3C2CD326.100@athlon.maya.org>; from andihartmann@freenet.de on Fri, Dec 28, 2001 at 09:16:38PM +0100 Sender: linux-kernel-ow...@vger.kernel.org Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org Organization: Internet mailing list Date: Thu, 3 Jan 2002 20:24:43 GMT Message-ID: <fa.j74mi6v.1e5qhho@ifi.uio.no> References: <fa.djbc0rv.1ogs82n@ifi.uio.no> Lines: 78 Unfortunately, I lost the response that basically said "2.4 looks stable to me", but let me count the ways in which I agree with Andreas' sentiment: A) VM has major issues 1) about a dozen recent OOPS reports in VM code 2) VM falls down on large-memory machines with a high inode count (slocate/updatedb, i/dcache) 3) Memory allocation failures and OOM triggers even though caches remain full. 4) Other bugs fixed in -aa and others B) Live- and dead-locks that I'm seeing on all 2.4 production machines > 2.4.9, possibly related to A. But how will I ever find out? C) IO-APIC code that requires noapic on any and all SMP machines that I've ever run on. I don't have anything against anyone here -- I think everyone is doing a fine job. It's an issue of acceptance of the problem and focus. These issues are all showstoppers for me, and while I don't represent the 90% of the Linux market that is UP desktops, IMHO future work on the kernel will be degraded by basic functionality that continues to cause problems. I think seeing some of Andrea's and Andrew's et al patches actually *happen* would be a good thing, since 2.4 kernels are decidedly not ready for production here. I am forced to apply 26 distinct patch sets to my kernels, and I am NOT the right person to make these judgements. Which is why I was interested in an LKML summary source, though I haven't yet had a chance to catch up on that thread of comment. Having a glitch in the radeon driver is one thing; having persistent, fatal, and reproducable failures in universal kernel code is entirely another. -- Ken. brown...@irridia.com On Fri, Dec 28, 2001 at 09:16:38PM +0100, Andreas Hartmann wrote: | Hello all, | | Again, I did a rsync-operation as described in | "[2.4.17rc1] Swapping" MID <3C1F4014.2010...@athlon.maya.org>. | | This time, the kernel had a swappartition which was about 200MB. As the | swap-partition was fully used, the kernel killed all processes of knode. | Nearly 50% of RAM had been used for buffers at this moment. Why is there | so much memory used for buffers? | | I know I repeat it, but please: | | Fix the VM-management in kernel 2.4.x. It's unusable. Believe | me! As comparison: kernel 2.2.19 didn't need nearly any swap for | the same operation! | | Please consider that I'm using 512 MB of RAM. This should, or better: | must be enough to do the rsync-operation nearly without any swapping - | kernel 2.2.19 does it! | | The performance of kernel 2.4.18pre1 is very poor, which is no surprise, | because the machine swaps nearly nonstop. | | | Regards, | Andreas Hartmann | | - | To unsubscribe from this list: send the line "unsubscribe linux-kernel" in | the body of a message to majord...@vger.kernel.org | More majordomo info at http://vger.kernel.org/majordomo-info.html | Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu! news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!news.tele.dk! small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!ifi.uio.no! internet-mailinglist Newsgroups: fa.linux.kernel Return-Path: <linux-kernel-ow...@vger.kernel.org> Original-Date: Thu, 3 Jan 2002 18:50:10 -0200 (BRST) From: Rik van Riel <r...@conectiva.com.br> X-X-Sender: <r...@imladris.surriel.com> To: Ken Brownfield <brown...@irridia.com> Cc: Andreas Hartmann <andihartm...@freenet.de>, Kernel-Mailingliste <linux-ker...@vger.kernel.org> Subject: Re: [2.4.17/18pre] VM and swap - it's really unusable In-Reply-To: <20020103142301.C4759@asooo.flowerfire.com> Original-Message-ID: <Pine.LNX.4.33L.0201031848060.24031-100000@imladris.surriel.com> X-spambait: aardv...@kernelnewbies.org X-spammeplease: aardv...@nl.linux.org MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-ow...@vger.kernel.org Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org Organization: Internet mailing list Date: Thu, 3 Jan 2002 20:52:01 GMT Message-ID: <fa.o032nuv.u2o3gi@ifi.uio.no> References: <fa.j74mi6v.1e5qhho@ifi.uio.no> Lines: 37 On Thu, 3 Jan 2002, Ken Brownfield wrote: > A) VM has major issues > 1) about a dozen recent OOPS reports in VM code > 2) VM falls down on large-memory machines with a > high inode count (slocate/updatedb, i/dcache) > 3) Memory allocation failures and OOM triggers > even though caches remain full. > 4) Other bugs fixed in -aa and others > B) Live- and dead-locks that I'm seeing on all 2.4 production > machines > 2.4.9, possibly related to A. But how will I > ever find out? I've spent ages trying to fix these bugs in the -ac kernel, but they got all backed out in search of better performance. Right now I'm developing a VM again, but I have no interest at all in fixing the livelocks in the main kernel, they'll just get removed again after a while. If you want to test my VM stuff, you can get patches from http://surriel.com/patches/ or direct access at the bitkeeper tree on http://linuxvm.bkbits.net/ cheers, Rik -- Shortwave goes a long way: irc.starchat.net #swl http://www.surriel.com/ http://distro.conectiva.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!sn-xit-02!supernews.com! newsfeed.direct.ca!look.ca!cpk-news-hub1.bbnplanet.com!news.gtei.net! newsfeed1.cidera.com!Cidera!news2.dg.net.ua!bn.utel.com.ua! carrier.kiev.ua!not-for-mail From: Dieter =?iso-8859-15?q?N=FCtzel?= <Dieter.Nuet...@hamburg.de> Newsgroups: lucky.linux.kernel Subject: Re: [2.4.17/18pre] VM and swap - it's really unusable Date: Tue, 8 Jan 2002 03:06:25 +0000 (UTC) Organization: DN Lines: 25 Sender: n...@horse.lucky.net Approved: newsmas...@lucky.net Message-ID: <20020108030420Z287595-13997+1799@vger.kernel.org> NNTP-Posting-Host: horse.lucky.net Mime-Version: 1.0 Content-Type: text/plain; Content-Transfer-Encoding: 8bit X-Trace: horse.lucky.net 1010459185 89122 193.193.193.118 (8 Jan 2002 03:06:25 GMT) X-Complaints-To: usenet@horse.lucky.net NNTP-Posting-Date: Tue, 8 Jan 2002 03:06:25 +0000 (UTC) X-Mailer: KMail [version 1.3.2] X-Mailing-List: linux-kernel@vger.kernel.org X-Comment-To: Marcelo Tosatti Is it possible to decide, now what should go into 2.4.18 (maybe -pre3) -aa or -rmap? Andrew Morten`s read-latency.patch is a clear winner for me, too. What about 00_nanosleep-5 and bootmem? The O(1) scheduler? Maybe preemption? It is disengageable so nobody should be harmed but we get the chance for wider testing. Any comments? Thanks, Dieter -- Dieter Nützel Graduate Student, Computer Science University of Hamburg Department of Computer Science @home: Dieter.Nuet...@hamburg.de - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!sn-xit-02!supernews.com! newsfeed.direct.ca!look.ca!hub1.nntpserver.com!news-out.spamkiller.net! propagator-la!news-in-la.newsfeeds.com!news-in.superfeed.net! news.exit.com!gehenna.pell.portland.or.us!nntp-server.caltech.edu! nntp-server.caltech.edu!mail2news96 Newsgroups: mlist.linux.kernel Date: Tue, 8 Jan 2002 11:55:59 +0100 (CET) From: Luigi Genoni <ker...@Expansa.sns.it> X-To: Dieter =?iso-8859-15?q?N=FCtzel?= <Dieter.Nuet...@hamburg.de> X-cc: Marcelo Tosatti <marc...@conectiva.com.br>, Andrea Arcangeli <and...@suse.de>, Rik van Riel <r...@conectiva.com.br>, Linux Kernel List <linux-ker...@vger.kernel.org>, Andrew Morton <a...@zip.com.au>, Robert Love <r...@tech9.net> Subject: Re: [2.4.17/18pre] VM and swap - it's really unusable Message-ID: <linux.kernel.Pine.LNX.4.33.0201081153310.29480-100000@Expansa.sns.it> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=iso-8859-1 Approved: n...@nntp-server.caltech.edu Lines: 24 On Tue, 8 Jan 2002, Dieter [iso-8859-15] Nützel wrote (passim): > Is it possible to decide, now what should go into 2.4.18 (maybe -pre3) -aa or > -rmap? [...] > Maybe preemption? It is disengageable so nobody should be harmed but we get > the chance for wider testing. > > Any comments? preemption?? this is eventually 2.5 stuff, and should not be integrated into 2.4 stable tree. Of course a backport is possible, when/if it will be quite well tested and well working on 2.5 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!sn-xit-02!supernews.com! newsfeed.direct.ca!look.ca!newshub2.rdc1.sfba.home.com!news.home.com! newshub1-work.rdc1.sfba.home.com!gehenna.pell.portland.or.us! nntp-server.caltech.edu!nntp-server.caltech.edu!mail2news96 Newsgroups: mlist.linux.kernel Date: Tue, 8 Jan 2002 14:21:17 +0100 From: Andrea Arcangeli <and...@suse.de> X-To: Luigi Genoni <ker...@Expansa.sns.it> X-Cc: Dieter =?iso-8859-1?Q?N=FCtzel?= <Dieter.Nuet...@hamburg.de>, Marcelo Tosatti <marc...@conectiva.com.br>, Rik van Riel <r...@conectiva.com.br>, Linux Kernel List <linux-ker...@vger.kernel.org>, Andrew Morton <a...@zip.com.au>, Robert Love <r...@tech9.net> Subject: Re: [2.4.17/18pre] VM and swap - it's really unusable Message-ID: <linux.kernel.20020108142117.F3221@inspiron.school.suse.de> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Approved: n...@nntp-server.caltech.edu Lines: 50 On Tue, Jan 08, 2002 at 11:55:59AM +0100, Luigi Genoni wrote: > > > On Tue, 8 Jan 2002, Dieter [iso-8859-15] Nützel wrote (passim): > > > Is it possible to decide, now what should go into 2.4.18 (maybe -pre3) -aa or > > -rmap? > [...] > > Maybe preemption? It is disengageable so nobody should be harmed but we get > > the chance for wider testing. > > > > Any comments? > preemption?? this is eventually 2.5 stuff, and should not be integrated indeed ("eventually" in the italian sense btw, obvious to me, but not for l-k). I'm not against preemption (I can see the benefits about the mean latency for real time DSP) but the claims about preemption making the kernel faster doesn't make sense to me. more frequent scheduling, overhead of branches in the locks (you've to conditional_schedule after the last preemption lock is released and the cachelines for the per-cpu preemption locks) and the other preemption stuff can only make the kernel slower. Furthmore for multimedia playback any sane kernel out there with lowlatency fixes applied will work as well as a preemption kernel that pays for all the preemption overhead. About the other claim that as the kernel becomes more granular performance will increase with preemption in kernel, that's obviously wrong as well, it's clearly the other way around. Maybe it was meant "latency will decrease further", that's right, but also performance will decrease if something. So yes, mean latency will decrease with preemptive kernel, but your CPU is definitely paying something for it. > into 2.4 stable tree. Of course a backport is possible, when/if it will be > quite well tested and well working on 2.5 > > > > Andrea - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu! news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no! ifi.uio.no!internet-mailinglist Newsgroups: fa.linux.kernel Return-Path: <linux-kernel-ow...@vger.kernel.org> Original-Date: Wed, 9 Jan 2002 00:33:35 +1100 From: Anton Blanchard <an...@samba.org> To: Andrea Arcangeli <and...@suse.de> Cc: Luigi Genoni <ker...@Expansa.sns.it>, Dieter N?tzel <Dieter.Nuet...@hamburg.de>, Marcelo Tosatti <marc...@conectiva.com.br>, Rik van Riel <r...@conectiva.com.br>, Linux Kernel List <linux-ker...@vger.kernel.org>, Andrew Morton <a...@zip.com.au>, Robert Love <r...@tech9.net> Subject: Re: [2.4.17/18pre] VM and swap - it's really unusable Original-Message-ID: <20020108133335.GB26307@krispykreme> Original-References: <20020108030420Z287595-13997+1...@vger.kernel.org> <Pine.LNX.4.33.0201081153310.29480-100...@Expansa.sns.it> <20020108142117.F3...@inspiron.school.suse.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20020108142117.F3221@inspiron.school.suse.de> User-Agent: Mutt/1.3.25i Sender: linux-kernel-ow...@vger.kernel.org Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org Organization: Internet mailing list Date: Tue, 8 Jan 2002 13:39:05 GMT Message-ID: <fa.gpe55mv.ajebbs@ifi.uio.no> References: <fa.i5nsc8v.5m6fgr@ifi.uio.no> Lines: 13 > So yes, mean latency will decrease with preemptive kernel, but your CPU > is definitely paying something for it. And Andrew Morton's work suggests he can do a much better job of reducing latency than -preempt. Anton - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!sn-xit-02!supernews.com! newsfeed.direct.ca!look.ca!newsfeed1.cidera.com!Cidera! news2.dg.net.ua!bn.utel.com.ua!carrier.kiev.ua!not-for-mail From: Daniel Phillips <phill...@bonn-fries.net> Newsgroups: lucky.linux.kernel Subject: Re: [2.4.17/18pre] VM and swap - it's really unusable Date: Tue, 8 Jan 2002 14:59:47 +0000 (UTC) Organization: unknown Lines: 40 Sender: n...@horse.lucky.net Approved: newsmas...@lucky.net Message-ID: <E16Nxjg-00009W-00@starship.berlin> References: <20020108030420Z287595-13997+1799@vger.kernel.org> <20020108142117.F3221@inspiron.school.suse.de> <20020108133335.GB26307@krispykreme> NNTP-Posting-Host: horse.lucky.net Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7BIT X-Trace: horse.lucky.net 1010501987 36212 193.193.193.118 (8 Jan 2002 14:59:47 GMT) X-Complaints-To: usenet@horse.lucky.net NNTP-Posting-Date: Tue, 8 Jan 2002 14:59:47 +0000 (UTC) X-Mailer: KMail [version 1.3.2] In-Reply-To: <20020108133335.GB26307@krispykreme> X-Mailing-List: linux-kernel@vger.kernel.org X-Comment-To: Anton Blanchard On January 8, 2002 02:33 pm, Anton Blanchard wrote: > Andrea Arcangeli [apparently] wrote: > > So yes, mean latency will decrease with preemptive kernel, but your CPU > > is definitely paying something for it. > > And Andrew Morton's work suggests he can do a much better job of > reducing latency than -preempt. That's not a particularly clueful comment, Anton. Obviously, any latency-busting hacks that Andrew does could also be patched into a -preempt kernel. What a preemptible kernel can do that a non-preemptible kernel can't is: reschedule exactly as often as necessary, instead of having lots of extra schedule points inserted all over the place, firing when *they* think the time is right, which may well be earlier than necessary. The preemptible approach is much less of a maintainance headache, since people don't have to be constantly doing audits to see if something changed, and going in to fiddle with scheduling points. Finally, with preemption, rescheduling can be forced with essentially zero latency in response to an arbitrary interrupt such as IO completion, whereas the non-preemptive kernel will have to 'coast to a stop'. In other words, the non-preemptive kernel will have little lags between successive IOs, whereas the preemptive kernel can submit the next IO immediately. So there are bound to be loads where the preemptive kernel turns in better latency *and throughput* than the scheduling point hack. Mind you, I'm not devaluing Andrew's work, it's good and valuable. However it's good to be aware of why that approach can't equal the latency-busting performance of the preemptive approach. -- Daniel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu! news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!news.tele.dk! small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!ifi.uio.no! internet-mailinglist Newsgroups: fa.linux.kernel Return-Path: <linux-kernel-ow...@vger.kernel.org> X-Authentication-Warning: vasquez.zip.com.au: Host r...@zipperii.zip.com.au [61.8.0.87] claimed to be zip.com.au Original-Message-ID: <3C3B4CB7.FEAAF5FC@zip.com.au> Original-Date: Tue, 08 Jan 2002 11:47:03 -0800 From: Andrew Morton <a...@zip.com.au> X-Mailer: Mozilla 4.77 [en] (X11; U; Linux 2.4.18pre1 i686) X-Accept-Language: en MIME-Version: 1.0 To: Daniel Phillips <phill...@bonn-fries.net> CC: Anton Blanchard <an...@samba.org>, Andrea Arcangeli <and...@suse.de>, Luigi Genoni <ker...@Expansa.sns.it>, Dieter N?tzel <Dieter.Nuet...@hamburg.de>, Marcelo Tosatti <marc...@conectiva.com.br>, Rik van Riel <r...@conectiva.com.br>, Linux Kernel List <linux-ker...@vger.kernel.org>, Robert Love <r...@tech9.net> Subject: Re: [2.4.17/18pre] VM and swap - it's really unusable Original-References: <20020108030420Z287595-13997+1...@vger.kernel.org> <20020108142117.F3...@inspiron.school.suse.de> <20020108133335.GB26307@krispykreme>, <20020108133335.GB26307@krispykreme> <E16Nxjg-00009W...@starship.berlin> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: linux-kernel-ow...@vger.kernel.org Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org Organization: Internet mailing list Date: Tue, 8 Jan 2002 19:54:56 GMT Message-ID: <fa.do1pjuv.1oguu2j@ifi.uio.no> References: <fa.hlj1q9v.1akmgbl@ifi.uio.no> Lines: 87 Daniel Phillips wrote: > > On January 8, 2002 02:33 pm, Anton Blanchard wrote: > > Andrea Arcangeli [apparently] wrote: > > > So yes, mean latency will decrease with preemptive kernel, but your CPU > > > is definitely paying something for it. > > > > And Andrew Morton's work suggests he can do a much better job of > > reducing latency than -preempt. > > That's not a particularly clueful comment, Anton. Obviously, any > latency-busting hacks that Andrew does could also be patched into a > -preempt kernel. Yes. The important part is the implicit dropping of the BKL across schedule(). > What a preemptible kernel can do that a non-preemptible kernel can't is: > reschedule exactly as often as necessary, instead of having lots of extra > schedule points inserted all over the place, firing when *they* think the > time is right, which may well be earlier than necessary. Nope. `if (current->need_resched)' -> the time is right (beyond right, actually). > The preemptible approach is much less of a maintainance headache, since > people don't have to be constantly doing audits to see if something changed, > and going in to fiddle with scheduling points. Except it doesn't work. The full-on low-latency patch has ~60 rescheduling points. Of these, ~40 involve popping spinlocks. Really, the only significant latency sources which the preemptible kernel solves are generic_file_read() and generic_file_write(). So preemptible kernel needs lock-break to be useful. And then it's basically the same thing, with the same maintainability problems. And believe me, these are considerable. Mainly because the areas which needs busting up exactly coincide with the areas where there has been most churn in the kernel. > Finally, with preemption, rescheduling can be forced with essentially zero > latency in response to an arbitrary interrupt such as IO completion, whereas > the non-preemptive kernel will have to 'coast to a stop'. In other words, > the non-preemptive kernel will have little lags between successive IOs, > whereas the preemptive kernel can submit the next IO immediately. So there > are bound to be loads where the preemptive kernel turns in better latency > *and throughput* than the scheduling point hack. Latency yes. Throughout no. I don't think the "preempt slows down the kernel" argument is very valid really. Let's invert the argument - Linux is multitasking, and that has a cost. There's no reason why certain bits of the kernel need to violate that just to get a bit more throughput. If it really worries you, set HZ=10 and increase all the timeslices, etc. Now, there *may* be overheads added due to losing the implicit locking which per-CPU data gives you. The main cost of preempt IMO is in complexity and stability risks. (BTW: I took a weird oops testing the preempt patch on an SMP NFS client. The fault address was 0x0aXXXXXX. No useful backtrace, unfortunately). > Mind you, I'm not devaluing Andrew's work, it's good and valuable. However > it's good to be aware of why that approach can't equal the latency-busting > performance of the preemptive approach. There's no point in just merging the preempt patch and saying "there, that's done". It doesn't do anything. Instead, a decision needs to be made: "Linux will henceforth be a low-latency kernel". Now, IF we can come to this decision, then internal preemption is the way to do it. But it affects ALL kernel developers. Because we'll need to introduce a new rule: "it is a bug to spend more than five milliseconds holding any locks". So. Do we we want a low-latency kernel? Are we prepared to mandate the five-millisecond rule? It can be done, but won't be easy, and we'll never get complete coverage. But I don't see the will around here. - - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!sn-xit-02!supernews.com! news.tele.dk!small.news.tele.dk!128.230.129.106!news.maxwell.syr.edu! netnews.com!xfer02.netnews.com!newsfeed1.cidera.com!Cidera! news2.dg.net.ua!bn.utel.com.ua!carrier.kiev.ua!not-for-mail From: Marcelo Tosatti <marc...@conectiva.com.br> Newsgroups: lucky.linux.kernel Subject: Re: [2.4.17/18pre] VM and swap - it's really unusable Date: Tue, 8 Jan 2002 15:15:42 +0000 (UTC) Organization: unknown Lines: 42 Sender: n...@horse.lucky.net Approved: newsmas...@lucky.net Message-ID: <Pine.LNX.4.21.0201081153160.19178-100000@freak.distro.conectiva> NNTP-Posting-Host: horse.lucky.net Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=ISO-8859-1 Content-Transfer-Encoding: 8BIT X-Trace: horse.lucky.net 1010502942 37547 193.193.193.118 (8 Jan 2002 15:15:42 GMT) X-Complaints-To: usenet@horse.lucky.net NNTP-Posting-Date: Tue, 8 Jan 2002 15:15:42 +0000 (UTC) In-Reply-To: <20020108030431.0099F38C58@perninha.conectiva.com.br> X-Mailing-List: linux-kernel@vger.kernel.org X-Comment-To: Dieter =?iso-8859-15?q?N=FCtzel?= On Tue, 8 Jan 2002, Dieter [iso-8859-15] Nützel wrote: > Is it possible to decide, now what should go into 2.4.18 (maybe -pre3) -aa or > -rmap? -rmap is 2.5 stuff. I would really like to integrate -aa stuff as soon as I can understand _why_ Andrea is doing those changes. Note that people will _always_ complain about VM: It will always be possible to optimize it to some case and cause harm to other cases. I'm not saying that VM is perfect right now: It for sure has problems. > Andrew Morten`s read-latency.patch is a clear winner for me, too. AFAIK Andrew's code simply adds schedule points around the kernel, right? If so, nope, I do not plan to integrate it. > What about 00_nanosleep-5 and bootmem? What is 00_nanosleep-5 and bootmem ? > The O(1) scheduler? 2.5 stuff. > Maybe preemption? It is disengageable so nobody should be harmed but we get > the chance for wider testing. 2.5 too. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!sn-xit-02!supernews.com! newsfeed.direct.ca!look.ca!newshub2.rdc1.sfba.home.com!news.home.com! newshub1-work.rdc1.sfba.home.com!gehenna.pell.portland.or.us! nntp-server.caltech.edu!nntp-server.caltech.edu!mail2news96 Newsgroups: mlist.linux.kernel Subject: Re: [2.4.17/18pre] VM and swap - it's really unusable X-To: marc...@conectiva.com.br (Marcelo Tosatti) Date: Tue, 8 Jan 2002 15:46:20 +0000 (GMT) X-Cc: Dieter.Nuet...@hamburg.de (Dieter =?iso-8859-15?q?N=FCtzel?=), and...@suse.de (Andrea Arcangeli), r...@conectiva.com.br (Rik van Riel), linux-ker...@vger.kernel.org (Linux Kernel List), a...@zip.com.au (Andrew Morton), r...@tech9.net (Robert Love) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Message-ID: <linux.kernel.E16NySC-0006pc-00@the-village.bc.nu> From: Alan Cox <a...@lxorguk.ukuu.org.uk> Approved: n...@nntp-server.caltech.edu Lines: 18 > > Andrew Morten`s read-latency.patch is a clear winner for me, too. > > AFAIK Andrew's code simply adds schedule points around the kernel, righ= > t?=20 > > If so, nope, I do not plan to integrate it. Yep. It has the most wonderful effect on system latency without actually breaking any semantics. Pre-empt is a trickier one because it does change actual behaviour a lot more, although it should be preserving locking rules. Alan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!sn-xit-02!supernews.com! newsfeed.direct.ca!look.ca!newsfeed1.cidera.com!Cidera!news2.dg.net.ua! bn.utel.com.ua!carrier.kiev.ua!not-for-mail From: Andrea Arcangeli <and...@suse.de> Newsgroups: lucky.linux.kernel Subject: Re: [2.4.17/18pre] VM and swap - it's really unusable Date: Tue, 8 Jan 2002 15:34:20 +0000 (UTC) Organization: unknown Lines: 62 Sender: n...@horse.lucky.net Approved: newsmas...@lucky.net Message-ID: <20020108162930.E1894@inspiron.school.suse.de> References: <20020108030420Z287595-13997+1799@vger.kernel.org> <20020108142117.F3221@inspiron.school.suse.de> <20020108133335.GB26307@krispykreme> <E16Nxjg-00009W-00@starship.berlin> NNTP-Posting-Host: horse.lucky.net Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: horse.lucky.net 1010504060 39295 193.193.193.118 (8 Jan 2002 15:34:20 GMT) X-Complaints-To: usenet@horse.lucky.net NNTP-Posting-Date: Tue, 8 Jan 2002 15:34:20 +0000 (UTC) Content-Disposition: inline User-Agent: Mutt/1.3.12i In-Reply-To: <E16Nxjg-00009W-00@starship.berlin>; from phillips@bonn-fries.net on Tue, Jan 08, 2002 at 04:00:11PM +0100 X-GnuPG-Key-URL: http://e-mind.com/~andrea/aa.gnupg.asc X-PGP-Key-URL: http://e-mind.com/~andrea/aa.asc X-Mailing-List: linux-kernel@vger.kernel.org X-Comment-To: Daniel Phillips On Tue, Jan 08, 2002 at 04:00:11PM +0100, Daniel Phillips wrote: > On January 8, 2002 02:33 pm, Anton Blanchard wrote: > > Andrea Arcangeli [apparently] wrote: > > > So yes, mean latency will decrease with preemptive kernel, but your CPU > > > is definitely paying something for it. > > > > And Andrew Morton's work suggests he can do a much better job of > > reducing latency than -preempt. > > That's not a particularly clueful comment, Anton. Obviously, any > latency-busting hacks that Andrew does could also be patched into a > -preempt kernel. > > What a preemptible kernel can do that a non-preemptible kernel can't is: > reschedule exactly as often as necessary, instead of having lots of extra > schedule points inserted all over the place, firing when *they* think the > time is right, which may well be earlier than necessary. "extra schedule points all over the place", that's the -preempt kernel not the lowlatency kernel! (on yeah, you don't see them in the source but ask your CPU if it sees them) > The preemptible approach is much less of a maintainance headache, since > people don't have to be constantly doing audits to see if something changed, > and going in to fiddle with scheduling points. this yes, it requires less maintainance, but still you should keep in mind the details about the spinlocks, things like the checks the VM does in shrink_cache are needed also with preemptive kernel. > Finally, with preemption, rescheduling can be forced with essentially zero > latency in response to an arbitrary interrupt such as IO completion, whereas > the non-preemptive kernel will have to 'coast to a stop'. In other words, > the non-preemptive kernel will have little lags between successive IOs, > whereas the preemptive kernel can submit the next IO immediately. So there > are bound to be loads where the preemptive kernel turns in better latency > *and throughput* than the scheduling point hack. The I/O pipeline is big enough that a few msec before or later in a submit_bh shouldn't make a difference, the batch logic in the ll_rw_block layer also try to reduce the reschedule, and last but not the least if the task is I/O bound preemptive kernel or not won't make any difference in the submit_bh latency because no task is eating cpu and latency will be the one of pure schedule call. > Mind you, I'm not devaluing Andrew's work, it's good and valuable. However > it's good to be aware of why that approach can't equal the latency-busting > performance of the preemptive approach. I also don't want to devaluate the preemptive kernel approch (the mean latency it can reach is lower than the one of the lowlat kernel, however I personally care only about worst case latency and this is why I don't feel the need of -preempt), but I just wanted to make clear that the idea that is floating around that preemptive kernel is all goodness is very far from reality, you get very low mean latency but at a price. Andrea - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!sn-xit-02!supernews.com! newsfeed.direct.ca!look.ca!newshub2.rdc1.sfba.home.com!news.home.com! newshub1-work.rdc1.sfba.home.com!gehenna.pell.portland.or.us! nntp-server.caltech.edu!nntp-server.caltech.edu!mail2news96 Newsgroups: mlist.linux.kernel Subject: Re: [2.4.17/18pre] VM and swap - it's really unusable X-To: a...@zip.com.au (Andrew Morton) Date: Tue, 8 Jan 2002 20:13:49 +0000 (GMT) X-Cc: phill...@bonn-fries.net (Daniel Phillips), an...@samba.org (Anton Blanchard), and...@suse.de (Andrea Arcangeli), ker...@Expansa.sns.it (Luigi Genoni), Dieter.Nuet...@hamburg.de (Dieter N?tzel), marc...@conectiva.com.br (Marcelo Tosatti), r...@conectiva.com.br (Rik van Riel), linux-ker...@vger.kernel.org (Linux Kernel List), r...@tech9.net (Robert Love) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Message-ID: <linux.kernel.E16O2d3-0007VF-00@the-village.bc.nu> From: Alan Cox <a...@lxorguk.ukuu.org.uk> Approved: n...@nntp-server.caltech.edu Lines: 20 > low-latency kernel". Now, IF we can come to this decision, then > internal preemption is the way to do it. But it affects ALL kernel The pre-empt patches just make things much much harder to debug. They remove some of the predictability and the normal call chain following goes out of the window because you end up seeing crashes in a thread with no idea what ran the microsecond before Some of that happens now but this makes it vastly worse. The low latency patches don't change the basic predictability and debuggability but allow you to hit a 1mS pre-empt target for the general case. Alan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!sn-xit-02!supernews.com! newsfeed.direct.ca!look.ca!newsfeed1.cidera.com!Cidera!news2.dg.net.ua! bn.utel.com.ua!carrier.kiev.ua!not-for-mail From: Daniel Phillips <phill...@bonn-fries.net> Newsgroups: lucky.linux.kernel Subject: Re: [2.4.17/18pre] VM and swap - it's really unusable Date: Tue, 8 Jan 2002 15:56:32 +0000 (UTC) Organization: unknown Lines: 62 Sender: n...@horse.lucky.net Approved: newsmas...@lucky.net Message-ID: <E16Nyaf-0000A5-00@starship.berlin> References: <20020108030420Z287595-13997+1799@vger.kernel.org> <E16Nxjg-00009W-00@starship.berlin> <20020108162930.E1894@inspiron.school.suse.de> NNTP-Posting-Host: horse.lucky.net Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7BIT X-Trace: horse.lucky.net 1010505392 42041 193.193.193.118 (8 Jan 2002 15:56:32 GMT) X-Complaints-To: usenet@horse.lucky.net NNTP-Posting-Date: Tue, 8 Jan 2002 15:56:32 +0000 (UTC) X-Mailer: KMail [version 1.3.2] In-Reply-To: <20020108162930.E1894@inspiron.school.suse.de> X-Mailing-List: linux-kernel@vger.kernel.org X-Comment-To: Andrea Arcangeli On January 8, 2002 04:29 pm, Andrea Arcangeli wrote: > > The preemptible approach is much less of a maintainance headache, since > > people don't have to be constantly doing audits to see if something changed, > > and going in to fiddle with scheduling points. > > this yes, it requires less maintainance, but still you should keep in > mind the details about the spinlocks, things like the checks the VM does > in shrink_cache are needed also with preemptive kernel. Yes of course, the spinlock regions still have to be analyzed and both patches have to be maintained for that. Long duration spinlocks are bad by any measure, and have to be dealt with anyway. > > Finally, with preemption, rescheduling can be forced with essentially zero > > latency in response to an arbitrary interrupt such as IO completion, whereas > > the non-preemptive kernel will have to 'coast to a stop'. In other words, > > the non-preemptive kernel will have little lags between successive IOs, > > whereas the preemptive kernel can submit the next IO immediately. So there > > are bound to be loads where the preemptive kernel turns in better latency > > *and throughput* than the scheduling point hack. > > The I/O pipeline is big enough that a few msec before or later in a > submit_bh shouldn't make a difference, the batch logic in the > ll_rw_block layer also try to reduce the reschedule, and last but not > the least if the task is I/O bound preemptive kernel or not won't make > any difference in the submit_bh latency because no task is eating cpu > and latency will be the one of pure schedule call. That's not correct. For one thing, you don't know that no task is eating CPU, or that nobody is hogging the kernel. Look at the above, and consider the part about the little lags between IOs. > > Mind you, I'm not devaluing Andrew's work, it's good and valuable. However > > it's good to be aware of why that approach can't equal the latency-busting > > performance of the preemptive approach. > > I also don't want to devaluate the preemptive kernel approch (the mean > latency it can reach is lower than the one of the lowlat kernel, however > I personally care only about worst case latency and this is why I don't > feel the need of -preempt), This is exactly the case that -preempt handles well. On the other hand, trying to show that scheduling hacks satisfy any given latency bound is equivalent to solving the halting problem. I thought you had done some real time work? > but I just wanted to make clear that the > idea that is floating around that preemptive kernel is all goodness is > very far from reality, you get very low mean latency but at a price. A price lots of people are willing to pay. By the way, have you measured the cost of -preempt in practice? -- Daniel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!sn-xit-02!supernews.com! newsfeed.direct.ca!look.ca!feed2.news.rcn.net!rcn!dca6-feed2.news.digex.net! intermedia!newsfeed1.cidera.com!Cidera!news2.dg.net.ua!bn.utel.com.ua! carrier.kiev.ua!not-for-mail From: Andrew Morton <a...@zip.com.au> Newsgroups: lucky.linux.kernel Subject: Re: [2.4.17/18pre] VM and swap - it's really unusable Date: Tue, 8 Jan 2002 20:24:24 +0000 (UTC) Organization: unknown Lines: 55 Sender: n...@horse.lucky.net Approved: newsmas...@lucky.net Message-ID: <3C3B5305.267EFC14@zip.com.au> References: <20020108030431.0099F38C58@perninha.conectiva.com.br> <Pine.LNX.4.21.0201081153160.19178-100000@freak.distro.conectiva> NNTP-Posting-Host: horse.lucky.net Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Trace: horse.lucky.net 1010521464 72669 193.193.193.118 (8 Jan 2002 20:24:24 GMT) X-Complaints-To: usenet@horse.lucky.net NNTP-Posting-Date: Tue, 8 Jan 2002 20:24:24 +0000 (UTC) X-Authentication-Warning: vasquez.zip.com.au: Host r...@zipperii.zip.com.au [61.8.0.87] claimed to be zip.com.au X-Mailer: Mozilla 4.77 [en] (X11; U; Linux 2.4.18pre1 i686) X-Accept-Language: en X-Mailing-List: linux-kernel@vger.kernel.org X-Comment-To: Marcelo Tosatti Marcelo Tosatti wrote: > > > Andrew Morten`s read-latency.patch is a clear winner for me, too. > > AFAIK Andrew's code simply adds schedule points around the kernel, right? > > If so, nope, I do not plan to integrate it. I haven't sent it to you yet :) It improves the kernel. That's good, isn't it? (There are already forty or fifty open-coded rescheduling points in the kernel. That patch just adds the missing (and most important) ten). BTW, with regard to the "preempt and low-lat improve disk throughput" argument. I have occasionally seen small throughput improvements, but I think these may be just request-merging flukes. Certainly they were very small. The one area where it sometimes makes a huuuuuge throughput improvement is software RAID. Much of the VM and dirty buffer writeout code assumes that submit_bh() starts I/O. Guess what? RAID's submit_bh() sometimes *doesn't* start I/O. Because the IO is started by a different thread. With the Riel VM I had a test case in which software RAID completely and utterly collapsed because of this. The machine was spending huge amounts of time spinning in page_launder(), madly submitting I/O, but never yielding, so the I/O wasn't being started. -aa VM has an open-coded yield in shrink_cahce() which prevents that particular collapse. But I had a report yesterday that the mini-ll patch triples throughput on a complex RAID stack in 2.4.17. Same reason. Arguably, this is a RAID problem - raidN_make_request() should be yielding. But it's better to do this in one nice, single, reviewable place - submit_bh(). However that won't prevent wait_for_buffers() from starving the raid thread. RAID is not alone. ksoftirqd, keventd and loop_thread() also need reasonably good response times. But given the number of people who have been providing feedback on this patch, and on the disk-read-latency patch, none of this is going anywhere, and mine will be the only Linux machines which don't suck. (Takes ball, goes home). - - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!sn-xit-02!supernews.com! news-x2.support.nl!news-x.support.nl!surfnet.nl!newsfeed.media.kyoto-u.ac.jp! newshub2.rdc1.sfba.home.com!news.home.com!newshub1-work.rdc1.sfba.home.com! gehenna.pell.portland.or.us!nntp-server.caltech.edu!nntp-server.caltech.edu! mail2news96 Newsgroups: mlist.linux.kernel Date: Wed, 9 Jan 2002 00:02:48 +0100 (CET) From: Luigi Genoni <ker...@Expansa.sns.it> X-To: Daniel Phillips <phill...@bonn-fries.net> X-cc: Andrea Arcangeli <and...@suse.de>, Anton Blanchard <an...@samba.org>, Dieter N?tzel <Dieter.Nuet...@hamburg.de>, Marcelo Tosatti <marc...@conectiva.com.br>, Rik van Riel <r...@conectiva.com.br>, Linux Kernel List <linux-ker...@vger.kernel.org>, Andrew Morton <a...@zip.com.au>, Robert Love <r...@tech9.net> Subject: Re: [2.4.17/18pre] VM and swap - it's really unusable Message-ID: <linux.kernel.Pine.LNX.4.33.0201082351020.1185-100000@Expansa.sns.it> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Approved: n...@nntp-server.caltech.edu Lines: 81 On Tue, 8 Jan 2002, Daniel Phillips wrote: > On January 8, 2002 04:29 pm, Andrea Arcangeli wrote: > > > The preemptible approach is much less of a maintainance headache, since > > > people don't have to be constantly doing audits to see if something changed, > > > and going in to fiddle with scheduling points. > > > > this yes, it requires less maintainance, but still you should keep in > > mind the details about the spinlocks, things like the checks the VM does > > in shrink_cache are needed also with preemptive kernel. > > Yes of course, the spinlock regions still have to be analyzed and both > patches have to be maintained for that. Long duration spinlocks are bad > by any measure, and have to be dealt with anyway. > > > > Finally, with preemption, rescheduling can be forced with essentially zero > > > latency in response to an arbitrary interrupt such as IO completion, whereas > > > the non-preemptive kernel will have to 'coast to a stop'. In other words, > > > the non-preemptive kernel will have little lags between successive IOs, > > > whereas the preemptive kernel can submit the next IO immediately. So there > > > are bound to be loads where the preemptive kernel turns in better latency > > > *and throughput* than the scheduling point hack. > > > > The I/O pipeline is big enough that a few msec before or later in a > > submit_bh shouldn't make a difference, the batch logic in the > > ll_rw_block layer also try to reduce the reschedule, and last but not > > the least if the task is I/O bound preemptive kernel or not won't make > > any difference in the submit_bh latency because no task is eating cpu > > and latency will be the one of pure schedule call. > > That's not correct. For one thing, you don't know that no task is eating > CPU, or that nobody is hogging the kernel. Look at the above, and consider > the part about the little lags between IOs. > > > > Mind you, I'm not devaluing Andrew's work, it's good and valuable. However > > > it's good to be aware of why that approach can't equal the latency-busting > > > performance of the preemptive approach. > > > > I also don't want to devaluate the preemptive kernel approch (the mean > > latency it can reach is lower than the one of the lowlat kernel, however > > I personally care only about worst case latency and this is why I don't > > feel the need of -preempt), > > This is exactly the case that -preempt handles well. On the other hand, > trying to show that scheduling hacks satisfy any given latency bound is > equivalent to solving the halting problem. > > I thought you had done some real time work? > > > but I just wanted to make clear that the > > idea that is floating around that preemptive kernel is all goodness is > > very far from reality, you get very low mean latency but at a price. > > A price lots of people are willing to pay Probably sometimes they are not making a good business. In the reality preempt is good in many scenarios, as I said, and I agree that for desktops, and dedicated servers where just one application runs, and probably the CPU is idle the most of the time, indeed users have a speed feeling. Please consider that on eavilly loaded servers, with 40 and more users, some are running gcc, others g77, others g++ compilations, someone runs pine or mutt or kmail, and netscape, and mozilla, and emacs (someone form xterm kde or gnome), and and and... You can have also 4/8 CPU butthey are not infinite ;) (but I talk mainly thinking of dualAthlon systems). there is a lot of memory and disk I/O. This is not a strange scenary on the interactive servers used at SNS. Here preempt has a too high price > > By the way, have you measured the cost of -preempt in practice? > Yes, I did a lot of tests, and with current preempt patch definitelly I was seeing a too big performance loss. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!sn-xit-02!supernews.com! news.tele.dk!small.news.tele.dk!newsfeed4.cidera.com!newsfeed1.cidera.com! Cidera!news2.dg.net.ua!bn.utel.com.ua!carrier.kiev.ua!not-for-mail From: Dieter =?iso-8859-15?q?N=FCtzel?= <Dieter.Nuet...@hamburg.de> Newsgroups: lucky.linux.kernel Subject: Re: [2.4.17/18pre] VM and swap - it's really unusable Date: Wed, 9 Jan 2002 00:16:25 +0000 (UTC) Organization: DN Lines: 104 Sender: n...@horse.lucky.net Approved: newsmas...@lucky.net Message-ID: <20020109001450Z288633-13996+2793@vger.kernel.org> References: <Pine.LNX.4.33.0201082351020.1185-100000@Expansa.sns.it> NNTP-Posting-Host: horse.lucky.net Mime-Version: 1.0 Content-Type: text/plain; Content-Transfer-Encoding: 8bit X-Trace: horse.lucky.net 1010535385 94833 193.193.193.118 (9 Jan 2002 00:16:25 GMT) X-Complaints-To: usenet@horse.lucky.net NNTP-Posting-Date: Wed, 9 Jan 2002 00:16:25 +0000 (UTC) X-Mailer: KMail [version 1.3.2] In-Reply-To: <Pine.LNX.4.33.0201082351020.1185-100000@Expansa.sns.it> X-Mailing-List: linux-kernel@vger.kernel.org X-Comment-To: Luigi Genoni On Wednesday, 9. January 2002 00:02, Luigi Genoni wrote: > On Tue, 8 Jan 2002, Daniel Phillips wrote: > > On January 8, 2002 04:29 pm, Andrea Arcangeli wrote: [-] > > > I also don't want to devaluate the preemptive kernel approch (the mean > > > latency it can reach is lower than the one of the lowlat kernel, > > > however I personally care only about worst case latency and this is why > > > I don't feel the need of -preempt), > > > > This is exactly the case that -preempt handles well. On the other hand, > > trying to show that scheduling hacks satisfy any given latency bound is > > equivalent to solving the halting problem. > > > > I thought you had done some real time work? > > > > > but I just wanted to make clear that the > > > idea that is floating around that preemptive kernel is all goodness is > > > very far from reality, you get very low mean latency but at a price. > > > > A price lots of people are willing to pay > > Probably sometimes they are not making a good business. In the reality > preempt is good in many scenarios, as I said, and I agree that for > desktops, and dedicated servers where just one application runs, and > probably the CPU is idle the most of the time, OK, good. You are much at the same line than I am. Should we starting not only to differentiate between UP and SMP systems but allthought between desktop and (big) servers? I remember one saying. "Think, this patch is worth only for ~0.05% of the Linux users..." (He meant the multi SMP system users.) Allmost 99.95% of the Linux users running desktops and I am somewhat tiered of saying, "sorry, Linux is under development..." Look at the imprint of the famous German ct magazine (they are not even known as Linux bashers...;-). It shows little penguins falling like domino stones (starting with 2.4.17). Let me rephrase it: I appreciate all your great work and I know "only" some (little) internals of it but we should do some interactivity improvements for the 2.4 kernel, too. I know what it's worth Andrew's (lowlatency patch) and Robert's (George Anzinger's) preempt patch. In short the system (bigger desktop) flies. The holly grail would be a combination of preempt+lock-break plus lowlatency and Ingo's O(1) scheduler. My main focus lies on 3D graphics not kernel and I use KDE (yes, a little luxury:-) 'cause KDE is C++ and most visualization systems are c and later c++. Without the above patches even my 1 GHz Athlon II, 640 MB, feels sluggish. But I don't forget to think about throughput which is even usefull for "heavy" compiler runs... > indeed users have a speed > feeling. Please consider that on eavilly loaded servers, with 40 and more > users, some are running gcc, others g77, others g++ compilations, someone > runs pine or mutt or kmail, and netscape, and mozilla, and emacs (someone > form xterm kde or gnome), and and > and... You can have also 4/8 CPU butthey are not infinite ;) (but I talk > mainly thinking of dualAthlon systems). > there is a lot of memory and disk I/O. > This is not a strange scenary on the interactive servers used at SNS. > Here preempt has a too high price That's why preempt is a compile time option, btw. > > By the way, have you measured the cost of -preempt in practice? > > Yes, I did a lot of tests, and with current preempt patch definitelly > I was seeing a too big performance loss. Have you tried with stock 2.4.17 or with additional patches? 2.4.17-rc2aa2 (10_vm-21)? The later make big differences in throughput for me (with and without preempt). I am under preparation of some numbers. Anybody want some special tests? dbench (yes, I know...) with and without MP3 during run latencytest0.42-png bonnie++ getc_putc Thank you for all your serious answers. This was definitely not intended as a flamewar start. -Dieter -- Dieter Nützel Graduate Student, Computer Science University of Hamburg Department of Computer Science @home: Dieter.Nuet...@hamburg.de - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/