Minix Scheduler

Path: utzoo!mnetor!uunet!lll-winken!lll-lcc!ames!umd5!uvaarpa!mcnc!
gatech!uflorida!codas!alxfin!rtmvax!johnc
From: jo...@rtmvax.UUCP (John Connin)
Newsgroups: comp.os.minix
Subject: Minix Scheduler
Message-ID: <1002@rtmvax.UUCP>
Date: 9 Mar 88 22:09:59 GMT
Organization: AFS Inc. REMOTE Site -rtmvax- Orlando,FL
Lines: 13
Keywords: minix, scheduler, multitasking

It appears to me that the Minix scheduling policy, although facilitating
a simple implementation, does not adequately support smooth operation
of concurrent applications.  Additionally, based on various comp.os.minix
messages it appears that others have make (in one form or another) similar
observations.

Before I possibly go off the deepend, I would be appreciative of others
(a) sharing any definitive work done in this area, and lacking such (b)
the sharing of ideas as to how the Minix scheduler should be changed,
both respect to policy and implementation.

    ------------------------------------------------------
     johnc!rtmvax!alxfin!codas          John Connin
    ------------------------------------------------------

Path: utzoo!mnetor!uunet!munnari!otc!metro!ipso!runx!brucee
From: bru...@runx.ips.oz (Bruce Evans)
Newsgroups: comp.os.minix
Subject: Re: Minix Scheduler (and tty performance)
Message-ID: <1432@runx.ips.oz>
Date: 16 Mar 88 16:30:40 GMT
References: <1002@rtmvax.UUCP>
Reply-To: bru...@runx.OZ (Bruce Evans)
Organization: RUNX  Un*x Timeshare. Sydney, Australia.
Lines: 97

In the referenced article jo...@rtmvax.UUCP (John Connin) writes:

> It appears to me that the Minix scheduling policy, although facilitating
> simple implementation, does not adequately support smooth operation
> of concurrent applications.  ...

I am doing a major rewrite of the scheduler in order to get better performance
from the new tty driver. Already performance has been improved from something
below 1200 baud to over 4800 baud on a 5 MHz PC. Some handshaking is still
necessary at this speed, but the interrupt handler and task switching is no
longer a major bottleneck. Zmodem runs on Minix and gives no errors at 4800
baud with a block size of 1024 over a cable linked to an Xenix machine.
Actual throughput is only 2400 baud.

This was mostly just a re-implementation, without major design changes. I'm
quite happy with the design of the scheduler. As Andy says, the complications
are mainly in the device drivers. The scheduler just has to be fast to handle
the heavy message traffic, and anything more complicated is likely to just
slow the system down.

I think the main problem is that fs is not really multi-tasking. While a
device driver is busy, fs is held busy and no tasks other than the one that
called fs can do I/O. They tend to do a little calculation, then line up on
a long queue waiting for fs. The drivers themselves are also a bit slow.
For instance, I have a floppy driver for another machine which reads 14 of 16
sectors per cylinder every revolution (after the seek). The Minix ones
seem to read 2 of 9 (1 block). They must be doing too much calculation for
the inter-sector gap (more likely, fs is). I have less experience with hard
disk drivers, but no doubt similar improvements are possible. DOS already
has them :-(. (On my new 386, Minix runs and the hard disk is about as fast
as for DOS :-). The disk has become the bottleneck.)

Summary of changes to speed up ttys (semi-chronological order)
--------------------------------------------------------------
(1) Modified clock handler to avoid unnecessary message passing and even
    save/restore. This alone gave a system-wide improvement of 14%.
(2) Eliminated unecessary locks and restores. Only the task switching runs
    with interrupts off (plus a few critical regions). This didn't make
    much difference.
(3) Re-coded tty interrupt handlers. The ttys started to work properly on
    input at 1200 baud.
(4) Changed system calls and interrupts to save the state on the user stacks
    instead of in the process table.  (The stack pointers alone are put in
    the table). mpx88.s was completely rewritten. It's much cleaner and a
    bit shorter now, and keeps the stack pointer valid at all times so the
    switch can be single-stepped by a debugger.
    fork() and exec() in mm had to be changed. The modularity of the system
    just got in the way.
(5) Introduced a new message structure for interrupt messages. It's only 10
    bytes instead of 24 and can be copied by structure assignment instead
    of cp_mess(). Made interrupt messages mostly read-only so they could be
    initialised at compile time. Achieved 2400 baud.
(6) Re-wrote interrupt() with subroutine calls to mini_send(), ready(),
    and pick_proc() in-line. It's not much longer because the subroutines
    collapsed into 3 or 4 C instructions each, but is less maintainable.
    Improved the subroutines as well. This didn't help the tty drivers
    but gave a few percent on another benchmark.
(7) Replaced the 1-deep message pointer queues in interrupt() by proper
    queues built in a small pool of interrupt messages. This was to fix
    a bug. The keyboard interrupt and tty input and output interrupts all
    used to compete for the same tty task. That's up to 3 messages when
    only 2 could be handled (it's worse with multiple ttys).
(8) Put in official mods to avoid deadlocks. No obvious problems.
(9) Moved flushing of the new interrupt message queue out of interrupt()
    into mini_rec(). It became much simpler, just 5 lines or so with
    much the same logic as reception of ordinary queued messages. At last!
    4800 baud. The flushing in interrupt() depended on repeated interrupts
    from the clock to keep it going. This gave 16 msec latency for some
    interrupts by was probably not the problem here. It was probably just
    the overhead.

Results
-------
Interrupt processing time for the tty input interrupt is now 170 machine
instructions.  This is before the high-level tty driver does *anything*. At
4800 baud, this consumes about 40% of the power a 5 MHz PC and leaves enough
time for the high level driver (and not much else). Zmodem only works
(at a reduced throughput) because the tty driver's buffers are larger than
zmodem's block size, so zmodem's handshaking takes effect.

To be done
----------
(1) Improve high-level tty driver.
(2) Make handshaking work (both XON/XOFF and RTS/CTS/DTR). Now there is some
    handshaking in the low level driver, but it doesn't mesh with the high
    level one. I like to run all local terminals at the same baud rate
    (9600 now), even if handshaking throttles it way down. It saves on stty's.
(3) Consider allowing interrupts while kernel is switching tasks. This
    could reduce latency to well under 100 instructions but requires more
    bookkeeping.
(4) Accumulate several characters in the low level tty driver before calling
    interrupt(). This requires some sort of timer interrupt to flush the
    queue (busy-waiting is no good).
(5) Convince the manufacturer to do (4) in hardware :-).

Bruce Evans
Internet: bru...@runx.ips.oz.au    UUCP: uunet!runx.ips.oz.au!brucee

Path: utzoo!mnetor!uunet!husc6!purdue!umd5!mimsy!chris
From: ch...@mimsy.UUCP (Chris Torek)
Newsgroups: comp.os.minix
Subject: Re: Minix Scheduler (and tty performance)
Message-ID: <10721@mimsy.UUCP>
Date: 19 Mar 88 18:22:08 GMT
References: <1002@rtmvax.UUCP> <1432@runx.ips.oz>
Organization: U of Maryland, Dept. of Computer Science, Coll. Pk., MD 20742
Lines: 15

In article <1...@runx.ips.oz> bru...@runx.ips.oz (Bruce Evans) writes:
>(4) Changed system calls and interrupts to save the state on the user stacks
>    instead of in the process table.

Saving syscall state in the user stack is arguably correct (although
4BSD Unix, at least, leaves it in a per-process kernel stack).  Saving
interrupt state on the user stack, however, is `wrong'.  coroutines
(among others) may want to leave stack stuff undisturbed at times.

One reason not to use user stacks for syscalls is the old `what if you
run out of space' argument.  I have no idea whether Minix tries to
handle this, but it is something worth considering.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	ch...@mimsy.umd.edu	Path:	uunet!mimsy!chris