Subject: Signal handler compatibility? Date: Tue, 10 Dec 91 02:47:13 -0500 From: tytso@ATHENA.MIT.EDU (Theodore Ts'o) To: Linux-activists@joker.cs.hut.fi Reply-To: tytso@athena.mit.edu Well, it looks like I have job control pretty much functioning with a version of bash that understands job control, and it pretty much works. One or two more days while I do more testing and packaging of diffs, and I'll be sending the kernel changes to Linus, and making them available on TSX-11 so that more adventurous souls who job control sooner rather than later can grab it and try it out. One of the problems which I've run into during my testing is that when a read gets interrupted with a ^Z, when you restart it, the system call that was in progress returns a -EINTR, and the program is supposed to understand the EINTR and restart the system call. Well, the problem is thatmost programs *don't* bother to understand EINTR, and so it is highly desirable to have restartable system calls ala BSD 4.3. Unfortunately, in order to do that, it will be necessary to change the signal trampoline code so the signal handler exits, a sigreturn system call is executed. This in fact will result in cleaner and clearer code in the kernel, but since the signal trampoline code is part of libc.a, it would mean that code which used signal handlers would need to be recompiled. In other words, a kernel which used the new way of doing signal handling would not be binary compatible with binaries that used signal handlers. Now, I can think of ways that would probably work in making the resulting kernel be backwards compatible, but they all involve really gross hacks that would be really painful, and I'm not sure that it would be good thing to be introducing ugly hacks to maintain backwards compatibility when Linux is still in beta test. How do people feel about this one? How much code which has been ported actually use signal handlers? Would people be willing to recompile them with a new libc.a? Linus, would you be willing to take a change in the signal handling code that might not be backwards compatible? - Ted
Subject: Looking for alpha testers for job control changes Date: Thu, 12 Dec 91 13:30:42 -0500 From: tytso@ATHENA.MIT.EDU (Theodore Ts'o) To: linux-activists@joker.cs.hut.fi Reply-To: tytso@athena.mit.edu I'm looking for a few people who are willing to take a look at my changes to add job control to Linux and test them out. The ideal alpha tester is someone who is willing to try out these changes and see if 1) they can find some way of using job control which breaks my implementation, 2) if they conflict with changes they are planning to make to the kernel (mostly changes in exit.c and signal.c, with additional changes in tty_io.c, tty_ioctl.c, sys.c, and open.c), and/or 3) if they reasonably correspond with other implementations of job control, particularily BSD 4.3 and Sys V. I make no guarantees about the correctness of these patches, and any future patches which I send out will be relative to Linux 0.11, NOT to these alpha patches. The main reason why I want people to try them out is that I've implemented job control almost entirely from the POSIX spec, and I would other people to check to see if my interpretation of the POSIX spec is compatible with the rest of the world. If you feel up to trying them out, please let me know. If you want to look at the changes before deciding, they can be found in TSX-11:~ftp/ALPHA/jobcontrol. One generic bug which will hit when you try this is out is that apparently there is a bug with how gcc handles signals. If you send gcc a SIGCONT (you don't even need to stop it first), it will die with an IOT trap. I suspect gcc needs to be recompiled with a recent libc.a to fix this problem. - Ted
Subject: Final version of the job control patches.... Date: Thu, 2 Jan 92 19:00:19 -0500 From: tytso@ATHENA.MIT.EDU (Theodore Ts'o) To: Linux-activists@joker.cs.hut.fi Reply-To: tytso@athena.mit.edu The final copy of my job control patches to Linux 0.11 have been sent to Linus; if you want an early peek at them, look in tsx-11.mit.edu:/ALPHA/jobcontrol. The following is the NOTES file from that directory: New Features added by these patches * Job control (setpgrp(), tcsetpgrp(), signals, wait(), etc.) * gethostname(), sethostname() - try using the included hostname program! * getrusage() - not completely implemented, but the skeleton is in place. * getrlimit(), setrlimit() - try using the ulimit command in bash! - nothing looks at the limits yet, though * gettimeofday(), settimeofday() - for that microsecond accuracy some people demand :-) (well, 0.01 second accuracy, actually....) Notes about patches: 1) I integrated in John Kohl's patches to tty_io.c, since I had also made changes, and at least one of his patches depended on my patch being made first. I have included the other patches which he sent to me as separate patches in separate files, in case you did not get a copy from him. 2) I changed the name of system_call.s to sys_call.s, since system_call.s is too long for RCS and !@#$! 14 character filenames. You should either also make this change (and modify the Makefile appropriately) or edit the patch file to change the filename back to system_call.s 3) Currently, MAXHOSTNAMELEN is set at 8 characters. I think it should be changed to 64 characters (change in sys/param.h), to make it like BSD. Unfortunately, doing this will require recompiling libc.a and the shell, since it uses the uname() system call. Not a big deal, but it means that when you boot with the new kernel, you need to make sure you have the right version of /bin/sh installed. 4) When we add time zone support into the library, we will need to do something about how Linux sets the time from the CMOS registers. The problem is that the CMOS clock ticks localtime, and POSIX specifies that we should be ticking GMT time. Currently, we're not, so it's not a problem; but once we start including TZ routines in the libc, we will need to deal with it. My attempt to solve the problem can be found in sys.c: adjust_time(). I don't like the solution all that much, but it's the best I could think of at the time. Things to think about: A) Perhaps the task_struct should go in malloc()'ed memory. This will free close to 1k for the kernel stack. We may not need to do it now, but as more things get added to the task_struct (like BSD-style group lists, etc.), and as the kernel gets more complicated (for example, when networking code is added), I suspect we will need to do it sooner or later. B) Speaking of which, I haven't looked at the new VM stuff that does paging. Does the malloc() routine still work, or will it need to be modified to coexist with paging? C) Another good idea would be to unify the buffer management and the free pool memory management, ala SunOS and the latest Sys VR4 design. This allows free memory pages to be used as buffers and vice versa. D) Currently, gid_t is a unsigned char. I would strongly suggest that we change to be a unsigned short, and represent it internally in the kernel as an unsigned int. This will make life much easier if (ha, ha) we ever decide add the Andrew Filesystem (AFS) to Linux. Since this effects the stat structure, this will probably require recompiling large numbers of programs. E) If a Linux machine stays up for greater than 248 days, the number of clock ticks will become > 2**31, and jiffies will overflow. I suspect something really messy will happen then. We can make things better by making jiffies (and all places that refer to offsets off jiffies) unsigned longs instead of signed longs, but even so, it is still conceivable for a machine to be up for greater than 500 days. This isn't a problem now, since Linux is in beta test, but saying that Unix systems are too unstable to worry about what happens when they've been up for more than 500 days seems a bit wrong to me. - Ted