Path: sparky!uunet!dtix!darwin.sura.net!jvnc.net!newsserver.technet.sg!ntuix! ntrc25.ntrc.ntu.ac.sg!othman From: oth...@ntrc25.ntrc.ntu.ac.sg (othman (EEE/Div 4)) Newsgroups: comp.unix.bsd Subject: Shared lib benchmarks, and experiences Message-ID: <1992Dec3.071056.27426@ntuix.ntu.ac.sg> Date: 3 Dec 92 07:10:56 GMT Sender: n...@ntuix.ntu.ac.sg (USENET News System) Organization: Nanyang Technological University - Singapore Lines: 41 Nntp-Posting-Host: ntrc25.ntrc.ntu.ac.sg X-Newsreader: TIN [version 1.1 PL6] I've installed Joerg's shared lib with little problem. The improvement in code size is significant for small programs but can be worse for larger programs such as cc1. cc1 using shared lib is actually larger than static lib, but SysV manual warns of this problem. There is little hard-disk that I manage to save even after installing the compressed man pages. At most 20 megabytes. That is too little but then there is still the X386 with xview3, which takes up 32Mbyte. No wonder the linux guys refused to give me figures for comparison. Anyone uses shared-lib for X applications or even server? I'll do it later but it will help a lot if someone can share with me their experiences. However there is some saving in virtual memory size, about 20Kbyte. 386dx25 no cache, no 387 Shared lib: ld -o dhry1 /usr/lib/crt0_s.o dhry-1.1.o -lc_s -lgnulib -rwxr-xr-x 1 root 13744 Dec 2 15:49 dhry1 maxtor200# dhry1 Dhrystone(1.1) time for 500000 passes = 106 This machine benchmarks at 4675 dhrystones/second Static lib: ld -o dhry1.st /usr/lib/crt0.o dhry-1.1.o -lc -lgnulib -rwxr-xr-x 1 root 23361 Dec 2 15:54 dhry1.st maxtor200# dhry1.st Dhrystone(1.1) time for 500000 passes = 115 This machine benchmarks at 4805 dhrystones/second UID PID PPID CPU PRI NI VSZ RSS WCHAN STAT TT TIME COMMAND 0 1904 110 20 33 0 88 0 - R p1 0:30.96 (dhry1) 0 1922 110 13 31 0 96 0 - R p1 0:04.59 (dhry1.st) -- Othman bin Ahmad, School of EEE, Nanyang Technological University, Singapore 2263. Internet Email: eoah...@ntuix.ntu.ac.sg
Path: sparky!uunet!elroy.jpl.nasa.gov!sdd.hp.com!zaphod.mps.ohio-state.edu! uwm.edu!ogicse!news.u.washington.edu!serval!hlu From: h...@eecs.wsu.edu (H.J. Lu) Newsgroups: comp.unix.bsd Subject: Re: Shared lib benchmarks, and experiences Message-ID: <1992Dec9.213442.18980@serval.net.wsu.edu> Date: 9 Dec 92 21:34:42 GMT Article-I.D.: serval.1992Dec9.213442.18980 References: <1992Dec3.071056.27426@ntuix.ntu.ac.sg> Sender: n...@serval.net.wsu.edu (USENET News System) Organization: School of EECS, Washington State University Lines: 15 In article <1992Dec3.071056.27...@ntuix.ntu.ac.sg>, oth...@ntrc25.ntrc.ntu.ac.sg (othman (EEE/Div 4)) writes: |> |> I've installed Joerg's shared lib with little problem. |> |> The improvement in code size is significant for small programs but can be |> worse for larger programs such as cc1. cc1 using shared lib is actually |> larger than static lib, but SysV manual warns of this problem. Very strange. I though shared libs always won. At least it is true under Linux. No matter how large/small the program is, the code size is always smaller if the program is linked with shared lib. [..] H.J.
Newsgroups: comp.unix.bsd Path: sparky!uunet!mcsun!news.funet.fi!hydra!klaava!torvalds From: torva...@klaava.Helsinki.FI (Linus Torvalds) Subject: Re: Shared lib benchmarks, and experiences Message-ID: <1992Dec9.233940.5174@klaava.Helsinki.FI> Organization: University of Helsinki References: <1992Dec3.071056.27426@ntuix.ntu.ac.sg> Date: Wed, 9 Dec 1992 23:39:40 GMT Lines: 129 In article <1992Dec3.071056.27...@ntuix.ntu.ac.sg> oth...@ntrc25.ntrc.ntu.ac.sg (othman (EEE/Div 4)) writes: > >I've installed Joerg's shared lib with little problem. > >The improvement in code size is significant for small programs but can be >worse for larger programs such as cc1. cc1 using shared lib is actually >larger than static lib, but SysV manual warns of this problem. > There is little hard-disk that I manage to save even after installing >the compressed man pages. > At most 20 megabytes. That is too little but then there is still the >X386 with xview3, which takes up 32Mbyte. No wonder the linux guys refused to >give me figures for comparison. A couple of comments here: - the linux way of doing shared libraries is different from the Joerg type, which seems to follow sysvr4 and SunOS. Under linux, shared binaries are *never* bigger than their unshared counterparts, as linux binaries don't have to contain any linking information at all. It's handled by a jump-table in the shared library image itself. This is one reason I prefer the linux way, although others seem to feel it's less dynamic and thus worse. Linux shared binaries are probably smaller than 386bsd's in all cases due to this. - The reason "the linux guys" "refused" to give you the information you wanted was probably due to nobody caring either due to your attitude or due to the fact that shared libraries under linux are by now the standard thing, and nobody much uses static binaries any more. - did you try X11 binaries? The difference is dramatic. Just to clarify: shared libraries do indeed save *a lot* of disk-space if done right. And it's been tested: the linux rootdisk relies on shared libraries to pack in as many programs as it does into 1.2MB. I didn't find any static binaries on my system (expect for some uninteresting ones like "update" which is dated from last year..), so just to show an example, I made one. The binary in question is the openlook virtual window manager (olvwm), and here are the sizes (both are stripped): # ll olvwm olvwm.static -rwxr-xr-x 1 root root 209920 Dec 5 17:36 olvwm -rwxr-xr-x 1 root root 427012 Dec 10 00:43 olvwm.static As you can see, there is a doubling in size when linking statically, and you can guess which I want to use on my system with 2 40MB harddisks (I'm not kidding you). I'd like to point out that 'olvwm' is not an extreme case: quite the reverse. It was just a binary that I could easily re-link, as I had the object files from a couple of days ago (as can be seen from the dates). With other binaries the savings are usually even more apparent: many X binaries contain mostly X library code when compiled statically, and can shrink to about 5-10% or their static size when recompiled to use shared libraries. Just for fun I just checked my /usr/bin/X11 directory: it contains 75 binaries, of which 21 have the minimal linux binary size of 9220: this is 1kB of binary header, one page of code, and one page of data. They could be shrunk yet more by linking them with the -N flag which packs all the data together, but they haven't (it's not the default gcc option under linux, as it means the binary won't get demand-loaded). Of the rest, 12 or 13 have a size of 13316 (one page more for either code or data), 6 more are yet another page bigger, and only 6 are more than 100kB in size. I'd be willing to bet that that isn't true under 386bsd without shared libraries: X11 binaries without shared libraries have a tendency to be >300kB in size regardless of what they do. Thus the shared libraries mean that I can have a pretty good complement of the standard small X utilities without worrying about disk space. In case somebody wants to actually check the sizes against the 386bsd binaries, here are a couple of examples (more or less randomly picked from the X binaries). I can't say how big the static versions are: I don't have them. -rwxr-xr-x 1 root root 9220 Oct 2 03:58 xclock -rwxr-xr-x 1 root root 13316 Oct 2 04:40 xsetroot -rwxr-xr-x 1 root root 21508 Oct 2 03:40 puzzle -rwxr-xr-x 1 root root 21508 Oct 2 03:56 xcalc -rwxr-xr-x 1 root root 37892 Oct 2 03:36 ico -rwxr-xr-x 1 root root 111620 Oct 2 03:51 twm Feel free to come to your own conclusions. And yes, it's most obvious with X binaries, but it shows clearly even on normal binaries too.. A couple of small examples (yes, here I've used the -N flag to press them under the 9220 mark): -rwxr-xr-x 1 root root 4376 Sep 8 05:35 cat -rwxr-xr-x 1 root root 3888 Nov 9 19:12 printf -rwxr-xr-x 1 root root 3636 Nov 9 19:12 id Note that the above are GNU binaries, not something that I've hacked up to be as small as possible. How big are they with static libs? I don't know, and I'm just as happy that way. > Anyone uses shared-lib for X applications or even server? >I'll do it later but it will help a lot if someone can share with me their >experiences. > However there is some saving in virtual memory size, about 20Kbyte. Right now, running X11 with a couple of clients (3 xterms, xeyes, oclock, xgas), /proc/meminfo gives me (pasted from another xterm): # cat /proc/meminfo total: used: free: shared: buffers: Mem: 15355904 14942208 413696 2629632 6291456 Swap: 5521408 0 5521408 As you can see, out of 15+MB (16MB minus kernel memory) 6MB is used for buffers (it's dynamic, and I put a upper limit of 6MB on it so that it never grows to any more than that). About 9MB is used by user-level binaries: 3.6MB of this the the X-server itself (probably much of it due to the background 1024x768 pixmap of Calvin & Hobbes). And due to page sharing, I have 2.5MB more virtual memory than the amount of memory actually used. Not all of it is shared libraries (the shell binaries are probably sharing normal code pages as well), but most of it probably is. The above aren't doctored numbers: I've seen more than 3MB shared, but I've also seen less. Anyway, for me the disk-space saved is more important. As to the timing checks you made: yes, shared libraries may slow things down. On the other hand, they can also speed things up: less memory used, less need to load in pages from disk etc.. I don't think the speed difference is much of an issue, but I haven't actually tested it at all. Linus
Newsgroups: comp.unix.bsd Path: sparky!uunet!spooky!witr From: w...@rwwa.COM (Robert Withrow) Subject: Re: Shared lib benchmarks, and experiences Message-ID: <1992Dec10.150750.2106@rwwa.COM> Sender: n...@rwwa.COM (News Administrator) Nntp-Posting-Host: spooky Reply-To: w...@rwwa.com Organization: R.W. Withrow Associates References: <1992Dec3.071056.27426@ntuix.ntu.ac.sg> <1992Dec9.233940.5174@klaava.Helsinki.FI> Distribution: usa Date: Thu, 10 Dec 1992 15:07:50 GMT Lines: 45 In article <1992Dec9.233940.5...@klaava.Helsinki.FI>,\ torva...@klaava.Helsinki.FI (Linus Torvalds) writes: [A number of things about the disk space savings of shared librarys that everyone agrees is true. It is true of SVR3 style shared librarys and it is also true of SVR4 (SUNOS) shared librarys, just as it is true of linux shared librarys, and the Joerg shared librarys. Only a few people deny the utility of shared librarys, and I often thing they are wearing blinkers. I don't understand why the shared library debate has to be so acrimonious, since the issue is really limited to just a very few but very important technical aspects.] | - the linux way of doing shared libraries is different from the Joerg | type, which seems to follow sysvr4 and SunOS. Under linux, shared | binaries are *never* bigger than their unshared counterparts, as | linux binaries don't have to contain any linking information at all. | It's handled by a jump-table in the shared library image itself. This | is one reason I prefer the linux way, although others seem to feel | it's less dynamic and thus worse. Linux shared binaries are probably | smaller than 386bsd's in all cases due to this. Absolute minimum binary size is not the most important criterion for judging a shared library implementation. The *technical* areas of criticism of linux (and Joerg, and SVR3) shared librarys center around two factors: 1) Address space polution: these shared librarys are *assigned* fixed addresses. This pollutes the address space of *all* processes and requires address-space configuration management to assign these addresses. This is significant in the real world. I frequently link against 20 or more librarys. 2) Versioning and mutability: Changing code in a shared library requires the relinking of all programs that use the shared library. Changine the size of a shared library will require the re-linking of code using *other* shared librarys. An example of an important operation that is *impossible* using these shared library implementations but *is* possible using SVR4-SUNOS implementations is to build Xaw3d and begin to use it without relinking *any* program on the system. I do this frequently. I consider this a crucial requirement for any shared library implementation. -- Robert Withrow, Tel: +1 617 598 4480, Fax: +1 617 598 4430, Net: w...@rwwa.COM R.W. Withrow Associates, 21 Railroad Ave, Swampscott MA 01907-1821 USA
Path: sparky!uunet!zaphod.mps.ohio-state.edu!uwm.edu!ogicse!news.u.washington.edu! serval!hlu From: h...@eecs.wsu.edu (H.J. Lu) Newsgroups: comp.unix.bsd Subject: Re: Shared lib benchmarks, and experiences Message-ID: <1992Dec10.200232.5557@serval.net.wsu.edu> Date: 10 Dec 92 20:02:32 GMT Article-I.D.: serval.1992Dec10.200232.5557 References: <1992Dec3.071056.27426@ntuix.ntu.ac.sg> <1992Dec9.233940.5174@klaava.Helsinki.FI> <1992Dec10.150750.2106@rwwa.COM> Sender: n...@serval.net.wsu.edu (USENET News System) Distribution: usa Organization: School of EECS, Washington State University Lines: 69 In article <1992Dec10.150750.2...@rwwa.COM>, w...@rwwa.COM (Robert Withrow) writes: |> In article <1992Dec9.233940.5...@klaava.Helsinki.FI>,\ |> torva...@klaava.Helsinki.FI (Linus Torvalds) writes: |> |> [A number of things about the disk space savings of shared librarys |> that everyone agrees is true. It is true of SVR3 style shared librarys |> and it is also true of SVR4 (SUNOS) shared librarys, just as it is true of |> linux shared librarys, and the Joerg shared librarys. Only a few people |> deny the utility of shared librarys, and I often thing they are wearing |> blinkers. I don't understand why the shared library debate has to be so |> acrimonious, since the issue is really limited to just a very few but |> very important technical aspects.] |> |> | - the linux way of doing shared libraries is different from the Joerg |> | type, which seems to follow sysvr4 and SunOS. Under linux, shared |> | binaries are *never* bigger than their unshared counterparts, as |> | linux binaries don't have to contain any linking information at all. |> | It's handled by a jump-table in the shared library image itself. This |> | is one reason I prefer the linux way, although others seem to feel |> | it's less dynamic and thus worse. Linux shared binaries are probably |> | smaller than 386bsd's in all cases due to this. |> |> Absolute minimum binary size is not the most important criterion for judging |> a shared library implementation. The *technical* areas of criticism of |> linux (and Joerg, and SVR3) shared librarys center around two factors: |> I don't the shared lib in Linux is the best in the technical aspects. But it serves its purpose. I am still interested in the other ways to implement it. I agree with Linus PIC is too much for CPUs like 386 with just a few registers. |> 1) Address space polution: these shared librarys are *assigned* fixed |> addresses. This pollutes the address space of *all* processes and requires |> address-space configuration management to assign these addresses. This |> is significant in the real world. I frequently link against 20 or more |> librarys. |> So? If you don't want to shared your binary, you can almost take any addresses you want. Otherwise, you have to be assigned an address for each library you are going to build. For most of people, there is not a problem. |> 2) Versioning and mutability: Changing code in a shared library requires |> the relinking of all programs that use the shared library. Changine the |> size of a shared library will require the re-linking of code using *other* |> shared librarys. FYI, we have been doing cp xxxx.so.x.y /lib cd /lib ln -sf xxxx.so.x.y xxxx.so.x for quite a while under linux. |> |> An example of an important operation that is *impossible* using these shared |> library implementations but *is* possible using SVR4-SUNOS implementations |> is to build Xaw3d and begin to use it without relinking *any* program on |> the system. I do this frequently. I consider this a crucial requirement |> for any shared library implementation. |> It is not impossible under Linux. But that belongs to another story. H.J.
Path: sparky!uunet!math.fu-berlin.de!unidui!du9ds3!veit From: v...@du9ds3.fb9dv.uni-duisburg.de (Holger Veit) Newsgroups: comp.unix.bsd Subject: Re: Shared lib benchmarks, and experiences Date: 11 Dec 92 08:56:55 GMT Organization: Uni-Duisburg FB9 Datenverarbeitung Lines: 67 Distribution: usa Message-ID: <veit.724064215@du9ds3> References: <1992Dec3.071056.27426@ntuix.ntu.ac.sg> <1992Dec9.233940.5174@klaava.Helsinki.FI> <1992Dec10.150750.2106@rwwa.COM> <1992Dec10.200232.5557@serval.net.wsu.edu> Reply-To: v...@du9ds3.fb9dv.uni-duisburg.de NNTP-Posting-Host: du9ds3.fb9dv.uni-duisburg.de In <1992Dec10.200232.5...@serval.net.wsu.edu> h...@eecs.wsu.edu (H.J. Lu) writes: >In article <1992Dec10.150750.2...@rwwa.COM>, w...@rwwa.COM (Robert Withrow) writes: [...some notes by Linus on Linux sharedlibs deleted...] >|> 1) Address space polution: these shared librarys are *assigned* fixed >|> addresses. This pollutes the address space of *all* processes and requires >|> address-space configuration management to assign these addresses. This >|> is significant in the real world. I frequently link against 20 or more >|> librarys. >|> >So? If you don't want to shared your binary, you can almost take any addresses >you want. Otherwise, you have to be assigned an address for each library you >are going to build. For most of people, there is not a problem. You see this from the Linux hacker's aspect only. There are many people who cannot afford recompiling anything from scratch any time, but want to have binaries. This is crucial in particular for X11. We already have different versions of X11 out (without shared libraries), what we certainly do not need are different versions of applications and libraries, such that every new program brings in its special set of shared libs. You may distribute object files, to be linked at the local system, but there are even people out there who do not have the space of ~7MB for kernel sources and objects available; they run into problems with for instance a XServer Link kit as well. >|> 2) Versioning and mutability: Changing code in a shared library requires >|> the relinking of all programs that use the shared library. Changine the >|> size of a shared library will require the re-linking of code using *other* >|> shared librarys. >FYI, we have been doing >cp xxxx.so.x.y /lib >cd /lib >ln -sf xxxx.so.x.y xxxx.so.x >for quite a while under linux. This means three patches in the libc, and you have three versions of it on the disk (for the really old, the middle old, and the new applications). If you play this game with the X libraries,... I thought you wanted to reduce your disk space requirements :-) >|> >|> An example of an important operation that is *impossible* using these shared >|> library implementations but *is* possible using SVR4-SUNOS implementations >|> is to build Xaw3d and begin to use it without relinking *any* program on >|> the system. I do this frequently. I consider this a crucial requirement >|> for any shared library implementation. >|> >It is not impossible under Linux. But that belongs to another story. It is not impossible under 386bsd as well. But in contrast to Linux, we do not recommend hacking to the hell for the casual user, and it won't be necessary in the next future. >H.J. H.V. -- | | / Dr. Holger Veit | INTERNET: v...@du9ds3.fb9dv.uni-duisburg.de |__| / University of Duisburg | | | / Dept. of Electr. Eng. | "Understand me correctly: | |/ Inst. f. Dataprocessing | I'm NOT the WIZARD OF OS" (Holger)
Path: sparky!uunet!enterpoop.mit.edu!snorkelwacker.mit.edu!ai-lab! hal.gnu.ai.mit.edu!ericy From: er...@hal.gnu.ai.mit.edu (Eric Youngdale) Newsgroups: comp.unix.bsd Subject: Re: Shared lib benchmarks, and experiences Date: 12 Dec 1992 22:10:03 GMT Organization: /etc/organization Lines: 58 Distribution: usa Message-ID: <1gdnvrINNp80@life.ai.mit.edu> References: <1992Dec10.150750.2106@rwwa.COM> <1992Dec10.200232.5557@serval.net.wsu.edu> <veit.724064215@du9ds3> NNTP-Posting-Host: hal.gnu.ai.mit.edu In article <veit.724064215@du9ds3> v...@du9ds3.fb9dv.uni-duisburg.de writes: >You see this from the Linux hacker's aspect only. There are many people >who cannot afford recompiling anything from scratch any time, but want >to have binaries. This is crucial in particular for X11. We already have >different versions of X11 out (without shared libraries), what we certainly >do not need are different versions of applications and libraries, such that >every new program brings in its special set of shared libs. You may distribute >object files, to be linked at the local system, but there are even people >out there who do not have the space of ~7MB for kernel sources and objects >available; they run into problems with for instance a XServer Link kit as well. Perhaps you do not understand. The way our libraries are made, you can just drop a new version into the /lib directory, and add a symlink, and you are ready to run the same binaries with the new sharable library. There is no need to relink, and you can delete the old version of the sharable library any time you wish. >>FYI, we have been doing > >>cp xxxx.so.x.y /lib >>cd /lib >>ln -sf xxxx.so.x.y xxxx.so.x > >>for quite a while under linux. > >This means three patches in the libc, and you have three versions of it on >the disk (for the really old, the middle old, and the new applications). >If you play this game with the X libraries,... >I thought you wanted to reduce your disk space requirements :-) This was true for older versions, but H.J was trying to say that we have come up with a way of making "plug compatible" libraries, so you can drop in a new library and delete the old version. We use jump tables to ensure that an entry point for each function is always located at the same address, and we have tools to ensure that each global data item also remains at a fixed address. We have recently overcome the obstacles that had prevented us from preparing the sharable X libraries in a similar way, and I expect that the next X release for linux will also be of a plug compatible variety. These will be made *without* source code modifications to the X library source code, btw. The worst case would be that you might have to patch each X Makefile, and it is unclear if that would even be needed. >>It is not impossible under Linux. But that belongs to another story. > >It is not impossible under 386bsd as well. But in contrast to Linux, we do >not recommend hacking to the hell for the casual user, and it won't be >necessary in the next future. Where did that come from? I don't think that anyone recommends a lot of hacking for the casual user. It is true that the first sharable libraries under linux were done so that everything needed to be relinked with a new version, but as you might imagine, that got old real fast. Even hackers don't get off on something quite so mundane. The plug compatible libraries were developed in response to this, and so far I think that most people have been quite satisfied with how it turned out. -Eric
Newsgroups: comp.unix.bsd Path: sparky!uunet!spooky!witr From: w...@rwwa.COM (Robert Withrow) Subject: Re: Shared lib benchmarks, and experiences Message-ID: <1992Dec12.235116.7484@rwwa.COM> Sender: n...@rwwa.COM (News Administrator) Nntp-Posting-Host: spooky Reply-To: w...@rwwa.com Organization: R.W. Withrow Associates References: <1992Dec10.150750.2106@rwwa.COM> <1992Dec10.200232.5557@serval.net.wsu.edu> <veit.724064215@du9ds3> <1gdnvrINNp80@life.ai.mit.edu> Distribution: usa Date: Sat, 12 Dec 1992 23:51:16 GMT Lines: 27 In article <1gdnvrINN...@life.ai.mit.edu>, er...@hal.gnu.ai.mit.edu (Eric Youngdale) writes: | Perhaps you do not understand. The way our libraries are made, you can | just drop a new version into the /lib directory, and add a symlink, and you are | ready to run the same binaries with the new sharable library. There is no need | to relink[...] According to private correspondence, I'm told you can even do this without the symlink by using environment variables. But, there are severe restrictions: 1) The two librarys must have identical ``assigned'' addresses, and 2) The two librarys must be substantially identical. By #2 I mean that if the second library is, say, built from a completely different set of object files, and a completely different set of ``internal'' routines, and is of a substantially different size, it won't work (even though the ``interface'' is the same). This is why I raised the example of Xaw3d -vs- Xaw. On SVR4 each process can elect to use one or the other of these libraries with the same binarys (say xterm) at the same time. I still don't think this is the case with linux, but correct me if I am wrong. -- Robert Withrow, Tel: +1 617 598 4480, Fax: +1 617 598 4430, Net: w...@rwwa.COM R.W. Withrow Associates, 21 Railroad Ave, Swampscott MA 01907-1821 USA
Path: sparky!uunet!enterpoop.mit.edu!snorkelwacker.mit.edu!ai-lab! hal.gnu.ai.mit.edu!ericy From: er...@hal.gnu.ai.mit.edu (Eric Youngdale) Newsgroups: comp.unix.bsd Subject: Re: Shared lib benchmarks, and experiences Date: 14 Dec 1992 17:02:37 GMT Organization: /etc/organization Lines: 80 Distribution: usa Message-ID: <1giendINNgku@life.ai.mit.edu> References: <veit.724064215@du9ds3> <1gdnvrINNp80@life.ai.mit.edu> <1992Dec12.235116.7484@rwwa.COM> NNTP-Posting-Host: hal.gnu.ai.mit.edu In article <1992Dec12.235116.7...@rwwa.COM> w...@rwwa.com writes: >In article <1gdnvrINN...@life.ai.mit.edu>, >er...@hal.gnu.ai.mit.edu (Eric Youngdale) writes: > >| Perhaps you do not understand. The way our linux libraries are made, you can >| just drop a new version into the /lib directory, and add a symlink, and you are >| ready to run the same binaries with the new sharable library. There is no need >| to relink[...] > >According to private correspondence, I'm told you can even do this without >the symlink by using environment variables. > >But, there are severe restrictions: > > 1) The two librarys must have identical ``assigned'' addresses, and > 2) The two librarys must be substantially identical. The first point is correct. I should point out that there is no reason why we cannot have two different libraries assigned to the same address - you just will not be able to use both at the same time in the same process. The way that it works is that there is something very analagous to a global constructor in C++, which looks for certain "special" variables. These variables contain the name of the sharable library, the assigned address, and the version number. When crt0 sees these variables, the requested libraries are mapped into the address space of the process. The second one depends upon how you define "substantially". >By #2 I mean that if the second library is, say, built from a completely >different set of object files, and a completely different set of ``internal'' >routines, and is of a substantially different size, it won't work (even though >the ``interface'' is the same). This is why I raised the example of >Xaw3d -vs- Xaw. On SVR4 each process can elect to use one or the other of >these libraries with the same binarys (say xterm) at the same time. I >still don't think this is the case with linux, but correct me if I am wrong. The example that you provide of Xaw/Xaw3d is not one that I am terribly familiar with. As a rule of thumb, as long as you can provide identical assigned addresses, you can generate plug compatible libraries. The limitations have less to do with the design of the library itself, but have more to do with the tools that we have available to ensure that the various addresses remain the same from one version to the next. The tricky bit in the past had been how you assign identical addresses to global data, and up until recently we had simply placed all global data in a separate file. This worked fine for libc, but did not work for the X libraries (so I am told) because it would have meant modifying the X source code heavily. Thus the libraries had to be nearly identical so that individual data items did not get moved around in memory, and what you were told in private correspondence is essentially correct for things as they stand now with the publicly available libraries that we have. I recently came up with a way of "extracting" the global variable declarations from the main files and shunting them off to separate files, so we now have complete freedom to arrange the global data any way we wish (as long as two variables do not overlap). All of this is handled at the assembly code level, so there are no source code modifications to the library required. If we assume that Xaw and Xaw3d have a common interface for the Xaw part, then there is no reason in principle why one cannot have sharable libraries for the two cases that have an identical interface. I want to add in passing that there were a lot of people who wanted some kind of dynamic linking instead of what we ended up with. Clearly there are pros and cons to either approach, and we had to balance these to come up with an answer. As I recall the biggest drawbacks to the dynamic linking were the need for a new assembler and linker, the need for more extensive kernel mods, larger binaries and more overhead to load a program. In its favor was the greater likelyhood that old binaries would still run with a new library, and no need for assigning addresses ahead of time. Finally some people are wondering why we are discussing the linux shared libraries on the comp.unix.bsd list. The original question had to do with shared experiences, and some people naturally were wondering how we do this under linux. This is somewhat of a moving target, although most of this is not visible to the regular user. Since I have been close to some of the discussions relating to the development I wanted to give my perspective and let you know where we currently are. You will ultimately decide one thing or another, and the only thing that I can guarantee is that some people will not be happy. In our experience, the malcontents will eventually forget about it.
Newsgroups: comp.unix.bsd Path: sparky!uunet!spooky!witr From: w...@rwwa.COM (Robert Withrow) Subject: Re: Shared lib benchmarks, and experiences Message-ID: <1992Dec14.231025.12627@rwwa.COM> Sender: n...@rwwa.COM (News Administrator) Nntp-Posting-Host: spooky Reply-To: w...@rwwa.com Organization: R.W. Withrow Associates References: <veit.724064215@du9ds3> <1gdnvrINNp80@life.ai.mit.edu> <1992Dec12.235116.7484@rwwa.COM> <1giendINNgku@life.ai.mit.edu> Distribution: usa Date: Mon, 14 Dec 1992 23:10:25 GMT Lines: 74 In article <1giendINN...@life.ai.mit.edu>, er...@hal.gnu.ai.mit.edu (Eric Youngdale) writes: | Finally some people are wondering why we are discussing the linux | shared libraries on the comp.unix.bsd list. The original question had to do | with shared experiences, and some people naturally were wondering how we do | this under linux. And I think that this discussion is good to have in C.U.B. so that the technical tradeoffs of shared librarys get a good airing... | > 1) The two librarys must have identical ``assigned'' addresses, and | > 2) The two librarys must be substantially identical. | | The first point is correct. I should point out that there is no reason | why we cannot have two different libraries assigned to the same address - you | just will not be able to use both at the same time in the same process. The reason why I bring this up is that I suspect that it is difficult to assign ``compatible'' ``assigned'' addresses except in the case where the libraries are ``substantially identical''. For example, if the latter library has twice as many entrypoints as the former, this is likely to be a difficult, problem and probably has no general solution. | The second one depends upon how you define "substantially". [...] | As a rule of thumb, as long as you can provide identical | assigned addresses, you can generate plug compatible libraries. The | limitations have less to do with the design of the library itself, but have | more to do with the tools that we have available to ensure that the various | addresses remain the same from one version to the next. This is the caveat that worries me. How do you handle the following situations? 1) Second library has (many) more entrypoints than former library. 2) The ordering of the entrypoints in the objects is different. 3) There is changed inter- and intra-calling relationships between routines in this and other libraries. 4) What about run-time library loading, as is done with resolvers on SVR4. | As I recall the biggest drawbacks to the dynamic linking were | the need for a new assembler and linker, the need for more extensive kernel | mods, larger binaries and more overhead to load a program. Let's handle these in turn. 1) Need for new assembler and linker: If you mean that you need a compilation system that can generate PI code, then yes, you need these. Since the GCC system generates PI code, I don't see why this is a problem. If you mean that you have to extensively modify the compilation system in other ways, this is not correct. You can handle all the needed functions in the CRTL startup code. You may want to have the linker do other things for efficiency reasons, but it is not otherwise required. 2) Kernel mods. Dynamic shared libs can be done without kernel mods depending on how code space is protected. Or you can use a mmap primitive to speed things up. Or you can add additional kernel code to make it all more efficient. Extensive kernel mods are not *required*. 3) Larger binaries: Not significantly, and, perhaps, not at all. It depends on the details. This should be weighed against the benefits. 4) More overhead to load a program. This also depends on the details. On my SVR4 system the additional time varys depending on whether the library has already been accessed by another process. For X programs, which access about a dozen shared librarys, the time seems to be swamped by other factors, such as widget creation. I don't notice it. -- Robert Withrow, Tel: +1 617 598 4480, Fax: +1 617 598 4430, Net: w...@rwwa.COM R.W. Withrow Associates, 21 Railroad Ave, Swampscott MA 01907-1821 USA
Newsgroups: comp.unix.bsd Path: sparky!uunet!spool.mu.edu!darwin.sura.net!ra!tantalus.nrl.navy.mil!eric From: e...@tantalus.nrl.navy.mil (Eric Youngdale) Subject: Re: Shared lib benchmarks, and experiences Message-ID: <BzAEnE.GKq@ra.nrl.navy.mil> Sender: use...@ra.nrl.navy.mil Organization: Naval Research Laboratory References: <1992Dec12.235116.7484@rwwa.COM> <1giendINNgku@life.ai.mit.edu> <1992Dec14.231025.12627@rwwa.COM> Distribution: usa Date: Tue, 15 Dec 1992 06:14:01 GMT Lines: 207 In article <1992Dec14.231025.12...@rwwa.COM> w...@rwwa.com writes: >| > 1) The two librarys must have identical ``assigned'' addresses, and >| > 2) The two librarys must be substantially identical. >| >| The first point is correct. I should point out that there is no reason >| why we cannot have two different libraries assigned to the same address - you >| just will not be able to use both at the same time in the same process. > >The reason why I bring this up is that I suspect that it is difficult to >assign ``compatible'' ``assigned'' addresses except in the case where the >libraries are ``substantially identical''. For example, if the latter library >has twice as many entrypoints as the former, this is likely to be a difficult, >problem and probably has no general solution. Offhand, I do not see why the number of entry points represents a problem. The general structure of a sharable library under linux is that we start with the jump table itself. This is usually padded to some large size to accomodate future expansion. Directly after this comes the global data, and everything is fixed at specific addresses. After this comes the regular text and data sections. As long as you do not overfill the section of memory that was set aside for jump vectors, there will not be a problem. The first time you build a sharable library, you select how much memory you want for the jump vectors - a wise maintainer will always allow a lot of room for future expansion if there is *any* possibility that they will be needed. There is no reason a priori that we even need to use up the jump vector slots in any particular order. We could use them randomly if we wanted to, although it would serve no useful purpose to do so. > >| The second one depends upon how you define "substantially". >[...] >| As a rule of thumb, as long as you can provide identical >| assigned addresses, you can generate plug compatible libraries. The >| limitations have less to do with the design of the library itself, but have >| more to do with the tools that we have available to ensure that the various >| addresses remain the same from one version to the next. > >This is the caveat that worries me. How do you handle the following >situations? > > 1) Second library has (many) more entrypoints than former library. This I have already discussed above. > 2) The ordering of the entrypoints in the objects is different. We have complete freedom to select whatever ordering of entry points that we wish when we first build the library. If there are two libraries that are supposed to share the same interface, then you simply have to provide identical lists of functions to the program that generates the jump vector module. > 3) There is changed inter- and intra-calling relationships between > routines in this and other libraries. An example of this might be our X libraries. Naturally the X libraries require various functions in libc, and since the X libraries are linked to the sharable jump-table version of libc, we can simply replace libc if there is a new version, and the sharable X libraries will automatically use the new version. Inter-calling (calls within the library) are all resolved by the linker without having to even look at the jump table. If the inter-calling relationships change, it will not be a problem, as long as the external interface remains fixed (i.e. the jump vectors and global data all remain at the same address). There are some things that could break a sharable library, such as a change in the number of arguments for a given function. In the past we have treated this in the following fashion: We leave the older N-argument function in the library with it's jump vector in the jump-table, but we fix things so that anytime we link to the library, the linker will only see the new N+1 version of the function. The N+1 version of the function has it's own distinct slot in the jump table, so there is never confusion about which function we are talking about. Naturally, the header file changes at the same time we change the library. The advantage of doing this is that we can allow a gradual changeover to the new way of doing things without suddenly breaking a lot of different programs all at one time. After a suitable period of time, and perhaps after some warnings have been posted, the old version of the function will be deleted and the jump slot would be changed to point to a routine that would simply tell you that you must recompile. We did something similar when we went from 16 bit inode numbers to 32 bit inode numbers in the stat structure (yet another minixism that bit the dust). We do this kind of stuff to avoid breaking peoples binaries, but it is a bit of a nuisance to do this kind of thing. The more mature the library is to begin with, the better the chance that you will never have to even worry about this sort of thing. I am not sure at all how easy or clean it would be to try and treat this sort of situation with dynamic linking. > 4) What about run-time library loading, as is done with resolvers > on SVR4. I do not know how SVR4 does it (even though I use it at work). The way it is handled under linux is that there is a special data element in the binary which contains the following bits of info: 1) The full path of the library to be loaded. 2) An ascii string which is more descriptive that the pathname. 3) The version number of the sharable library. 4) The virtual address at which it should be loaded. This is spotted by crt0 (in this respect, it is similar to a global constructor under c++), and it basically does some checking (i.e. it makes sure that the version number of the library linked against is consistent with the version number of the library found at the pathname, and that the virtual address that we are requesting the library be loaded be the same as what the library itself wants to be loaded). I am not sure, but I think that it simply amounts to some kind of mmap, and the pages are demand loaded as required. If you wanted to know for sure, you would have to ask Linus about this. > >| As I recall the biggest drawbacks to the dynamic linking were >| the need for a new assembler and linker, the need for more extensive kernel >| mods, larger binaries and more overhead to load a program. > >Let's handle these in turn. > >1) Need for new assembler and linker: If you mean that you need a compilation >system that can generate PI code, then yes, you need these. Since the GCC >system generates PI code, I don't see why this is a problem. The compiler is not the problem. The assembler, gas, does not understand (yet) the special syntax that GCC generates when writing PI code. Out of curiousity I tried compiling something with PIC, and I got gobs of assembler errors. As I recall, this was probably the most formidable stumbling block, although in retrospect we probably could have solve the problem by running some kind of postprocessor on the assembly code. We are also using the GNU ld, and depending upon how you do the implementation, changes may have to be made here as well. There was another objection that has been raised in the past by various people, and that is that in the 3/486 architecture there are relatively few machine registers compared to something like a VAX. The PI code that I have seen GCC generate always seems to use up one register as a reference pointer of some kind or another, and when you reserve this register (usually ebx) for this purpose, it is not available to the compiler for other uses, and this could lead to poorer performance. I have not seen any numbers to back this up, but the objection has been raised. >If you mean that you have to extensively modify the compilation system >in other ways, this is not correct. You can handle all the needed functions >in the CRTL startup code. You may want to have the linker do other things >for efficiency reasons, but it is not otherwise required. Ah, yes, but we probably would want to have the linker do other things for efficiency reasons - if you were to compare a quick and dirty dynamic linking implementation to the linux style fixed-address libraries, the fixed address libraries would come out looking quite good indeed. In a proper implementation of dynamic linking we would probably want the list of external symbols arranged in such a way that they take up a minimum amount of space in each binary and in such a way that the externals are easy to resolve quickly. If efficiency were no concern, we could probably just use the output from "ld -r" and build a mini-linker into crt0 to finish the job. >2) Kernel mods. Dynamic shared libs can be done without kernel mods >depending on how code space is protected. Or you can use a mmap primitive >to speed things up. Or you can add additional kernel code to make it >all more efficient. Extensive kernel mods are not *required*. I had of course forgotten that the linking could be done by crt0. Nonetheless, there is some programming involved, either in the kernel or in crt0 before you can start to use dynamic linking. > >3) Larger binaries: Not significantly, and, perhaps, not at all. It depends >on the details. This should be weighed against the benefits. I doubt that there would be any binaries that would be no larger with dynamic linking, but I have no doubt that you could achieve something where the additional space was not very much. >4) More overhead to load a program. This also depends on the details. >On my SVR4 system the additional time varys depending on whether the library >has already been accessed by another process. For X programs, which access >about a dozen shared librarys, the time seems to be swamped by other factors, >such as widget creation. I don't notice it. Again, I don't know the grungy details on how things work under SVR4. I use it at work, and it seems fast enough to me, so it is obviously possible to do dynamic linking in a workable way. The question always boils down to the tradoffs involved, and to what tools need to be developed in order to implement one scheme or the other. The biggest technical obstacle at the time for us was probably the assembler, although I think that we probably would have wanted to muck with the linker as well for efficiency. There were some people who felt that we should try and use the off-the-shelf as and ld from FSF instead of trying to maintain our own variant version. There was a fairly long debate about the whole thing, and in the end we realized that it would not be that tough to implement the fixed address type of libraries. Compatibility from one version to the next has always been the hard part about this type of implementation, and this is where we have been spending most of our effort to refine the process. In contrast, with dynamic linking I would imagine that most of the refinement would be in making it efficient, since version to version compatibility is relatively easy to provide once you had a basic operating principle that is functional. Anyway, we have been refining the concept for about 6 months, and we now have it to a point where the drawbacks are quite minimal. Given the proper tools it is not that tough to actually build a sharable jump-table type of library, although it may be true that is is a little easier to generate a dynamic linking type of library instead (this depends a lot on the implementation as well). If we had decided to go with dynamic linking in one way or another, we would have probably needed to spend more time upfront before we would have gotten anything out the door. -Eric -- Eric Youngdale
Newsgroups: comp.unix.bsd Path: sparky!uunet!spooky!witr From: w...@rwwa.COM (Robert Withrow) Subject: Re: Shared lib benchmarks, and experiences Message-ID: <1992Dec15.141455.14369@rwwa.COM> Sender: n...@rwwa.COM (News Administrator) Nntp-Posting-Host: spooky Reply-To: w...@rwwa.com Organization: R.W. Withrow Associates References: <1992Dec12.235116.7484@rwwa.COM> <1giendINNgku@life.ai.mit.edu> <1992Dec14.231025.12627@rwwa.COM> <BzAEnE.GKq@ra.nrl.navy.mil> Distribution: usa Date: Tue, 15 Dec 1992 14:14:55 GMT Lines: 70 In article <BzAEnE....@ra.nrl.navy.mil>, e...@tantalus.nrl.navy.mil (Eric Youngdale) writes: | In article <1992Dec14.231025.12...@rwwa.COM> w...@rwwa.com writes: | >The reason why I bring this up is that I suspect that it is difficult to | >assign ``compatible'' ``assigned'' addresses except in the case where the | >libraries are ``substantially identical''. For example, if the latter library | >has twice as many entrypoints as the former, this is likely to be a difficult, | >problem and probably has no general solution. | The first time | you build a sharable library, you select how much memory you want for the jump | vectors[...] So as long as you reserve enough extra space and you cause (using tools) the jump table to be ordered appropriately, your libraries can be made forward compatible. The following restrictions still apply: 1) You must reserve enough space, which means wasting address space, and, depending on how dynamicly the library changes, you may eventually ``run out'' of space unless you are very liberal in what you reserve. Neither of these problems is very major, assuming some care and forethought in the creation of the library. 2) You still ``pollute'' the address space of all processes, due to the assigned addresses. I personally think this is a serious drawback because this problem only grows worse with time, and cannot ever be reduced without creating incompatibility. And it requires central authority... As a note, the jump table method improves load-time performance at the expense of per-call run-time overhead. Most dynamic loaded library implementations improve run-time performance at the expense of load-time overhead by using other methods. I somewhat prefer the latter because most processes I use (interactively) are long lived. | Inter-calling (calls within the library) are all resolved by the | linker without having to even look at the jump table. Which, BTW, is a restriction. A staticly loaded replacement for such a routine will not be called by the library code. Typical examples are malloc() routines... | There was another objection that has been raised in the past by various | people, and that is that in the 3/486 architecture there are relatively few | machine registers compared to something like a VAX. The PI code that I have | seen GCC generate always seems to use up one register as a reference pointer of | some kind or another, and when you reserve this register (usually ebx) for this | purpose, it is not available to the compiler for other uses, and this could | lead to poorer performance. Given GCC's code generation strategy, this is likely to cause more frequent reloading. Smart optimization can reduce this, but there is likely to be a micro-level performance hit using PIC. *Macro* level performance is not affected to the degree that code examination would tend to indicate, due to program dynamics, and other savings that PIC can enable. This gets into nitty details, but suffice to say that in real systems PIC is roughly performance neutral... | Anyway, we have been refining the concept for about 6 months, and we | now have it to a point where the drawbacks are quite minimal. I agree. I still think that the benefits of DSLs make them worth the effort though. I remember my sigh-of-relief when I went from SVR3 to SVR4 shared libraries... -- Robert Withrow, Tel: +1 617 598 4480, Fax: +1 617 598 4430, Net: w...@rwwa.COM R.W. Withrow Associates, 21 Railroad Ave, Swampscott MA 01907-1821 USA
Newsgroups: comp.unix.bsd Path: sparky!uunet!zaphod.mps.ohio-state.edu!darwin.sura.net!ra! tantalus.nrl.navy.mil!eric From: e...@tantalus.nrl.navy.mil (Eric Youngdale) Subject: Re: Shared lib benchmarks, and experiences Message-ID: <BzBG6q.21J@ra.nrl.navy.mil> Sender: use...@ra.nrl.navy.mil Organization: Naval Research Laboratory References: <1992Dec14.231025.12627@rwwa.COM> <BzAEnE.GKq@ra.nrl.navy.mil> <1992Dec15.141455.14369@rwwa.COM> Distribution: usa Date: Tue, 15 Dec 1992 19:44:49 GMT Lines: 78 In article <1992Dec15.141455.14...@rwwa.COM> w...@rwwa.com writes: >So as long as you reserve enough extra space and you cause (using tools) the >jump table to be ordered appropriately, your libraries can be made forward >compatible. The following restrictions still apply: > > 1) You must reserve enough space, which means wasting address space, > and, depending on how dynamicly the library changes, you may eventually > ``run out'' of space unless you are very liberal in what you reserve. > Neither of these problems is very major, assuming some care and > forethought in the creation of the library. > > 2) You still ``pollute'' the address space of all processes, due to the > assigned addresses. I personally think this is a serious drawback > because this problem only grows worse with time, and cannot ever > be reduced without creating incompatibility. And it requires central > authority... It is a drawback, but I feel that it is a relatively minor one. Under linux we have reserved 1.5Gb for sharable libaries, and a most we are using about 0.25% of that right now. If you assume that because of padding and allowances for future expansion that the maximum usable fraction of this would be no less than 25%, then this would mean that we could use at least 375Mb of sharable libraries. By todays standards this seems ridiculous, but if we are looking 5-10 years down the road it might not seem so silly. Still, this type of usage would be extremely taxing on the 3/486 architecture as we know it today and I suspect that any application that needs this much VM would probably be best run on a 64 bit machine. You are also correct about the central authority. This is all behind the scenes for most users, but someone does need to coordinate. Theoretically we could end up with a situation where one library grew to the point where it encroached upon another libraries VM, and this could be dealt with by dividing the library into two separate but interdependent sharable images. Clearly given enough time, and enough expansion things could become a real mess. As a corralary to this, does anyone have a sense of how the dynamic linking time grows as the library grows? For example, would it go as N, N^2 (where N is the number of symbols), something inbetween? If you assume that function sizes remain roughly constant (so that human programmers can easily manage them), then conceivably the number of entry points would be proportional to N, and the number of externals that need to be resolved could also be proportional to N. We could also assume that machines will be M times faster, and this would mean that the dynamic link time could go as N*N/M. My point is that the time to dynamically link to 100Mb worth of sharable libraries could conceivably grow to unacceptable levels. Right now all we can do is speculate and extrapolate. >| Inter-calling (calls within the library) are all resolved by the >| linker without having to even look at the jump table. > >Which, BTW, is a restriction. A staticly loaded replacement for such >a routine will not be called by the library code. Typical examples >are malloc() routines... Yes, we do run up against this. Emacs presents exactly this problem with malloc. I know that it can be solved because we all (who use emacs, anyways) use a version that is linked to a sharable library, but this is clearly something that does need to be considered in certain cases. >| Anyway, we have been refining the concept for about 6 months, and we >| now have it to a point where the drawbacks are quite minimal. > >I agree. I still think that the benefits of DSLs make them worth the >effort though. I remember my sigh-of-relief when I went from SVR3 to >SVR4 shared libraries... There are plusses and minuses to both approaches. As long as the details are handled properly, I think that either approach will work just fine as far as the end user is concerned. Now that I think about it, I recall that in our case the people who wanted the DSLs were unwilling/unable to write the code to make it work. The people who wanted the fixed address libraries were willing to do the programming, and this tipped the balance in favor of the fixed address libraries. Given that we both inhabit similar types of "democracies", something like this could well be the deciding factor for 386bsd as well :-). -- Eric Youngdale