§ ¶What makes a computer slow
I've become increasingly frustrated with the speed of computers lately, or rather, lack thereof. After thinking about it, I came up with three reasons why I think computers have gotten slow. It's a bit long. Should I call this a rant? Oh well, let's just start with the first one:
Multithreading, overall, has been good for computer usability. I'm glad that I no longer have to choose to run either my editor or my program, or shell out from my editor and remember that I've done so (ah, BRIEF). While multithreading on the CPU is good, though, it isn't so good for the hard disk. In fact, it makes hard disk performance suck, because seeks are expensive and even a moderate amount of seeking causes throughput to plummet. The problem is that when you add multithreading to the mix, you have multiple threads submitting disk requests, which then causes the disk to thrash all over the place. This is especially fun when you boot your computer and all of the garbage that every program has placed in the startup folder decides to load at the same time. It's gotten worse with faster CPUs, because now programmers think they can speed things up by doing tasks in the "background," only to increase the amount of contention on the one poor disk, which ends up lowering overall throughput. (*cough*Intellisense*cough*)
My fondest memory of the effect of multithreading on disk performance was from my Amiga days, where attempting to read two files at the same time would cause the floppy disk drive to seek for every single sector read. As you can imagine, that was very loud and very slow. (Gronk, gronk, gronk, gronk, gronk....) Newer operating systems do buffering and read-ahead, but you can still see some pretty huge performance drops in VirtualDub if you are doing a high bandwidth disk operation and have something else hitting the disk at the same time. Supposedly this might be better in Vista, as the Vista I/O manager doesn't need to split I/Os into 64K chunks, but I haven't tested that.
#2: Virtual memory (paging)
Ah, virtual memory. Let's use disk space to emulate extra memory... oh wait, the disk is at least three orders of magnitude slower. Ouch.
Contrary to popular opinion, virtual memory can help performance. In a good operating system, any unused memory is used for disk cache, which means that enabling virtual memory can improve performance by allowing unused data to be swapped out and letting the disk cache grow. In the days of Windows 95 there was a big difference between having a 500K cache and a 2MB cache... or no cache at all. Nowadays, it feels like this is very counterproductive, because Windows will frequently swap out applications to grow the disk cache from 800MB to 1GB, which barely increases cache hit rate at all, and just incurs huge delays when application data has to be swapped back in. It's especially silly to have every single application on your system incrementally swapped out just because you did a dir /s/b/a-d c:.
In the old Win9x days, you could set a system.ini parameter to clamp the size of the disk cache to fix this problem; with Windows NT platforms, it seems you have to pay $30 for some dorky program that calls undocumented APIs periodically. I wouldn't laugh, Linux folks -- I've heard that your VM is scarcely better than Windows, and the VMs of both OSes are trounced by the one in FreeBSD. Of course, I can count the number of people I know that run FreeBSD on one hand.
For better or worse, virtual memory has made out-of-memory problems almost a non-issue. Sure, you can still hit it if you exhaust address space or get absolutely outrageous with your memory requests, and the system will start running reaaaalllllyyyy slooooowwwww when you overcommit by a lot, but usually, the user has ample warning as the system gradually slows down. Virtual memory has also let software vendors be a bit loose with their system requirements -- you're a moron if you think a program will run well with 256MB when that's what it says on the box -- but at least the program will work, somewhat, until you can get more RAM.
That isn't the worst problem, though. Care to guess what my #1 is?
#1: Garbage collection
Time for my real pet peeve, garbage collection -- which has been introduced to Windows applications in spades by the .NET Framework.
What is garbage collection? It's an implicit memory management mechanism, versus the usual explicit mode. That is, instead of a program directly saying that it is done with memory, the system detects that memory is unused when the program has no more references to it. The primary advantages of using GC are that the program doesn't have to manually manage memory, which involves both CPU and programmer overhead, and true memory leaks are impossible. Memory heap related problems, such as leaks, double frees, etc. are responsible for a lot of program failures and can be a huge headache when programming, thus the interest in avoiding the problem entirely.
The classic argument against GC is that it can eat a lot of CPU time doing its sweeps, and it does so disruptively in spurts. Sure enough, in order programs with GC, you could see them noticeably halt every once in a while while the GC did its work. Yet, even though I'm a diehard I-love-performance, native-code-all-the-way kind of guy, I think this problem is mostly a non-issue at this point. For one thing, it's been a while since I've seen a program with pauses due to periodic GC CPU usage being a problem, and it seems that in modern Java and C# environments it's easy to get decent GC performance as long as you aren't unnaturally stupid with your allocation behavior. Heck, most old BASIC interpreters used it, and if garbage collection worked quickly enough for a 1MHz 6502 with 64K of RAM, it should work just fine on modern systems. Sure, you wouldn't want to use garbage collection in a program with a real-time requirements, but you're better off avoiding any dynamic memory allocation in that case.
No, the two problems are memory usage and locality.
Programs that use GC as their main memory allocation strategy consume more memory. Yeah, a good compacting heap can get better packing than a malloc heap, but GC-based programs generally allocate more memory than non-GC ones so that the GC doesn't have to run all the time, and in general I see GC programs take a lot more memory than non-GC ones. This is part of the reason I hate the new trend of writing video card control programs in the .NET Framework. I swore up and down that I would never buy another video card from a particular vendor once I saw their awful new .NET-based control panel that takes more than 50MB of memory, and of course the other major IHV went and did the same thing so I'm now screwed both ways. Repeat with dozens of other applications that have gone .NET, and it starts to add up very quickly.
You might say: why does it matter, since modern computers have so much memory? The memory would go unused otherwise, so you want apps to use it all up, right? Absolutely not. First, having a dozen programs that all claim extra memory adds up really fast, but as I've said above, the disk cache would use that memory otherwise. (In fact, in a way, the disk cache itself uses garbage collection.) If five applications each eat up an extra 100MB, that's half a gig that isn't available for disk caching. When you're working with big datasets, like editing a big sound file or linking a program in Visual Studio, that smaller cache can become noticeable.
The real problem, though, is access locality. I'm not talking about something like low-level cache behavior here -- I don't think explicit memory management has an advantage here, and GC or not, that's best taken care of by preallocating your data in clumps and avoiding allocation in your inner loop. I'm talking about the well-known problem that GCs don't interact well with paging because a full garbage collection pass requires access to the entire memory heap. It's this, I claim, that is currently ushering in the new Era of Slow(tm). Let's say you have a control panel icon in the corner of the taskbar, say, with a red icon that has three letters on it... say, an A, a T, and let's throw in an i for good measure. You haven't used it in a while and it doesn't do anything while idle, so the OS has happily paged most of it out to disk. Now it becomes active, and the garbage collector decides to do a gen.2 garbage collection sweep... which causes every single page in that application's heap to swap back in. The result is an agonizingly long wait as the hard disk goes nuts bringing thousands and thousands of 4K pages back in one at a time. Even better, if the reason your system started swapping is because you're already low on memory, this impressive amount of swap traffic causes even more swapping in other programs, grinding everything to a complete halt for as long as several minutes. Argh!
Let's be honest: garbage collection is here to stay. It's quite powerful for certain data structures, most notably string heaps, and it has undeniable benefits in other areas such as sandboxed execution environments and concurrent programming. What I think aggrevates the problems, though, are languages and programming environments that you insist on putting everything in the GC heap. It is possible to do garbage collection in C++ even without runtime help, and it's extremely valuable for caches. A compacting garbage collector like the one that the .NET Framework uses, though, has the irritating behavior that it wants to see everything so it can track and relocate pointers. The result is that it's so painful to deal with external memory that you end up with lameness like keeping 1MB images in GCed arrays. Add to this languages like C# that have very poor support for explicit memory management (doing manual reference counting without C++ style smart pointers sucks), and you get tons of programs that have much poorer memory behavior than they should. It seems that Java fares a little better -- in particular, Azureus felt much snappier than I had expected for a Java app -- but I'm sure there are counterexamples.
The usual argument for GC is that it relieves the programmer of the burden of managing memory. That really isn't true, and I think the current trend of relying on the GC to handle everything costs a bit too much performance-wise. Even with garbage collection, it's still a good idea to carefully manage object lifetimes and minimize allocations. I try to allocate as little memory as possible in VirtualDub's inner render loop, and I would do the same even if I switched to garbage collection -- not that I have any intention of doing so, ever. It's OK to give up some performance for rapid development and robustness, but I frequently see new applications go well over 100ms and into the seconds range for non-interactivity, and I believe that garbage collection is largely responsible for it.
The worst thing is that some C and C++ programming books, says to avoid the use of free() because you have a huge amount of memory that you'll never use and, if you're lucky enough, the OS will provide with a sort of GC.
Kwisatz Haderach - 23 10 07 - 03:35
I've been avoiding the red-coloured manufacturer's graphic cards precisely because somebody had a brillant idea to write the managing software in .NET. Which is the other major IHV that did the same?
And, speaking of .NET, nothing is better than double-clicking a 32k exe file just to be greeted by 30 seconds of disk thrashing while the environment loads...
ender - 23 10 07 - 10:17
Time to write you own OS. :) I dream of day when I will boot into VirtualDub partition...
Or at least Explorer replacement. Or LiveCD :)
GrofLuigi - 23 10 07 - 10:55
I so agree with this. Surely the point of having faster computers is to get tedious stuff done quicker? What you said about virtual memory makes sense, but Windows seems to get very upset if you turn it off. Do you know a safe way to do this? (Obviously not safe if you use all your memory, but I have 2 gigs so I should be ok...)
Chris Ovenden (link) - 23 10 07 - 12:55
The problems you have with GC is mostly limited to those using sweeps. Systems with GC implemented with reference counting and cycle detection (like CPython) don't usually have the growing memory usage and swapping problems that other runtime named after coffee tend to demonstrate.
theeth - 23 10 07 - 13:45
I also disable VM in windows. I have a 2GB machine, and never ever run out of memory. Even when creating and compiling a HL2DM bsp map file that used about 700mb at full compile.
MarC - 23 10 07 - 13:53
may be of interest - a gc implementation which interacts with the page cache to minimise the problem you indicate.
Pete Kirkham (link) - 23 10 07 - 14:03
I hate applications being swapped out for a bit of cache. I try to run with no swap at all, but my 1GB doesn't cut it for that--once I go to 3GB I'll do it again. I'd like to just run 8GB, but I don't know about running 2003 Server for that. I wish other versions of Windows would at least support >4GB memory for disk cache, if not full PAE support.
I have separate data and system drives. My apps and Windows are on a 70G Raptor--small, but plenty for applications and no games--and my games and large media are on a second, larger, slower drive. Most data-intensive stuff like VD's happens on that second drive, which is rarely multitasked, so it doesn't fight with apps.
I can't agree with placing GC over swapping. Things like poor locality cause a general slowdown akin to having a slower CPU, and I find it far more disruptive and annoying when I tab to VC and it spends five seconds redrawing parts of the screen as various parts of the IDE are swapped back in, and then swaps a bit more when I open a menu or scroll down as other parts come back. If something in the background is accessing the disk, #3 kicks in too, and it's all an order of magnitude worse.
(I do have my swap on my app drive and not my data drive--I don't remember why. Probably because that drive is faster, but I should move it off of there, since there's less contention for the other drive.)
Glenn Maynard - 23 10 07 - 17:30
Multitasking (shouldn't it be this instead of "multithreading"?) has been around so long that this is almost like rehashing debates about MultiFinder. What can we do about it?
On that note, virtual memory hasn't really bit me hard since we entered the 64-bit era. If I need the RAM, I go with an x64 in the first place. So many times in the bad old 32-bit days of IRIX, I went to Fry's while I waited for my machine to recover from a swapfest. Never again.
As for your number 1, I'll take a poorly written C# app that has performance problems over a poorly written C++ app that core dumps all over the place without reason.
I contend that anti-virus and corporate "policy orchestrator" software has far more of an adverse effect on performance than any of the other three you mention. I have a quad core monster at work that seems slower than my Mac Mini at home -- running Vista (!) -- because of all the McAfee crap the IT department puts on there to monitor me.
You know... MacOS X uses a non-GCed sorta dynamic language (Objective-C). Maybe it's time for you to switch?
Trimbo (link) - 24 10 07 - 01:48
i'm very curious to hear the author's responses to everyone's comments....
johny why (link) - 24 10 07 - 03:53
"Cobol so corrupts the minds of those that use it, teaching it should be a capital offence."
.NET is the new Cobol.
IanB - 24 10 07 - 09:29
My pet peeve (which relates to both #1 and #2) is: In this age of cheap hard drives, why do most machines still have only one? CPUs and RAM have exploded in speed, but hard drives just get bigger. Standard (and easy to use) RAID 5/10 arrays would do a lot to reduce this bottleneck.
Rob Scott - 24 10 07 - 14:48
The "cool" thing about GC is that if you have a timer function (which allocates some memory of course), you will see the VM Size growing and growing until drops back just to start again, and I'm talking about ranges above 100MB now. I've never seen so many page faults in taskman as in my .net apps. There is of course no way to dispose language native arrays, they just hang around for a while to make sure the memory manager cannot reuse its space.
Gabest - 25 10 07 - 01:15
Mainly because the aim is to shrink the PC size while offering as much as possible. For instance, four years ago, it was commonplace to be able to acquire a desktop machine having more than 2 slots for hard disks specifically. Nowadays, you are lucky to find three slots, generally one or two slots are available for hard disks, and one of those slots is either a floppy drive or a media reader if the case was designed badly. The need to increase the size of disks instead of adding more is spurred by this trend. Oh, and laptops can only have 1 integrated drive, so that is important too...
King InuYasha (link) - 25 10 07 - 20:34
Whee, lots of comments. Let's see....
Write an OS? A long time ago, I started writing a DOS clone in asm. I think I got as far as writing the alloc/free functions in the int 21h handler before I got bored. I have no plans of becoming the next Linus Torvalds (at a minimum, I would have to be more witty in responses).
I know where Trimbo works. I'll make arrangements to have him punished.
There are lots of experimental GCs that solve various problems, but they all introduce other problems and don't yet work well in general. One often overlooked cost is that of changing references -- more elaborate tracking or caching mechanisms often make simple reference mutations more expensive. What we need to see is results. People kept saying that GC pauses were a problem, and the way that they got shut up was for incremental GCs to arrive as constructive proof that the problem was solved. I won't believe that the paging problem is resolved until someone actually ships a general solution that solves it. Given how invasive it is to hook into the OS page fault handler, I think that'll take a while. In the mean time, pretty much all of the mark-and-sweep GCs that people actually use still stink under paging.
Three problems with adding more drives. The first problem, as King InuYasha pointed out, is that laptops don't take multiple drives, and for that matter, it also isn't viable anywhere where space, noise, or power draw is a concern -- which it increasingly is across the board. The second problem is that adding more drives doesn't help latency nearly as much as it does throughput. The third problem is that more drives means more failures.
Throwing more hardware at the problem, in general, isn't the way to go. For one thing, the gravy train is slowing down, and CPU speeds aren't ramping up as fast as they used to. Burning more CPU power to make problems simpler is fine, except when it outstrips the rate at which hardware is speeding up... and people rightfully get upset when their faster computers actually run slower due to inefficient software. It doesn't bother me much when things don't get much faster, but it's shameful to have a barnstorming machine with a dual core CPU and loads of RAM that pauses for seconds at a time in a text editor.
Phaeron - 26 10 07 - 00:25
Some food for thoughts: is Native Command Queueing a help against hard disk thrashing? My Linux system seems to think so; Windows XP (a dual boot partition I always forget to delete) just doesn't use it (I think it can't enable it due to a BIOS problem) but under Linux, it's detected and activated - on my oldish Hitachi SATA-II 250 Gb, loading 6 apps and 4 different toolkits (32 and 64-bit versions of both Gtk and Qt) don't seem too slow.
Of course, you could also use Raid-0.
About VM: I have swap partitions configured on all my machines - but they rarely get more than a few megabytes of use. Why? Simply put: Linux (I do mean the kernel) will start paging out unused memory pages only when RAM has reached a certain amount (around 80%) of filling. However, it's true that once the swap partition starts to be in heavy use, the system gets very, very slow.
On the other hand, you shouldn't use the swap when there's no need for it...
Now, it also depends on the allocator - and right now, Linux gives you 3 differnent ones. You are free to choose the one you prefer, but my (limited) tests on SLAB seem to show lower RAM use and a nimbler system - YMMV - compared with SLUB (the 'legacy' one), especially when used with increased processes pre-emption.
Mitch 74 - 26 10 07 - 09:35
I'm using Vista 64-bit on a laptop with 4 gigs of RAM and a ridiculously slow 200-gig, 4200rpm hard drive, and I have noticed some stuff that I like. The proactive memory management helps a lot. I/O priority does work. I expected worse because once you issue a read/write, the drive is gone for ~10ms while it seeks, but the system is actually perfectly usable while a defragmentation is in progress, with the hard drive active 100% of the time.
It also appears to do something genius: if most everything is paged out due to an application using, say, 3.8 gigs of memory in my case, when the application deallocates the memory, Vista immediately starts to read back the page file in disk sequential order, which is ridiculously fast compared to random hard page faults, and a few seconds later all the system is fast again.
I have also noticed that I/O priority has excellent mapping for all I/O operations. In particular, the MFT and the page file. If a low-I/O-priority app is memory-trashing a lot, its hard page faults are served at low priority too. Same with the MFT if it tries to list or create many files (such as your dir /s/b/a-d c:).
I have yet to test huge file operations to see if that hurts as much as it did in previous versions (both with the seeking and the cache-eating-all-the-RAM problems), but preliminary testing suggests it does not.
Optimistic John - 26 10 07 - 15:04
If speaking of ATI:
Download the latest driver ONLY, and download "ATI Tray Tools". I bet you'll have everything you need, WITHOUT .NET.
Or go with "Omega Driver" for ATI.
Matyas - 26 10 07 - 16:22
i'm still catching up on comments here, but the following might be of interest. they claim garbage collection need not be a memory hog-- at least with D:
johny why (link) - 27 10 07 - 00:32
Optimistic John - I have to agree that Vista is much better. If you have Vista (or XP-x64 or Server 2003) and a UPS (or laptop) and your system is really stable, you can also go to your disk drive in the Device Manager and turn on the 'extended performance' caching option which enables less-safe (but higher performance) write buffering.
Derek - 27 10 07 - 13:19
I've got a few issues with the Java GC - I was working on some robot classes for a Braitenberg simulator, and wanted to make my 'bots print out a "I've been eaten!" message to let me keep track of what was going on. Now Java has no destructors, because it's got no way of knowing that the last reference to an object is being destroyed. So, I poked around in the APIs, and found "finalizers" which looked like they would do what I wanted.
Except the finalizers get called when (if) the GC decides to collect your object. Which, if you've got plenty of memory and not too many objects, may never happen. Even if you tell the GC to go and eat objects, it may decide to chomp them without calling your finalizers. It may even decide that you've got memory to spare, and do nothing. And finalizers aren't guaranteed to be called on exit, even if you tell the runtime to do so, because apparently the GC may chomp the objects that your finalizers have references to.
So, they're effectively useless. And anything that needs to cleanup after itself has to provide a close() method or similar, and pray that the user calls it. I think with my program I decided the only way to make them work would be to code my own close()-style methods and add code to the simulator to call them. Which required the code for the simulator, which I didn't have.
Thomas (link) - 27 10 07 - 13:52
It's the same for C# with the Dispose methods, if you are interracting with unmanaged code then you have to write more memory management code than ever before. I wish it was more like COM where the object is destroyed right after its refcount reaches 0. For me who used COM a lot the GC nicely solved the circular the reference problem but added another instead.
Gabest - 27 10 07 - 16:20
Finalizers are so finicky that they're almost useless -- not only are they driven by memory pressure, which is useless for resources like file handles, but they also run in parallel threads. My advice is to avoid them, in either Java or .NET.
Fast deterministic collection of objects in arbitrary graphs is the holy grail problem of garbage collection -- if someone found a good way to do this, GC would suddenly become so much incredibly better it's not even funny.
Phaeron - 28 10 07 - 15:40
A funny thing with Vista is that they've added a number of services that run in the background at all times, and sometimes kicks in even when the system is not idle. These services (SuperFetch, ReadyBoost, indexer, defragger, file caches that needs to be swapped out when the RAM is needed by something important, etc) are there to make the computer *run faster*. So they've added a lot of stuff that cause drive access and eat memory to make it faster. I think it's easy to see how they're just digging themselves deeper in the RAM hole. It's like RAM is no longer a precious commodity it once was.
At least on Linux, I can see the system is using about 300 MB tops on my system, rather than Vistas what -- 1.5 GB of my 2 GB? No wonder it sometimes feels like I'm still just on a 1 GB system on Windows XP.
??Finalizers are so finicky that they’re almost useless—not only are they driven by memory pressure, which is useless for resources like file handles, but they also run in parallel threads. My advice is to avoid them, in either Java or .NET.??
Agreed on this, this is where C#'s "using" keyword is supposed to be used -- on stuff like file handles and FileStreams. But still, yeah, you need to nanny the garbage collector quite a bit in certain scenarios for optimum performance there.
Jugalator - 28 10 07 - 17:22
The evil came over the computer world with the Microsoft Basic on home computers like the VC20, the C64 and some others. The interpreter wasn't intended first for handling huge programs - and they didn't test all of scenarions users/programmers can invent. I dorked the basic with a 2000 line program (that was impossible on the VC20)- it hung frequently, and suddenly the code was modified. Colclusion: the GC itself was buggy! Then I wrote assembler programs, and never had such probs in the rest of the C64 world.
Later MS found it cool to torture 386SX owners with a bad copy of a mainframe technology called 'virtual memory'. VM is not bad itself, but the concept of running programs in a virtual machine without any memory restrictions.... and braindead paging models. You cannot even switch it of in XP, it always swaps somewhere, doesn't matter if you have 4 gig of memory.
Actually I find it funny that they care about cleaning up the memory, but noone cares about cleaning up c:\temp, noone cares really for good programming if bad programming with a powerful wizard driven SDK does the same work but you need a quadcore to run all of that scrap.
BTW A** is not the only manufacturer using dot net for some simple settings, but it's possible to install the pure driver from the CD without this dotnet scrap. Vodaphone has a dial in manager for wireless broadband connections via UMTS and CDMA that also comes with dot net. That knelt down my poor Pentium 3 /800 that manages my internet connection
shadowmaster - 28 10 07 - 17:37
_Finalizers are so finicky that they’re almost useless—not only are they driven by memory pressure, which is useless for resources like file handles, but they also run in parallel threads. My advice is to avoid them, in either Java or .NET._
Absolutely agreed. I find that in general I love C# -- I like the syntax, and for smaller, less memory intensive programs it is a nice language, plus, for graphical applications the ability to create partial classes works to hugely clean up the code. However in both java and C# I find that I frequently create a memory manager in which all objects are referenced in a hashtable which includes "destroy" functionality. While there is no guarantee that the GC will eat objects after the pointers are unset, it is much, much more likely.
WhIteSidE - 28 10 07 - 18:40
@IanB: COBOL.NET actually exists.
Nicolas - 01 11 07 - 15:13
well, let us all hope that with "flash" hard drives reaching the 80+ gig range (so far mostly in portables, but you can aftermarket them in desktops) some of these problems will solve themselves.
Josh Lemon - 03 11 07 - 04:54
Until flash drives get much faster, they aren't a real solution. Sure, XP Embedded boots up in a few seconds on a 1GHz Geode with 512MB RAM, but once Explorer starts, you have to wait a lot while programs in Startup are being loaded.
ender - 05 11 07 - 14:56
Modern garbage collectors are incremental, generational, tend to avoid moving large objects and either pereserve initial object locality or improve it dynamically by object collocation.
Oleg - 05 11 07 - 23:08
Vista will attempt to use almost all of your RAM at all times by trying to anticipate the data you'll need based on previous usage patterns. If something is needed that isn't in this RAM cache then it is retrieved from the hard drive and the RAM cache drops the least likely to be used data. That means a much smaller portion is actually currently needed data. For example my system uses about 500MB of RAM with normal usage. The other 500MB is filled with a cache.
It's actually a little surprising that it only uses 1.5GB of 2GB. Apparently Vista ran out of things that it thinks you might do.
Eric - 06 11 07 - 04:42
You are correct. That still doesn't change the fact that in order to do a full garbage collection, the collector needs to check all memory that might contain references. Incremental collection reduces the frequency of full collections, but you still have to do them occasionally, and that will force memory to be paged back in. There are also well known issues with incremental collection where the GC prematurely promotes objects to seldom-collected generations, and the GC performs unexpectedly poorly.
Vista does seem a bit better at reducing disk access, but I'm pretty sure it still suffers from the same behavior of paging out application memory to enlarge the cache. Even if it has technology to attempt to page data in before a page fault occurs, you still have problems with disk contention and the loss of performance when it guesses wrong or otherwise can't page data back in in time. I'm sure there are people who would err more on the side of having more unused memory if it reduces paging.
Phaeron - 07 11 07 - 00:02
I had problems like those while playing Stepmania.
It looks like full screen apps speed up dramatically once you alt+tab to them.
Josh - 10 11 07 - 23:28
For #1, I'm using C# to write mission critical services, which run 24/7 and I'm quite happy that I don't have those freaking memory leaks that I needed to debug and clean up in the past with C++. I agree, C++ is faster and more beautiful and maybe more geeky, but if I need to write a system service which is supposed to send and receive approx. 1 million SMS messages per dayin 2 weeks, well, I have no option but go for a GC environment. I can't make it work fast robust and safe at the same time in that time frame.
The only point about using unusually excess memory for some programs is: Our programmer says: "Well there's GC in any way, so why not allocate more memory. The GC will eventually recover the memory if needed"
GC doesn't mean that we programmers aren't responsible for the memory our programmers use. It's just helping us in case we "forget" to free a reference.
If you write a C# program as careful as C++ program, the GC will kick in when your program ends.
About finalizers: Don't use them if avoidable. finalizers cause your dead object to continue living until another GC. (In first GC, the finalizer is triggered. In the second, object's memory is claimed)
And if you have to use finalizers, "Don't reference anything outside the scope of that object in the finalizer code. The referenced object may be dead and buried already!!!!!"
GC is a magic bullet if you're working on in house projects or projects with limited audience (how many copies of an SMS server can be sold out?) and limited time to deliver the project. If the program is destined for masses, better go for C++ I guess.
Salih - 15 11 07 - 16:28
It's about time somebody said it! It can take almost 30 seconds to switch tabs in Firefox on this "new" laptop. Don't get me started with laptop design...
Dan Deibler (Voting for Ron Paul in the Republican presidential primary) (link) - 17 11 07 - 16:59
As far as GC goes it can be a ticking bomb for sure and Win (xp at least) have to many problems with the memory manager as is (e.g. makes sure you have tons of free (wasted) memory) even if you got plenty of gigs.
It's a shame that the good old days with Amiga and C64 are over - todays programmers have a lot to learn from the oldies. This does not mean that you have to write programs as efficient as if they was supposed to run on a C64 but also it does not mean you should not care about proper memory management. ;)
I could not agree more with your comments ;)
Waxhead (link) - 25 11 07 - 11:49
Do you know that this page provided me with much quick, precise education about the GC and its stupidity? Thank you people.
What I hate the most about managed environments is how it made lots of people so lazy they can't program no more. I tend to liken them to the VB bunch out there. While managed environments are good for business related issues, it makes people forget many things. Everyone, to my knowledge, grow up learning how he/she must deal with his/her garbage and how to be responsible. For me, this managed env. is teaching programmers to be the other way around.
PS - I really love the gling glong my Amiga floppy does everytime I switch it on.. :-p
Morlac - 27 11 07 - 08:01
You didn't list my "favorite" cause of slowness, but you didn't reference to it a few times.
In the last ten years, HDD capacity has gone thousand-fold, HDD I/O speeds have gone ten-fold, but the access times have only about halved from what they were. Needless to say, the CPU speed improvements in that same period of time are far, far, greater. Nowadays one can get nice 6 Go/s I/O rates with DDR2 SDRAM, with access times around 30 ns (an estimate, didn't calculate or benchmark, but that's the ballpark). If you compare that to the values of modern desktop hard drive (60 Mo/s and 13 ms)¹, you'll quickly notice that RAM isn't just three times as fast as HDDs, as indicated in the article, but 100 times as fast in linear I/O and a whopping 400000 times faster on random access. There's a big difference between these two, so it is worth checking which it is that you are getting... Check the average file size on your system. DLLs might be in the megabyte class on average. Configuration files, XML files (surprisingly popular nowadays even in applications' proprietary files), icons, etc. are just kilobytes each. On Linux the issue is even worse, as there is a clear trend for smaller files (that are presumably easier to modify). A quick check on my system disk revealed that the average file size is just 2.6 Ko. With the help of good disk cache I'd estimate that accessing a random file needs just two random seeks on average. After a cold boot or with an ineffective filesystem you need a lot more, of course (first reading all parent directories, which are stored as files themselves, and then reading the actual file; and reading a file consists of several indirections due to UNIX style allocation trees).
Do you have an idea of how large a file should be so that the I/O speed, rather than the access time, dominates the read operation? Well-knowing people buying HDDs always look at the I/O speed benchmark, as the figure is easy to understand, but way too often completely ignore the access time (which is a combination of avg. seek time and latency, as printed on the specifications). This is quite sad, as files need to be one megabyte or larger for the I/O speed to dominate. For smaller files and also for fragmented big files (something that you downloaded with Bittorrent²) the access time is the only thing that matters. The filesystem can do very little to avoid fragmentation between files, as it has no way of knowing in which sequence the files will be loaded.³
So, by far the biggest reason for software slowing down is programmers using too small files. The next time you want to use a few dozen XML files, separate icon files, sound clips or whatever, be aware that reading all these easily takes ten seconds, even if your application is quite small, while packing them into larger files allows your application to load even 50 Mo/s.
Note: I have rounded some values above quite liberally because my point is about the general speed of hardware, not the exact speed difference of some particular products.
¹) Of course there is that reader who as a WD Raptor or some SCSI/SAS disk, whose access time is only a quarter of that of a desktop disk. However, even in that case, you still get a 1e-5 slowdown, so it isn't all that different.
²) A good filesystem driver should take sparse allocation into account and try to store the file linearly even if it is actually written in random order, but in practice I don't see many filesystems doing this. NTFS on Windows, in particular, fails at this.
³) Windows apparently does some usage pattern tracking, but I don't see that working very well either.
P.S. my second favorite cause is quickly becoming the use of XML files, because those can only be read a character at a time (each character must be read before it is known whether there will be more characters or what the next character means), and linearly (it is not possible to seek in XML). For most uses (DOM) they also need to be loaded completely into RAM before any processing can occur. Compare this to a typical binary file format (e.g. AVI or Matroska) that stores chunks of data, with the size of the chunk stored in the header, so that an entire chunk can be loaded at once and it is known what will be found at each offset inside it. These files often also have an index that allow quick seeking to any spot in the file. Formats that don't have a separate index, often use syncwords to allow accessing data from arbitrary spots (e.g. MPEG streams). Neither approach can be used in XML without horribly breaking the format in the process.
P.P.S. Linux actually handles that swap/cache balance very well in the CK patchset and recently also in the mainline (since 2.6.20 or so), thanks to adaptive swappiness patch. Another major issue, crappy scheduler causing long pauses when there are multiple busy processes competing on a CPU, was also fixed first in CK patchset and only very recently in mainline (2.6.23). I bet that the bad things that you hear are from before these fixes.
Tronic - 21 12 07 - 06:09
Mhm. GC is not that bad, the only problem is that is not widespread enough. GC programs consume so much heap space because they have to manage their own one, so you end up having a GC heap and a non GC one. If everything is GC in your OS then that problem won't be ever present. As for the two-phase problem and the interaction of the collector with the VM, it's of course a problem that has been studied, and all major GC implementations are incremental, they don't sweep the entire heap, and there are VM-friendly GC. In fact GC can be much faster than the explicit memory managment because you can move blocks of memory, thus improving locality, and it generates a lot less fragmentation. GC malloc is usually much faster than non GC one. It's only something that has to be better integrated into the operating systems, but I think that if it becomes widespread, it's something that will prove as an improvement.
Angelo Pesce - 28 12 07 - 07:45
Sorry, Angelo, you've been fed the party line.
The non-GC heap in Windows does not have to be of any significant size. The entire VM allocation size for vdub.exe is 580K on my system, which is orders of magnitude lower than the overhead of most GC systems. GC systems generally *have* to waste space, since that's the major way that they amortize the cost of the collection sweeps. All you have to do is run the CLR profiler and look at the size of the allocation ramps. The distance between the heap size before and after collection is memory that is temporarily wasted, and the longer time that passes before that space is reclaimed, the longer the cumulative loss of that memory's utility to the system.
GC is faster than *bad* explicit memory management. Good programs don't spend all of their time allocating memory, GC or not. The best programs barely allocate memory at all during performance-critical operation. I don't care if you can use operator new() a million times faster in a GC heap.
Modern GCs are incremental in that they don't have to halt program execution in order to reclaim memory. That does not mean that they don't have to sweep the entire heap when a full collection pass occurs. If this weren't the case, then GC paging wouldn't be a problem.
Yes, there are VM-friendly GC, as others have pointed out. They are not in production, probably because they have other issues such as requiring cooperation with the kernel paging logic (security problems) or increase the cost of elementary operations. Show me a major, high-profile GC-based language that uses a VM-friendly collector. The main implementations of Java and .NET don't.
You're also forgetting by far the one huge problem left with the GC algorithms you describe. They are not deterministic. If GC becomes integrated in the OS, then hard real-time becomes IMPOSSIBLE. There are a lot of people working on multimedia applications that would become very upset at this.
Phaeron - 28 12 07 - 15:54
Phaeron, what do you have to say about Photoshop VM?
Igor (link) - 07 01 08 - 21:38
Not much. It's a specialized VM solely in user space for an application designed to own the whole machine in order to edit huge files, implemented because it uses application-specific knowledge to outperform the generic VM (or overcome the lack thereof). The main upside of the Photoshop VM system is that it has a hard upper bound that's user tunable. I have seen cases where having Photoshop open can cause mysterious failures in other applications, but I doubt that's directly related to memory usage.
Phaeron - 07 01 08 - 23:28
To be honest about slow .NET apps, GC is certainly a factor, but don't forget that there may be other factors as well. You'd have to debug or profile the .NET app to be definite about what is causing the .NET app to be slow.
Yuhong Bao - 23 03 08 - 18:16
"It can take almost 30 seconds to switch tabs in Firefox on this “new” laptop."
Again to be definite about why that is you would have to use the debugger, the source code for Firefox is readily available.
Yuhong Bao - 23 03 08 - 18:24
"While multithreading on the CPU is good, though, it isn't so good for the hard disk. In fact, it makes hard disk performance suck, because seeks are expensive and even a moderate amount of seeking causes throughput to plummet. The problem is that when you add multithreading to the mix, you have multiple threads submitting disk requests, which then causes the disk to thrash all over the place."
RAID can help here.
Yuhong Bao - 23 03 08 - 18:41