Mac OS X – when it doesn’t "just work"


My wife, mother in law, father in law, and sister in law all grew up using Macs.  I grew up using various unix machines.  I had to say that I was excited, if for no other reason than family tech support, to see that OS9 would be leaving our home and theirs and being replaced with something sensible and UNIX based.


This weekend, that sentiment bit me, badly.


The sister in law is going to graphic design school and naturally that requires a mac and the adobe creative suite.  She was able to rustle up a powerbook (lombard, maybe ?) G3 4 or 500mhz.  Works well enough, runs OS X and all her mandatory apps.  I was impressed at how easily she was able to use it to do various computery tasks, and watching her actually work in Illustrator, drag PDFs to a network share (the other OS X machine in the house), and print from there (network printing wasn’t setup) was actually pretty impressive.


Because my in-laws are so appreciative and generally wonderful, I don’t mind doing occasional tech support for the mother and sister in law (the father in law is himself technically brilliant, but short on time).  A week ago, the sister-in-law called with something of a dilemma.


Her powerbook had stopped booting.  It would hard lock upon bootup, on the white-apple-logo screen with the spinning cursor (i love that apple added a spinning cursor during boot.. reminds me of 20 year old Sun machines with their spinning ascii cursor).  Her dad had already given the machine a once over.. the standard apple tricks (pmu reset, pram zapping, etc) and some new special OS X ones (deleting kext cache! – do mac people actually know what that means? 🙂


His conclusion – hardware issue.  Uh oh.  Sister-in-law doesn’t need any money-expending situations, so a hardware issue would be bad.  Her call to the various paid-apple people all involved proposals that she give them money in order for them to do something to her computer that her father had already tried doing.


I asked her to try a few more things over the phone.  The machine oddly enough booted and ran correctly in safe mode (hold down SHIFT during bootup).  The machine would hard lock in the same place trying to boot from CD (uh oh).


I asked her to boot single user (hold down option-S during power-on).  This drops sister-in-law into a single-user bash prompt, running as root.  Frankly, i’m not sure this is what mac users want their debugging experience to be like.  With old macs, you just kicked them or threw them away when they stopped working.  Now that there’s an actual operating system under the covers, people such as myself foolishly beleive we can revive these machines with enough messing around.


Indeed, i told her i’d fix it for her when i saw her in person next (they’re a 3hr drive from us). 


I spent about 10 hours this weekend working over the powerbook.  I know now an exhaustive amount about the OS X boot procedure.  It’s a somewhat unix style boot, with some weird apple stuff thrown in at the end.


When you boot single user, /etc/rc.boot is executed.  All this does is really do rudimentary fsck’s, and then drop you to sh.  Good.


Multi-user boot is /etc/rc


Both look at /etc/hostconfig


The last line of /etc/rc is something called “SystemStarter”.  Uh oh.  Capital letters mean its not unix, its NeXT/Apple stuff.  Missing are both the SysV rcN.d/Sxxfeaturename system, as well as the “its all in a big rc” BSD style approach. 


No, SystemStarter examines /System/Library/StartupItems and /Library/StartupItems, looking for stuff to do. 


I modified rc so that it didn’t call systemstarter, and instead gave me another sh prompt.  Running SystemStarter manually (they did provide a non-daemon mode, a verbose mode, and a non-gui mode) showed that things were generally ok (except for a problem starting CrashReporter) until you tried to boot the GUI.  SystemStarter starts LoginWindow as one of its last jobs, so it was clear that something inside SystemStarter->LoginWindow wasn’t right.


Oddly enough, the binary version of systemstarter was 177.3 on this mac.  The other mac, also running 10.3.4 with latest patches, had version 177.2  How does that happen ?  One of these machines has the wrong file version on it, clearly, but both feel they’ve got all the latest critical updates.  Interesting.


I did some more reading about what safe mode did, since safe mode booted fine.  Turns out that OS X retains Apples ideas about “Extensions”.  Some extensinos are “bad”, and don’t get loaded during safe boot.  We’ve got the analagous concept in windows.  In windows, we use the registry to define service/driver sets for the various configs.  (CurrentControlSet, etc).  In OS X, the system extensions are filesystem structure based.


You’ve got /System/Library/Extensions, and extension is foo.kext.  foo.kext is a subdirectory, and in this directory you’ve got Info.plist, which is an XML or NeXT prop-list document talking about the extensions dependancies, and as i found out eventually, what boot modes this extension is loaded under.  Also you’ve got MacOS/foo, the actual binary image of the loadable kernel module.


Now, It turns out that kext’s are loaded with kextload.  Loaded kexts can be queries with kextstat.  And there’s a daemon which dynamically loads kexts as needed, called kextd.


Kextd is started in /etc/rc (the multi-user script).  If safe boot is enabled, kextd -x is called, which tells kextd (as near as i can tell ) to just pass -x to kextload.  kextload -x means “only load safe-boot or root extensions”. 


I decied i’d try and isolate the problem to kext’s.  I modifed /etc/rc so that kextd -x was called regardless of if we’re safe booting or not.


After making this change, normal multi-user boot succeeded.  The machine worked great, apart from having no sound, modem, firewire, etc 🙂


Clearly, there’s an extension that gets loaded outside of the safe-boot set that hangs bootup somehow.  But how to isolate it ?


On this laptop, there were 183 extensions in /System/Library/Extensions.  100+ were required for a safe boot. 


OS X has the notion that an extension can be required for the root filesystem, and furthermore, can be required for certain types of root filesystems (i.e. local, network, or cd).  The set of extensions actually loaded during a safe boot is any extension marked required for safe boot (in its Info.plist, in the OSBundleRequired property), and any extension which is needed for root or root-filesystem mounting. 


It is impossible to discern this, by the way, without using csh and grep.  Eventually i constructed some output files using csh, awk, grep, vi, and comm to figure out a list of which extensions were NOT used in safe mode.


In the old world of macs, people would drag extensions out of the extensions folder into Extensions (Disabled) and drag them back in until the box worked.  They had then identified the misbehaving extension.


Doing the same thing with csh foreach() loops and mv is just as good. 


I started by removing every non-required kext from the default kext directory, and then reverted the kextd -x change and did a normal multi-user boot. 


Awesome.  machine booted, but all video is 4 shades of grey (i was pleased to see this bit of NeXTSTEP was still lurking under the covers). 


next i started adding in groups of kexts.  I’d make a textfile saying which ones i wanted to add, then add them in a foreach loop, sync the disks, and reboot.  If it booted cleanly, i’d add another batch. 


Working with the mac in greyscale mode kind of stinks, so i wanted to sort that out in short order.  However, i found that moving all of the ATI video kexts caused a non-boot condition.  Interesting.  I had noted a few messages coming from the ATIRage128.kext driver as it loaded on previous boots (it can load in text mode and not crash) and i also noticed that trying to unload it caused a hang.  I removed the ATIRage128.kext driver and teh box booted again, albeit with a grey screen again.  Doh.  I started trying to add more kexts and noticed a folder that wasn’t a kext at all – AppleNDRV.  In this folder were ATIRuntime bundles – neat.  Adding that folder back to its original location gave me a booting mac complete with color.


After adding back all the other kexts, it would seem that the ATIRage128.kext module was responsible for the hang.  Presumably, this also explains why the boot CD crashed, as i beleive it’s a stock driver.  One also wonders how im getting color video with no kernel extension for the video hardware, but, whatever 🙂


the 10.3.4 update (which was applied a day or two before the box stopped working) included massive updates for ATI and NVidia chipsets.  The checksum on the actual module matched that of another ATI equipped machine (which worked, by the way).


So, is it a hardware problem ? I donno.  When the 128 driver loads, it prints a kmessage saying its unhappy about the firmware on the video board in the powerbook (a RageM3).  Without the kext loadable, the box works perfectly, even for doing opaque drags and work in illustrator and photoshop.  What was it doing in the first place ?


The point of all of this is – are mac users really supposed to be able to fix this themselves ?  I lived and breathed unix all through highschool and college and it took me the better part of 10 hours to get this machine booting multi-user with working audio and so on.  Will the next system update drop a new ATIRage128.kext into the default place, perhaps causing this problem again ?


It seems that the process of debugging boot failures ought to be a bit easier than this.  I run my windows machines with /SOS and the other nice switches that give you textual boot progress, and OS X supports “verbose boot” (hold down option-V during boot), but that was no good in this case, as the crash didn’t occur until the video hardware tried to do something clever, even though merely loading the kernel module for the ATIRage128 didn’t cause the crash – only starting the windowserver with this module also loaded triggered the hang.


What are the odds that Apple’s paid tech support would have resolved this without a format ? (or at all?)


 

Comments (38)

  1. Kai Cherry says:

    See..you aren’t trained as an Apple tech 🙂

    First rule: Run the hardware diagnostics. With out even seeing this hardware, I’m gonna take a guess that its probably the RAM.

    To Non-Apple hardware people, this notion seems…nuts 🙂

    And Desktop or Powerbook cert’d folks with at least a couple of years under their belt are probably knodding "mmmmhmmm"

    I’ve seen machines with spotty RAM do some really insane things, like the optical drive not working, keys sending the wrong character, etc.

    I’d try removing any 3rd party ram, switching the ram in the machine for "known good" and starting from there.

    -K

  2. Rob says:

    I had those symptoms on an old-style iMac running 10.3. Not having any of the unix knowledge or background you do, after running through the usual reboot reset fsck etc., I cut my losses and archived all the docs and apps off via Firewire (did you try booting the PowerBook into Firewire target mode?–best option if you don’t have an external firewire drive available and don’t want to or can’t transfer files over Ethernet).

    Then reinstall clean system. I’m sure some extension/file/or another was the culprit, but it didn’t matter.

    The archive and reinstall technique is pretty much SOP, and near 100% successful on non-hardware issues, for middling level mac tech heads like me. So to answer your question, no, mac users don’t have to have that much Unix skills to solve problems running OS X and get back up and running.

  3. Rob says:

    I just re-read and noticed you tried booting from CD (to install new system?). Again, booting the PB into Firewire Target mode would make its HD available as a potential boot volume on the other Mac OS X machine in the house, and you could have installed a new system on the PB HD that way.

    I remember now fixing another balky PB once by pulling its HD into an open bay on my G4. (Some PB HDs are a breeze to pull, while iBook is almost impossible.)

  4. Matt Evans says:

    Thanks to those that asked about HW diags or being suspicious of the RAM.

    I’ve had shoddy Pbook ram before (my wife has a wallstreet series 2 that i foolishly put some CompUSA ram in once upon a time 🙂

    The father-in-law said he ran all the hw-diag stuff he could and everything checked out ok. Not having run it myself, and not necessarily knowing the "proper" hw-diag tools, i’m not ruling out the possibility that some hw-diag toolset might have found / still might find an issue..

    Re: firewire target mode..

    i didn’t realize such a thing even existed until i was doing some additional post-fixing-it searching.

    Even so, i was more interested in understanding what was broken (to a point, at least) than just getting the box working again, and re-installing the OS always seems heavy handed and unfortuneate..

  5. Ryan says:

    Quick question: what version of OS X was on the install CD? The fact that the computer hangs when booting from the hd and from the generic install cd hints strongly at a hardware problem. If it booted fine from the cd then I would almost certainaly blame the ATI driver.

    One option (which, I agree, is heavy-handed) is to boot the powerbook in firewire target disk mode and reinstall using the "Archive and Install" option — this will preserve all of the existing files, settings, users and applications while giving the system a clean slate. I doubt this will do the trick, though, since booting off of the CD didn’t work…

    Did anything happen *physically* to the machine before she noticed problems?

  6. Kim Helliwell says:

    Actually, I had a similar problem (not the exact symptoms, but failure to boot after installing the latest OS X update). This was on a Quicksilver Power Mac.

    I did call Apple (the machine is under AppleCare), and the tech there was suspecting a scrambled disk image. We did all the "usual" things. He recommended getting DiskWarrior (which in general is a good idea anyway), but by the time we exhausted the disk repair stuff you can do with Disk Utility, I was fairly well convinced that the disk structure was OK, but some data in the sectors was scrambled in the OS installation. The final thing the tech suggested was to reinstall the OS with an "archive and install," which puts the old instance of the OS out to pasture, and puts a new one in place without erasing the disk. So your files are preserved intact (and yes, I did some backups first to be extra safe….). This fixed the problem, and the next attempt to update went fine. Dunno what happened the first time.

    I do agree with your point, though. I wouldn’t have known to do all you tried (and I’m a longtime Unix user; just not as savvy about boot sequences as you apparently are). I truly doubt the Apple tech would have been able to lead you through all that process.

    Just to provide some contrast, though: the same week my Mac blew, my wife’s XP box became so badly infected with viruses and spyware that it became unusable on the internet. I had to back up her user dir, wipe the disk and restore the OS and then reinstall all the apps. Much more painful than the Mac experience! And, in fact, I leveraged that experience to persuade my wife to switch to a Mac, and we went out and got her a 20" iMac which now has pride of place on her desk.

  7. Matt says:

    Well, you’re a UNIX guy and all that delving around wasn’t relly necessary.

    There’s a Mac way of fixing things. And it wouldn’t have required a reformat or piddling around with the extensions.

  8. Just a small correction: cmd-S is single user mode, not opt-S. Same with verbose mode: cmd-v.

  9. Bob says:

    When wierdness happens, ALWAYS try Disk Warrior! Especially before getting all UNIXy. The mightiest sword you will ever buy…

  10. daniel Eran says:

    Next time find out what’s wrong before you start "fixing."

    If your car doesn’t start, would you disassemble the whole thing in your garage? Possibly if you knew nothing about cars.

    Since Macs, unlike PCs, can easily boot from a second volume, the smartest thing to do is try booting from a known good disk – a CD or a FireWire HD with a known good installation. You could even boot the questionable Mac using your other Mac in target mode. That would help tell you if its hardware or software.

    Futzing around with just enough knowledge to be dangerous, is well, a waste of time. You can’t blame "Macs" for you lack of knowledge in how to go about troubleshooting. Apple has a good amount of troubleshooting steps on their website as well.

  11. Matt Evans says:

    Daniel,

    please re-read carefully. The powerbook wouldn’t boot from an install CD either.

    As to which version of the OS X CD was used – I don’t know, as it wasn’t me that attempted the CD boots. It was >= 10.3, but i don’t know what subversion of 10.3 is committed to disc at my in-laws house.

  12. Don says:

    "I asked her to boot single user (hold down option-S during power-on).  This drops sister-in-law into a single-user bash prompt, running as root.  Frankly, i’m not sure this is what mac users want their debugging experience to be like."

    Frankly, that isn’t what the debugging experience is supposed to be like. I have 19 years experience with Macs and now OS X and I still don’t bother with all that noodly Unix stuff, yet I can get out of bad situations. If a GUI utility doesn’t fix it, it’s usually a better use of my time to just restore from my complete bootable backup.

    What you have to remember is that while you might have found a stupid problem with a video driver, it’s gotta be a very rare stupid problem with that video driver, otherwise millions of non-Unix-savvy Mac users would have already given up and quit.

  13. MacDuff says:

    I have used OS X like CRAZY for almost two years.

    I have installed tons of apps and hacked the system’s appearance and behavior via GUI shareware.

    I do use Carbon Copy Cloner to backup my boot disk but, short of the occasional operator error with data files, have never had to restore my System.

    I have never suffered such a catastrophic event (*touches wood*).

    I’m STILL running on my original Jag install, updated via Archive & Install all the up to 10.3.4.

    I don’t now a single line of UNIX. About as close as I get to it is booting to the prompt, running the disk check on the boot disk, then rebooting back into Aqua.

    This is one guy’s experience. YMMV — but I KNOW I’m far from alone in this positive experience.

  14. Brett Johnson says:

    "Archive and Reinstall" …

    "Backup, Reformat, Reinstall" …

    "Delete ALL the preference files" …

    "Always run DiskWarrior/Cocktail/Drive X" …

    "Why get all UNIXy?"…

    I’ve used NeXTSTEP, OpenStep, and Mac OS X as my primary desktop and development environment since 1989 (yes, I started on NeXTSTEP 0.8). I have also developed software for Solaris, SunOS 4, HP/UX, Irix, AIX, OSF1, xBSD, Linux, and a dozen flavors of Unix you probably never heard of. So I guess you could call me "UNIXy" – I love to get all UNIXy.

    Yes, 10 hours to isolate a problematic kernel extension is a long time. However, no amount of disk diagnostics, reformatting the drive, deleting preference files, or re-installing the OS would fix this particular problem. The best you could hope for would be reinstalling a previous version of the OS to avoid the broken 10.3.4 ATI kext. All the other mentioned "solutions" would preserve or reinstall the same broken driver.

    When running NeXTstep and OpenStep, I effectively used the same installation of the OS from 1990 until 2002. I upgraded the OS revision-by-revision from 1.0 until 4.2. Although I formatted newer and faster disks for use in newer and faster machines, I simply copied my installation from my old disk to the new one (using dd or ditto).

    I have formatted my disk, and "reinstalled" Mac OS X exactly twice since 10.0:

    Once was to replace Developer Previews of OS X with 10.0.

    The second time was after a catastrophic disk failure made recovery of the system files untenable.

    What Matt has brought to light is that the Apple Service techs (especially telephone support) are not really interested in finding the root cause of problems. The length of Matt’s 10 hour troubleshooting session can most likely be attributed to his relative inexperience in the OS X boot process (BTW SystemStarter is all Apple – it didn’t exist in NeXT’s OSes). Theoretically, an Apple Service tech practiced in troubleshooting boot problems should probably have narrowed the problem down in under an hour (or two).

    However, no call center service manager will allow a technician to spend two hours on a single call. By simply telling the caller to "back up your data, format the disk, and reinstall the OS", the time-consuming onus is now thrust back upon the user.

    This is very Windows-like support strategy. MS Windows does experience System rot, system file corruption, registry corruption, and malware infestation. In many of these cases, a cleanroom reinstallation is the most sensible solution.

    However NeXTstep, OpenStep, and Mac OS X do not suffer maladies of that nature. (Not that they suffer no maladies – they have plenty of unique "challenges" of their own.) In this environment, proper and useful troubleshooting skills can help isolate the problems much more effectively than the sledgehammer approach.

    Apple customers that have coughed up the dough for AppleCare should get this level of competent troubleshooting from the Apple experts. Ideally, Apple call center experts should have the ability to remotely administer problematic machines. This would simply require instructing the user to enable remote manufacturer administration (disabled by default for security reasons). Obviously, this doesn’t help for boot problems. However the Apple stores and Apple-certified service retailers can provide the hands-on technical help. They were called "Mac Gurus" or something.

    The Mac OS X startup sequence minimizes the role of /etc/rc* to the most fundamental system bootstrap. The role of SystemStarter helps "organize" the OS startup. Prerelease versions of OS X had a newer BSD style startup directories ordered numerically. SystemStarter’s dependency mechanism is easier to manage (from a makefile/fink/rpm style point of view).

    I think SystemStarter should be enhanced with "binary-search" troubleshooting capability. When invoked in troubleshoot mode, SystemStarter would drop crumbs during the boot process: "Got here" and "Expect to get Here". If during a boot, SystemStarter encounters these crumbs, it means that the boot process never reached the "Expect to get Here" point. It should redrop a new "Expect" crumb half-way between "Got Here" and the failed Expect point, then continue booting. If the boot sequence arrives at the "Expect to get Here" point, it should redrop a new pair of cookies.

  15. Coombs says:

    Great detective work Matt !! While I agree that it would be time-saving to just ‘archive and install’, the Matt’s approach is commendable in that it points out the exact the problem. A similar approach on a number of different machines, particularly older ones such as the Lombard, would then identify if only a subset of extensions are the usual culprits. One may then be able to fashion a simpler fix to such problems.

    Oh, one more thing. The Lombard does not have firewire ports-so no target disk mode is possible. Once again-great job Matt!

  16. Brett Johnson says:

    I also found that my systems seem to be much more stable than other OS X users. I think I can attribute this to one or both of the following:

    I NEVER run Classic. NEVER.

    I modified /etc/crontab to run daily, weekly, monthly maintenance during working hours. By default, they run in the middle of the night, but my machines are asleep then.

  17. Thomas Lunde says:

    You wrote:

    After adding back all the other kexts, it would seem that the ATIRage128.kext module was responsible for the hang. Presumably, this also explains why the boot CD crashed, as i beleive it’s a stock driver. One also wonders how im getting color video with no kernel extension for the video hardware, but, whatever 🙂

    I believe that the ATIRage128.kext only provides acceleration support, not basic funcationality. The NDRV covers the basic functionality.

    I have a Wallstreet I machine (250mhz, 13.3" LCD). It has a non-Pro ATI Rage chipset. Thus, it works under Panther (‘tho unsupported) because the ATI Rage NDRV provides enough functionality to get color video. Graphics benchmarks show that it is very slow. Forcing the ATI kext for the Rage Pro chipset (which would, if I had a Wallstreet II machine, be the appropriate acceleration kernal extension) to load results in strange color shifts and lots of video artifacts. They do, however, appear at a much faster rate! 😐

    Thanks for the detailed post on the OS X boot sequence. I’ve bumped across SystemStarter a couple of times (e.g. when wanting a MTA to run at boot on a OS X client machine) but never had the time to properly explore it.

  18. dood says:

    Here’s a handy tip for next time, in response to Brett Johnson:

    try SystemStarter -nd, it runs through its usual process in a verbose debug mode, without actual kicking off all the usual processes. Very handy for understanding what it is doing.

  19. Johnathan says:

    I’ll admit I would have fixed it the "Mac way" for lack of a better term. More than likely I would have tried Archive/Install first, and if that wasn’t sufficient, booted it in safe mode and made a backup of all the data then reformat/reinstall fresh.

    But isn’t it nice that you can also fix it the "Unix way" now? At least you have the option of doing so.

    (BTW, Brett, what you describe in the comment above as a "binary search" sounds a lot like what Conflict Catcher did in OS 7-9. Basically it kept trying various combinations of extensions (control panels, inits, etc.) until one of them loaded properly. It’d be nice if Apple would implement the same thing for OS X, but I think their official position is that they are not responsible for *.kext that are non-Apple provided. It’d be a decent opportunity for Aladdin or another third party developer though (to develop a similar tool).)

  20. Mark Twomey says:

    "One also wonders how im getting color video with no kernel extension for the video hardware, but, whatever :)"

    It’s the stuff in NRDV, an NDRV is a minimalist driver stored in the NVRAM of Mac video cards, which allows for minimal display functionality but no acceleration.

    Don’t rule out the fact that it could be some bad RAM in the system. Problems which might not have been apparent in MacOS 9 & Jaguar could very well show up in Panther. I’d say hit it with memtest, which you can get from:

    http://friskythecat.tripod.com/

    ..and see if that coughs up any hairballs for you.

  21. klam says:

    My sister had a similar problem although we were not able to get to the root cause as it seems you were able to. Fortunately, we were able to boot from a CD (yes, I know you tried that) and use the Archive and Install option. It worked really well, and in your case, I think that a bootable external hard drive would have helped you do something similar.

    In reference to your last question, I think that the answer is ‘no, users should not have to go through this level of troubleshooting, but for most instances, they should be able to fix these sorts of things themselves.’ Granted, it sounds like this was an isolated case and the guys at Apple did make a good attempt to avoid problems (they probably got the driver from ATI and did some rudimentary testing). However, Apple makes mistakes too and has provided numerous easy ways of recovering from a catastrophe.

    Anyway, here is a great website for finding answers to your OS X questions and a page that provides useful troubleshooting steps:

    http://www.macosxhints.com

    http://www.macosxhints.com/article.php?story=2004011205473937

  22. randy chariker says:

    i’ve ran into this same problem on a lombard…the ram had come loose…if a laptop was running fine one minute then ka-pow the next …grill the owner about his laptop carrying(?) habits!…always do the simplest thing first

  23. AppleGuy says:

    Why did this thing ever get published. Trying to get the greatest Desktop OS on the planet (right now) to run on 5 year old hardware and then complaining about it is just nuts. Now, if you had this kind of trouble with a year old Mac, then I could understand. Can you also publish an article on a successful install of Windows XP on some outdated PC hardware. Yes, call me a Mac fan boy, but come on, let’s get back to some common sense.

  24. Ryan Peterson says:

    Sometime I am ashamed to be a Mac user. Those of you who simply flame matt for his comments should be ashamed. I think he makes a valid point about os x. While sometimes there is an easier way he went in the hard way and I was impressed. I personally love to tinker sometimes and with OS X it is now possible, but as pointed out by many not always necessary. Having has some low level issues with OS X myself I would love to see a better cleaner way to work with the system on a lower level. So the guy dove in head first so what. So what if it is not the traditional Mac way I found the article interesting and informative about the OS X boot process. Don’t be so cynical and cruel people it’s not good for anyone. Learn what you can and move on that is what it is all about.

  25. Joh Momma says:

    You have a very specific, extremely isolated issue. To be critical of an entire platform due to your one isolated experience is ludicrous. There are many far simpler solutions to solving this problem, but idiots that are to smart for their own good and start digging around where they needn’t. Maybe you should read about Mac troubleshooting (an area you are sadly unexperienced with) instead of applying your knowledge from some other platform that is not applicable in the same way. You may know alot about linux/unix, but that really doesn’t mean that you know shit about mac os.

  26. Rosyna says:

    Matt, I think you were looking WAY too much into the problem and making the fix far more difficult than it actually is.

    You could have skipped a large amount of debugging by just typing kextstat in the terminal, looking at the list and then seeing which ones didn’t start with com.apple and removing them.

    Also, if you had another machine you could just ssh in and have a look in /Library/Logs (for crash reports and console.log) or /var/log (for system.log). They would have very likely said something.

    My guess? Prebinding failed. Downloading and installing the 10.3.4 combined update would have fixed that issue. Why would it work then when disabling some kexts? Well, disabling them probably caused different code paths to be followed.

    OS X takes an EXTREMELY long time to boot off a CD (upwards of 10 to 20 minutes on some computers). The only way to determine if it is actually booting is to boot off the CD into verbose mode (insert CD, reboot, hold option, wait for CD to appear, click on CD, hold command v, then click on the arrow or press return).

  27. <data masked> says:

    Lombard = "not supported."

  28. Thomas L. Ferrell says:

    A little searching on the web is often the best way to find the solution to a Mac problem. For those who are familiar with Mac tinkering, the web site http://www.xlr8yourmac.com is a great resource. For some history on the Lombard and its initially unsupported (earlier OS X) ATI Rage LT Pro chip, see

    <http://www.xlr8yourmac.com/OSX/osx_ragepro_driver_tip.html&gt;

    Supposedly, OS x version 10.3 added graphics acceleration for the Lombard video chip set. This link showed up under Google as the second item after a search for "OS X" and Lombard. Note that problems were observed by Lombard owners under 10.2.8 if more than 256MB of RAM was installed. see

    <http://musox.com/article38.html&gt;

  29. Matt Evans says:

    Wow this has really brought people out of the woodwork 🙂

    A few comments

    1. The powerbook is a G3 model, with built in firewire, and no SCSI. As far as i konw, that makes it a Pismo. The graphics chipset is a RageProM3, according to system profiler.

    2. This isn’t a diatribe against Mac OS X. I love mac OS X, it is worlds better than OS 9. The point of this posting was to chronicle my experience trying to fix a relative’s powerbook. I found that the nature of the problem was such that traditional Mac fixit techniques were ineffective. The gap between the usual approaches and what I actually needed to do to resolve the problem was quite large, and indicates that consumer operating systems – mac os x included – have a long way to go before they really "just work" all the time.

    3. I’ve seen lots of suggestions of what people think happened. Please keep in mind the following:

    – 10.3.4 was already installed

    – booting from the CD froze in the same logical spot in the process – the spinning cursor STOPPED SPINNING and never started again

    – i used kextstat heavily to help trouble shoot this. However, this was an apple-supplied kext (even if it was just repackaged ATI code), and many essential and non essential kexts have com.apple in their name. Using a bundlename of com.apple.* is not effective in determining which kext’s are critical, optional, or likely to have problems. Only inspecting the plist files is.

    – i checked the syslog entries first(im a unix person, remember) they were not helpful. The error wasn’t occuring on kext load – only when the windowsystem tried to use the accelerated path apparently provided by the loaded kext. I even did a kextload -t on all of the extensions – try this on your mac, you may be surprised at some of the errors you see. None of the errors i got however pointed at ATIRage128.kext, or any of its dependancies.

    – the TIL was helpful for understanding what SystemStarter and kextd looked for, as were the man pages for both of these. I don’t mean to imply that apple did a bad job documenting this stuff, although it was a lot of diving into docs compared to what i think the expectation is for fixing macs. It was great of apple to have included man pages for SystemStarter, kextd, kextload, kextstat, and so on. I could easily see many of those tools not having much in the way of available documentation, as realistically they are uninteresting to most users, even power users.

    Thanks for all of the feedback. I’ve got some good URLs for tools to try for future mac problems, and firewire target mode sounds really cool, so i’ll have to go investigate that at some point.

  30. kenh says:

    I have over two years on my I-book, now on Jaguar 10.2.8 install (not even a clean install over the original) and I have never had anything to do with UNIX. I do disc permissions occasionally, and I run Virtual PC on it which really sucks up the memory, but it still works.

    I do agree though, with someone’s comment about Classic. I keep Classic only because I still run Photoshop version 3, (copied from an earlier clamshell I-book) and that version has to be 10 years old.

    I think Classic is responsible for 90% of everything that can happen on OSX.

  31. macangels says:

    Hi

    I’d just like to say that I am very grateful for the work Matt put in as it has produced the most helpful comprehensible introduction into how Mac OS X boots up and works that I have ever read to date. Good words from Brett too.

    And, note, it required a " non-Mac Guy " do it.

    I am really sorry for the flame from the so-called " Mac Guys " that went on to disprove the big Mac Myth … that we are all dumb but pretty … by being dumb and ugly!

    For sure, no Apple tech support would have gone further than nuke and paving the HD either and been none the wiser. Most techs have not a clue what is going on in there, it is just TV now.

    In fact, we are no longer even technicians in the true sense, just fitters for whom Disk Warrior is our St Christopher medallion.

    Yeah, sure, if it broke .. jus’ buy a new one and chuck the ole one away.

    Thanks for a real world view – and come Back Cassidy and Green, all is forgiven …

    MA

  32. ckahrl says:

    Now this is really weird. I have a 550 Ti-Book (G4) and my daughter uses my old 333 Lombard G3, both with 512MgRAM. We are both running 10.3.4. Recently I had trouble–Its a hardware problem of some sort–and so I re-installed the system (because Apple told me too. (I lost no files on my non-boot partitions.)

    But while I was doing this, I found that I could no longer boot on an OS 9.1 Cd. Or boot with the disk the machine came with! I can only boot with 9.2.1 or 9.2.2 from a CD.

    It turns out this is now true of my daughter’s Lombard. So, I’m thinking that maybe installing 10.3.x has somehow altered the firmware. Huh?

    Sometimes, when the Lombard crashes one has to boot from a CD to bless the startup disk or you get a slashcircle.

    Anyway, it is pretty amazing that the system runs on the machines with the old ROM anyway.

  33. macbooks says:

    Interesting read about the problem, and shame on the guys who say it’s always easy to fix, or that you shouldn’t bother. One of the things I miss about earlier macs is that it was always possible to "know" the system. Since there were a fairly limited number of files to keep track of, it was usually possible in OS’s 7-9 to keep a map in your head of the major pieces of the system and what they did. With X, there are literally tens of thousands of files in a basic system install, which means that it takes some serious reading to get an idea of what the pieces are doing, and makes it much tougher to remember where all the pieces go without doing some very involved research.

  34. <data masked> says:

    Revise original post: Pismo = "Supported."

  35. <data masked> says:

    Code is never perfect. OSX started out with a limited feature set and slowly developed to 10.3.4 as user request and bugs were addressed. That said: Matt has shown an ugly side of OSX. Drag and Drop back-up "Just" doesn’t "work" as it did in the old days. Shove in the CD and reinstall, apparently, is not as foolproof as in the Classic days.

    Mac has moved closer to an IT support model because the "personal computer" fell into the margins. We have a need for Matts now and I have to change my evangelical method. It Just Works is marketing, not reality.