יום ראשון, מרץ 23, 2008

Hardware woes, a day in a life

An advanture in (literally) many pieces...
  • It starts with power-on problems:
    • This morning, my main computer went bust. It hanged, and after trying to reboot it, simply refused to turn-on.
    • Taking the power-cord off, wait a minute for it to cool, nothing (not even the fans).
    • OK, it's a power-supply -- that's simple, I had a spare one.
    • Few minutes later (open the case, find the spare): Hmmm... the old spare has only a big flat motherboard connector. The installed one had also another small rectangular one. So the spare is older than I need...
    • No problem, a quick drive to a local store (and buy food for the cat on the way), leave 130 NIS (~37$) at the store and got even nicer PS (with big down facing fan).
  • The power-supply is fixed:
    • Connected it to the MB, tested power-button, fans are moving, good, problem solved.
    • Yeh, sure... the PS is good, but the motherboard looks dead -- no beep, no signal to monitor.
    • Disconnected all disks, removed an unnecessary card, repeat -- the same.
    • So it looks like the MoBo is a goner as well.
  • Let's get a new MoBo:
    • Went to an even more local store (identity hidden to protect the guilty).
    • Asked for a replacement (for Pentium 4, 3Ghz), turns out it would cost 400 NIS (more than a 115$).
    • Maybe it's better to simply buy a new MoBo+CPU+Ram and have an upgrade? I've always wanted an Intel Core 2 Duo (yes, kvm here we come ;-)
    • The seller shows me several options. I insist on an Intel chipset (free drivers) and he tries to sell me an Intel board, with Intel graphics chipset (yay) but some obscure Realtek Gigabit adapter. That's not what I want. He tries to convince me that "everybody" use Realtek network chips and that it's my only way (I don't really like that).
    • He got busy with more important customers and I quickly went to the first store. Back to original plan (A replacement MoBo).
    • They only had a single model of those 478 socket type boards. Some noname with a SiS chipset (Oouuch) for ~300 NIS. Problem: this board has no Sata. No-Problem: they have a 45 NIS Sata/Pata adapter. The seller reminded me about the thermal paste for the CPU (an extra 10 NIS) -- all totalling 364 NIS (~100 $) including VAT (some 25% less than the first guy wanted for a similar Mobo).
    • Back to home, reconnect quickly. I don't have an extra connector for some old PATA drivers I had (Sata convertor occupies its location). Nevermind, they are old backups. I'll look at this later.
    • Test. Good. Now we are stuck in GRUB!
  • Groovy adventures:
    • You see, due to the Sata/Pata flip-flop, drives changed positions and that killed grub.
    • That's should be one rescue CD away...
    • Boot old Knoppix (3.8.2) twice, no way, bad media.
    • In my CD pack I also had an old RIP, booted, it has "Boot to GRUB in it's menu" Yeh.
    • Doesn't work. Tested other menu item. Looks like bad media. Again? No it cannot be the drive. It must be old media.
    • Next is Fedora-8-KDE-Live. This boots OK.
    • As expected grub-install doesn't like to be run from the wrong root directory. A little more work:
      vgscan; vgchange -a y VolGroup00; mount /dev/Volgroup/fc8_slash /mnt/root
    • Now chroot into /mnt/root, grub-install still doesn't like me. Nevermind -- run grub and from the prompt use setup.
    • Took me few tests to get this right (set the "root" directive, before "setup") but finally it's done without errors.
  • Boot to the new system
    • And you thought it would be that easy...
    • Grub boots the kernel... "No Volume groups found"....."panic"
    • OK, so maybe putting '/' in a VG wasn't the smartest idea, but what's the problem?
    • Could it be that my games with grub somehow thrashed the VG partition? cannot be.
    • Boot in a hurry back to Fedora-Live, vgdisplay shows everything is OK (Pheeeuuuu). Maybe the stupid machine remember this VG was somewhere else? Check with vgdisplay -- no, everything looks fine with the new location.
    • OK, another test -- vgexport and vgimport again -- that should erase all previous memories of old location. Boot again... Same problem. Few more flip-flops like that and than it hits me -- it must be the stupid initrd.
    • Yes, the error message wasn't during the kernel boot but shortly afterwards when Fedora runs "nash".
    • Boot again to Fedora-Live, chroot again, mkinitrd, boot again...
    • Now the stupid beast tries to mount /dev/root without any reference to VG's and panics right away. That's an improvement (I've saved the old initrd) since it proves this is where the problem is located.
    • Boot again to Fedora-Live, now configure networking and STFW a bit for more details. Looks like I'm OK, but some options should be passed to do it right.
    • I decide to go the easy way: chroot to the OS and rpmquery --scripts kernel to see how the post install script runs mkinitrd. Well, they do it via new-kernel-pkg.
    • Now that's nice:
      new-kernel-pkg --mkinitrd --depmod --update 2.6.24.3-34.fc8
    • Boot to an almost working OS.
  • Final tidbits:
    • Network won't come up because /etc/sysconfig/network-scripts/ifcfg-eth0 bind the interface to the HWADDR -- fix it (and I maintain also /etc/ethers, so fix it as well).
    • Update time on the MoBo -- I did during one of the boots to Fedora-Live: system-config-date, select the timezone, exit and then
      ntpdate 0.fedora.pool.ntp.org; hwclock -u -w
    • No swap. The swap used to be on one of the old PATA devices I had. The main VG is full, but its physical disk is not, so: fdisk, create new partition, another VG (bad for performance, but who cares now...), a new LV in this VG, mkswap, update /etc/fstab and we are done.
    • Audio: system-config-soundcard does not know to unload the previous intel drivers (who loaded them anyway? maybe /etc/modprobe.conf, but they are not there anymore). An easy one -- fuser /dev/snd/*, kill them all (pulseaudio, arts, amarok, kmix), rmmod recursively like an idiot (rmmod/modprobe could have a usefull recursive option...). Rerun system-config-soundcard. No sound. Easy. Speaker cable was out. Still no sound. Easy. Mixers were at zero volume. Now everything is good.

  • Final thoughts:
    • Damage:
      • Hardware costs: ~500-550 NIS
      • Lost work day
      • Frustration, frustration, frustration...
    • Root partition in a VG comes with some complexities (and some conveniences as well).
    • Boot from arbitrary disks is hindered by:
      • Grub
      • Initrd
That's all folks, it's been a long day.