Hard disk failure imminent!

Original picture by http://www.flickr.com/photos/tenorshare-data-recovery/

A couple of weeks Windows 7 informed me that my hard disk is failing, and that I should back up immediately. Skeptical of Windows messages about things like this, I decided to run Spinrite on the disk. After making a backup with Carbonite.

Apparently I should have not done that. The better course of action would have been to copy the whole hard disk to a new one, and then run Spinrite on the old disk.

By doing this, I apparently pushed the disk just enough so it ran out of spare sectors for relocating bad tracks. I found this out after I got a new 1TB disk, placed it in the same computer, and tried to copy the old disk to the new disk.

Copying the disk turned out to be an adventure as well. At first I tried to do it the old fashioned way: boot in Linux, and run a dd from the old disk to the new disk. This works great on working disks, but failing disks disrupt the dd command…

Finally I ran across CloneZilla and tuxboot. Tuxboot is actually a great utility that lets you turn a CloneZilla Live, DRBL Live, GParted Live or Tux2live installation into a bootable USB or DVD disk, without too much thought. CloneZilla is the open source equivalent of Norton Ghost.

After downloading tuxboot and selecting the CloneZilla Live installation, the download quickly completed and I now had a USB stick to boot from. Taking it back to the PC with the failing hard drive, reboot, and voila, CloneZilla.

CloneZilla has a lot of options, and there are great step-by-step instructions available here. I chose the disk-to-disk clone option, and after tweaking the options a bit, was able to recover most of the hard disk. There were a number of clusters that CloneZilla complained about not being able to recover, but it seems they are not in any files that I use on a regular basis.

Lessons learned from this:

  1. Make sure you always have a backup. I use Carbonite, but there are several other great options out there.
  2. Use Spinrite for maintenance, and not for diagnoses. It may work in the ultimate attempt for recovery, but it’s better to know in advance your disk is failing, than to be notified by Windows and then scramble around.
  3. Don’t use the disk that is failing, get a duplicate disk as soon as possible, and use CloneZilla to copy your whole installation.

Forgotten password in Linux

Picture by Freddie the Boy

It happens to me every so often, that I have to do maintenance on a Linux box, and I’ve changed my passwords around and can’t remember the password I used on that particular box (or even the user name…). I always have to hunt around the net, hoping I find something, so I thought I’d capture it on my own blog:

  1. Reboot the computer
  2. At the GRUB or LILO prompt, press escape.
  3. Go to the line that would normally boot, and press e to edit
  4. Go to the end of the command line, and add rw init=/bin/bash to it
  5. Press Enter, then press b to boot
  6. You should now be entered into a passwordless root shell
  7. Either set your password with passwd <username>, or see a list of users with cat /etc/passwd.
  8. Reboot again
  9. Happy times!

Hope this helps me (and maybe some else) in the future…

Ubuntu and the Symantec Backup client

At our company we’re using Symantec Backup Exec to back up all our servers, including some Linux machines. I set up a newer Ubuntu install (9.04) on VMWare, and was pretty confident I would be able to get the backup working (using the Legacy agent). Well, that was not as easy as it seemed…

I followed the instructions on installing the legacy agent. After some tweaking here and there, the server was now showing up in the list of legacy agents on the Backup Exec server. However, every time I tried to pull up information on the server, Backup Exec returned an error, saying the server refused connections and may be running out of available network connections. Nonsense. But something was not right apparently.

Googling around for instructions, and suspecting a firewall or security setting, I tried tweaking access to the ports Backup Exec is using. No effect. Then I hit upon something: luckily I had an older Ubuntu server running with the legacy Backup Exec client, and could compare settings. It turns out that Ubuntu installs a host file with the following content:

127.0.0.1 localhost
127.0.1.1 fqdn.domain.com fqdn

Why is there a difference in the third tuple? On the older server, both lines referred to 127.0.0.1. I decided to change the 2nd line on the newer server to 127.0.0.1 and reboot. After giving everything some time to publicize itself on the network, it appeared and asked for the credentials to use for the new server. I selected the correct user, and to my astonishment, the complete directory structure appeared.

I don’t know why there are two different entries for the local machine, but it definitely broke the legacy Backup Exec client. Now I can only hope I never have use our backups…!

Firefox equivalent of Internet Explorer’s Every Time I visit the Webpage

We have a little rotating web page setup in our break room, and have been using a dial indicator to show our performance in bookings and shipments. However, due to the nature of the set-up (a page, showing a flash file, that is configured by an XML file), it turned out to be necessary in Internet Explorer to use the option “Every time I visit the webpage” on the “Check for newer versions of stored pages:” setting in the Temporary Internet Files and History Settings.

Unfortunately, Internet Explorer 7 still can’t handle CSS properly, so some of the tables looked horrible. Switching to Firefox fixed that problem. But now the old data was showing. And where is that “Newer versions of stored pages” setting in Firefox???

It’s hiding in the config. In the address bar, type

about:config

Then find the setting browser.cache.check_doc_frequency, and change it to 1. This will duplicate the Internet Explorer behavior (as far as loading cached page goes, mind you!).

The options for this setting are as follows:

Value Description
0 Check for a new version of a page once per session
1 Check for a new version every time a page is loaded
2 Never check for a new version – always load the page from cache
3 Check for a new version when the page is out of date (Default)

Nagios – how to determine the name of a service in Windows

I’ve recently set up Nagios on one of our test servers, and the Windows client for Nagios allows you to monitor services (whether they started, stopped, etc.). However, the name of the service to monitor isn’t always the same as the name in the Services application in Administrative Tools.

To find out the name of the service, you’ll have to look at the registry:

  1. Open up regedit (Run, regedit)
  2. Navigate to HKEY_LOCAL_MACHINE
  3. Navigate to SYSTEM
  4. Navigate to CurrentControlSet
  5. Navigate to Services
  6. Find the service you plan on monitoring. The name of the node is the name you need to enter on the Nagios server as the name of the service.

Upgrade Ubuntu server 6.04 LTS to 8.04 LTS

When upgrading an Ubuntu server to 8.04 LTS, you should use the new and improved server upgrade system. Use the following steps to activate this upgrade process:

  1. Enable the “dapper-updates” repository
  2. Install the “update-manager-core” package. Dependencies include python-apt, python-gnupginterface and python2.4-apt.
  3. Run the command “sudo do-release-upgrade” in a terminal window
  4. Follow the prompts on-screen.

Cursor keys not working in Ubuntu 8.10 VMWare client

After last week’s problem with the Terminal Services client, I also experienced problems with the cursor keys in the VMWare client for Ubuntu 8.10. This was particularly annoying, since I use Quicken in a VMWare machine, and the selection of categories is much easier with the cursor keys than with the mouse.

It turns out that this is a bug in Ubuntu Intrepid 8.10. None of the patches since October 2008 have addressed this issue, but the manual fix is very simple:

  • Open a terminal window
  • Type the following in the terminal:

    echo ‘xkeymap.nokeycodeMap = true’ > ~/.vmware/config

After entering this you have to restart your VMWare client session. If it doesn’t work, try restarting the VMWare server, however, since this is a client setting, it shouldn’t affect the server or be affected by server settings.

Cursor keys not working in Ubuntu Terminal Services client

I’m using VPN and the Terminal Services client pretty frequently to access my computer at work. As discussed in this article, the Ubuntu VPN setup is much simpler in 8.10. However, when using the Terminal Services client, I had some problems with some of the function keys, in particular the cursor keys.

Now, being an old Unix hack, I managed to use the hjkl keys to navigate around the vi screen, but in some of the applications, this became an issue. With some Googling, the solution turned out to be fairly simple.

When you start up the Terminal Services client, go to the Local Resources tab. Under the keyboard heading, select as your keyboard language en-us instead of us. Now connect to the remote computer, and all the keys should function as they were intended.

Hope this helps somebody!

Ubuntu 8.10 connect to Cisco VPN through vpnc

One of the things I need to do with my home machine is occasionally connect to our VPN at work. In 6.06LTS this required downloading the Cisco VPN client, compile it, install it, and hope it will work in the next kernel update. On top of that, you had to run a script to create the VPN connection.

In 8.10 Intrepid Ibex this is much simpler, and much more elgant. First, you need to install the VPN Connection Manager (VPNC) package. When you do this through the Add/Remove Applications, it should install three packages:

  • vpnc
  • resolvconf
  • network-manager-vpnc
The first two are essential, but the third one is the kicker in 8.10: it allows you to manage your VPN certificates, and choose which connections to make and break.
After you’ve installed these three packages, do the following:
  • right-click on the Network Manager applet.
  • Choose Edit connections
  • Click the VPN tab
  • You should have the options to Add a connection manually, or to Import a VPN certificate.
  • Since our network admin provided me with a certificate, I chose Import, and selected the certificate file.
  • The import will try to get as much information as possible out of the selected file. In most cases, you need to provide the group and user password.
  • If the group password is encrypted, it can be determined by taking the encrypted string and running it through the Cisco decoder at http://www.unix-ag.uni-kl.de/~massar/bin/cisco-decode
  • Save your changes
  • Close the Edit Connections screen
You should now be able to left-click on the Network Manager applet, select VPN connections, and click on the newly added connection. The Network icon will show a circling star for a couple of seconds, and then indicate that the VPN connection is established by showing a yellow padlock in the bottom right of the icon.

Disconnecting is just as easy: left-click on the applet, select VPN connections, and select Disconnect VPN.

Ubuntu 8.10 installation – GRUB error 18

Over the Christmas break I’ve installed Ubuntu 8.10 on my main machine. The installation was not upgradeable without some serious wizardry (the /boot partition was too small, and increasing that on a full disk is not easy), so I decided to do a fresh install.

After going through all the installation steps, and booting up Ubuntu 8.10 for the first time, I was greeted with a GRUB error 18. Some Googling revealed that this was caused by the hard disk being too large for the BIOS to handle. And there was even a helpful post that described a three step process:

  • Set your hard disk for LBA mode
  • Install Ubuntu
  • Set your hard disk back to normal
Unfortunately, this didn’t work for me. The installation resulted in the same GRUB error. However, there is an easier fix.
GRUB error 18 means actually that the kernel cannot be found in the first 1023 cylinders. You can change that by creating a /boot partition that is completely within those first 1023 cylinders. So, after the first try at installing, and failing with the GRUB error, try this:
  • Restart your machine, with the Ubuntu CD as startup.
  • Install Ubuntu as normal, until you get to the partition information.
  • Select Manual from the partition options.
  • The only thing you need to change is the main partition (/). Delete the one that is on the disk now. The partitioner may tell you it needs to write changes to the disk – by all means, let it write them.
  • Next, create a partition at the very beginning of the hard disk, of sufficient size, but not too big (I decided on 1GB, but it may be better to go with 512KB or even smaller – not too small, since I couldn’t do an upgrade on my 128KB boot partition). Choose ext2 as file system – you won’t need journaling or anything fancy on that partition. Your mount point is /boot.
  • Finally, create the main partition, covering the remainder of the hard disk. Make the file system ext3 – you want the journaling etc. on this one.
  • You should now have a /boot partition at the beginning of your disk, a / partition for most of the rest, and a small swap partition (about twice the size of your memory). If not, you need to manually adjust the partitions until you have all three.
  • Continue with the rest of the installation.
Your mileage may vary, but this worked for me (it’s also the trick I used when installing 6.06LTS). Hope this helps someone!