Race With Time – Server Uptime

How long can your system do in “uptime”?
We’ll see…

03-09-10 status update: After almost 27 days of solid uptime, the server unexpectedly crashed. I’ve been reviewing the logs, but unable to find any information leading me to the cause of the crash. So far, the only thing I’ve found that seems suspicious is the avahi-daemon – there is an error right about the time when the server stopped responding.

The network shares (mounted via AFP with Netatalk) suddenly disappeared on my Mac. This lead me to believe that I may had lost network connection from the Mac – but that wasn’t the case. Pinging and attempts to ssh to the server were unsuccessful, as they both timed out. All vital signs on the server seemed to be okay – the power was on, fans were spinning, hard drives were spinning…etc., but there was no response from the keyboard – just a black screen on my monitor, as if the screensaver was on.

Unplugging and plugging the keyboard didn’t help, there was still no way to get anything to come out on the screen – trying another keyboard still didn’t do the trick. The NIC LEDs were on, but solid, which was a bit uncommon. There was no visible hard drive activity… Not really sure what happened.

After 15 – 20 minutes of troubleshooting before powering the server off, I realized that the system may have just locked up or crashed, and it seems that this is what happened.

A few minutes later after a hard shut-down, everything seemed normal. I took advantage of the time to install the latest kernel, which had been sitting in my “pending” updates for quite a few days now.

So, the race against time starts again.

Current server uptime? 1:12

Help! Deleted /var folder contents

How did it happen?

Thinking that it would help me free up space on my /var partition, I logged on as root and did: rm -R *
It took a few seconds, and gave me an error about being unable to delete the “lock” and another folder.

Shortly after that, I tried to do an update using aptitude and apt-get, but get this error:

What am I to do?
Well, for starters – learn my lesson: Never, EVER delete anything from those system partitions without knowing what I’m deleting. Next time, a quick Google search should help me clean out my /var folder.

Next thing that I’ll do is run the Ubuntu Live CD and try to recover the system from there. I’ll add results later.


After searching the Internet and browsing the Ubuntu forums without much luck, I found a post which helped me get apt-get and aptitude working.

What I did: Create each folder that apt-get and aptitude were asking for.
For the /var/lib/dpkg/status (status) file, I had to “touch” a new file. It was then that sudo aptitude update worked.
# sudo touch /var/lib/dpkg/status

Now – the only problem so far is that the system no longer “remembers” the programs that are installed, but at least apt-get and aptitude are operational once again. From further readings, it seems that after deleting the contents of /var, a reinstall of the operating system may be necessary. This doesn’t cause much of an inconvenience for me, since I’ve been thinking of doing so for quite a few days now.