When managing a production server, one of the most important thing is the tradeoff between server downtime and keeping server's software updated.
While most of the updates can be applied from little to no downtime, a kernel update is always problematic since it requires typically a full reboot, and a significant downtime. To prevent that, many servers do not issue kernel updates as often as they should, specially those cheap rented servers.
On the other hand, there are servers which like to presume of having a high uptime. While that might look good, it is in fact, quite the opposite: a high uptime in a server means they might not have updated their server's software!
So I will introduce kexec and a benchmark to show how it can reduce downtime by reducing reboot time. But first, let's look how a unix-like system boots and shutdowns.
In a typical boot/shutdown action, this are (aproximatelly) the steps that will be made by the machine:
- Boot
- BIOS stage
- Bootloader load
- Kernel load
- INIT
- Kernel init
- Hardware initialisation
- Checking and mounting partitions
- Start services
- Shutdown
- Stop services
- Sync discs
- Unmount partitions
- Hardware stop
- Hardware power off
By using kexec, some of those steps are skipped, since it will change kernel from a running system. These are (aproximatelly) the steps for kexec reboot:
- INIT
- Kernel init
- Checking and (re)mounting partitions
- Start services
To prove that reboot time decreased I created a little bash script to measure downtime (testTime.sh) and tested in my personal server running a Gentoo system:
To use provided script, you must run it after apache have been stopped with:
time ./testTime.sh SERVER_WWW_URI 2&>1 > /dev/null
The commands I used for this benchmark are (via SSH):
Normal Reboot: /etc/init.d/apache2 stop && echo "Now you can exec time measurement script" && reboot
kexec reboot: kexec -l KERNELIMAGE --reuse-cmdline && /etc/init.d/apache2 stop && echo "Now you can exec time measurement script" && kexec -e
These are the results I got:
So to sum up, despite it still takes time to perform kernel update, it is reduced significantly, so for most servers out there, now that is not an excuse to have system not updated anymore!Full reboot:
real 1m21.996s
user 0m3.241s
sys 0m2.833skexec reboot:
real 0m31.415s
user 0m1.872s
sys 0m1.684s
No comments:
Post a Comment