Howto setup a KVM server the fast way

This is a very short quick setup on how to get KVM server up and running. It assumes that

  • you want to run a KVM server with at least one virtual machine,
  • your KVM server gets an ip address in your network,
  • your virtual machine(s) get an ip address from your network – so you can use bridging instead of natting (using NATting instead of bridging is an easy task but not part of this howto),
  • you can use lvm for disk space allocation on your KVM master (using other disk space allications methods like image files is easy, too, but not part of this howto)

Get the server running

I assume you are able to install an Ubuntu server from scratch and setup a lvm environment. Actually this can be done by mostly accepting the defaults during the Ubuntu server setup. I’d suggest you install at least Ubuntu Lucid 10.04 or newer.
If you continue reading here, you should have a running, up-to-date Ubuntu server with network connectivity and preferably access via ssh.

Get the network up and running

For the bridged network you need to install the bridge utilities and change your network configuration. First install the package:
$ sudo apt-get install bridge-utils
Now add a bridge named „br0“ (this has only be done once):
$ sudo brctl addbr br0
Now change your /etc/network/interfaces so it uses the bridge br0. This step actually sets up br0 instead of eth0. Think of eth0 as being just a physical transport added to the virtual bridge interface.
# The loopback network interface
auto lo
iface lo inet loopback
 
auto eth0
iface eth0 inet manual
 
auto br0
iface br0 inet static
address 192.168.1.100
netmask 255.255.255.0
network 192.168.1.0
broadcast 192.168.1.255
gateway 192.168.1.1
bridge_ports eth0
bridge_fd 9
bridge_hello 2
bridge_maxage 12
bridge_stp off

Please make sure you don’t forget setting your „eth0“ to „iface eth0 inet manual“ as shown above. This is needed as you want to prevent eth0 to fetch an address via dhcp but still want it to be there for your bridge as it is the physical layer. After you setup the bridge either restart your network (sudo /etc/init.d/networking restart) or reboot your server. If you are accessing your server already by ssh be warned that a misconfiguration might lock you out.

Install KVM

Now it’s time to install kvm and some usefull helper applications:
$ sudo apt-get install qemu-kvm ubuntu-vm-builder uml-utilities \
  virtinst

That’s all: You already have a kvm server now. Time to…

Install your first virtual machine

We are going to setup a 100Gb logical volume for the guest, download Ubuntu and create a machine with 2Gb of Ram and 4 cores:

# create an empty 100Gb logical volume
sudo lvcreate --size 100G vg0 --name guest1
# download Ubuntu iso
$ wget http://..../
# create machine
$ sudo virt-install --connect qemu:///system -n guest1 -r 2048 \
 --vcpus=4 -f /dev/mapper/guest1 --network=bridge:br0 \
 --vnc --accelerate -v -c ./SOMEUBUNTUISO.iso \
 --os-type=linux --os-variant=ubuntuKarmic --noautoconsole
# please note: "ubuntuKarmic" is currently the most recent
# virt-install defaults scheme - just use this if in doubt.

Get a VNC connection

KVM uses VNC to give you ca graphical interface to your machine. The good thing about this is, that it enables you to use graphical installers (and yes, even Windows) without problems. As even Ubuntu server boots into a graphical mode in the beginning – it’s great to use VNC here.

I assume you are working on a remote server. KVM gives every guest it launches a new vnc instance with a new, incremented port. It starts with 5900. So let’s tunnel via ssh:

ssh user@remotekvmhost -L 5900:localhost:5900

You connect to your remote kvm host via ssh and open a ssh tunnel fort port 5900. Now start your prefered VNC client locally and let it connect to either display „0“ or port 5900 which means the same in VNC (duh…).

From now on you should see your server on a VNC display. Install it like you’d install every other server. The networking is bridged, so you could even use dhcp if that is offered in your network.

Please make sure, you install the package „acpi“ inside your kvm guest, otherwise you won’t be able to stop the guest from the master (as it is done via acpi):

# make sure, "acpi" is installed in the *guest* machine
sudo apt-get install acpi

After installation you can manage your kvm gues by using the following commands:

# list running instances
$ virsh list
# start an instance
$ virsh start INSTANCENAME
# stop an instance politely
$ virsh stop INSTANCE
# immediatly destroy a running instance
$ virsh destroy INSTANCE
# edit the config file for an instance
$ virsh edit INSTANCE

Mounting the LVM volumes

As you might have noticed, your virtual guest’s lvm volumes cannot be mounted directly in the master as they contain their own partition table. If you need access to the guest’s filesystem from the master, though, you have to create some device nodes. There is a great tool called „kpartx“ than can create and delete device nodes for you. It’s as easy as this:

# install kpartx
$ sudo install kpartx
# make sure, virtual gues is switched off!
# create device nodes
$ sudo kpartx -a /dev/mapper/guest1
# check /dev/mapper for new device nodes and mount/unmount them
# after you are done, delete the nodes
$ sudo kpartx -d /dev/mapper/guest1

Please note, this methods also works with other block devices like image files containing partition tables. You only might run into trouble, when your lvm volume contains it’s own lvm. If that is the case, play around with pvscan, vgscan and lvscan after using kpartx. Be brave but be warned that backing up data is always a great idea.

Alternative Management Interfaces

In case you really need a gui for your management needs, check „virt-manager“. You can install this on your desktop and remotely manage running instances:

$ sudo install virt-manager

You should check RedHat’s „Virtual Machine Manager“ page, though. It might be a good idea to manually compile and install a more recent version and rely on the setup howtos. Personally I prefer using plain text console here, as it helps being able to act quite fast and from everywhere when problems occur.

Conclusion

Nowadays it’s fairly easy setting up a KVM server. As KVM/libvirt enabled guests are quite fast, it’s a nice and easy way for even hosting virtual machines. I run about a dozen virtual machines and three hardware servers for two years now without any serious problems.

Detecting and Removing Unused Indexes in MySQL

Preface: The following post is a backup from a post first published on the Moviepilot Techblog, which is going to be replaced by the Moviepilot Labs Blog. The content is a bit outdated, as the way to go today is using MariaDB instead of OurDelta. The very content about the UserStats plugin and using it for detecting and removing unused indexes is still valid, though – and a nice way of getting rid of performance killers…

MySQL performance depends on a balanced usage of MySQL indexes. While it is easy to add an index and identify queries not using indexes via EXPLAIN during development or slow.log it is a lot harder to get rid of unused indexes. Finding and removing them might be crucial for your performance as indexes can create a remarkable cpu cycle and i/o overhead during updates to tables (INSERT/UPDATE/DELETE).

The default MySQL community edition server from mysql.com or your Linux/BSD distribution (which you shouldn’t use for a lot of reasons anyway) is not yet helpfull in this regard. There are however inofficial patches for advanced statistics that provide the details needed for optimizing your list of indexes. The easiest way to get started with a patched MySQL server is using a pre-patched binary. At Moviepilot an OurDelta’s pre-patchted MySQL 5.0 server that includes the UserStats patch is running fine for about a year now.

Let’s assume you already installed OurDelta’s MySQL 5.0, which is fairly more than adding and using an apt-source in Debian/Ubuntu or similar in rpm-based distributions. After installation the MySQL server behaves

Enable UserStats‘ Enhanced Statistics

As stated on the official patch originator’s (Percona) documentation, UserStats is enabled by setting the global variable „userstat_running“ to „on“. You can do this on the fly by entering your mysql command line interface and issuing „SET GLOBAL userstat_running = 1;“ as shown below:

mysql> SET GLOBAL userstat_running = 1;
Query OK, 0 rows affected (0.00 sec)

The UserStats counter is now running and only has a slight impact on your cpu performance. For us it’s fine to run it by default but you might enable it on an on-demand basis

Grab Statistics

The UserStats statistics can be retrieved in two ways. The simple way is using „SHOW INDEX_STATISTICS“. This will provide with an unsorted list of all indexes that have been used so far with count times.

mysql> show index_statistics;
+-------------+----------+--------------------------+---------+
|Table_schema |Table_name|Index_name                |Rows_read|
+-------------+----------+--------------------------+---------+
|de_moviepilot|broadcasts|movie_id_and_ends_at_index|  7244936|
|fr_moviepilot|place_keyw|lft_and_rgt               |    46965|
|de_moviepilot|mushes916 |index_mushes_on_user_id_an|   310538|
|de_moviepilot|mushes567 |top                       |   137855|
|de_moviepilot|mushes402 |PRIMARY                   |  3033119|
...
|pl_moviepilot|u_settings|index_user_settings_on_use|   469600|
|de_moviepilot|answers   |answerable_id_and_answerab| 11162446|
|es_moviepilot|cinema_the|PRIMARY                   |    76805|
|de_moviepilot|list_items|PRIMARY                   |    14208|
+-------------+----------+--------------------------+---------+
10689 rows in set (0.03 sec)

This table is already quite useful as it gives you handy details about your indexes. As „SHOW“ only processes WHERE-clauses, ignores LIKE-clauses and rejects ORDER you should rather query the virtual table in information_schema like this:

mysql> select * from information_schema.INDEX_STATISTICS\
ORDER BY Rows_read DESC LIMIT 0,10;
+-------------+----------+-------------------------+------------+
|TABLE_SCHEMA |TABLE_NAME|INDEX_NAME               |ROWS_READ   |
+-------------+----------+-------------------------+------------+
|de_moviepilot|images    |parent_id_and_thumbnail_o|138769917931|
|de_moviepilot|ratings   |PRIMARY                  |116200730622|
|de_moviepilot|ratings   |top_on_ratings           |111350089590|
|de_moviepilot|events    |index_events_on_parent_id| 97002618593|
|de_moviepilot|ratings   |movie_id_and_user_id_and_| 45962792087|
|de_moviepilot|neighbours|PRIMARY                  | 34403784465|
|de_moviepilot|plot_keywo|lft_and_rgt              | 30943317768|
|de_moviepilot|comments  |index_comments_on_comment| 26576184065|
|de_moviepilot|comments  |commentable_type_and_comm| 25467669528|
|moviepilot   |users     |type_and_id_idx          | 21950479057|
+-------------+----------+-------------------------+------------+
10 rows in set (0.02 sec)

You just got the list of the ten most used MyIsam/InnoDb indexes in your database. See tables TABLE_STATISTICS, CLIENT_STATISTICS and USER_STATISTICS in information_schema for further details on table, client and user stats. Feel free to check your InnoDb tables for ones with few writes that maybe should be migrated to MyIsam or heave write MyIsam tables vice versa.

Detect Unused Indexes

But our task for this post is detecting unused indexes. As you already might have noticed, INDEX_STATISTICS only shows indexes that have been used at least once. If you need a list of unused indexes, meaning indexes that have been accessed zero times, you can get them by comparing the list of available indexes and the list of used indexes on a per table base.

select disctinct(INDEX_NAME) from STATISTICS \
where INDEX_NAME != 'PRIMARY' and INDEX_SCHEMA = '${DB}' \
and table_name = '${TABLE}' and INDEX_NAME not in (select \
INDEX_NAME from index_statistics where INDEX_SCHEMA =
'${DB}' and table_name = '${TABLE}');

The variables are placeholders ${DB} and ${TABLE} for usage in shell scripts. Just replace them by a database and table name of your choice.

Putting it all together

As the query above only works on a table basis (I am sure, there are better queries for this issue), and you might want to run this on a regular basis, we wrote a little shell script called „unused_indexes.sh“, available on our snippets repo on github. The script checks all tables in all or a specific database:

$ ./unused_indexes.sh
usage: -d DATABASE (OR -a for all databases) [-f TABLENAMEFILTER]
# check all databases/tables
$ ./unused_indexes.sh -a
# check all tables in database "moviepilot"
$ ./unused_indexes.sh -d moviepilot

The output looks similar to

unused indexes in table moviepilot.stat_promo:
referrer_index mandant_index
---------------------------------------
unused indexes in table moviepilot.stat_promo_del:
c_i_m
---------------------------------------
unused indexes in table mp.comments:
comment_id meta

As we „sharded“ some large tables by splitting them we’d also like to be able to exclude tables:

# check all tables in all databases not matching "%mushes%"
$ ./unused_indexes.sh -a -f mushes
# check all tables in database "moviepilot" not matching "%mushes%"
$ ./unused_indexes.sh -d moviepilot -f mushes

Pitfalls

Please keep in mind that you should enable UserStats for a period long enough to grab statistics that show an average usage of your application and database setup. Also keep in mind that you might have indexes that are only used a few times when running scheduled jobs like importers and therefore might seem to be unused but are important anyway. Also consider flushing your statistics from time to time. As your application’s behaviour changes through deployments your index usage does, too. It might be a good idea to flush UserStats after every deployment.

The current version of unused_indexes.sh ignores all indexes that have been used at least once. It might be a good idea also checking indexes that have been used fewer than n times – just use the SELECT … ORDER BY from above.

Recovering Linux file permissions

I recently ran into a server, where somebody accidently issued a „chown -R www-data:www-data /var“. So all files and directories within /var where chowned to the www-data which actually means a complete system fuckup as everything from logging over mail and caching to databases relies on a correct setup there. Sadfully this was a remote production server so I had to find a quick solution to get a least a state good enough for the next days.

I started peaking around a possibity to reset file permissions based on .deb package details. There are at least approaches (the method there misses a pre-download of all installed .deb packages) to do this (and I remember running a program years ago that checked file permissions based on .deb files – just did not find it via apt-get). Nonetheless this approach lacks the possibility of handling application created files. Files in /var/log for instance don’t have to be declared in a .deb file but urgently need the right file permissions.

So I came to a different approach: cloning permissions. By chance we had a quite similar server running meaning same Linux distribution and nearly the same services installed. I wrote a one liner to save the file permissions on the healthy server:

$ find /var -printf "%p;%u;%g;%m\n" > permissions.txt

The command writes a text file with the following format:

dir/filename;user;group;mode

Please note, I started using „:“ as a separator but noted that at least some Perl related files have a double colon in there name.

Now I only needed a simple shell script that sets the file permissions on the broken server based on the text file we just generated. It came down to this:

#!/bin/bash

ENTRIES=$(cat permissions.txt)

for ENTRY in ${ENTRIES}
do
	echo ${ENTRY} | sed "s/;/ /g" | {
		read FILE USER GROUP MODE
		chown ${USER}:${GROUP} "${FILE}"
		chmod ${MODE} "${FILE}"
	}
done

The script reads every line of the text file, splits it’s content into variables and sets the user and group via „chown“ as well as the mode via „chmod“. It doesn’t check if a directory/file exists before chowning/chmodding it, as it actually doesn’t matter. If it’s not there, it just won’t do something harmfull.

After you’ve run this, it’s a good idea to restart all services and start watching log files. You have to take care of all services that rely on fast changing files in /var. For instance a mail daemon puts a lot of unique file names into /var/spool and the script above won’t be able to take care of that. You have to double check database directories like /var/lib/mysql, hosted repositories and so on. But the script will provide with a state where most services are at least running and you get an idea of how to switch back the remaining directories. It might be helpfull to search for suspicious files, like

$ find /var -user www-data

RubyGems 9.9.9 packaged – Fake install RubyGems on Debian/Ubuntu

For a lot of reasons I often rely on a mixture of a Debian/Ubuntu pre packaged Ruby with a self compiled RubyGems. It helps you in situations where you don’t care that much about the Ruby interpreter itself but need an up to date RubyGems. While this is easy to install, you might run into trouble when installing packages that depend on Ruby and RubyGems, namely packages like „rubygems“, „rubygems1.8“ and „rubygems1.9“.

After unsuccessfully playing around with dpkg for a while (you can put packages on „hold“ which prevents them from being installed automatically, I came to the conclusion, the best way is to install a fake package that is empty but satisfies depencies.

So, here it is: The shiny new RubyGems 9.9.9 which delivers rubygems, rubygems1.8 and rubygems1.9 right away. Just install it (e.g. with dpkg) and you’ll be able installing packages that rely on a rubygems package.

In case you want to play around with the package and customize it to your needs, e.g. only deliver rubygems1.8 or rubygems1.9, take

1. Install equivs

$ sudo apt-get install equivs

2. create a control file

$ equivs-control rubygems

3. edit the control file

$ vim rubygems

You can compare the default settings in the control file with the output of e.g. „apt-cache show rubygems“. The crucial field is „Provides:“ where you can put a comma separated list of packages you want to fake install. Choose a high version for  there „Version: “ field as this will mark the package newer as the distribution’s own package. This prevents the packager from replacing it.

Section: universe/interpreters
Priority: optional
Homepage: http://www.screenage.de/blog/
Standards-Version: 3.6.2
 
Package: rubygems
Version: 9.9.9
Maintainer: Caspar Clemens Mierau <[email protected]>
Provides: rubygems1.8,rubygems1.9,rubygems
Architecture: all
Description: Fake RubyGems replacement
 This is a fake meta package satisfying rubygems dependencies.
 .
 This package can be used when you installed a packaged ruby but want
 to use rubygems from source and still rely on software that depends
 on ruby and rubygems

4. build the package

$ equivs-build rubygems

p.s.: You can also use equivs for easily building meta packages containing a list of packages you want to install at a glance, e.g. for semi automated server bootstrapping.

Bootstrapping a Puppet agent/master on Ubuntu

Though it’s really great that Puppet made it into Ubuntu’s main repository, the provided version is rather outdated which prevents you from using advanced language features when writing your manifests. So sooner or later you end up installing Puppet manually. In order to speed up installation I stripped it down to the following:

install agent:

$ bash &lt; &lt;(wget -qO - https://bit.ly/install-puppet-agent)

install master:

$ bash &lt; &lt;(wget -qO - https://bit.ly/install-puppet-master)

The call fetches the most recent version of the install script from github, installs Ubuntu’s Ruby (which is good enough for running Puppet), fetches an upstream version of gem itself and updates it to the most recent version and finally installs the Puppet gem.

You can, of course, also download, review and run the scripts manually. Just have a look at https://github.com/moviepilot/puppet/tree/master/tools

slides from the ‚From MySQL to MariaDB‘ presentation

As announced, I held a short talk on switching from MySQL community edition (especially 5.1) to MariaDB (currently 5.2.6) at this years LinuxTag in Berlin.

Here are the (German) slides for reference:

(In case you cannot see the embedded presentation, you can also click here)

Please note: There are a lot of good English slides around. If you want give a talk on MariaDB, the „Beginner’s Guide“ might be a good start:

A Beginner’s Guide to MariaDB Presentation

Short talk on MariaDB at Linuxtag 2011

If you happen to be around at this years LinuxTag 2011 in Berlin/Germany, you are invited to attend my short talk on MariaDB as a drop-in replacement for MySQL. The talk focusses on differences between MySQL Community Edition and MariaDB (e.g. XtraDB, Aria, userstats), shows some features live and explains how to switch. I’ll probably post the slides here afterwards.

The talk will be held in German and is scheduled for Friday, the 13th of May, 16:30. The official announcement can be found here.

Using backuppc as a dirty distributed shell

Backuppc is a neat server-based backup solution. In Linux envorinments it is often used in combination with rsync over ssh – and, let’s be hontest – often fairly lazy sudo or root rights for the rsync over ssh connection. This has a lot of disadvantages, but at least, you can use this setup as a cheap distributed shell, as a good maintained backuppc server might have access to a lot of your servers.

I wrote a small wrapper, that reads the (especially Debian/Ubuntu packaged) backuppc configuration and iterates through the hosts, allowing you to issue commands on every valid connection. I used it so far for listing used ssh keys, os patch levels and even small system manipulations.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
#!/bin/bash
SSH_KEY="-i /var/lib/backuppc/.ssh/id_rsa"
SSH_LOGINS=( `grep "root" /etc/backuppc/hosts | \
 awk '{print "root@"$1" "}' | \
 sed ':a;N;$!ba;s/\n//g'` )
 
for SSH_LOGIN in "${SSH_LOGINS[@]}"
do
 HOST=`echo "${SSH_LOGIN}" | awk -F"@" '{print $2'}`
 echo "--------------------------------------------"
 echo "checking host: ${HOST}"
 ssh -C -qq -o "NumberOfPasswordPrompts=0" \
 -o "PasswordAuthentication=no" ${SSH_KEY} ${SSH_LOGIN} "$1"
done

You can easily change this to your needs (e.g. changing login user, adding sudo and so on).

$ ./exec_remote_command.sh "date"
--------------------------------------------
checking host: a.b.com
Mo 9. Mai 15:40:26 CEST 2011
--------------------------------------------
checking host: b.b.com
[...]

Make sure to quote your command, especially when using commands with options, so the script can handle the command line as one argument.

A younger sister of the script is the following ssh key checker that lists and sorts the ssh keys used on systems by their key comment (feel free to include the key itself):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
#!/bin/bash
 
SSH_KEY="-i /var/lib/backuppc/.ssh/id_rsa"
SSH_LOGINS=( `grep "root" /etc/backuppc/hosts | \
 awk '{print "root@"$1" "}' | \
 sed ':a;N;$!ba;s/\n//g'` )
 
for SSH_LOGIN in "${SSH_LOGINS[@]}"
do
 HOST=`echo "${SSH_LOGIN}" | awk -F"@" '{print $2'}`
 echo "--------------------------------------------"
 echo "checking host: ${HOST}"
 ssh -C -qq -o "NumberOfPasswordPrompts=0" \
 -o "PasswordAuthentication=no" ${SSH_KEY} ${SSH_LOGIN} \
 "cut -d: -f6 /etc/passwd | xargs -i{} egrep -s \
 '^ssh-' {}/.ssh/authorized_keys {}/.ssh/authorized_keys2" | \
 cut -f 3- -d " " | sort
 ssh -C -qq -o "NumberOfPasswordPrompts=0" \
 -o "PasswordAuthentication=no" ${SSH_KEY} ${SSH_LOGIN} \
 "egrep -s '^ssh-' /etc/skel/.ssh/authorized_keys \
 /etc/skel/.ssh/authorized_keys2" | cut -f 3- -d " " | sort
done

A sample output of the script:

$ ./check_keys.sh 2>/dev/null
--------------------------------------------
checking host: a.b.com
[email protected] 
backuppc@localhost
some random key comment
--------------------------------------------
checking host: b.b.com
[...]

That’s all for now. Don’t blame me for doing it this way – I am only the messenger :)

Ubuntu (Berlin) Global Jam at c-base and Daniel Holbach’s notebook

Members of „Ubuntu Berlin“ met yesterday at c-base within the Ubuntu Global Jam. While it was nice seeing new and international faces showing up and introducing newcomers to advanced Launchpad usage, my main attraction of the day was Daniel Holbach’s notebook. He asserted it runs Maverick and starts up within five seconds, which made me laugh at first as my netbook’s startup time tripled from Lucid to Maverick to round about 45 seconds (which will at least change back until release I assume).

Ubuntu Berlin at Ubuntu Global Jam (c-base) - August 2010

Ubuntu (Berlin) Global Jam at c-base

To make it short: Between bug triaging and patching Daniel showed the startup procedure two or three times on his X61s (with an solid state disk, one has to add) and as promised it started up in five seconds after Grub. Actually this isn’t more than a fast booting notebook, but it shows the results of focussed efforts from the last one and a half year. Remember the initial „10s“ posting and the bunch of changes it took.

So I am happy looking forward to improvements for Maverick on my netbook. And yes: I am happy with 10 seconds, too.

[update]

Daniel noted, that it’s a X61s, not a T61. Changed.

Desktop Summit 2011 in Berlin

I am happy to announce that Berlin has been chosen as location for the Desktop Summit 2011. If you don’t know so far: Desktop Summit is a 1000+ developer conference co-hosting KDE’s „Akademy“ and GNOME’s „GUADEC“ at the same time:

Read the press release: Desktop Summit 2011 Announced

As Ubuntu member and head member of c-base e.V. I am part of the Berlin team, together with Claudia Rauch from KDE e.V. and Mirko Boehm of KDE. Let me quote Mirko:

„We are honored and proud that our proposal was selected. What we look forward to the most is the inspiration our communities will draw from having the Desktop Summit together again, but also from visiting our bustling, welcoming city. We would like to thank all the supporters of the proposal, and will work hard to make the conference a big success.“

I am sure this event will become a success. And it’s a great opportunity to meet and greet across the letters before the „U“ in „Ubuntu“.