Tuesday, May 14, 2013

How to get started with libvirt on Debian


If you like hacking and have a few machines you use for development, chances are you know what I am about to talk about here. You start from this new idea, install a few tools, peek at some existing source code, try to compile it, get something running... and eventually move on to the next project

At least until your laptop becomes a giant meatball of services running for who knows what reason, you can't remember which machine you were actually using for that test, or half assed scripts you have no memory of  keep creeping up in your PATH.

My first approach at finding a solution was based on chroots. The idea was simple: only develop on my laptop, but create a self contained environment for each project where to install all the needed dependencies and tools, and where to run all my crazy experiments. The holy grail of the time were chroots, and during those years, I became good friend with rsync, debootstrap, mount --rbind and sometimes even pivot_root.

This worked well for a while. Until, well, I run into the limitations of chroots: can't really simulate networking, run different kernels (or OSes), and don't help much if you need to work on something boot related or that has to do with userspace and kernel interactions.

Guess what was my second approach? I started using Virtual Machines.

At first it was only one. A good old image created from scratch I would run with qemu and a tap device. A few tens of lines of shell script to get it up as needed, and I was back in business with my hacking.

Fast forward a few years, and I have > 10 different VMs on my laptop, this shell script has grown to almost 1k lines of an unmaintainable entanglement of relatively simple commands and images to run, and I am afraid of even thinking of what to use for my next project. My own spaghetti VMs.

A few weekends ago I finally built up the courage to fix this, and well, discovered how easy it is to manage VMs with libvirt. So, here's what I learned...

Setup

You start by installing the needed tools. On a Debian system:
$ sudo -s
# apt-get install libvirt-bin virtinst

This should get a "libvirtd" binary running on your machine:
$ ps u -C libvirtd
USER    PID %CPU %MEM    VSZ  RSS TTY STAT START TIME COMMAND
root  11950  0.0  0.1 111928 7544 ?   Sl   Apr19 1:29 /usr/sbin/libvirtd -d

The role of libvirtd is quite important: it takes care of managing the VMs running on your host. It is the daemon that starts them up, stops them and prepares the environment that they need. You control libvirtd by using virsh from the shell, or virt-manager to have a graphical interface. I am generally not fond of graphical interfaces, so I will talk about virsh for the rest of the post.

First few steps with libvirt

Before anything else, you should know that libvirt and virsh not only allow you to manage VMs running on your own system, but can control VMs running on remote systems or a cluster of physical machines. Every time you use virsh you need to specify some sort of URI to tell libvirt which sets of virtual machines you want to control.

For example, let's say you want to control a XEN virtual machine running on a remote server called "myserver.com". When using virsh, you can refer to that VM by providing an URI like "xen+ssh://root@myserver.com/", indicating that you want to use ssh to connect as root to the server myserver.com, and control xen virtual machines running there.

With QEMU (and KVM), which is what I use, there are two URIs you need to be aware of:
  • qemu://xxxx/system, to indicate all the system VMs running on server xxxx. 
  • qemu://xxxx/session, to indicate all the VMs belonging to the user that is running the virsh command.
That's right: each user can have its own set of VMs and networks, and if allowed to do so, can control a set of system, global VMs. Session VMs run as the user that started them, while system VMs generally run as an unprivileged, dedicated, user, libvirt-qemu on a debian systems.

If you omit xxxx, with URIs like qemu:///system, or qemu:///session, you are referring to the system and session VMs running on the machines you are running the command on, localhost.

Note that if you use virsh as root, and do not specify which sets of VMs you want to control, it will default to controlling the system VMs, the global ones. If you run virsh as a different user instead, it will default to controlling the session VMs, the ones that only belong to you.

This is a common mistake and good source of confusion when you get started, and you should keep in mind that it is a good idea to explicitly specify which VMs you want to work on with the -c option, that you will see in a few minutes.

Managing system VMs

On a Debian machine, for a user to be allowed to mange system VMs it needs to be able to send commands to libvirtd. By default, libvirtd listens on a unix domain socket in /var/run/libvirt, and for a user to be able to write to that socket he needs to belong to the libvirt group.

If you edit /etc/libvirt/libvirtd.conf, you can configure libvirtd to wait for commands using a variety of different mechanisms, including for example SSL encrypted TCP sockets.

Given that I only wanted to manage system local virtual machines, I just added my user, rabexc, to the group libvirt so I didn't have to be root to manage these machines:
usermod -a -G libvirt rabexc
# alternatively, use vigr and vigr -s

Defining a network

Each VM you define will likely need some sort of network connectivity, and some sort of storage to use.
Each object in libvirt, being it a network, a pool of disks to use, or a VM, is defined by an xml file.

Let's start by looking at the default network configuration, run:
$ virsh -c qemu:///system net-list
Name                 State      Autostart
-----------------------------------------
This means that there are no active virtual networks. Try one more time adding --all:
$ virsh -c qemu:///system net-list --all
Name                 State      Autostart
-----------------------------------------
default              inactive   no
and notice the default network.
If you want to inspect or change the configuration of the network, you can use either net-dumpxml or net-edit, like:
$ virsh -c qemu:///system net-dumpxml default
<network>
  <name>default</name>
  <uuid>ee49713c-d1c8-e08b-b007-6401efd145fe</uuid>
  <forward mode="nat">
  <bridge delay="0" name="virbr0" stp="on">
  <ip address="192.168.122.1" netmask="255.255.255.0">
    <dhcp>
      <range end="192.168.122.254" start="192.168.122.2">
    </range></dhcp>
  </ip>
  </bridge>
  </forward>
</network>

The output is pretty much self explanatory: 192.168.122.1 will be assigned to the virbr0 interface as the address of the gateway, virtual machines will be assigned addresses between 192.168.122.2 and 192.168.122.254 using dhcp, and forward traffic of those virtual machines to the outside world by using nat, eg, by mapping their IP address behind the address of your host.

A bridge device (virbr0) allows Virtual Machines to communicate with each other, as if they were connected to their own dedicated network. You can configure networking in many different ways, with nat, with bridging, with simple gateway forwarding, ... You can find full documentation on the parameters here: http://libvirt.org/formatnetwork.html, and change the definition by using net-edit. Other handy commands:
  • "net-undefine default", for example, to forever eliminate the default network.
  • "net-define file.xml", to define a new network starting from an .xml file. I usually start from the xml of another network, by using "virsh ... net-dumpxml default > file.xml", edit edit edit, and then "virsh ... net-define file.xml".

Starting and stopping networks

Once you have a network defined, you need to start it, or well, tell virsh that you want it started automatically. In our case, the commands would be:
  • "net-start default", to start the default network.
  • "net-destroy default", to stop the default network, with the ability of starting it again in the future.
  • "net-autostart default", to automatically start the default network at boot.
Now... what happens exactly when we start a network? My laptop has quite a few iptables rules and various other random network configurations. So, let's try:

$ virsh -c qemu:///system net-start default
Network default started
And have a look at the system:
$ ps faux
[...]
root   1799 0.0 0.6 109688 6508 ? Sl May01 0:00 /usr/sbin/libvirtd -d
nobody 4246 0.0 0.0   4608  896 ?  S 08:35 0:00 /usr/sbin/dnsmasq --strict-order --bind-interfaces --pid-file=/var/run/libvirt/network/default.pid --conf-file= --except-interface lo --listen-address 192.168.122.1 --dhcp-range 192.168.122.2,192.168.122.254 --dhcp-leasefile=/var/lib/libvirt/dnsmasq/default.leases --dhcp-lease-max=253 --dhcp-no-override
# netstat -nulp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address  Foreign Address PID/Program name
udp        0      0 192.168.0.1:53 0.0.0.0:*       4246/dnsmasq
udp        0      0 0.0.0.0:67     0.0.0.0:*       4246/dnsmasq

# netstat -ntlp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address   Foreign Address  State   PID/Program name
tcp        0      0 192.168.0.1:53  0.0.0.0:*        LISTEN  4246/dnsmasq
tcp        0      0 0.0.0.0:22      0.0.0.0:*        LISTEN  2108/sshd
libvirt started dnsmasq, which is a simple dhcp server with the ability to also provide DNS names. Note that the command line parameters seem to match what we had in the default xml file.
$ ip address show
1: lo:  mtu 16436 qdisc noqueue state UNKNOWN
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0:  mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 52:54:00:2e:72:8b brd ff:ff:ff:ff:ff:ff
    inet 192.168.100.86/24 brd 192.168.100.255 scope global eth0
    inet6 fe80::5054:ff:fe2e:728b/64 scope link
       valid_lft forever preferred_lft forever
4: virbr0:  mtu 1500 qdisc noqueue state DOWN
    link/ether 8a:3c:6e:11:28:85 brd ff:ff:ff:ff:ff:ff
    inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0
This shows that a new device, virbr0, has been created, and assigned 192.168.122.1 as an address.
$ sudo iptables -nvL
Chain INPUT (policy ACCEPT 565 packets, 38728 bytes)
 pkts bytes target     prot opt in     out     source               destination
    0     0 ACCEPT     udp  --  virbr0 *       0.0.0.0/0            0.0.0.0/0            udp dpt:53
    0     0 ACCEPT     tcp  --  virbr0 *       0.0.0.0/0            0.0.0.0/0            tcp dpt:53
    0     0 ACCEPT     udp  --  virbr0 *       0.0.0.0/0            0.0.0.0/0            udp dpt:67
    0     0 ACCEPT     tcp  --  virbr0 *       0.0.0.0/0            0.0.0.0/0            tcp dpt:67

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
    0     0 ACCEPT     all  --  *      virbr0  0.0.0.0/0            192.168.122.0/24     state RELATED,ESTABLISHED
    0     0 ACCEPT     all  --  virbr0 *       192.168.122.0/24     0.0.0.0/0
    0     0 ACCEPT     all  --  virbr0 virbr0  0.0.0.0/0            0.0.0.0/0
    0     0 REJECT     all  --  *      virbr0  0.0.0.0/0            0.0.0.0/0            reject-with icmp-port-unreachable
    0     0 REJECT     all  --  virbr0 *       0.0.0.0/0            0.0.0.0/0            reject-with icmp-port-unreachable

Chain OUTPUT (policy ACCEPT 376 packets, 124K bytes)
 pkts bytes target     prot opt in     out     source               destination

$ cat /proc/sys/net/ipv4/ip_forward
1
Firewalling rules have also been installed. In particular, the first 4 rules allow querying of dnsmasq from the virtual network. Here they are meaningless: iptables default policy is to accept by default. But had I had my real iptables rules running, they would have allowed that traffic, and those new rules would have been inserted before my existing rules.

Forwarding rules, instead, allow all replies to come back in (packets belonging to RELATED and ESTABLISHED sessions), and allow communications from the virtual network to any other network, as long as the source ip is 192.168.122/24.
Note also that ip forwarding has either been enabled, or was already enabled by default.
$ sudo iptables -t nat -nvL
Chain PREROUTING (policy ACCEPT 1 packets, 32 bytes)
 pkts bytes target     prot opt in     out     source               destination

Chain INPUT (policy ACCEPT 1 packets, 32 bytes)
 pkts bytes target     prot opt in     out     source               destination

Chain OUTPUT (policy ACCEPT 1 packets, 1500 bytes)
 pkts bytes target     prot opt in     out     source               destination

Chain POSTROUTING (policy ACCEPT 1 packets, 1500 bytes)
 pkts bytes target     prot opt in     out     source               destination
    0     0 MASQUERADE  tcp  --  *      *       192.168.122.0/24    !192.168.122.0/24     masq ports: 1024-65535
    0     0 MASQUERADE  udp  --  *      *       192.168.122.0/24    !192.168.122.0/24     masq ports: 1024-65535
    0     0 MASQUERADE  all  --  *      *       192.168.122.0/24    !192.168.122.0/24
Finally, note that rules to perform NAT have been installed. Those rules are added by scripts when the network is setup. Some documentation is provided here: http://wiki.libvirt.org/page/Networking#Forwarding_Incoming_Connections.

If you want to, you can also add arbitrary rules to filter traffic to virtual machines, and have libvirt install and remove them automatically. As for the network commands, the main commands are: nwfilter-define, nwfilter-undefine, ...-edit, ...-list, ...-dumpxml. You can read more about firewalling on the libvirt site: http://libvirt.org/firewall.html

Managing storage

Now that we have a network running for our VMs, we need to worry about storage. There are many ways to get some disk space, ranging from dedicated partitions or LVM volumes to simple files.

The main idea is to create a "pool" from which you can draw space from, and create "volumes". Not very original, is it? On my system, I just dedicated a directory to storing images and "volumes".

You can start with:
$ virsh -c qemu:///system \
    pool-define-as devel \
    dir --target /opt/kvms/pools/devel

This creates a pool called devel  in a drectory /opt/kvms/pools/devel.  I can see this pool with:
$ virsh -c qemu:///system pool-list --all
Name                 State      Autostart 
-----------------------------------------
devel                inactive   no        

Note the --all parameter, without it, you would only see started pools. And as before, you can mark it to be automatically started by using:
$ virsh -c qemu:///system pool-autostart devel

and start it with:
$ virsh -c qemu:///system pool-start devel

To create and manage volumes you can use vol-create, vol-delete, vol-resize, ... all the vol commands that "virsh help" shows you. Or, you can just let virsh manage the volumes for you, as we will see in a second. The one command you will find useful is vol-list, to have the list of volumes in a pool.

For example:
$ virsh -c qemu:///system vol-list devel
Name                 Path
-----------------------------------------
Shows that there are no volumes. Don't forget that the pool has to be active for most of the vol- commands to work.

Installing a virtual machine

Now you are finally ready to create a new virtual machine. The main command to use is "virt-install". Let's look at a typical invocation:
virt-install -n debian-testing \
             --ram 2048 --vcpus=2 \
             --cpu=host \
             -c ./netinst/debian-6.0.7-amd64-netinst.iso \
             --os-type=linux --os-variant=debiansqueeze \
             --disk=pool=devel,size=2,format=qcow2 \
             -w network=devel --graphics=vnc

and go over the command line for a minute:

  • -n debian-testing is just a name. I am calling this VM "debian-testing".
  • --ram 2048 --vcpus=2 should also be no surprise: give it 2Gb of RAM, and 2 CPUs.
  • --cpu=host means that I do not want to emulate any specific CPU, the VM should just be provided the same CPU as my physical machine. This is generally fast, but can mean troubles if you want to be able to migrate your VMs to a less capable machine. The fact is, however, that I don't care about migrating my VMs, and prefer them to be fast :).
  • -c ./netinst... means that the VM should be configured to have a "CD-ROM" with the file .iso specified in it. This is just an installation image of debian.
  • --os-type, --os-variant are optional, but in theory allow libvirt to configure the VM with the optimal parameters for your operating system.
The most interesting part to me comes from:
  • --disk=pool=devel,size=2,format=qcow2, which asks libvirt to automatically allocate 2 Gb of space from the devel pool. Do you remember? The pool we defined just a few sections ago. The format parameter indicates how to store this VMs disks. The qcow2 format is probably the most common format for KVM and QEMU, and provides a great deal of flexibility. Look at the man page for more details.
  • -w network=devel means that the VM should be connected to the default network. Again, the network we created at the start of this article.
  • --graphics=vnc just means that you want to have a vnc window to control the VM.
Of course, you need to get a suitable installation media in advance, the file specified with -c ./netinsta.... I generally use CD or USB images suitable for a network install, which means minimal system, most of it downloaded from the network. virt-install also supports fetching directly the image to use from an http, ftp, or nfs server, in which case you should use the -l option, and read the man page, man virt-install. Don't forget that the image type must match the cpu you specify with --cpu.

Converting an existing virtual machine

In my case I had many existing VMs on my system. I did not want to maintain the same network setup, in facts, the default DHCP and NAT setup with a bridge provided by libvirt was better than what I had before. To import the VMs, I followed a simple procedure:

  1. Copied the image in the directory of the pool: cp my-vm.qcow2 /opt/kvms/pools/devel
  2. Refreshed the pool, just in case: virsh -c qemu:///system pool-refresh default
  3. Created a new VM based on that image, by using virt-install with the --import option, for example:
    virt-install --connect qemu:///system --ram 1024 -n my-vm --os-type=linux --os-variant=debianwheezy --disk vol=default/my-vm.qcow2,device=disk,format=qcow2 --vcpus=1 --vnc --import

    Note the default/my-vm.qcow2 indicating the file to use, and --import.
Of course, once the import was completed I had to connect to the VM and change the network parameters to use DHCP instead of a static address.

Managing Virtual Machines

You may have noticed that once you run virt-install, your virtual machine is started. The main commands to manage virtual machines are:
  • virt-viewer my-vm - to have the screen of your VM opened up in a vnc client.
  • virsh start my-vm - to start your VM.
  • virsh destroy my-vm - to stop your VM violently. It is generally much better to run "shutdown" from your VM, or better...
  • virsh shutdown my-vm - to send your VM a "shutdown request", like if you had pressed the shutdown button on your server. Note that it is then up to the OS installed and its configuration to decide what to do. Some desktop environments, for example, will pop up a window asking you what you want to do, and not really shutdown the machine.
  • virt-clone --original my-vm --auto-clone - to make an exact copy of your VM.
  • virsh autostart my-vm - to automatically start your vm at boot.

A few other random notes

VNC console from remote machine with no libvirt tools

I had to connect to the VNC console of my virtual machines from a remote desktop that did not have virt-viewer installed, so I could not use the -c and URI parameters. A simple port forwarding got me what I wanted:
$ ssh rabexc@server -L 5905:localhost:5900
$ vncviewer :5

To forward port 5900, first VM running VNC, to the local port 5905, and asked vncviewer to connect directly to the 5th VNC console locally (5900 + 5 = 5905).

virsh snapshots and qcow2

First time I used "virsh snapshot-save my-vm" to take a snapshot of all the volumes used by my VM I could not find where the data was stored. It turns out that qcow2 files have direct support for snapshots, which are saved internally within the same file. To see them, beside the virsh commands, you can use: qemu-img info /opt/kvms/pools/devel/my-vm.qcow2.

Moving qcow2 images around

If you created qcow2 images based on other images by using -o backing_file=... to only record the differences, if you move the images around this diff will not work anymore, as it will not find the original backing file. A quick fix was to use:

qemu-img rebase -u -b original_backing_file_in_new_path.img \
    derived_image.qcow2

Note that -u, unsafe, is only usable if really, the only thing that changed between the two images was the path.

Sending qemu monitor commands directly

Before switching to libvirt I was used to managing kvm / qemu VMs by using the monitor interface. Despite what the documentation claims, it is possible to send commands through this interface directly by using:

$ virsh -c qemu:///system \
    qemu-monitor-command \
    --hmp debian-testing "help"

for example.

Finding the IP address of your VM

When a VM starts with the default network configuration it will be assigned an IP via DHCP by dnsmasq. This IP can change. For some reason, I was sort of expecting dnsmasq, also capable of behaving as a simple DNS server, would maintain a mapping VM name to IP, and accept DNS queries to resolve the name of the VM. Turns out this is not the case, unless you explicitly add mappings between names and the MAC address of your VM in the network configuration. Or at least, I could not find a better way to do it.

The only reliable way to find the IP of your VM is to either provide a static mapping, or look into /var/lib/libvirt/dnsmasq/default.leases for the MAC address of your VM, where default is the name of your network.

You can find the MAC address of your VM by looking at its xml definition, with something like:
virsh dumpxml debian-modxslt |grep "mac address"

You can find plenty of shell scripts on google to do this automatically for you.

Conclusions

Switching to libvirt took me only a few hours, and I am no longer afraid of having to deal with multiple VMs on my laptop :). Creating them, cloning temporarily, or removing them has become an extremely simple task.

No comments:

Post a Comment