2016년 7월 30일 토요일

Enable PXE netboot in KVM guests for testing UEFI

Many sysadmins and infrastructure engineers use Oracle VirtualBox + Vagrant to quickly test new system configurations without actually having to test on bare metal. Virtualbox is very convenient and its host bridge networking just works out of the box, but unfortunately Virtualbox doesn't support UEFI PXE netboot (it only supports legacy BIOS PXE).

On the other hand, the virt-manager GUI front-end for libvirt does support UEFI through Intel's tianocore opensource UEFI implementation. But when I created a VM using virt-manager's default macvtap host bridge to my wired interface, the VM could not communicate with my dhcp server to get an IP for PXE netboot.

It turns out that this is a very common problem for users of libvirt, and the solution is to create an entirely new bridge interface, add your wired interface as a slave to the new bridge, and finally to set the packet forwarding delay on your new bridge to 0 (the default forwarding delay is quite high, something like 15 seconds, which is why kvm VM's couldn't connect to my dhcp server). Optionally, you can also disable Spanning Tree Protocol on the bridge.

Here is a simple script I wrote to setup a bridge for KVM guests to communicate with the host:

#!/bin/bash
# setup-bridge.sh
# Last Updated: 2016-07-28
# Jun Go gojun077

# This script creates a linux bridge with a single ethernet
# iface as a slave. It takes 2 arguments:
# (1) Name of bridge
# (2) wired interface name (to be slave of bridge)
# The ip address for the bridge will be assigned by the script
# 'setup_ether.sh'

# This script must be executed as root and requires the
# packages bridge-utils and iproute2 to be installed.

# USAGE sudo ./setup-bridge  
# (ex) sudo ./setup-bridge br0 enp1s0

if [ -z "$1" ]; then
  echo "Must enter bridge name"
  exit 1
elif [ -z "$2" ]; then
  echo "Must enter name of wired iface to be used as slave for bridge"
  exit 1
else
  # create bridge and change its state to UP
  ip l add name "$1" type bridge
  ip l set "$1" up
  # set bridge forwarding delay to 0
  brctl setfd "$1" 0
  # Add wired interface to the bridge
  ip l set "$2" up
  ip l set "$2" master "$1"
  # Show active interfaces
  ip a show up
fi
==========================================
Update 2016-10-8
You can also create the bridge interface statically using systemd-networkd conf files. Here is a sample br0.netdev which should be placed in /etc/systemd/network/ :

[NetDev]
Name=br0
Kind=bridge

[Bridge]
ForwardDelaySec=0

You would then add the IP address, Gateway, DNS, etc. info to br0.network in the same path as br0.netdev:

[Match]
Name=br0

[Network]
Address=192.168.95.95/24
Broadcast=192.168.95.255
DNS=168.126.63.1

[Route]

Gateway=192.168.95.97

And finally, you need to make your wired interface a slave of the bridge device. Assuming your wired iface is named enp2s0, here is a sample systemd-networkd conf file, enp2s0.network, with the reqruired settings:

[Match]
Name=enp2s0

[Network]
Bridge=br0

Finally, to apply these settings, systemctl start systemd-networkd and systemctl enable systemd-networkd
==========================================

If you do not set the forwarding delay to 0, your KVM guest will not be able to communicate with your host.

Now you can assign an IP address to your new bridge.

The next step is to install the OVMF UEFI firmware for QEMU / KVM if you don't have it yet. On Fedora 22+, OVMF is a dependency of libvirtd, so you probably already have it installed. It should just work. The default nvram setting for OVMF firmware in /etc/libvirt/qemu.conf for Fedora is as follows:

nvram = [
   "/usr/share/OVMF/OVMF_CODE.fd:/usr/share/OVMF/OVMF_VARS.fd"
]

This will normally be commented out as it is the default setting.

The name of the OVMF package in Fedora is ed2k-ovmf and it contains the following files:

$ rpm -ql edk2-ovmf
/usr/share/doc/edk2-ovmf
/usr/share/doc/edk2-ovmf/README
/usr/share/edk2
/usr/share/edk2/ovmf
/usr/share/edk2/ovmf/EnrollDefaultKeys.efi
/usr/share/edk2/ovmf/OVMF_CODE.fd
/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd
/usr/share/edk2/ovmf/OVMF_VARS.fd
/usr/share/edk2/ovmf/Shell.efi
/usr/share/edk2/ovmf/UefiShell.iso

On Archlinux, there is also a ovmf package in the default repos, but it doesn't contain all the image files necessary for UEFI. Instead, you need to install the ovmf-git package from the Arch User Repository. ovmf-git contains the following files:

[archjun@pinkS310 bin]$ pacman -Ql ovmf-git
ovmf-git /usr/
ovmf-git /usr/share/
ovmf-git /usr/share/ovmf/
ovmf-git /usr/share/ovmf/ia32/
ovmf-git /usr/share/ovmf/ia32/ovmf_code_ia32.bin
ovmf-git /usr/share/ovmf/ia32/ovmf_ia32.bin
ovmf-git /usr/share/ovmf/ia32/ovmf_vars_ia32.bin
ovmf-git /usr/share/ovmf/x64/
ovmf-git /usr/share/ovmf/x64/ovmf_code_x64.bin
ovmf-git /usr/share/ovmf/x64/ovmf_vars_x64.bin
ovmf-git /usr/share/ovmf/x64/ovmf_x64.bin

You probably noticed that the filenames are slightly different from those on Fedora. Likewise, the nvram setting in /etc/libvirt/qemu.conf is also different for Archlinux:

# uefi nvram files from ovmf-git AUR
nvram = [
   "/usr/share/ovmf/x64/ovmf_x64.bin:/usr/share/ovmf/x64/ovmf_vars_x64.bin"
]

After changing this setting you need to restart the libvirtd systemd service with sudo systemctl restart libvirtd.

You should now be able to install Linux over UEFI PXE netboot into KVM guests. Note that after completing your PXE install and the first reboot, you will have to enter the UEFI firmware menu and select your new Linux boot partition in the boot menu. To access the Tianocore UEFI firmware menu, press ESC when you see the Tianocore logo at POST. I have attached screenshots below.

1. Tianocore UEFI boot screen

2. Grub2 boot menu for UEFI PXE

3. Start UEFI PXE installation for RHEL 7.2

4. vncserver on KVM guest trying to reverse-connect to listening vncclient on my host

5. vnc reverse-connect succeeded; now viewing automated installation in GUI over vnc


6. After reboot, press ESC at POST to enter Tianocore UEFI Firmware menu

7. Enter the Boot Manager menu (through OVMF Platform Config, if I recall)
The first highlighted entry in the boot manager is the bridge device.

8.Go down to UEFI Misc Device to select your newly-installed Linux EFI boot partition.

9. You can also specify that the UEFI misc device be used on the next reboot


References:

http://www.tianocore.org/ovmf/
http://wiki.libvirt.org/page/PXE_boot_(or_dhcp)_on_guest_failed
https://bugzilla.redhat.com/show_bug.cgi?id=533684
https://wiki.archlinux.org/index.php/PCI_passthrough_via_OVMF#Configuring_libvirt

2016년 7월 23일 토요일

PXE netboot installation differences between Ubuntu and RHEL

PXE netboot is the preferred method for fully automated/unattended installations of Linux to many machines at once. I have written several posts about PXE netboot for Linux, this one (PXE for UEFI and Legacy BIOS) being the most recent. In the RHEL world, fine-grained system settings are defined in Kickstart files. While Ubuntu supports a subset of Kickstart commands, it is better to use D-I (Debian Installer) preseed files to automate PXE netboot installs of Debian/Ubuntu.

Below is a partial list of the differences I have observed in PXE netboot for Ubuntu and RHEL:

Remote monitoring of PXE installation

RHEL supports monitoring remote installs via vnc or vnc reverse-connect, but Ubuntu does not support vnc. Ubuntu does allow remote installation via ssh but only for manual installs; You cannot monitor a fully automated Ubuntu install remotely.

To enable vnc for a PXE install on RHEL 7.0, for example, you would add the following options to the APPEND line in your PXE default config file menu entry (for Legacy BIOS)

LABEL RHEL7.0
  MENU LABEL Boot RHEL 7.0 (ks install)
  KERNEL images/rhel7.0/vmlinuz
  INITRD images/rhel7.0/initrd.img
  APPEND ip=dhcp inst.repo=http://192.168.95.97:8080 inst.vnc ksdevice=link inst.ks=http://192.168.95.97:8000/rhel7.0-ks.cfg

If you want to enable vnc reverse connect you would add something like inst.vncconnect=1.2.3.4:5500 after inst.vnc on the APPEND line.

In the case of Ubuntu you would enable D-I preseed from your PXE default config as follows:

LABEL xenial16.04_netinst-Netconsole
  MENU LABEL Boot Ubuntu 16.04 (manual netinst over ssh)
  KERNEL images/xenial_x64/linux
  INITRD  images/xenial_x64/initrd.gz
  APPEND ip=dhcp auto=true priority=critical locale=en_US.UTF-8 kdb-chooser/method=us netcfg/choose_interface=auto url=http://192.168.95.97:8000/netconsole.cfg

It is important that you specify auto=true for automated install and priority=critical so that D-I will not ask the user for information (except for absolutely critical cases). You can specify the location of the D-I preseed file over http (among other methods) with url=

Then the preseed file must contain instructions to start sshd so that you can make an ssh connection to the Debian Installer and control the installation manually. Here is my D-I preseed file for Ubuntu PXE install over ssh:

# D-I preseed file to enable remote installs of Ubuntu/Debian
# over SSH

#d-i debian-installer/locale string ko_KR.UTF-8
d-i debian-installer/language string en
d-i debian-installer/country string KR
d-i debian-installer/locale string en_US.UTF-8
d-i keyboard-configuration/xkb-keymap select us
d-i debconf/priority                   select critical
d-i auto-install/enabled               boolean true
d-i netcfg/choose_interface            select auto
d-i netcfg/get_hostname                string unassigned-hostname

### Network console
# Use the following settings if you wish to make use of the network-console
# component for remote installation over SSH. This only makes sense if you
# intend to perform the remainder of the installation manually.
d-i anna/choose_modules string network-console
d-i network-console/password           password foofoo
d-i network-console/password-again     password foofoo
d-i preseed/early_command string anna-install network-console


Kickstart syntax support
RHEL and its variants natively support Kickstart syntax for automated installs (although the syntax varies between versions of RHEL; I strongly recommend checking your syntax with ksvalidator) but Ubuntu only supports a subset of Kickstart commands. For example, installing to multiple disks with something like

part /boot --fstype=ext4    --ondisk=sda --size=512
part /     --fstype=ext4    --ondisk=sda --size=1 --grow
part /var  --fstype=ext4    --ondisk=sdb --size=1 --grow

is NOT supported in Ubuntu-flavored kickstart files. You can only install to a single disk when using Ubuntu kickstart syntax. There are many more such gotchas so it's a good idea to stick with the Debian/Ubuntu native automated install solution D-I preseed.


DNS
For all PXE installs, you need a DHCP server to assign IP's to netboot clients. But a fully-automated Ubuntu PXE D-I preseed installation also requires that you have a DNS server running. If you do not have a name server running on the subnet for PXE netboot, D-I will stop and ask you to enter the address of the name server or press ENTER for no name server.

Since I use dnsmasq, I made sure that the following was commented out in /etc/dnsmasq.conf so that DNS would run:

# Listen on this specific port instead of the standard DNS port
# (53). Setting this to zero completely disables DNS function,
# leaving only DHCP and/or TFTP.
#port=5353
#port=0

I am sure there are many more differences, but these are the ones I've encountered so far. If you know of more, please let me know in the comments.

2016년 7월 16일 토요일

Openstack with SELINUX

Normally before installing Openstack (be it Devstack, RDO, or some other flavor) I set SELINUX to permissive. Yesterday, however, I was at a client IDC and they requested that SELINUX be set to enabled before installing RDO 6 Juno.

This was the first time I had received such a request. I therefore edited /etc/selinux/config and set SELINUX=enforcing and then made selinux relabel the entire filesystem with fixfiles relabel (and said y to deleting all the files in /tmp). After the relabeling, you must reboot the system.

However after enabling SELINUX, the system integration company working on the project complained that they were getting permission denied errors for the kvm kernel module. I was able to get things working again after editing /etc/libvirt/qemu.conf and uncommenting the line:

security_driver = "selinux"

and then restarting libvirtd with systemctl restart libvirtd

In hindsight, I think the client might have run into the following issue when trying to install Redhat Distribution of Openstack (RDO) with SELINUX set to disabled:

PackStack fails if SELinux is disabled
The solution is to enable SELinux in permissive mode (if there is a reason not to have it in enforcing mode).

References:

https://www.centos.org/docs/5/html/5.2/Deployment_Guide/sec-sel-fsrelabel.html

https://www.rdoproject.org/install/selinux/

2016년 7월 7일 목요일

Google Cloud Platform billing shock (or don't forget to delete networking resources, too!)

I have played with public cloud services on AWS, Digital Ocean and recently, Google Cloud Platform (GCP). GCP is has a user-friendly, clean interface. One thing I really like is that estimated usage charges appear on-screen when you launch a compute instance.

I followed one of the GCP Kubernetes tutorials from the following link:

https://cloud.google.com/container-engine/docs/tutorials

I created a Kubernetes cluster composed of several docker containers and a load balancer to split traffic among containers running a hello world node.js app. Once you are done with the tutorial you should of course delete the resources you were using. For some reason, however, I only deleted the compute resources I used in the tutorial (Kubernetes cluster and constituent containers) but forgot to delete the load balancer. As a result, I was surprised when I got my GCP bill for June 2016:


The item Network Load Balancing Forwarding Rule Minimum Service Charge in APAC indicates that the load balancer ran for 696 hours or 4 weeks and one day! This cost me over $17 USD.

As soon as I finished the Kubernetes tutorial I made sure to delete the Kubernetes cluster, and as a result I only used 0.65 hours of compute resources costing me just 4 cents. Unfortunately, however, I neglected to delete the external load balancer (doh!) ...

The step in which you are supposed to delete the Kubernetes pod and LB are detailed here:

http://kubernetes.io/docs/hellonode/#thats-it-time-to-tear-it-down

That’s it for the demo! So you don’t leave this all running and incur charges, let’s learn how to tear things down.
Delete the Deployment (which also deletes the running pods) and Service (which also deletes your external load balancer):
kubectl delete service,deployment hello-node
I think I missed the load balancer because I deleted resources through the GCP web dashboard instead of from the command line.

To delete networking resources from the dashboard, click the icon at top left that looks like 三 and select Networking.



Then from the Networking screen, select Load Balancing:


If you have any existing load balancers they will appear in this screen and you can delete them. Hopefully other people will be able to learn from my dumb mistake and make sure to delete the networking resources in addition to their compute resources when they are done using GCP.

2016년 7월 1일 금요일

Problems compiling motion 3.2.12 on Fedora 24

Note: On July 17, 2016 Leigh Scott created a new rpmfusion package motion-3.3.0.trunkREV561-2.fc24.x86_64 for motion webcam which contains a patch for ffmpeg-3. You can also check out the rpmfusion/motion repository on github to see the changes which must be made to ffmpeg.c and configure.in for motion to properly compile.

https://github.com/rpmfusion/motion/blob/master/api-update_libav10.patch
https://github.com/rpmfusion/motion/blob/master/api-update_ffmpeg-2.9.patch

============================================

I recently upgraded from f23 to f24 as soon as the latter was released. There are quite a few packages from the RPM Fusion community repo which are not yet available on F24 that are available on F23.

When compiling motion the first issue I encountered after running
./configure
make

------------------
fatal error: linux/videodev.h: No such file or directory
compilation terminated.

I fixed this problem by installing fedora package libv4l-devel and v4l-utils-devel-tools

Then I created the following symlink:

sudo ln -s /usr/include/libv4l1-videodev.h   /usr/include/linux/videodev.h 

I tried to compile again:
./configure
make clean
make

The next problem, however, is that motion looks for PIX_FMT_YUV420P in ffmpeg's pixfmt.h header file provided by ffmpeg-3.0.2-1.fc24.x86_64 from RPM Fusion but this variable has been renamed to AV_PIX_FMT_YUV420P 

I need to find which files to edit with new variable names in order to compile motion... 



References: