2015년 4월 29일 수요일

Linux/Unix text parsing with awk for spreadsheet data

I track some basic personal health data in a spreadsheet, and track things like time to bed, wakeup time, mealtimes, etc. Simple analysis (average, mean avg. deviation, etc.) is easy to do using built-in spreadsheet functions like AVERAGE().

For conditional calculations, however, you will have to start using IF() statements, which can get complicated if you have several conditions you want to check for.

Consider the following Tab-separated data from a spreadsheet covering the month of September 2014:

Bed Wake DoW Condition nap (hrs) Rest (hrs)
9/2/2014 0:45:00 9/2/2014 5:50:00 1 5.08
9/3/2014 0:10:00 9/3/2014 5:50:00 2 5.67
9/4/2014 1:00:00 9/4/2014 5:50:00 3 4.83
9/5/2014 2:00:00 9/5/2014 11:30:00 4 pretty hung over today 9.50
9/6/2014 2:00:00 9/6/2014 9:45:00 5 1.5 9.25
9/7/2014 2:00:00 9/7/2014 9:45:00 6 7.75
9/8/2014 3:00:00 9/8/2014 10:40:00 0 Chuseok day 7.67
9/9/2014 2:00:00 9/9/2014 11:50:00 1 9.83
9/10/2014 0:30:00 9/10/2014 5:50:00 2 came into the office to study; caught a cold 5.33
9/10/2014 23:30:00 9/11/2014 5:50:00 3 cold sore appears on lip 6.33
9/11/2014 23:15:00 9/12/2014 6:20:00 4 7.08
9/13/2014 1:30:00 9/13/2014 11:00:00 5 9.50
9/14/2014 1:30:00 9/14/2014 11:00:00 6 9.50
9/15/2014 1:00:00 9/15/2014 5:50:00 0 common cold has moved to the chest; phlegm comes out 4.83
9/16/2014 0:00:00 9/16/2014 5:00:00 1 woke up 40 minutes early b/c of bad cough, condition is better than it was on Monday 5.00
9/17/2014 0:00:00 9/17/2014 5:50:00 2 5.83
9/18/2014 0:00:00 9/18/2014 6:00:00 3 6.00
9/19/2014 0:00:00 9/19/2014 5:50:00 4 5.83
9/19/2014 23:00:00 9/20/2014 9:30:00 5 10.50
9/21/2014 0:30:00 9/21/2014 10:00:00 6 9.50
9/22/2014 0:00:00 9/22/2014 5:55:00 0 5.92
9/23/2014 0:00:00 9/23/2014 9:00:00 1 9.00
9/24/2014 12:30:00 9/24/2014 17:30:00 2 5.00
9/24/2014 23:40:00 9/25/2014 5:50:00 3 6.17
9/25/2014 23:30:00 9/26/2014 5:50:00 4 6.33
9/27/2014 1:30:00 9/27/2014 10:00:00 5 1 9.50
9/28/2014 0:30:00 9/28/2014 10:00:00 6 9.50
9/29/2014 0:30:00 9/29/2014 5:55:00 0 5.42
9/29/2014 23:40:00 9/30/2014 5:55:00 1 6.25
9/30/2014 23:50:00 10/1/2014 5:55:00 2 6.08 

The 3rd field, DoW (Day Of Week), takes values from 0 to 6, with 0 being Monday and 6 being Sunday. Getting an average value for hours slept Mon~Sun is trivial, as I can simply use AVERAGE() on the 6th column which is the field Rest.

But what if I want to find the average number of hours slept on the weekends (when DoW is 5 or 6)? Doing it the spreadsheet way would require an IF() statement checking if the 3rd field, DoW, is either 5 or 6 and then taking the average of the values in the 6th field, Rest, in the case that the IF conditions are satisfied.

Using Linux/UNIX text parsing tools is simpler, in my opinion. First I will copy the above TSV data into a text file named sept2014.txt

I will now print all lines to stdout (or I could redirect output to a file with > filename) satisfying the condition that the 3rd field contains a 5 or a 6.

$ cat sept2014.txt | awk -F'\t' '$3 == "5" || $3 == "6"'

The -F flag above designates the field separator character, which in the case above is TAB denoted by \t (single-quoted for a string literal). The default field separator in awk is non-TAB whitespace, so if fields are separated by spaces, there is no reason to explicitly state the field separator.

$N where N is some natural number, denotes the field number. $3 == "5" simply checks if the 3rd field has the value 5, while $3 checks if the 3rd field has the value 6.

The output of the above one-liner above is:

9/6/2014 2:00:00 9/6/2014 9:45:00 5 1.5 9.25
9/7/2014 2:00:00 9/7/2014 9:45:00 6 7.75
9/13/2014 1:30:00 9/13/2014 11:00:00 5 9.50
9/14/2014 1:30:00 9/14/2014 11:00:00 6 9.50
9/19/2014 23:00:00 9/20/2014 9:30:00 5 10.50
9/21/2014 0:30:00 9/21/2014 10:00:00 6 9.50
9/27/2014 1:30:00 9/27/2014 10:00:00 5 1 9.50
9/28/2014 0:30:00 9/28/2014 10:00:00 6 9.50

As you can see, only rows corresponding to DoW 5 or 6 (Saturday or Sunday) are printed. I can now copy and paste this data into a new sheet in the existing spreadsheet and calculate the average hours of sleep for the weekends. I think that sometimes quick text parsing with Linux/Unix text utilities is much faster than trying to write your own spreadsheet macro or create a multiply-nested spreadsheet formula.

2015년 4월 22일 수요일

Reconfiguring multipath devices on a production system

About two weeks ago, I was working my way down a server health checklist during the 2 ~ 5 A.M. maintenance window on an early Tuesday morning. Among four application servers running RHEL 6.4, one of them showed an incorrect number of LUN's (Logical Unit Number) from the 3PAR SAN connected over fibre channel (fc).

Since there were 8 disks in the array on the SAN and 8 fc paths from SAN-to-switch and switch-to-server, 64 LUN's should have appeared when I invoked multipath -ll (or more accurately, multipath -ll | grep sd | wc -l), but instead only 38 showed up.

Inside /etc/multipath.conf there is a blacklist {} section containing wwid's for the devices multipathd is supposed to ignore, like local disks. In addition to blacklisting individual devices by wwid, you can also specify classes of devices that multipathd should ignore by using the devnode keyword. Unfortunately, there was a problem with one particular devnode blacklist:

devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st|sda)[0-9]*"

Can you find a problem with the above device blacklist statement?

Earlier I mentioned that an 8-disk array was connected to a Linux server by fc cables constituting 8 paths. This means that 64 LUN's / multipath devices should appear on the server. Assuming that the local disks are /dev/sda, /dev/sdb, /dev/sdc, and /dev/sdd, the SAN multipath devices with 64 LUN's will take up the device names /dev/sde through /dev/sdbi

The incorrect devnode blacklist above will correctly blacklist /dev/sda which is a local disk, but it will incorrectly exclude /dev/sdaa, /dev/sdab, ... /dev/sdaz
which comes to 26 multipath devices. Therefore only 64 - 26 = 38 LUN's were appearing.

After removing 'sda' from the above devnode blacklist statement, I had to notify our client that they would have to stop any applications running on the SAN multipath devices so that I could unmount the partitions on the SAN disks.

After the dev team brought their applications down, I tried to umount the partitions, but umount complained the devices were still busy. Apparently there were still some leftover threads locking files on the SAN devices. To find the PID's of the processes with files open on the SAN disks, I used lsof /foo which should return something like the following:

# lsof /foo
COMMAND   PID  USER   FD   TYPE DEVICE SIZE/OFF      NODE NAME
bash    34266 user1  cwd    DIR 253,14     4096 376176642 /foo/my_app/data01
...

The PID can be seen in the second field of the output lines. Simply kill -15 34266 (or kill -9 ... if that doesn't work) and retry umount /foo

Despite unmounting all the mount points on the multipath devices, when I tried to flush all unused multipath device maps with multipath -F to proceed with reconfiguring the device mappings, I got the following:

# multipath -F
Apr 07 05:44:24 | mpathh: map in use
Apr 07 05:44:24 | mpathg: map in use
Apr 07 05:44:24 | mpathf: map in use
Apr 07 05:44:24 | mpathe: map in use
Apr 07 05:44:24 | mpathd: map in use
Apr 07 05:44:24 | mpathc: map in use
Apr 07 05:44:24 | mpathb: map in use
Apr 07 05:44:24 | mpatha: map in use

Although the mountpoints on the multipath devices are no longer in use, if there are LVM partitions on the SAN disks, the volume group and logical volumes are probably still active. Although you can deactivate each LV individually with lvchange -an /dev/VGname/LVname it is easier to just deactivate the entire VG residing on the SAN disks using vgchange -an VGname

Now I can flush the multipath config data with multipath -F, reload the changes in /etc/multipath.conf with service multipathd reload and finally restart the multipath daemon with service multipathd restart

Now I want to check if 64 LUN's show up. I type multipath -v2 to enable maximum verbose output, then retry multipath -ll

Now that all 64 LUN's appear over 8 paths, it is time to reactivate the Volume Group with vgchange -ay VGname and to remount the file systems.

*Note: My fellow engineers sometimes use dmsetup remove dm-name to flush the device-mapper cache if vgchange -an fails to deactivate an LVM Volume Group. Apparently using dmsetup remove does not delete any data; all partitions are still intact on disk, but it allows you flush device-mapper data. This is useful if multipathd complains that device maps are still in use on disks that you want to reconfigure for multipathing.

Using this method, once you have fixed the multipath setup, you must then recreate the PV, VG, and LV's exactly the same as the original except you must not run mkfs on the "new" LV's, as the old data is still intact on disk. Simply remount the LV's and everything should be OK.

2015년 4월 6일 월요일

glibc patch for non-LTS Ubuntu 12.10

A few weeks ago when the glibc 'ghost' vulnerability was announced, sysadmins and system engineers the world over frantically began patching systems. Although most of the servers my company manages have been patched by now, I got a weird request from a client - they have an old development machine still running Ubuntu 12.10 Quantal Quetzal, a non-LTS release that went out of support in 2014. The Ubuntu security advisory for glibc (known as eglibc in Ubuntu provided by package libc6) indicates that patches are available for 12.04 and 10.04, but 12.10 is left out in the cold.

Taking a look at the packages depending on libc6 in Ubuntu 12.04 reveals 19 packages including libc6 itself:

libc6 (mandatory)
libc-bin (mandatory)
libc6-i386
libc6-dbg
libc6-dev
libc6-dev-i386
linux-libc-dev (req'd by libc6-dev, libc6-dev-i386)
libc-dev-bin
libc6-pic
libc6-prof
glibc-doc
nscd

libc6-amd64 (i386)
libc6-dev-amd64 (i386)
libc6-xen (i386)
libnss-files-udeb (debian installer build only!)
libnss-dns-udeb (debian installer build only!)
libc6-udeb (debian installer build only!)
multiarch-support (dummy pkg)

All of the above packages must be upgraded to version 2.15-0ubuntu10.10 or above for Ubuntu 12.04!

Since my client is running 12.10 64-bit on x86 hardware, however, packages for the i386 architecture (indicated in red) can be ignored. The libc6-i386 and libc6-dev-i386 packages cannot be ignored, however, as they are multiarch 32-bit glibc packages for 64-bit Ubuntu. Also the debian installer build packages with the '-udeb' suffix and the dummy package can be ignored as well.

To check the current eglibc version in use by 12.10 Quantal Quetzal, run the following from the commandline:

ldd -version
Ubuntu eglibc 2.15-0ubuntu-20

Hmm... the unpatched version of eglibc in Ubuntu 12.10 is nominally higher than that of the patched version (2.15-0ubuntu10.10) in Ubuntu 12.04. But the higher version doesn't mean we are safe because the Quantal Quetzal packages don't receive updates!

One solution is to manually downgrade all the 12.10 eglibc packages to patched 12.04 versions.

First we need to find which glibc/eglibc packages are currently installed on the Ubuntu 12.10 machine, because during the patch we don't want to install any unnecessary packages.

Enter the following bash for-loop on the command-line:

for i in {libc6,libc-bin,libc-dev,libc-i386,glibc-doc,nscd}; do
  dpkg -l | grep $i
done

Installed packages matching from the list will be displayed one to a line. Here's what I get when I run the above command on a minimal 12.10 install:

ii  libc6:amd64                        2.15-0ubuntu20             amd64        Embedded GNU C Library: Shared libraries
ii  libc-bin                           2.15-0ubuntu20             amd64        Embedded GNU C Library: Binaries


Only two glibc-related packages are installed but on a development machine it would not be surprising for more packages to be returned.

Updated 12.04 LTS packages can still be downloaded from the web (which is unfortunately no longer the case for 12.10, as it is no longer supported). At the following link you can download the latest libc6 for 12.04:

http://packages.ubuntu.com/precise/amd64/libc6

And from packages.ubuntu.com you can also search for the other packages you need (listed in the security advisory link presented earlier).

OK- so now you have downloaded all the packages you need to some directory on your 12.10 box. Now it's time to "downgrade" your 12.10 libc6-related packages to those from 12.04:

Assuming all the downloaded .deb files for downgrade exist in the same folder, you can run the following:

$ sudo dpkg –i *.deb
dpkg: warning: downgrading libc6:amd64 from 2.15-0ubuntu20 to 2.15-ubuntu10.11
...
Although you are downgrading the packages, you are downgrading to patched versions from 12.04 LTS.

Now if you reboot and run ldd -version, you will see that your system is now running the patched version of eglibc:

$ ldd –version
ldd (Ubuntu EGLIBC 2.15-0ubuntu10.11) 2.15


A Note about enabling apt-get for unsupported Ubuntu versions

If you try to run sudo apt-get install foo in Ubuntu 12.10 you will get a message that this version is no longer supported. But what if you want to upgrade to 13.04 and from there to 14.04 LTS? Or what if you plan to stay at 12.10 but just want to download additional packages from that version?

First of all, you need to edit your /etc/apt/sources.list file and change the URL for the package repository from us.archive.ubuntu.com to old-releases.ubuntu.com

Sure, you could do this manually, copy-pasting multiple times, but I suggest you use vi's global find-replace for this task: Enter the following in the vi buffer while editing sources.list:

:%s:us-archive.ubuntu.com:old-releases.ubuntu.com:g

where s means substitution
g means global replace
instead of / as a field delimiter, I have chosen to use :

Now apply changes to sources.list:

sudo apt-get update

You will find that you can now use apt-get to install packages from 12.10 or even upgrade the distro to supported (and patched) versions of Ubuntu.