2016년 9월 24일 토요일

Using Korean UTF8 fonts in LaTeX with koTeX

Although I am not an academic, I sometimes use LaTeX to write documents containing mathematical expressions. My resume is also formatted as a .tex document which I then export to PDF using pdflatex. One inconvenience, however, is that regular LaTeX doesn't support UTF8 characters.

When I searched the Internet for ways to add Korean fonts to .tex documents, I first came across the package texlive-langcjk which can be invoked in your document with \usepackage {CJKutf8} but it fails to correctly render Korean fonts using the pdflatex rendering engine.

I then tried the koTeX package which can be found in the default Fedora 24 repositories as texlive-kotex-* and in the default Archlinux repositories as texlive-langkorean. In your LaTeX preamble simply add

\usepackage {kotex}

and you will be able to use any UTF8 Korean fonts you have installed on your system. Note that you must use xetex instead of pdflatex as your tex to pdf rendering engine, however.







To use xetex on Fedora, you must have the texlive-xetex-bin package installed. On Archlinux, xetex can be found in the package texlive-bin. If you find that you are missing some .sty font files needed to render your LaTeX document to PDF, you will have to install those separately. For example, on Fedora I had to separately install texlive-isodate, texlive-substr, texlive-textpos, and texlive-titlesec before my resume template could render into PDF using xetex.

2016년 9월 17일 토요일

Openstack Mitaka multinode install using Packstack

I recently did a four-node install of Openstack Mitaka (RDO 9) using a pre-edited Packstack answer file. Of the four nodes, two are Nova compute nodes, one is a control node (Keystone, Horizon, Nova Controller, Neutron, etc) and one is a storage node (Glance and Cinder).

The external and internal network interfaces are br-ex and br-eno2, respectively. Before running

packstack --answer-file [name of answerfile]

you must create these interfaces manually in /etc/sysconfig/network-scripts

On my servers, I made eno1 a slave of br-ex and used an Open V-switch bridge instead of the built-in linux bridge. Here is my network config file for eno1 (ifcfg-eno1)

NAME=eno1
UUID=a5802af4-1400-4d4f-964c-2eae6e20905f
DEVICE=eno1
DEVICETYPE=ovs
ONBOOT=yes
TYPE=OVSPort
BOOTPROTO=none
OVS_BRIDGE=br-ex


And here is my network config file for br-ex (ifcfg-br-ex):

NAME=br-ex
DEVICE=br-ex
DEVICETYPE=ovs
TYPE=OVSBridge
ONBOOT=yes
BOOTPROTO=static
IPADDR=192.168.10.116
NETMASK=255.255.255.0
GATEWAY=192.168.10.1
PEERDNS=yes
DNS1=168.126.63.1
DNS2=168.126.63.2


You will also need to make similar settings for eno2 and br-eno2  (which will be used for the internal or management network). In my case the external network is on 192.168.10.x, while my internal network is on 192.168.95.x

To apply the new settings, systemctl restart network and also make sure that you stop and disable NetworkManager before running Packstack.

In the Packstack answer file, I used br-ex and br-eno2 in the following four settings:

CONFIG_NEUTRON_L3_EXT_BRIDGE=br-ex

CONFIG_NEUTRON_OVS_BRIDGE_MAPPINGS=extnet:br-ex,physnet1:br-eno2

CONFIG_NEUTRON_OVS_BRIDGE_IFACES=br-ex:eno1,br-eno2:eno2

CONFIG_NEUTRON_OVS_BRIDGES_COMPUTE=br-ex,br-eno2

I also specified that extnet and physnet1 use flat networking instead of vxlan or vlan so that instances can communicate with my existing physical network.

In order to install Glance and Cinder on a node separate from the control node, you must enable unsupported parameters as follows:

# Specify 'y' if you want to use unsupported parameters. This should
# be used only if you know what you are doing. Issues caused by using
# unsupported options will not be fixed before the next major release.
# ['y', 'n']
CONFIG_UNSUPPORTED=y


then specify the storage node IP which will host Glance and Cinder:

# (Unsupported!) Server on which to install OpenStack services
# specific to storage servers such as Image or Block Storage services.
CONFIG_STORAGE_HOST=192.168.95.11


It is highly-recommended to define hostnames for each node so that Openstack can refer to nodes without using IP addresses (which can change), but in my test installation I didn't create specify hostnames and IP's in /etc/hosts for each node. Because I simply use IP addresses as identifiers, after the Packstack install is completed, I have to edit /etc/nova/nova.conf on both compute nodes and edit the following setting to use IP address instead of hostname:

vncserver_proxyclient_address=192.168.10.115

Note that the IP address must have an IP in the same subnet as the vnc base url in nova.conf

You can see my entire Openstack Mitaka Packstack answer file for four-node installation at the following link:

https://gist.githubusercontent.com/gojun077/104e431f17694e2ba90e186c2fadaaaa/raw/69357ae21be2d2120e18aa7cfa967ad760aa9013/rdo-9-packstack-sample-four-node-answer-file.cfg



2016년 9월 10일 토요일

Preparing a Win7 VM for use in Openstack Mitaka

In the project I am currently working on, the client wants to use a variety of OS instances on top of Openstack. Running recent Linux VM's (kernel versions 2.6 and above support the virtio drivers used in Openstack) on Nova just works out of the box, but running Windows VM's on Openstack requires some preparation -- namely, you must first install Redhat's virtio drivers into a Win7 VM and then import the VM image into Glance.

Since I already have a Win7 VM I use for testing in Virtualbox (downloaded legally from Microsoft at modernie), I thought it would be a simple task to just install virtio-win drivers into my Virtualbox Win7 instance. This idea didn't pan out, however.

If you follow the Redhat guide for installing virtio drivers into Windows VM's, your VM needs to have the following System Devices in Device Manager:

  • PCI standard RAM Controller (for memory balloon driver)
  • PCI Simple Communication Controller (for vioserial driver)
  • virtio network (for netKVM driver)
  • virtio storage (for viostor driver)

Virtualbox by default supports virtio-net network adapter type, but it doesn't support virtio storage devices, memory ballooning, or vioserial, so these virtio

Here is a screenshot of the available System Devices in a Win7 modernie VM running on Virtualbox:

You will notice that the required devices noted above do not exist.

I therefore had to do things the hard way. First I downloaded a Windows 7 installation iso then I created a new KVM virtual machine using virt-manager. The virtual machine has the following devices:


  • 2 IDE CD-ROM's - the first CD-ROM must mount the virtio-win iso file while the second CD-ROM will mount the Windows 7 iso.
  • 1 VirtIO hard disk
  • 2 VirtIO network adapters (1 for the private subnet, 1 for the public subnet for floating IP's in Openstack)
As of September 2016, you also have to change the video type from QXL to Cirrus, otherwise the VM will get stuck on "Starting Windows". This is a bug in qemu and might have been fixed by the time you read this article.

The Win7 installer will not be able to find any disks (because the disk uses virtio), so when it asks you for the location of the disk, click the install drivers button and select the CD-ROM containing the Redhat virtio drivers for Windows.

I followed this guide for these steps:


Although the guide is for VMWare, it works just as well for KVM. However, the last step about adding the virtio drivers to regedit is not necessary for KVM.

2016년 9월 3일 토요일

pdfshuffler for adding and removing pages from a PDF file

I used to use pdfmod for Linux to concatenate multiple PDF files together or to remove pages from a PDF file. pdfmod has mono libraries (open source MS .NET implementation) as a dependency, however, and they take up over 100 MB on disk when installed.

I have found a nice, lightweight replacement, however. pdfshuffler has just a handful of python dependencies and takes up less than 5 MB on disk. It has all the functionality of pdfmod but is much lighter. It can be found in the default repos on Fedora 24 and in the Arch User Repository. Highly recommended!



Note that you may run into a python bug when running pdfshuffler; you simply need to change ~2 lines of Python source as a workaround:

https://sourceforge.net/p/pdfshuffler/bugs/39/?limit=25#30ad/6f47

2016년 8월 27일 토요일

Finding PID for programs - Why 'pidof foo' is less trustworthy than 'ps -eF | grep foo'

I often use pidof to find the PID of a running program. It works well with dhclient, dnsmasq and other executable binaries. But pidof runs into problems when trying to find the PID of script files that in turn invoke other programs. Take, for example, the Python 2 program deluge-gtk BitTorrent client.

[archjun@pinkS310 bin]$ pidof deluge-gtk
[archjun@pinkS310 bin]$ ps -eF | grep "deluge*" | grep -v grep
archjun  25862     1  3 289160 89272  1 16:47 ?        00:01:30 /usr/bin/python2 /usr/bin/deluge-gtk /MULTIMEDIA/Torrents/CentOS-5.5-x86_64-bin-DVD.torrent

In the first case, pidof fails to return any PID for the deluge-gtk executable file. In the second case, grepping for deluge-gtk in the output of ps -eF (all processes, extra full format) correctly returns the PID of the BitTorrent client which is executed by Python 2.

Let's take a look at the contents of the deluge-gtk executable file:

[archjun@pinkS310 bin]$ cat /usr/bin/deluge-gtk
#!/usr/bin/python2
# EASY-INSTALL-ENTRY-SCRIPT: 'deluge==1.3.13.dev0','gui_scripts','deluge-gtk'
__requires__ = 'deluge==1.3.13.dev0'
import sys
from pkg_resources import load_entry_point

if __name__ == '__main__':
    sys.exit(
        load_entry_point('deluge==1.3.13.dev0', 'gui_scripts', 'deluge-gtk')()
    )

ps -eF is more useful because it can follow an execution chain to the final PID.

2016년 8월 20일 토요일

Web scraping using lynx and shell utilities

In 2016, many people would probably think of using Python modules such as BeautifulSoup, urllib, or requests for scraping and parsing web pages. While this is a good choice, in some cases it can be quicker to scrape web pages using the text browser lynx and parsing the results using grep, awk, and sed.

My use case is as follows: I want to programatically generate a list of rpm packages from Fedora's EPEL X (5, 6, 7), CentOS vault, CentOS mirror, and HP DL server firmware sites. I want this list to be comparable to the output of rpm -qa on RHEL machines. Here are some sample URL's for sites showing rpm package lists:

http://vault.centos.org/5.7/updates/x86_64/RPMS/
http://mirror.centos.org/centos-5/5.11/os/x86_64/CentOS/
https://dl.fedoraproject.org/pub/epel/6/x86_64/
http://mirror.centos.org/centos-7/7.2.1511/updates/x86_64/Packages/
http://downloads.linux.hpe.com/repo/spp/rhel/6/x86_64/2016.04.0_supspp_rhel6.8_x86_64/

If you visit any of these links you will find that the basic format is the same -- from the left, the first field is an icon, the second field is the rpm filename, the third field is the date in YYYY-MM-DD, the fourth field is time in HH:MM, and the fifth field is file size.

Here is my bash script which parses file list html pages into a simple text file:


You can see that lynx renders the page from HTML into regular text and dumps this output to a file if you pass the -dump option. But this is not enough, because lynx by default inserts a newline character in lines greater than 79 characters. To avoid this problem, you must manually set the line width to something larger. The maximum width in lynx is 990 characters, so I specified this value through the option -width=990. Finally the -nolist option removes the list of links that lynx inserts at the bottom of the page.

Using grep I then extract just the lines containing the string ".rpm". Next I replace all tabs with 4 spaces using sed and then use awk to print just the filename field. Finally I use sed to remove the ".rpm" extension from the filenames to make the output identical to the format of rpm -qa. Note that the last sed statement might not render correctly in your browser because I use mathjax on my blog. Unfortunately, the characters I am trying to express are also the tags for a mathjax expression; The sed snippet should appear as follows:

sed "s:\openparens\.rpm\closeparens::g" "${F3}" > "$2"

I have replaced '(' and ')' with openparens and closeparens, respectively due to my blog's mathjax plugin incorrectly interpreting the above expression as a mathjax statement.

If you don't escape .rpm with backslashes, '.' will be interpreted as a regex "match any character" which would match strings like "-rpm", ".rpm", "redhat-rpm-config", etc. This is undesirable.

BTW this script is for informational purposes. It would actually be easier to skip the data munging steps of replacing tabs with spaces using sed and just invoke lynx with the following options:

lynx -dump -listonly ...

which will return only links to rpm files from EPEL, CentOS mirror, etc. Then you can return just the filename from each link path with awk:

awk -F'/' '{ print $NF }'




2016년 8월 13일 토요일

Differences in binary file sizes between RHEL and CentOS

CentOS maintains binary compatibility with Red Hat Enterprise Linux, so applications which run on certain versions of RHEL should be able to run without changes on analogous versions of CentOS. Recently, however, a client asked me why executable binaries from the initscripts package (which contains /bin/ipcalc, /bin/usleep, etc) on RHEL 6.X have slightly different file sizes with those from the CentOS 6.X initscripts package.

First, I needed to verify that the source code in the initscripts srpm's for RHEL and CentOS were identical.

I downloaded initscripts-9.03.46-1.el6.src.rpm for RHEL 6.6 from the Redhat partner site and I downloaded initscripts-9.03.46-1.el6.centos.src.rpm from CentOS vault at the following url:

http://vault.centos.org/6.6/os/Source/SPackages/initscripts-9.03.46-1.el6.centos.src.rpm

I then unpacked the RHEL 6.6 initscripts source rpm's as follows:

[root@localhost srpm]# rpm2cpio initscripts-9.03.46-1.el6.src.rpm | cpio -idmv
initscripts-9.03.46.tar.bz2
initscripts.spec
3146 blocks
[root@localhost srpm]# ls
initscripts-9.03.46-1.el6.src.rpm  initscripts-9.03.46.tar.bz2  initscripts.spec
[root@localhost srpm]# tar -xvf initscripts-9.03.46.tar.bz2
initscripts-9.03.46/
initscripts-9.03.46/.gitignore
initscripts-9.03.46/.tx/
initscripts-9.03.46/.tx/config
initscripts-9.03.46/COPYING
initscripts-9.03.46/Makefile
...

I also did the same for the CentOS 6.6 initscripts package. I then renamed the directories for the extracted srpm's and then used the meld GUI diff tool to compare the .../src as well as the entire extracted initscripts srpm directories for RHEL 6.6 and CentOS 6.6.

As you can see below, the contents of the srpm's are identical:



Compiler options are contained within .../src/Makefile and the options are identical, as you can see from the diff results above. So the binary size differences are not due to differences in the source code, compiler options, or rpm Specfile between RHEL and CentOS.

Next, I did a simple C program compilation test of my own using gcc on a stock installation of RHEL 6.6 and CentOS 6.6.

Here is a simple hello world one-liner I have named hello.c:

#include

int main(void)
{
  printf("hello world!\n");
}

If I compile it with gcc with the following options

gcc hello.c -O0 -std=c99 -Wall -Werror -o hello

I still get slightly different file sizes on RHEL and CentOS:

RHEL 6.6
[root@localhost pset1]# ls -al hello
-rwxr-xr-x. 1 root root 6473 Aug  9 08:31 hello

CentOS 6.6
[root@localhost pset1]# ls -al hello
-rwxr-xr-x. 1 root root 6425 Aug 11 06:15 hello

This is a difference of 48 bytes.

I then used objdump from bintools to examine the assembly code in the compile hello object files. I renamed each object file as hello_rhel66 and hello_cent66, respectively. I am using the -s option with objdump so I can see full contents that also converts hex strings to ASCII.

[fedjun@u36jcFedora Downloads]$ objdump -s hello_rhel66 > hello_rhel66.dump
[fedjun@u36jcFedora Downloads]$ objdump -s hello_cent66 > hello_cent66.dump
[fedjun@u36jcFedora Downloads]$ diff -u hello_rhel66.dump hello_cent66
hello_cent66       hello_cent66.dump  
[fedjun@u36jcFedora Downloads]$ diff -u hello_rhel66.dump hello_cent66.dump
--- hello_rhel66.dump 2016-08-13 10:07:48.893239117 +0900
+++ hello_cent66.dump 2016-08-13 10:08:02.078435160 +0900
@@ -1,5 +1,5 @@

-hello_rhel66:     file format elf64-x86-64
+hello_cent66:     file format elf64-x86-64

 Contents of section .interp:
  400200 2f6c6962 36342f6c 642d6c69 6e75782d  /lib64/ld-linux-
@@ -9,8 +9,8 @@
  40022c 00000000 02000000 06000000 12000000  ................
 Contents of section .note.gnu.build-id:
  40023c 04000000 14000000 03000000 474e5500  ............GNU.
- 40024c 4cc6b3fd d6ec9bb6 e4540da0 aba4807f  L........T......
- 40025c 0f84997f                             ....            
+ 40024c 69320cbb e7408021 2c646e86 8344b173  i2...@.!,dn..D.s
+ 40025c 5e478671                             ^G.q            
 Contents of section .gnu.hash:
  400260 01000000 01000000 01000000 00000000  ................
  400270 00000000 00000000 00000000           ............    
@@ -137,7 +137,4 @@
 Contents of section .comment:
  0000 4743433a 2028474e 55292034 2e342e37  GCC: (GNU) 4.4.7
  0010 20323031 32303331 33202852 65642048   20120313 (Red H
- 0020 61742034 2e342e37 2d313029 00474343  at 4.4.7-10).GCC
- 0030 3a202847 4e552920 342e342e 37203230  : (GNU) 4.4.7 20
- 0040 31323033 31332028 52656420 48617420  120313 (Red Hat 
- 0050 342e342e 372d3131 2900               4.4.7-11).      
+ 0020 61742034 2e342e37 2d313129 00        at 4.4.7-11).

Apparently the contents of the .interp and .comments section differ between the two binaries. I believe the same holds true for each of the individual binaries from the initscripts package on RHEL 6.6 and CentOS 6.6. Each of the object files may contain different comments and time stamps which will lead to different binary file sizes.