Enable network namespaces in CentOS 6.4

By default, CentOS 6.4 does not support network namespaces. If one wants to test the new virtualization platforms (Docker, OpenStack, & co…) on a CentOS server, all features won’t be available.
For OpenStack for example, Neutron won’t work as expected, since it actually needs network namespace to create networks,

Fortunately, RedHat – through RDO – provides a kernel that get this feature backported.

So, before updating the kernel, if one runs :

#> ip netns list

s/he will be presented with the following error message : Object “netns” is unknown, try “ip help”.

The following steps needs to be realized to install the new kernel and enable the network namespace feature

#> yum install -y http://rdo.fedorapeople.org/rdo-release.rpm
#> yum install kernel iproute
#> reboot

And that’s it. Really.

Now one can run

#> ip netns add spredzy
#> ip netns list

spredzy should get displayed.

If everything is working one should have the following kernel and iproute packages installed :

kernel-2.6.32-358.123.2.openstack.el6.x86_64
kernel-firmware-2.6.32-358.123.2.openstack.el6.noarch
iproute-2.6.32-130.el6ost.netns.2.x86_64

Note : the openstack mention for kernel and netns for iproute


The Foreman PXE provisioning with Libvirt

More than just a Puppet management interface The Foreman can handle the whole lifecycle of servers, from their creations and provisioning (pxe + kickstart/preseed) to their management (puppet). Today’s blog post will highlight how to use the provisioning feature of The Foreman using Libvirt DHCP server (dnsmasq) for local testing purpose.

Pre Requisite

  • An instance of a VM running foreman on libvirt, for this post version 1.3.0 of The Foreman is used, and CentOS 6.4 will be deployed.

Create the Operating System (The Foreman)

The Operating System

In the first time, simply fill the four first field and click submit. We will get back to it at a later point.

Path : More -> Provisioning -> Operating Systems -> New Operating System

Edit OS

Edit OS

The Architecture

Add an architecture one will be supporting for a set of OSes

Path : More -> Provisioning -> Architectures -> New Architecture

Edit Architecture

Edit Architecture

The Installation Media

For our case, the CentOS installation media already exist, one still have to click on CentOS and specify RedHat as Operating System family.

If you have a local mirror of CentOS repositories you could simply make the path points to it, installation will be much faster.

Path : More -> Provisioning -> Installation Media

Edit Installation Media

Edit Installation Media

The Partion Table

A RedHat default partition tables is already present, for the purpose of the demo we will be using it but you might want to create your own one. Do not forget to specify the Operating System Family.

Path : More -> Provisioning -> Partition Tables

Edit Partition Tables

Edit Partition Tables

The Templates

The provisiong template section is where one defines its kickstart/preseed, PXE, gPXE, etc… scripts.

One can define snippets that can be embedded within scripts.

For the demo purpose we will be using two pre-existing scripts

  • Kickstart Default PXELinux (PXELinux)
  • Kickstart Default (provision)

Once one clicks on the Template, one needs to go the the Association tab on the presented page to associate it with the proper OS. Here it needs to be done twice for the Kickstart Default PXELinux and for the Kickstart Default scripts.

Path : More -> Provisioning -> The Provisioning  Templates

Edit Provisioning Template

Edit Provisioning Template

The Operating System

And back to the Operating System to bind it all together.

Path : More -> Provisioning -> Operating Systems -> CentOS 6.4

First you should be presented with the following page, pick the right options (Architecture, Partition Tables, Installation Media) for your OS

Edit OS - OS

Edit OS – OS

Now go to template and associate the template accordingly

Edit OS - Templates

Edit OS – Templates

You can now save the OS.

Create the domain (The Foreman)

Here nothing fancy, simply fill up what is prompted. In the current scenario we don’t use The Foreman as a DNS.

Path : More -> Provisioning -> Domains -> New Domain

Edit Domain

Edit Domain

Create the Subnet (The Foreman)

Here the Network Address is the one from your libvirt’s dnsmasq configuration. Normally you can guess if from a simple ifconfig eth0, else on the host run virsh net-dumpxml default, assuming you run the default network. Same thing applies for the Network Mask.

Select the appropriate domain (cf. Create The Domain) and then the most important make sure the smart proxy name is selected in the TFTP Proxy box.

Path : More -> Provisioning -> Subnets -> New Subnet

Edit Subnet

Edit Subnet

Create the VM with PXE boot (Libvirt)

Create the New VM with a PXE boot

node1 - PXE

node1 – PXE

For now you can stop the VM since the DHCP server is not configured. Please note the MAC address of the Virtual Machine, it will be needed on the later section

Configure dnsmasq for IP attribution and PXE boot (Libvirt)

Note your foreman VM and your node1 VM MAC addresses.

Stop your foreman VM now.

1. Destroy the network

virsh net-destroy default

2. Edit the current network to assign static ip

virsh net-edit default

Remplace

<ip address='192.168.100.1' netmask='255.255.255.0'>
  <dhcp>
    <range start='192.168.100.128' end='192.168.100.254' />
  </dhcp>
</ip>

by

<ip address='192.168.100.1' netmask='255.255.255.0'>
  <dhcp>
    <range start='192.168.100.128' end='192.168.100.254' />
    <host mac='52:54:00:CB:C3:C6' name='foreman' ip='192.168.100.169' />
    <host mac='52:54:00:89:2A:7E' name='node1' ip='192.168.100.170' />
    <bootp file='pxelinux.0' server='192.168.100.169' />
  </dhcp>
</ip>

3. Restart the network

virsh net-start default

What is being done here at step 2,  is a static assignement of IP addresses by the DHCP server and the configuration of the PXE boot.

Static Assignement of IP address

<host mac='52:54:00:CB:C3:C6' name='foreman' ip='192.168.100.169' />

Here we tell dnsmasq that device with MAC address ’52:54:00:CB:C3:C6′ will always be assigned ip ‘192.168.100.169’

PXE Boot Configuration

<bootp file='pxelinux.0' server='192.168.100.169' />

He we tell devices that wish to do PXE boot, to get the file pxelinux.0 on the tftp server running on 192.168.100.169

You can now start the foreman VM, not node1 yet.

Create the Host (The Foreman)

Here fill up the information as needed, the specifics to PXE provisioning are the Network and Operating System tabs.

  • In the Network tab, fill up the MAC address, the configured domain, subnet and the IP Address assigned in DHCP server.
  • In the Operating System tab, select the Operating System you want your VM to be. (cf. Configure the Operating System)

Path : Hosts -> New Host

Edit Network Host

Edit Network Host

Edit Operating System Host

Edit Operating Syste

Start the VM (Libvirt)

Simply start the node1 VM, it will be assigned the static IP address  and will retrieve the pxelinux.0 from the foreman server as specified in the DHCP server. It might take some time while the installation is processing.

Once the VM automatically rebooted, one needs to go to the foreman > hosts page and will see that the node1 is in a ‘No Changes’ state, meaning build was successful, puppet connected. The VM is now fully managed by The Foreman.

Conclusion

One can configure as many OSes as one wants with fully configurable kickstart/preseed scripts, themselves dynamically parametrizable. As of today, The Foreman is a solid solution to manage the whole lifcycle of servers, from creation to provisioning to management, providing the user with details – filtrable – reports of what is going on. On a personal note I would say that if you are managing puppet servers and you are not using The Foreman, you are doing it wrong. QED.

Resources


Samba standalone + OpenLDAP

On the web there are many tutorials about setting a Samba server as one’s Domain Controller (DC), but really a few about setting a Standalone Samba server relying on an external OpenLDAP for authentication. Actually quite a simple process, it needs a lot of configuration on both ends, the Samba server and the OpenLDAP one, before it can be functionnal.

This post shows how to set up a Samba 3.6 server to rely on an external OpenLDAP 2.4 server, both being hosted on a CentOS 6.4

The Samba Server

Authorize the use of LDAP system-wide

In order for the Samba server to be able to rely on then OpenLDAP one, the use of LDAP needs to be enabled system-wide. To do so the authconfig configuration needs to be updated the following way

authconfig --enableldap --update

This simply edits the /etc/nsswitch.conf file and append ldap on passwd, shadow, group, netgroup and automount items

Install the samba packages

Simply run

yum install samba samba-common

Note : This article is about Samba 3.6 version and not Samba4. So do install the samba* packages and not the samba4* packages.

Copy and install the Samba schema in the OpenLDAP server

Note : Since those steps need to be done before the smb.conf configuration, this section shows here, even if logically it belongs to “The OpenLDAP server”

By default, the OpenLDAP server doesn’t speak the Samba language. One needs to add samba LDAP schema to it. From the Samba server, once the samba packages installed simply copy the samba.ldif file located at /usr/share/doc/samba-3.6.9/LDAP/samba.ldif to your OpenLDAP cn=schema directory

scp /usr/share/doc/samba-3.6.9/LDAP/samba.ldif user@openldap:/etc/openldap/slapd.d/cn=config/cn=schema

On the OpenLDAP server, the file needs to be renamed with the pattern – cn={X}samba.ldif – where X represents the highest number available + 1. On a default OpenLDAP installation, the highest number available is 11 (cn={11}collective.ldif) thus, the samba.ldif file needs to be renamed cn={12}samba.ldif

Edit the cn={12}samba.ldif file at line 1 and 3 so it look like this

dn: cn={12}samba.ldif
objectClass: olcSchemaConfig
cn: cn={12}samba.ldif

Finally, restart the slapd service so the new schema can be loaded correctly.

The smb.conf

In Samba there are 3 backends storage available per default.

  • smbpasswd – it is deprecated,
  • tdbsam – the one enabled by default.  It relies on a local database of user, filled via the smbpasswd -a command
  • ldapsam –  It relies on an external LDAP directory

To make your standalone Samba server rely on OpenLDAP simply change this chunk of code

security = user
passdb backend = tdbuser

by

security = user
passdb backend = ldapsam:ldap://ldap.serv.er.ip/
ldap suffix = dc=wordpress,dc=com
ldap admin dn = cn=admin,dc=wordpress,dc=com
  • ldap suffix : the suffix of your DIT
  • ldap admin dn : This is optional. If the OpenLDAP server denies anonymous request, then one needs to specify an admin dn entry.  Also if your LDAP tree do not have a SambaDomain entry yet, specifying the ldap admin dn configuration will create it automatically.  If using ldap admin dn, one needs to specify the admin dn password running smbpasswd -W

Save and exit the file, then restart the smb service. After few second one can run net getlocalsid and will be presented with a line looking like

SID for domain SAMBA-SERVER is: S-1-5-21-2844801791-3392433664-1093953107

If you set ldap admin dn in the smb.conf, the SambaDomain was created automatically and net getlocalsid returns this value, if you setted it manually net getlocalsid should return your your SambaDomain informations

Set samba to start automatically at boot time – chkconfig samba on – and the Samba server is all set to receive request from LDAP existing users.

The OpenLDAP server

In order for an OpenLDAP server to be Samba aware, some attributes needs to be added to the appropriate entryies. Make sure the samba schema has been loaded into OpenLDAP, as explained earlier.

SambaDomain

This entry can be automatically created  by the Samba server – if one wants  – and contains general informations about the Samba behavior. The most important information that can be found here is the SID, Security IDentifier for the domain. It will be needed for the configuration of Samba Groups and Users entries.

SambaGroupMapping

This is an auxiliary objectClass  that should be added to all the posixGroup entry that one wants to work with in Samba. It has only  two mandatory attributes, the SambaSID that is a uniqe ID within the SambaDomain ans the SambaGroupType, that define the type or the group.

The SambaSID is composed of the SID + RID

  • SID : From the SambaDomain entry
  • RID : Relative IDentifier, a unique id within the SambaDomain

The defined SambaGroupType are :

  • 2: Domain Group
  • 4: Local Group (alias)
  • 5: Builtin

SambaSamAccount

This is probably the most touchy, yet scriptable part. This is the auxiliary objectClass that should be added to all the posixAccount entry that one wants to work with in Samba. It contains Samba credentials. For Samba to authenticate a LDAP hosted user, the latter needs to have the the following attributes set

  • SambaAcctFlag : define user type (permissions)
  • SambaLMPassword : The LanMan password
  • SambaNTPassword : The NT password
  • SambaPwdLastSet : Timestamp of the last password update
  • SambaSID : The unique identifier within the SambaDomain

To obtain those informations , one can run this script , this needs the perl module Crypt-SmbHash to be installed

Usage : ./script username password

This will give the following outputs

:0:47F9DBCCD37D6B40AAD3B435B51404EE:82E6D500C194BA5B9716495691FB7DD6:[U          ]:LCT-4C18B9FC

  |            LMPassword          |         NTPassword             |   AcctFlags |

For the SambaSID value, refere to the SambaGroupMapping section the same logic apply here.

Once the SambaDomain, SambaGroupMapping and SambaSamAccount applied where it has to, the Samba server is ready to authenticate against the OpenLDAP server

Conclusion

Making a standalone Samba server rely on an external OpenLDAP , is not a difficult process, but it does involve quite a lot of configuration. In this article, neither the IPtables or the SElinux side of things has been adressed, but you should definetly set them up accordingly.  Go ahead add people on your DIT and see how they can access their own Samba Share. QED


Effective backup/recovery process for OpenLDAP

Making sure to never lose any piece of data is a really difficult task. A point-in-time backup (snapshot), in a permanently living and changing environment does not match data loss-less expectations.

In today’s post the focus will be put on OpenLDAP backup/recovery process in order to never lose a bit of data – well maybe the last transaction in case of a power outage.

Most online resources refer to the OpenLDAP backup/recovery process as :

  • For Backup : running a slapcat command and sending the output to a backup server in a cron job
  • For Recovery : getting the last meaningful backup from the backup server and reload it with a slapadd command

Simple isn’t it ? Well it is simple but it simply does not prevent from important data loss. Let’s highlight two cases that demonstrates the limit of this backup plan.

Case 1

Let’s take a moderately busy service that inserts an average of 1,000 new daily users in its dictionary. There are backups made (using the slapcat command) every day at midnight. Now, for some reasons one day at 8.00pm, a hard drive crashes (no RAID) or the filesystem got corrupt or the reason you want to come up with… It is time for recovery. We set a new VM or a new drive, set OpenLDAP again, get back the last meaningful backup and load it with a slapadd command. OpenLDAP server is back to its yesterday state but what about the 900 entries that got inserted today ? Well simply gone. That is why you must have a redundant set of OpenLDAP servers via replication. But replication is not a backup plan in itself.

Case 2

For precaution you set up a master/slave schema (a.k.a Consumer/Provider in the LDAP terms). So even if the main OpenLDAP server crashes you do have an up to date copy. Now since error is human, if an employee inadvertently removes an important set of data, this change will be replicated to all your slaves OpenLDAP servers and the data won’t be recoverable. Recovering yesterday backup will leave you in the same state as Case 1 and data would have been lost.

Solution

Design of an infrastructure effective for backup/recovery process

Design of an infrastructure effective for backup/recovery process

To be able to almost never lose a bit of OpenLDAP data, the infrastructure to deploy will heavily rely on the accesslog module provided by OpenLDAP.

The accesslog overlay is used to keep track of all or selected operations on a particular DIT (the target DIT) by writing details of the operations as entries to another DIT (the accesslog DIT). The accesslog DIT can be searched using standard LDAP queries. Accesslog overlay parameters control whether to log all or a subset of LDAP operations (logops) on the target DIT, to save related information such as the previous contents of attributes or entries (logold and logoldattr) and when to remove log entries from the accesslog DIT

Definition from http://www.zytrax.com/books/ldap/ch6/accesslog.html

Accesslog are mainly used for replication/audit purpose. In the above schema, our slaves will never be master of any other OpenLDAP server, they do use accesslog as a real-time accesslog backup in case the Master OpenLDAP server becomes unavailable for any reason.

Backup Process

As simple as it is described by most resources out there, the backup process will be a slapcat command – run as a cron job – of the needed DIT and their relative Accesslog DIT

#> slapcat -n 2 >  maindit-bk.ldif
#> slapcat -n 3 > maindital-bk.ldif

Recovery Process

This is how the recovery process would work :

  1. Load the last meaningful backup of the needed DIT with the command
  2. Load the accesslog from either the backup or the slave accesslog – which ever fit the most – do not forget to clean the accesslog if you are trying to recover an erroneous action
  3. Set the DIT to be a consumer of the freshly loaded accesslog

Practice

Step 1 : Simulate data loss

#> service slapd stop
#> slapcat -n 2 > maindit-backup.ldif
#> service slapd start
#> ldapadd -x -w 'test' -D 'cn=Manager,dc=domain,dc=com' -f user.ldif
#> service slapd stop
#> slapcat -n 3 > maindit-accesslog-backup.ldif

At this time, there are two backup files :

  • the maindit-backup.ldif that has everything but the last entry
  • the maindit-accesslog-backup.ldif that do have the addition of the user.

Step 2 : Recovering a clean OpenLDAP server

  1. Install a new VM with the appropriate package and configuration [if necesary only]
  2. If you are using a corrupted OpenLDAP server, move all the dbd file of your corrupted database (mv /var/lib/ldap/{yourdbdname}/*.dbd /backup/ldap/{yourdbname})
  3. Enable accesslog and syncprov modules
  4. Reload the needed DIT with slapadd
  5. Create an Accesslog db that will be used as provider
  6. Reload the accesslog db with its backup
  7. Configure syncrepl on the main DIT to be a consumer of the accesslog provider
  8. Restart slapd

At this time you have your OpenLDAP server being back up-to-date data wise and no data has been lost.

Conclusion

Not that simple right ? It needs a bit more than 2 lines of shell scripts. Long time observed behavior is that people/company do backups but do not test recovery. They are tested when the backup plan is created but then left aside and almost never used. Some company, on the other side takes recovery to its extreme and deploy last night backup to production every day. This way the recovery process is well tested and they don’t fear failure. Either way one decides to go, make sure  to always have a data loss-less backup/recovery plan, an up-to-date documentation that goes along with it, and your nagios’ check_ldap plugin up and running. QED


Network Access Server with a RaspberryPi : Part 1 – DNS

The RaspberryPi did not land in the market unnoticed. For about $35 you get a ready to work computer.
Many people have done amazing things with it – from IoT to distributed computation – other uses it as a full stack home media player. Other surely have a spare RaspberryPi and don’t know what they can do with it, the answer is a SMB grade Network Access Server (NAS)

This 3 part series intend to show how to use a RaspberryPi as a Network Access Server with enterprise services.

This specific blog post will be about providing a LAN  with a local DNS resolver using dnsmasq, that will improve overall internet speed of the clients in the LAN and allow a network administrator to configure its host names in an easy fashion.

Note: The RaspberryPi is running Raspbian as its Operationg System

DNS Primer

It is taken for granted that the reader knows what is the basic function of a DNS server, translate a name to an IP address.

A DNS server in order to be effective has to match two criterias :

  • Proximity
  • Cacheability

Proximity

In the wild, they are two kids of DNS servers network types.

The first one, anycast network type. With anycast network type, several geographically separated DNS servers listen on the same IP address, the DNS server closest to you in terms of hops will answer your query, providing you with the lowest latency.

The second one, unicast networy type, With unicast network type, a single server listens to a single IP address. Meaning if you live in California and your DNS provider has its servers in California you will have low latency, but if a country-side european resident uses the same DNS server he will have a much higher latency.

Bottom line on proximity, the closer the better. The closest you can get to your computer – beyond 127.0.0.1-  is your LAN. Having a DNS resolver on your LAN provides one with the second lowest possible latency.

Cacheability

One of the biggest challenge of Public DNS resolvers is Cacheability, more precisely shared cacheability. Due to the scale of the deployed infrastructure by those PublicDNS resolvers maintaining a common cache is somewhat a big  technical challenge in itself.

When you clicked on spredzy.wordpress.com a DNS server answered the IP corresponding to the hostname and then cached the association IP <> Hostname for TTL time. So a user can think next time s/he will be hitting the DNS server for the exact same host name (within the TTL) the DNS query may be faster, well the answer is not necessarily.

Be it an unicast or an anycast network type,  nothing can ensure one will end up on the exact same server two times in a row. (ie. Load balancer, etc…)

Bottom line on cacheability, by caching locally on a single server (the Pi) you won’t need to worry about shared cache. It will always be synced to itself

A note on dnsmasq name server feature

By deploying a DNS resolver within one local network, both proximity and cacheability issues are tackled. Also, but nonetheless, deploying dnsmasq on one local network will act as an authoritative source  for local devices names defined in /etc/hosts. No more need to deal with BIND and DNS records such as ‘router A XXX.XXX.XXX.XXX’. Simply by inserting the line ‘XXX.XXX.XXX.XXX router’ in your host file your DNS server will provide you the correct IP address.

Installation & Configuration

Installation

sudo apt-get install dnsmasq dnsmasq-base
update-rc dnsmasq default

Configuration

As with most programs, dnsmasq configuration can be edited in the /etc/dnsmasq.conf file or by dropping configuration rules in /etc/dsnmasq.d directory.

In order to keep a clean configuration, only the listen-address parameter will be edited in /etc/dnsmasq.conf

listen-address=PI.IP.Addr.ess

Then, the extra configuration will be written in specific files under /etc/dnsmasq.d/

dns.conf

server=208.67.222.222                       # Primary DNS Server
server=208.67.220.220                       # Secondary DNS Server

server=/mydomain.com/Other.dns.ip.address   # Specific DNS server for a given domain name

bogus-nxdomain=67.215.65.132                # Return NXDOMAIN as it should (IP applies to OpenDNS)

all-servers                                 # All listed DNS servers will be queried the faster will be picked

Make your computer default DNS your raspberrypi

Once everything is set up, you need to let your computer know which DNS server to use. For this several options :

  • Configure it directly in your DHCP server if you have access to (recommended)
  • In Linux, either configure NetworkManager or your /etc/resolv.conf file to have the right DNS server
  • In Windows configure your connection accordingly to use the right DNS

Also the /etc/hosts file will be edited to highlight the name server feature of dnsmasq

192.168.42.41    printer    printer.localdomain
192.168.42.1     router     router.localdomain
192.168.42.13    storage    storage.localdomain

Test

For tesing the performance of using the RaspberryPi as a DNS server the following script was ran 10 times from a laptop connected to a router via WiFi.
Using the RaspberryPi as DNS server

#!/bin/sh

sleep 2 && dig wordpress.com | grep 'Query time:'
yguenane@laptop:~$ repeat 10 ./dns.sh
;; Query time: 102 msec
;; Query time: 31 msec
;; Query time: 28 msec
;; Query time: 29 msec
;; Query time: 32 msec
;; Query time: 29 msec
;; Query time: 29 msec
;; Query time: 30 msec
;; Query time: 28 msec
;; Query time: 29 msec

Using OpenDNS as DNS server

#!/bin/sh

sleep 2 && dig wordpress.com @208.67.222.222 | grep 'Query time:'
yguenane@laptop:~$ repeat 10 ./dns.sh
;; Query time: 103 msec
;; Query time: 131 msec
;; Query time: 133 msec
;; Query time: 132 msec
;; Query time: 134 msec
;; Query time: 131 msec
;; Query time: 131 msec
;; Query time: 133 msec
;; Query time: 134 msec
;; Query time: 133 msec

Using Google PublicDNS as DNS server

#!/bin/sh

sleep 2 && dig wordpress.com @8.8.8.8 | grep 'Query time:'
yguenane@laptop:~$ repeat 10 ./dns.sh
;; Query time: 136 msec
;; Query time: 135 msec
;; Query time: 131 msec
;; Query time: 131 msec
;; Query time: 131 msec
;; Query time: 132 msec
;; Query time: 132 msec
;; Query time: 136 msec
;; Query time: 131 msec
;; Query time: 131 msec

One can see the – big – response time difference from the RaspberryPi compared to the PublicDNS servers once the entry is cached.

For the name feature, one can simply ping printer and see that 192.168.42.41 will be pinged.

Cache can be tuned thanks via cache-size, no-negcache, local-ttl and neg-ttl options. Refer to the man pages for more details.

Conclusion

BIND is a great product, it does well what is has been conceived for, but the entrance barrier might be high for a non networking-related profile. Dnsmasq is a lightweight yet mature alternative for SMBs. It allows one, totally unfamiliar with DNS records to set up a name server easily for an entire network.
In this first part we only focused on the DNS feature of dnsmasq, but it has much more it can provide. Next part will focus on the DHCP and PXE Server feature.


Monitor your cluster of Tomcat applications with Logstash and Kibana

A month ago I wrote a post about Logstash + ElasticSearch + Kibana potential used all together. Back then the example used was fairly simple, so today’s goal is to see how
one can make the most out of those tools in an IT infrastructutre with real-life problematics. The objective will be to show how to monitor in a central place logs coming from a cluster of tomcat servers.

Problematic : Monitor a cluster of tomcat application

Let’s take a cluster of 3 identical nodes, each of which hosts 3 tomcat applications, this ends up in 9 applications to monitor. In front of these cluster stands a load balancer – so customers can be in any node at any time.

Now if an error happens on the user applications, unless one has a log management system in place, one will need to log into each and every node of the cluster and analyze the logs – I know it can be scripted but you get the point it’s not the optimal way.

This post aims to show how this problem can be tackle using Logstash, Redis, ElasticSearch and Kibana to build a strong – highly scalable and customizable – log management system.

Tools

Here we will apply the following scheme from Logstash website. The only difference is that Kibana will be used instead of the embedded Logstash web interface.

getting-started-centralized-overview-diagram

What does what

  • Logstash: As a log shipper and a log indexer
  • Redis : As a broker – used as a queuing system
  • ElasticSearch : As a log indexer – store and index logs
  • Kibana : As a front-end viewer – a nice UI with useful extra features

Installation

Logstash

Logstash comes as a jar, it is bundled with everything it needs to run.
The jar file is available here

To run it simply execute :
java -jar logstash-1.1.9-monolithic.jar agent -f CONFFILE -l LOGFILE

This will start a logstash instance that will act based on the CONFFILE it has been started with. To make this a bit cleaner, it is recommended to daemonize it so it can be started/stopped/started at boot time with traditional tools. Java-service-wrapper libraries will let you daemonize logstash in no time, traditional initd script works also.

Logstash needs to be installed on both the cluster node (shippers) and the central server where log will be gathered, stored and indexed (indexer)

Redis

Redis is a highly scalable key-value store, it will be used as a broker here. It will be installed on the central location.
Redis installation is pretty straightforward, it has packages available for every main linux distributions. (CentOS user will need to install EPEL first)

  1. Edit /etc/redis.conf, change bind 127.0.0.1 to bind YOUR.IP.ADDR.ESS
  2. Make sure Redis is configured to start at boot (chkconfig/updated-rc.d)

Important : Make sure a specific set of firewall rules is set for Logstash shipper to be able to communicate with Redis

Elastic Search

Unfortunately ElasticSearch can not be found on package repositories on most Linux distributions yet. Debian users are a bit luckier, since the team at elasticsearch.org provide them with a .deb, for other distributions users, installation will need to be manual. ElasticSearch will be installed in the central location.

Get the source from ElasticSearch.org, and as with Logstash I would recommend to use java-service-wrapper libraries to daemonize it.

You need to edit the configuration file elasticearch.yml and uncomment network.host: 192.168.0.1 and replace it to network.host:YOUR.IR.ADDR.ESS. The rest of the configuration needs to be edited based on the workload that is expected.

Kibana

Kibana does not have packages yet, the source code needs to be retrieved from Github of Kibana website itself. Installation is straight forward

  1. wget https://github.com/rashidkpc/Kibana/archive/v0.2.0.tar.gz
  2. tar xzf v0.2.0.tar.gz && cd Kibana-0.2.0
  3. gem install bundler
  4. bundle install
  5. vim KibanaConfig.rb
  6. bundle exec ruby kibana.rb

Main fields to configure in KibanaConfig.rb:

  • Elasticsearch : the URL of your ES server
  • KibanaPort : the PORT to reach Kibana
  • KibanaHost : The URL Kibana is bound to
  • Default_fields : The fields you’d like to see on your dashboard
  • (Extra) Smart_index_pattern : The index Kibana should look into

Kibana will be installed on the central location. Look into `sample/kibana` and `kibana-daemon.rb` for how to daemonize it.

configuration

Tomcat Servers

Here, Logstash will monitor two kinds of logs, the applicactions logs and the logs access

Logs Access

In order to enable logs access in tomcat edit your /usr/share/tomcat7/conf/server.xml and add the AccesLog valve

<Valve className="org.apache.catalina.valves.AccessLogValve" directory="logs"
    prefix="localhost_access_log." suffix=".txt" renameOnRotate="true"
    pattern="%h %l %u %t &quot;%r&quot; %s %b" />

Application Logs

Here a library that will output log4j message directly to the logstash json_event format will be used – Special thanks to @lusis for the hard work – so no groking will be required

Configure log4j.xml

Edit the /usr/share/tomcat7/wepapps/myapp1/WEB-INF/classes/log4j.xml

<appender name="MYLOGFILE" class="org.apache.log4j.DailyRollingFileAppender">
    <param name="File" value="/path/to/my/log.log"/>
    <param name="Append" value="false"/>
    <param name="DatePattern" value="'.'yyyy-MM-dd"/>
    <layout class="net.logstash.log4j.JSONEventLayout"/>
</appender>

Logstash file

Shipper

An example of what a Logstash shipper config file could look like

	input {

  		file {
    		  path => '/path/to/my/log.log'
    		  format => 'json_event'
    		  type => 'log4j'
		  tags => 'myappX-nodeX'
  		}

  		file {
    		path => '/var/log/tomcat7/localhost_access_log..txt'
    		format => 'plain'
    		type => 'access-log'
			tags => 'nodeX'
  		}

	}

	filter {

		grok {
			type => "access-log"
			pattern => "%{IP:client} \- \- \[%{DATA:datestamp}\] \"%{WORD:method} %{URIPATH:uri_path}%{URIPARAM:params} %{DATA:protocol}\" %{NUMBER:code} %{NUMBER:bytes}"
		}

		kv {
			type => "access-log"
			fields => ["params"]
			field_split=> "&?"
		}

            urldecode {
                    type => "access-log"
                    all_fields => true
            }

	}

	output {

		redis {
			host => "YOUR.IP.ADDR.ESS"
			data_type => "list"
			key => "logstash"
		}

	}

Indexer

An example of what a Logstash indexer config file could look like

	input {

		redis {
			host => "YOUR.IP.ADDR.ESS"
			type => "redis-input"
			data_type => "list"
			key => "logstash"
			format => "json_event"
		}

	}

	output {

		elasticsearch {
			host => "YOUR.IP.ADDR.ESS"
		}

	}

Testing

  1. Make sure Redis + ElasticSearch + LogStash(indexer) + Kibana are started
  2. Make sure all LogStash (shipper) are started
  3. Go to YOUR.IP.ADDR.ESS:5601 and enjoy a nice structured workflow of logs

The Lucene query language can be used on the header text-box to query/filter results. Kibana will have a new interface soon, that will let one customize an actual dashboards of logs, take a peak at the demo it does look promising.

Find below some screenshot (current version of Kibana) of this what the configuration based on this post provide :

Log access analysis
Screen shot 2013-03-02 at 11.22.48
Application log analysis
Screen shot 2013-03-02 at 12.27.04
Application log analysis -details
Screen shot 2013-03-02 at 12.27.23

You can see that their tags are marked ‘pentaho-node1’ so now it is easy to know what application (pentaho) on which node (node1) did produce the error.

Kibana has some excellent features take time to get to know them.

Conclusion

Last month twitter example was not real-life problems oriented, but with this post one can see all the power behind the use of those tools. Once those tools set up correctly one can gather all the logs of a cluster and explore them, one can easily figure out from where issues are coming from and all at a fingertip. QED


java-service-wrapper or how to daemonize your java services for all major OSes

Java services (java programs in general) come in different flavors, either in a jar file either with a shell script that will execute the java program after parsing your options or any other form. Most often than not, java services do not come with a handy init.d script leaving you the responsibility of doing/maintaining it by your own. Writing init.d scripts isn’t the most attractive task one can be asked to do, it can get really cumbersome. Also it raises the question of ‘What if I want to run it on another OS ?’, this script will need to be adapted each and every time. In this blog post, Java-Service-Wrapper (JSW) will be introduced, JSW allows you to make your java services run as native Unix/Linux/Windows services in a painless way.

In this blog post in order to show how java-service-wrapper works, Logstash (A log management tool, see this post if you want more details) will be installed as a Linux service.

Getting Logstash and running it manually

First thing first, let’s download the logstash jar here. It contains all the dependencies Logstash needs to run.

Then create a the configuration file (ie. /etc/logstash/test.conf) Logstash will be run against :

input {
    file {
        path => '/var/log/secure'
        type => 'secure'
    }
}
output {
    file {
        path => '/tmp/test.json'
    }
}

Finally simply run : java -jar logstash-1.1.9-monolithic.jar agent -f /etc/logstash/test.conf
After few seconds (maybe a minute or two), if you ssh to the box where Logstash is running, you should see the content of the /var/log/secure log line in your /tmp/test.json file

We have a working version of Logstash, but you can already see, that starting it and stoping it without a service fashion way will be a pain.

In order to make things the clean way we will move and rename logstash-1.1.9-monolithic.jar to /usr/local/bin/logstash.jar

Getting java-service-wrapper

For this step simply download the appropriate tar ball from here

Once untared you’ll be presented with some files and directories, out of all of them only 5 will be useful here

  • lib/libwrapper.so: copy it to /usr/local/lib
  • lib/wrapper.jar: copy it to /usr/local/lib
  • bin/wrapper: copy it to /usr/local/bin
  • src/bin/sh.script.in: copy it and rename it to /usr/local/bin/myservice_wrapper and add the executable bit permission if this is not already the case (ie. /usr/local/bin/logstash_wrapper)
  • conf/wrapper.conf: copy it and rename it to /etc/myservice.conf (ie. /etc/logstash.conf)

And that is it for the installation of java-service-wrapper, from now on if you need to add other java services all you’ll need to do is copy again the sh.script.in and wrapper.conf with the accurate names.

Configuring sh.script.in and wrapper.conf accordingly

sh.script.in aka myservice_wrapper

The change in this file are pretty straight forward, it concerns the details about the way your service will be run.

Those are the minimum changed one needs to edit :

  • APP_NAME: your service name (ie. logstash)
  • APP_LONG_NAME: if you have a longer description
  • WRAPPER_CONF: /etc/myservice.conf (ie. /etc/logstash.conf)

Later more in depth change can be brought :

  • PRIORITY: specify nice value
  • PIDDIR: the path of where the pid file should be stored
  • RUN_AS_USER: specify the user the service should be run as
  • USE_UPSTART: flag for using upstart

Also you can define the run levels when your service should be started/stopped in this specific file

wrapper.conf aka myservice.conf

Here, the configuration is a bit more complex. This file defines the way your java program will be called (classpath, libraries, parameters, etc…). We will see some aspect of the configuration here, but for a full details of what is possible please refer to the official doc

wrapper.java.command

Simply specify the java executable path

wrapper.java.command=/usr/bin/java

wrapper.java.mainclass

Specify the class to execute when the Wrapper starts the application. There are 4 possibilities, refer to the doc for more informations.

Basically, if your application comes within a jar use WrapperJarApp if it comes in a different way use WrapperSimpleApp. Since Logstash comes in a jar we will be using WrapperJarApp class

wrapper.java.mainclass=org.tanukisoftware.wrapper.WrapperJarApp

wrapper.logfile

Log file to which all output to the console will be logged

wrapper.properties=/var/log/logstash_wrapper.log

wrapper.java.library.path.x

Java library path to use

wrapper.java.library.path.1=/usr/local/lib

wrapper.java.classpath.x

Java classpath to use

wrapper.java.classpath.1=/usr/local/lib/wrapper.jar
wrapper.java.classpath.2=/usr/local/bin/logstash.jar

wrapper.java.additional.x

Additional Java parameters to pass to Java when it is launched. These are not parameters for your application, but rather parameters for the JVM.

wrapper.java.additional.1=-Xms1G
wrapper.java.additional.2=-Xmx1G

wrapper.app.parameters.x

And finally the parameter we want to pass to our service.
We’ve seen previously that we ran logstash the following way :java -jar /usr/local/bin/logstash.jar agent -f /tmp/test.conf

This will be translated with the following configuration

wrapper.app.parameter.1=/usr/local/bin/logstash.jar
wrapper.app.parameter.2=agent
wrapper.app.parameter.3=-f
wrapper.app.parameter.3=/etc/logstash/test.conf

And we are done !

Testing

First step of testing will be to verify the wrapper shell script does work correctly.
Run : /usr/local/bin/logstash_wrapper console

If that works you should see something similar to this

wrapper  | --> Wrapper Started as Console
wrapper  | Java Service Wrapper Community Edition 32-bit 3.5.17
wrapper  |   Copyright (C) 1999-2012 Tanuki Software, Ltd. All Rights Reserved.
wrapper  |     http://wrapper.tanukisoftware.com
wrapper  | 
wrapper  | Launching a JVM...
jvm 1    | WrapperManager: Initializing...
jvm 1    | {:message=>"Read config", :level=>:info, :file=>"/home/vagrant/logstash-1.1.9-monolithic.jar!/logstash/agent.rb", :line=>"329", :method=>"run"}
jvm 1    | {:message=>"Start thread", :level=>:info, :file=>"/home/vagrant/logstash-1.1.9-monolithic.jar!/logstash/agent.rb", :line=>"332", :method=>"run"}

Finally create a symbolic link /etc/init.d/logstash pointing to your /usr/local/bin/logstash_wrapper, and you are done.

service logstash start
service logstash status
service logstash stop

All three command should be available.

Now if you change OSes all you have to do is edit the file path in the wrapper.conf (ie. /etc/logstash.conf) to reflect your actual paths on the new OS and nothing else. You will be up and running in no time.

Conclusion

Java-service-wrapper is one out of several options to do this specific task. I am not claiming that it is the solution that will solve all your problem, but it is a strong option, it does – in an understandable way – solve the java-service daemonization issue and the multiple OS porting. JSW saves you time and effort on the long run. Now write the configuration file and daemonzie it everywhere. QED