Monitor your cluster of Tomcat applications with Logstash and Kibana

A month ago I wrote a post about Logstash + ElasticSearch + Kibana potential used all together. Back then the example used was fairly simple, so today’s goal is to see how
one can make the most out of those tools in an IT infrastructutre with real-life problematics. The objective will be to show how to monitor in a central place logs coming from a cluster of tomcat servers.

Problematic : Monitor a cluster of tomcat application

Let’s take a cluster of 3 identical nodes, each of which hosts 3 tomcat applications, this ends up in 9 applications to monitor. In front of these cluster stands a load balancer – so customers can be in any node at any time.

Now if an error happens on the user applications, unless one has a log management system in place, one will need to log into each and every node of the cluster and analyze the logs – I know it can be scripted but you get the point it’s not the optimal way.

This post aims to show how this problem can be tackle using Logstash, Redis, ElasticSearch and Kibana to build a strong – highly scalable and customizable – log management system.


Here we will apply the following scheme from Logstash website. The only difference is that Kibana will be used instead of the embedded Logstash web interface.


What does what

  • Logstash: As a log shipper and a log indexer
  • Redis : As a broker – used as a queuing system
  • ElasticSearch : As a log indexer – store and index logs
  • Kibana : As a front-end viewer – a nice UI with useful extra features



Logstash comes as a jar, it is bundled with everything it needs to run.
The jar file is available here

To run it simply execute :
java -jar logstash-1.1.9-monolithic.jar agent -f CONFFILE -l LOGFILE

This will start a logstash instance that will act based on the CONFFILE it has been started with. To make this a bit cleaner, it is recommended to daemonize it so it can be started/stopped/started at boot time with traditional tools. Java-service-wrapper libraries will let you daemonize logstash in no time, traditional initd script works also.

Logstash needs to be installed on both the cluster node (shippers) and the central server where log will be gathered, stored and indexed (indexer)


Redis is a highly scalable key-value store, it will be used as a broker here. It will be installed on the central location.
Redis installation is pretty straightforward, it has packages available for every main linux distributions. (CentOS user will need to install EPEL first)

  1. Edit /etc/redis.conf, change bind to bind YOUR.IP.ADDR.ESS
  2. Make sure Redis is configured to start at boot (chkconfig/updated-rc.d)

Important : Make sure a specific set of firewall rules is set for Logstash shipper to be able to communicate with Redis

Elastic Search

Unfortunately ElasticSearch can not be found on package repositories on most Linux distributions yet. Debian users are a bit luckier, since the team at provide them with a .deb, for other distributions users, installation will need to be manual. ElasticSearch will be installed in the central location.

Get the source from, and as with Logstash I would recommend to use java-service-wrapper libraries to daemonize it.

You need to edit the configuration file elasticearch.yml and uncomment and replace it to The rest of the configuration needs to be edited based on the workload that is expected.


Kibana does not have packages yet, the source code needs to be retrieved from Github of Kibana website itself. Installation is straight forward

  1. wget
  2. tar xzf v0.2.0.tar.gz && cd Kibana-0.2.0
  3. gem install bundler
  4. bundle install
  5. vim KibanaConfig.rb
  6. bundle exec ruby kibana.rb

Main fields to configure in KibanaConfig.rb:

  • Elasticsearch : the URL of your ES server
  • KibanaPort : the PORT to reach Kibana
  • KibanaHost : The URL Kibana is bound to
  • Default_fields : The fields you’d like to see on your dashboard
  • (Extra) Smart_index_pattern : The index Kibana should look into

Kibana will be installed on the central location. Look into `sample/kibana` and `kibana-daemon.rb` for how to daemonize it.


Tomcat Servers

Here, Logstash will monitor two kinds of logs, the applicactions logs and the logs access

Logs Access

In order to enable logs access in tomcat edit your /usr/share/tomcat7/conf/server.xml and add the AccesLog valve

<Valve className="org.apache.catalina.valves.AccessLogValve" directory="logs"
    prefix="localhost_access_log." suffix=".txt" renameOnRotate="true"
    pattern="%h %l %u %t &quot;%r&quot; %s %b" />

Application Logs

Here a library that will output log4j message directly to the logstash json_event format will be used – Special thanks to @lusis for the hard work – so no groking will be required

Configure log4j.xml

Edit the /usr/share/tomcat7/wepapps/myapp1/WEB-INF/classes/log4j.xml

<appender name="MYLOGFILE" class="org.apache.log4j.DailyRollingFileAppender">
    <param name="File" value="/path/to/my/log.log"/>
    <param name="Append" value="false"/>
    <param name="DatePattern" value="'.'yyyy-MM-dd"/>
    <layout class="net.logstash.log4j.JSONEventLayout"/>

Logstash file


An example of what a Logstash shipper config file could look like

	input {

  		file {
    		  path => '/path/to/my/log.log'
    		  format => 'json_event'
    		  type => 'log4j'
		  tags => 'myappX-nodeX'

  		file {
    		path => '/var/log/tomcat7/localhost_access_log..txt'
    		format => 'plain'
    		type => 'access-log'
			tags => 'nodeX'


	filter {

		grok {
			type => "access-log"
			pattern => "%{IP:client} \- \- \[%{DATA:datestamp}\] \"%{WORD:method} %{URIPATH:uri_path}%{URIPARAM:params} %{DATA:protocol}\" %{NUMBER:code} %{NUMBER:bytes}"

		kv {
			type => "access-log"
			fields => ["params"]
			field_split=> "&?"

            urldecode {
                    type => "access-log"
                    all_fields => true


	output {

		redis {
			host => "YOUR.IP.ADDR.ESS"
			data_type => "list"
			key => "logstash"



An example of what a Logstash indexer config file could look like

	input {

		redis {
			host => "YOUR.IP.ADDR.ESS"
			type => "redis-input"
			data_type => "list"
			key => "logstash"
			format => "json_event"


	output {

		elasticsearch {
			host => "YOUR.IP.ADDR.ESS"



  1. Make sure Redis + ElasticSearch + LogStash(indexer) + Kibana are started
  2. Make sure all LogStash (shipper) are started
  3. Go to YOUR.IP.ADDR.ESS:5601 and enjoy a nice structured workflow of logs

The Lucene query language can be used on the header text-box to query/filter results. Kibana will have a new interface soon, that will let one customize an actual dashboards of logs, take a peak at the demo it does look promising.

Find below some screenshot (current version of Kibana) of this what the configuration based on this post provide :

Log access analysis
Screen shot 2013-03-02 at 11.22.48
Application log analysis
Screen shot 2013-03-02 at 12.27.04
Application log analysis -details
Screen shot 2013-03-02 at 12.27.23

You can see that their tags are marked ‘pentaho-node1’ so now it is easy to know what application (pentaho) on which node (node1) did produce the error.

Kibana has some excellent features take time to get to know them.


Last month twitter example was not real-life problems oriented, but with this post one can see all the power behind the use of those tools. Once those tools set up correctly one can gather all the logs of a cluster and explore them, one can easily figure out from where issues are coming from and all at a fingertip. QED


java-service-wrapper or how to daemonize your java services for all major OSes

Java services (java programs in general) come in different flavors, either in a jar file either with a shell script that will execute the java program after parsing your options or any other form. Most often than not, java services do not come with a handy init.d script leaving you the responsibility of doing/maintaining it by your own. Writing init.d scripts isn’t the most attractive task one can be asked to do, it can get really cumbersome. Also it raises the question of ‘What if I want to run it on another OS ?’, this script will need to be adapted each and every time. In this blog post, Java-Service-Wrapper (JSW) will be introduced, JSW allows you to make your java services run as native Unix/Linux/Windows services in a painless way.

In this blog post in order to show how java-service-wrapper works, Logstash (A log management tool, see this post if you want more details) will be installed as a Linux service.

Getting Logstash and running it manually

First thing first, let’s download the logstash jar here. It contains all the dependencies Logstash needs to run.

Then create a the configuration file (ie. /etc/logstash/test.conf) Logstash will be run against :

input {
    file {
        path => '/var/log/secure'
        type => 'secure'
output {
    file {
        path => '/tmp/test.json'

Finally simply run : java -jar logstash-1.1.9-monolithic.jar agent -f /etc/logstash/test.conf
After few seconds (maybe a minute or two), if you ssh to the box where Logstash is running, you should see the content of the /var/log/secure log line in your /tmp/test.json file

We have a working version of Logstash, but you can already see, that starting it and stoping it without a service fashion way will be a pain.

In order to make things the clean way we will move and rename logstash-1.1.9-monolithic.jar to /usr/local/bin/logstash.jar

Getting java-service-wrapper

For this step simply download the appropriate tar ball from here

Once untared you’ll be presented with some files and directories, out of all of them only 5 will be useful here

  • lib/ copy it to /usr/local/lib
  • lib/wrapper.jar: copy it to /usr/local/lib
  • bin/wrapper: copy it to /usr/local/bin
  • src/bin/ copy it and rename it to /usr/local/bin/myservice_wrapper and add the executable bit permission if this is not already the case (ie. /usr/local/bin/logstash_wrapper)
  • conf/wrapper.conf: copy it and rename it to /etc/myservice.conf (ie. /etc/logstash.conf)

And that is it for the installation of java-service-wrapper, from now on if you need to add other java services all you’ll need to do is copy again the and wrapper.conf with the accurate names.

Configuring and wrapper.conf accordingly aka myservice_wrapper

The change in this file are pretty straight forward, it concerns the details about the way your service will be run.

Those are the minimum changed one needs to edit :

  • APP_NAME: your service name (ie. logstash)
  • APP_LONG_NAME: if you have a longer description
  • WRAPPER_CONF: /etc/myservice.conf (ie. /etc/logstash.conf)

Later more in depth change can be brought :

  • PRIORITY: specify nice value
  • PIDDIR: the path of where the pid file should be stored
  • RUN_AS_USER: specify the user the service should be run as
  • USE_UPSTART: flag for using upstart

Also you can define the run levels when your service should be started/stopped in this specific file

wrapper.conf aka myservice.conf

Here, the configuration is a bit more complex. This file defines the way your java program will be called (classpath, libraries, parameters, etc…). We will see some aspect of the configuration here, but for a full details of what is possible please refer to the official doc

Simply specify the java executable path

Specify the class to execute when the Wrapper starts the application. There are 4 possibilities, refer to the doc for more informations.

Basically, if your application comes within a jar use WrapperJarApp if it comes in a different way use WrapperSimpleApp. Since Logstash comes in a jar we will be using WrapperJarApp class


Log file to which all output to the console will be logged

Java library path to use

Java classpath to use

Additional Java parameters to pass to Java when it is launched. These are not parameters for your application, but rather parameters for the JVM.

And finally the parameter we want to pass to our service.
We’ve seen previously that we ran logstash the following way :java -jar /usr/local/bin/logstash.jar agent -f /tmp/test.conf

This will be translated with the following configuration

And we are done !


First step of testing will be to verify the wrapper shell script does work correctly.
Run : /usr/local/bin/logstash_wrapper console

If that works you should see something similar to this

wrapper  | --> Wrapper Started as Console
wrapper  | Java Service Wrapper Community Edition 32-bit 3.5.17
wrapper  |   Copyright (C) 1999-2012 Tanuki Software, Ltd. All Rights Reserved.
wrapper  |
wrapper  | 
wrapper  | Launching a JVM...
jvm 1    | WrapperManager: Initializing...
jvm 1    | {:message=>"Read config", :level=>:info, :file=>"/home/vagrant/logstash-1.1.9-monolithic.jar!/logstash/agent.rb", :line=>"329", :method=>"run"}
jvm 1    | {:message=>"Start thread", :level=>:info, :file=>"/home/vagrant/logstash-1.1.9-monolithic.jar!/logstash/agent.rb", :line=>"332", :method=>"run"}

Finally create a symbolic link /etc/init.d/logstash pointing to your /usr/local/bin/logstash_wrapper, and you are done.

service logstash start
service logstash status
service logstash stop

All three command should be available.

Now if you change OSes all you have to do is edit the file path in the wrapper.conf (ie. /etc/logstash.conf) to reflect your actual paths on the new OS and nothing else. You will be up and running in no time.


Java-service-wrapper is one out of several options to do this specific task. I am not claiming that it is the solution that will solve all your problem, but it is a strong option, it does – in an understandable way – solve the java-service daemonization issue and the multiple OS porting. JSW saves you time and effort on the long run. Now write the configuration file and daemonzie it everywhere. QED

Powerful Analysis Tool using Logstash + ElasticSearch + Kibana

Reading about Logstash the first time I thought Yet Another Log Management Tool, but I was totally wrong.

As its author claims a log is nothing more than :

date + content = LOG

So sure all our system logs look that way (apache, nginx, mail, mysql, auth, etc…) but not only … What about git commit, tweets, facebook status, Nike+ run, a purchase, etc… ?

  • Git : A git commit includes a timestamp with a message + commit details
  • Tweet : A tweet is a message posted at a specific point-in-time
  • Facebook Status :A facebook status is a message posted at a specific point-in-time
  • Nike+ run : A run ends up at a specific point-in-time and convey extra data (Distance, Length, GPS tracks)
  • A purchase : A purchase is made at a specific point-in-time and convey extra data (Total amount, quantity of product bought, etc..

So more than a simple log management tool, Logstash with the help of Kibana and ElasticSearch can form a really powerful and fast analysis tool.

Installation & Demo (~10 minutes)


  • Java 1.6+ needs to be installed
  • The bundle gem needs to be installed

Download logstash


Create a logstash-twitter.conf

input {
    twitter {
        type           => "twitter"
        user           => "username"
        password       => "password"
        message_format => "json"
        keywords       => ["kibana", "logstash", "elasticsearch"]

output {
    elasticsearch {
        embedded       => true

Run logstash

java -jar logstash-1.1.9-monolithic.jar agent -f logstash-twitter.conf

Download & Install Kibana

curl -L | tar -xzvf -
cd Kibana-0.2.0
bundle install
ruby Kibana.rb

Now access it via http://localhost:5601/

Screenshot (Since today was CouchDB conference in Berlin, I supposed I would have had more input tracking couchdb keyword)

Events list

Kibana example

Kibana example

Event detail

Kibana event detail

Kibana event detail

And done, everytime someone tweet about either Kibana, Logstash or ElasticSearch you will have all the information about this tweet in a nice UI

Technical Explanations

Logstash –

Logstash works as a pipeline system : inputs | filters | outputs. In our simple example we used only the inputs | outputs pipe. Logstash by default offers 26 different inputs and 45 differents outputs ( – at the really bottom). Here the twitter input relies on the Twitter Streaming API to retrieve the tweets with the keywords we mentionned, then send them directly to our ElasticSearch instance to store them and index them.

ElasticSearch –

ElasticSearch is a distributed, RESTful, search server based on Apache Lucene. It does fully support the near real-time search of Apache Lucene. Its role here is to index and store all the event that it is feeded with. The server supports the Lucene Query Language.

Note : here the embedded ElasticSearch component is used. I would advise one to set one on its own in an independent manner.

Kibana –

Kibana is the UI that sits on top of ElasticSearch. It will give you the interface to explore your data, select them, drill into them, filter them, group them etc… Even though it is pretty basic, it lets you get the most out of your data.


Even if this example is really basic – and mainly useless – it shows you how quickly a powerful analysis tool can be set up using this triplet. If the input you are looking for does not exist yet, simply create it ( same thing apply for the output.
Do not forget that everything with a date (specific point-in-time) and a content is in some way a log. Now you know how to analyze and measure it. QED

Gitlab + Custom Hooks

With Gitlab (also with Github) it is straight forward to add post-receive web-hooks so actions can be taken after a push event. At the difference of Github, Gitlab is normally self-hosted, which could technically lead to interesting possibilities with custom post-receive (or any other) hooks. Unfortunately it is not possible to add custom-hooks directly from the web interface, it needs to be done under the hood.

Gitlab relies on Gitolite for it’s authorization process, we will make it relies on Gitolite also for git hooks’ management. We will stick to the Gitolite way of decuplating hooks based on the doc, in the section hook chaining

How to make Gitlab custom hooks aware ?

Remember during Gitlab’s installation the following step – copying Gitlab custom post-receive hook to Gitolite hooks directory :

cp ./lib/hooks/post-receive /home/git/.gitolite/hooks/common/post-receive

In order to make Gitlab custom post-receive hook aware, you need to edit the /home/git/.gitolite/hooks/common/post-receive file so it looks like this :

#!/usr/bin/env bash

# This file was placed here by GitLab. It makes sure that your pushed commits
# will be processed properly.

while read oldrev newrev ref
  reponame=`basename "$pwd" | sed s/\.git$//`
  env -i redis-cli rpush "resque:gitlab:queue:post_receive" "{\"class\":\"PostReceive\",\"args\":[\"$reponame\",\"$oldrev\",\"$newrev\",\"$ref\",\"$GL_USER\"]}" > /dev/null 2>&1
  if [ -x "$path_to_hook/$reponame" ];then
    "$path_to_hook/$reponame" "$reponame" "$oldrev" "$newrev" "$ref" "$GL_USER"

How does it work ?

Explanation of the file difference with previous version

  • Line 8 : indicates the directory where the post-receive hooks will be stored
  • Line 12-14 : if a post-receive hook exists for this project execute it

In practice

In practice there will be one post-receive hook per project and the hook should be named after the project. (Do not forget to make it executable)

And that’s about it, from now on every time you will be pushing your project in Gitlab, it will execute the post-receive script located in $path_to_hook and named after the project itself.


Project name : customhooks
Post-Receive Hook: location is /home/git/.gitolite/hooks/common/post-receive.secondary.d/customhooks


GIT_WORK_TREE=/var/www/blog git checkout -f

Note : the post-receive scripts can be written in any script-able language be it Shell, Ruby, Python, Perl, etc…

After pushing the customhooks project, I will have a copy of my actual project in the directory /var/www/blog . It’s up to you now to have hooks as sophisticated as your needs requires it.


This post shows how to do it specifically for the post-receive hook, but the same logic can be applied to the other available hooks. Remember, Gitolite manages them not Gitlab directly.
Even if genuinely Gitlab does not give you the possibility to add custom hooks, it is an easy feature to add. QED

Gitolite + OpenLDAP

While for small project one can easily manage Gitolite authorizaton permissions manually, this task can get really cumbersome as the project grows and different roles get to have different permissions (ie. devel, qa, etc…)

Companies traditionally rely on a centralized system to handle their users, the groups they belong to and as many information as they actually need (or not), one of them being LDAP. The purpose of this post is to see how to make Gitolite rely on informations stored in an LDAP DIT to grant user to perform specific actions on the git repositories.

Prequisite : In order to follow this post you will need to have a working Gitolite installation (v3.0+) and a reachable LDAP directory.

This is the LDIF file that will be used to handle authentication :

dn: cn=john,ou=group,dc=yanisguenane,dc=fr
cn: john
gidNumber: 20001
objectClass: top
objectClass: posixGroup
memberUid: john

dn: cn=jane,ou=group,dc=yanisguenane,dc=fr
cn: jane
gidNumber: 20002
objectClass: top
objectClass: posixGroup
memberUid: jane

dn: cn=devel,ou=group,dc=yanisguenane,dc=fr
cn: devel
gidNumber: 20003
objectClass: top
objectClass: posixGroup
memberUid: john

dn: uid=jane,ou=people,dc=yanisguenane,dc=fr
uid: jane
uidNumber: 10000
gidNumber: 10000
cn: jane
sn: jane
objectClass: top
objectClass: person
objectClass: posixAccount
objectClass: shadowAccount
loginShell: /bin/bash
homeDirectory: /home/jane

dn: uid=john,ou=people,dc=yanisguenane,dc=fr
uid: john
uidNumber: 10001
gidNumber: 10001
cn: john
sn: john
objectClass: top
objectClass: person
objectClass: posixAccount
objectClass: shadowAccount
loginShell: /bin/bash
homeDirectory: /home/john

Make Gitolite LDAP aware

Thought by default Gitolite is LDAP (and any authentication system) unaware, author left an open door for Gitolite to query a specific authentication system one wants. Be it LDAP or any other queriable system.

They are three rules to make that happen :

  • The query to the authentication system should be done via a script
  • The script should take the username as only parameter
  • The script should return a group space separated list the defined user belongs to

An example of an LDAP script can be find here
Note : It should be edited to meet your LDAP DIT configuration, the link posted matches the LDIF used for this post

In order to make Gitolite LDAP aware one needs to edit the file located at $GITOLITE_HOME/.gitolite.rc. Inside the %RC hash, add the following line :

In v3

GROUPLIST_PGM           =>  '/path/to/ldap-query-groups-script',

In v2

$GL_GET_MEMBERSHIPS_PGM => '/path/to/ldap-query-groups-script',

And … done ! Your Gitolite installation is LDAP aware !

How to use it

  • Add the authorized users to Gitolite keychain
  • As you would do with a regular Gitolite setup, you need to add the user to the Gitolie keychain. The name of the public key file (.pub) should match your LDAP username you want to set up.

    Here, they are two ways to deal with it

    • Full LDAP : get the SSH key from querying your LDAP DIT – if they are stored in here for each user
    • Basic : copy the user public key file via your prefered way

  • Define the repositories and permissions
  • Important : Remember that for a given username, the script will return a list of groups the user belongs to. Hence, your repositories configuration should be group based and not user based. A good practice would be that each user has its individual group, so you can grant access to individual user.

    repo test-ldap-devel
        RW+    =    @devel
    repo test-ldap-jane
        Rw+    =    @john @jane
  • Finally push the chances
  • Once configured to your needs simply push the changes.


Session 1 – john

john@workstation-john: ssh-keygen -t rsa -b 1024 -N '' -f ~/.ssh/john
john@workstation-john: scp ~/.ssh/ git add && git commit -m "" && git push origin master
john@workstation-john: git clone
Cloning into test-ldap-devel...
warning: You appear to have cloned an empty repository.

Session 2 – jane

jane@workstation-jane: ssh-keygen -t rsa -b 1024 -N '' -f ~/.ssh/jane
jane@workstation-jane: scp ~/.ssh/ git add && git commit -m "" && git push origin master
jane@workstation-jane: git clone
Cloning into test-ldap-devel...
FATAL: R any test-ldap-devel jane DENIED by fallthru
(or you mis-spelled the reponame)
fatal: The remote end hung up unexpectedly

jane@workstation-jane: git clone
Cloning into test-ldap-jane...
warning: You appear to have cloned an empty repository.


As we can see on Jane’s session, her try to clone test-ldap-devel was denied, but the one to clone test-ldap-jane did work. QED