Monday, 23 December 2013

Log4j 2 Memory-Mapped File Appender

During the weekend I dug into Java NIO, specifically, mapping files to memory to reduce I/O time. What's more, since I had a lot of free time on my hands, I developed a Log4j 2 memory-mapped file appender. On my machine, performance tests running on a single thread using the MemoryMappedFile appender show an improvement by a factor of 3 when compared to the RandomAccessFile appender.

Wednesday, 11 December 2013

Apache Kafka for Event Sourcing

Event Sourcing is a pattern intended for "capturing all changes to an application state as a sequence of events". As explained by Fowler, the pattern is useful when you want the ability to completely rebuild the application state, perform temporal querying, or replay events. The LMAX platform is a famous example where Event Sourcing is applied to keep all application state in-memory and consequently contributing to the system's surprisingly high throughput and low latency.

While investigating the architectural components of Samza, I came across a component that can be of great help when implementing Event Sourcing: Apache Kafka. Created by the folks at LinkedIn as a solution to their log processing requirements, Kafka is a broker with message replay built-in.

Kafka consumers receive messages from publish/subscribe channels known as topics. A topic is divided into user-defined partitions where a partition can serve messages only to a single consumer process. Balancing the message load between consumers is a matter of adding more partitions to the topic, assigning those partitions to other consumer instances, and finally, publishing messages to all topic partitions in a round robin fashion.

What fascinates about Kafka is that at any point in time a consumer can rewind back through the history of messages and re-consume messages at a particular offset. In the above diagram, Consumer B can consume the latest messages or replay messages, say, starting from offset 1. 

At face value we could be forgiven to think that a broker with in-built message replay would have trouble achieving high throughput for large message volumes. After all, Kafka is retaining unconsumed as well as consumed messages on disk: presumably costlier than simply keeping unconsumed messages in memory. However, a few clever design decisions, such as relying on the OS page cache and minimising random disk I/O, gave LinkedIn engineers impressive throughput results when comparing Kafka against both ActiveMQ and RabbitMQ.

With the basic concepts and performance considerations out of the way, let me illustrate my point about Kafka's suitability for Event Sourcing by giving a code example:

The above producer publishes, for a number of times, a message to the topic ossandme on partition 0. In particular, it creates a message by instantiating the KeyedMessage class with the following parameters (line 19):
  • Name of the topic to which the message is published.
  • ID of the partition the message will sit on.
  • Message content, in this case, the time the message was published.
The following consumer pulls messages from the topic ossandme:

For each message received, the application outputs the message's offset on partition 0 in addition to its content (line 50). The first thing to observe is that I've programmed against Kafka's low-level SimpleConsumer API. Alternatively, I could have opted for the High Level Consumer API to reduce development time. I chose the former because with the latter I was unable to find a way to replay any set of messages I wanted. 

Event Sourcing comes into play when an exception occurs. On exception, the application rewinds back to the first message in the partition and re-attempts to process all of the partition's messages (line 24). I like the fact that, to get this type of behaviour, I didn't have to introduce a database but simply leveraged the broker's message replay capability. Without a database, I've one less moving part to think about in my architecture.

Kafka is a young project and I'm interested to see how it matures. I'm keen to hear people's experiences using Kafka and whether it proved to be the right solution for them. As the project matures, I suspect we'll hear more often about Kafka in our technical discussions along with the other, more established, open source brokers.

Tuesday, 27 August 2013

Bridging Mule and MSMQ with ZeroMQ

Hearing the words Mule and Microsoft's MSMQ in the same sentence sends a shiver down my spine. I remember once, Mule guru John D'Emic and me had spent a considerable amount of time and patience getting Mule and MSMQ to talk to each other through DCOM. The major factor that contributed to this unpleasant experience was our ignorance of the numerous security measures imposed by Windows to restrict DCOM access. The morale of this story is unless you have a veteran Windows administrator at your disposal, avoid the DCOM route.

So which choices do we have other than DCOM? JNI sounds promising but you are then sacrificing Mule's platform independence. Here's an idea: introduce a messaging bridge between Mule and MSMQ. The bridge can be implemented in any language that facilitates interaction with MSMQ. C# is an attractive option.

We still have to consider which middleware to use for exchanging messages between the bridge and Mule. There are many alternatives and among them is ZeroMQ. I think ZeroMQ is a good candidate for several reasons:
  • It supports asynchronous communication
  • You're not adding another component to your architecture
  • It's well documented in addition to having a low learning curve
  • A ZeroMQ transport [1] and binding are available for Mule and C# respectively
  • It will more than likely satisfy your message throughput requirements
In as little as 15 minutes I developed a simple bridge in C#:

The above code should be self-explanatory but I've put comments for your convenience.

Here's the Mule 3 app dispatching messages to the bridge:

On receiving an HTTP request, Mule leverages the ZeroMQ transport to send asynchronously the request's payload to the bridge.

In all likelihood, the illustrated bridge code for Mule-MSMQ interoperability won't serve all your needs. I can think about a dozen features that a developer would want such as destination queues resolved at run-time, an agreed format for message content, and etc. But hey, at least it's a start :-)

1: I've recently replaced the transport's ZeroMQ C++ library with a pure Java implementation of ZeroMQ.

Monday, 19 August 2013

JRuby Complex Classes in Java Method Signatures

As documented in the JRuby wiki, java_signature changes a method's signature to match the signature string passed to it:

Observe that the classes in the method signature are primitive. What if we use a complex class as a parameter type?

Running the above code will give you the following NoMethodError message:

The way I went about using complex classes in signatures is to utilise add_method_signature instead of java_signature:

add_method_signature expects the first argument to be the name of the method that will have its signature changed. For the second argument, it expects it to be a list of classes. The list's first item is the return class (e.g., void) while the subsequent items are the signature's parameter classes (e.g., int and MyClass). Note that I invoke become_java! on the complex class. This tells MyClass to materialize itself into a Java class. The false flag is needed so that JRuby's main class loader is used to load the class. Without it, you'll be greeted by a java.lang.ClassNotFoundException.

Saturday, 10 August 2013

JRuby CXF: A Gem for Creating SOAP Web Services

It seems to me that Web Services don't receive much love from Rubyists. In fact, of the two Ruby projects I know that add Web Service support (SOAP4R and ActionWebService), both appear to be inactive. Someone might say that if Web Services are a must, then avoid Ruby or put an integration layer between your Ruby application and the client/service. From my experience, life is not always that simple and these solutions might not be applicable. 

The Java ecosystem has a popular and well-supported open source project that is used to build SOAP Web Services and clients. This project is called Apache CXF. On one fine sunny day I asked myself: "Wouldn't it be great if I could publish a Web Service from Ruby using Apache CXF?". Almost immediately I put that thought away. Trying to integrate a Java library into Ruby is, well, hard in my books. But then JRuby popped into my mind. JRuby is the Ruby language implemented in Java. This means that Ruby and Java objects talk to each other with relative ease.

Seeing the potential in the idea, last week I set about developing a JRuby wrapper gem for CXF. I must admit it was more challenging than I thought but at the end I was happy with the results. The bulk of the work was customising the Aegis data binder so that it could map RubyObject instances.

The first step to using the gem is installing it:

A code example is in order here:

Publishing the above class as a Web Service means requiring the gem and including the module CXF::WebServiceServlet:

Including WebServiceServlet causes the class to become a regular Java servlet. This implies that any servlet container can load the Web Service. For this example, I'll load the Web Service using an embedded Jetty:

Running the example requires two libraries to be available in the Java classpath: CXF 2.7.6 and Jetty 8.

Accessing the URL http://localhost:8080/hello-world?wsdl with a browser will display the following WSDL:

You'll note that the operations are missing from the WSDL. This is because I didn't tell CXF to expose any of the methods in the class HelloWorld as Web Service operations. Let me do that now:

expose tells CXF to publish the method denoted by the first argument (i.e., :say_hello). The second argument in expose is a map. It should have at a minimum the following entries:
  1.  expects - maps to an ordered list of hashes where each hash corresponds to a method parameter and its expected type.

  2.  returns - maps to the expected return type (e.g., :string).
Re-executing the example will give out this WSDL:

The gem supports various options to customise the WSDL. For instance, the service name and namespace can be changed:

The complete list of options is found in the project repository's README file .

Till now I've assumed that a Web Service operation will only accept simple types. In the real world we're more likely to be using complex types:

I've added two classes in the example: Animal and Person. It is necessary to include the CXF::ComplexType module so that CXF can derive an XML schema from these classes and embed the schema in the WSDL. A complex type element is declared using the method member. A member needs at least a name for the element and its type. You could also declare whether a property is required as seen in the member pet. The required option defaults to true if not specified.

Note that now say_hello and give_age are expecting a Person object instead of primitive types and they are accessing the object via accessors. Behind the scenes the gem creates an accessor for each member that is declared.

I hope I've given you enough info to get started out with the gem. My plan is maintain JRuby CXF as I believe it could be useful for those who aren't happy with the current alternatives. Of course, if you find an issue with the gem, I'd be more than happy to accept code contributions ;-).

Wednesday, 24 July 2013

The Rails Way to Search Filtering

It's been a while since my last post. Numerous changes took place in my professional life since then. I decided to take a break from my career and focus my energy to learning new technologies outside my area of expertise. One biggie is Ruby on Rails 3.

To test my RoR as well as my UI skills, I created a website for sharing, discovering and creating events. A requirement for evplore is to allow users to filter events by country, locality, tag, or date as shown below:

Before delving into the nitty-gritty of implementing search filtering, let's have a glance at the EventsController:

The controller does what you might expect: it invokes the Event model's find method to retrieve the events that match the user's search criteria.

Here's a dirty way of implementing the find method inside the Event model:

Any programmer worth his salt will note that this solution is tough to maintain. The find method has a branch for every possible way the user can apply search filters (in all 2^n branches). Having more than a few filters causes an explosion in the number of elsif branches.

The Rails way to solving this problem is to use query chaining along with the scoped method:

The Event model now avoids issuing a different query for each way search filters are applied. Instead, every query corresponding to a filter is put in its own class method. All methods do the same thing more or less. For example, in_country takes as a parameter the country submitted by the user. If the parameter is nil or blank, this means the user did not submit a country filter so the method should skip the query and return the relation as is: this is what the scoped method is for. Otherwise, the filter query is applied on the current relation and the new relation is returned.

What's left is for the EventsController to chain together the query methods so that the result of each query is passed on to the next query method for further filtering:

The complete code for this demo is available on GitHub.

Monday, 25 March 2013

How To Publish a WADL with Mule's REST Router Module

It seems to me that Mule's REST Router module, which I've previously blogged about, is gaining popularity as a viable alternative to the Jersey module. Yet, the REST Router has a serious limitation in that it's unable to generate a WADL: the REST equivalent of a WSDL. Recently, I had to overcome this limitation owing to a customer requirement for producing client code from the WADL.

Consider the Mule config below:

The REST Router processes HTTP requests that have a URL path matching the templateURI pattern attribute. Non-matching requests are ignored by the router and passed on to the subsequent message processor which in this case is a static component. Each nested block in the router corresponds to an HTTP method. A GET request is processed by the rest-router:get element, a PUT request is processed by the rest-router:put element, and so on.

WADL generation isn't a feature supported by the REST Router. If I was using the Jersey module in this example, I'd get the service WADL on hitting the URL http://localhost:8081/application.wadl. Although I'd like to avoid using Jersey for various reasons, I could leverage the library's WADL generation feature:

The above modified Mule config routes WADL requests to Jersey and all other requests to the REST Router. Upon receiving a WADL request, Jersey returns the WADL derived from the JAX-RS annotated org.ossandme.CommentResourceCommentResource describes in Java the same RESTful interface described by the REST Router. Since all service logic is performed by the REST Router, I implement the resource as a Java interface instead of a class:

This example is a Maven project hosted on GitHub. To run it, (1) open your console and enter from the project root's directory:

(2) Then deploy the package on Mule 3.3.x standalone runtime. Accessing the URL http://localhost:8081/application.wadl will give you the following:

Saturday, 2 March 2013

Posting & Processing JSON with jQuery & Spring MVC

Consider an HTML form containing numerous input fields. When the user clicks on the form's submit button, the fields need to be sent as JSON to a service that under the hood is implemented in Spring MVC.

A jQuery function transforms and posts the form's inputs:

Through the $('form').submit(...) function (line 61), jQuery intercepts any click on the submit button and posts the form with the $.ajax(...) function (line 63). Before sending the data, jQuery transforms the form's inputs to JSON using JSON.stringify($(this).serializeArray()) (line 66). This function outputs a JSON string consisting of a list of key-value pairs:

On the service side, I have the controller to process the form:

getCreateClientForm() returns the HTML form to the user. The more interesting method is processSubmitCreateClient(...).

The headers annotation attribute tells Spring that processSubmitCreateClient(...) should be invoked only if the HTTP request header Content-Type is set to application/json. Furthermore, @RequestBody tells Spring to bind the JSON data to the client paramater which is a list of maps. processSubmitCreateClient(...) iterates through each element to merge the individuals maps into a single map to facilitate processing.

I  added the Jackson library to the project's POM since Spring requires a JSON deserializer to perform the data binding:

You can grab the complete example from GitHub and run it from the project root directory using the following command:

From your browser, enter "http://localhost:8080/jq-springmvc-json/create-client.html" to access the form.

Sunday, 27 January 2013

ReplyTo in ZeroMQ using WS-Addressing

In my last post, I mentioned how one could leverage SOAP along with WS-Addressing to implement the return address pattern in ØMQ (known in the JMS world as ReplyTo and in the MSMQ world as ResponseQueue). In this post, I'll go into more detail by giving a code example using CXF's ØMQ transport.

Below is a WSDL that is similar in many respects to the WSDL of the previous post:

Like the other WSDL, this one declares that the service has an operation named sayHi. It states that ØMQ is to be used for transporting the SOAP messages (line 29). Moreover, it declares the service and consumer communication parameters: zmq:address; zmq:serviceConfig; zmq:clientConfig (lines 43-45). Unlike the other WSDL, the socket types are push for the client and pull for the service. This means that communication between the client and service is one-way and therefore the client won't receive a reply, at least not on the same channel. Remember, since we're implementing return address, the service sends its reply onto a different channel.

This is the shell command I entered for generating the SEI and proxy classes:

The -asyncMethods switch tells CXF's wsdl2java to generate, in the SEI, for each Web Service operation an asynchronous method in addition to a synchronous one. In the consumer implementation, I use the generated SEI:

doOperation() sequence of events is as follows:
  1. Publishes the endpoint that processes the service's async reply (line 10).

  2. Creates a client that has WS-Addressing enabled (lines 13-14).

  3. Adds the ReplyTo element  to the request's SOAP header (lines 17-18). The element contains the callback URL that the service will read in order to know to whom it should deliver its response and how it should deliver it.

  4. Calls asynchronously the SEI method sayHiAsync() (line 21). If the consumer invokes the synchronous method (i.e., sayHi()) instead of the async one, the consumer would block indefinitely after sending out the request because it will wait for a reply that it will never receive.
The endpoint consuming the service's reply is a plain Web Service:

The callback terminates the program as soon as it processes the reply.

I re-run wsdl2java to generate another SEI but using the following arguments:

I use the SEI generated from the above command to implement the service:

Nothing special about except the @Addressing annotation. This annotation is required for CXF to process the request's WS-Addressing headers. triggers the server and consumer into action:

The complete application is on GitHub. The instructions for running the demo are almost identical to the ones of the previous post. First enter the following Maven command from the project's root directory:

Then run the demo using the java command:

You should then get the following output:

Tuesday, 15 January 2013

SOAP over ZeroMQ with Apache CXF

You're probably asking yourself one of these two questions on reading this post's title:
  1. Who in his right mind would use SOAP as a data format for ØMQ?
  2. Why the heck would I want to have SOAP on top of ØMQ as opposed to HTTP?
For the first question, consider a system requiring:
  • A return address pattern such as the one offered by JMS and MSMQ.
  • Guaranteed message delivery. 
  • Encryption and authentication of messages.
ØMQ wasn't designed to cover these requirements so it's up to the developer to come up with a solution. The developer can re-invent the wheel and build the solution from the ground up. But why do that when other folks have already taken the trouble of writing and implementing standards that address these concerns independent of the transport used? I'm speaking of WS-*. Using a SOAP engine, we can leverage WS-* standards like WS-Addressing, WS-Security, and WS-ReliableMessaging to address the above requirements and more. 

Focusing on the second question, several motivations exist for using SOAP on top of ØMQ. The first that may pop into mind is the desire to reduce latency as ØMQ incurs less overhead than HTTP. For me, strong motivations include:
  • Leveraging ØMQ's low-level and high-level patterns such as publish-subscribe and pipeline. 
  • Dispatching messages to services while they're in a disconnected state.

With the "Why?" out of the way, let's figure out the "How?". That is, how to process SOAP messages over ØMQ. Apache CXF is an open-source Java SOAP engine that operates over transports such as JMS, UDP, and of course HTTP. During the holidays, since I had lot of free time on my hands ;-), I extended CXF to support ØMQ as an additional transport

What follows is an example of a Web Service and a client talking with each other over the transport [1]. Consider this WSDL fragment that I wrote, and which the service and client are based on:

The contract describes a service that publishes a single operation called sayHi. sayHi accepts an empty message and returns a string. Line 29 of the WSDL informs us that the service expects sayHi to be invoked using SOAP over ØMQ. Line 43 tells us the service address whereas lines 44-45 give additional necessary details for the client and service to establish communication. Observe that I'm using socket types req and rep for the client and service respectively. This means that the communication between client and service occurs in RPC-style.

I use the the wsdl2java tool that comes with the Apache CXF distribution to generate SEI and proxy classes from the above WSDL. These classes help me develop the WSDL service implementation and the client.

Next, I implement the service:

The service is published using the following code:

From the consumer-side, here's the code:

I write an additional class that triggers the execution of the service and its consumer:

I use Maven from the project's root directory to build the example:

Finally, to run the app:

The output I get is:

You'll find the complete example on GitHub. This example is basic; its purpose is to get you up and running with the transport. In my next post I'll have a more real-world example where I'll show how WS-Addressing's ReplyTo element is be leveraged to implement the return address pattern in ØMQ.

1: You need the Java binding for ØMQ installed on your machine to run the example.