Samuel Cozannet
on 7 May 2015

Universal Modeling Language for Service-Oriented Architectures: Part 2

Share on:

In the first part of this two part blog we looked at why Canonical believes a new language is needed for modeling modern applications in the cloud.

In this second blog we will apply these high-level concepts to build a modular and scalable sentiment analysis application with Juju, using components such as Kafka, ZooKeeper, Storm and Node.js. We’ll then use Juju to swap out Storm for Spark seamlessly!

Modeling in action

Context

One of the first applications I created at Canonical was a Twitter Sentiment Analysis solution. I found a great article by Kenny Ballou who built a similar application with Kafka and Storm. But Kenny didn’t stop at Storm, he also coded the very same demo for Spark streaming.

This is a perfect example for our new modeling language. Business owners generally don’t care what technologies are used for the backend of their sentiment analysis application. The use of Spark or Storm is an engineering choice. It’s a choice that Engineers may want to re-evaluate over time as both visions are valid. However from a service perspective they are similar and business owners do not and should not concern themselves with the engineering specifics.

Let’s see how we can represent this application with our universal modeling language, build it from scratch, then modify it and scale the components independently.

Service Model

Our application’s purpose is to collect tweets matching a certain hashtag(s), perform sentiment analysis on the stream, then publish the result on a dashboard.

At the highest level, the services we need are:

Tweet Reader: connects to Twitter and grab tweets from the streaming API, then makes them available to other services;
Tweet Processor: consumes tweets and extracts the value out of them. In our case, a positive and a negative score for each tweet;
Visualization: consumes the tweets and their sentiment scores to display the result in a human readable format.

First Approach: Storm

Charm Model

In Juju, a micro-service is represented by a charm. Technically a charm is a set of scripts that follow a strict naming convention to describe how to install, run, scale and integrate the individual micro-service. A service is a collection of charms orchestrated together.

Let’s see how we can break down our 3 services:

Tweet Reader

Kenny’s solution is based on Apache Kafka. Kafka is a pub/sub messaging system built at LinkedIn that can work at scale and process big data pipes. It has recently been added to the major Hadoop distributions beside Apache Flume.

Kafka’s model requires 3 sub-services:

A producer to read
A broker to make available to other services
A Service Management tool, called ZooKeeper

Tweet Processor

Storm Cluster

Storm needs 3 services to run as well:

A Service Management Tool
A Storm Server, called Nimbus, that is used to ship Storm Apps (topologies) to…
Storm Workers, or agents, that will get simple tasks to do, run them and report to their Server

Those services are related together. Nimbus knows how to speak to its agents, the other way around, and all of them know how to speak to the Service Management System.

Code injector

We introduce a fourth service in our model: the code injector. In any Big Data application, infrastructure and applications are two distinct objects. The role of the infrastructure is to provide standard plugs to developers so they can ship code and data in, and collect intelligence (information) out. Think of the Big Data infrastructure as the Lego box and the application as the manual.

We have modeled that by creating a service which connects to raw code and ships it into the infrastructure automatically. We call this the Pluggable Architecture Model.

The code we ship is dynamic and open to developers. It can be changed and that doesn’t affect the model or the infrastructure. The only thing that it needs to know is how to speak to the visualization engine and where to find the tweets.

Visualization

We are going to need a service to expose a website on the Internet (a web server), and a cool dashboard solution to grab the results.

Kenny was familiar with Node.js but you can use any number of other technologies provided they expose the same method to collect tweets and their results.

End Result

OK, we have modeled our application through 3 high level services, which we broke down into sub-services. We made our infrastructure flexible by connecting it to the outside world in 3 places (tweets, processing code, visualization).

In the Juju GUI, this is what it looks like:

Second Approach: Spark

This version of the application fulfills the same goal: connect to Twitter, grab tweets, process them and display the result.

Furthermore since we already have a perfectly working Tweet reader and visualization layer we’d like to keep those as-is and really focus on the processing engine. Let’s see how our model can make that happen.