Microservice API on Elastic Beanstalk with Jetty, Jersey, Guice, and Mongodb

This blog aims to outline how one can very easily ship microservice APIs on Elastic Beanstalk with Jetty, Jersey, Guice and Mongodb. This piece of work written in Java was an inspiration over a weekend, where I finally found a coherent red thread based on some of the work I did in the past. So if you are reading this, be fully aware that this stack is no longer cool and hip since the conception of RxJava and Akka 😀 http://blog.circleci.com/its-the-future/

Code lives here. https://github.com/yveshwang/api-example

Why Elastic Beanstalk

Platform as a service (PaaS) is rather nice. A little similar to Heroku, Elastic Beanstalk (EB) environment can be tailored to run Docker containers amongst other things. Where EB truly shine is that the platform is accessible and completely configurable as if it was IaaS. This is made possible by ebextensions. For example, each EB instance by default ship with a local Nginx instance as designed by AWS. If you really want to switch it out, in theory you can do so by hijacking the initialisation process through ebextensions, uninstall Nginx, and install something like Varnish instead 😀 Yep, you can well and truly extend and tweak the platform to your heart’s content and potentially break everything.

Shipping to Elastic Beanstalk is fairly straight forward as it mostly involves a Dockerfile or a Docker image hosted in a public or private docker.io repository. If you have a Dockerfile or a Docker image, and a bit savvy with the eb cli tool, you are sorted and ready to deploy to Elastic Beanstalk.

Having a Docker container also makes testing repeatable and standardised. See the previous build pipeline blog series. It is by intention that the infrastructure part of the setting up eb is left out of this post for now as I believe Elastic Beanstalk deserves a blog post of its own. So let’s keep it old school and talk UML and code a little bit in this blog post.

Why Jetty

Jetty is a lightweight container as a naive attempt to defeat the monolithic J2EE containers because let’s be honest, most of us do not use half the functionalities in J2EE and the clustering of these J2EE containers goes against the very principle of microservices in my opinion. Instead, we should adhere to HTTP and RESTful API everything! Note that Jetty is most certainly cool, but it is not RxNetty or Akka-http cool.

Why Guice

Inversion of control is neat. On a grander scale, Guice can be used to inject mock layers en mass. For example, using Mockito to configure an entire mock data access layer and injecting that in context of unit or integration testing thereby allowing more tests to be written with less dependencies. Guice is also a nice way to help address the separation of concerns by keeping configuration away from business logic. Lastly, being able to do @Inject anywhere is powerful and allows us to construct a lot of templates and basically scale out horizontally through scaffolding code. When used properly, this is the little unsung hero of the Java world in my opinion.

Why Mongodb

Expect endless devops discussion on this very topic. Ever since the hacker news trolls came out of the woodworks against 10gen, the discussion has never ended. I like Mongo. I like it because it is fast to bang out a prototype.

DBs can vastly differ in ACID properties and thus address different combinations of CAP. I think I will save my opinion on Mongodb for another blog post another time. For now, Morphia is nice to work with in Java.

Why Jersey

Jersey is a pretty well structured way to write RESTful endpoints.

Putting it all together

Busting out some sick UML skills here.

Class diagram for api-example

Class diagram for api-example

Some basic principle by convention are as follows:

  • Each entity lives in its own collection in Mongodb
  • Each entity has one data access object (DAO)
  • Facade pattern is applied and should only contain business logic (no DB related operations)
  • Each DAO then at the very least as its own facade that can be extended to support business logic
  • You can freely inject other facades into one another.
  • Each facade maps to one HTTP resource supporting typical CRUD routines for that entity’s
    RESTful interfaces, GET, PUT, POST, DELETE and PATCH (ha!)
  • Caching headers, ETag, IMF headers can live in filters
  • Basic auth is also supported here as an example, that should live in filters too

Benefits

  • Loose coupling between the layers. You can replace Mongo quickly by just replacing the DAO implementations.
  • Most code is scaffolding or pure business logic. All connector code and basic CRUD support, including PATCH, lives in its respective base classes for entity, DAO, facade to resource layer that can be easily extended and reused.
  • Easy to test. All layers are tested, all entities are tested. And the test code can be easily extended
  • You can ship this code to an offshore team and expect that they can easily create new entities and new HTTP endpoints in a short time by simply copypasta some scaffolding code, reuse some templated CRUD classes, follow the basic CRUD routines, and pump out some basic business logic 🙂 Good times!
  • If you actually have a good team to work with, then this stack is very easy to extend by simply following the Facade pattern. Build cool stuff like your own in-memory RRDs for statistics then inject that statistics to other business logic!
  • Clustering is easy because this stack speaks HTTP and you would simply need a load balancer and some minor (or major?) Mongodb config.

Cons

  • At the core, Facade pattern is used liberally. This is not an event driven or reactive approach at all. When using facade, think shared memory, which means threading and parallelism will require due diligence. This is one reason I believe an event-driven, message based approach would improve the Facade pattern.
  • The stack compiles against JDK7. It would work fine with JDK8.
  • Not reactive, and by definition, not hipster enough.

Building on varnish-agent2

The varnish-agent2 existing code base is pretty solid,  rather beautiful I might add. These simple words, keep it simple, has been realised by the following rule of thumb.

– Close to 0 configuration
– “Just works”
– Maintainable
– Generic
– Stateless

For those that are keen to get started on the varnish-agent2 code base, I hope that this document will be of some use. I have used a tiny subset of the Concurrent Object Modeling and architectural design mEThod (COMET), particularly the Requirements Modeling, Analysis Modeling and snippets of the actual Design Model.

Lastly, this document mostly serves as a mental note for myself 🙂

Requirements Modeling

The requirement for the vagent2 is to provide an extendible restful interface for varnish cache. In addition, vagent2 act as the point of integration for varnish cache with other systems – e.g. an administrative or monitoring system.

The use cases are simple right now, and is depicted in the use case diagram below.

vagent2 use cases

vagent2 use cases

Analysis Modeling

vagent2 is designed to support the full spectrum of HTTP method types. A user of the vagent2 will issue these HTTP requests and receive JSON data as response where applicable. Furthermore, the vagent2 is built with modules in mind to address a potentially expanding feature set. Lastly, each modules should be able to communicate and reused by another module.

IPC lies at the heart of the varnish-agent2 code base and message passing is the norm here for implementing an event driven model. Each module follows vagent2’s plugin paradigm, and comes equipped with the module’s own private data set. The plugin therefore follows the IPC set of callback methods, such as ipc_start and ipc_run. These methods are assigned to the appropriate functions within each module.

For the module to be exposed as a restful interface, a simple httpd_register method will hook in the module’s reply method of choice, and expose it appropriately.

For any module, the basic dependencies is depicted below.

Module breakdown

basic module dependencies

Static Modeling

varnish-agent2 ships with a few core modules, such as the logger, shmlog access, httpd and ipc modules. vstatus and vadmin provides access to the shmlog via Varnish API. Note that as of writing this blog, varnish 3.0.3 was used.

These aforementioned modules provide the building blocks for managing varnish cache. For an overview of the static models, see the class diagram below.

Static Model

Static overview of vagent2

Dynamic Modeling

Initialisation
The process of initialising a module is rather straightforward. First, add a new init method for the new module in plugins.h, ensure that you have called the init method in main.c, and of course, allocated some memory for the new plugin in main.c too.

This new module must provide an implementation of the new init method. See diagram below depicting vban’s initialisation process.

vban initialisation process

vban initialisation process

Once initialised, by hooking onto struct agent_plugin* structure, the new module will correspond to the IPC life cycle.

plugin->start  = ipc_start;
plugin->ipc->priv = your_private_plugin_data;
plugin->ipc->cb = your_plugin_callback;

plugin->start is called when your plugin starts. Note that you need to assign a start method if you want the IPC to execute your callback.

plug->ipc->priv refers to a persisted data for your plugin. This can be anything, and as a rule of thumb, this is a good place to hold references to other modules.

plug->ipc-cb refers to the callback method of when ipc_run is issued by another module.

A sample execution path

To tie it all together, the collaboration diagram below illustrate the execution path of issuing a ban to the vagent2. Note that ipc is used to reach the vadmin module.

Issue a ban

Issue a ban