Thursday, February 23, 2012

Understand Java benchmarking

Really good article about Java benchmarking which explain why not blindly trust numerics !
Never blindly trust any numbers. Know how they were obtained.

Robust Java Benchmark

  • Execution-time measurement
  • Code warmup
  • Dynamic optimization
  • Resource reclamation
  • Caching
  • Preparation

Tuesday, February 14, 2012

very agile project layout with maven, aka multi-tenant, multi-version development environment


Introduction

As a developer, it is current to switch from an ‘urgent’ task to another more ‘mega urgent’. Sometime it is necessary because a fix should be done quickly, sometime it is because someone needs help, etc.

The difficulties we’re facing each time is about the data. The database is not in synch with the code to debug, or with the code to test, or with the code to do a demo on. A developer should not spend time to set up its development environment. He should spend time to make new features, fix bugs and so on. There is no difficulties to switch the code because your favorite SCM does it for you.

Every time I loose my time to set up my dev env I take some time to enhance it. With multiple tenants application, with multiple versions and with multiple releases a week, I had enough. So I search a project layout which would fill my needs. Its layout should be modular and scalable.

The following is a description of this layout using maven.

Analyzis

We will describe how the data and the code should be versioned together.

Versioning

The version of the source code is managed by an SCM such as SVN, GIT or whatever you like. The data must be compatible with the source code. This way, when you work on a certain version of the application, your data are always up-to-date with the source code.

Data

We can distinguish 3 kinds of data to run with :
  • memory : the data are only in memory
  • local : the data are persisted on some repository on the localhost, ie : a database, web service
  • remote : the data are persisted on some repository on remote host, ie : a database, web service


Dataset can be made for each features such as feature1-memory. This way we can test features that requires data with different states. Of course, fewer dataset for most feature is the better to reduce the number of data and complexity.

Source code

We can distinguish 5 kinds of source code :
  • main : the production code, configuration must be externalized;
  • test : unit test code (should run very quickly);
  • it : integration (functional) test code (may take some time to run);
  • perf : performance test code, microbenchmark;
  • dev : code to speed up development by tweaking developer environment.

Maven layout

This layout is done with maven because it offers a lot of extension mechanisms. We mainly use profile to provide different configurations, coupled with configuration file and filters. It is important to know that this is done only by convention. There is no plug-in which does that structuring from end-to-end.

Data module

To set up dataset which are in database we use sql-maven-plugin (but this can be done also with mock web service).

Possible actions should be :
  • create : create a schema
  • init : inititialize constant data
  • insert : insert the dataset
  • clear : clear the data
  • drop : drop the schema


With a data module, dataset can be shared among dev, test, it and perf source code. It simplify the reuse of data for each use case of a feature.

Locally create a schema and set up default dataset :
# mvn verify -Plocal,create,insert -Ddataset=default

Locally create a schema and set up feature-foo dataset :
# mvn verify -Plocal,create -Ddataset=feature-foo
Locally drop a schema
# mvn verify -Plocal,drop

Remotly clear and set up default dataset on desk5:8080 host :
# mvn verify -Premote,clear,insert -Ddata.host=10.1.1.5:8080

Application module (web,desktop...)

To run web application we use the jetty-maven-plugin.

As of 99% of the development time is done doing development and not releasing, the tools should be easy oriented development (It should be easy to release and deploy as well :-p).

Running

Application must check data version prior to run to indicate if the schema and dataset are compatible (because we can run against different branch data).
The data are loaded from the data module
Memory (default)
The default command runs the application with dev and main classes with memory default dataset :
# mvn jetty:run
is equivalent to
# mvn jetty:run -Pmemory -Ddataset=default

Runs with memory feature-foo dataset :
# mvn jetty:run -Ddataset=feature-foo

Choosing a dataset is only available for memory data because local and remote data are persistent. So a set up command should be run to change the data.
Local
Runs with local current dataset :
# mvn jetty:run -Plocal
Remote
Does not run because there should not had remote default host :
# mvn jetty:run -Premote (should fail)
Runs with remote desk5 host on port 8080 and current dataset :
# mvn jetty:run -Premote -Ddata.host=desk5:8080

Unit test

Run the unit test from src/test with memory dataset because assertion cannot change :
# mvn test

Integration test

Run the integration test from src/it with memory dataset because assertion cannot change :
# mvn verify

Packaging

Package the application as for the production because configuration must be externalized. So there should be only one kind of packaging :
# mvn package

Release module

To make a release we use maven-assembly-plugin.

A release profile is required to clearly differentiate production release from another. So without default, a choice is required. Profiles are only shortcuts, configuration should be done using maven configuration ${param} and properties filters.

Make a release-prod.zip with artifacts and production configuration :
# mvn package -Prelease,prod
should be equivalent to something like :
# mvn package -Prelease -Dapplication.param1=something ...
where parameters are set in prod profile.

Make a release-int.zip with artifacts and integration configuration :
# mvn package -Prelease,int

Make a release-dev.zip with artifacts and development configuration :
# mvn package -Prelease,dev

Make a release-standalone.zip with artifacts and standalone configuration :
# mvn package -Prelease,standalone
We can use winstone-maven-plugin to embed webapp and web server into a JAR.


Conclusion

Following this layout using maven should not be very difficult but requires strictness. As most of the work is done by profile, extension should not be very difficult.

You will see now how you’ve stop loosing time and how you’d become only focused to feature, and how quick it is to fix bug from a version and sharing and set up dataset.

Moreover, you can simplify these tasks by using Hudson/Jenkins CI. This will reduce making a last-stable release to just a download click ;-)

Good hack.