Java containerization for modular PF4j applications

No, I couldn't use jib

by Marco Micera | July 23, 2021

Here’s a brief story about container optimization that came about due to frustration over long docker build times.

The existing software architecture

We’re dealing with a modular PF4J application whose containers contain (apologies for the alliteration):

  • an entrypoint JAR file;
  • exactly one plugin JAR.

The former interacts with Kafka and triggers the latter upon receiving records belonging to a specific topic:

services:
    one-of-the-plugins:
        image: ${ENTRYPOINT_JAR_IMAGE}
        command: [
            "--kafka_server", "${KAFKA_INTERNAL_ADDR}",
            "--plugin_list", "OPAL", # always one :/
            "--topic", "OPAL=input.topic.name",
            ...
        ]

An extract of an argument list of a docker-compose plugin service

It looks like the purpose of the former entrypoint JAR was to orchestrate multiple plugins in a non-dockerized environment: this was probably the fastest way of containerizing plugins without breaking things. Therefore, this excludes jib as a possibility.

Let’s speed up the dev process!

I’m a big fan of building applications in Dockerfile stages, as it allows me to reach a higher level of automation during development (e.g., the docker-compose build service configuration, and skaffold for Kubernetes).

The containerization begins

Unfortunately, jib is out of play due to the presence of PF4J, so we’ll have to dockerize JARs in order not to break things.
One could use the single-line mvn install command used by the CI:

#############################################
# Stage 1a: building all the necessary JARs #
#############################################

FROM maven:3.8.1-jdk-11-slim AS build-jars
COPY . /home/app
RUN mvn --file /home/app/pom.xml install --projects server,:my-plugin --also-make

But that would be a disaster due to Docker cache invalidation.
A brute-force solution would be to build all Maven projects individually: that’s what we’re gonna do.

Slow and steady wins the race

What does my plugin need, exactly?

$ mvn validate --projects :my-plugin --also-make
[INFO] Scanning for projects...
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Build Order:
[INFO]
[INFO] parent                                                             [pom]
[INFO] core                                                               [jar]
[INFO] plugin                                                             [pom]
[INFO] my-plugin                                                          [jar]

Two JARs, two projects, and server, as the CI says… roger!
Let’s build these entities following this order to make the most of the Docker cache:

#############################################
# Stage 1a: building all the necessary JARs #
#############################################

FROM maven:3.8.1-jdk-11-slim AS build-jars

# Building parent
COPY pom.xml /home/app/
RUN mvn --file /home/app/pom.xml install --projects :parent

# Building core
COPY core/ /home/app/core
RUN mvn --file /home/app/pom.xml install --projects core

# Building server
COPY server/ /home/app/server
RUN mvn --file /home/app/pom.xml install --projects :server

# Building the generic plugin
RUN mvn --file /home/app/pom.xml install --projects plugin

# Building my plugin
COPY plugins/my-plugin/ /home/app/plugins/my-plugin
RUN mvn --file /home/app/pom.xml install --projects :my-plugin

If only it were that simple!
We now enter some Maven-specific madness: parent needs the pom.xml files of core, server, and all plugins!
To cover the first two:

COPY core/pom.xml /home/app/core/
COPY server/pom.xml /home/app/server/

But what about all the other plugins?
Copying the entire folder containing all plugins would defeat the purpose of this optimization since a change in our plugin’s source code would invalidate this (and all the following) layers!

What to do?
To follow, more madness.

As promised, more madness

We want to copy all plugins’ pom.xml files without having to copy their source code too.
Unluckily, Docker’s COPY directive doesn’t support glob patterns, but we can use another stage to get around this:

###############################################################################
# Stage 0: layer with plugins' pom.xml files only. Used for caching purposes. #
###############################################################################

FROM alpine:3.14.0 AS list-plugins-pom-files

# Copying the entirety of all plugins
COPY plugins /home

# Finding and removing non-pom.xml files
RUN find /home \! -name "pom.xml" -mindepth 2 -maxdepth 2 -print | xargs rm -rf

We then add the following to stage 1a:

COPY --from=list-plugins-pom-files /home/ /home/app/plugins/

Evaluation

I’ll build my-plugin twice: once at the beginning, and then after making a change exclusively in its source code, without affecting the other Maven projects it needs (as it normally happens during development).

I’ll revert to a pre-optimization commit, and I’ll repeat this after the optimization was made.

Note that, for the first build of each experiment, I’ll do a mvn clean, a docker system prune, and a docker build with the --no-cache flag. I’ll also skip tests as they represent an equal overhead in both cases.

Results

Pre-optimization build times (skipping tests):

  • First build (--no-cache): 2min 3s
  • After a change in the plugin’s source code: ๐Ÿข 1min 44s ๐Ÿข

Post-optimization build times (skipping tests):

  • First build (--no-cache): 2min 44s
  • After a change in the plugin’s source code: ๐Ÿš€ 21.5s ๐Ÿš€

Conclusions

There seems to be a slight increase of build time when it comes to the first build, 122.9 seconds vs. 163.9 seconds: this might be due to the fact that the post-optimization Dockerfile uses way more layers with respect to the pre-optimization one.

This is justified, however, by the great achieved time saving:

Pre-optimizationPost-optimization
First build122.9s163.9s
Second build103.6s21.5s
Time saved19.3s๐Ÿš€ 142.4s ๐Ÿš€

We only saved (122.9 - 103.6)s = 19.3s without optimization, and then saved (163.9 - 21.5)s = 142.4s after.
That’s 2 minutes and 22 seconds saved every time one wants to containerize the plugin after making a change exclusively in its source code.

Not bad!

A very last, very stupid mistake

The CI takes almost 10 minutes to package my application, do a docker build, push it to the registry, etc., in different steps: ๐Ÿค” what the heck?
Ok, it runs tests, but still…

TL;DR

It wasn’t using BuildKit.

Explanation

To maintain backward compatibility, the CI docker build command should expect already-existing JARs, packaged in a previous step.
That’s easily achievable using a different last stage… so what’s the matter?

It turns out that the classic docker build --target package-ci was executing all stages, including the unused ones, no matter what.

Don’t forget to use BuildKit DOCKER_BUILDKIT=1!
It now only takes 5 minutes (tests included)… phew.