September 22nd, 2015

Resilience Testing Nginx with Docker, dnsmasq and Muxy

Using Docker, dnsmasq and Muxy, we create a repeatable Nginx ecosystem that enables us to test its functionality, performance and resilience under a range of conditions.

— Matthew Fellows —

Nginx is arguably the best HTTP server and reverse proxy available – its super-fast, scales easily and you have lots of knobs and levers to tune it to your needs. However, it is not perfect – there are a couple of things that seem to trip people up constantly, in particular:

  1. DNS is cached forever, meaning if you are proxying a host with a dynamic IPs (say, an AWS ELB) then you will eventually start sending traffic to a black hole
  2. Its configuration is not easily testable

I’ve been using Nginx a lot recently, and have come to enjoy a simple development pattern that both addresses the above shortcomings, whilst keeping us in line with 12 factor applications – ensuring development and production environment parity.

That pattern involves combining Nginx, with dnsmasq configured as its resolver, in a multi-Docker setup. In this article, I’m going to use this setup as a basis for a battery of tests; including functional integration testing, performance testing using Gatling and resilience testing using Muxy. You can follow along by pulling down https://github.com/mefellows/nginx-docker-setup.

TL;DR: When combined with dnsmasq and Docker, Nginx can be easily integration, load and resilience tested locally without the need for complex environments. I present an example setup you can steal.

Setting up Nginx + dnsmasq with Docker Compose

We are going to build a simulation environment that looks a bit like this:

Nginx - Docker + DNSMasq Setup

Nginx – Docker + dnsmas setup

  1. We will create a separate container, that runs a suite of Golang functional tests – this will test things like proxy headers and routing behaviour
  2. The Tests container will issue HTTP requests to Nginx, which will be configured as an HTTP Reverse Proxy
  3. Nginx will use dnsmasq to resolve DNS queries, as Nginx won’t honour host entries which is useful to substitute real upstreams in development/test. DNSMasq will read host entries, which we can have Docker Compose inject during setup
  4. Finally, we will configure an HTTP Mock/Echo server that will read headers from the request and echo them back to us – this allows us to assert a lot about what is happening when requests pass through nginx

To configure Nginx to use dnsmasq as our resolver, we set the resolver directive:

We then create our docker-compose.yml file, that links all of our containers together in the above configuration (sans-testing), importantly setting host entries. Note that this has been simplified for readability:

nginx:
  build: .
  ports:
   - "8001:80"
  volumes:
  - "/var/log/nginx:/var/log/nginx/"
  links:
   - dns:dnsmasq
   - api:api.foo.local

api:
  build: test/mockapi
  ports:
   - "8002:80"

dns:
  build: ./dnsmasq
  ports:
  - "5353:53"
  links:
   - api:api.foo.local

Running “docker-compose up” should now start our environment and you should now be able to hit the API via Nginx:

Writing Tests

Now that we have a functioning environment, we should write some tests! So what should we test?

In our sample application, we have 2 configurations:

  1. a “foo.com” proxy, that proxies across the two backends “api.foo.local” and “api-backup.foo.local” respectively, setting some headers
  2. a “myfandangledwebsite.com” proxy, that sends traffic to “bar.com” unless a specific cookie is set, in which it instead proxies “newbar.com”.

In the first scenario, we are setting an “X-Request-ID” header as a UUID if not already present, along with other standard proxy headers, so we should confirm these are working as expected.

In the second scenario, we are proxying different backends depending on the presence of a cookie – we also want to test this. We will get to resilience and performance testing in due course, for now, we are interested in testing the behaviour of our Nginx proxy configurations.

The trick in testing this setup – in addition to the use of dnsmasq – is to create a small echo/mock server that will reply to requests for a particular header with the headers it itself received in the request. We can then issue requests to it via the Nginx proxy, and see what headers it received. Here is our (simplified) echo server:

A request for “/header/host” should return back the “Host” header in the response, so we can see what it received. We can now test all of the headers, including what happens if we supply an “X-Request-Id”.

I opted to write these tests in Golang, in part because I already had a Golang container in the setup so it reduces setup time, but you can use what ever you like – you just need to be able to issue HTTP requests and return a non-zero exit code if the tests fail. You can see the tests at https://github.com/mefellows/nginx-docker-setup/blob/master/test/integration-test/server_generic_test.go.

This approach also works for the 2nd scenario, where we change the upstream depending on the presence of a cookie. In one test, we send through no cookies, so assert that the “Host” it receives is “bar.com”. In the other test, we send through our secret cookie, and assert that the “Host” it receives is “newbar.com”. Easy. Here is our other test: https://github.com/mefellows/nginx-docker-setup/blob/master/test/integration-test/router_test.go.

Now that we have our tests, we need to create a Docker Compose setup to glue it all together:

Notice how we point the “test” container at Nginx for all of the hostnames it needs to hit. You can run these tests from the repository like so:

Performance Testing with Gatling

Taking the approach from above, we can swap out the integration tests for a different suite of tests using Gatling, an open-source loading testing framework. If you haven’t played with Gatling before, checkout the quickstart guide. To keep things simple, we are just going to hammer the system with 10,000 users each making 10 requests. Here is our scenario:

Running these tests should yield something like the following:

At this point, we’re aren’t super concerned about the absolute numbers – after all this is likely running in a Virtualbox VM with minimal memory/CPU allocated – but what we are interested in is tweaking our Nginx configuration, our topology, what happens if the the server goes down and so on. The key thing is we now have a platform in which to do these things. So go… Play!

Mucking with Muxy

So we’ve been clever, and have setup Nginx backup servers in the case a primary server goes down. As a bonus, we can now start to test this feature with (an overly simple) Muxy configuration, that will inject a 500 status code for that primary API if called. This should trigger Nginx to resort to our backup API which does not have Muxy proxying it, and we can ensure that no non 20x status codes are received. Here is our Muxy configuration file:

As you can see, we have an http proxy setup to proxy the main API and have setup 2 middlewares in addition to verbose logging:

  1. A network tamperer, that slows and disrupts the raw network packets on the network device (adds latency, packet loss, shapes the bandwidth and so on)
  2. An http tamperer that will simply set the response code to 500

Our expected behaviour is that we only get http 200s, so let’s test that by firing 100 requests at it:

This simple test case fires off 100 goroutines ensuring no errors occurred.

Of course, this is an overly simplified example – Nginx will receive a 500 response code with the first request and immediately switch to the backup server. But the approach can be improved, for example if you were to remove the http tamperer and test with the network tamperer, you will be able to see what happens when some random variation is added to the mix. Importantly, this is non-deterministic and this can be an dangerous thing, so we need to tread a careful line not to attempt to determine exactly what will happen, but what should happen. That is to say, we only care about the end result – we only care that we send 200 response codes with the correct response body – we don’t care if that had to go to the backup server exactly 6 times or not.

upstream { article down; }

So hopefully you learned that it’s both possible and easy to setup a repeatable environment, in which you can run Nginx and a battery of tests against it without resorting to modifying its configuration in order to do so. Whilst they are all fairly simple, you could imagine some more interesting scenarios in which we can test a matrix of different workloads, under different conditions – including randomness – and assert the expected behaviour with more confidence before migrating to Production.