Behold, Dockerfiles!

I must apologize. Last week I left you with a story about a web ecosystem, but no source code to help with your own. Well good news, this week is different.

For those who prefer to read code first, head to Github and dive in. The docs are minimal right now, but should be enough to get you started:

github.com/mcskinner/docker-nginx
github.com/mcskinner/docker-nginx-php
github.com/mcskinner/docker-wordpress

For those who want the backstory and breakdown, read on.

NGINX in Docker

The first step, of course, is getting NGINX to work in Docker. There’s an image for that. Easy right?

$ docker run -d -p 80:80 nginx

And just like that, you have a running Docker instance that will serve up the NGINX boilerplate.

So what’s wrong? Nothing really. You could start from this image and be just fine. But I happened to notice that this guy is based on Debian instead of Ubuntu. See for yourself:

$ docker run nginx /bin/cat /proc/version
Linux version 4.1.19-boot2docker (root@bcad5a346f31) (gcc version 4.9.2 (Debian 4.9.2-10) ) #1 SMP Thu Apr 7 02:41:05 UTC 2016

Again, that’s not fundamentally a bad thing. But it does mean you might spend a lot of time chasing down portability bugs. From what I can tell, every Docker example ever seems to be set in Ubuntu, not Debian.

So if we want Ubuntu, we’ll have to set up the configuration ourselves. That’s why we’re here, right?

Let’s start with the basic image:

FROM ubuntu:14.04

Fresh starts are great! Now let’s layer on NGINX. It’s available via the typical apt-get route:

RUN apt-get update && \
  apt-get -y upgrade && \
  apt-get -y install nginx

As mentioned in a previous post, the default configs are quite nice. So this install is actually very close to complete. But there is one tweak that’s essential, and a few more that are nice to have:

RUN sed -i -e "s/keepalive_timeout\s*65/keepalive_timeout 2/" /etc/nginx/nginx.conf && \
  sed -i -e "s/keepalive_timeout 2/keepalive_timeout 2;\n\tclient_max_body_size 100m/" /etc/nginx/nginx.conf && \
  echo "\ndaemon off;" >> /etc/nginx/nginx.conf && \
  chown -R www-data:www-data /var/lib/nginx

The most important part here is turning off daemon mode. We want a foreground process in our container. Otherwise this is just some tuning and a bit of permissions boilerplate.

After that, we just need to expose the right ports and tell Docker how to spin things up:

EXPOSE 80
EXPOSE 443
CMD ["nginx"]

The complete source for this, with examples, is available at github.com/mcskinner/docker-nginx. Or if you just want the image, that’s at hub.docker.com/r/mcskinner/nginx.

PHP? PHP!

Next up is to layer on support for PHP, because WordPress is written in PHP. The examples I’ve seen lean toward PHP-FPM, so we’ll go that route.

The process looks a lot like it did for our first image. Starting from there:

FROM mcskinner/nginx

Install the requirements:

RUN apt-get update && \
  apt-get -y upgrade && \
  apt-get -y install php5-fpm php5-mysql php-apc python-setuptools

Then make some config tweaks:

RUN sed -i -e "s/;cgi.fix_pathinfo\s*=\s*1/cgi.fix_pathinfo = 0/g" /etc/php5/fpm/php.ini && \
  sed -i -e "s/;daemonize\s*=\s*yes/daemonize = no/g" /etc/php5/fpm/php-fpm.conf

As before, we need to update the configuration so everything runs as a foreground process instead of a daemon. The other bit regarding cgi.fix_pathinfo is there to close a security hole related to URI rewriting. Apparently the default behavior in the event of a missing file is to look for similar filenames and see what happens when you run them through the PHP interpreter. That’s sketchy and dangerous, so best practice is to just say no.

Now for a bit of the complexity I mentioned in that previous post. PHP and NGINX both need to be running at the same time for this whole thing to work. There are lots of ways to do that, probably some including pure containers. I used Supervisor:

RUN /usr/bin/easy_install supervisor
RUN /usr/bin/easy_install supervisor-stdout
ADD ./supervisord.conf /etc/supervisord.conf

The first two lines install the software, and the last adds our configuration. The config could probably be a bit more minimal, mostly it just says how to run NGINX and PHP-FPM. After that Supervisor will make sure they start running and keep running.

Unfortunately that can fight a bit with the default Ubuntu init handler, Upstart. We’ll just replace the offending piece with an always-successful nop (i.e. /bin/true):

RUN dpkg-divert --local --rename --add /sbin/initctl
RUN ln -sf /bin/true /sbin/initctl

And then set up our default command to spin up Supervisor and point it to the installed config:

CMD ["/usr/local/bin/supervisord", "-n", "-c", "/etc/supervisord.conf"]

Once again, the corresponding code with examples can be found at github.com/mcskinner/docker-nginx-php, or just the image at hub.docker.com/r/mcskinner/nginx-php.

On to WordPress

And now, with all the prerequisites set up, we can get WordPress specific. Okay wait, actually first we let Docker know those are our prereqs:

FROM mcskinner/nginx-php

And then install a whole bunch of PHP add-ons that WordPress needs:

RUN apt-get update && \
  apt-get -y upgrade && \
  apt-get -y install php5-curl php5-gd php5-intl php-pear php5-imagick php5-imap php5-mcrypt php5-memcache php5-ming php5-ps php5-pspell php5-recode php5-sqlite php5-tidy php5-xmlrpc php5-xsl

We’ll also want to tweak the configs a bit. Always with the configuration. In this case, we just want to enable a bunch of suggestions by uncommenting them:

RUN find /etc/php5/cli/conf.d/ -name "*.ini" -exec sed -i -re 's/^(\s*)#(.*)/\1;\2/g' {} \;

Alright, now we’re ready for the WordPress install. We’ll add it to the image, unzip it to our serving root, and hand the permissions over to NGINX:

ADD https://wordpress.org/latest.tar.gz /var/www/latest.tar.gz
RUN cd /var/www/ && tar xvf latest.tar.gz && rm latest.tar.gz && \
  rm -rf /var/www/html && mv /var/www/wordpress /var/www/html && \
  mv /usr/share/nginx/html/50x.html /var/www/html/ && \
  chown -R www-data:www-data /var/www/html

Speaking of NGINX, the default configuration will no longer work for WordPress, so we’ll need to add our own. The easiest way to do that is to clobber the default and keep everything else the same:

ADD ./nginx-wordpress-site.conf /etc/nginx/sites-available/default

And that’s it! The Supervisor configuration from mcskinner/nginx-php will work just fine for WordPress, so we don’t need a new CMD here.

You can find the code at github.com/mcskinner/docker-wordpress and the finished image at hub.docker.com/r/mcskinner/wordpress.

More Configuration

Okay so maybe that’s not quite it. The NGINX config from above deserves a little bit of explanation. But only a little bit, because it’s not really that special. Most of it is stolen from the NGINX defaults. See for yourself:

$ docker run mcskinner/nginx /bin/cat /etc/nginx/sites-enabled/default

But there are a few differences to point out. First, we’ve extended the default location to use index.php as a catchall instead of a 404 not found error. Second, we’ve uncommented the already-written location block for PHP-FPM and enhanced it with a bit more protection against URI rewrites.

Last, and most interesting, we see this snippet has been added:

location = /xmlrpc.php {
  deny all;
}

As it turns out, if you launch a WordPress installation on AWS, you will get hit by scanners looking for remote exploits. I found this because NGINX threw me an error today when I tried to visit this site.

Okay, great. Time to look at the logs. For our install, those are in /var/log/nginx/. I found my access.log to be empty, but error.log had a lot of lines like this:

2016/04/16 20:58:09 [error] 19#0: *1364243 connect() to unix:/var/run/php5-fpm.sock failed (11: Resource temporarily unavailable) while connecting to upstream, client: 185.106.92.55, server: localhost, request: "POST /xmlrpc.php HTTP/1.0", upstream: "fastcgi://unix:/var/run/php5-fpm.sock:", host: "52.27.97.12"

And I do mean quite a few. Perhaps 10 per second, from only 3-4 IPs. You can see here that we’d already seen 1.3M+ errors. Time to do some research.

And we’re back. Fortunately the research did not take long. At some point in time WordPress had a security vulnerability involving this /xmlrpc.php endpoint. So, naturally, random asshats on the internet like to spam at it. Since it’s only used for some subset of features that I don’t understand or use, the fix is to deny it completely. That will save our resources for legitimate requests instead of spam.

After putting that fix in, my error.log is a lot cleaner. It still contains an echo of the denial, but no more failures. Even better, the access.log is now tracking these sketchy requests:

185.106.92.55 - - [16/Apr/2016:21:23:24 +0000] "POST /xmlrpc.php HTTP/1.0" 502 537 "-" "Mozilla/4.0 (compatible: MSIE 7.0; Windows NT 6.0)"

The next step from here would be to set up something like fail2ban so these IPs get shut down completely. I haven’t gotten there just yet, but expect to hear more on that soon.

Hello, Experiment!

Not so long ago I had this idea that I wanted to build my own little web ecosystem. I dreamt of a minimal setup. A convenient place to host my ideas and a blog about how they came to be. Almost as if I was just running it all locally, with me as the main user.

So we’re talking a pretty minimal traffic load, which means it’s best if everything runs together and cheaply. And I want to control it like a local machine, so no black boxes like Heroku or AppEngine. Docker claims to be the right abstraction and, having not yet done devops for a living, I’m willing to believe them. At first I’ll host it all on AWS, but because it’s containers all the way down I can always move it later.

It’s a good dream. But let’s back up a bit, because I’m not quite there yet. What I’ve done so far is booted up WordPress-in-Docker and started yammering. On my local machine. It’s actually not much of an accomplishment at all. In fact, the internet is chock full of ways to get WordPress up and running quickly.

WordPress maintains their own Docker images, Docker has a nice example with Compose, and you could follow an Amazon tutorial and end up on AWS ASAP.

These are all great starting points. But I’m me, and if this is going to be like my localhost, then I’m going to want to do a few things differently.

To start, what if I don’t want to run my database in a container? The experts seem to think that data persistence in and around Docker is a pain point, so I don’t want to go there. Maybe I should do it and mount my DB-in-Docker onto some EBS-backed drives, but I chose not to fight that battle today. Today I opt for a hosted solution, which for SQL-on-AWS means an RDS instance. Presumably I got that right and this blog will persist even as the containers running it come and go.

Next up, I got the idea to use NGINX instead of Apache. I think event-driven servers are great, and I think “engine-x” is a great name. I even read the intro docs and everything made sense, so I dove into best practices and examples. At which point I struggled with a lot of complexity that I think I could have avoided.

Before that happens to you, spin up a copy of their Docker image. Follow the examples and note that everything just works. Then open up a shell inside the image and navigate to /etc/nginx. There you will find a working default configuration, with the nginx.conf centerpiece that everyone else talks about.

The trick for me was realizing that nginx.conf usually dispatches elsewhere for most of the per-installation configuration. The default uses both conf.d/ and sites-enabled/, which works with every tutorial I’ve seen. I imagine that sometimes you might actually need to tweak the root nginx.conf, but I don’t know when that is because I haven’t had to yet. That helped quite a lot in interpreting the various tutorials, though some were still better than others.

Unfortunately, the docs supplied by WordPress did not seem very good. They even contained a few not-so-best-practices according to the NGINX docs. Digital Ocean, on the other hand, has a nice write-up on How To Install WordPress with Nginx on Ubuntu 14.04 and a working config to match. It further links to an article on getting the prerequisite LEMP stack installed. In our case the requirement is just to run PHP-FHM as a service alongside the NGINX service.

So those lead pretty naturally to a mostly-working Dockerfile. But just mostly. The first problem is how to tell WordPress about the RDS credentials from earlier without baking them into the Docker image. For now I solved this by manually finalizing the WordPress config outside of Docker, stowing it someplace safe, and then mounting it into the container at launch. I will need to do something better if I want to automate the release, but fortunately I’m not there yet.

The second problem was how to actually run two required services inside one container. The common answer appears to be supervisord, which will maintain the various processes for you. Since this was yet another tool to learn, I must admit that I borrowed my config from elsewhere.

In fact, as it turns out, very early in this quest I found an existing and probably working example for Docker+NGINX+WordPress. And I didn’t understand it at all. The configs were alien, it involved extra tools like supervisord, and MySQL seemed very embedded. Then I started to understand it, and I didn’t like it. Supervisord felt like cheating at containers, the WordPress install needs a mounted volume, and MySQL still seemed embedded. And then I finally understood it and that I needed most of it. In fact I had already rebuilt most of it.

Oh well, at least I know what most of it means and I could borrow the supervisord config. I’ll polish it up my implementation a bit and then go over it in an upcoming post.