Friday, November 14, 2014

Apache MPMs – Prefork, Worker, and Event

If you're still using Apache when the world is slowly moving to NGINX, you're looking for every optimization to keep up as much as you can. You might tweak what modules are loaded, play with Keepalives, fiddling with Negotiations, FollowSymLinks, and Overrides, you might even be throwing more hardware at it and pretending you didn't. However, if you're running a ridiculously busy site and don't want your web server to topple over due to memory usage, you should really look into what MPM you're using.
The MPM, or Multi Processing Module, you use is responsible for just about the entire HTTP session. From listening on the network, taking requests in, and most importantly, how to handle children. No, we're not talking about 5 year olds, we're talking about child processes and threads. For Unix based machines, Apache offers three MPMs to choose from; Prefork, Worker, and Event. While there are many other MPMs available for Apache on different systems, we're going to focus on Linux and what you're most likely going to see (and what I have experience with). These MPMs handle the processes and potentially threads that the Apache web server uses to accept, process, and server HTTP requests.
A process is loosely looked at as an particular instance of a program. Each process is completely self contained, and each process is executed completely separate and isolated from the others in terms of address space, variables, memory, etc. For the sake and scope of this article, you can think of the following statement; 5 Apache processes are 5 different instances of Apache running. While there's quite a few caveats to that, it's safe enough to keep in mind for the purpose of this article.
A thread on the other hand is created from and owned by a process. A process can have multiple threads, each of these threads are not completely independent. They share the same state and address space as the process that owns them.
In short a process is an instance of a program and is used to tell the entire system it exists and needs resources and can execute on it's own. A thread is owned by a process and can only really execute stuff. Since a thread doesn't need to establish itself as the entire application like a process does, it inherently uses less system resources like RAM.
Now that we know what a process and thread is, let's dive into how Apache handles them using the Prefork and Worker MPM. Regardless of which MPM you use, when you start your Apache web server, a single process launches. I usually like to call this guy 'the coordinator'. This guy is responsible for launching the other processes that actually listen for requests, process them, and serve them. So now you've started Apache, the coordinator process starts a few child processes, now what?


The Prefork MPM is non threaded. It doesn't use threads at all. The entire process is dedicated to each HTTP request. When a HTTP request comes in, say for your cat picture, this entire process is tied up and responsible for that request to that one person. If at the same time another person browses to your cat picture, a whole different process has to be used.
Prefork is great in that it's fast and stable. It has a slight edge with response times as it doesn't have to deal with running different threads inside of it's process. It's stable in that if something goes wrong with a particular request, because this entire process is dedicated to that one request, other requests are not affected as they will be handled by other processes.
Prefork is also used if you're using a Apache module that doesn't deal with threading all that well. The most common is mod_php (despite it's latest efforts with ZTS). You can get around this issue by having PHP and/or other scripts not be handled by Apache, but something like php-fpm.
However, if you're dealing with a high amount of concurrent requests, this can eat up resources like crazy. Remember, each process has to establish itself as a full instance of Apache. Meaning it will load all modules, and be a full web server for each and every request. If you have have a lot of requests, the coordinator will have to launch as many of these processes as there are requests, quickly driving RAM usage through the roof.


The Worker MPM uses threading, which does absolute wonders for memory usage in high concurrency times. With Worker MPM, a smaller number of processes are launched because rather than each and every request needing it's own process like in Prefork, Worker MPM will thread existing processes, serving multiple connections within a single process by means of those threads. New connections in Worker simply have to wait for an available thread, rather than an available process like in Prefork MPM.


The event MPM is very new. In fact, it's only been released as stable in Apache 2.4. The Event MPM works the exact same way as the Worker MPM when it comes to processes and threads. The big difference is that an Event MPM will dedicate a thread to a request, not the whole HTTP connection.
This is useful in a situation where you like the idea of threading, but have an application that uses rather long KeepAlive timeouts. With the Worker MPM, the thread would be bound to the connection, and stayed tied up regardless if a request was being processed or not.
With the Event MPM, the connection the thread is only used for requests and frees backup immediately after the request is fulfilled, regardless of the actual HTTP connection, which is handled by the parent process.  Since the thread frees up immediately after the request is fulfilled  it can be used for other requests.  Meaning fewer threads!


So, which should you use? To be perfectly honest, in my opinion, everyone should switch over to the Worker MPM as it handles concurrency very well and is easy on RAM. Although if you have some compatibility issues, you may need to stick with Prefork.
If you're running the latest version of Apache, go ahead and use the new Event MPM.
If you're using PHP, you MUST use PHP-FPM alongside Worker or Event MPM.. Your PHP application will most likely break in some form or another if you use mod_php with the Worker MPM. In reality, you should probably be using PHP-FPM regardless of the Apache MPM.
At the end of the day, this post is more informational than anything.  There will be times where using a non-threaded MPM is the way to go, and others where a threaded MPM is the best route.
If you're installing Apache from your distribution's repositories, it's likely that it is configured to use Prefork MPM by default. Check back in a couple days and I'll go through a full setup of a WordPress site using the Worker MPM and PHP-FPM.

No comments: