Wednesday, September 21, 2011

FastCGI with a PHP APC Opcode Cache

 
 

Sent to you by Danny via Google Reader:

 
 

via Brandon's Blog by Brandon on 7/6/09

Hosting PHP web applications in a shared environment usually involves a choice between two exclusive options: host a fast application by using a persistent opcode cache, or host an application that your shared neighbors can't snoop around or destroy. In this post I discuss a way to get the best of both worlds, by combining FastCGI with a single opcode cache per user.

This is a long post, ready to jump right in? Skip the history!

The evolution of mod_php to FastCGI

In the early days of all-you-can eat shared hosting, administrators served PHP via mod_php. mod_php loads the PHP interpreter into every web server process during server startup, thus alleviating the expense of starting an interpreter each time a script executes. This allowed executing PHP scripts relatively fast.

mod_php came with a few drawbacks:

  • Every server process, even those serving static files such as images and CSS scripts, contained the PHP interpreter. This caused a lot of bloat in the web server's memory footprint. It also eliminated the ability to use mutil-threaded web servers as many PHP extensions are not thread safe.
  • Every PHP script ran as the same user as the web server. While web servers typically run as a non-privileged user such as nobody, multiple mutually untrusting shared accounts could easily access, disrupt or destroy each other by executing a PHP script.

FastCGI loads the PHP interpreter into a separate process. This process is still persistent across connections, but, using a mechanism such as suEXEC, can run as a different user. Static files can be served by a lightweight multi-threaded web server process while PHP scripts are served by a single-threaded FastCGI process. What's more, if PHP crashes, it doesn't bring down the entire web server.

In the shared hosting context, each user's PHP scripts are executed with the user's credentials. This leads to a more secure environment for both the host and the shared user.

The opcode cache

One of the easiest and most effective things you can do to speed up your PHP scripts is to enable an opcode cache such as APC, XCache or eAccelerator. An opcode cache caches the compiled state of PHP scripts in shared memory. Thus each time a PHP script is run, the server doesn't have to waste time compiling the source code. Opcode caches can speed up execution of scripts by up to 5 times and decrease server load.

In my opinion running PHP on a webserver without an opcode cache is like restarting your car's engine at every stop sign. You can still get where you're going but it's going to take longer and put a lot more wear and tear on your engine. An opcode cache is so important that APC is going to be included in the core of PHP 6.

An opcode cache requires that the PHP interpreter process persist between connections. Both mod_php and FastCGI satisfy this requirement. An opcode cache requires RAM, a precious commodity on a shared hosting server. By default, each cache allocates 30MB of shared memory. This can be easily configured up or down depending on the scripts you are running.

Combining FastCGI with an opcode cache

So if we agree that FastCGI and opcode caches are good (a must IMHO), why do most shared hosting providers only enable one? The answer is two-fold:

  1. RAM. Each opcode cache is typically 30MB. Each PHP process gets its own opcode cache. Each user must run its own PHP process for security. Thus each user requires at least 30MB of RAM on top of the RAM required for the PHP interpreter (a lot). All you can eat shared hosting companies typically oversell their servers. Overselling usually works when it comes to bandwidth, I/O and CPU time, however overselling RAM is harder. Remember the PHP processes stay in memory between connections. So a small site only getting a 100 hits a day still hogs the same amount of RAM as a busy site. This breaks the overselling model.
  2. FastCGI. In a typical configuration, FastCGI spawns many separate PHP processes per user. Each PHP process needs its own opcode cache. Instead of maintaining one opcode cache (per user), the server maintains multiple caches. This reduces the effectiveness of the cache and increases the strain on server resources.

Solving problem #1 is hard. Some have suggested a single cache that can be shared across multiple processes and users and still provide assurance that different users cannot mess with each other. This blog post is not about #1. There are many reasons to use unlimited shared hosting providers. Opcode caches are not one.

This blog post is about how to solve problem #2. The goal is to have a reasonable system that utilizes suEXEC, FastCGI and the APC opcode cache. Each user should have one and only one opcode cache. The administrator should be able to adjust the size of the cache for each individual user based on their needs (and monthly fee). Finally, the solution should decrease script load time and increase server performance while maintaining security and privacy between accounts.

mod_fastcgi vs mod_fcgid

I run Apache on my server. Many people suggest running a more lightweight server such as lighttpd. One day I may switch, but for now I've tuned my Apache server to be as fast as I need.

There are two modules to implement FastCGI on Apache – mod_fastcgi and the newer mod_fcgid. Both are binary compatible with each other and do basically the same thing. mod_fcgid sports better control over spawning processes and error detection. mod_fastcgi has been around longer. Both support suEXEC, and both separate PHP from Apache, thus allowing Apache to run threaded workers if desired.

As I mentioned in the combining FastCGI with an opcode cache section, the typical behavior of FastCGI is to spawn multiple PHP interpreters. The FastCGI process monitors each child process, kicking out processes with errors, restarting failed processes and sending incoming requests to the least busy child. This is usually the preferred behavior, and mod_fcgid implements it particularly well.

Opcode caches throw a wrench in this however, because of their inability to share the cache across FastCGI processes. Hopefully one day this will be remedied. Luckily, in the meantime, PHP is capable of playing "process manager" and a single PHP process can spawn several children to handle requests. This way the parent PHP process can instantiate the opcode cache and its children can share it. You'll see this later when we set the PHP_FCGI_CHILDREN environment variable.

Both mod_fcgid and mod_fastcgi can be told to limit the number of PHP processes to 1 per user. The PHP process can then be told how many children to spawn. Unfortunately mod_fcgid will only send one request per child process. The fact that PHP spawns its own children is ignored by mod_fcgid. If we use mod_fcgid with our setup, we can only handle one concurrent PHP request. This is not good. A long running request could easily block multiple smaller requests.

mod_fastcgi will send multiple simultaneous requests to a single PHP process if the PHP process has children that can handle it. This is the reason we must use mod_fastcgi to achieve our goal of one cache per user.

Implementation

This section describes the steps I took to enable suEXEC FastCGI with a single APC opcode cache per user on Apache 2.2. These instructions may vary by Linux distribution and are not intended to be a cut-and-paste howto. I use Gentoo, so most steps will be geared towards a Gentoo install but the general idea should work on any distribution.

1. Install php-cgi and disable mod_php

The PHP interpreter can run in three different modes: as an Apache module, as a CGI binary or as a command line command. Typically, separate binaries are built for the CGI and CLI modes, php-cgi and php respectively. On Gentoo, each mode is associated with a USE flag: apache2 for mod_php, cgi for a CGI binary, and cli for command-line PHP. The cgi USE flag must be enabled. If it isn't, add it to /etc/make.conf or /etc/portage/package.use and recompile PHP. On other distributions, search for a php-cgi binary.

You will want to disable mod_php (if it was enabled) before implementing FastCGI. This can be done by commenting out the appropriate LoadModule line in your Apache configuration file:

# LoadModule php5_module modules/libphp5.so

On Gentoo, this can be easily done by removing PHP5 from the APACHE2_OPTS variable in /etc/conf.d/apache2.

2. Install and enable mod_fastcgi Apache module

We already discussed why we must use mod_fastcgi instead of mod_fcgid. On Gentoo, installing mod_fastcgi can easily be done by running:

$ sudo emerge mod_fastcgi

For other distributions, try installing a mod_fastcgi package or see the FastCGI Installation Notes.

Make sure your Apache conf file contains the line:

LoadModule fastcgi_module modules/mod_fastcgi.so

On Gentoo, this line is found in /etc/apache/modules.d/20_mod_fastcgi.conf. mod_fastcgi is enabled by adding FASTCGI to the APACHE2_OPTS variable in /etc/conf.d/apache2.

3. Install and configure the APC Opcode Cache

To install APC on Gentoo, simply run:

$ sudo emerge pecl-apc

For other distributions, see the Alternative PHP Cache installation instructions.

Once installed, look for the apc.ini file in your php extension configuration directory (e.g. /etc/php/cgi-php5/ext-active). The default apc.ini works with one exception. You need to comment out apc.shm_size="30" (line 5 below). Commenting this line will enable us to set it per user later.

My apc.ini file looks like:

 extension=apc.so apc.enabled="1" apc.shm_segments="1" ;commenting this out allows you to set it in each fastcgi process ;apc.shm_size="30" apc.num_files_hint="1024" apc.ttl="7200" apc.user_ttl="7200" apc.gc_ttl="3600" apc.cache_by_default="1" ;apc.filters="" apc.mmap_file_mask="/tmp/apcphp5.XXXXXX" apc.slam_defense="0" apc.file_update_protection="2" apc.enable_cli="0" apc.max_file_size="1M" apc.stat="1" apc.write_lock="1" apc.report_autofilter="0" apc.include_once_override="0" apc.rfc1867="0" apc.rfc1867_prefix="upload_" apc.rfc1867_name="APC_UPLOAD_PROGRESS" apc.rfc1867_freq="0" apc.localcache="0" apc.localcache.size="512" apc.coredump_unmap="0" 

4. Install/enable Apache suEXEC

Apache 2.2 contains built-in support for executing CGI programs as a different user id and group id than the webserver. This support must be compiled into Apache. On Gentoo, use the suexec USE flag and recompile apache. On other distributions, see Configuring & Installing suEXEC.

5. Create wrapper scripts

The Apache suEXEC security model requires that the CGI binary meet some pretty stringent requirements concerning file ownership and permissions. Rather than copying the php-cgi binary for each user, we create multiple wrapper scripts around the php-cgi binary. These wrapper scripts allow us to set options on a per-user basis.

I keep my wrapper scripts in /var/www/bin, though you may keep yours wherever you want. Each user has a directory in /var/www/bin, for example:

$ ls -l /var/www/bin
dr-xr-xr-x 2 bob bob 104 Jun 24 13:56 bob/
dr-xr-xr-x 2 sue sue 104 Jun 24 13:56 sue/
dr-xr-xr-x 2 joe joe 104 Jun 24 13:53 joe/

Inside each user's bin directory is a single wrapper script, php-fastcgi:

$ ls -l /var/www/bin/bob/
-r-xr-x‐‐‐ 1 bob bob 145 Jun 24 13:56 php-fastcgi

I've shown the ls -l output to show the file and directory ownership and permissions. These are important, and Apache suEXEC will not work correctly if the owner and permissions are not correct.

The contents of the php-fastcgi file in each user's bin directory (see below for an explanation):

 #!/bin/sh  PHP_FCGI_CHILDREN=5 export PHP_FCGI_CHILDREN PHP_FCGI_MAX_REQUESTS=500 export PHP_FCGI_MAX_REQUESTS  umask 0022 exec /usr/bin/php-cgi -d apc.shm_size=25 

PHP_FCGI_CHILDREN
This variable tells PHP how many child processes it should spawn. As we discussed earlier, our PHP process will act as "process manager" and pass incoming requests to its children. The parent will maintain a single opcode cache which each child will share. The PHP_FCGI_CHILDREN variable tells PHP how many children to spawn. Another way to think of this is the number of concurrent PHP requests that can be handled per user.

PHP_FCGI_MAX_REQUESTS
PHP is known for memory leaks in long running processes. This variable causes each child process to be restarted once it has served a given number of requests (e.g. 500). Only the child process is restarted, the parent process remains. Since the parent process maintains the opcode cache, the opcode cache persists.

umask 0022
This sets the umask the PHP binary will run under. Some people may prefer a stricter umask such as 0077, however I've found 0022 works best as it allows the Apache server running as nobody to read static files written earlier by a suEXEC'd PHP process. Some PHP applications (WordPress plugins) do not do a good job with permissions, and a strict umask can cause applications to fail.

exec /usr/bin/php-cgi -d apc.shm_size=25
This line calls the php-cgi binary and modifies the APC cache size. It is possible to configure this to use a separate php.ini file instead of setting configuration parameters on the command line, however I like the ability to share a single php.ini file.

6. Edit global Apache settings

There are two sets of settings you must configure in Apache: those that affect all users and those that affect a specific user. This section describes global settings that affect all users.

I like to keep my global settings in my /etc/apache/modules.d/20_mod_fastcgi.conf file, but these can go in any part of your http.conf file. Most of the time you do not want this in a VirtualHost section. My global mod_fastcgi settings look like this (see below for an explanation):

 <IfDefine FASTCGI> LoadModule fastcgi_module modules/mod_fastcgi.so  FastCgiConfig -idle-timeout 20 -maxClassProcesses 1 FastCgiWrapper On  AddHandler php5-fcgi .php Action php5-fcgi /cgi-bin/php-fastcgi  <Location "/cgi-bin/php-fastcgi">    Order Deny,Allow    Deny from All    Allow from env=REDIRECT_STATUS    Options ExecCGI    SetHandler fastcgi-script </Location>  </IfDefine> 

FastCgiConfig
The FastCgiConfig configuration directive sets parameters for all dynamic FastCGI processes. The idle-timeout causes FastCGI to abort a request if there is no activity for more than 20 seconds. The maxClassProcesses option is very important: it tells FastCGI to only spawn one php-cgi process regardless of how many requests are pending. Remember that our PHP process will spawn its own children, so FastCGI only needs to spawn one. Until this APC bug is fixed, this is necessary to allow sharing the APC cache among children.

FastCgiWrapper
The FastCgiWrapper configuration directive is needed to allow suEXEC to work.

AddHandler / Action
The AddHandler and Action configuration directives tell Apache to handle all files ending in .php with the php-fastcgi script in cgi-bin. In the next step, you'll see how we alias this cgi-bin directory for each individual user.

Location
The Location directive tells Apache how to handle requests to /cgi-bin/php-fastcgi. The Allow from env=REDIRECT_STATUS on line 13 prevents users from executing this script directly. With this line, the only way to execute php-fastcgi is by requesting a file ending in .php.

7. Edit per-user Apache settings

On my host, every virtual host is associated with one user. And every user has exactly one opcode cache. A single user can have multiple virtual hosts, but these virtual hosts share the same opcode cache.

For each virtual host, I add the following lines, customized for the user associated with that virtual host:

 <VirtualHost *:80> ServerName www.sue.bltweb.net ... <IfModule mod_fastcgi.c>    SuexecUserGroup sue sue    Alias /cgi-bin/ /var/www/bin/sue/ </IfModule> ... </VirtualHost> 

When combined with the global apache settings and the wrapper scripts, this will launch the php-cgi binary using suEXEC to execute as the appropriate user and group whenever a .php file is requested.

There are several different ways to call the FastCGI binary. On my hosts, users don't have access to their cgi-bin directory. The /var/www/bin directory is not accessible by ordinary users. This doesn't have to be the case, the cgi-bin directory could be stored in the user's directory. It is important to note that allowing the user to modify php.ini values allows them to modify their opcode cache size, which could have severe repercussions on RAM usage.

Pros and Cons

The implementation described above is only one of many ways to implement FastCGI and APC. In my opinion, it is the best way to meet my goals, but in this section I'll try to outline some of the advantages and disadvantages of my setup.

Advantages
  • Different users can have different APC cache sizes
  • Multiple concurrent PHP requests can be handled simultaneously
  • RAM usage is predictable as a product of the number of users on the host
  • Server is better secured against attacks from the inside since PHP processes run as the user who owns the script
  • Resource usage can be monitored since each user has a separate PHP process
  • A PHP crash doesn't mean an Apache crash. If a PHP process crashes it is restarted automatically.
Disadvantages
  • The process manager built into mod_fastcgi isn't used. One of the motivations behind mod_fcgid was to improve upon the process manager in mod_fastcgi.
  • The newer mod_fcgid cannot be used as it will only send one request at a time to the PHP process, thus multiple requests can't be handled simultaneously
  • Maintaining separate opcode caches per user uses a considerably larger amount of RAM than a single opcode cache used with mod_php
  • Users cannot alter php.ini files
  • If a PHP script crashes it has potential to take down all of the PHP requests currently being processed for that user

Performance

In my next post I'll try to cover RAM usage, performance, benchmarks, compatibility and best practices. This post is already way too long; I'm surprised you are even still reading it!

Stay tuned for more information on using FastCGI with a PHP APC opcode cache. In the meantime let me know what you think of this approach. Have you tried it? Know of a better way? Found any bugs or problems? Leave a comment below!


 
 

Things you can do from here:

 
 

No comments: