Tuesday, October 20, 2009

Enable MySQL Compression

Enable MySQL Compression
garym@teledyn.com - October 22, 2004 - 00:55
Project: Drupal
Version: 7.x-dev
Component: database system
Category: feature request
Priority: normal
Assigned: Unassigned
Status: postponed
Jump to:

* Most recent comment

Description

A tiny change to the mysql connect can result in twice the speed overall on connections; I went from an average of 2.7sec per cached homepage to just over 1.4sec by simply adding this switch.

apparently MySQL has had this MYSQL_CLIENT_COMPRESS for a long time, but yesterday was the first I'd ever heard of it.
Attachment Size
Attachment Size
mysql-compress.patch 1.19 KB
» Login or register to post comments
#1
killes@www.drop.org - October 22, 2004 - 09:37

It is not clear to me why this should help. Most of us do run mysql and apache on the same server. Can you explain that to me?
Login or register to post comments
#2
garym@teledyn.com - October 24, 2004 - 01:23

I'm not completely hip to the inner workings of unix sockets, but I do know that a socket is a file handle and I know MySQL has a hard limit on the number of connections; any change that minimizes the length of a transaction means you tie up a precious file-handle for less time.

Is it fair to say "Most of us do run mysql and apache on the same server"? I run 7 Drupals myself, all of them on a webhost who locates MySQL on a dedicated machine, and I admin on 3 Drupals for clients, all of whom have MySQL on other machines, either for the same webhosts-rules reason, or in one case because they have large dreams.

Also, when a site becomes popular, moving MySQL to another machine is the easiest way for Drupal to be served over a multi-headed cluster (ie two drupal servers fed from a third MySQL server). The patch allows Drupal to scale.

The patch, so far as I know, is at worst benign on a single host, it should still show a performance boost at the cost of more CPU consumption due to less traffic down the interprocess pipes. It appears to boost the page speed on my dev server, but this machine is too under-powered and otherwise too loaded to really say by how much.

At the very least, as with the pconnect advice, perhaps this connect flag option should be mentioned in the comments.
Login or register to post comments
#3
bertboerland@ww... - October 24, 2004 - 12:29

I'll have to say that anything that will make it more easy and speedy to run drupal on a 2 (or 3) tier framework will make it better. So indeed for most users with relative small sites it will be enough to have the webserver and database server on the same host (with lots of other processes as well). But for serious hosting scaling out it the easiest way of taking load.

Scaling out being having more host instead of faster hosts. It's more easy for operations, you can take one webserver offline while the farm will still handle all request with the loadbalancer in front of it.

If drupal wants to grow, layout, content anf logic must be more seperatable, also in hardware. So I am for optional compression from the weblayer towards the database layer and hope it makes it in the code.
Login or register to post comments
#4
Anonymous - October 25, 2004 - 15:29

It is seriously unclear to me why this would ever be a good idea except on very slow connections between web server and database. And if you have a very slow connection, you shouldn't be trying to run a cluster of HTTP servers for your site, as scaling the web end of it will be the least of your worries.

It should be clear to everyone that provided the connection is quick enough (and 100/1000Mbit ethernet should be plenty) compression will merely slow things down - it's just extra overhead for both the MySQL server and the PHP client - you still need to pass as much data around, you're just adding the overhead of compressing and decompressing it when you shouldn't need to because the link in the middle is plenty fast enough to cope with it being uncompressed. A cached page is unlikely to be more than about 100Kb of HTML. Pulling 100Kb it over a 100Mbit link should take you about 1/100th of a second. This is seriously unlikely to be your bottleneck. :-)

If you're running the MySQL and HTTP processes on the same box then you will be compressing/decompressing stuff over what is effectively an infinite bandwidth pipe, which is extremely stupid because it will always slow things down. If you benchmark this and genuinely find that there is a speed increase in this case I would be astonished.

If you have quick servers that are idle enough that they can compress things faster than they can send them over ethernet and the bottleneck really is a 100MBIT connection in the middle, then we have a serious issue with the amount of stuff Drupal is pulling off the database and should look into that. Additionally, your servers evidently won't be under enough load for page generation to take more than a few ms anyway.
Login or register to post comments
#5
robertDouglass - October 25, 2004 - 15:37

There is not really much to discuss here, only things to benchmark. It is clear that this needs to be benchmarked for at least two cases: MySQL on the same box and MySQL on a different box. I listen to benchmarks much more closely than I listen to arguments why something may or may not be better than something else. The only question is, who is going to do the benchmarks and how? I am not volunteering as I don't have the right setup to test both cases objectively.
Login or register to post comments
#6
Anonymous - October 25, 2004 - 16:03

I think this discussion is fairly irrelevant, as well as would be most benchmarks.

Let me be a little more specific here.

A) Any person with enough brain cells left alive should be able to crack open the code, and add the 'MYSQL_CLIENT_COMPRESS' parameter to the end of the mysql_connect function; if you’re already optimizing, you will know about the existence of the feature since you’ve read the documentation.

B) Compression is a tradeoff between transfer time (size of data being transferred) and CPU usage (the cycles spent on compressing and decompressing the data), the compression is not free; even if it looks like a good idea when loading *one* page (due to less data exchanged, and much CPU available), it will not be a good idea when the CPU is loaded like hell on a busy server.

C) as far as the unix sockets and blocking argument in the thread, I’d say it's both mostly false, and additionally should be resolved by using persistent connections (mysql_pconnect), rather than trying to 'shorten the time we spend connected to the socket'.

-rL
Login or register to post comments
#7
Anonymous - October 25, 2004 - 16:04

I think this discussion is fairly irrelevant, as well as would be most benchmarks.

Let me be a little more specific here.

A) Any person with enough brain cells left alive should be able to crack open the code, and add the 'MYSQL_CLIENT_COMPRESS' parameter to the end of the mysql_connect function; if you’re already optimizing, you will know about the existence of the feature since you’ve read the documentation.

B) Compression is a tradeoff between transfer time (size of data being transferred) and CPU usage (the cycles spent on compressing and decompressing the data), the compression is not free; even if it looks like a good idea when loading *one* page (due to less data exchanged, and much CPU available), it will not be a good idea when the CPU is loaded like hell on a busy server.

C) as far as the unix sockets and blocking argument in the thread, I’d say it's both mostly false, and additionally should be resolved by using persistent connections (mysql_pconnect), rather than trying to 'shorten the time we spend connected to the socket'.

-rL
Login or register to post comments
#8
garym@teledyn.com - October 26, 2004 - 04:08

Hmmm ... I certainly have 100Mbit between these Solaris servers, and a physical distance of only a few meters, and while I'm not privy to the mysql server, the webserver generally runs pretty warm, with unix load generally at least 3 and often between 10 and 20, and yet, using the Apache Benchmark, on the live loaded server (which is also seeing other traffic) I clearly see 100% increase in speed (ie, pages generated in half the time or better) with just this one change.

Curiously also, prior to adding this flag, users complained of "click doing nothing" by which I assume they mean they click on a link and the browser's busy-icon whirrs a bit, and then stops because the connection timed out or was aborted somewhere along the path. With just this one switch change the frequency of these no-action clicks dropped from about three out of four to less than one in twenty. Monitoring the load (using 'top') I also would normally see the baseline server load of 1+ climb to 5+ when I enable my drupal path, but with this flag the load only increased to 3+ ... this last result could simply be coincidence as my site was often assailed by periodic swarms of both spammers and RSS-hounds.

Given what's posted above, I've insufficient brain cells to explain why all this performance improvement should be so.

But again, I'm no expert and these are not controlled laboratory benchmark tests, just an observed change in behaviour on a production server under live-load real-world conditions, and I can only really say that it worked for me. pconnect, on the other hand and while counter-intuitive to its definiton, made matters dramatically worse, leaving open file handles scattered in all directions, eventually causing the servers to completely seize up and reboot. My webhost still hates me for that little experiment ...
Login or register to post comments
#9
Anonymous - October 26, 2004 - 14:30

I suspect these results are specific to your scenario; the supposed performance improvement is probably a result of an 'unexpected' interaction in your specific setup. It could be an indication of network problems (and thus, for example, the compression results in less data transferred, thus less retransmitions, thus seemingly better performance, etc.), or of some artifacts created by your operating system (leaky management of sockets, file handles, memory, or that army of small gray elves who move the bits and bytes across the busses) which causes the MYSQL_CLIENT_COMPRESS to create a side-effect of it dissapearing.

With all that said, if it works for you - and if you lack the desire/time/resources to reproduce this in a clean setup with no unknowns to get to the bottom of this - then well, you found a clever hack which works in your situation! Keep that engineering spirit alive!

This however is not a reason to implement this as part of the default Drupal setup. The outcome, although positive for you scenario, still defies "the way things work" (TM) in more ways than one.

Imho, this should be left at that, and possibly serve people in the future stumbling accross this thread in search of 'things I can try to make things better'.

- rL
Login or register to post comments
#10
Steven - December 24, 2004 - 06:37

This isn't going to be patched in, so marking as won'tfix.
Login or register to post comments
#11
kbahey - December 22, 2007 - 05:21
Version: x.y.z » 7.x-dev
Status: won't fix » needs review

I ran into a situation where a client who runs MySQL and PHP on different servers was experience long running queries. Namely, some cache queries took hundreds of milliseconds. The link between the servers was misconfigured (100MBps instead of 1000MBps), and this did show up when a lot of data was sent from the server to the client.

Perhaps this would not go into core as is, but:

1. We can have a settings.php flag to say compress/don't compress, and those who need it can turn it on.

2. Others can find it useful as is and apply the patch to their setup.

Here is the patch. It is just a one liner and a few comments.

Index: includes/database.mysql.inc
===================================================================
RCS file: /cvs/drupal/drupal/includes/database.mysql.inc,v
retrieving revision 1.85
diff -u -r1.85 database.mysql.inc
--- includes/database.mysql.inc 19 Dec 2007 13:03:16 -0000 1.85
+++ includes/database.mysql.inc 22 Dec 2007 05:13:05 -0000
@@ -77,9 +77,12 @@
// mysql_connect() was called before with the same parameters.
// This is important if you are using two databases on the same
// server.
- // - 2 means CLIENT_FOUND_ROWS: return the number of found
+ // - MYSQL_CLIENT_FOUND_ROWS: return the number of found
// (matched) rows, not the number of affected rows.
- $connection = @mysql_connect($url['host'], $url['user'], $url['pass'], TRUE, 2);
+ // - MYSQL_CLIENT_COMPRESS: compress the data sent from MySQL server
+ // to the client. This can speed things when MySQL and PHP are on
+ // different servers and have a relatively slow link.
+ $connection = @mysql_connect($url['host'], $url['user'], $url['pass'], TRUE, MYSQL_CLIENT_FOUND_ROWS|MYSQL_CLIENT_COMPRESS);
if (!$connection || !mysql_select_db(substr($url['path'], 1))) {
// Show error screen otherwise
_db_error_page(mysql_error());
Login or register to post comments
#12
KarenS - December 22, 2007 - 17:50

subscribing.

I am on a shared host that has the mysql server on a different machine than the web server and I have no information or control over how that is configured, probably not an uncommon scenario. And I also am plagued with sporadic timeout problems, mostly in some administrative pages that load lots of stuff and in places where big caches are being retrieved or saved (like the Views and CCK caches).

I had no idea this might be my problem or that this was something I could control, so at least documenting it would be beneficial if it sometimes helps. When I get a chance (won't be right now with the holidays) I'll try this out on my setup.
Login or register to post comments
#13
birdmanx35 - February 19, 2008 - 18:25

If this has performance benefits, it's worth talking about.
Login or register to post comments
#14
fysa - May 13, 2008 - 22:59

subscribed, awaiting adventurous benchmarkee
Login or register to post comments
#15
fysa - May 14, 2008 - 02:00

FYI: 11.11 requests per second -> 7.77 requests per second with both apache2 and mysql on the same 5.7 box. ;)
Login or register to post comments
#16
Crell - August 21, 2008 - 22:40
Status: needs review » needs work

The path is rather guaranteed to not apply now. :-)

However, with the new array-based DB configuration it should be possible to add support for a "compressed" => TRUE flag that only MySQL pays attention to. If it's set, run the code to enable compression. If not, don't.

If there is sometimes a benefit and sometimes not, a documented toggle sounds like the best way to go.
Login or register to post comments
#17
chx - August 22, 2008 - 08:56

http://bugs.php.net/bug.php?id=44135&edit=1 until this is fixed I doubt this one can be done.
Login or register to post comments
#18
ajayg - January 4, 2009 - 05:40

For whatever it is worth. This small change significantly helped in performance as documented here http://groups.drupal.org/node/17913. USing drupal 5.12
Login or register to post comments
#19
Dave Reid - January 4, 2009 - 07:26
Status: needs work » postponed

I like the idea, but I've confirmed there's currently no way to enable MySQL compression with PDO, so marking as posponed.
Login or register to post comments
#20
kbahey - January 4, 2009 - 15:34

Adding link where it says it is not supported.

Bummer ...
Login or register to post comments
#21
ajayg - January 4, 2009 - 18:07

Hmm. It worked fine for mysqli and mysql. So what is PDO? (sorry for a newbie question but I have always used either mysql or mysqli with drupal).
Login or register to post comments
#22
kbahey - January 4, 2009 - 18:33

PDO is the new database layer for Drupal 7.x.
Login or register to post comments
#23
mrfelton - September 19, 2009 - 14:08

I too would like to see this benchmarked and tested.

No comments: