Friday, April 17, 2009

set global innodb fast=true;

By: Raj Thukral
So you ran into some basic limitations with MyISAM when your site got busier. Even single row updates would lock the whole table and slow things down to a crawl. Then you updated to InnoDB to get the benefit of row-level locking, but now the site is even slower than before. What gives?

Well, we had exactly the same issue with one our clients, a large online community site for models and photographers. Updates were killing the system with MyISAM, and converting the tables to InnoDB got rid of the table locks but made the site slower than ever. The buffer pool hit ratio was 1000/1000 and there were no obvious bottlenecks, except that it was all excruciatingly slow.

Here’s what was happening: MySQL by default sets autocommit=1 for every connection. Which isn’t so bad for MyISAM, since a write isn’t guaranteed to disk, but for InnoDB, it has to do a flush to disk for each implicit commit. Which means each insert/update/delete on an InnoDB table will result in a flush to disk.

What’s so bad about flushes to disk? Nothing at all. They ensure that any commit is guaranteed to be there when you restart the database after a crash. The only penalty for that is the physical speed of the commit. Since the disk has to write the data to a platter before the commit can return, it can take a while to do.

Assuming even an average seek time of 9ms per disk write, you’re limited to roughly 67 commits/sec1, which is really slow. And while the disk is busy seeking to the sector to be written, it is not doing reads for you. In effect, things just hang. InnoDB may work around part of this limitation by grouping some commits together, but even so, the limitation exists, just at a slightly higher commit rate.

So what are the workarounds? Having the InnoDB log files on a dedicated disk might help some — your frequent flushes to disk will be on a different disk, so they won’t block selects from the data disk2. However, the physical limitation of the number of flushes you can do still remains. A (battery-backed) write cache on the controller can help as well, but that may be overkill for what you want to do, those cards can be expensive, and eventually they too have to flush to disk.

So lets go back to why we went to InnoDB in the first place. We went because we did not want to have the database doing table-level locks. Data integrity on a crash is not guaranteed with MyISAM, and it was not the reason we were switching to InnoDB anyway. So while a commit guaranteed to disk is nice to have, it’s not essential for the particular application. In the rare event of a database crash, we are willing to lose a few transactions.

Remember, this client is not a financial site, so the parameter of interest becomes innodb_log_flush_at_trx_commit. Setting this to 0 means that InnoDB will no longer flush every commit to disk. It will still do a flush every one second or so anyway, so in the worst case you may lose about one second worth of data. But on the other side, your disk is now freed up to do reads for you at full speed. The result — the site is fast again. Or, in the words of a user on the site: “OMFG someone fed the Hamster…the wheel is spinning again!”

1. This figure is illustrative only, it depends on the disk speed, disk layout, raid controller etc. but the point is that eventually you run into a commit rate limitation.

2. (Commits are not to the datafiles, only to the logfiles; datafiles only ever get written at a checkpoint… but that’s for another blog — look up the InnoDB double write buffer if you’re impatient!)

Jun 21, 2007
Category: Group Blog Posts, MySQL

2 Responses
Xaprb says:
June 21, 2007 at 1:11 pm
A better “workaround” might be to disable AutoCommit and group logically related statements together in transactions, rather than letting InnoDB arbitrarily decide which bits of work are committed together as a unit. Unless you really are doing a bunch of rapid-fire single-statement transactions, in which case yeah, you’ve hit the ceiling of your commit rate and that won’t help. Another good solution is to use a RAID controller with a battery-backed write cache and let it group flushes together from the cache.

$conn->query("SET autocommit=0;");
$conn->query("INSERT INTO table_name(id,name) VALUES (1,'myname')");

Raj Thukral suggests setting the innodb_log_flush_at_trx_commit parameter in my.cnf to 0, thereby causing innodb to flush the commits to disc once a second rather than with every commit. This will speed up innodb if there are lots of commits going on which will interfere with other activity.

He does make the point that using this really depend on whether are comfortable with losing 1 second’s worth of data or not.

I am not sure I have that level of intestinal fortitude, and I have had a few bad experiences with 0, so I prefer 1.

I would nonetheless strongly recommend reading the documentation on innodb’s parameters, spefically on innodb_flush_log_at_trx_commit:

When innodb_flush_log_at_trx_commit is set to 0, the log buffer is written out to the log file once per second and the flush to disk operation is performed on the log file, but nothing is done at a transaction commit. When this value is 1 (the default), the log buffer is written out to the log file at each transaction commit and the flush to disk operation is performed on the log file. When set to 2, the log buffer is written out to the file at each commit, but the flush to disk operation is not performed on it. However, the flushing on the log file takes place once per second also when the value is 2. Note that the once-per-second flushing is not 100% guaranteed to happen every second, due to process scheduling issues.

The default value of this variable is 1, which is the value that is required for ACID compliance. You can achieve better performance by setting the value different from 1, but then you can lose at most one second worth of transactions in a crash. If you set the value to 0, then any mysqld process crash can erase the last second of transactions. If you set the value to 2, then only an operating system crash or a power outage can erase the last second of transactions. However, InnoDB’s crash recovery is not affected and thus crash recovery does work regardless of the value. Note that many operating systems and some disk hardware fool the flush-to-disk operation. They may tell mysqld that the flush has taken place, even though it has not. Then the durability of transactions is not guaranteed even with the setting 1, and in the worst case a power outage can even corrupt the InnoDB database. Using a battery-backed disk cache in the SCSI disk controller or in the disk itself speeds up file flushes, and makes the operation safer. You can also try using the Unix command hdparm to disable the caching of disk writes in hardware caches, or use some other command specific to the hardware vendor.

Note: For the greatest possible durability and consistency in a replication setup using InnoDB with transactions, you should use innodb_flush_log_at_trx_commit=1 and sync_binlog=1 in your master server my.cnf file.

I did spend some time with MySQL tech support on this and came up with the following as an optimal configuration for innodb for large installation:

# InnoDB options, see:

# Data directory, and data file
innodb_data_home_dir = /var/lib/mysql/data
innodb_data_file_path = ibdata1:10M:autoextend

# Use one file per table

# Buffer pool size
#innodb_buffer_pool_size = 1G
innodb_buffer_pool_size = 10G
innodb_additional_mem_pool_size = 32M

# Transaction log location and sizes
innodb_log_group_home_dir = /var/lib/mysql/logs
innodb_log_files_in_group = 4
#innodb_log_file_size = 50M
innodb_log_file_size = 512M
innodb_log_buffer_size = 8M

# Percentage of unwritten dirty pages not to exceed
innodb_max_dirty_pages_pct = 80

# Transaction commit policy
innodb_flush_log_at_trx_commit = 1

# Timeout to wait for a lock before rolling back a transaction
innodb_lock_wait_timeout = 50

# Flush method
innodb_flush_method = O_DIRECT

# Number of concurrent threads to run
innodb_thread_concurrency = 32

# Prevent extra locking, we can only use this if we have row
# level replication, see ‘binlog_format’

# Enable fast innodb shutdown (skip full purge and insert buffer merge)
innodb_fast_shutdown = 1
#innodb_fast_shutdown = 0

# Dont delay insert, update and delete operations when purge
# operations are lagging
innodb_max_purge_lag = 0
#innodb_max_purge_lag = 1

# Force recovery mode, do not mess with unless you really know
# what you are doing, see:
#innodb_force_recovery = 4

No comments: