* Suicide
o having no backups
o depending on slaves for backup
o keeping backups on same SAN
o having a single DBA - Frank didn't like this one at all
o not keeping binlogs
* Restoring from backup
o how much time?
o uncompressed backup ready to mount?
o separate network for recovery?
* In Fotolog, 1TB of data was severely hit.
o first problem: backup was highly compressed (tar.gz)
o uncompressing took hours
o so keep uncompressed backups (at least last N days)
o it should be mountable, rather than transferable
* Frank going over recovery modes at http://dev.mysql.com/doc/refman/5.0/en/forcing-recovery.html
* Row by row recovery
o row by row recovery (get the range of ids)
o custom scripts
o may not be able to use primary key
o foreign key based retrieval faster
o lose 4 seconds for each crashed record (in Fotolog, for some reason some values were crashing mysqld)
* Lessons
o SANs make sense (in some environments)
o try to replicate the whole SAN (in Fotolog, a SAN actually failed because of a bug in its maintenance program)
o everything will fail at some point
o backup everything (cron jobs, my.cnf, custom scripts)
o have backup in a form ready to restore
o don't count replication a backup
o be worried about 'routine' operations
* Peter Zaitsev of Percona takes the stage to talk about his homegrown tools for InnoDB recovery
o innodb-tools - will recover even if mysqld doesn't start, for example if half of RAID0 fails or somebody deleted some data. innodb-tools will recover using InnoDB tablespaces.
* We're out of time
● ● ●
Artem Russakovskii is a San Francisco programmer, blogger, and future millionaire (that last part is in the works). Follow Artem on Twitter (@ArtemR) or subscribe to the RSS feed.
In the meantime, if you found this article useful, feel free to buy me a cup of coffee below.
Sunday, July 5, 2009
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment