Wednesday, March 25, 2009

[support] MyISAM vs InnoDB

[support] MyISAM vs InnoDB
Kieran Lal kieran at acquia.com
Sat Mar 21 20:25:29 UTC 2009

Previous message: [support] MyISAM vs InnoDB
Next message: [support] MyISAM vs InnoDB
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

--------------------------------------------------------------------------------

On Sat, Mar 21, 2009 at 1:16 PM, Daniel Carrera
wrote:
> Kieran Lal wrote:
>>
>> Tell us more about your site. Number of nodes, users, traffic. What
>> kind of hosting environment, etc?
>>
>> InnoDB is safer, but consumes more resources.
>
>
> Hmm... let's see... 7,152 nodes, 4,300 users, and if I read the monthly log
> correctly, it seems like it averages 15,300 hits/day with a peak of 18,000
> hits/day on wednesdays. But I think that might include all the files that go
> into making a web page (CSS, logo, etc).
>
> We have a shared hosting environment, of the type that doesn't over-sell
> disk space, so it is somewhat intermediate between a typical shared hosting
> and a dedicated host.
>
> The servers run RHEL, PHP 5.2.8 and MySQL 5.1.30.
>
> Uhm... I can't think of anything else I can say about the site. Does any of
> this help answer my question?

MySQL ISAM engine is lightweight, and not transactional, but has been
know to corrupt, although I think it's much more stable in MySQL 5.1.
For high performance, and transaction integrity many people move their
database engine to InnoDB.

InnoDB uses ~5x the disk space and ~3x more memory. It can be faster,
and more reliable. But the down side is that if big tables like node,
watchdog are InnoDB you can run into problems. Based on your site
size, I don't forsee moving your entire site to InnoDB to be a
problem.

Here's a Drupal performance tuning checklist:
http://tag1consulting.com/files/checklist.pdf

If you see a slowdown, let me know and I'll help you out.

Cheers,
Kieran

>
> Daniel.
> --
> [ Drupal support list | http://lists.drupal.org/ ]


I use InnoDB myself. I suggest the innodb_file_per_table configuration
option to help control the size issue. If you create a table and then
remove it this option will return all the disk space, otherwise you
continue to use it.

--
Earnie http://r-feed.com
Make a Drupal difference and review core patches.


It really depends on what you need to do. If you don't know it I'd
go for MyISAM since most of the features you may be looking for in
InnoDB are not really fully exploited by stock Drupal.

The reasons that may make you chose InnoDB may be the same to
consider to add to the list PostgreSQL, but it really depends on
what is your specific need.

Consider that anyway PostgreSQL support is (was) not as good as
MySQL, anyway I never had serious issues.

The main annoyance was waiting some patch got included into core and
having to re-patch every time a new point release comes out.

But in my experience the problems you may run into can be patched in
5 min.

Summing it up:

Drupal doesn't have support for transaction or referential integrity
but you may need it if you're developing your own modules.
If you've several writes or high concurrency and data integrity is
important PostgreSQL may be an interesting option.

If you've a lot of read, and you don't mind having to deal with
some inconsistency , MyISAM and replication may be fine.

I think InnoDB is "in between" without shining for any aspect.
If I were looking for transactions and data integrity I'd take the
InnoDB route just if I already invested a lot in MySQL.

--
Ivan Sergio Borgonovo
http://www.webthatworks.it


I just wrote another post that explains my hosting environment. I am
interested in speed and reliability (isn't everybody?). My understanding
is that:

1) InnoDB is more reliable.
2) InnoDB reads faster, if the primary keys and indexes are set right.
3) InnoDB writes slower.
4) InnoDB requires more disk space and more RAM.

Am I right so far? I'm not very familiar with this topic.

I'm not very concerned about disk space. I'm happy to make the DB 20%
bigger if it'll improve read speed and reliability. But I don't fully
understand the implications of InnoDB vs MyISAM, so I thought I'd ask here.


> The reasons that may make you chose InnoDB may be the same to
> consider to add to the list PostgreSQL, but it really depends on
> what is your specific need.

I thought that the main benefit of PostgreSQL was advanced features
(procedures, triggers, etc). I don't use any of those. I'm just looking
for speed an reliability.
================

PostgreSQL is a way better than MySQL, about all topics.
The only thing on which MySQL could be better, is maybe it's faster on
read operations on small databases.
But, in fact, PostgreSQL can handle correctly really BIG databases, with
a really big amount of data, and is easier to cluster.
PostgreSQL is more stable, secure, and a more efficient and flawless
with stocked procedures and triggers.

If you have to choose between MyISAM and InnoDB, if I were you, I'd
choose InnoDB, it's a lot more stable. MyISAM may be faster and eat less
disk space, but you may have really bad surprises with your data
integrity.

In the company I work for, we used to make a MySQL intensive use, the
first monthes, we abandon MyISAM for a lot of obvious reasons. And right
now, we are abandoning MySQL in flavor of PostgreSQL.

We still use MySQL for Drupal, because a lot of module developpers still
can't write standard SQL, but, if all modules were PostgreSQL aware, we
would not use MySQL at all.

This a my opinion, from my experience, I may be wrong.

Do the right choice, don't do IT :)

============

It's true, PostgreSQL really has a lot of cool features. But it's also
true that it provides a fully ACID (Atomicity, Consistency, Isolation,
Durability) compliant database, which MySQL does not until the future 6
release.
In PG database, every request, even a single simple one, is executed in
a transactionnal context. This is also true with triggers and
procedures.

I quote "PostgreSQL is fully ACID-compliant, while MySQL's InnoDB
storage engine provides engine-level ACID-compliance" (see
http://www.wikivs.com/wiki/MySQL_vs_PostgreSQL#ACID_Compliance).

Even with clustering, PostgreSQL seems to be a lot more powerfull, since
you can use cross databases constraints between tables from more than
one PG instance. You can also do table partitioning, and a lot of other
cool features to help you when you start having *A LOT* of data :)
=================

MyISAM, InnoDB and PostgreSQL perform differently on different
situation.
Ease of development and safety of data may have an indirect impact
on speed as well.

For some tasks MyISAM is reliable enough and "failure" won't impact
the "average" speed. Once you add replication it may become your
best solution.

I think that outside that scenario the only 2 good reason to chose
InnoDB over PostgreSQL are:
- you invested a lot in MySQL (your staff is made of MySQL black
belt)
- you want to lower as much as possible the cost of maintaining
drupal sites and still data integrity and frequent writes aren't
your first priority but you're starting to feel the pain of
MyISAM. In my view most of the time this is going to be a
"diplomatic" choice... still not a "we seriously would like a bit
more data integrity/concurrency but we can't afford PostgreSQL"
choice. Once you're in the territory of "a bit more data integrity
and concurrency" you've already passed the threshold that will
make maintaining MySQL cheaper.

So most of the times the real reason to keep using InnoDB vs.
PostgreSQL is your knowledge of the tool.

In most if not all the situations where you may chose InnoDB over
MyISAM just on technical merits, PostgreSQL would be a better
candidate... since there are other constraint that may influence
your choice you may still prefer InnoDB.

And... as Michael Prasuhn wrote most of the sites don't have to deal
with the problems InnoDB try to solve and when they run into them
they have built up so much (too much?) MySQL knowledge and invested
so much in MySQL tuning/coding that their best move is to switch to
InnoDB.

--
Ivan Sergio Borgonovo
http://www.webthatworks.it

=================
A lot of confusion here, from others, not just Daniel. Overall the
biggest benefit is that InnoDB supports row-level locking versus table-
locking on writes. For high traffic sites this can make some obvious
benefits for tables like access logs, sessions, users, comments,
node_comment_statistics and the like.

The downside is that for reading large amounts of simple data, InnoDB
is slower than MyISAM. In fact MyISAM is probably one of the fastest
engines out there in terms of reading data.

In short, unless you know you have large waits doing common tasks such
as insert comments when the load on the site is high, or you have a
way to examine how many connections are waiting on table locks before
completion, you are probably best off sticking with MyISAM.

Also, while InnoDB is widely available these days, you may still run
into hosts that don't support it, but do support MyISAM.


-Mike
============
I don't think speed is really the problem here. There are a lot of wide
architectures that uses PostgreSQL or other DBMS's under very high load.
There are now a lot of solution to support high loads, and most of them
are based on redondancy and large amount of memory. Use a fast DBMS, ok,
but use a REAL DBMS, they are, in fact, fast enough.

The fact MyISAM reads fast, is the concequencies of its lack of security
and features.
Don't kill the kitten, don't use MyISAM.

Indeed, keeping such engine alive is an heresy, because it's not ACID
compliant, even if Drupal does not use transactions or referencial
integrity, this is a very bad habit to continue to use it.

Your conclusion makes sens, some hosters continues to use outdated
technos such as MyISAM. But in case you are starting a new architecture
of your own somewhere, I really disencourage you to use MyISAM.
==================
> Could you confirm that row-level locking means that you can do
> multiple INSERTs at the same time?
>
> Suppose that you are going to do 50 INSERTs in one transaction (my
> non-Drupal site does this at one place). Can two users do the 50-
> INSERT operation at the same time? (the application is a markbook/
> gradebook - the inserts are the marks/grades for 50 students).


Well there are some gotcha conditions, if you have a WHERE clause in
your update, it must be using an index, otherwise it has to do a table
scan, and locks the whole table. Also, any inserts using an auto-
increment column will lock the entire table to prevent duplicate values.

> I thought that InnoDB was supposed to be faster for reading. At
> least, for reading rows that are related by index (e.g. users 1-20
> assuming that "user id" is an index).


Please see http://2bits.com/articles/mysql-innodb-performance-gains-as-well-as-some-pitfalls.html
for some examples.

-Mike
==================

> Michael Prasuhn wrote:
>> A lot of confusion here, from others, not just Daniel. Overall the
>> biggest benefit is that InnoDB supports row-level locking versus
>> table-locking on writes.
>
> Could you confirm that row-level locking means that you can do
> multiple INSERTs at the same time?
>
> Suppose that you are going to do 50 INSERTs in one transaction (my
> non-Drupal site does this at one place). Can two users do the
> 50-INSERT operation at the same time? (the application is a
> markbook/gradebook - the inserts are the marks/grades for 50
> students).
>

If the operation is transactionalized and the users are updating some
of the same rows the one who updates last looses because the engine is
smart enough to know that the second update cannot be allowed to happen
because the signature of the row has changed; the user can then be
warned that the data changed prior to his update. Or, in some cases,
the rows can be locked up front so the second user waits on the first
user to release the data rows before he can even view the data. In the
non-transactinalized case the second update wins and the original
update is lost without warning; we see this in the issue queue quite
often with people changing the status column from the changed value to
previous value incidentally.

====================

> It really depends on what you need to do. If you don't know it
> > I'd go for MyISAM since most of the features you may be looking
> > for in InnoDB are not really fully exploited by stock Drupal.

> Well, not yet anyway. http://drupal.org/node/301362

Supporting MyISAM, sqlite and advanced (I'd say standard) features
of serious RDBMS doesn't look a viable way to develop drupal (unless
the DB abstraction layer is outsourced, but this would(?) be more
suited for another flame ;) )

I consider normal to add "ON DELETE CASCADE" actions, but to
replicate it transparently in PHP it is a hell of a work.
And that's just one example.

My point is:
- if you're not using "advanced" features and you're not constrained
by concurrent writes MyISAM is good enough. I'd say MyISAM may be
even better suited to Drupal than InnoDB. None of the two engines
is famous for data integrity after all.
- if you're using advanced features, you're constrained by
concurrent writes, you can't afford data loss and you didn't
invest too much in MySQL there are better engines to look at.

Choosing InnoDB would be just a matter of how much you invested
in MySQL.

I don't think that once you're faced with these problems your hosting
offer is a real concern.
==============
Data integrity is understood as respecting the relationships between tables.
So, if there are several tables in a one to many relationship with the node
table, say, for a CCK based content type, if the node is deleted, then the
child records would also be deleted.

and you would not be allowed to delete a parent without also deleting
children.

===========
yep... but doing this in PHP is a pain, doing it in PHP and in SQL
is a double pain.
And unless support for MyISAM and sqlite is going to be dropped, and
I don't think right now is a good choice, using more "advanced"
DB features in core would be a double pain.

Another thing is helping people to exploit more advanced features in
their modules. Modules may have more exotic requirements than core.

=========
I don't know about sqlite but MyISAM would continue to work even if we
code for the transactional update. It would just lock tables instead
of rows and the cascading would not occur. If warned it would be up to
the user to make a determination to continue with MyISAM or move to a
new provider.

No comments: