Monday, May 25, 2009

LENGTH() and UTF-8

> Hi!
> We are storing UTF-8 data in out mysql database and we need to get the
> length of the data. But length() doesn't return the number of characters
> but the pure number of bytes.
> SELECT LENGTH('köter') => 6
> Currently we are doing something like that:
> SELECT LENGTH(CONVERT('köter' USING 'ucs2'))/2;
> This works fine but a "real" solution like CHAR_LENGTH() or something like
> that would be really apprectiated.



Returns the length of the string str, measured in characters. A multi-byte
character counts as a single character. This means that for a string
containing five two-byte characters, LENGTH() returns 10, whereas
CHAR_LENGTH() returns 5.

Look at OCTET_LENGTH() and CHAR_LENGTH(). (While OCTET_LENGTH() is a
synonym, it is the SQL standard way of getting the length of a string
in bytes.)


No comments: